|
1 | 1 | \ingroup module_hidden
|
2 | 2 | \defgroup goto-programs goto-programs
|
| 3 | + |
3 | 4 | # Folder goto-programs
|
4 | 5 |
|
| 6 | +\author Kareem Khazem, Martin Brain |
| 7 | + |
| 8 | +\section overview Overview |
| 9 | +Goto programs are the intermediate representation of the CPROVER tool |
| 10 | +chain. They are language independent and similar to many of the compiler |
| 11 | +intermediate languages. Section \ref goto-programs "goto-programs" describes the |
| 12 | +`goto_programt` and `goto_functionst` data structures in detail. However |
| 13 | +it useful to understand some of the basic concepts. Each function is a |
| 14 | +list of instructions, each of which has a type (one of 18 kinds of |
| 15 | +instruction), a code expression, a guard expression and potentially some |
| 16 | +targets for the next instruction. They are not natively in static |
| 17 | +single-assign (SSA) form. Transitions are nondeterministic (although in |
| 18 | +practise the guards on the transitions normally cover form a disjoint |
| 19 | +cover of all possibilities). Local variables have non-deterministic |
| 20 | +values if they are not initialised. Variables and data within the |
| 21 | +program is commonly one of three types (parameterised by width): |
| 22 | +`unsignedbv_typet`, `signedbv_typet` and `floatbv_typet`, see |
| 23 | +`util/std_types.h` for more information. Goto programs can be serialised |
| 24 | +in a binary (wrapped in ELF headers) format or in XML (see the various |
| 25 | +`_serialization` files). |
| 26 | + |
| 27 | +The `cbmc` option `–show-goto-programs` is often a good starting point |
| 28 | +as it outputs goto-programs in a human readable form. However there are |
| 29 | +a few things to be aware of. Functions have an internal name (for |
| 30 | +example `c::f00`) and a ‘pretty name’ (for example `f00`) and which is |
| 31 | +used depends on whether it is internal or being presented to the user. |
| 32 | +The `main` method is the ‘logical’ main which is not necessarily the |
| 33 | +main method from the code. In the output `NONDET` is use to represent a |
| 34 | +nondeterministic assignment to a variable. Likewise `IF` as a beautified |
| 35 | +`GOTO` instruction where the guard expression is used as the condition. |
| 36 | +`RETURN` instructions may be dropped if they precede an `END_FUNCTION` |
| 37 | +instruction. The comment lines are generated from the `locationt` field |
| 38 | +of the `instructiont` structure. |
| 39 | + |
| 40 | +`goto-programs/` is one of the few places in the CPROVER codebase that |
| 41 | +templates are used. The intention is to allow the general architecture |
| 42 | +of program and functions to be used for other formalisms. At the moment |
| 43 | +most of the templates have a single instantiation; for example |
| 44 | +`goto_functionst` and `goto_function_templatet` and `goto_programt` and |
| 45 | +`goto_program_templatet`. |
| 46 | + |
| 47 | +\section data_structures Data Structures |
| 48 | + |
| 49 | +FIXME: This text is partially outdated. |
| 50 | + |
| 51 | +The common starting point for working with goto-programs is the |
| 52 | +`read_goto_binary` function which populates an object of |
| 53 | +`goto_functionst` type. This is defined in `goto_functions.h` and is an |
| 54 | +instantiation of the template `goto_functions_templatet` which is |
| 55 | +contained in `goto_functions_template.h`. They are wrappers around a map |
| 56 | +from strings to `goto_programt`’s and iteration macros are provided. |
| 57 | +Note that `goto_function_templatet` (no `s`) is defined in the same |
| 58 | +header as `goto_functions_templatet` and is gives the C type for the |
| 59 | +function and Boolean which indicates whether the body is available |
| 60 | +(before linking this might not always be true). Also note the slightly |
| 61 | +counter-intuitive naming; `goto_functionst` instances are the top level |
| 62 | +structure representing the program and contain `goto_programt` instances |
| 63 | +which represent the individual functions. At the time of writing |
| 64 | +`goto_functionst` is the only instantiation of the template |
| 65 | +`goto_functions_templatet` but other could be produced if a different |
| 66 | +data-structures / kinds of models were needed for functions. |
| 67 | + |
| 68 | +`goto_programt` is also an instantiation of a template. In a similar |
| 69 | +fashion it is `goto_program_templatet` and allows the types of the guard |
| 70 | +and expression used in instructions to be parameterised. Again, this is |
| 71 | +currently the only use of the template. As such there are only really |
| 72 | +helper functions in `goto_program.h` and thus `goto_program_template.h` |
| 73 | +is probably the key file that describes the representation of (C) |
| 74 | +functions in the goto-program format. It is reasonably stable and |
| 75 | +reasonably documented and thus is a good place to start looking at the |
| 76 | +code. |
| 77 | + |
| 78 | +An instance of `goto_program_templatet` is effectively a list of |
| 79 | +instructions (and inner template called `instructiont`). It is important |
| 80 | +to use the copy and insertion functions that are provided as iterators |
| 81 | +are used to link instructions to their predecessors and targets and |
| 82 | +careless manipulation of the list could break these. Likewise there are |
| 83 | +helper macros for iterating over the instructions in an instance of |
| 84 | +`goto_program_templatet` and the use of these is good style and strongly |
| 85 | +encouraged. |
| 86 | + |
| 87 | +Individual instructions are instances of type `instructiont`. They |
| 88 | +represent one step in the function. Each has a type, an instance of |
| 89 | +`goto_program_instruction_typet` which denotes what kind of instruction |
| 90 | +it is. They can be computational (such as `ASSIGN` or `FUNCTION_CALL`), |
| 91 | +logical (such as `ASSUME` and `ASSERT`) or informational (such as |
| 92 | +`LOCATION` and `DEAD`). At the time of writing there are 18 possible |
| 93 | +values for `goto_program_instruction_typet` / kinds of instruction. |
| 94 | +Instructions also have a guard field (the condition under which it is |
| 95 | +executed) and a code field (what the instruction does). These may be |
| 96 | +empty depending on the kind of instruction. In the default |
| 97 | +instantiations these are of type `exprt` and `codet` respectively and |
| 98 | +thus covered by the previous discussion of `irept` and its descendents. |
| 99 | +The next instructions (remembering that transitions are guarded by |
| 100 | +non-deterministic) are given by the list `targets` (with the |
| 101 | +corresponding list of labels `labels`) and the corresponding set of |
| 102 | +previous instructions is get by `incoming_edges`. Finally `instructiont` |
| 103 | +have informational `function` and `location` fields that indicate where |
| 104 | +they are in the code. |
5 | 105 |
|
6 |
| -\author Kareem Khazem |
7 | 106 |
|
8 | 107 | \section goto-conversion Goto Conversion
|
9 | 108 |
|
|
0 commit comments