Skip to content

Commit 472f0ac

Browse files
Populating module documentation md files
by distributiong M. Brain's brief module descriptions from the old wiki
1 parent 2154f54 commit 472f0ac

File tree

24 files changed

+423
-544
lines changed

24 files changed

+423
-544
lines changed

doc/architectural/cprover-architecture-overview.md

+51-497
Large diffs are not rendered by default.

src/analyses/README.md

+4
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
\ingroup module_hidden
22
\defgroup analyses analyses
3+
34
# Folder analyses
45

6+
This contains the abstract interpretation framework `ai.h` and several
7+
static analyses that instantiate it.
8+
59
FIXME: put here a good introduction describing what is contained
610
in this folder.

src/ansi-c/README.md

+41-22
Original file line numberDiff line numberDiff line change
@@ -2,30 +2,26 @@
22
\defgroup ansi-c ansi-c
33
# Folder ansi-c
44

5-
\author Kareem Khazem
5+
\author Kareem Khazem, Martin Brain
66

7-
CodeWarrior C Compilers Reference 3.2:
8-
9-
http://cache.freescale.com/files/soft_dev_tools/doc/ref_manual/CCOMPILERRM.pdf
7+
\section overview Overview
108

11-
http://cache.freescale.com/files/soft_dev_tools/doc/ref_manual/ASMX86RM.pdf
9+
Contains the front-end for ANSI C, plus a variety of common extensions.
10+
This parses the file, performs some basic sanity checks (this is one
11+
area in which the UI could be improved; patches most welcome) and then
12+
produces a goto-program (see below). The parser is a traditional Flex /
13+
Bison system.
1214

13-
ARM 4.1 Compiler Reference:
15+
`internal_addition.c` contains the implementation of various ‘magic’
16+
functions that are that allow control of the analysis from the source
17+
code level. These include assertions, assumptions, atomic blocks, memory
18+
fences and rounding modes.
1419

15-
http://infocenter.arm.com/help/topic/com.arm.doc.dui0491c/DUI0491C_arm_compiler_reference.pdf
16-
17-
18-
Parsing performance considerations:
19-
20-
* Measured on trunk/regression/ansi-c/windows_h_VS_2012/main.i
21-
22-
* 13%: Copying into i_preprocessed
23-
24-
* 5%: ansi_c_parser.read()
25-
26-
* 53%: yyansi_clex()
27-
28-
* 29%: parser (without typechecking)
20+
The `library/` subdirectory contains versions of some of the C standard
21+
header files that make use of the CPROVER built-in functions. This
22+
allows CPROVER programs to be ‘aware’ of the functionality and model it
23+
correctly. Examples include `stdio.c`, `string.c`, `setjmp.c` and
24+
various threading interfaces.
2925

3026
\section preprocessing Preprocessing & Parsing
3127

@@ -48,8 +44,6 @@ digraph G {
4844
\enddot
4945

5046

51-
52-
---
5347
\section type-checking Type-checking
5448

5549
In the \ref ansi-c and \ref java_bytecode directories.
@@ -136,3 +130,28 @@ called symbols. Thus, for example:
136130
parameter and return types of the function. The value of the symbol is
137131
the function's body (a \ref codet), and the symbol is stored in the
138132
symbol table with `foo` as the key.
133+
134+
135+
\section performance Parsing performance considerations
136+
137+
* Measured on trunk/regression/ansi-c/windows_h_VS_2012/main.i
138+
139+
* 13%: Copying into i_preprocessed
140+
141+
* 5%: ansi_c_parser.read()
142+
143+
* 53%: yyansi_clex()
144+
145+
* 29%: parser (without typechecking)
146+
147+
\section references Compiler References
148+
149+
CodeWarrior C Compilers Reference 3.2:
150+
151+
http://cache.freescale.com/files/soft_dev_tools/doc/ref_manual/CCOMPILERRM.pdf
152+
153+
http://cache.freescale.com/files/soft_dev_tools/doc/ref_manual/ASMX86RM.pdf
154+
155+
ARM 4.1 Compiler Reference:
156+
157+
http://infocenter.arm.com/help/topic/com.arm.doc.dui0491c/DUI0491C_arm_compiler_reference.pdf

src/big-int/README.md

+3
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
\ingroup module_hidden
22
\defgroup big-int big-int
3+
34
# Folder big-int
45

6+
\author Martin Brain
7+
58
CPROVER is distributed with its own multi-precision arithmetic library;
69
mainly for historical and portability reasons. The library is externally
710
developed and thus `big-int` contains the source as it is distributed.

src/cbmc/README.md

+6-1
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,9 @@
22
\defgroup cbmc cbmc
33
# Folder CBMC
44

5-
The CBMC handles the code related to interacting with CBMC.
5+
This contains the first full application. CBMC is a bounded model
6+
checker that uses the front ends (`ansi-c`, `cpp`, goto-program or
7+
others) to create a goto-program, `goto-symex` to unwind the loops the
8+
given number of times and to produce and equation system and finally
9+
`solvers` to find a counter-example (technically, `goto-symex` is then
10+
used to construct the counter-example trace).

src/cpp/README.md

+13-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,17 @@
11
\ingroup module_hidden
22
\defgroup cpp cpp
3+
34
# Folder cpp
45

5-
The C++ Language front-end is for processing C++.
6+
\author Martin Brain
7+
8+
This directory contains the C++ front-end. It supports the subset of C++
9+
commonly found in embedded and system applications. Consequentially it
10+
doesn’t have full support for templates and many of the more advanced
11+
and obscure C++ features. The subset of the language that can be handled
12+
is being extended over time so bug reports of programs that cannot be
13+
parsed are useful.
14+
15+
The functionality is very similar to the ANSI C front end; parsing the
16+
code and converting to goto-programs. It makes use of code from
17+
`langapi` and `ansi-c`.

src/goto-analyzer/README.md

+4-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
\ingroup module_hidden
22
\defgroup goto-analyzer goto-analyzer
3+
34
# Folder goto-analyzer
45

5-
`goto-analyzer/` is a module stores information related to interacting with
6-
goto-analyzer. These files are medium risk to change and change frequently.
6+
`goto-analyzer/` is a tool performing static analyses on goto
7+
programs. It provides the front end for many of the static analyses
8+
in the \ref analyses directory.

src/goto-cc/README.md

+4
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
\ingroup module_hidden
22
\defgroup goto-cc goto-cc
3+
34
# Folder goto-cc
45

6+
\author Martin Brain
7+
58
`goto-cc` is a compiler replacement that just performs the first step of
69
the process; converting C or C++ programs to goto-binaries. It is
710
intended to be dropped in to an existing build procedure in place of the
@@ -11,3 +14,4 @@ the `goto-cc/` binary. If it is called `goto-cc` then it emulates GCC
1114
flags, `goto-armcc` emulates the ARM compiler, `goto-cl` emulates VCC
1215
and `goto-cw` emulates the Code Warrior compiler. The output of this
1316
tool can then be used with `cbmc` or `goto-instrument`.
17+

src/goto-diff/README.md

+4-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
\ingroup module_hidden
22
\defgroup goto-diff goto-diff
3+
34
# Folder goto-diff
45

6+
`goto-diff/` is a tool that offers functionality similar to the `diff`
7+
tool, but for GOTO programs.
8+
59

6-
`goto-diff/` is a module has files which change frequently and are medium
7-
risk.

src/goto-instrument/README.md

+3
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
\ingroup module_hidden
22
\defgroup goto-instrument goto-instrument
3+
34
# Folder goto-instrument
45

6+
\author Martin Brain
7+
58
The `goto-instrument/` directory contains a number of tools, one per
69
file, that are built into the `goto-instrument` program. All of them
710
take in a goto-program (produced by `goto-cc`) and either modify it or

src/goto-programs/README.md

+100-1
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,108 @@
11
\ingroup module_hidden
22
\defgroup goto-programs goto-programs
3+
34
# Folder goto-programs
45

6+
\author Kareem Khazem, Martin Brain
7+
8+
\section overview Overview
9+
Goto programs are the intermediate representation of the CPROVER tool
10+
chain. They are language independent and similar to many of the compiler
11+
intermediate languages. Section \ref goto-programs "goto-programs" describes the
12+
`goto_programt` and `goto_functionst` data structures in detail. However
13+
it useful to understand some of the basic concepts. Each function is a
14+
list of instructions, each of which has a type (one of 18 kinds of
15+
instruction), a code expression, a guard expression and potentially some
16+
targets for the next instruction. They are not natively in static
17+
single-assign (SSA) form. Transitions are nondeterministic (although in
18+
practise the guards on the transitions normally cover form a disjoint
19+
cover of all possibilities). Local variables have non-deterministic
20+
values if they are not initialised. Variables and data within the
21+
program is commonly one of three types (parameterised by width):
22+
`unsignedbv_typet`, `signedbv_typet` and `floatbv_typet`, see
23+
`util/std_types.h` for more information. Goto programs can be serialised
24+
in a binary (wrapped in ELF headers) format or in XML (see the various
25+
`_serialization` files).
26+
27+
The `cbmc` option `–show-goto-programs` is often a good starting point
28+
as it outputs goto-programs in a human readable form. However there are
29+
a few things to be aware of. Functions have an internal name (for
30+
example `c::f00`) and a ‘pretty name’ (for example `f00`) and which is
31+
used depends on whether it is internal or being presented to the user.
32+
The `main` method is the ‘logical’ main which is not necessarily the
33+
main method from the code. In the output `NONDET` is use to represent a
34+
nondeterministic assignment to a variable. Likewise `IF` as a beautified
35+
`GOTO` instruction where the guard expression is used as the condition.
36+
`RETURN` instructions may be dropped if they precede an `END_FUNCTION`
37+
instruction. The comment lines are generated from the `locationt` field
38+
of the `instructiont` structure.
39+
40+
`goto-programs/` is one of the few places in the CPROVER codebase that
41+
templates are used. The intention is to allow the general architecture
42+
of program and functions to be used for other formalisms. At the moment
43+
most of the templates have a single instantiation; for example
44+
`goto_functionst` and `goto_function_templatet` and `goto_programt` and
45+
`goto_program_templatet`.
46+
47+
\section data_structures Data Structures
48+
49+
FIXME: This text is partially outdated.
50+
51+
The common starting point for working with goto-programs is the
52+
`read_goto_binary` function which populates an object of
53+
`goto_functionst` type. This is defined in `goto_functions.h` and is an
54+
instantiation of the template `goto_functions_templatet` which is
55+
contained in `goto_functions_template.h`. They are wrappers around a map
56+
from strings to `goto_programt`’s and iteration macros are provided.
57+
Note that `goto_function_templatet` (no `s`) is defined in the same
58+
header as `goto_functions_templatet` and is gives the C type for the
59+
function and Boolean which indicates whether the body is available
60+
(before linking this might not always be true). Also note the slightly
61+
counter-intuitive naming; `goto_functionst` instances are the top level
62+
structure representing the program and contain `goto_programt` instances
63+
which represent the individual functions. At the time of writing
64+
`goto_functionst` is the only instantiation of the template
65+
`goto_functions_templatet` but other could be produced if a different
66+
data-structures / kinds of models were needed for functions.
67+
68+
`goto_programt` is also an instantiation of a template. In a similar
69+
fashion it is `goto_program_templatet` and allows the types of the guard
70+
and expression used in instructions to be parameterised. Again, this is
71+
currently the only use of the template. As such there are only really
72+
helper functions in `goto_program.h` and thus `goto_program_template.h`
73+
is probably the key file that describes the representation of (C)
74+
functions in the goto-program format. It is reasonably stable and
75+
reasonably documented and thus is a good place to start looking at the
76+
code.
77+
78+
An instance of `goto_program_templatet` is effectively a list of
79+
instructions (and inner template called `instructiont`). It is important
80+
to use the copy and insertion functions that are provided as iterators
81+
are used to link instructions to their predecessors and targets and
82+
careless manipulation of the list could break these. Likewise there are
83+
helper macros for iterating over the instructions in an instance of
84+
`goto_program_templatet` and the use of these is good style and strongly
85+
encouraged.
86+
87+
Individual instructions are instances of type `instructiont`. They
88+
represent one step in the function. Each has a type, an instance of
89+
`goto_program_instruction_typet` which denotes what kind of instruction
90+
it is. They can be computational (such as `ASSIGN` or `FUNCTION_CALL`),
91+
logical (such as `ASSUME` and `ASSERT`) or informational (such as
92+
`LOCATION` and `DEAD`). At the time of writing there are 18 possible
93+
values for `goto_program_instruction_typet` / kinds of instruction.
94+
Instructions also have a guard field (the condition under which it is
95+
executed) and a code field (what the instruction does). These may be
96+
empty depending on the kind of instruction. In the default
97+
instantiations these are of type `exprt` and `codet` respectively and
98+
thus covered by the previous discussion of `irept` and its descendents.
99+
The next instructions (remembering that transitions are guarded by
100+
non-deterministic) are given by the list `targets` (with the
101+
corresponding list of labels `labels`) and the corresponding set of
102+
previous instructions is get by `incoming_edges`. Finally `instructiont`
103+
have informational `function` and `location` fields that indicate where
104+
they are in the code.
5105

6-
\author Kareem Khazem
7106

8107
\section goto-conversion Goto Conversion
9108

src/goto-symex/README.md

+18-2
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,25 @@
11
\ingroup module_hidden
22
\defgroup goto-symex goto-symex
3-
# Folder goto-symex
43

4+
# Folder goto-symex
55

6-
\author Kareem Khazem
6+
\author Kareem Khazem, Martin Brain
7+
8+
This directory contains a symbolic evaluation system for goto-programs.
9+
This takes a goto-program and translates it to an equation system by
10+
traversing the program, branching and merging and unwinding loops as
11+
needed. Each reverse goto has a separate counter (the actual counting is
12+
handled by `cbmc`, see the `–unwind` and `–unwind-set` options). When a
13+
counter limit is reach, an assertion can be added to explicitly show
14+
when analysis is incomplete. The symbolic execution includes constant
15+
folding so loops that have a constant number of iterations will be
16+
handled completely (assuming the unwinding limit is sufficient).
17+
18+
The output of the symbolic execution is a system of equations; an object
19+
containing a list of `symex_target_elements`, each of which are
20+
equalities between `expr` expressions. See `symex_target_equation.h`.
21+
The output is in static, single assignment (SSA) form, which is *not*
22+
the case for goto-programs.
723

824
\section symbolic-execution Symbolic Execution
925

src/java_bytecode/README.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
\ingroup module_hidden
22
\defgroup java_bytecode java_bytecode
3+
34
# Folder java_bytecode
45

56

6-
This module provide a front end for Java.
7+
This module provides a bytecode-based front end for Java.

src/jsil/README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
\ingroup module_hidden
22
\defgroup jsil jsil
3-
# Folder jsil
43

4+
# Folder jsil
55

6-
`jsil/` is a module that focuses on type checking.
6+
`jsil/` contains a JavaScript front end.

src/json/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
\defgroup json json
33
# Folder json
44

5-
`json/` is a utility that processes json.
5+
`json/` contains a JSON parser.

src/langapi/README.md

+7-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,12 @@
11
\ingroup module_hidden
22
\defgroup langapi langapi
3+
34
# Folder langapi
45

6+
\author Martin Brain
57

6-
`langapi/` is a language front end.
8+
`langapi/` contains the basic interfaces and support classes for programming
9+
language front ends. Developers only really need look at this if they
10+
are adding support for a new language. It’s main users are the
11+
language front-ends such as `ansi-c/` and
12+
`cpp/`.

src/linking/README.md

+7-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
\ingroup module_hidden
22
\defgroup linking linking
3+
34
# Folder linking
45

5-
linking docs: todo
6+
\author Martin Brain
7+
8+
This allows multiple ‘object
9+
files’ (goto-programs) to be linked into one ‘executable’ (another
10+
goto-program), thus allowing existing build systems to be used to build
11+
complete goto-program binaries.

src/memory-models/README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
\ingroup module_hidden
22
\defgroup memory-models memory-models
3-
# Folder memory-models
43

4+
# Folder memory-models
55

6-
`memory-models` is a tool that works with memory.
6+
`memory-models` contains tools related to weak memory models.

src/miniz/README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
\ingroup module_hidden
22
\defgroup miniz miniz
3-
Folder miniz
43

4+
# Folder miniz
55

6-
`miniz/` is a utility for minimizing things.
6+
`miniz/` contains a minimal ZIP compression library.

0 commit comments

Comments
 (0)