Skip to content

Commit 424ef50

Browse files
author
Daniel Kroening
authored
Merge pull request #1090 from karkhaz/kk-neu-documentation
CBMC documentation campaign
2 parents 4577401 + 58f6dde commit 424ef50

File tree

163 files changed

+4420
-14314
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

163 files changed

+4420
-14314
lines changed

doc/CPPLINT.cfg

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
exclude_files=.*

doc/architectural/cbmc-guide.md

+601
Large diffs are not rendered by default.

doc/architectural/front-page.md

+46
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
CProver Documentation
2+
=====================
3+
4+
\author Kareem Khazem
5+
6+
These pages contain user tutorials, automatically-generated API
7+
documentation, and higher-level architectural overviews for the
8+
CProver codebase. Users can download CProver tools from the
9+
<a href="http://www.cprover.org/">CProver website</a>; contributors
10+
should use the <a href="https://github.com/diffblue/cbmc">repository</a>
11+
hosted on GitHub.
12+
13+
### For users:
14+
15+
* The \ref cprover-manual "CProver Manual" details the capabilities of
16+
CBMC and SATABS and describes how to install and use these tools. It
17+
also covers the underlying theory and prerequisite concepts behind how
18+
these tools work.
19+
20+
### For contributors:
21+
22+
* If you already know exactly what you're looking for, the API reference
23+
is generated from the codebase. You can search for classes and class
24+
members in the search bar at top-right or use one of the links in the
25+
sidebar.
26+
27+
* For higher-level architectural information, each of the pages under
28+
the "Modules" link in the sidebar gives an overview of a directory in
29+
the CProver codebase.
30+
31+
* The \ref module_cbmc "CBMC guided tour" is a good start for new
32+
contributors to CBMC. It describes the stages through which CBMC
33+
transforms source files into bug reports and counterexamples, linking
34+
to the relevant documentation for each stage.
35+
36+
* The \subpage cbmc-hacking "CBMC hacking HOWTO" helps new contributors
37+
to CProver to get their feet wet through a series of programming
38+
exercises---mostly modifying goto-instrument, and thus learning to
39+
manipulate the main data structures used within CBMC.
40+
41+
* The \subpage cbmc-guide "CBMC guide" is a single document describing
42+
the layout of the codebase and many of the important data structures.
43+
It probably contains more information than the module pages at the
44+
moment, but may be somewhat out-of-date.
45+
46+
\defgroup module_hidden _hidden

doc/architectural/howto.md

+243
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,243 @@
1+
\ingroup module_hidden
2+
\page cbmc-hacking CBMC Hacking HOWTO
3+
4+
\author Kareem Khazem
5+
6+
This is an introduction to hacking on the `cprover` codebase. It is not
7+
intended as a user guide to `CBMC` or related tools. It is structured
8+
as a series of programming exercises that aim to acclimatise the reader
9+
to the basic data structures and workflow needed for contributing to
10+
`CBMC`.
11+
12+
13+
## Initial setup
14+
15+
Clone the [CBMC repository][cbmc-repo] and build it:
16+
17+
git clone https://github.com/diffblue/cbmc.git
18+
cd cbmc/src
19+
make minisat2-download
20+
make
21+
22+
Ensure that [graphviz][graphviz] is installed on your system (in
23+
particular, you should be able to run a program called `dot`). Install
24+
[Doxygen][doxygen] and generate doxygen documentation:
25+
26+
# In the src directory
27+
doxygen doxyfile
28+
# View the documentation in a web browser
29+
firefox doxy/html/index.html
30+
31+
If you've never used doxygen documentation before, get familiar with the
32+
layout. Open the generated HTML page in a web browser; search for the
33+
class `goto_programt` in the search bar, and jump to the documentation
34+
for that class; and read through the copious documentation.
35+
36+
The build writes executable programs into several of the source
37+
directories. In this tutorial, we'll be using binaries inside the
38+
`cbmc`, `goto-instrument`, and `goto-cc` directories. Add these
39+
directories to your `$PATH`:
40+
41+
# Assuming you cloned CBMC into ~/code
42+
export PATH=~/code/cbmc/src/goto-instrument:~/code/cbmc/src/goto-cc:~/code/cbmc/src/cbmc:$PATH
43+
# Add to your shell's startup configuration file so that you don't have to run that command every time.
44+
echo 'export PATH=~/code/cbmc/src/goto-instrument:~/code/cbmc/src/goto-cc:~/code/cbmc/src/cbmc:$PATH' >> .bashrc
45+
46+
Optional: install an image viewer that can read images on stdin.
47+
I use [feh][feh].
48+
49+
[cbmc-repo]: https://github.com/diffblue/cbmc/
50+
[doxygen]: http://www.stack.nl/~dimitri/doxygen/
51+
[graphviz]: http://www.graphviz.org/
52+
[feh]: https://feh.finalrewind.org/
53+
54+
55+
56+
## Whirlwind tour of the tools
57+
58+
CBMC's code is located under the `cbmc` directory. Even if you plan to
59+
contribute only to CBMC, it is important to be familiar with several
60+
other of cprover's auxiliary tools.
61+
62+
63+
### Compiling with `goto-cc`
64+
65+
There should be an executable file called `goto-cc` in the `goto-cc`
66+
directory; make a symbolic link to it called `goto-gcc`:
67+
68+
cd cbmc/src/goto-cc
69+
ln -s "$(pwd)/goto-cc" goto-gcc
70+
71+
Find or write a moderately-interesting C program; we'll call it `main.c`.
72+
Run the following commands:
73+
74+
goto-gcc -o main.goto main.c
75+
cc -o main.exe main.c
76+
77+
Invoke `./main.goto` and `./main.exe` and observe that they run identically.
78+
The version that was compiled with `goto-gcc` is larger, though:
79+
80+
du -hs *.{goto,exe}
81+
82+
Programs compiled with `goto-gcc` are mostly identical to their `clang`-
83+
or `gcc`-compiled counterparts, but contain additional object code in
84+
cprover's intermediate representation. The intermediate representation
85+
is (informally) called a *goto-program*.
86+
87+
88+
### Viewing goto-programs
89+
90+
`goto-instrument` is a Swiss army knife for viewing goto-programs and
91+
performing single program analyses on them. Run the following command:
92+
93+
goto-instrument --show-goto-functions main.goto
94+
95+
Many of the instructions in the goto-program intermediate representation
96+
are similar to their C counterparts. `if` and `goto` statements replace
97+
structured programming constructs.
98+
99+
Find or write a small C program (2 or 3 functions, each containing a few
100+
varied statements). Compile it using `goto-gcc` as above into an object
101+
file called `main`. If you installed `feh`, try the following command
102+
to dump a control-flow graph:
103+
104+
goto-instrument --dot main | tail -n +2 | dot -Tpng | feh -
105+
106+
If you didn't install `feh`, you can write the diagram to the file and
107+
then view it:
108+
109+
goto-instrument --dot main | tail -n +2 | dot -Tpng > main.png
110+
Now open main.png with an image viewer
111+
112+
(the invocation of `tail` is used to filter out the first line of
113+
`goto-instrument` output. If `goto-instrument` writes more or less
114+
debug output by the time you read this, read the output of
115+
`goto-instrument --dot main` and change the invocation of `tail`
116+
accordingly.)
117+
118+
There are a few other views of goto-programs. Run `goto-instrument -h`
119+
and try the various switches under the "Diagnosis" section.
120+
121+
122+
123+
## Learning about goto-programs
124+
125+
In this section, you will learn about the basic goto-program data
126+
structures. Reading from and manipulating these data structures form
127+
the core of writing an analysis for CBMC.
128+
129+
130+
### First steps with `goto-instrument`
131+
132+
<div class=memdoc>
133+
**Task:** Write a simple C program with a few functions, each containing
134+
a few statements. Compile the program with `goto-gcc` into a binary
135+
called `main`.
136+
</div>
137+
138+
139+
The entry point of `goto-instrument` is in `goto_instrument_main.cpp`.
140+
Follow the control flow into `goto_instrument_parse_optionst::doit()`, located in `goto_instrument_parse_options.cpp`.
141+
At some point in that function, there will be a long sequence of `if` statements.
142+
143+
<div class=memdoc>
144+
**Task:** Add a `--greet` switch to `goto-instrument`, taking an optional
145+
argument, with the following behaviour:
146+
147+
$ goto-instrument --greet main
148+
hello, world!
149+
$ goto-instrument --greet Leperina main
150+
hello, Leperina!
151+
152+
You will also need to add the `greet` option to the
153+
`goto_instrument_parse_options.h` file in order for this to work.
154+
Notice that in the `.h` file, options that take an argument are followed
155+
by a colon (like `(property):`), while simple switches have no colon.
156+
Make sure that you `return 0;` after printing the message.
157+
</div>
158+
159+
The idea behind `goto-instrument` is that it parses a goto-program and
160+
then performs one single analysis on that goto-program, and then
161+
returns. Each of the switches in `doit` function of
162+
`goto_instrument_parse_options` does something different with the
163+
goto-program that was supplied on the command line.
164+
165+
166+
### Goto-program basics
167+
168+
At this point in `goto-instrument_parse_options` (where the `if`
169+
statements are), the goto-program will have been loaded into the object
170+
`goto_functions`, of type `goto_functionst`. This has a field called
171+
`function_map`, a map from function names to functions.
172+
173+
174+
<div class="memdoc">
175+
**Task:** Add a `--print-function-names` switch to `goto-instrument`
176+
that prints out the name of every function in the goto-binary. Are
177+
there any functions that you didn't expect to see?
178+
</div>
179+
180+
The following is quite difficult to follow from doxygen, but: the value
181+
type of `function_map` is `goto_function_templatet<goto_programt>`.
182+
183+
184+
<div class=memdoc>
185+
**Task:** Read the documentation for `goto_function_templatet<bodyT>`
186+
and `goto_programt`.
187+
</div>
188+
189+
Each goto_programt object contains a list of
190+
\ref goto_program_templatet::instructiont called
191+
`instructions`. Each instruction has a field called `code`, which has
192+
type \ref codet.
193+
194+
<div class=memdoc>
195+
**Task:** Add a `--pretty-program` switch to `goto-instrument`. This
196+
switch should use the `codet::pretty()` function to pretty-print every
197+
\ref codet in the entire program. The strings that `pretty()` generates
198+
for a codet look like this:
199+
200+
index
201+
* type: unsignedbv
202+
* width: 8
203+
* #c_type: char
204+
0: symbol
205+
* type: array
206+
* size: nil
207+
* type:
208+
* #source_location:
209+
* file: src/main.c
210+
* line: 18
211+
* function:
212+
* working_directory: /some/dir
213+
0: unsignedbv
214+
* width: 8
215+
* #c_type: char
216+
...
217+
</div>
218+
219+
The sub-nodes of a particular node in the pretty representation are
220+
numbered, starting from 0. They can be accessed through the `op0()`,
221+
`op1()` and `op2()` methods in the `exprt` class.
222+
223+
Every node in the pretty representation has an identifier, accessed
224+
through the `id()` function. The file `util/irep_ids.def` lists the
225+
possible values of these identifiers; have a quick scan through that
226+
file. In the pretty representation above, the following facts are true
227+
of that particular node:
228+
229+
- `node.id() == ID_index`
230+
- `node.type().id() == ID_unsignedbv`
231+
- `node.op0().id() == ID_symbol`
232+
- `node.op0().type().id() == ID_array`
233+
234+
The fact that the `op0()` child has a `symbol` ID menas that you could
235+
cast it to a `symbol_exprt` (which is a subtype of `exprt`) using the
236+
function `to_symbol_expr`.
237+
238+
<div class=memdoc>
239+
**Task:** Add flags to `goto-instrument` to print out the following information:
240+
* the name of every function that is *called* in the program;
241+
* the value of every constant in the program;
242+
* the value of every symbol in the program.
243+
</div>
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)