Proposal of Interface between Kani/CProver tools

Hello,

We are looking to solicit some feedback from the Kani Team for the following proposal
on future integration between Kani and CBMC.

Do let us know about any inaccuracies in the description of the status quo, or any clarifications/enhancements to the proposals.

Thank you,

---

## Current Integration between Kani and CBMC

Bulk of code of interest is under `kani/kani-driver`. The following steps are traced from `kani/kani-driver/src/main.rs`.

1. Translate `.rs` file into `*.symtab.json`
    1. `cargo_build` in `kani/kani-driver/src/call_cargo.rs`
2. Iterate through all the `*.symtab.json` files produced in step 1, and convert them (link all of the symbol table files) into `*.symtab.out` by calling `symtab2gb`.
    1. In `symbol_table_to_gotoc` at `kani/kani-driver/src/call_symtab.rs`
    2. The output at this level is (multiple?) `GOTO-binary`
3. Link the binaries produced at step 2, into one `GOTO-binary`.
    1. In `link_goto_binary` at `kani/kani-driver/src/call_goto_cc.rs`.
    2. The output at this level is a single `GOTO-binary`.
4. Perform instrumentation (with `goto-instrument`) run of file
    1.  But after a function has been set as the entry point - designated harness - by `goto-cc`
    2. In `run_goto_instrument` at `kani/kani-driver/src/call_goto_instrument.rs`
    3. Perform `goto-model` validation, then perform series of instrumentations (drop unused functions, rewrite back edges, etc)
5. Kani runs `cbmc` on binary (harness) and collects report
    1. At `check_harness` in `kani/kani-driver/src/main.rs`
    2. `check_harness` calls `cbmc` through call to `run_cbmc` of `kani/kani-driver/src/call_cbmc.rs`

## Considerations

1. The Kani Team probably want something like a Rust <-> C++ bridge, kind of like [`https://cxx.rs`](https://cxx.rs/), so we can assume that they can call into the C++ functions.
    1. This works by building the C++ and Rust files, along with the bridge. Demo: [https://github.com/dtolnay/cxx/tree/master/demo](https://github.com/dtolnay/cxx/tree/master/demo)
    2. For a first draft API, we probably need to make these calls easier to enter into.
2. `goto_convert` demands exposing the concepts of a `symbol_tablet`, `goto_modelt`, and `message_handler`
    1. Maybe not the `message_handler`, looks like an implementation detail that could be hidden.
    2. The `goto-binary` is effectively a serialised version of the `goto_modelt`
    3. Kani does have its own model of a CBMC symbol table at `kani/cprover_bindings/src/goto_program/symbol_table.rs`
3. In theory, linking with `goto-cc` could be done programmatically, but at the moment too stateful (depends on state of `compilet` which depends on initial configuration done by `goto-gcc` personality).
4. However call to `goto-cc --function` might be able to be skipped if a call to `ansi-c-entry-point` can be made.
    1. A bit hard, as for now it also depends on `configt`, but in theory nothing binds it to it - could be refactored?
5. Effectively, `CBMC::do_it` is performing some initialisation that all leads to a selection of a `verifier`, which is a `multi` or `single-path-symex-checkert`, then initialised with some options and the goto-model, and instructed to run, and report on the run results.
    1. The initialisation depends on a `goto-program` produced through `get_goto_program`, which in turn calls `process_goto_program`, which performs a number of transformations on the `goto-program`, for instance removing inline assembly or skips, performing a reachability slice, etc.
    2. Most of these transformations are also performed by `goto-instrument`, but there are some problems:
        1. Non-uniform interface between these transformation passes.
        2. Non-delineated boundaries between these transformations/unknown invariants between these different transformation passes.
6. There are two options for linking. Either calling the `linkingt` class, like `symtab2gb` does, which operates on a symbol table level, or calling `link_goto_model`, which operates at the level of two goto-models.

# Proposals

## Long-term Goal

- Standardisation of most operations around a `goto-modelt`.
    - `cbmc`, `goto-instrument`, etc., are going to be loading a `goto-modelt` in memory.
    - All transformations, analysis, linking, etc. will be working on the `goto-modelt`.
        - Auxiliary functionality to translate a `symbol_tablet` into a `goto-modelt` (through `goto_convert` is also going to be exposed), to enable current producers of `symbol_tablet` (that now depend on `symtab2gb`) to be able to easily convert into a `goto-modelt` for further manipulation/transformations.
    - Subsequent transformations to the `goto-modelt` (say, for instance, `remove_skips`) are going to be maintained as a versioned patch-set over the initial model.
        - This would allow rollbacks of transformations to be both possible and efficient in the future.
        - This would also support scripting of the transformations and associated error handling without the need to start from scratch when a transformation has corrupted the goto-model.
- All tools standardise their interface around a `config` and `goto-modelt`.
    - This allows external tools to be able to invoke them with with a provided configuration and `goto-modelt`.
- This will allow CPROVER tools, to transition into being a library around `goto-modelt` on a long term basis.
    - This would allow users to write their own transformations/tools over `goto-modelt`, both as parts of the CPROVER library, and as external tools.
        - The existing tools `cbmc`, `goto-instrument`, etc,  could then be repurposed to be thin CLI wrappers around the library.
    - External tools will be able to load `goto-modelt` in the library, and extract a new one (along with analysis/verification reports, traces, etc.) from the new library.
- The clean-up of interfaces and internals of these tools will also offer the extra benefit of increased reliability/robustness for these tools when they are being used/invoked in binary form.

## Step 1: API for invoking verification with CBMC

- We will start moving towards the ideals expressed at the *Long-term Goal* above, by attempting to refactor `cbmc` (the binary) in a way that exposes a uniform verification interface.
    - Right now, `cbmc`'s entry point is the function `doit`, which for the most part does configuration management (by parsing command line arguments, etc) and loading/preprocessing of a `goto-modelt` which it then passes on a verifier (selected based on command line arguments).
    - Aim is to expose these verifiers on a more uniform interface, and make sure that they can be invoked on a `goto-modelt` and a `configt` passed to them, with no other dependencies.
- This allows us to deduplicate the main analysis engine from global state (which it now depends on, set up while the cbmc binary bootstraps) and make it easier to enter, both from within the CPROVER tools, and from external tools.
- At the end of this step, we will have exposed `cbmc` in a way that enables programmatic access of the verification engine, instead of depending on the call to a binary version of it.

## Step 2: API extended with goto-model transformations

- As part of the first step, we will have characterised the schema of the `goto-modelt` required to be passed in to the verifiers.
    - This means that the minimum set of invariants required to analyse a `goto-modelt` are going to be enforced before any analysis.
- This allows us to work backwards from that base, and start isolating other transformations on the `goto-modelt` (for example, the ones provided by `goto-instrument`) while always preserving the sequence of invariants demanded by these transformations, and ultimately, the verification engine.
    - We can also iron out the rough edges on their interfaces (make them uniform - right now some operate on the level of `goto-functions`, others on the level of `goto-modelt`, there are differences in the arities of these functions, etc).
- At the end of this step, we will have similarly exposed `goto-instrument` in a way that enables performing the instrumentations that it provides in a programmatic way, without needing to depend on multiple calls to the binary.

## Step 3: API extended with compilation and linking

- After we can get all transformations to work and tools to be centred around a `goto-modelt`, we can then move on adding construction of `goto-modelt`s with compilation and linking supported by an API.
- At the end of this step, we will have exposed the core functionality provided by `goto-cc` so that it can be manipulated programmatically.

## Step 4: API to enable programmatic construction of `goto-modelt`

- At this step, we have all we need to handle compilation/linking/transformations of existing `goto-modelt`s, so the next step is to provide an API that allows staged construction of a `goto-modelt` in a programmatic function.
    - The `goto-modelt` is effectively a symbol table and a CFG (in the form of `goto-functionst`) - the API would offer the capability to create a new `goto-function`, add `goto-instructions` to its body, and update the symbol table with the symbols it contains.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal of Interface between Kani/CProver tools #7042

Current Integration between Kani and CBMC

Considerations

Proposals

Long-term Goal

Step 1: API for invoking verification with CBMC

Step 2: API extended with goto-model transformations

Step 3: API extended with compilation and linking

Step 4: API to enable programmatic construction of `goto-modelt`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal of Interface between Kani/CProver tools #7042

Description

Current Integration between Kani and CBMC

Considerations

Proposals

Long-term Goal

Step 1: API for invoking verification with CBMC

Step 2: API extended with goto-model transformations

Step 3: API extended with compilation and linking

Step 4: API to enable programmatic construction of goto-modelt

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Step 4: API to enable programmatic construction of `goto-modelt`