Skip to content

Memory primitives in assume are unchecked #5955

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SaswatPadhi opened this issue Mar 18, 2021 · 11 comments
Closed

Memory primitives in assume are unchecked #5955

SaswatPadhi opened this issue Mar 18, 2021 · 11 comments
Assignees
Labels
aws Bugs or features of importance to AWS CBMC users aws-high bug

Comments

@SaswatPadhi
Copy link
Contributor

SaswatPadhi commented Mar 18, 2021

CBMC version: develop

Operating system: Mac OS 10.15.7

I know it's recommended not to use __CPROVER_r_ok within __CPROVER_assume, but it's not enforced -- no error from CBMC.

Some projects (e.g. throughout s2n) use this, and the point of this test case is to demonstrate that CBMC would accept bad programs when __CPROVER_r_ok is used in __CPROVER_assume, and give a false guarantee of correctness even with all of the pointer-checking flags enabled.

Test case:

#include <assert.h>
#include <stdbool.h>
#include <stdlib.h>

bool validate(const char *data, bool allocated)
{
    bool check_1 = (data == NULL) ==> (!allocated);
    bool check_2 = allocated ==> __CPROVER_r_ok(data, 128);
    return check_1 && check_2;
}

void main()
{
    char *data;
    bool allocated;

    data = !allocated ? NULL : malloc(1);

    __CPROVER_assume(validate(data, allocated));

    // same as `validate`, but inlined here to see which assertion fails
    assert((data == NULL) ==> (!allocated));
    assert(allocated ==> __CPROVER_r_ok(data, 128));
}

Exact command line resulting in the issue:

cbmc --malloc-may-fail --malloc-fail-null --pointer-primitive-check --bounds-check --pointer-check --pointer-overflow-check test.c

What behaviour did you expect:

The second assert should fail because data is malloced with 1 byte and shouldn't be readable up to 128 bytes.

What happened instead:

CBMC reports that the program is valid -- all implicit and explicit assertions pass. I have used all the CBMC -check flags that I thought could be relevant in this context.

Additional information:

With the --smt2 flag, CBMC does report an assertion failure at the last line of main and also within the __CPROVER_r_ok check, as expected.

This bug is related to, but is different from, #5952. There we actually read the same length as we allocated, so the failure from --smt2 is unexpected. In this report, success from the SAT-based solver is unexpected.

@feliperodri feliperodri added aws Bugs or features of importance to AWS CBMC users aws-high bug labels Mar 18, 2021
@SaswatPadhi
Copy link
Contributor Author

SaswatPadhi commented Mar 19, 2021

A simpler example:

#include <assert.h>
#include <stdbool.h>
#include <stdlib.h>

void main()
{
    char *data;

    data = malloc(1);

    __CPROVER_assume(__CPROVER_r_ok(data, 128));
    assert(__CPROVER_r_ok(data, 128));
}

No warnings / errors and verification succeeds with the following cmdline:

cbmc --malloc-may-fail \
     --malloc-fail-null \
     --pointer-primitive-check \
     --bounds-check \
     --pointer-check \
     --pointer-overflow-check \
     test.c
Full output from CBMC
CBMC version 5.28.0 (cbmc-5.28.0) 64-bit x86_64 macos
Parsing /Users/saspadhi/test.c
Converting
Type-checking test
Generating GOTO Program
Adding CPROVER library (x86_64)
Removal of function pointers and virtual functions
Generic Property Instrumentation
Running with 8 object bits, 56 offset bits (default)
Starting Bounded Model Checking
Runtime Symex: 0.00671644s
size of program expression: 99 steps
simple slicing removed 2 assignments
Generated 6 VCC(s), 6 remaining after simplification
Runtime Postprocess Equation: 2.1885e-05s
Passing problem to propositional reduction
converting SSA
Runtime Convert SSA: 0.00103259s
Running propositional reduction
Post-processing
Runtime Post-process: 0.000138232s
Solving with MiniSAT 2.2.1 with simplifier
514 variables, 672 clauses
SAT checker: instance is UNSATISFIABLE
Runtime Solver: 0.00134772s
Runtime decision procedure: 0.00244049s

** Results:
/Users/saspadhi/test.c function main
[main.pointer_primitives.1] line 11 pointer invalid in R_OK((const void *)data, (__CPROVER_size_t)128): SUCCESS
[main.pointer_primitives.2] line 11 deallocated dynamic object in R_OK((const void *)data, (__CPROVER_size_t)128): SUCCESS
[main.pointer_primitives.3] line 11 dead object in R_OK((const void *)data, (__CPROVER_size_t)128): SUCCESS
[main.pointer_primitives.4] line 11 pointer outside dynamic object bounds in R_OK((const void *)data, (__CPROVER_size_t)128): SUCCESS
[main.pointer_primitives.5] line 11 pointer outside object bounds in R_OK((const void *)data, (__CPROVER_size_t)128): SUCCESS
[main.assertion.1] line 12 assertion __CPROVER_r_ok(data, 128): SUCCESS

<builtin-library-malloc> function malloc
[malloc.assertion.1] line 26 max allocation size exceeded: SUCCESS
[malloc.assertion.2] line 31 max allocation may fail: SUCCESS

** 0 of 8 failed (1 iterations)
VERIFICATION SUCCESSFUL

@SaswatPadhi SaswatPadhi changed the title Soundness bug with use of memory primitives in assume Memory primitives in assume are unchecked Mar 22, 2021
@chrisr-diffblue chrisr-diffblue self-assigned this Apr 1, 2021
@TGWDB TGWDB self-assigned this Apr 16, 2021
@TGWDB
Copy link
Contributor

TGWDB commented Apr 20, 2021

I've done some exploration of this issue and there are two different things interacting that cause the unexpected behaviour. One appears to be a misunderstanding about how cbmc checks properties, and the other is (probably) a bug in the --smt2 option.

Part 1 - Misunderstanding CBMC

To explain the misunderstanding let's consider the simpler example:

#include <assert.h>
#include <stdbool.h>
#include <stdlib.h>

void main()
{
    char *data;

    data = malloc(1);

    __CPROVER_assume(__CPROVER_r_ok(data, 128));
    assert(__CPROVER_r_ok(data, 128));
}

Now there are several commands we can run to show interesting output (somewhat simplified from the above). Let's use the following to illustrate:

cbmc simple.c

which produces the following output (I've added cat for illustration):

tgw@DB0907:~/github/testing/Issue5955$ cat -n simple.c
     1  #include <assert.h>
     2  #include <stdbool.h>
     3  #include <stdlib.h>
     4
     5  void main()
     6  {
     7      char *data;
     8
     9      data = malloc(1);
    10
    11      __CPROVER_assume(__CPROVER_r_ok(data, 128));
    12      assert(__CPROVER_r_ok(data, 128));
    13  }
tgw@DB0907:~/github/testing/Issue5955$ cbmc simple.c
CBMC version 5.28.0 (cbmc-5.28.0-4-ga8cfa6f55) 64-bit x86_64 linux
Parsing simple.c
Converting
Type-checking simple
Generating GOTO Program
Adding CPROVER library (x86_64)
Removal of function pointers and virtual functions
Generic Property Instrumentation
Running with 8 object bits, 56 offset bits (default)
Starting Bounded Model Checking
Runtime Symex: 0.0099174s
size of program expression: 79 steps
simple slicing removed 0 assignments
Generated 0 VCC(s), 0 remaining after simplification
Runtime Postprocess Equation: 3.23e-05s

** Results:
<builtin-library-malloc> function malloc
[malloc.assertion.1] line 26 max allocation size exceeded: SUCCESS
[malloc.assertion.2] line 31 max allocation may fail: SUCCESS

simple.c function main
[main.assertion.1] line 12 assertion __CPROVER_r_ok(data, 128): SUCCESS

** 0 of 3 failed (1 iterations)
VERIFICATION SUCCESSFUL
tgw@DB0907:~/github/testing/Issue5955$

The key point here is that the assertion on line 12 passes. This appears to be sufficient to illustrate the misunderstanding with cbmc. To clarify, consifer the following that is the same except with an additional assertion that should always fail added:

tgw@DB0907:~/github/testing/Issue5955$ cat -n withfail.c
     1  #include <assert.h>
     2  #include <stdbool.h>
     3  #include <stdlib.h>
     4
     5  void main()
     6  {
     7      char *data;
     8
     9      data = malloc(1);
    10
    11      __CPROVER_assume(__CPROVER_r_ok(data, 128));
    12      assert(__CPROVER_r_ok(data, 128));
    13      assert(0);
    14  }
tgw@DB0907:~/github/testing/Issue5955$ cbmc withfail.c
CBMC version 5.28.0 (cbmc-5.28.0-4-ga8cfa6f55) 64-bit x86_64 linux
Parsing withfail.c
Converting
Type-checking withfail
Generating GOTO Program
Adding CPROVER library (x86_64)
Removal of function pointers and virtual functions
Generic Property Instrumentation
Running with 8 object bits, 56 offset bits (default)
Starting Bounded Model Checking
Runtime Symex: 0.009985s
size of program expression: 79 steps
simple slicing removed 0 assignments
Generated 0 VCC(s), 0 remaining after simplification
Runtime Postprocess Equation: 5.91e-05s

** Results:
<builtin-library-malloc> function malloc
[malloc.assertion.1] line 26 max allocation size exceeded: SUCCESS
[malloc.assertion.2] line 31 max allocation may fail: SUCCESS

withfail.c function main
[main.assertion.1] line 12 assertion __CPROVER_r_ok(data, 128): SUCCESS
[main.assertion.2] line 13 assertion 0: SUCCESS

** 0 of 4 failed (1 iterations)
VERIFICATION SUCCESSFUL
tgw@DB0907:~/github/testing/Issue5955$

Observe that here the assertion on line 13 does not fail. This is because of how cbmc performs checks.

To understand this, let us consider the withfail.c example:

     1  #include <assert.h>
     2  #include <stdbool.h>
     3  #include <stdlib.h>
     4
     5  void main()
     6  {
     7      char *data;
     8
     9      data = malloc(1);
    10
    11      __CPROVER_assume(__CPROVER_r_ok(data, 128));
    12      assert(__CPROVER_r_ok(data, 128));
    13      assert(0);
    14  }

The internals of cbmc operate as follows.

  1. At line 7 we declare a pointer data that has no known value.
  2. On line 9 data obtains the result of malloc(1) which is assumed to succeed (note that some flags can allow failures here, but it does not matter for this part).
  3. On line 11 the assume of the __CPROVER_r_ok(data, 128) attempts to refine the possible values of data. However, since there is no value set for data that is consistent with lines 9 and 11, this path becomes unreachable.
  4. Lines 12 and 13 are considered unreachable, and cbmc assumes that unreachable checks pass (i.e. they succeed here).

Note that line 12 can be commented out and line 13 will still succeed.

To conclude the misunderstanding part, since the path becomes unreachable any checks are considered to succeed by cbmc and this is why line 12 (in both simple.c and withfail.c succeed.

Part 2 - Different smt2 Behaviour

The second aspect of this issue is that there is different behaviour when the --smt2 flag is used. In particular there appears to be a bug in the SMT translation for the __CPROVER_r_ok check and the relation with the data pointer on line 11. This appears to be related to #5952 although it is unclear for the moment if they are exactly the same problem.

Some experiments show that the smt2 outfile for the examples does indded include

(assert (=> (= ((_ extract 63 56) |main::1::data!0@1#2|) (_ bv2 8))(= object_size.0 (_ bv1 64))))

but I have not checked if the details sent to Z3 are different to the outfile.

That said, the path to addressing them has the same question: is this an urgent bug to explore and fix considering the upcoming work on rewriting the SMT backend?

@SaswatPadhi
Copy link
Contributor Author

Thanks for looking into this, @TGWDB.

Regarding part 1, we understand that that incorrect use of __CPROVER_r_ok in assume context would lead to assume(false); any other check after that would vacuously succeed. The fact that these primitives don't work as expected in assume context -- don't really make an allocation have a desired size, is not intuitive though and so in large projects it's easy to end up with vacuous proofs.
The expectation here is to raise a "warning" and infrom users that these primitives could easily lead to vacuous proofs when used in assume context. May be @feliperodri and @mww-aws could add more.

Regarding part 2, no fixing the SMT issue with the old backend is not urgent. It probably is the same issue as #5952, as you mentioned, and that's on my todo list, but it's not urgent.

@tautschnig
Copy link
Collaborator

Selectively quoting:

The fact that these primitives don't work as expected in assume context -- don't really make an allocation have a desired size, is not intuitive though and so in large projects it's easy to end up with vacuous proofs.

I'm not quite following why the example is a case of unintuitive semantics -- it seems __CPROVER_r_ok(data, 128) is rather obviously false? I thought that --pointer-primitive-check was the approach to identify cases where the semantics may be surprising.

Isn't it really coverage checking that you need?

@SaswatPadhi
Copy link
Contributor Author

SaswatPadhi commented Apr 20, 2021

I thought that --pointer-primitive-check was the approach to identify cases where the semantics may be surprising.

But --pointer-primitive-check doesn't raise any warning or error in this case either (see the second comment). So I am not sure in what way is it more helpful than not using it, in this case.

Isn't it really coverage checking that you need?

That's one approach yes ... but we don't yet have an automated coverage check. I also think it might be hard to enforce such an automated check because the acceptable coverage across harnesses varies a lot -- may be @feliperodri can comment.

@feliperodri
Copy link
Collaborator

Isn't it really coverage checking that you need?

@tautschnig not only for coverage checking, but also to check for valid pointers. --pointer-primitive-check helps us to identify when we try to use memory primitives with invalid pointers (i.e., neither NULL or allocated). However, we want to warn users that using memory primitives in assumptions could lead to unreachable traces. In the past, users believed the example above succeeds because there are no failures, when in fact the assertions are unreachable. In this case, there is absolutely nothing in the standard CBMC output that indicates unreachable checks.

@tautschnig
Copy link
Collaborator

In this case, there is absolutely nothing in the standard CBMC output that indicates unreachable checks.

Unless coverage checking is enabled, there never is information about unreachable checks. That's why coverage checking is (at least: currently) necessary. As much as I do agree that the memory primitives have their surprising pitfalls: I'm not sure why they'd deserve preferential treatment here. The example that @SaswatPadhi posted at the beginning of this conversation is a really good case here: one can easily construct arbitrarily complex assumptions, and, yes, some of them will result in vacuous proofs. Some of them might contain memory primitives, but that's not at all a necessary condition.

From my point of view, there are two possible routes forward. Further suggestions would be much appreciated!

  1. Always perform coverage checking. We'd just need to figure out how we can do so efficiently.
  2. Always warn when one of the memory primitives is used in a way other than assert(__CPROVER_{r,w}_ok(...)).

@feliperodri
Copy link
Collaborator

As much as I do agree that the memory primitives have their surprising pitfalls: I'm not sure why they'd deserve preferential treatment here.
Some of them might contain memory primitives, but that's not at all a necessary condition.

They could be very useful when dealing with pointers and we often tried to use them, but fell in their pitfalls. So, targeting them here wouldn't solve the vacuity problem, but it'd at least help users to not apply them in the wrong context unintentionally.

  1. Always perform coverage checking. We'd just need to figure out how we can do so efficiently.

I think this is unnecessary for now, because we already perform coverage checking in proof harnesses.

  1. Always warn when one of the memory primitives is used in a way other than assert(__CPROVER_{r,w}_ok(...)).

I like this idea. This would make clear to proof writers "be aware where you're using these primitives". @SaswatPadhi @mww-aws @markrtuttle any thoughts?

@SaswatPadhi
Copy link
Contributor Author

Always warn when one of the memory primitives is used in a way other than assert(_CPROVER{r,w}_ok(...)).

I like this idea. This would make clear to proof writers "be aware where you're using these primitives". @SaswatPadhi @mww-aws @markrtuttle any thoughts?

+1. I like it too. This is exactly what I was trying to say when I mentioned that we should warn users about using these primitives in assume context.

@TGWDB
Copy link
Contributor

TGWDB commented May 7, 2021

I propose this issue is closed as there does not appear to be a true bug here. Also since the concern of vacuous proofs is now in #6057 and the SMT aspect is in #5952 (or the upcoming SMT backend rework).

@SaswatPadhi
Copy link
Contributor Author

Sure, I am closing this issue. A fix for #6057 would also check for vacuity due to usage of memory primitives in assume context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws Bugs or features of importance to AWS CBMC users aws-high bug
Projects
None yet
Development

No branches or pull requests

5 participants