Skip to content

Memory primitives should handle invalid pointers #6045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
feliperodri opened this issue Apr 20, 2021 · 21 comments
Closed

Memory primitives should handle invalid pointers #6045

feliperodri opened this issue Apr 20, 2021 · 21 comments
Assignees
Labels
aws Bugs or features of importance to AWS CBMC users aws-medium bug More info needed

Comments

@feliperodri
Copy link
Collaborator

CBMC version: 5.27.0 (cbmc-5.26.0-148-g7ced011bd-dirty)
Operating system: macOS Mojave 10.14.6
Exact command line resulting in the issue: cbmc main.c --malloc-may-fail --malloc-fail-null --trace
What behaviour did you expect: line 7 assertion __CPROVER_r_ok(ptr, 0): FAILURE
What happened instead: VERIFICATION SUCCESSFUL

To explain the behavior, let's consider the following example.

// main.c
#include <stdlib.h>
#include <assert.h>

int main() {
  void *ptr = malloc(0);
  __CPROVER_assume(ptr != NULL);
  assert(__CPROVER_r_ok(ptr, 0));
}

I expect the assertion to fail, because ptr could be an invalid pointer; however, it succeeds.

CBMC version 5.27.0 (cbmc-5.26.0-148-g7ced011bd-dirty) 64-bit x86_64 macos
Parsing main.c
Converting
Type-checking main
Generating GOTO Program
Adding CPROVER library (x86_64)
Removal of function pointers and virtual functions
Generic Property Instrumentation
Running with 8 object bits, 56 offset bits (default)
Starting Bounded Model Checking
Runtime Symex: 0.0034956s
size of program expression: 94 steps
simple slicing removed 5 assignments
Generated 1 VCC(s), 1 remaining after simplification
Runtime Postprocess Equation: 1.1753e-05s
Passing problem to propositional reduction
converting SSA
Runtime Convert SSA: 0.000522958s
Running propositional reduction
Post-processing
Runtime Post-process: 8.1631e-05s
Solving with MiniSAT 2.2.1 with simplifier
437 variables, 313 clauses
SAT checker inconsistent: instance is UNSATISFIABLE
Runtime Solver: 9.8288e-05s
Runtime decision procedure: 0.000650914s

** Results:
<builtin-library-malloc> function malloc
[malloc.assertion.1] line 26 max allocation size exceeded: SUCCESS
[malloc.assertion.2] line 31 max allocation may fail: SUCCESS

main.c function main
[main.assertion.1] line 8 assertion __CPROVER_r_ok(ptr, 0): SUCCESS

** 0 of 3 failed (1 iterations)
VERIFICATION SUCCESSFUL

I expect that the first thing __CPROVER_r_ok would check is whether the pointer is valid or not. If not valid, return false.
I also try to verify this program using --pointer-primitive-check flag, but I got another unexpected behavior.
Command line: cbmc main.c --malloc-may-fail --malloc-fail-null --pointer-primitive-check --trace

CBMC version 5.27.0 (cbmc-5.26.0-148-g7ced011bd-dirty) 64-bit x86_64 macos
Parsing main.c
Converting
Type-checking main
Generating GOTO Program
Adding CPROVER library (x86_64)
Removal of function pointers and virtual functions
Generic Property Instrumentation
Running with 8 object bits, 56 offset bits (default)
Starting Bounded Model Checking
Runtime Symex: 0.00497527s
size of program expression: 99 steps
simple slicing removed 5 assignments
Generated 6 VCC(s), 6 remaining after simplification
Runtime Postprocess Equation: 1.4961e-05s
Passing problem to propositional reduction
converting SSA
Runtime Convert SSA: 0.000804849s
Running propositional reduction
Post-processing
Runtime Post-process: 0.000148184s
Solving with MiniSAT 2.2.1 with simplifier
508 variables, 679 clauses
SAT checker: instance is SATISFIABLE
Runtime Solver: 0.00109353s
Runtime decision procedure: 0.00196439s
Building error trace
Running propositional reduction
Solving with MiniSAT 2.2.1 with simplifier
508 variables, 372 clauses
SAT checker: instance is UNSATISFIABLE
Runtime Solver: 1.8796e-05s
Runtime decision procedure: 3.933e-05s

** Results:
<builtin-library-malloc> function malloc
[malloc.assertion.1] line 26 max allocation size exceeded: SUCCESS
[malloc.assertion.2] line 31 max allocation may fail: SUCCESS

main.c function main
[main.assertion.1] line 8 assertion __CPROVER_r_ok(ptr, 0): SUCCESS
[main.pointer_primitives.1] line 8 pointer invalid in R_OK(ptr, (__CPROVER_size_t)0): SUCCESS
[main.pointer_primitives.2] line 8 deallocated dynamic object in R_OK(ptr, (__CPROVER_size_t)0): SUCCESS
[main.pointer_primitives.3] line 8 dead object in R_OK(ptr, (__CPROVER_size_t)0): SUCCESS
[main.pointer_primitives.4] line 8 pointer outside dynamic object bounds in R_OK(ptr, (__CPROVER_size_t)0): FAILURE
[main.pointer_primitives.5] line 8 pointer outside object bounds in R_OK(ptr, (__CPROVER_size_t)0): SUCCESS

Trace for main.pointer_primitives.4:

State 26 file main.c function main line 6 thread 0
----------------------------------------------------
  ptr=NULL (00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000)

State 30 file main.c function main line 6 thread 0
----------------------------------------------------
  malloc_size=0ul (00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000)

State 59 file main.c function main line 6 thread 0
----------------------------------------------------
  ptr=(const void *)dynamic_object1 (00000010 00000000 00000000 00000000 00000000 00000000 00000000 00000000)

Assumption:
  file main.c line 7 function main
  ptr != (void *)0

Violated property:
  file main.c function main line 8 thread 0
  pointer outside dynamic object bounds in R_OK(ptr, (__CPROVER_size_t)0)
  POINTER_OBJECT(NULL) == POINTER_OBJECT(ptr) || !IS_DYNAMIC_OBJECT(ptr) || POINTER_OFFSET(ptr) >= 0l && OBJECT_SIZE(ptr) >= (unsigned long int)POINTER_OFFSET(ptr) + 1ul



** 1 of 8 failed (2 iterations)
VERIFICATION FAILED

Again, if cbmc checked for validity first, I think we'd also avoid this problem.

@feliperodri feliperodri added bug aws Bugs or features of importance to AWS CBMC users aws-medium labels Apr 20, 2021
@feliperodri
Copy link
Collaborator Author

cc. @SaswatPadhi

@feliperodri
Copy link
Collaborator Author

In fact, why does cbmc return dynamic_object1 for malloc(0)?

@SaswatPadhi
Copy link
Contributor

SaswatPadhi commented Apr 20, 2021

Quoting from https://stackoverflow.com/a/2022402:

The C standard (C17 7.22.3/1) says:

If the size of the space requested is zero, the behavior is implementation defined: either a null pointer is returned, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.

malloc(0) could return NULL or a valid memory location which shouldn't be dereferenced.

So, I think there are two issues raised by this bug report:

  1. What does __CPROVER_r_ok do without --pointer-primitive-check flag?
  2. malloc(0) shouldn't return a new dynamic object, as Felipe already pointed out.

Here is a clear soundness bug, even without using __CPROVER_r_ok:

$ cat test.c
#include <stdlib.h>
#include <assert.h>

int main() {
  char *a = malloc(0);
  char *b = malloc(0);
  
  __CPROVER_assume(a != NULL && b != NULL);
  
  assert(a != b);
}

$ cbmc --malloc-may-fail --malloc-fail-null --pointer-check test.c | tail
** Results:
/Users/saspadhi/test.c function main
[main.assertion.1] line 10 assertion a != b: SUCCESS

<builtin-library-malloc> function malloc
[malloc.assertion.1] line 26 max allocation size exceeded: SUCCESS
[malloc.assertion.2] line 31 max allocation may fail: SUCCESS

** 0 of 3 failed (1 iterations)
VERIFICATION SUCCESSFUL

According to C standard, this should fail because a and b could actually be the same location!

malloc(0) should be treated just like a non-deterministic pointer.

@SaswatPadhi SaswatPadhi changed the title Unexpected behavior with __CPROVER_r_ok Incorrect semantics for malloc(0) Apr 20, 2021
@tautschnig
Copy link
Collaborator

[...]

1. What does `__CPROVER_r_ok` do without `--pointer-primitive-check` flag?

See http://cprover.diffblue.com/memory-primitives.html.

2. `malloc(0)` shouldn't return a new dynamic object, as Felipe already pointed out.

Why not?

Here is a clear soundness bug, even without using __CPROVER_r_ok:

[...]

According to C standard, this should fail because a and b could actually be the same location!

I don't think "According to the C standard, this should fail" is correct. The C standard says that the behaviour is implementation-defined. If, however, we want to simulate all permitted implementations, then, yes, we would also have to consider the case where multiple calls to malloc return the same address. This is something we currently don't do, and it's a limitation not restricted to the case of zero-sized allocations.

malloc(0) should be treated just like a non-deterministic pointer.

Why?

@SaswatPadhi
Copy link
Contributor

SaswatPadhi commented Apr 21, 2021

Hi Michael,

According to C standard, this should fail because a and b could actually be the same location!

I don't think "According to the C standard, this should fail" is correct. The C standard says that the behaviour is implementation-defined.

You're right, I should have said, "according to the C standard, this could fail." Are we currently verifying all properties assuming a particular compiler implementation?

This is something we currently don't do, and it's a limitation not restricted to the case of zero-sized allocations.

I don't know enough about malloc -- can different calls to malloc for non-zero-sized allocations actually return the same address (within the same process)?

malloc(0) should be treated just like a non-deterministic pointer.

Why?

I suggested this to "simulate all platforms", as you mentioned. The standard says it's implementation-defined, so on an arbitrary implementation it would be a non-deterministic pointer.

@tautschnig
Copy link
Collaborator

tautschnig commented Apr 21, 2021

[...]

Are we currently verifying all properties assuming a particular compiler implementation?

We do honor the system compiler at times, but here it's not just the compiler: also the C library matters. And, no, we don't consistently consider all possible cases or all the specifics of an existing implementation. We should strive to fix this and make sure all such aspects are configurable.

This is something we currently don't do, and it's a limitation not restricted to the case of zero-sized allocations.

I don't know enough about malloc -- can different calls to malloc for non-zero-sized allocations actually return the same address (within the same process)?

Yes:

$ cat re-malloc.c
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>

int main()
{
  int *p = malloc(sizeof(int));
  uintptr_t p_int = (uintptr_t)p;
  free(p);
  int *q = malloc(sizeof(int));
  uintptr_t q_int = (uintptr_t)q;
  free(q);

  printf("p = %lx\n", p_int);
  printf("q = %lx\n", q_int);
}

$ gcc re-malloc.c && ./a.out
p = 556dc9aa3260
q = 556dc9aa3260

malloc(0) should be treated just like a non-deterministic pointer.

Why?

I suggested this to "simulate all platforms", as you mentioned. The standard says it's implementation-defined, so on an arbitrary implementation it would be a non-deterministic pointer.

But a non-deterministic pointer would include pointers to existing objects, resulting dereferencing suddenly becoming valid for such cases.

So, yes, our modeling of malloc could perhaps be made more elaborate. But let's take a few steps back first: what problem are you seeking to solve?

@SaswatPadhi
Copy link
Contributor

Thanks a lot for the detailed explanation.

malloc could reuse freed addresses, so I see now how different calls to it could return the same address.

But a non-deterministic pointer would include pointers to existing objects, resulting dereferencing suddenly becoming valid for such cases.

Ah yes, so perhaps a non-deterministic pointer that's invalid, i.e. either NULL or doesn't map to any existing object.

But let's take a few steps back first: what problem are you seeking to solve?

May be @feliperodri can comment on this. He was debugging some s2n issue. (I was just digging deep into the bug report.)

@feliperodri
Copy link
Collaborator Author

feliperodri commented Apr 21, 2021

@tautschnig The problem is that checking for nullness is not enough to make sure a pointer is valid. Take a look at the following example:
Command line: cbmc main.c --malloc-may-fail --malloc-fail-null --pointer-primitive-check --trace

// main.c
#include <stdlib.h>
#include <assert.h>

int main() {
  size_t len;
  void *ptr = malloc(len);
  __CPROVER_assume(ptr != NULL);
  assert(__CPROVER_r_ok(ptr, len));
}

I'd expect this assertion to fail, because for len == 0 the pointer might be invalid. Instead, I get a pointer outside dynamic object bounds in R_OK(ptr, len) violation, which only occurs if I use the --pointer-primitive-check.

Also, I still don't think we're modelling malloc correctly for the case malloc(0), because CBMC doesn't fail if we try to dereference a pointer that was allocated with malloc(0). According to MEM04-C, this behavior is implementation-defined:

  1. either a null pointer is returned, or
  2. the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.

Therefore, I expect two things:

  1. malloc(0) should return either NULL or an invalid pointer;
  2. __CPROVER_r_ok returns false for invalid pointers even with --pointer-primitive-check (this should be the first thing this primitive checks);

For a more concrete example, take a look at the proof harness for s2n_hash_block_size. If we change the allocation to use a non-deterministic integer, this proof will always fail for the case when malloc_size == 0ul. So, there is an implicit assumption in this proof about the allocated size of that pointer.

@feliperodri
Copy link
Collaborator Author

This might not be an urgent problem. There are two important remarks here:

  1. We no longer use __CPROVER_*_ok primitives in proofs anymore in order to make sure they are not used in negative contexts.
  2. CBMC reports an error when we try to dereference a pointer previously allocated with zero. See the following example:

Command line: cbmc main.c --malloc-may-fail --malloc-fail-null --pointer-primitive-check --pointer-check

     1	// main.c
     2	#include <stdlib.h>
     3	#include <assert.h>
     4
     5	int main() {
     6	  size_t len;
     7	  size_t *ptr = malloc(len);
     8	  __CPROVER_assume(ptr != NULL);
     9	  *ptr = len;
    10	}

CBMC indicates there is a "pointer outside dynamic object bounds" violation in line 9. See the output:

** Results:
<builtin-library-malloc> function malloc
[malloc.assertion.1] line 26 max allocation size exceeded: SUCCESS
[malloc.assertion.2] line 31 max allocation may fail: SUCCESS

main.c function main
[main.pointer_dereference.1] line 9 dereference failure: pointer NULL in *ptr: SUCCESS
[main.pointer_dereference.2] line 9 dereference failure: pointer invalid in *ptr: SUCCESS
[main.pointer_dereference.3] line 9 dereference failure: deallocated dynamic object in *ptr: SUCCESS
[main.pointer_dereference.4] line 9 dereference failure: dead object in *ptr: SUCCESS
[main.pointer_dereference.5] line 9 dereference failure: pointer outside dynamic object bounds in *ptr: FAILURE
[main.pointer_dereference.6] line 9 dereference failure: pointer outside object bounds in *ptr: SUCCESS
[main.pointer_dereference.7] line 9 dereference failure: invalid integer address in *ptr: SUCCESS

** 1 of 9 failed (2 iterations)
VERIFICATION FAILED

Thus, we will get an error either way, but I still think these are wrong:

[main.pointer_dereference.2] line 9 dereference failure: pointer invalid in *ptr: SUCCESS
[main.pointer_dereference.3] line 9 dereference failure: deallocated dynamic object in *ptr: SUCCESS

@tautschnig
Copy link
Collaborator

@feliperodri Thank you for your further clarification! Perhaps the main point we need to work out is what exactly an "invalid" pointer is? Here are my thoughts:

[...]
Therefore, I expect two things:

  1. malloc(0) should return either NULL or an invalid pointer;

What is your definition of an "invalid" pointer? As the standard prescribes that "the behavior is as if the size were some nonzero value" the pointer returned by a call malloc(0) must either be NULL or a pointer to an object of the requested size. That returned pointer is fine to be used except for trying to access an object (i.e., no read from or write to the dereferenced pointer).

  1. __CPROVER_r_ok returns false for invalid pointers even with --pointer-primitive-check (this should be the first thing this primitive checks);
    [...]

I agree with this statement, except the definition of "invalid pointer" is to be clarified.

@feliperodri
Copy link
Collaborator Author

Perhaps the main point we need to work out is what exactly an "invalid" pointer is?

@tautschnig Precisely! Take a look at the next example.

Command line: cbmc main.c --malloc-may-fail --malloc-fail-null --trace

// main.c
#include <assert.h>

int main() {
  int *ptr;
  assert(__CPROVER_r_ok(ptr, 0));
}

In this case, CBMC considers that ptr == INVALID-3, take a look at the complete counterexample.

** Results:
main.c function main
[main.assertion.1] line 6 assertion __CPROVER_r_ok(ptr, 0): FAILURE

Trace for main.assertion.1:

State 24 file main.c function main line 5 thread 0
----------------------------------------------------
  ptr=INVALID-3 (00000011 10000000 00000000 00000000 00000000 00000000 00000000 00000100)

Violated property:
  file main.c function main line 6 thread 0
  assertion __CPROVER_r_ok(ptr, 0)
  !((signed long int)(signed long int)!((FALSE || !(POINTER_OBJECT((const void *)ptr) == POINTER_OBJECT(NULL))) && !IS_INVALID_POINTER((const void *)ptr) && (FALSE || !(POINTER_OBJECT((const void *)ptr) == POINTER_OBJECT(__CPROVER_deallocated))) && (FALSE || !(POINTER_OBJECT((const void *)ptr) == POINTER_OBJECT(__CPROVER_dead_object))) && (FALSE || (IS_DYNAMIC_OBJECT((const void *)ptr) ==> !(POINTER_OFFSET((const void *)ptr) < 0l || (__CPROVER_size_t)POINTER_OFFSET((const void *)ptr) + (__CPROVER_size_t)0 > OBJECT_SIZE((const void *)ptr)))) && (FALSE || (!IS_DYNAMIC_OBJECT((const void *)ptr) ==> !(POINTER_OFFSET((const void *)ptr) < 0l || (__CPROVER_size_t)POINTER_OFFSET((const void *)ptr) + (__CPROVER_size_t)0 > OBJECT_SIZE((const void *)ptr)))) && (POINTER_OBJECT(NULL) == POINTER_OBJECT((const void *)ptr) && NULL != (const void *)ptr ==> FALSE)) != 0l)



** 1 of 1 failed (2 iterations)
VERIFICATION FAILED

What does INVALID-3 mean in this context? Why can't we use it to represent an invalid pointer in the malloc(0) case as well? AFAIK, we can still use ptr later in this program and we would only detect errors during dereferences. This seems like exactly what we want for malloc(0), right?

@tautschnig
Copy link
Collaborator

CBMC's encoding of pointers, at present, has three groups of object: the NULL object, the INVALID object, and all actually allocated objects. For a reason that isn't known to me, symbolic execution initializes each pointer to be pointing to the INVALID object (note that this is different from a non-deterministic value, and causes some confusion when afterwards trying to use __CPROVER_assume, there are multiple issues about this aspect).

Such a pointer cannot be NULL and cannot point to any existing object. We could make malloc(0) return such a pointer, but the only difference that I can see is that __CPROVER_r_ok(ptr, 0) would then fail --pointer-primitive-check. Given that I'm not sure what the semantics of __CPROVER_r_ok(ptr, 0) (i.e., trying to establish whether reading zero bytes starting from address ptr is valid) is, I'm not sure whether this actually makes any difference.

@feliperodri
Copy link
Collaborator Author

@tautschnig thank you for the clarification.

CBMC's encoding of pointers, at present, has three groups of object: the NULL object, the INVALID object, and all actually allocated objects. For a reason that isn't known to me, symbolic execution initializes each pointer to be pointing to the INVALID object (note that this is different from a non-deterministic value, and causes some confusion when afterwards trying to use __CPROVER_assume, there are multiple issues about this aspect).

  1. Can we make malloc(0) non-deterministically return NULL or INVALID? I can see the reason why CBMC doesn't consider that a pointer could be non-deterministically assigned to any allocated object, but I don't see a problem with implementing malloc(0) to non-deterministically return either NULL or INVALID. This would match the definition in MEM04-C.

Such a pointer cannot be NULL and cannot point to any existing object. We could make malloc(0) return such a pointer, but the only difference that I can see is that __CPROVER_r_ok(ptr, 0) would then fail --pointer-primitive-check. Given that I'm not sure what the semantics of __CPROVER_r_ok(ptr, 0) (i.e., trying to establish whether reading zero bytes starting from address ptr is valid) is, I'm not sure whether this actually makes any difference.

  1. __CPROVER_r_okand __CPROVER_w_ok should always return false, if the pointer is INVALID, regardless of --pointer-primitive-check, because invalid pointers are not readable or writable. Thus, if these checks fail because of --pointer-primitive-check, isn't that a bug? Why we the following property is even part of the primitive check? I'd expect that all memory primitive functions would handle INVALID pointers, since they are one of the "groups of objects".
[main.pointer_primitives.1] line 6 pointer invalid in R_OK((const void *)ptr, (__CPROVER_size_t)0): FAILURE
  1. The semantics of __CPROVER_r_ok(ptr, 0) is
    3.1. Return false if ptr == INVALID || PTR == NULL; otherwise,
    3.2. Return true (this is an improved nullness check).

@SaswatPadhi any thoughts?

@SaswatPadhi
Copy link
Contributor

SaswatPadhi commented Apr 23, 2021

In addition to the MEM04-C page, which Felipe mentioned above, I was also looking at these slides [http://index-of.es/Exploit/HES10-jvanegue_zero-allocations.pdf], and they discuss various verification approaches, but I think the most relevant bits are in slides 24 -- 35.

__CPROVER_r_ok and __CPROVER_w_ok should always return false, if the pointer is INVALID, regardless of --pointer-primitive-check, because invalid pointers are not readable or writable.

This was exactly my confusion as well. Michael pointed me to http://cprover.diffblue.com/memory-primitives.html, which states:

the primitives listed in the Memory Primitives section require a pointer that is either null or valid to have well-defined semantics. CBMC has the option --pointer-primitive-check to detect potential misuses of the memory primitives. It checks that the pointers that appear in these primitives are either null or valid

So basically, unless --pointer-primitive-check is specified, r_ok and w_ok would happily accept invalid pointers.

I am still not clear about the difference between --pointer-check and --pointer-primitive-check here. I get a single gigantic assertion in the --trace with --pointer-check, which looks a lot like what --pointer-primitive-check does as separate assertions.

$ cat test.c
#include <assert.h>
int main() {
  int *ptr;
  assert(__CPROVER_r_ok(ptr, 0));
}

$ cbmc --pointer-check --trace test.c
[...]
Violated property:
  file /Users/saspadhi/test.c function main line 5 thread 0
  assertion __CPROVER_r_ok(ptr, 0)
  !((signed long int)(signed long int)!((FALSE || !(POINTER_OBJECT((const void *)ptr) == POINTER_OBJECT(NULL))) &&
  !IS_INVALID_POINTER((const void *)ptr) &&
  (FALSE || !(POINTER_OBJECT((const void *)ptr) == POINTER_OBJECT(__CPROVER_deallocated))) && (FALSE || !(POINTER_OBJECT((const void *)ptr) == POINTER_OBJECT(__CPROVER_dead_object))) &&
  (FALSE || (IS_DYNAMIC_OBJECT((const void *)ptr) ==> !(POINTER_OFFSET((const void *)ptr) < 0l || (__CPROVER_size_t)POINTER_OFFSET((const void *)ptr) + (__CPROVER_size_t)0 > OBJECT_SIZE((const void *)ptr)))) &&
  (FALSE || (!IS_DYNAMIC_OBJECT((const void *)ptr) ==> !(POINTER_OFFSET((const void *)ptr) < 0l || (__CPROVER_size_t)POINTER_OFFSET((const void *)ptr) + (__CPROVER_size_t)0 > OBJECT_SIZE((const void *)ptr)))) &&
  (POINTER_OBJECT(NULL) == POINTER_OBJECT((const void *)ptr) && NULL != (const void *)ptr ==> FALSE)) != 0l)

** 1 of 1 failed (2 iterations)
VERIFICATION FAILED

@tautschnig
Copy link
Collaborator

--pointer-primitive-check will only generate checks for pointers used within pointer_object, pointer_offset, object_size, r_ok, w_ok, is_dynamic_object expressions. For such pointers, however, it is perfectly possible that --pointer-check (elsewhere, e.g., within dereference) might generate similar checks.

@jimgrundy
Copy link
Collaborator

Sorry to come late to this party. I have a different expectation of behavior here:

I expect the example @feliperodri gives back at the start of this discussion to pass. In general, I expect the following to pass:

size_t s;
void *p = malloc(s);
__CPROVER_assume(p != null);
assert(__CPROVER_r_ok(p,s));

If this sort of thing doesn't pass it will add needless complication to the contracts of any function that takes a pointer and a size where there size is allowed to be 0 (and presumably in that case the function does nothing). memcpy type functions for example.

If the size is 0 and p is not NULL then we have a pointer that is not valid to access. I would expect the predicate __CPROVER_r_ok(p,0) to be vacuously true for such pointers since we are saying we need accessibility only for 0 bytes. If it doesn't work like this I'm going to have to write (size == 0 || __CPROVER_r_ok(p,size)) everywhere, and that just adds clutter. Now, ok, I know that __CPROVER_r_ok doesn't currently work in assumptions, but we want that fixed too.

@jimgrundy
Copy link
Collaborator

Continuing to party late...

I don't share @SaswatPadhi's expectation that the value returned by malloc(0) could be pretty much anything. The man page for malloc says that if malloc(0) returns a non-null pointer then you may not access that pointer. But... it says explicitly that you may free that pointer. That tells you a lot.

The following code should be a no-op;

void *p = malloc(0);
if (p != NULL) {free(p);}

If malloc(0) could return arbitrary pointers this could result in freeing other chunks of memory.

@SaswatPadhi gives an example above, which passes, but he expects to fail (because "According to C standard, this should fail because a and b could actually be the same location"):

int main() {
  char *a = malloc(0);
  char *b = malloc(0);
  
  __CPROVER_assume(a != NULL && b != NULL);
  
  assert(a != b);
}

I expect this to pass. Note that malloc isn't defined by the C language, but rather by the standard library. Here is what the Linux man page (standard library documentation) says about malloc:

If size is 0, then malloc() returns either NULL, or a unique pointer value that can later be successfully passed to free()

Similarly, the OpenBSD man page says this:

If nmemb or size is equal to 0, a unique pointer to an access protected, zero sized object is returned

Since the non-null returned values are required to be unique this code should pass.

@SaswatPadhi
Copy link
Contributor

SaswatPadhi commented Aug 18, 2021

Actually the ISO C standard does talk about malloc(0). All major versions of the standard 9899:1990 - 9899:2018 say that:

If the size of the space requested is zero, the behavior is implementation-defined: either a null pointer is returned to indicate an error, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.

(Source: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf; section 7.22.3; page 256)

Since the standard says that the behavior is "implementation-defined", Linux and BSD are not wrong in additionally requiring the uniqueness property, but my point was that it's not part of the ISO C standard. In other words, an ISO C compliant compiler on some (not-so-sane) platform could return non-unique pointers on malloc(0), or at least that's what I understood from this paragraph and I might be wrong about that.

@jimgrundy
Copy link
Collaborator

Hi Saswat. Thanks for the correction. You are quite right the standard also covers these standard library function.

But, my reading of the standard is that malloc(0) may return NULL or a pointer (implementation choice), but that if it returns a pointer it must be unique (distinct form other currently live malloced pointers). Why do I think it says that?

As the quoted text says, you can either return NULL, or you can act like it was a malloc of some non-zero size (that didn't return null) (except that you can't access through the returned pointer). What does the standard say about non-null values returned by malloc for nonzero sizes? It says "Each such allocation shall yield a pointer to an object disjoint from any other object.". Of course, this is English so it is still open to interpretation. Would we consider two pointers to the same object to be disjoint if the size of the object were 0? Well, remembering that "the behavior is as if the size were some nonzero value" I am going to say no, I don't think we are supposed to consider them disjoint.

When the Linux and BSD man pages say that they will return NULL or a unique pointer, I don't think they are saying anything different to the standard, I read them as saying the same thing differently, a clarification rather than a narrowing. This seems to be the common interpretation. For example, the Solaris malloc man says:

"If size, nelem, or elsize is 0, the allocation functions return a unique non-null pointer that can be passed to free()"

MacOS malloc says nothing explicit about behavior when size is 0 that I can see, sadly, while the Windows malloc manual says something much more explicit about allocating 0 size things on the heap, which seems to imply that the results will be unique.

Suppose you don't buy my line of reasoning above, try this one instead:

Another way to to look it is is this. Since the C standard says that if malloc(0) returns a non-null value then the value is like that of a non-zero sized malloc (except you can't access it). And, since not being able to access through said pointer is the only thing it says is different about the result of non-null zero sized malloc vs a non-null non-zero sized malloc then I think that says that you can free it. This is something that the various unix malloc man pages say explicitly. I don't think they are narrowing the standard here, just clarifying. So, what does that entail?

To me it means that I expect this to be OK:

void *p = malloc(0);
void *q = malloc(0);
if (p && q) {
  free(p);
  free(q);
}

But, if this is ok it would imply that when p and q are both non-null they are also distinct, otherwise we'd be calling free twice on the same pointer, which the C standard says is undefined:

if the argument does not match a pointer earlier returned by a memory management function, or if the space has been
deallocated by a call to free or realloc, the behavior is undefined.

So, I think that in this way the C standard also implicitly requires that non-null values returned by malloc be unique (regardless of the size malloced).

@SaswatPadhi
Copy link
Contributor

Thanks a lot for the detailed explanation! Thinking more about this part,

the behavior is as if the size were some nonzero value

your reasoning makes a lot of sense.

@jimgrundy
Copy link
Collaborator

I think maybe we can close this. Are you ok with that @SaswatPadhi @feliperodri @tautschnig ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws Bugs or features of importance to AWS CBMC users aws-medium bug More info needed
Projects
None yet
Development

No branches or pull requests

4 participants