Fix handling of DEAD instructions and function call inlining #6473

SaswatPadhi · 2021-11-24T03:31:35Z

In this PR, we fix 3 minor issues related to code contracts:

an overflow error in handling DEAD instructions during write set inclusion checking.
refactor and optimize function call inlining during code contracts instrumentation
extend write set inclusion checks to loop head instructions (in addition to loop body) to check for possible side effects in the guard

Each commit message has a non-empty body, explaining why the change was made.
Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
n/a ~~The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/~~
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
n/a ~~My commit message includes data points confirming performance improvements (if claimed).~~
My PR is restricted to a single feature or bugfix.
n/a ~~White-space or formatting changes outside the feature-related changed lines are in commits of their own.~~

codecov · 2021-11-24T05:01:40Z

Codecov Report

Merging #6473 (abd36cf) into develop (2a9e3e2) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff            @@
##           develop    #6473   +/-   ##
========================================
  Coverage    76.02%   76.02%           
========================================
  Files         1546     1546           
  Lines       165352   165359    +7     
========================================
+ Hits        125711   125717    +6     
- Misses       39641    39642    +1

Impacted Files	Coverage Δ
src/goto-instrument/contracts/contracts.cpp	`97.10% <100.00%> (+0.35%)`	⬆️
src/goto-instrument/loop_utils.cpp	`82.85% <0.00%> (-8.58%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e6b3118...abd36cf. Read the comment docs.

When handling DEAD instructions in function/loop assigns clause instrumentation, the instruction iterator was not being incremented correctly. This led to instrumentation outside of the function/loop scope, and spurious write set inclusion violations. Moreover, for loops, declarations of some temporaries (those involved in the "initialization" of loop counter, for instance) are "outside" of the loop identified by CBMC, so we no longer raise an exception on not finding a corresponding DECL for each DEAD. These variables are writable because they appear as assigns clause targets, not because they were local DECLs.

Function call inlining was earlier performed on the same function multiple times if it had multiple loops. We refactor the inlining call out and only inline the function once, even if it has multiple loops.

Previously, only loop body was being instrumented for write set inclusion checks. This could miss checking writes performed by the loop guard, if it has side effects. In this PR, we also instrument the loop guard with inclusion checks.

remi-delmas-3000

missing some tests with actual side effects in the guard evaluation, and unresolved issue with decreases clause evaluation

remi-delmas-3000 · 2021-11-24T19:15:50Z

src/goto-instrument/contracts/contracts.cpp

  // FIXME: This simple approach wouldn't work when
  // the loop guard in the source file is split across multiple lines.
  const auto head_loc = loop_head->source_location();
-  while(loop_head->source_location() == head_loc)
+  while(loop_head->source_location() == head_loc ||


This seems fragile. An alternative solution could be to have the C front end inject LOCATION markers to precisely delimit loop entry, loop guard evaluation, loop condition test, loop body, loop step instructions, jump back to head instructions.

Yes, @martin-cs suggested something similar (keeping some markers around), in my other PR.

I would suggest not to fix this this in the same PR especially because it would touch the C front end (parsing functions etc.) This PR is tiny and it only fixes the assigns clause issues (with DEAD) and makes it play well with function call inlining (modulo the existing issue with loop body finding).

I think I agree with both. I have already had Opinions at @SaswatPadhi about loop structure detection (@remi-delmas-3000 did I CC you? If not I can resend. ); this is fragile and is asking for trouble and inconsistency with the rest of CPROVER. I think this functionality should be factored out and used in all appropriate places.

But I also agree with @SaswatPadhi ; that is outside of the scope of this PR.

remi-delmas-3000 · 2021-11-24T19:21:01Z

regression/contracts/loop_assigns-05/main.c

+
+void main()
+{
+  for(int i = lowerbound(); i < upperbound(); incr(&i))


could we also add another simple test such as

int main() { unsigned int max; assume(max > 0); unsigned int i=0; while( i < max++) loop_decreases(max-i) { i++; } }

This loop's decreases clause is incorrect, max and i can both overflow
(but the loop still terminates when max overflows and becomes smaller than i or stays at max_int and i starts catching up on it)
We should be able to find all these errors.

Agreed that you should be able to find these and agreed that this would be worth adding as a test.
Not entirely sure how this connects to the specific issue being fixed though.

I think this is a separate issue from the one addressed in this PR. @remi-delmas-3000 is there an issue where we track this?

src/goto-instrument/contracts/contracts.cpp

remi-delmas-3000 · 2021-11-24T20:12:01Z

src/goto-instrument/contracts/contracts.cpp

+  if(!natural_loops.loop_map.size())
+    return;
+
+  goto_function_inline(
+    goto_functions, function_name, ns, log.get_message_handler());
+


inlining after loop detection will miss loops that are hidden behind function calls that do not have contracts

which is okay? because they don't have contracts anyway?

I don't object but do note that this kind of inlining can get expensive.

remi-delmas-3000 · 2021-11-24T20:14:35Z

src/goto-instrument/contracts/contracts.cpp

+        log.warning() << "Found a `DEAD` variable "
+                      << name2string(symbol.get_identifier())
+                      << " without corresponding `DECL`, at: "
+                      << instruction_it->source_location() << messaget::eom;


Do we really need to warn about constructs the user has no control over ? How could we distinguish such expected cases from really incorrect cases ?

We could probably use log.debug or log.info, yes.

Yeah... I am not sure it is "logging at run-time to the user" issue, in part because, as @remi-delmas-3000 says, it is not clear what the user is supposed to do with this information.

This feels like a DATA_INVARIANT of a goto-programs; you can only DEAD a variable that you have previously DECLd. This seems reasonable and feels like the kind of thing that should be in the --validate-goto-model checks.

SaswatPadhi · 2021-11-24T21:07:43Z

Thanks for taking a look, @remi-delmas-3000

missing some tests with actual side effects in the guard evaluation, and unresolved issue with decreases clause evaluation

Guards with side effects were out of scope until now. This PR only changes assigns clause instrumentation to happen on the entire loop that CBMC identifies, rather than just the body. We haven't tested loops with guards that have side effects though, and there could be other issues with the invariant assumption / havoc statement / base case assertion placement, that we might want to fix before we can get decreases clauses working.

I would suggest keeping this PR only about assigns clause fixes (both function & loop contracts) and making separate PRs each for fixing issues with loops that with guards that have side effects, handling do/while loops etc.

remi-delmas-3000 · 2021-11-24T21:57:59Z

approving to merge but we need to keep in mind the issues mentioned in my review

martin-cs

I am not going to block this over the logging but please could you address it?

( Also thanks for the really well split-out changes and clear commit messages, makes it so much easier to read. )

martin-cs · 2021-11-25T14:31:56Z

regression/contracts/loop_assigns-05/main.c

+
+void main()
+{
+  for(int i = lowerbound(); i < upperbound(); incr(&i))


Agreed that you should be able to find these and agreed that this would be worth adding as a test.
Not entirely sure how this connects to the specific issue being fixed though.

martin-cs · 2021-11-25T14:37:18Z

src/goto-instrument/contracts/contracts.cpp

+        log.warning() << "Found a `DEAD` variable "
+                      << name2string(symbol.get_identifier())
+                      << " without corresponding `DECL`, at: "
+                      << instruction_it->source_location() << messaget::eom;


Yeah... I am not sure it is "logging at run-time to the user" issue, in part because, as @remi-delmas-3000 says, it is not clear what the user is supposed to do with this information.

This feels like a DATA_INVARIANT of a goto-programs; you can only DEAD a variable that you have previously DECLd. This seems reasonable and feels like the kind of thing that should be in the --validate-goto-model checks.

martin-cs · 2021-11-25T14:40:17Z

src/goto-instrument/contracts/contracts.cpp

  // FIXME: This simple approach wouldn't work when
  // the loop guard in the source file is split across multiple lines.
  const auto head_loc = loop_head->source_location();
-  while(loop_head->source_location() == head_loc)
+  while(loop_head->source_location() == head_loc ||


I think I agree with both. I have already had Opinions at @SaswatPadhi about loop structure detection (@remi-delmas-3000 did I CC you? If not I can resend. ); this is fragile and is asking for trouble and inconsistency with the rest of CPROVER. I think this functionality should be factored out and used in all appropriate places.

But I also agree with @SaswatPadhi ; that is outside of the scope of this PR.

martin-cs · 2021-11-25T14:41:39Z

src/goto-instrument/contracts/contracts.cpp

+  if(!natural_loops.loop_map.size())
+    return;
+
+  goto_function_inline(
+    goto_functions, function_name, ns, log.get_message_handler());
+


I don't object but do note that this kind of inlining can get expensive.

feliperodri · 2021-11-29T18:25:48Z

regression/contracts/loop_assigns-05/main.c

+
+void main()
+{
+  for(int i = lowerbound(); i < upperbound(); incr(&i))


I think this is a separate issue from the one addressed in this PR. @remi-delmas-3000 is there an issue where we track this?

tautschnig

I was too late to the party: I'll create a follow-up PR to fix the issues I called out.

tautschnig · 2021-11-30T14:46:02Z

src/goto-instrument/contracts/contracts.cpp

+        // they must appear as assigns targets anyway,
+        // but their DECL statements are outside of the loop.
+        log.warning() << "Found a `DEAD` variable "
+                      << name2string(symbol.get_identifier())


name2string seems to be rarely used (to the extent that I'm wondering whether we should get rid of it). I'd suggest you use id2string instead, though here you actually don't need either!

tautschnig · 2021-11-30T14:50:15Z

src/goto-instrument/contracts/contracts.cpp

+    return;
+
+  goto_function_inline(
+    goto_functions, function_name, ns, log.get_message_handler());


As discussed offline: calls to goto_function_inline will require goto_functions.update() to be called afterwards.

tautschnig · 2021-11-30T14:50:30Z

src/goto-instrument/contracts/contracts.cpp

-  // Insert aliasing assertions
+  // Inline all function calls.
+  goto_function_inline(
+    goto_functions, function_obj->first, ns, log.get_message_handler());


As above: call goto_functions.update()

SaswatPadhi added bugfix aws Bugs or features of importance to AWS CBMC users aws-high Code Contracts Function and loop contracts labels Nov 24, 2021

SaswatPadhi requested review from remi-delmas-3000 and feliperodri November 24, 2021 03:31

SaswatPadhi self-assigned this Nov 24, 2021

SaswatPadhi requested a review from tautschnig as a code owner November 24, 2021 03:31

SaswatPadhi force-pushed the assigns-scope-fix branch from a4b8ca5 to c4dc070 Compare November 24, 2021 15:59

SaswatPadhi assigned remi-delmas-3000 and feliperodri and unassigned SaswatPadhi Nov 24, 2021

SaswatPadhi added 3 commits November 24, 2021 19:19

refactor function call inling for code contracts

911db55

Function call inlining was earlier performed on the same function multiple times if it had multiple loops. We refactor the inlining call out and only inline the function once, even if it has multiple loops.

instrument loop head for inclusion checks as well

abd36cf

Previously, only loop body was being instrumented for write set inclusion checks. This could miss checking writes performed by the loop guard, if it has side effects. In this PR, we also instrument the loop guard with inclusion checks.

SaswatPadhi force-pushed the assigns-scope-fix branch from c4dc070 to abd36cf Compare November 24, 2021 19:21

remi-delmas-3000 requested changes Nov 24, 2021

View reviewed changes

remi-delmas-3000 approved these changes Nov 24, 2021

View reviewed changes

martin-cs approved these changes Nov 25, 2021

View reviewed changes

feliperodri approved these changes Nov 30, 2021

View reviewed changes

feliperodri merged commit 1a1245e into diffblue:develop Nov 30, 2021

tautschnig reviewed Nov 30, 2021

View reviewed changes

tautschnig mentioned this pull request Nov 30, 2021

Invoke goto_functions.update() after inlining #6493

Merged

3 tasks

Fix handling of DEAD instructions and function call inlining #6473

Fix handling of DEAD instructions and function call inlining #6473

Uh oh!

Conversation

SaswatPadhi commented Nov 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Nov 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

remi-delmas-3000 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SaswatPadhi Nov 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SaswatPadhi commented Nov 24, 2021

Uh oh!

remi-delmas-3000 commented Nov 24, 2021

Uh oh!

martin-cs left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tautschnig left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SaswatPadhi commented Nov 24, 2021 •

edited

Loading

codecov bot commented Nov 24, 2021 •

edited

Loading

SaswatPadhi Nov 24, 2021 •

edited

Loading