Skip to content

Fix handling of DEAD instructions and function call inlining #6473

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 30, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions regression/contracts/loop_assigns-05/main.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#include <assert.h>

int j;

int lowerbound()
{
return 0;
}

int upperbound()
{
return 10;
}

void incr(int *i)
{
(*i)++;
}

void body_1(int i)
{
j = i;
}

void body_2(int *i)
{
(*i)++;
(*i)--;
}

int body_3(int *i)
{
(*i)++;
if(*i == 4)
return 1;

(*i)--;
return 0;
}

int main()
{
for(int i = lowerbound(); i < upperbound(); incr(&i))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we also add another simple test such as

int main()
{
   unsigned int max;
   assume(max > 0);
   unsigned int i=0;  
   while( i < max++)
   loop_decreases(max-i)
   {
     i++;
   }
}

This loop's decreases clause is incorrect, max and i can both overflow
(but the loop still terminates when max overflows and becomes smaller than i or stays at max_int and i starts catching up on it)
We should be able to find all these errors.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that you should be able to find these and agreed that this would be worth adding as a test.
Not entirely sure how this connects to the specific issue being fixed though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a separate issue from the one addressed in this PR. @remi-delmas-3000 is there an issue where we track this?

// clang-format off
__CPROVER_assigns(i, j)
__CPROVER_loop_invariant(0 <= i && i <= 10)
__CPROVER_loop_invariant(i != 0 ==> j + 1 == i)
// clang-format on
{
body_1(i);

if(body_3(&i))
return 1;

body_2(&i);
}

assert(j == 9);
}
19 changes: 19 additions & 0 deletions regression/contracts/loop_assigns-05/test.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
CORE
main.c
--apply-loop-contracts
^EXIT=0$
^SIGNAL=0$
^\[body_1.\d+\] .* Check that j is assignable: SUCCESS$
^\[body_2.\d+\] .* Check that \*i is assignable: SUCCESS$
^\[body_3.\d+\] .* Check that \*i is assignable: SUCCESS$
^\[incr.\d+\] .* Check that \*i is assignable: SUCCESS$
^\[main.\d+\] .* Check that i is assignable: SUCCESS$
^\[main.\d+\] .* Check loop invariant before entry: SUCCESS$
^\[main.\d+\] .* Check that loop invariant is preserved: SUCCESS$
^\[main.assertion.\d+\] .* assertion j == 9: SUCCESS$
^VERIFICATION SUCCESSFUL$
--
--
This test checks write set inclusion checks in presence of function calls,
which are inlined, and also checks that DEAD instructions introduced during
inlining is correctly handled.
65 changes: 44 additions & 21 deletions src/goto-instrument/contracts/contracts.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,10 @@ void code_contractst::check_apply_loop_contracts(
insert_before_swap_and_advance(
goto_function.body, loop_head, snapshot_instructions);
};

// Perform write set instrumentation on the entire loop.
check_frame_conditions(
function_name, goto_function.body, loop_head, loop_end, loop_assigns);
}

havoc_assigns_targetst havoc_gen(modifies, ns);
Expand Down Expand Up @@ -251,29 +255,28 @@ void code_contractst::check_apply_loop_contracts(

// Assume invariant & decl the variant temporaries (just before loop guard).
// Use insert_before_swap to preserve jumps to loop head.
insert_before_swap_and_advance(goto_function.body, loop_head, generated_code);
insert_before_swap_and_advance(
goto_function.body,
loop_head,
add_pragma_disable_assigns_check(generated_code));

// Forward the loop_head iterator until the start of the body.
// This is necessary because complex C loop_head conditions could be
// converted to multiple GOTO instructions (e.g. temporaries are introduced).
// If the loop_head location shifts to a different function,
// assume that it's an inlined function and keep skipping.
// FIXME: This simple approach wouldn't work when
// the loop guard in the source file is split across multiple lines.
const auto head_loc = loop_head->source_location();
while(loop_head->source_location() == head_loc)
while(loop_head->source_location() == head_loc ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fragile. An alternative solution could be to have the C front end inject LOCATION markers to precisely delimit loop entry, loop guard evaluation, loop condition test, loop body, loop step instructions, jump back to head instructions.

Copy link
Contributor Author

@SaswatPadhi SaswatPadhi Nov 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, @martin-cs suggested something similar (keeping some markers around), in my other PR.

I would suggest not to fix this this in the same PR especially because it would touch the C front end (parsing functions etc.) This PR is tiny and it only fixes the assigns clause issues (with DEAD) and makes it play well with function call inlining (modulo the existing issue with loop body finding).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with both. I have already had Opinions at @SaswatPadhi about loop structure detection (@remi-delmas-3000 did I CC you? If not I can resend. ); this is fragile and is asking for trouble and inconsistency with the rest of CPROVER. I think this functionality should be factored out and used in all appropriate places.

But I also agree with @SaswatPadhi ; that is outside of the scope of this PR.

loop_head->source_location().get_function() != head_loc.get_function())
loop_head++;

// At this point, we are just past the loop head,
// so at the beginning of the loop body.
auto loop_body = loop_head;
loop_head--;

// Perform write set instrumentation before adding anything else to loop body.
if(assigns.is_not_nil())
{
check_frame_conditions(
function_name, goto_function.body, loop_body, loop_end, loop_assigns);
}

// Generate: assignments to store the multidimensional decreases clause's
// value just before the loop body (but just after the loop guard)
if(!decreases_clause.is_nil())
Expand All @@ -287,7 +290,8 @@ void code_contractst::check_apply_loop_contracts(
converter.goto_convert(old_decreases_assignment, generated_code, mode);
}

goto_function.body.destructive_insert(loop_body, generated_code);
goto_function.body.destructive_insert(
loop_body, add_pragma_disable_assigns_check(generated_code));
}

// Generate: assert(invariant) just after the loop exits
Expand Down Expand Up @@ -337,7 +341,10 @@ void code_contractst::check_apply_loop_contracts(
}
}

insert_before_swap_and_advance(goto_function.body, loop_end, generated_code);
insert_before_swap_and_advance(
goto_function.body,
loop_end,
add_pragma_disable_assigns_check(generated_code));

// change the back edge into assume(false) or assume(guard)
loop_end->turn_into_assume();
Expand Down Expand Up @@ -678,6 +685,12 @@ void code_contractst::apply_loop_contract(
local_may_aliast local_may_alias(goto_function);
natural_loops_mutablet natural_loops(goto_function.body);

if(!natural_loops.loop_map.size())
return;

goto_function_inline(
goto_functions, function_name, ns, log.get_message_handler());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline: calls to goto_function_inline will require goto_functions.update() to be called afterwards.


Comment on lines +688 to +693
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inlining after loop detection will miss loops that are hidden behind function calls that do not have contracts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is okay? because they don't have contracts anyway?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't object but do note that this kind of inlining can get expensive.

// A graph node type that stores information about a loop.
// We create a DAG representing nesting of various loops in goto_function,
// sort them topologically, and instrument them in the top-sorted order.
Expand Down Expand Up @@ -992,7 +1005,11 @@ bool code_contractst::check_frame_conditions_function(const irep_idt &function)
function_obj->second.body, instruction_it, snapshot_instructions);
};

// Insert aliasing assertions
// Inline all function calls.
goto_function_inline(
goto_functions, function_obj->first, ns, log.get_message_handler());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above: call goto_functions.update()


// Insert write set inclusion checks.
check_frame_conditions(
function_obj->first,
function_obj->second.body,
Expand All @@ -1010,8 +1027,6 @@ void code_contractst::check_frame_conditions(
const goto_programt::targett &instruction_end,
assigns_clauset &assigns)
{
goto_function_inline(goto_functions, function, ns, log.get_message_handler());

for(; instruction_it != instruction_end; ++instruction_it)
{
const auto &pragmas = instruction_it->source_location().get_pragmas();
Expand Down Expand Up @@ -1054,20 +1069,28 @@ void code_contractst::check_frame_conditions(
assigns_clauset::conditional_address_ranget{assigns, symbol});
if(symbol_car != assigns.get_write_set().end())
{
instruction_it++;
auto invalidation_assignment = goto_programt::make_assignment(
symbol_car->validity_condition_var,
false_exprt{},
instruction_it->source_location());
// note that instruction_it is not advanced by this call,
// so no need to move it backwards
body.insert_before_swap(instruction_it, invalidation_assignment);
// note that the CAR must be invalidated _after_ the DEAD instruction
body.insert_after(
instruction_it,
add_pragma_disable_assigns_check(invalidation_assignment));
}
else
{
throw incorrect_goto_program_exceptiont(
"Found a `DEAD` variable without corresponding `DECL`!",
instruction_it->source_location());
// For loops, the loop_head appears after the DECL of counters,
// and any other temporaries introduced during "initialization".
// However, they go `DEAD` (possible conditionally) inside the loop,
// in presence of return statements.
// Notice that for them those variables be writable,
// they must appear as assigns targets anyway,
// but their DECL statements are outside of the loop.
log.warning() << "Found a `DEAD` variable "
<< name2string(symbol.get_identifier())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name2string seems to be rarely used (to the extent that I'm wondering whether we should get rid of it). I'd suggest you use id2string instead, though here you actually don't need either!

<< " without corresponding `DECL`, at: "
<< instruction_it->source_location() << messaget::eom;
Comment on lines +1090 to +1093
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to warn about constructs the user has no control over ? How could we distinguish such expected cases from really incorrect cases ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably use log.debug or log.info, yes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... I am not sure it is "logging at run-time to the user" issue, in part because, as @remi-delmas-3000 says, it is not clear what the user is supposed to do with this information.

This feels like a DATA_INVARIANT of a goto-programs; you can only DEAD a variable that you have previously DECLd. This seems reasonable and feels like the kind of thing that should be in the --validate-goto-model checks.

}
}
else if(
Expand Down