-
Notifications
You must be signed in to change notification settings - Fork 273
Improvements to goto-analyzer's constant propagator and simplifier [blocks: #2522] #2132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
94e9e53
to
69ec784
Compare
69ec784
to
197f76d
Compare
Looks sensible, but there ought to be corresponding new tests for the new kinds of propagation we intend to support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @smowton this needs some tests - also dunno how much overlap there is with this + variable sensitivity domain enhancements to constant propagation @chrisr-diffblue
src/util/expr_util.h
Outdated
@@ -72,4 +72,7 @@ bool has_subtype(const typet &, const irep_idt &id, const namespacet &); | |||
/// lift up an if_exprt one level | |||
if_exprt lift_if(const exprt &, std::size_t operand_number); | |||
|
|||
/// returns true if the expression can be used as an lvalue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As much as I hate having to add the comment, the coding standards say docs should be on the implementation.
src/util/replace_symbol.cpp
Outdated
@@ -80,7 +81,7 @@ bool replace_symbolt::replace( | |||
{ | |||
const exprt &e=it->second; | |||
|
|||
if(!replace_with_const && e.is_constant()) // Would give non l-value. | |||
if(!replace_with_const && !is_lvalue(e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't belong in this commit as this is a non-trivial change,
Previously required replace_with_const == false
AND e.id() == ID_constant
. Now requires replace_with_const == false
AND a super set of the original possible e
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like @thk123 I'm reviewing this patch by patch and yes, there is an issue here. I'm happy to resolve it by seeing where else in the patch set it would fit rather than moving it but still.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a separate PR now: #2200.
src/analyses/constant_propagator.cpp
Outdated
expr.id() == ID_plus || expr.id() == ID_array || expr.id() == ID_struct || | ||
expr.id() == ID_union) | ||
{ | ||
forall_operands(it, expr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought forall_operands
was now deprecated? Could use:
return std::all_of(
expr.operands().begin(),
expr.operands().end(),
[](const exprt &) { return is_constant(expr); });
To make the condition more explicit
src/analyses/constant_propagator.cpp
Outdated
{ | ||
return is_constant(expr.op0()); | ||
} | ||
else if( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ID_minus, ID_multiply etc? Or am I missing something
197f76d
to
1abe91a
Compare
1abe91a
to
626804b
Compare
Tests have now been added - without adding any new tests...! I have gone through various regression tests and moved from |
626804b
to
7ce5312
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few changes but more than that, I'd like to talk about the patch that loops calls to AI.
^Simplified: assert: 1, assume: 0, goto: 3, assigns: 4, function calls: 0$ | ||
^Unmodified: assert: 0, assume: 0, goto: 0, assigns: 11, function calls: 2$ | ||
^Simplified: assert: 1, assume: 0, goto: 1, assigns: 6, function calls: 0$ | ||
^Unmodified: assert: 0, assume: 0, goto: 1, assigns: 9, function calls: 2$ | ||
-- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this change matches the commit message of the commit it is in but I guess it is being updated elsewhere in the patch set, in which case it's a bit unnecessary to change it.
src/util/replace_symbol.cpp
Outdated
@@ -80,7 +81,7 @@ bool replace_symbolt::replace( | |||
{ | |||
const exprt &e=it->second; | |||
|
|||
if(!replace_with_const && e.is_constant()) // Would give non l-value. | |||
if(!replace_with_const && !is_lvalue(e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like @thk123 I'm reviewing this patch by patch and yes, there is an issue here. I'm happy to resolve it by seeing where else in the patch set it would fit rather than moving it but still.
src/analyses/constant_propagator.cpp
Outdated
to_side_effect_expr(expr).get_statement()==ID_nondet) | ||
return false; | ||
if(expr.is_constant()) | ||
return true; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way we could share the code between this and symex rather than keeping the two maintained in parallel? expr_util.h?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll give this a try, even though symex has some peculiarities around plus and multiplication just so as to feed more information to calls to malloc
.
@@ -18,7 +18,7 @@ class optionst; | |||
|
|||
bool static_simplifier( | |||
goto_modelt &, | |||
const ai_baset &, | |||
ai_baset &, | |||
const optionst &, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... I think we need to talk about this one.
src/analyses/constant_propagator.cpp
Outdated
|
||
return b; | ||
while(!values.replace_const.replace(condition)) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a test case for this? As in; do you know where it is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's needed when local constant propagation yields a subexpression that is not an lvalue, as in the following:
int x = 1;
int *p = &x;
x = *p;
Now the first substitution+simplification will be x = x;
when only the second round will yield x = 1;
.
" \nhavoc[,params:<regex>][,globals:<regex>]\n" \ | ||
" nondet-return, assert-false-assume-false " \ | ||
"and\n" \ | ||
" havoc[,params:<regex>][,globals:<regex>]\n" \ | ||
" (default: nondet-return)" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated, but, yes.
@@ -220,6 +221,10 @@ bool static_simplifier( | |||
<< ", function calls: " << unmodified.function_calls | |||
<< messaget::eom; | |||
|
|||
// restore return types before writing the binary | |||
restore_returns(goto_model); | |||
goto_model.goto_functions.update(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I know I keep saying this BUT we (in the broadest sense) do need to work out what the valid forms of goto-program are at various points in the tool. This reads to me a little bit like "returns must always be present when on disk".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not as much about returns, but more about the types: I don't think the simplifier should yield a binary where symbols have different types. Or at least that's not what I'd expect from a "simplifier".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair.
src/goto-programs/remove_returns.cpp
Outdated
"to end of function"); | ||
i_it=goto_program.instructions.insert(i_it, *i_it); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again; this looks like another "we need to clarify our invariants" issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, though in this case that was code I had originally corrupted^W^Wwritten with fixed assumptions about its structure. Those assumptions were unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also fair.
I have two things against the looped calls to AI patch (the rest I'm happy to accept (although having the constant detection shared rather than duplicated would be awesome)):
|
( I realise this week I am "block and discuss" on every patch but ... ) |
I do strongly prefer to have good quality code so all those inputs are highly appreciated! So ...
I don't think it's as simple as that; in fact simplify, as far as I can tell, is idempotent. There might be some bugs left, but that's not what this looping is about. The key bit is re-running constant propagation (or indeed whatever domain) after having done simplification. Consider a reduced version of one of the regression tests:
The first iteration of constant propagation will determine the value of
Now the branching is gone, and constant propagation plus simplification can happily sort out the assertion. I am all ears to hear about better solutions, but from my point of view a fixed point around the combination of abstraction interpretation and simplification is required here. |
> 2. Isn't this just papering over the cracks? I think simplify should be idempotent. I think it not being is indicative of a bug / lack of precision. @hannes-steffenhagen-diffblue very kindly did some work on this with the variable sensitivity domain and found a number of issues. So I'm kind of inclined to look at the cases where repeated simplification helps and see if they are actually issues with the domain / simplification code.
I don't think it's as simple as that; in fact simplify, as far as I can tell, _is_ idempotent. There might be some bugs left, but that's not what this looping is about. The key bit is re-running constant propagation (or indeed whatever domain) after having done simplification.
I was including this in the idempotency part!
Consider a reduced version of one of the regression tests:
```
int i = 0;
if(i == 0) ++i;
assert(i == 1);
```
The first iteration of constant propagation will determine the value of `i` for all instructions in line 2, but cannot resolve the assertion, because there still is a join point leaving the value of `i` non-constant at that point.
I see that as a precision issue. I think only one branch of the if
statement should be taken because ```i == 0``` can be resolved by the
domain. Thus one side will be \bot and so the assertion can be reduced
to ```assert(true)```.
Thus the result of the first run is
```
int i = 0;
i = 1;
assert(i == 1);
```
Now the branching is gone, and constant propagation plus simplification can happily sort out the assertion. I am all ears to hear about better solutions, but from my point of view a fixed point around the combination of abstraction interpretation and simplification is required here.
Do you have examples of this causing a problem? My next real
development task is finishing off goto-analyze and the variable
sensitivity domain. It is my belief that all cases of needing multiple
iterations I can fix with the VSD. Let's give it a go and see if we can
solve this by finding real programs that have this issue.
|
That's needed when local constant propagation yields a subexpression that is not an lvalue, as in the following:
```
int x = 1;
int *p = &x;
x = *p;
```
Now the first substitution+simplification will be `x = x;` when only the second round will yield `x = 1;`.
Again, in my eyes, this looks like a precision problem where there is a
lack of handling of pointers (a design problem in the constant
propagator, which is why I think we should eventually deprecate it in
favour of VSD).
|
[...]
Ok, but that's not what the domain is doing right now? So it's a but in the constant-propagation domain?
Yes, I do (as otherwise admittedly wouldn't have touched this code...). My workflow includes using the static unwinder that @danpoe built before applying the simplifier. I'm all ears if there is a better solution. I might actually filter out the undisputed bug fixes from this PR into a new one so that those can go in in the meantime. |
d07507b
to
03dfcb0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚫
This PR failed Diffblue compatibility checks (cbmc commit: d07507b).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/91427232
Status will be re-evaluated on next push.
Please contact @peterschrammel, @thk123, or @allredj for support.
Common spurious failures:
- the cbmc commit has disappeared in the mean time (e.g. in a force-push)
- the author is not in the list of contributors (e.g. first-time contributors).
The incompatibility may have been introduced by an earlier PR. In that case merging this
PR should be avoided unless it fixes the current incompatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 03dfcb0).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/91431166
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- any thoughts on how this interacts with the improvements in #2522 ?
03dfcb0
to
9b16c48
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 9b16c48).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/91505755
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes; looks much better.
@@ -1,4 +1,4 @@ | |||
KNOWNBUG | |||
FUTURE | |||
main.c | |||
--variable --pointers --arrays --structs --verify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be future, also, the names for options should change.
main.c | ||
--constants --simplify out.gb | ||
--constants --verify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why has this changed task?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For completeness: I have changed the task here as simplification did always take place, but not enough of it to actually turn the assertion into a firm FALSE
.
main.c | ||
--constants --simplify out.gb | ||
--constants --verify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, why change task?
@@ -1,9 +1,9 @@ | |||
FUTURE | |||
KNOWNBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change, although not wrong, doesn't seem to match the commit message.
9b16c48
to
731c3bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 731c3bf).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/93273597
This option is not currently supported, there is no known bug here.
731c3bf
to
fd6154b
Compare
Several tests already pass without any code modifications.
The "values" parameter shadowed the class member values, which, however, was used by functions called by assign_rec. Thus they possibly operate on two different sets of values.
It has only been disabled on suspicion of being at fault for bugs, but these are quite possibly due to other bugs (cf. soundness fixes later in this series).
The included regression test is an example the requires interleaving of simplification and constant propagation.
1. Properly handle index, member, or dereference expressions on the left-hand side of assignments. 2. If we find something on the left-hand side that we don't support (maybe byte extracts, if-expressions, type casts, or maybe something else), set the entire value set to top to ensure soundness.
We still fall back to setting values to top if dereferencing fails.
fd6154b
to
2359689
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 2359689).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/95302081
Previously only
exprt::id() == ID_constant
was deemed "constant", but several other expressions (in particular: the address of an object) are also constants that can safely be propagated. The interpretation of what is (or isn't) constant now largely overlaps with what goto-symex considers a constant.Dependencies: