-
Notifications
You must be signed in to change notification settings - Fork 273
Fix bugs with String.startsWith in string refinement #2297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
romainbrenguier
merged 5 commits into
diffblue:develop
from
romainbrenguier:bugfix/starts-with
Jun 13, 2018
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
71f80e1
Deprecates add_axioms_for_is_suffix
romainbrenguier 6763669
Correct starts_with for offset out of bound case
romainbrenguier d599dfc
Allow duplicate instantiations of the same constant string
romainbrenguier 0c56151
Add test for StartsWith
romainbrenguier 7a6277d
Reformat documentation without latex
romainbrenguier File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
// Must be compiled with CProverString: | ||
// javac Test.java ../cprover/CProverString.java | ||
public class Test { | ||
|
||
// Reference implementation | ||
public static boolean referenceStartsWith(String s, String prefix, int offset) { | ||
if (offset < 0 || offset > s.length() - prefix.length()) { | ||
return false; | ||
} | ||
|
||
for (int i = 0; i < prefix.length(); i++) { | ||
if (org.cprover.CProverString.charAt(s, offset + i) | ||
!= org.cprover.CProverString.charAt(prefix, i)) { | ||
return false; | ||
} | ||
} | ||
return true; | ||
} | ||
|
||
public static boolean check(String s, String t, int offset) { | ||
// Filter out null strings | ||
if(s == null || t == null) { | ||
return false; | ||
} | ||
|
||
// Act | ||
final boolean result = s.startsWith(t, offset); | ||
|
||
// Assert | ||
final boolean referenceResult = referenceStartsWith(s, t, offset); | ||
assert(result == referenceResult); | ||
|
||
// Check reachability | ||
assert(result == false); | ||
return result; | ||
} | ||
|
||
public static boolean checkDet() { | ||
boolean result = false; | ||
result = "foo".startsWith("foo", 0); | ||
assert(result); | ||
result = "foo".startsWith("f", -1); | ||
assert(!result); | ||
result = "foo".startsWith("oo", 1); | ||
assert(result); | ||
result = "foo".startsWith("f", 1); | ||
assert(!result); | ||
result = "foo".startsWith("bar", 0); | ||
assert(!result); | ||
result = "foo".startsWith("oo", 2); | ||
assert(!result); | ||
assert(false); | ||
return result; | ||
} | ||
|
||
public static boolean checkNonDet(String s) { | ||
// Filter | ||
if (s == null) { | ||
return false; | ||
} | ||
|
||
// Act | ||
final boolean result = s.startsWith(s, 1); | ||
|
||
// Assert | ||
assert(!result); | ||
|
||
// Check reachability | ||
assert(false); | ||
return result; | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
CORE | ||
Test.class | ||
--refine-strings --string-max-input-length 10 --unwind 10 --function Test.check | ||
^EXIT=10$ | ||
^SIGNAL=0$ | ||
assertion at file Test.java line 31 .*: SUCCESS | ||
assertion at file Test.java line 34 .*: FAILURE | ||
-- |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
CORE | ||
Test.class | ||
--refine-strings --string-max-input-length 100 --unwind 10 --function Test.checkDet | ||
^EXIT=10$ | ||
^SIGNAL=0$ | ||
assertion at file Test.java line 41 .*: SUCCESS | ||
assertion at file Test.java line 43 .*: SUCCESS | ||
assertion at file Test.java line 45 .*: SUCCESS | ||
assertion at file Test.java line 47 .*: SUCCESS | ||
assertion at file Test.java line 49 .*: SUCCESS | ||
assertion at file Test.java line 51 .*: SUCCESS | ||
assertion at file Test.java line 52 .*: FAILURE | ||
-- |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
CORE | ||
Test.class | ||
--refine-strings --string-max-input-length 100 --unwind 10 --function Test.checkNonDet | ||
^EXIT=10$ | ||
^SIGNAL=0$ | ||
assertion at file Test.java line 66 .*: SUCCESS | ||
assertion at file Test.java line 69 .*: FAILURE | ||
-- |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,15 +15,19 @@ Author: Romain Brenguier, [email protected] | |
#include <util/deprecate.h> | ||
|
||
/// Add axioms stating that the returned expression is true exactly when the | ||
/// first string is a prefix of the second one, starting at position offset. | ||
/// offset is greater or equal to 0 and the first string is a prefix of the | ||
/// second one, starting at position offset. | ||
/// | ||
/// These axioms are: | ||
/// 1. \f$ {\tt isprefix} \Rightarrow |str| \ge |{\tt prefix}|+offset \f$ | ||
/// 2. \f$ \forall 0 \le qvar<|{\tt prefix}|.\ {\tt isprefix} | ||
/// \Rightarrow s0[witness+{\tt offset}]=s2[witness] \f$ | ||
/// 3. \f$ !{\tt isprefix} \Rightarrow |{\tt str}|<|{\tt prefix}|+{\tt offset} | ||
/// \lor (0 \le witness<|{\tt prefix}| | ||
/// \land {\tt str}[witness+{\tt offset}] \ne {\tt prefix}[witness])\f$ | ||
/// 1. isprefix => offset_within_bounds | ||
/// 2. forall qvar in [0, |prefix|). | ||
/// isprefix => str[qvar + offset] = prefix[qvar] | ||
/// 3. !isprefix => !offset_within_bounds | ||
/// || 0 <= witness < |prefix| | ||
/// && str[witness+offset] != prefix[witness] | ||
/// | ||
/// where offset_within_bounds is: | ||
/// offset >= 0 && offset <= |str| && |str| - offset >= |prefix| | ||
/// | ||
/// \param prefix: an array of characters | ||
/// \param str: an array of characters | ||
|
@@ -34,34 +38,39 @@ exprt string_constraint_generatort::add_axioms_for_is_prefix( | |
const array_string_exprt &str, | ||
const exprt &offset) | ||
{ | ||
symbol_exprt isprefix=fresh_boolean("isprefix"); | ||
const typet &index_type=str.length().type(); | ||
const symbol_exprt isprefix = fresh_boolean("isprefix"); | ||
const typet &index_type = str.length().type(); | ||
const exprt offset_within_bounds = and_exprt( | ||
binary_relation_exprt(offset, ID_ge, from_integer(0, offset.type())), | ||
binary_relation_exprt(offset, ID_le, str.length()), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
binary_relation_exprt( | ||
minus_exprt(str.length(), offset), ID_ge, prefix.length())); | ||
|
||
// Axiom 1. | ||
lemmas.push_back( | ||
implies_exprt( | ||
isprefix, str.axiom_for_length_ge(plus_exprt(prefix.length(), offset)))); | ||
|
||
symbol_exprt qvar=fresh_univ_index("QA_isprefix", index_type); | ||
string_constraintt a2( | ||
qvar, | ||
prefix.length(), | ||
implies_exprt( | ||
isprefix, equal_exprt(str[plus_exprt(qvar, offset)], prefix[qvar]))); | ||
constraints.push_back(a2); | ||
|
||
symbol_exprt witness=fresh_exist_index("witness_not_isprefix", index_type); | ||
and_exprt witness_diff( | ||
axiom_for_is_positive_index(witness), | ||
and_exprt( | ||
lemmas.push_back(implies_exprt(isprefix, offset_within_bounds)); | ||
|
||
// Axiom 2. | ||
constraints.push_back([&] { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||
const symbol_exprt qvar = fresh_univ_index("QA_isprefix", index_type); | ||
const exprt body = implies_exprt( | ||
isprefix, equal_exprt(str[plus_exprt(qvar, offset)], prefix[qvar])); | ||
return string_constraintt(qvar, prefix.length(), body); | ||
}()); | ||
|
||
// Axiom 3. | ||
lemmas.push_back([&] { | ||
const exprt witness = fresh_exist_index("witness_not_isprefix", index_type); | ||
const exprt strings_differ_at_witness = and_exprt( | ||
axiom_for_is_positive_index(witness), | ||
prefix.axiom_for_length_gt(witness), | ||
notequal_exprt(str[plus_exprt(witness, offset)], prefix[witness]))); | ||
or_exprt s0_notpref_s1( | ||
not_exprt(str.axiom_for_length_ge(plus_exprt(prefix.length(), offset))), | ||
witness_diff); | ||
notequal_exprt(str[plus_exprt(witness, offset)], prefix[witness])); | ||
const exprt s1_does_not_start_with_s0 = or_exprt( | ||
not_exprt(offset_within_bounds), | ||
not_exprt(str.axiom_for_length_ge(plus_exprt(prefix.length(), offset))), | ||
strings_differ_at_witness); | ||
return implies_exprt(not_exprt(isprefix), s1_does_not_start_with_s0); | ||
}()); | ||
|
||
implies_exprt a3(not_exprt(isprefix), s0_notpref_s1); | ||
lemmas.push_back(a3); | ||
return isprefix; | ||
} | ||
|
||
|
@@ -135,6 +144,7 @@ exprt string_constraint_generatort::add_axioms_for_is_empty( | |
/// \param swap_arguments: boolean flag telling whether the suffix is the second | ||
/// argument or the first argument | ||
/// \return Boolean expression `issuffix` | ||
DEPRECATED("should use `strings_startwith(s0, s1, s1.length - s0.length)`") | ||
exprt string_constraint_generatort::add_axioms_for_is_suffix( | ||
const function_application_exprt &f, bool swap_arguments) | ||
{ | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1833,18 +1833,17 @@ static void update_index_set( | |
} | ||
} | ||
|
||
/// Finds an index on `str` used in `expr` that contains `qvar`, for instance | ||
/// with arguments ``(str[k]=='a')``, `str`, and `k`, the function should | ||
/// return `k`. | ||
/// If two different such indexes exist, an invariant will fail. | ||
/// Find indexes of `str` used in `expr` that contains `qvar`, for instance | ||
/// with arguments ``(str[k+1]=='a')``, `str`, and `k`, the function should | ||
/// return `k+1`. | ||
/// \param [in] expr: the expression to search | ||
/// \param [in] str: the string which must be indexed | ||
/// \param [in] qvar: the universal variable that must be in the index | ||
/// \return an index expression in `expr` on `str` containing `qvar`. Returns | ||
/// an empty optional when `expr` does not contain `str`. | ||
static optionalt<exprt> | ||
find_index(const exprt &expr, const exprt &str, const symbol_exprt &qvar) | ||
/// \return index expressions in `expr` on `str` containing `qvar`. | ||
static std::unordered_set<exprt, irep_hash> | ||
find_indexes(const exprt &expr, const exprt &str, const symbol_exprt &qvar) | ||
{ | ||
decltype(find_indexes(expr, str, qvar)) result; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||
auto index_str_containing_qvar = [&](const exprt &e) { | ||
if(auto index_expr = expr_try_dynamic_cast<index_exprt>(e)) | ||
{ | ||
|
@@ -1855,28 +1854,11 @@ find_index(const exprt &expr, const exprt &str, const symbol_exprt &qvar) | |
return false; | ||
}; | ||
|
||
auto it = std::find_if( | ||
expr.depth_begin(), expr.depth_end(), index_str_containing_qvar); | ||
if(it == expr.depth_end()) | ||
return {}; | ||
const exprt &index = to_index_expr(*it).index(); | ||
|
||
// Check that there are no two such indexes | ||
it.next_sibling_or_parent(); | ||
while(it != expr.depth_end()) | ||
{ | ||
if(index_str_containing_qvar(*it)) | ||
{ | ||
INVARIANT( | ||
to_index_expr(*it).index() == index, | ||
"string should always be indexed by same value in a given formula"); | ||
it.next_sibling_or_parent(); | ||
} | ||
else | ||
++it; | ||
} | ||
|
||
return index; | ||
std::for_each(expr.depth_begin(), expr.depth_end(), [&](const exprt &e) { | ||
if(index_str_containing_qvar(e)) | ||
result.insert(to_index_expr(e).index()); | ||
}); | ||
return result; | ||
} | ||
|
||
/// Instantiates a string constraint by substituting the quantifiers. | ||
|
@@ -1897,18 +1879,24 @@ static exprt instantiate( | |
const exprt &str, | ||
const exprt &val) | ||
{ | ||
const optionalt<exprt> idx = find_index(axiom.body, str, axiom.univ_var); | ||
if(!idx.has_value()) | ||
return true_exprt(); | ||
|
||
const exprt r = compute_inverse_function(stream, axiom.univ_var, val, *idx); | ||
implies_exprt instance( | ||
and_exprt( | ||
binary_relation_exprt(axiom.univ_var, ID_ge, axiom.lower_bound), | ||
binary_relation_exprt(axiom.univ_var, ID_lt, axiom.upper_bound)), | ||
axiom.body); | ||
replace_expr(axiom.univ_var, r, instance); | ||
return instance; | ||
const auto indexes = find_indexes(axiom.body, str, axiom.univ_var); | ||
INVARIANT( | ||
str.id() == ID_array || indexes.size() <= 1, | ||
"non constant array should always be accessed at the same index"); | ||
exprt::operandst conjuncts; | ||
for(const auto &index : indexes) | ||
{ | ||
const exprt univ_var_value = | ||
compute_inverse_function(stream, axiom.univ_var, val, index); | ||
implies_exprt instance( | ||
and_exprt( | ||
binary_relation_exprt(axiom.univ_var, ID_ge, axiom.lower_bound), | ||
binary_relation_exprt(axiom.univ_var, ID_lt, axiom.upper_bound)), | ||
axiom.body); | ||
replace_expr(axiom.univ_var, univ_var_value, instance); | ||
conjuncts.push_back(instance); | ||
} | ||
return conjunction(conjuncts); | ||
} | ||
|
||
/// Instantiates a quantified formula representing `not_contains` by | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this just be removed directly instead of deprecating it? Are there actually any users out there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still used when converting from String.endsWith, so there is a little bit more work to do.
We should have a model of endsWith which just calls startsWith then we can remove the preprocessing for endsWith and is_suffix will become dead code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be a lot of work to fix endsWith right away?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be a couple of PRs on other repositories, so it's not difficult changes but they take some time because of CI and such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks for clarifying!