Skip to content

Fix bugs with String.startsWith in string refinement #2297

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 13, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file not shown.
72 changes: 72 additions & 0 deletions jbmc/regression/jbmc-strings/StartsWith/Test.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
// Must be compiled with CProverString:
// javac Test.java ../cprover/CProverString.java
public class Test {

// Reference implementation
public static boolean referenceStartsWith(String s, String prefix, int offset) {
if (offset < 0 || offset > s.length() - prefix.length()) {
return false;
}

for (int i = 0; i < prefix.length(); i++) {
if (org.cprover.CProverString.charAt(s, offset + i)
!= org.cprover.CProverString.charAt(prefix, i)) {
return false;
}
}
return true;
}

public static boolean check(String s, String t, int offset) {
// Filter out null strings
if(s == null || t == null) {
return false;
}

// Act
final boolean result = s.startsWith(t, offset);

// Assert
final boolean referenceResult = referenceStartsWith(s, t, offset);
assert(result == referenceResult);

// Check reachability
assert(result == false);
return result;
}

public static boolean checkDet() {
boolean result = false;
result = "foo".startsWith("foo", 0);
assert(result);
result = "foo".startsWith("f", -1);
assert(!result);
result = "foo".startsWith("oo", 1);
assert(result);
result = "foo".startsWith("f", 1);
assert(!result);
result = "foo".startsWith("bar", 0);
assert(!result);
result = "foo".startsWith("oo", 2);
assert(!result);
assert(false);
return result;
}

public static boolean checkNonDet(String s) {
// Filter
if (s == null) {
return false;
}

// Act
final boolean result = s.startsWith(s, 1);

// Assert
assert(!result);

// Check reachability
assert(false);
return result;
}
}
8 changes: 8 additions & 0 deletions jbmc/regression/jbmc-strings/StartsWith/test.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
CORE
Test.class
--refine-strings --string-max-input-length 10 --unwind 10 --function Test.check
^EXIT=10$
^SIGNAL=0$
assertion at file Test.java line 31 .*: SUCCESS
assertion at file Test.java line 34 .*: FAILURE
--
13 changes: 13 additions & 0 deletions jbmc/regression/jbmc-strings/StartsWith/test_det.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
CORE
Test.class
--refine-strings --string-max-input-length 100 --unwind 10 --function Test.checkDet
^EXIT=10$
^SIGNAL=0$
assertion at file Test.java line 41 .*: SUCCESS
assertion at file Test.java line 43 .*: SUCCESS
assertion at file Test.java line 45 .*: SUCCESS
assertion at file Test.java line 47 .*: SUCCESS
assertion at file Test.java line 49 .*: SUCCESS
assertion at file Test.java line 51 .*: SUCCESS
assertion at file Test.java line 52 .*: FAILURE
--
8 changes: 8 additions & 0 deletions jbmc/regression/jbmc-strings/StartsWith/test_nondet.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
CORE
Test.class
--refine-strings --string-max-input-length 100 --unwind 10 --function Test.checkNonDet
^EXIT=10$
^SIGNAL=0$
assertion at file Test.java line 66 .*: SUCCESS
assertion at file Test.java line 69 .*: FAILURE
--
5 changes: 2 additions & 3 deletions src/solvers/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -231,9 +231,6 @@ allocates a new string before calling a primitive.
* `cprover_string_is_prefix` :
\copybrief string_constraint_generatort::add_axioms_for_is_prefix
\link string_constraint_generatort::add_axioms_for_is_prefix More... \endlink
* `cprover_string_is_suffix` :
\copybrief string_constraint_generatort::add_axioms_for_is_suffix
\link string_constraint_generatort::add_axioms_for_is_suffix More... \endlink
* `cprover_string_index_of` :
\copybrief string_constraint_generatort::add_axioms_for_index_of(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_index_of(const function_application_exprt &f) More... \endlink
Expand Down Expand Up @@ -319,6 +316,8 @@ allocates a new string before calling a primitive.
* `cprover_string_intern` : Never tested.
* `cprover_string_is_empty` :
Should use `cprover_string_length(s) == 0` instead.
* `cprover_string_is_suffix` : Should use `cprover_string_is_prefix` with an
offset argument.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this just be removed directly instead of deprecating it? Are there actually any users out there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still used when converting from String.endsWith, so there is a little bit more work to do.
We should have a model of endsWith which just calls startsWith then we can remove the preprocessing for endsWith and is_suffix will become dead code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be a lot of work to fix endsWith right away?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a couple of PRs on other repositories, so it's not difficult changes but they take some time because of CI and such.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for clarifying!

* `cprover_string_empty_string` : Can use literal of empty string instead.
* `cprover_string_of_long` : Should be the same as `cprover_string_of_int`.
* `cprover_string_delete_char_at` : A call to
Expand Down
72 changes: 41 additions & 31 deletions src/solvers/refinement/string_constraint_generator_testing.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,19 @@ Author: Romain Brenguier, [email protected]
#include <util/deprecate.h>

/// Add axioms stating that the returned expression is true exactly when the
/// first string is a prefix of the second one, starting at position offset.
/// offset is greater or equal to 0 and the first string is a prefix of the
/// second one, starting at position offset.
///
/// These axioms are:
/// 1. \f$ {\tt isprefix} \Rightarrow |str| \ge |{\tt prefix}|+offset \f$
/// 2. \f$ \forall 0 \le qvar<|{\tt prefix}|.\ {\tt isprefix}
/// \Rightarrow s0[witness+{\tt offset}]=s2[witness] \f$
/// 3. \f$ !{\tt isprefix} \Rightarrow |{\tt str}|<|{\tt prefix}|+{\tt offset}
/// \lor (0 \le witness<|{\tt prefix}|
/// \land {\tt str}[witness+{\tt offset}] \ne {\tt prefix}[witness])\f$
/// 1. isprefix => offset_within_bounds
/// 2. forall qvar in [0, |prefix|).
/// isprefix => str[qvar + offset] = prefix[qvar]
/// 3. !isprefix => !offset_within_bounds
/// || 0 <= witness < |prefix|
/// && str[witness+offset] != prefix[witness]
///
/// where offset_within_bounds is:
/// offset >= 0 && offset <= |str| && |str| - offset >= |prefix|
///
/// \param prefix: an array of characters
/// \param str: an array of characters
Expand All @@ -34,34 +38,39 @@ exprt string_constraint_generatort::add_axioms_for_is_prefix(
const array_string_exprt &str,
const exprt &offset)
{
symbol_exprt isprefix=fresh_boolean("isprefix");
const typet &index_type=str.length().type();
const symbol_exprt isprefix = fresh_boolean("isprefix");
const typet &index_type = str.length().type();
const exprt offset_within_bounds = and_exprt(
binary_relation_exprt(offset, ID_ge, from_integer(0, offset.type())),
binary_relation_exprt(offset, ID_le, str.length()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ID_le -> ID_lt

binary_relation_exprt(
minus_exprt(str.length(), offset), ID_ge, prefix.length()));

// Axiom 1.
lemmas.push_back(
implies_exprt(
isprefix, str.axiom_for_length_ge(plus_exprt(prefix.length(), offset))));

symbol_exprt qvar=fresh_univ_index("QA_isprefix", index_type);
string_constraintt a2(
qvar,
prefix.length(),
implies_exprt(
isprefix, equal_exprt(str[plus_exprt(qvar, offset)], prefix[qvar])));
constraints.push_back(a2);

symbol_exprt witness=fresh_exist_index("witness_not_isprefix", index_type);
and_exprt witness_diff(
axiom_for_is_positive_index(witness),
and_exprt(
lemmas.push_back(implies_exprt(isprefix, offset_within_bounds));

// Axiom 2.
constraints.push_back([&] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

const symbol_exprt qvar = fresh_univ_index("QA_isprefix", index_type);
const exprt body = implies_exprt(
isprefix, equal_exprt(str[plus_exprt(qvar, offset)], prefix[qvar]));
return string_constraintt(qvar, prefix.length(), body);
}());

// Axiom 3.
lemmas.push_back([&] {
const exprt witness = fresh_exist_index("witness_not_isprefix", index_type);
const exprt strings_differ_at_witness = and_exprt(
axiom_for_is_positive_index(witness),
prefix.axiom_for_length_gt(witness),
notequal_exprt(str[plus_exprt(witness, offset)], prefix[witness])));
or_exprt s0_notpref_s1(
not_exprt(str.axiom_for_length_ge(plus_exprt(prefix.length(), offset))),
witness_diff);
notequal_exprt(str[plus_exprt(witness, offset)], prefix[witness]));
const exprt s1_does_not_start_with_s0 = or_exprt(
not_exprt(offset_within_bounds),
not_exprt(str.axiom_for_length_ge(plus_exprt(prefix.length(), offset))),
strings_differ_at_witness);
return implies_exprt(not_exprt(isprefix), s1_does_not_start_with_s0);
}());

implies_exprt a3(not_exprt(isprefix), s0_notpref_s1);
lemmas.push_back(a3);
return isprefix;
}

Expand Down Expand Up @@ -135,6 +144,7 @@ exprt string_constraint_generatort::add_axioms_for_is_empty(
/// \param swap_arguments: boolean flag telling whether the suffix is the second
/// argument or the first argument
/// \return Boolean expression `issuffix`
DEPRECATED("should use `strings_startwith(s0, s1, s1.length - s0.length)`")
exprt string_constraint_generatort::add_axioms_for_is_suffix(
const function_application_exprt &f, bool swap_arguments)
{
Expand Down
72 changes: 30 additions & 42 deletions src/solvers/refinement/string_refinement.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1833,18 +1833,17 @@ static void update_index_set(
}
}

/// Finds an index on `str` used in `expr` that contains `qvar`, for instance
/// with arguments ``(str[k]=='a')``, `str`, and `k`, the function should
/// return `k`.
/// If two different such indexes exist, an invariant will fail.
/// Find indexes of `str` used in `expr` that contains `qvar`, for instance
/// with arguments ``(str[k+1]=='a')``, `str`, and `k`, the function should
/// return `k+1`.
/// \param [in] expr: the expression to search
/// \param [in] str: the string which must be indexed
/// \param [in] qvar: the universal variable that must be in the index
/// \return an index expression in `expr` on `str` containing `qvar`. Returns
/// an empty optional when `expr` does not contain `str`.
static optionalt<exprt>
find_index(const exprt &expr, const exprt &str, const symbol_exprt &qvar)
/// \return index expressions in `expr` on `str` containing `qvar`.
static std::unordered_set<exprt, irep_hash>
find_indexes(const exprt &expr, const exprt &str, const symbol_exprt &qvar)
{
decltype(find_indexes(expr, str, qvar)) result;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

auto index_str_containing_qvar = [&](const exprt &e) {
if(auto index_expr = expr_try_dynamic_cast<index_exprt>(e))
{
Expand All @@ -1855,28 +1854,11 @@ find_index(const exprt &expr, const exprt &str, const symbol_exprt &qvar)
return false;
};

auto it = std::find_if(
expr.depth_begin(), expr.depth_end(), index_str_containing_qvar);
if(it == expr.depth_end())
return {};
const exprt &index = to_index_expr(*it).index();

// Check that there are no two such indexes
it.next_sibling_or_parent();
while(it != expr.depth_end())
{
if(index_str_containing_qvar(*it))
{
INVARIANT(
to_index_expr(*it).index() == index,
"string should always be indexed by same value in a given formula");
it.next_sibling_or_parent();
}
else
++it;
}

return index;
std::for_each(expr.depth_begin(), expr.depth_end(), [&](const exprt &e) {
if(index_str_containing_qvar(e))
result.insert(to_index_expr(e).index());
});
return result;
}

/// Instantiates a string constraint by substituting the quantifiers.
Expand All @@ -1897,18 +1879,24 @@ static exprt instantiate(
const exprt &str,
const exprt &val)
{
const optionalt<exprt> idx = find_index(axiom.body, str, axiom.univ_var);
if(!idx.has_value())
return true_exprt();

const exprt r = compute_inverse_function(stream, axiom.univ_var, val, *idx);
implies_exprt instance(
and_exprt(
binary_relation_exprt(axiom.univ_var, ID_ge, axiom.lower_bound),
binary_relation_exprt(axiom.univ_var, ID_lt, axiom.upper_bound)),
axiom.body);
replace_expr(axiom.univ_var, r, instance);
return instance;
const auto indexes = find_indexes(axiom.body, str, axiom.univ_var);
INVARIANT(
str.id() == ID_array || indexes.size() <= 1,
"non constant array should always be accessed at the same index");
exprt::operandst conjuncts;
for(const auto &index : indexes)
{
const exprt univ_var_value =
compute_inverse_function(stream, axiom.univ_var, val, index);
implies_exprt instance(
and_exprt(
binary_relation_exprt(axiom.univ_var, ID_ge, axiom.lower_bound),
binary_relation_exprt(axiom.univ_var, ID_lt, axiom.upper_bound)),
axiom.body);
replace_expr(axiom.univ_var, univ_var_value, instance);
conjuncts.push_back(instance);
}
return conjunction(conjuncts);
}

/// Instantiates a quantified formula representing `not_contains` by
Expand Down