Skip to content

Make clear that cbmc version of charAt and substring do not throw OutOfBound TG-1808 #1665

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

romainbrenguier
Copy link
Contributor

The internal code produced by these two methods differs
from the real Java implementation because it lacks the ability to throw
exceptions.
To make that clear, we deactivate preprocessing of charAt and substring, and instead provide functions CProverString.charAt and CProverString.substring.
If the user wants to model String.charAt and substring it can do so by checking for out of bound indexes before passing the string to CProverString.charAt and CProverString.substring.

We add to the regression directories some CProverString class, which is useful to compile java file using CProverString.charAt and CProverString.substring and which documents what is the behaviour corresponding to our models (i.e. jbmc's CProverString.charAt does the same thing as the Java charAt function provided).

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable. Perhaps add comments indicating the rationale behind this change?

@@ -1,13 +1,12 @@
public class SubString01
{
public static void main(String[] args)
{
public static void main(String[] args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Irregular indentation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a common problem...

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's something I don't quite understand. Please see the comments.

@@ -1,13 +1,12 @@
public class SubString01
{
public static void main(String[] args)
{
public static void main(String[] args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a common problem...

public final class CProverString
{
public static char charAt(String s, int i) {
if(0 <= i && i < s.length())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're creating a new file, you can use the standard java style, e.g. space after if.

else if(i >= s.length())
return "";
else
return s.substring(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I don't understand. Who handles that function if the preprocessing is disabled for String.substring(I)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the preprocessing is enabled for CProverString.substring instead, so this function is replace by the preprocessing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what's the purpose of this code? It won't be used in cbmc will it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, you have to provide the functions, but why do you distinguish between cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so that the programs that call CProverString.substring can be executed and the result of the execution will be the same as what jbmc expects it to be.
At the same time it documents what the internal substring exactly corresponds to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see. It makes sense to allow these tests to still execute. The logic is the other way around compared to the models library, which makes it confusing. But I don't see an easy way around that.

Is there any reason why we return s when i < 0? The original implementation throws an exception, so "" would probably make more sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to say that it's what the substring primitive of the solver does, but actually looking at the code it would generate a string of length length - i where the first characters can be anything...
I don't want to change the constraints in the solver in this PR because this would be out of scope, so what about appending -i times the '\u0000' character at the beginning?
I would fit in the specifications of the internal primitive (but the user should not rely on the first characters being '\u0000').

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info. Yes you can change it to have null chars, but it's not very important, as this code is only used when running the test files.

We provide functions CProverString.charAt and CProverString.substring.
The reason we deactivate these is that the internal modelling differs
from the real Java implementation because it lacks the ability to throw
exceptions, in particular for the out of bounds case.
This defines useful string functions replacing standard java functions
that cannot be directly supported in cbmc because of exception throwing.
We replace calls to charAt/substring by corresponding CProverString functions.
This defines string function to replace standard Java string functions
that cannot be supported internally by CBMC because of exception throwing.
charAt and substring are replaced by CProver functions
@romainbrenguier romainbrenguier force-pushed the bugfix/string-out-of-bound#TG-1808 branch from 6f25a84 to ac7e649 Compare December 14, 2017 11:41
Copy link
Contributor

@thk123 thk123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not certain I understand this - does the models library implementation of charAt, substring etc call these functions so they get intercepted? So the JBMC tests are updated as they should be using the intercepted call rather than depending on an implementation of String.charAt etc

@romainbrenguier
Copy link
Contributor Author

@thk123

I'm not certain I understand this - does the models library implementation of charAt, substring etc call these functions so they get intercepted?

The models library is going to use the CProverString functions now: https://github.com/diffblue/models-library/pull/301 (these changes needs to be coordinated with this one in CBMC).
So, before String.charAt was intercepted and the model was empty, now String.charAt is not intercepted, we use the model, and the model use CProverString.charAt which is intercepted.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thk123 thk123 merged commit d41403f into diffblue:develop Dec 14, 2017
smowton added a commit to smowton/cbmc that referenced this pull request May 9, 2018
67ec6f2 Merge remote-tracking branch 'upstream/develop' into pull-support-20180104
fabc99e Merge pull request diffblue#1563 from NathanJPhillips/feature/lazy-loading
2d67e42 Merge pull request diffblue#1692 from NathanJPhillips/feature/numbering-at
5266ba2 Merge commit 'ac4756fc3bb0e853f04de2b69f300d65cfbfc553' into pull-support-20171212
4f4a9c7 Add at method to template_numberingt
d9cc0c0 Merge pull request diffblue#1685 from peterschrammel/remove-existing-coverage-goals
f13c871 Update a comment in instrument_cover_goals
6c39443 Remove existing coverage goals
a2cf16d Store symbol type instead of followed type when clean casting
f96efb4 Change template parameter name to not clash with existing type
b0cd57b Encapsulate lazy_goto_modelt::goto_functions
ef02f4d Update test to handle changed symbol creation order
441d269 Reset counter used by get_fresh_aux_symbol in unit tests
f759f25 Fix new test to run remove_java_new pass
cb09d8e Fix new tests to use lazy loading
166563f Do lowering of java_new as a function-level pass
3779782 Always convert the main function from bytecode into the symbol_table
7e52e22 Always allow the main function not to have a body
c938b08 Encapsulate maps in language_filest
87c6b28 Don't erase symbol in java_bytecode_convert_methodt::convert
e3e44c8 Do type-checking as a function pass
2ba1364 Update load_all_functions
5f06e36 Recreate initialize in final
aa5440b Moved call to final to lazy_goto_modelt::finalize
a659bc0 Add lazy_goto_functions_mapt
7e1ecc9 Allow post-processing functions to run on a function-by-function basis
0c05be6 Allow convert_lazy_method on functions that don't have a lazy body
05a8da2 Make goto_convert_functions not add directly to function map
c078858 Introduce lazy_goto_modelt
a22dd1c Merge pull request diffblue#1671 from thk123/bugfix/TG-1278/correct-access-level
5b6eb00 Merge pull request diffblue#1668 from romainbrenguier/bugfix/string-index-of#TG-1846
9062853 Tidied up the generic specalised class copying the base class visibility
f934ca3 String classes should be public
7b06a00 Merge pull request diffblue#1498 from smowton/smowton/feature/call_graph_improvements
f3b4c9b Test for verification of the indexOf method
9a7fa2d Correct lower bound in String indexOf
682cd1a Use a symbol generator to avoid name clashes
28a2ada Access in empty array should be unconstrained
d41403f Merge pull request diffblue#1665 from romainbrenguier/bugfix/string-out-of-bound#TG-1808
ac7e649 Use CProverString function in jbmc tests
22dc353 Add CProverString class for jbmc-strings tests
ef610b3 Use CProverString functions in strings-smoke-tests
5c716c1 Add a CProverString class for string-smoke-tests
6b619e0 Deactivate preprocessing of charAt and substring
bcfaaa4 Merge pull request diffblue#1664 from svorenova/refactoring_tg1190
c2a8bae Refactoring in specialization of generic classes
7a98e15 Rename call graph constructors
6228ed3 Slice global inits: use improved call graph
d136bbc Expose limited call graph via goto-instrument
9c29bee Improve call graph unit tests
8ed3ccb Add documentation to call graph
8f6f429 Add call graph helpers
3b06a16 Call graph: add constructors that only include reachable functions
9b65862 grapht: add get_reachable function
aaa8513 Call graph -> grapht transformation
8115248 Call graph: optionally record callsites

git-subtree-dir: cbmc
git-subtree-split: 67ec6f2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants