Skip to content

Nondet-initialize enums to be equal to a constant of the same type #2701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Aug 14, 2018

Conversation

antlechner
Copy link
Contributor

@antlechner antlechner commented Aug 9, 2018

Previously we nondet-initialized enums just like any other type: by nondet-initializing each field. In the case of enums, there are two fields: String name and int ordinal. These two fields depend on each other, for example, for an enum
public enum Color { RED, GREEN, BLUE }
the enum Color.BLUE would have name == "BLUE" and ordinal == 2. Java generally only allows an enum object to be either null, or equal to one of the constants defined for its type.

This PR introduces this restriction in java_object_factory. The changed part of the generated GOTO program looks e.g. like this:

        struct Color *c;
        int enum_index_init;
        clinit_wrapper();
        IF !NONDET(_Bool) THEN GOTO 1
        c = null;
        GOTO 2
     1: enum_index_init = NONDET(int);
        ASSUME enum_index_init >= 0
        ASSUME enum_index_init < $VALUES->length
        c = (struct Color *)$VALUES->data[enum_index_init];
     2: INPUT("c", c);

A corresponding change in lazy methods v1 is also necessary as it has to mirror the behaviour of __CPROVER_start.

Copy link
Contributor

@thk123 thk123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional tests needed for your change to always call static init:

  • static field assigned a value then its value should be set at the start of the function
  • (TG test): static field on a class that implements an interface that gets substituted should be set up.

Emojis explained
🚫 - must be addressed
❓ - must be answered
💡 - Apply the ones you agree with and have time for
⛏️ - Apply if it is easy and you are already making changes. Mostly adding so these things can be avoided in future

Reviewed commit by commit so some relevant comments may be hidden

Oh BTW seems really negative review, but this no the whole looks great and I'm looking forward to this feature!

const java_class_typet &class_type = to_java_class_type(subtype);
const irep_idt &class_name = class_type.get_name();
const irep_idt class_clinit = clinit_wrapper_name(class_name);
gen_method_call(assignments, expr, class_clinit);

This comment was marked as resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please @smowton ! I raised exactly this question at a few stand-ups, and @peterschrammel said it's probably best to call static initialisers for all argument types, not just enums, as that's closer to what "real Java" does. No one could really think of an example where it would really make a difference for our generated tests, or indeed why the JVM calls static initialisers before constructor calls. @peterschrammel suggested it may have something to do with concurrency, so @cesaro might have an opinion on this too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JVM should call static initialisers before constructors, because the constructor could reference a statically initialised value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasspriggs But then the rule might as well be "the JVM should call static initialisers before referencing statically initialised values"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@antlechner But then the JVM would have to keep track of which values are statically initialised. I suspect that just keeping track of which classes have had their static initialiser run is simpler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beware this might reduce coverage though if the type's static fields were unused. We may want an option to break the JVM spec in this respect and only static init upon actual static field reference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about coverage.
@peterschrammel Should I leave this change as it is for now, and if it reduces coverage then we'll add an option for it? Or should I introduce the option right away, just in case? Or put everything behind a "if enum type"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if we did have an option, shouldn't it apply to calling clinit_wrapper before regular/deterministic constructor calls too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want an option to break the JVM spec in this respect and only static init upon actual static field reference.

We should try that to make it more a conscious choice to deviate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterschrammel As part of this PR for nondet-initialisation only, or in a future PR for both nondet and det object initialisation?

const exprt &instance_expr,
const irep_idt &method_name)
{
if(const auto func = symbol_table.lookup(method_name))

This comment was marked as resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it will usually be the case that the method doesn't exist, as most classes don't have both a static initialiser and a cproverNondetInitialize, which are the values for method_name that this is called with. I renamed this function to gen_method_call_if_present to clarify this.

/// \param min_value_expr: Represents the minimum value for the integer.
/// \param max_value_expr: Represents the maximum value for the integer.
/// \return A symbol expression for the resulting integer.
const symbol_exprt java_object_factoryt::gen_nondet_int_init(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 this is similar to generate_nondet_int in nondet.cpp in TG, it might be nice to de-duplicate the calls

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I talked to @thomasspriggs about this and I'm planning to make this change in a follow-up PR, should have mentioned it here.

{
// Allocate a new symbol for the int
const symbolt &int_symbol = get_fresh_aux_symbol(
java_int_type(),

This comment was marked as resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (using the type of the min_value_expr here, and a precondition above making sure that max_value_expr has the same type).

const auto &int_symbol_expr = int_symbol.symbol_expr();

// Nondet-initialize it
gen_nondet_init(

This comment was marked as resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  assignments.add(
    code_assignt(int_symbol_expr, side_effect_expr_nondett(int_type)));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure about this one. On the one hand, every function call to gen_nondet_init looks horrible because it takes so many parameters. On the other hand, it's nice to just rely on gen_nondet_init to generate anything for you, and I think it clarifies its role in this file when it is used this way.
Also, for primitive types it also has a add_source_location() = loc operation in addition to what you wrote, so that would become more like four lines overall and not be much more readable than the current version.

plus_exprt plus(enum_array_expr, index_expr, enum_array_expr.type());
const dereference_exprt arraycellref(
plus_exprt(enum_array_expr, index_expr, enum_array_expr.type()),
enum_array_expr.type().subtype());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Don't specify the type, the single parameter constructor will do what you want and is less error prone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of a reference typed array, the type may need to be explicitly specified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? If it does, it suggests a type cast is needed, if the type if the expr passed in as the first parameter is T*, then any well formed dereference_exprt will have type T (i.e. expr.type().subtype())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thk123 I removed the type. I checked the two constructors and the single parameter version does just call the other one with op.type().subtype().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the only way this could be different is if the type() of plus is different from array_enum_expr. The type of plus is determines by the lhs expr so in this case you're definitely fine. I think it would be valid to do index_expr + enum_array_expr and then you would get the wrong answer...

}
else
if(target_class_type.get_base("java::java.lang.Enum"))

This comment was marked as resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in a separate commit

typet(), // override type immaterial
depth+1,
update_in_place);
// obtain a target pointer to initialize; if in MUST_UPDATE_IN_PLACE mode we
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self/reviewers - this code hasn't changed, just GH messing up the move.

@@ -1280,7 +1315,8 @@ void java_object_factoryt::allocate_nondet_length_array(
assignments,
"nondet_array_length",
from_integer(0, java_int_type()),
max_length_expr);
max_length_expr,
true);

This comment was marked as outdated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got rid of the boolean.

@@ -382,6 +382,8 @@ void ci_lazy_methodst::initialize_instantiated_classes(
if(param.type().id()==ID_pointer)
{
const pointer_typet &original_pointer=to_pointer_type(param.type());
add_clinit_call_for_pointer_type(

This comment was marked as resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope - this is fine because add_all_needed_classes calls initialize_instantiated_classes with both the original and any substituted pointer

@antlechner antlechner force-pushed the antonia/enum-constants branch 4 times, most recently from c8fb413 to ba56bdf Compare August 10, 2018 17:44
@antlechner
Copy link
Contributor Author

antlechner commented Aug 10, 2018

@thk123 I think I implemented and/or replied to all of your comments. Except for tests - I don't think the difference between what we used to do and what we do after these changes can be observed in regression tests: as soon as we try to observe a static variable we have to access it and at that point we would have always called clinit_wrapper anyway. I did however add some more tests that check that not only are enums initialised to be some constant of their type, but in fact any of these constants can be chosen. I also rebased on develop, so this is ready for re-review.

@antlechner antlechner force-pushed the antonia/enum-constants branch from ba56bdf to 4f12c10 Compare August 10, 2018 18:19
Copy link
Contributor

@thk123 thk123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for all the improvements - needs a TG bump if it hasn't got one already

@thk123
Copy link
Contributor

thk123 commented Aug 13, 2018

Has one: diffblue/test-gen#2156 - currently failing CI

Copy link
Member

@peterschrammel peterschrammel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo TG passing

if (c == null)
return;
assert c != null;
boolean isRed = c.name().startsWith("RED") && c.name().length() == 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May we have tests with switch already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have one tests for switch on an enum type. It uses --cover location and pretty much just checks that all possible ordinals (except the last one) can be generated for a switch statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterschrammel Do you think the switch test and the new NondetEnumArg test should be combined?

@antlechner antlechner force-pushed the antonia/enum-constants branch 2 times, most recently from fc8a69e to 2a77873 Compare August 13, 2018 12:18
Add a high-level overview of the recursive algorithm in the file.
This is more consistent with the behaviour of the JDK (the Java language
specification says that static initializers should be run on object
initialization), and a necessary requirement to access enum constants.
Rather than nondet-initializing each field (String name and int ordinal)
of an enum, we assign it to be equal to one of the constant values
stored in the $VALUES array of the corresponding type.
We load static initializers for all types of parameters of the entry
function(s). This is necessary for the new nondet-initialization of
enums, and also consistent with the Java language specification, which
says that static initializers should be run on object initialization.
The tests check that for a nondeterministic non-null enum, any of the
constants of its type can be chosen, and nothing other than these
constants.
These really should be unreachable, but some company code needs to be
changed for this first. A ticket number has been added as a reminder.
Copy link
Contributor

@thk123 thk123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only issue is we are duplicating the logic for when we call clinit. I can't immediately think of a better way (and you have a comment linking the two. You could pull out a weird utility function like does_need_clinit_call so they both use the same condition.

if(!must_be_null)
// case in gen_nondet_struct_init would slow symex down too much, so if we
// decide to do this for all types, we should do it here.
if(can_cast_type<java_class_typet>(subtype))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ mild preference for:

if(const auto class_type = type_try_dynamic_cast<java_class_typet>(subtype)
{
  if(class_type->get_base("java::java.lang.Enum") && !must_be_null)

Since puts in one place the check that is is a java_class_typet

// case in gen_nondet_struct_init would slow symex down too much.
if(!must_be_null)
// case in gen_nondet_struct_init would slow symex down too much, so if we
// decide to do this for all types, we should do it here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment saying that this is mirrored in ci_lazy_methods

@antlechner
Copy link
Contributor Author

@thk123 Applied your two line-specific comments. The duplication is a general problem with CI lazy methods, which has to mirror everything that happens in __CPROVER_initialize and __CPROVER_start. Symex-driven lazy loading no longer has this problem. I'll check if it's worth adding a shared utility function for the "is this an enum?" logic in a follow-up PR.

Calling clinit_wrapper on all nondet-initialized types has a noticeable
impact on performance, so for now we are only doing this for enums, and
we will re-evaluate the behavior for other types in the future.
@antlechner antlechner force-pushed the antonia/enum-constants branch from 80300d6 to 5ca6cd2 Compare August 14, 2018 12:48
@antlechner antlechner merged commit 0ecd008 into diffblue:develop Aug 14, 2018
@antlechner antlechner deleted the antonia/enum-constants branch August 14, 2018 14:25
NathanJPhillips pushed a commit to NathanJPhillips/cbmc that referenced this pull request Sep 6, 2018
4f5e148 Merge pull request diffblue#2713 from thk123/dump/expr2c-configuration
01b7418 Merge pull request diffblue#2710 from thomasspriggs/tas/struct_component
b763877 Merge pull request diffblue#2737 from diffblue/remove-appveyor
9c3fd45 Add AWS CodeBuild badges
4261fd8 Remove appveyor badge
fe23db7 Remove appveyor.yml
21857a7 Add support for getting components by name to `struct_exprt`
05993f4 Merge pull request diffblue#2727 from tautschnig/fptr-debug
0ecd008 Merge pull request diffblue#2701 from antlechner/antonia/enum-constants
159bf15 Merge pull request diffblue#2205 from diffblue/rename_symbol_type
5ca6cd2 Restrict new clinit_wrapper calls to enum types
6e04213 Reformatting the structs to aid readability
0cfacb8 Add type2c conversion for specifying a type name
c3fef70 Add a clean configuration for expr2c
3a40db1 Make expr2ct configurable in a number of ways
097cf71 Merge pull request diffblue#2731 from diffblue/increase-version
170c1ea Merge pull request diffblue#2730 from diffblue/parent-child-invariant
c9e46ae Temporarily remove new UNREACHABLE statements
03e3877 Add tests for nondet initialization of enums
3e32140 CI lazy methods: load clinit for all param types
5542c54 Move nondet enum initialization to new function
6ecf4f9 Nondet-init enums by assigning them to a constant
6c84caa Refactor logic for generating a nondet int
07d1e44 Always run clinit_wrapper before nondet object init
3be826f Add some documentation to java_object_factory
5e80310 add invariant on parent-child class relationship
2cf1931 Merge pull request diffblue#2661 from jeannielynnmoulton/jeannie/JavaMethodType
c6acc9c increased version number in preparation for release 5.10
0ee4178 Merge pull request diffblue#2729 from tautschnig/add-sub-conflict
f5aff56 Merge pull request diffblue#2705 from danpoe/feature/replace-function-calls
89048e6 Function-pointer remvoval: print human-friendly debug messages
1fc3118 Use pointer difference type when adding to pointer
072b592 Merge pull request diffblue#2726 from tautschnig/java-loc
4dca215 Goto program should not use java_method_typet
273fff4 Add can_cast_type and precondition.
6cb7e5d Reinstate require_code and add require_java_method
78f7cb7 Refactor constructors for java_method_typet
6cefc61 Unit tests conversion from code_typet to java_method_typet
c3b8b5a Update tag in to_java_method_type
972315a Update docs on java_method_typet constructor
211931c Add tag for java_method_type
551df9c Update unit tests to use java_method_typet
da18c9f Change variables named code_type to method_type
f1bd41e Change code_typet to java_method_typet in jbmc
be3c4c6 Merge pull request diffblue#2430 from tautschnig/vs-function-id
03fa885 Replace function calls
c4d79ab Java string preprocessing: use provided source location
fb0c552 Java string preprocessing: use and document parameter function_id
c31edca Merge pull request diffblue#2725 from tautschnig/replace-symbolt-code-type
2c1fc06 Merge pull request diffblue#2721 from tautschnig/replace_symbol-cleanup2
8211a78 Merge pull request diffblue#2722 from tautschnig/cleanup-valuest
30bc071 Add "// empty last line" to options list in goto_instrument_parse_options.h
8c8801c List source files in goto-programs/Makefile in lexicographic order
40d28ae replace_symbolt: report replacements in code_typet::return_type
4b9df3b Revert "Ignore return value"
e39ea2e Merge pull request diffblue#2683 from karkhaz/kk-continue-unsafe
424ab4f --stop-on-fail now stops on failed path
545bff8 Add clear() to path exploration worklist
95d8d0f Generalise option setting from strategy unit tests
e85fb77 Cleanup constant_propagator_domaint::valuest
18d08bf Merge pull request diffblue#2719 from tautschnig/quiet-unit-tests
30d557a Constant propagation: Check type consistency before adding to replacement map
a5ce621 Make unit tests quiet
61b3086 Merge pull request diffblue#2468 from tautschnig/vs-names
7bfd36b Merge pull request diffblue#2714 from diffblue/msvc-asm
3785941 Remove names of unused parameters
af79cb9 add support for Visual Studio style inline assembler
254b4d4 Merge pull request diffblue#2642 from diffblue/remove_asm_fix
26009a3 remove_asm now guarantees that functions called exist
9c10f38 Merge pull request diffblue#2716 from tautschnig/fix-buildspec
c3b2beb Merge pull request diffblue#2635 from qaphla/move_is_lvalue
e247a29 CodeBuild: Remove empty artifact stanza
756018d Merge pull request diffblue#2709 from owen-jones-diffblue/doc/how-to-run-tests
bba5dea Merge pull request diffblue#2699 from diffblue/goto-cc-clang
355fbd2 avoid assert()
6178908 bump goto binary version
f93deec type symbols now use ID_symbol_type
22b755a Merge pull request diffblue#2711 from diffblue/mode-gcc-asm-functions
cf75535 Merge pull request diffblue#2702 from owen-jones-diffblue/doc/minor-fixes-to-cprover-developer-documentation
5c06786 set mode for functions added by remove_asm
ef53b65 Update description of regression test framework
312ca1d Add section to documentation about running tests
d7ddf59 Merge pull request diffblue#2700 from romainbrenguier/clean-up/side-effect-location
119e88b Pass location around for nondet initialization
50660db Specify source location for nondet expressions
d1f2ad9 Replace -> with &rarr;
e448db6 Merge pull request diffblue#2708 from owen-jones-diffblue/coding-standard-class-comments
519370d State that identifiers should be good
30d29b9 Replace unicode arrows → with ascii ones ->
611374f Document classes and member variables unless obvious
adb7ef0 Minor fixes to documentation outline
fd4f563 Add side_effect_exprt constructor with location
98657d8 Merge pull request diffblue#2668 from diffblue/expose_remove_preconditions
6a36fa4 Merge pull request diffblue#2615 from owen-jones-diffblue/doc/cbmc-developer-guide
61a8c30 Merge pull request diffblue#2666 from NlightNFotis/invariant_changes
c0bcce7 use clang as native compiler for goto-clang
1f19e23 goto-cc: use result of our native compiler detection
d9d9e2a Merge pull request diffblue#2692 from diffblue/follow-tags
f94d5e2 follow union, struct and enum tags
99e33bd fix typo in comments for struct_tag_typet
1f53246 expose remove_preconditions
f212505 Avoids using expr.op0 when type is known
7b36ca2 Moves is_lvalue to expr_util.c
4782b48 Fix invariant regression tests
efb1c40 Refactor invariant_failedt definition.
515f050 Pass the condition to the invariant_failedt constructor.
bf6dd9e Added extra use-case hints to the already present invariant definitions.
612b4f8 Address review comments
7233f92 Rearrange everything into separate pages
1cb3cdd Move other tools into a separate file
82eefb7 Fix links between files
d9e690b Move folder walkthrough to a separate file
2939db4 Address review comments
8d5cbcb Create CBMC developer guide documentation

git-subtree-dir: cbmc
git-subtree-split: 4f5e148
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants