-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Fix random state not being used for sampling configurations #1329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix random state not being used for sampling configurations #1329
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good, but is there a reason you only touched the classification pipeline test and not the regression ones, too?
I labelled it with "PR: In Progress", I will do the regression ones as well, also I have to take it back out of our randomized test where we randomly select sample configurations too. I'll send you a review request when it's ready |
So I ended up removing |
Codecov Report
@@ Coverage Diff @@
## development #1329 +/- ##
===============================================
+ Coverage 88.05% 88.46% +0.40%
===============================================
Files 140 140
Lines 11163 11811 +648
===============================================
+ Hits 9830 10449 +619
- Misses 1333 1362 +29
Continue to review full report at Codecov.
|
classifier = SimpleClassificationPipeline( | ||
random_state=1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now inconsistent. The random_state
is not dropped in a few unit tests above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because this test relies an an accuracy score, it needs to be fixed as not all configurations will get 96%. The other tests are not specifically for metrics.
* Added random state to classifiers * Added some doc strings * Removed random_state again * flake'd * Fix some test issues * Re-added seed to test * Updated test doc for unknown test * flake'd
* Added random state to classifiers * Added some doc strings * Removed random_state again * flake'd * Fix some test issues * Re-added seed to test * Updated test doc for unknown test * flake'd
SimpleClassificationPipelineTest.test_configurations_signed_data
gives undeterministic error #1310 highlighted an in-deterministic test, giving us failures with a certain configuration (not solved).SimpleClassificationPipeline
not receivingrandom_state
in the tests which in turn does not get passed to theConfigSpace
it creates.ConfigSpace
used byAutoml._create_search_space
was not recieving arandom_state
either, not sure how this effects the starting samples that autosklearn would use.This PR:
random_state
in tests asNone
so that further bad configurations can still surface, all be it randomly.random_state
toAutoml._create_search_space
so the non-test code is deterministic.