Emit a warning for tests that use random.seed() without resetting state afterwards #5463

Zac-HD · 2019-06-21T07:55:04Z

Background

As noted in HypothesisWorks/hypothesis#1919, it's scarily easy to "pollute" the global PRNG by calling random.seed() (or random.setstate()) without cleaning up at the end of the test. Typical use with a static seed - whether 0 or 19680801 - results in strong but generally invisible correlations between the results in different test runs, even when the user thought that the test vary between runs.

Some packages and users seem to accept tests that only pass for certain seeds (though IMO this is often a mistake). While PRNG pollution is not bad in this case, nor is it beneficial - and so emitting a warning rather than raising an error seems appropriate.

How to detect PRNG pollution

Collect a dict of {nodeid: (state_before, state_after)}, using something like an autouse fixture. To account for plugins and fixtures which manipulate random state, this must be collected from the outer-most position for each item. For efficiency, we can discard any item where state_before == state_after - seeding back to the original state is indistinguishable from simply not using randomness at all, but more importantly does not affect any other test.
Persist the dict to disk (if possible), and fetch the old version. Check for cases where a nodeid had a different state_before in each run, but the same state_after. This check is highly reliable, but obviously only works on subsequent runs and only where the data can be persisted.
Check for any pairs of tests from this run with a different state_before and the same state_after. This indicates with very high probability that both ran with the same seed. To reduce false-positives, discard any pair where persisted data indicates that either item has varying state_after between runs.

Alternatively, it is possible that e.g. of tests A C B D, A and B set the seed then advanced the PRNG by a and b states; then C and D did not set a seed but advanced the PRNG by k - b and k - a states respectively. I've never heard of this actually happening, but the warning text should clearly distinguish between tests that are known to have called seed(), and those where we detected that seed() had been called (potentially in another test).

In practice

This has bitten us twice in Hypothesis, even though we're more careful than most.
Make tests using random numbers more reproducible #667 had the goal of "Allowing the test to set a seed without affecting other tests"
pytest-random-order seeds the global PRNG in order to shuffle items, without restoring the previous state. pytest-random has the same problem.
Numpy and Matplotlib documentation simply recommends calling seed() to avoid flaky tests, which in turn reduces the power of tests which are expected to pass for any seed (because they are likely to be run with fewer PRNG states than expected).
For other frameworks, the assumption is that seed() is safe because it will be used before every use of randomness. However, this makes it impossible to reliably use unseeded randomness anywhere else in the test suite. pytest-randomly simply enforces this by calling random.seed() before every test.

Next steps

Someone writes a PR against features to implement the warning. I'd like to do this, but am unlikely to get to it before August so anyone who wants to is welcome 😄

The text was updated successfully, but these errors were encountered:

asottile · 2019-06-23T04:45:10Z

this sounds like a good idea, but maybe not for core (for a plugin perhaps?)

I've seen testsuites which intentionally seed(0) in an autouse fixture to establish some sanity -- this would trip up those testsuites

Zac-HD · 2019-06-23T05:58:01Z

I've seen testsuites which intentionally seed(0) in an autouse fixture to establish some sanity -- this would trip up those testsuites

Then the fixture should use random.setstate to restore the previous state afterwards!

I know this would emit new warnings, but that's the whole point: IMO requiring the little bit of cleanup logic below is a good trade to eliminate a whole class of subtle ways to make a test suite less powerful than expected.

state = random.getstate()
random.seed(x)
try:
    yield
finally:
    random.setstate(state)

Doing this check in a plugin would be better than nothing, but significantly more fragile than in core - results would depend on the ordering of plugins and false-positives could result. Imagine for example an autouse fixture that just rolled back the PRNG state after every test - that would fix the pollution problem (though not the seed-diversity problem if there is a seed() call at setup time)

asottile · 2019-06-23T14:24:43Z

subtle ways to make a test suite less powerful than expected.

I guess it's more that I disagree with this, in my opinion random has no place in tests

Zac-HD · 2019-06-25T04:32:17Z

I guess it's more that I disagree with this, in my opinion random has no place in tests

Then the state before will be the same as the state after, and it's a noop 😄

I actually do agree with you here, outside of a few special cases - for example I think randomness is worth the trouble with Hypothesis, because we do a lot of underlying work to make sure that you can randomly find bugs, but not lose them on the next run.

Zac-HD · 2019-06-29T03:28:58Z

Closing this as it would conflict with many existing plugins, and apparently would only be much use for Hypothesis - we'll just write our own tests 😎

RonnyPfannschmidt · 2019-06-29T06:07:55Z

We should have a follow up about having pytest manage random seeds, and warning about randomness state changes at collection or test execution as opt ins

Zac-HD added type: enhancement new feature or API change, should be merged into features branch type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature labels Jun 21, 2019

Zac-HD closed this as completed Jun 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Emit a warning for tests that use random.seed() without resetting state afterwards #5463

Emit a warning for tests that use random.seed() without resetting state afterwards #5463

Zac-HD commented Jun 21, 2019

asottile commented Jun 23, 2019

Uh oh!

Zac-HD commented Jun 23, 2019

Uh oh!

asottile commented Jun 23, 2019

Uh oh!

Zac-HD commented Jun 25, 2019

Uh oh!

Zac-HD commented Jun 29, 2019

Uh oh!

RonnyPfannschmidt commented Jun 29, 2019

Uh oh!

Uh oh!

Emit a warning for tests that use random.seed() without resetting state afterwards #5463

Emit a warning for tests that use random.seed() without resetting state afterwards #5463

Comments

Zac-HD commented Jun 21, 2019

Background

How to detect PRNG pollution

In practice

Next steps

asottile commented Jun 23, 2019

Uh oh!

Zac-HD commented Jun 23, 2019

Uh oh!

asottile commented Jun 23, 2019

Uh oh!

Zac-HD commented Jun 25, 2019

Uh oh!

Zac-HD commented Jun 29, 2019

Uh oh!

RonnyPfannschmidt commented Jun 29, 2019

Uh oh!