Skip to content

Emit a warning for tests that use random.seed() without resetting state afterwards #5463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Zac-HD opened this issue Jun 21, 2019 · 6 comments
Labels
type: enhancement new feature or API change, should be merged into features branch type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature

Comments

@Zac-HD
Copy link
Member

Zac-HD commented Jun 21, 2019

Background

As noted in HypothesisWorks/hypothesis#1919, it's scarily easy to "pollute" the global PRNG by calling random.seed() (or random.setstate()) without cleaning up at the end of the test. Typical use with a static seed - whether 0 or 19680801 - results in strong but generally invisible correlations between the results in different test runs, even when the user thought that the test vary between runs.

Some packages and users seem to accept tests that only pass for certain seeds (though IMO this is often a mistake). While PRNG pollution is not bad in this case, nor is it beneficial - and so emitting a warning rather than raising an error seems appropriate.

How to detect PRNG pollution

  1. Collect a dict of {nodeid: (state_before, state_after)}, using something like an autouse fixture. To account for plugins and fixtures which manipulate random state, this must be collected from the outer-most position for each item. For efficiency, we can discard any item where state_before == state_after - seeding back to the original state is indistinguishable from simply not using randomness at all, but more importantly does not affect any other test.

  2. Persist the dict to disk (if possible), and fetch the old version. Check for cases where a nodeid had a different state_before in each run, but the same state_after. This check is highly reliable, but obviously only works on subsequent runs and only where the data can be persisted.

  3. Check for any pairs of tests from this run with a different state_before and the same state_after. This indicates with very high probability that both ran with the same seed. To reduce false-positives, discard any pair where persisted data indicates that either item has varying state_after between runs.

    Alternatively, it is possible that e.g. of tests A C B D, A and B set the seed then advanced the PRNG by a and b states; then C and D did not set a seed but advanced the PRNG by k - b and k - a states respectively. I've never heard of this actually happening, but the warning text should clearly distinguish between tests that are known to have called seed(), and those where we detected that seed() had been called (potentially in another test).

In practice

  • This has bitten us twice in Hypothesis, even though we're more careful than most.
  • Make tests using random numbers more reproducible #667 had the goal of "Allowing the test to set a seed without affecting other tests"
  • pytest-random-order seeds the global PRNG in order to shuffle items, without restoring the previous state. pytest-random has the same problem.
  • Numpy and Matplotlib documentation simply recommends calling seed() to avoid flaky tests, which in turn reduces the power of tests which are expected to pass for any seed (because they are likely to be run with fewer PRNG states than expected).
  • For other frameworks, the assumption is that seed() is safe because it will be used before every use of randomness. However, this makes it impossible to reliably use unseeded randomness anywhere else in the test suite. pytest-randomly simply enforces this by calling random.seed() before every test.

Next steps

Someone writes a PR against features to implement the warning. I'd like to do this, but am unlikely to get to it before August so anyone who wants to is welcome 😄

@Zac-HD Zac-HD added type: enhancement new feature or API change, should be merged into features branch type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature labels Jun 21, 2019
@asottile
Copy link
Member

this sounds like a good idea, but maybe not for core (for a plugin perhaps?)

I've seen testsuites which intentionally seed(0) in an autouse fixture to establish some sanity -- this would trip up those testsuites

@Zac-HD
Copy link
Member Author

Zac-HD commented Jun 23, 2019

I've seen testsuites which intentionally seed(0) in an autouse fixture to establish some sanity -- this would trip up those testsuites

Then the fixture should use random.setstate to restore the previous state afterwards!

I know this would emit new warnings, but that's the whole point: IMO requiring the little bit of cleanup logic below is a good trade to eliminate a whole class of subtle ways to make a test suite less powerful than expected.

state = random.getstate()
random.seed(x)
try:
    yield
finally:
    random.setstate(state)

Doing this check in a plugin would be better than nothing, but significantly more fragile than in core - results would depend on the ordering of plugins and false-positives could result. Imagine for example an autouse fixture that just rolled back the PRNG state after every test - that would fix the pollution problem (though not the seed-diversity problem if there is a seed() call at setup time)

@asottile
Copy link
Member

subtle ways to make a test suite less powerful than expected.

I guess it's more that I disagree with this, in my opinion random has no place in tests

@Zac-HD
Copy link
Member Author

Zac-HD commented Jun 25, 2019

I guess it's more that I disagree with this, in my opinion random has no place in tests

Then the state before will be the same as the state after, and it's a noop 😄

I actually do agree with you here, outside of a few special cases - for example I think randomness is worth the trouble with Hypothesis, because we do a lot of underlying work to make sure that you can randomly find bugs, but not lose them on the next run.

@Zac-HD
Copy link
Member Author

Zac-HD commented Jun 29, 2019

Closing this as it would conflict with many existing plugins, and apparently would only be much use for Hypothesis - we'll just write our own tests 😎

@Zac-HD Zac-HD closed this as completed Jun 29, 2019
@RonnyPfannschmidt
Copy link
Member

We should have a follow up about having pytest manage random seeds, and warning about randomness state changes at collection or test execution as opt ins

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement new feature or API change, should be merged into features branch type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature
Projects
None yet
Development

No branches or pull requests

3 participants