Skip to content

Conversation

rfourquet
Copy link
Member

@rfourquet rfourquet commented Sep 30, 2023

We maintain a "global seed" for this feature of @testset:

Before the execution of the body of a @testset, there is an implicit call to Random.seed!(seed)
where seed is the current seed of the global RNG. Moreover, after the execution of the body, the state of the global RNG is restored to what it was before the @testset. This is meant to ease reproducibility in case of failure, and to allow seamless re-arrangements of @testsets regardless of their side-effect on the global RNG state.

But since we don't use MersenneTwister as the "global RNG" anymore, we need to maintain a separate "global seed" object. So far we literally used a global object Random.GLOBAL_SEED storing the original seed, but it's not robust when multi-tasking is involved: e.g.

seed!(0)
x = rand()
seed!(0)
@sync begin
    @async @testset "A" begin
        seed!(1) # reset GLOBAL_SEED to V2
        sleep(2)
    end # reset GLOBAL_SEED to its original value V1
    sleep(0.5)
    @async @testset "B" begin
        # here seed!(2) above has already been called
        # so @testset B recorded value V2 as the "original" value of GLOBAL_SEED
        seed!(2)
        sleep(2)
        # here @testset A already finished
    end # reset GLOBAL_SEED to the wrong original value V2
end
@testset "main task" begin
    # async tests didn't mutate this task's global seed
    @test x == rand() # fails!
end

So we store here a "global seed" in task_local_storage(), which is set when seed!() is invoked without an explicit RNG, and defaults to Random.GLOBAL_SEED, which is set only once when Random is loaded. And instead of actually storing a seed, we store a copy of the RNG state.

This is still not ideal, in that at the beginning of @testset "A" or @testset "B", we can't do @test x == rand(), because these are in separate tasks, so the global seed defaults to Random.GLOBAL_SEED, and not to the global seed of the parent's task; there might be a nice way to handle that, but at least different tasks don't corrupt each-other's seeds.

@rfourquet rfourquet added the randomness Random number generation and the Random stdlib label Sep 30, 2023
We maintain a "global seed" for this feature of `@testset`:

> Before the execution of the body of a @testset, there is an implicit call to Random.seed!(seed)
where seed is the current seed of the global RNG. Moreover, after the execution of the body, the
state of the global RNG is restored to what it was before the @testset. This is meant to ease
reproducibility in case of failure, and to allow seamless re-arrangements of @testsets regardless of
their side-effect on the global RNG state.

But since we don't use `MersenneTwister` as the "global RNG" anymore, we need to maintain a separate
"global seed" object. So far we literally used a global object `Random.GLOBAL_SEED` storing the
original seed, but it's not robust when multi-tasking is involved: e.g.
```
seed!(0)
x = rand()
seed!(0)
@sync begin
    @async @testset "A" begin
        seed!(1) # reset GLOBAL_SEED to V2
        sleep(2)
    end # reset GLOBAL_SEED to its original value V1
    sleep(0.5)
    @async @testset "B" begin
        # here seed!(2) above has already been called
        # so @testset B recorded value V2 as the "original" value of GLOBAL_SEED
        seed!(2)
        sleep(2)
        # here @testset A already finished
    end # reset GLOBAL_SEED to the wrong original value V2
end
@testset "main task" begin
    # async tests didn't mutate this task's global seed
    @test x == rand() # fails!
end
```

So we store here a "global seed" in `task_local_storage()`, which is set when `seed!()` is invoked
without an explicit RNG, and defaults to `Random.GLOBAL_SEED`, which is set only once when
`Random` is loaded. And instead of actually storing a seed, we store a copy of the RNG state.

This is still not ideal, in that at the beginning of `@testset "A"` or `@testset "B"`, we can't
do `@test x == rand()`, because these are in separate tasks, so the global seed defaults to
`Random.GLOBAL_SEED`, and not to the global seed of the parent's task;
there might be a nice way to handle that, but at least different tasks don't corrupt
each-other's seeds.
@rfourquet rfourquet force-pushed the rf/yeet-global-seed branch from 46838f8 to d81acd9 Compare October 7, 2023 15:38
@vchuravy
Copy link
Member

vchuravy commented Oct 7, 2023

This is still not ideal...

I was trying to address this in #51012

@rfourquet
Copy link
Member Author

Ah interesting! But I don't see in that PR a fix for the undisciplined mutation of Random.GLOBAL_SEED, did I miss something? But yeah I thought a bit about whether ScopedValue could be used for the "global seed", but it seemed to me that this would require that task creation handles passing the global seed to the child task, which would probably be overkill...

So I will merge this now, as it fixes a somewhat bug, but am curious to see if something better can be done eventually.

@vchuravy
Copy link
Member

vchuravy commented Oct 7, 2023

Yeah this was irrespective of the handling of the global random seed.

@rfourquet rfourquet merged commit 0296599 into master Oct 7, 2023
@rfourquet rfourquet deleted the rf/yeet-global-seed branch October 7, 2023 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
randomness Random number generation and the Random stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants