-
Notifications
You must be signed in to change notification settings - Fork 252
Closed
Labels
performanceHow fast can we go?How fast can we go?
Description
Describe the bug
I am working with random generated array and need to regenerate the same array multiple time. So I need to reset the seed and redraw, however, CUDA.seed!
is almost a thousand time slower than Random.seed!
making performance-wsie unusable.
Is there a reason why it is so slow?
To reproduce
The Minimal Working Example (MWE) for this bug:
julia> @btime CUDA.seed!(seed);
6.351 ms (18 allocations: 51.84 KiB)
julia> @btime Random.seed!(seed);
10.286 μs (2 allocations: 112 bytes)
julia> seed
0xdd61db20186b29af
This speed is still there at low level CURAND as well
>>@btime begin
CURAND.curandSetPseudoRandomGeneratorSeed(CURAND.default_rng(), seed);
CURAND.curandSetGeneratorOffset(CURAND.default_rng(), 0);
CURAND.curandGenerateNormal(CURAND.default_rng(), e2, length(e2), 0.0, 1.0);
end
6.189 ms (7 allocations: 112 bytes)
Expected behavior
I know random generators are tricky so I don't expect it to be better than on GPU but at least some similar performance would be expected.
Version info
Details on Julia:
>> versioninfo()
Julia Version 1.5.1
Commit 697e782ab8 (2020-08-25 20:08 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E3-1270 v6 @ 3.80GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
Environment:
JULIA_BINDIR = /home/mlouboutin3/GATechBundle/Julia/julia-1.5.1/bin
Details on CUDA:
>> CUDA.versioninfo()
CUDA toolkit 10.1.243, artifact installation
CUDA driver 10.1.0
NVIDIA driver 418.39.0
Libraries:
- CUBLAS: 10.2.1
- CURAND: 10.1.1
- CUFFT: 10.1.1
- CUSOLVER: 10.2.0
- CUSPARSE: 10.3.0
- CUPTI: 12.0.0
- NVML: 10.0.0+418.39
- CUDNN: 8.0.4 (for CUDA 10.1.0)
- CUTENSOR: 1.2.1 (for CUDA 10.1.0)
Toolchain:
- Julia: 1.5.1
- LLVM: 9.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4
- Device support: sm_30, sm_32, sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75
Metadata
Metadata
Assignees
Labels
performanceHow fast can we go?How fast can we go?