-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Description
It appears that (at least) on julia 1.5 and 1.6, ENABLE_TIMINGS over-reports time spent in LLVM_OPT.
Consider the following very simple example:
$ time JULIA_NUM_THREADS=1 ./julia -E '@time begin using InteractiveUtils ; peakflops() ; end ; exit()'
0.847438 seconds (3.13 M allocations: 233.060 MiB, 5.32% gc time, 85.96% compilation time)
ROOT : 11.83 % 396455815
GC : 3.90 % 130840452
LOWERING : 2.05 % 68771174
PARSING : 0.34 % 11302830
INFERENCE : 23.56 % 789704775
CODEGEN : 1.50 % 50364613
METHOD_LOOKUP_SLOW : 0.01 % 284413
METHOD_LOOKUP_FAST : 2.64 % 88575244
LLVM_OPT : 33.27 % 1115154038
LLVM_MODULE_FINISH : 0.47 % 15890112
METHOD_MATCH : 1.25 % 41789788
TYPE_CACHE_LOOKUP : 1.31 % 43747847
TYPE_CACHE_INSERT : 0.02 % 591496
MACRO_INVOCATION : 0.01 % 336092
AST_COMPRESS : 0.85 % 28611834
AST_UNCOMPRESS : 1.71 % 57321144
SYSIMG_LOAD : 13.75 % 460790605
ADD_METHOD : 0.00 % 160604
INIT_MODULE : 1.54 % 51535716
JULIA_NUM_THREADS=1 ./julia -E 2.10s user 0.33s system 204% cpu 1.191 total
In this very small example, LLVM_OPT already reports 1.115 seconds as being 33%, when that's already more than 50% of the 2.10s reported by the shell.
In fact, the sum of the time it reports is 3.35
seconds, which is more than the 2.10s reported by the shell:
julia> sum([396455815
130840452
68771174
11302830
789704775
50364613
284413
88575244
1115154038
15890112
41789788
43747847
591496
336092
28611834
57321144
460790605
160604
51535716
])
3352228592
julia> 3352228592/1e9
3.352228592
Notably, even subtracting out all the time in LLVM_OPT, the total is still larger than what's reported by the shell:
julia> (3352228592-1115154038)/1e9
2.237074554
Is there weird something going on with multithreading here or something like that, where this is reporting CPU-time and we actually used many cores?
I originally came across this because the times reported by ENABLE_TIMINGS
are much larger than the total llvm optimization time reported by @snoopl
, as added in #37136.
Are the LLVM_OPT timings wrong? Or is there something else going on here that I'm not understanding?
Thanks! :)