Skip to content

Add units (ticks) to ENABLE_TIMINGS output #39213

@NHDaly

Description

@NHDaly

It appears that (at least) on julia 1.5 and 1.6, ENABLE_TIMINGS over-reports time spent in LLVM_OPT.

Consider the following very simple example:

$ time JULIA_NUM_THREADS=1 ./julia -E '@time begin using InteractiveUtils ; peakflops() ; end ; exit()'
  0.847438 seconds (3.13 M allocations: 233.060 MiB, 5.32% gc time, 85.96% compilation time)
ROOT                      : 11.83 %   396455815
GC                        :  3.90 %   130840452
LOWERING                  :  2.05 %   68771174
PARSING                   :  0.34 %   11302830
INFERENCE                 : 23.56 %   789704775
CODEGEN                   :  1.50 %   50364613
METHOD_LOOKUP_SLOW        :  0.01 %   284413
METHOD_LOOKUP_FAST        :  2.64 %   88575244
LLVM_OPT                  : 33.27 %   1115154038
LLVM_MODULE_FINISH        :  0.47 %   15890112
METHOD_MATCH              :  1.25 %   41789788
TYPE_CACHE_LOOKUP         :  1.31 %   43747847
TYPE_CACHE_INSERT         :  0.02 %   591496
MACRO_INVOCATION          :  0.01 %   336092
AST_COMPRESS              :  0.85 %   28611834
AST_UNCOMPRESS            :  1.71 %   57321144
SYSIMG_LOAD               : 13.75 %   460790605
ADD_METHOD                :  0.00 %   160604
INIT_MODULE               :  1.54 %   51535716
JULIA_NUM_THREADS=1 ./julia -E   2.10s user 0.33s system 204% cpu 1.191 total

In this very small example, LLVM_OPT already reports 1.115 seconds as being 33%, when that's already more than 50% of the 2.10s reported by the shell.

In fact, the sum of the time it reports is 3.35 seconds, which is more than the 2.10s reported by the shell:

julia> sum([396455815
       130840452
       68771174
       11302830
       789704775
       50364613
       284413
       88575244
       1115154038
       15890112
       41789788
       43747847
       591496
       336092
       28611834
       57321144
       460790605
       160604
       51535716
       ])
3352228592

julia> 3352228592/1e9
3.352228592

Notably, even subtracting out all the time in LLVM_OPT, the total is still larger than what's reported by the shell:

julia> (3352228592-1115154038)/1e9
2.237074554

Is there weird something going on with multithreading here or something like that, where this is reporting CPU-time and we actually used many cores?

I originally came across this because the times reported by ENABLE_TIMINGS are much larger than the total llvm optimization time reported by @snoopl, as added in #37136.


Are the LLVM_OPT timings wrong? Or is there something else going on here that I'm not understanding?

Thanks! :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    docsThis change adds or pertains to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions