Question and Proposal: Allow single-run execution of precompile file

Hello! This is a combo question and proposal, where I am hoping to reduce the time it takes to build a statically compiled sysimg or executable, by avoiding compiling everything _twice_, as we currently do.

My understanding of the current process of static compilation (as run from, say, `PackageCompiler.create_app()`) includes at least the following steps:
1. Run the provided precompile script(s) with `--trace-compile=tmpname()`, to record _as a string representation_ what functions+types to compile when building the sysimg. This produces a file(s) containing several `precompile(foo, (Bar, Baz))` statements.
2. Start a new julia session with `--output-o`, import the Package(s) being compiled, and then load and execute the `precompile` statements generated by the precompile scripts and provided in any `precompile_statements_file`(s).

My concern is that this means that in order to produce a statically compiled binary, we have to pay for compilation latency _twice_ for every function compiled from in the `precompile_execution_file`.

My question is whether we could (maybe optionally) combine these into a single step, running the `precompile_execution_file` with `--output-o` directly, and then also loading and executing any provided `precompile_statements_file`s in the same session.

Could you help me understand why this is currently performed in two separate steps now, and what we might need to do in order to allow us to combine these into one step?

Some problems that we have from the current setup:
- It's slow: we have to pay for compiling all the code twice, and for our software this currently takes >1 hour and we aren't snooping as much as we would like. I'd like us to snoop much more, but we're afraid of making our CI build times too long.
- Since we currently write every method instance to disk _as a string_ and then read it back in a new process, there are currently *bugs that cause us to drop some functions*.
    - For more details, see:
        - https://github.com/JuliaLang/julia/issues/28808#issuecomment-712472418
        - https://github.com/JuliaLang/PackageCompiler.jl/pull/457
    - If we were to avoid this round-trip through a text file, we could get much better recall via static compilation. Currently we have around 3,000 / 15,000 precompile statements not actually working in our build (😢), and I hope avoiding the round trip could help?


---------------------------------------

Some reasons that I can imagine that might motivate why we currently do this in two steps include:
A) In order to avoid method invalidations, perhaps we want to ensure that we aren't eval'ing new definitions (by loading new packages for example) during execution of the process running with `--output-o`?
    - My thinking is that if we were to run `--output-o` during the main process, we might accidentally invalidate some of the functions we mean to statically compile after we've emitted them by loading some new package halfway through the precompilation script, and then I don't know how `--output-o` would handle that. Would that cause problems?
B) Perhaps running with `--output-o` makes julia significantly slower, to the point where it might be faster overall to run once without that flag, record the results to disk, and then run again with the flag, only performing the output? But this seems dubious to me when the compilation itself is the bottleneck in a precompilation script (which it hopefully should be, for a well-written precompilation script).
C) A precompilation script may load other packages in order to trigger all the compilations desired to be statically compiled, but we don't necessarily want to precompile the functions from those _other_ packages.
    - To solve this case, I imagine that we could perhaps update julia's `--output-o` flag to take a list of top-level module names from which to emit object code, and it could ignore methods and/or types coming from outside that list? That should replicate the current behavior.


I'm very interested to hear if there are other things I'm missing! :) Sorry if this is rehashing old discussions; i haven't been able to find anything on this when searching.

If this doesn't make sense all the time, perhaps we could support it with a flag, or something?
Anyway, thanks for your time!
Happy 2021!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Question and Proposal: Allow single-run execution of precompile file #486

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Question and Proposal: Allow single-run execution of precompile file #486

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions