Skip to content

Conversation

topolarity
Copy link
Member

Code that returns a constant value never ends up compiled by LLVM, but we want to log them with --trace-compile since they are inferred / compiled by our own Compiler.

Resolves #58482

@topolarity topolarity requested a review from vtjnash May 27, 2025 14:15
@topolarity topolarity added needs tests Unit tests are required for this change backport 1.10 Change should be backported to the 1.10 release backport 1.11 Change should be backported to release-1.11 backport 1.12 Change should be backported to release-1.12 and removed needs tests Unit tests are required for this change labels May 27, 2025
Code that returns a constant value never ends up compiled by LLVM, but
we want to log them since they are inferred / compiled by our own
Compiler.

These can sometimes include methods that speculatively end up 'compiled'
with a non-'compileable' signature, since inference eagerly explores
widened signatures that JIT / codegen normally would not. That's not work
that we can log here since `precompile(...)` will reject the widened
signature, but that's probably fine, since the JIT doesn't know to look
for these widened signatures anyway.
@topolarity topolarity force-pushed the ct/trace-compile-const-return branch from 5b907c9 to f130858 Compare May 27, 2025 14:19
@KristofferC KristofferC mentioned this pull request May 28, 2025
58 tasks
// This isa_compileable_sig check is required because `precompile(...)` will reject the signature otherwise,
// assuming it is too wide to compile to a good result. That happens even though we know a posteriori that
// the signature compiles to a function with excellent performance (a const-return function).
record_precompile_statement(mi, 0.0, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably works but it makes me nervous, since all the other calls to this are from jl_compile_method_internal. Is there any way to arrange for this call to also come from that function?

@NHDaly
Copy link
Member

NHDaly commented Jul 16, 2025

@kpamnany to leave a status update on the experiments we tried w/ this PR.
(TL;DR: it caused crashes when backported to our 1.10 build, so we haven't settled on it yet.)

@topolarity
Copy link
Member Author

it caused crashes when backported to our 1.10 build, so we haven't settled on it yet.

This PR is probably not merge-able as-is, but the crashes that @kpamnany encountered are not due to this fix.

They happen when you include the precompile statement in the sysimage build, not when it is generated. Adding a precompile statement should never crash the compiler, so it's a separate bug

@topolarity topolarity marked this pull request as draft July 16, 2025 14:50
@kpamnany
Copy link
Member

What @topolarity said. This patch does emit precompile statements for const-return methods just fine. But when we try to build a sysimage including these precompile statements, we saw some strange failures.

On our 1.10 build, we used this patch and generated precompile statements for a sysimage build. We then added:

diff --git a/base/compiler/cicache.jl b/base/compiler/cicache.jl
index 8332777e6d..7bec2ae3ad 100644
--- a/base/compiler/cicache.jl
+++ b/base/compiler/cicache.jl
@@ -56,6 +56,8 @@ function get(wvc::WorldView{InternalCodeCache}, mi::MethodInstance, default)
     r = ccall(:jl_rettype_inferred, Any, (Any, UInt, UInt), mi, first(wvc.worlds), last(wvc.worlds))
     if r === nothing
         return default
+    elseif !isa(r, CodeInstance)
+        println("Expected CodeInstance for ", mi)
     end
     return r::CodeInstance
 end

And tried to build a sysimage. The sysimage built successfully a number of times, but then failed with:

2025-06-25 23:34:18 - PackageCompiler: compiling incremental system image
2025-06-25 23:34:52 Expected CodeInstance for (::Type{Base.IteratorEltype})(Type{Base.Iterators.Flatten{Array{Array{Salsa.DependencyKey, 1}, 1}}}) from (::Type{Base.IteratorEltype})(Type{Base.Iterators.Flatten{I}}) where {I}
2025-06-25 23:34:52 Internal error: encountered unexpected error in runtime:
2025-06-25 23:34:52 TypeError(func=:typeassert, context="", expected=Core.CodeInstance, got=<?#0x7fffaa5dd6d0::<?#0x7fffaa5dd730::(nil)>>)
2025-06-25 23:34:52 thread (1) ijl_type_error_rt at /build/julia-kp-backport-trace-const-return-methods/src/rtutils.c:121
2025-06-25 23:34:52 thread (1) ijl_type_error at /build/julia-kp-backport-trace-const-return-methods/src/rtutils.c:129
2025-06-25 23:34:52 thread (1) get at ./compiler/cicache.jl:62 [inlined]
2025-06-25 23:34:52 thread (1) typeinf_edge at ./compiler/typeinfer.jl:872
2025-06-25 23:34:52 thread (1) abstract_call_method at ./compiler/abstractinterpretation.jl:633
2025-06-25 23:34:52 thread (1) abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:95
2025-06-25 23:34:52 thread (1) abstract_call_known at ./compiler/abstractinterpretation.jl:2101

That mi corresponded to the following precompile statement:

#=    0.0 =# precompile(Tuple{Type{Base.IteratorEltype}, Type{Base.Iterators.Flatten{Array{Array{Salsa.DependencyKey, 1}, 1}}}}) #= const-return =#

Unfortunately, this did not reproduce in the next 10 runs. :(

Cody then suggested:

diff --git a/src/gf.c b/src/gf.c
index 2933db4229..19be5ece49 100644
--- a/src/gf.c
+++ b/src/gf.c
@@ -517,6 +517,7 @@ JL_DLLEXPORT void jl_mi_cache_insert(jl_method_instance_t *mi JL_ROOTING_ARGUMEN
                                      jl_code_instance_t *ci JL_ROOTED_ARGUMENT JL_MAYBE_UNROOTED)
 {
     JL_GC_PUSH1(&ci);
+    assert(jl_is_code_instance(ci));
     if (jl_is_method(mi->def.method))
         JL_LOCK(&mi->def.method->writelock);
     jl_code_instance_t *oldci = jl_atomic_load_relaxed(&mi->cache);

After a number of no-error runs, we finally saw:

2025-06-27 15:18:29 - PackageCompiler: compiling incremental system image
2025-06-27 15:19:05 julia: /build/julia-kp-backport-trace-const-return-methods/src/gf.c:526: ijl_mi_cache_insert: Assertion `jl_is_code_instance(ci)' failed.
2025-06-27 15:19:05 
2025-06-27 15:19:05 [1100631] signal (6.-6): Aborted
2025-06-27 15:19:05 in expression starting at none:1
2025-06-27 15:19:05 signal (6) thread (1) __pthread_kill_implementation at /nix/store/cg9s562sa33k78m63njfn1rw47dp9z0i-glibc-2.40-66/lib/libc.so.6 (unknown line)
2025-06-27 15:19:05 signal (6) thread (1) gsignal at /nix/store/cg9s562sa33k78m63njfn1rw47dp9z0i-glibc-2.40-66/lib/libc.so.6 (unknown line)
2025-06-27 15:19:05 signal (6) thread (1) abort at /nix/store/cg9s562sa33k78m63njfn1rw47dp9z0i-glibc-2.40-66/lib/libc.so.6 (unknown line)
2025-06-27 15:19:05 signal (6) thread (1) __assert_fail_base.cold at /nix/store/cg9s562sa33k78m63njfn1rw47dp9z0i-glibc-2.40-66/lib/libc.so.6 (unknown line)
2025-06-27 15:19:05 signal (6) thread (1) __assert_fail at /nix/store/cg9s562sa33k78m63njfn1rw47dp9z0i-glibc-2.40-66/lib/libc.so.6 (unknown line)
2025-06-27 15:19:05 signal (6) thread (1) ijl_mi_cache_insert at /build/julia-kp-backport-trace-const-return-methods/src/gf.c:526
2025-06-27 15:19:05 signal (6) thread (1) setindex! at ./compiler/cicache.jl:13 [inlined]
2025-06-27 15:19:05 signal (6) thread (1) setindex! at ./compiler/cicache.jl:72 [inlined]
2025-06-27 15:19:05 signal (6) thread (1) cache_result! at ./compiler/typeinfer.jl:406
2025-06-27 15:19:05 signal (6) thread (1) _typeinf at ./compiler/typeinfer.jl:283
2025-06-27 15:19:05 signal (6) thread (1) typeinf at ./compiler/typeinfer.jl:216
<snip>

Strangely, the setindex! that calls jl_mi_cache_insert is dispatched on ci being a CodeInstance:

function setindex!(cache::InternalCodeCache, ci::CodeInstance, mi::MethodInstance)
    ccall(:jl_mi_cache_insert, Cvoid, (Any, Any), mi, ci)

And we haven't gotten any further with this investigation.

@topolarity
Copy link
Member Author

It's also worth mentioning that another run at some point showed a GC-based segfault of some kind (iirc), so it's likely that bug is a GC corruption / incorrect-ness of some kind.

@KristofferC KristofferC mentioned this pull request Jul 22, 2025
20 tasks
@KristofferC KristofferC mentioned this pull request Aug 6, 2025
38 tasks
This was referenced Aug 19, 2025
DilumAluthge added a commit that referenced this pull request Sep 5, 2025
Backported PRs:
- [x] #54840 <!-- Add boundscheck in speccache_eq to avoid OOB access
due to data race -->
- [x] #42080 <!-- recommend explicit `using Foo: Foo, ...` in package
code (was: "using considered harmful") -->
- [x] #58127 <!-- [DOC] Update installation docs: /downloads/ =>
/install/ -->
- [x] #58202 <!-- [release-1.11] malloc: use jl_get_current_task to fix
null check -->
- [x] #58584 <!-- Make `Ptr` values static-show w/ type-information -->
- [x] #58637 <!-- Make late gc lower handle insertelement of alloca use.
-->
- [x] #58837 <!-- fix null comparisons for non-standard address spaces
-->
- [x] #57826 <!-- Add a `similar` method for `Type{<:CodeUnits}` -->
- [x] #58293 <!-- fix trailing indices stackoverflow in reinterpreted
array -->
- [x] #58887 <!-- Pkg: Allow configuring can_fancyprint(io::IO) using
IOContext -->
- [x] #58937 <!-- Fix nthreadpools size in JLOptions -->
- [x] #58978 <!-- Fix precompilepkgs warn loaded setting -->
- [x] #58998 <!-- Bugfix: Use Base.aligned_sizeof instead of sizeof in
Mmap.mmap -->
- [x] #59120 <!-- Fix memory order typo in "src/julia_atomics.h" -->
- [x] #59170 <!-- Clarify and enhance confusing precompile test -->

Need manual backport:
- [ ] #56329 <!-- loading: clean up more concurrency issues -->
- [ ] #56956 <!-- Add "mea culpa" to foreign module assignment error.
-->
- [ ] #57035 <!-- linux: workaround to avoid deadlock inside
dl_iterate_phdr in glibc -->
- [ ] #57089 <!-- Block thread from receiving profile signal with
stackwalk lock -->
- [ ] #57249 <!-- restore non-freebsd-unix fix for profiling -->
- [ ] #58011 <!-- Remove try-finally scope from `@time_imports`
`@trace_compile` `@trace_dispatch` -->
- [ ] #58062 <!-- remove unnecessary edge from `exp_impl` to `pow` -->
- [ ] #58157 <!-- add showing a string to REPL precompile workload -->
- [ ] #58209 <!-- Specialize `one` for the `SizedArray` test helper -->
- [ ] #58108 <!-- Base.get_extension & Dates.format made public -->
- [ ] #58356 <!-- codegen: remove readonly from abstract type calling
convention -->
- [ ] #58415 <!-- [REPL] more reliable extension loading -->
- [ ] #58510 <!-- Don't filter `Core` methods from newly-inferred list
-->
- [ ] #58110 <!-- relax dispatch for the `IteratorSize` method for
`Generator` -->
- [ ] #58965 <!-- Fix `hygienic-scope`s in inner macro expansions -->
- [ ] #58971 <!-- Fix alignment of failed precompile jobs on CI -->
- [ ] #59066 <!-- build: Also pass -fno-strict-aliasing for C++ -->

Contains multiple commits, manual intervention needed:
- [ ] #55877 <!-- fix FileWatching designs and add workaround for a stat
bug on Apple -->
- [ ] #56755 <!-- docs: fix scope type of a `struct` to hard -->
- [ ] #57809 <!-- Fix fptrunc Float64 -> Float16 rounding through
Float32 -->
- [ ] #57398 <!-- Make remaining float intrinsics require float
arguments -->
- [ ] #56351 <!-- Fix `--project=@script` when outside script directory
-->
- [ ] #57129 <!-- clarify that time_ns is monotonic -->
- [ ] #58134 <!-- Note annotated string API is experimental in Julia
1.11 in HISTORY.md -->
- [ ] #58401 <!-- check that hashing of types does not foreigncall
(`jl_type_hash` is concrete evaluated) -->
- [ ] #58435 <!-- Fix layout flags for types that have oddly sized
primitive type fields -->
- [ ] #58483 <!-- Fix tbaa usage when storing into heap allocated
immutable structs -->
- [ ] #58512 <!-- Make more types jl_static_show readably -->
- [ ] #58012 <!-- Re-enable tab completion of kwargs for large method
tables -->
- [ ] #58683 <!-- Add 0 predecessor to entry basic block and handle it
in inlining -->
- [ ] #59112 <!-- Add builtin function name to add methods error -->

Non-merged PRs with backport label:
- [ ] #59329 <!-- aotcompile: destroy LLVM context after serializing
combined module -->
- [ ] #58848 <!-- Set array size only when safe to do so -->
- [ ] #58535 <!-- gf.c: include const-return methods in
`--trace-compile` -->
- [ ] #58038 <!-- strings/cstring: `transcode`: prevent Windows sysimage
invalidation -->
- [ ] #57604 <!-- `@nospecialize` for `string_index_err` -->
- [ ] #57366 <!-- Use ptrdiff_t sized offsets for gvars_offsets to allow
large sysimages -->
- [ ] #56890 <!-- Enable getting non-boxed LLVM type from Julia Type -->
- [ ] #56823 <!-- Make version of opaque closure constructor in world
-->
- [ ] #55958 <!-- also redirect JL_STDERR etc. when redirecting to
devnull -->
- [ ] #55956 <!-- Make threadcall gc safe -->
- [ ] #55534 <!-- Set stdlib sources as read-only during installation
-->
- [ ] #55499 <!-- propagate the terminal's `displaysize` to the
`IOContext` used by the REPL -->
- [ ] #55458 <!-- Allow for generically extracting unannotated string
-->
- [ ] #55457 <!-- Make AnnotateChar equality consider annotations -->
- [ ] #55220 <!-- `isfile_casesensitive` fixes on Windows -->
- [ ] #53957 <!-- tweak how filtering is done for what packages should
be precompiled -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
- [ ] #50813 <!-- More doctests for Sockets and capitalization fix -->
- [ ] #50157 <!-- improve docs for `@inbounds` and
`Base.@propagate_inbounds` -->

---------

Co-authored-by: Kiran Pamnany <[email protected]>
Co-authored-by: adienes <[email protected]>
Co-authored-by: Gabriel Baraldi <[email protected]>
Co-authored-by: Keno Fischer <[email protected]>
Co-authored-by: Simeon David Schaub <[email protected]>
Co-authored-by: Jameson Nash <[email protected]>
Co-authored-by: Alex Arslan <[email protected]>
Co-authored-by: Fons van der Plas <[email protected]>
Co-authored-by: Ian Butterworth <[email protected]>
Co-authored-by: JonasIsensee <[email protected]>
Co-authored-by: Curtis Vogt <[email protected]>
Co-authored-by: Dilum Aluthge <[email protected]>
Co-authored-by: DilumAluthgeBot <[email protected]>
Co-authored-by: DilumAluthge <[email protected]>
@vtjnash
Copy link
Member

vtjnash commented Sep 17, 2025

Resolution is just to use trace-dispatch if that is what you mean, instead of trying to make trace-compile do something unexpected here

@vtjnash vtjnash closed this Sep 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.10 Change should be backported to the 1.10 release backport 1.11 Change should be backported to release-1.11 backport 1.12 Change should be backported to release-1.12
Projects
None yet
Development

Successfully merging this pull request may close these issues.

--trace-compile ignores inferred const-return Methods
5 participants