Skip to content

Conversation

gbaraldi
Copy link
Member

@gbaraldi gbaraldi commented Jun 4, 2025

This was in DAECompiler.jl code found by @serenity4. He also mentioned that writing up how one might go and fix a bug like this so i'll give a quick writeup (this was a very simple bug so it might not be too interesting)

The original crash which looked something like

%19 = alloca [10 x i64], align 8
%155 = insertelement <4 x ptr> poison, ptr %19, i32 0
Unexpected instruction
[898844] signal 6 (-6): Aborted
in expression starting at /home/gbaraldi/DAECompiler.jl/test/reflection.jl:28
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
RecursivelyVisit<llvm::IntrinsicInst, LateLowerGCFrame::PlaceRootsAndUpdateCalls(llvm::ArrayRef, int, State&, std::map<llvm::Value*, std::pair<int, int> >)::<lambda(llvm::AllocaInst*&)>::<lambda(llvm::Use&)> > at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:803
operator() at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2560 [inlined]
PlaceRootsAndUpdateCalls at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2576
runOnFunction at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2638
run at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2675
run at /home/gbaraldi/julia4/usr/include/llvm/IR/PassManagerInternal.h:91

which means it was crashing inside of late-gc-lowering, so the first thing I did was ran julia and the same test with LLVM_ASSERTIONS=1 and FORCE_ASSERTIONS=1 to see if LLVM complained about a malformed module, and both were fine. Next step was trying to get the failing code out for inspection.
Easiest way is to do export JULIA_LLVM_ARGS="--print-before=LateLowerGCFrame --print-module-scope" and pipe the output to a file.
The file is huge, but since it's a crash in LLVM we know that the last thing is what we want, and that gave me the IR I wanted.
To verify that this is failing I did make -C src install-analysis-deps to install the LLVM machinery (opt...). That gets put in the tools directory of a julia build. Then I checked if this crashed outside of julia by doing
./opt -load-pass-plugin=../lib/libjulia-codegen.dylib --passes=LateLowerGCFrame -S test.ll -o tmp3.ll . This is run from inside the tools dir so your paths might vary (the -S is so LLVM doesn't generate bitcode) and my code did crash, however it was over 500 lines of IR which makes it harder to debug and to write a test.

Next step then is to minimize the crash by doing llvm-reduce over it (it's basically creduce but optimized for LLVM IR) which gave me a 2 line reproducer (in this case apparently just having the insertelement was enough for the pass to fail). One thing to be wary is that llvm-reduce will usually make very weird code, so it might be useful to modify the code slightly so it doesn't look odd (it will have unreachable basic-blocks and such).
After the cleanup fixing the bug here wasn't interesting but this doesn't apply generally. And also always transform your reduced IR into a test to put in llvmpasses.

@gbaraldi gbaraldi requested review from vtjnash and Keno June 4, 2025 20:08
@Keno Keno merged commit 906d348 into master Jun 4, 2025
8 checks passed
@Keno Keno deleted the gb/latelowerinsert branch June 4, 2025 23:30
@gbaraldi gbaraldi added backport 1.11 Change should be backported to release-1.11 backport 1.12 Change should be backported to release-1.12 labels Jul 15, 2025
KristofferC pushed a commit that referenced this pull request Jul 22, 2025
This was in DAECompiler.jl code found by @serenity4. He also mentioned
that writing up how one might go and fix a bug like this so i'll give a
quick writeup (this was a very simple bug so it might not be too
interesting)

The original crash which looked something like
>   %19 = alloca [10 x i64], align 8
  %155 = insertelement <4 x ptr> poison, ptr %19, i32 0
Unexpected instruction
> [898844] signal 6 (-6): Aborted
in expression starting at
/home/gbaraldi/DAECompiler.jl/test/reflection.jl:28
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
RecursivelyVisit<llvm::IntrinsicInst,
LateLowerGCFrame::PlaceRootsAndUpdateCalls(llvm::ArrayRef<int>, int,
State&, std::map<llvm::Value*, std::pair<int, int>
>)::<lambda(llvm::AllocaInst*&)>::<lambda(llvm::Use&)> > at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:803
operator() at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2560
[inlined]
PlaceRootsAndUpdateCalls at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2576
runOnFunction at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2638
run at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2675
run at
/home/gbaraldi/julia4/usr/include/llvm/IR/PassManagerInternal.h:91

which means it was crashing inside of late-gc-lowering, so the first
thing I did was ran julia and the same test with LLVM_ASSERTIONS=1 and
FORCE_ASSERTIONS=1 to see if LLVM complained about a malformed module,
and both were fine. Next step was trying to get the failing code out for
inspection.
Easiest way is to do `export
JULIA_LLVM_ARGS="--print-before=LateLowerGCFrame --print-module-scope"`
and pipe the output to a file.
The file is huge, but since it's a crash in LLVM we know that the last
thing is what we want, and that gave me the IR I wanted.
To verify that this is failing I did `make -C src install-analysis-deps`
to install the LLVM machinery (opt...). That gets put in the `tools`
directory of a julia build. Then I checked if this crashed outside of
julia by doing
`./opt -load-pass-plugin=../lib/libjulia-codegen.dylib
--passes=LateLowerGCFrame -S test.ll -o tmp3.ll `. This is run from
inside the tools dir so your paths might vary (the -S is so LLVM doesn't
generate bitcode) and my code did crash, however it was over 500 lines
of IR which makes it harder to debug and to write a test.

Next step then is to minimize the crash by doing
[`llvm-reduce`](https://llvm.org/docs/CommandGuide/llvm-reduce.html)
over it (it's basically creduce but optimized for LLVM IR) which gave me
a 2 line reproducer (in this case apparently just having the
insertelement was enough for the pass to fail). One thing to be wary is
that llvm-reduce will usually make very weird code, so it might be
useful to modify the code slightly so it doesn't look odd (it will have
unreachable basic-blocks and such).
After the cleanup fixing the bug here wasn't interesting but this
doesn't apply generally. And also always transform your reduced IR into
a test to put in llvmpasses.

(cherry picked from commit 906d348)
@KristofferC KristofferC mentioned this pull request Jul 22, 2025
20 tasks
@KristofferC KristofferC mentioned this pull request Aug 6, 2025
38 tasks
@KristofferC KristofferC removed the backport 1.12 Change should be backported to release-1.12 label Aug 6, 2025
KristofferC pushed a commit that referenced this pull request Aug 19, 2025
This was in DAECompiler.jl code found by @serenity4. He also mentioned
that writing up how one might go and fix a bug like this so i'll give a
quick writeup (this was a very simple bug so it might not be too
interesting)

The original crash which looked something like
>   %19 = alloca [10 x i64], align 8
  %155 = insertelement <4 x ptr> poison, ptr %19, i32 0
Unexpected instruction
> [898844] signal 6 (-6): Aborted
in expression starting at
/home/gbaraldi/DAECompiler.jl/test/reflection.jl:28
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
RecursivelyVisit<llvm::IntrinsicInst,
LateLowerGCFrame::PlaceRootsAndUpdateCalls(llvm::ArrayRef<int>, int,
State&, std::map<llvm::Value*, std::pair<int, int>
>)::<lambda(llvm::AllocaInst*&)>::<lambda(llvm::Use&)> > at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:803
operator() at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2560
[inlined]
PlaceRootsAndUpdateCalls at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2576
runOnFunction at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2638
run at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2675
run at
/home/gbaraldi/julia4/usr/include/llvm/IR/PassManagerInternal.h:91

which means it was crashing inside of late-gc-lowering, so the first
thing I did was ran julia and the same test with LLVM_ASSERTIONS=1 and
FORCE_ASSERTIONS=1 to see if LLVM complained about a malformed module,
and both were fine. Next step was trying to get the failing code out for
inspection.
Easiest way is to do `export
JULIA_LLVM_ARGS="--print-before=LateLowerGCFrame --print-module-scope"`
and pipe the output to a file.
The file is huge, but since it's a crash in LLVM we know that the last
thing is what we want, and that gave me the IR I wanted.
To verify that this is failing I did `make -C src install-analysis-deps`
to install the LLVM machinery (opt...). That gets put in the `tools`
directory of a julia build. Then I checked if this crashed outside of
julia by doing
`./opt -load-pass-plugin=../lib/libjulia-codegen.dylib
--passes=LateLowerGCFrame -S test.ll -o tmp3.ll `. This is run from
inside the tools dir so your paths might vary (the -S is so LLVM doesn't
generate bitcode) and my code did crash, however it was over 500 lines
of IR which makes it harder to debug and to write a test.

Next step then is to minimize the crash by doing
[`llvm-reduce`](https://llvm.org/docs/CommandGuide/llvm-reduce.html)
over it (it's basically creduce but optimized for LLVM IR) which gave me
a 2 line reproducer (in this case apparently just having the
insertelement was enough for the pass to fail). One thing to be wary is
that llvm-reduce will usually make very weird code, so it might be
useful to modify the code slightly so it doesn't look odd (it will have
unreachable basic-blocks and such).
After the cleanup fixing the bug here wasn't interesting but this
doesn't apply generally. And also always transform your reduced IR into
a test to put in llvmpasses.

(cherry picked from commit 906d348)
KristofferC pushed a commit that referenced this pull request Aug 19, 2025
This was in DAECompiler.jl code found by @serenity4. He also mentioned
that writing up how one might go and fix a bug like this so i'll give a
quick writeup (this was a very simple bug so it might not be too
interesting)

The original crash which looked something like
>   %19 = alloca [10 x i64], align 8
  %155 = insertelement <4 x ptr> poison, ptr %19, i32 0
Unexpected instruction
> [898844] signal 6 (-6): Aborted
in expression starting at
/home/gbaraldi/DAECompiler.jl/test/reflection.jl:28
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
RecursivelyVisit<llvm::IntrinsicInst,
LateLowerGCFrame::PlaceRootsAndUpdateCalls(llvm::ArrayRef<int>, int,
State&, std::map<llvm::Value*, std::pair<int, int>
>)::<lambda(llvm::AllocaInst*&)>::<lambda(llvm::Use&)> > at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:803
operator() at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2560
[inlined]
PlaceRootsAndUpdateCalls at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2576
runOnFunction at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2638
run at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2675
run at
/home/gbaraldi/julia4/usr/include/llvm/IR/PassManagerInternal.h:91

which means it was crashing inside of late-gc-lowering, so the first
thing I did was ran julia and the same test with LLVM_ASSERTIONS=1 and
FORCE_ASSERTIONS=1 to see if LLVM complained about a malformed module,
and both were fine. Next step was trying to get the failing code out for
inspection.
Easiest way is to do `export
JULIA_LLVM_ARGS="--print-before=LateLowerGCFrame --print-module-scope"`
and pipe the output to a file.
The file is huge, but since it's a crash in LLVM we know that the last
thing is what we want, and that gave me the IR I wanted.
To verify that this is failing I did `make -C src install-analysis-deps`
to install the LLVM machinery (opt...). That gets put in the `tools`
directory of a julia build. Then I checked if this crashed outside of
julia by doing
`./opt -load-pass-plugin=../lib/libjulia-codegen.dylib
--passes=LateLowerGCFrame -S test.ll -o tmp3.ll `. This is run from
inside the tools dir so your paths might vary (the -S is so LLVM doesn't
generate bitcode) and my code did crash, however it was over 500 lines
of IR which makes it harder to debug and to write a test.

Next step then is to minimize the crash by doing
[`llvm-reduce`](https://llvm.org/docs/CommandGuide/llvm-reduce.html)
over it (it's basically creduce but optimized for LLVM IR) which gave me
a 2 line reproducer (in this case apparently just having the
insertelement was enough for the pass to fail). One thing to be wary is
that llvm-reduce will usually make very weird code, so it might be
useful to modify the code slightly so it doesn't look odd (it will have
unreachable basic-blocks and such).
After the cleanup fixing the bug here wasn't interesting but this
doesn't apply generally. And also always transform your reduced IR into
a test to put in llvmpasses.

(cherry picked from commit 906d348)
@KristofferC KristofferC mentioned this pull request Aug 19, 2025
65 tasks
DilumAluthge added a commit that referenced this pull request Aug 31, 2025
DilumAluthge added a commit that referenced this pull request Aug 31, 2025
DilumAluthge added a commit that referenced this pull request Sep 1, 2025
KristofferC pushed a commit that referenced this pull request Sep 5, 2025
This was in DAECompiler.jl code found by @serenity4. He also mentioned
that writing up how one might go and fix a bug like this so i'll give a
quick writeup (this was a very simple bug so it might not be too
interesting)

The original crash which looked something like
>   %19 = alloca [10 x i64], align 8
  %155 = insertelement <4 x ptr> poison, ptr %19, i32 0
Unexpected instruction
> [898844] signal 6 (-6): Aborted
in expression starting at
/home/gbaraldi/DAECompiler.jl/test/reflection.jl:28
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
RecursivelyVisit<llvm::IntrinsicInst,
LateLowerGCFrame::PlaceRootsAndUpdateCalls(llvm::ArrayRef<int>, int,
State&, std::map<llvm::Value*, std::pair<int, int>
>)::<lambda(llvm::AllocaInst*&)>::<lambda(llvm::Use&)> > at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:803
operator() at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2560
[inlined]
PlaceRootsAndUpdateCalls at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2576
runOnFunction at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2638
run at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2675
run at
/home/gbaraldi/julia4/usr/include/llvm/IR/PassManagerInternal.h:91

which means it was crashing inside of late-gc-lowering, so the first
thing I did was ran julia and the same test with LLVM_ASSERTIONS=1 and
FORCE_ASSERTIONS=1 to see if LLVM complained about a malformed module,
and both were fine. Next step was trying to get the failing code out for
inspection.
Easiest way is to do `export
JULIA_LLVM_ARGS="--print-before=LateLowerGCFrame --print-module-scope"`
and pipe the output to a file.
The file is huge, but since it's a crash in LLVM we know that the last
thing is what we want, and that gave me the IR I wanted.
To verify that this is failing I did `make -C src install-analysis-deps`
to install the LLVM machinery (opt...). That gets put in the `tools`
directory of a julia build. Then I checked if this crashed outside of
julia by doing
`./opt -load-pass-plugin=../lib/libjulia-codegen.dylib
--passes=LateLowerGCFrame -S test.ll -o tmp3.ll `. This is run from
inside the tools dir so your paths might vary (the -S is so LLVM doesn't
generate bitcode) and my code did crash, however it was over 500 lines
of IR which makes it harder to debug and to write a test.

Next step then is to minimize the crash by doing
[`llvm-reduce`](https://llvm.org/docs/CommandGuide/llvm-reduce.html)
over it (it's basically creduce but optimized for LLVM IR) which gave me
a 2 line reproducer (in this case apparently just having the
insertelement was enough for the pass to fail). One thing to be wary is
that llvm-reduce will usually make very weird code, so it might be
useful to modify the code slightly so it doesn't look odd (it will have
unreachable basic-blocks and such).
After the cleanup fixing the bug here wasn't interesting but this
doesn't apply generally. And also always transform your reduced IR into
a test to put in llvmpasses.

(cherry picked from commit 906d348)
DilumAluthge added a commit that referenced this pull request Sep 5, 2025
Backported PRs:
- [x] #54840 <!-- Add boundscheck in speccache_eq to avoid OOB access
due to data race -->
- [x] #42080 <!-- recommend explicit `using Foo: Foo, ...` in package
code (was: "using considered harmful") -->
- [x] #58127 <!-- [DOC] Update installation docs: /downloads/ =>
/install/ -->
- [x] #58202 <!-- [release-1.11] malloc: use jl_get_current_task to fix
null check -->
- [x] #58584 <!-- Make `Ptr` values static-show w/ type-information -->
- [x] #58637 <!-- Make late gc lower handle insertelement of alloca use.
-->
- [x] #58837 <!-- fix null comparisons for non-standard address spaces
-->
- [x] #57826 <!-- Add a `similar` method for `Type{<:CodeUnits}` -->
- [x] #58293 <!-- fix trailing indices stackoverflow in reinterpreted
array -->
- [x] #58887 <!-- Pkg: Allow configuring can_fancyprint(io::IO) using
IOContext -->
- [x] #58937 <!-- Fix nthreadpools size in JLOptions -->
- [x] #58978 <!-- Fix precompilepkgs warn loaded setting -->
- [x] #58998 <!-- Bugfix: Use Base.aligned_sizeof instead of sizeof in
Mmap.mmap -->
- [x] #59120 <!-- Fix memory order typo in "src/julia_atomics.h" -->
- [x] #59170 <!-- Clarify and enhance confusing precompile test -->

Need manual backport:
- [ ] #56329 <!-- loading: clean up more concurrency issues -->
- [ ] #56956 <!-- Add "mea culpa" to foreign module assignment error.
-->
- [ ] #57035 <!-- linux: workaround to avoid deadlock inside
dl_iterate_phdr in glibc -->
- [ ] #57089 <!-- Block thread from receiving profile signal with
stackwalk lock -->
- [ ] #57249 <!-- restore non-freebsd-unix fix for profiling -->
- [ ] #58011 <!-- Remove try-finally scope from `@time_imports`
`@trace_compile` `@trace_dispatch` -->
- [ ] #58062 <!-- remove unnecessary edge from `exp_impl` to `pow` -->
- [ ] #58157 <!-- add showing a string to REPL precompile workload -->
- [ ] #58209 <!-- Specialize `one` for the `SizedArray` test helper -->
- [ ] #58108 <!-- Base.get_extension & Dates.format made public -->
- [ ] #58356 <!-- codegen: remove readonly from abstract type calling
convention -->
- [ ] #58415 <!-- [REPL] more reliable extension loading -->
- [ ] #58510 <!-- Don't filter `Core` methods from newly-inferred list
-->
- [ ] #58110 <!-- relax dispatch for the `IteratorSize` method for
`Generator` -->
- [ ] #58965 <!-- Fix `hygienic-scope`s in inner macro expansions -->
- [ ] #58971 <!-- Fix alignment of failed precompile jobs on CI -->
- [ ] #59066 <!-- build: Also pass -fno-strict-aliasing for C++ -->

Contains multiple commits, manual intervention needed:
- [ ] #55877 <!-- fix FileWatching designs and add workaround for a stat
bug on Apple -->
- [ ] #56755 <!-- docs: fix scope type of a `struct` to hard -->
- [ ] #57809 <!-- Fix fptrunc Float64 -> Float16 rounding through
Float32 -->
- [ ] #57398 <!-- Make remaining float intrinsics require float
arguments -->
- [ ] #56351 <!-- Fix `--project=@script` when outside script directory
-->
- [ ] #57129 <!-- clarify that time_ns is monotonic -->
- [ ] #58134 <!-- Note annotated string API is experimental in Julia
1.11 in HISTORY.md -->
- [ ] #58401 <!-- check that hashing of types does not foreigncall
(`jl_type_hash` is concrete evaluated) -->
- [ ] #58435 <!-- Fix layout flags for types that have oddly sized
primitive type fields -->
- [ ] #58483 <!-- Fix tbaa usage when storing into heap allocated
immutable structs -->
- [ ] #58512 <!-- Make more types jl_static_show readably -->
- [ ] #58012 <!-- Re-enable tab completion of kwargs for large method
tables -->
- [ ] #58683 <!-- Add 0 predecessor to entry basic block and handle it
in inlining -->
- [ ] #59112 <!-- Add builtin function name to add methods error -->

Non-merged PRs with backport label:
- [ ] #59329 <!-- aotcompile: destroy LLVM context after serializing
combined module -->
- [ ] #58848 <!-- Set array size only when safe to do so -->
- [ ] #58535 <!-- gf.c: include const-return methods in
`--trace-compile` -->
- [ ] #58038 <!-- strings/cstring: `transcode`: prevent Windows sysimage
invalidation -->
- [ ] #57604 <!-- `@nospecialize` for `string_index_err` -->
- [ ] #57366 <!-- Use ptrdiff_t sized offsets for gvars_offsets to allow
large sysimages -->
- [ ] #56890 <!-- Enable getting non-boxed LLVM type from Julia Type -->
- [ ] #56823 <!-- Make version of opaque closure constructor in world
-->
- [ ] #55958 <!-- also redirect JL_STDERR etc. when redirecting to
devnull -->
- [ ] #55956 <!-- Make threadcall gc safe -->
- [ ] #55534 <!-- Set stdlib sources as read-only during installation
-->
- [ ] #55499 <!-- propagate the terminal's `displaysize` to the
`IOContext` used by the REPL -->
- [ ] #55458 <!-- Allow for generically extracting unannotated string
-->
- [ ] #55457 <!-- Make AnnotateChar equality consider annotations -->
- [ ] #55220 <!-- `isfile_casesensitive` fixes on Windows -->
- [ ] #53957 <!-- tweak how filtering is done for what packages should
be precompiled -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
- [ ] #50813 <!-- More doctests for Sockets and capitalization fix -->
- [ ] #50157 <!-- improve docs for `@inbounds` and
`Base.@propagate_inbounds` -->

---------

Co-authored-by: Kiran Pamnany <[email protected]>
Co-authored-by: adienes <[email protected]>
Co-authored-by: Gabriel Baraldi <[email protected]>
Co-authored-by: Keno Fischer <[email protected]>
Co-authored-by: Simeon David Schaub <[email protected]>
Co-authored-by: Jameson Nash <[email protected]>
Co-authored-by: Alex Arslan <[email protected]>
Co-authored-by: Fons van der Plas <[email protected]>
Co-authored-by: Ian Butterworth <[email protected]>
Co-authored-by: JonasIsensee <[email protected]>
Co-authored-by: Curtis Vogt <[email protected]>
Co-authored-by: Dilum Aluthge <[email protected]>
Co-authored-by: DilumAluthgeBot <[email protected]>
Co-authored-by: DilumAluthge <[email protected]>
DilumAluthge pushed a commit that referenced this pull request Sep 9, 2025
This was in DAECompiler.jl code found by @serenity4. He also mentioned
that writing up how one might go and fix a bug like this so i'll give a
quick writeup (this was a very simple bug so it might not be too
interesting)

The original crash which looked something like
>   %19 = alloca [10 x i64], align 8
  %155 = insertelement <4 x ptr> poison, ptr %19, i32 0
Unexpected instruction
> [898844] signal 6 (-6): Aborted
in expression starting at
/home/gbaraldi/DAECompiler.jl/test/reflection.jl:28
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
RecursivelyVisit<llvm::IntrinsicInst,
LateLowerGCFrame::PlaceRootsAndUpdateCalls(llvm::ArrayRef<int>, int,
State&, std::map<llvm::Value*, std::pair<int, int>
>)::<lambda(llvm::AllocaInst*&)>::<lambda(llvm::Use&)> > at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:803
operator() at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2560
[inlined]
PlaceRootsAndUpdateCalls at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2576
runOnFunction at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2638
run at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2675
run at
/home/gbaraldi/julia4/usr/include/llvm/IR/PassManagerInternal.h:91

which means it was crashing inside of late-gc-lowering, so the first
thing I did was ran julia and the same test with LLVM_ASSERTIONS=1 and
FORCE_ASSERTIONS=1 to see if LLVM complained about a malformed module,
and both were fine. Next step was trying to get the failing code out for
inspection.
Easiest way is to do `export
JULIA_LLVM_ARGS="--print-before=LateLowerGCFrame --print-module-scope"`
and pipe the output to a file.
The file is huge, but since it's a crash in LLVM we know that the last
thing is what we want, and that gave me the IR I wanted.
To verify that this is failing I did `make -C src install-analysis-deps`
to install the LLVM machinery (opt...). That gets put in the `tools`
directory of a julia build. Then I checked if this crashed outside of
julia by doing
`./opt -load-pass-plugin=../lib/libjulia-codegen.dylib
--passes=LateLowerGCFrame -S test.ll -o tmp3.ll `. This is run from
inside the tools dir so your paths might vary (the -S is so LLVM doesn't
generate bitcode) and my code did crash, however it was over 500 lines
of IR which makes it harder to debug and to write a test.

Next step then is to minimize the crash by doing
[`llvm-reduce`](https://llvm.org/docs/CommandGuide/llvm-reduce.html)
over it (it's basically creduce but optimized for LLVM IR) which gave me
a 2 line reproducer (in this case apparently just having the
insertelement was enough for the pass to fail). One thing to be wary is
that llvm-reduce will usually make very weird code, so it might be
useful to modify the code slightly so it doesn't look odd (it will have
unreachable basic-blocks and such).
After the cleanup fixing the bug here wasn't interesting but this
doesn't apply generally. And also always transform your reduced IR into
a test to put in llvmpasses.

(cherry picked from commit 906d348)
@DilumAluthge DilumAluthge mentioned this pull request Sep 9, 2025
73 tasks
@DilumAluthge DilumAluthge removed the backport 1.11 Change should be backported to release-1.11 label Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants