Skip to content

Conversation

KristofferC
Copy link
Member

Backported PRs:

Need manual backport:

Contains multiple commits, manual intervention needed:

Non-merged PRs with backport label:

kpamnany and others added 13 commits August 19, 2025 22:26
…54840)

Like #54671, but for
`speccache_eq`.

Saw another segfault with this in the stack trace, hence this fix. I
also looked for other uses of `jl_smallintset_lookup` and there's one in
`idset.c`. That doesn't appear to be racy but I'm not familiar with the
code, so maybe you can take a look at it in case we need to push a fix
for that one too @gbaraldi or @vtjnash?

(cherry picked from commit dd1ed17)
This was in DAECompiler.jl code found by @serenity4. He also mentioned
that writing up how one might go and fix a bug like this so i'll give a
quick writeup (this was a very simple bug so it might not be too
interesting)

The original crash which looked something like
>   %19 = alloca [10 x i64], align 8
  %155 = insertelement <4 x ptr> poison, ptr %19, i32 0
Unexpected instruction
> [898844] signal 6 (-6): Aborted
in expression starting at
/home/gbaraldi/DAECompiler.jl/test/reflection.jl:28
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
RecursivelyVisit<llvm::IntrinsicInst,
LateLowerGCFrame::PlaceRootsAndUpdateCalls(llvm::ArrayRef<int>, int,
State&, std::map<llvm::Value*, std::pair<int, int>
>)::<lambda(llvm::AllocaInst*&)>::<lambda(llvm::Use&)> > at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:803
operator() at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2560
[inlined]
PlaceRootsAndUpdateCalls at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2576
runOnFunction at
/home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2638
run at /home/gbaraldi/julia4/src/llvm-late-gc-lowering.cpp:2675
run at
/home/gbaraldi/julia4/usr/include/llvm/IR/PassManagerInternal.h:91

which means it was crashing inside of late-gc-lowering, so the first
thing I did was ran julia and the same test with LLVM_ASSERTIONS=1 and
FORCE_ASSERTIONS=1 to see if LLVM complained about a malformed module,
and both were fine. Next step was trying to get the failing code out for
inspection.
Easiest way is to do `export
JULIA_LLVM_ARGS="--print-before=LateLowerGCFrame --print-module-scope"`
and pipe the output to a file.
The file is huge, but since it's a crash in LLVM we know that the last
thing is what we want, and that gave me the IR I wanted.
To verify that this is failing I did `make -C src install-analysis-deps`
to install the LLVM machinery (opt...). That gets put in the `tools`
directory of a julia build. Then I checked if this crashed outside of
julia by doing
`./opt -load-pass-plugin=../lib/libjulia-codegen.dylib
--passes=LateLowerGCFrame -S test.ll -o tmp3.ll `. This is run from
inside the tools dir so your paths might vary (the -S is so LLVM doesn't
generate bitcode) and my code did crash, however it was over 500 lines
of IR which makes it harder to debug and to write a test.

Next step then is to minimize the crash by doing
[`llvm-reduce`](https://llvm.org/docs/CommandGuide/llvm-reduce.html)
over it (it's basically creduce but optimized for LLVM IR) which gave me
a 2 line reproducer (in this case apparently just having the
insertelement was enough for the pass to fail). One thing to be wary is
that llvm-reduce will usually make very weird code, so it might be
useful to modify the code slightly so it doesn't look odd (it will have
unreachable basic-blocks and such).
After the cleanup fixing the bug here wasn't interesting but this
doesn't apply generally. And also always transform your reduced IR into
a test to put in llvmpasses.

(cherry picked from commit 906d348)
When this API was added, this function inlined, which is important,
because the API relies on the allocation of the `Ref` being elided. At
some point (I went back to 1.8) this regressed. For example, it is
currently responsible for substantially all non-Expr allocations in
JuliaParser. Before (parsing all of Base with JuliaParser):
```
│     Memory estimate: 76.93 MiB, allocs estimate: 719922.
```
After:
```
│     Memory estimate: 53.31 MiB, allocs estimate: 156.
```

Also add a test to make sure this doesn't regress again.

(cherry picked from commit d6294ba)
Co-authored-by: Jameson Nash <[email protected]>
(cherry picked from commit 3ed13ea)
Currently, `similar(::CodeUnits)` works as expected by going through the
generic `AbstractArray` method. However, the fallback method hit by
`similar(::Type{<:CodeUnits}, dims)` does not work, as it assumes the
existence of a constructor that accepts an `UndefInitializer`. This can
be made to work by defining a corresponding `similar` method that
returns an `Array`.

One could make a case that this is a bugfix since it was arguably a bug
that this method didn't work given that `CodeUnits` is an
`AbstractArray` subtype and the other `similar` methods work. If anybody
buys that argument, it could be nice to backport this; it came up in
some internal code that uses Arrow.jl and JSON3.jl together.

(cherry picked from commit 8e524c7)
would fix #57170, fix
#54623

@nanosoldier `runbenchmarks("array", vs=":master")`

(cherry picked from commit e853a4f)
@ViralBShah ViralBShah added the release Release management and versioning. label Aug 19, 2025
@DilumAluthge
Copy link
Member

I have a Distributed.jl bugfix to backport, but I first need to do the backport in the Distributed repo, and then I'll bump Distributed here.

…release-julia-1.11` (but keep the commit the same) (#59390)

This PR will be followed up with a stdlib bump with the Distributed
backports.
…o e9b9023 (#59391)

Stdlib: Distributed
URL: https://github.com/JuliaLang/Distributed.jl
Stdlib branch: release-julia-1.11
Julia branch: backports-release-1.11
Old commit: 6c7cdb5
New commit: e9b9023
Julia version: 1.11.5
Distributed version: 1.11.0(Does not match)
Bump invoked by: @DilumAluthge
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
JuliaLang/Distributed.jl@6c7cdb5...e9b9023

```
$ git log --oneline 6c7cdb5..e9b9023
e9b9023 Backports for Julia 1.11.x (#141)
```

Co-authored-by: DilumAluthge <[email protected]>
@DilumAluthge
Copy link
Member

Okay, I've pushed two commits to this branch:

  1. 5225013 ([backports-release-1.11] Change Distributed branch from master to release-julia-1.11 (but keep the commit the same) #59390)
  2. a5f88ec (🤖 [backports-release-1.11] Bump the Distributed stdlib from 6c7cdb5 to e9b9023 #59391)

@KristofferC Should I edit the PR description manually and add these two PRs to the list? Or would that break the backporter script?

@ViralBShah
Copy link
Member

ViralBShah commented Aug 29, 2025

We should get the openblas patch in: #59346

@DilumAluthge
Copy link
Member

DilumAluthge commented Aug 30, 2025

CI is currently failing on this PR, because of segfaults in multiple test sets.

Here is a MWE that reproduces the segfault:

test_list = [
    "ambiguous",
]

Base.runtests(test_list)

With that MWE, bisect blames 3b04664 (which backports #58837):

3b04664577713059bee663fb3d1cd37080a7ff8e is the first bad commit
commit 3b04664577713059bee663fb3d1cd37080a7ff8e
Author: Simeon David Schaub <[email protected]>
Date:   Tue Jul 1 16:58:39 2025 +0200

    fix null comparisons for non-standard address spaces (#58837)

    Co-authored-by: Jameson Nash <[email protected]>
    (cherry picked from commit 3ed13ea7ab3d73f408b12a70ad29565c97bf5562)

 src/cgutils.cpp | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)
bisect found first bad commit
Here are the details of the bisect (click to expand):

I ran the bisect on Linux x86_64.

My starting good commit was: 9615af0

My starting bad commit was: 562699a

Here is my git bisect run script (../bisect_script.bash):

#!/usr/bin/env bash

set -euf -o pipefail

which -a ccache

make cleanall
make distclean

rm -f Make.user
echo "USECCACHE=1" >> Make.user

make -j16

./julia ../test.jl

Here are the contents of the ../test.jl file:

test_list = [
    "ambiguous",
]

Base.runtests(test_list)

The git bisect log is:

git bisect start
# status: waiting for both good and bad commits
# bad: [562699a03d8ddbfd45ce6c3ca6a427a73c572dfe] Merge branch 'release-1.11' into backports-release-1.11
git bisect bad 562699a03d8ddbfd45ce6c3ca6a427a73c572dfe
# good: [9615af0f269df4d371b8010e9507ed5bae86103b] set VERSION to 1.11.6 (#58935)
git bisect good 9615af0f269df4d371b8010e9507ed5bae86103b
# bad: [fc978933e245a75a1985cceeb8c421cb95496cf0] Pkg: Allow configuring can_fancyprint(io::IO) using IOContext (#58887)
git bisect bad fc978933e245a75a1985cceeb8c421cb95496cf0
# good: [47ebf952ea2bc46d9d27a0668edff23b0956a35d] Unicode: Force-inline isgraphemebreak! (#58674)
git bisect good 47ebf952ea2bc46d9d27a0668edff23b0956a35d
# bad: [01de2854a043625358f3aa7fc3d87f867eb30577] Add a `similar` method for `Type{<:CodeUnits}` (#57826)
git bisect bad 01de2854a043625358f3aa7fc3d87f867eb30577
# bad: [3b04664577713059bee663fb3d1cd37080a7ff8e] fix null comparisons for non-standard address spaces (#58837)
git bisect bad 3b04664577713059bee663fb3d1cd37080a7ff8e
# first bad commit: [3b04664577713059bee663fb3d1cd37080a7ff8e] fix null comparisons for non-standard address spaces (#58837)

@DilumAluthge
Copy link
Member

I've confirmed that reverting 3b04664 locally fixes my MWE.

@DilumAluthge
Copy link
Member

DilumAluthge commented Sep 1, 2025

Okay, we are very close to green CI on this PR now. The last failure is llvmpasses, which would be fixed by #59447

@DilumAluthge
Copy link
Member

Alright, I'll wait for CI to finish (just to double-check that it is all green now), and then I'll kick off PkgEval.

@DilumAluthge
Copy link
Member

Much of CI is green now, including the previously-failing llvmpasses job. I'll go ahead and kick off PkgEval now, while we wait for the rest of CI to finish.

@DilumAluthge
Copy link
Member

@nanosoldier runtests()

@DilumAluthge
Copy link
Member

CI is all green now.

@KristofferC
Copy link
Member Author

Hm, I don't see the nanosoldier notification showing that it is running.

@maleadt
Copy link
Member

maleadt commented Sep 2, 2025

@nanosoldier runtests()

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

Report summary

❗ Packages that crashed

3 packages crashed only on the current version.

  • The process was aborted: 1 packages
  • GC corruption was detected: 1 packages
  • A segmentation fault happened: 1 packages

19 packages crashed on the previous version too.

✖ Packages that failed

33 packages failed only on the current version.

  • Package has test failures: 3 packages
  • Package tests unexpectedly errored: 3 packages
  • Tests became inactive: 1 packages
  • Test duration exceeded the time limit: 25 packages
  • Test log exceeded the size limit: 1 packages

3109 packages failed on the previous version too.

✔ Packages that passed tests

40 packages passed tests only on the current version.

  • Other: 40 packages

6417 packages passed tests on the previous version too.

➖ Packages that were skipped altogether

1 packages were skipped only on the current version.

  • Package could not be installed: 1 packages

1310 packages were skipped on the previous version too.

@DilumAluthge
Copy link
Member

@nanosoldier runtests(["FastPower", "JuliaInterpreter", "MatrixBandwidth", "ColPack", "Intervals", "MRICoilSensitivities", "FindMinimaxPolynomial", "Ferrite", "Clarabel", "MathOptChordalDecomposition", "FourierTools", "KSVD", "StateSpaceDynamics", "LaplacianExpectationMaximization", "InferOpt", "NonconvexMultistart", "QuantumSymbolics", "BetaML", "JudiLing", "ONSAS", "ParametrisedConvexApproximators", "GeoStatsValidation", "Knockoffs", "ModelingToolkitTolerances", "SurfaceReactions", "SurfaceCoverage", "AcousticRayTracers", "SmoothPeriodicStatsModels", "CalibrateEmulateSample", "StateSpaceAnalysis", "ControlBarrierFunctions", "FourLeafMLE", "RobustBlindVerification", "LinearSolveAutotune"])

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

Report summary

✖ Packages that failed

2 packages failed only on the current version.

  • Package tests unexpectedly errored: 2 packages

3 packages failed on the previous version too.

✔ Packages that passed tests

2 packages passed tests only on the current version.

  • Other: 2 packages

27 packages passed tests on the previous version too.

@DilumAluthge
Copy link
Member

DilumAluthge commented Sep 4, 2025

@KristofferC The JuliaInterpreter failure looks real?

Check builtin.jl consistency: Error During Test at /home/pkgeval/.julia/packages/JuliaInterpreter/378J1/test/check_builtins.jl:6
Got exception outside of a @test
LoadError: UndefVarError: `isdefinedglobal` not defined in `Main`
Stacktrace:

@KristofferC
Copy link
Member Author

Nah, it just that test didn't run at all on the baseline due to:

https://github.com/JuliaDebug/JuliaInterpreter.jl/blob/da8883712ddcf1e781894a9cdd96578a9f991113/test/check_builtins.jl#L4

@DilumAluthge
Copy link
Member

Ah cool. From the log, that looks like the only JuliaInterpreter test that failed, so presumably I can ignore JuliaInterpreter?

@DilumAluthge
Copy link
Member

DilumAluthge commented Sep 4, 2025

Okay, so if we ignore JuliaInterpreter, then the only remaining PkgEval failure is ONSAS.jl, which failed with one error:

[ Info: Testing ../examples/von_misses_truss/von_misses_truss.jl...
../examples/von_misses_truss/von_misses_truss.jl: Error During Test at /home/pkgeval/.julia/packages/SafeTestsets/raUNr/src/SafeTestsets.jl:30
  Got exception outside of a @test
  LoadError: AssertionError: !(any(isnan, component_data))
  Stacktrace:
    [1] write_cell_data(vtk::ONSAS.VTK.VTKMeshFile{WriteVTK.DatasetFile}, celldata::Vector{Matrix{Float64}}, name::String; component_names::Vector{String}, kwargs::@Kwargs{})
      @ ONSAS.VTK ~/.julia/packages/ONSAS/49upX/src/Interfaces/VTK.jl:137

[...]

   [10] run()
      @ Main.var"##../examples/von_misses_truss/von_misses_truss.jl#252" ~/.julia/packages/ONSAS/49upX/examples/von_misses_truss/von_misses_truss.jl:135
   [11] top-level scope
      @ ~/.julia/packages/ONSAS/49upX/examples/von_misses_truss/von_misses_truss.jl:140

[...]
ERROR: LoadError: Some tests did not pass: 489 passed, 0 failed, 1 errored, 3 broken.
in expression starting at /home/pkgeval/.julia/packages/ONSAS/49upX/test/runtests.jl:50

@DilumAluthge
Copy link
Member

I'll try it again, just to make sure it still reproduces.

@DilumAluthge
Copy link
Member

@nanosoldier runtests(["ONSAS"])

@DilumAluthge
Copy link
Member

DilumAluthge commented Sep 4, 2025

Okay, so locally I can reproduce the ONSAS.jl failure (an !(any(isnan, component_data)) assertion failure in the von_misses_truss example) on both of the following:

  1. The current tip of release-1.11 (9615af0)
  2. The current tip of backports-release-1.11 (70d29e1af52b1d1b6310b66e303495c18074f820)

However, the failure is non-deterministic. When I run my reproducer1 multiple times in a row, sometimes it passes and sometimes it fails.

  1. On release-1.11 (9615af0), I ran my reproducer 15 times in a row. I got 3 failures and 12 passes.
  2. On backports-release-1.11 (70d29e1af52b1d1b6310b66e303495c18074f820), I ran my reproducer 15 times in a row. I got 5 failures and 10 passes.

Therefore, I think that this is a non-deterministic failure that already exists on release-1.11, and thus I don't believe it's a bug introduced by this backports PR.

I've opened mvanzulli/ONSAS.jl#537.

Footnotes

  1. The reproducer is basically just ./julia --project=../path/to/ONSAS.jl ../path/to/ONSAS.jl/examples/von_misses_truss/von_misses_truss.jl.

@DilumAluthge
Copy link
Member

To summarize the current status of PkgEval on this PR:

There are currently two packages failing:

  1. JuliaInterpreter
    • Per Kristoffer's explanation above, we can ignore this one.
  2. ONSAS
    • Per what I wrote above, this failure is reproducible (although it is non-deterministic) on the tip of release-1.11. Therefore, I don't think this failure was introduced by a commit in this backports PR.

Therefore, my conclusion is that PkgEval looks good on this PR.

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - no new issues were detected.
The full report is available.

Report summary

✔ Packages that passed tests

1 packages passed tests on the previous version too.

@DilumAluthge DilumAluthge merged commit 64d2674 into release-1.11 Sep 5, 2025
7 checks passed
@DilumAluthge DilumAluthge deleted the backports-release-1.11 branch September 5, 2025 13:16
@DilumAluthge
Copy link
Member

DilumAluthge commented Sep 5, 2025

Ugh this defaulted to squash-merge, which isn't correct. That's my bad. We want to regular merge backports PRs, so that people can bisect on the release branch. I'll fix this.

@DilumAluthge DilumAluthge restored the backports-release-1.11 branch September 5, 2025 13:19
@DilumAluthge DilumAluthge deleted the backports-release-1.11 branch September 5, 2025 13:25
@DilumAluthge DilumAluthge restored the backports-release-1.11 branch September 5, 2025 13:41
DilumAluthge added a commit that referenced this pull request Sep 5, 2025
Merge branch `backports-release-1.11` into `release-1.11`

#59336
@DilumAluthge
Copy link
Member

I've fixed this. It's now a regular merge commit (not a squash merge). This will allow people to bisect on the release-1.11 branch.

@DilumAluthge DilumAluthge deleted the backports-release-1.11 branch September 5, 2025 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release Release management and versioning.
Projects
None yet
Development

Successfully merging this pull request may close these issues.