Skip to content

PkgEval detected issues or regressions? #39641

@vtjnash

Description

@vtjnash

Looking through recent PkgEval run for segfaults, these seem worth investigating:

signal (4): Illegal instruction
in expression starting at /home/pkgeval/.julia/packages/RBNF/mMbsg/test/qasm.jl:105
Generator at ./generator.jl:32 [inlined]
Generator at ./generator.jl:32 [inlined]
pprint_for_seq at /home/pkgeval/.julia/packages/PrettyPrint/z2Fty/src/PrettyPrint.jl:60

Fails due to:

julia> Vector{Union{Nothing, T} where T} == Vector{Any}
true
julia> Vector{Union{Nothing, T} where T} === Vector{Any}
false

Which is #35130 (comment)


signal (11): Segmentation fault
in expression starting at /home/pkgeval/.julia/packages/ChainRules/vQPoD/test/rulesets/LinearAlgebra/norm.jl:1
gc_try_setmark at /buildworker/worker/package_linux64/build/src/gc.c:1704 [inlined]
gc_mark_loop at /buildworker/worker/package_linux64/build/src/gc.c:2296
_jl_gc_collect at /buildworker/worker/package_linux64/build/src/gc.c:3034
jl_gc_collect at /buildworker/worker/package_linux64/build/src/gc.c:3241
maybe_collect at /buildworker/worker/package_linux64/build/src/gc.c:880 [inlined]
jl_gc_pool_alloc at /buildworker/worker/package_linux64/build/src/gc.c:1204
jl_gc_alloc_ at /buildworker/worker/package_linux64/build/src/julia_internal.h:285 [inlined]
jl_gc_alloc at /buildworker/worker/package_linux64/build/src/gc.c:3283
_new_array_ at /buildworker/worker/package_linux64/build/src/array.c:122 [inlined]
_new_array at /buildworker/worker/package_linux64/build/src/array.c:188 [inlined]
jl_alloc_array_1d at /buildworker/worker/package_linux64/build/src/array.c:459
Array at ./boot.jl:451 [inlined]
Array at ./boot.jl:460 [inlined]
similar at ./abstractarray.jl:785 [inlined]
similar at ./abstractarray.jl:784 [inlined]
_array_for at ./array.jl:671 [inlined]
compute_live_ins at ./compiler/ssair/slot2ssa.jl:526
construct_ssa! at ./compiler/ssair/slot2ssa.jl:638
slot2reg at ./compiler/ssair/driver.jl:117 [inlined]
run_passes at ./compiler/ssair/driver.jl:124
...

This is similar to ProximalAlgorithms, though in this case, it is a hoisted FCA load of a getfield:

julia> using ChainRulesTestUtils, LinearAlgebra
julia> f = ChainRulesTestUtils.var"#fnew#23"{ChainRulesTestUtils.var"#31#32"{NamedTuple{(), Tuple{}}, typeof(LinearAlgebra.norm)}, 
       Tuple{LinearAlgebra.UpperTriangular{Base.Complex{Float64}, Array{Base.Complex{Float64}, 2}}, Float64}, Tuple{Bool, Bool}}
julia> fx = ccall(:jl_new_struct_uninit, Any, (Any,), f); Core.println(fx)
julia> code_llvm(fx, Tuple{LinearAlgebra.UpperTriangular{Base.Complex{Float64}, Array{Base.Complex{Float64}, 2}}, Vararg{Any}}, raw=true, dump_module=true, optimize=true|false)
  %.fca.0.load = load {}*, {}** %137, align 8                                                                                                                                                           
  %.fca.0.insert = insertvalue [1 x {}*] undef, {}* %.fca.0.load, 0                                                                                                                                     
  %138 = getelementptr inbounds [7 x {}*], [7 x {}*]* %gcframe1182, i64 0, i64 6                                                                                                                        
  store {}* %.fca.0.load, {}** %138, align 16                                                       

[ Info: acquire() failed, error=TypeError(:typeassert, "", NLopt.Callback_Data, Core.Compiler.UseRef(:(Base.arraysize(_4, 2)), 4)), retry(9)

signal (11): Segmentation fault
in expression starting at /home/pkgeval/.julia/packages/EfficientGlobalOptimization/NlQ6o/test/test_ego.jl:1
jl_object_id_ at /buildworker/worker/package_linux64/build/src/builtins.c:385
type_hash at /buildworker/worker/package_linux64/build/src/jltypes.c:1040
typekey_hash at /buildworker/worker/package_linux64/build/src/jltypes.c:1052 [inlined]
lookup_type at /buildworker/worker/package_linux64/build/src/jltypes.c:635
inst_datatype_inner at /buildworker/worker/package_linux64/build/src/jltypes.c:1190
jl_inst_arg_tuple_type at /buildworker/worker/package_linux64/build/src/jltypes.c:1425
arg_type_tuple at /buildworker/worker/package_linux64/build/src/gf.c:1844 [inlined]
jl_lookup_generic_ at /buildworker/worker/package_linux64/build/src/gf.c:2371 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2423
_show_default at ./show.jl:412
show_default at ./show.jl:395 [inlined]
show at ./show.jl:390 [inlined]
print at ./strings/io.jl:35
unknown function (ip: 0x7f5799ec8b45)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2245 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2427
print_to_string at ./strings/io.jl:135
string at ./strings/io.jl:174 [inlined]
macro expansion at ./logging.jl:340 [inlined]
#acquire#13 at /home/pkgeval/.julia/packages/EfficientGlobalOptimization/NlQ6o/src/EGO.jl:55
acquire##kw at /home/pkgeval/.julia/packages/EfficientGlobalOptimization/NlQ6o/src/EGO.jl:51
#acquire#13 at /home/pkgeval/.julia/packages/EfficientGlobalOptimization/NlQ6o/src/EGO.jl:56
acquire##kw at /home/pkgeval/.julia/packages/EfficientGlobalOptimization/NlQ6o/src/EGO.jl:51

This is specifying [compat] for an antiquated version of https://github.com/JuliaOpt/NLopt.jl which was broken.


signal (11): Segmentation fault
in expression starting at none:0
PyObject_Call at /home/pkgeval/.julia/conda/3/lib/libpython3.8.so.1.0 (unknown line)
macro expansion at /home/pkgeval/.julia/packages/PyCall/tqyST/src/exception.jl:95 [inlined]
#109 at /home/pkgeval/.julia/packages/PyCall/tqyST/src/pyfncall.jl:43 [inlined]
disable_sigint at ./c.jl:458 [inlined]
__pycall! at /home/pkgeval/.julia/packages/PyCall/tqyST/src/pyfncall.jl:42 [inlined]
BoundsError(a=Array{Core.Compiler.BasicBlock, (1,)}[Core.Compiler.BasicBlock(stmts=Core.Compiler.StmtRange(start=1, stop=2), preds=Array{Int64, (0,)}[], succs=Array{Int64, (0,)}[])], i=(2,))
jl_bounds_error_ints at /buildworker/worker/package_linux64/build/src/rtutils.c:186
getindex at ./array.jl:802 [inlined]
compute_basic_blocks at ./compiler/ssair/ir.jl:105
...
signal (11): Segmentation fault
in expression starting at /home/pkgeval/.julia/packages/OptionalArgChecks/V6lE7/test/runtests.jl:45
_ZN4llvm23removeUnreachableBlocksERNS_8FunctionEPNS_14DomTreeUpdaterEPNS_16MemorySSAUpdaterE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZL19simplifyFunctionCFGRN4llvm8FunctionERKNS_19TargetTransformInfoERKNS_18SimplifyCFGOptionsE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)

signal (11): Segmentation fault
in expression starting at /home/pkgeval/.julia/packages/ProximalAlgorithms/IxN0R/test/problems/test_lasso_small.jl:1
_ZN4llvm11PointerType3getEPNS_4TypeEj at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZN4llvm12InstCombiner18visitAddrSpaceCastERNS_17AddrSpaceCastInstE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZN4llvm12InstCombiner3runEv at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZL31combineInstructionsOverFunctionRN4llvm8FunctionERNS_19InstCombineWorklistEPNS_9AAResultsERNS_15AssumptionCacheERNS_17TargetLibraryInfoERNS_13DominatorTreeERNS_25OptimizationRemarkEmitterEPNS_18BlockFrequencyInfoEPNS_18ProfileSummaryInfoEjPNS_8LoopInfoE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZN4llvm24InstructionCombiningPass13runOnFunctionERNS_8FunctionE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)

This appears to be a bug in PhiNode handling in llvm-alloc-opt.cpp. In particular, we have this node:

│     %1076 = φ (#394 => %1054, #396 => %1060)::Union{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(+), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(*), Tuple{Base.RefValue{Nothing}, Vector{Flo
at64}}}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(*), Tuple{Float64, Vector{Float64}}}}}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(+), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.Default
ArrayStyle{1}, Nothing, typeof(*), Tuple{Float64, Vector{Float64}}}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(*), Tuple{Float64, Vector{Float64}}}}}}                                                                                

which turns into:

   %1896 = bitcast {}* %value_phi70 to { [2 x {}*] }*

and

   %1909 = bitcast {}* %value_phi70 to { { double, {}* } }*

after (splitting on isa). This ends up confusing llvm-alloc-opt.cpp, which tries to fold the PhiNode and then marks that first slot as containing a ref. This violates LLVM's typed-memory assumption that all loads through a particular pointer must share the same type. This pass either needs to further split the users (into ref and value slots) or inhibit the optimization when the types collide like this.

To see this in usage, try:
code_llvm(ProximalAlgorithms.iterate, (ProximalAlgorithms.DRLS_iterable{Float64, Float64, Vector{Float64}, ProximalAlgorithms.ProximalOperators.LeastSquaresDirect{1, Float64, Float64, Matrix{Float64}, Vector{Float64}, Cholesky{Float64, Matrix{Float64}}}, NormL1{Float64}, ProximalAlgorithms.LBFGS{Float64, Float64, Int64, Vector{Float64}, 5}}, ProximalAlgorithms.DRLS_state{Float64, Vector{Float64}, ProximalAlgorithms.LBFGS{Float64, Float64, Int64, Vector{Float64}, 5}}))


  %.sub2957 = phi {} addrspace(10)** [ %.sub2952, %L112.thread ], [ %.sub29, %L112 ]
julia: /buildworker/worker/package_linux64/build/src/llvm-late-gc-lowering.cpp:1478: State LateLowerGCFrame::LocalScan(llvm::Function&): Assertion `false && "Expected alloca"' failed.

signal (6): Aborted
in expression starting at /home/pkgeval/.julia/packages/Matcha/Bx1mi/test/runtests.jl:70
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7f3604364728)
__assert_fail at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
LocalScan at /buildworker/worker/package_linux64/build/src/llvm-late-gc-lowering.cpp:1478
runOnFunction at /buildworker/worker/package_linux64/build/src/llvm-late-gc-lowering.cpp:2661
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)

LLVM is normalizing our return_roots alloca from T, n to [T x n], which involves adding a GEP 0,0 before the user. Somehow this GEP gets sunk into a loop, and ends up generating a bunch of phi nodes, which all are the same GEP into the same original alloca.
From: code_llvm(Matcha.inner_matchat, (Vector{Int64}, Int64, Tuple{Greed{typeof(Matcha.alwaysmatch), UnitRange{Int64}}, Greed{Int64, UnitRange{Int64}}}, Matcha.History{Vector{Int64}, SubArray, Int64}))

Metadata

Metadata

Assignees

No one assigned

    Labels

    regressionRegression in behavior compared to a previous version

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions