Closed
Description
We sporadically (<5%) get a test failure (example: https://buildkite.com/julialang/julia-master/builds/10097#eee6dda2-d1bf-4118-aa04-5056c642eb4f) for which the first stacktrace is
�[91m�[1mError During Test�[22m�[39m at �[39m�[1m/cache/build/default-amdci4-0/julialang/julia-master/julia-0c9c484d19/share/julia/stdlib/v1.9/LinearAlgebra/test/matmul.jl:232�[22m
Got exception outside of a @test
ArgumentError: colons must be converted by to_indices(...)
Stacktrace:
[1] to_index(#unused#::Colon)
@ Base ./indices.jl:299
[2] to_index(A::Matrix{ComplexF32}, i::Function)
@ Base ./indices.jl:277
[3] to_indices
@ ./indices.jl:333 [inlined]
[4] to_indices
@ ./indices.jl:324 [inlined]
[5] view(::Matrix{ComplexF32}, ::Function, ::UnitRange{Int64})
@ Base ./subarray.jl:176
[6] macro expansion
@ /cache/build/default-amdci4-0/julialang/julia-master/julia-0c9c484d19/share/julia/stdlib/v1.9/LinearAlgebra/test/matmul.jl:237 [inlined]
[7] macro expansion
@ /cache/build/default-amdci4-0/julialang/julia-master/julia-0c9c484d19/share/julia/stdlib/v1.9/Test/src/Test.jl:1433 [inlined]
[8] macro expansion
@ /cache/build/default-amdci4-0/julialang/julia-master/julia-0c9c484d19/share/julia/stdlib/v1.9/LinearAlgebra/test/matmul.jl:232 [inlined]
[9] top-level scope
@ /cache/build/default-amdci4-0/julialang/julia-master/julia-0c9c484d19/share/julia/stdlib/v1.9/Test/src/Test.jl:1433 [inlined]
[10] top-level scope
@ /cache/build/default-amdci4-0/julialang/julia-master/julia-0c9c484d19/share/julia/stdlib/v1.9/LinearAlgebra/test/matmul.jl:0
[11] include
@ ./Base.jl:429 [inlined]
[12] macro expansion
@ /cache/build/default-amdci4-0/julialang/julia-master/julia-0c9c484d19/share/julia/test/testdefs.jl:24 [inlined]
[13] macro expansion
@ /cache/build/default-amdci4-0/julialang/julia-master/julia-0c9c484d19/share/julia/stdlib/v1.9/Test/src/Test.jl:1357 [inlined]
[14] macro expansion
@ /cache/build/default-amdci4-0/julialang/julia-master/julia-0c9c484d19/share/julia/test/testdefs.jl:23 [inlined]
[15] macro expansion
@ ./timing.jl:440 [inlined]
[16] runtests(name::String, path::String, isolate::Bool; seed::UInt128)
@ Main /cache/build/default-amdci4-0/julialang/julia-master/julia-0c9c484d19/share/julia/test/testdefs.jl:21
[17] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, UInt128, Tuple{Symbol}, NamedTuple{(:seed,), Tuple{UInt128}}})
@ Base ./essentials.jl:731
[18] (::Distributed.var"#106#108"{Distributed.CallMsg{:call_fetch}})()
@ Distributed /cache/build/default-amdci4-0/julialang/julia-master/julia-0c9c484d19/share/julia/stdlib/v1.9/Distributed/src/process_messages.jl:285
[19] run_work_thunk(thunk::Distributed.var"#106#108"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)
@ Distributed /cache/build/default-amdci4-0/julialang/julia-master/julia-0c9c484d19/share/julia/stdlib/v1.9/Distributed/src/process_messages.jl:70
[20] macro expansion
@ /cache/build/default-amdci4-0/julialang/julia-master/julia-0c9c484d19/share/julia/stdlib/v1.9/Distributed/src/process_messages.jl:285 [inlined]
[21] (::Distributed.var"#105#107"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()
@ Distributed ./task.jl:476
I was going to guess is that it depends on what other tests might have run on the same node, but in this case it appears to be the first test run on that node. I am therefore at a bit of a loss.
Activity
N5N3 commentedon Mar 16, 2022
Another example. https://buildkite.com/julialang/julia-master/builds/9882#ca1d2e5a-d527-4c31-a909-b8aaac1b1c07
Keno commentedon Mar 17, 2022
Looks like we have an rr trace there, so should be quite debuggable.
Keno commentedon Mar 17, 2022
Just to clarify, what I'm seeing in the backtrace is:
Do we think that the 4->3 call is wrong because it should have dispatched to the one in mutlidimensional.jl instead?
Keno commentedon Mar 17, 2022
Just on a regular master build, that's the optimized IR for that call, which all seems correct, so I guess we just need to look at the rr trace and see what's different about it.
Keno commentedon Mar 17, 2022
Hmm, I would have expected the test failure to trigger the breakpoint here: https://github.com/JuliaLang/julia/blame/master/stdlib/Test/src/Test.jl#L655, but I don't see it in the trace.
Keno commentedon Mar 17, 2022
Ah, that's because it was inside a
@testset
, but outside an@test
. I'll add a similar call for that case. I'll see if I can find the failing test anyway, but it might be a bit annoying so we may just want to wait for the next failure after I make that change.Also trigger breakpoint on testset error
Keno commentedon Mar 17, 2022
#44652
I'm a bit short on time, so unless somebody else wants to go digging, I say let's get that merged and wait for another rr trace that includes that change, so it'll be easier to find.
Also trigger breakpoint on testset error
Also trigger breakpoint on testset error
Also trigger breakpoint on testset error (#44652)
new()
to reference the called object instead of re-creating it with apply_type #44664Keno commentedon Mar 24, 2022
We have this captured in https://buildkite.com/julialang/julia-master/builds/10284#c8607f85-b173-49f9-a0fe-c1d1586d6ccf
julia-1
atKeno commentedon Mar 24, 2022
So the last generic call before the error compiles something. The arguments are:
I don't really know how
args[1]
ended up asAny
. Seems like that should beColon
.Keno commentedon Mar 24, 2022
Hmm:
Keno commentedon Mar 24, 2022
Ah, nevermind. That's just a printing bug in
jl_
:Add missing gc root in codegen
Add missing gc root in codegen (#44724)