Skip to content

Precompile deadlock #2057

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
timholy opened this issue Sep 30, 2020 · 16 comments
Closed

Precompile deadlock #2057

timholy opened this issue Sep 30, 2020 · 16 comments

Comments

@timholy
Copy link
Member

timholy commented Sep 30, 2020

While it's entirely possible I'm running a broken branch (I have #2049, plus JuliaLang/julia#37754), I'm getting deadlocks presumably due to the wonderful new parallel precompilation:

(@v1.6) pkg> precompile
PrecompilingPrecompilingPrecompilingPrecompilingPrecompiling project...
┌ Warning: Precompilation failed for indirect dependency ExprTools [e2ba6199-217a-4e67-a87a-7c52f15ade04]
└ @ Pkg.API ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:987
 project...┌ Warning: Precompilation failed for indirect dependency Mocking [78c3b35d-d492-501b-9361-3d52fe80e533]
└ @ Pkg.API ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:987

┌ Warning: Precompilation failed for indirect dependency ISVD [0de75c66-c0d4-11e8-3de9-d1632eed9267]
└ @ Pkg.API ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:987
┌ Warning: Precompilation failed for indirect dependency Opus_jll [91d4177d-7536-5919-b921-800302f37372]
└ @ Pkg.API ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:987
 project...┌ Warning: Precompilation failed for indirect dependency StableRNGs [860ef19b-820b-49d6-a774-d7a799459cd3]
└ @ Pkg.API ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:987
┌ Warning: Precompilation failed for indirect dependency PCRE_jll [2f80f16e-611a-54ab-bc61-aa92de5b98fc]
└ @ Pkg.API ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:987

┌ Warning: Precompilation failed for indirect dependency Xorg_libpthread_stubs_jll [14d82f49-176c-5ed1-bb49-ad3f5cbd8c74]
└ @ Pkg.API ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:987
┌ Warning: Precompilation failed for indirect dependency capnproto_jll [3576fdfd-e245-5854-bcf7-dae6dc3117e0]
└ @ Pkg.API ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:987
┌ Warning: Precompilation failed for indirect dependency ProgressLogging [33c8b6b6-d38a-422a-b730-caa89a2f386c]
└ @ Pkg.API ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:987
 project...
 project...
┌ Warning: Precompilation failed for indirect dependency DeepDiffs [ab62b9b5-e342-54a8-a765-a90f495de1a6]
└ @ Pkg.API ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:987

and then it just hangs. top tells me that julia isn't consuming any CPU, and when I hit Ctrl-C I get

^Cfatal: error thrown and no exception handler available.
InterruptException()
jl_mutex_unlock at /home/tim/src/julia-master/src/locks.h:139 [inlined]
jl_task_get_next at /home/tim/src/julia-master/src/partr.c:476
poptask at ./task.jl:737
wait at ./task.jl:745 [inlined]
task_done_hook at ./task.jl:475
_jl_invoke at /home/tim/src/julia-master/src/gf.c:2192 [inlined]
jl_apply_generic at /home/tim/src/julia-master/src/gf.c:2374
jl_apply at /home/tim/src/julia-master/src/julia.h:1687 [inlined]
jl_finish_task at /home/tim/src/julia-master/src/task.c:210
start_task at /home/tim/src/julia-master/src/task.c:781
unknown function (ip: (nil))

Any tips on how to debug this?

@IanButterworth
Copy link
Member

IanButterworth commented Sep 30, 2020

The reason everything's failing to precompile is because of a change to compilecache args. That's fixed on Pkg master by #2048 so perhaps rebase #2049 for testing?

Although, the deadlock is a little concerning.. Not sure why that's happening even in the case where every precompile task fails

Also, the multiple Precompiling project... will be is fixed by #2047

@IanButterworth
Copy link
Member

I tried adding an error() to master just before the compilecache call and I don't get a deadlock. Every task fails and the whole thing exits gracefully.

I guess one way a deadlock could occur is if the Manifest is broken and has a package listed as a dep of another package, but not listed itself in the manifest.. So the other package is waiting indefinitely for that package to be processed. If this continues to happen, perhaps we should check for that state

@timholy
Copy link
Member Author

timholy commented Sep 30, 2020

Works for me now, indeed I had rebased #2049 from a point prior to #2048 being merged. Thanks!

@timholy timholy closed this as completed Sep 30, 2020
@timholy
Copy link
Member Author

timholy commented Sep 30, 2020

Though if you want me to reopen for the "failure to exit graceful" I can do that. I'm not sure what more information I can provide, other than saying that I didn't update my Manifest in between the failure and the success (just Julia itself).

@timholy
Copy link
Member Author

timholy commented Sep 30, 2020

Oh, interesting...it works most of the way through and then...

[ Info: Precompiling GtkReactive [27996c0f-39cd-5cc1-a27a-05f136f946b6]
[ Info: Precompiling ReferenceTests [324d217c-45ce-50fc-942e-d289b448e8cf]
[ Info: Precompiling Flux [587475ba-b771-5e3f-ad9e-33799f191a9c]
[ Info: Precompiling ImageContrastAdjustment [f332f351-ec65-5f6a-b3d1-319c6670881a]
[ Info: Precompiling Images [916415d5-f1e6-5110-898d-aaa5f9f070e0]
[ Info: Precompiling ProfileView [c46f51b8-102a-5cf2-8d2c-8597cb0e0da7]
[ Info: Precompiling ImageFeatures [92ff4b2b-8094-53d3-b29d-97f740f06cef]
[ Info: Precompiling ImageView [86fae568-95e7-573e-a6b2-d8a6b900c9ef]
[ Info: Precompiling TileTrees [cd6f60fb-236b-5882-ad4f-250d5433a9a7]
Gtk-Message: 08:35:04.269: Failed to load module "canberra-gtk-module"
Gtk-Message: 08:35:04.269: Failed to load module "canberra-gtk-module"
[ Info: Precompiling ImageSegmentation [80713f31-8817-5129-9cf8-209ff8fb23e1]
[ Info: Precompiling DiffEqFlux [aae7a2af-3d4f-5e19-a356-7da93b79d9d0]
[ Info: Precompiling TiledFactorizations [23bcbbb2-c0d4-11e8-38f2-37d4ddb111c4]
Gtk-Message: 08:35:12.428: Failed to load module "canberra-gtk-module"
Gtk-Message: 08:35:12.429: Failed to load module "canberra-gtk-module"
[ Info: Precompiling MergePairwise [2b713a72-c0d4-11e8-0e3e-e7c72182117a]
ERROR: LoadError: AssertionError: precompile(Tuple{typeof(_mappedarray), Function, Base.ReinterpretArray{N0f8, 2, UInt8, Array{UInt8, 2}}})
Stacktrace:
 [1] _precompile_()
   @ ImageView ~/.julia/packages/ImageView/wTvyH/src/precompile.jl:155
 [2] top-level scope
   @ ~/.julia/packages/ImageView/wTvyH/src/ImageView.jl:705
 [3] include
   @ ./Base.jl:389 [inlined]
 [4] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::Nothing)
   @ Base ./loading.jl:1169
 [5] top-level scope
   @ none:1
 [6] eval
   @ ./boot.jl:360 [inlined]
 [7] eval(x::Expr)
   @ Base.MainInclude ./client.jl:446
 [8] top-level scope
   @ none:1
in expression starting at /home/tim/.julia/packages/ImageView/wTvyH/src/ImageView.jl:1

That's from ImageView v0.10.9. At least once, it just deadlocked then, though I also saw it exit gracefully once. The ReinterpretArray failure must be due to merging JuliaLang/julia#37559.

I doubt it's relevant, but I also managed to trigger

(@v1.6) pkg> free AbstractPlotting GeometryBasics MeshIO
  Resolving package versions...
  Installed AbstractPlotting ─ v0.12.12
Updating ^[[A`~/.julia/environments/v1.6/Project.toml`
  [537997a7] ~ AbstractPlotting v0.12.10 `~/.julia/dev/AbstractPlotting`  v0.12.12
  [5c1252a2] ~ GeometryBasics v0.3.1 `~/.julia/dev/GeometryBasics`  v0.3.1
  [7269a6da] ~ MeshIO v0.4.1 `~/.julia/dev/MeshIO`  v0.4.1
Updating `~/.julia/environments/v1.6/Manifest.toml`
  [537997a7] ~ AbstractPlotting v0.12.10 `~/.julia/dev/AbstractPlotting`  v0.12.12
  [5c1252a2] ~ GeometryBasics v0.3.1 `~/.julia/dev/GeometryBasics`  v0.3.1
  [dbd62bd0] - MakieGallery v0.2.17
  [7269a6da] ~ MeshIO v0.4.1 `~/.julia/dev/MeshIO`  v0.4.1
  [860ef19b] - StableRNGs v0.1.2
  [a4af3ec5] - SyntaxTree v1.0.1
  [0dad84c5] ERROR: MethodError: no method matching isless(::VersionNumber, ::Pkg.Types.VersionSpec)
Closest candidates are:
  isless(::VersionNumber, ::VersionNumber) at version.jl:184
  isless(::Missing, ::Any) at missing.jl:87
  isless(::Any, ::Missing) at missing.jl:88
Stacktrace:
  [1] <(x::VersionNumber, y::Pkg.Types.VersionSpec)
    @ Base ./operators.jl:279
  [2] >(x::Pkg.Types.VersionSpec, y::VersionNumber)
    @ Base ./operators.jl:305
  [3] print_diff(ctx::Pkg.Types.Context, old::Pkg.Types.PackageSpec, new::Pkg.Types.PackageSpec)
    @ Pkg.Operations ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:1673
  [4] print_status(ctx::Pkg.Types.Context, old_ctx::Pkg.Types.Context, header::Symbol, uuids::Vector{Base.UUID}, names::Vector{String}; manifest::Bool, diff::Bool)
    @ Pkg.Operations ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:1744
  [5] status(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; header::Symbol, mode::Pkg.Types.PackageMode, git_diff::Bool, env_diff::Pkg.Types.EnvCache)
    @ Pkg.Operations ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:1810
  [6] show_update
    @ ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:1774 [inlined]
  [7] free(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec})
    @ Pkg.Operations ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:1337
  [8] free(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Pkg.API ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:309
  [9] free
    @ ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:289 [inlined]
 [10] #free#54
    @ ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:68 [inlined]
 [11] free(pkgs::Vector{Pkg.Types.PackageSpec})
    @ Pkg.API ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:68
 [12] do_cmd!(command::Pkg.REPLMode.Command, repl::REPL.LineEditREPL)
    @ Pkg.REPLMode ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/REPLMode/REPLMode.jl:403
 [13] do_cmd(repl::REPL.LineEditREPL, input::String; do_rethrow::Bool)
    @ Pkg.REPLMode ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/REPLMode/REPLMode.jl:381
 [14] do_cmd
    @ ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/REPLMode/REPLMode.jl:376 [inlined]
 [15] (::Pkg.REPLMode.var"#24#27"{REPL.LineEditREPL, REPL.LineEdit.Prompt})(s::REPL.LineEdit.MIState, buf::IOBuffer, ok::Bool)
    @ Pkg.REPLMode ~/src/julia-master/usr/share/julia/stdlib/v1.6/Pkg/src/REPLMode/REPLMode.jl:545
 [16] #invokelatest#2
    @ ./essentials.jl:709 [inlined]
 [17] invokelatest
    @ ./essentials.jl:708 [inlined]
 [18] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
    @ REPL.LineEdit ~/src/julia-master/usr/share/julia/stdlib/v1.6/REPL/src/LineEdit.jl:2435
 [19] run_frontend(repl::REPL.LineEditREPL, backend::REPL.REPLBackendRef)
    @ REPL ~/src/julia-master/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:1124
 [20] (::REPL.var"#44#49"{REPL.LineEditREPL, REPL.REPLBackendRef})()
    @ REPL ./task.jl:392

so it seems there's more than one breakage in Pkg-land. Maybe if Pkg has an internal error it can trigger the deadlock?

@timholy timholy reopened this Sep 30, 2020
@KristofferC
Copy link
Member

I doubt it's relevant, but I also managed to trigger

Is #2051 I think

@IanButterworth
Copy link
Member

Also

ERROR: LoadError: AssertionError: precompile(Tuple{typeof(_mappedarray), Function, Base.ReinterpretArray{N0f8, 2, UInt8, Array{UInt8, 2}}})
Stacktrace:

That's Base.precompile not Pkg.precompile, so that seems to be an issue with ImageView itself.

Could you try to load ImageView on that setup?

@timholy
Copy link
Member Author

timholy commented Sep 30, 2020

Yes, it's an issue with ImageView itself. The key point is that the type-definition of ReinterpretArray changed and I put the precompile call in an @assert so I'd know if any signatures go stale. If I comment out those precompile directives then it works fine.

@timholy
Copy link
Member Author

timholy commented Sep 30, 2020

Could you try to load ImageView on that setup?

Happy to try, but not sure what you mean.

@IanButterworth
Copy link
Member

@timholy I think your edit clarified. I thought you were inferring that the ImageView error was a Pkg.precompile issue. I was just suggesting doing using ImageView to check the issue was unrelated. No need to do that.

I really want to figure out why the deadlock is happening.

Possible reasons:

  • errors occurring in the try-catch. Doesn't seem so, given:

I tried adding an error() to master just before the compilecache call and I don't get a deadlock. Every task fails and the whole thing exits gracefully.

  • If the Manifest is broken and has a package listed as a dep of another package, but not listed itself in the manifest as it's own entry..
    @KristofferC is this possible? Should we check for this?

Any other ideas?

@KristofferC
Copy link
Member

I don't know but even so, it would be good to not have it deadlock. I think it should maybe be possible to write it in a way where it "deterministically" will always finish all tasks?

@IanButterworth
Copy link
Member

This should do that #2058

@IanButterworth
Copy link
Member

@timholy If your setup still deadlocks, it would be great if you can test #2058

@IanButterworth
Copy link
Member

We could add this inside the @sync loop to check for manifest issues

@assert all(map(dep->dep in keys(depsmap), deps)) "$pkg has dependencies that are missing from the manifest of the current environment"

@IanButterworth
Copy link
Member

IanButterworth commented Sep 30, 2020

That @assert is unnecessary, as any Pkg operation seems to protect against that problem already. Here, the ColorTypes entry was removed from the env manifest

julia> Pkg.precompile()
ERROR: `ColorVectorSpace=c3611d14-8923-5661-9e6a-0046d554d3a4` depends on `ColorTypes`, but no such entry exists in the manifest.
Stacktrace:
  [1] pkgerror(::String, ::Vararg{String, N} where N)
    @ Pkg.Types ~/Documents/GitHub/Pkg.jl/src/Types.jl:52
  [2] normalize_deps(name::String, uuid::Base.UUID, deps::Vector{String}, manifest::Dict{String, Vector{Pkg.Types.Stage1}})
    @ Pkg.Types ~/Documents/GitHub/Pkg.jl/src/manifest.jl:95
  [3] validate_manifest(stage1::Dict{String, Vector{Pkg.Types.Stage1}})
    @ Pkg.Types ~/Documents/GitHub/Pkg.jl/src/manifest.jl:109
  [4] Dict{Base.UUID, Pkg.Types.PackageEntry}(raw::Dict{String, Any})
    @ Pkg.Types ~/Documents/GitHub/Pkg.jl/src/manifest.jl:158
  [5] read_manifest(f_or_io::String)
    @ Pkg.Types ~/Documents/GitHub/Pkg.jl/src/manifest.jl:170
  [6] Pkg.Types.EnvCache(env::Nothing)
    @ Pkg.Types ~/Documents/GitHub/Pkg.jl/src/Types.jl:294
  [7] EnvCache
    @ ~/Documents/GitHub/Pkg.jl/src/Types.jl:274 [inlined]
  [8] Pkg.Types.Context()
    @ Pkg.Types ./util.jl:442
  [9] #precompile#195
    @ ~/Documents/GitHub/Pkg.jl/src/API.jl:913 [inlined]
 [10] precompile
    @ ~/Documents/GitHub/Pkg.jl/src/API.jl:913 [inlined]

I think it should maybe be possible to write it in a way where it "deterministically" will always finish all tasks?

On reflection, I can't see how this isn't true currently.

Maybe it's a circular dep issue...?

@IanButterworth
Copy link
Member

Circular deps should be detected and precomp disabled for them now.

If anyone experiences a deadlock, please save and share your environment Project and Manifest files for testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants