-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Block thread from receiving profile signal with stackwalk lock #57089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is Jameson's proposed amendment to the changes made in 57035 that introduced a deadlock on FreeBSD, amusingly in service of fixing a deadlock on Linux. On Linux and (non-macOS) BSD, instead of acquiring and releasing a lock on the profiler in `jl_with_stackwalk_lock`, we're just blocking the thread from receiving the profiler's `SIGUSR2` signal at all. Should fix 57058. If so, closes 57059. Co-Authored-By: Jameson Nash <[email protected]>
vtjnash
approved these changes
Jan 18, 2025
Success! 🎉 |
Added the backport label, but should only be backported in conjunction with #57035. |
vtjnash
added a commit
that referenced
this pull request
Feb 3, 2025
Restores #57035, undo #57089 for non-FreeBSD. While I suggested doing this change for all platforms, I forgot that means non-FreeBSD platforms become vulnerable again to the very deadlock problems that #57035 was required to prevent. That fix seems to not be viable on FreeBSD due to known libc implementation problems on that platform. However, upon closer inspection of the questionable design implementation decisions they seem to have made here, the platform is likely not currently vulnerable to this libunwind bug in the first place: https://github.com/lattera/freebsd/blob/master/libexec/rtld-elf/rtld_lock.c#L120
vtjnash
added a commit
that referenced
this pull request
Feb 3, 2025
Restores #57035, undo #57089 for non-FreeBSD. While I suggested doing this change for all platforms, I forgot that means non-FreeBSD platforms become vulnerable again to the very deadlock problems that #57035 was required to prevent. That fix seems to not be viable on FreeBSD due to known libc implementation problems on that platform. However, upon closer inspection of the questionable design implementation decisions they seem to have made here, the platform is likely not currently vulnerable to this libunwind bug in the first place: https://github.com/lattera/freebsd/blob/master/libexec/rtld-elf/rtld_lock.c#L120
vtjnash
added a commit
that referenced
this pull request
Feb 4, 2025
Restores #57035, undo #57089 for non-FreeBSD. While I suggested doing this change for all platforms, I forgot that means non-FreeBSD platforms become vulnerable again to the very deadlock problems that #57035 was required to prevent. That fix seems to not be viable on FreeBSD due to known libc implementation problems on that platform. However, upon closer inspection of the questionable design implementation decisions they seem to have made here, the platform is likely not currently vulnerable to this libunwind bug in the first place: https://github.com/lattera/freebsd/blob/master/libexec/rtld-elf/rtld_lock.c#L120
KristofferC
pushed a commit
that referenced
this pull request
Feb 6, 2025
Restores #57035, undo #57089 for non-FreeBSD. While I suggested doing this change for all platforms, I forgot that means non-FreeBSD platforms become vulnerable again to the very deadlock problems that #57035 was required to prevent. That fix seems to not be viable on FreeBSD due to known libc implementation problems on that platform. However, upon closer inspection of the questionable design implementation decisions they seem to have made here, the platform is likely not currently vulnerable to this libunwind bug in the first place: https://github.com/lattera/freebsd/blob/master/libexec/rtld-elf/rtld_lock.c#L120 (cherry picked from commit 2f0a523)
KristofferC
pushed a commit
that referenced
this pull request
Feb 6, 2025
Restores #57035, undo #57089 for non-FreeBSD. While I suggested doing this change for all platforms, I forgot that means non-FreeBSD platforms become vulnerable again to the very deadlock problems that #57035 was required to prevent. That fix seems to not be viable on FreeBSD due to known libc implementation problems on that platform. However, upon closer inspection of the questionable design implementation decisions they seem to have made here, the platform is likely not currently vulnerable to this libunwind bug in the first place: https://github.com/lattera/freebsd/blob/master/libexec/rtld-elf/rtld_lock.c#L120 (cherry picked from commit 2f0a523)
71 tasks
65 tasks
DilumAluthge
added a commit
that referenced
this pull request
Sep 5, 2025
Backported PRs: - [x] #54840 <!-- Add boundscheck in speccache_eq to avoid OOB access due to data race --> - [x] #42080 <!-- recommend explicit `using Foo: Foo, ...` in package code (was: "using considered harmful") --> - [x] #58127 <!-- [DOC] Update installation docs: /downloads/ => /install/ --> - [x] #58202 <!-- [release-1.11] malloc: use jl_get_current_task to fix null check --> - [x] #58584 <!-- Make `Ptr` values static-show w/ type-information --> - [x] #58637 <!-- Make late gc lower handle insertelement of alloca use. --> - [x] #58837 <!-- fix null comparisons for non-standard address spaces --> - [x] #57826 <!-- Add a `similar` method for `Type{<:CodeUnits}` --> - [x] #58293 <!-- fix trailing indices stackoverflow in reinterpreted array --> - [x] #58887 <!-- Pkg: Allow configuring can_fancyprint(io::IO) using IOContext --> - [x] #58937 <!-- Fix nthreadpools size in JLOptions --> - [x] #58978 <!-- Fix precompilepkgs warn loaded setting --> - [x] #58998 <!-- Bugfix: Use Base.aligned_sizeof instead of sizeof in Mmap.mmap --> - [x] #59120 <!-- Fix memory order typo in "src/julia_atomics.h" --> - [x] #59170 <!-- Clarify and enhance confusing precompile test --> Need manual backport: - [ ] #56329 <!-- loading: clean up more concurrency issues --> - [ ] #56956 <!-- Add "mea culpa" to foreign module assignment error. --> - [ ] #57035 <!-- linux: workaround to avoid deadlock inside dl_iterate_phdr in glibc --> - [ ] #57089 <!-- Block thread from receiving profile signal with stackwalk lock --> - [ ] #57249 <!-- restore non-freebsd-unix fix for profiling --> - [ ] #58011 <!-- Remove try-finally scope from `@time_imports` `@trace_compile` `@trace_dispatch` --> - [ ] #58062 <!-- remove unnecessary edge from `exp_impl` to `pow` --> - [ ] #58157 <!-- add showing a string to REPL precompile workload --> - [ ] #58209 <!-- Specialize `one` for the `SizedArray` test helper --> - [ ] #58108 <!-- Base.get_extension & Dates.format made public --> - [ ] #58356 <!-- codegen: remove readonly from abstract type calling convention --> - [ ] #58415 <!-- [REPL] more reliable extension loading --> - [ ] #58510 <!-- Don't filter `Core` methods from newly-inferred list --> - [ ] #58110 <!-- relax dispatch for the `IteratorSize` method for `Generator` --> - [ ] #58965 <!-- Fix `hygienic-scope`s in inner macro expansions --> - [ ] #58971 <!-- Fix alignment of failed precompile jobs on CI --> - [ ] #59066 <!-- build: Also pass -fno-strict-aliasing for C++ --> Contains multiple commits, manual intervention needed: - [ ] #55877 <!-- fix FileWatching designs and add workaround for a stat bug on Apple --> - [ ] #56755 <!-- docs: fix scope type of a `struct` to hard --> - [ ] #57809 <!-- Fix fptrunc Float64 -> Float16 rounding through Float32 --> - [ ] #57398 <!-- Make remaining float intrinsics require float arguments --> - [ ] #56351 <!-- Fix `--project=@script` when outside script directory --> - [ ] #57129 <!-- clarify that time_ns is monotonic --> - [ ] #58134 <!-- Note annotated string API is experimental in Julia 1.11 in HISTORY.md --> - [ ] #58401 <!-- check that hashing of types does not foreigncall (`jl_type_hash` is concrete evaluated) --> - [ ] #58435 <!-- Fix layout flags for types that have oddly sized primitive type fields --> - [ ] #58483 <!-- Fix tbaa usage when storing into heap allocated immutable structs --> - [ ] #58512 <!-- Make more types jl_static_show readably --> - [ ] #58012 <!-- Re-enable tab completion of kwargs for large method tables --> - [ ] #58683 <!-- Add 0 predecessor to entry basic block and handle it in inlining --> - [ ] #59112 <!-- Add builtin function name to add methods error --> Non-merged PRs with backport label: - [ ] #59329 <!-- aotcompile: destroy LLVM context after serializing combined module --> - [ ] #58848 <!-- Set array size only when safe to do so --> - [ ] #58535 <!-- gf.c: include const-return methods in `--trace-compile` --> - [ ] #58038 <!-- strings/cstring: `transcode`: prevent Windows sysimage invalidation --> - [ ] #57604 <!-- `@nospecialize` for `string_index_err` --> - [ ] #57366 <!-- Use ptrdiff_t sized offsets for gvars_offsets to allow large sysimages --> - [ ] #56890 <!-- Enable getting non-boxed LLVM type from Julia Type --> - [ ] #56823 <!-- Make version of opaque closure constructor in world --> - [ ] #55958 <!-- also redirect JL_STDERR etc. when redirecting to devnull --> - [ ] #55956 <!-- Make threadcall gc safe --> - [ ] #55534 <!-- Set stdlib sources as read-only during installation --> - [ ] #55499 <!-- propagate the terminal's `displaysize` to the `IOContext` used by the REPL --> - [ ] #55458 <!-- Allow for generically extracting unannotated string --> - [ ] #55457 <!-- Make AnnotateChar equality consider annotations --> - [ ] #55220 <!-- `isfile_casesensitive` fixes on Windows --> - [ ] #53957 <!-- tweak how filtering is done for what packages should be precompiled --> - [ ] #51479 <!-- prevent code loading from lookin in the versioned environment when building Julia --> - [ ] #50813 <!-- More doctests for Sockets and capitalization fix --> - [ ] #50157 <!-- improve docs for `@inbounds` and `Base.@propagate_inbounds` --> --------- Co-authored-by: Kiran Pamnany <[email protected]> Co-authored-by: adienes <[email protected]> Co-authored-by: Gabriel Baraldi <[email protected]> Co-authored-by: Keno Fischer <[email protected]> Co-authored-by: Simeon David Schaub <[email protected]> Co-authored-by: Jameson Nash <[email protected]> Co-authored-by: Alex Arslan <[email protected]> Co-authored-by: Fons van der Plas <[email protected]> Co-authored-by: Ian Butterworth <[email protected]> Co-authored-by: JonasIsensee <[email protected]> Co-authored-by: Curtis Vogt <[email protected]> Co-authored-by: Dilum Aluthge <[email protected]> Co-authored-by: DilumAluthgeBot <[email protected]> Co-authored-by: DilumAluthge <[email protected]>
65 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is Jameson's proposed amendment to the changes made in #57035 that introduced a deadlock on FreeBSD, amusingly in service of fixing a deadlock on Linux.
On Linux and (non-macOS) BSD, instead of acquiring and releasing a lock on the profiler in
jl_with_stackwalk_lock
, we're just blocking the thread from receiving the profiler'sSIGUSR2
signal at all.This should fix #57058; I don't get the deadlock locally on FreeBSD with this change, but it's AArch64 instead of x86-64 like on CI, so let's see if this also makes CI happy. If so, closes #57059.