Skip to content

stackwalk: fix heuristic termination #57801

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 17, 2025
Merged

stackwalk: fix heuristic termination #57801

merged 1 commit into from
Mar 17, 2025

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Mar 17, 2025

When getting stacktraces on non-X86 platforms, the first frame may not have been set up yet, incorrectly triggering this bad-frame detection logic. This should fix the issue of async unwind failing after only getting 2 frames, if the first frame happens to land in the function header. This is not normally an issue on X86 or non-signals, but also causes no expected issues to be the same logic there too.

Fix #52334

After (on arm64-apple-darwin24.3.0):

julia> f(1)
Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
ERROR: StackOverflowError:
Stacktrace:
     [1] f(x::Int64)
       @ Main ./REPL[3]:1
     [2] g(x::Int64)
       @ Main ./REPL[4]:1
--- the above 2 lines are repeated 39990 more times ---
 [79983] f(x::Int64)
       @ Main ./REPL[3]:1

n.b. This will not fix and is not related to any issues where profiling gets only a single stack frame during profiling of syscalls on Apple AArch64. This fix is specific to the bug where it gets exactly 2 frames.

When getting stacktraces on non-X86 platforms, the first frame may not
have been set up yet, incorrectly triggering this bad-frame detection
logic. This should fix the issue of async unwind failing after only
getting 2 frames, if the first frame happens to land in the function
header. This is not normally an issue on X86 or non-signals, but also
causes no expected issues to be the same logic there too.

Fix #52334
@vtjnash vtjnash added backport 1.10 Change should be backported to the 1.10 release backport 1.11 Change should be backported to release-1.11 backport 1.12 Change should be backported to release-1.12 error handling Handling of exceptions by Julia or the user labels Mar 17, 2025
@vtjnash vtjnash merged commit f82917a into master Mar 17, 2025
9 of 12 checks passed
@vtjnash vtjnash deleted the jn/52334 branch March 17, 2025 20:53
KristofferC pushed a commit that referenced this pull request Mar 20, 2025
When getting stacktraces on non-X86 platforms, the first frame may not
have been set up yet, incorrectly triggering this bad-frame detection
logic. This should fix the issue of async unwind failing after only
getting 2 frames, if the first frame happens to land in the function
header. This is not normally an issue on X86 or non-signals, but also
causes no expected issues to be the same logic there too.

Fix #52334

After (on arm64-apple-darwin24.3.0):
```
julia> f(1)
Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
ERROR: StackOverflowError:
Stacktrace:
     [1] f(x::Int64)
       @ Main ./REPL[3]:1
     [2] g(x::Int64)
       @ Main ./REPL[4]:1
--- the above 2 lines are repeated 39990 more times ---
 [79983] f(x::Int64)
       @ Main ./REPL[3]:1
```

n.b. This will not fix and is not related to any issues where profiling
gets only a single stack frame during profiling of syscalls on Apple
AArch64. This fix is specific to the bug where it gets exactly 2 frames.

(cherry picked from commit f82917a)
@KristofferC KristofferC removed the backport 1.12 Change should be backported to release-1.12 label Mar 24, 2025
KristofferC pushed a commit that referenced this pull request Mar 31, 2025
When getting stacktraces on non-X86 platforms, the first frame may not
have been set up yet, incorrectly triggering this bad-frame detection
logic. This should fix the issue of async unwind failing after only
getting 2 frames, if the first frame happens to land in the function
header. This is not normally an issue on X86 or non-signals, but also
causes no expected issues to be the same logic there too.

Fix #52334

After (on arm64-apple-darwin24.3.0):
```
julia> f(1)
Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
ERROR: StackOverflowError:
Stacktrace:
     [1] f(x::Int64)
       @ Main ./REPL[3]:1
     [2] g(x::Int64)
       @ Main ./REPL[4]:1
--- the above 2 lines are repeated 39990 more times ---
 [79983] f(x::Int64)
       @ Main ./REPL[3]:1
```

n.b. This will not fix and is not related to any issues where profiling
gets only a single stack frame during profiling of syscalls on Apple
AArch64. This fix is specific to the bug where it gets exactly 2 frames.

(cherry picked from commit f82917a)
@KristofferC KristofferC mentioned this pull request Mar 31, 2025
71 tasks
KristofferC pushed a commit that referenced this pull request Mar 31, 2025
When getting stacktraces on non-X86 platforms, the first frame may not
have been set up yet, incorrectly triggering this bad-frame detection
logic. This should fix the issue of async unwind failing after only
getting 2 frames, if the first frame happens to land in the function
header. This is not normally an issue on X86 or non-signals, but also
causes no expected issues to be the same logic there too.

Fix #52334

After (on arm64-apple-darwin24.3.0):
```
julia> f(1)
Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
ERROR: StackOverflowError:
Stacktrace:
     [1] f(x::Int64)
       @ Main ./REPL[3]:1
     [2] g(x::Int64)
       @ Main ./REPL[4]:1
--- the above 2 lines are repeated 39990 more times ---
 [79983] f(x::Int64)
       @ Main ./REPL[3]:1
```

n.b. This will not fix and is not related to any issues where profiling
gets only a single stack frame during profiling of syscalls on Apple
AArch64. This fix is specific to the bug where it gets exactly 2 frames.

(cherry picked from commit f82917a)
KristofferC pushed a commit that referenced this pull request Mar 31, 2025
When getting stacktraces on non-X86 platforms, the first frame may not
have been set up yet, incorrectly triggering this bad-frame detection
logic. This should fix the issue of async unwind failing after only
getting 2 frames, if the first frame happens to land in the function
header. This is not normally an issue on X86 or non-signals, but also
causes no expected issues to be the same logic there too.

Fix #52334

After (on arm64-apple-darwin24.3.0):
```
julia> f(1)
Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
ERROR: StackOverflowError:
Stacktrace:
     [1] f(x::Int64)
       @ Main ./REPL[3]:1
     [2] g(x::Int64)
       @ Main ./REPL[4]:1
--- the above 2 lines are repeated 39990 more times ---
 [79983] f(x::Int64)
       @ Main ./REPL[3]:1
```

n.b. This will not fix and is not related to any issues where profiling
gets only a single stack frame during profiling of syscalls on Apple
AArch64. This fix is specific to the bug where it gets exactly 2 frames.

(cherry picked from commit f82917a)
@KristofferC KristofferC removed the backport 1.11 Change should be backported to release-1.11 label Apr 10, 2025
@KristofferC KristofferC mentioned this pull request Jun 4, 2025
75 tasks
KristofferC pushed a commit that referenced this pull request Jun 5, 2025
When getting stacktraces on non-X86 platforms, the first frame may not
have been set up yet, incorrectly triggering this bad-frame detection
logic. This should fix the issue of async unwind failing after only
getting 2 frames, if the first frame happens to land in the function
header. This is not normally an issue on X86 or non-signals, but also
causes no expected issues to be the same logic there too.

Fix #52334

After (on arm64-apple-darwin24.3.0):
```
julia> f(1)
Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
ERROR: StackOverflowError:
Stacktrace:
     [1] f(x::Int64)
       @ Main ./REPL[3]:1
     [2] g(x::Int64)
       @ Main ./REPL[4]:1
--- the above 2 lines are repeated 39990 more times ---
 [79983] f(x::Int64)
       @ Main ./REPL[3]:1
```

n.b. This will not fix and is not related to any issues where profiling
gets only a single stack frame during profiling of syscalls on Apple
AArch64. This fix is specific to the bug where it gets exactly 2 frames.

(cherry picked from commit f82917a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.10 Change should be backported to the 1.10 release error handling Handling of exceptions by Julia or the user
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regression in displaying of stacktraces with recursive functions. (on ARM?)
2 participants