-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
feat(native): add stack overflow handling to advanced usage #13548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
supervacuus
merged 7 commits into
master
from
feat/native/add_stack_overflow_handling_to_advance_usage
May 7, 2025
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
4cb7b33
feat(native): add stack overflow handling to advanced usage
supervacuus 3856045
Update index.mdx
supervacuus d259689
fix typo
supervacuus 7cbe2eb
clarify first paragraph
supervacuus ae2bfe7
colon-ize where its needed
supervacuus ca19dcc
move below signal handling in sidebar
supervacuus 2d26d4a
clean up mechanism paragraph in signal handler
supervacuus File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
130 changes: 130 additions & 0 deletions
130
docs/platforms/native/advanced-usage/stack-overflow-handling/index.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,130 @@ | ||
--- | ||
title: Handling Stack Overflows | ||
description: "Learn about differences in reporting crashes from stack overflows across platforms, and how Sentry can help." | ||
sidebar_order: 1100 | ||
--- | ||
Application crashes due to stack overflow differ from other crashes from the handler's perspective because the handler | ||
relies on the resource that ran out: stack space. Since the handler typically runs on the thread whose stack overflowed, | ||
it can no longer use stack variables or call functions. This results in a crashed application without sending a report that | ||
it happened. | ||
|
||
How to handle this issue is different from platform to platform, but options boil down to: | ||
|
||
* allocating a stack that only the crash handler can use (Linux/POSIX and Windows) | ||
* running the handler in a separate thread (or process), which will receive a message of the crash asynchronously (macOS) | ||
|
||
Independent of whether an application crashed due to a stack overflow or not, handlers should make minimal use of the | ||
stack because even if there was no stack overflow, the amount of stack available to the handler could be limited. This is | ||
especially true for users who use the `on_crash` or `before_send` hook over which Sentry has no control. | ||
|
||
On Linux (and other `POSIX` systems), users should preallocate everything before their hooks run and only move data into | ||
preallocated storage because heap allocations can also fail inside the signal handler (constructing `sentry_value_t` is | ||
okay because we use a safe allocator inside the signal handler). See also: | ||
[What to consider when writing on_crash hooks](https://docs.sentry.io/platforms/native/advanced-usage/signal-handling/#what-to-consider-when-writing-on_crash-hooks). | ||
|
||
## How do OSes differ, and how can Sentry help? | ||
|
||
### Windows | ||
|
||
The Windows API provides a [thread-stack guarantee interface](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setthreadstackguarantee) | ||
where users can give a size in bytes reserved for the handler to run in case of a crash. However, this size is subtracted | ||
from the thread stack reserve as it is a direct continuation inside the thread stack, not a separate allocation or | ||
memory region. | ||
|
||
The developer must weigh the thread stack reserve against the handler's guarantee during regular operation. | ||
Otherwise, the guarantee used for the handler could eat enough stack space to lead to an overflow. | ||
|
||
This should not be the case for most threads on Windows, which have a default stack reserve of 1MiB (whereas the | ||
required handler guarantee will be only 10s of KiB). However, some threads created by specific runtimes or the kernel | ||
(for drivers) might have much smaller stack reserves, where a handler guarantee of 32KiB could already be half or all | ||
the stack available to the thread. | ||
|
||
In short, while Windows provides a very high-level request interface ("guarantee me x bytes for my handler"), it is not | ||
flexible regarding the location of the guaranteed handler stack. As such, you must consider the size of the guarantee in | ||
the context of the stack reserve and the actual stack use in a particular thread. The latter is hard to do for threads | ||
you do not control. | ||
|
||
Also, you'll need to request the stack guarantee from within the thread you want to specify. You cannot set a | ||
guarantee from the outside, which typically limits you to the threads you own. | ||
|
||
On Windows, the Native SDK automatically sets a stack guarantee of 64KiB for all threads that start after loading it as | ||
a shared library. For static library builds, we only automatically set the stack guarantee for the thread that calls | ||
`sentry_init()`. | ||
|
||
If you need to set stack guarantees manually, you can use the Win32 API directly. A once-set guarantee amount cannot be | ||
decreased through the Win32 API; it can only be increased. We also provide `sentry_set_thread_stack_guarantee()` on top | ||
of the Win32 function, which includes helpful logging and prevents overriding a previously set stack guarantee. | ||
|
||
The auto initialization is also defensive in requesting the stack reserve for each thread it runs on, and only attempts | ||
to set a guarantee if the reserve is at least, by default, 10 times larger than the requested default guarantee. | ||
|
||
You can parameterize this behavior to suit your use case by: | ||
|
||
* changing the default handler stack using the compile-time parameter `SENTRY_HANDLER_STACK_SIZE`. | ||
* disabling auto-initialization altogether using the compile-time option `SENTRY_THREAD_STACK_GUARANTEE_AUTO_INIT` | ||
* tuning the relative allowance between the stack reserve and the handler guarantee using the compile-time parameter `SENTRY_THREAD_STACK_GUARANTEE_FACTOR` | ||
* enabling more detailed logging during tuning parameters with `SENTRY_THREAD_STACK_GUARANTEE_VERBOSE_LOG` | ||
|
||
These parameters are documented in more detail in the section on compile-time options of | ||
[the Native SDK's README](https://github.com/getsentry/sentry-native?tab=readme-ov-file#compile-time-options). | ||
|
||
### Linux or OSes that primarily use POSIX signal handlers | ||
|
||
When you use POSIX signal handlers, you can specify a `sigaltstack`. This alternative signal stack allows the kernel to | ||
continue the handler stack even if the crashed and preempted thread stack runs out. | ||
|
||
This relatively low-level interface allows users to specify an arbitrary memory range (on the heap, stack or any memory | ||
mapping a user can access). The upside of allowing the user to determine the size _and_ location offers flexibility | ||
compared to the Windows approach because it is independent of the stack usage and size of the crashed thread and allows | ||
you to add additional bounds like protected regions around the handler stack. | ||
|
||
However, it also adds environmental complexity because a badly placed or incorrectly set up memory region could lead to | ||
hard-to-identify bugs (consider a handler stack inside the heap, where a handler overflow caused by a stack hungry | ||
`on_crash` implementation could lead to arbitrary heap corruption). | ||
|
||
Like Windows, you can only assign a `sigaltstack` from within the thread, meaning you can only set the handler region | ||
for threads you own. | ||
|
||
On Linux, `crashpad` and `breakpad` provide their own `sigaltstack` initialization, currently not influenced by Sentry: | ||
|
||
* `breakpad` typically allocates 16KiB or `SIGSTKSZ` if it is bigger. | ||
* `crashpad` allocates `SIGSTKSZ` + your system's page size and aligns it to the page size (which will lead to 16KiB or | ||
32KiB on most systems) | ||
|
||
Both `breakpad` and `crashpad` will only specify a `sigaltstack` if none exists or the one defined is smaller than | ||
the target size. `breakpad` allocates the alternate stack on the heap. `crashpad` creates a separate memory mapping that | ||
includes a guard page. | ||
|
||
The `inproc` backend uses the handler stack size specified in [`SENTRY_HANDLER_STACK_SIZE`](https://github.com/getsentry/sentry-native?tab=readme-ov-file#compile-time-options) | ||
and only sets up a `sigaltstack` if none has been defined. Like `breakpad`, it allocates the handler stack on the heap. | ||
|
||
<Alert> | ||
All backends currently only set up the `sigaltstack` for the thread that initializes the Native SDK. All other threads | ||
must get their own `sigaltstack` setup since no auto-initialization, like on Windows, exists on Linux. | ||
</Alert> | ||
|
||
### Android | ||
|
||
Android automatically configures every thread to use a `sigaltstack` size of 16KiB (on 32-bit systems) and 32KiB (on | ||
64-bit systems). The Android team recommends not overriding these because configuration inconsistencies with the signal | ||
stacks provided by Android can lead to crashes in regular runtime operation. The `inproc` backend of the Native SDK used | ||
in the Android integration will not define a `sigaltstack` on Linux/Android if one is already specified. Thus, only the | ||
default `sigaltstack` will be used on Android, and you can be sure one exists for each thread. | ||
|
||
### macOS, when using Mach exception port listeners | ||
|
||
The Mach exception port listener typically blocks in a separate thread until the kernel delivers a Mach exception. Since | ||
the listener thread is entirely independent of the thread that crashed, an exception caused by a stack overflow will | ||
never affect the available stack for the handler. This is even more true for `crashpad` on macOS, where the handler | ||
doesn't only run in a separate thread but in a separate process. | ||
|
||
<Alert> | ||
This means that when using `breakpad` or `crashpad` on macOS, handling a stack overflow does not require any different setup | ||
or care than other crashes. | ||
</Alert> | ||
|
||
<Alert> | ||
Be aware that, in contrast to Mach exception port usage, signal handlers on macOS run on the same thread that caused the | ||
signal and thus also need a `sigaltstack` to handle any crash from a stack overflow. Only the `inproc` backend on macOS | ||
currently relies entirely on signal handlers, and its signal stack is set up equivalently to Linux or other POSIX platforms. | ||
</Alert> |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) make the order consistent with the bullet points below