Skip to content

Conversation

tstuefe
Copy link
Member

@tstuefe tstuefe commented Sep 20, 2025

ASAN, when catching an error, will abort the process.

Two things control this:

  1. the compiler option -fsanitize-recover=address (resp. -fno-sanitize-recover=address. This controls whether, once ASAN returns from its error report, the compiler-generated ASAN stubs will abort the process. This is by default set to -fno-sanitize-recover=address, so we won't recover.
  2. The runtime option halt_on_error controls whether ASAN itself returns from its error handler or whether it aborts the process. This, by default, is set to 1, so by default ASAN aborts.

We "double abort" in the sense that two options are overlaid and both prevent the process from continuing.

I propose that we set, during build time for ASAN builds, the option -fsanitize-recover=address. Now, we can control whether to abort or not using the runtime setting halt_on_error=0. By default, we still will abort, since halt_on_error=1. So, the default behavior won't change. However, we can now at least decide to do it differently.

What would that give us?

By aborting right away, ASAN denies the JVM the option to catch the error and write an hs-err file. Of course, not every error that ASAN catches will result in a segfault or in an assertion. The JVM could lurch on for a bit before it stumbles. However, the chance for the JVM to stop on its own very soon after a memory corruption happens is pretty good. Then we get a hs-err file and a crash dump in close correlation to the error ASAN caught.

And even if there is no close relationship between the original ASAN error and the eventual segfault/assertion (think ASAN sees a double free, JVM continues, and after a while asserts somewhere else as a remote consequence of the error - the stacks in the hs-err file won't be related to the original error) - the hs-err file is shock-full of helpful information about running threads (see also JDK-8368124), memory mappings, JVM flags, etc. All of that would make it easier to understand the ASAN report.

And even if the JVM survives, one can still attach to the still living process and grab thread dumps, VM.info reports, heap dumps etc.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Integration blocker

 ⚠️ Title mismatch between PR and JBS for issue JDK-8368176

Issue

  • JDK-8368176: ASAN should optionally not stop the JVM (Enhancement - P4) ⚠️ Title mismatch between PR and JBS.

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27404/head:pull/27404
$ git checkout pull/27404

Update a local copy of the PR:
$ git checkout pull/27404
$ git pull https://git.openjdk.org/jdk.git pull/27404/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 27404

View PR using the GUI difftool:
$ git pr show -t 27404

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27404.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 20, 2025

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Sep 20, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot changed the title JDK-8368176: ASAN should not inhibit hs-err file generation 8368176: ASAN should not inhibit hs-err file generation Sep 20, 2025
@openjdk
Copy link

openjdk bot commented Sep 20, 2025

@tstuefe The following label will be automatically applied to this pull request:

  • build

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@tstuefe tstuefe marked this pull request as ready for review September 20, 2025 07:42
@openjdk openjdk bot added the rfr Pull request is ready for review label Sep 20, 2025
@mlbridge
Copy link

mlbridge bot commented Sep 20, 2025

Webrevs

ASAN_CFLAGS="$ASAN_CFLAGS -fsanitize-recover=address"
elif test "x$TOOLCHAIN_TYPE" = "xmicrosoft"; then
# -Oy- is equivalent to -fno-omit-frame-pointer in GCC/Clang.
ASAN_CFLAGS="-fsanitize=address -Oy- -DADDRESS_SANITIZER"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the -fsanitize-recover=address compiler options supported when TOOLCHAIN_TYPE == microsoft

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at Visual Studio documentation and such, it looks like -fsanitize-recover isn't supported (yet).
https://developercommunity.visualstudio.com/t/add-fsanitize-recoveraddress-support-to-asan/1459414

@sendaoYan
Copy link
Member

Hi, jdk support address sanitizer and leak sanitizer for now. It seems that this PR only concern address sanitizer, does the leak sanitizer also need this enhancement

@kimbarrett
Copy link

Hi, jdk support address sanitizer and leak sanitizer for now. It seems that this PR only concern address sanitizer, does the leak sanitizer also need this enhancement

gcc15.2 docs list the sanitizers that support recovery, and "leak" is not in
that list. Sanitizers that support it enable recovery by default, except for
"address", which is noted as still being experimental (which is presumably why
it's not enabled by default).
https://gcc.gnu.org/onlinedocs/gcc-15.2.0/gcc/Instrumentation-Options.html#index-fsanitize-recover
Once "address" is no longer experimental we can presumably remove the explicit
recovery enabling.

But maybe we should consider this PR premature, since `-fsanitize-recovery=address" is still experimental.

@kimbarrett
Copy link

Shouldn't there be a configure check for the availability of the -fsanitize-recovery=address option?

@tstuefe tstuefe changed the title 8368176: ASAN should not inhibit hs-err file generation 8368176: ASAN build should optionally not stop the JVM Sep 23, 2025
@afshin-zafari
Copy link
Contributor

The logic behind the ASAN is that when an error is detected, the program is in an unstable state. So letting it to continue may produce more errors that more likely are caused by the first error.
Instead of turning on/off the whole build/run of the program, we can skip/exclude the places (i.e., functions) that we don't want the ASAN reports (using ATTRIBUTE_NO_ASAN).

@tstuefe
Copy link
Member Author

tstuefe commented Sep 23, 2025

I found a better and more reliable way to get hs-err files with ASAN : #27446.

So for me, the main motivation for this change is gone, and I wonder whether I should just close this PR.

Only, I think it still worthwhile to have at least the option to continue running the JVM. Mostly because the new alternative proposal, albeit a lot better than this one, relies on the ability of installing ASAN callbacks, and not all ASAN versions may allow that.

@afshin-zafari

The logic behind the ASAN is that when an error is detected, the program is in an unstable state. So letting it to continue may produce more errors that more likely are caused by the first error.

I am aware of that, and count on that. Please see the motivation I gave in the description above.

Instead of turning on/off the whole build/run of the program, we can skip/exclude the places (i.e., functions) that we don't want the ASAN reports (using ATTRIBUTE_NO_ASAN).

That would not be very useful though: you would hide the error. I want to see the error. I just want the JVM to continue after that error report has been written to stderr.

@kimbarrett

But maybe we should consider this PR premature, since `-fsanitize-recovery=address" is still experimental.

I am pretty sure its experimental because there is no safe way in which the program could continue. So - the feature itself is stable, but the target program would be instable. I don't see what the disadvantage would be in allowing to do that, though. Whoever uses ASAN in a way like this must know what he is doing.

@magicus
Copy link
Member

magicus commented Sep 23, 2025

I found a better and more reliable way to get hs-err files with ASAN : #27446.

So for me, the main motivation for this change is gone, and I wonder whether I should just close this PR.

Yes, I think you should. As you say, JBS-8368365 is a much better solution. This is just shaky ground; if we had no other options we might have considered it, but now there seem to be no good reason.

@tstuefe
Copy link
Member Author

tstuefe commented Sep 23, 2025

I found a better and more reliable way to get hs-err files with ASAN : #27446.
So for me, the main motivation for this change is gone, and I wonder whether I should just close this PR.

Yes, I think you should. As you say, JBS-8368365 is a much better solution. This is just shaky ground; if we had no other options we might have considered it, but now there seem to be no good reason.

Okay then! I'll close this as wont fix.

@tstuefe tstuefe closed this Sep 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build [email protected] rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

5 participants