Skip to content

Conversation

tstuefe
Copy link
Member

@tstuefe tstuefe commented Sep 23, 2025

When we run with ASAN enabled and ASAN catches an error, it reports, then stops the JVM. hs-err files and crash dumps at that point would be incredibly useful, though. The ASAN error report itself is seldom enlightening since it only contains native stacks.

After this patch, the JVM will always produce hs-err files when an ASAN-report happens. It will only produce core files if ASAN_OPTIONS disable_coredump=0 and abort_on_error=1 and the JVM option CreateCoredumpOnCrash had not been disabled (and the limit for core file size is high enough etc, all the usual restrictions on OS level still apply).

This means that ASAN builds, by default, will continue to not generate cores, since ASAN default options inhibit that. See detail in the comments below.


Tested on Fedora 42 and Debian 12, both manually and by running the new companion jtreg test.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8368365: ASAN errors should produce hs-err files and core dumps (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27446/head:pull/27446
$ git checkout pull/27446

Update a local copy of the PR:
$ git checkout pull/27446
$ git pull https://git.openjdk.org/jdk.git pull/27446/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 27446

View PR using the GUI difftool:
$ git pr show -t 27446

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27446.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 23, 2025

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Sep 23, 2025

@tstuefe This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8368365: ASAN errors should produce hs-err files and core dumps

Reviewed-by: mbaesken, asmehra

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 108 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot changed the title JDK-8368365: Let ASAN errors generate hs-err files and core dumps 8368365: Let ASAN errors generate hs-err files and core dumps Sep 23, 2025
@openjdk
Copy link

openjdk bot commented Sep 23, 2025

@tstuefe The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@tstuefe tstuefe changed the title 8368365: Let ASAN errors generate hs-err files and core dumps 8368365: ASAN errors should produce hs-err files and core dumps Sep 23, 2025
@tstuefe tstuefe marked this pull request as ready for review September 23, 2025 09:21
@openjdk openjdk bot added the rfr Pull request is ready for review label Sep 23, 2025
@mlbridge
Copy link

mlbridge bot commented Sep 23, 2025

@MBaesken
Copy link
Member

MBaesken commented Sep 23, 2025

Is there some XX flag to enable the new behavior ?
It would be quite horrible for us to always get a hserr+core when running into an ASAN issue.

@tstuefe
Copy link
Member Author

tstuefe commented Sep 23, 2025

Is there some XX flag to enable the new behavior ? It would be quite horrible for us to always get a hserr+core when running into an ASAN issue.

Interesting. No, no flag, but I can easily add one.

But why would you not want to have a hs-err file? Core file generation depends, as usual, on CreateCoreDumpOnCrash and the ulimit, of course.

@tstuefe
Copy link
Member Author

tstuefe commented Sep 23, 2025

@MBaesken I am fine with adding some sort of option, but we need to figure out what the default behavior would be. It would be sad to disable this by default. I see developers using the ASAN build regularly, and typically they don't know which switches exist. Or, they have no possibility to even set switches, since the command line cannot be modified.

Therefore, I'd like to understand better what the problem is. hs-err files are quite small, at least in comparison to typical cores. We can disable core files by default, while hs-err files would still be generated. Would that be a compromise?

@MBaesken
Copy link
Member

Therefore, I'd like to understand better what the problem is. hs-err files are quite small, at least in comparison to typical cores. We can disable core files by default, while hs-err files would still be generated. Would that be a compromise?

That sounds reasonable ; currently we have still lots of asan reports/issues so having (many) cores would be very bad for us.

@dean-long
Copy link
Member

I haven't tried it yet, but what happens if we cause ASAN to abort by setting the flag in the environment:
export ASAN_OPTIONS=abort_on_error=1
Do we get a useful stack trace in the hs_err file?

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me but I am not an ASAN user so ...

@tstuefe
Copy link
Member Author

tstuefe commented Sep 24, 2025

Hi all,

After feedback by @MBaesken and the question of @dean-long, I reworked the patch to make the JVM exhibit exactly the same behaviors with regard to core files that a standard ASAN-instrumented binary would show. There is a longer story behind that; see the lengthy comment in address.cpp.

The gist of it: ASAN, by default, inhibits core file generation. To get cores, one needs to set abort_on_error=1 and disable_coredump=0. The first version of my patch was more permissive and allowed core files despite these settings; that was bad (thanks @MBaesken for reminding me) since in mass integration tests done with ASAN, you don't want cores enabled. Note that cores can also get enormous with ASAN.

The new patch exhibits the same behavior as a standard ASAN-instrumented binary: core files are only generated if abort_on_error=1 and disable_coredump=0. By default, this prevents core files. This overrules +CreateCoredumpOnCrash in the JVM (by default enabled).

I think this is a good compromise. Anyone wanting to get cores with ASAN would use the standard ASAN options for this.

@dean-long

I haven't tried it yet, but what happens if we cause ASAN to abort by setting the flag in the environment:
export ASAN_OPTIONS=abort_on_error=1. Do we get a useful stack trace in the hs_err file?

Yes, hs-err files are now generated in all cases. That is independent of core generation.

We now 1) print the ASAN report to stderr, then 2) dump an hs-err file, which also contains the ASAN report, then optionally 3) create a core dump.

@dean-long
Copy link
Member

Actually, what I was trying to get at with my question was whether we need the callback if abort_on_error=1 is set. If abort_on_error=1 is set, it will call abort(), which should send a SIGABRT and then cause an hs_err file to get generated.

@tstuefe
Copy link
Member Author

tstuefe commented Sep 25, 2025

Actually, what I was trying to get at with my question was whether we need the callback if abort_on_error=1 is set. If abort_on_error=1 is set, it will call abort(), which should send a SIGABRT and then cause an hs_err file to get generated.

We don't catch SIGABRT. Starting to do so would be a more invasive change (and add some complexities, since we ourselves use abort(3) to generate cores). It would also interfere with user SIGABRT handlers, possibly requiring them to start using libjsig etc.

@ashu-mehra
Copy link
Contributor

This looks good to me. I have a couple of questions/points:

  1. Is __asan_set_error_report_callback the documented way for applications to install callback? I couldn't find information about this. It would be good to add a link to the doc, if there is one, as a comment in the code.
  2. Should there be a test for abort_on_error=1:disable_coredump=0 case where the JVM is expected to generate a core file.

@tstuefe
Copy link
Member Author

tstuefe commented Sep 26, 2025

Thank you, @ashu-mehra .

This looks good to me. I have a couple of questions/points:

  1. Is __asan_set_error_report_callback the documented way for applications to install callback? I couldn't find information about this. It would be good to add a link to the doc, if there is one, as a comment in the code.

Documentation for Asan is sparse in general, but it is documented in the header file. I don't know if it was ever not supported - the dlsym'ing I did out of an abundance of caution in case there are older Asan versions around without that functionality.

  1. Should there be a test for abort_on_error=1:disable_coredump=0 case where the JVM is expected to generate a core file.

I thought about that too, but testing for core files is tricky. For one, there are many ways core file dumping could fail. Prediction of the core file path is difficult (with systemd, you'd need to interpret the sysctl kernel.core-pattern value - we don't even manage to do that correctly in hotspot when displaying the core file dumping message). And with Asan, core file size is a bit unpredictable.

@openjdk
Copy link

openjdk bot commented Sep 29, 2025

@tstuefe build, client, compiler, core-libs, graal, i18n, ide-support, javadoc, jmx, net, nio, security, serviceability, shenandoah have been added to this pull request based on files touched in new commit(s).

@tstuefe tstuefe force-pushed the JDK-8368365-ASAN-errors-should-generate-hs-err-files-and-core-dumps branch from 8412f5b to e250395 Compare September 29, 2025 08:34
@openjdk
Copy link

openjdk bot commented Sep 29, 2025

@tstuefe Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

@tstuefe
Copy link
Member Author

tstuefe commented Sep 29, 2025

/label remove build,client,compiler,core-libs,graal,i18n,ide-support,javadoc,jmx,net,nio,security,serviceability,shenandoah

Sorry for the spam and the force push. Something went wrong with my final merge from master.

@openjdk
Copy link

openjdk bot commented Sep 29, 2025

@tstuefe
The build label was successfully removed.

The client label was successfully removed.

The compiler label was successfully removed.

The core-libs label was successfully removed.

The graal label was successfully removed.

The i18n label was successfully removed.

The ide-support label was successfully removed.

The javadoc label was successfully removed.

The jmx label was successfully removed.

The net label was successfully removed.

The nio label was successfully removed.

The security label was successfully removed.

The serviceability label was successfully removed.

The shenandoah label was successfully removed.

@tstuefe
Copy link
Member Author

tstuefe commented Sep 30, 2025

Thanks all. I'll hold off with the push until I am back from vacation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot [email protected] ready Pull request is ready to be integrated rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

6 participants