Skip to content

Conversation

rohitarulraj
Copy link
Contributor

@rohitarulraj rohitarulraj commented Mar 14, 2025

In JDK-8309130, Array sort was optimized using AVX512 SIMD instructions for x86_64. Currently, this optimization has been disabled for AMD Zen 4 [JDK-8317763] due to bad performance of compressstoreu.
Ref: https://www.reddit.com/r/java/comments/171t5sj/heads_up_openjdk_implementation_of_avx512_based/.

This patch enables Zen 4 to pick optimized AVX2 version of SIMD sort and Zen 5 picks the AVX512 version.

JTREG Tests: Completed Tier1 & Tier2 tests on Zen4 & Zen5 - No Regressions.

Attaching ArraySort performance data for Zen4 & Zen5.
Zen4-ArraySort-Data.txt
Zen5-ArraySort-Data.txt


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 2 Reviewers)

Issue

  • JDK-8317976: Optimize SIMD sort for AMD Zen 4 (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24053/head:pull/24053
$ git checkout pull/24053

Update a local copy of the PR:
$ git checkout pull/24053
$ git pull https://git.openjdk.org/jdk.git pull/24053/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24053

View PR using the GUI difftool:
$ git pr show -t 24053

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24053.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 14, 2025

👋 Welcome back rraj! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Mar 14, 2025

@rohitarulraj This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8317976: Optimize SIMD sort for AMD Zen 4

Reviewed-by: psandoz, vlivanov

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 318 new commits pushed to the master branch:

  • 3d2c3cd: 8352970: Remove unnecessary Windows version check in Win32ShellFolderManager2
  • c70ad6a: 8352906: stdout/err.encoding on Windows set by incorrect Win32 call
  • da3bb06: 8352685: Opensource JInternalFrame tests - series2
  • d809033: 8341775: Duplicate manifest files are removed by jarsigner after signing
  • a269bef: 8350459: MontgomeryIntegerPolynomialP256 multiply intrinsic with AVX2 on x86_64
  • c029220: 8352896: LambdaExpr02.java runs wrong test class
  • c0b61d3: 8352680: Opensource few misc swing tests
  • 3e9a7a4: 8353063: make/ide/vscode: Invalid Configuration Values
  • 8ef7832: 8350471: Unhandled compilation bailout in GraphKit::builtin_throw
  • ddf326b: 8346888: [ubsan] block.cpp:1617:30: runtime error: 9.97582e+36 is outside the range of representable values of type 'int'
  • ... and 308 more: https://git.openjdk.org/jdk/compare/b1a21b563e3ae13fa5c409a4f0c04686c3f5b34a...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@PaulSandoz, @iwanowww) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 14, 2025
@openjdk
Copy link

openjdk bot commented Mar 14, 2025

@rohitarulraj The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@rohitarulraj rohitarulraj changed the title JDK-8317976: Enable optimized SIMD sort for AMD Zen 4 & Zen 5 JDK-8317976: Optimize SIMD sort for AMD Zen 4 Mar 14, 2025
@openjdk openjdk bot changed the title JDK-8317976: Optimize SIMD sort for AMD Zen 4 8317976: Optimize SIMD sort for AMD Zen 4 Mar 14, 2025
@mlbridge
Copy link

mlbridge bot commented Mar 14, 2025

Webrevs

Comment on lines 4329 to 4331
snprintf(ebuf_, sizeof(ebuf_),
((VM_Version::is_intel() || (VM_Version::is_amd() && (VM_Version::cpu_family() > 0x19)))
&& VM_Version::supports_avx512dq()) ? "avx512_sort" : "avx2_sort");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps factor the expression to a separate method rather than repeat it three times?

Can we add some constant with a descriptive name for the CPU family rather than directly using 0x19?

@PaulSandoz
Copy link
Member

/reviewers 2 reviewer

@openjdk
Copy link

openjdk bot commented Mar 17, 2025

@PaulSandoz
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 2 Reviewers).

Copy link
Member

@PaulSandoz PaulSandoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you for updating. I am not a proper HotSpot reviewer so i bumped up the number of required reviewers, and a HotSpot developer needs to quickly review it.

@@ -771,6 +773,10 @@ class VM_Version : public Abstract_VM_Version {
//
static bool cpu_supports_evex() { return (_cpu_features & CPU_AVX512F) != 0; }

static bool supports_avx512_simd_sort() {
// Disable AVX512 version of SIMD Sort on AMD Zen4 Processors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you forgot to remove the comment: // Disable AVX512 version of SIMD Sort on AMD Zen4 Processors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Zen4, we are disabling AVX512 version of SIMD Sort and using AVX2 version. So the comment is valid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!

@@ -4315,7 +4315,7 @@ void StubGenerator::generate_compiler_stubs() {

// Load x86_64_sort library on supported hardware to enable SIMD sort and partition intrinsics

if (VM_Version::is_intel() && (VM_Version::supports_avx512dq() || VM_Version::supports_avx2())) {
if (VM_Version::supports_avx512dq() || VM_Version::supports_avx2()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you check for VM_Version::supports_avx512_simd_sort() here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above condition will hold for all AMD processors. Only for Zen4, even though AVX512 is supported, we want to pick AVX2 version of SIMD sort (due to the regression) which is handled by the code below:

snprintf(ebuf_, sizeof(ebuf_), VM_Version::supports_avx512_simd_sort() ? "avx512_sort" : "avx2_sort");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please update the PR description summarizing the main high-level changes in this PR. Will make it easy for others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Vamsi, updated the PR description accordingly.

@rohitarulraj
Copy link
Contributor Author

@PaulSandoz @vamsi-parasa : Can I integrate this patch?

@PaulSandoz
Copy link
Member

No, before you can do that need another review from a HotSpot reviewer.

@rohitarulraj
Copy link
Contributor Author

@vamsi-parasa : Could you please review or provide feedback on this patch?

@PaulSandoz
Copy link
Member

@vamsi-parasa : Could you please review or provide feedback on this patch?

Srinivas does not currently have openjdk reviewer status. @iwanowww if you have a few moments can you help review?

Copy link
Contributor

@iwanowww iwanowww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good.

@@ -771,6 +773,10 @@ class VM_Version : public Abstract_VM_Version {
//
static bool cpu_supports_evex() { return (_cpu_features & CPU_AVX512F) != 0; }

static bool supports_avx512_simd_sort() {
// Disable AVX512 version of SIMD Sort on AMD Zen4 Processors
return ((is_intel() || (is_amd() && (cpu_family() > CPU_FAMILY_AMD_19H))) && supports_avx512dq()); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's quite hard to parse. The following looks clearer to me:

if (supports_avx512dq()) {
  // Disable AVX512 version of SIMD Sort on AMD Zen4 Processors.
  if (is_amd() && cpu_family() == CPU_FAMILY_AMD_19H) {
    return false;
  } 
  return true;
}
return false;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second the suggested refactoring. Need to make sure the original is_intel() check is also included appropriately in the logic :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's quite hard to parse. The following looks clearer to me:

if (supports_avx512dq()) {
  // Disable AVX512 version of SIMD Sort on AMD Zen4 Processors.
  if (is_amd() && cpu_family() == CPU_FAMILY_AMD_19H) {
    return false;
  } 
  return true;
}
return false;

Done.

Copy link
Contributor

@iwanowww iwanowww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 28, 2025
@rohitarulraj
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Mar 29, 2025
@openjdk
Copy link

openjdk bot commented Mar 29, 2025

@rohitarulraj
Your change (at version b369de6) is now ready to be sponsored by a Committer.

@sendaoYan
Copy link
Member

/sponsor

@openjdk
Copy link

openjdk bot commented Mar 30, 2025

Going to push as commit 8cbadf7.
Since your change was applied there have been 320 commits pushed to the master branch:

  • b9d7a75: 8352879: TestPeriod.java and TestGetContentType.java run wrong test class
  • 895aabc: 8351233: [ASAN] avx2-emu-funcs.hpp:151:20: error: ‘D.82188’ is used uninitialized
  • 3d2c3cd: 8352970: Remove unnecessary Windows version check in Win32ShellFolderManager2
  • c70ad6a: 8352906: stdout/err.encoding on Windows set by incorrect Win32 call
  • da3bb06: 8352685: Opensource JInternalFrame tests - series2
  • d809033: 8341775: Duplicate manifest files are removed by jarsigner after signing
  • a269bef: 8350459: MontgomeryIntegerPolynomialP256 multiply intrinsic with AVX2 on x86_64
  • c029220: 8352896: LambdaExpr02.java runs wrong test class
  • c0b61d3: 8352680: Opensource few misc swing tests
  • 3e9a7a4: 8353063: make/ide/vscode: Invalid Configuration Values
  • ... and 310 more: https://git.openjdk.org/jdk/compare/b1a21b563e3ae13fa5c409a4f0c04686c3f5b34a...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Mar 30, 2025
@openjdk openjdk bot closed this Mar 30, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Mar 30, 2025
@openjdk
Copy link

openjdk bot commented Mar 30, 2025

@sendaoYan @rohitarulraj Pushed as commit 8cbadf7.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler [email protected] integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants