Skip to content

Conversation

saucecontrol
Copy link
Member

@saucecontrol saucecontrol commented Mar 25, 2025

AVX-512 embedded masking relies on use of ConditionalSelect, which makes this pattern common:

static Vector512<int> SubtractSaturateUnsigned(Vector512<int> a, Vector512<int> b) =>
    Vector512.ConditionalSelect(Vector512.GreaterThan(a, b), a - b, Vector512<int>.Zero);
       vpcmpgtd k1, zmm0, zmm1
       vpsubd   zmm0 {k1}{z}, zmm0, zmm1

This change improves codegen when the same pattern is used with a compare that does not produce a kmask result:

static Vector256<int> SubtractSaturateUnsigned(Vector256<int> a, Vector256<int> b) =>
    Vector256.ConditionalSelect(Vector256.GreaterThan(a, b), a - b, Vector256<int>.Zero);
        vpcmpgtd ymm2, ymm0, ymm1
        vpsubd   ymm0, ymm0, ymm1
-       vxorps   ymm1, ymm1, ymm1
-       vpblendvb ymm0, ymm1, ymm0, ymm2
+       vpand    ymm0, ymm0, ymm2

And more when pblendvb is not available:

static Vector128<int> SubtractSaturateUnsigned(Vector128<int> a, Vector128<int> b) =>
    Vector128.ConditionalSelect(Vector128.GreaterThan(a, b), a - b, Vector128<int>.Zero);
        movaps   xmm2, xmm0
        pcmpgtd  xmm2, xmm1
        psubd    xmm0, xmm1
        pand     xmm0, xmm2
-       xorps    xmm1, xmm1
-       pandn    xmm2, xmm1
-       por      xmm0, xmm2

Diffs show a few interesting variations in tests.

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 25, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Mar 25, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@saucecontrol saucecontrol marked this pull request as ready for review March 25, 2025 02:47
@saucecontrol
Copy link
Member Author

cc @tannergooding

Copy link
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC. @dotnet/jit-contrib, @EgorBo for secondary review

@EgorBo
Copy link
Member

EgorBo commented Mar 27, 2025

/azp run Fuzzlyn

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@saucecontrol
Copy link
Member Author

saucecontrol commented Mar 27, 2025

The x86 Fuzzlyn failure looks like another one from #112728. I'll put up a separate fix for that.

Arm failures are #113939

@saucecontrol
Copy link
Member Author

ping @tannergooding @EgorBo for merge

// If either of the value operands is const zero, we can optimize down to AND or AND_NOT.
GenTree* binOp = nullptr;

if (op3->IsVectorZero())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have nodes of type ConvertMaskToVector(Vector.Zero) for op2 or op3. I just opened #114272 to fix a case where I was not checking that for arm64.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a way we'd create that currently on xarch, but I plan on reviving #110342 and will keep that possibility in mind since we'll more eagerly choose intrinsics that produce a mask with that.

Copy link
Contributor

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kunalspathak kunalspathak merged commit 2fe8937 into dotnet:main Apr 4, 2025
109 of 111 checks passed
@saucecontrol saucecontrol deleted the csel branch April 4, 2025 20:39
@github-actions github-actions bot locked and limited conversation to collaborators May 5, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants