Skip to content

[X86][AVX512] Convert _mm_reduce_* intrinsics to emit llvm.reduction intrinsics in IR #46850

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
RKSimon opened this issue Sep 12, 2020 · 7 comments
Labels
bugzilla Issues migrated from bugzilla clang:headers Headers provided by Clang, e.g. for intrinsics

Comments

@RKSimon
Copy link
Collaborator

RKSimon commented Sep 12, 2020

Bugzilla Link 47506
Resolution FIXED
Resolved on Feb 16, 2021 08:16
Version unspecified
OS Windows NT
Depends On #46849
CC @topperc,@phoebewang,@zygoloid
Fixed by commit(s) 6c23cbc,4855a1004d4d,61da20575d6c

Extended Description

The mm_reduce/mm_mask_reduce style AVX512 intrinsics currently expand to a reduction code sequence in the header.

Now that we are close to handling llvm.reduction intrinsics in IR, we should be able to emit these IR intrinsics directly instead of relying on a later pass to recognise the pattern.

@RKSimon
Copy link
Collaborator Author

RKSimon commented Sep 14, 2020

integer reductions - https://reviews.llvm.org/D87604

@RKSimon
Copy link
Collaborator Author

RKSimon commented Oct 13, 2020

integer reductions - https://reviews.llvm.org/D87604

rG6c23cbc5603c

@RKSimon
Copy link
Collaborator Author

RKSimon commented Oct 14, 2020

I'm not sure what to about the fadd/fmul intrinsics.

According to the intel intrinsics guide they expand as a sequence:

https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=reduce&expand=4562

float _mm512_reduce_add_ps (__m512 a)

dst[31:0] := 0.0
FOR j := 0 to 15
i := j*32
dst[31:0] := dst[31:0] + a[i+31:i]
ENDFOR

But clang/gcc/icc all expand this as a standard subvector split reduction, with or without fast-math:

https://godbolt.org/z/W9r8oW

If we are happy to just match clang's current behaviour then I think we can just repeat what I did for integer in D87604.

@phoebewang
Copy link
Contributor

I think we can change intrinsics guide to match the actual behavior.

@RKSimon
Copy link
Collaborator Author

RKSimon commented Nov 29, 2020

Great - I'll create a patch for the float reductions to match our current behaviour.

@RKSimon
Copy link
Collaborator Author

RKSimon commented Dec 13, 2020

fadd/fmul reductions - https://reviews.llvm.org/D92940

rG4855a1004d4d

@RKSimon
Copy link
Collaborator Author

RKSimon commented Feb 16, 2021

fmin/fmax reductions - https://reviews.llvm.org/D93179

rG61da20575d6c

Thanks to @​pengfei for completing the work on this (and @​spatel for some of the FMF reduction subtleties).

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla clang:headers Headers provided by Clang, e.g. for intrinsics
Projects
None yet
Development

No branches or pull requests

2 participants