Skip to content

AMDGPU: Stop handling legacy amdgpu-unsafe-fp-atomics attribute #101699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Aug 2, 2024

This is now autoupgraded to annotate atomicrmw instructions in
old bitcode.

Copy link
Contributor Author

arsenm commented Aug 2, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @arsenm and the rest of your teammates on Graphite Graphite

@arsenm arsenm added clang Clang issues not falling into any other category backend:AMDGPU clang:codegen IR generation bugs: mangling, exceptions, etc. llvm:globalisel llvm:ir llvm:transforms labels Aug 2, 2024 — with Graphite App
@arsenm arsenm marked this pull request as ready for review August 2, 2024 15:55
@llvmbot
Copy link
Member

llvmbot commented Aug 2, 2024

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

This is now autoupgraded to annotate atomicrmw instructions in
old bitcode.


Patch is 45.73 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/101699.diff

4 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+1-8)
  • (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-mmra.ll (+3-4)
  • (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll (+84-85)
  • (removed) llvm/test/Transforms/Inline/AMDGPU/inline-amdgpu-unsafe-fp-atomics.ll (-99)
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 0c36cab7044f3..cbe7a38f25eb4 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -16123,14 +16123,7 @@ static bool globalMemoryFPAtomicIsLegal(const GCNSubtarget &Subtarget,
   } else if (Subtarget.supportsAgentScopeFineGrainedRemoteMemoryAtomics())
     return true;
 
-  if (RMW->hasMetadata("amdgpu.no.fine.grained.memory"))
-    return true;
-
-  // TODO: Auto-upgrade this attribute to the metadata in function body and stop
-  // checking it.
-  return RMW->getFunction()
-      ->getFnAttribute("amdgpu-unsafe-fp-atomics")
-      .getValueAsBool();
+  return RMW->hasMetadata("amdgpu.no.fine.grained.memory");
 }
 
 TargetLowering::AtomicExpansionKind
diff --git a/llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-mmra.ll b/llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-mmra.ll
index d51e9291a6119..23a56c285c97d 100644
--- a/llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-mmra.ll
+++ b/llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-mmra.ll
@@ -124,7 +124,7 @@ define i16 @test_cmpxchg_i16_global_agent_align4(ptr addrspace(1) %out, i16 %in,
   ret i16 %extract
 }
 
-define void @syncscope_workgroup_nortn(ptr %addr, float %val) #0 {
+define void @syncscope_workgroup_nortn(ptr %addr, float %val) {
 ; GFX90A-LABEL: define void @syncscope_workgroup_nortn(
 ; GFX90A-SAME: ptr [[ADDR:%.*]], float [[VAL:%.*]]) #[[ATTR1:[0-9]+]] {
 ; GFX90A-NEXT:    br label [[ATOMICRMW_CHECK_SHARED:%.*]]
@@ -159,7 +159,7 @@ define void @syncscope_workgroup_nortn(ptr %addr, float %val) #0 {
 ; GFX1100-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr [[ADDR]], float [[VAL]] syncscope("workgroup") seq_cst, align 4, !mmra [[META0]]
 ; GFX1100-NEXT:    ret void
 ;
-  %res = atomicrmw fadd ptr %addr, float %val syncscope("workgroup") seq_cst, !mmra !2
+  %res = atomicrmw fadd ptr %addr, float %val syncscope("workgroup") seq_cst, !mmra !2, !amdgpu.no.fine.grained.memory !3, !amdgpu.ignore.denormal.mode !3
   ret void
 }
 
@@ -188,11 +188,10 @@ define i32 @atomic_load_global_align1(ptr addrspace(1) %ptr) {
   ret i32 %val
 }
 
-attributes #0 = { "amdgpu-unsafe-fp-atomics"="true" }
-
 !0 = !{!"foo", !"bar"}
 !1 = !{!"bux", !"baz"}
 !2 = !{!0, !1}
+!3 = !{}
 ;.
 ; GFX90A: [[META0]] = !{[[META1:![0-9]+]], [[META2:![0-9]+]]}
 ; GFX90A: [[META1]] = !{!"foo", !"bar"}
diff --git a/llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll b/llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll
index def9522077004..473b948f46888 100644
--- a/llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll
+++ b/llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll
@@ -273,7 +273,7 @@ define void @test_atomicrmw_fadd_f32_as999_no_use_unsafe(ptr addrspace(999) %ptr
   ret void
 }
 
-define float @test_atomicrmw_fadd_f32_global_unsafe(ptr addrspace(1) %ptr, float %value) #0 {
+define float @test_atomicrmw_fadd_f32_global_unsafe(ptr addrspace(1) %ptr, float %value) #3 {
 ; CI-LABEL: @test_atomicrmw_fadd_f32_global_unsafe(
 ; CI-NEXT:    [[TMP1:%.*]] = load float, ptr addrspace(1) [[PTR:%.*]], align 4
 ; CI-NEXT:    br label [[ATOMICRMW_START:%.*]]
@@ -323,22 +323,22 @@ define float @test_atomicrmw_fadd_f32_global_unsafe(ptr addrspace(1) %ptr, float
 ; GFX908-NEXT:    ret float [[TMP5]]
 ;
 ; GFX90A-LABEL: @test_atomicrmw_fadd_f32_global_unsafe(
-; GFX90A-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4
+; GFX90A-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4, !amdgpu.no.fine.grained.memory [[META0:![0-9]+]]
 ; GFX90A-NEXT:    ret float [[RES]]
 ;
 ; GFX940-LABEL: @test_atomicrmw_fadd_f32_global_unsafe(
-; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4
+; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4, !amdgpu.no.fine.grained.memory [[META0:![0-9]+]]
 ; GFX940-NEXT:    ret float [[RES]]
 ;
 ; GFX11-LABEL: @test_atomicrmw_fadd_f32_global_unsafe(
-; GFX11-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4
+; GFX11-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4, !amdgpu.no.fine.grained.memory [[META0:![0-9]+]]
 ; GFX11-NEXT:    ret float [[RES]]
 ;
-  %res = atomicrmw fadd ptr addrspace(1) %ptr, float %value syncscope("wavefront") monotonic
+  %res = atomicrmw fadd ptr addrspace(1) %ptr, float %value syncscope("wavefront") monotonic, !amdgpu.no.fine.grained.memory !0
   ret float %res
 }
 
-define float @test_atomicrmw_fadd_f32_buffer_fat_ptr_unsafe(ptr addrspace(7) %ptr, float %value) #0 {
+define float @test_atomicrmw_fadd_f32_buffer_fat_ptr_unsafe(ptr addrspace(7) %ptr, float %value) #3 {
 ; CI-LABEL: @test_atomicrmw_fadd_f32_buffer_fat_ptr_unsafe(
 ; CI-NEXT:    [[TMP1:%.*]] = load float, ptr addrspace(7) [[PTR:%.*]], align 4
 ; CI-NEXT:    br label [[ATOMICRMW_START:%.*]]
@@ -388,22 +388,22 @@ define float @test_atomicrmw_fadd_f32_buffer_fat_ptr_unsafe(ptr addrspace(7) %pt
 ; GFX908-NEXT:    ret float [[TMP5]]
 ;
 ; GFX90A-LABEL: @test_atomicrmw_fadd_f32_buffer_fat_ptr_unsafe(
-; GFX90A-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(7) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4
+; GFX90A-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(7) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX90A-NEXT:    ret float [[RES]]
 ;
 ; GFX940-LABEL: @test_atomicrmw_fadd_f32_buffer_fat_ptr_unsafe(
-; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(7) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4
+; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(7) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX940-NEXT:    ret float [[RES]]
 ;
 ; GFX11-LABEL: @test_atomicrmw_fadd_f32_buffer_fat_ptr_unsafe(
-; GFX11-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(7) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4
+; GFX11-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(7) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX11-NEXT:    ret float [[RES]]
 ;
-  %res = atomicrmw fadd ptr addrspace(7) %ptr, float %value syncscope("wavefront") monotonic
+  %res = atomicrmw fadd ptr addrspace(7) %ptr, float %value syncscope("wavefront") monotonic, !amdgpu.no.fine.grained.memory !0
   ret float %res
 }
 
-define float @test_atomicrmw_fadd_f32_as999_unsafe(ptr addrspace(999) %ptr, float %value) #0 {
+define float @test_atomicrmw_fadd_f32_as999_unsafe(ptr addrspace(999) %ptr, float %value) #3 {
 ; CI-LABEL: @test_atomicrmw_fadd_f32_as999_unsafe(
 ; CI-NEXT:    [[TMP1:%.*]] = load float, ptr addrspace(999) [[PTR:%.*]], align 4
 ; CI-NEXT:    br label [[ATOMICRMW_START:%.*]]
@@ -453,22 +453,22 @@ define float @test_atomicrmw_fadd_f32_as999_unsafe(ptr addrspace(999) %ptr, floa
 ; GFX908-NEXT:    ret float [[TMP5]]
 ;
 ; GFX90A-LABEL: @test_atomicrmw_fadd_f32_as999_unsafe(
-; GFX90A-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(999) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4
+; GFX90A-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(999) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX90A-NEXT:    ret float [[RES]]
 ;
 ; GFX940-LABEL: @test_atomicrmw_fadd_f32_as999_unsafe(
-; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(999) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4
+; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(999) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX940-NEXT:    ret float [[RES]]
 ;
 ; GFX11-LABEL: @test_atomicrmw_fadd_f32_as999_unsafe(
-; GFX11-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(999) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4
+; GFX11-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(999) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX11-NEXT:    ret float [[RES]]
 ;
-  %res = atomicrmw fadd ptr addrspace(999) %ptr, float %value syncscope("wavefront") monotonic
+  %res = atomicrmw fadd ptr addrspace(999) %ptr, float %value syncscope("wavefront") monotonic, !amdgpu.no.fine.grained.memory !0
   ret float %res
 }
 
-define double @test_atomicrmw_fadd_f64_global_unsafe(ptr addrspace(1) %ptr, double %value) #0 {
+define double @test_atomicrmw_fadd_f64_global_unsafe(ptr addrspace(1) %ptr, double %value) #3 {
 ; CI-LABEL: @test_atomicrmw_fadd_f64_global_unsafe(
 ; CI-NEXT:    [[TMP1:%.*]] = load double, ptr addrspace(1) [[PTR:%.*]], align 8
 ; CI-NEXT:    br label [[ATOMICRMW_START:%.*]]
@@ -518,11 +518,11 @@ define double @test_atomicrmw_fadd_f64_global_unsafe(ptr addrspace(1) %ptr, doub
 ; GFX908-NEXT:    ret double [[TMP5]]
 ;
 ; GFX90A-LABEL: @test_atomicrmw_fadd_f64_global_unsafe(
-; GFX90A-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], double [[VALUE:%.*]] syncscope("wavefront") monotonic, align 8
+; GFX90A-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], double [[VALUE:%.*]] syncscope("wavefront") monotonic, align 8, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX90A-NEXT:    ret double [[RES]]
 ;
 ; GFX940-LABEL: @test_atomicrmw_fadd_f64_global_unsafe(
-; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], double [[VALUE:%.*]] syncscope("wavefront") monotonic, align 8
+; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], double [[VALUE:%.*]] syncscope("wavefront") monotonic, align 8, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX940-NEXT:    ret double [[RES]]
 ;
 ; GFX11-LABEL: @test_atomicrmw_fadd_f64_global_unsafe(
@@ -541,11 +541,11 @@ define double @test_atomicrmw_fadd_f64_global_unsafe(ptr addrspace(1) %ptr, doub
 ; GFX11:       atomicrmw.end:
 ; GFX11-NEXT:    ret double [[TMP5]]
 ;
-  %res = atomicrmw fadd ptr addrspace(1) %ptr, double %value syncscope("wavefront") monotonic
+  %res = atomicrmw fadd ptr addrspace(1) %ptr, double %value syncscope("wavefront") monotonic, !amdgpu.no.fine.grained.memory !0
   ret double %res
 }
 
-define float @test_atomicrmw_fadd_f32_flat_unsafe(ptr %ptr, float %value) #0 {
+define float @test_atomicrmw_fadd_f32_flat_unsafe(ptr %ptr, float %value) #3 {
 ; CI-LABEL: @test_atomicrmw_fadd_f32_flat_unsafe(
 ; CI-NEXT:    [[TMP1:%.*]] = load float, ptr [[PTR:%.*]], align 4
 ; CI-NEXT:    br label [[ATOMICRMW_START:%.*]]
@@ -601,7 +601,7 @@ define float @test_atomicrmw_fadd_f32_flat_unsafe(ptr %ptr, float %value) #0 {
 ; GFX90A-NEXT:    br i1 [[IS_SHARED]], label [[ATOMICRMW_SHARED:%.*]], label [[ATOMICRMW_CHECK_PRIVATE:%.*]]
 ; GFX90A:       atomicrmw.shared:
 ; GFX90A-NEXT:    [[TMP1:%.*]] = addrspacecast ptr [[PTR]] to ptr addrspace(3)
-; GFX90A-NEXT:    [[TMP2:%.*]] = atomicrmw fadd ptr addrspace(3) [[TMP1]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4
+; GFX90A-NEXT:    [[TMP2:%.*]] = atomicrmw fadd ptr addrspace(3) [[TMP1]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX90A-NEXT:    br label [[ATOMICRMW_PHI:%.*]]
 ; GFX90A:       atomicrmw.check.private:
 ; GFX90A-NEXT:    [[IS_PRIVATE:%.*]] = call i1 @llvm.amdgcn.is.private(ptr [[PTR]])
@@ -614,7 +614,7 @@ define float @test_atomicrmw_fadd_f32_flat_unsafe(ptr %ptr, float %value) #0 {
 ; GFX90A-NEXT:    br label [[ATOMICRMW_PHI]]
 ; GFX90A:       atomicrmw.global:
 ; GFX90A-NEXT:    [[TMP4:%.*]] = addrspacecast ptr [[PTR]] to ptr addrspace(1)
-; GFX90A-NEXT:    [[TMP5:%.*]] = atomicrmw fadd ptr addrspace(1) [[TMP4]], float [[VALUE]] syncscope("wavefront") monotonic, align 4
+; GFX90A-NEXT:    [[TMP5:%.*]] = atomicrmw fadd ptr addrspace(1) [[TMP4]], float [[VALUE]] syncscope("wavefront") monotonic, align 4, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX90A-NEXT:    br label [[ATOMICRMW_PHI]]
 ; GFX90A:       atomicrmw.phi:
 ; GFX90A-NEXT:    [[LOADED_PHI:%.*]] = phi float [ [[TMP2]], [[ATOMICRMW_SHARED]] ], [ [[LOADED_PRIVATE]], [[ATOMICRMW_PRIVATE]] ], [ [[TMP5]], [[ATOMICRMW_GLOBAL]] ]
@@ -623,18 +623,18 @@ define float @test_atomicrmw_fadd_f32_flat_unsafe(ptr %ptr, float %value) #0 {
 ; GFX90A-NEXT:    ret float [[LOADED_PHI]]
 ;
 ; GFX940-LABEL: @test_atomicrmw_fadd_f32_flat_unsafe(
-; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4
+; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX940-NEXT:    ret float [[RES]]
 ;
 ; GFX11-LABEL: @test_atomicrmw_fadd_f32_flat_unsafe(
-; GFX11-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4
+; GFX11-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX11-NEXT:    ret float [[RES]]
 ;
-  %res = atomicrmw fadd ptr %ptr, float %value syncscope("wavefront") monotonic
+  %res = atomicrmw fadd ptr %ptr, float %value syncscope("wavefront") monotonic, !amdgpu.no.fine.grained.memory !0
   ret float %res
 }
 
-define double @test_atomicrmw_fadd_f64_flat_unsafe(ptr %ptr, double %value) #0 {
+define double @test_atomicrmw_fadd_f64_flat_unsafe(ptr %ptr, double %value) #3 {
 ; CI-LABEL: @test_atomicrmw_fadd_f64_flat_unsafe(
 ; CI-NEXT:    [[TMP1:%.*]] = load double, ptr [[PTR:%.*]], align 8
 ; CI-NEXT:    br label [[ATOMICRMW_START:%.*]]
@@ -684,11 +684,11 @@ define double @test_atomicrmw_fadd_f64_flat_unsafe(ptr %ptr, double %value) #0 {
 ; GFX908-NEXT:    ret double [[TMP5]]
 ;
 ; GFX90A-LABEL: @test_atomicrmw_fadd_f64_flat_unsafe(
-; GFX90A-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr [[PTR:%.*]], double [[VALUE:%.*]] syncscope("wavefront") monotonic, align 8
+; GFX90A-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr [[PTR:%.*]], double [[VALUE:%.*]] syncscope("wavefront") monotonic, align 8, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX90A-NEXT:    ret double [[RES]]
 ;
 ; GFX940-LABEL: @test_atomicrmw_fadd_f64_flat_unsafe(
-; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr [[PTR:%.*]], double [[VALUE:%.*]] syncscope("wavefront") monotonic, align 8
+; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr [[PTR:%.*]], double [[VALUE:%.*]] syncscope("wavefront") monotonic, align 8, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX940-NEXT:    ret double [[RES]]
 ;
 ; GFX11-LABEL: @test_atomicrmw_fadd_f64_flat_unsafe(
@@ -707,7 +707,7 @@ define double @test_atomicrmw_fadd_f64_flat_unsafe(ptr %ptr, double %value) #0 {
 ; GFX11:       atomicrmw.end:
 ; GFX11-NEXT:    ret double [[TMP5]]
 ;
-  %res = atomicrmw fadd ptr %ptr, double %value syncscope("wavefront") monotonic
+  %res = atomicrmw fadd ptr %ptr, double %value syncscope("wavefront") monotonic, !amdgpu.no.fine.grained.memory !0
   ret double %res
 }
 
@@ -816,7 +816,7 @@ define void @test_atomicrmw_fadd_f32_global_no_use_ieee(ptr addrspace(1) %ptr, f
   ret void
 }
 
-define void @test_atomicrmw_fadd_f32_global_no_use_denorm_flush(ptr addrspace(1) %ptr, float %value) #0 {
+define void @test_atomicrmw_fadd_f32_global_no_use_denorm_flush(ptr addrspace(1) %ptr, float %value) #4 {
 ; CI-LABEL: @test_atomicrmw_fadd_f32_global_no_use_denorm_flush(
 ; CI-NEXT:    [[TMP1:%.*]] = load float, ptr addrspace(1) [[PTR:%.*]], align 4
 ; CI-NEXT:    br label [[ATOMICRMW_START:%.*]]
@@ -858,14 +858,14 @@ define void @test_atomicrmw_fadd_f32_global_no_use_denorm_flush(ptr addrspace(1)
 ; GFX90A-NEXT:    ret void
 ;
 ; GFX940-LABEL: @test_atomicrmw_fadd_f32_global_no_use_denorm_flush(
-; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], float [[VALUE:%.*]] seq_cst, align 4
+; GFX940-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], float [[VALUE:%.*]] seq_cst, align 4, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX940-NEXT:    ret void
 ;
 ; GFX11-LABEL: @test_atomicrmw_fadd_f32_global_no_use_denorm_flush(
-; GFX11-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], float [[VALUE:%.*]] seq_cst, align 4
+; GFX11-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], float [[VALUE:%.*]] seq_cst, align 4, !amdgpu.no.fine.grained.memory [[META0]]
 ; GFX11-NEXT:    ret void
 ;
-  %res = atomicrmw fadd ptr addrspace(1) %ptr, float %value seq_cst
+  %res = atomicrmw fadd ptr addrspace(1) %ptr, float %value seq_cst, !amdgpu.no.fine.grained.memory !0
   ret void
 }
 
@@ -1272,13 +1272,13 @@ define float @test_atomicrmw_fadd_f32_global_one_as(ptr addrspace(1) %ptr, float
   ret float %res
 }
 
-define void @test_atomicrmw_fadd_f32_global_no_use_unsafe_structfp(ptr addrspace(1) %ptr, float %value) #1 {
-; CI-LABEL: @test_atomicrmw_fadd_f32_global_no_use_unsafe_structfp(
+define void @test_atomicrmw_fadd_f32_global_no_use_unsafe_strictfp(ptr addrspace(1) %ptr, float %value) #1 {
+; CI-LABEL: @test_atomicrmw_fadd_f32_global_no_use_unsafe_strictfp(
 ; CI-NEXT:    [[TMP1:%.*]] = load float, ptr addrspace(1) [[PTR:%.*]], align 4
 ; CI-NEXT:    br label [[ATOMICRMW_START:%.*]]
 ; CI:       atomicrmw.start:
 ; CI-NEXT:    [[LOADED:%.*]] = phi float [ [[TMP1]], [[TMP0:%.*]] ], [ [[TMP5:%.*]], [[ATOMICRMW_START]] ]
-; CI-NEXT:    [[NEW:%.*]] = call float @llvm.experimental.constrained.fadd.f32(float [[LOADED]], float [[VALUE:%.*]], metadata !"round.dynamic", metadata !"fpexcept.strict") #[[ATTR10:[0-9]+]]
+; CI-NEXT:    [[NEW:%.*]] = call float @llvm.experimental.constrained.fadd.f32(float [[LOADED]], float [[VALUE:%.*]], metadata !"round.dynamic", metadata !"fpexcept.strict") #[[ATTR8:[0-9]+]]
 ; CI-NEXT:    [[TMP2:%.*]] = bitcast float [[NEW]] to i32
 ; CI-NEXT:    [[TMP3:%.*]] = bitcast float [[LOADED]] to i32
 ; CI-NEXT:    [[TMP4:%.*]] = cmpxchg ptr addrspace(1) [[PTR]], i32 [[TMP3]], i32 [[TMP2]] syncscope("wavefront") monotonic monotonic, align 4
@@ -1289,12 +1289,12 @@ define void @test_atomicrmw_fadd_f32_global_no_use_unsafe_structfp(ptr addrspace
 ; CI:       atomicrmw.end:
 ; CI-NEXT:    ret void
 ;
-; GFX9-LABEL: @test_atomicrmw_fadd_f32_global_no_use_unsafe_structfp(
+; GFX9-LABEL: @test_atomicrmw_fadd_f32_global_no_use_unsafe_strictfp(
 ; GFX9-NEXT:    [[TMP1:%.*]] = load float, ptr addrspace(1) [[PTR:%.*]], align 4
 ; GFX9-NEXT:    br label [[ATOMICRMW_START:%.*]]
 ; GFX9:       atomicrmw.start:
 ; GFX9-NEXT:    [[LOADED:%.*]] = phi float [ [[TMP1]], [[TMP0:%.*]] ], [ [[TMP5:%.*]], [[ATOMICRMW_START]] ]
-; GFX9-NEXT:    [[NEW:%.*]] = call float @llvm.experimental.constrained.fadd.f32(float [[LOADED]], float [[VALUE:%.*]], metadata !"round.dynamic", metadata !"fpexcept.strict") #[[ATTR10:[0-9]+]]
+; GFX9-NEXT:    [[NEW:%.*]] = call float @llvm.experimental.constrained.fadd.f32(float [[LOADED]], float [[VALUE:%.*]], metadata !"round.dynamic", metadata !"fpexcept.strict") #[[ATTR8:[0-9]+]]
 ; GFX9-NEXT:    [[TMP2:%.*]] = bitcast float [[NEW]] to i32
 ; GFX9-NEXT:    [[TMP3:%.*]] = bitcast float [[LOADED]] to i32
 ; GFX9-NEXT:    [[TMP4:%.*]] = cmpxchg ptr addrspace(1) [[PTR]], i32 [[TMP3]], i32 [[TMP2]] syncscope("wavefront") monotonic monotonic, align 4
@@ -1305,23 +1305,23 @@ define void @test_atomicrmw_fadd_f32_global_no_use_unsafe_structfp(ptr addrspace
 ; GFX9:       atomicrmw.end:
 ; GFX9-NEXT:    ret void
 ;
-; GFX908-LABEL: @test_atomicrmw_fadd_f32_global_no_use_unsafe_structfp(
-; GFX908-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], float [[VALUE:%.*]] syncscope("wavefront") monotonic, align 4
+; GFX908-LABEL: @test_atomicrmw_fadd_f32_global_no_use_unsafe_strictfp(
+; GFX908-NEXT:    [[RES:%.*]] = atomicrmw fadd ptr addrspace(1) [[PTR:%.*]], float [[VALUE:%.*]] syn...
[truncated]

@arsenm arsenm force-pushed the users/arsenm/amdgpu-autoupgrade-unsafe-fp-atomics-attribute branch from f19c1c2 to 5ad5d43 Compare August 12, 2024 09:02
@arsenm arsenm force-pushed the users/arsenm/amdgpu-stop-handling-amdgpu-unsafe-fp-atomics-attribute branch from 1f62a17 to 66a2fe2 Compare August 12, 2024 09:02
Base automatically changed from users/arsenm/amdgpu-autoupgrade-unsafe-fp-atomics-attribute to main August 12, 2024 10:56
@arsenm arsenm force-pushed the users/arsenm/amdgpu-stop-handling-amdgpu-unsafe-fp-atomics-attribute branch from 66a2fe2 to 5721daa Compare August 12, 2024 10:58
Copy link
Contributor Author

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

This is now autoupgraded to annotate atomicrmw instructions in
old bitcode.
@arsenm arsenm force-pushed the users/arsenm/amdgpu-stop-handling-amdgpu-unsafe-fp-atomics-attribute branch from 5721daa to d4af49f Compare August 12, 2024 13:59
@arsenm arsenm merged commit edded8d into main Aug 13, 2024
8 checks passed
@arsenm arsenm deleted the users/arsenm/amdgpu-stop-handling-amdgpu-unsafe-fp-atomics-attribute branch August 13, 2024 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category llvm:globalisel llvm:ir llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants