Skip to content

[AMDGPU] Cleanup hasUnwantedEffectsWhenEXECEmpty function #70206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

cdevadas
Copy link
Collaborator

The readlane & writelane instructions don't really depend on the the EXEC mask and they should return false from here.

The readlane & writelane instructions don't really depend on the
the EXEC mask and they should return false from here.
@llvmbot
Copy link
Member

llvmbot commented Oct 25, 2023

@llvm/pr-subscribers-backend-amdgpu

Author: Christudasan Devadasan (cdevadas)

Changes

The readlane & writelane instructions don't really depend on the the EXEC mask and they should return false from here.


Full diff: https://github.com/llvm/llvm-project/pull/70206.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+1-2)
  • (modified) llvm/test/CodeGen/AMDGPU/remove-short-exec-branches-special-instructions.mir (+6-8)
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 4ff7b462f0f3295..827c2c156638468 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -3906,8 +3906,7 @@ bool SIInstrInfo::hasUnwantedEffectsWhenEXECEmpty(const MachineInstr &MI) const
   //
   // However, executing them with EXEC = 0 causes them to operate on undefined
   // data, which we avoid by returning true here.
-  if (Opcode == AMDGPU::V_READFIRSTLANE_B32 ||
-      Opcode == AMDGPU::V_READLANE_B32 || Opcode == AMDGPU::V_WRITELANE_B32)
+  if (Opcode == AMDGPU::V_READFIRSTLANE_B32)
     return true;
 
   return false;
diff --git a/llvm/test/CodeGen/AMDGPU/remove-short-exec-branches-special-instructions.mir b/llvm/test/CodeGen/AMDGPU/remove-short-exec-branches-special-instructions.mir
index fe4aa6a9aea68cb..8f3f70ea6287ee9 100644
--- a/llvm/test/CodeGen/AMDGPU/remove-short-exec-branches-special-instructions.mir
+++ b/llvm/test/CodeGen/AMDGPU/remove-short-exec-branches-special-instructions.mir
@@ -130,13 +130,12 @@ body: |
 
 ---
 
-name: need_skip_writelane_b32
+name: dont_skip_writelane_b32
 body: |
-  ; CHECK-LABEL: name: need_skip_writelane_b32
+  ; CHECK-LABEL: name: dont_skip_writelane_b32
   ; CHECK: bb.0:
-  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000)
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
   ; CHECK-NEXT:   successors: %bb.2(0x80000000)
@@ -160,13 +159,12 @@ body: |
 ...
 
 ---
-name: need_skip_readlane_b32
+name: dont_skip_readlane_b32
 body: |
-  ; CHECK-LABEL: name: need_skip_readlane_b32
+  ; CHECK-LABEL: name: dont_skip_readlane_b32
   ; CHECK: bb.0:
-  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000)
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
   ; CHECK-NEXT:   successors: %bb.2(0x80000000)

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm assuming this doesn't break anything

@cdevadas cdevadas merged commit 7ce613f into llvm:main Oct 25, 2023
@ruiling
Copy link
Contributor

ruiling commented Oct 26, 2023

I don't think the change is ok. The function hasUnwantedEffectsWhenEXECEmpty() is used to query whether we can optimize away a s_cbranch_exez, that is whether a block can be always executed even the EXEC is zero, and the block execution with zero EXEC should not cause any observable side-effect to the program. In theory, I think any instruction write to SGPR could possibly cause sideffect to the program. Although in practice, we might not hit this, I am not sure about this. But I think we should revert this, and I think it is even better to say scalar instructions also have unwanted-side-effect-when-exec-zero. @nhaehnle @perlfu

@perlfu
Copy link
Contributor

perlfu commented Oct 26, 2023

Out of caution, I think this should be reverted.
V_READLANE and V_WRITELANE always execute (as if they are scalar instructions) regardless of EXEC mask.

The lack of test coverage in the code base is probably part of the problem why this looks like an OK change.
I will see if I can find some test cases where this matters and add them -- if there are none then maybe this is fine.

The pre-existing comment is misleading, and that might have not helped the confusion.

// These are like SALU instructions in terms of effects, so it's questionable
// whether we should return true for those.
//
// However, executing them with EXEC = 0 causes them to operate on undefined
// data, which we avoid by returning true here.

This suggests maybe we can remove them, but then says they are undefined.
Undefined behaviour at the instruction level only exists for V_READFIRSTLANE with EXEC=0.
However you can imagine a sequence like this:

v_cmp_eq_u32 s0, v0, 0.0
v_writelane_b32 v1, s0, 0

Where the result of v_cmp is effected by EXEC=0, but v_writelane still uses the value.
As @ruiling stated, any write to an SGPR could be an issue.

Taking a step back, the problem exists because we peephole remove s_cbranch_execz irrespective of whether it is part of uniform or non-uniform control flow.
We would only expect it to occur as part of non-uniform control flow, and executing scalar operations as part of non-uniform control flow should be fine, because it is just the same as executing with any number of lanes active.
However, we do not know for sure the purpose or origin of the s_cbranch_execz instructions which is why we must be cautious.

Copy link

@jonesk7734 jonesk7734 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

cdevadas added a commit that referenced this pull request Oct 26, 2023
@jayfoad
Copy link
Contributor

jayfoad commented Oct 26, 2023

Ok

@jonesk7734 are you a bot?

@cdevadas
Copy link
Collaborator Author

Patch reverted with 16fbc45.

@jonesk7734
Copy link

jonesk7734 commented Oct 26, 2023 via email

zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants