[RISCV][VLOPT] Fix assertion failure across blocks #124734

lukel97 · 2025-01-28T11:41:47Z

Whilst adding a cross-block test, I encountered an assertion failure in the second pass where we check the instruction popped off the worklist is a candidate.

The leaf instruction %c in this case will be added to the worklist when its VL is VLMAX, but during the first pass it will have its VL reduced to 1.

Then in the second pass when its processed via the worklist, isCandidate will no longer be true due to its VL == 1.

This fixes it by moving the VL == 1 check to tryReduceVL, keeping it alongside the other VL check for bailing out early as an optimisation.

llvmbot · 2025-01-28T11:42:21Z

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

Whilst adding a cross-block test, I encountered an assertion failure in the second pass where we check the instruction popped off the worklist is a candidate.

The leaf instruction %c in this case will be added to the worklist when its VL is VLMAX, but during the first pass it will have its VL reduced to 1.

Then in the second pass when its processed via the worklist, isCandidate will no longer be true due to its VL == 1.

I think the easiest fix for this is to remove the VL > 1 check in isCandidate, so now it just checks static properties about the MCInstrDesc.

Full diff: https://github.com/llvm/llvm-project/pull/124734.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp (-10)
(modified) llvm/test/CodeGen/RISCV/rvv/vl-opt.mir (+46)

diff --git a/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp b/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
index e8c01e57038bfb..9bd3895489da10 100644
--- a/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
+++ b/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
@@ -1143,16 +1143,6 @@ bool RISCVVLOptimizer::isCandidate(const MachineInstr &MI) const {
   if (MI.getNumDefs() != 1)
     return false;
 
-  unsigned VLOpNum = RISCVII::getVLOpNum(Desc);
-  const MachineOperand &VLOp = MI.getOperand(VLOpNum);
-
-  // If the VL is 1, then there is no need to reduce it. This is an
-  // optimization, not needed to preserve correctness.
-  if (VLOp.isImm() && VLOp.getImm() == 1) {
-    LLVM_DEBUG(dbgs() << "  Not a candidate because VL is already 1\n");
-    return false;
-  }
-
   if (MI.mayRaiseFPException()) {
     LLVM_DEBUG(dbgs() << "Not a candidate because may raise FP exception\n");
     return false;
diff --git a/llvm/test/CodeGen/RISCV/rvv/vl-opt.mir b/llvm/test/CodeGen/RISCV/rvv/vl-opt.mir
index 027eb8ca3c17f0..7a28eaaaa5d8be 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vl-opt.mir
+++ b/llvm/test/CodeGen/RISCV/rvv/vl-opt.mir
@@ -149,3 +149,49 @@ body: |
     ; CHECK-NEXT: early-clobber %y:vrm2 = PseudoVWADD_WV_M1_TIED $noreg, %x, 1, 3 /* e8 */, 0 /* tu, mu */
     %x:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
     %y:vrm2 = PseudoVWADD_WV_M1_TIED $noreg, %x, 1, 3 /* e8 */, 0 /* tu, mu */
+...
+---
+name: crossbb
+body: |
+  ; CHECK-LABEL: name: crossbb
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   PseudoBR %bb.3
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   %a1:vr = PseudoVADD_VV_M1 $noreg, %c, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+  ; CHECK-NEXT:   %a2:vr = PseudoVADD_VV_M1 $noreg, %a1, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+  ; CHECK-NEXT:   $v8 = COPY %a2
+  ; CHECK-NEXT:   PseudoRET
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   %b1:vr = PseudoVADD_VV_M1 $noreg, %c, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+  ; CHECK-NEXT:   %b2:vr = PseudoVADD_VV_M1 $noreg, %b1, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+  ; CHECK-NEXT:   $v8 = COPY %b2
+  ; CHECK-NEXT:   PseudoRET
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.3:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT:   liveins: $x1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   %c:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+  ; CHECK-NEXT:   BEQ $x1, $x0, %bb.1
+  ; CHECK-NEXT:   PseudoBR %bb.2
+  bb.0:
+    PseudoBR %bb.3
+  bb.1:
+    %a1:vr = PseudoVADD_VV_M1 $noreg, %c, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+    %a2:vr = PseudoVADD_VV_M1 $noreg, %a1, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+    $v8 = COPY %a2
+    PseudoRET
+  bb.2:
+    %b1:vr = PseudoVADD_VV_M1 $noreg, %c, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+    %b2:vr = PseudoVADD_VV_M1 $noreg, %b1, $noreg, 1, 3 /* e8 */, 0 /* tu, mu */
+    $v8 = COPY %b2
+    PseudoRET
+  bb.3:
+    liveins: $x1
+    %c:vr = PseudoVADD_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */, 0 /* tu, mu */
+    BEQ $x1, $x0, %bb.1
+    PseudoBR %bb.2

Whilst adding a cross-block test, I encountered an assertion failure in the second pass where we check the instruction popped off the worklist is a candidate. The leaf instruction %c in this case will be added to the worklist when its VL is VLMAX, but during the first pass it will have its VL reduced to 1. Then in the second pass when its processed via the worklist, isCandidate will no longer be true due to its VL == 1. I think the easiest fix for this is to remove the VL > 1 check in isCandidate, so now it just checks static properties about the MCInstrDesc.

michaelmaitland · 2025-01-28T15:04:02Z

I'm not sure I agree with this approach. I think there is a deeper problem that is going on: something is on the worklist that is not a candidate, and this is an invariant we are currently not permitting.

I think we should handle it in one of two ways:

Allow things on the worklist that are not isCandidate. When when we pop it off, we will check whether it isCandidate and decide whether to optimize or not.
Do a better job of not adding something that !isCandidate to the worklist.

I think (1) is probably easier, but I don't have a strong preference.

lukel97 · 2025-01-28T15:50:00Z

Do a better job of not adding something that !isCandidate to the worklist.

The issue lies in that a candidate can become a non-candidate whilst on the worklist, because after an instruction gets reduced it's no longer a candidate, which can happen before it's popped off the worklist.

Allow things on the worklist that are not isCandidate. When when we pop it off, we will check whether it isCandidate and decide whether to optimize or not.

I think we can do this approach.

It's probably worth noting that there's two types of check going on in isCandidate at the moment, those to ensure correctness (isSupportedInstr, mayRaiseFPException), and those to see if it's worthwhile (VL > 1). We could potentially move the latter into tryReduceVL to keep it alongside the other VL checks.

preames · 2025-01-28T16:19:55Z

It's probably worth noting that there's two types of check going on in isCandidate at the moment, those to ensure correctness (isSupportedInstr, mayRaiseFPException), and those to see if it's worthwhile (VL > 1). We could potentially move the latter into tryReduceVL to keep it alongside the other VL checks.

This seems like the right approach to me.

preames

LGTM

llvm-ci · 2025-01-29T03:10:14Z

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime running on omp-vega20-0 while building llvm at step 7 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/30/builds/14814

Here is the relevant piece of the build log for the reference

Step 7 (Add check check-offload) failure: test (failure)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: api/omp_host_call.c' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 1
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/api/omp_host_call.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/api/Output/omp_host_call.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a && /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/api/Output/omp_host_call.c.tmp | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/api/omp_host_call.c
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/api/omp_host_call.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/api/Output/omp_host_call.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/api/Output/omp_host_call.c.tmp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -11
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/api/omp_host_call.c
# .---command stderr------------
# | FileCheck error: '<stdin>' is empty.
# | FileCheck command line:  /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/api/omp_host_call.c
# `-----------------------------
# error: command failed with exit status: 2

--

********************

lukel97 requested review from michaelmaitland, preames and topperc January 28, 2025 11:41

llvmbot added the backend:RISC-V label Jan 28, 2025

lukel97 mentioned this pull request Jan 28, 2025

[RISCV][VLOPT] Compute demanded VLs up front #124530

Merged

Keep VL = 1 check, check isCandidate after popping instead

4ca7c87

lukel97 force-pushed the vloptimizer/crossbb branch from cabc640 to 4ca7c87 Compare January 28, 2025 15:52

Move VL=1 check to tryReduceVL

e0781aa

preames approved these changes Jan 28, 2025

View reviewed changes

lukel97 merged commit ff271d0 into llvm:main Jan 29, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV][VLOPT] Fix assertion failure across blocks #124734

[RISCV][VLOPT] Fix assertion failure across blocks #124734

Uh oh!

lukel97 commented Jan 28, 2025 •

edited

Loading

Uh oh!

llvmbot commented Jan 28, 2025

Uh oh!

michaelmaitland commented Jan 28, 2025

Uh oh!

lukel97 commented Jan 28, 2025

Uh oh!

preames commented Jan 28, 2025

Uh oh!

preames left a comment

Uh oh!

Uh oh!

llvm-ci commented Jan 29, 2025

Uh oh!

Uh oh!

[RISCV][VLOPT] Fix assertion failure across blocks #124734

[RISCV][VLOPT] Fix assertion failure across blocks #124734

Uh oh!

Conversation

lukel97 commented Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jan 28, 2025

Uh oh!

michaelmaitland commented Jan 28, 2025

Uh oh!

lukel97 commented Jan 28, 2025

Uh oh!

preames commented Jan 28, 2025

Uh oh!

preames left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented Jan 29, 2025

Uh oh!

Uh oh!

lukel97 commented Jan 28, 2025 •

edited

Loading