[IA][RISCV] Recognizing gap masks assembled from bitwise AND #153324

mshockwave · 2025-08-13T00:58:08Z

For a deinterleaved masked.load / vp.load, if it's mask, %c, is synthesized by the following snippet:

%m = shufflevector %s, poison, <0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3>
%g = <1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0>
%c = and %m, %g

Then we can know that %g is the gap mask and %s is the mask for each field / component. This patch teaches InterleaveAccess pass to recognize such pattern

Split out from #151612

llvmbot · 2025-08-13T00:58:43Z

@llvm/pr-subscribers-backend-risc-v

Author: Min-Yih Hsu (mshockwave)

Changes

For a deinterleaved masked.load / vp.load, if it's mask, %c, is synthesized by the following snippet:

%m = shufflevector %s, poison, &lt;0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3&gt;
%g = &lt;1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0&gt;
%c = and %m, %g

Then we can know that %g is the gap mask and %s is the mask for each field / component. This patch teaches InterleaveAccess pass to recognize such pattern

Split out from #151612

Full diff: https://github.com/llvm/llvm-project/pull/153324.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/InterleavedAccessPass.cpp (+13)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll (+74-38)

diff --git a/llvm/lib/CodeGen/InterleavedAccessPass.cpp b/llvm/lib/CodeGen/InterleavedAccessPass.cpp
index a41a44df3f847..8bda10e1bef49 100644
--- a/llvm/lib/CodeGen/InterleavedAccessPass.cpp
+++ b/llvm/lib/CodeGen/InterleavedAccessPass.cpp
@@ -592,6 +592,7 @@ static void getGapMask(const Constant &MaskConst, unsigned Factor,
 
 static std::pair<Value *, APInt> getMask(Value *WideMask, unsigned Factor,
                                          ElementCount LeafValueEC) {
+  using namespace PatternMatch;
   auto GapMask = APInt::getAllOnes(Factor);
 
   if (auto *IMI = dyn_cast<IntrinsicInst>(WideMask)) {
@@ -601,6 +602,18 @@ static std::pair<Value *, APInt> getMask(Value *WideMask, unsigned Factor,
     }
   }
 
+  // Try to match `and <interleaved mask>, <gap mask>`. The WideMask here is
+  // expected to be a fixed vector and gap mask should be a constant mask.
+  Value *AndMaskLHS;
+  Constant *AndMaskRHS;
+  if (LeafValueEC.isFixed() &&
+      match(WideMask, m_c_And(m_Value(AndMaskLHS), m_Constant(AndMaskRHS)))) {
+    assert(!isa<Constant>(AndMaskLHS) &&
+           "expect constants to be folded already");
+    getGapMask(*AndMaskRHS, Factor, LeafValueEC.getFixedValue(), GapMask);
+    return {getMask(AndMaskLHS, Factor, LeafValueEC).first, GapMask};
+  }
+
   if (auto *ConstMask = dyn_cast<Constant>(WideMask)) {
     if (auto *Splat = ConstMask->getSplatValue())
       // All-ones or all-zeros mask.
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll
index 7d7ef3e4e2a4b..2c738e5aeb55b 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll
@@ -367,6 +367,24 @@ define {<4 x i32>, <4 x i32>} @vpload_factor3_mask_skip_fields(ptr %ptr) {
   ret {<4 x i32>, <4 x i32>} %res1
 }
 
+define {<4 x i32>, <4 x i32>} @vpload_factor3_combined_mask_skip_field(ptr %ptr, <4 x i1> %mask) {
+; CHECK-LABEL: vpload_factor3_combined_mask_skip_field:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    li a1, 12
+; CHECK-NEXT:    vsetivli zero, 6, e32, m1, ta, ma
+; CHECK-NEXT:    vlsseg2e32.v v8, (a0), a1, v0.t
+; CHECK-NEXT:    ret
+  %interleaved.mask = shufflevector <4 x i1> %mask, <4 x i1> poison, <12 x i32> <i32 0, i32 0, i32 0, i32 1, i32 1, i32 1, i32 2, i32 2, i32 2, i32 3, i32 3, i32 3>
+  %combined = and <12 x i1> %interleaved.mask, <i1 true, i1 true, i1 false, i1 true, i1 true, i1 false, i1 true, i1 true, i1 false, i1 true, i1 true, i1 false>
+  %interleaved.vec = tail call <12 x i32> @llvm.vp.load.v12i32.p0(ptr %ptr, <12 x i1> %combined, i32 12)
+  ; mask = %mask, skip the last field
+  %v0 = shufflevector <12 x i32> %interleaved.vec, <12 x i32> poison, <4 x i32> <i32 0, i32 3, i32 6, i32 9>
+  %v1 = shufflevector <12 x i32> %interleaved.vec, <12 x i32> poison, <4 x i32> <i32 1, i32 4, i32 7, i32 10>
+  %res0 = insertvalue {<4 x i32>, <4 x i32>} undef, <4 x i32> %v0, 0
+  %res1 = insertvalue {<4 x i32>, <4 x i32>} %res0, <4 x i32> %v1, 1
+  ret {<4 x i32>, <4 x i32>} %res1
+}
+
 define {<4 x i32>, <4 x i32>, <4 x i32>, <4 x i32>} @vpload_factor4(ptr %ptr) {
 ; CHECK-LABEL: vpload_factor4:
 ; CHECK:       # %bb.0:
@@ -514,8 +532,8 @@ define {<8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>} @load_
 ; RV32-NEXT:    li a2, 32
 ; RV32-NEXT:    lui a3, 12
 ; RV32-NEXT:    lui a6, 12291
-; RV32-NEXT:    lui a7, %hi(.LCPI25_0)
-; RV32-NEXT:    addi a7, a7, %lo(.LCPI25_0)
+; RV32-NEXT:    lui a7, %hi(.LCPI26_0)
+; RV32-NEXT:    addi a7, a7, %lo(.LCPI26_0)
 ; RV32-NEXT:    vsetvli zero, a2, e32, m8, ta, ma
 ; RV32-NEXT:    vle32.v v24, (a5)
 ; RV32-NEXT:    vmv.s.x v0, a3
@@ -600,12 +618,12 @@ define {<8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>} @load_
 ; RV32-NEXT:    addi a1, a1, 16
 ; RV32-NEXT:    vs4r.v v8, (a1) # vscale x 32-byte Folded Spill
 ; RV32-NEXT:    lui a7, 49164
-; RV32-NEXT:    lui a1, %hi(.LCPI25_1)
-; RV32-NEXT:    addi a1, a1, %lo(.LCPI25_1)
+; RV32-NEXT:    lui a1, %hi(.LCPI26_1)
+; RV32-NEXT:    addi a1, a1, %lo(.LCPI26_1)
 ; RV32-NEXT:    lui t2, 3
 ; RV32-NEXT:    lui t1, 196656
-; RV32-NEXT:    lui a4, %hi(.LCPI25_3)
-; RV32-NEXT:    addi a4, a4, %lo(.LCPI25_3)
+; RV32-NEXT:    lui a4, %hi(.LCPI26_3)
+; RV32-NEXT:    addi a4, a4, %lo(.LCPI26_3)
 ; RV32-NEXT:    lui t0, 786624
 ; RV32-NEXT:    li a5, 48
 ; RV32-NEXT:    lui a6, 768
@@ -784,8 +802,8 @@ define {<8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>} @load_
 ; RV32-NEXT:    vl8r.v v8, (a1) # vscale x 64-byte Folded Reload
 ; RV32-NEXT:    vsetvli zero, zero, e64, m8, ta, ma
 ; RV32-NEXT:    vrgatherei16.vv v24, v8, v2
-; RV32-NEXT:    lui a1, %hi(.LCPI25_2)
-; RV32-NEXT:    addi a1, a1, %lo(.LCPI25_2)
+; RV32-NEXT:    lui a1, %hi(.LCPI26_2)
+; RV32-NEXT:    addi a1, a1, %lo(.LCPI26_2)
 ; RV32-NEXT:    lui a3, 3073
 ; RV32-NEXT:    addi a3, a3, -1024
 ; RV32-NEXT:    vmv.s.x v0, a3
@@ -849,16 +867,16 @@ define {<8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>} @load_
 ; RV32-NEXT:    vrgatherei16.vv v28, v8, v3
 ; RV32-NEXT:    vsetivli zero, 10, e32, m4, tu, ma
 ; RV32-NEXT:    vmv.v.v v28, v24
-; RV32-NEXT:    lui a1, %hi(.LCPI25_4)
-; RV32-NEXT:    addi a1, a1, %lo(.LCPI25_4)
-; RV32-NEXT:    lui a2, %hi(.LCPI25_5)
-; RV32-NEXT:    addi a2, a2, %lo(.LCPI25_5)
+; RV32-NEXT:    lui a1, %hi(.LCPI26_4)
+; RV32-NEXT:    addi a1, a1, %lo(.LCPI26_4)
+; RV32-NEXT:    lui a2, %hi(.LCPI26_5)
+; RV32-NEXT:    addi a2, a2, %lo(.LCPI26_5)
 ; RV32-NEXT:    vsetivli zero, 16, e16, m2, ta, ma
 ; RV32-NEXT:    vle16.v v24, (a2)
 ; RV32-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; RV32-NEXT:    vle16.v v8, (a1)
-; RV32-NEXT:    lui a1, %hi(.LCPI25_7)
-; RV32-NEXT:    addi a1, a1, %lo(.LCPI25_7)
+; RV32-NEXT:    lui a1, %hi(.LCPI26_7)
+; RV32-NEXT:    addi a1, a1, %lo(.LCPI26_7)
 ; RV32-NEXT:    vsetivli zero, 16, e64, m8, ta, ma
 ; RV32-NEXT:    vle16.v v10, (a1)
 ; RV32-NEXT:    csrr a1, vlenb
@@ -886,14 +904,14 @@ define {<8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>} @load_
 ; RV32-NEXT:    vl8r.v v0, (a1) # vscale x 64-byte Folded Reload
 ; RV32-NEXT:    vsetivli zero, 16, e64, m8, ta, ma
 ; RV32-NEXT:    vrgatherei16.vv v16, v0, v10
-; RV32-NEXT:    lui a1, %hi(.LCPI25_6)
-; RV32-NEXT:    addi a1, a1, %lo(.LCPI25_6)
-; RV32-NEXT:    lui a2, %hi(.LCPI25_8)
-; RV32-NEXT:    addi a2, a2, %lo(.LCPI25_8)
+; RV32-NEXT:    lui a1, %hi(.LCPI26_6)
+; RV32-NEXT:    addi a1, a1, %lo(.LCPI26_6)
+; RV32-NEXT:    lui a2, %hi(.LCPI26_8)
+; RV32-NEXT:    addi a2, a2, %lo(.LCPI26_8)
 ; RV32-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; RV32-NEXT:    vle16.v v4, (a1)
-; RV32-NEXT:    lui a1, %hi(.LCPI25_9)
-; RV32-NEXT:    addi a1, a1, %lo(.LCPI25_9)
+; RV32-NEXT:    lui a1, %hi(.LCPI26_9)
+; RV32-NEXT:    addi a1, a1, %lo(.LCPI26_9)
 ; RV32-NEXT:    vsetivli zero, 16, e16, m2, ta, ma
 ; RV32-NEXT:    vle16.v v6, (a1)
 ; RV32-NEXT:    vsetivli zero, 8, e64, m4, ta, ma
@@ -980,8 +998,8 @@ define {<8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>} @load_
 ; RV64-NEXT:    li a4, 128
 ; RV64-NEXT:    lui a1, 1
 ; RV64-NEXT:    vle64.v v8, (a3)
-; RV64-NEXT:    lui a3, %hi(.LCPI25_0)
-; RV64-NEXT:    addi a3, a3, %lo(.LCPI25_0)
+; RV64-NEXT:    lui a3, %hi(.LCPI26_0)
+; RV64-NEXT:    addi a3, a3, %lo(.LCPI26_0)
 ; RV64-NEXT:    vmv.s.x v0, a4
 ; RV64-NEXT:    csrr a4, vlenb
 ; RV64-NEXT:    li a5, 61
@@ -1169,8 +1187,8 @@ define {<8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>} @load_
 ; RV64-NEXT:    vl8r.v v16, (a2) # vscale x 64-byte Folded Reload
 ; RV64-NEXT:    vsetivli zero, 8, e64, m4, ta, mu
 ; RV64-NEXT:    vslideup.vi v12, v16, 1, v0.t
-; RV64-NEXT:    lui a2, %hi(.LCPI25_1)
-; RV64-NEXT:    addi a2, a2, %lo(.LCPI25_1)
+; RV64-NEXT:    lui a2, %hi(.LCPI26_1)
+; RV64-NEXT:    addi a2, a2, %lo(.LCPI26_1)
 ; RV64-NEXT:    li a3, 192
 ; RV64-NEXT:    vsetivli zero, 16, e16, m2, ta, ma
 ; RV64-NEXT:    vle16.v v6, (a2)
@@ -1204,8 +1222,8 @@ define {<8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>} @load_
 ; RV64-NEXT:    vrgatherei16.vv v24, v16, v6
 ; RV64-NEXT:    addi a2, sp, 16
 ; RV64-NEXT:    vs8r.v v24, (a2) # vscale x 64-byte Folded Spill
-; RV64-NEXT:    lui a2, %hi(.LCPI25_2)
-; RV64-NEXT:    addi a2, a2, %lo(.LCPI25_2)
+; RV64-NEXT:    lui a2, %hi(.LCPI26_2)
+; RV64-NEXT:    addi a2, a2, %lo(.LCPI26_2)
 ; RV64-NEXT:    li a3, 1040
 ; RV64-NEXT:    vmv.s.x v0, a3
 ; RV64-NEXT:    addi a1, a1, -2016
@@ -1289,12 +1307,12 @@ define {<8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>} @load_
 ; RV64-NEXT:    add a1, sp, a1
 ; RV64-NEXT:    addi a1, a1, 16
 ; RV64-NEXT:    vs4r.v v8, (a1) # vscale x 32-byte Folded Spill
-; RV64-NEXT:    lui a1, %hi(.LCPI25_3)
-; RV64-NEXT:    addi a1, a1, %lo(.LCPI25_3)
+; RV64-NEXT:    lui a1, %hi(.LCPI26_3)
+; RV64-NEXT:    addi a1, a1, %lo(.LCPI26_3)
 ; RV64-NEXT:    vsetivli zero, 16, e16, m2, ta, ma
 ; RV64-NEXT:    vle16.v v20, (a1)
-; RV64-NEXT:    lui a1, %hi(.LCPI25_4)
-; RV64-NEXT:    addi a1, a1, %lo(.LCPI25_4)
+; RV64-NEXT:    lui a1, %hi(.LCPI26_4)
+; RV64-NEXT:    addi a1, a1, %lo(.LCPI26_4)
 ; RV64-NEXT:    vle16.v v8, (a1)
 ; RV64-NEXT:    csrr a1, vlenb
 ; RV64-NEXT:    li a2, 77
@@ -1345,8 +1363,8 @@ define {<8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>, <8 x i64>} @load_
 ; RV64-NEXT:    vl2r.v v8, (a1) # vscale x 16-byte Folded Reload
 ; RV64-NEXT:    vsetivli zero, 16, e64, m8, ta, ma
 ; RV64-NEXT:    vrgatherei16.vv v0, v16, v8
-; RV64-NEXT:    lui a1, %hi(.LCPI25_5)
-; RV64-NEXT:    addi a1, a1, %lo(.LCPI25_5)
+; RV64-NEXT:    lui a1, %hi(.LCPI26_5)
+; RV64-NEXT:    addi a1, a1, %lo(.LCPI26_5)
 ; RV64-NEXT:    vle16.v v20, (a1)
 ; RV64-NEXT:    csrr a1, vlenb
 ; RV64-NEXT:    li a2, 61
@@ -1963,8 +1981,8 @@ define {<4 x i32>, <4 x i32>, <4 x i32>} @invalid_vp_mask(ptr %ptr) {
 ; RV32-NEXT:    vle32.v v12, (a0), v0.t
 ; RV32-NEXT:    li a0, 36
 ; RV32-NEXT:    vmv.s.x v20, a1
-; RV32-NEXT:    lui a1, %hi(.LCPI61_0)
-; RV32-NEXT:    addi a1, a1, %lo(.LCPI61_0)
+; RV32-NEXT:    lui a1, %hi(.LCPI62_0)
+; RV32-NEXT:    addi a1, a1, %lo(.LCPI62_0)
 ; RV32-NEXT:    vsetivli zero, 8, e32, m2, ta, ma
 ; RV32-NEXT:    vle16.v v21, (a1)
 ; RV32-NEXT:    vcompress.vm v8, v12, v11
@@ -2039,8 +2057,8 @@ define {<4 x i32>, <4 x i32>, <4 x i32>} @invalid_vp_evl(ptr %ptr) {
 ; RV32-NEXT:    vmv.s.x v10, a0
 ; RV32-NEXT:    li a0, 146
 ; RV32-NEXT:    vmv.s.x v11, a0
-; RV32-NEXT:    lui a0, %hi(.LCPI62_0)
-; RV32-NEXT:    addi a0, a0, %lo(.LCPI62_0)
+; RV32-NEXT:    lui a0, %hi(.LCPI63_0)
+; RV32-NEXT:    addi a0, a0, %lo(.LCPI63_0)
 ; RV32-NEXT:    vsetivli zero, 8, e32, m2, ta, ma
 ; RV32-NEXT:    vle16.v v20, (a0)
 ; RV32-NEXT:    li a0, 36
@@ -2181,6 +2199,24 @@ define {<4 x i32>, <4 x i32>} @maskedload_factor3_mask_skip_field(ptr %ptr) {
   ret {<4 x i32>, <4 x i32>} %res1
 }
 
+define {<4 x i32>, <4 x i32>} @maskedload_factor3_combined_mask_skip_field(ptr %ptr, <4 x i1> %mask) {
+; CHECK-LABEL: maskedload_factor3_combined_mask_skip_field:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    li a1, 12
+; CHECK-NEXT:    vsetivli zero, 4, e32, m1, ta, ma
+; CHECK-NEXT:    vlsseg2e32.v v8, (a0), a1, v0.t
+; CHECK-NEXT:    ret
+  %interleaved.mask = shufflevector <4 x i1> %mask, <4 x i1> poison, <12 x i32> <i32 0, i32 0, i32 0, i32 1, i32 1, i32 1, i32 2, i32 2, i32 2, i32 3, i32 3, i32 3>
+  %combined = and <12 x i1> %interleaved.mask, <i1 true, i1 true, i1 false, i1 true, i1 true, i1 false, i1 true, i1 true, i1 false, i1 true, i1 true, i1 false>
+  %interleaved.vec = tail call <12 x i32> @llvm.masked.load.v12i32.p0(ptr %ptr, i32 4, <12 x i1> %combined, <12 x i32> poison)
+  ; mask = %mask, skip the last field
+  %v0 = shufflevector <12 x i32> %interleaved.vec, <12 x i32> poison, <4 x i32> <i32 0, i32 3, i32 6, i32 9>
+  %v1 = shufflevector <12 x i32> %interleaved.vec, <12 x i32> poison, <4 x i32> <i32 1, i32 4, i32 7, i32 10>
+  %res0 = insertvalue {<4 x i32>, <4 x i32>} undef, <4 x i32> %v0, 0
+  %res1 = insertvalue {<4 x i32>, <4 x i32>} %res0, <4 x i32> %v1, 1
+  ret {<4 x i32>, <4 x i32>} %res1
+}
+
 ; We can only skip the last field for now.
 define {<4 x i32>, <4 x i32>, <4 x i32>} @maskedload_factor3_invalid_skip_field(ptr %ptr) {
 ; RV32-LABEL: maskedload_factor3_invalid_skip_field:
@@ -2198,8 +2234,8 @@ define {<4 x i32>, <4 x i32>, <4 x i32>} @maskedload_factor3_invalid_skip_field(
 ; RV32-NEXT:    vle32.v v12, (a0), v0.t
 ; RV32-NEXT:    li a0, 36
 ; RV32-NEXT:    vmv.s.x v20, a1
-; RV32-NEXT:    lui a1, %hi(.LCPI68_0)
-; RV32-NEXT:    addi a1, a1, %lo(.LCPI68_0)
+; RV32-NEXT:    lui a1, %hi(.LCPI70_0)
+; RV32-NEXT:    addi a1, a1, %lo(.LCPI70_0)
 ; RV32-NEXT:    vsetivli zero, 8, e32, m2, ta, ma
 ; RV32-NEXT:    vle16.v v21, (a1)
 ; RV32-NEXT:    vcompress.vm v8, v12, v11

github-actions · 2025-08-13T01:01:28Z

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:

git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD llvm/lib/CodeGen/InterleavedAccessPass.cpp llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll

The following files introduce new uses of undef:

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

preames · 2025-08-13T01:24:32Z

llvm/lib/CodeGen/InterleavedAccessPass.cpp

+    assert(!isa<Constant>(AndMaskLHS) &&
+           "expect constants to be folded already");
+    getGapMask(*AndMaskRHS, Factor, LeafValueEC.getFixedValue(), GapMask);
+    return {getMask(AndMaskLHS, Factor, LeafValueEC).first, GapMask};


I don't think this works as written. Consider the case where MaskLHS has segment 3 inactive, and the and mask has segment 2 inactive. Reporting only segment 2 would be wrong wouldn't it? I think you need to merge the gap masks.

Edit: This isn't a correctness issue, it's a missed optimization.

I think you need to merge the gap masks.

There are two cases where the gap mask from the LHS would not be all-ones: (1) LHS is a constant mask (2) LHS is also a mask assembled by AND.

For case (1), I think constant folding should already handle that (hence the assertion in the line above); for case (2), namely "multi-layer" bitwise AND, I think constant folding should also take care of it and turn it into a single layer AND with LHS being non-constant and RHS being a constant.

Constant folded by InstCombine or something in this pass?

Constant folded by InstCombine or something in this pass?

InstCombine

InstCombine

I don't think it is a good idea to make assumptions about what earlier passes will do. We don't have to produce optimal code if both operands are constants or the constant is on the left hand side instead of the right. But we cannot write an assert that says it will not happen.

But we cannot write an assert that says it will not happen.

I think that's a fair point. I've updated the patch to make the logics more general: now it will try to merge both the deinterleaved masks and the gap masks from LHS & RHS.

preames

LGTM

[IA][RISCV] Recognizing gap masks assembled from bitwise AND

92bcf49

mshockwave requested review from preames, lukel97 and Mel-Chen August 13, 2025 00:58

llvmbot added backend:RISC-V llvm:codegen labels Aug 13, 2025

preames reviewed Aug 13, 2025

View reviewed changes

fixup! Generalize the logics and merge masks from LHS & RHS

be21e87

preames approved these changes Aug 14, 2025

View reviewed changes

mshockwave merged commit c202d2f into llvm:main Aug 14, 2025
6 of 9 checks passed

mshockwave deleted the patch/strided-interleaved-accessed-combined-mask branch August 14, 2025 18:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[IA][RISCV] Recognizing gap masks assembled from bitwise AND #153324

[IA][RISCV] Recognizing gap masks assembled from bitwise AND #153324

Uh oh!

mshockwave commented Aug 13, 2025

Uh oh!

llvmbot commented Aug 13, 2025

Uh oh!

github-actions bot commented Aug 13, 2025

Uh oh!

preames Aug 13, 2025 •

edited

Loading

Uh oh!

mshockwave Aug 13, 2025

Uh oh!

topperc Aug 13, 2025

Uh oh!

mshockwave Aug 13, 2025

Uh oh!

topperc Aug 13, 2025

Uh oh!

mshockwave Aug 13, 2025

Uh oh!

preames left a comment

Uh oh!

Uh oh!

Uh oh!

[IA][RISCV] Recognizing gap masks assembled from bitwise AND #153324

[IA][RISCV] Recognizing gap masks assembled from bitwise AND #153324

Uh oh!

Conversation

mshockwave commented Aug 13, 2025

Uh oh!

llvmbot commented Aug 13, 2025

Uh oh!

github-actions bot commented Aug 13, 2025

Uh oh!

preames Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mshockwave Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

topperc Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

mshockwave Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

topperc Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

mshockwave Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

preames left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

preames Aug 13, 2025 •

edited

Loading