[InstCombine] Split GEPs with multiple variable indices #137297

nikic · 2025-04-25T08:53:50Z

Split GEPs that have more than one variable index into two. This is in preparation for the ptradd migration, which will not support multi-index GEPs.

This also enables the split off part to be CSEd and LICMed.

llvm/test/Transforms/InstCombine/icmp-gep.ll

llvm/test/Transforms/InstCombine/ptrtoint-nullgep.ll

llvm/test/Transforms/InstCombine/sub-gep.ll

llvm/test/Transforms/InstCombine/sub.ll

Support the ptrtoint(gep null, x) -> x and ptrtoint(gep inttoptr(x), y) -> x+y folds for the case where there is a chain of geps that ends in null or inttoptr. This avoids some regressions from llvm#137297. While here, also be a bit more careful about edge cases like pointer to vector splats and mismatched pointer and index size.

Support the ptrtoint(gep null, x) -> x and ptrtoint(gep inttoptr(x), y) -> x+y folds for the case where there is a chain of geps that ends in null or inttoptr. This avoids some regressions from #137297. While here, also be a bit more careful about edge cases like pointer to vector splats and mismatched pointer and index size.

…37323) Support the ptrtoint(gep null, x) -> x and ptrtoint(gep inttoptr(x), y) -> x+y folds for the case where there is a chain of geps that ends in null or inttoptr. This avoids some regressions from llvm#137297. While here, also be a bit more careful about edge cases like pointer to vector splats and mismatched pointer and index size.

Currently OptimizePointerDifference() only handles single GEPs with a common base, not GEP chains. This patch generalizes the support to nested GEPs with a common base. Finding the common base is a bit annoying because we want to stop as soon as possible and not recurse into common GEP prefixes. This helps avoids regressions from #137297.

…ence (#142958) Currently OptimizePointerDifference() only handles single GEPs with a common base, not GEP chains. This patch generalizes the support to nested GEPs with a common base. Finding the common base is a bit annoying because we want to stop as soon as possible and not recurse into common GEP prefixes. This helps avoids regressions from llvm/llvm-project#137297.

This depth limits a linear search (rather than the usual potentially exponential one) and is not particularly important for compile-time in practice. The change in llvm#137297 is going to increase the length of GEP chains, so I'd like to increase this limit a bit to reduce the chance of regressions.

…143714) This depth limits a linear search (rather than the usual potentially exponential one) and is not particularly important for compile-time in practice. The change in #137297 is going to increase the length of GEP chains, so I'd like to increase this limit a bit to reduce the chance of regressions (dtcxzyw/llvm-opt-benchmark#2419 showed a 13% increase in SearchLimitReached). There is no particular significance to the new value of 10. Compile-time is neutral.

…142958) Currently OptimizePointerDifference() only handles single GEPs with a common base, not GEP chains. This patch generalizes the support to nested GEPs with a common base. Finding the common base is a bit annoying because we want to stop as soon as possible and not recurse into common GEP prefixes. This helps avoids regressions from llvm#137297.

…lvm#143714) This depth limits a linear search (rather than the usual potentially exponential one) and is not particularly important for compile-time in practice. The change in llvm#137297 is going to increase the length of GEP chains, so I'd like to increase this limit a bit to reduce the chance of regressions (dtcxzyw/llvm-opt-benchmark#2419 showed a 13% increase in SearchLimitReached). There is no particular significance to the new value of 10. Compile-time is neutral.

Fold icmp between a chain of geps and its base pointer. Previously only a single gep was supported. This will be extended to handle the case of two gep chains with a common base in a followup. This helps to avoid regressions after #137297.

Split GEPs that have more than one variable index into two. This is in preparation for the ptradd migration, which will not support multi-index GEPs.

llvmbot · 2025-07-23T19:38:29Z

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

Split GEPs that have more than one variable index into two. This is in preparation for the ptradd migration, which will not support multi-index GEPs.

Patch is 23.37 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/137297.diff

9 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+30)
(modified) llvm/test/Transforms/InstCombine/canonicalize-gep-constglob.ll (+3-1)
(modified) llvm/test/Transforms/InstCombine/gep-vector.ll (+16-8)
(modified) llvm/test/Transforms/InstCombine/getelementptr.ll (+6-4)
(modified) llvm/test/Transforms/InstCombine/icmp-gep.ll (+14-14)
(modified) llvm/test/Transforms/InstCombine/loadstore-alignment.ll (+4-2)
(modified) llvm/test/Transforms/InstCombine/pr58901.ll (+2-1)
(modified) llvm/test/Transforms/InstCombine/ptrtoint-nullgep.ll (+34-19)
(modified) llvm/test/Transforms/InstCombine/sub.ll (+12-12)

diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index e2a9255ca9c6e..873b104bc0944 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -2777,6 +2777,12 @@ Instruction *InstCombinerImpl::visitGEPOfGEP(GetElementPtrInst &GEP,
     Indices.append(GEP.idx_begin()+1, GEP.idx_end());
   }
 
+  // Don't create GEPs with more than one variable index.
+  unsigned NumVarIndices =
+      count_if(Indices, [](Value *Idx) { return !isa<Constant>(Idx); });
+  if (NumVarIndices > 1)
+    return nullptr;
+
   if (!Indices.empty())
     return replaceInstUsesWith(
         GEP, Builder.CreateGEP(
@@ -3225,6 +3231,30 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
     return replaceInstUsesWith(GEP, Res);
   }
 
+  bool SeenVarIndex = false;
+  for (auto [IdxNum, Idx] : enumerate(Indices)) {
+    if (isa<Constant>(Idx))
+      continue;
+
+    if (!SeenVarIndex) {
+      SeenVarIndex = true;
+      continue;
+    }
+
+    // GEP has multiple variable indices: Split it.
+    ArrayRef<Value *> FrontIndices = ArrayRef(Indices).take_front(IdxNum);
+    Value *FrontGEP =
+        Builder.CreateGEP(GEPEltType, PtrOp, FrontIndices,
+                          GEP.getName() + ".split", GEP.getNoWrapFlags());
+
+    SmallVector<Value *> BackIndices;
+    BackIndices.push_back(Constant::getNullValue(NewScalarIndexTy));
+    append_range(BackIndices, drop_begin(Indices, IdxNum));
+    return GetElementPtrInst::Create(
+        GetElementPtrInst::getIndexedType(GEPEltType, FrontIndices), FrontGEP,
+        BackIndices, GEP.getNoWrapFlags());
+  }
+
   // Check to see if the inputs to the PHI node are getelementptr instructions.
   if (auto *PN = dyn_cast<PHINode>(PtrOp)) {
     if (Value *NewPtrOp = foldGEPOfPhi(GEP, PN, Builder))
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-gep-constglob.ll b/llvm/test/Transforms/InstCombine/canonicalize-gep-constglob.ll
index 1520d6ce59548..07c8a8c6b90e1 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-gep-constglob.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-gep-constglob.ll
@@ -35,7 +35,9 @@ define ptr @xzy(i64 %x, i64 %y, i64 %z) {
 ; CHECK-LABEL: define ptr @xzy(
 ; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]], i64 [[Z:%.*]]) {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds [10 x [10 x [10 x i32]]], ptr getelementptr inbounds nuw (i8, ptr @glob, i64 40), i64 0, i64 [[X]], i64 [[Z]], i64 [[Y]]
+; CHECK-NEXT:    [[GEP_SPLIT:%.*]] = getelementptr inbounds [10 x [10 x [10 x i32]]], ptr getelementptr inbounds nuw (i8, ptr @glob, i64 40), i64 0, i64 [[X]]
+; CHECK-NEXT:    [[GEP_SPLIT1:%.*]] = getelementptr inbounds [10 x [10 x i32]], ptr [[GEP_SPLIT]], i64 0, i64 [[Z]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds [10 x i32], ptr [[GEP_SPLIT1]], i64 0, i64 [[Y]]
 ; CHECK-NEXT:    ret ptr [[GEP]]
 ;
 entry:
diff --git a/llvm/test/Transforms/InstCombine/gep-vector.ll b/llvm/test/Transforms/InstCombine/gep-vector.ll
index 5546cb36d2f55..313c7ef32fe11 100644
--- a/llvm/test/Transforms/InstCombine/gep-vector.ll
+++ b/llvm/test/Transforms/InstCombine/gep-vector.ll
@@ -31,7 +31,8 @@ define <2 x ptr> @vectorindex3() {
 
 define ptr @bitcast_vec_to_array_gep(ptr %x, i64 %y, i64 %z) {
 ; CHECK-LABEL: @bitcast_vec_to_array_gep(
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr [7 x i32], ptr [[X:%.*]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT:    [[GEP_SPLIT:%.*]] = getelementptr [7 x i32], ptr [[X:%.*]], i64 [[Y:%.*]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr [7 x i32], ptr [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
 ; CHECK-NEXT:    ret ptr [[GEP]]
 ;
   %gep = getelementptr [7 x i32], ptr %x, i64 %y, i64 %z
@@ -42,7 +43,8 @@ define ptr @bitcast_vec_to_array_gep(ptr %x, i64 %y, i64 %z) {
 
 define ptr @bitcast_array_to_vec_gep(ptr %x, i64 %y, i64 %z) {
 ; CHECK-LABEL: @bitcast_array_to_vec_gep(
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds <3 x i32>, ptr [[X:%.*]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT:    [[GEP_SPLIT:%.*]] = getelementptr inbounds <3 x i32>, ptr [[X:%.*]], i64 [[Y:%.*]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds <3 x i32>, ptr [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
 ; CHECK-NEXT:    ret ptr [[GEP]]
 ;
   %gep = getelementptr inbounds <3 x i32>, ptr %x, i64 %y, i64 %z
@@ -53,7 +55,8 @@ define ptr @bitcast_array_to_vec_gep(ptr %x, i64 %y, i64 %z) {
 
 define ptr @bitcast_vec_to_array_gep_matching_alloc_size(ptr %x, i64 %y, i64 %z) {
 ; CHECK-LABEL: @bitcast_vec_to_array_gep_matching_alloc_size(
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr [4 x i32], ptr [[X:%.*]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT:    [[GEP_SPLIT:%.*]] = getelementptr [4 x i32], ptr [[X:%.*]], i64 [[Y:%.*]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr [4 x i32], ptr [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
 ; CHECK-NEXT:    ret ptr [[GEP]]
 ;
   %gep = getelementptr [4 x i32], ptr %x, i64 %y, i64 %z
@@ -64,7 +67,8 @@ define ptr @bitcast_vec_to_array_gep_matching_alloc_size(ptr %x, i64 %y, i64 %z)
 
 define ptr @bitcast_array_to_vec_gep_matching_alloc_size(ptr %x, i64 %y, i64 %z) {
 ; CHECK-LABEL: @bitcast_array_to_vec_gep_matching_alloc_size(
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds <4 x i32>, ptr [[X:%.*]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT:    [[GEP_SPLIT:%.*]] = getelementptr inbounds <4 x i32>, ptr [[X:%.*]], i64 [[Y:%.*]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds <4 x i32>, ptr [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
 ; CHECK-NEXT:    ret ptr [[GEP]]
 ;
   %gep = getelementptr inbounds <4 x i32>, ptr %x, i64 %y, i64 %z
@@ -76,7 +80,8 @@ define ptr @bitcast_array_to_vec_gep_matching_alloc_size(ptr %x, i64 %y, i64 %z)
 define ptr addrspace(3) @bitcast_vec_to_array_addrspace(ptr %x, i64 %y, i64 %z) {
 ; CHECK-LABEL: @bitcast_vec_to_array_addrspace(
 ; CHECK-NEXT:    [[ASC:%.*]] = addrspacecast ptr [[X:%.*]] to ptr addrspace(3)
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr [7 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT:    [[GEP_SPLIT:%.*]] = getelementptr [7 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr [7 x i32], ptr addrspace(3) [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
 ; CHECK-NEXT:    ret ptr addrspace(3) [[GEP]]
 ;
   %asc = addrspacecast ptr %x to ptr addrspace(3)
@@ -89,7 +94,8 @@ define ptr addrspace(3) @bitcast_vec_to_array_addrspace(ptr %x, i64 %y, i64 %z)
 define ptr addrspace(3) @inbounds_bitcast_vec_to_array_addrspace(ptr %x, i64 %y, i64 %z) {
 ; CHECK-LABEL: @inbounds_bitcast_vec_to_array_addrspace(
 ; CHECK-NEXT:    [[ASC:%.*]] = addrspacecast ptr [[X:%.*]] to ptr addrspace(3)
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds [7 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT:    [[GEP_SPLIT:%.*]] = getelementptr inbounds [7 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds [7 x i32], ptr addrspace(3) [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
 ; CHECK-NEXT:    ret ptr addrspace(3) [[GEP]]
 ;
   %asc = addrspacecast ptr %x to ptr addrspace(3)
@@ -102,7 +108,8 @@ define ptr addrspace(3) @inbounds_bitcast_vec_to_array_addrspace(ptr %x, i64 %y,
 define ptr addrspace(3) @bitcast_vec_to_array_addrspace_matching_alloc_size(ptr %x, i64 %y, i64 %z) {
 ; CHECK-LABEL: @bitcast_vec_to_array_addrspace_matching_alloc_size(
 ; CHECK-NEXT:    [[ASC:%.*]] = addrspacecast ptr [[X:%.*]] to ptr addrspace(3)
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr [4 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT:    [[GEP_SPLIT:%.*]] = getelementptr [4 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr [4 x i32], ptr addrspace(3) [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
 ; CHECK-NEXT:    ret ptr addrspace(3) [[GEP]]
 ;
   %asc = addrspacecast ptr %x to ptr addrspace(3)
@@ -115,7 +122,8 @@ define ptr addrspace(3) @bitcast_vec_to_array_addrspace_matching_alloc_size(ptr
 define ptr addrspace(3) @inbounds_bitcast_vec_to_array_addrspace_matching_alloc_size(ptr %x, i64 %y, i64 %z) {
 ; CHECK-LABEL: @inbounds_bitcast_vec_to_array_addrspace_matching_alloc_size(
 ; CHECK-NEXT:    [[ASC:%.*]] = addrspacecast ptr [[X:%.*]] to ptr addrspace(3)
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds [4 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT:    [[GEP_SPLIT:%.*]] = getelementptr inbounds [4 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds [4 x i32], ptr addrspace(3) [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
 ; CHECK-NEXT:    ret ptr addrspace(3) [[GEP]]
 ;
   %asc = addrspacecast ptr %x to ptr addrspace(3)
diff --git a/llvm/test/Transforms/InstCombine/getelementptr.ll b/llvm/test/Transforms/InstCombine/getelementptr.ll
index bb0a94cb01494..7c1342a7db84b 100644
--- a/llvm/test/Transforms/InstCombine/getelementptr.ll
+++ b/llvm/test/Transforms/InstCombine/getelementptr.ll
@@ -1380,7 +1380,7 @@ define ptr @gep_of_gep_multiuse_var_and_var(ptr %p, i64 %idx, i64 %idx2) {
 ; CHECK-LABEL: @gep_of_gep_multiuse_var_and_var(
 ; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr [4 x i32], ptr [[P:%.*]], i64 [[IDX:%.*]]
 ; CHECK-NEXT:    call void @use(ptr [[GEP1]])
-; CHECK-NEXT:    [[GEP2:%.*]] = getelementptr [4 x i32], ptr [[P]], i64 [[IDX]], i64 [[IDX2:%.*]]
+; CHECK-NEXT:    [[GEP2:%.*]] = getelementptr [4 x i32], ptr [[GEP1]], i64 0, i64 [[IDX2:%.*]]
 ; CHECK-NEXT:    ret ptr [[GEP2]]
 ;
   %gep1 = getelementptr [4 x i32], ptr %p, i64 %idx
@@ -1922,8 +1922,9 @@ define ptr @gep_merge_nusw(ptr %p, i64 %x, i64 %y) {
 
 define ptr @gep_merge_nuw_add_zero(ptr %p, i64 %idx, i64 %idx2) {
 ; CHECK-LABEL: @gep_merge_nuw_add_zero(
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr nuw [2 x i32], ptr [[P:%.*]], i64 [[IDX:%.*]], i64 [[IDX2:%.*]]
-; CHECK-NEXT:    ret ptr [[GEP]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr nuw [2 x i32], ptr [[GEP_SPLIT:%.*]], i64 [[IDX2:%.*]]
+; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr nuw [2 x i32], ptr [[GEP]], i64 0, i64 [[IDX3:%.*]]
+; CHECK-NEXT:    ret ptr [[GEP1]]
 ;
   %gep1 = getelementptr nuw [2 x i32], ptr %p, i64 %idx
   %gep = getelementptr nuw [2 x i32], ptr %gep1, i64 0, i64 %idx2
@@ -1935,7 +1936,8 @@ define ptr @gep_merge_nuw_add_zero(ptr %p, i64 %idx, i64 %idx2) {
 ; after the merge.
 define ptr @gep_merge_nusw_add_zero(ptr %p, i64 %idx, i64 %idx2) {
 ; CHECK-LABEL: @gep_merge_nusw_add_zero(
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr [2 x i32], ptr [[P:%.*]], i64 [[IDX:%.*]], i64 [[IDX2:%.*]]
+; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr nusw [2 x i32], ptr [[P:%.*]], i64 [[IDX:%.*]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr nusw [2 x i32], ptr [[GEP1]], i64 0, i64 [[IDX2:%.*]]
 ; CHECK-NEXT:    ret ptr [[GEP]]
 ;
   %gep1 = getelementptr nusw [2 x i32], ptr %p, i64 %idx
diff --git a/llvm/test/Transforms/InstCombine/icmp-gep.ll b/llvm/test/Transforms/InstCombine/icmp-gep.ll
index aede844e04ac7..5044850ec918f 100644
--- a/llvm/test/Transforms/InstCombine/icmp-gep.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-gep.ll
@@ -686,13 +686,13 @@ define i1 @test_scalable_ij(ptr %foo, i64 %i, i64 %j) {
 
 define i1 @gep_nuw(ptr %p, i64 %a, i64 %b, i64 %c, i64 %d) {
 ; CHECK-LABEL: @gep_nuw(
-; CHECK-NEXT:    [[GEP1_IDX:%.*]] = shl nuw i64 [[A:%.*]], 2
-; CHECK-NEXT:    [[GEP1_IDX1:%.*]] = shl nuw i64 [[B:%.*]], 1
-; CHECK-NEXT:    [[GEP1_OFFS:%.*]] = add nuw i64 [[GEP1_IDX]], [[GEP1_IDX1]]
-; CHECK-NEXT:    [[GEP2_IDX:%.*]] = shl nuw i64 [[C:%.*]], 3
-; CHECK-NEXT:    [[GEP2_IDX2:%.*]] = shl nuw i64 [[D:%.*]], 2
-; CHECK-NEXT:    [[GEP2_OFFS:%.*]] = add nuw i64 [[GEP2_IDX]], [[GEP2_IDX2]]
-; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[GEP1_OFFS]], [[GEP2_OFFS]]
+; CHECK-NEXT:    [[GEP1_SPLIT_IDX:%.*]] = shl nuw i64 [[A:%.*]], 2
+; CHECK-NEXT:    [[GEP1_IDX:%.*]] = shl nuw i64 [[B:%.*]], 1
+; CHECK-NEXT:    [[TMP1:%.*]] = add nuw i64 [[GEP1_SPLIT_IDX]], [[GEP1_IDX]]
+; CHECK-NEXT:    [[GEP2_SPLIT_IDX:%.*]] = shl nuw i64 [[C:%.*]], 3
+; CHECK-NEXT:    [[GEP2_IDX:%.*]] = shl nuw i64 [[D:%.*]], 2
+; CHECK-NEXT:    [[TMP2:%.*]] = add nuw i64 [[GEP2_SPLIT_IDX]], [[GEP2_IDX]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[TMP1]], [[TMP2]]
 ; CHECK-NEXT:    ret i1 [[CMP]]
 ;
   %gep1 = getelementptr nuw [2 x i16], ptr %p, i64 %a, i64 %b
@@ -703,13 +703,13 @@ define i1 @gep_nuw(ptr %p, i64 %a, i64 %b, i64 %c, i64 %d) {
 
 define i1 @gep_nusw(ptr %p, i64 %a, i64 %b, i64 %c, i64 %d) {
 ; CHECK-LABEL: @gep_nusw(
-; CHECK-NEXT:    [[GEP1_IDX:%.*]] = shl nsw i64 [[A:%.*]], 2
-; CHECK-NEXT:    [[GEP1_IDX1:%.*]] = shl nsw i64 [[B:%.*]], 1
-; CHECK-NEXT:    [[GEP1_OFFS:%.*]] = add nsw i64 [[GEP1_IDX]], [[GEP1_IDX1]]
-; CHECK-NEXT:    [[GEP2_IDX:%.*]] = shl nsw i64 [[C:%.*]], 3
-; CHECK-NEXT:    [[GEP2_IDX2:%.*]] = shl nsw i64 [[D:%.*]], 2
-; CHECK-NEXT:    [[GEP2_OFFS:%.*]] = add nsw i64 [[GEP2_IDX]], [[GEP2_IDX2]]
-; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[GEP1_OFFS]], [[GEP2_OFFS]]
+; CHECK-NEXT:    [[GEP1_SPLIT_IDX:%.*]] = shl nsw i64 [[A:%.*]], 2
+; CHECK-NEXT:    [[GEP1_IDX:%.*]] = shl nsw i64 [[B:%.*]], 1
+; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[GEP1_SPLIT_IDX]], [[GEP1_IDX]]
+; CHECK-NEXT:    [[GEP2_SPLIT_IDX:%.*]] = shl nsw i64 [[C:%.*]], 3
+; CHECK-NEXT:    [[GEP2_IDX:%.*]] = shl nsw i64 [[D:%.*]], 2
+; CHECK-NEXT:    [[TMP2:%.*]] = add i64 [[GEP2_SPLIT_IDX]], [[GEP2_IDX]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[TMP1]], [[TMP2]]
 ; CHECK-NEXT:    ret i1 [[CMP]]
 ;
   %gep1 = getelementptr nusw [2 x i16], ptr %p, i64 %a, i64 %b
diff --git a/llvm/test/Transforms/InstCombine/loadstore-alignment.ll b/llvm/test/Transforms/InstCombine/loadstore-alignment.ll
index 098f2eee52df0..2cdc73e4cff18 100644
--- a/llvm/test/Transforms/InstCombine/loadstore-alignment.ll
+++ b/llvm/test/Transforms/InstCombine/loadstore-alignment.ll
@@ -33,7 +33,8 @@ define <2 x i64> @hem_2d(i32 %i, i32 %j) {
 ; CHECK-LABEL: @hem_2d(
 ; CHECK-NEXT:    [[TMP1:%.*]] = sext i32 [[I:%.*]] to i64
 ; CHECK-NEXT:    [[TMP2:%.*]] = sext i32 [[J:%.*]] to i64
-; CHECK-NEXT:    [[T:%.*]] = getelementptr [13 x <2 x i64>], ptr @xx, i64 [[TMP1]], i64 [[TMP2]]
+; CHECK-NEXT:    [[T_SPLIT:%.*]] = getelementptr [13 x <2 x i64>], ptr @xx, i64 [[TMP1]]
+; CHECK-NEXT:    [[T:%.*]] = getelementptr [13 x <2 x i64>], ptr [[T_SPLIT]], i64 0, i64 [[TMP2]]
 ; CHECK-NEXT:    [[L:%.*]] = load <2 x i64>, ptr [[T]], align 1
 ; CHECK-NEXT:    ret <2 x i64> [[L]]
 ;
@@ -90,7 +91,8 @@ define void @hem_2d_store(i32 %i, i32 %j, <2 x i64> %y) {
 ; CHECK-LABEL: @hem_2d_store(
 ; CHECK-NEXT:    [[TMP1:%.*]] = sext i32 [[I:%.*]] to i64
 ; CHECK-NEXT:    [[TMP2:%.*]] = sext i32 [[J:%.*]] to i64
-; CHECK-NEXT:    [[T:%.*]] = getelementptr [13 x <2 x i64>], ptr @xx, i64 [[TMP1]], i64 [[TMP2]]
+; CHECK-NEXT:    [[T_SPLIT:%.*]] = getelementptr [13 x <2 x i64>], ptr @xx, i64 [[TMP1]]
+; CHECK-NEXT:    [[T:%.*]] = getelementptr [13 x <2 x i64>], ptr [[T_SPLIT]], i64 0, i64 [[TMP2]]
 ; CHECK-NEXT:    store <2 x i64> [[Y:%.*]], ptr [[T]], align 1
 ; CHECK-NEXT:    ret void
 ;
diff --git a/llvm/test/Transforms/InstCombine/pr58901.ll b/llvm/test/Transforms/InstCombine/pr58901.ll
index 1eea4b5305545..3d836aae77001 100644
--- a/llvm/test/Transforms/InstCombine/pr58901.ll
+++ b/llvm/test/Transforms/InstCombine/pr58901.ll
@@ -15,7 +15,8 @@ define ptr @f1(ptr %arg, i64 %arg1) {
 define ptr @f2(ptr %arg, i64 %arg1) {
 ; CHECK-LABEL: @f2(
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr i8, ptr [[ARG:%.*]], i64 72
-; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr [6 x i32], ptr [[TMP1]], i64 [[ARG1:%.*]], i64 [[ARG1]]
+; CHECK-NEXT:    [[DOTSPLIT:%.*]] = getelementptr [6 x i32], ptr [[TMP1]], i64 [[ARG1:%.*]]
+; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr [6 x i32], ptr [[DOTSPLIT]], i64 0, i64 [[ARG1]]
 ; CHECK-NEXT:    ret ptr [[TMP2]]
 ;
   %1 = getelementptr [6 x i32], ptr %arg, i64 3
diff --git a/llvm/test/Transforms/InstCombine/ptrtoint-nullgep.ll b/llvm/test/Transforms/InstCombine/ptrtoint-nullgep.ll
index c5282ce40acb1..17a9d540983ca 100644
--- a/llvm/test/Transforms/InstCombine/ptrtoint-nullgep.ll
+++ b/llvm/test/Transforms/InstCombine/ptrtoint-nullgep.ll
@@ -548,10 +548,10 @@ define i64 @fold_ptrtoint_nested_array_two_vars(i64 %x, i64 %y) {
 ;
 ; INSTCOMBINE-LABEL: define {{[^@]+}}@fold_ptrtoint_nested_array_two_vars
 ; INSTCOMBINE-SAME: (i64 [[X:%.*]], i64 [[Y:%.*]]) {
-; INSTCOMBINE-NEXT:    [[PTR_IDX:%.*]] = shl i64 [[X]], 2
-; INSTCOMBINE-NEXT:    [[PTR_IDX1:%.*]] = shl i64 [[Y]], 1
-; INSTCOMBINE-NEXT:    [[PTR_OFFS:%.*]] = add i64 [[PTR_IDX]], [[PTR_IDX1]]
-; INSTCOMBINE-NEXT:    ret i64 [[PTR_OFFS]]
+; INSTCOMBINE-NEXT:    [[PTR_SPLIT_IDX:%.*]] = shl i64 [[X]], 2
+; INSTCOMBINE-NEXT:    [[PTR_IDX:%.*]] = shl i64 [[Y]], 1
+; INSTCOMBINE-NEXT:    [[RET:%.*]] = add i64 [[PTR_SPLIT_IDX]], [[PTR_IDX]]
+; INSTCOMBINE-NEXT:    ret i64 [[RET]]
 ;
 
   %ptr = getelementptr [2 x i16], ptr addrspace(1) null, i64 %x, i64 %y
@@ -574,10 +574,10 @@ define i64 @fold_ptrtoint_nested_array_two_vars_plus_zero(i64 %x, i64 %y) {
 ;
 ; INSTCOMBINE-LABEL: define {{[^@]+}}@fold_ptrtoint_nested_array_two_vars_plus_zero
 ; INSTCOMBINE-SAME: (i64 [[X:%.*]], i64 [[Y:%.*]]) {
-; INSTCOMBINE-NEXT:    [[PTR_IDX:%.*]] = shl i64 [[X]], 3
-; INSTCOMBINE-NEXT:    [[PTR_IDX1:%.*]] = shl i64 [[Y]], 2
-; INSTCOMBINE-NEXT:    [[PTR_OFFS:%.*]] = add i64 [[PTR_IDX]], [[PTR_IDX1]]
-; INSTCOMBINE-NEXT:    ret i64 [[PTR_OFFS]]
+; INSTCOMBINE-NEXT:    [[PTR_SPLIT_IDX:%.*]] = shl i64 [[X]], 3
+; INSTCOMBINE-NEXT:    [[PTR_IDX:%.*]] = shl i64 [[Y]], 2
+; INSTCOMBINE-NEXT:    [[RET:%.*]] = add i64 [[PTR_SPLIT_IDX]], [[PTR_IDX]]
+; INSTCOMBINE-NEXT:    ret i64 [[RET]]
 ;
   %ptr = getelementptr [2 x [2 x i16]], ptr addrspace(1) null, i64 %x, i64 %y, i64 0
   %ret = ptrtoint ptr addrspace(1) %ptr to i64
@@ -599,11 +599,11 @@ define i64 @fold_ptrtoint_nested_array_two_vars_plus_const(i64 %x, i64 %y) {
 ;
 ; INSTCOMBINE-LABEL: define {{[^@]+}}@fold_ptrtoint_nested_array_two_vars_plus_const
 ; INSTCOMBINE-SAME: (i64 [[X:%.*]], i64 [[Y:%.*]]) {
-; INSTCOMBINE-NEXT:    [[PTR_IDX:%.*]] = shl i64 [[X]], 3
-; INSTCOMBINE-NEXT:    [[PTR_IDX1:%.*]] = shl i64 [[Y]], 2
-; INSTCOMBINE-NEXT:    [[PTR_OFFS:%.*]] = add i64 [[PTR_IDX]], [[PTR_IDX1]]
-; INSTCOMBINE-NEXT:    [[PTR_OFFS2:%.*]] = or disjoint i64 [[PTR_OFFS]], 2
-; INSTCOMBINE-NEXT:    ret i64 [[PTR_OFFS2]]
+; INSTCOMBINE-NEXT:    [[PTR_SPLIT_IDX:%.*]] = shl i64 [[X]], 3
+; INSTCOMBINE-NEXT:    [[PTR_IDX:%.*]] = shl i64 [[Y]], 2
+; INSTCOMBINE-NEXT:    [[PTR_OFFS:%.*]] = or disjoint i64 [[PTR_IDX]], 2
+; INSTCOMBINE-NEXT:    [[RET:%.*]] = add i64 [[PTR_SPLIT_IDX]], [[PTR_OFFS]]
+; INSTCOMBINE-NEXT:    ret i64 [[RET]]
 ;
   %ptr = getelementptr [2 x [2 x i16]], ptr addrspace(1) null, i64 %x, i64 %y, i64 1
   %ret = ptrtoint ptr addrspace(1) %ptr to i64
@@ -612,12 +612,27 @@ define i64 @fold_ptrtoint_nested_array_two_vars_plus_const(i64 %x, i64 %y) {
 
 ; Negative test -- should not be folded since there are multiple GEP uses
 define i64 @fold_ptrtoint_nested_nullgep_array_variable_multiple_uses(i64 %x, i64 %y) {
-; ALL-LABEL: define {{[^@]+}}@fold_ptrtoint_nested_nullgep_array_variable_multiple_uses
-; ALL-SAME: (i64 [[X:%.*]], i64 [[Y:%.*]]) {
-; ALL-NEXT:    [[PTR:%.*]] = getelementptr [2 x i16], ptr addrspace(1) null, i64 [[X]], i64 [[Y]]
-; ALL-NEXT:    call void @use_ptr(ptr addrspace(1) [[PTR]])
-; ALL-NEXT:    [[RET:%.*]] = ptrtoint ptr addrspace(1) [[PTR]] to i64
-; ALL-NEXT:    ret i64 [[RET]]
+; LLPARSER-LABEL: define {{[^@]+}}@fold_ptrtoint_nested_nullgep_array_variable_multiple_uses
+; LLPARSER-SAME: (i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; LLPARSER-NEXT:    [[PTR:%.*]] = getelementptr [2 x i16], ptr addrspace(1) null, i64 [[X]], i64 [[Y]]
+; LLPARSER-NEXT:    call void @use_ptr(ptr addrspace(1) [[PTR]])
+; LLPARSER-NEXT:    [[RET:%.*]] = ptrtoint ptr addrspace(1) [[PTR]] to i64
+; LLPARSER-NEXT:    ret i64 [[RET]]
+;
+; INSTSIMPLIFY-LABEL: define {{[^@]+}}@fold_ptrtoint_nested_nullgep_array_variable_multiple_uses
+; INSTSIMPLIFY-SAME: (i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; INSTSIMPLIFY-NEXT:    [[PTR:%.*]] = getelementptr [2 x i16], ptr addrspace(1) null, i64 [[X]], i64 [[Y]]
+; INSTSIMPLIFY-NEXT:    call void @use_ptr(ptr addrspace(1) [[PTR]])
+; INSTSIMPLIFY-NEXT:    [[RET:%.*]] = ptrtoint ptr addrspace(1) [[PTR]] to i64
+; INSTSIMPLIFY-NEXT:    ret i64 [[RET]...
[truncated]

dtcxzyw

LGTM.

dtcxzyw · 2025-07-26T12:21:38Z

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

+  // Don't create GEPs with more than one variable index.
+  unsigned NumVarIndices =
+      count_if(Indices, [](Value *Idx) { return !isa<Constant>(Idx); });
+  if (NumVarIndices > 1)


Is there any reason for not putting the logic into shouldMergeGEPs? Do you mean the call to simplifyAddInst above may produce a constant?

Do you mean the call to simplifyAddInst above may produce a constant?

Yes, this is the motivation for the separate check.

alexfh · 2025-08-14T19:04:01Z

FYI, we see a significant regression (50-70%) of the https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Benchmarks/Stanford/FloatMM.c benchmark after this commit. It's visible on both aarch64 and x86-64. This benchmark is not representative of any real workloads we're interested in, AFAIU, so I'm not particularly concerned, but maybe you want to double-check that there's no hidden problems here.

alexfh · 2025-08-14T19:04:26Z

@nikic fyi ^^

nikic · 2025-08-19T15:03:02Z

FYI, we see a significant regression (50-70%) of the https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Benchmarks/Stanford/FloatMM.c benchmark after this commit. It's visible on both aarch64 and x86-64. This benchmark is not representative of any real workloads we're interested in, AFAIU, so I'm not particularly concerned, but maybe you want to double-check that there's no hidden problems here.

I can't reproduce this. In which configuration?

Looking at the assembly diff, the only relevant difference I see is a loop getting unrolled, but doesn't seem to have any effect on performance.

alexfh · 2025-08-20T10:37:47Z

FYI, we see a significant regression (50-70%) of the https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Benchmarks/Stanford/FloatMM.c benchmark after this commit. It's visible on both aarch64 and x86-64. This benchmark is not representative of any real workloads we're interested in, AFAIU, so I'm not particularly concerned, but maybe you want to double-check that there's no hidden problems here.

I can't reproduce this. In which configuration?

We're compiling the benchmarks with -O3 (no LTO or FDO involved) and observing the regressions on two flavors of aarch64 and three flavors of x86-64 (Haswell, Skylake and AMD Rome).

Looking at the assembly diff, the only relevant difference I see is a loop getting unrolled, but doesn't seem to have any effect on performance.

nikic · 2025-08-20T10:43:04Z

Hm, that's weird. I also tested a plain -O3 build on x86-64 (on a Zen 5 CPU) and didn't see a difference (between current main and current main with this commit reverted).

nikic commented Apr 25, 2025

View reviewed changes

nikic mentioned this pull request Apr 25, 2025

[InstCombine] Support ptrtoint of gep folds for chain of geps #137323

Merged

nikic force-pushed the gep-split branch from 70645cd to 116ff85 Compare April 28, 2025 07:34

nikic mentioned this pull request Jun 5, 2025

[InstCombine] Support nested GEPs in OptimizePointerDifference #142958

Merged

nikic force-pushed the gep-split branch 2 times, most recently from 1ab3e50 to fb31f08 Compare June 10, 2025 14:36

nikic mentioned this pull request Jun 10, 2025

Task submission dtcxzyw/llvm-opt-benchmark#1312

Open

dtcxzyw mentioned this pull request Jun 10, 2025

pre-commit: PR137297 dtcxzyw/llvm-opt-benchmark#2419

Closed

nikic mentioned this pull request Jun 11, 2025

[BasicAA][ValueTracking] Increase depth for underlying object search #143714

Merged

nikic mentioned this pull request Jun 13, 2025

[InstCombine] Fold icmp of gep chain with base #144065

Merged

nikic mentioned this pull request Jun 17, 2025

[SeparateConstOffsetFromGEP] propagate const offset through GEP chains #143470

Merged

nikic force-pushed the gep-split branch from fb31f08 to 2df8628 Compare July 2, 2025 08:18

nikic mentioned this pull request Jul 2, 2025

[InstCombine] Fix multi-use handling for multi-GEP rewrite #146689

Merged

[InstCombine] Split GEPs with multiple variable indices

75d05e9

Split GEPs that have more than one variable index into two. This is in preparation for the ptradd migration, which will not support multi-index GEPs.

nikic force-pushed the gep-split branch from 2df8628 to 75d05e9 Compare July 23, 2025 15:19

nikic marked this pull request as ready for review July 23, 2025 19:37

llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Jul 23, 2025

zyw-bot mentioned this pull request Jul 23, 2025

pre-commit: PR137297 dtcxzyw/llvm-opt-benchmark#2604

Closed

nikic requested a review from dtcxzyw July 23, 2025 20:27

nikic mentioned this pull request Jul 24, 2025

[InstCombine] Lower multi-dimensional GEP to ptradd #150383

Closed

dtcxzyw mentioned this pull request Jul 24, 2025

Fuzz 137297 dtcxzyw/llvm-fuzz-service#106

Closed

usha1830 mentioned this pull request Jul 24, 2025

[flang][perf] Adding noalias for function arguments affects function specialization #143219

Closed

dtcxzyw mentioned this pull request Jul 24, 2025

[SLPVectorizer] Assertion `valid() && "InstructionsState is invalid."' failed. #150479

Closed

dtcxzyw approved these changes Jul 26, 2025

View reviewed changes

nikic merged commit 8a09adc into llvm:main Jul 30, 2025
13 checks passed

nikic deleted the gep-split branch July 30, 2025 10:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[InstCombine] Split GEPs with multiple variable indices #137297

[InstCombine] Split GEPs with multiple variable indices #137297

Uh oh!

nikic commented Apr 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Jul 23, 2025

Uh oh!

dtcxzyw left a comment

Uh oh!

dtcxzyw Jul 26, 2025

Uh oh!

nikic Jul 26, 2025

Uh oh!

Uh oh!

alexfh commented Aug 14, 2025

Uh oh!

alexfh commented Aug 14, 2025

Uh oh!

nikic commented Aug 19, 2025

Uh oh!

alexfh commented Aug 20, 2025

Uh oh!

nikic commented Aug 20, 2025

Uh oh!

Uh oh!

[InstCombine] Split GEPs with multiple variable indices #137297

[InstCombine] Split GEPs with multiple variable indices #137297

Uh oh!

Conversation

nikic commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Jul 23, 2025

Uh oh!

dtcxzyw left a comment

Choose a reason for hiding this comment

Uh oh!

dtcxzyw Jul 26, 2025

Choose a reason for hiding this comment

Uh oh!

nikic Jul 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alexfh commented Aug 14, 2025

Uh oh!

alexfh commented Aug 14, 2025

Uh oh!

nikic commented Aug 19, 2025

Uh oh!

alexfh commented Aug 20, 2025

Uh oh!

nikic commented Aug 20, 2025

Uh oh!

Uh oh!

nikic commented Apr 25, 2025 •

edited

Loading