-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[InstCombine] Split GEPs with multiple variable indices #137297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Support the ptrtoint(gep null, x) -> x and ptrtoint(gep inttoptr(x), y) -> x+y folds for the case where there is a chain of geps that ends in null or inttoptr. This avoids some regressions from llvm#137297. While here, also be a bit more careful about edge cases like pointer to vector splats and mismatched pointer and index size.
Support the ptrtoint(gep null, x) -> x and ptrtoint(gep inttoptr(x), y) -> x+y folds for the case where there is a chain of geps that ends in null or inttoptr. This avoids some regressions from #137297. While here, also be a bit more careful about edge cases like pointer to vector splats and mismatched pointer and index size.
…37323) Support the ptrtoint(gep null, x) -> x and ptrtoint(gep inttoptr(x), y) -> x+y folds for the case where there is a chain of geps that ends in null or inttoptr. This avoids some regressions from llvm#137297. While here, also be a bit more careful about edge cases like pointer to vector splats and mismatched pointer and index size.
Currently OptimizePointerDifference() only handles single GEPs with a common base, not GEP chains. This patch generalizes the support to nested GEPs with a common base. Finding the common base is a bit annoying because we want to stop as soon as possible and not recurse into common GEP prefixes. This helps avoids regressions from #137297.
…ence (#142958) Currently OptimizePointerDifference() only handles single GEPs with a common base, not GEP chains. This patch generalizes the support to nested GEPs with a common base. Finding the common base is a bit annoying because we want to stop as soon as possible and not recurse into common GEP prefixes. This helps avoids regressions from llvm/llvm-project#137297.
1ab3e50
to
fb31f08
Compare
This depth limits a linear search (rather than the usual potentially exponential one) and is not particularly important for compile-time in practice. The change in llvm#137297 is going to increase the length of GEP chains, so I'd like to increase this limit a bit to reduce the chance of regressions.
…143714) This depth limits a linear search (rather than the usual potentially exponential one) and is not particularly important for compile-time in practice. The change in #137297 is going to increase the length of GEP chains, so I'd like to increase this limit a bit to reduce the chance of regressions (dtcxzyw/llvm-opt-benchmark#2419 showed a 13% increase in SearchLimitReached). There is no particular significance to the new value of 10. Compile-time is neutral.
…142958) Currently OptimizePointerDifference() only handles single GEPs with a common base, not GEP chains. This patch generalizes the support to nested GEPs with a common base. Finding the common base is a bit annoying because we want to stop as soon as possible and not recurse into common GEP prefixes. This helps avoids regressions from llvm#137297.
…lvm#143714) This depth limits a linear search (rather than the usual potentially exponential one) and is not particularly important for compile-time in practice. The change in llvm#137297 is going to increase the length of GEP chains, so I'd like to increase this limit a bit to reduce the chance of regressions (dtcxzyw/llvm-opt-benchmark#2419 showed a 13% increase in SearchLimitReached). There is no particular significance to the new value of 10. Compile-time is neutral.
…lvm#143714) This depth limits a linear search (rather than the usual potentially exponential one) and is not particularly important for compile-time in practice. The change in llvm#137297 is going to increase the length of GEP chains, so I'd like to increase this limit a bit to reduce the chance of regressions (dtcxzyw/llvm-opt-benchmark#2419 showed a 13% increase in SearchLimitReached). There is no particular significance to the new value of 10. Compile-time is neutral.
Fold icmp between a chain of geps and its base pointer. Previously only a single gep was supported. This will be extended to handle the case of two gep chains with a common base in a followup. This helps to avoid regressions after #137297.
Split GEPs that have more than one variable index into two. This is in preparation for the ptradd migration, which will not support multi-index GEPs.
@llvm/pr-subscribers-llvm-transforms Author: Nikita Popov (nikic) ChangesSplit GEPs that have more than one variable index into two. This is in preparation for the ptradd migration, which will not support multi-index GEPs. Patch is 23.37 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/137297.diff 9 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index e2a9255ca9c6e..873b104bc0944 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -2777,6 +2777,12 @@ Instruction *InstCombinerImpl::visitGEPOfGEP(GetElementPtrInst &GEP,
Indices.append(GEP.idx_begin()+1, GEP.idx_end());
}
+ // Don't create GEPs with more than one variable index.
+ unsigned NumVarIndices =
+ count_if(Indices, [](Value *Idx) { return !isa<Constant>(Idx); });
+ if (NumVarIndices > 1)
+ return nullptr;
+
if (!Indices.empty())
return replaceInstUsesWith(
GEP, Builder.CreateGEP(
@@ -3225,6 +3231,30 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
return replaceInstUsesWith(GEP, Res);
}
+ bool SeenVarIndex = false;
+ for (auto [IdxNum, Idx] : enumerate(Indices)) {
+ if (isa<Constant>(Idx))
+ continue;
+
+ if (!SeenVarIndex) {
+ SeenVarIndex = true;
+ continue;
+ }
+
+ // GEP has multiple variable indices: Split it.
+ ArrayRef<Value *> FrontIndices = ArrayRef(Indices).take_front(IdxNum);
+ Value *FrontGEP =
+ Builder.CreateGEP(GEPEltType, PtrOp, FrontIndices,
+ GEP.getName() + ".split", GEP.getNoWrapFlags());
+
+ SmallVector<Value *> BackIndices;
+ BackIndices.push_back(Constant::getNullValue(NewScalarIndexTy));
+ append_range(BackIndices, drop_begin(Indices, IdxNum));
+ return GetElementPtrInst::Create(
+ GetElementPtrInst::getIndexedType(GEPEltType, FrontIndices), FrontGEP,
+ BackIndices, GEP.getNoWrapFlags());
+ }
+
// Check to see if the inputs to the PHI node are getelementptr instructions.
if (auto *PN = dyn_cast<PHINode>(PtrOp)) {
if (Value *NewPtrOp = foldGEPOfPhi(GEP, PN, Builder))
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-gep-constglob.ll b/llvm/test/Transforms/InstCombine/canonicalize-gep-constglob.ll
index 1520d6ce59548..07c8a8c6b90e1 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-gep-constglob.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-gep-constglob.ll
@@ -35,7 +35,9 @@ define ptr @xzy(i64 %x, i64 %y, i64 %z) {
; CHECK-LABEL: define ptr @xzy(
; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]], i64 [[Z:%.*]]) {
; CHECK-NEXT: entry:
-; CHECK-NEXT: [[GEP:%.*]] = getelementptr inbounds [10 x [10 x [10 x i32]]], ptr getelementptr inbounds nuw (i8, ptr @glob, i64 40), i64 0, i64 [[X]], i64 [[Z]], i64 [[Y]]
+; CHECK-NEXT: [[GEP_SPLIT:%.*]] = getelementptr inbounds [10 x [10 x [10 x i32]]], ptr getelementptr inbounds nuw (i8, ptr @glob, i64 40), i64 0, i64 [[X]]
+; CHECK-NEXT: [[GEP_SPLIT1:%.*]] = getelementptr inbounds [10 x [10 x i32]], ptr [[GEP_SPLIT]], i64 0, i64 [[Z]]
+; CHECK-NEXT: [[GEP:%.*]] = getelementptr inbounds [10 x i32], ptr [[GEP_SPLIT1]], i64 0, i64 [[Y]]
; CHECK-NEXT: ret ptr [[GEP]]
;
entry:
diff --git a/llvm/test/Transforms/InstCombine/gep-vector.ll b/llvm/test/Transforms/InstCombine/gep-vector.ll
index 5546cb36d2f55..313c7ef32fe11 100644
--- a/llvm/test/Transforms/InstCombine/gep-vector.ll
+++ b/llvm/test/Transforms/InstCombine/gep-vector.ll
@@ -31,7 +31,8 @@ define <2 x ptr> @vectorindex3() {
define ptr @bitcast_vec_to_array_gep(ptr %x, i64 %y, i64 %z) {
; CHECK-LABEL: @bitcast_vec_to_array_gep(
-; CHECK-NEXT: [[GEP:%.*]] = getelementptr [7 x i32], ptr [[X:%.*]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT: [[GEP_SPLIT:%.*]] = getelementptr [7 x i32], ptr [[X:%.*]], i64 [[Y:%.*]]
+; CHECK-NEXT: [[GEP:%.*]] = getelementptr [7 x i32], ptr [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
; CHECK-NEXT: ret ptr [[GEP]]
;
%gep = getelementptr [7 x i32], ptr %x, i64 %y, i64 %z
@@ -42,7 +43,8 @@ define ptr @bitcast_vec_to_array_gep(ptr %x, i64 %y, i64 %z) {
define ptr @bitcast_array_to_vec_gep(ptr %x, i64 %y, i64 %z) {
; CHECK-LABEL: @bitcast_array_to_vec_gep(
-; CHECK-NEXT: [[GEP:%.*]] = getelementptr inbounds <3 x i32>, ptr [[X:%.*]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT: [[GEP_SPLIT:%.*]] = getelementptr inbounds <3 x i32>, ptr [[X:%.*]], i64 [[Y:%.*]]
+; CHECK-NEXT: [[GEP:%.*]] = getelementptr inbounds <3 x i32>, ptr [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
; CHECK-NEXT: ret ptr [[GEP]]
;
%gep = getelementptr inbounds <3 x i32>, ptr %x, i64 %y, i64 %z
@@ -53,7 +55,8 @@ define ptr @bitcast_array_to_vec_gep(ptr %x, i64 %y, i64 %z) {
define ptr @bitcast_vec_to_array_gep_matching_alloc_size(ptr %x, i64 %y, i64 %z) {
; CHECK-LABEL: @bitcast_vec_to_array_gep_matching_alloc_size(
-; CHECK-NEXT: [[GEP:%.*]] = getelementptr [4 x i32], ptr [[X:%.*]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT: [[GEP_SPLIT:%.*]] = getelementptr [4 x i32], ptr [[X:%.*]], i64 [[Y:%.*]]
+; CHECK-NEXT: [[GEP:%.*]] = getelementptr [4 x i32], ptr [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
; CHECK-NEXT: ret ptr [[GEP]]
;
%gep = getelementptr [4 x i32], ptr %x, i64 %y, i64 %z
@@ -64,7 +67,8 @@ define ptr @bitcast_vec_to_array_gep_matching_alloc_size(ptr %x, i64 %y, i64 %z)
define ptr @bitcast_array_to_vec_gep_matching_alloc_size(ptr %x, i64 %y, i64 %z) {
; CHECK-LABEL: @bitcast_array_to_vec_gep_matching_alloc_size(
-; CHECK-NEXT: [[GEP:%.*]] = getelementptr inbounds <4 x i32>, ptr [[X:%.*]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT: [[GEP_SPLIT:%.*]] = getelementptr inbounds <4 x i32>, ptr [[X:%.*]], i64 [[Y:%.*]]
+; CHECK-NEXT: [[GEP:%.*]] = getelementptr inbounds <4 x i32>, ptr [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
; CHECK-NEXT: ret ptr [[GEP]]
;
%gep = getelementptr inbounds <4 x i32>, ptr %x, i64 %y, i64 %z
@@ -76,7 +80,8 @@ define ptr @bitcast_array_to_vec_gep_matching_alloc_size(ptr %x, i64 %y, i64 %z)
define ptr addrspace(3) @bitcast_vec_to_array_addrspace(ptr %x, i64 %y, i64 %z) {
; CHECK-LABEL: @bitcast_vec_to_array_addrspace(
; CHECK-NEXT: [[ASC:%.*]] = addrspacecast ptr [[X:%.*]] to ptr addrspace(3)
-; CHECK-NEXT: [[GEP:%.*]] = getelementptr [7 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT: [[GEP_SPLIT:%.*]] = getelementptr [7 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]]
+; CHECK-NEXT: [[GEP:%.*]] = getelementptr [7 x i32], ptr addrspace(3) [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
; CHECK-NEXT: ret ptr addrspace(3) [[GEP]]
;
%asc = addrspacecast ptr %x to ptr addrspace(3)
@@ -89,7 +94,8 @@ define ptr addrspace(3) @bitcast_vec_to_array_addrspace(ptr %x, i64 %y, i64 %z)
define ptr addrspace(3) @inbounds_bitcast_vec_to_array_addrspace(ptr %x, i64 %y, i64 %z) {
; CHECK-LABEL: @inbounds_bitcast_vec_to_array_addrspace(
; CHECK-NEXT: [[ASC:%.*]] = addrspacecast ptr [[X:%.*]] to ptr addrspace(3)
-; CHECK-NEXT: [[GEP:%.*]] = getelementptr inbounds [7 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT: [[GEP_SPLIT:%.*]] = getelementptr inbounds [7 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]]
+; CHECK-NEXT: [[GEP:%.*]] = getelementptr inbounds [7 x i32], ptr addrspace(3) [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
; CHECK-NEXT: ret ptr addrspace(3) [[GEP]]
;
%asc = addrspacecast ptr %x to ptr addrspace(3)
@@ -102,7 +108,8 @@ define ptr addrspace(3) @inbounds_bitcast_vec_to_array_addrspace(ptr %x, i64 %y,
define ptr addrspace(3) @bitcast_vec_to_array_addrspace_matching_alloc_size(ptr %x, i64 %y, i64 %z) {
; CHECK-LABEL: @bitcast_vec_to_array_addrspace_matching_alloc_size(
; CHECK-NEXT: [[ASC:%.*]] = addrspacecast ptr [[X:%.*]] to ptr addrspace(3)
-; CHECK-NEXT: [[GEP:%.*]] = getelementptr [4 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT: [[GEP_SPLIT:%.*]] = getelementptr [4 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]]
+; CHECK-NEXT: [[GEP:%.*]] = getelementptr [4 x i32], ptr addrspace(3) [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
; CHECK-NEXT: ret ptr addrspace(3) [[GEP]]
;
%asc = addrspacecast ptr %x to ptr addrspace(3)
@@ -115,7 +122,8 @@ define ptr addrspace(3) @bitcast_vec_to_array_addrspace_matching_alloc_size(ptr
define ptr addrspace(3) @inbounds_bitcast_vec_to_array_addrspace_matching_alloc_size(ptr %x, i64 %y, i64 %z) {
; CHECK-LABEL: @inbounds_bitcast_vec_to_array_addrspace_matching_alloc_size(
; CHECK-NEXT: [[ASC:%.*]] = addrspacecast ptr [[X:%.*]] to ptr addrspace(3)
-; CHECK-NEXT: [[GEP:%.*]] = getelementptr inbounds [4 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]], i64 [[Z:%.*]]
+; CHECK-NEXT: [[GEP_SPLIT:%.*]] = getelementptr inbounds [4 x i32], ptr addrspace(3) [[ASC]], i64 [[Y:%.*]]
+; CHECK-NEXT: [[GEP:%.*]] = getelementptr inbounds [4 x i32], ptr addrspace(3) [[GEP_SPLIT]], i64 0, i64 [[Z:%.*]]
; CHECK-NEXT: ret ptr addrspace(3) [[GEP]]
;
%asc = addrspacecast ptr %x to ptr addrspace(3)
diff --git a/llvm/test/Transforms/InstCombine/getelementptr.ll b/llvm/test/Transforms/InstCombine/getelementptr.ll
index bb0a94cb01494..7c1342a7db84b 100644
--- a/llvm/test/Transforms/InstCombine/getelementptr.ll
+++ b/llvm/test/Transforms/InstCombine/getelementptr.ll
@@ -1380,7 +1380,7 @@ define ptr @gep_of_gep_multiuse_var_and_var(ptr %p, i64 %idx, i64 %idx2) {
; CHECK-LABEL: @gep_of_gep_multiuse_var_and_var(
; CHECK-NEXT: [[GEP1:%.*]] = getelementptr [4 x i32], ptr [[P:%.*]], i64 [[IDX:%.*]]
; CHECK-NEXT: call void @use(ptr [[GEP1]])
-; CHECK-NEXT: [[GEP2:%.*]] = getelementptr [4 x i32], ptr [[P]], i64 [[IDX]], i64 [[IDX2:%.*]]
+; CHECK-NEXT: [[GEP2:%.*]] = getelementptr [4 x i32], ptr [[GEP1]], i64 0, i64 [[IDX2:%.*]]
; CHECK-NEXT: ret ptr [[GEP2]]
;
%gep1 = getelementptr [4 x i32], ptr %p, i64 %idx
@@ -1922,8 +1922,9 @@ define ptr @gep_merge_nusw(ptr %p, i64 %x, i64 %y) {
define ptr @gep_merge_nuw_add_zero(ptr %p, i64 %idx, i64 %idx2) {
; CHECK-LABEL: @gep_merge_nuw_add_zero(
-; CHECK-NEXT: [[GEP:%.*]] = getelementptr nuw [2 x i32], ptr [[P:%.*]], i64 [[IDX:%.*]], i64 [[IDX2:%.*]]
-; CHECK-NEXT: ret ptr [[GEP]]
+; CHECK-NEXT: [[GEP:%.*]] = getelementptr nuw [2 x i32], ptr [[GEP_SPLIT:%.*]], i64 [[IDX2:%.*]]
+; CHECK-NEXT: [[GEP1:%.*]] = getelementptr nuw [2 x i32], ptr [[GEP]], i64 0, i64 [[IDX3:%.*]]
+; CHECK-NEXT: ret ptr [[GEP1]]
;
%gep1 = getelementptr nuw [2 x i32], ptr %p, i64 %idx
%gep = getelementptr nuw [2 x i32], ptr %gep1, i64 0, i64 %idx2
@@ -1935,7 +1936,8 @@ define ptr @gep_merge_nuw_add_zero(ptr %p, i64 %idx, i64 %idx2) {
; after the merge.
define ptr @gep_merge_nusw_add_zero(ptr %p, i64 %idx, i64 %idx2) {
; CHECK-LABEL: @gep_merge_nusw_add_zero(
-; CHECK-NEXT: [[GEP:%.*]] = getelementptr [2 x i32], ptr [[P:%.*]], i64 [[IDX:%.*]], i64 [[IDX2:%.*]]
+; CHECK-NEXT: [[GEP1:%.*]] = getelementptr nusw [2 x i32], ptr [[P:%.*]], i64 [[IDX:%.*]]
+; CHECK-NEXT: [[GEP:%.*]] = getelementptr nusw [2 x i32], ptr [[GEP1]], i64 0, i64 [[IDX2:%.*]]
; CHECK-NEXT: ret ptr [[GEP]]
;
%gep1 = getelementptr nusw [2 x i32], ptr %p, i64 %idx
diff --git a/llvm/test/Transforms/InstCombine/icmp-gep.ll b/llvm/test/Transforms/InstCombine/icmp-gep.ll
index aede844e04ac7..5044850ec918f 100644
--- a/llvm/test/Transforms/InstCombine/icmp-gep.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-gep.ll
@@ -686,13 +686,13 @@ define i1 @test_scalable_ij(ptr %foo, i64 %i, i64 %j) {
define i1 @gep_nuw(ptr %p, i64 %a, i64 %b, i64 %c, i64 %d) {
; CHECK-LABEL: @gep_nuw(
-; CHECK-NEXT: [[GEP1_IDX:%.*]] = shl nuw i64 [[A:%.*]], 2
-; CHECK-NEXT: [[GEP1_IDX1:%.*]] = shl nuw i64 [[B:%.*]], 1
-; CHECK-NEXT: [[GEP1_OFFS:%.*]] = add nuw i64 [[GEP1_IDX]], [[GEP1_IDX1]]
-; CHECK-NEXT: [[GEP2_IDX:%.*]] = shl nuw i64 [[C:%.*]], 3
-; CHECK-NEXT: [[GEP2_IDX2:%.*]] = shl nuw i64 [[D:%.*]], 2
-; CHECK-NEXT: [[GEP2_OFFS:%.*]] = add nuw i64 [[GEP2_IDX]], [[GEP2_IDX2]]
-; CHECK-NEXT: [[CMP:%.*]] = icmp eq i64 [[GEP1_OFFS]], [[GEP2_OFFS]]
+; CHECK-NEXT: [[GEP1_SPLIT_IDX:%.*]] = shl nuw i64 [[A:%.*]], 2
+; CHECK-NEXT: [[GEP1_IDX:%.*]] = shl nuw i64 [[B:%.*]], 1
+; CHECK-NEXT: [[TMP1:%.*]] = add nuw i64 [[GEP1_SPLIT_IDX]], [[GEP1_IDX]]
+; CHECK-NEXT: [[GEP2_SPLIT_IDX:%.*]] = shl nuw i64 [[C:%.*]], 3
+; CHECK-NEXT: [[GEP2_IDX:%.*]] = shl nuw i64 [[D:%.*]], 2
+; CHECK-NEXT: [[TMP2:%.*]] = add nuw i64 [[GEP2_SPLIT_IDX]], [[GEP2_IDX]]
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq i64 [[TMP1]], [[TMP2]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%gep1 = getelementptr nuw [2 x i16], ptr %p, i64 %a, i64 %b
@@ -703,13 +703,13 @@ define i1 @gep_nuw(ptr %p, i64 %a, i64 %b, i64 %c, i64 %d) {
define i1 @gep_nusw(ptr %p, i64 %a, i64 %b, i64 %c, i64 %d) {
; CHECK-LABEL: @gep_nusw(
-; CHECK-NEXT: [[GEP1_IDX:%.*]] = shl nsw i64 [[A:%.*]], 2
-; CHECK-NEXT: [[GEP1_IDX1:%.*]] = shl nsw i64 [[B:%.*]], 1
-; CHECK-NEXT: [[GEP1_OFFS:%.*]] = add nsw i64 [[GEP1_IDX]], [[GEP1_IDX1]]
-; CHECK-NEXT: [[GEP2_IDX:%.*]] = shl nsw i64 [[C:%.*]], 3
-; CHECK-NEXT: [[GEP2_IDX2:%.*]] = shl nsw i64 [[D:%.*]], 2
-; CHECK-NEXT: [[GEP2_OFFS:%.*]] = add nsw i64 [[GEP2_IDX]], [[GEP2_IDX2]]
-; CHECK-NEXT: [[CMP:%.*]] = icmp eq i64 [[GEP1_OFFS]], [[GEP2_OFFS]]
+; CHECK-NEXT: [[GEP1_SPLIT_IDX:%.*]] = shl nsw i64 [[A:%.*]], 2
+; CHECK-NEXT: [[GEP1_IDX:%.*]] = shl nsw i64 [[B:%.*]], 1
+; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[GEP1_SPLIT_IDX]], [[GEP1_IDX]]
+; CHECK-NEXT: [[GEP2_SPLIT_IDX:%.*]] = shl nsw i64 [[C:%.*]], 3
+; CHECK-NEXT: [[GEP2_IDX:%.*]] = shl nsw i64 [[D:%.*]], 2
+; CHECK-NEXT: [[TMP2:%.*]] = add i64 [[GEP2_SPLIT_IDX]], [[GEP2_IDX]]
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq i64 [[TMP1]], [[TMP2]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%gep1 = getelementptr nusw [2 x i16], ptr %p, i64 %a, i64 %b
diff --git a/llvm/test/Transforms/InstCombine/loadstore-alignment.ll b/llvm/test/Transforms/InstCombine/loadstore-alignment.ll
index 098f2eee52df0..2cdc73e4cff18 100644
--- a/llvm/test/Transforms/InstCombine/loadstore-alignment.ll
+++ b/llvm/test/Transforms/InstCombine/loadstore-alignment.ll
@@ -33,7 +33,8 @@ define <2 x i64> @hem_2d(i32 %i, i32 %j) {
; CHECK-LABEL: @hem_2d(
; CHECK-NEXT: [[TMP1:%.*]] = sext i32 [[I:%.*]] to i64
; CHECK-NEXT: [[TMP2:%.*]] = sext i32 [[J:%.*]] to i64
-; CHECK-NEXT: [[T:%.*]] = getelementptr [13 x <2 x i64>], ptr @xx, i64 [[TMP1]], i64 [[TMP2]]
+; CHECK-NEXT: [[T_SPLIT:%.*]] = getelementptr [13 x <2 x i64>], ptr @xx, i64 [[TMP1]]
+; CHECK-NEXT: [[T:%.*]] = getelementptr [13 x <2 x i64>], ptr [[T_SPLIT]], i64 0, i64 [[TMP2]]
; CHECK-NEXT: [[L:%.*]] = load <2 x i64>, ptr [[T]], align 1
; CHECK-NEXT: ret <2 x i64> [[L]]
;
@@ -90,7 +91,8 @@ define void @hem_2d_store(i32 %i, i32 %j, <2 x i64> %y) {
; CHECK-LABEL: @hem_2d_store(
; CHECK-NEXT: [[TMP1:%.*]] = sext i32 [[I:%.*]] to i64
; CHECK-NEXT: [[TMP2:%.*]] = sext i32 [[J:%.*]] to i64
-; CHECK-NEXT: [[T:%.*]] = getelementptr [13 x <2 x i64>], ptr @xx, i64 [[TMP1]], i64 [[TMP2]]
+; CHECK-NEXT: [[T_SPLIT:%.*]] = getelementptr [13 x <2 x i64>], ptr @xx, i64 [[TMP1]]
+; CHECK-NEXT: [[T:%.*]] = getelementptr [13 x <2 x i64>], ptr [[T_SPLIT]], i64 0, i64 [[TMP2]]
; CHECK-NEXT: store <2 x i64> [[Y:%.*]], ptr [[T]], align 1
; CHECK-NEXT: ret void
;
diff --git a/llvm/test/Transforms/InstCombine/pr58901.ll b/llvm/test/Transforms/InstCombine/pr58901.ll
index 1eea4b5305545..3d836aae77001 100644
--- a/llvm/test/Transforms/InstCombine/pr58901.ll
+++ b/llvm/test/Transforms/InstCombine/pr58901.ll
@@ -15,7 +15,8 @@ define ptr @f1(ptr %arg, i64 %arg1) {
define ptr @f2(ptr %arg, i64 %arg1) {
; CHECK-LABEL: @f2(
; CHECK-NEXT: [[TMP1:%.*]] = getelementptr i8, ptr [[ARG:%.*]], i64 72
-; CHECK-NEXT: [[TMP2:%.*]] = getelementptr [6 x i32], ptr [[TMP1]], i64 [[ARG1:%.*]], i64 [[ARG1]]
+; CHECK-NEXT: [[DOTSPLIT:%.*]] = getelementptr [6 x i32], ptr [[TMP1]], i64 [[ARG1:%.*]]
+; CHECK-NEXT: [[TMP2:%.*]] = getelementptr [6 x i32], ptr [[DOTSPLIT]], i64 0, i64 [[ARG1]]
; CHECK-NEXT: ret ptr [[TMP2]]
;
%1 = getelementptr [6 x i32], ptr %arg, i64 3
diff --git a/llvm/test/Transforms/InstCombine/ptrtoint-nullgep.ll b/llvm/test/Transforms/InstCombine/ptrtoint-nullgep.ll
index c5282ce40acb1..17a9d540983ca 100644
--- a/llvm/test/Transforms/InstCombine/ptrtoint-nullgep.ll
+++ b/llvm/test/Transforms/InstCombine/ptrtoint-nullgep.ll
@@ -548,10 +548,10 @@ define i64 @fold_ptrtoint_nested_array_two_vars(i64 %x, i64 %y) {
;
; INSTCOMBINE-LABEL: define {{[^@]+}}@fold_ptrtoint_nested_array_two_vars
; INSTCOMBINE-SAME: (i64 [[X:%.*]], i64 [[Y:%.*]]) {
-; INSTCOMBINE-NEXT: [[PTR_IDX:%.*]] = shl i64 [[X]], 2
-; INSTCOMBINE-NEXT: [[PTR_IDX1:%.*]] = shl i64 [[Y]], 1
-; INSTCOMBINE-NEXT: [[PTR_OFFS:%.*]] = add i64 [[PTR_IDX]], [[PTR_IDX1]]
-; INSTCOMBINE-NEXT: ret i64 [[PTR_OFFS]]
+; INSTCOMBINE-NEXT: [[PTR_SPLIT_IDX:%.*]] = shl i64 [[X]], 2
+; INSTCOMBINE-NEXT: [[PTR_IDX:%.*]] = shl i64 [[Y]], 1
+; INSTCOMBINE-NEXT: [[RET:%.*]] = add i64 [[PTR_SPLIT_IDX]], [[PTR_IDX]]
+; INSTCOMBINE-NEXT: ret i64 [[RET]]
;
%ptr = getelementptr [2 x i16], ptr addrspace(1) null, i64 %x, i64 %y
@@ -574,10 +574,10 @@ define i64 @fold_ptrtoint_nested_array_two_vars_plus_zero(i64 %x, i64 %y) {
;
; INSTCOMBINE-LABEL: define {{[^@]+}}@fold_ptrtoint_nested_array_two_vars_plus_zero
; INSTCOMBINE-SAME: (i64 [[X:%.*]], i64 [[Y:%.*]]) {
-; INSTCOMBINE-NEXT: [[PTR_IDX:%.*]] = shl i64 [[X]], 3
-; INSTCOMBINE-NEXT: [[PTR_IDX1:%.*]] = shl i64 [[Y]], 2
-; INSTCOMBINE-NEXT: [[PTR_OFFS:%.*]] = add i64 [[PTR_IDX]], [[PTR_IDX1]]
-; INSTCOMBINE-NEXT: ret i64 [[PTR_OFFS]]
+; INSTCOMBINE-NEXT: [[PTR_SPLIT_IDX:%.*]] = shl i64 [[X]], 3
+; INSTCOMBINE-NEXT: [[PTR_IDX:%.*]] = shl i64 [[Y]], 2
+; INSTCOMBINE-NEXT: [[RET:%.*]] = add i64 [[PTR_SPLIT_IDX]], [[PTR_IDX]]
+; INSTCOMBINE-NEXT: ret i64 [[RET]]
;
%ptr = getelementptr [2 x [2 x i16]], ptr addrspace(1) null, i64 %x, i64 %y, i64 0
%ret = ptrtoint ptr addrspace(1) %ptr to i64
@@ -599,11 +599,11 @@ define i64 @fold_ptrtoint_nested_array_two_vars_plus_const(i64 %x, i64 %y) {
;
; INSTCOMBINE-LABEL: define {{[^@]+}}@fold_ptrtoint_nested_array_two_vars_plus_const
; INSTCOMBINE-SAME: (i64 [[X:%.*]], i64 [[Y:%.*]]) {
-; INSTCOMBINE-NEXT: [[PTR_IDX:%.*]] = shl i64 [[X]], 3
-; INSTCOMBINE-NEXT: [[PTR_IDX1:%.*]] = shl i64 [[Y]], 2
-; INSTCOMBINE-NEXT: [[PTR_OFFS:%.*]] = add i64 [[PTR_IDX]], [[PTR_IDX1]]
-; INSTCOMBINE-NEXT: [[PTR_OFFS2:%.*]] = or disjoint i64 [[PTR_OFFS]], 2
-; INSTCOMBINE-NEXT: ret i64 [[PTR_OFFS2]]
+; INSTCOMBINE-NEXT: [[PTR_SPLIT_IDX:%.*]] = shl i64 [[X]], 3
+; INSTCOMBINE-NEXT: [[PTR_IDX:%.*]] = shl i64 [[Y]], 2
+; INSTCOMBINE-NEXT: [[PTR_OFFS:%.*]] = or disjoint i64 [[PTR_IDX]], 2
+; INSTCOMBINE-NEXT: [[RET:%.*]] = add i64 [[PTR_SPLIT_IDX]], [[PTR_OFFS]]
+; INSTCOMBINE-NEXT: ret i64 [[RET]]
;
%ptr = getelementptr [2 x [2 x i16]], ptr addrspace(1) null, i64 %x, i64 %y, i64 1
%ret = ptrtoint ptr addrspace(1) %ptr to i64
@@ -612,12 +612,27 @@ define i64 @fold_ptrtoint_nested_array_two_vars_plus_const(i64 %x, i64 %y) {
; Negative test -- should not be folded since there are multiple GEP uses
define i64 @fold_ptrtoint_nested_nullgep_array_variable_multiple_uses(i64 %x, i64 %y) {
-; ALL-LABEL: define {{[^@]+}}@fold_ptrtoint_nested_nullgep_array_variable_multiple_uses
-; ALL-SAME: (i64 [[X:%.*]], i64 [[Y:%.*]]) {
-; ALL-NEXT: [[PTR:%.*]] = getelementptr [2 x i16], ptr addrspace(1) null, i64 [[X]], i64 [[Y]]
-; ALL-NEXT: call void @use_ptr(ptr addrspace(1) [[PTR]])
-; ALL-NEXT: [[RET:%.*]] = ptrtoint ptr addrspace(1) [[PTR]] to i64
-; ALL-NEXT: ret i64 [[RET]]
+; LLPARSER-LABEL: define {{[^@]+}}@fold_ptrtoint_nested_nullgep_array_variable_multiple_uses
+; LLPARSER-SAME: (i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; LLPARSER-NEXT: [[PTR:%.*]] = getelementptr [2 x i16], ptr addrspace(1) null, i64 [[X]], i64 [[Y]]
+; LLPARSER-NEXT: call void @use_ptr(ptr addrspace(1) [[PTR]])
+; LLPARSER-NEXT: [[RET:%.*]] = ptrtoint ptr addrspace(1) [[PTR]] to i64
+; LLPARSER-NEXT: ret i64 [[RET]]
+;
+; INSTSIMPLIFY-LABEL: define {{[^@]+}}@fold_ptrtoint_nested_nullgep_array_variable_multiple_uses
+; INSTSIMPLIFY-SAME: (i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; INSTSIMPLIFY-NEXT: [[PTR:%.*]] = getelementptr [2 x i16], ptr addrspace(1) null, i64 [[X]], i64 [[Y]]
+; INSTSIMPLIFY-NEXT: call void @use_ptr(ptr addrspace(1) [[PTR]])
+; INSTSIMPLIFY-NEXT: [[RET:%.*]] = ptrtoint ptr addrspace(1) [[PTR]] to i64
+; INSTSIMPLIFY-NEXT: ret i64 [[RET]...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
// Don't create GEPs with more than one variable index. | ||
unsigned NumVarIndices = | ||
count_if(Indices, [](Value *Idx) { return !isa<Constant>(Idx); }); | ||
if (NumVarIndices > 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason for not putting the logic into shouldMergeGEPs
? Do you mean the call to simplifyAddInst
above may produce a constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the call to
simplifyAddInst
above may produce a constant?
Yes, this is the motivation for the separate check.
FYI, we see a significant regression (50-70%) of the https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Benchmarks/Stanford/FloatMM.c benchmark after this commit. It's visible on both aarch64 and x86-64. This benchmark is not representative of any real workloads we're interested in, AFAIU, so I'm not particularly concerned, but maybe you want to double-check that there's no hidden problems here. |
@nikic fyi ^^ |
I can't reproduce this. In which configuration? Looking at the assembly diff, the only relevant difference I see is a loop getting unrolled, but doesn't seem to have any effect on performance. |
We're compiling the benchmarks with -O3 (no LTO or FDO involved) and observing the regressions on two flavors of aarch64 and three flavors of x86-64 (Haswell, Skylake and AMD Rome).
|
Hm, that's weird. I also tested a plain |
Split GEPs that have more than one variable index into two. This is in preparation for the ptradd migration, which will not support multi-index GEPs.
This also enables the split off part to be CSEd and LICMed.