Skip to content

[InstCombine] Fold select Cond, not X, X into Cond ^ X #93591

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 4, 2024

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented May 28, 2024

See the following example:

define i1 @src(i64 %x, i1 %y) {
  %1526 = icmp ne i64 %x, 0
  %1527 = icmp eq i64 %x, 0
  %sel = select i1 %y, i1 %1526, i1 %1527
  ret i1 %sel
}

define i1 @tgt(i64 %x, i1 %y) {
  %1527 = icmp eq i64 %x, 0
  %sel = xor i1 %y, %1527
  ret i1 %sel
}

I find that this pattern is common in C/C++/Rust code base.
This patch folds select Cond, Y, X into Cond ^ X iff:

  1. X has the same type as Cond
  2. X is poison -> Y is poison
  3. X == !Y

Alive2: https://alive2.llvm.org/ce/z/hSmkHS

It is an alternative to #90089.

@llvmbot
Copy link
Member

llvmbot commented May 28, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

See the following example:

define i1 @<!-- -->src(i64 %x, i1 %y) {
  %1526 = icmp ne i64 %x, 0
  %1527 = icmp eq i64 %x, 0
  %sel = select i1 %y, i1 %1526, i1 %1527
  ret i1 %sel
}

define i1 @<!-- -->tgt(i64 %x, i1 %y) {
  %1527 = icmp eq i64 %x, 0
  %sel = xor i1 %y, %1527
  ret i1 %sel
}

I find that this pattern is common in C/C++/Rust code base.
This patch folds select Cond, Y, X into Cond ^ X iff:

  1. X has the same type as Cond
  2. X is poison -> Y is poison
  3. X == !Y

Alive2: https://alive2.llvm.org/ce/z/hSmkHS

It is an alternative to #90089.


Full diff: https://github.com/llvm/llvm-project/pull/93591.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (+34)
  • (modified) llvm/test/Transforms/InstCombine/select-cmp.ll (+86)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index a3ddb402bf662..011b1c2e09c03 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -3502,6 +3502,34 @@ static bool matchFMulByZeroIfResultEqZero(InstCombinerImpl &IC, Value *Cmp0,
   return false;
 }
 
+/// Return true iff:
+/// 1. X is poison implies Y is poison.
+/// 2. X is true implies Y is false.
+/// 3. X is false implies Y is true.
+/// Otherwise, return false.
+static bool isKnownInversion(Value *X, Value *Y) {
+  // Handle X = icmp pred V, C1, Y = icmp pred V, C2.
+  Value *V;
+  Constant *C1, *C2;
+  ICmpInst::Predicate Pred1, Pred2;
+  if (!match(X, m_ICmp(Pred1, m_Value(V), m_Constant(C1))) ||
+      !match(Y, m_ICmp(Pred2, m_Specific(V), m_Constant(C2))))
+    return false;
+
+  if (C1 == C2)
+    return Pred1 == ICmpInst::getInversePredicate(Pred2);
+
+  // Try to infer the relationship from constant ranges.
+  const APInt *RHSC1, *RHSC2;
+  if (!match(C1, m_APInt(RHSC1)) || !match(C2, m_APInt(RHSC2)))
+    return false;
+
+  const auto CR1 = ConstantRange::makeExactICmpRegion(Pred1, *RHSC1);
+  const auto CR2 = ConstantRange::makeExactICmpRegion(Pred2, *RHSC2);
+
+  return CR1.inverse() == CR2;
+}
+
 Instruction *InstCombinerImpl::visitSelectInst(SelectInst &SI) {
   Value *CondVal = SI.getCondition();
   Value *TrueVal = SI.getTrueValue();
@@ -3996,5 +4024,11 @@ Instruction *InstCombinerImpl::visitSelectInst(SelectInst &SI) {
     }
   }
 
+  // select Cond, !X, X -> xor Cond, X
+  // Note: We don't fold select Cond, Y, X -> X (iff X->Y & !X->!Y) here as
+  // it indicates that these two patterns should be canonicalized.
+  if (CondVal->getType() == SI.getType() && isKnownInversion(FalseVal, TrueVal))
+    return BinaryOperator::CreateXor(CondVal, FalseVal);
+
   return nullptr;
 }
diff --git a/llvm/test/Transforms/InstCombine/select-cmp.ll b/llvm/test/Transforms/InstCombine/select-cmp.ll
index 711fac542179f..8609bc5041ad8 100644
--- a/llvm/test/Transforms/InstCombine/select-cmp.ll
+++ b/llvm/test/Transforms/InstCombine/select-cmp.ll
@@ -345,4 +345,90 @@ define i1 @icmp_no_common(i1 %c, i8 %x, i8 %y, i8 %z) {
   ret i1 %r
 }
 
+define i1 @test_select_inverse_eq(i64 %x, i1 %y) {
+; CHECK-LABEL: @test_select_inverse_eq(
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i64 [[X:%.*]], 0
+; CHECK-NEXT:    [[SEL:%.*]] = xor i1 [[CMP2]], [[Y:%.*]]
+; CHECK-NEXT:    ret i1 [[SEL]]
+;
+  %cmp1 = icmp ne i64 %x, 0
+  %cmp2 = icmp eq i64 %x, 0
+  %sel = select i1 %y, i1 %cmp1, i1 %cmp2
+  ret i1 %sel
+}
+
+define i1 @test_select_inverse_signed(i64 %x, i1 %y) {
+; CHECK-LABEL: @test_select_inverse_signed(
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp slt i64 [[X:%.*]], 0
+; CHECK-NEXT:    [[SEL:%.*]] = xor i1 [[CMP2]], [[Y:%.*]]
+; CHECK-NEXT:    ret i1 [[SEL]]
+;
+  %cmp1 = icmp sgt i64 %x, -1
+  %cmp2 = icmp slt i64 %x, 0
+  %sel = select i1 %y, i1 %cmp1, i1 %cmp2
+  ret i1 %sel
+}
+
+define i1 @test_select_inverse_unsigned(i64 %x, i1 %y) {
+; CHECK-LABEL: @test_select_inverse_unsigned(
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp ugt i64 [[X:%.*]], 10
+; CHECK-NEXT:    [[SEL:%.*]] = xor i1 [[CMP2]], [[Y:%.*]]
+; CHECK-NEXT:    ret i1 [[SEL]]
+;
+  %cmp1 = icmp ult i64 %x, 11
+  %cmp2 = icmp ugt i64 %x, 10
+  %sel = select i1 %y, i1 %cmp1, i1 %cmp2
+  ret i1 %sel
+}
+
+define i1 @test_select_inverse_eq_ptr(ptr %x, i1 %y) {
+; CHECK-LABEL: @test_select_inverse_eq_ptr(
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp ne ptr [[X:%.*]], null
+; CHECK-NEXT:    [[SEL:%.*]] = xor i1 [[CMP2]], [[Y:%.*]]
+; CHECK-NEXT:    ret i1 [[SEL]]
+;
+  %cmp1 = icmp eq ptr %x, null
+  %cmp2 = icmp ne ptr %x, null
+  %sel = select i1 %y, i1 %cmp1, i1 %cmp2
+  ret i1 %sel
+}
+
+define i1 @test_select_inverse_fail(i64 %x, i1 %y) {
+; CHECK-LABEL: @test_select_inverse_fail(
+; CHECK-NEXT:    [[CMP1:%.*]] = icmp sgt i64 [[X:%.*]], 0
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp slt i64 [[X]], 0
+; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[Y:%.*]], i1 [[CMP1]], i1 [[CMP2]]
+; CHECK-NEXT:    ret i1 [[SEL]]
+;
+  %cmp1 = icmp sgt i64 %x, 0
+  %cmp2 = icmp slt i64 %x, 0
+  %sel = select i1 %y, i1 %cmp1, i1 %cmp2
+  ret i1 %sel
+}
+
+define <2 x i1> @test_select_inverse_vec(<2 x i64> %x, <2 x i1> %y) {
+; CHECK-LABEL: @test_select_inverse_vec(
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp eq <2 x i64> [[X:%.*]], zeroinitializer
+; CHECK-NEXT:    [[SEL:%.*]] = xor <2 x i1> [[CMP2]], [[Y:%.*]]
+; CHECK-NEXT:    ret <2 x i1> [[SEL]]
+;
+  %cmp1 = icmp ne <2 x i64> %x, zeroinitializer
+  %cmp2 = icmp eq <2 x i64> %x, zeroinitializer
+  %sel = select <2 x i1> %y, <2 x i1> %cmp1, <2 x i1> %cmp2
+  ret <2 x i1> %sel
+}
+
+define <2 x i1> @test_select_inverse_vec_fail(<2 x i64> %x, i1 %y) {
+; CHECK-LABEL: @test_select_inverse_vec_fail(
+; CHECK-NEXT:    [[CMP1:%.*]] = icmp ne <2 x i64> [[X:%.*]], zeroinitializer
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp eq <2 x i64> [[X]], zeroinitializer
+; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[Y:%.*]], <2 x i1> [[CMP1]], <2 x i1> [[CMP2]]
+; CHECK-NEXT:    ret <2 x i1> [[SEL]]
+;
+  %cmp1 = icmp ne <2 x i64> %x, zeroinitializer
+  %cmp2 = icmp eq <2 x i64> %x, zeroinitializer
+  %sel = select i1 %y, <2 x i1> %cmp1, <2 x i1> %cmp2
+  ret <2 x i1> %sel
+}
+
 declare void @use(i1)

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request May 28, 2024
@goldsteinn
Copy link
Contributor

Im somewhat partial to #90089. In general would prefer to use existing analysis logic (isImpliedCondition...) than this bespoke function.
Is there anything this function can do that other cant?

@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 28, 2024

Im somewhat partial to #90089. In general would prefer to use existing analysis logic (isImpliedCondition...) than this bespoke function. Is there anything this function can do that other cant?

No. I just use this simplified helper function to reduce compile-time overhead.
See #90089 (comment)

@goldsteinn
Copy link
Contributor

Im somewhat partial to #90089. In general would prefer to use existing analysis logic (isImpliedCondition...) than this bespoke function. Is there anything this function can do that other cant?

No. I just use this simplified helper function to reduce compile-time overhead. See #90089 (comment)

Okay, compile time is impact seems like a reasonable motivation.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 28, 2024

Missed optimization from cvc5: https://alive2.llvm.org/ce/z/s3a6rd

define i1 @src (i1 %cmp23, i1 %pol) {
   %lnot = xor i1 %pol, true
   call void @use(i1 %lnot)
   %lnot.pol = select i1 %cmp23, i1 %lnot, i1 %pol
   ret i1 %lnot.pol
}

define i1 @tgt(i1 %cmp23, i1 %pol) {
   %lnot = xor i1 %pol, true
   call void @use(i1 %lnot)
   %lnot.pol = xor i1 %cmp23, %pol
   ret i1 %lnot.pol
}

declare void @use(i1)

@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 29, 2024

Missed optimization from cvc5: https://alive2.llvm.org/ce/z/s3a6rd

define i1 @src (i1 %cmp23, i1 %pol) {
   %lnot = xor i1 %pol, true
   call void @use(i1 %lnot)
   %lnot.pol = select i1 %cmp23, i1 %lnot, i1 %pol
   ret i1 %lnot.pol
}

define i1 @tgt(i1 %cmp23, i1 %pol) {
   %lnot = xor i1 %pol, true
   call void @use(i1 %lnot)
   %lnot.pol = xor i1 %cmp23, %pol
   ret i1 %lnot.pol
}

declare void @use(i1)

Without multi-use on xor: https://godbolt.org/z/TYxz1njab

INSTCOMBINE ITERATION #1 on src
ADD:   ret i1 %lnot.pol
ADD:   %lnot.pol = select i1 %cmp23, i1 %lnot, i1 %pol
ADD:   %lnot = xor i1 %pol, true
IC: Visiting:   %lnot = xor i1 %pol, true
IC: Visiting:   %lnot.pol = select i1 %cmp23, i1 %lnot, i1 %pol
ADD DEFERRED:   %1 = select i1 %cmp23, i1 true, i1 false
IC: Old =   %lnot.pol = select i1 %cmp23, i1 %1, i1 %pol
    New =   <badref> = xor i1 %pol, %lnot
ADD:   %lnot.pol = xor i1 %pol, %lnot
IC: ERASE   %2 = select i1 %cmp23, i1 %1, i1 %pol
ADD DEFERRED:   %1 = xor i1 %pol, true
IC: ERASE   %1 = xor i1 %pol, true
ADD:   %lnot = select i1 %cmp23, i1 true, i1 false
IC: Visiting:   %lnot = select i1 %cmp23, i1 true, i1 false
IC: Replacing   %lnot = select i1 %cmp23, i1 true, i1 false
    with i1 %cmp23
IC: Mod =   %lnot = select i1 %cmp23, i1 true, i1 false
    New =   %lnot = select i1 %cmp23, i1 true, i1 false
IC: ERASE   %lnot = select i1 %cmp23, i1 true, i1 false
IC: Visiting:   %lnot.pol = xor i1 %pol, %cmp23
IC: Visiting:   ret i1 %lnot.pol

As there is no way to remove one-use constraint in InstCombinerImpl::foldSelectIntoOp, should we handle this pattern in this PR?

@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 29, 2024

@goldsteinn
Copy link
Contributor

As there is no way to remove one-use constraint in InstCombinerImpl::foldSelectIntoOp, should we handle this pattern in this PR?

#84696 adds the ability to do that.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Jun 3, 2024

Ping.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@goldsteinn
Copy link
Contributor

LGTM

@dtcxzyw dtcxzyw merged commit 0a39c88 into llvm:main Jun 4, 2024
5 of 7 checks passed
@dtcxzyw dtcxzyw deleted the perf/select-inverse-v2 branch June 4, 2024 15:50
dtcxzyw added a commit that referenced this pull request Oct 16, 2024
In #93591 we introduced
`isKnownInversion` and assumes `X` is poison implies `Y` is poison
because they share common operands. But after introducing `samesign`
this assumption no longer hold if `X` is an icmp has `samesign` flag.

Alive2 link: https://alive2.llvm.org/ce/z/rj3EwQ (Please run it locally
with this patch and AliveToolkit/alive2#1098).

This approach is the most conservative way in my mind to address this
problem. If `X` has `samesign` flag, it will check if `Y` also has this
flag and make sure constant RHS operands have the same sign.

Fixes #112350.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants