-
Notifications
You must be signed in to change notification settings - Fork 14k
[FlattenCFG] Fixup Phi nodes during CFG flattening #143766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-llvm-transforms Author: None (HighW4y2H3ll) ChangesSimplifyCFG currently only merges basic block into its predecessor with Full diff: https://github.com/llvm/llvm-project/pull/143766.diff 5 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp b/llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp
index 4e437e9abeb43..864a3c35ed79e 100644
--- a/llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp
+++ b/llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp
@@ -264,6 +264,8 @@ static bool iterativelySimplifyCFG(Function &F, const TargetTransformInfo &TTI,
if (simplifyCFG(&BB, TTI, DTU, Options, LoopHeaders)) {
LocalChange = true;
++NumSimpl;
+ // Function iterator might not be valid anymore
+ break;
}
}
Changed |= LocalChange;
diff --git a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
index 6608515e1cbbc..f9fda9625ea36 100644
--- a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
+++ b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
@@ -207,6 +207,10 @@ bool llvm::MergeBlockIntoPredecessor(BasicBlock *BB, DomTreeUpdater *DTU,
if (PredecessorWithTwoSuccessors) {
if (!(PredBB_BI = dyn_cast<BranchInst>(PTI)))
return false;
+ // Can't merge if any of the instructions has side effects
+ for (Instruction &candid : *BB)
+ if (candid.mayHaveSideEffects())
+ return false;
BranchInst *BB_JmpI = dyn_cast<BranchInst>(BB->getTerminator());
if (!BB_JmpI || !BB_JmpI->isUnconditional())
return false;
@@ -283,6 +287,35 @@ bool llvm::MergeBlockIntoPredecessor(BasicBlock *BB, DomTreeUpdater *DTU,
if (MSSAU)
MSSAU->moveAllAfterMergeBlocks(BB, PredBB, Start);
+ // PHI nodes that references the Pred shall be replaced with a SelectInst
+ for (BasicBlock *Succ : successors(PTI)) {
+ for (PHINode &PhiSucc : Succ->phis()) {
+ if (llvm::is_contained(PhiSucc.blocks(), PredBB) && llvm::is_contained(PhiSucc.blocks(), BB)) {
+ if (PredBB_BI->isConditional()) {
+ Value *Cond = PredBB_BI->getCondition();
+ Value *OrigPhi0 = PhiSucc.removeIncomingValue(PredBB, false);
+ Value *OrigPhi1 = PhiSucc.removeIncomingValue(BB, false);
+
+ SelectInst *newPhi = nullptr;
+ // Swap branch given the conditions
+ if (PredBB_BI->getSuccessor(0) == Succ) {
+ newPhi = SelectInst::Create(Cond, OrigPhi0, OrigPhi1, "", PTI->getIterator());
+ } else {
+ newPhi = SelectInst::Create(Cond, OrigPhi1, OrigPhi0, "", PTI->getIterator());
+ }
+
+ // Erase PhiNode if its empty
+ if (PhiSucc.getNumOperands() == 0) {
+ PhiSucc.replaceAllUsesWith(newPhi);
+ PhiSucc.eraseFromParent();
+ break;
+ }
+ PhiSucc.addIncoming(newPhi, PredBB);
+ }
+ }
+ }
+ }
+
// Make all PHI nodes that referred to BB now refer to Pred as their
// source...
BB->replaceAllUsesWith(PredBB);
diff --git a/llvm/lib/Transforms/Utils/Local.cpp b/llvm/lib/Transforms/Utils/Local.cpp
index be71cb69ad8cc..c382c1e385134 100644
--- a/llvm/lib/Transforms/Utils/Local.cpp
+++ b/llvm/lib/Transforms/Utils/Local.cpp
@@ -153,7 +153,9 @@ bool llvm::ConstantFoldTerminator(BasicBlock *BB, bool DeleteDeadConditions,
// Let the basic block know that we are letting go of one copy of it.
assert(BI->getParent() && "Terminator not inserted in block!");
- Dest1->removePredecessor(BI->getParent());
+ // Keep the predecesor if Dest has multiple incoming edges
+ if (Dest1->getUniquePredecessor())
+ Dest1->removePredecessor(BI->getParent());
// Replace the conditional branch with an unconditional one.
BranchInst *NewBI = Builder.CreateBr(Dest1);
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index e221022bb8361..27b0692ca7a4f 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -8337,7 +8337,7 @@ bool SimplifyCFGOpt::simplifyOnce(BasicBlock *BB) {
// Merge basic blocks into their predecessor if there is only one distinct
// pred, and if there is only one distinct successor of the predecessor, and
// if there are no PHI nodes.
- if (MergeBlockIntoPredecessor(BB, DTU))
+ if (MergeBlockIntoPredecessor(BB, DTU, nullptr, nullptr, nullptr, true))
return true;
if (SinkCommon && Options.SinkCommonInsts)
diff --git a/llvm/test/Transforms/SimplifyCFG/two-succ-merge-block.ll b/llvm/test/Transforms/SimplifyCFG/two-succ-merge-block.ll
new file mode 100644
index 0000000000000..eb8ac774b6e5b
--- /dev/null
+++ b/llvm/test/Transforms/SimplifyCFG/two-succ-merge-block.ll
@@ -0,0 +1,58 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --tool ./opt --version 5
+; RUN: opt < %s -passes=simplifycfg -S | FileCheck %s
+
+define i1 @_Z7compareRK1SS1_(ptr %a, ptr %b) {
+; CHECK-LABEL: @_Z7compareRK1SS1_(
+; CHECK-SAME: ptr [[A:%.*]], ptr [[B:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: %0 = load i32, ptr %a, align 4, !tbaa !3
+; CHECK-NEXT: %1 = load i32, ptr %b, align 4, !tbaa !3
+; CHECK-NEXT: %cmp.i = icmp slt i32 %0, %1
+; CHECK-NEXT: %cmp.i19 = icmp eq i32 %0, %1
+; CHECK-NEXT: %y = getelementptr inbounds nuw i8, ptr %a, i64 4
+; CHECK-NEXT: %2 = load i32, ptr %y, align 4, !tbaa !8
+; CHECK-NEXT: %y14 = getelementptr inbounds nuw i8, ptr %b, i64 4
+; CHECK-NEXT: %3 = load i32, ptr %y14, align 4, !tbaa !8
+; CHECK-NEXT: %cmp = icmp slt i32 %2, %3
+; CHECK-NEXT: %4 = select i1 %cmp.i19, i1 %cmp, i1 false
+; CHECK-NEXT: %5 = select i1 %cmp.i, i1 true, i1 %4
+; CHECK-NEXT: br label %lor.end
+; CHECK-EMPTY:
+; CHECK-NEXT: lor.end: ; preds = %entry
+; CHECK-NEXT: ret i1 %5
+entry:
+ %0 = load i32, ptr %a, align 4, !tbaa !3
+ %1 = load i32, ptr %b, align 4, !tbaa !3
+ %cmp.i = icmp slt i32 %0, %1
+ br i1 %cmp.i, label %lor.end, label %lor.rhs
+
+lor.rhs: ; preds = %entry
+ %cmp.i19 = icmp eq i32 %0, %1
+ br i1 %cmp.i19, label %land.rhs, label %lor.end
+
+land.rhs: ; preds = %lor.rhs
+ %y = getelementptr inbounds nuw i8, ptr %a, i64 4
+ %2 = load i32, ptr %y, align 4, !tbaa !8
+ %y14 = getelementptr inbounds nuw i8, ptr %b, i64 4
+ %3 = load i32, ptr %y14, align 4, !tbaa !8
+ %cmp = icmp slt i32 %2, %3
+ br label %lor.end
+
+lor.end: ; preds = %lor.rhs, %land.rhs, %entry
+ %4 = phi i1 [ true, %entry ], [ false, %lor.rhs ], [ %cmp, %land.rhs ]
+ ret i1 %4
+}
+
+!llvm.module.flags = !{!0, !1}
+!llvm.ident = !{!2}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"uwtable", i32 2}
+!2 = !{!"clang"}
+!3 = !{!4, !5, i64 0}
+!4 = !{!"_ZTS1S", !5, i64 0, !5, i64 4}
+!5 = !{!"int", !6, i64 0}
+!6 = !{!"omnipotent char", !7, i64 0}
+!7 = !{!"Simple C++ TBAA"}
+!8 = !{!4, !5, i64 4}
+!9 = !{!5, !5, i64 0}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Still have a lot of unit testing failures that needs to be double checked... but would like to hear some feedbacks on this if possible... thx :) |
e315f89
to
d912e14
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you modified just MergeBlockIntoPredecessor without actually using the PredecessorWithTwoSuccessors mode anywhere in SimplifyCFG, so I don't see how your test could possibly work.
If I understand what you're trying to do, this is effectively speculating blocks into predecessors. Checking for side effects is incorrect there, you need to check speculability. For example, in your test case speculation is not legal because the loads are non-speculatable. On top of that, speculating instructions requires cost modelling, you can't just dump a block of arbitrary size into the predecessor, even though it may not be executed at all.
SimplifyCFG already has various transforms that do this correctly and with cost modelling.
Thanks for checking!! Sorry that I missed a commit earlier which actually uses the PredecessorWithTwoSuccessors. The |
3432037
to
7194cc0
Compare
@nikic I fixed up the FlattenCFG pass which shall be able to handle this CFG now! Could you take another look when its convenient? Thank you! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like your transform lost the contents of land.rhs entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh not at all, I just skipped the checking of the land.rhs
since its entirely unchanged. let me add it there if that works better.
@nikic Could you take a look at this PR again? I've fixed all the unittests and it shall be ready for review! :) |
Also cc @arsenm since it's touching quite a bit of AMDGPU unit tests... thx! :) |
FlattenCFG Pass is skipped when Basic Block contains a Phi node. Fix up the Phi node with a SelectInst so that FlattenCFG can be ran again.