Skip to content

[VPlan] Introduce VPScalarPHIRecipe, use for can & EVL IV codegen (NFC). #114305

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7722,6 +7722,7 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
BestVPlan.prepareToExecute(ILV.getTripCount(),
ILV.getOrCreateVectorTripCount(nullptr),
CanonicalIVStartValue, State);
VPlanTransforms::prepareToExecute(BestVPlan);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better rename this VPlan2VPlan conversion pass, say, convertToConcreteRecipes(), and place it above, following VPlanTransforms::optimizeForVFAndUF(), rather than here, in the midst of State-dependent code generation? (BestVPlan.prepareToExecute() actually starts to generate code.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted in afef545, thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restored the original position for now, will need to check where there are remaining users of getCanonicalIV, and how to best update them


BestVPlan.execute(&State);

Expand Down
7 changes: 3 additions & 4 deletions llvm/lib/Transforms/Vectorize/VPlan.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1070,10 +1070,9 @@ void VPlan::execute(VPTransformState *State) {
}

auto *PhiR = cast<VPHeaderPHIRecipe>(&R);
bool NeedsScalar =
isa<VPCanonicalIVPHIRecipe, VPEVLBasedIVPHIRecipe>(PhiR) ||
(isa<VPReductionPHIRecipe>(PhiR) &&
cast<VPReductionPHIRecipe>(PhiR)->isInLoop());
bool NeedsScalar = isa<VPScalarPHIRecipe>(PhiR) ||
(isa<VPReductionPHIRecipe>(PhiR) &&
cast<VPReductionPHIRecipe>(PhiR)->isInLoop());
Comment on lines +1074 to +1075
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In-loop reductions should also be converted to VPScalarHeaderPHIRecipe?

Value *Phi = State->get(PhiR, NeedsScalar);
Value *Val = State->get(PhiR->getBackedgeValue(), NeedsScalar);
cast<PHINode>(Phi)->addIncoming(Val, VectorLatchBB);
Expand Down
52 changes: 47 additions & 5 deletions llvm/lib/Transforms/Vectorize/VPlan.h
Original file line number Diff line number Diff line change
Expand Up @@ -2239,6 +2239,45 @@ class VPWidenPointerInductionRecipe : public VPHeaderPHIRecipe,
#endif
};

/// Recipe to generate a scalar PHI. Used to generate code for recipes that
/// produce scalar header phis, including VPCanonicalIVPHIRecipe and
/// VPEVLBasedIVPHIRecipe.
class VPScalarPHIRecipe : public VPHeaderPHIRecipe {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VPScalarHeaderPHIRecipe?

std::string Name;

public:
VPScalarPHIRecipe(VPValue *Start, VPValue *BackedgeValue, DebugLoc DL,
StringRef Name)
: VPHeaderPHIRecipe(VPDef::VPScalarPHISC, nullptr, Start, DL),
Name(Name.str()) {
addOperand(BackedgeValue);
}

~VPScalarPHIRecipe() override = default;

VPScalarPHIRecipe *clone() override {
llvm_unreachable("cloning not implemented yet");
}

VP_CLASSOF_IMPL(VPDef::VPScalarPHISC)

/// Generate the phi/select nodes.
void execute(VPTransformState &State) override;

/// Returns true if the recipe only uses the first lane of operand \p Op.
bool onlyFirstLaneUsed(const VPValue *Op) const override {
assert(is_contained(operands(), Op) &&
"Op must be an operand of the recipe");
return true;
}

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
/// Print the recipe.
void print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const override;
#endif
};

/// A recipe for handling phis that are widened in the vector loop.
/// In the VPlan native path, all incoming VPValues & VPBasicBlock pairs are
/// managed in the recipe directly.
Expand Down Expand Up @@ -3134,8 +3173,10 @@ class VPCanonicalIVPHIRecipe : public VPHeaderPHIRecipe {
return D->getVPDefID() == VPDef::VPCanonicalIVPHISC;
}

/// Generate the canonical scalar induction phi of the vector loop.
void execute(VPTransformState &State) override;
void execute(VPTransformState &State) override {
llvm_unreachable(
"cannot execute this recipe, should be replaced by VPScalarPHIRecipe");
}

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
/// Print the recipe.
Expand Down Expand Up @@ -3231,9 +3272,10 @@ class VPEVLBasedIVPHIRecipe : public VPHeaderPHIRecipe {
return D->getVPDefID() == VPDef::VPEVLBasedIVPHISC;
}

/// Generate phi for handling IV based on EVL over iterations correctly.
/// TODO: investigate if it can share the code with VPCanonicalIVPHIRecipe.
void execute(VPTransformState &State) override;
void execute(VPTransformState &State) override {
llvm_unreachable(
"cannot execute this recipe, should be replaced by VPScalarPHIRecipe");
}

/// Return the cost of this VPEVLBasedIVPHIRecipe.
InstructionCost computeCost(ElementCount VF,
Expand Down
16 changes: 8 additions & 8 deletions llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -213,14 +213,14 @@ Type *VPTypeAnalysis::inferScalarType(const VPValue *V) {
TypeSwitch<const VPRecipeBase *, Type *>(V->getDefiningRecipe())
.Case<VPActiveLaneMaskPHIRecipe, VPCanonicalIVPHIRecipe,
VPFirstOrderRecurrencePHIRecipe, VPReductionPHIRecipe,
VPWidenPointerInductionRecipe, VPEVLBasedIVPHIRecipe>(
[this](const auto *R) {
// Handle header phi recipes, except VPWidenIntOrFpInduction
// which needs special handling due it being possibly truncated.
// TODO: consider inferring/caching type of siblings, e.g.,
// backedge value, here and in cases below.
return inferScalarType(R->getStartValue());
})
VPWidenPointerInductionRecipe, VPEVLBasedIVPHIRecipe,
VPScalarPHIRecipe>([this](const auto *R) {
// Handle header phi recipes, except VPWidenIntOrFpInduction
// which needs special handling due it being possibly truncated.
// TODO: consider inferring/caching type of siblings, e.g.,
// backedge value, here and in cases below.
return inferScalarType(R->getStartValue());
})
.Case<VPWidenIntOrFpInductionRecipe, VPDerivedIVRecipe>(
[](const auto *R) { return R->getScalarType(); })
.Case<VPReductionRecipe, VPPredInstPHIRecipe, VPWidenPHIRecipe,
Expand Down
41 changes: 22 additions & 19 deletions llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3102,17 +3102,6 @@ InstructionCost VPInterleaveRecipe::computeCost(ElementCount VF,
VectorTy, std::nullopt, CostKind, 0);
}

void VPCanonicalIVPHIRecipe::execute(VPTransformState &State) {
Value *Start = getStartValue()->getLiveInIRValue();
PHINode *Phi = PHINode::Create(Start->getType(), 2, "index");
Phi->insertBefore(State.CFG.PrevBB->getFirstInsertionPt());

BasicBlock *VectorPH = State.CFG.getPreheaderBBFor(this);
Phi->addIncoming(Start, VectorPH);
Phi->setDebugLoc(getDebugLoc());
State.set(this, Phi, /*IsScalar*/ true);
}

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
void VPCanonicalIVPHIRecipe::print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const {
Expand Down Expand Up @@ -3154,8 +3143,6 @@ void VPWidenPointerInductionRecipe::execute(VPTransformState &State) {
assert(!onlyScalarsGenerated(State.VF.isScalable()) &&
"Recipe should have been replaced");

auto *IVR = getParent()->getPlan()->getCanonicalIV();
PHINode *CanonicalIV = cast<PHINode>(State.get(IVR, /*IsScalar*/ true));
unsigned CurrentPart = getUnrollPart(*this);

// Build a pointer phi
Expand All @@ -3165,6 +3152,12 @@ void VPWidenPointerInductionRecipe::execute(VPTransformState &State) {
BasicBlock *VectorPH = State.CFG.getPreheaderBBFor(this);
PHINode *NewPointerPhi = nullptr;
if (CurrentPart == 0) {
auto *IVR = cast<VPHeaderPHIRecipe>(&getParent()
->getPlan()
->getVectorLoopRegion()
->getEntryBasicBlock()
->front());
Comment on lines +3155 to +3159
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deserves a method, being getCanonicalIV() w/o casting its result to VPCanonicalIVPHIRecipe?

PHINode *CanonicalIV = cast<PHINode>(State.get(IVR, /*IsScalar*/ true));
NewPointerPhi = PHINode::Create(ScStValueType, 2, "pointer.phi",
CanonicalIV->getIterator());
NewPointerPhi->addIncoming(ScalarStartValue, VectorPH);
Expand Down Expand Up @@ -3478,20 +3471,30 @@ void VPActiveLaneMaskPHIRecipe::print(raw_ostream &O, const Twine &Indent,
}
#endif

void VPEVLBasedIVPHIRecipe::execute(VPTransformState &State) {
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
void VPEVLBasedIVPHIRecipe::print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const {
O << Indent << "EXPLICIT-VECTOR-LENGTH-BASED-IV-PHI ";

printAsOperand(O, SlotTracker);
O << " = phi ";
printOperands(O, SlotTracker);
}
#endif

void VPScalarPHIRecipe::execute(VPTransformState &State) {
BasicBlock *VectorPH = State.CFG.getPreheaderBBFor(this);
Value *Start = State.get(getOperand(0), VPLane(0));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better use getStartValue() than getOperand(0).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in afef545

PHINode *Phi = State.Builder.CreatePHI(Start->getType(), 2, "evl.based.iv");
PHINode *Phi = State.Builder.CreatePHI(Start->getType(), 2, Name);
Phi->addIncoming(Start, VectorPH);
Phi->setDebugLoc(getDebugLoc());
State.set(this, Phi, /*IsScalar=*/true);
}

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
void VPEVLBasedIVPHIRecipe::print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const {
O << Indent << "EXPLICIT-VECTOR-LENGTH-BASED-IV-PHI ";

void VPScalarPHIRecipe::print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const {
O << Indent << "SCALAR-PHI";
printAsOperand(O, SlotTracker);
O << " = phi ";
printOperands(O, SlotTracker);
Expand Down
21 changes: 21 additions & 0 deletions llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1821,3 +1821,24 @@ void VPlanTransforms::createInterleaveGroups(
}
}
}

void VPlanTransforms::prepareToExecute(VPlan &Plan) {
ReversePostOrderTraversal<VPBlockDeepTraversalWrapper<VPBlockBase *>> RPOT(
Plan.getVectorLoopRegion());
Comment on lines +1826 to +1827
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RPOT not needed nor used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in afef545

for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(
vp_depth_first_deep(Plan.getEntry()))) {
for (VPRecipeBase &R : make_early_inc_range(VPBB->phis())) {
if (!isa<VPCanonicalIVPHIRecipe, VPEVLBasedIVPHIRecipe>(&R))
continue;
auto *PhiR = cast<VPHeaderPHIRecipe>(&R);
StringRef Name =
isa<VPCanonicalIVPHIRecipe>(PhiR) ? "index" : "evl.based.iv";
auto *ScalarR =
new VPScalarPHIRecipe(PhiR->getStartValue(), PhiR->getBackedgeValue(),
PhiR->getDebugLoc(), Name);
Comment on lines +1837 to +1838
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could provide constructors for VPScalarHeaderPHIRecipe from canonical and EVL header phi recipes.

ScalarR->insertBefore(PhiR);
PhiR->replaceAllUsesWith(ScalarR);
PhiR->eraseFromParent();
}
}
}
3 changes: 3 additions & 0 deletions llvm/lib/Transforms/Vectorize/VPlanTransforms.h
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,9 @@ struct VPlanTransforms {

/// Remove dead recipes from \p Plan.
static void removeDeadRecipes(VPlan &Plan);

/// Lower abstract recipes to concrete ones, that can be codegen'd.
static void prepareToExecute(VPlan &Plan);
};

} // namespace llvm
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/Transforms/Vectorize/VPlanValue.h
Original file line number Diff line number Diff line change
Expand Up @@ -359,6 +359,7 @@ class VPDef {
VPFirstOrderRecurrencePHISC,
VPWidenIntOrFpInductionSC,
VPWidenPointerInductionSC,
VPScalarPHISC,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to keep (all header phi recipes in) some order...

VPReductionPHISC,
// END: SubclassID for recipes that inherit VPHeaderPHIRecipe
// END: Phi-like recipes
Expand Down
Loading