Skip to content

[VPlan] Impl VPlan-based pattern match for ExtendedRed and MulAccRed #113903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 56 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
33b1f60
[VPlan] Impl VPlan-based pattern match for ExtendedRed and MulAccRed.…
ElvisWang123 Oct 28, 2024
68fbd70
Partially support Extended-reduction.
ElvisWang123 Nov 4, 2024
c8c9d56
Support MulAccRecipe
ElvisWang123 Nov 5, 2024
d29a118
Fix servel errors and update tests.
ElvisWang123 Nov 6, 2024
e5b50f7
Refactors
ElvisWang123 Nov 6, 2024
cc004ff
Fix typos and update printing test
ElvisWang123 Nov 7, 2024
b5445ca
Fold reduce.add(zext(mul(sext(A), sext(B)))) into MulAccRecipe when A…
ElvisWang123 Nov 11, 2024
1df91d4
Refactor! Reuse functions from VPReductionRecipe.
ElvisWang123 Nov 11, 2024
a0b2f30
Refactor! Add comments and refine new recipes.
ElvisWang123 Nov 12, 2024
46928bd
Remove underying instruction dependency.
ElvisWang123 Nov 14, 2024
35abf19
Revert "Remove underying instruction dependency."
ElvisWang123 Nov 14, 2024
453997e
Remove extended instruction after mul in MulAccRecipe.
ElvisWang123 Nov 15, 2024
fa4f476
Refactor.
ElvisWang123 Nov 15, 2024
86ad2d8
Clamp the range when the ExtendedReduction or MulAcc cost is invalid.
ElvisWang123 Nov 15, 2024
594f9e4
Try to not depend on underlying ext/mul instructions and preserve fla…
ElvisWang123 Nov 18, 2024
52369d0
Update testcase and fix reduction cost.
ElvisWang123 Nov 25, 2024
abc08f3
!fixup. Rebase to upstream `prepareToExecute()` implementation.
ElvisWang123 Dec 5, 2024
729a70e
Move VPReductionRecipe inherite from VPRecipeWithIRFlags.
ElvisWang123 Dec 11, 2024
ea58282
Only create VPMulAcc/VPExtendedReduction recipe when beneficial. NFC
ElvisWang123 Dec 11, 2024
1c22ce2
Merge branch 'main' into vp-arm-mve-transform
ElvisWang123 Dec 11, 2024
a987456
!fixup use `auto`
ElvisWang123 Dec 12, 2024
6c434c7
!fixup VPReductionRecipe unit tests.
ElvisWang123 Dec 12, 2024
f4b1b78
!fixup migrate tryTo* to VPlanTransforms
ElvisWang123 Dec 23, 2024
bffcac5
Implement clone() and add some docs.
ElvisWang123 Dec 23, 2024
da705f1
Update comments.
ElvisWang123 Dec 23, 2024
1dc279e
fix-ReductionEVLRecipe query underlyingInstr().
ElvisWang123 Dec 23, 2024
20ea82e
Merge branch 'main' into vp-arm-mve-transform
ElvisWang123 Dec 23, 2024
90f9ffa
Update after merge.
ElvisWang123 Dec 23, 2024
99512fe
Address comments and split off abstract recipes creation from adjustR…
ElvisWang123 Dec 26, 2024
2e4014a
!fixup using foldTailWithEVL.
ElvisWang123 Dec 27, 2024
38dd924
!fixup, remove extra debugLoc and move check of EVL out of transforms.
ElvisWang123 Jan 21, 2025
602a5e4
Merge branch 'main' into vp-arm-mve-transform
ElvisWang123 Jan 21, 2025
1939d44
Update after merge main.
ElvisWang123 Jan 22, 2025
2ee6e76
Merge branch 'main' into vp-arm-mve-transform
ElvisWang123 Feb 17, 2025
d584fc1
Update after merge. Using runPass::().
ElvisWang123 Feb 18, 2025
21b33e6
!fixup, Remove unused check and functions.
ElvisWang123 Feb 26, 2025
ae371e5
Merge branch 'main' into vp-arm-mve-transform
ElvisWang123 Mar 4, 2025
0d7b7f3
!fixup; Address comments.
ElvisWang123 Mar 4, 2025
e12bd04
!fixup, Add Mul cost to prevent FMuladd Reduction cost misaligned.
ElvisWang123 Mar 7, 2025
4906637
!Fixup, typo.
ElvisWang123 Mar 10, 2025
ca5db10
Merge branch 'main' into vp-arm-mve-transform
ElvisWang123 Mar 18, 2025
2fbdc7c
!fixup, Address comments and fix VPReductionRecipe::computeCost
ElvisWang123 Mar 19, 2025
38d83bf
Merge branch 'main' into vp-arm-mve-transform
ElvisWang123 Mar 19, 2025
3e2acad
Merge branch 'main' into vp-arm-mve-transform
ElvisWang123 Mar 19, 2025
d2a5a43
!fixup, Update after merge, using std::array.
ElvisWang123 Mar 19, 2025
484f9cc
fixup, formatting.
ElvisWang123 Mar 19, 2025
cd86af4
!fixup, address comments.
ElvisWang123 Mar 20, 2025
84f8a46
Merge branch 'main' into vp-arm-mve-transform
ElvisWang123 Mar 25, 2025
36e1032
!fixup, formatting and address comments.
ElvisWang123 Mar 25, 2025
2483a29
!fixup, Update inferScalarType and not clear the VF of plan.
ElvisWang123 Apr 7, 2025
56dcd90
!fixup, address comments.
ElvisWang123 Apr 10, 2025
26d938a
Merge branch 'main' into vp-arm-mve-transform
ElvisWang123 Apr 18, 2025
b32538f
!fixup, address comments.
ElvisWang123 Apr 18, 2025
fd539f8
!fixup, address comments and using `transferFlags()` to copy nneg.
ElvisWang123 Apr 23, 2025
71c7401
!fixup, address comments.
ElvisWang123 Apr 23, 2025
7da7983
!fixup, Add new recipes to mayReadWriteMemory.
ElvisWang123 Apr 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 15 additions & 60 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7559,62 +7559,6 @@ LoopVectorizationPlanner::precomputeCosts(VPlan &Plan, ElementCount VF,
}
}

// The legacy cost model has special logic to compute the cost of in-loop
// reductions, which may be smaller than the sum of all instructions involved
// in the reduction.
// TODO: Switch to costing based on VPlan once the logic has been ported.
for (const auto &[RedPhi, RdxDesc] : Legal->getReductionVars()) {
if (ForceTargetInstructionCost.getNumOccurrences())
continue;

if (!CM.isInLoopReduction(RedPhi))
continue;

const auto &ChainOps = RdxDesc.getReductionOpChain(RedPhi, OrigLoop);
SetVector<Instruction *> ChainOpsAndOperands(llvm::from_range, ChainOps);
auto IsZExtOrSExt = [](const unsigned Opcode) -> bool {
return Opcode == Instruction::ZExt || Opcode == Instruction::SExt;
};
// Also include the operands of instructions in the chain, as the cost-model
// may mark extends as free.
//
// For ARM, some of the instruction can folded into the reducion
// instruction. So we need to mark all folded instructions free.
// For example: We can fold reduce(mul(ext(A), ext(B))) into one
// instruction.
for (auto *ChainOp : ChainOps) {
for (Value *Op : ChainOp->operands()) {
if (auto *I = dyn_cast<Instruction>(Op)) {
ChainOpsAndOperands.insert(I);
if (I->getOpcode() == Instruction::Mul) {
auto *Ext0 = dyn_cast<Instruction>(I->getOperand(0));
auto *Ext1 = dyn_cast<Instruction>(I->getOperand(1));
if (Ext0 && IsZExtOrSExt(Ext0->getOpcode()) && Ext1 &&
Ext0->getOpcode() == Ext1->getOpcode()) {
ChainOpsAndOperands.insert(Ext0);
ChainOpsAndOperands.insert(Ext1);
}
}
}
}
}

// Pre-compute the cost for I, if it has a reduction pattern cost.
for (Instruction *I : ChainOpsAndOperands) {
auto ReductionCost =
CM.getReductionPatternCost(I, VF, toVectorTy(I->getType(), VF));
if (!ReductionCost)
continue;

assert(!CostCtx.SkipCostComputation.contains(I) &&
"reduction op visited multiple times");
CostCtx.SkipCostComputation.insert(I);
LLVM_DEBUG(dbgs() << "Cost of " << ReductionCost << " for VF " << VF
<< ":\n in-loop reduction " << *I << "\n");
Cost += *ReductionCost;
}
}

// Pre-compute the costs for branches except for the backedge, as the number
// of replicate regions in a VPlan may not directly match the number of
// branches, which would lead to different decisions.
Expand Down Expand Up @@ -9757,10 +9701,6 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) {
"entry block must be set to a VPRegionBlock having a non-empty entry "
"VPBasicBlock");

for (ElementCount VF : Range)
Plan->addVF(VF);
Plan->setName("Initial VPlan");

// Update wide induction increments to use the same step as the corresponding
// wide induction. This enables detecting induction increments directly in
// VPlan and removes redundant splats.
Expand Down Expand Up @@ -9796,6 +9736,21 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) {
// Adjust the recipes for any inloop reductions.
adjustRecipesForReductions(Plan, RecipeBuilder, Range.Start);

// Transform recipes to abstract recipes if it is legal and beneficial and
// clamp the range for better cost estimation.
// TODO: Enable following transform when the EVL-version of extended-reduction
// and mulacc-reduction are implemented.
if (!CM.foldTailWithEVL()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to special case this for EVL? Shouldn't the cost-model tell us that the combined reductions aren't profitable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for the cost model issue but the EVL-recipe generation is not implemented yet.
Abstract recipes will transform to concrete recipes after EVL transforms. And we need to generate VPReductionEVLRecipe when convert to the concrete recipes.

VPCostContext CostCtx(CM.TTI, *CM.TLI, Legal->getWidestInductionType(), CM,
CM.CostKind);
VPlanTransforms::runPass(VPlanTransforms::convertToAbstractRecipes, *Plan,
CostCtx, Range);
}

for (ElementCount VF : Range)
Plan->addVF(VF);
Plan->setName("Initial VPlan");

// Interleave memory: for each Interleave Group we marked earlier as relevant
// for this VPlan, replace the Recipes widening its memory instructions with a
// single VPInterleaveRecipe at its insertion point.
Expand Down
Loading