Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe #136173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: users/SamTebbs33/elvis-vp-arm-mve-transform
Are you sure you want to change the base?
[LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe #136173
Changes from all commits
a8edb86
595d8ff
44ccc68
07501bd
5f6428a
81b34c1
6d1d4ab
f809c68
f1b15d3
3f02b10
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried downloading this patch and applying to the HEAD of LLVM and
patch
said this diff had already been applied. Does the PR need rebasing?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah perhaps this is my mistake. You did say it depends upon #113903. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's the case :). Let me know if you have any issues applying it after applying 113903 too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to make the change of
VPPartialReductionRecipe : public VPSingleDefRecipe
->VPPartialReductionRecipe : public VPReductionRecipe
as an NFC change? (For cases around VPMulAccumulateReductionRecipes you can initially add some asserts that the recipe isn't a partial reduction, because that won't be supported until this PR lands)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I could make it an NFC change, since to conform to
VPReductionRecipe
, the accumulator and binop have to be swapped around.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(We discussed this offline) swapping the operands in the debug-print function of the recipe is not something that really concerns me, and I think there's still value making this functionally (from end-user perspective) NFC change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pre-committed the NFC but rebasing Elvis's changes on top of that has been pretty challenging considering the number of commits on that branch. So I will cherry-pick the NFC on to this branch and it'll just go away once Elvis's PR lands and I rebase this PR on top of main.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the
classof
forVPReductionRecipe
now includeVPPartialReductionRecipe
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be given the
Range &
and for that range to be clamped if it doesn't match or if the cost is higher than the individual operations (similar to what happens intryToCreateAbstractReductionRecipe
) ?(note that the cost part is still missing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point we've already created the partial reduction and clamped the range so I don't think we need to do any costing (like
tryToMatchAndCreateMulAccumulateReduction
does withgetMulAccReductionCost
) since we already know it's worthwhile (seegetScaledReductions
in LoopVectorize.cpp). This part of the code just puts the partial reduction inside the abstract recipe, which shouldn't need to consider any costing.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I read the code is that at the point of getting to this point in the code, it has recognised a reduction so there is a
VP[Partial]ReductionRecipe
. It then tries to analyse whether that recipe can be transformed into aVPMulAccumulateReductionRecipe
. ForVPReductionRecipe
it will clamp the range to all the VFs that can be turned into aVPMulAccumulateReductionRecipe
, but forVPPartialReductionRecipe
it doesn't do that. I don't see why for partial reductions we'd do something different.In fact, why wouldn't the
tryToMatchAndCreateMulAccumulateReduction
code be sufficient here? Now that you've madeVPPartialReductionRecipe
a subclass ofVPReductionRecipe
, I'd expect that code to function roughly the same.