-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[RISCV] Fold addi
into load / store even if they are in different BBs.
#67024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-risc-v ChangesCurrently, since ISel only looks at one basic block at a time we miss some opportunities to combine load / store with In this PR we combine Full diff: https://github.com/llvm/llvm-project/pull/67024.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index d176fcaf54c2db0..492714ae2a78f8e 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -49,6 +49,7 @@ using namespace llvm;
#define DEBUG_TYPE "riscv-lower"
STATISTIC(NumTailCalls, "Number of tail calls");
+STATISTIC(NumADDIsMerged, "Number of ADDIs merged.");
static cl::opt<unsigned> ExtensionMaxWebSize(
DEBUG_TYPE "-ext-max-web-size", cl::Hidden,
@@ -2278,6 +2279,84 @@ bool RISCVTargetLowering::isLegalElementTypeForRVV(EVT ScalarTy) const {
}
}
+static bool tryToFoldInstIntoUse(MachineInstr &UseMI, MachineInstr &MI) {
+
+ if (MI.getOpcode() != RISCV::ADDI)
+ return false;
+ if (!(MI.getOperand(0).isReg() && MI.getOperand(1).isReg()))
+ return false;
+
+ switch (UseMI.getOpcode()) {
+ default:
+ return false;
+ case RISCV::LB:
+ case RISCV::LH:
+ case RISCV::LW:
+ case RISCV::LD:
+ case RISCV::LBU:
+ case RISCV::LHU:
+ case RISCV::SB:
+ case RISCV::SH:
+ case RISCV::SW:
+ case RISCV::SD:
+ break;
+ }
+ MachineOperand &OriginalBaseMO = UseMI.getOperand(1);
+ if (!OriginalBaseMO.isReg())
+ return false;
+ if (OriginalBaseMO.getReg() != MI.getOperand(0).getReg())
+ return false;
+
+ MachineOperand &OriginalOffsetMO = UseMI.getOperand(2);
+ MachineOperand &ADDIOffsetMO = MI.getOperand(2);
+ if (!(OriginalOffsetMO.isImm() && ADDIOffsetMO.isImm()))
+ return false;
+
+ int64_t OriginalOffset = OriginalOffsetMO.getImm();
+ int64_t ADDIOffset = ADDIOffsetMO.getImm();
+ int64_t TotalOffset = OriginalOffset + ADDIOffset;
+ if (!isInt<12>(TotalOffset))
+ return false;
+
+ OriginalOffsetMO.setImm(TotalOffset);
+ OriginalBaseMO.setReg(MI.getOperand(1).getReg());
+ NumADDIsMerged++;
+ return true;
+}
+
+void RISCVTargetLowering::finalizeLowering(MachineFunction &MF) const {
+ TargetLoweringBase::finalizeLowering(MF);
+ MachineRegisterInfo &MRI = MF.getRegInfo();
+
+ SmallVector<MachineInstr *, 8> ToErase;
+ for (MachineFunction::iterator I = MF.begin(), E = MF.end(); I != E; ++I) {
+ MachineBasicBlock *MBB = &*I;
+ for (MachineBasicBlock::iterator MBBI = MBB->begin(), MBBE = MBB->end();
+ MBBI != MBBE;) {
+ MachineInstr &MI = *MBBI++;
+ if (MI.getOpcode() != RISCV::ADDI)
+ continue;
+ if (!MI.getOperand(0).isReg())
+ continue;
+ SmallVector<MachineInstr *, 4> Users;
+ for (MachineInstr &UseMI :
+ MRI.use_instructions(MI.getOperand(0).getReg())) {
+ Users.push_back(&UseMI);
+ }
+ bool AllUsesWereFolded = true;
+ for (MachineInstr *UseMI : Users) {
+ AllUsesWereFolded &= tryToFoldInstIntoUse(*UseMI, MI);
+ }
+ if (AllUsesWereFolded)
+ ToErase.push_back(&MI);
+ }
+ }
+ for (MachineInstr *MI : ToErase) {
+ MI->eraseFromParent();
+ }
+
+ return;
+}
unsigned RISCVTargetLowering::combineRepeatedFPDivisors() const {
return NumRepeatedDivisors;
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h
index 815b9be47f56026..1f0416231a2f4cb 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h
@@ -951,6 +951,8 @@ class RISCVTargetLowering : public TargetLowering {
return false;
};
+ void finalizeLowering(MachineFunction &MF) const override;
+
/// For available scheduling models FDIV + two independent FMULs are much
/// faster than two FDIVs.
unsigned combineRepeatedFPDivisors() const override;
|
Currently, since ISel only looks at one basic block at a time we miss some opportunities to combine load / store with `addi`. Such opportunities may occur when GEP and the use of GEP are in different basic blocks. In this PR we combine `addi` with memory access in `RISCVISelLowering:finalizeLowering`.
80e35d2
to
daa4a90
Compare
Tests? Isn't Does this increase register pressure? |
Why can't this be done in |
|
||
MachineOperand &OriginalOffsetMO = UseMI.getOperand(2); | ||
MachineOperand &ADDIOffsetMO = MI.getOperand(2); | ||
if (!(OriginalOffsetMO.isImm() && ADDIOffsetMO.isImm())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be an assert since we know this instruction is an ADDI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be a symbol? Like something that only gets resolved by linker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it can be a symbol. We do need to check isImm().
for (MachineBasicBlock::iterator MBBI = MBB->begin(), MBBE = MBB->end(); | ||
MBBI != MBBE;) { | ||
MachineInstr &MI = *MBBI++; | ||
if (MI.getOpcode() != RISCV::ADDI) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you already check Opcode == ADDI
and Operand0.isReg()
in tryToFoldInstIntoUse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, thanks!
This way the diff for this PR can show what's changed (https://llvm.org/docs/TestingGuide.html#precommit-workflow-for-tests)
I hope not. Without this transformation, the |
case RISCV::SB: | ||
case RISCV::SH: | ||
case RISCV::SW: | ||
case RISCV::SD: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized I should also include FP loads and stores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're also missing LWU
I haven't seen this pass when I wrote this. I agree, good idea. I can't understand why is I see two questions here: (1) Should we combine this with Update: I see that there is |
We're still in SSA form where MergeBaseOffset is now. So I don't understand this statement. |
bool AllUsesWereFolded = true; | ||
for (MachineInstr *UseMI : Users) | ||
AllUsesWereFolded &= tryToFoldInstIntoUse(*UseMI, MI); | ||
if (AllUsesWereFolded) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all uses aren't folded, then this would increase register pressure wouldn't it?
Could we use https://reviews.llvm.org/D152828 for this? |
Good idea, I'll take a look but not this week. |
Currently, since ISel only looks at one basic block at a time we miss some opportunities to combine load / store with
addi
. Such opportunities may occur when GEP and the use of GEP are in different basic blocks.In this PR we combine
addi
with memory access inRISCVISelLowering:finalizeLowering
.According to my measurements, this patch is about 1% in dynamic instruciton count on mcf.