-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[MCP] Optimize copies when src is used during backward propagation #111130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MCP] Optimize copies when src is used during backward propagation #111130
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-backend-aarch64 @llvm/pr-subscribers-backend-arm Author: Vladimir Radosavljevic (vladimirradosavljevic) ChangesBefore this patch, redundant COPY couldn't be removed for the following case:
This patch adds support for tracking the users of the source register during backward propagation, so that we can remove the redundant COPY in the above case and optimize it to:
Patch is 1.37 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/111130.diff 65 Files Affected:
diff --git a/llvm/lib/CodeGen/MachineCopyPropagation.cpp b/llvm/lib/CodeGen/MachineCopyPropagation.cpp
index 8bcc437cbfb865..8293aba823ed79 100644
--- a/llvm/lib/CodeGen/MachineCopyPropagation.cpp
+++ b/llvm/lib/CodeGen/MachineCopyPropagation.cpp
@@ -110,6 +110,7 @@ class CopyTracker {
struct CopyInfo {
MachineInstr *MI = nullptr;
MachineInstr *LastSeenUseInCopy = nullptr;
+ SmallPtrSet<MachineInstr *, 4> SrcUsers;
SmallVector<MCRegister, 4> DefRegs;
bool Avail = false;
};
@@ -224,6 +225,43 @@ class CopyTracker {
}
}
+ /// Track copy's src users, and return false if that can't be done.
+ /// We can only track if we have a COPY instruction which source is
+ /// the same as the Reg.
+ bool trackSrcUsers(MCRegister Reg, MachineInstr &MI,
+ const TargetRegisterInfo &TRI, const TargetInstrInfo &TII,
+ bool UseCopyInstr) {
+ MCRegUnit RU = *TRI.regunits(Reg).begin();
+ MachineInstr *AvailCopy = findCopyDefViaUnit(RU, TRI);
+ if (!AvailCopy)
+ return false;
+
+ std::optional<DestSourcePair> CopyOperands =
+ isCopyInstr(*AvailCopy, TII, UseCopyInstr);
+ Register Src = CopyOperands->Source->getReg();
+
+ // Bail out, if the source of the copy is not the same as the Reg.
+ if (Src != Reg)
+ return false;
+
+ auto I = Copies.find(RU);
+ if (I == Copies.end())
+ return false;
+
+ I->second.SrcUsers.insert(&MI);
+ return true;
+ }
+
+ /// Return the users for a given register.
+ SmallPtrSet<MachineInstr *, 4> getSrcUsers(MCRegister Reg,
+ const TargetRegisterInfo &TRI) {
+ MCRegUnit RU = *TRI.regunits(Reg).begin();
+ auto I = Copies.find(RU);
+ if (I == Copies.end())
+ return {};
+ return I->second.SrcUsers;
+ }
+
/// Add this copy's registers into the tracker's copy maps.
void trackCopy(MachineInstr *MI, const TargetRegisterInfo &TRI,
const TargetInstrInfo &TII, bool UseCopyInstr) {
@@ -236,7 +274,7 @@ class CopyTracker {
// Remember Def is defined by the copy.
for (MCRegUnit Unit : TRI.regunits(Def))
- Copies[Unit] = {MI, nullptr, {}, true};
+ Copies[Unit] = {MI, nullptr, {}, {}, true};
// Remember source that's copied to Def. Once it's clobbered, then
// it's no longer available for copy propagation.
@@ -427,6 +465,7 @@ class MachineCopyPropagation : public MachineFunctionPass {
bool hasImplicitOverlap(const MachineInstr &MI, const MachineOperand &Use);
bool hasOverlappingMultipleDef(const MachineInstr &MI,
const MachineOperand &MODef, Register Def);
+ bool canUpdateSrcUsers(const MachineInstr &Copy, const MachineOperand &MODef);
/// Candidates for deletion.
SmallSetVector<MachineInstr *, 8> MaybeDeadCopies;
@@ -667,6 +706,26 @@ bool MachineCopyPropagation::hasOverlappingMultipleDef(
return false;
}
+/// Return true if it is safe to update the users of the source register of the
+/// copy.
+bool MachineCopyPropagation::canUpdateSrcUsers(const MachineInstr &Copy,
+ const MachineOperand &CopySrc) {
+ for (auto *SrcUser : Tracker.getSrcUsers(CopySrc.getReg(), *TRI)) {
+ if (hasImplicitOverlap(*SrcUser, CopySrc))
+ return false;
+
+ for (MachineOperand &MO : SrcUser->uses()) {
+ if (!MO.isReg() || MO.getReg() != CopySrc.getReg())
+ continue;
+ if (MO.isTied() || !MO.isRenamable() ||
+ !isBackwardPropagatableRegClassCopy(Copy, *SrcUser,
+ MO.getOperandNo()))
+ return false;
+ }
+ }
+ return true;
+}
+
/// Look for available copies whose destination register is used by \p MI and
/// replace the use in \p MI with the copy's source register.
void MachineCopyPropagation::forwardUses(MachineInstr &MI) {
@@ -1030,6 +1089,9 @@ void MachineCopyPropagation::propagateDefs(MachineInstr &MI) {
if (hasOverlappingMultipleDef(MI, MODef, Def))
continue;
+ if (!canUpdateSrcUsers(*Copy, *CopyOperands->Source))
+ continue;
+
LLVM_DEBUG(dbgs() << "MCP: Replacing " << printReg(MODef.getReg(), TRI)
<< "\n with " << printReg(Def, TRI) << "\n in "
<< MI << " from " << *Copy);
@@ -1037,6 +1099,15 @@ void MachineCopyPropagation::propagateDefs(MachineInstr &MI) {
MODef.setReg(Def);
MODef.setIsRenamable(CopyOperands->Destination->isRenamable());
+ for (auto *SrcUser : Tracker.getSrcUsers(Src, *TRI)) {
+ for (MachineOperand &MO : SrcUser->operands()) {
+ if (!MO.isReg() || !MO.isUse() || MO.getReg() != Src)
+ continue;
+ MO.setReg(Def);
+ MO.setIsRenamable(CopyOperands->Destination->isRenamable());
+ }
+ }
+
LLVM_DEBUG(dbgs() << "MCP: After replacement: " << MI << "\n");
MaybeDeadCopies.insert(Copy);
Changed = true;
@@ -1102,7 +1173,9 @@ void MachineCopyPropagation::BackwardCopyPropagateBlock(
CopyDbgUsers[Copy].insert(&MI);
}
}
- } else {
+ } else if (!Tracker.trackSrcUsers(MO.getReg().asMCReg(), MI, *TRI, *TII,
+ UseCopyInstr)) {
+ // If we can't track the source users, invalidate the register.
Tracker.invalidateRegister(MO.getReg().asMCReg(), *TRI, *TII,
UseCopyInstr);
}
diff --git a/llvm/test/CodeGen/ARM/umulo-128-legalisation-lowering.ll b/llvm/test/CodeGen/ARM/umulo-128-legalisation-lowering.ll
index afd75940b45932..464808ec8861b3 100644
--- a/llvm/test/CodeGen/ARM/umulo-128-legalisation-lowering.ll
+++ b/llvm/test/CodeGen/ARM/umulo-128-legalisation-lowering.ll
@@ -7,12 +7,11 @@ define { i128, i8 } @muloti_test(i128 %l, i128 %r) unnamed_addr #0 {
; ARMV6: @ %bb.0: @ %start
; ARMV6-NEXT: push {r4, r5, r6, r7, r8, r9, r10, r11, lr}
; ARMV6-NEXT: sub sp, sp, #28
-; ARMV6-NEXT: ldr r7, [sp, #72]
+; ARMV6-NEXT: ldr lr, [sp, #72]
; ARMV6-NEXT: mov r6, r0
; ARMV6-NEXT: str r0, [sp, #8] @ 4-byte Spill
; ARMV6-NEXT: ldr r4, [sp, #84]
-; ARMV6-NEXT: umull r1, r0, r2, r7
-; ARMV6-NEXT: mov lr, r7
+; ARMV6-NEXT: umull r1, r0, r2, lr
; ARMV6-NEXT: umull r5, r10, r4, r2
; ARMV6-NEXT: str r1, [r6]
; ARMV6-NEXT: ldr r6, [sp, #80]
diff --git a/llvm/test/CodeGen/Mips/llvm-ir/sdiv.ll b/llvm/test/CodeGen/Mips/llvm-ir/sdiv.ll
index 8d548861f43936..72cead18f89fab 100644
--- a/llvm/test/CodeGen/Mips/llvm-ir/sdiv.ll
+++ b/llvm/test/CodeGen/Mips/llvm-ir/sdiv.ll
@@ -388,9 +388,8 @@ define signext i64 @sdiv_i64(i64 signext %a, i64 signext %b) {
; MMR3-NEXT: .cfi_def_cfa_offset 24
; MMR3-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
; MMR3-NEXT: .cfi_offset 31, -4
-; MMR3-NEXT: addu $2, $2, $25
-; MMR3-NEXT: lw $25, %call16(__divdi3)($2)
-; MMR3-NEXT: move $gp, $2
+; MMR3-NEXT: addu $gp, $2, $25
+; MMR3-NEXT: lw $25, %call16(__divdi3)($gp)
; MMR3-NEXT: jalr $25
; MMR3-NEXT: nop
; MMR3-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
@@ -405,9 +404,8 @@ define signext i64 @sdiv_i64(i64 signext %a, i64 signext %b) {
; MMR6-NEXT: .cfi_def_cfa_offset 24
; MMR6-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
; MMR6-NEXT: .cfi_offset 31, -4
-; MMR6-NEXT: addu $2, $2, $25
-; MMR6-NEXT: lw $25, %call16(__divdi3)($2)
-; MMR6-NEXT: move $gp, $2
+; MMR6-NEXT: addu $gp, $2, $25
+; MMR6-NEXT: lw $25, %call16(__divdi3)($gp)
; MMR6-NEXT: jalr $25
; MMR6-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
; MMR6-NEXT: addiu $sp, $sp, 24
@@ -549,65 +547,59 @@ define signext i128 @sdiv_i128(i128 signext %a, i128 signext %b) {
; MMR3: # %bb.0: # %entry
; MMR3-NEXT: lui $2, %hi(_gp_disp)
; MMR3-NEXT: addiu $2, $2, %lo(_gp_disp)
-; MMR3-NEXT: addiusp -48
-; MMR3-NEXT: .cfi_def_cfa_offset 48
-; MMR3-NEXT: sw $ra, 44($sp) # 4-byte Folded Spill
-; MMR3-NEXT: swp $16, 36($sp)
+; MMR3-NEXT: addiusp -40
+; MMR3-NEXT: .cfi_def_cfa_offset 40
+; MMR3-NEXT: sw $ra, 36($sp) # 4-byte Folded Spill
+; MMR3-NEXT: sw $17, 32($sp) # 4-byte Folded Spill
; MMR3-NEXT: .cfi_offset 31, -4
; MMR3-NEXT: .cfi_offset 17, -8
-; MMR3-NEXT: .cfi_offset 16, -12
-; MMR3-NEXT: addu $16, $2, $25
+; MMR3-NEXT: addu $gp, $2, $25
; MMR3-NEXT: move $1, $7
-; MMR3-NEXT: lw $7, 68($sp)
-; MMR3-NEXT: lw $17, 72($sp)
-; MMR3-NEXT: lw $3, 76($sp)
+; MMR3-NEXT: lw $7, 60($sp)
+; MMR3-NEXT: lw $17, 64($sp)
+; MMR3-NEXT: lw $3, 68($sp)
; MMR3-NEXT: move $2, $sp
; MMR3-NEXT: sw16 $3, 28($2)
; MMR3-NEXT: sw16 $17, 24($2)
; MMR3-NEXT: sw16 $7, 20($2)
-; MMR3-NEXT: lw $3, 64($sp)
+; MMR3-NEXT: lw $3, 56($sp)
; MMR3-NEXT: sw16 $3, 16($2)
-; MMR3-NEXT: lw $25, %call16(__divti3)($16)
+; MMR3-NEXT: lw $25, %call16(__divti3)($gp)
; MMR3-NEXT: move $7, $1
-; MMR3-NEXT: move $gp, $16
; MMR3-NEXT: jalr $25
; MMR3-NEXT: nop
-; MMR3-NEXT: lwp $16, 36($sp)
-; MMR3-NEXT: lw $ra, 44($sp) # 4-byte Folded Reload
-; MMR3-NEXT: addiusp 48
+; MMR3-NEXT: lw $17, 32($sp) # 4-byte Folded Reload
+; MMR3-NEXT: lw $ra, 36($sp) # 4-byte Folded Reload
+; MMR3-NEXT: addiusp 40
; MMR3-NEXT: jrc $ra
;
; MMR6-LABEL: sdiv_i128:
; MMR6: # %bb.0: # %entry
; MMR6-NEXT: lui $2, %hi(_gp_disp)
; MMR6-NEXT: addiu $2, $2, %lo(_gp_disp)
-; MMR6-NEXT: addiu $sp, $sp, -48
-; MMR6-NEXT: .cfi_def_cfa_offset 48
-; MMR6-NEXT: sw $ra, 44($sp) # 4-byte Folded Spill
-; MMR6-NEXT: sw $17, 40($sp) # 4-byte Folded Spill
-; MMR6-NEXT: sw $16, 36($sp) # 4-byte Folded Spill
+; MMR6-NEXT: addiu $sp, $sp, -40
+; MMR6-NEXT: .cfi_def_cfa_offset 40
+; MMR6-NEXT: sw $ra, 36($sp) # 4-byte Folded Spill
+; MMR6-NEXT: sw $17, 32($sp) # 4-byte Folded Spill
; MMR6-NEXT: .cfi_offset 31, -4
; MMR6-NEXT: .cfi_offset 17, -8
-; MMR6-NEXT: .cfi_offset 16, -12
-; MMR6-NEXT: addu $16, $2, $25
+; MMR6-NEXT: addu $gp, $2, $25
; MMR6-NEXT: move $1, $7
-; MMR6-NEXT: lw $7, 68($sp)
-; MMR6-NEXT: lw $17, 72($sp)
-; MMR6-NEXT: lw $3, 76($sp)
+; MMR6-NEXT: lw $7, 60($sp)
+; MMR6-NEXT: lw $17, 64($sp)
+; MMR6-NEXT: lw $3, 68($sp)
; MMR6-NEXT: move $2, $sp
; MMR6-NEXT: sw16 $3, 28($2)
; MMR6-NEXT: sw16 $17, 24($2)
; MMR6-NEXT: sw16 $7, 20($2)
-; MMR6-NEXT: lw $3, 64($sp)
+; MMR6-NEXT: lw $3, 56($sp)
; MMR6-NEXT: sw16 $3, 16($2)
-; MMR6-NEXT: lw $25, %call16(__divti3)($16)
+; MMR6-NEXT: lw $25, %call16(__divti3)($gp)
; MMR6-NEXT: move $7, $1
-; MMR6-NEXT: move $gp, $16
; MMR6-NEXT: jalr $25
-; MMR6-NEXT: lw $16, 36($sp) # 4-byte Folded Reload
-; MMR6-NEXT: lw $17, 40($sp) # 4-byte Folded Reload
-; MMR6-NEXT: lw $ra, 44($sp) # 4-byte Folded Reload
-; MMR6-NEXT: addiu $sp, $sp, 48
+; MMR6-NEXT: lw $17, 32($sp) # 4-byte Folded Reload
+; MMR6-NEXT: lw $ra, 36($sp) # 4-byte Folded Reload
+; MMR6-NEXT: addiu $sp, $sp, 40
; MMR6-NEXT: jrc $ra
entry:
%r = sdiv i128 %a, %b
diff --git a/llvm/test/CodeGen/Mips/llvm-ir/srem.ll b/llvm/test/CodeGen/Mips/llvm-ir/srem.ll
index 29cb34b8d970f1..72496fcc53a5ac 100644
--- a/llvm/test/CodeGen/Mips/llvm-ir/srem.ll
+++ b/llvm/test/CodeGen/Mips/llvm-ir/srem.ll
@@ -336,9 +336,8 @@ define signext i64 @srem_i64(i64 signext %a, i64 signext %b) {
; MMR3-NEXT: .cfi_def_cfa_offset 24
; MMR3-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
; MMR3-NEXT: .cfi_offset 31, -4
-; MMR3-NEXT: addu $2, $2, $25
-; MMR3-NEXT: lw $25, %call16(__moddi3)($2)
-; MMR3-NEXT: move $gp, $2
+; MMR3-NEXT: addu $gp, $2, $25
+; MMR3-NEXT: lw $25, %call16(__moddi3)($gp)
; MMR3-NEXT: jalr $25
; MMR3-NEXT: nop
; MMR3-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
@@ -353,9 +352,8 @@ define signext i64 @srem_i64(i64 signext %a, i64 signext %b) {
; MMR6-NEXT: .cfi_def_cfa_offset 24
; MMR6-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
; MMR6-NEXT: .cfi_offset 31, -4
-; MMR6-NEXT: addu $2, $2, $25
-; MMR6-NEXT: lw $25, %call16(__moddi3)($2)
-; MMR6-NEXT: move $gp, $2
+; MMR6-NEXT: addu $gp, $2, $25
+; MMR6-NEXT: lw $25, %call16(__moddi3)($gp)
; MMR6-NEXT: jalr $25
; MMR6-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
; MMR6-NEXT: addiu $sp, $sp, 24
@@ -497,65 +495,59 @@ define signext i128 @srem_i128(i128 signext %a, i128 signext %b) {
; MMR3: # %bb.0: # %entry
; MMR3-NEXT: lui $2, %hi(_gp_disp)
; MMR3-NEXT: addiu $2, $2, %lo(_gp_disp)
-; MMR3-NEXT: addiusp -48
-; MMR3-NEXT: .cfi_def_cfa_offset 48
-; MMR3-NEXT: sw $ra, 44($sp) # 4-byte Folded Spill
-; MMR3-NEXT: swp $16, 36($sp)
+; MMR3-NEXT: addiusp -40
+; MMR3-NEXT: .cfi_def_cfa_offset 40
+; MMR3-NEXT: sw $ra, 36($sp) # 4-byte Folded Spill
+; MMR3-NEXT: sw $17, 32($sp) # 4-byte Folded Spill
; MMR3-NEXT: .cfi_offset 31, -4
; MMR3-NEXT: .cfi_offset 17, -8
-; MMR3-NEXT: .cfi_offset 16, -12
-; MMR3-NEXT: addu $16, $2, $25
+; MMR3-NEXT: addu $gp, $2, $25
; MMR3-NEXT: move $1, $7
-; MMR3-NEXT: lw $7, 68($sp)
-; MMR3-NEXT: lw $17, 72($sp)
-; MMR3-NEXT: lw $3, 76($sp)
+; MMR3-NEXT: lw $7, 60($sp)
+; MMR3-NEXT: lw $17, 64($sp)
+; MMR3-NEXT: lw $3, 68($sp)
; MMR3-NEXT: move $2, $sp
; MMR3-NEXT: sw16 $3, 28($2)
; MMR3-NEXT: sw16 $17, 24($2)
; MMR3-NEXT: sw16 $7, 20($2)
-; MMR3-NEXT: lw $3, 64($sp)
+; MMR3-NEXT: lw $3, 56($sp)
; MMR3-NEXT: sw16 $3, 16($2)
-; MMR3-NEXT: lw $25, %call16(__modti3)($16)
+; MMR3-NEXT: lw $25, %call16(__modti3)($gp)
; MMR3-NEXT: move $7, $1
-; MMR3-NEXT: move $gp, $16
; MMR3-NEXT: jalr $25
; MMR3-NEXT: nop
-; MMR3-NEXT: lwp $16, 36($sp)
-; MMR3-NEXT: lw $ra, 44($sp) # 4-byte Folded Reload
-; MMR3-NEXT: addiusp 48
+; MMR3-NEXT: lw $17, 32($sp) # 4-byte Folded Reload
+; MMR3-NEXT: lw $ra, 36($sp) # 4-byte Folded Reload
+; MMR3-NEXT: addiusp 40
; MMR3-NEXT: jrc $ra
;
; MMR6-LABEL: srem_i128:
; MMR6: # %bb.0: # %entry
; MMR6-NEXT: lui $2, %hi(_gp_disp)
; MMR6-NEXT: addiu $2, $2, %lo(_gp_disp)
-; MMR6-NEXT: addiu $sp, $sp, -48
-; MMR6-NEXT: .cfi_def_cfa_offset 48
-; MMR6-NEXT: sw $ra, 44($sp) # 4-byte Folded Spill
-; MMR6-NEXT: sw $17, 40($sp) # 4-byte Folded Spill
-; MMR6-NEXT: sw $16, 36($sp) # 4-byte Folded Spill
+; MMR6-NEXT: addiu $sp, $sp, -40
+; MMR6-NEXT: .cfi_def_cfa_offset 40
+; MMR6-NEXT: sw $ra, 36($sp) # 4-byte Folded Spill
+; MMR6-NEXT: sw $17, 32($sp) # 4-byte Folded Spill
; MMR6-NEXT: .cfi_offset 31, -4
; MMR6-NEXT: .cfi_offset 17, -8
-; MMR6-NEXT: .cfi_offset 16, -12
-; MMR6-NEXT: addu $16, $2, $25
+; MMR6-NEXT: addu $gp, $2, $25
; MMR6-NEXT: move $1, $7
-; MMR6-NEXT: lw $7, 68($sp)
-; MMR6-NEXT: lw $17, 72($sp)
-; MMR6-NEXT: lw $3, 76($sp)
+; MMR6-NEXT: lw $7, 60($sp)
+; MMR6-NEXT: lw $17, 64($sp)
+; MMR6-NEXT: lw $3, 68($sp)
; MMR6-NEXT: move $2, $sp
; MMR6-NEXT: sw16 $3, 28($2)
; MMR6-NEXT: sw16 $17, 24($2)
; MMR6-NEXT: sw16 $7, 20($2)
-; MMR6-NEXT: lw $3, 64($sp)
+; MMR6-NEXT: lw $3, 56($sp)
; MMR6-NEXT: sw16 $3, 16($2)
-; MMR6-NEXT: lw $25, %call16(__modti3)($16)
+; MMR6-NEXT: lw $25, %call16(__modti3)($gp)
; MMR6-NEXT: move $7, $1
-; MMR6-NEXT: move $gp, $16
; MMR6-NEXT: jalr $25
-; MMR6-NEXT: lw $16, 36($sp) # 4-byte Folded Reload
-; MMR6-NEXT: lw $17, 40($sp) # 4-byte Folded Reload
-; MMR6-NEXT: lw $ra, 44($sp) # 4-byte Folded Reload
-; MMR6-NEXT: addiu $sp, $sp, 48
+; MMR6-NEXT: lw $17, 32($sp) # 4-byte Folded Reload
+; MMR6-NEXT: lw $ra, 36($sp) # 4-byte Folded Reload
+; MMR6-NEXT: addiu $sp, $sp, 40
; MMR6-NEXT: jrc $ra
entry:
%r = srem i128 %a, %b
diff --git a/llvm/test/CodeGen/Mips/llvm-ir/udiv.ll b/llvm/test/CodeGen/Mips/llvm-ir/udiv.ll
index cc2c6614e69c8f..9451f1e9be0967 100644
--- a/llvm/test/CodeGen/Mips/llvm-ir/udiv.ll
+++ b/llvm/test/CodeGen/Mips/llvm-ir/udiv.ll
@@ -336,9 +336,8 @@ define signext i64 @udiv_i64(i64 signext %a, i64 signext %b) {
; MMR3-NEXT: .cfi_def_cfa_offset 24
; MMR3-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
; MMR3-NEXT: .cfi_offset 31, -4
-; MMR3-NEXT: addu $2, $2, $25
-; MMR3-NEXT: lw $25, %call16(__udivdi3)($2)
-; MMR3-NEXT: move $gp, $2
+; MMR3-NEXT: addu $gp, $2, $25
+; MMR3-NEXT: lw $25, %call16(__udivdi3)($gp)
; MMR3-NEXT: jalr $25
; MMR3-NEXT: nop
; MMR3-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
@@ -353,9 +352,8 @@ define signext i64 @udiv_i64(i64 signext %a, i64 signext %b) {
; MMR6-NEXT: .cfi_def_cfa_offset 24
; MMR6-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
; MMR6-NEXT: .cfi_offset 31, -4
-; MMR6-NEXT: addu $2, $2, $25
-; MMR6-NEXT: lw $25, %call16(__udivdi3)($2)
-; MMR6-NEXT: move $gp, $2
+; MMR6-NEXT: addu $gp, $2, $25
+; MMR6-NEXT: lw $25, %call16(__udivdi3)($gp)
; MMR6-NEXT: jalr $25
; MMR6-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
; MMR6-NEXT: addiu $sp, $sp, 24
@@ -497,65 +495,59 @@ define signext i128 @udiv_i128(i128 signext %a, i128 signext %b) {
; MMR3: # %bb.0: # %entry
; MMR3-NEXT: lui $2, %hi(_gp_disp)
; MMR3-NEXT: addiu $2, $2, %lo(_gp_disp)
-; MMR3-NEXT: addiusp -48
-; MMR3-NEXT: .cfi_def_cfa_offset 48
-; MMR3-NEXT: sw $ra, 44($sp) # 4-byte Folded Spill
-; MMR3-NEXT: swp $16, 36($sp)
+; MMR3-NEXT: addiusp -40
+; MMR3-NEXT: .cfi_def_cfa_offset 40
+; MMR3-NEXT: sw $ra, 36($sp) # 4-byte Folded Spill
+; MMR3-NEXT: sw $17, 32($sp) # 4-byte Folded Spill
; MMR3-NEXT: .cfi_offset 31, -4
; MMR3-NEXT: .cfi_offset 17, -8
-; MMR3-NEXT: .cfi_offset 16, -12
-; MMR3-NEXT: addu $16, $2, $25
+; MMR3-NEXT: addu $gp, $2, $25
; MMR3-NEXT: move $1, $7
-; MMR3-NEXT: lw $7, 68($sp)
-; MMR3-NEXT: lw $17, 72($sp)
-; MMR3-NEXT: lw $3, 76($sp)
+; MMR3-NEXT: lw $7, 60($sp)
+; MMR3-NEXT: lw $17, 64($sp)
+; MMR3-NEXT: lw $3, 68($sp)
; MMR3-NEXT: move $2, $sp
; MMR3-NEXT: sw16 $3, 28($2)
; MMR3-NEXT: sw16 $17, 24($2)
; MMR3-NEXT: sw16 $7, 20($2)
-; MMR3-NEXT: lw $3, 64($sp)
+; MMR3-NEXT: lw $3, 56($sp)
; MMR3-NEXT: sw16 $3, 16($2)
-; MMR3-NEXT: lw $25, %call16(__udivti3)($16)
+; MMR3-NEXT: lw $25, %call16(__udivti3)($gp)
; MMR3-NEXT: move $7, $1
-; MMR3-NEXT: move $gp, $16
; MMR3-NEXT: jalr $25
; MMR3-NEXT: nop
-; MMR3-NEXT: lwp $16, 36($sp)
-; MMR3-NEXT: lw $ra, 44($sp) # 4-byte Folded Reload
-; MMR3-NEXT: addiusp 48
+; MMR3-NEXT: lw $17, 32($sp) # 4-byte Folded Reload
+; MMR3-NEXT: lw $ra, 36($sp) # 4-byte Folded Reload
+; MMR3-NEXT: addiusp 40
; MMR3-NEXT: jrc $ra
;
; MMR6-LABEL: udiv_i128:
; MMR6: # %bb.0: # %entry
; MMR6-NEXT: lui $2, %hi(_gp_disp)
; MMR6-NEXT: addiu $2, $2, %lo(_gp_disp)
-; MMR6-NEXT: addiu $sp, $sp, -48
-; MMR6-NEXT: .cfi_def_cfa_offset 48
-; MMR6-NEXT: sw $ra, 44($sp) # 4-byte Folded Spill
-; MMR6-NEXT: sw $17, 40($sp) # 4-byte Folded Spill
-; MMR6-NEXT: sw $16, 36($sp) # 4-byte Folded Spill
+; MMR6-NEXT: addiu $sp, $sp, -40
+; MMR6-NEXT: .cfi_def_cfa_offset 40
+; MMR6-NEXT: sw $ra, 36($sp) # 4-byte Folded Spill
+; MMR6-NEXT: sw $17, 32($sp) # 4-byte Folded Spill
; MMR6-NEXT: .cfi_offset 31, -4
; MMR6-NEXT: .cfi_offset 17, -8
-; MMR6-NEXT: .cfi_offset 16, -12
-; MMR6-NEXT: addu $16, $2, $25
+; MMR6-NEXT: addu $gp, $2, $25
; MMR6-NEXT: move $1, $7
-; MMR6-NEXT: lw $7, 68($sp)
-; MMR6-NEXT: lw $17, 72($sp)
-; MMR6-NEXT: lw $3, 76($sp)
+; MMR6-NEXT: lw $7, 60($sp)
+; MMR6-NEXT: lw $17, 64($sp)
+; MMR6-NEXT: lw $3, 68($sp)
; MMR6-NEXT: move $2, $sp
; MMR6-NEXT: sw16 $3, 28($2)
; MMR6-NEXT: sw16 $17, 24($2)
; MMR6-NEXT: sw16 $7, 20($2)
-; MMR6-NEXT: lw $3, 64($sp)
+; MMR6-NEXT: lw $3, 56($sp)
; MMR6-NEXT: sw16 $3, 16($2)
-; MMR6-NEXT: lw $25, %call16(__udivti3)($16)
+; MMR6-NEXT: lw $25, %call16(__udivti3)($gp)
; MMR6-NEXT: move $7, $1
-; MMR6-NEXT: move $gp, $16
; MMR6-NEXT: jalr $25
-; MMR6-NEXT: lw $16, 36($sp) # 4-byte Folded Reload
-; MMR6-NEXT: lw $17, 40($sp) # 4-byte Folded Reload
-; MMR6-NEXT: lw $ra, 44($sp) # 4-byte Folded Reload
-; MMR6-NEXT: addiu $sp, $sp, 48
+; MMR6-NEXT: lw $17, 32($sp) # 4-byte Folded Reload
+; MMR6-NEXT: lw $ra, 36($sp) # 4-byte Folded Relo...
[truncated]
|
@qcolombet @arsenm @topperc PTAL. |
2519013
to
a3daac1
Compare
Before this patch, redundant COPY couldn't be removed for the following case: $R0 = OP ... ... // Read of %R0 $R1 = COPY killed $R0 This patch adds support for tracking the users of the source register during backward propagation, so that we can remove the redundant COPY in the above case and optimize it to: $R1 = OP ... ... // Replace all uses of %R0 with $R1 Upstream PR: llvm/llvm-project#111130 Signed-off-by: Vladimir Radosavljevic <[email protected]>
Before this patch, redundant COPY couldn't be removed for the following case: $R0 = OP ... ... // Read of %R0 $R1 = COPY killed $R0 This patch adds support for tracking the users of the source register during backward propagation, so that we can remove the redundant COPY in the above case and optimize it to: $R1 = OP ... ... // Replace all uses of %R0 with $R1 Upstream PR: llvm/llvm-project#111130 Signed-off-by: Vladimir Radosavljevic <[email protected]>
Before this patch, redundant COPY couldn't be removed for the following case: $R0 = OP ... ... // Read of %R0 $R1 = COPY killed $R0 This patch adds support for tracking the users of the source register during backward propagation, so that we can remove the redundant COPY in the above case and optimize it to: $R1 = OP ... ... // Replace all uses of %R0 with $R1 Upstream PR: llvm/llvm-project#111130 Signed-off-by: Vladimir Radosavljevic <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is one of these test changes a dedicated MIR test for this situation?
Having a quick glance at the patch I don't think it is. Could you add a few tests that checks specifically for the new pattern?
and also
and so on. Essentially add explicit test coverage with focused mir tests. |
Note: I think you support all these cases, but I'd like to see them written down. |
a3daac1
to
7704f32
Compare
@arsenm @qcolombet I added dedicated MIR test to cover 3 cases:
|
Ping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Before this patch, redundant COPY couldn't be removed for the following case: $R0 = OP ... ... // Read of %R0 $R1 = COPY killed $R0 This patch adds support for tracking the users of the source register during backward propagation, so that we can remove the redundant COPY in the above case and optimize it to: $R1 = OP ... ... // Replace all uses of %R0 with $R1
7704f32
to
c71a310
Compare
Rebased and updated checks in |
@vladimirradosavljevic Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
…lvm#111130) Before this patch, redundant COPY couldn't be removed for the following case: ``` $R0 = OP ... ... // Read of %R0 $R1 = COPY killed $R0 ``` This patch adds support for tracking the users of the source register during backward propagation, so that we can remove the redundant COPY in the above case and optimize it to: ``` $R1 = OP ... ... // Replace all uses of %R0 with $R1 ```
Before this patch, redundant COPY couldn't be removed for the following case: $R0 = OP ... ... // Read of %R0 $R1 = COPY killed $R0 This patch adds support for tracking the users of the source register during backward propagation, so that we can remove the redundant COPY in the above case and optimize it to: $R1 = OP ... ... // Replace all uses of %R0 with $R1 Upstream PR: llvm/llvm-project#111130 Signed-off-by: Vladimir Radosavljevic <[email protected]>
Before this patch, redundant COPY couldn't be removed for the following case:
This patch adds support for tracking the users of the source register during backward propagation, so that we can remove the redundant COPY in the above case and optimize it to: