-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[CodeGen] TwoAddressInstructionPass: Update default option #100046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-aarch64 @llvm/pr-subscribers-backend-x86 Author: AtariDreams (AtariDreams) ChangesPulled out of comment made on #80627 - to simplify further investigation into visit limits. Since 10 was the limit over a decade ago, I have decided to increase it by 10-fold because that is around the number where compile time vs. benefit starts to wear off for the tests that changed codegen. Patch is 2.59 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/100046.diff 35 Files Affected:
diff --git a/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp b/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
index 665d57841a97b..f8b6b34e92b9f 100644
--- a/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
+++ b/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
@@ -80,6 +80,13 @@ EnableRescheduling("twoaddr-reschedule",
cl::desc("Coalesce copies by rescheduling (default=true)"),
cl::init(true), cl::Hidden);
+// Limit the number of rescheduling visits to dependent instructions.
+// FIXME: Arbitrary limit to reduce compile time cost.
+static cl::opt<unsigned> MaxVisits(
+ "twoaddr-visit-limit", cl::Hidden, cl::init(100),
+ cl::desc(
+ "Maximum number of rescheduling visits to dependent instructions (0 = no limit)"));
+
// Limit the number of dataflow edges to traverse when evaluating the benefit
// of commuting operands.
static cl::opt<unsigned> MaxDataFlowEdge(
@@ -994,7 +1001,7 @@ bool TwoAddressInstructionImpl::rescheduleMIBelowKill(
// Debug or pseudo instructions cannot be counted against the limit.
if (OtherMI.isDebugOrPseudoInstr())
continue;
- if (NumVisited > 10) // FIXME: Arbitrary limit to reduce compile time cost.
+ if (MaxVisits && NumVisited > MaxVisits)
return false;
++NumVisited;
if (OtherMI.hasUnmodeledSideEffects() || OtherMI.isCall() ||
@@ -1160,14 +1167,14 @@ bool TwoAddressInstructionImpl::rescheduleKillAboveMI(
}
}
- // Check if the reschedule will not break depedencies.
+ // Check if the reschedule will not break dependencies.
unsigned NumVisited = 0;
for (MachineInstr &OtherMI :
make_range(mi, MachineBasicBlock::iterator(KillMI))) {
// Debug or pseudo instructions cannot be counted against the limit.
if (OtherMI.isDebugOrPseudoInstr())
continue;
- if (NumVisited > 10) // FIXME: Arbitrary limit to reduce compile time cost.
+ if (MaxVisits && NumVisited > MaxVisits)
return false;
++NumVisited;
if (OtherMI.hasUnmodeledSideEffects() || OtherMI.isCall() ||
diff --git a/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll
index 25a6ea490c163..c32757f123aa8 100644
--- a/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll
+++ b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll
@@ -1148,63 +1148,57 @@ define void @sext_v32i8_v32i64(ptr %in, ptr %out) {
; CHECK: // %bb.0:
; CHECK-NEXT: ldp q1, q0, [x0]
; CHECK-NEXT: add z0.b, z0.b, z0.b
-; CHECK-NEXT: add z1.b, z1.b, z1.b
-; CHECK-NEXT: mov z2.d, z0.d
-; CHECK-NEXT: sunpklo z0.h, z0.b
-; CHECK-NEXT: mov z3.d, z1.d
-; CHECK-NEXT: sunpklo z1.h, z1.b
+; CHECK-NEXT: add z2.b, z1.b, z1.b
+; CHECK-NEXT: sunpklo z3.h, z0.b
+; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
+; CHECK-NEXT: sunpklo z1.h, z2.b
; CHECK-NEXT: ext z2.b, z2.b, z2.b, #8
+; CHECK-NEXT: sunpklo z0.h, z0.b
+; CHECK-NEXT: sunpklo z4.s, z3.h
; CHECK-NEXT: ext z3.b, z3.b, z3.b, #8
-; CHECK-NEXT: sunpklo z4.s, z0.h
-; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
; CHECK-NEXT: sunpklo z5.s, z1.h
-; CHECK-NEXT: ext z1.b, z1.b, z1.b, #8
; CHECK-NEXT: sunpklo z2.h, z2.b
-; CHECK-NEXT: sunpklo z3.h, z3.b
-; CHECK-NEXT: sunpklo z0.s, z0.h
-; CHECK-NEXT: sunpklo z16.d, z4.s
+; CHECK-NEXT: ext z1.b, z1.b, z1.b, #8
+; CHECK-NEXT: sunpklo z6.s, z0.h
+; CHECK-NEXT: sunpklo z3.s, z3.h
+; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
+; CHECK-NEXT: sunpklo z7.d, z4.s
; CHECK-NEXT: ext z4.b, z4.b, z4.b, #8
-; CHECK-NEXT: sunpklo z1.s, z1.h
-; CHECK-NEXT: sunpklo z17.d, z5.s
+; CHECK-NEXT: sunpklo z16.d, z5.s
; CHECK-NEXT: ext z5.b, z5.b, z5.b, #8
-; CHECK-NEXT: sunpklo z6.s, z2.h
-; CHECK-NEXT: sunpklo z7.s, z3.h
+; CHECK-NEXT: sunpklo z17.s, z2.h
; CHECK-NEXT: ext z2.b, z2.b, z2.b, #8
-; CHECK-NEXT: sunpklo z4.d, z4.s
+; CHECK-NEXT: sunpklo z1.s, z1.h
+; CHECK-NEXT: sunpklo z0.s, z0.h
+; CHECK-NEXT: sunpklo z18.d, z6.s
+; CHECK-NEXT: ext z6.b, z6.b, z6.b, #8
+; CHECK-NEXT: sunpklo z19.d, z3.s
; CHECK-NEXT: ext z3.b, z3.b, z3.b, #8
-; CHECK-NEXT: sunpklo z19.d, z0.s
+; CHECK-NEXT: sunpklo z4.d, z4.s
; CHECK-NEXT: sunpklo z5.d, z5.s
-; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
; CHECK-NEXT: sunpklo z2.s, z2.h
-; CHECK-NEXT: sunpklo z18.d, z6.s
-; CHECK-NEXT: ext z6.b, z6.b, z6.b, #8
-; CHECK-NEXT: sunpklo z3.s, z3.h
-; CHECK-NEXT: stp q16, q4, [x1, #128]
-; CHECK-NEXT: mov z16.d, z7.d
-; CHECK-NEXT: sunpklo z0.d, z0.s
-; CHECK-NEXT: stp q17, q5, [x1]
-; CHECK-NEXT: sunpklo z5.d, z7.s
-; CHECK-NEXT: sunpklo z4.d, z6.s
-; CHECK-NEXT: mov z6.d, z1.d
-; CHECK-NEXT: ext z16.b, z16.b, z7.b, #8
-; CHECK-NEXT: mov z7.d, z2.d
-; CHECK-NEXT: stp q19, q0, [x1, #160]
-; CHECK-NEXT: sunpklo z0.d, z2.s
-; CHECK-NEXT: ext z6.b, z6.b, z1.b, #8
-; CHECK-NEXT: sunpklo z1.d, z1.s
-; CHECK-NEXT: stp q18, q4, [x1, #192]
-; CHECK-NEXT: mov z4.d, z3.d
-; CHECK-NEXT: ext z7.b, z7.b, z2.b, #8
-; CHECK-NEXT: sunpklo z16.d, z16.s
; CHECK-NEXT: sunpklo z6.d, z6.s
-; CHECK-NEXT: ext z4.b, z4.b, z3.b, #8
-; CHECK-NEXT: sunpklo z2.d, z7.s
; CHECK-NEXT: sunpklo z3.d, z3.s
-; CHECK-NEXT: stp q5, q16, [x1, #64]
-; CHECK-NEXT: stp q1, q6, [x1, #32]
-; CHECK-NEXT: sunpklo z1.d, z4.s
-; CHECK-NEXT: stp q0, q2, [x1, #224]
-; CHECK-NEXT: stp q3, q1, [x1, #96]
+; CHECK-NEXT: stp q16, q5, [x1]
+; CHECK-NEXT: sunpklo z5.d, z1.s
+; CHECK-NEXT: ext z1.b, z1.b, z1.b, #8
+; CHECK-NEXT: stp q7, q4, [x1, #128]
+; CHECK-NEXT: sunpklo z4.d, z17.s
+; CHECK-NEXT: ext z17.b, z17.b, z17.b, #8
+; CHECK-NEXT: stp q18, q6, [x1, #192]
+; CHECK-NEXT: sunpklo z6.d, z0.s
+; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
+; CHECK-NEXT: stp q19, q3, [x1, #160]
+; CHECK-NEXT: sunpklo z3.d, z2.s
+; CHECK-NEXT: ext z2.b, z2.b, z2.b, #8
+; CHECK-NEXT: sunpklo z7.d, z17.s
+; CHECK-NEXT: sunpklo z1.d, z1.s
+; CHECK-NEXT: sunpklo z0.d, z0.s
+; CHECK-NEXT: sunpklo z2.d, z2.s
+; CHECK-NEXT: stp q5, q1, [x1, #32]
+; CHECK-NEXT: stp q4, q7, [x1, #64]
+; CHECK-NEXT: stp q3, q2, [x1, #96]
+; CHECK-NEXT: stp q6, q0, [x1, #224]
; CHECK-NEXT: ret
;
; NONEON-NOSVE-LABEL: sext_v32i8_v32i64:
@@ -3133,63 +3127,57 @@ define void @zext_v32i8_v32i64(ptr %in, ptr %out) {
; CHECK: // %bb.0:
; CHECK-NEXT: ldp q1, q0, [x0]
; CHECK-NEXT: add z0.b, z0.b, z0.b
-; CHECK-NEXT: add z1.b, z1.b, z1.b
-; CHECK-NEXT: mov z2.d, z0.d
-; CHECK-NEXT: uunpklo z0.h, z0.b
-; CHECK-NEXT: mov z3.d, z1.d
-; CHECK-NEXT: uunpklo z1.h, z1.b
+; CHECK-NEXT: add z2.b, z1.b, z1.b
+; CHECK-NEXT: uunpklo z3.h, z0.b
+; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
+; CHECK-NEXT: uunpklo z1.h, z2.b
; CHECK-NEXT: ext z2.b, z2.b, z2.b, #8
+; CHECK-NEXT: uunpklo z0.h, z0.b
+; CHECK-NEXT: uunpklo z4.s, z3.h
; CHECK-NEXT: ext z3.b, z3.b, z3.b, #8
-; CHECK-NEXT: uunpklo z4.s, z0.h
-; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
; CHECK-NEXT: uunpklo z5.s, z1.h
-; CHECK-NEXT: ext z1.b, z1.b, z1.b, #8
; CHECK-NEXT: uunpklo z2.h, z2.b
-; CHECK-NEXT: uunpklo z3.h, z3.b
-; CHECK-NEXT: uunpklo z0.s, z0.h
-; CHECK-NEXT: uunpklo z16.d, z4.s
+; CHECK-NEXT: ext z1.b, z1.b, z1.b, #8
+; CHECK-NEXT: uunpklo z6.s, z0.h
+; CHECK-NEXT: uunpklo z3.s, z3.h
+; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
+; CHECK-NEXT: uunpklo z7.d, z4.s
; CHECK-NEXT: ext z4.b, z4.b, z4.b, #8
-; CHECK-NEXT: uunpklo z1.s, z1.h
-; CHECK-NEXT: uunpklo z17.d, z5.s
+; CHECK-NEXT: uunpklo z16.d, z5.s
; CHECK-NEXT: ext z5.b, z5.b, z5.b, #8
-; CHECK-NEXT: uunpklo z6.s, z2.h
-; CHECK-NEXT: uunpklo z7.s, z3.h
+; CHECK-NEXT: uunpklo z17.s, z2.h
; CHECK-NEXT: ext z2.b, z2.b, z2.b, #8
-; CHECK-NEXT: uunpklo z4.d, z4.s
+; CHECK-NEXT: uunpklo z1.s, z1.h
+; CHECK-NEXT: uunpklo z0.s, z0.h
+; CHECK-NEXT: uunpklo z18.d, z6.s
+; CHECK-NEXT: ext z6.b, z6.b, z6.b, #8
+; CHECK-NEXT: uunpklo z19.d, z3.s
; CHECK-NEXT: ext z3.b, z3.b, z3.b, #8
-; CHECK-NEXT: uunpklo z19.d, z0.s
+; CHECK-NEXT: uunpklo z4.d, z4.s
; CHECK-NEXT: uunpklo z5.d, z5.s
-; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
; CHECK-NEXT: uunpklo z2.s, z2.h
-; CHECK-NEXT: uunpklo z18.d, z6.s
-; CHECK-NEXT: ext z6.b, z6.b, z6.b, #8
-; CHECK-NEXT: uunpklo z3.s, z3.h
-; CHECK-NEXT: stp q16, q4, [x1, #128]
-; CHECK-NEXT: mov z16.d, z7.d
-; CHECK-NEXT: uunpklo z0.d, z0.s
-; CHECK-NEXT: stp q17, q5, [x1]
-; CHECK-NEXT: uunpklo z5.d, z7.s
-; CHECK-NEXT: uunpklo z4.d, z6.s
-; CHECK-NEXT: mov z6.d, z1.d
-; CHECK-NEXT: ext z16.b, z16.b, z7.b, #8
-; CHECK-NEXT: mov z7.d, z2.d
-; CHECK-NEXT: stp q19, q0, [x1, #160]
-; CHECK-NEXT: uunpklo z0.d, z2.s
-; CHECK-NEXT: ext z6.b, z6.b, z1.b, #8
-; CHECK-NEXT: uunpklo z1.d, z1.s
-; CHECK-NEXT: stp q18, q4, [x1, #192]
-; CHECK-NEXT: mov z4.d, z3.d
-; CHECK-NEXT: ext z7.b, z7.b, z2.b, #8
-; CHECK-NEXT: uunpklo z16.d, z16.s
; CHECK-NEXT: uunpklo z6.d, z6.s
-; CHECK-NEXT: ext z4.b, z4.b, z3.b, #8
-; CHECK-NEXT: uunpklo z2.d, z7.s
; CHECK-NEXT: uunpklo z3.d, z3.s
-; CHECK-NEXT: stp q5, q16, [x1, #64]
-; CHECK-NEXT: stp q1, q6, [x1, #32]
-; CHECK-NEXT: uunpklo z1.d, z4.s
-; CHECK-NEXT: stp q0, q2, [x1, #224]
-; CHECK-NEXT: stp q3, q1, [x1, #96]
+; CHECK-NEXT: stp q16, q5, [x1]
+; CHECK-NEXT: uunpklo z5.d, z1.s
+; CHECK-NEXT: ext z1.b, z1.b, z1.b, #8
+; CHECK-NEXT: stp q7, q4, [x1, #128]
+; CHECK-NEXT: uunpklo z4.d, z17.s
+; CHECK-NEXT: ext z17.b, z17.b, z17.b, #8
+; CHECK-NEXT: stp q18, q6, [x1, #192]
+; CHECK-NEXT: uunpklo z6.d, z0.s
+; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
+; CHECK-NEXT: stp q19, q3, [x1, #160]
+; CHECK-NEXT: uunpklo z3.d, z2.s
+; CHECK-NEXT: ext z2.b, z2.b, z2.b, #8
+; CHECK-NEXT: uunpklo z7.d, z17.s
+; CHECK-NEXT: uunpklo z1.d, z1.s
+; CHECK-NEXT: uunpklo z0.d, z0.s
+; CHECK-NEXT: uunpklo z2.d, z2.s
+; CHECK-NEXT: stp q5, q1, [x1, #32]
+; CHECK-NEXT: stp q4, q7, [x1, #64]
+; CHECK-NEXT: stp q3, q2, [x1, #96]
+; CHECK-NEXT: stp q6, q0, [x1, #224]
; CHECK-NEXT: ret
;
; NONEON-NOSVE-LABEL: zext_v32i8_v32i64:
diff --git a/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-to-fp.ll b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-to-fp.ll
index afd3bb7161c15..e34af3fe4db95 100644
--- a/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-to-fp.ll
+++ b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-to-fp.ll
@@ -567,42 +567,37 @@ define void @ucvtf_v16i16_v16f64(ptr %a, ptr %b) {
; CHECK: // %bb.0:
; CHECK-NEXT: ldp q1, q0, [x0]
; CHECK-NEXT: ptrue p0.d, vl2
-; CHECK-NEXT: mov z2.d, z0.d
+; CHECK-NEXT: uunpklo z2.s, z0.h
+; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
; CHECK-NEXT: uunpklo z3.s, z1.h
; CHECK-NEXT: ext z1.b, z1.b, z1.b, #8
; CHECK-NEXT: uunpklo z0.s, z0.h
+; CHECK-NEXT: uunpklo z4.d, z2.s
; CHECK-NEXT: ext z2.b, z2.b, z2.b, #8
; CHECK-NEXT: uunpklo z1.s, z1.h
-; CHECK-NEXT: mov z5.d, z3.d
-; CHECK-NEXT: uunpklo z4.d, z0.s
+; CHECK-NEXT: uunpklo z5.d, z3.s
+; CHECK-NEXT: ext z3.b, z3.b, z3.b, #8
+; CHECK-NEXT: uunpklo z2.d, z2.s
+; CHECK-NEXT: uunpklo z6.d, z0.s
; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
-; CHECK-NEXT: uunpklo z2.s, z2.h
-; CHECK-NEXT: ext z5.b, z5.b, z3.b, #8
-; CHECK-NEXT: mov z7.d, z1.d
+; CHECK-NEXT: uunpklo z7.d, z1.s
+; CHECK-NEXT: ext z1.b, z1.b, z1.b, #8
; CHECK-NEXT: uunpklo z3.d, z3.s
-; CHECK-NEXT: uunpklo z0.d, z0.s
; CHECK-NEXT: ucvtf z4.d, p0/m, z4.d
-; CHECK-NEXT: mov z6.d, z2.d
-; CHECK-NEXT: uunpklo z5.d, z5.s
-; CHECK-NEXT: ext z7.b, z7.b, z1.b, #8
+; CHECK-NEXT: ucvtf z5.d, p0/m, z5.d
+; CHECK-NEXT: uunpklo z0.d, z0.s
+; CHECK-NEXT: ucvtf z2.d, p0/m, z2.d
; CHECK-NEXT: uunpklo z1.d, z1.s
+; CHECK-NEXT: ucvtf z6.d, p0/m, z6.d
; CHECK-NEXT: ucvtf z3.d, p0/m, z3.d
; CHECK-NEXT: ucvtf z0.d, p0/m, z0.d
-; CHECK-NEXT: ext z6.b, z6.b, z2.b, #8
-; CHECK-NEXT: uunpklo z2.d, z2.s
-; CHECK-NEXT: uunpklo z7.d, z7.s
-; CHECK-NEXT: ucvtf z5.d, p0/m, z5.d
+; CHECK-NEXT: stp q4, q2, [x1, #64]
+; CHECK-NEXT: movprfx z2, z7
+; CHECK-NEXT: ucvtf z2.d, p0/m, z7.d
; CHECK-NEXT: ucvtf z1.d, p0/m, z1.d
-; CHECK-NEXT: uunpklo z6.d, z6.s
-; CHECK-NEXT: stp q4, q0, [x1, #64]
-; CHECK-NEXT: ucvtf z2.d, p0/m, z2.d
-; CHECK-NEXT: stp q3, q5, [x1]
-; CHECK-NEXT: movprfx z3, z7
-; CHECK-NEXT: ucvtf z3.d, p0/m, z7.d
-; CHECK-NEXT: movprfx z0, z6
-; CHECK-NEXT: ucvtf z0.d, p0/m, z6.d
-; CHECK-NEXT: stp q1, q3, [x1, #32]
-; CHECK-NEXT: stp q2, q0, [x1, #96]
+; CHECK-NEXT: stp q5, q3, [x1]
+; CHECK-NEXT: stp q6, q0, [x1, #96]
+; CHECK-NEXT: stp q2, q1, [x1, #32]
; CHECK-NEXT: ret
;
; NONEON-NOSVE-LABEL: ucvtf_v16i16_v16f64:
@@ -2024,42 +2019,37 @@ define void @scvtf_v16i16_v16f64(ptr %a, ptr %b) {
; CHECK: // %bb.0:
; CHECK-NEXT: ldp q1, q0, [x0]
; CHECK-NEXT: ptrue p0.d, vl2
-; CHECK-NEXT: mov z2.d, z0.d
+; CHECK-NEXT: sunpklo z2.s, z0.h
+; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
; CHECK-NEXT: sunpklo z3.s, z1.h
; CHECK-NEXT: ext z1.b, z1.b, z1.b, #8
; CHECK-NEXT: sunpklo z0.s, z0.h
+; CHECK-NEXT: sunpklo z4.d, z2.s
; CHECK-NEXT: ext z2.b, z2.b, z2.b, #8
; CHECK-NEXT: sunpklo z1.s, z1.h
-; CHECK-NEXT: mov z5.d, z3.d
-; CHECK-NEXT: sunpklo z4.d, z0.s
+; CHECK-NEXT: sunpklo z5.d, z3.s
+; CHECK-NEXT: ext z3.b, z3.b, z3.b, #8
+; CHECK-NEXT: sunpklo z2.d, z2.s
+; CHECK-NEXT: sunpklo z6.d, z0.s
; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
-; CHECK-NEXT: sunpklo z2.s, z2.h
-; CHECK-NEXT: ext z5.b, z5.b, z3.b, #8
-; CHECK-NEXT: mov z7.d, z1.d
+; CHECK-NEXT: sunpklo z7.d, z1.s
+; CHECK-NEXT: ext z1.b, z1.b, z1.b, #8
; CHECK-NEXT: sunpklo z3.d, z3.s
-; CHECK-NEXT: sunpklo z0.d, z0.s
; CHECK-NEXT: scvtf z4.d, p0/m, z4.d
-; CHECK-NEXT: mov z6.d, z2.d
-; CHECK-NEXT: sunpklo z5.d, z5.s
-; CHECK-NEXT: ext z7.b, z7.b, z1.b, #8
+; CHECK-NEXT: scvtf z5.d, p0/m, z5.d
+; CHECK-NEXT: sunpklo z0.d, z0.s
+; CHECK-NEXT: scvtf z2.d, p0/m, z2.d
; CHECK-NEXT: sunpklo z1.d, z1.s
+; CHECK-NEXT: scvtf z6.d, p0/m, z6.d
; CHECK-NEXT: scvtf z3.d, p0/m, z3.d
; CHECK-NEXT: scvtf z0.d, p0/m, z0.d
-; CHECK-NEXT: ext z6.b, z6.b, z2.b, #8
-; CHECK-NEXT: sunpklo z2.d, z2.s
-; CHECK-NEXT: sunpklo z7.d, z7.s
-; CHECK-NEXT: scvtf z5.d, p0/m, z5.d
+; CHECK-NEXT: stp q4, q2, [x1, #64]
+; CHECK-NEXT: movprfx z2, z7
+; CHECK-NEXT: scvtf z2.d, p0/m, z7.d
; CHECK-NEXT: scvtf z1.d, p0/m, z1.d
-; CHECK-NEXT: sunpklo z6.d, z6.s
-; CHECK-NEXT: stp q4, q0, [x1, #64]
-; CHECK-NEXT: scvtf z2.d, p0/m, z2.d
-; CHECK-NEXT: stp q3, q5, [x1]
-; CHECK-NEXT: movprfx z3, z7
-; CHECK-NEXT: scvtf z3.d, p0/m, z7.d
-; CHECK-NEXT: movprfx z0, z6
-; CHECK-NEXT: scvtf z0.d, p0/m, z6.d
-; CHECK-NEXT: stp q1, q3, [x1, #32]
-; CHECK-NEXT: stp q2, q0, [x1, #96]
+; CHECK-NEXT: stp q5, q3, [x1]
+; CHECK-NEXT: stp q6, q0, [x1, #96]
+; CHECK-NEXT: stp q2, q1, [x1, #32]
; CHECK-NEXT: ret
;
; NONEON-NOSVE-LABEL: scvtf_v16i16_v16f64:
@@ -2507,37 +2497,33 @@ define void @scvtf_v16i32_v16f64(ptr %a, ptr %b) {
; CHECK-NEXT: ldp q1, q0, [x0, #32]
; CHECK-NEXT: ptrue p0.d, vl2
; CHECK-NEXT: ldp q5, q4, [x0]
-; CHECK-NEXT: mov z2.d, z0.d
-; CHECK-NEXT: mov z3.d, z1.d
-; CHECK-NEXT: mov z6.d, z4.d
-; CHECK-NEXT: mov z7.d, z5.d
-; CHECK-NEXT: ext z2.b, z2.b, z0.b, #8
-; CHECK-NEXT: ext z3.b, z3.b, z1.b, #8
+; CHECK-NEXT: sunpklo z2.d, z0.s
+; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
+; CHECK-NEXT: sunpklo z3.d, z1.s
+; CHECK-NEXT: ext z1.b, z1.b, z1.b, #8
+; CHECK-NEXT: sunpklo z6.d, z4.s
+; CHECK-NEXT: ext z4.b, z4.b, z4.b, #8
+; CHECK-NEXT: sunpklo z7.d, z5.s
+; CHECK-NEXT: ext z5.b, z5.b, z5.b, #8
; CHECK-NEXT: sunpklo z0.d, z0.s
; CHECK-NEXT: sunpklo z1.d, z1.s
-; CHECK-NEXT: ext z6.b, z6.b, z4.b, #8
-; CHECK-NEXT: ext z7.b, z7.b, z5.b, #8
+; CHECK-NEXT: scvtf z2.d, p0/m, z2.d
; CHECK-NEXT: sunpklo z4.d, z4.s
+; CHECK-NEXT: scvtf z3.d, p0/m, z3.d
; CHECK-NEXT: sunpklo z5.d, z5.s
-; CHECK-NEXT: sunpklo z2.d, z2.s
-; CHECK-NEXT: sunpklo z3.d, z3.s
+; CHECK-NEXT: scvtf z6.d, p0/m, z6.d
; CHECK-NEXT: scvtf z0.d, p0/m, z0.d
-; CHECK-NEXT: sunpklo z6.d, z6.s
-; CHECK-NEXT: sunpklo z7.d, z7.s
; CHECK-NEXT: scvtf z1.d, p0/m, z1.d
-; CHECK-NEXT: scvtf z4.d, p0/m, z4.d
-; CHECK-NEXT: scvtf z2.d, p0/m, z2.d
-; CHECK-NEXT: scvtf z3.d, p0/m, z3.d
-; CHECK-NEXT: stp q1, q3, [x1, #64]
-; CHECK-NEXT: movprfx z1, z7
-; CHECK-NEXT: scvtf z1.d, p0/m, z7.d
-; CHECK-NEXT: stp q0, q2, [x1, #96]
-; CHECK-NEXT: movprfx z0, z6
-; CHECK-NEXT: scvtf z0.d, p0/m, z6.d
-; CHECK-NEXT: movprfx z2, z5
-; CHECK-NEXT: scvtf z2.d, p0/m, z5.d
-; CHECK-NEXT: stp q2, q1, [x1]
-; CHECK-NEXT: stp q4, q0, [x1, #32]
+; CHECK-NEXT: stp q2, q0, [x1, #96]
+; CHECK-NEXT: movprfx z2, z4
+; CHECK-NEXT: scvtf z2.d, p0/m, z4.d
+; CHECK-NEXT: movprfx z0, z7
+; CHECK-NEXT: scvtf z0.d, p0/m, z7.d
+; CHECK-NEXT: stp q3, q1, [x1, #64]
+; CHECK-NEXT: movprfx z3, z5
+; CHECK-NEXT: scvtf z3.d, p0/m, z5.d
+; CHECK-NEXT: stp q6, q2, [x1, #32]
+; CHECK-NEXT: stp q0, q3, [x1]
; CHECK-NEXT: ret
;
; NONEON-NOSVE-LABEL: scvtf_v16i32_v16f64:
diff --git a/llvm/test/CodeGen/ARM/copy-by-struct-i32.ll b/llvm/test/CodeGen/ARM/copy-by-struct-i32.ll
index 34aab4c04b109..8f134e0ac7f18 100644
--- a/llvm/test/CodeGen/ARM/copy-by-struct-i32.ll
+++ b/llvm/test/CodeGen/ARM/copy-by-struct-i32.ll
@@ -22,23 +22,23 @@ define arm_aapcscc void @s(ptr %q, ptr %p) {
; ASSEMBLY-NEXT: ldr r2, [r1, #8]
; ASSEMBLY-NEXT: ldr r3, [r1, #12]
; ASSEMBLY-NEXT: strd r4, r5, [sp, #128]
-; ASSEMBLY-NEXT: add r5, r1, #16
-; ASSEMBLY-NEXT: mov r4, sp
-; ASSEMBLY-NEXT: vld1.32 {d16}, [r5]!
-; ASSEMBLY-NEXT: vst1.32 {d16}, [r4]!
-; ASSEMBLY-NEXT: vld1.32 {d16}, [r5]!
-; ASSEMBLY-NEXT: vst1.32 {d16}, [r4]!
-; ASSEMBLY-NEXT: vld1.32 {d16}, [r5]!
-; ASSEMBLY-NEXT: vst1.32 {d16}, [r4]!
-; ASSEMBLY-NEXT: vld1.32 {d16}, [r5]!
-; ASSEMBLY-NEXT: vst1.32 {d16}, [r4]!
-; ASSEMBLY-NEXT: vld1.32 {d16}, [r5]!
-; ASSEMBLY-NEXT: vst1.32 {d16}, [r4]!
-; ASSEMBLY-NEXT: vld1.32 {d16}, [r5]!
-; ASSEMBLY-NEXT: vst1.32 {d16}, [r4]!
-; ASSEMBLY-NEXT: vld1.32 {d16}, [r5]!
-; ASSEMBLY-NEXT: vst1.32 {d16}, [r4]!
+; ASSEMBLY-NEXT: add r4, r1, #16
+; ASSEMBLY-NEXT: mov r5, sp
+; ASSEMBLY-NEXT: vld1.32 {d16}, [r4]!
+; ASSEMBLY-NEXT: vst1.32 {d16}, [r5]!
+; ASSEMBLY-NEXT: vld1.32 {d16}, [r4]!
+; ASSEMBLY-NEXT: vst1.32 {d16}, [r5]!
+; ASSEMBLY-NEXT: vld1.32 {d16}, [r4]!
+; ASSEMBLY-NEXT: vst1.32 {d16}, [r5]!
+; ASSEMBLY-NEXT: vld1.32 {d16}, [r4]!
+; ASSEMBLY-NEXT: vst1.32 {d16}, [r5]!
+; ASSEMBLY-NEXT: vld1.32 {d16}, [r4]!
+; ASSEMBLY-NEXT: vst1.32 {d16}, [r5]!
+; ASSEMBLY-NEXT: vld1.32 {d16}, [r4]!
+; ASSEMBLY-NEXT: vst1.32 {d16}, [r5]!
+; ASSEMBLY-NEXT: vld1.32 {d16}, [r4]!
; ASSEMBLY-NEXT: movw r4, #72
+; ASSEMBLY-NEXT: vst1.32 {d16}, [r5]!
; ASSEMBLY-NEXT: .LBB0_1: @ %entry
; ASSEMBLY-NEXT: @ =>This Inner Loop Header: Depth=1
; ASSEMBLY-NEXT: vld1.32 {d16}, [r1]!
@@ -58,3 +58,5 @@ entry:
}
declare arm_aapcscc void @r(...)
+;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
+; BEFORE-EXPAND: {{.*}}
diff --git a/llvm/test/CodeGen/ARM/vselect_imax.ll b/llvm/test/CodeGen/ARM/vselect_imax.ll
index 37f511fcc68cc..89072683fb01a 100644
--- a/llvm/test/CodeGen/ARM/vselect_imax.ll
+++ b/llvm/test/CodeGen/ARM/vselect_imax.ll
@@ -242,198 +242,195 @@ define void @func_blend20(ptr %loadaddr, ptr %loadaddr2,
ptr %blend, ptr %storeaddr) {
; CHECK-LABEL: func_blend20:
; CHECK: @ %bb.0:
-; CHECK-NEXT: .save {r4, r5, r6, r7, r8, r9, r10, lr}
-; CHECK-NEXT: push {r4, r5, r6, r7, r8, r9, r10, lr}
+; CHECK-NEXT: .save {r4, r5, r6, r7, r8, r9, r11, lr}
+; CHECK-NEXT: push {r4, r5, r6, r7, r8, r9, r11, lr}
; CHECK-NEXT...
[truncated]
|
e829f0d
to
094d235
Compare
@RKSimon It is now ready |
36cd318
to
16e50a3
Compare
…mand line option Pulled out of comment made on llvm#80627 - to simplify further investigation into visit limits. Since 10 was the limit over a decade ago, I have decided to increase it by 10-fold because that is around the number where compile time vs. benefit starts to wear off for the tests that changed codegen.
@topperc Thoughts on this? |
// Limit the number of rescheduling visits to dependent instructions. | ||
// FIXME: Arbitrary limit to reduce compile time cost. | ||
static cl::opt<unsigned> | ||
MaxVisits("twoaddr-visit-limit", cl::Hidden, cl::init(100), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this PR, it would be better to keep the original value as default (10) - but add test coverage for setting twoaddr-visit-limit to higher (and lower?) values. We can then do another PR that alters the default value in the future.
Pulled out of comment made on #80627 - to simplify further investigation into visit limits.
Since 10 was the limit over a decade ago, I have decided to increase it by 10-fold because that is around the number where compile time vs. benefit starts to wear off for the tests that changed codegen.