Skip to content

Commit f696aa1

Browse files
committed
[AArch64][SME] Implement the SME ABI (ZA state management) in Machine IR
<h1>Short Summary</h1> This patch adds a new pass `aarch64-machine-sme-abi` to handle the ABI for ZA state (e.g., lazy saves and agnostic ZA functions). This is currently not enabled by default (but aims to be by LLVM 22). The goal is for this new pass to more optimally place ZA saves/restores and to work with exception handling. <h1>Long Description</h1> This patch reimplements management of ZA state for functions with private and shared ZA state. Agnostic ZA functions will be handled in a later patch. For now, this is under the flag `-aarch64-new-sme-abi`, however, we intend for this to replace the current SelectionDAG implementation once complete. The approach taken here is to mark instructions as needing ZA to be in a specific ("ACTIVE" or "LOCAL_SAVED"). Machine instructions implicitly defining or using ZA registers (such as $zt0 or $zab0) require the "ACTIVE" state. Function calls may need the "LOCAL_SAVED" or "ACTIVE" state depending on the callee (having shared or private ZA). We already add ZA register uses/definitions to machine instructions, so no extra work is needed to mark these. Calls need to be marked by glueing Arch64ISD::INOUT_ZA_USE or Arch64ISD::REQUIRES_ZA_SAVE to the CALLSEQ_START. These markers are then used by the MachineSMEABIPass to find instructions where there is a transition between required ZA states. These are the points we need to insert code to set up or restore a ZA save (or initialize ZA). To handle control flow between blocks (which may have different ZA state requirements), we bundle the incoming and outgoing edges of blocks. Bundles are formed by assigning each block an incoming and outgoing bundle (initially, all blocks have their own two bundles). Bundles are then combined by joining the outgoing bundle of a block with the incoming bundle of all successors. These bundles are then assigned a ZA state based on the blocks that participate in the bundle. Blocks whose incoming edges are in a bundle "vote" for a ZA state that matches the state required at the first instruction in the block, and likewise, blocks whose outgoing edges are in a bundle vote for the ZA state that matches the last instruction in the block. The ZA state with the most votes is used, which aims to minimize the number of state transitions. Change-Id: Iced4a3f329deab3ff8f3fd449a2337f7bbfa71ec
1 parent 3bdfca5 commit f696aa1

23 files changed

+3884
-503
lines changed

llvm/lib/Target/AArch64/AArch64.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ FunctionPass *createAArch64CleanupLocalDynamicTLSPass();
6060
FunctionPass *createAArch64CollectLOHPass();
6161
FunctionPass *createSMEABIPass();
6262
FunctionPass *createSMEPeepholeOptPass();
63+
FunctionPass *createMachineSMEABIPass();
6364
ModulePass *createSVEIntrinsicOptsPass();
6465
InstructionSelector *
6566
createAArch64InstructionSelector(const AArch64TargetMachine &,
@@ -111,6 +112,7 @@ void initializeFalkorMarkStridedAccessesLegacyPass(PassRegistry&);
111112
void initializeLDTLSCleanupPass(PassRegistry&);
112113
void initializeSMEABIPass(PassRegistry &);
113114
void initializeSMEPeepholeOptPass(PassRegistry &);
115+
void initializeMachineSMEABIPass(PassRegistry &);
114116
void initializeSVEIntrinsicOptsPass(PassRegistry &);
115117
void initializeAArch64Arm64ECCallLoweringPass(PassRegistry &);
116118
} // end namespace llvm

llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp

Lines changed: 26 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -92,8 +92,8 @@ class AArch64ExpandPseudo : public MachineFunctionPass {
9292
bool expandCALL_BTI(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI);
9393
bool expandStoreSwiftAsyncContext(MachineBasicBlock &MBB,
9494
MachineBasicBlock::iterator MBBI);
95-
MachineBasicBlock *expandRestoreZA(MachineBasicBlock &MBB,
96-
MachineBasicBlock::iterator MBBI);
95+
MachineBasicBlock *expandCommitOrRestoreZA(MachineBasicBlock &MBB,
96+
MachineBasicBlock::iterator MBBI);
9797
MachineBasicBlock *expandCondSMToggle(MachineBasicBlock &MBB,
9898
MachineBasicBlock::iterator MBBI);
9999
};
@@ -980,40 +980,50 @@ bool AArch64ExpandPseudo::expandStoreSwiftAsyncContext(
980980
}
981981

982982
MachineBasicBlock *
983-
AArch64ExpandPseudo::expandRestoreZA(MachineBasicBlock &MBB,
984-
MachineBasicBlock::iterator MBBI) {
983+
AArch64ExpandPseudo::expandCommitOrRestoreZA(MachineBasicBlock &MBB,
984+
MachineBasicBlock::iterator MBBI) {
985985
MachineInstr &MI = *MBBI;
986+
bool IsRestoreZA = MI.getOpcode() == AArch64::RestoreZAPseudo;
987+
assert((MI.getOpcode() == AArch64::RestoreZAPseudo ||
988+
MI.getOpcode() == AArch64::CommitZAPseudo) &&
989+
"Expected ZA commit or restore");
986990
assert((std::next(MBBI) != MBB.end() ||
987991
MI.getParent()->successors().begin() !=
988992
MI.getParent()->successors().end()) &&
989993
"Unexpected unreachable in block that restores ZA");
990994

991995
// Compare TPIDR2_EL0 value against 0.
992996
DebugLoc DL = MI.getDebugLoc();
993-
MachineInstrBuilder Cbz = BuildMI(MBB, MBBI, DL, TII->get(AArch64::CBZX))
994-
.add(MI.getOperand(0));
997+
MachineInstrBuilder Branch =
998+
BuildMI(MBB, MBBI, DL,
999+
TII->get(IsRestoreZA ? AArch64::CBZX : AArch64::CBNZX))
1000+
.add(MI.getOperand(0));
9951001

9961002
// Split MBB and create two new blocks:
9971003
// - MBB now contains all instructions before RestoreZAPseudo.
998-
// - SMBB contains the RestoreZAPseudo instruction only.
999-
// - EndBB contains all instructions after RestoreZAPseudo.
1004+
// - SMBB contains the [Commit|RestoreZA]Pseudo instruction only.
1005+
// - EndBB contains all instructions after [Commit|RestoreZA]Pseudo.
10001006
MachineInstr &PrevMI = *std::prev(MBBI);
10011007
MachineBasicBlock *SMBB = MBB.splitAt(PrevMI, /*UpdateLiveIns*/ true);
10021008
MachineBasicBlock *EndBB = std::next(MI.getIterator()) == SMBB->end()
10031009
? *SMBB->successors().begin()
10041010
: SMBB->splitAt(MI, /*UpdateLiveIns*/ true);
10051011

1006-
// Add the SMBB label to the TB[N]Z instruction & create a branch to EndBB.
1007-
Cbz.addMBB(SMBB);
1012+
// Add the SMBB label to the CB[N]Z instruction & create a branch to EndBB.
1013+
Branch.addMBB(SMBB);
10081014
BuildMI(&MBB, DL, TII->get(AArch64::B))
10091015
.addMBB(EndBB);
10101016
MBB.addSuccessor(EndBB);
10111017

10121018
// Replace the pseudo with a call (BL).
10131019
MachineInstrBuilder MIB =
10141020
BuildMI(*SMBB, SMBB->end(), DL, TII->get(AArch64::BL));
1015-
MIB.addReg(MI.getOperand(1).getReg(), RegState::Implicit);
1016-
for (unsigned I = 2; I < MI.getNumOperands(); ++I)
1021+
unsigned FirstBLOperand = 1;
1022+
if (IsRestoreZA) {
1023+
MIB.addReg(MI.getOperand(1).getReg(), RegState::Implicit);
1024+
FirstBLOperand = 2;
1025+
}
1026+
for (unsigned I = FirstBLOperand; I < MI.getNumOperands(); ++I)
10171027
MIB.add(MI.getOperand(I));
10181028
BuildMI(SMBB, DL, TII->get(AArch64::B)).addMBB(EndBB);
10191029

@@ -1629,8 +1639,9 @@ bool AArch64ExpandPseudo::expandMI(MachineBasicBlock &MBB,
16291639
return expandCALL_BTI(MBB, MBBI);
16301640
case AArch64::StoreSwiftAsyncContext:
16311641
return expandStoreSwiftAsyncContext(MBB, MBBI);
1642+
case AArch64::CommitZAPseudo:
16321643
case AArch64::RestoreZAPseudo: {
1633-
auto *NewMBB = expandRestoreZA(MBB, MBBI);
1644+
auto *NewMBB = expandCommitOrRestoreZA(MBB, MBBI);
16341645
if (NewMBB != &MBB)
16351646
NextMBBI = MBB.end(); // The NextMBBI iterator is invalidated.
16361647
return true;
@@ -1641,6 +1652,8 @@ bool AArch64ExpandPseudo::expandMI(MachineBasicBlock &MBB,
16411652
NextMBBI = MBB.end(); // The NextMBBI iterator is invalidated.
16421653
return true;
16431654
}
1655+
case AArch64::InOutZAUsePseudo:
1656+
case AArch64::RequiresZASavePseudo:
16441657
case AArch64::COALESCER_BARRIER_FPR16:
16451658
case AArch64::COALESCER_BARRIER_FPR32:
16461659
case AArch64::COALESCER_BARRIER_FPR64:

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

Lines changed: 88 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -8244,53 +8244,54 @@ SDValue AArch64TargetLowering::LowerFormalArguments(
82448244
if (Subtarget->hasCustomCallingConv())
82458245
Subtarget->getRegisterInfo()->UpdateCustomCalleeSavedRegs(MF);
82468246

8247-
// Create a 16 Byte TPIDR2 object. The dynamic buffer
8248-
// will be expanded and stored in the static object later using a pseudonode.
8249-
if (Attrs.hasZAState()) {
8250-
TPIDR2Object &TPIDR2 = FuncInfo->getTPIDR2Obj();
8251-
TPIDR2.FrameIndex = MFI.CreateStackObject(16, Align(16), false);
8252-
SDValue SVL = DAG.getNode(AArch64ISD::RDSVL, DL, MVT::i64,
8253-
DAG.getConstant(1, DL, MVT::i32));
8254-
8255-
SDValue Buffer;
8256-
if (!Subtarget->isTargetWindows() && !hasInlineStackProbe(MF)) {
8257-
Buffer = DAG.getNode(AArch64ISD::ALLOCATE_ZA_BUFFER, DL,
8258-
DAG.getVTList(MVT::i64, MVT::Other), {Chain, SVL});
8259-
} else {
8260-
SDValue Size = DAG.getNode(ISD::MUL, DL, MVT::i64, SVL, SVL);
8261-
Buffer = DAG.getNode(ISD::DYNAMIC_STACKALLOC, DL,
8262-
DAG.getVTList(MVT::i64, MVT::Other),
8263-
{Chain, Size, DAG.getConstant(1, DL, MVT::i64)});
8264-
MFI.CreateVariableSizedObject(Align(16), nullptr);
8265-
}
8266-
Chain = DAG.getNode(
8267-
AArch64ISD::INIT_TPIDR2OBJ, DL, DAG.getVTList(MVT::Other),
8268-
{/*Chain*/ Buffer.getValue(1), /*Buffer ptr*/ Buffer.getValue(0)});
8269-
} else if (Attrs.hasAgnosticZAInterface()) {
8270-
// Call __arm_sme_state_size().
8271-
SDValue BufferSize =
8272-
DAG.getNode(AArch64ISD::GET_SME_SAVE_SIZE, DL,
8273-
DAG.getVTList(MVT::i64, MVT::Other), Chain);
8274-
Chain = BufferSize.getValue(1);
8275-
8276-
SDValue Buffer;
8277-
if (!Subtarget->isTargetWindows() && !hasInlineStackProbe(MF)) {
8278-
Buffer =
8279-
DAG.getNode(AArch64ISD::ALLOC_SME_SAVE_BUFFER, DL,
8280-
DAG.getVTList(MVT::i64, MVT::Other), {Chain, BufferSize});
8281-
} else {
8282-
// Allocate space dynamically.
8283-
Buffer = DAG.getNode(
8284-
ISD::DYNAMIC_STACKALLOC, DL, DAG.getVTList(MVT::i64, MVT::Other),
8285-
{Chain, BufferSize, DAG.getConstant(1, DL, MVT::i64)});
8286-
MFI.CreateVariableSizedObject(Align(16), nullptr);
8247+
if (!Subtarget->useNewSMEABILowering() || Attrs.hasAgnosticZAInterface()) {
8248+
// Old SME ABI lowering (deprecated):
8249+
// Create a 16 Byte TPIDR2 object. The dynamic buffer
8250+
// will be expanded and stored in the static object later using a
8251+
// pseudonode.
8252+
if (Attrs.hasZAState()) {
8253+
TPIDR2Object &TPIDR2 = FuncInfo->getTPIDR2Obj();
8254+
TPIDR2.FrameIndex = MFI.CreateStackObject(16, Align(16), false);
8255+
SDValue SVL = DAG.getNode(AArch64ISD::RDSVL, DL, MVT::i64,
8256+
DAG.getConstant(1, DL, MVT::i32));
8257+
SDValue Buffer;
8258+
if (!Subtarget->isTargetWindows() && !hasInlineStackProbe(MF)) {
8259+
Buffer = DAG.getNode(AArch64ISD::ALLOCATE_ZA_BUFFER, DL,
8260+
DAG.getVTList(MVT::i64, MVT::Other), {Chain, SVL});
8261+
} else {
8262+
SDValue Size = DAG.getNode(ISD::MUL, DL, MVT::i64, SVL, SVL);
8263+
Buffer = DAG.getNode(ISD::DYNAMIC_STACKALLOC, DL,
8264+
DAG.getVTList(MVT::i64, MVT::Other),
8265+
{Chain, Size, DAG.getConstant(1, DL, MVT::i64)});
8266+
MFI.CreateVariableSizedObject(Align(16), nullptr);
8267+
}
8268+
Chain = DAG.getNode(
8269+
AArch64ISD::INIT_TPIDR2OBJ, DL, DAG.getVTList(MVT::Other),
8270+
{/*Chain*/ Buffer.getValue(1), /*Buffer ptr*/ Buffer.getValue(0)});
8271+
} else if (Attrs.hasAgnosticZAInterface()) {
8272+
// Call __arm_sme_state_size().
8273+
SDValue BufferSize =
8274+
DAG.getNode(AArch64ISD::GET_SME_SAVE_SIZE, DL,
8275+
DAG.getVTList(MVT::i64, MVT::Other), Chain);
8276+
Chain = BufferSize.getValue(1);
8277+
SDValue Buffer;
8278+
if (!Subtarget->isTargetWindows() && !hasInlineStackProbe(MF)) {
8279+
Buffer = DAG.getNode(AArch64ISD::ALLOC_SME_SAVE_BUFFER, DL,
8280+
DAG.getVTList(MVT::i64, MVT::Other),
8281+
{Chain, BufferSize});
8282+
} else {
8283+
// Allocate space dynamically.
8284+
Buffer = DAG.getNode(
8285+
ISD::DYNAMIC_STACKALLOC, DL, DAG.getVTList(MVT::i64, MVT::Other),
8286+
{Chain, BufferSize, DAG.getConstant(1, DL, MVT::i64)});
8287+
MFI.CreateVariableSizedObject(Align(16), nullptr);
8288+
}
8289+
// Copy the value to a virtual register, and save that in FuncInfo.
8290+
Register BufferPtr =
8291+
MF.getRegInfo().createVirtualRegister(&AArch64::GPR64RegClass);
8292+
FuncInfo->setSMESaveBufferAddr(BufferPtr);
8293+
Chain = DAG.getCopyToReg(Chain, DL, BufferPtr, Buffer);
82878294
}
8288-
8289-
// Copy the value to a virtual register, and save that in FuncInfo.
8290-
Register BufferPtr =
8291-
MF.getRegInfo().createVirtualRegister(&AArch64::GPR64RegClass);
8292-
FuncInfo->setSMESaveBufferAddr(BufferPtr);
8293-
Chain = DAG.getCopyToReg(Chain, DL, BufferPtr, Buffer);
82948295
}
82958296

82968297
if (CallConv == CallingConv::PreserveNone) {
@@ -8307,6 +8308,15 @@ SDValue AArch64TargetLowering::LowerFormalArguments(
83078308
}
83088309
}
83098310

8311+
if (Subtarget->useNewSMEABILowering()) {
8312+
// Clear new ZT0 state. TODO: Move this to the SME ABI pass.
8313+
if (Attrs.isNewZT0())
8314+
Chain = DAG.getNode(
8315+
ISD::INTRINSIC_VOID, DL, MVT::Other, Chain,
8316+
DAG.getConstant(Intrinsic::aarch64_sme_zero_zt, DL, MVT::i32),
8317+
DAG.getTargetConstant(0, DL, MVT::i32));
8318+
}
8319+
83108320
return Chain;
83118321
}
83128322

@@ -8871,14 +8881,12 @@ static SDValue emitSMEStateSaveRestore(const AArch64TargetLowering &TLI,
88718881
MachineFunction &MF = DAG.getMachineFunction();
88728882
AArch64FunctionInfo *FuncInfo = MF.getInfo<AArch64FunctionInfo>();
88738883
FuncInfo->setSMESaveBufferUsed();
8874-
88758884
TargetLowering::ArgListTy Args;
88768885
TargetLowering::ArgListEntry Entry;
88778886
Entry.Ty = PointerType::getUnqual(*DAG.getContext());
88788887
Entry.Node =
88798888
DAG.getCopyFromReg(Chain, DL, Info->getSMESaveBufferAddr(), MVT::i64);
88808889
Args.push_back(Entry);
8881-
88828890
SDValue Callee =
88838891
DAG.getExternalSymbol(IsSave ? "__arm_sme_save" : "__arm_sme_restore",
88848892
TLI.getPointerTy(DAG.getDataLayout()));
@@ -9001,6 +9009,9 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
90019009
if (MF.getTarget().Options.EmitCallGraphSection && CB && CB->isIndirectCall())
90029010
CSInfo = MachineFunction::CallSiteInfo(*CB);
90039011

9012+
// Determine whether we need any streaming mode changes.
9013+
SMECallAttrs CallAttrs = getSMECallAttrs(MF.getFunction(), CLI);
9014+
90049015
// Check callee args/returns for SVE registers and set calling convention
90059016
// accordingly.
90069017
if (CallConv == CallingConv::C || CallConv == CallingConv::Fast) {
@@ -9014,14 +9025,26 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
90149025
CallConv = CallingConv::AArch64_SVE_VectorCall;
90159026
}
90169027

9028+
bool UseNewSMEABILowering = Subtarget->useNewSMEABILowering();
9029+
bool IsAgnosticZAFunction = CallAttrs.caller().hasAgnosticZAInterface();
9030+
auto ZAMarkerNode = [&]() -> std::optional<unsigned> {
9031+
// TODO: Handle agnostic ZA functions.
9032+
if (!UseNewSMEABILowering || IsAgnosticZAFunction)
9033+
return std::nullopt;
9034+
if (!CallAttrs.caller().hasZAState() && !CallAttrs.caller().hasZT0State())
9035+
return std::nullopt;
9036+
return CallAttrs.requiresLazySave() ? AArch64ISD::REQUIRES_ZA_SAVE
9037+
: AArch64ISD::INOUT_ZA_USE;
9038+
}();
9039+
90179040
if (IsTailCall) {
90189041
// Check if it's really possible to do a tail call.
90199042
IsTailCall = isEligibleForTailCallOptimization(CLI);
90209043

90219044
// A sibling call is one where we're under the usual C ABI and not planning
90229045
// to change that but can still do a tail call:
9023-
if (!TailCallOpt && IsTailCall && CallConv != CallingConv::Tail &&
9024-
CallConv != CallingConv::SwiftTail)
9046+
if (!ZAMarkerNode.has_value() && !TailCallOpt && IsTailCall &&
9047+
CallConv != CallingConv::Tail && CallConv != CallingConv::SwiftTail)
90259048
IsSibCall = true;
90269049

90279050
if (IsTailCall)
@@ -9073,9 +9096,6 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
90739096
assert(FPDiff % 16 == 0 && "unaligned stack on tail call");
90749097
}
90759098

9076-
// Determine whether we need any streaming mode changes.
9077-
SMECallAttrs CallAttrs = getSMECallAttrs(MF.getFunction(), CLI);
9078-
90799099
auto DescribeCallsite =
90809100
[&](OptimizationRemarkAnalysis &R) -> OptimizationRemarkAnalysis & {
90819101
R << "call from '" << ore::NV("Caller", MF.getName()) << "' to '";
@@ -9089,7 +9109,7 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
90899109
return R;
90909110
};
90919111

9092-
bool RequiresLazySave = CallAttrs.requiresLazySave();
9112+
bool RequiresLazySave = !UseNewSMEABILowering && CallAttrs.requiresLazySave();
90939113
bool RequiresSaveAllZA = CallAttrs.requiresPreservingAllZAState();
90949114
if (RequiresLazySave) {
90959115
const TPIDR2Object &TPIDR2 = FuncInfo->getTPIDR2Obj();
@@ -9171,10 +9191,21 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
91719191
AArch64ISD::SMSTOP, DL, DAG.getVTList(MVT::Other, MVT::Glue), Chain,
91729192
DAG.getTargetConstant((int32_t)(AArch64SVCR::SVCRZA), DL, MVT::i32));
91739193

9174-
// Adjust the stack pointer for the new arguments...
9194+
// Adjust the stack pointer for the new arguments... and mark ZA uses.
91759195
// These operations are automatically eliminated by the prolog/epilog pass
9176-
if (!IsSibCall)
9196+
assert((!IsSibCall || !ZAMarkerNode.has_value()) &&
9197+
"ZA markers require CALLSEQ_START");
9198+
if (!IsSibCall) {
91779199
Chain = DAG.getCALLSEQ_START(Chain, IsTailCall ? 0 : NumBytes, 0, DL);
9200+
if (ZAMarkerNode) {
9201+
// Note: We need the CALLSEQ_START to glue the ZAMarkerNode to, simply
9202+
// using a chain can result in incorrect scheduling. The markers referer
9203+
// to the position just before the CALLSEQ_START (though occur after as
9204+
// CALLSEQ_START lacks in-glue).
9205+
Chain = DAG.getNode(*ZAMarkerNode, DL, DAG.getVTList(MVT::Other),
9206+
{Chain, Chain.getValue(1)});
9207+
}
9208+
}
91789209

91799210
SDValue StackPtr = DAG.getCopyFromReg(Chain, DL, AArch64::SP,
91809211
getPointerTy(DAG.getDataLayout()));
@@ -9646,7 +9677,7 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
96469677
}
96479678
}
96489679

9649-
if (CallAttrs.requiresEnablingZAAfterCall())
9680+
if (RequiresLazySave || CallAttrs.requiresEnablingZAAfterCall())
96509681
// Unconditionally resume ZA.
96519682
Result = DAG.getNode(
96529683
AArch64ISD::SMSTART, DL, DAG.getVTList(MVT::Other, MVT::Glue), Result,
@@ -9667,7 +9698,6 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
96679698
SDValue TPIDR2_EL0 = DAG.getNode(
96689699
ISD::INTRINSIC_W_CHAIN, DL, MVT::i64, Result,
96699700
DAG.getConstant(Intrinsic::aarch64_sme_get_tpidr2, DL, MVT::i32));
9670-
96719701
// Copy the address of the TPIDR2 block into X0 before 'calling' the
96729702
// RESTORE_ZA pseudo.
96739703
SDValue Glue;
@@ -9679,7 +9709,6 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
96799709
DAG.getNode(AArch64ISD::RESTORE_ZA, DL, MVT::Other,
96809710
{Result, TPIDR2_EL0, DAG.getRegister(AArch64::X0, MVT::i64),
96819711
RestoreRoutine, RegMask, Result.getValue(1)});
9682-
96839712
// Finally reset the TPIDR2_EL0 register to 0.
96849713
Result = DAG.getNode(
96859714
ISD::INTRINSIC_VOID, DL, MVT::Other, Result,

llvm/lib/Target/AArch64/AArch64ISelLowering.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,10 @@ class AArch64TargetLowering : public TargetLowering {
173173
MachineBasicBlock *EmitZTInstr(MachineInstr &MI, MachineBasicBlock *BB,
174174
unsigned Opcode, bool Op0IsDef) const;
175175
MachineBasicBlock *EmitZero(MachineInstr &MI, MachineBasicBlock *BB) const;
176+
177+
// Note: The following group of functions are only used as part of the old SME
178+
// ABI lowering. They will be removed once -aarch64-new-sme-abi=true is the
179+
// default.
176180
MachineBasicBlock *EmitInitTPIDR2Object(MachineInstr &MI,
177181
MachineBasicBlock *BB) const;
178182
MachineBasicBlock *EmitAllocateZABuffer(MachineInstr &MI,

0 commit comments

Comments
 (0)