-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[IR] Add llvm.sincos
intrinsic
#109825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IR] Add llvm.sincos
intrinsic
#109825
Conversation
@llvm/pr-subscribers-llvm-support @llvm/pr-subscribers-backend-arm Author: Benjamin Maxwell (MacDue) ChangesThis adds the The
The lowering is built on top of the existing FSINCOS ISD node, with additional type legalization to allow for f16, f128, and vector values. Patch is 57.57 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/109825.diff 18 Files Affected:
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 91c3e60bb0acb1..22fe98beef6a09 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -15361,6 +15361,8 @@ Semantics:
This function returns the first value raised to the second power with an
unspecified sequence of rounding operations.
+.. _t_llvm_sin:
+
'``llvm.sin.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -15398,6 +15400,8 @@ trapping or setting ``errno``.
When specified with the fast-math-flag 'afn', the result may be approximated
using a less accurate calculation.
+.. _t_llvm_cos:
+
'``llvm.cos.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -15694,6 +15698,47 @@ trapping or setting ``errno``.
When specified with the fast-math-flag 'afn', the result may be approximated
using a less accurate calculation.
+
+'``llvm.sincos.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+This is an overloaded intrinsic. You can use ``llvm.sincos`` on any
+floating-point or vector of floating-point type. Not all targets support
+all types however.
+
+::
+
+ declare { float, float } @llvm.sincos.f32(float %Val)
+ declare { double, double } @llvm.sincos.f64(double %Val)
+ declare { x86_fp80, x86_fp80 } @llvm.sincos.f80(x86_fp80 %Val)
+ declare { fp128, fp128 } @llvm.sincos.f128(fp128 %Val)
+ declare { ppc_fp128, ppc_fp128 } @llvm.sincos.ppcf128(ppc_fp128 %Val)
+ declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float> %Val)
+
+Overview:
+"""""""""
+
+The '``llvm.sincos.*``' intrinsics returns the sine and cosine of the operand.
+
+Arguments:
+""""""""""
+
+The argument is a :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
+of floating-point values. Returns two values matching the argument type in a
+struct.
+
+Semantics:
+""""""""""
+
+This intrinsic is equivalent to a calling both :ref:`llvm.sin <t_llvm_sin>`
+and :ref:`llvm.cos <t_llvm_cos>` on the argument.
+
+The first result is the sine of the argument and the second result is the cosine
+of the argument.
+
'``llvm.pow.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index 7198e134a2d262..1f98d3632954e2 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -1978,6 +1978,9 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
case Intrinsic::cos:
ISD = ISD::FCOS;
break;
+ case Intrinsic::sincos:
+ ISD = ISD::FSINCOS;
+ break;
case Intrinsic::tan:
ISD = ISD::FTAN;
break;
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h b/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
index 9b993482c8cc07..ab3025e4923cd0 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
@@ -2009,6 +2009,13 @@ class MachineIRBuilder {
return buildInstr(TargetOpcode::G_FFREXP, {Fract, Exp}, {Src}, Flags);
}
+ /// Build and insert \p Sin, \p Cos = G_FSINCOS \p Src
+ MachineInstrBuilder
+ buildFSincos(const DstOp &Sin, const DstOp &Cos, const SrcOp &Src,
+ std::optional<unsigned> Flags = std::nullopt) {
+ return buildInstr(TargetOpcode::G_FSINCOS, {Sin, Cos}, {Src}, Flags);
+ }
+
/// Build and insert \p Res = G_FCOPYSIGN \p Op0, \p Op1
MachineInstrBuilder buildFCopysign(const DstOp &Dst, const SrcOp &Src0,
const SrcOp &Src1) {
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index 0a74a217a5f010..ce72ba69af9aff 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -1040,6 +1040,8 @@ let IntrProperties = [IntrNoMem, IntrSpeculatable, IntrWillReturn] in {
def int_nearbyint : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>]>;
def int_round : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>]>;
def int_roundeven : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>]>;
+ def int_sincos : DefaultAttrsIntrinsic<[LLVMMatchType<0>, LLVMMatchType<0>],
+ [llvm_anyfloat_ty]>;
// Truncate a floating point number with a specific rounding mode
def int_fptrunc_round : DefaultAttrsIntrinsic<[ llvm_anyfloat_ty ],
diff --git a/llvm/include/llvm/Support/TargetOpcodes.def b/llvm/include/llvm/Support/TargetOpcodes.def
index 9e70eb8d8fdd35..42d07338910f8e 100644
--- a/llvm/include/llvm/Support/TargetOpcodes.def
+++ b/llvm/include/llvm/Support/TargetOpcodes.def
@@ -809,6 +809,9 @@ HANDLE_TARGET_OPCODE(G_FCOS)
/// Floating point sine.
HANDLE_TARGET_OPCODE(G_FSIN)
+/// Floating point combined sine and cosine.
+HANDLE_TARGET_OPCODE(G_FSINCOS)
+
/// Floating point tangent.
HANDLE_TARGET_OPCODE(G_FTAN)
diff --git a/llvm/include/llvm/Target/GenericOpcodes.td b/llvm/include/llvm/Target/GenericOpcodes.td
index f5e62dda6fd043..fcbf748b3db8de 100644
--- a/llvm/include/llvm/Target/GenericOpcodes.td
+++ b/llvm/include/llvm/Target/GenericOpcodes.td
@@ -1020,6 +1020,13 @@ def G_FSIN : GenericInstruction {
let hasSideEffects = false;
}
+// Floating point combined sine and cosine.
+def G_FSINCOS : GenericInstruction {
+ let OutOperandList = (outs type0:$dst1, type0:$dst2);
+ let InOperandList = (ins type0:$src1);
+ let hasSideEffects = false;
+}
+
// Floating point tangent of a value.
def G_FTAN : GenericInstruction {
let OutOperandList = (outs type0:$dst);
diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index 8e860a1f740295..bc840fccc4d053 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -2340,6 +2340,13 @@ bool IRTranslator::translateKnownIntrinsic(const CallInst &CI, Intrinsic::ID ID,
MachineInstr::copyFlagsFromInstruction(CI));
return true;
}
+ case Intrinsic::sincos: {
+ ArrayRef<Register> VRegs = getOrCreateVRegs(CI);
+ MIRBuilder.buildFSincos(VRegs[0], VRegs[1],
+ getOrCreateVReg(*CI.getArgOperand(0)),
+ MachineInstr::copyFlagsFromInstruction(CI));
+ return true;
+ }
case Intrinsic::fptosi_sat:
MIRBuilder.buildFPTOSI_SAT(getOrCreateVReg(CI),
getOrCreateVReg(*CI.getArgOperand(0)));
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
index 3c087727a80126..cf1291c81b6845 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
@@ -5577,6 +5577,16 @@ void SelectionDAGLegalize::PromoteNode(SDNode *Node) {
Results.push_back(Tmp2.getValue(1));
break;
}
+ case ISD::FSINCOS: {
+ Tmp1 = DAG.getNode(ISD::FP_EXTEND, dl, NVT, Node->getOperand(0));
+ Tmp2 = DAG.getNode(ISD::FSINCOS, dl, {NVT, NVT}, Tmp1);
+
+ for (unsigned ResNum = 0; ResNum < Node->getNumValues(); ResNum++)
+ Results.push_back(
+ DAG.getNode(ISD::FP_ROUND, dl, OVT, Tmp2.getValue(ResNum),
+ DAG.getIntPtrConstant(0, dl, /*isTarget=*/true)));
+ break;
+ }
case ISD::FFLOOR:
case ISD::FCEIL:
case ISD::FRINT:
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
index 2c81c829e75cbb..fb492a86992c5f 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
@@ -127,6 +127,7 @@ void DAGTypeLegalizer::SoftenFloatResult(SDNode *N, unsigned ResNo) {
case ISD::FLDEXP:
case ISD::STRICT_FLDEXP: R = SoftenFloatRes_ExpOp(N); break;
case ISD::FFREXP: R = SoftenFloatRes_FFREXP(N); break;
+ case ISD::FSINCOS: R = SoftenFloatRes_FSINCOS(N); break;
case ISD::STRICT_FREM:
case ISD::FREM: R = SoftenFloatRes_FREM(N); break;
case ISD::STRICT_FRINT:
@@ -765,6 +766,45 @@ SDValue DAGTypeLegalizer::SoftenFloatRes_FFREXP(SDNode *N) {
return ReturnVal;
}
+SDValue DAGTypeLegalizer::SoftenFloatRes_FSINCOS(SDNode *N) {
+ assert(!N->isStrictFPOpcode() && "strictfp not implemented for fsincos");
+ EVT VT = N->getValueType(0);
+ RTLIB::Libcall LC = RTLIB::getFSINCOS(VT);
+
+ if (!TLI.getLibcallName(LC))
+ return SDValue();
+
+ EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
+ SDValue StackSlotSin = DAG.CreateStackTemporary(NVT);
+ SDValue StackSlotCos = DAG.CreateStackTemporary(NVT);
+
+ SDLoc DL(N);
+
+ TargetLowering::MakeLibCallOptions CallOptions;
+ std::array Ops{GetSoftenedFloat(N->getOperand(0)), StackSlotSin,
+ StackSlotCos};
+ std::array OpsVT{VT, StackSlotSin.getValueType(),
+ StackSlotCos.getValueType()};
+
+ // TODO: setTypeListBeforeSoften can't properly express multiple return types,
+ // but since both returns have the same type for sincos it should be okay.
+ CallOptions.setTypeListBeforeSoften({OpsVT}, VT, true);
+
+ auto [ReturnVal, Chain] = TLI.makeLibCall(DAG, LC, NVT, Ops, CallOptions, DL,
+ /*Chain=*/SDValue());
+ unsigned ResNo = 0;
+ for (SDValue OutPtr : {StackSlotSin, StackSlotCos}) {
+ int FrameIdx = cast<FrameIndexSDNode>(OutPtr)->getIndex();
+ auto PtrInfo =
+ MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FrameIdx);
+
+ SDValue LoadExp = DAG.getLoad(NVT, DL, Chain, OutPtr, PtrInfo);
+ SetSoftenedFloat(SDValue(N, ResNo++), LoadExp);
+ }
+
+ return SDValue();
+}
+
SDValue DAGTypeLegalizer::SoftenFloatRes_FREM(SDNode *N) {
return SoftenFloatRes_Binary(N, GetFPLibCall(N->getValueType(0),
RTLIB::REM_F32,
@@ -2683,6 +2723,10 @@ void DAGTypeLegalizer::PromoteFloatResult(SDNode *N, unsigned ResNo) {
case ISD::FLDEXP: R = PromoteFloatRes_ExpOp(N); break;
case ISD::FFREXP: R = PromoteFloatRes_FFREXP(N); break;
+ case ISD::FSINCOS:
+ R = PromoteFloatRes_FSINCOS(N);
+ break;
+
case ISD::FP_ROUND: R = PromoteFloatRes_FP_ROUND(N); break;
case ISD::STRICT_FP_ROUND:
R = PromoteFloatRes_STRICT_FP_ROUND(N);
@@ -2878,6 +2922,18 @@ SDValue DAGTypeLegalizer::PromoteFloatRes_FFREXP(SDNode *N) {
return Res;
}
+SDValue DAGTypeLegalizer::PromoteFloatRes_FSINCOS(SDNode *N) {
+ EVT VT = N->getValueType(0);
+ EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
+ SDValue Op = GetPromotedFloat(N->getOperand(0));
+ SDValue Res = DAG.getNode(N->getOpcode(), SDLoc(N), {NVT, NVT}, Op);
+
+ for (unsigned ResNum = 0; ResNum < N->getNumValues(); ResNum++)
+ SetPromotedFloat(SDValue(N, ResNum), Res.getValue(ResNum));
+
+ return SDValue();
+}
+
// Explicit operation to reduce precision. Reduce the value to half precision
// and promote it back to the legal type.
SDValue DAGTypeLegalizer::PromoteFloatRes_FP_ROUND(SDNode *N) {
@@ -3126,6 +3182,10 @@ void DAGTypeLegalizer::SoftPromoteHalfResult(SDNode *N, unsigned ResNo) {
case ISD::FFREXP: R = SoftPromoteHalfRes_FFREXP(N); break;
+ case ISD::FSINCOS:
+ R = SoftPromoteHalfRes_FSINCOS(N);
+ break;
+
case ISD::LOAD: R = SoftPromoteHalfRes_LOAD(N); break;
case ISD::ATOMIC_LOAD:
R = SoftPromoteHalfRes_ATOMIC_LOAD(N);
@@ -3282,6 +3342,26 @@ SDValue DAGTypeLegalizer::SoftPromoteHalfRes_FFREXP(SDNode *N) {
return DAG.getNode(GetPromotionOpcode(NVT, OVT), dl, MVT::i16, Res);
}
+SDValue DAGTypeLegalizer::SoftPromoteHalfRes_FSINCOS(SDNode *N) {
+ EVT OVT = N->getValueType(0);
+ EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), OVT);
+ SDValue Op = GetSoftPromotedHalf(N->getOperand(0));
+ SDLoc dl(N);
+
+ // Promote to the larger FP type.
+ Op = DAG.getNode(GetPromotionOpcode(OVT, NVT), dl, NVT, Op);
+ SDValue Res = DAG.getNode(N->getOpcode(), dl, DAG.getVTList(NVT, NVT), Op);
+
+ // Convert back to FP16 as an integer.
+ ISD::NodeType Truncate = GetPromotionOpcode(NVT, OVT);
+ for (unsigned ResNum = 0; ResNum < N->getNumValues(); ResNum++) {
+ SDValue Trunc = DAG.getNode(Truncate, dl, MVT::i16, Res.getValue(ResNum));
+ SetSoftPromotedHalf(SDValue(N, ResNum), Trunc);
+ }
+
+ return SDValue();
+}
+
SDValue DAGTypeLegalizer::SoftPromoteHalfRes_FP_ROUND(SDNode *N) {
EVT RVT = N->getValueType(0);
bool IsStrict = N->isStrictFPOpcode();
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
index d14516ef3e2fbb..7375ab37f37495 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
@@ -596,6 +596,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
SDValue SoftenFloatRes_FPOW(SDNode *N);
SDValue SoftenFloatRes_ExpOp(SDNode *N);
SDValue SoftenFloatRes_FFREXP(SDNode *N);
+ SDValue SoftenFloatRes_FSINCOS(SDNode *N);
SDValue SoftenFloatRes_FREEZE(SDNode *N);
SDValue SoftenFloatRes_FREM(SDNode *N);
SDValue SoftenFloatRes_FRINT(SDNode *N);
@@ -742,6 +743,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
SDValue PromoteFloatRes_FMAD(SDNode *N);
SDValue PromoteFloatRes_ExpOp(SDNode *N);
SDValue PromoteFloatRes_FFREXP(SDNode *N);
+ SDValue PromoteFloatRes_FSINCOS(SDNode *N);
SDValue PromoteFloatRes_FP_ROUND(SDNode *N);
SDValue PromoteFloatRes_STRICT_FP_ROUND(SDNode *N);
SDValue PromoteFloatRes_LOAD(SDNode *N);
@@ -790,6 +792,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
SDValue SoftPromoteHalfRes_FMAD(SDNode *N);
SDValue SoftPromoteHalfRes_ExpOp(SDNode *N);
SDValue SoftPromoteHalfRes_FFREXP(SDNode *N);
+ SDValue SoftPromoteHalfRes_FSINCOS(SDNode *N);
SDValue SoftPromoteHalfRes_FP_ROUND(SDNode *N);
SDValue SoftPromoteHalfRes_LOAD(SDNode *N);
SDValue SoftPromoteHalfRes_ATOMIC_LOAD(SDNode *N);
@@ -861,7 +864,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
SDValue ScalarizeVecRes_IS_FPCLASS(SDNode *N);
SDValue ScalarizeVecRes_FIX(SDNode *N);
- SDValue ScalarizeVecRes_FFREXP(SDNode *N, unsigned ResNo);
+ SDValue ScalarizeVecRes_UnaryOpWithTwoResults(SDNode *N, unsigned ResNo);
// Vector Operand Scalarization: <1 x ty> -> ty.
bool ScalarizeVectorOperand(SDNode *N, unsigned OpNo);
@@ -915,7 +918,8 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
void SplitVecRes_CMP(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_UnaryOp(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_ADDRSPACECAST(SDNode *N, SDValue &Lo, SDValue &Hi);
- void SplitVecRes_FFREXP(SDNode *N, unsigned ResNo, SDValue &Lo, SDValue &Hi);
+ void SplitVecRes_UnaryOpWithTwoResults(SDNode *N, unsigned ResNo, SDValue &Lo,
+ SDValue &Hi);
void SplitVecRes_ExtendOp(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_InregOp(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_ExtVecInRegOp(SDNode *N, SDValue &Lo, SDValue &Hi);
@@ -1066,6 +1070,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
SDValue WidenVecRes_ExpOp(SDNode *N);
SDValue WidenVecRes_Unary(SDNode *N);
SDValue WidenVecRes_InregOp(SDNode *N);
+ SDValue WidenVecRes_FSINCOS(SDNode *N);
// Widen Vector Operand.
bool WidenVectorOperand(SDNode *N, unsigned OpNo);
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
index 5d433204d5da08..6b4d2e02d62d29 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
@@ -451,6 +451,7 @@ SDValue VectorLegalizer::LegalizeOp(SDValue Op) {
case ISD::UMULO:
case ISD::FCANONICALIZE:
case ISD::FFREXP:
+ case ISD::FSINCOS:
case ISD::SADDSAT:
case ISD::UADDSAT:
case ISD::SSUBSAT:
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
index 1c466ed0b77997..6479fccb855f9c 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
@@ -130,7 +130,8 @@ void DAGTypeLegalizer::ScalarizeVectorResult(SDNode *N, unsigned ResNo) {
R = ScalarizeVecRes_ADDRSPACECAST(N);
break;
case ISD::FFREXP:
- R = ScalarizeVecRes_FFREXP(N, ResNo);
+ case ISD::FSINCOS:
+ R = ScalarizeVecRes_UnaryOpWithTwoResults(N, ResNo);
break;
case ISD::ADD:
case ISD::AND:
@@ -275,7 +276,9 @@ SDValue DAGTypeLegalizer::ScalarizeVecRes_FIX(SDNode *N) {
Op2, N->getFlags());
}
-SDValue DAGTypeLegalizer::ScalarizeVecRes_FFREXP(SDNode *N, unsigned ResNo) {
+SDValue
+DAGTypeLegalizer::ScalarizeVecRes_UnaryOpWithTwoResults(SDNode *N,
+ unsigned ResNo) {
assert(N->getValueType(0).getVectorNumElements() == 1 &&
"Unexpected vector type!");
SDValue Elt = GetScalarizedVector(N->getOperand(0));
@@ -1252,7 +1255,8 @@ void DAGTypeLegalizer::SplitVectorResult(SDNode *N, unsigned ResNo) {
SplitVecRes_ADDRSPACECAST(N, Lo, Hi);
break;
case ISD::FFREXP:
- SplitVecRes_FFREXP(N, ResNo, Lo, Hi);
+ case ISD::FSINCOS:
+ SplitVecRes_UnaryOpWithTwoResults(N, ResNo, Lo, Hi);
break;
case ISD::ANY_EXTEND:
@@ -2611,8 +2615,10 @@ void DAGTypeLegalizer::SplitVecRes_ADDRSPACECAST(SDNode *N, SDValue &Lo,
Hi = DAG.getAddrSpaceCast(dl, HiVT, Hi, SrcAS, DestAS);
}
-void DAGTypeLegalizer::SplitVecRes_FFREXP(SDNode *N, unsigned ResNo,
- SDValue &Lo, SDValue &Hi) {
+void DAGTypeLegalizer::SplitVecRes_UnaryOpWithTwoResults(SDNode *N,
+ unsigned ResNo,
+ SDValue &Lo,
+ SDValue &Hi) {
SDLoc dl(N);
auto [LoVT, HiVT] = DAG.GetSplitDestVTs(N->getValueType(0));
auto [LoVT1, HiVT1] = DAG.GetSplitDestVTs(N->getValueType(1));
@@ -4746,6 +4752,14 @@ void DAGTypeLegalizer::WidenVectorResult(SDNode *N, unsigned ResNo) {
case ISD::VP_FSHR:
Res = WidenVecRes_Ternary(N);
break;
+ case ISD::FSINCOS: {
+ if (!unrollExpandedOp())
+ Res = WidenVecRes_FSINCOS(N);
+ for (unsigned ResNum = 0; ResNum < N->getNumValues(); ResNum++)
+ SetWidenedVector(SDValue(N, ResNum), Res.getValue(ResNum));
+ Res = SDValue();
+ break;
+ }
}
// If Res is null, the sub-method took care of registering the result.
@@ -5494,6 +5508,12 @@ SDValue DAGTypeLegalizer::WidenVecRes_InregOp(SDNode *N) {
WidenVT, WidenLHS, DAG.getValueType(ExtVT));
}
+SDValue DAGTypeLegalizer::WidenVecRes_FSINCOS(SDNode *N) {
+ EVT WidenVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
+ SDValue InOp = GetWidenedVector(N->getOperand(0));
+ return DAG.getNode(N->getOpcode(), SDLoc(N), {WidenVT, WidenVT}, InOp);
+}
+
SDValue DAGTypeLegalizer::WidenVecRes_MERGE_VALUES(SDNode *N, unsigned ResNo) {
SDValue WidenVec = DisintegrateMERGE_VALUES(N, ResNo);
return GetWidenedVector(WidenVec);
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 25213f587116d5..14bfb9126cb230 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -6918,12 +6918,23 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I,
getValue(I.getArgOperand(0)),
getValue(I.getArgOperand(1)), Flags));
return;
+ case Intrinsic::sincos:
case Intrinsic::frexp: {
+ unsigned Opcode;
+ switch (Intrinsic) {
+ default:
+ llvm_unreachable("unexpected intrinsic");
+ case Intrinsic::sincos:
+ Opcode = ISD::FSINCOS;
+ break;
+ case Intrinsic::frexp:
+ Opcode = ISD::FFREXP;
+ break;
+ }
SmallVector<EVT, 2> ValueVTs;
ComputeValueVTs(TLI, DAG.getDataLayout(), I.getType(), ValueVTs);
SDVTList VTs = DAG.getVTList(ValueVTs);
- setValue(&I,
- DAG.getNode(ISD::FFREXP, sdl, VTs, getValue(I.getArgOperand(0))));
+ setValue(&I, DAG.getNode(Opcode, sdl, VTs, getValue(I.getArgOperand(0))));
return;
}
case Intrinsic::arithmetic_fence: {
diff --git a/ll...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
5cf8c1e
to
03e622e
Compare
Not sure why, but my clang format ( |
Please add G_SINCOS to https://llvm.org/docs/GlobalISel/GenericOpcode.html. |
We've never made any progress vectorizing intrinsics with aggregate return types like this (e.g. overflow intrinsics) @fhahn @alexey-bataev Do you think we can easily support this or should we consider alternatives? We've also had problems with handling arguments that take pointers (sincos / modf etc.) so I guess its just an annoying problem in general....... |
I have a patch WIP/experimental patch to support this, I could share as a draft now. |
Can this deal with sin/cos pairs, like div/rem are already handled? I assume the vectorizer can deal with making sure they're likely to be used in codegen |
Here's my (experimental/WIP) implementation for vectorizing (homogeneous) structs: #109833. It works for my current set of test cases (based on functions like sincos). |
We commonly add a test when importing a new intrinsics: #88240. |
6afd2b5
to
9fdf9fb
Compare
Kind ping 🙂 |
Do you want to get this in before #109833 ? Whichever way you do it, I'd like to see sincos vectorisation test coverage. |
I've split them so they can go in any order. Just #109833 alone won't vectorize I don't mind adding negative tests before that though if it'd help 🙂 |
9fdf9fb
to
96ce74d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This intrinsic and the reason for having it makes sense to me. The code looks fine too, only left a few minor comments.
309302e
to
012ac8d
Compare
This adds the `llvm.sincos` intrinsic, legalization, and lowering. The `llvm.sincos` intrinsic takes a floating-point value and returns both the sine and cosine (as a struct). ``` declare { float, float } @llvm.sincos.f32(float %Val) declare { double, double } @llvm.sincos.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.sincos.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.sincos.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.sincos.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float> %Val) ``` The lowering is built on top of the existing FSINCOS ISD node, with additional type legalization to allow for f16, f128, and vector values.
012ac8d
to
d8ca293
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - cheers
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/123/builds/8296 Here is the relevant piece of the build log for the reference
|
This adds the `llvm.sincos` intrinsic, legalization, and lowering. The `llvm.sincos` intrinsic takes a floating-point value and returns both the sine and cosine (as a struct). ``` declare { float, float } @llvm.sincos.f32(float %Val) declare { double, double } @llvm.sincos.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.sincos.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.sincos.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.sincos.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float> %Val) ``` The lowering is built on top of the existing FSINCOS ISD node, with additional type legalization to allow for f16, f128, and vector values.
This adds the
llvm.sincos
intrinsic, legalization, and lowering.The
llvm.sincos
intrinsic takes a floating-point value and returns both the sine and cosine (as a struct).The lowering is built on top of the existing FSINCOS ISD node, with additional type legalization to allow for f16, f128, and vector values.