-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Fix mechanism propagating mangled names for TLI function mappings #66656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix mechanism propagating mangled names for TLI function mappings #66656
Conversation
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-llvm-transforms ChangesCurrently the mappings from TLI are used to generate the list of available "scalar to vector" mappings attached to scalar calls as "vector-function-abi-variant" LLVM IR attribute. Function names from TLI are wrapped in mangled name following the pattern: This patch is only a refactoring stage where there is no change to the compiler's behaviour. Patch is 117.37 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/66656.diff 9 Files Affected:
diff --git a/llvm/include/llvm/Analysis/TargetLibraryInfo.h b/llvm/include/llvm/Analysis/TargetLibraryInfo.h
index 5d62e837c1f3d5f..b15f20f43ab29a2 100644
--- a/llvm/include/llvm/Analysis/TargetLibraryInfo.h
+++ b/llvm/include/llvm/Analysis/TargetLibraryInfo.h
@@ -27,11 +27,25 @@ class Triple;
/// Describes a possible vectorization of a function.
/// Function 'VectorFnName' is equivalent to 'ScalarFnName' vectorized
/// by a factor 'VectorizationFactor'.
+/// The MangledName string holds scalar-to-vector mapping:
+/// _ZGV<isa><mask><vlen><vparams>_<scalarname>(<vectorname>)
+///
+/// where:
+///
+/// <isa> = "_LLVM_"
+/// <mask> = "M" if masked, "N" if no mask.
+/// <vlen> = Number of concurrent lanes, stored in the `VectorizationFactor`
+/// field of the `VecDesc` struct. If the number of lanes is scalable
+/// then 'x' is printed instead.
+/// <vparams> = "v", as many as are the numArgs.
+/// <scalarname> = the name of the scalar function.
+/// <vectorname> = the name of the vector function.
struct VecDesc {
StringRef ScalarFnName;
StringRef VectorFnName;
ElementCount VectorizationFactor;
bool Masked;
+ StringRef MangledName;
};
enum LibFunc : unsigned {
@@ -163,18 +177,18 @@ class TargetLibraryInfoImpl {
/// Return true if the function F has a vector equivalent with vectorization
/// factor VF.
bool isFunctionVectorizable(StringRef F, const ElementCount &VF) const {
- return !(getVectorizedFunction(F, VF, false).empty() &&
- getVectorizedFunction(F, VF, true).empty());
+ return !(getVectorizedFunction(F, VF, false).first.empty() &&
+ getVectorizedFunction(F, VF, true).first.empty());
}
/// Return true if the function F has a vector equivalent with any
/// vectorization factor.
bool isFunctionVectorizable(StringRef F) const;
- /// Return the name of the equivalent of F, vectorized with factor VF. If no
- /// such mapping exists, return the empty string.
- StringRef getVectorizedFunction(StringRef F, const ElementCount &VF,
- bool Masked) const;
+ /// Return the name of the equivalent of F, vectorized with factor VF and it's
+ /// mangled name. If no such mapping exists, return empty strings.
+ std::pair<StringRef, StringRef>
+ getVectorizedFunction(StringRef F, const ElementCount &VF, bool Masked) const;
/// Set to true iff i32 parameters to library functions should have signext
/// or zeroext attributes if they correspond to C-level int or unsigned int,
@@ -350,8 +364,9 @@ class TargetLibraryInfo {
bool isFunctionVectorizable(StringRef F) const {
return Impl->isFunctionVectorizable(F);
}
- StringRef getVectorizedFunction(StringRef F, const ElementCount &VF,
- bool Masked = false) const {
+ std::pair<StringRef, StringRef>
+ getVectorizedFunction(StringRef F, const ElementCount &VF,
+ bool Masked = false) const {
return Impl->getVectorizedFunction(F, VF, Masked);
}
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index 98bcfe3843669f7..77cc458823ebc8b 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -14,7 +14,7 @@
#if defined(TLI_DEFINE_MASSV_VECFUNCS_NAMES)
#define TLI_DEFINE_MASSV_VECFUNCS
-#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF) VEC,
+#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF, MANGLN) VEC,
#endif
#define FIXED(NL) ElementCount::getFixed(NL)
@@ -23,860 +23,860 @@
#define MASKED true
#if !(defined(TLI_DEFINE_VECFUNC))
-#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF) {SCAL, VEC, VF, NOMASK},
+#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF, MANGLN) {SCAL, VEC, VF, NOMASK, MANGLN},
#endif
#if defined(TLI_DEFINE_ACCELERATE_VECFUNCS)
// Accelerate framework's Vector Functions
// Floating-Point Arithmetic and Auxiliary Functions
-TLI_DEFINE_VECFUNC("ceilf", "vceilf", FIXED(4))
-TLI_DEFINE_VECFUNC("fabsf", "vfabsf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.fabs.f32", "vfabsf", FIXED(4))
-TLI_DEFINE_VECFUNC("floorf", "vfloorf", FIXED(4))
-TLI_DEFINE_VECFUNC("sqrtf", "vsqrtf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sqrt.f32", "vsqrtf", FIXED(4))
+TLI_DEFINE_VECFUNC("ceilf", "vceilf", FIXED(4), "_ZGV_LLVM_N4v_ceilf(vceilf)")
+TLI_DEFINE_VECFUNC("fabsf", "vfabsf", FIXED(4), "_ZGV_LLVM_N4v_fabsf(vfabsf)")
+TLI_DEFINE_VECFUNC("llvm.fabs.f32", "vfabsf", FIXED(4), "_ZGV_LLVM_N4v_llvm.fabs.f32(vfabsf)")
+TLI_DEFINE_VECFUNC("floorf", "vfloorf", FIXED(4), "_ZGV_LLVM_N4v_floorf(vfloorf)")
+TLI_DEFINE_VECFUNC("sqrtf", "vsqrtf", FIXED(4), "_ZGV_LLVM_N4v_sqrtf(vsqrtf)")
+TLI_DEFINE_VECFUNC("llvm.sqrt.f32", "vsqrtf", FIXED(4), "_ZGV_LLVM_N4v_llvm.sqrt.f32(vsqrtf)")
// Exponential and Logarithmic Functions
-TLI_DEFINE_VECFUNC("expf", "vexpf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.exp.f32", "vexpf", FIXED(4))
-TLI_DEFINE_VECFUNC("expm1f", "vexpm1f", FIXED(4))
-TLI_DEFINE_VECFUNC("logf", "vlogf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.log.f32", "vlogf", FIXED(4))
-TLI_DEFINE_VECFUNC("log1pf", "vlog1pf", FIXED(4))
-TLI_DEFINE_VECFUNC("log10f", "vlog10f", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.log10.f32", "vlog10f", FIXED(4))
-TLI_DEFINE_VECFUNC("logbf", "vlogbf", FIXED(4))
+TLI_DEFINE_VECFUNC("expf", "vexpf", FIXED(4), "_ZGV_LLVM_N4v_expf(vexpf)")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "vexpf", FIXED(4), "_ZGV_LLVM_N4v_llvm.exp.f32(vexpf)")
+TLI_DEFINE_VECFUNC("expm1f", "vexpm1f", FIXED(4), "_ZGV_LLVM_N4v_expm1f(vexpm1f)")
+TLI_DEFINE_VECFUNC("logf", "vlogf", FIXED(4), "_ZGV_LLVM_N4v_logf(vlogf)")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "vlogf", FIXED(4), "_ZGV_LLVM_N4v_llvm.log.f32(vlogf)")
+TLI_DEFINE_VECFUNC("log1pf", "vlog1pf", FIXED(4), "_ZGV_LLVM_N4v_log1pf(vlog1pf)")
+TLI_DEFINE_VECFUNC("log10f", "vlog10f", FIXED(4), "_ZGV_LLVM_N4v_log10f(vlog10f)")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "vlog10f", FIXED(4), "_ZGV_LLVM_N4v_llvm.log10.f32(vlog10f)")
+TLI_DEFINE_VECFUNC("logbf", "vlogbf", FIXED(4), "_ZGV_LLVM_N4v_logbf(vlogbf)")
// Trigonometric Functions
-TLI_DEFINE_VECFUNC("sinf", "vsinf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sin.f32", "vsinf", FIXED(4))
-TLI_DEFINE_VECFUNC("cosf", "vcosf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.cos.f32", "vcosf", FIXED(4))
-TLI_DEFINE_VECFUNC("tanf", "vtanf", FIXED(4))
-TLI_DEFINE_VECFUNC("asinf", "vasinf", FIXED(4))
-TLI_DEFINE_VECFUNC("acosf", "vacosf", FIXED(4))
-TLI_DEFINE_VECFUNC("atanf", "vatanf", FIXED(4))
+TLI_DEFINE_VECFUNC("sinf", "vsinf", FIXED(4), "_ZGV_LLVM_N4v_sinf(vsinf)")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "vsinf", FIXED(4), "_ZGV_LLVM_N4v_llvm.sin.f32(vsinf)")
+TLI_DEFINE_VECFUNC("cosf", "vcosf", FIXED(4), "_ZGV_LLVM_N4v_cosf(vcosf)")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "vcosf", FIXED(4), "_ZGV_LLVM_N4v_llvm.cos.f32(vcosf)")
+TLI_DEFINE_VECFUNC("tanf", "vtanf", FIXED(4), "_ZGV_LLVM_N4v_tanf(vtanf)")
+TLI_DEFINE_VECFUNC("asinf", "vasinf", FIXED(4), "_ZGV_LLVM_N4v_asinf(vasinf)")
+TLI_DEFINE_VECFUNC("acosf", "vacosf", FIXED(4), "_ZGV_LLVM_N4v_acosf(vacosf)")
+TLI_DEFINE_VECFUNC("atanf", "vatanf", FIXED(4), "_ZGV_LLVM_N4v_atanf(vatanf)")
// Hyperbolic Functions
-TLI_DEFINE_VECFUNC("sinhf", "vsinhf", FIXED(4))
-TLI_DEFINE_VECFUNC("coshf", "vcoshf", FIXED(4))
-TLI_DEFINE_VECFUNC("tanhf", "vtanhf", FIXED(4))
-TLI_DEFINE_VECFUNC("asinhf", "vasinhf", FIXED(4))
-TLI_DEFINE_VECFUNC("acoshf", "vacoshf", FIXED(4))
-TLI_DEFINE_VECFUNC("atanhf", "vatanhf", FIXED(4))
+TLI_DEFINE_VECFUNC("sinhf", "vsinhf", FIXED(4), "_ZGV_LLVM_N4v_sinhf(vsinhf)")
+TLI_DEFINE_VECFUNC("coshf", "vcoshf", FIXED(4), "_ZGV_LLVM_N4v_coshf(vcoshf)")
+TLI_DEFINE_VECFUNC("tanhf", "vtanhf", FIXED(4), "_ZGV_LLVM_N4v_tanhf(vtanhf)")
+TLI_DEFINE_VECFUNC("asinhf", "vasinhf", FIXED(4), "_ZGV_LLVM_N4v_asinhf(vasinhf)")
+TLI_DEFINE_VECFUNC("acoshf", "vacoshf", FIXED(4), "_ZGV_LLVM_N4v_acoshf(vacoshf)")
+TLI_DEFINE_VECFUNC("atanhf", "vatanhf", FIXED(4), "_ZGV_LLVM_N4v_atanhf(vatanhf)")
#elif defined(TLI_DEFINE_DARWIN_LIBSYSTEM_M_VECFUNCS)
// Darwin libsystem_m vector functions.
// Exponential and Logarithmic Functions
-TLI_DEFINE_VECFUNC("exp", "_simd_exp_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.exp.f64", "_simd_exp_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("expf", "_simd_exp_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.exp.f32", "_simd_exp_f4", FIXED(4))
+TLI_DEFINE_VECFUNC("exp", "_simd_exp_d2", FIXED(2), "_ZGV_LLVM_N2v_exp(_simd_exp_d2)")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_simd_exp_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.exp.f64(_simd_exp_d2)")
+TLI_DEFINE_VECFUNC("expf", "_simd_exp_f4", FIXED(4), "_ZGV_LLVM_N4v_expf(_simd_exp_f4)")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_simd_exp_f4", FIXED(4), "_ZGV_LLVM_N4v_llvm.exp.f32(_simd_exp_f4)")
// Trigonometric Functions
-TLI_DEFINE_VECFUNC("acos", "_simd_acos_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("acosf", "_simd_acos_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("asin", "_simd_asin_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("asinf", "_simd_asin_f4", FIXED(4))
-
-TLI_DEFINE_VECFUNC("atan", "_simd_atan_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("atanf", "_simd_atan_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("atan2", "_simd_atan2_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("atan2f", "_simd_atan2_f4", FIXED(4))
-
-TLI_DEFINE_VECFUNC("cos", "_simd_cos_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.cos.f64", "_simd_cos_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("cosf", "_simd_cos_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.cos.f32", "_simd_cos_f4", FIXED(4))
-
-TLI_DEFINE_VECFUNC("sin", "_simd_sin_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.sin.f64", "_simd_sin_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("sinf", "_simd_sin_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sin.f32", "_simd_sin_f4", FIXED(4))
+TLI_DEFINE_VECFUNC("acos", "_simd_acos_d2", FIXED(2), "_ZGV_LLVM_N2v_acos(_simd_acos_d2)")
+TLI_DEFINE_VECFUNC("acosf", "_simd_acos_f4", FIXED(4), "_ZGV_LLVM_N4v_acosf(_simd_acos_f4)")
+TLI_DEFINE_VECFUNC("asin", "_simd_asin_d2", FIXED(2), "_ZGV_LLVM_N2v_asin(_simd_asin_d2)")
+TLI_DEFINE_VECFUNC("asinf", "_simd_asin_f4", FIXED(4), "_ZGV_LLVM_N4v_asinf(_simd_asin_f4)")
+
+TLI_DEFINE_VECFUNC("atan", "_simd_atan_d2", FIXED(2), "_ZGV_LLVM_N2v_atan(_simd_atan_d2)")
+TLI_DEFINE_VECFUNC("atanf", "_simd_atan_f4", FIXED(4), "_ZGV_LLVM_N4v_atanf(_simd_atan_f4)")
+TLI_DEFINE_VECFUNC("atan2", "_simd_atan2_d2", FIXED(2), "_ZGV_LLVM_N2v_atan2(_simd_atan2_d2)")
+TLI_DEFINE_VECFUNC("atan2f", "_simd_atan2_f4", FIXED(4), "_ZGV_LLVM_N4v_atan2f(_simd_atan2_f4)")
+
+TLI_DEFINE_VECFUNC("cos", "_simd_cos_d2", FIXED(2), "_ZGV_LLVM_N2v_cos(_simd_cos_d2)")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_simd_cos_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.cos.f64(_simd_cos_d2)")
+TLI_DEFINE_VECFUNC("cosf", "_simd_cos_f4", FIXED(4), "_ZGV_LLVM_N4v_cosf(_simd_cos_f4)")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_simd_cos_f4", FIXED(4), "_ZGV_LLVM_N4v_llvm.cos.f32(_simd_cos_f4)")
+
+TLI_DEFINE_VECFUNC("sin", "_simd_sin_d2", FIXED(2), "_ZGV_LLVM_N2v_sin(_simd_sin_d2)")
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_simd_sin_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.sin.f64(_simd_sin_d2)")
+TLI_DEFINE_VECFUNC("sinf", "_simd_sin_f4", FIXED(4), "_ZGV_LLVM_N4v_sinf(_simd_sin_f4)")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_simd_sin_f4", FIXED(4), "_ZGV_LLVM_N4v_llvm.sin.f32(_simd_sin_f4)")
// Floating-Point Arithmetic and Auxiliary Functions
-TLI_DEFINE_VECFUNC("cbrt", "_simd_cbrt_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("cbrtf", "_simd_cbrt_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("erf", "_simd_erf_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("erff", "_simd_erf_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("powf", "_simd_pow_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.pow.f32", "_simd_pow_f4", FIXED(4))
+TLI_DEFINE_VECFUNC("cbrt", "_simd_cbrt_d2", FIXED(2), "_ZGV_LLVM_N2v_cbrt(_simd_cbrt_d2)")
+TLI_DEFINE_VECFUNC("cbrtf", "_simd_cbrt_f4", FIXED(4), "_ZGV_LLVM_N4v_cbrtf(_simd_cbrt_f4)")
+TLI_DEFINE_VECFUNC("erf", "_simd_erf_d2", FIXED(2), "_ZGV_LLVM_N2v_erf(_simd_erf_d2)")
+TLI_DEFINE_VECFUNC("erff", "_simd_erf_f4", FIXED(4), "_ZGV_LLVM_N4v_erff(_simd_erf_f4)")
+TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_pow(_simd_pow_d2)")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.pow.f64(_simd_pow_d2)")
+TLI_DEFINE_VECFUNC("powf", "_simd_pow_f4", FIXED(4), "_ZGV_LLVM_N4v_powf(_simd_pow_f4)")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_simd_pow_f4", FIXED(4), "_ZGV_LLVM_N4v_llvm.pow.f32(_simd_pow_f4)")
// Hyperbolic Functions
-TLI_DEFINE_VECFUNC("sinh", "_simd_sinh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("sinhf", "_simd_sinh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("cosh", "_simd_cosh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("coshf", "_simd_cosh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("tanh", "_simd_tanh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("tanhf", "_simd_tanh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("asinh", "_simd_asinh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("asinhf", "_simd_asinh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("acosh", "_simd_acosh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("acoshf", "_simd_acosh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("atanh", "_simd_atanh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("atanhf", "_simd_atanh_f4", FIXED(4))
+TLI_DEFINE_VECFUNC("sinh", "_simd_sinh_d2", FIXED(2), "_ZGV_LLVM_N2v_sinh(_simd_sinh_d2)")
+TLI_DEFINE_VECFUNC("sinhf", "_simd_sinh_f4", FIXED(4), "_ZGV_LLVM_N4v_sinhf(_simd_sinh_f4)")
+TLI_DEFINE_VECFUNC("cosh", "_simd_cosh_d2", FIXED(2), "_ZGV_LLVM_N2v_cosh(_simd_cosh_d2)")
+TLI_DEFINE_VECFUNC("coshf", "_simd_cosh_f4", FIXED(4), "_ZGV_LLVM_N4v_coshf(_simd_cosh_f4)")
+TLI_DEFINE_VECFUNC("tanh", "_simd_tanh_d2", FIXED(2), "_ZGV_LLVM_N2v_tanh(_simd_tanh_d2)")
+TLI_DEFINE_VECFUNC("tanhf", "_simd_tanh_f4", FIXED(4), "_ZGV_LLVM_N4v_tanhf(_simd_tanh_f4)")
+TLI_DEFINE_VECFUNC("asinh", "_simd_asinh_d2", FIXED(2), "_ZGV_LLVM_N2v_asinh(_simd_asinh_d2)")
+TLI_DEFINE_VECFUNC("asinhf", "_simd_asinh_f4", FIXED(4), "_ZGV_LLVM_N4v_asinhf(_simd_asinh_f4)")
+TLI_DEFINE_VECFUNC("acosh", "_simd_acosh_d2", FIXED(2), "_ZGV_LLVM_N2v_acosh(_simd_acosh_d2)")
+TLI_DEFINE_VECFUNC("acoshf", "_simd_acosh_f4", FIXED(4), "_ZGV_LLVM_N4v_acoshf(_simd_acosh_f4)")
+TLI_DEFINE_VECFUNC("atanh", "_simd_atanh_d2", FIXED(2), "_ZGV_LLVM_N2v_atanh(_simd_atanh_d2)")
+TLI_DEFINE_VECFUNC("atanhf", "_simd_atanh_f4", FIXED(4), "_ZGV_LLVM_N4v_atanhf(_simd_atanh_f4)")
#elif defined(TLI_DEFINE_LIBMVEC_X86_VECFUNCS)
// GLIBC Vector math Functions
-TLI_DEFINE_VECFUNC("sin", "_ZGVbN2v_sin", FIXED(2))
-TLI_DEFINE_VECFUNC("sin", "_ZGVdN4v_sin", FIXED(4))
+TLI_DEFINE_VECFUNC("sin", "_ZGVbN2v_sin", FIXED(2), "_ZGV_LLVM_N2v_sin(_ZGVbN2v_sin)")
+TLI_DEFINE_VECFUNC("sin", "_ZGVdN4v_sin", FIXED(4), "_ZGV_LLVM_N4v_sin(_ZGVdN4v_sin)")
-TLI_DEFINE_VECFUNC("sinf", "_ZGVbN4v_sinf", FIXED(4))
-TLI_DEFINE_VECFUNC("sinf", "_ZGVdN8v_sinf", FIXED(8))
+TLI_DEFINE_VECFUNC("sinf", "_ZGVbN4v_sinf", FIXED(4), "_ZGV_LLVM_N4v_sinf(_ZGVbN4v_sinf)")
+TLI_DEFINE_VECFUNC("sinf", "_ZGVdN8v_sinf", FIXED(8), "_ZGV_LLVM_N8v_sinf(_ZGVdN8v_sinf)")
-TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVbN2v_sin", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVdN4v_sin", FIXED(4))
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVbN2v_sin", FIXED(2), "_ZGV_LLVM_N2v_llvm.sin.f64(_ZGVbN2v_sin)")
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVdN4v_sin", FIXED(4), "_ZGV_LLVM_N4v_llvm.sin.f64(_ZGVdN4v_sin)")
-TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVbN4v_sinf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVdN8v_sinf", FIXED(8))
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVbN4v_sinf", FIXED(4), "_ZGV_LLVM_N4v_llvm.sin.f32(_ZGVbN4v_sinf)")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVdN8v_sinf", FIXED(8), "_ZGV_LLVM_N8v_llvm.sin.f32(_ZGVdN8v_sinf)")
-TLI_DEFINE_VECFUNC("cos", "_ZGVbN2v_cos", FIXED(2))
-TLI_DEFINE_VECFUNC("cos", "_ZGVdN4v_cos", FIXED(4))
+TLI_DEFINE_VECFUNC("cos", "_ZGVbN2v_cos", FIXED(2), "_ZGV_LLVM_N2v_cos(_ZGVbN2v_cos)")
+TLI_DEFINE_VECFUNC("cos", "_ZGVdN4v_cos", FIXED(4), "_ZGV_LLVM_N4v_cos(_ZGVdN4v_cos)")
-TLI_DEFINE_VECFUNC("cosf", "_ZGVbN4v_cosf", FIXED(4))
-TLI_DEFINE_VECFUNC("cosf", "_ZGVdN8v_cosf", FIXED(8))
+TLI_DEFINE_VECFUNC("cosf", "_ZGVbN4v_cosf", FIXED(4), "_ZGV_LLVM_N4v_cosf(_ZGVbN4v_cosf)")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVdN8v_cosf", FIXED(8), "_ZGV_LLVM_N8v_cosf(_ZGVdN8v_cosf)")
-TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVbN2v_cos", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVdN4v_cos", FIXED(4))
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVbN2v_cos", FIXED(2), "_ZGV_LLVM_N2v_llvm.cos.f64(_ZGVbN2v_cos)")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVdN4v_cos", FIXED(4), "_ZGV_LLVM_N4v_llvm.cos.f64(_ZGVdN4v_cos)")
-TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVbN4v_cosf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVdN8v_cosf", FIXED(8))
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVbN4v_cosf", FIXED(4), "_ZGV_LLVM_N4v_llvm.cos.f32(_ZGVbN4v_cosf)")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVdN8v_cosf", FIXED(8), "_ZGV_LLVM_N8v_llvm.cos.f32(_ZGVdN8v_cosf)")
-TLI_DEFINE_VECFUNC("pow", "_ZGVbN2vv_pow", FIXED(2))
-TLI_DEFINE_VECFUNC("pow", "_ZGVdN4vv_pow", FIXED(4))
+TLI_DEFINE_VECFUNC("pow", "_ZGVbN2vv_pow", FIXED(2), "_ZGV_LLVM_N2vv_pow(_ZGVbN2vv_pow)")
+TLI_DEFINE_VECFUNC("pow", "_ZGVdN4vv_pow", FIXED(4), "_ZGV_LLVM_N4vv_pow(_ZGVdN4vv_pow)")
-TLI_DEFINE_VECFUNC("powf", "_ZGVbN4vv_powf", FIXED(4))
-TLI_DEFINE_VECFUNC("powf", "_ZGVdN8vv_powf", FIXED(8))
+TLI_DEFINE_VECFUNC("powf", "_ZGVbN4vv_powf", FIXED(4), "_ZGV_LLVM_N4vv_powf(_ZGVbN4vv_powf)")
+TLI_DEFINE_VECFUNC("powf", "_ZGVdN8vv_powf", FIXED(8), "_ZGV_LLVM_N8vv_powf(_ZGVdN8vv_powf)")
-TLI_DEFINE_VECFUNC("__pow_finite", "_ZGVbN2vv___pow_finite", FIXED(2))
-TLI_DEFINE_VECFUNC("__pow_finite", "_ZGVdN4vv___pow_finite", FIXED(4))
+TLI_DEFINE_VECFUNC("__pow_finite", "_ZGVbN2vv___pow_finite", FIXED(2), "_ZGV_LLVM_N2vv___pow_finite(_ZGVbN2vv___pow_finite)")
+TLI_DEFINE_VECFUNC("__pow_finite", "_ZGVdN4vv___pow_finite", FIXED(4), "_ZGV_LLVM_N4vv___pow_finite(_ZGVdN4vv___pow_finite)")
-TLI_DEFINE_VECFUNC("__powf_finite", "_ZGVbN4vv___powf_finite", FIXED(4))
-TLI_DEFINE_VECFUNC("__powf_finite", "_ZGVdN8vv___powf_finite", FIXED(8))
+TLI_DEFINE_VECFUNC("__powf_finite", "_ZGVbN4vv___powf_finite", FIXED(4), "_ZGV_LLVM_N4vv___powf_finite(_ZGVbN4vv___powf_finite)")
+TLI_DEFINE_VECFUNC("__powf_finite", "_ZGVdN8vv___powf_finite", FIXED(8), "_ZGV_LLVM_N8vv___powf_finite(_ZGVdN8vv___powf_finite)")
-TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVbN2vv_pow", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVdN4vv_pow", FIXED(4))
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVbN2vv_pow", FIXED(2), "_ZGV_LLVM_N2vv_llvm.pow.f64(_ZGVbN2vv_pow)")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVdN4vv_pow", FIXED(4), "_ZGV_LLVM_N4vv_llvm.pow.f64(_ZGVdN4vv_pow)")
-TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVbN4vv_powf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVdN8vv_powf", FIXED(8))
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVbN4vv_powf", FIXED(4), "_ZGV_LLVM_N4vv_llvm.pow.f32(_ZGVbN4vv_powf)")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVdN8vv_powf", FIXED(8), "_ZGV_LLVM_N8vv_llvm.pow.f32(_ZGVdN8vv_powf)")
-TLI_DEFINE_VECFUNC("exp", "_ZGVbN2v_exp", FIXED(2))
-TLI_DEFINE_VECFUNC("exp", "_ZGVdN4v_exp", FIXED(4))
+TLI_DEFINE_VECFUNC("exp", "_ZGVbN2v_exp", FIXED(2), "_ZGV_LLVM_N2v_exp(_ZGVbN2v_exp)")
+TLI_DEFINE_VECFUNC("exp", "_ZGVdN4v_exp", FIXED(4), "_ZGV_LLVM_N2v_exp(_ZGVdN4v_exp)")
-TLI_DEFINE_VECFUNC("expf", "_ZGVbN4v_expf", FIXED(4))
-TLI_DEFINE_VECFUNC("expf", "_ZGVdN8v_expf", FIXED(8))
+TLI_DEFINE_VECFUNC("expf", "_ZGVbN4v_expf", FIXED(4), "_ZGV_LLVM_N4v_expf(_ZGVbN4v_expf)")
+TLI_DEFINE_VECFUNC("expf", "_ZGVdN8v_expf", FIXED(8), "_ZGV_LLVM_N8v_expf(_ZGVdN8v_expf)")
-TLI_DEFINE_VECFUNC("__exp_finite", "_ZGVbN2v___exp_finite", FIXED(2))
-TLI_DEFINE_VECFUNC("__exp_finite", "_ZGVdN4v___exp_finite", FIXED(4))
+TLI_DEFINE_VECFUNC("__exp_finite", "_ZGVbN2v___exp_finite", FIXED(2), "_ZGV_LLVM_N2v___exp_finite(_ZGVbN2v___exp_finite)")
+TLI_DEFINE_VECFUNC("__exp_finite", "_ZGVdN4v___exp_finite", FIXED(4), "_ZGV_LLVM_N4v___exp_finite(_ZGVdN4v___exp_finite)")
-TLI_DEFINE_VECFUNC("__expf_finite", "_ZGVbN4v___expf_finite", FIXED(4))
-TLI_DEFINE_VECFUNC("__expf_finite", "_ZGVdN8v___expf_finite", FIXED(8))
+TLI_DEFINE_VECFUNC("__expf_finite", "_ZGVbN4v___expf_finite", FIXED(4), "_ZGV_LLVM_N4v___expf_finite(_ZGVbN4v___expf_finite)")
+TLI_DEFINE_VECFUNC("__expf_finite", "_ZGVdN8v___expf_finite", FIXED(8), "_ZGV_LLVM_N8v___expf_finite(_ZGVdN8v___expf_finite)")
-TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVbN2v_exp", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVdN4v_...
[truncated]
|
TLI_DEFINE_VECFUNC("floorf", "vfloorf", FIXED(4)) | ||
TLI_DEFINE_VECFUNC("sqrtf", "vsqrtf", FIXED(4)) | ||
TLI_DEFINE_VECFUNC("llvm.sqrt.f32", "vsqrtf", FIXED(4)) | ||
TLI_DEFINE_VECFUNC("ceilf", "vceilf", FIXED(4), "_ZGV_LLVM_N4v_ceilf(vceilf)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MANGLN shouldn't need to be a full vector-function-abi-variant
string because that string has a specific use and is built up based on the fields within this table. We should try to keep things minimal and that means certainly dropping the (vector-name)
part of the string.
I cannot quite decide whether the scalar name is useful to include or not but that likely comes down to the interface used to extract the data. You're current returning the mangling and the vector name so that's likely where the need comes from. However if you instead have queries that return a reference to the whole record then you might be able to restrict the new field to just mangling prefix because the scalar name will be accessible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid I cannot shorten the mangled name removing the optional part.
If I do I'll crash in VFABI::setVectorVariantNames() with
assert(VI && "Cannot add an invalid VFABI name.").
The culprit is the code below. VectorName will only be changed if there
is an optional part. For LLVM isa this causes a std::nullopt return
because the VectorName is unchanged and equals OriginalName.
std::optional VFABI::tryDemangleForVFABI(StringRef MangledName,
const Module &M) {
const StringRef OriginalName = MangledName;
StringRef VectorName = MangledName;
...
// Reduce MangledName to [()].
MangledName = MangledName.ltrim(ScalarName);
// Find the optional custom name redirection.
if (MangledName.consume_front("(")) {
if (!MangledName.consume_back(")"))
return std::nullopt;
// Update the vector variant with the one specified by the user.
VectorName = MangledName;
// If the vector name is missing, bail out.
if (VectorName.empty())
return std::nullopt;
}
// LLVM internal mapping via the TargetLibraryInfo (TLI) must be
// redirected to an existing name.
if (ISA == VFISAKind::LLVM && VectorName == OriginalName)
return std::nullopt;
Sorry for not formatting the above as code but this removes all line breaks for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think perhaps you've misunderstood the intent of this work and thus my request. I'm not expecting there to be any changes to the value of the vector-function-abi-variant
attribute. I'm only expecting the way the text is generated to be changed so that for follow on patches we can stop tryDemangleForVFABI
from incorrectly looking through the module in order to figure out the element count for scalable vector functions.
If you look at how mangleTLIVectorName
is used you'll see that it's given data that has been extracted from TargetLibraryInfo (via its database of VecDesc objects). The only missing piece of information is the Vector ABI mangling prefix that it instead computes. The request is for this prefix to be store within TargetLibraryInfo as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully fixed now.
std::pair<StringRef, StringRef> | ||
getVectorizedFunction(StringRef F, const ElementCount &VF, bool Masked) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels clunky and I think the original interface is just wrong because it's not going to scale well with different variants for the same scalar functions. I think it's best to return the complete record (essentially turn VecDesc into a class) so that all the related information is kept together.
NOTE: I'm not saying you need to convert all the code paths, feel free to sit the old interface on top of new one and then just use the new one where you need it (i.e. when you need access to the mangle string).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
TLI_DEFINE_VECFUNC("erf", "_simd_erf_d2", FIXED(2), "_ZGV_LLVM_N2v_erf(_simd_erf_d2)") | ||
TLI_DEFINE_VECFUNC("erff", "_simd_erf_f4", FIXED(4), "_ZGV_LLVM_N4v_erff(_simd_erf_f4)") | ||
TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_pow(_simd_pow_d2)") | ||
TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.pow.f64(_simd_pow_d2)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how many of these are wrong but pow
takes two parameters and yet the vector ABI string only shows one v
(i.e. one operand). I had thought there would be existing tests to ensure this couldn't go wrong. I'm guessing now there are no such existing tests?
It might be worth using a macro a bit like we have for the ElementCount
fields to cover the most common usages (e.g. VABI_MANGLE_OneOp(#NoElts), VABI_MANGLE_TwoOp(#NoElts)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, pow usually takes two arguments. This one seems to be an exception. From the veclib-calls-libsystem-darwin.ll
test file:
declare double @pow(double) ... %call = tail call double @pow(double %lv)
All darwin tests are passing but I cannot guarantee the tests cover all mappings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a bug with the original implementation. @fhahn does the use of pow/powf (i.e. only having a single operand) within veclib-calls-libsystem-darwin.ll look correct to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're moving towards only the prefix needing to be store. If true then that reenforces the idea of using a macros of some form to help reduce the chances of typos.
/// Return the name of the equivalent of F, vectorized with factor VF. If no | ||
/// such mapping exists, return the empty string. | ||
/// Return the name of the equivalent of F, vectorized with factor VF. | ||
/// If no such mapping exists, return empty strings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably this change can now be reverted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
/// Return a pointer to a VecDesc object holding all info for scalar to vector | ||
/// mappings in TLI for the equivalent of F, vectorized with factor VF. | ||
/// If no such mapping exists, return nullpointer. | ||
const VecDesc *getMangledTLIVectorName(StringRef F, const ElementCount &VF, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like getVectorMappingInfo
seems more representative of what the functions does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@@ -354,6 +389,10 @@ class TargetLibraryInfo { | |||
bool Masked = false) const { | |||
return Impl->getVectorizedFunction(F, VF, Masked); | |||
} | |||
const VecDesc *getMangledTLIVectorName(StringRef F, const ElementCount &VF, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above regarding the function name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
TLI_DEFINE_VECFUNC("floorf", "vfloorf", FIXED(4)) | ||
TLI_DEFINE_VECFUNC("sqrtf", "vsqrtf", FIXED(4)) | ||
TLI_DEFINE_VECFUNC("llvm.sqrt.f32", "vsqrtf", FIXED(4)) | ||
TLI_DEFINE_VECFUNC("ceilf", "vceilf", FIXED(4), "_ZGV_LLVM_N4v_ceilf(vceilf)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think perhaps you've misunderstood the intent of this work and thus my request. I'm not expecting there to be any changes to the value of the vector-function-abi-variant
attribute. I'm only expecting the way the text is generated to be changed so that for follow on patches we can stop tryDemangleForVFABI
from incorrectly looking through the module in order to figure out the element count for scalable vector functions.
If you look at how mangleTLIVectorName
is used you'll see that it's given data that has been extracted from TargetLibraryInfo (via its database of VecDesc objects). The only missing piece of information is the Vector ABI mangling prefix that it instead computes. The request is for this prefix to be store within TargetLibraryInfo as well.
StringRef getScalarFnName() const { return ScalarFnName; } | ||
StringRef getVectorFnName() const { return VectorFnName; } | ||
ElementCount getVectorizationFactor() const { return VectorizationFactor; } | ||
bool getMasked() const { return Masked; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be isMasked
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
StringRef getVectorFnName() const { return VectorFnName; } | ||
ElementCount getVectorizationFactor() const { return VectorizationFactor; } | ||
bool getMasked() const { return Masked; } | ||
StringRef getMangledName() const { return MangledName; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about adding a function like getVectorFunctionABIVariantString
that uses all the fields to construct _ZGV<isa><mask><vlen><vparams>_<scalarname>(<vectorname>)
?
Once this exists then perhaps getMangledName()
becomes redundant and can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. getMangledName()
is removed.
/// <scalarname> = the name of the scalar function. | ||
/// <vectorname> = the name of the vector function. | ||
class VecDesc { | ||
private: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The private:
can be removed because by default class members are private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
TLI_DEFINE_VECFUNC("erf", "_simd_erf_d2", FIXED(2), "_ZGV_LLVM_N2v_erf(_simd_erf_d2)") | ||
TLI_DEFINE_VECFUNC("erff", "_simd_erf_f4", FIXED(4), "_ZGV_LLVM_N4v_erff(_simd_erf_f4)") | ||
TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_pow(_simd_pow_d2)") | ||
TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.pow.f64(_simd_pow_d2)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're moving towards only the prefix needing to be store. If true then that reenforces the idea of using a macros of some form to help reduce the chances of typos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: I'll take a proper look at VecFuncs.def
once I'm happy with the rest of the patch.
; SLEEFGNUABI-SAME: [4 x ptr] [ | ||
; SLEEFGNUABI-SAME: ptr @_ZGVnN2v_sin, | ||
; SLEEFGNUABI-SAME: ptr @_ZGVsMxv_sin, | ||
; SLEEFGNUABI_SAME; ptr @_ZGVnN4v_log10f, | ||
; SLEEFGNUABI-SAME: ptr @_ZGVsMxv_log10f | ||
; ARMPL-SAME: [4 x ptr] [ | ||
; ARMPL-SAME: ptr @armpl_vsinq_f64, | ||
; ARMPL-SAME: ptr @armpl_svsin_f64_x, | ||
; ARMPL-SAME: ptr @armpl_vlog10q_f32, | ||
; ARMPL-SAME: ptr @armpl_svlog10_f32_x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please can you rebase the pull request to remove the non-relevant changes. I know that might mess up review comment linking but that doesn't matter at this stage.
llvm/lib/Analysis/VectorUtils.cpp
Outdated
std::string | ||
VFABI::getVectorFunctionABIVariantString(const StringRef MangledNamePrefix, | ||
const StringRef ScalarFnName, | ||
const StringRef VectorFnName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function would be better as a VecDesc
member function.
@@ -14,7 +14,7 @@ | |||
|
|||
#if defined(TLI_DEFINE_MASSV_VECFUNCS_NAMES) | |||
#define TLI_DEFINE_MASSV_VECFUNCS | |||
#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF) VEC, | |||
#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF, MANGLN) VEC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we've settled on just storing the prefix I suggest VABI_PREFIX
as a more accurate name.
e466c0e
to
477a0b3
Compare
Currently the mappings from TLI are used to generate the list of available "scalar to vector" mappings attached to scalar calls as "vector-function-abi-variant" LLVM IR attribute. Function names from TLI are wrapped in mangled name following the pattern: _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)] The problem is the mangled name uses _LLVM_ as the ISA name which prevents the compiler to compute vectorization factor for scalable vectors as it cannot make any decision based on the _LLVM_ ISA. If we use "s" as the ISA name, the compiler can make decisions based on VFABI specification where SVE spacific rules are described. This patch is only a refactoring stage where there is no change to the compiler's behaviour.
the functionality.
and adjusted the implementation.
renamed a macro field and rebased on top of the main branch.
477a0b3
to
1ec6a90
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we've not heard back about the pow
issue but I think it's just a bug that we can fix in this patch.
TLI_DEFINE_VECFUNC("exp", "_ZGVbN2v_exp", FIXED(2)) | ||
TLI_DEFINE_VECFUNC("exp", "_ZGVdN4v_exp", FIXED(4)) | ||
TLI_DEFINE_VECFUNC("exp", "_ZGVbN2v_exp", FIXED(2), "_ZGV_LLVM_N2v") | ||
TLI_DEFINE_VECFUNC("exp", "_ZGVdN4v_exp", FIXED(4), "_ZGV_LLVM_N2v") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be _ZGV_LLVM_N4v
?
TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v") | ||
TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v") | ||
TLI_DEFINE_VECFUNC("powf", "_simd_pow_f4", FIXED(4), "_ZGV_LLVM_N4v") | ||
TLI_DEFINE_VECFUNC("llvm.pow.f32", "_simd_pow_f4", FIXED(4), "_ZGV_LLVM_N4v") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These take two parameters and so show end vv
.
TLI_DEFINE_VECFUNC("atan2", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2v") | ||
TLI_DEFINE_VECFUNC("atan2f", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4v") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These take two parameters and this should end vv
.
TLI_DEFINE_VECFUNC("atan2", "_simd_atan2_d2", FIXED(2), "_ZGV_LLVM_N2v") | ||
TLI_DEFINE_VECFUNC("atan2f", "_simd_atan2_f4", FIXED(4), "_ZGV_LLVM_N4v") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These take two parameters and so show end vv.
Hi @JolantaJensen and others, I'm seeing problems with this patch for my out-of-tree target. For this targert we have a bunch of target intrinsics that we say are vectorizable, but we don't provide a name of the vector version of the intrinsic. This means that before this patch, addMappingsFromTLI wouldn't do much since TLI.getVectorizedFunction(ScalarName, VF, Predicate) returned "" and AddVariantDecl wouldn't add anything. Vectorization then worked as we wanted anyway. With this patch, the vectorFn name guard in AddVariantDecl has apparently been removed and instead Is this change on purpose? We had some similar problems a couple of years ago with EDIT: The commit message says |
Hi @mikaelholmen , |
A guard for empty vector function name was removed in llvm#66656. This patch adds the guard back.
A guard for empty vector function name was removed in #66656. This patch adds the guard back.
…ed names and the call's argument types to construct the VFInfo. Currently the tryDemangleForVFABI is incorrectly making assumptions that when the mangled name has __LLVM__ set as ISA and _vlen_ is set to scalable/VLA the module can be queried to identify the correct VF. This works currently only by chance rather than by design and limits the ability to maximise the use for the Vector ABI. This behaviour is changed by llvm#66656 and llvm#67308 and means demangling can be implemented as expected based solely on the rules as documented within the VFABI.
Currently the mappings from TLI are used to generate the list of available "scalar to vector" mappings attached to scalar calls as "vector-function-abi-variant" LLVM IR attribute. Function names from TLI are wrapped in mangled name following the pattern:
ZGV<scalar_name>[(<vector_redirection>)] The problem is the mangled name uses LLVM as the ISA name which prevents the compiler to compute vectorization factor for scalable vectors as it cannot make any decision based on the LLVM ISA. If we use "s" as the ISA name, the compiler can make decisions based on VFABI specification where SVE spacific rules are described.
This patch is only a refactoring stage where there is no change to the compiler's behaviour.