Skip to content

[clang] Constexpr for __builtin_shufflevector and __builtin_convertvector #76615

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 29, 2024

Conversation

Destroyerrrocket
Copy link
Contributor

Summary:

This patch adds constexpr support for __builtin_shufflevector and __builtin_convertvector.

A small oddity encountered was that the arg to the intrinsics may be an lvalue without any sort of implicit cast of any kind. I solved this through the EvaluateVectorOrLValue function, which treats the lvalue as if it was in an rvalue cast, which gets me the desired vector.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Dec 30, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 30, 2023

@llvm/pr-subscribers-clang

Author: Pol M (Destroyerrrocket)

Changes

Summary:

This patch adds constexpr support for __builtin_shufflevector and __builtin_convertvector.

A small oddity encountered was that the arg to the intrinsics may be an lvalue without any sort of implicit cast of any kind. I solved this through the EvaluateVectorOrLValue function, which treats the lvalue as if it was in an rvalue cast, which gets me the desired vector.


Full diff: https://github.com/llvm/llvm-project/pull/76615.diff

4 Files Affected:

  • (modified) clang/docs/LanguageExtensions.rst (+3-2)
  • (modified) clang/docs/ReleaseNotes.rst (+2)
  • (modified) clang/lib/AST/ExprConstant.cpp (+137-1)
  • (modified) clang/test/Sema/constant-builtins-2.c (+61)
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index 23a7f4f5d5b926..5f06c3d4b86f94 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -2853,7 +2853,7 @@ Query for this feature with ``__has_builtin(__builtin_dump_struct)``
 ``__builtin_shufflevector`` is used to express generic vector
 permutation/shuffle/swizzle operations.  This builtin is also very important
 for the implementation of various target-specific header files like
-``<xmmintrin.h>``.
+``<xmmintrin.h>``. This builtin can be used within constant expressions.
 
 **Syntax**:
 
@@ -2907,7 +2907,8 @@ Query for this feature with ``__has_builtin(__builtin_shufflevector)``.
 
 ``__builtin_convertvector`` is used to express generic vector
 type-conversion operations. The input vector and the output vector
-type must have the same number of elements.
+type must have the same number of elements. This builtin can be used within
+constant expressions.
 
 **Syntax**:
 
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 0c8fec691bf3c9..b6f1407436ed4f 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -237,6 +237,8 @@ Non-comprehensive list of changes in this release
 * Since MSVC 19.33 added undocumented attribute ``[[msvc::constexpr]]``, this release adds the attribute as well.
 
 * Added ``#pragma clang fp reciprocal``.
+* Builtins ``__builtin_shufflevector()`` and ``__builtin_convertvector()`` may now be used within constant
+ expressions.
 
 New Compiler Flags
 ------------------
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index f6aeee1a4e935d..e8afa10fe7aaac 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -2702,7 +2702,8 @@ static bool checkFloatingPointResult(EvalInfo &Info, const Expr *E,
 static bool HandleFloatToFloatCast(EvalInfo &Info, const Expr *E,
                                    QualType SrcType, QualType DestType,
                                    APFloat &Result) {
-  assert(isa<CastExpr>(E) || isa<CompoundAssignOperator>(E));
+  assert(isa<CastExpr>(E) || isa<CompoundAssignOperator>(E) ||
+         isa<ConvertVectorExpr>(E));
   llvm::RoundingMode RM = getActiveRoundingMode(Info, E);
   APFloat::opStatus St;
   APFloat Value = Result;
@@ -10643,6 +10644,9 @@ namespace {
     bool VisitUnaryImag(const UnaryOperator *E);
     bool VisitBinaryOperator(const BinaryOperator *E);
     bool VisitUnaryOperator(const UnaryOperator *E);
+    bool VisitConvertVectorExpr(const ConvertVectorExpr *E);
+    bool VisitShuffleVectorExpr(const ShuffleVectorExpr *E);
+
     // FIXME: Missing: conditional operator (for GNU
     //                 conditional select), shufflevector, ExtVectorElementExpr
   };
@@ -10895,6 +10899,138 @@ bool VectorExprEvaluator::VisitUnaryOperator(const UnaryOperator *E) {
   return Success(APValue(ResultElements.data(), ResultElements.size()), E);
 }
 
+static bool EvaluateVectorOrLValue(APValue &Result, EvalInfo &Info,
+                                   const Expr *E, const QualType &Type) {
+  if (!Evaluate(Result, Info, E))
+    return false;
+
+  if (Result.isLValue()) {
+    // Source of the data is an lvalue; Manually handle the lvalue as if
+    // it was an rvalue to get the current APValue.
+    LValue LValueFound;
+    LValueFound.setFrom(Info.Ctx, Result);
+    if (!handleLValueToRValueConversion(Info, E, Type, LValueFound, Result)) {
+      return false;
+    }
+  }
+
+  if (!Result.isVector()) {
+    return false;
+  }
+  return true;
+}
+
+static bool handleVectorConversion(EvalInfo &Info, const FPOptions FPO,
+                                   const Expr *E, QualType SourceTy,
+                                   QualType DestTy, APValue const &Original,
+                                   APValue &Result) {
+  if (SourceTy->isIntegerType()) {
+    if (DestTy->isRealFloatingType()) {
+      Result = APValue(APFloat(0.0));
+      return HandleIntToFloatCast(Info, E, FPO, SourceTy, Original.getInt(),
+                                  DestTy, Result.getFloat());
+    }
+    if (DestTy->isIntegerType()) {
+      Result = APValue(
+          HandleIntToIntCast(Info, E, DestTy, SourceTy, Original.getInt()));
+      return true;
+    }
+  } else if (SourceTy->isRealFloatingType()) {
+    if (DestTy->isRealFloatingType()) {
+      Result = Original;
+      return HandleFloatToFloatCast(Info, E, SourceTy, DestTy,
+                                    Result.getFloat());
+    }
+    if (DestTy->isIntegerType()) {
+      Result = APValue(APSInt());
+      return HandleFloatToIntCast(Info, E, SourceTy, Original.getFloat(),
+                                  DestTy, Result.getInt());
+    }
+  }
+  return false;
+}
+
+bool VectorExprEvaluator::VisitConvertVectorExpr(const ConvertVectorExpr *E) {
+  APValue Source;
+  QualType SourceVecType = E->getSrcExpr()->getType();
+  if (!EvaluateVectorOrLValue(Source, Info, E->getSrcExpr(), SourceVecType))
+    return false;
+
+  QualType DestTy = E->getType()->castAs<VectorType>()->getElementType();
+  QualType SourceTy = SourceVecType->castAs<VectorType>()->getElementType();
+
+  const FPOptions FPO = E->getFPFeaturesInEffect(Info.Ctx.getLangOpts());
+
+  SmallVector<APValue, 4> ResultElements;
+  ResultElements.reserve(Source.getVectorLength());
+  for (unsigned EltNum = 0; EltNum < Source.getVectorLength(); ++EltNum) {
+    APValue Elt;
+    if (!handleVectorConversion(Info, FPO, E, SourceTy, DestTy,
+                                Source.getVectorElt(EltNum), Elt))
+      return false;
+    ResultElements.push_back(std::move(Elt));
+  }
+
+  return Success(APValue(ResultElements.data(), ResultElements.size()), E);
+}
+
+static bool handleVectorShuffle(EvalInfo &Info, const ShuffleVectorExpr *E,
+                                QualType ElemType, APValue const &VecVal1,
+                                APValue const &VecVal2, unsigned EltNum,
+                                APValue &Result) {
+  unsigned const TotalElementsInAVector = VecVal1.getVectorLength();
+
+  Expr const *IndexExpr = E->getExpr(2 + EltNum);
+  APSInt IndexVal;
+  if (!EvaluateInteger(IndexExpr, IndexVal, Info)) {
+    return false;
+  }
+
+  uint32_t index = IndexVal.getZExtValue();
+  // The spec says that -1 should be treated as undef for optimizations,
+  // but in constexpr we need to choose a value. We'll choose 0.
+  if (index >= TotalElementsInAVector * 2) {
+    index = 0;
+  }
+
+  if (index >= TotalElementsInAVector) {
+    Result = VecVal2.getVectorElt(index - TotalElementsInAVector);
+  } else {
+    Result = VecVal1.getVectorElt(index);
+  }
+  return true;
+}
+
+bool VectorExprEvaluator::VisitShuffleVectorExpr(const ShuffleVectorExpr *E) {
+  APValue VecVal1;
+  const Expr *Vec1 = E->getExpr(0);
+  if (!EvaluateVectorOrLValue(VecVal1, Info, Vec1, Vec1->getType()))
+    return false;
+  APValue VecVal2;
+  const Expr *Vec2 = E->getExpr(1);
+  if (!EvaluateVectorOrLValue(VecVal2, Info, Vec2, Vec2->getType()))
+    return false;
+
+  VectorType const *DestVecTy = E->getType()->castAs<VectorType>();
+  if (!DestVecTy) {
+    return false;
+  }
+  QualType DestElTy = DestVecTy->getElementType();
+
+  auto TotalElementsInOutputVector = DestVecTy->getNumElements();
+
+  SmallVector<APValue, 4> ResultElements;
+  ResultElements.reserve(TotalElementsInOutputVector);
+  for (unsigned EltNum = 0; EltNum < TotalElementsInOutputVector; ++EltNum) {
+    APValue Elt;
+    if (!handleVectorShuffle(Info, E, DestElTy, VecVal1, VecVal2, EltNum, Elt))
+      return false;
+    ResultElements.push_back(std::move(Elt));
+  }
+
+  return Success(APValue(ResultElements.data(), ResultElements.size()), E);
+}
+
 //===----------------------------------------------------------------------===//
 // Array Evaluation
 //===----------------------------------------------------------------------===//
diff --git a/clang/test/Sema/constant-builtins-2.c b/clang/test/Sema/constant-builtins-2.c
index 2bdd7b06daabfe..c6bb602149e193 100644
--- a/clang/test/Sema/constant-builtins-2.c
+++ b/clang/test/Sema/constant-builtins-2.c
@@ -302,3 +302,64 @@ extern __typeof__(__builtin_expect(0, 0)) bi0;
 // Strings
 int array1[__builtin_strlen("ab\0cd")];
 int array2[(sizeof(array1)/sizeof(int)) == 2? 1 : -1];
+
+typedef double vector4double __attribute__((__vector_size__(32)));
+typedef float vector4float __attribute__((__vector_size__(16)));
+typedef long long vector4long __attribute__((__vector_size__(32)));
+typedef int vector4int __attribute__((__vector_size__(16)));
+typedef short vector4short __attribute__((__vector_size__(8)));
+typedef char vector4char __attribute__((__vector_size__(4)));
+typedef double vector8double __attribute__((__vector_size__(64)));
+typedef float vector8float __attribute__((__vector_size__(32)));
+typedef long long vector8long __attribute__((__vector_size__(64)));
+typedef int vector8int __attribute__((__vector_size__(32)));
+typedef short vector8short __attribute__((__vector_size__(16)));
+typedef char vector8char __attribute__((__vector_size__(8)));
+
+// Convert vector
+#define CHECK_NUM(__size, __typeFrom, __typeTo, ...)                            \
+  vector##__size##__typeTo                                                      \
+      from_##vector##__size##__typeFrom##_to_##vector##__size##__typeTo##_var = \
+          __builtin_convertvector((vector##__size##__typeFrom){__VA_ARGS__},    \
+                                  vector##__size##__typeTo);
+#define CHECK_TO_ALL_TYPES(__size, __typeFrom, ...)                            \
+  CHECK_NUM(__size, __typeFrom, double, __VA_ARGS__)                           \
+  CHECK_NUM(__size, __typeFrom, float, __VA_ARGS__)                            \
+  CHECK_NUM(__size, __typeFrom, long, __VA_ARGS__)                             \
+  CHECK_NUM(__size, __typeFrom, int, __VA_ARGS__)                              \
+  CHECK_NUM(__size, __typeFrom, short, __VA_ARGS__)                            \
+  CHECK_NUM(__size, __typeFrom, char, __VA_ARGS__)
+
+#define CHECK_ALL_COMBINATIONS(__size, ...)                                    \
+  CHECK_TO_ALL_TYPES(__size, double, __VA_ARGS__)                              \
+  CHECK_TO_ALL_TYPES(__size, float, __VA_ARGS__)                               \
+  CHECK_TO_ALL_TYPES(__size, long, __VA_ARGS__)                                \
+  CHECK_TO_ALL_TYPES(__size, int, __VA_ARGS__)                                 \
+  CHECK_TO_ALL_TYPES(__size, short, __VA_ARGS__)                               \
+  CHECK_TO_ALL_TYPES(__size, char, __VA_ARGS__)
+
+CHECK_ALL_COMBINATIONS(4, 0, 1, 2, 3);
+CHECK_ALL_COMBINATIONS(8, 0, 1, 2, 3, 4, 5, 6, 7);
+#undef CHECK_ALL_COMBINATIONS
+#undef CHECK_TO_ALL_TYPES
+#undef CHECK_NUM
+
+// Shuffle vector
+vector4int const vector4intConst1 = {0, 1, 2, 3};
+vector4int const vector4intConst2 = {4, 5, 6, 7};
+vector8int const vector8intConst = {};
+
+vector4int vectorShuffle1 =
+    __builtin_shufflevector(vector4intConst1, vector4intConst2, 0, 1, 2, 3);
+vector4int vectorShuffle2 =
+    __builtin_shufflevector(vector4intConst1, vector4intConst2, 4, 5, 6, 7);
+vector4int vectorShuffle3 =
+    __builtin_shufflevector(vector4intConst1, vector4intConst2, -1, -1, -1, -1);
+vector4int vectorShuffle4 =
+    __builtin_shufflevector(vector4intConst1, vector4intConst2, 0, 2, 4, 6);
+vector8int vectorShuffle5 = __builtin_shufflevector(
+    vector8intConst, vector8intConst, 0, 2, 4, 6, 8, 10, 12, 14);
+vector4int vectorShuffle6 = __builtin_shufflevector(
+    vector8intConst, vector8intConst, 0, 2, 4, 6);
+vector8int vectorShuffle7 =
+    __builtin_shufflevector(vector4intConst1, vector4intConst2, 0, 2, 4, 6, 1, 3, 5, 7);

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minors but a frontend specialist really needs to review this

if (!Result.isVector()) {
return false;
}
return true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use return Result.isVector() ?

// but in constexpr we need to choose a value. We'll choose 0.
if (index >= TotalElementsInAVector * 2) {
index = 0;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any chance we can retain the undef? Could we insert an undef element?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is, I unfortunately don't know how...

APSInt IndexVal;
if (!EvaluateInteger(IndexExpr, IndexVal, Info)) {
return false;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(style) unnecessary braces

VectorType const *DestVecTy = E->getType()->castAs<VectorType>();
if (!DestVecTy) {
return false;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(style) unnecessary braces


SmallVector<APValue, 4> ResultElements;
ResultElements.reserve(Source.getVectorLength());
for (unsigned EltNum = 0; EltNum < Source.getVectorLength(); ++EltNum) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull out repeated Source.getVectorLength() calls?

vector4int vectorShuffle6 = __builtin_shufflevector(
vector8intConst, vector8intConst, 0, 2, 4, 6);
vector8int vectorShuffle7 =
__builtin_shufflevector(vector4intConst1, vector4intConst2, 0, 2, 4, 6, 1, 3, 5, 7);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What could we do to check the result values of the constant expression? Could we bitcast a char4 to a uint value and static assert the result for instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right! we can indeed make something like:
constexpr vector4char v = {1,2,3,4};
static_assert(std::bit_cast(v) == 0x4030201, "equal");
I had initially dismissed this as the operator[] is not implemented for vectors. This is endian dependent, is there a particular way I should handle this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use the same 'LITTLE_END' define approach used in clang\test\SemaCXX\constexpr-builtin-bit-cast.cpp?

How much work will it to be to eventually implement the vector operator[] as a constexpr as well do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I'll copy that over.

Well, in that case I would have to care about lvalues (in my code I just needed the values to return a new copy, so I could ignore all that machinery for now). I'd have to learn how these work more in depth. I don't think they'd be necessarily super complicated, and that would make for an ideal follow-up for me! (looks like this would be part of: #41806, and also this also looks interesting: #30794)

I think this is a pretty interesting set of tasks to work on!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But just for this validation, I'd definitiely take the bitcast approach

@RKSimon RKSimon requested a review from erichkeane January 1, 2024 15:31
@RKSimon
Copy link
Collaborator

RKSimon commented Jan 4, 2024

Please can you rename constat_builtins_vector.cpp -> constant_builtins_vector.cpp

@Destroyerrrocket
Copy link
Contributor Author

(sorry about the typo!)

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully someone with more knowledge of the const evaluator (@shafik and @tbaederr have experience lately!) can poke their head in, but a few things from my look over.

// RUN: %clang_cc1 -verify -std=c++2a -fsyntax-only %s
// expected-no-diagnostics

#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clean up this test? All of the macros doing the 'check' generation make this pretty unreadable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just really didn't want to type out all the combinations. I can expand the macros out, np!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a "best of both worlds" situation here to check in the generated result. Maybe something like an #ifdef GEN_TESTS wrapping the macros, and a second RUN pipeline wrapping clang -DGEN_TESTS -E and FileCheck to validate that the output and contents of this file are in sync?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry @sethp, I don't follow your explanation, I'm not familiar enough with the testing infrastructure in Clang. If you think that will make the code look cleaner while still not having to spell out all convinations, I'm all for it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you could provide a toy example, I can do that for this test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion was not so much a way to avoid spelling them all out as to trick the preprocessor into doing (a lot of) the work. I'll try to put together a toy example in the next couple of days, but if you run clang -E on the existing file and commit the resulting spellings you'll get much of the way to what I was thinking.

Would that address your concern, @erichkeane?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I got a quick demo of what I was thinking put together. For your consideration: 17c78a2

It did & didn't work out quite the way I'd hoped: I'd forgotten just how, uh, peculiar the preprocessor is, especially if you're trying to get it to generate human-readable output. There might be a way to do it nicely (adding clang-format into the pipeline?), or it might not be an idea worth pursuing at all. I'll leave the final determination up to you & @erichkeane .

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, i don't think that is worth doing. I'm still disturbed by how difficult it is to figure out exactly what/how we are testing things, but perhaps just some detailed comments on the macros are sufficient.

@@ -10895,6 +10899,132 @@ bool VectorExprEvaluator::VisitUnaryOperator(const UnaryOperator *E) {
return Success(APValue(ResultElements.data(), ResultElements.size()), E);
}

static bool EvaluateVectorOrLValue(APValue &Result, EvalInfo &Info,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interface to this function seems really odd for the name? Is the name inaccurate, or is this using 'false' to mean 'failure' AND 'not a vector'?

It seems to me that this needs to be something like "EvaluateLValueVector" or something?

Copy link
Contributor Author

@Destroyerrrocket Destroyerrrocket Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the name inaccurate, or is this using 'false' to mean 'failure' AND 'not a vector'?

Indeed, false is being treated as a failure and not a vector.

It seems to me that this needs to be something like "EvaluateLValueVector" or something?

I'm not entirely sure if the way I implemented the handling of an lvalue as a parameter is even the correct way to go with, so it might very well be that the reason it sticks out is because I'm doing something wrong. I guess I'll have to wait for someone who has dealt with lvalues in constexpr to tell me where I went wrong

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've dealt with lvalues in the interpreter, but maybe only a little more than you have. I've found it more than a little confusing, especially because interpreter seemingly uses "RValue" as a synonym for "actual value" (as you're using it here), and "LValue" as "opaque blob of bytes" (with metadata, and accessors, but enough wrinkles that it's just easier to convert to an RValue to work with the actual data).

That said, I think there's two ways to go here to meet the need instead of this function:

  1. Teach the parser that these builtins ought to take their arguments as rvalues, so that it will insert a LValueToRValue cast in the expression tree wrapping the src value, and so you can use EvaluateVector (probably loosening its assertion to allow xvalues too). This probably would have downstream consequences, though, and I lack the expertise to know whether they'd be good or bad. But it might be interesting!
  2. Use, as much of the interpreter seems to, EvaluateAsRValue as shorthand for "do an in-place 'dereference' of the lvalue": the main difference I see from EvaluateVectorOrLValue here is that the latter checks for an impossible condition (SemaChecking.cpp already prevents construction of any ConvertVectorExpr or ShuffleVectorExpr that doesn't have vector-typed arguments in the appropriate slots).

For the latter path, it might be wise to add a test case to exercise the checker, too; perhaps something like this for __builtin_convertvector?

static_assert([]{
    __builtin_convertvector(0, vector8char); // expected-error {{must be a vector}}
    __builtin_convertvector((vector8char){}, int); // expected-error {{must be of vector type}}
    return 1;
}());

(and similar for __builtin_shufflevector)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that these errors should be covered by convertvector.c, I'll add one of the missing diagnostics you pointed out. One already exists

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went with the EvaluateAsRValue

@RKSimon RKSimon requested review from shafik and tbaederr January 9, 2024 11:20
@@ -2702,7 +2702,8 @@ static bool checkFloatingPointResult(EvalInfo &Info, const Expr *E,
static bool HandleFloatToFloatCast(EvalInfo &Info, const Expr *E,
QualType SrcType, QualType DestType,
APFloat &Result) {
assert(isa<CastExpr>(E) || isa<CompoundAssignOperator>(E));
assert(isa<CastExpr>(E) || isa<CompoundAssignOperator>(E) ||
isa<ConvertVectorExpr>(E));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(style) assertion message

Copy link
Contributor

@sethp sethp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Destroyerrrocket, thanks for your work getting vectors into constexpr-land! I'm also curious about that, so I did my best to give you a helpful review.

The only part I'd be uncomfortable about merging is the handling of -1 sentinels / indeterminate values: everything else I expect you to take as no more than a gentle suggestion.

Comment on lines 10650 to 10651
// FIXME: Missing: conditional operator (for GNU
// conditional select), shufflevector, ExtVectorElementExpr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// FIXME: Missing: conditional operator (for GNU
// conditional select), shufflevector, ExtVectorElementExpr
// FIXME: Missing: conditional operator (for GNU
// conditional select), ExtVectorElementExpr

@@ -10895,6 +10899,132 @@ bool VectorExprEvaluator::VisitUnaryOperator(const UnaryOperator *E) {
return Success(APValue(ResultElements.data(), ResultElements.size()), E);
}

static bool EvaluateVectorOrLValue(APValue &Result, EvalInfo &Info,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've dealt with lvalues in the interpreter, but maybe only a little more than you have. I've found it more than a little confusing, especially because interpreter seemingly uses "RValue" as a synonym for "actual value" (as you're using it here), and "LValue" as "opaque blob of bytes" (with metadata, and accessors, but enough wrinkles that it's just easier to convert to an RValue to work with the actual data).

That said, I think there's two ways to go here to meet the need instead of this function:

  1. Teach the parser that these builtins ought to take their arguments as rvalues, so that it will insert a LValueToRValue cast in the expression tree wrapping the src value, and so you can use EvaluateVector (probably loosening its assertion to allow xvalues too). This probably would have downstream consequences, though, and I lack the expertise to know whether they'd be good or bad. But it might be interesting!
  2. Use, as much of the interpreter seems to, EvaluateAsRValue as shorthand for "do an in-place 'dereference' of the lvalue": the main difference I see from EvaluateVectorOrLValue here is that the latter checks for an impossible condition (SemaChecking.cpp already prevents construction of any ConvertVectorExpr or ShuffleVectorExpr that doesn't have vector-typed arguments in the appropriate slots).

For the latter path, it might be wise to add a test case to exercise the checker, too; perhaps something like this for __builtin_convertvector?

static_assert([]{
    __builtin_convertvector(0, vector8char); // expected-error {{must be a vector}}
    __builtin_convertvector((vector8char){}, int); // expected-error {{must be of vector type}}
    return 1;
}());

(and similar for __builtin_shufflevector)

DestTy, Result.getInt());
}
}
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to include a diagnostic here: maybe something to the effect of constexpr vector conversion involving type %s not yet supported?

That said, I'm not sure what types exist that would be valid vector elements but fall through to this case (fixed point? complex?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I certainly don't know about any such case...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like the only one missing is _BitInt(8) or bigger. I'll just add support for that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the bit-precise support "just worked"? At least, I see a test for it, but not any changes in this function. That's neat!

I'm still leaning towards adding a diagnostic here, though: I'm imagining a future standard or compiler version adding a new vector-capable type (_Complex, maybe? One of the fixed-point types?), and callers of __builtin_convertvector(...) with an that type will just see some flavor of "must be initialized by a constant expression," despite the docs' promise that "This builtin can be used within constant expressions."

I'm not sure how to test it—maybe using an incomplete type?—but I'm also pretty comfortable with an un-exercised Info.FFDiag(...) call here too. That said, this certainly does work with everything I can think to throw at it today, though, so I'm open to calling it a future follow-up idea too. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that adding a diagnostic here, even if unreachable, is more than just desired, I'm well convinced on the future-proofing argument. I will not add a test because I simply don't have a way to reach this, I believe.

QualType ElemType, APValue const &VecVal1,
APValue const &VecVal2, unsigned EltNum,
APValue &Result) {
unsigned const TotalElementsInAVector = VecVal1.getVectorLength();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit of a nit, but this seems odd to me; should it be called TotalElementsInFirstArg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't both vectors the same size? This code assumes that based on the spec. If They are not the same size, I need to revisit this, but otherwise I think the name is just fine. Maybe I'll rename it to TotalElementsInInputVector so it is a bit more descriptive

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think the vectors had to be the same length: both clang and gcc document the extension to require that the vectors have "a compatible element type" (gcc) or "are vectors that have the same element type" (clang).

However, while gcc does accept a shuffle of e.g. an 8-element and 4-element vector (for a 12-element "input"), clang seems to reject it: https://godbolt.org/z/3ooszMKbc

So maybe the thing to do here is add an assert(VecVal1.getVectorLength() == VecVal2.getVectorLength())? And/or, change the documentation to make it clear it's "same element type and length"?

An alternative to consider would be to plumb both the constexpr and codegen pieces to accept a wider range of vectors for arguments 1 and 2, but I'm assuming the current lowering is just using LLVM's shufflevector, which itself only supports homogeneous arguments. To me, that smells like something best addressed as its own unit of work and far outside the scope of this PR, but I wanted to include it for completeness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! It seems like Gcc has extra characteristics then. given that at some point it could happen that such a thing needs to be supported, let's actually rewrite the code to support both sizes. Beware, I won't be able to test it, as Sema will prevent it from reaching constexpr.

// RUN: %clang_cc1 -verify -std=c++2a -fsyntax-only %s
// expected-no-diagnostics

#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a "best of both worlds" situation here to check in the generated result. Maybe something like an #ifdef GEN_TESTS wrapping the macros, and a second RUN pipeline wrapping clang -DGEN_TESTS -E and FileCheck to validate that the output and contents of this file are in sync?

Comment on lines 10999 to 11069
APValue VecVal1;
const Expr *Vec1 = E->getExpr(0);
if (!EvaluateVectorOrLValue(VecVal1, Info, Vec1, Vec1->getType()))
return false;
APValue VecVal2;
const Expr *Vec2 = E->getExpr(1);
if (!EvaluateVectorOrLValue(VecVal2, Info, Vec2, Vec2->getType()))
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure how to do it, it seems like it'd be handy if we could finesse the evaluator here into producing a single APValue that's the concatenation of the vectors produced by the first two sub-expressions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I 100% Agree! I'll look if concatenating the vectors can be done in an efficient manner!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find a way that would not require copying (the move operator is just a copy in this case) about 600 bytes of data. I think I'll just leave it as is, as just making a new APValue to insert values to does not seem to solve much

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, too bad: thanks for looking into it and reporting back!

Comment on lines 10985 to 10989
uint32_t index = IndexVal.getZExtValue();
// The spec says that -1 should be treated as undef for optimizations,
// but in constexpr we need to choose a value. We'll choose 0.
if (index >= TotalElementsInAVector * 2)
index = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this quite implements __builtin_shufflevector; consider:

typedef char vector4char __attribute__((__vector_size__(4)));
typedef char vector8char __attribute__((__vector_size__(8)));

__builtin_shufflevector((vector4char){}, (vector8char){}, 10);

That should produce a single-element char vector whose element has value 0: https://godbolt.org/z/9655x1oc6 (though it looks like there's another issue elsewhere in Sema that's not currently accepting that).

Also, the >= suggests to me that it'll also produce values for out-of-bounds indexes. Happily, this already aborts (with a useful diagnostic):

__builtin_shufflevector((vector8char){}, (vector8char){}, 20);

So, for that part it's probably enough here to clarify with an assert that index is less the sum of the vectors' lengths (or the -1 sentinel).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think just picking a value here isn't quite right. Consider:

constexpr auto f() {
    return __builtin_shufflevector((vector8char){7}, (vector8char){}, -1);
}

This implementation will always produce a value of 7 for f()[0] when called in a constant context, right? But it's some flavor of U{nspecified,ndefined}B, so the value produced at runtime is unknown. As I understand it the evaluator is required to bail out on UB, to prevent exactly that kind of mismatch: it's preferable for the interpreter to say "I can't do anything with this" than to do something that would only match some of the time.

So, as I understand it, there's two options here:

  1. return false with some diagnostic saying "no -1 indexes in a constant context", or
  2. use APValue::Indeterminate (or APValue::None?) for this slot in the vector (oh, I guess that's why there's an APValue per element).

The latter would allow an expression like:

static_assert(__builtin_shufflevector((vector8char){7}, (vector8char){}, 0, -1)[0] == 7);

but not

// should fail to produce a constant value
static_assert(__builtin_shufflevector((vector8char){7}, (vector8char){}, 0, -1)[1]); 

(interestingly, neither gcc nor clang works completely as I'd expect here; clang doesn't yet support constant-time [..] vector element accesses, and gcc actually allows the second construction to produce 0, which it probably shouldn't).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:6:12: error: first two arguments to '__builtin_shufflevector' must have the same type 6 | return __builtin_shufflevector((vector4char){}, (vector8char){}, 10)[0]; | ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated.

Latest trunk seems to also think that both input vectors should match in size. I'll add an llvm_unreachable for the case where it is out of bounds

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

APValue::Indeterminate seems to be exactly what I was looking for! I'll read up on this, thank you. I assumed that such a thing did not exist, but I'm glad to be proven wrong

Copy link
Contributor Author

@Destroyerrrocket Destroyerrrocket Mar 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operator[] needs lvalue support for vectors, I think I commented with @RKSimon that it would be a really useful thing for validations. For now, I'm bit casting to get around this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad news... seems like APValue::Indeterminate and None are both intentionally disallowed. I'll go with the second option then, make the user choose what element they want to use as a source in constexpr. They can always if consteval

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re Operator[] there's already a PR for doing that in constant contexts, I think: #72607 . Definitely agree it'd be helpful for validations!

re: APValue::Indeterminate, I think you're talking about how top-level indeterminate values get rejected by clang as not being a constant expression, right? Dang, that both makes sense and feels unfortunate: thanks for giving it a try, at least!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, let's hope operator[] makes it.

The error comes from somewhere in the depths of ExprConst, it intentionally tells you that undefined values are indeed illegal in constexpr code. I agree with you, a shame it makes sense!

@Destroyerrrocket
Copy link
Contributor Author

@sethp This is an amazing review! I won't be able to work on this for 2 weeks, but thank you a lot for your work!

@RKSimon
Copy link
Collaborator

RKSimon commented Mar 21, 2024

@Destroyerrrocket reverse-ping - are you still working on this?

@Destroyerrrocket
Copy link
Contributor Author

@RKSimon Hi, Sorry I have not reached out! Yes, I plan to do some work on this in the next two weeks; on a normal work week I end up way too tired to work more in the weekend, if you need this and want to take over just let me know, I don't want to stall anyone!

@Destroyerrrocket
Copy link
Contributor Author

Whenever you have a change, I think this is ready for a second pass. Sorry for the delay! ^-^ If I missed something, do let me know

@Destroyerrrocket Destroyerrrocket force-pushed the users/destroyerrocket/46593 branch from 9ff1269 to 25f3a1a Compare March 31, 2024 16:33
Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love it if @sethp can do another review, but only 1 nit from me.

@@ -10318,6 +10318,8 @@ def err_shufflevector_nonconstant_argument : Error<
def err_shufflevector_argument_too_large : Error<
"index for __builtin_shufflevector must be less than the total number "
"of vector elements">;
def err_shufflevector_minus_one_is_undefined_behavior_constexpr : Error<
"index for __builtin_shufflevector must be within the bounds of the input vectors in a constexpr context. An index of -1 at position %0 was found. -1 is only allowed at runtime.">;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"index for __builtin_shufflevector must be within the bounds of the input vectors in a constexpr context. An index of -1 at position %0 was found. -1 is only allowed at runtime.">;
"index for __builtin_shufflevector not within the bounds of the input vectors; index of -1 found at position %0 not permitted in a constexpr context">;

Leave the 'only allowed at runtime' for the documentation (and document it?).

Copy link
Contributor

@sethp sethp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The __builtin_shufflevector impl LGTM (one tiny note below), though I still owe @Destroyerrrocket a worked example on the test side.

I have one suggestion for __builtin_convertvector, but overall that LGTM too: I think we're really close to merging this!

Comment on lines 11034 to 11035
// The spec says that -1 should be treated as undef for optimizations,
// but in constexpr we need to choose a value. We'll choose 0.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tad out of date now. Maybe:

Suggested change
// The spec says that -1 should be treated as undef for optimizations,
// but in constexpr we need to choose a value. We'll choose 0.
// The spec says that -1 should be treated as undef for optimizations,
// but in constexpr we'd have to produce an APValue::Indeterminate,
// which is prohibited from being a top-level constant value. Emit a
// diagnostic instead.

?

DestTy, Result.getInt());
}
}
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the bit-precise support "just worked"? At least, I see a test for it, but not any changes in this function. That's neat!

I'm still leaning towards adding a diagnostic here, though: I'm imagining a future standard or compiler version adding a new vector-capable type (_Complex, maybe? One of the fixed-point types?), and callers of __builtin_convertvector(...) with an that type will just see some flavor of "must be initialized by a constant expression," despite the docs' promise that "This builtin can be used within constant expressions."

I'm not sure how to test it—maybe using an incomplete type?—but I'm also pretty comfortable with an un-exercised Info.FFDiag(...) call here too. That said, this certainly does work with everything I can think to throw at it today, though, so I'm open to calling it a future follow-up idea too. What do you think?

Comment on lines 11063 to 11072
VectorType const *DestVecTy = E->getType()->castAs<VectorType>();
if (!DestVecTy)
return false;

QualType DestElTy = DestVecTy->getElementType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed this before: castAs isn't ever supposed to produce a null pointer; I think it's roughly the llvm not-RTTI spelling of static_cast (if I'm using any of those words right), whereas dyn_cast (which Type wraps as getAs) is more akin to a fail-able dynamic_cast.

So this ought to be equivalent to:

Suggested change
VectorType const *DestVecTy = E->getType()->castAs<VectorType>();
if (!DestVecTy)
return false;
QualType DestElTy = DestVecTy->getElementType();
VectorType const *DestVecTy = E->getType()->castAs<VectorType>();
QualType DestElTy = DestVecTy->getElementType();

But I'll leave it up to you whether that's a change worth making here. Mostly this is me trying to flex my "do I remember what the difference between Type::getAs and Type::castAs is" muscle, and it's offered to you in case you're interested in the same exercise 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good to know indeed :)

@RKSimon
Copy link
Collaborator

RKSimon commented Apr 15, 2024

Other than fixing the ReleaseNotes.rst conflict is there anything outstanding on this now?

@Destroyerrrocket
Copy link
Contributor Author

Destroyerrrocket commented Apr 15, 2024 via email

@sethp
Copy link
Contributor

sethp commented Apr 15, 2024

That sounds right to me; the tasks I see are:

  • Expanding the test cases for __builtin_convertvector
  • Rewording the diagnostic for -1 in __builtin_shufflevector
  • Rewording the comment for -1 in __builtin_shufflevector
  • A stylistic change to __builtin_convertvector
  • Considering adding a diagnostic for __builtin_convertvector in case we "fall through" to the bottom (which we think is currently impossible)
  • ReleaseNotes.rst

Of those, I think the only one that isn't pretty much purely mechanical is whether or not we add the extra diagnostic.

Let me know if I missed anything, or if there's anything I can do to help get this one over the finish line: thanks so much @Destroyerrrocket for all your work already!

@Destroyerrrocket Destroyerrrocket force-pushed the users/destroyerrocket/46593 branch from d0220be to 8d41d93 Compare April 20, 2024 10:27
@Destroyerrrocket
Copy link
Contributor Author

Destroyerrrocket commented Apr 20, 2024

Addressed the comments; I don't really like the expanded macro code, it generates 600 lines of really repetitive code, so I left the macros as a source of truth that I'd use to generate the cases in the future.

If there is anything else that needs changing, do let me know!

Copy link
Contributor

@sethp sethp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Barring any objections, I'll plan to hit the squash & merge button on this one come Monday (29 April) morning.

Thank you @Destroyerrrocket for your thoroughness and patience! I enjoyed learning my way through the details with you 😄

@Destroyerrrocket
Copy link
Contributor Author

Destroyerrrocket commented Apr 24, 2024 via email

@philnik777
Copy link
Contributor

I just wanted to say thanks for working on this. I'd love to enable the vectorization branches in libc++ during constant evaluation. It's always a great way to have confidence in the correctness of the code, and this brings us a step closer to having fewer __libcpp_is_constant_evaluated() in the code base.

@sethp
Copy link
Contributor

sethp commented Apr 29, 2024

Well, good news and bad news: I succeeded in trying to save one more cycle by asking you to fix the conflict in ReleaseNotes.rst, and these changes are in the llvm-project's main branch: 2903df0 (sorry for mucking up the title, that's one for me to grow on).

The bad news is that github did not consider this an "indirect merge" as it would have if I'd authored a merge commit resolving the conflict. So we have a couple options that I see:

  • Mark the PR as "closed" instead of "merged" and celebrate anyway
  • @Destroyerrrocket you could try force-pushing your branch to that commit (git push --force-with-lease https://github.com/Destroyerrrocket/llvm-project.git 2903df02fb3c057849aaa796a91289b01950a5f0:refs/heads/Destroyerrrocket-users/destroyerrocket/46593 ought to do it), which github will hopefully recognize as associating the two properly
  • or, some further git shenanigans like
    git reset --hard 2903df02fb3c057849aaa796a91289b01950a5f0
    git commit --allow-empty -m 'Mark PR#76615 as integrated'
    git push --force-with-lease
    
    and then we hit the "squash and merge" button on that.

What do y'all think?

@Destroyerrrocket Destroyerrrocket force-pushed the users/destroyerrocket/46593 branch from 8d41d93 to 04e8a17 Compare April 29, 2024 17:01
@Destroyerrrocket
Copy link
Contributor Author

Done!

@sethp sethp merged commit b83e65d into llvm:main Apr 29, 2024
4 checks passed
kamaub referenced this pull request Apr 29, 2024
commit 8d41d93
Author: Pol Marcet Sardà <[email protected]>
Date:   Sat Apr 20 12:19:49 2024 +0200

    Address some misc comments; added a diagnostic and expanded macros in
    testing.

commit 9493c0f
Author: Pol Marcet Sardà <[email protected]>
Date:   Sun Mar 31 18:18:45 2024 +0200

    Following the review of sethp, I have made the following changes:

    -- Added diagnostic for the undefined shuffle of -1
    -- Validated support for _BitInt
    -- A bunch of other minnor tweaks here and there

commit 8273abc
Author: Pol Marcet Sardà <[email protected]>
Date:   Thu Jan 4 12:31:08 2024 +0100

    Fix typo in file name

commit ff68f23
Author: Pol Marcet Sardà <[email protected]>
Date:   Thu Jan 4 11:26:08 2024 +0100

    Address suggestions from RKSimon

commit c14783d
Author: Pol Marcet Sardà <[email protected]>
Date:   Sat Dec 30 13:59:00 2023 +0100

    [clang] Constexpr for __builtin_shufflevector and __builtin_convertvector

    Summary:

    This patch adds constexpr support for __builtin_shufflevector
    and __builtin_convertvector.

    A small oddity encountered was that the arg to the intrinsics may be an
    lvalue without any sort of implicit cast of any kind. I solved this
    through the EvaluateVectorOrLValue function, which treats the lvalue as
    if it was in an rvalue cast, which gets me the desired vector.

Co-Authored-By: Seth Pellegrino <[email protected]>
@kamaub
Copy link
Contributor

kamaub commented Apr 29, 2024

This changeset causes failures on the clang-ppc64be-linux-multistage bot with build 19869 and clang-ppc64be-linux-test-suite bot with build 22932. It noticeably passes clang-ppc64le-linux-multistage on build 41191, they are big and little endian versions of each other.
Please investigate and supply a fix or revert your patch as soon as possible, thank you.

Please let me know if I can assist with this. This appears to be the correct PR for 2903df0 and so I have posted there as well.

@Destroyerrrocket
Copy link
Contributor Author

(I think) I know what the issue is; The test has a macro to generate the little/big endian constants. But during review we expanded them because that helped readability. The generating macros are still there, I understand that this is kinda urgent so if someone has 10 minutes to remove the expanded macro code and uncomment the macro usages, the tests should pass again in big endian mode.
Beware that at the bottom there are a few extra manual tests. If no one can right now, I think it's easier if we revert and merge again at our own pace :)

@sethp
Copy link
Contributor

sethp commented Apr 29, 2024

Ah, that sounds right: I'll tweak the tests and make sure the test case runs on both big & little-endian systems.

@Destroyerrrocket
Copy link
Contributor Author

Thank you @sethp ! I've marked the lines that need changing in the commit

@sethp
Copy link
Contributor

sethp commented Apr 29, 2024

Oh, I did it slightly differently: I just backed out the "last step" of the macro expansion by hand (1 ? turns out not to be a pattern that shows up too much on purpose).

That's in 347a02b ; thanks @kamaub !

@Destroyerrrocket
Copy link
Contributor Author

Great, works for me!

nhasabni pushed a commit to nhasabni/llvm-project that referenced this pull request May 9, 2024
…ctor (llvm#76615)

This patch adds constexpr support for __builtin_shufflevector and
__builtin_convertvector.

NB: the changes went in under 2903df0 , this commit is just github PR bookkeepping.
nhasabni pushed a commit to nhasabni/llvm-project that referenced this pull request May 9, 2024
…ctor (llvm#76615)

This patch adds constexpr support for __builtin_shufflevector and
__builtin_convertvector.

NB: the changes went in under 2903df0 , this commit is just github PR bookkeepping.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants