-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[HLSL] Apply NoRecurse attrib to all HLSL functions #105907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously, functions named "main" got the NoRecurse attribute consistent with the behavior of C++, which HLSL largely follows. However, standard recursion is not allowed in HLSL, so all functions should really have this attribute. This doesn't prevent recursion, but rather signals that these functions aren't expected to recurse. Practically, this was done so that entry point functions named "main" would have all have the same attributes as otherwise identical entry points with other names. This required small changes to the this assignment tests because they no longer generate so many attribute sets since more of them match. related to llvm#105244 but done to simplify testing for llvm#89806
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-hlsl Author: Greg Roth (pow2clk) ChangesPreviously, functions named "main" got the NoRecurse attribute consistent with the behavior of C++, which HLSL largely follows. However, standard recursion is not allowed in HLSL, so all functions should really have this attribute. This doesn't prevent recursion, but rather signals that these functions aren't expected to recurse. Practically, this was done so that entry point functions named "main" would have all have the same attributes as otherwise identical entry points with other names. This required small changes to the this assignment tests because they no longer generate so many attribute sets since more of them match. related to #105244 Full diff: https://github.com/llvm/llvm-project/pull/105907.diff 4 Files Affected:
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index c89eaa0f4e3bfc..a5747283e98058 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1064,13 +1064,17 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
// OpenCL C 2.0 v2.2-11 s6.9.i:
// Recursion is not supported.
//
+ // HLSL
+ // Recursion is not supported.
+ //
// SYCL v1.2.1 s3.10:
// kernels cannot include RTTI information, exception classes,
// recursive code, virtual functions or make use of C++ libraries that
// are not compiled for the device.
- if (FD && ((getLangOpts().CPlusPlus && FD->isMain()) ||
- getLangOpts().OpenCL || getLangOpts().SYCLIsDevice ||
- (getLangOpts().CUDA && FD->hasAttr<CUDAGlobalAttr>())))
+ if (FD &&
+ ((getLangOpts().CPlusPlus && FD->isMain()) || getLangOpts().OpenCL ||
+ getLangOpts().HLSL || getLangOpts().SYCLIsDevice ||
+ (getLangOpts().CUDA && FD->hasAttr<CUDAGlobalAttr>())))
Fn->addFnAttr(llvm::Attribute::NoRecurse);
llvm::RoundingMode RM = getLangOpts().getDefaultRoundingMode();
diff --git a/clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl b/clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl
new file mode 100644
index 00000000000000..ae3a3b5f90199f
--- /dev/null
+++ b/clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl
@@ -0,0 +1,93 @@
+// RUN: %clang_cc1 -x hlsl -triple dxil-pc-shadermodel6.3-library -finclude-default-header %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s
+// RUN: %clang_cc1 -x hlsl -triple dxil-pc-shadermodel6.0-compute -finclude-default-header %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s
+
+// Verify that a few different function types all get the NoRecurse attribute
+
+#define MAX 100
+
+struct Node {
+ uint value;
+ uint key;
+ uint left, right;
+};
+
+// CHECK: Function Attrs:{{.*}}norecurse
+// CHECK: define noundef i32 @"?Find@@YAIY0GE@UNode@@I@Z"(ptr noundef byval([100 x %struct.Node]) align 4 %SortedTree, i32 noundef %key) [[IntAttr:\#[0-9]+]]
+// CHECK: ret i32
+// Find and return value corresponding to key in the SortedTree
+uint Find(Node SortedTree[MAX], uint key) {
+ uint nix = 0; // head
+ while(true) {
+ if (nix < 0)
+ return 0.0; // Not found
+ Node n = SortedTree[nix];
+ if (n.key == key)
+ return n.value;
+ if (key < n.key)
+ nix = n.left;
+ else
+ nix = n.right;
+ }
+}
+
+// CHECK: Function Attrs:{{.*}}norecurse
+// CHECK: define noundef i1 @"?InitTree@@YA_NY0GE@UNode@@V?$RWBuffer@T?$__vector@I$03@__clang@@@hlsl@@I@Z"(ptr noundef byval([100 x %struct.Node]) align 4 %tree, ptr noundef byval(%"class.hlsl::RWBuffer") align 4 %encodedTree, i32 noundef %maxDepth) [[ExtAttr:\#[0-9]+]]
+// CHECK: ret i1
+// Initialize tree with given buffer
+// Imagine the inout works
+export
+bool InitTree(/*inout*/ Node tree[MAX], RWBuffer<uint4> encodedTree, uint maxDepth) {
+ uint size = pow(2.f, maxDepth) - 1;
+ if (size > MAX) return false;
+ for (uint i = 1; i < size; i++) {
+ tree[i].value = encodedTree[i].x;
+ tree[i].key = encodedTree[i].y;
+ tree[i].left = encodedTree[i].z;
+ tree[i].right = encodedTree[i].w;
+ }
+ return true;
+}
+
+RWBuffer<uint4> gTree;
+
+// Mangled entry points are internal
+// CHECK: Function Attrs:{{.*}}norecurse
+// CHECK: define internal void @"?main@@YAXI@Z"(i32 noundef %GI) [[IntAttr]]
+// CHECK: ret void
+
+// Canonical entry points are external and shader attributed
+// CHECK: Function Attrs:{{.*}}norecurse
+// CHECK: define void @main() [[EntryAttr:\#[0-9]+]]
+// CHECK: ret void
+
+[numthreads(1,1,1)]
+[shader("compute")]
+void main(uint GI : SV_GroupIndex) {
+ Node haystack[MAX];
+ uint needle = 0;
+ if (InitTree(haystack, gTree, GI))
+ needle = Find(haystack, needle);
+}
+
+// Mangled entry points are internal
+// CHECK: Function Attrs:{{.*}}norecurse
+// CHECK: define internal void @"?defaultMain@@YAXXZ"() [[IntAttr]]
+// CHECK: ret void
+
+// Canonical entry points are external and shader attributed
+// CHECK: Function Attrs:{{.*}}norecurse
+// CHECK: define void @defaultMain() [[EntryAttr]]
+// CHECK: ret void
+
+[numthreads(1,1,1)]
+[shader("compute")]
+void defaultMain() {
+ Node haystack[MAX];
+ uint needle = 0;
+ if (InitTree(haystack, gTree, 4))
+ needle = Find(haystack, needle);
+}
+
+// CHECK: attributes [[IntAttr]] = {{.*}} norecurse
+// CHECK: attributes [[ExtAttr]] = {{.*}} norecurse
+// CHECK: attributes [[EntryAttr]] = {{.*}} norecurse
diff --git a/clang/test/CodeGenHLSL/this-assignment-overload.hlsl b/clang/test/CodeGenHLSL/this-assignment-overload.hlsl
index 0c4905e0f45980..f0affcb69a3fcd 100644
--- a/clang/test/CodeGenHLSL/this-assignment-overload.hlsl
+++ b/clang/test/CodeGenHLSL/this-assignment-overload.hlsl
@@ -25,7 +25,7 @@ void main() {
}
// This test makes a probably safe assumption that HLSL 202x includes operator overloading for assignment operators.
-// CHECK: define linkonce_odr noundef i32 @"?getFirst@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #2 align 2 {
+// CHECK: define linkonce_odr noundef i32 @"?getFirst@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
// CHECK-NEXT:entry:
// CHECK-NEXT:%this.addr = alloca ptr, align 4
// CHECK-NEXT:%Another = alloca %struct.Pair, align 4
@@ -42,7 +42,7 @@ void main() {
// CHECK-NEXT:%0 = load i32, ptr %First2, align 4
// CHECK-NEXT:ret i32 %0
-// CHECK: define linkonce_odr noundef i32 @"?getSecond@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #2 align 2 {
+// CHECK: define linkonce_odr noundef i32 @"?getSecond@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
// CHECK-NEXT:entry:
// CHECK-NEXT:%this.addr = alloca ptr, align 4
// CHECK-NEXT:%agg.tmp = alloca %struct.Pair, align 4
diff --git a/clang/test/CodeGenHLSL/this-assignment.hlsl b/clang/test/CodeGenHLSL/this-assignment.hlsl
index 6916afcde40546..5c8de0a18ef7ca 100644
--- a/clang/test/CodeGenHLSL/this-assignment.hlsl
+++ b/clang/test/CodeGenHLSL/this-assignment.hlsl
@@ -24,7 +24,7 @@ void main() {
}
// This tests reference like implicit this in HLSL
-// CHECK: define linkonce_odr noundef i32 @"?getFirst@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #3 align 2 {
+// CHECK: define linkonce_odr noundef i32 @"?getFirst@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
// CHECK-NEXT:entry:
// CHECK-NEXT:%this.addr = alloca ptr, align 4
// CHECK-NEXT:%Another = alloca %struct.Pair, align 4
@@ -34,7 +34,7 @@ void main() {
// CHECK-NEXT:call void @llvm.memcpy.p0.p0.i32(ptr align 4 %this1, ptr align 4 %Another, i32 8, i1 false)
// CHECK-NEXT:%First = getelementptr inbounds nuw %struct.Pair, ptr %this1, i32 0, i32 0
-// CHECK: define linkonce_odr noundef i32 @"?getSecond@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #3 align 2 {
+// CHECK: define linkonce_odr noundef i32 @"?getSecond@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
// CHECK-NEXT:entry:
// CHECK-NEXT:%this.addr = alloca ptr, align 4
// CHECK-NEXT:%ref.tmp = alloca %struct.Pair, align 4
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, LGTM.
// CHECK: define noundef i32 @"?Find@@YAIY0GE@UNode@@I@Z"(ptr noundef byval([100 x %struct.Node]) align 4 %SortedTree, i32 noundef %key) [[IntAttr:\#[0-9]+]] | ||
// CHECK: ret i32 | ||
// Find and return value corresponding to key in the SortedTree | ||
uint Find(Node SortedTree[MAX], uint key) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need such complicated functions for this test? Does the body matter here or is it really just the internal/external properties that matter. In my opinion, the function body is distracting from the essence of the test here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not. I admit I got carried away, but in the process I discovered the constraints of the current implementation and potentially introduced some incidentals that might catch future issues by creating a more representative shader. I realize that philosophies might differ here, but I'm reluctant to change it.
(getLangOpts().CUDA && FD->hasAttr<CUDAGlobalAttr>()))) | ||
if (FD && | ||
((getLangOpts().CPlusPlus && FD->isMain()) || getLangOpts().OpenCL || | ||
getLangOpts().HLSL || getLangOpts().SYCLIsDevice || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the norecurse
attribute apply to direct recursion only or indirectly as well (e.g. A->B->A
). We do have bounded recursion in raytracing in that TraceRay
can invoke a ClosetHit
shader which can then again call TraceRay
.
I don't think that changes anything about this PR, but just throwing it out there in case it matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This attribute is descriptive and used in a few places to determine if recursion is expected or not. There is not actually any place that checks for recursion and produces an error for HLSL nor for OpenCL. There is a check for it when inlining takes place, but that's too late for diagnostics. #105244 is meant to address this problem.
Previously, functions named "main" got the NoRecurse attribute consistent with the behavior of C++, which HLSL largely follows. However, standard recursion is not allowed in HLSL, so all functions should really have this attribute. This doesn't prevent recursion, but rather signals that these functions aren't expected to recurse.
Practically, this was done so that entry point functions named "main" would have all have the same attributes as otherwise identical entry points with other names.
This required small changes to the this assignment tests because they no longer generate so many attribute sets since more of them match.
related to #105244
but done to simplify testing for #89806