-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[clang][AMDGPU][CUDA] Handle __builtin_printf for device printf #68515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang ChangesPreviously Ref: #68478 Full diff: https://github.com/llvm/llvm-project/pull/68515.diff 4 Files Affected:
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index bf984861bccb5cc..c16c005787ca778 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -5464,6 +5464,7 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
Value *HalfVal = Builder.CreateLoad(Address);
return RValue::get(Builder.CreateFPExt(HalfVal, Builder.getFloatTy()));
}
+ case Builtin::BI__builtin_printf:
case Builtin::BIprintf:
if (getTarget().getTriple().isNVPTX() ||
getTarget().getTriple().isAMDGCN()) {
diff --git a/clang/lib/CodeGen/CGGPUBuiltin.cpp b/clang/lib/CodeGen/CGGPUBuiltin.cpp
index 75fb06de938425d..794be0520163157 100644
--- a/clang/lib/CodeGen/CGGPUBuiltin.cpp
+++ b/clang/lib/CodeGen/CGGPUBuiltin.cpp
@@ -135,7 +135,8 @@ RValue EmitDevicePrintfCallExpr(const CallExpr *E, CodeGenFunction *CGF,
llvm::Function *Decl, bool WithSizeArg) {
CodeGenModule &CGM = CGF->CGM;
CGBuilderTy &Builder = CGF->Builder;
- assert(E->getBuiltinCallee() == Builtin::BIprintf);
+ assert(E->getBuiltinCallee() == Builtin::BIprintf ||
+ E->getBuiltinCallee() == Builtin::BI__builtin_printf);
assert(E->getNumArgs() >= 1); // printf always has at least one arg.
// Uses the same format as nvptx for the argument packing, but also passes
diff --git a/clang/test/CodeGenCUDA/printf-builtin.cu b/clang/test/CodeGenCUDA/printf-builtin.cu
new file mode 100644
index 000000000000000..586d00a878ddf89
--- /dev/null
+++ b/clang/test/CodeGenCUDA/printf-builtin.cu
@@ -0,0 +1,20 @@
+// REQUIRES: x86-registered-target
+// REQUIRES: nvptx-registered-target
+// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -emit-llvm -disable-llvm-optzns -fno-builtin-printf -fcuda-is-device \
+// RUN: -o - %s | FileCheck %s
+
+#define __device__ __attribute__((device))
+
+extern "C" __device__ int printf(const char *format, ...);
+
+// CHECK-LABEL: @_Z4foo1v()
+__device__ int foo1() {
+ // CHECK-NOT: call i32 (ptr, ...) @printf
+ return __builtin_printf("Hello World\n");
+}
+
+// CHECK-LABEL: @_Z4foo2v()
+__device__ int foo2() {
+ // CHECK: call i32 (ptr, ...) @printf
+ return printf("Hello World\n");
+}
diff --git a/clang/test/CodeGenHIP/printf-builtin.hip b/clang/test/CodeGenHIP/printf-builtin.hip
new file mode 100644
index 000000000000000..76c7d41376c972d
--- /dev/null
+++ b/clang/test/CodeGenHIP/printf-builtin.hip
@@ -0,0 +1,21 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -emit-llvm -disable-llvm-optzns -mprintf-kind=hostcall -fno-builtin-printf -fcuda-is-device \
+// RUN: -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -emit-llvm -disable-llvm-optzns -mprintf-kind=buffered -fno-builtin-printf -fcuda-is-device \
+// RUN: -o - %s | FileCheck %s
+
+#define __device__ __attribute__((device))
+
+extern "C" __device__ int printf(const char *format, ...);
+
+// CHECK-LABEL: @_Z4foo1v()
+__device__ int foo1() {
+ // CHECK-NOT: call i32 (ptr, ...) @printf
+ return __builtin_printf("Hello World\n");
+}
+
+// CHECK-LABEL: @_Z4foo2v()
+__device__ int foo2() {
+ // CHECK: call i32 (ptr, ...) @printf
+ return printf("Hello World\n");
+}
|
It looks reasonable to me, although I'm not really an AMDGPU person. /me summons @arsenm ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks
AMDGPU backend relies on LLVM passes to translate printf at IR level. __builtin_printf and printf should be equivalent other than the name. |
For the OpenCL case only, not for HIP/OpenMP |
Previously __builtin_printf would result to emitting call to printf, even though directly calling printf was translated. Ref: llvm#68478
193c652
to
b7a87d0
Compare
I don't have commit access, @arsenm please merge. |
…#68515) Previously `__builtin_printf` would result to emitting call to `printf`, even though directly calling `printf` was translated. Ref: llvm#68478
* llvm/main: (328 commits) [Flang][OpenMP] Attempt to make map-types-and-sizes.f90 test more agnostic to other architectures [Transforms] Add more cos combinations to SimplifyLibCalls and InstCombine (llvm#79699) [workflows] Close issues used for backports once the PR has been created (llvm#80394) [RISCV] Add support for RISC-V Pointer Masking (llvm#79929) [lldb] Cleanup regex in libcxx formatters (NFC) (llvm#80618) [lldb] Remove unused private TypeCategoryMap methods (NFC) (llvm#80602) [mlir][sparse] refine sparse assembler strategy (llvm#80521) [NFC] Fix typo (llvm#80703) Fix broken ARM processor features test (llvm#80717) [ValueTracking][NFC] Pass `SimplifyQuery` to `computeKnownFPClass` family (llvm#80657) [x86_64][windows][swift] do not use Swift async extended frame for wi… (llvm#80468) [X86] addConstantComments - add FP16 MOVSH asm comments support [X86] Regenerate some vector constant comments missed in recent patches to improve mask predicate handling in addConstantComments [clang][AMDGPU][CUDA] Handle __builtin_printf for device printf (llvm#68515) Add some clarification to email check message [GitHub][Workflows] Prevent multiple private email comments (temporarily) (llvm#80648) [workflows] Use /mnt as the build directory on Linux (llvm#80583) [Flang][OpenMP] Initial mapping of Fortran pointers and allocatables for target devices (llvm#71766) [AMDGPU] GlobalISel for f8 conversions (llvm#80503) [AMDGPU] Fixed byte_sel of v_cvt_f32_bf8/v_cvt_f32_fp8 (llvm#80502) ...
Previously
__builtin_printf
would result to emitting call toprintf
, even though directly callingprintf
was translated.Ref: #68478