Skip to content

Conversation

Maetveis
Copy link
Contributor

@Maetveis Maetveis commented Oct 8, 2023

Previously __builtin_printf would result to emitting call to printf, even though directly calling printf was translated.

Ref: #68478

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Oct 8, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 8, 2023

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Changes

Previously __builtin_printf would result to emitting call to printf, even though directly calling printf was translated.

Ref: #68478


Full diff: https://github.com/llvm/llvm-project/pull/68515.diff

4 Files Affected:

  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+1)
  • (modified) clang/lib/CodeGen/CGGPUBuiltin.cpp (+2-1)
  • (added) clang/test/CodeGenCUDA/printf-builtin.cu (+20)
  • (added) clang/test/CodeGenHIP/printf-builtin.hip (+21)
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index bf984861bccb5cc..c16c005787ca778 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -5464,6 +5464,7 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     Value *HalfVal = Builder.CreateLoad(Address);
     return RValue::get(Builder.CreateFPExt(HalfVal, Builder.getFloatTy()));
   }
+  case Builtin::BI__builtin_printf:
   case Builtin::BIprintf:
     if (getTarget().getTriple().isNVPTX() ||
         getTarget().getTriple().isAMDGCN()) {
diff --git a/clang/lib/CodeGen/CGGPUBuiltin.cpp b/clang/lib/CodeGen/CGGPUBuiltin.cpp
index 75fb06de938425d..794be0520163157 100644
--- a/clang/lib/CodeGen/CGGPUBuiltin.cpp
+++ b/clang/lib/CodeGen/CGGPUBuiltin.cpp
@@ -135,7 +135,8 @@ RValue EmitDevicePrintfCallExpr(const CallExpr *E, CodeGenFunction *CGF,
                                 llvm::Function *Decl, bool WithSizeArg) {
   CodeGenModule &CGM = CGF->CGM;
   CGBuilderTy &Builder = CGF->Builder;
-  assert(E->getBuiltinCallee() == Builtin::BIprintf);
+  assert(E->getBuiltinCallee() == Builtin::BIprintf ||
+         E->getBuiltinCallee() == Builtin::BI__builtin_printf);
   assert(E->getNumArgs() >= 1); // printf always has at least one arg.
 
   // Uses the same format as nvptx for the argument packing, but also passes
diff --git a/clang/test/CodeGenCUDA/printf-builtin.cu b/clang/test/CodeGenCUDA/printf-builtin.cu
new file mode 100644
index 000000000000000..586d00a878ddf89
--- /dev/null
+++ b/clang/test/CodeGenCUDA/printf-builtin.cu
@@ -0,0 +1,20 @@
+// REQUIRES: x86-registered-target
+// REQUIRES: nvptx-registered-target
+// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -emit-llvm -disable-llvm-optzns -fno-builtin-printf -fcuda-is-device \
+// RUN:   -o - %s | FileCheck  %s
+
+#define __device__ __attribute__((device))
+
+extern "C" __device__ int printf(const char *format, ...);
+
+// CHECK-LABEL: @_Z4foo1v()
+__device__ int foo1() {
+  // CHECK-NOT: call i32 (ptr, ...) @printf
+  return __builtin_printf("Hello World\n");
+}
+
+// CHECK-LABEL: @_Z4foo2v()
+__device__ int foo2() {
+  // CHECK: call i32 (ptr, ...) @printf
+  return printf("Hello World\n");
+}
diff --git a/clang/test/CodeGenHIP/printf-builtin.hip b/clang/test/CodeGenHIP/printf-builtin.hip
new file mode 100644
index 000000000000000..76c7d41376c972d
--- /dev/null
+++ b/clang/test/CodeGenHIP/printf-builtin.hip
@@ -0,0 +1,21 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -emit-llvm -disable-llvm-optzns -mprintf-kind=hostcall -fno-builtin-printf -fcuda-is-device \
+// RUN:   -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -emit-llvm -disable-llvm-optzns -mprintf-kind=buffered -fno-builtin-printf -fcuda-is-device \
+// RUN:   -o - %s | FileCheck %s
+
+#define __device__ __attribute__((device))
+
+extern "C" __device__ int printf(const char *format, ...);
+
+// CHECK-LABEL: @_Z4foo1v()
+__device__ int foo1() {
+  // CHECK-NOT: call i32 (ptr, ...) @printf
+  return __builtin_printf("Hello World\n");
+}
+
+// CHECK-LABEL: @_Z4foo2v()
+__device__ int foo2() {
+  // CHECK: call i32 (ptr, ...) @printf
+  return printf("Hello World\n");
+}

@Maetveis
Copy link
Contributor Author

Maetveis commented Feb 3, 2024

@jlebar @yxsamliu I don't know who could review this, but it is sitting here for quite some time unnoticed now, while being a simple enough change IMO.

@jlebar
Copy link
Member

jlebar commented Feb 4, 2024

It looks reasonable to me, although I'm not really an AMDGPU person. /me summons @arsenm ?

Copy link
Collaborator

@yxsamliu yxsamliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks

@yxsamliu
Copy link
Collaborator

yxsamliu commented Feb 4, 2024

It looks reasonable to me, although I'm not really an AMDGPU person. /me summons @arsenm ?

AMDGPU backend relies on LLVM passes to translate printf at IR level. __builtin_printf and printf should be equivalent other than the name.

@arsenm
Copy link
Contributor

arsenm commented Feb 5, 2024

It looks reasonable to me, although I'm not really an AMDGPU person. /me summons @arsenm ?

AMDGPU backend relies on LLVM passes to translate printf at IR level.

For the OpenCL case only, not for HIP/OpenMP

Previously __builtin_printf would result to emitting call to printf,
even though directly calling printf was translated.

Ref: llvm#68478
@Maetveis Maetveis force-pushed the gpu_handle_builtin_printf branch from 193c652 to b7a87d0 Compare February 5, 2024 14:58
@Maetveis Maetveis requested a review from arsenm February 5, 2024 14:58
@Maetveis
Copy link
Contributor Author

Maetveis commented Feb 5, 2024

I don't have commit access, @arsenm please merge.

@arsenm arsenm merged commit 5942868 into llvm:main Feb 5, 2024
@Maetveis Maetveis deleted the gpu_handle_builtin_printf branch February 5, 2024 20:24
agozillon pushed a commit to agozillon/llvm-project that referenced this pull request Feb 5, 2024
…#68515)

Previously `__builtin_printf` would result to emitting call to `printf`,
even though directly calling `printf` was translated.

Ref: llvm#68478
ichaer added a commit to ichaer/llvm-project-onesided_lower_bound that referenced this pull request Feb 12, 2024
* llvm/main: (328 commits)
  [Flang][OpenMP] Attempt to make map-types-and-sizes.f90 test more agnostic to other architectures
  [Transforms] Add more cos combinations to SimplifyLibCalls and InstCombine (llvm#79699)
  [workflows] Close issues used for backports once the PR has been created (llvm#80394)
  [RISCV] Add support for RISC-V Pointer Masking (llvm#79929)
  [lldb] Cleanup regex in libcxx formatters (NFC) (llvm#80618)
  [lldb] Remove unused private TypeCategoryMap methods (NFC) (llvm#80602)
  [mlir][sparse] refine sparse assembler strategy (llvm#80521)
  [NFC] Fix typo (llvm#80703)
  Fix broken ARM processor features test (llvm#80717)
  [ValueTracking][NFC] Pass `SimplifyQuery` to `computeKnownFPClass` family (llvm#80657)
  [x86_64][windows][swift] do not use Swift async extended frame for wi… (llvm#80468)
  [X86] addConstantComments - add FP16 MOVSH asm comments support
  [X86] Regenerate some vector constant comments missed in recent patches to improve mask predicate handling in addConstantComments
  [clang][AMDGPU][CUDA] Handle __builtin_printf for device printf (llvm#68515)
  Add some clarification to email check message
  [GitHub][Workflows] Prevent multiple private email comments (temporarily) (llvm#80648)
  [workflows] Use /mnt as the build directory on Linux (llvm#80583)
  [Flang][OpenMP] Initial mapping of Fortran pointers and allocatables for target devices (llvm#71766)
  [AMDGPU] GlobalISel for f8 conversions (llvm#80503)
  [AMDGPU] Fixed byte_sel of v_cvt_f32_bf8/v_cvt_f32_fp8 (llvm#80502)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants