Move the CUDA implementation of trunc to ATen. #25423

xuhdev · 2019-08-29T20:30:51Z

Stack from ghstack:

Move the CUDA implementation of log to ATen. #26494 Move the CUDA implementation of log to ATen.
Simplify operator sign using the helper. #25592 Simplify operator sign using the helper.
Move the CUDA implementation of trunc to ATen. #25423 Move the CUDA implementation of trunc to ATen.

Fix #24650

Differential Revision: D17397489

Fix #24650

Fix #24650 ghstack-source-id: cb7c7ae Pull Request resolved: #25423

VitalyFedyunin · 2019-08-29T20:34:14Z

aten/src/ATen/native/cuda/UnaryOpsKernel.cu

+void trunc_kernel_cuda(TensorIterator& iter) {
+  AT_DISPATCH_FLOATING_TYPES_AND_HALF(iter.dtype(), "trunc_cuda", [&]() {
+    gpu_kernel(iter, []GPU_LAMBDA(scalar_t a) -> scalar_t {
+      return std::trunc(a);


Are you sure there is no need to overload to call truncf ?

Yeah, it's overloaded https://en.cppreference.com/w/cpp/numeric/math/trunc (also search for VSTD::trunc in crt/math_functions.h in the cuda include dir)

If you are worried, here's some evidence. Compile the following program using

nvcc -ptx -src-in-ptx -arch=sm_60 test.cu

#include <cuda_runtime.h> __global__ void test_trunc_f(float x, float& x2) { x2 = std::trunc(x); } __global__ void test_trunc_d(double x, double& x2) { x2 = std::trunc(x); } __global__ void test_trunc_f_d(float x, float& x2) { x2 = truncf(x); }

The output shows (Note that the first and third functions are compiled to the same asm code)

// // Generated by NVIDIA NVVM Compiler // // Compiler Build ID: CL-26907403 // Cuda compilation tools, release 10.1, V10.1.243 // Based on LLVM 3.4svn // .version 6.4 .target sm_60 .address_size 64 // .globl _Z12test_trunc_ffRf .visible .entry _Z12test_trunc_ffRf( .param .f32 _Z12test_trunc_ffRf_param_0, .param .u64 _Z12test_trunc_ffRf_param_1 ) { .reg .f32 %f<3>; .reg .b64 %rd<3>; ld.param.f32 %f1, [_Z12test_trunc_ffRf_param_0]; ld.param.u64 %rd1, [_Z12test_trunc_ffRf_param_1]; cvta.to.global.u64 %rd2, %rd1; cvt.rzi.f32.f32 %f2, %f1; st.global.f32 [%rd2], %f2; ret; } // .globl _Z12test_trunc_ddRd .visible .entry _Z12test_trunc_ddRd( .param .f64 _Z12test_trunc_ddRd_param_0, .param .u64 _Z12test_trunc_ddRd_param_1 ) { .reg .f64 %fd<3>; .reg .b64 %rd<3>; ld.param.f64 %fd1, [_Z12test_trunc_ddRd_param_0]; ld.param.u64 %rd1, [_Z12test_trunc_ddRd_param_1]; cvta.to.global.u64 %rd2, %rd1; cvt.rzi.f64.f64 %fd2, %fd1; st.global.f64 [%rd2], %fd2; ret; } // .globl _Z14test_trunc_f_dfRf .visible .entry _Z14test_trunc_f_dfRf( .param .f32 _Z14test_trunc_f_dfRf_param_0, .param .u64 _Z14test_trunc_f_dfRf_param_1 ) { .reg .f32 %f<3>; .reg .b64 %rd<3>; ld.param.f32 %f1, [_Z14test_trunc_f_dfRf_param_0]; ld.param.u64 %rd1, [_Z14test_trunc_f_dfRf_param_1]; cvta.to.global.u64 %rd2, %rd1; cvt.rzi.f32.f32 %f2, %f1; st.global.f32 [%rd2], %f2; ret; }

Fix #24650

Fix #24650 ghstack-source-id: c8226da Pull Request resolved: #25423

Fix #24650

rocm failure looks reasonable

VitalyFedyunin · 2019-09-16T15:36:22Z

Please fix pr/py2-clang7-rocmdeb-ubuntu16.04

Fix #24650 Differential Revision: [D17397489](https://our.internmc.facebook.com/intern/diff/D17397489)

xuhdev · 2019-09-16T16:59:17Z

@VitalyFedyunin Updated. Should have fixed it now

Fix #24650 Differential Revision: [D17397489](https://our.internmc.facebook.com/intern/diff/D17397489)

VitalyFedyunin · 2019-09-20T20:26:25Z

"All checks have passed" super suspicious ;)

xuhdev · 2019-09-21T22:29:15Z

@VitalyFedyunin A friendly reminder that the previous two merged commits in this stack seem to have put the incorrect authorship: f55a9da

VitalyFedyunin · 2019-09-23T18:04:08Z

Please rebase ( contention on the UnaryOps.cpp is way to high )

Fix #24650 Differential Revision: [D17397489](https://our.internmc.facebook.com/intern/diff/D17397489)

xuhdev · 2019-09-23T18:26:16Z

Done!

Summary: Pull Request resolved: pytorch/pytorch#25423 Fix #24650 Test Plan: Imported from OSS Differential Revision: D17397489 Pulled By: VitalyFedyunin fbshipit-source-id: 933f915a44ff9b7803ddb2708bf0e723433ee0b6

facebook-github-bot · 2019-09-24T16:08:48Z

@VitalyFedyunin merged this pull request in 7bdc0c1.

xuhdev · 2019-09-24T17:46:04Z

Thanks!

Move the CUDA implementation of trunc to ATen.

b6a2f24

Fix #24650

pytorchbot added module: cuda Related to torch.cuda, and CUDA support in general module: internals Related to internal abstractions in c10 and ATen module: operators labels Aug 29, 2019

xuhdev added a commit that referenced this pull request Aug 29, 2019

Move the CUDA implementation of trunc to ATen.

e3c1a92

Fix #24650 ghstack-source-id: cb7c7ae Pull Request resolved: #25423

xuhdev requested a review from VitalyFedyunin August 29, 2019 20:31

VitalyFedyunin reviewed Aug 29, 2019

View reviewed changes

Update on "Move the CUDA implementation of trunc to ATen."

ac92a15

Fix #24650

xuhdev added a commit that referenced this pull request Sep 3, 2019

Move the CUDA implementation of trunc to ATen.

f43b059

Fix #24650 ghstack-source-id: c8226da Pull Request resolved: #25423

xuhdev mentioned this pull request Sep 3, 2019

Simplify operator sign using the helper. #25592

Closed

xuhdev added 7 commits September 3, 2019 14:52

Update on "Move the CUDA implementation of trunc to ATen."

02ceb83

Fix #24650

Update on "Move the CUDA implementation of trunc to ATen."

718be8e

Fix #24650

Update on "Move the CUDA implementation of trunc to ATen."

b9b90a5

Fix #24650

Update on "Move the CUDA implementation of trunc to ATen."

97d427b

Fix #24650

Update on "Move the CUDA implementation of trunc to ATen."

f959d20

Fix #24650

Update on "Move the CUDA implementation of trunc to ATen."

b3ee09f

Fix #24650

Update on "Move the CUDA implementation of trunc to ATen."

8a933cb

Fix #24650

VitalyFedyunin previously approved these changes Sep 16, 2019

View reviewed changes

Update on "Move the CUDA implementation of trunc to ATen."

b1c96af

Fix #24650 Differential Revision: [D17397489](https://our.internmc.facebook.com/intern/diff/D17397489)

xuhdev requested a review from VitalyFedyunin September 16, 2019 16:59

Update on "Move the CUDA implementation of trunc to ATen."

b65c551

Fix #24650 Differential Revision: [D17397489](https://our.internmc.facebook.com/intern/diff/D17397489)

ezyang added the open source label Sep 18, 2019

Update on "Move the CUDA implementation of trunc to ATen."

3a098d7

Fix #24650 Differential Revision: [D17397489](https://our.internmc.facebook.com/intern/diff/D17397489)

xuhdev mentioned this pull request Sep 19, 2019

Move the CUDA implementation of log to ATen. #26494

Closed

Update on "Move the CUDA implementation of trunc to ATen."

5c39f5c

Fix #24650 Differential Revision: [D17397489](https://our.internmc.facebook.com/intern/diff/D17397489)

VitalyFedyunin approved these changes Sep 20, 2019

View reviewed changes

Update on "Move the CUDA implementation of trunc to ATen."

bdc1dae

Fix #24650 Differential Revision: [D17397489](https://our.internmc.facebook.com/intern/diff/D17397489)

facebook-github-bot closed this in 7bdc0c1 Sep 24, 2019

facebook-github-bot added the merged label Sep 24, 2019

xuhdev deleted the gh/xuhdev/33/head branch September 24, 2019 17:46

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Move the CUDA implementation of trunc to ATen. #25423

Move the CUDA implementation of trunc to ATen. #25423

Uh oh!

xuhdev commented Aug 29, 2019 •

edited

Loading

Uh oh!

VitalyFedyunin Aug 29, 2019

Uh oh!

xuhdev Aug 29, 2019 •

edited

Loading

Uh oh!

xuhdev Aug 29, 2019

Uh oh!

VitalyFedyunin commented Sep 16, 2019

Uh oh!

xuhdev commented Sep 16, 2019

Uh oh!

VitalyFedyunin commented Sep 20, 2019

Uh oh!

xuhdev commented Sep 21, 2019

Uh oh!

VitalyFedyunin commented Sep 23, 2019

Uh oh!

xuhdev commented Sep 23, 2019

Uh oh!

facebook-github-bot commented Sep 24, 2019

Uh oh!

xuhdev commented Sep 24, 2019

Uh oh!

Uh oh!

Move the CUDA implementation of trunc to ATen. #25423

Move the CUDA implementation of trunc to ATen. #25423

Uh oh!

Conversation

xuhdev commented Aug 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VitalyFedyunin Aug 29, 2019

Choose a reason for hiding this comment

Uh oh!

xuhdev Aug 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xuhdev Aug 29, 2019

Choose a reason for hiding this comment

Uh oh!

VitalyFedyunin commented Sep 16, 2019

Uh oh!

xuhdev commented Sep 16, 2019

Uh oh!

VitalyFedyunin commented Sep 20, 2019

Uh oh!

xuhdev commented Sep 21, 2019

Uh oh!

VitalyFedyunin commented Sep 23, 2019

Uh oh!

xuhdev commented Sep 23, 2019

Uh oh!

facebook-github-bot commented Sep 24, 2019

Uh oh!

xuhdev commented Sep 24, 2019

Uh oh!

Uh oh!

xuhdev commented Aug 29, 2019 •

edited

Loading

xuhdev Aug 29, 2019 •

edited

Loading