-
Notifications
You must be signed in to change notification settings - Fork 13.4k
clang++ cuda: Lambda capture fails to initialize memory when variable only used in #ifdef __CUDA_ARCH__
#193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
On the host side, the lambda does not capture struct Lambda {
#ifdef __CUDA_ARCH__
int a;
#endif
}; and shared the type Workaround: either don't use |
Thanks for the help. Didn't really think this through. |
#ifdef __CUDA_ARCH__
@llvm/issue-subscribers-c-11 Author: Ryan Greenblatt (rgreenblatt)
```cuda
template <typename F> __global__ void call(F f) { f(); }
int main(int argc, char *argv[]) { call<<<1, 1>>>([=] device() { #ifdef CUDA_ARCH cudaDeviceSynchronize(); return 0;
|
Make safepoint id a u64 rather than an Operand.
Build with
clang++ -std=c++17 --cuda-gpu-arch=sm_75 -L/usr/local/cuda/lib64 -lcudart test.cu
changing options as needed. The issue also occurs on c++11/14. The value printed is garbage instead of 0. Using the variable outside the ifdef or explicity capturing makes the issue go away. The issue occurs regardless of optimization settings as far as I can tell.My testing is on trunk, but I think this still occurs with clang 9/10
The text was updated successfully, but these errors were encountered: