You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Building Pytorch 0.4.1 from source on a Nvidia Drive PX2 (Driveworks 0.6) currently does not work, due to an odd Nvidia print statement which breaks the CUDA architecture detection: running any CUDA process results in the following print statement to std out: nvrm_gpu: Bug 200215060 workaround enabled.
Unfortunately there's nothing I can do (rather, that I've found) to work around this or suppress it - it's part of the CUDA 9.0 install which comes with the Driveworks SDK for these PX2s.
This breaks the CUDA architecture detection in <pytorch_root>/cmake/Modules_CUDA_fix/upstream/FindCUDA/select_compute_arch.cmakehere from function CUDA_DETECT_INSTALLED_GPUS:
This function writes and compiles a short cpp program which prints the CUDA device architectures, and caches the program output in CMakeCache.
Instead of printing 6.1 6.2 to stdout as expected on this device, this additional print statement results in CUDA_GPU_DETECT_OUTPUT being set to: nvrm_gpu: Bug 200215060 workaround enabled.\n6.1 6.2
This newline breaks CMakeCache, which doesn't handle newlines in cached variables.
In addition, the output results in a number of message(SEND_ERROR <>) from each parsed string (stating that 'nvrm_gpu:', 'Bug', '200215060', ... aren't valid architectures, understandably).
This can be fixed by adding a one-line addition to line 90 of this .cmake file, parsing the program output to ensure it has a list of sensible possible architecture version in the compute_capabilities variable (floats, e.g, 6.1, 6.2, etc) string(REGEX MATCHALL "[0-9]+\\.[0-9]+" compute_capabilities "${compute_capabilities}")
With this patch, pytorch 0.4.1 builds happily.
If you have any other suggestions for a fix or workaround, I'd be happy to try them.
Code example
Reproduceable on multiple PX2s with this version of driveworks, python 3.5, a fresh checkout of 0.4.1, and the following install command:
MAX_JOBS=1 python3 setup.py install --user
System Info
CUDA used to build PyTorch: 9.0.225
OS: Ubuntu 16.04.5 LTS
GCC version: (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
CMake version: version 3.5.1
Python version: 3.5
CUDA runtime version: 9.0.225
cuDNN version: Probably one of the following:
/usr/lib/aarch64-linux-gnu/libcudnn.so.7.0.4
/usr/lib/aarch64-linux-gnu/libcudnn_static_v7.a
The text was updated successfully, but these errors were encountered:
Thanks @soumith - that works perfectly as well!
If possible, it'd be great to add this to the CMake module itself, to make the default behaviour handle this. I've attached a quick diff of a patch which resolves this. px2_build_fix.diff.zip
Issue description
Building Pytorch 0.4.1 from source on a Nvidia Drive PX2 (Driveworks 0.6) currently does not work, due to an odd Nvidia print statement which breaks the CUDA architecture detection: running any CUDA process results in the following print statement to std out:
nvrm_gpu: Bug 200215060 workaround enabled.
Unfortunately there's nothing I can do (rather, that I've found) to work around this or suppress it - it's part of the CUDA 9.0 install which comes with the Driveworks SDK for these PX2s.
This breaks the CUDA architecture detection in
<pytorch_root>/cmake/Modules_CUDA_fix/upstream/FindCUDA/select_compute_arch.cmake
here from functionCUDA_DETECT_INSTALLED_GPUS
:6.1 6.2
to stdout as expected on this device, this additional print statement results inCUDA_GPU_DETECT_OUTPUT
being set to:nvrm_gpu: Bug 200215060 workaround enabled.\n6.1 6.2
message(SEND_ERROR <>)
from each parsed string (stating that 'nvrm_gpu:', 'Bug', '200215060', ... aren't valid architectures, understandably).This can be fixed by adding a one-line addition to line 90 of this .cmake file, parsing the program output to ensure it has a list of sensible possible architecture version in the
compute_capabilities
variable (floats, e.g, 6.1, 6.2, etc)string(REGEX MATCHALL "[0-9]+\\.[0-9]+" compute_capabilities "${compute_capabilities}")
With this patch, pytorch 0.4.1 builds happily.
If you have any other suggestions for a fix or workaround, I'd be happy to try them.
Code example
Reproduceable on multiple PX2s with this version of driveworks, python 3.5, a fresh checkout of 0.4.1, and the following install command:
MAX_JOBS=1 python3 setup.py install --user
System Info
CUDA used to build PyTorch: 9.0.225
OS: Ubuntu 16.04.5 LTS
GCC version: (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
CMake version: version 3.5.1
Python version: 3.5
CUDA runtime version: 9.0.225
cuDNN version: Probably one of the following:
/usr/lib/aarch64-linux-gnu/libcudnn.so.7.0.4
/usr/lib/aarch64-linux-gnu/libcudnn_static_v7.a
The text was updated successfully, but these errors were encountered: