You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/site_specific_config/gpu.md
+38-25Lines changed: 38 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,27 +29,38 @@ For CUDA-enabled software to run, it needs to be able to find the **NVIDIA GPU d
29
29
The challenge here is that the NVIDIA GPU drivers are not _always_ in a standard system location, and that we
30
30
can not install the GPU drivers in EESSI (since they are too closely tied to the client OS and GPU hardware).
31
31
32
-
### Compiling CUDA software {: #cuda_sdk }
32
+
### Compiling software on top of CUDA, cuDNN and other SDKs provided by NVIDIA {: #cuda_sdk }
33
33
34
-
An additional requirement is necessary if you want to be able to compile CUDA-enabled software using a CUDA installation included in EESSI. This requires a *full* CUDA SDK, but the [CUDA SDK End User License Agreement (EULA)](https://docs.nvidia.com/cuda/eula/index.html) does not allow for full redistribution. In EESSI, we are (currently) only allowed to redistribute the files needed to *run* CUDA software.
34
+
An additional requirement is necessary if you want to be able to compile software
35
+
that makes use of a CUDA installation or cu\* SDKs (e.g., cuDNN) included in
36
+
EESSI. This requires a *full* installation of the CUDA SDK, cuDNN, etc. However,
37
+
the [CUDA SDK End User License Agreement (EULA)](https://docs.nvidia.com/cuda/eula/index.html)
38
+
and the [Software License Agreement (SLA) for NVIDIA cuDNN](https://docs.nvidia.com/deeplearning/cudnn/latest/reference/eula.html)
39
+
do not allow for full redistribution. In EESSI, we are (currently) only allowed to
40
+
redistribute the files needed to *run* CUDA and cuDNN software.
35
41
36
-
!!! note "Full CUDA SDK only needed to *compile* CUDA software"
37
-
Without a full CUDA SDK on the host system, you will still be able to *run* CUDA-enabled software from the EESSI stack,
38
-
you just won't be able to *compile* additional CUDA software.
42
+
!!! note "A full CUDA SDK or cuDNN SDK is only needed to *compile* CUDA or cuDNN software"
43
+
Without a full CUDA SDK or cuDNN SDK on the host system, you will still
44
+
be able to *run* CUDA-enabled or cuDNN-enabled software from the EESSI stack,
45
+
you just won't be able to *compile* additional CUDA or cuDNN software.
39
46
40
-
Below, we describe how to make sure that the EESSI software stack can find your NVIDIA GPU drivers and (optionally) full installations of the CUDA SDK.
47
+
Below, we describe how to make sure that the EESSI software stack can find your
48
+
NVIDIA GPU drivers and (optionally) full installations of the CUDA SDK and the
49
+
cuDNN SDK.
41
50
42
51
### Configuring CUDA driver location {: #driver_location }
43
52
44
53
All CUDA-enabled software in EESSI expects the CUDA drivers to be available in a specific subdirectory of this `host_injections` directory.
45
-
In addition, installations of the CUDA SDK included EESSI are stripped down to the files that we are allowed to redistribute;
54
+
In addition, installations of the CUDA SDK and cuDNN SDK included EESSI are stripped down to the files that we are allowed to redistribute;
46
55
all other files are replaced by symbolic links that point to another specific subdirectory of `host_injections`. For example:
47
56
```
48
57
$ ls -l /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/CUDA/12.1.1/bin/nvcc
If the corresponding full installation of the CUDA SDK is available there, the CUDA installation included in EESSI can be used to build CUDA software.
61
+
If the corresponding full installation of the CUDA SDK is available there, the
62
+
CUDA installation included in EESSI can be used to build CUDA software. The same
63
+
applies to the cuDNN SDK.
53
64
54
65
55
66
### Using NVIDIA GPUs via a native EESSI installation {: #nvidia_eessi_native }
@@ -74,37 +85,39 @@ This script uses `ldconfig` on your host system to locate your GPU drivers, and
74
85
75
86
Note that it is safe to re-run the script even if no driver updates were done: the script should detect that the current version of the drivers were already symlinked.
76
87
77
-
#### Installing full CUDA SDK (optional)
88
+
#### Installing full CUDA SDK and cuDNN SDK (optional)
78
89
79
-
To install a full CUDA SDK under `host_injections`, use the `install_cuda_host_injections.sh` script that is included in EESSI:
90
+
To install a full CUDA SDK and cuDNN SDK under `host_injections`, use the `install_cuda_and_libraries.sh` script that is included in EESSI:
By default, the install script processes all files matching `eessi-*CUDA*.yml` in
107
+
the above `/cvmfs/software.eessi.io/versions/${EESSI_VERSION}/scripts/gpu_support/nvidia/easystacks` directory.
93
108
94
-
You can run `/cvmfs/software.eessi.io/versions/${EESSI_VERSION}/scripts/gpu_support/nvidia/install_cuda_host_injections.sh --help` to check all of the options.
109
+
You can run `/cvmfs/software.eessi.io/versions/${EESSI_VERSION}/scripts/gpu_support/nvidia/install_cuda_and_libraries.sh --help` to check all of the options.
95
110
96
111
!!! tip
97
112
98
-
This script uses EasyBuild to install the CUDA SDK. For this to work, two requirements need to be satisfied:
99
-
100
-
* `module load EasyBuild` should work (or the `eb` command is already available in the environment);
101
-
* The version of EasyBuild being used should provide the requested version of the CUDA easyconfig file
102
-
(in the example case above, that's `CUDA-12.1.1.eb`).
113
+
This script uses EasyBuild to install the CUDA SDK and the cuDNN SDK. For this to work, two requirements need to be satisfied:
103
114
104
-
You can rely on the EasyBuild installation that is included in EESSI for this.
115
+
* `module load EasyBuild/${EB_VERSION}` must work (EB_VERSION is extracted
116
+
from the name of the easystack file (e.g., from `eb-4.9.4` EB_VERSION is
117
+
derived as 4.9.4);
118
+
* `module load EESSI-extend/${EESSI_VERSION}-easybuild` must work.
105
119
106
-
Alternatively, you may load an EasyBuild module manually _before_ running the `install_cuda_host_injections.sh`
107
-
script to make an `eb` command available.
120
+
Both modules are included in EESSI.
108
121
109
122
110
123
### Using NVIDIA GPUs via EESSI in a container {: #nvidia_eessi_container }
0 commit comments