Skip to content

Commit c75a8ad

Browse files
authored
Merge pull request #328 from trz42/use_new_cuda_n_libraries_script
update GPU docs about cuDNN and new script to install full SDKs
2 parents 60770c9 + 971f484 commit c75a8ad

File tree

1 file changed

+38
-25
lines changed
  • docs/site_specific_config

1 file changed

+38
-25
lines changed

docs/site_specific_config/gpu.md

Lines changed: 38 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -29,27 +29,38 @@ For CUDA-enabled software to run, it needs to be able to find the **NVIDIA GPU d
2929
The challenge here is that the NVIDIA GPU drivers are not _always_ in a standard system location, and that we
3030
can not install the GPU drivers in EESSI (since they are too closely tied to the client OS and GPU hardware).
3131

32-
### Compiling CUDA software {: #cuda_sdk }
32+
### Compiling software on top of CUDA, cuDNN and other SDKs provided by NVIDIA {: #cuda_sdk }
3333

34-
An additional requirement is necessary if you want to be able to compile CUDA-enabled software using a CUDA installation included in EESSI. This requires a *full* CUDA SDK, but the [CUDA SDK End User License Agreement (EULA)](https://docs.nvidia.com/cuda/eula/index.html) does not allow for full redistribution. In EESSI, we are (currently) only allowed to redistribute the files needed to *run* CUDA software.
34+
An additional requirement is necessary if you want to be able to compile software
35+
that makes use of a CUDA installation or cu\* SDKs (e.g., cuDNN) included in
36+
EESSI. This requires a *full* installation of the CUDA SDK, cuDNN, etc. However,
37+
the [CUDA SDK End User License Agreement (EULA)](https://docs.nvidia.com/cuda/eula/index.html)
38+
and the [Software License Agreement (SLA) for NVIDIA cuDNN](https://docs.nvidia.com/deeplearning/cudnn/latest/reference/eula.html)
39+
do not allow for full redistribution. In EESSI, we are (currently) only allowed to
40+
redistribute the files needed to *run* CUDA and cuDNN software.
3541

36-
!!! note "Full CUDA SDK only needed to *compile* CUDA software"
37-
Without a full CUDA SDK on the host system, you will still be able to *run* CUDA-enabled software from the EESSI stack,
38-
you just won't be able to *compile* additional CUDA software.
42+
!!! note "A full CUDA SDK or cuDNN SDK is only needed to *compile* CUDA or cuDNN software"
43+
Without a full CUDA SDK or cuDNN SDK on the host system, you will still
44+
be able to *run* CUDA-enabled or cuDNN-enabled software from the EESSI stack,
45+
you just won't be able to *compile* additional CUDA or cuDNN software.
3946

40-
Below, we describe how to make sure that the EESSI software stack can find your NVIDIA GPU drivers and (optionally) full installations of the CUDA SDK.
47+
Below, we describe how to make sure that the EESSI software stack can find your
48+
NVIDIA GPU drivers and (optionally) full installations of the CUDA SDK and the
49+
cuDNN SDK.
4150

4251
### Configuring CUDA driver location {: #driver_location }
4352

4453
All CUDA-enabled software in EESSI expects the CUDA drivers to be available in a specific subdirectory of this `host_injections` directory.
45-
In addition, installations of the CUDA SDK included EESSI are stripped down to the files that we are allowed to redistribute;
54+
In addition, installations of the CUDA SDK and cuDNN SDK included EESSI are stripped down to the files that we are allowed to redistribute;
4655
all other files are replaced by symbolic links that point to another specific subdirectory of `host_injections`. For example:
4756
```
4857
$ ls -l /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/CUDA/12.1.1/bin/nvcc
4958
lrwxrwxrwx 1 cvmfs cvmfs 109 Dec 21 14:49 /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/CUDA/12.1.1/bin/nvcc -> /cvmfs/software.eessi.io/host_injections/2023.06/software/linux/x86_64/amd/zen3/software/CUDA/12.1.1/bin/nvcc
5059
```
5160

52-
If the corresponding full installation of the CUDA SDK is available there, the CUDA installation included in EESSI can be used to build CUDA software.
61+
If the corresponding full installation of the CUDA SDK is available there, the
62+
CUDA installation included in EESSI can be used to build CUDA software. The same
63+
applies to the cuDNN SDK.
5364

5465

5566
### Using NVIDIA GPUs via a native EESSI installation {: #nvidia_eessi_native }
@@ -74,37 +85,39 @@ This script uses `ldconfig` on your host system to locate your GPU drivers, and
7485

7586
Note that it is safe to re-run the script even if no driver updates were done: the script should detect that the current version of the drivers were already symlinked.
7687

77-
#### Installing full CUDA SDK (optional)
88+
#### Installing full CUDA SDK and cuDNN SDK (optional)
7889

79-
To install a full CUDA SDK under `host_injections`, use the `install_cuda_host_injections.sh` script that is included in EESSI:
90+
To install a full CUDA SDK and cuDNN SDK under `host_injections`, use the `install_cuda_and_libraries.sh` script that is included in EESSI:
8091

8192
```{ .bash .copy }
82-
/cvmfs/software.eessi.io/versions/${EESSI_VERSION}/scripts/gpu_support/nvidia/install_cuda_host_injections.sh
93+
/cvmfs/software.eessi.io/versions/${EESSI_VERSION}/scripts/gpu_support/nvidia/install_cuda_and_libraries.sh
8394
```
8495

85-
For example, to install CUDA 12.1.1 in the directory that the [`host_injections` variant symlink](host_injections.md) points to,
96+
For example, to install CUDA 12.1.1 and cuDNN 8.9.2.26 in the directory that the [`host_injections` variant symlink](host_injections.md) points to,
8697
using `/tmp/$USER/EESSI` as directory to store temporary files:
8798
```
88-
/cvmfs/software.eessi.io/versions/${EESSI_VERSION}/scripts/gpu_support/nvidia/install_cuda_host_injections.sh --cuda-version 12.1.1 --temp-dir /tmp/$USER/EESSI --accept-cuda-eula
99+
/cvmfs/software.eessi.io/versions/${EESSI_VERSION}/scripts/gpu_support/nvidia/install_cuda_and_libraries.sh --temp-dir /tmp/$USER/EESSI --accept-cuda-eula --accept-cudnn-eula
89100
```
90-
You should choose the CUDA version you wish to install according to what CUDA versions are included in EESSI;
91-
see the output of `module avail CUDA/` after [setting up your environment for using
92-
EESSI](../using_eessi/setting_up_environment.md).
101+
The versions 12.1.1 for CUDA and 8.9.2.26 for cuDNN are defined in an easystack
102+
file that is also included in EESSI:
103+
```
104+
/cvmfs/software.eessi.io/versions/${EESSI_VERSION}/scripts/gpu_support/nvidia/easystacks/eessi-2023.06-eb-4.9.4-2023a-CUDA-host-injections.yml
105+
```
106+
By default, the install script processes all files matching `eessi-*CUDA*.yml` in
107+
the above `/cvmfs/software.eessi.io/versions/${EESSI_VERSION}/scripts/gpu_support/nvidia/easystacks` directory.
93108

94-
You can run `/cvmfs/software.eessi.io/versions/${EESSI_VERSION}/scripts/gpu_support/nvidia/install_cuda_host_injections.sh --help` to check all of the options.
109+
You can run `/cvmfs/software.eessi.io/versions/${EESSI_VERSION}/scripts/gpu_support/nvidia/install_cuda_and_libraries.sh --help` to check all of the options.
95110

96111
!!! tip
97112

98-
This script uses EasyBuild to install the CUDA SDK. For this to work, two requirements need to be satisfied:
99-
100-
* `module load EasyBuild` should work (or the `eb` command is already available in the environment);
101-
* The version of EasyBuild being used should provide the requested version of the CUDA easyconfig file
102-
(in the example case above, that's `CUDA-12.1.1.eb`).
113+
This script uses EasyBuild to install the CUDA SDK and the cuDNN SDK. For this to work, two requirements need to be satisfied:
103114

104-
You can rely on the EasyBuild installation that is included in EESSI for this.
115+
* `module load EasyBuild/${EB_VERSION}` must work (EB_VERSION is extracted
116+
from the name of the easystack file (e.g., from `eb-4.9.4` EB_VERSION is
117+
derived as 4.9.4);
118+
* `module load EESSI-extend/${EESSI_VERSION}-easybuild` must work.
105119

106-
Alternatively, you may load an EasyBuild module manually _before_ running the `install_cuda_host_injections.sh`
107-
script to make an `eb` command available.
120+
Both modules are included in EESSI.
108121

109122

110123
### Using NVIDIA GPUs via EESSI in a container {: #nvidia_eessi_container }

0 commit comments

Comments
 (0)