Skip to content

name/gres-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

GRES Generator

Generates SLURM GRES configurations for GPU clusters. Automatically maps GPUs to CPU cores.

What it does

Looks at your GPUs and CPUs, then creates the config file SLURM needs to schedule GPU jobs properly.

Install

curl -O https://github.com/raw/name/gres-generator/main/gres-gen
chmod +x gres-gen

Examples

Basic usage:

./gres-gen
Name=gpu        File=/dev/nvidia0 CPUs=0-15
Name=gpu        File=/dev/nvidia1 CPUs=16-31
Name=gpu        File=/dev/nvidia2 CPUs=32-47
Name=gpu        File=/dev/nvidia3 CPUs=48-63

Custom GPU name:

./gres-gen --name rtx4090
Name=rtx4090      File=/dev/nvidia0 CPUs=0-15
Name=rtx4090      File=/dev/nvidia1 CPUs=16-31

Add to SLURM config:

./gres-gen > /etc/slurm/gres.conf
systemctl restart slurmctld

Test GPU scheduling:

srun --gres=gpu:1 nvidia-smi
srun --gres=gpu:2 python train_model.py

Options

./gres-gen --help
  • --name DEVICE - Change device name (default: gpu)
  • --header FILE - Include header file
  • --autodetect OPT - Add AutoDetect line

Requirements

  • Linux with NVIDIA GPUs
  • nvidia-smi command available
  • SLURM installed

That's it

The tool handles CPU core distribution automatically. Each GPU gets roughly equal CPU cores.

Problems? Check that nvidia-smi -L shows your GPUs.

About

Automatically creates SLURM GRES configurations that map your GPUs to CPU cores.

Resources

License

Stars

Watchers

Forks

Languages