Skip to content

Scheduling by slot #13292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
learner321 opened this issue Jun 5, 2025 · 3 comments
Closed

Scheduling by slot #13292

learner321 opened this issue Jun 5, 2025 · 3 comments

Comments

@learner321
Copy link

Reference: https://docs.open-mpi.org/en/main/launching-apps/scheduling.html#scheduling-overview

MPI VERSION: mpirun (Open MPI) 5.0.3

hostfile:

node0 slots=2 max_slots=20
node1 slots=2 max_slots=20

test.sh:

#!/bin/bash

RANK=$OMPI_COMM_WORLD_RANK
LOCAL_RANK=$OMPI_COMM_WORLD_LOCAL_RANK
WORLD_SIZE=$OMPI_COMM_WORLD_SIZE

echo `hostname` rank=${RANK} local_rank=$LOCAL_RANK world_size=$WORLD_SIZE

cmd1: mpirun --hostfile hostfile -n 8 --map-by slot ./test.sh | sort
log1:

There are not enough slots available in the system to satisfy the 8
slots that were requested by the application:

  ./test.sh

Either request fewer procs for your application, or make more slots
available for use.

A "slot" is the PRRTE term for an allocatable unit where we can
launch a process.  The number of slots available are defined by the
environment in which PRRTE processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
     processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
     hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
     RM is present, PRRTE defaults to the number of processor cores

In all the above cases, if you want PRRTE to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --map-by :OVERSUBSCRIBE option to ignore the
number of available slots when deciding the number of processes to
launch.

cmd2: mpirun --hostfile hostfile -n 8 --map-by slot:oversubscribe ./test.sh | sort
log2:

node0 rank=0 local_rank=0 world_size=8
node0 rank=1 local_rank=1 world_size=8
node0 rank=2 local_rank=2 world_size=8
node0 rank=3 local_rank=3 world_size=8
node1 rank=4 local_rank=0 world_size=8
node1 rank=5 local_rank=1 world_size=8
node1 rank=6 local_rank=2 world_size=8
node1 rank=7 local_rank=3 world_size=8

I hope to obtain the following results. Thanks:)

node0 rank=0 local_rank=0 world_size=8
node0 rank=1 local_rank=1 world_size=8
node0 rank=4 local_rank=2 world_size=8
node0 rank=5 local_rank=3 world_size=8
node1 rank=2 local_rank=0 world_size=8
node1 rank=3 local_rank=1 world_size=8
node1 rank=6 local_rank=2 world_size=8
node1 rank=7 local_rank=3 world_size=8
@rhc54
Copy link
Contributor

rhc54 commented Jun 5, 2025

Your hostfile only stipulates 4 slots, so asking for 8 procs oversubscribes what you stated. The "max_slots" entry is used to specify the absolute maximum number of procs allowed on the node, even when oversubscribe is specified.

@rhc54
Copy link
Contributor

rhc54 commented Jun 5, 2025

Afraid I can't read that language 🤷‍♂

@learner321
Copy link
Author

Have been solved. Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants