Skip to content

Conversation

nvcastet
Copy link

@nvcastet nvcastet commented Aug 8, 2024

scontrol show config issues RPC calls to master node.
When Slurm is configured with per-user RPC rate-limiting (rl_enable), the command can be throttled causing a large variance in rank start times since it is called in a hook executed per rank.
This PR leverages env variables to get the info for the PMI hook instead of calling scontrol show config.

CC @flx42 @3XX0

@3XX0
Copy link
Member

3XX0 commented Oct 24, 2024

I would have to look at it again, but there were good reason not to do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants