Skip to content

Can't request multiple GPUs when deploying on Runpod #3146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alxiang opened this issue Feb 11, 2024 · 1 comment · Fixed by #3291
Closed

Can't request multiple GPUs when deploying on Runpod #3146

alxiang opened this issue Feb 11, 2024 · 1 comment · Fixed by #3291
Assignees
Labels
bug Something isn't working clouds Cloud support and cloud-specifc features

Comments

@alxiang
Copy link

alxiang commented Feb 11, 2024

in sky/provision/runpod/utils.py, gpu_quantity only factors into min_vcpu_count and min_memory_in_gb, so when I have my resources like:

resources:
  cloud: runpod
  accelerators: A100-80GB-SXM:2

I only get 1 A100 in Runpod and not 2. Based on my understanding of Runpod's Python SDK, the proper way to specify the # of GPUs is the gpu_count parameter (see https://github.com/runpod/runpod-python/blob/9e11c994fcceba0d6c3cd35be87084a28d9426d3/runpod/api/ctl_commands.py#L119C12-L119C21).

I'm on runpod==1.6.0, seems like skypilot should be updated to be compatible?

@Michaelvll
Copy link
Collaborator

That is a great catch @alxiang! Thank you for reporting this! Would you like to submit a PR to fix this?

@Michaelvll Michaelvll added clouds Cloud support and cloud-specifc features bug Something isn't working labels Feb 11, 2024
@concretevitamin concretevitamin self-assigned this Mar 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working clouds Cloud support and cloud-specifc features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants