-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Closed
Labels
installationInstallation problemsInstallation problems
Description
Your current environment
Describe the bug
When running
with a NVIDIA RTX 5090 GPU, I encountered the following error:
RuntimeError: CUDA error: no kernel image is available for execution on the device
From the logs, it seems that PyTorch does not support the compute capability of the RTX 5090 (sm_120):
To Reproduce
- Use RTX 5090 GPU
- Install vLLM with Docker or system Python environment
- Launch the vLLM OpenAI API server
- Engine fails to start due to CUDA kernel compatibility issue
Environment
- GPU: NVIDIA GeForce RTX 5090
- CUDA Driver Version: 12.8
- CUDA Toolkit: 12.8.93
- NVIDIA Driver: 570.124.06
- PyTorch Version: 2.x (installed via pip)
- vLLM Version: Latest (from PyPI)
- Python Version: 3.10
- OS: Ubuntu 22.04
Additional Context
It seems that the RTX 5090 uses a new compute capability (sm_120
), which is currently not supported in the stable PyTorch build I'm using.
Is there a recommended way to run vLLM with this GPU? Should I:
- Switch to a nightly PyTorch build that supports sm_120?
- Build PyTorch from source with
TORCH_CUDA_ARCH_LIST="12.0"
? - Wait for official support from PyTorch?
Any guidance or workaround would be greatly appreciated. Thanks!
How you are installing vllm
pip install -vvv vllm
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
danhnc96, oelhammouchi, WangJincheng4869, Stashcat-Anis, ryanzheng1998 and 3 more
Metadata
Metadata
Assignees
Labels
installationInstallation problemsInstallation problems