Skip to content

Commit c3a672d

Browse files
mgoinminpeter
authored andcommitted
[Doc] Update quickstart and install for cu128 using --torch-backend=auto (vllm-project#18505)
Signed-off-by: mgoin <[email protected]> Signed-off-by: minpeter <[email protected]>
1 parent 9bd8a0b commit c3a672d

File tree

3 files changed

+33
-35
lines changed

3 files changed

+33
-35
lines changed

docs/source/getting_started/installation/gpu/cuda.inc.md

Lines changed: 24 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Installation
22

3-
vLLM contains pre-compiled C++ and CUDA (12.6) binaries.
3+
vLLM contains pre-compiled C++ and CUDA (12.8) binaries.
44

55
## Requirements
66

@@ -23,18 +23,26 @@ Therefore, it is recommended to install vLLM with a **fresh new** environment. I
2323
You can install vLLM using either `pip` or `uv pip`:
2424

2525
```console
26-
# Install vLLM with CUDA 12.6.
27-
pip install vllm # If you are using pip.
28-
uv pip install vllm # If you are using uv.
26+
# Install vLLM with CUDA 12.8.
27+
# If you are using pip.
28+
pip install vllm --extra-index-url https://download.pytorch.org/whl/cu128
29+
# If you are using uv.
30+
uv pip install vllm --torch-backend=auto
2931
```
3032

31-
As of now, vLLM's binaries are compiled with CUDA 12.6 and public PyTorch release versions by default. We also provide vLLM binaries compiled with CUDA 12.8, 11.8, and public PyTorch release versions:
33+
We recommend leveraging `uv` to [automatically select the appropriate PyTorch index at runtime](https://docs.astral.sh/uv/guides/integration/pytorch/#automatic-backend-selection) by inspecting the installed CUDA driver version via `--torch-backend=auto` (or `UV_TORCH_BACKEND=auto`). To select a specific backend (e.g., `cu126`), set `--torch-backend=cu126` (or `UV_TORCH_BACKEND=cu126`). If this doesn't work, try running `uv self update` to update `uv` first.
34+
35+
:::{note}
36+
NVIDIA Blackwell GPUs (B200, GB200) require a minimum of CUDA 12.8, so make sure you are installing PyTorch wheels with at least that version. PyTorch itself offers a [dedicated interface](https://pytorch.org/get-started/locally/) to determine the appropriate pip command to run for a given target configuration.
37+
:::
38+
39+
As of now, vLLM's binaries are compiled with CUDA 12.8 and public PyTorch release versions by default. We also provide vLLM binaries compiled with CUDA 12.6, 11.8, and public PyTorch release versions:
3240

3341
```console
3442
# Install vLLM with CUDA 11.8.
3543
export VLLM_VERSION=0.6.1.post1
36-
export PYTHON_VERSION=310
37-
pip install https://github.com/vllm-project/vllm/releases/download/v${VLLM_VERSION}/vllm-${VLLM_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux1_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118
44+
export PYTHON_VERSION=312
45+
uv pip install https://github.com/vllm-project/vllm/releases/download/v${VLLM_VERSION}/vllm-${VLLM_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux1_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118
3846
```
3947

4048
(install-the-latest-code)=
@@ -51,30 +59,30 @@ pip install -U vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
5159

5260
`--pre` is required for `pip` to consider pre-released versions.
5361

54-
If you want to access the wheels for previous commits (e.g. to bisect the behavior change, performance regression), due to the limitation of `pip`, you have to specify the full URL of the wheel file by embedding the commit hash in the URL:
62+
Another way to install the latest code is to use `uv`:
5563

5664
```console
57-
export VLLM_COMMIT=33f460b17a54acb3b6cc0b03f4a17876cff5eafd # use full commit hash from the main branch
58-
pip install https://wheels.vllm.ai/${VLLM_COMMIT}/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
65+
uv pip install -U vllm --torch-backend=auto --extra-index-url https://wheels.vllm.ai/nightly
5966
```
6067

61-
Note that the wheels are built with Python 3.8 ABI (see [PEP 425](https://peps.python.org/pep-0425/) for more details about ABI), so **they are compatible with Python 3.8 and later**. The version string in the wheel file name (`1.0.0.dev`) is just a placeholder to have a unified URL for the wheels, the actual versions of wheels are contained in the wheel metadata (the wheels listed in the extra index url have correct versions). Although we don't support Python 3.8 any more (because PyTorch 2.5 dropped support for Python 3.8), the wheels are still built with Python 3.8 ABI to keep the same wheel name as before.
62-
63-
##### Install the latest code using `uv`
68+
##### Install specific revisions using `pip`
6469

65-
Another way to install the latest code is to use `uv`:
70+
If you want to access the wheels for previous commits (e.g. to bisect the behavior change, performance regression), due to the limitation of `pip`, you have to specify the full URL of the wheel file by embedding the commit hash in the URL:
6671

6772
```console
68-
uv pip install -U vllm --extra-index-url https://wheels.vllm.ai/nightly
73+
export VLLM_COMMIT=33f460b17a54acb3b6cc0b03f4a17876cff5eafd # use full commit hash from the main branch
74+
pip install https://wheels.vllm.ai/${VLLM_COMMIT}/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
6975
```
7076

77+
Note that the wheels are built with Python 3.8 ABI (see [PEP 425](https://peps.python.org/pep-0425/) for more details about ABI), so **they are compatible with Python 3.8 and later**. The version string in the wheel file name (`1.0.0.dev`) is just a placeholder to have a unified URL for the wheels, the actual versions of wheels are contained in the wheel metadata (the wheels listed in the extra index url have correct versions). Although we don't support Python 3.8 any more (because PyTorch 2.5 dropped support for Python 3.8), the wheels are still built with Python 3.8 ABI to keep the same wheel name as before.
78+
7179
##### Install specific revisions using `uv`
7280

7381
If you want to access the wheels for previous commits (e.g. to bisect the behavior change, performance regression), you can specify the commit hash in the URL:
7482

7583
```console
7684
export VLLM_COMMIT=72d9c316d3f6ede485146fe5aabd4e61dbc59069 # use full commit hash from the main branch
77-
uv pip install vllm --extra-index-url https://wheels.vllm.ai/${VLLM_COMMIT}
85+
uv pip install vllm --torch-backend=auto --extra-index-url https://wheels.vllm.ai/${VLLM_COMMIT}
7886
```
7987

8088
The `uv` approach works for vLLM `v0.6.6` and later and offers an easy-to-remember command. A unique feature of `uv` is that packages in `--extra-index-url` have [higher priority than the default index](https://docs.astral.sh/uv/pip/compatibility/#packages-that-exist-on-multiple-indexes). If the latest public release is `v0.6.6.post1`, `uv`'s behavior allows installing a commit before `v0.6.6.post1` by specifying the `--extra-index-url`. In contrast, `pip` combines packages from `--extra-index-url` and the default index, choosing only the latest version, which makes it difficult to install a development version prior to the released version.
Lines changed: 1 addition & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,6 @@
1-
You can create a new Python environment using [conda](https://docs.conda.io/projects/conda/en/stable/user-guide/getting-started.html):
1+
It's recommended to use [uv](https://docs.astral.sh/uv/), a very fast Python environment manager, to create and manage Python environments. Please follow the [documentation](https://docs.astral.sh/uv/#getting-started) to install `uv`. After installing `uv`, you can create a new Python environment and install vLLM using the following commands:
22

33
```console
4-
# (Recommended) Create a new conda environment.
5-
conda create -n vllm python=3.12 -y
6-
conda activate vllm
7-
```
8-
9-
:::{note}
10-
[PyTorch has deprecated the conda release channel](https://github.com/pytorch/pytorch/issues/138506). If you use `conda`, please only use it to create Python environment rather than installing packages.
11-
:::
12-
13-
Or you can create a new Python environment using [uv](https://docs.astral.sh/uv/), a very fast Python environment manager. Please follow the [documentation](https://docs.astral.sh/uv/#getting-started) to install `uv`. After installing `uv`, you can create a new Python environment using the following command:
14-
15-
```console
16-
# (Recommended) Create a new uv environment. Use `--seed` to install `pip` and `setuptools` in the environment.
174
uv venv --python 3.12 --seed
185
source .venv/bin/activate
196
```

docs/source/getting_started/quickstart.md

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -21,25 +21,28 @@ It's recommended to use [uv](https://docs.astral.sh/uv/), a very fast Python env
2121
```console
2222
uv venv --python 3.12 --seed
2323
source .venv/bin/activate
24-
uv pip install vllm
24+
uv pip install vllm --torch-backend=auto
2525
```
2626

27-
Another delightful way is to use `uv run` with `--with [dependency]` option, which allows you to run commands such as `vllm serve` without creating an environment:
27+
`uv` can [automatically select the appropriate PyTorch index at runtime](https://docs.astral.sh/uv/guides/integration/pytorch/#automatic-backend-selection) by inspecting the installed CUDA driver version via `--torch-backend=auto` (or `UV_TORCH_BACKEND=auto`). To select a specific backend (e.g., `cu126`), set `--torch-backend=cu126` (or `UV_TORCH_BACKEND=cu126`).
28+
29+
Another delightful way is to use `uv run` with `--with [dependency]` option, which allows you to run commands such as `vllm serve` without creating any permanent environment:
2830

2931
```console
3032
uv run --with vllm vllm --help
3133
```
3234

33-
You can also use [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html) to create and manage Python environments.
35+
You can also use [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html) to create and manage Python environments. You can install `uv` to the conda environment through `pip` if you want to manage it within the environment.
3436

3537
```console
3638
conda create -n myenv python=3.12 -y
3739
conda activate myenv
38-
pip install vllm
40+
pip install --upgrade uv
41+
uv pip install vllm --torch-backend=auto
3942
```
4043

4144
:::{note}
42-
For non-CUDA platforms, please refer [here](#installation-index) for specific instructions on how to install vLLM.
45+
For more detail and non-CUDA platforms, please refer [here](#installation-index) for specific instructions on how to install vLLM.
4346
:::
4447

4548
(quickstart-offline)=

0 commit comments

Comments
 (0)