-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Update release pipeline post PyTorch 2.8.0 update #23960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update release pipeline post PyTorch 2.8.0 update #23960
Conversation
Signed-off-by: Huy Do <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly updates the release pipeline to replace the deprecated CUDA 11.8 build with a CUDA 12.9 build, following the PyTorch 2.8.0 update. It also addresses an arm64 build failure by adding libnuma-dev
. However, a critical dependency is missing from the final runtime image stage in the Dockerfile, which will likely lead to runtime errors. The libnuma-dev
package needs to be added to the vllm-base
stage as well.
Does this PR mean, we can expect vllm nightlies built against 2.8.0 very soon? :) Might be good to even have a v0.10.1.2 release built against 2.8.0 - as v0.10.1.1 was released when 2.8.0 was already out PyTorch 2.8.0 has an important fix #18851 (comment) , so would be very beneficial to have a vllm proper release built against 2.8.0... |
Signed-off-by: Huy Do <[email protected]>
Signed-off-by: Huy Do <[email protected]>
Yup, once this lands, the vLLM nightlies (and the next vLLM release) wheel will be on PyTorch 2.8.0 |
Nit: if you are ok with x86_64, the "nightlies" are already there. Where? Here: https://gallery.ecr.aws/q9t5s3a7/vllm-release-repo Update: direct wheel can be obtained from URL like: #20358 (comment) |
Are there any plans of releasing a service release built against 2.8.0? E.g. v0.10.1.2 or v0.10.2? that would be exactly v0.10.1.1 code, but built against 2.8.0 |
Signed-off-by: Huy Do <[email protected]>
Seems there are docker images, right? I'm looking for s3/http-published whl files |
Yes, I can see vllm-0.10.1rc2.dev371+g67c14906a-cp38-abi3-manylinux1_x86_64.whl from a build job that uploaded it to S3. Give it a try? Update: #20358 (comment) |
Probably from this PR there should be some fresh builds against 2.8.0... I propose to still have a service release - as this could actually provide feedback to PyTorch if there are any perf regressions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Huy Do <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an issue on PyTorch side, to align build matrix for CUDA+PyTorch across x86 and aarch64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that we have discrepancy regarding aarch64 docker image build.
x86 is cuda 12.8.1
aarch64 would be 12.9.1 with this PR
cc @nvpohanh
@simon-mo Should we align on the cuda version used between x86 images and aarch64 images? |
I realize for PyTorch project, during v2.8.0 release, the aarch64 binary wheel was only available for cuda 12.9. That was probably why we had to use cu129 for ARM container. |
yes, so my question is: should we also upgrade the cuda version in x86 docker images so that the x86 images and aarch64 images have the same cuda version? Otherwise, it is kind of weird that the same vLLM release images have different cuda versions on different archs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one looks better than #24020 , as it also takes care of the wheel uploading part.
please resolve some comments i left. thanks!
Signed-off-by: Huy Do <[email protected]>
Signed-off-by: Huy Do <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing b72ebd5 aarch64 image (on https://gallery.ecr.aws/q9t5s3a7/vllm-release-repo) as well from this PR. Nice!
Purpose
This is the second part after #20358. This PR does 3 things:
libnuma-dev
to fix arm64 build https://buildkite.com/vllm/release/builds/7768#0198f57a-b3ef-4861-8528-97ce129f5c03/114-5868Test Plan
CI https://buildkite.com/vllm/release/builds/7784
cc @simon-mo @khluu @seemethere
Essential Elements of an Effective PR Description Checklist