-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[V0 deprecation] Remove _set_default_args_v0
function
#25409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Isotr0py <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request removes the _set_default_args_v0
function and related logic as part of the V0 engine deprecation. The changes are straightforward, renaming _set_default_args_v1
to _set_default_args
and simplifying the call site. However, the removal of V0 components appears to be incomplete in other parts of the codebase. I've pointed out a critical issue where setting VLLM_USE_V1=0
could lead to a runtime crash due to lingering references to V0 components in platform-specific files. This should be addressed to ensure the V0 removal is clean and does not introduce new bugs.
# Set default arguments for V1 Engine. | ||
self._set_default_args(usage_context, model_config) | ||
# Disable chunked prefill for POWER (ppc64le)/ARM/s390x CPUs in V1 | ||
if current_platform.is_cpu() and current_platform.get_cpu_architecture( | ||
) in (CpuArchEnum.POWERPC, CpuArchEnum.S390X, CpuArchEnum.ARM): | ||
logger.info("Chunked prefill is not supported for ARM and POWER " | ||
"and S390X CPUs; " | ||
"disabling it for V1 backend.") | ||
self.enable_chunked_prefill = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this change correctly removes the V0-specific default arguments, the removal of V0 components seems incomplete across the codebase. Several platform-specific files still contain logic that depends on VLLM_USE_V1
and might attempt to use V0 components, which are presumably removed. This could lead to runtime errors if a user sets VLLM_USE_V1=0
.
For example, in vllm/platforms/cuda.py
and vllm/platforms/rocm.py
, the code still selects vllm.worker.worker.Worker
if VLLM_USE_V1
is false. If this V0 worker has been removed, this will cause a crash.
It is recommended to update these files to either always use the V1 worker or raise an error if VLLM_USE_V1=0
, similar to the approach in cpu.py
, xpu.py
, and tpu.py
which explicitly disallow VLLM_USE_V1=0
.
…#25409) Signed-off-by: Isotr0py <[email protected]>
…#25409) Signed-off-by: Isotr0py <[email protected]>
…#25409) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: charlifu <[email protected]>
Signed-off-by: Isotr0py <[email protected]> Signed-off-by: yewentao256 <[email protected]>
…#25409) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: gaojc <[email protected]>
…#25409) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
…#25409) Signed-off-by: Isotr0py <[email protected]>
Purpose
_set_default_args_v0
since v0 engine has been removedTest Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.