-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Bugfix] Fix packed_factor missing attribute error #23902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly addresses an AttributeError
by renaming pack_factor
to packed_factor
in several locations. The changes are accurate and align with the recent refactoring.
However, the fix appears to be incomplete. I've identified another instance of the old attribute name pack_factor
that was missed in vllm/model_executor/layers/linear.py
.
Specifically, in the QKVParallelLinear.weight_loader
method, lines 1110-1111 still use param.pack_factor
. This will likely cause the same AttributeError
this PR aims to solve.
To fully resolve the issue, please update these lines as well:
# In vllm/model_executor/layers/linear.py, starting at line 1109
if packed_dim == output_dim:
shard_size = shard_size // param.packed_factor
shard_offset = shard_offset // param.packed_factor
With this addition, the bugfix will be complete.
55c0125
to
6678302
Compare
@yaochengji, can you look into this PR? |
79503ff
to
1b3a5df
Compare
@dsikka gentle ping for review. |
Signed-off-by: Kyuyeun Kim <[email protected]>
1b3a5df
to
86c2590
Compare
Addressed this as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for fixing this!
* 'main' of https://github.com/845473182/vllm: (457 commits) [BugFix] Fix routed_scaling_factor double mul for dots1 and glm4 MoE models (vllm-project#24132) [Misc] Add check for dual_chunk_attention (vllm-project#24070) [Doc]: fix typos in Python comments (vllm-project#24115) [Doc]: fix typos in Python comments (vllm-project#24093) [Compile] Fix Compile Warning for `w4a8_mm_entry.cu` (vllm-project#23660) fix some typos (vllm-project#24071) [V1] Wrapper which plumbs request-level logits processors into vLLM batch-level logits processing (vllm-project#23656) Upgrade xgrammar to 0.1.23 (vllm-project#22988) Update release pipeline post PyTorch 2.8.0 update (vllm-project#24073) [XPU] Fix the bug of LoRA logits on the XPU platform (vllm-project#24081) [CI/Build] Disable SiluMul NVFP4 quant fusion tests (vllm-project#24121) [Bug] R1 Accuracy: Fix `routed_scaling_factor` Double Mul Issue (vllm-project#24119) [AMD][Kernel][Bugfix] Cast offsets tensor bn to tl.int64 to avoid GPU segfault (vllm-project#23692) [CI] Enable all hf transformers baselines in test_hybrid (vllm-project#23936) [Log] Only Print Profiler Results on Rank 0 (vllm-project#23370) Fix weights loading for Apertus (vllm-project#24100) [Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110) [Bugfix] Fix packed_factor missing attribute error (vllm-project#23902) Run ruff format on a few files. (vllm-project#24075) [Bugfix] Fix transform_config parsing in Compressed Tensors (vllm-project#23945) ...
Signed-off-by: Kyuyeun Kim <[email protected]> Signed-off-by: 子悬 <[email protected]>
Signed-off-by: Kyuyeun Kim <[email protected]>
Signed-off-by: Kyuyeun Kim <[email protected]>
Purpose
With the introduction of
PackedvLLMParameter
,pack_factor
was renamed topacked_factor
in #5874. However, some part of the code still tries to accesspack_factor
and triggers an error. This PR fixes the error.Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.