Skip to content

Conversation

kyuyeunk
Copy link
Contributor

@kyuyeunk kyuyeunk commented Aug 29, 2025

Purpose

With the introduction of PackedvLLMParameter, pack_factor was renamed to packed_factor in #5874. However, some part of the code still tries to access pack_factor and triggers an error. This PR fixes the error.

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses an AttributeError by renaming pack_factor to packed_factor in several locations. The changes are accurate and align with the recent refactoring.

However, the fix appears to be incomplete. I've identified another instance of the old attribute name pack_factor that was missed in vllm/model_executor/layers/linear.py.

Specifically, in the QKVParallelLinear.weight_loader method, lines 1110-1111 still use param.pack_factor. This will likely cause the same AttributeError this PR aims to solve.

To fully resolve the issue, please update these lines as well:

# In vllm/model_executor/layers/linear.py, starting at line 1109
if packed_dim == output_dim:
    shard_size = shard_size // param.packed_factor
    shard_offset = shard_offset // param.packed_factor

With this addition, the bugfix will be complete.

@kyuyeunk kyuyeunk force-pushed the fix_packed_factor branch 3 times, most recently from 55c0125 to 6678302 Compare August 29, 2025 21:33
@kyuyeunk
Copy link
Contributor Author

@yaochengji, can you look into this PR?

@yaochengji yaochengji added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 29, 2025
@yaochengji
Copy link
Collaborator

@dsikka could you take a look at the PR? As the issue comes from #5874.

@kyuyeunk kyuyeunk force-pushed the fix_packed_factor branch 4 times, most recently from 79503ff to 1b3a5df Compare September 2, 2025 06:35
@kyuyeunk
Copy link
Contributor Author

kyuyeunk commented Sep 2, 2025

@dsikka gentle ping for review.

@kyuyeunk
Copy link
Contributor Author

kyuyeunk commented Sep 2, 2025

Code Review

This pull request correctly addresses an AttributeError by renaming pack_factor to packed_factor in several locations. The changes are accurate and align with the recent refactoring.

However, the fix appears to be incomplete. I've identified another instance of the old attribute name pack_factor that was missed in vllm/model_executor/layers/linear.py.

Specifically, in the QKVParallelLinear.weight_loader method, lines 1110-1111 still use param.pack_factor. This will likely cause the same AttributeError this PR aims to solve.

To fully resolve the issue, please update these lines as well:

# In vllm/model_executor/layers/linear.py, starting at line 1109
if packed_dim == output_dim:
    shard_size = shard_size // param.packed_factor
    shard_offset = shard_offset // param.packed_factor

With this addition, the bugfix will be complete.

Addressed this as well.

Copy link
Collaborator

@yaochengji yaochengji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for fixing this!

@yaochengji yaochengji merged commit 9480ae2 into vllm-project:main Sep 2, 2025
39 checks passed
845473182 pushed a commit to 845473182/vllm that referenced this pull request Sep 3, 2025
* 'main' of https://github.com/845473182/vllm: (457 commits)
  [BugFix] Fix routed_scaling_factor double mul for dots1 and glm4 MoE models (vllm-project#24132)
  [Misc] Add check for dual_chunk_attention (vllm-project#24070)
  [Doc]: fix typos in Python comments (vllm-project#24115)
  [Doc]: fix typos in Python comments (vllm-project#24093)
  [Compile] Fix Compile Warning for `w4a8_mm_entry.cu` (vllm-project#23660)
  fix some typos (vllm-project#24071)
  [V1] Wrapper which plumbs request-level logits processors into vLLM batch-level logits processing (vllm-project#23656)
  Upgrade xgrammar to 0.1.23 (vllm-project#22988)
  Update release pipeline post PyTorch 2.8.0 update (vllm-project#24073)
  [XPU] Fix the bug of LoRA logits on the XPU platform (vllm-project#24081)
  [CI/Build] Disable SiluMul NVFP4 quant fusion tests (vllm-project#24121)
  [Bug] R1 Accuracy: Fix `routed_scaling_factor` Double Mul Issue (vllm-project#24119)
  [AMD][Kernel][Bugfix] Cast offsets tensor bn to tl.int64 to avoid GPU segfault (vllm-project#23692)
  [CI] Enable all hf transformers baselines in test_hybrid (vllm-project#23936)
  [Log] Only Print Profiler Results on Rank 0 (vllm-project#23370)
  Fix weights loading for Apertus (vllm-project#24100)
  [Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110)
  [Bugfix] Fix packed_factor missing attribute error (vllm-project#23902)
  Run ruff format on a few files. (vllm-project#24075)
  [Bugfix] Fix transform_config parsing in Compressed Tensors (vllm-project#23945)
  ...
akaihaoshuai pushed a commit to akaihaoshuai/vllm that referenced this pull request Sep 3, 2025
@kyuyeunk kyuyeunk deleted the fix_packed_factor branch September 4, 2025 03:41
eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants