[Quant] SupportsQuant handles ignored_modules #14635

kylesayrs · 2025-03-11T19:31:57Z

Purpose

Properly map ignored_modules for quantization configs belonging to nested models and models which require a hf_to_vllm_mapper
Standardize on the QuantizationConfig.ignored_modules attribute
The logic for remapping according to hf_to_vllm_mapper is very similar to the logic for remapping according to packed_modules_mapping , and both are specifically related to Supporting Quantization
All model quantization-related logic is handled by the SupportsQuant mixin

Changes

SupportsQuant now modifies the quant_config.ignored_modules attribute to account for the relevant hf_to_vllm_mapper
Standardize bitsandbytes to use the ignored_modules attribute instead of llm_int8_skip_modules
Standardize awq to use the ignored_modules attribute instead of modules_to_not_convert
Add SupportsQuant to qwen2_5_vl

Testing

The following script previously failed to account for hf_to_vllm_mapper when determining ignored modules

import vllm
llm = vllm.LLM(
    "unsloth/Qwen2.5-VL-72B-Instruct-unsloth-bnb-4bit",
    max_model_len=3200,
    quantization="bitsandbytes",
    load_format="bitsandbytes",
    trust_remote_code=True,
)

Signed-off-by: Kyle Sayers <[email protected]>

github-actions · 2025-03-11T19:32:08Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: Kyle Sayers <[email protected]>

…uant-ignored-modules

Signed-off-by: Kyle Sayers <[email protected]>

mergify · 2025-06-11T05:49:41Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @kylesayrs.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

kylesayrs · 2025-06-24T22:34:22Z

Closing in favor of #20046

The above solution is a little bit better, as it can be implemented more incrementally and also is capable of updating more than just the ignored list

WIP

dba7650

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs mentioned this pull request Mar 11, 2025

[Bugfix] Fix quantization skip modules logic #13562

Closed

2 tasks

kylesayrs added 3 commits March 11, 2025 15:52

WIP: circular imports, types

e90431c

Signed-off-by: Kyle Sayers <[email protected]>

Merge remote-tracking branch 'upstream/main' into kylesayrs/supportsQ…

d5797b3

…uant-ignored-modules

add bits and bytes and qwen_2_5_vl

76f1ae2

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs marked this pull request as ready for review March 26, 2025 20:22

kylesayrs requested review from mgoin, robertgshaw2-redhat and tlrmchlsmth as code owners March 26, 2025 20:22

kylesayrs changed the title ~~[WIP] [Quant] SupportsQuant handles ignored_modules~~ [Quant] SupportsQuant handles ignored_modules Mar 26, 2025

This was referenced Mar 26, 2025

[SupportsQuant] Bert, Blip, Blip2, Bloom #15573

Merged

[SupportsQuant] Chameleon, Chatglm, Commandr #15952

Merged

mgoin self-assigned this Apr 8, 2025

mergify bot added the needs-rebase label Jun 11, 2025

mergify bot added the qwen Related to Qwen models label Jun 19, 2025

kylesayrs closed this Jun 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Quant] SupportsQuant handles ignored_modules #14635

[Quant] SupportsQuant handles ignored_modules #14635

Uh oh!

kylesayrs commented Mar 11, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Mar 11, 2025

Uh oh!

mergify bot commented Jun 11, 2025

Uh oh!

kylesayrs commented Jun 24, 2025

Uh oh!

Uh oh!

Uh oh!

[Quant] SupportsQuant handles ignored_modules #14635

[Quant] SupportsQuant handles ignored_modules #14635

Uh oh!

Conversation

kylesayrs commented Mar 11, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Changes

Testing

Uh oh!

github-actions bot commented Mar 11, 2025

Uh oh!

mergify bot commented Jun 11, 2025

Uh oh!

kylesayrs commented Jun 24, 2025

Uh oh!

Uh oh!

kylesayrs commented Mar 11, 2025 •

edited by github-actions bot

Loading