[FEATURE] Add `dynamic` suppor for AutoRound quantiztion

@wenhuach21  GPTQModel has merged `dynamic` per layer/module control of quantization but I don't think auto-round currently supports such per layer/module control during quantization. I know this is something AutoRound also wants. Is there anyway we can work together to standardize the data-interface to transfer the `dynamic` info to auto-round? Since this feature is new, I am open to changing the protocol within gptqmodel itself if autoround has better suggestions. Thanks.

https://github.com/ModelCloud/GPTQModel/blob/main/tests/test_dynamic.py

Ref: `dynamic` inference port to vllm (will port to sglang after vllm merge) https://github.com/vllm-project/vllm/pull/7086

Both quantizer (GPTQModel and AutoRound) and inference library (vllm, sglang) need to receive the per layer/module `dynamic` overrides. It would be nice if everyone can somehow agree to something close/or similar to avoid compat issues. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] Add `dynamic` suppor for AutoRound quantiztion #329

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Add dynamic suppor for AutoRound quantiztion #329

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[FEATURE] Add `dynamic` suppor for AutoRound quantiztion #329