Add NVFP4 QAT #2666

andrewor14 · 2025-08-01T21:26:26Z

Stack from ghstack (oldest at bottom):

Summary: This commit adds a QAT flow for NVFP4, following the
numerics in NVFP4Tensor closely but without the dtyping casting,
swizzling, and the packing/unpacking. Users can call this flow as follows:

from torchao.quantization import quantize_
from torchao.quantization.qat import NVFP4FakeQuantizeConfig, QATConfig

qat_config = QATConfig(
    activation_config=NVFP4FakeQuantizeConfig(),
    weight_config=NVFP4FakeQuantizeConfig(),
    step="prepare",
)
quantize_(model, qat_config)

Test Plan:

python test/quantization/test_qat.py -k test_qat_nvfp4

**Summary:** This commit adds a QAT flow for NVFP4, following the numerics in `NVFP4Tensor` closely but without the dtyping casting, swizzling, and the packing/unpacking. Users can call this flow as follows: ``` from torchao.quantization import quantize_ from torchao.quantization.qat import NVFP4FakeQuantizeConfig, QATConfig qat_config = QATConfig( activation_config=NVFP4FakeQuantizeConfig(), weight_config=NVFP4FakeQuantizeConfig(), step="prepare", ) quantize_(model, qat_config) ``` **Test Plan:** ``` python test/quantization/test_qat.py -k test_qat_nvfp4 ``` [ghstack-poisoned]

pytorch-bot · 2025-08-01T21:26:30Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2666

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 87175e9 with merge base 97b090d ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

**Summary:** This commit adds a QAT flow for NVFP4, following the numerics in `NVFP4Tensor` closely but without the dtyping casting, swizzling, and the packing/unpacking. Users can call this flow as follows: ``` from torchao.quantization import quantize_ from torchao.quantization.qat import NVFP4FakeQuantizeConfig, QATConfig qat_config = QATConfig( activation_config=NVFP4FakeQuantizeConfig(), weight_config=NVFP4FakeQuantizeConfig(), step="prepare", ) quantize_(model, qat_config) ``` **Test Plan:** ``` python test/quantization/test_qat.py -k test_qat_nvfp4 ``` ghstack-source-id: fe592ca Pull Request resolved: #2666

This was referenced Aug 1, 2025

New multi-step QAT API #2629

Open

Deprecate old QAT APIs #2641

Open

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 1, 2025

andrewor14 requested review from drisspg and jerryzh168 August 1, 2025 21:27

andrewor14 added the topic: new feature Use this tag if this PR adds a new feature label Aug 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add NVFP4 QAT #2666

Add NVFP4 QAT #2666

Uh oh!

andrewor14 commented Aug 1, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Aug 1, 2025 •

edited

Loading

Uh oh!

Uh oh!

Add NVFP4 QAT #2666

Are you sure you want to change the base?

Add NVFP4 QAT #2666

Uh oh!

Conversation

andrewor14 commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2666

✅ No Failures

Uh oh!

Uh oh!

andrewor14 commented Aug 1, 2025 •

edited

Loading

pytorch-bot bot commented Aug 1, 2025 •

edited

Loading