[PyTorch] fix input_quantizer usage for save_original_input; fix blockwise FP8 convert_and_update_tensor #1978

hxbai · 2025-07-22T02:16:04Z

Description

#1963 added the unnecessary rowwise quantization to blockwise FP8 and MXFP8 in the save_original_input case. This PR removes this extra quantization.

#1952 did not correctly handle Blockwise FP8 tensor update. This PR also fixes this.

Fixes # (issue)

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring

Changes

Please list the changes introduced in this PR:

Change A
Change B

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Signed-off-by: Hongxiao Bai <[email protected]>

timmoon10 · 2025-07-22T18:58:06Z

/te-ci pytorch L1

Signed-off-by: Hongxiao Bai <[email protected]>

for more information, see https://pre-commit.ci

timmoon10 · 2025-08-01T19:06:11Z

/te-ci pytorch L1

timmoon10

LGTM, pending CI

ksivaman

LGTM

ksivaman · 2025-08-04T18:21:27Z

/te-ci pytorch L0 L1

…kwise FP8 convert_and_update_tensor (NVIDIA#1978) * fix input_quantizer in save_original_input bwd Signed-off-by: Hongxiao Bai <[email protected]> * fix get shape of blockwise tensor with only compact colwise data Signed-off-by: Hongxiao Bai <[email protected]> * fix blockwise FP8 convert_and_update_tensor Signed-off-by: Hongxiao Bai <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Hongxiao Bai <[email protected]> Co-authored-by: Tim Moon <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Kirthi Shankar Sivamani <[email protected]> Signed-off-by: Anton Korzh <[email protected]>

fix input_quantizer in save_original_input bwd

36854f6

Signed-off-by: Hongxiao Bai <[email protected]>

hxbai mentioned this pull request Jul 22, 2025

[PyTorch] Debug linear layer when saving original input and using debug quantizer #1963

Merged

13 tasks

hxbai marked this pull request as draft July 22, 2025 15:41

Merge branch 'main' into save_original_fix

5607756

fix get shape of blockwise tensor with only compact colwise data

1d17d72

Signed-off-by: Hongxiao Bai <[email protected]>

hxbai marked this pull request as ready for review July 23, 2025 07:57

Merge remote-tracking branch 'origin/main' into save_original_fix

368117b

Signed-off-by: Hongxiao Bai <[email protected]>

hxbai force-pushed the save_original_fix branch from da084bb to 368117b Compare July 31, 2025 06:35

hxbai added 2 commits August 1, 2025 13:20

Merge branch 'main' into save_original_fix

39e9052

fix blockwise FP8 convert_and_update_tensor

6ce6a4e

Signed-off-by: Hongxiao Bai <[email protected]>

hxbai changed the title ~~[PyTorch] fix input_quantizer usage in Linear backward for save_original_input~~ [PyTorch] fix input_quantizer usage for save_original_input; fix blockwise FP8 convert_and_update_tensor Aug 1, 2025

pre-commit-ci bot and others added 2 commits August 1, 2025 10:01

[pre-commit.ci] auto fixes from pre-commit.com hooks

2c43099

for more information, see https://pre-commit.ci

Merge branch 'main' into save_original_fix

bc58f10

timmoon10 approved these changes Aug 1, 2025

View reviewed changes

ksivaman approved these changes Aug 4, 2025

View reviewed changes

Merge branch 'main' into save_original_fix

403d9e5

timmoon10 merged commit de69ca0 into NVIDIA:main Aug 6, 2025
12 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PyTorch] fix input_quantizer usage for save_original_input; fix blockwise FP8 convert_and_update_tensor #1978

[PyTorch] fix input_quantizer usage for save_original_input; fix blockwise FP8 convert_and_update_tensor #1978

Uh oh!

hxbai commented Jul 22, 2025 •

edited

Loading

Uh oh!

timmoon10 commented Jul 22, 2025

Uh oh!

timmoon10 commented Aug 1, 2025

Uh oh!

timmoon10 left a comment

Uh oh!

ksivaman left a comment

Uh oh!

ksivaman commented Aug 4, 2025

Uh oh!

Uh oh!

Uh oh!

[PyTorch] fix input_quantizer usage for save_original_input; fix blockwise FP8 convert_and_update_tensor #1978

[PyTorch] fix input_quantizer usage for save_original_input; fix blockwise FP8 convert_and_update_tensor #1978

Uh oh!

Conversation

hxbai commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Changes

Checklist:

Uh oh!

timmoon10 commented Jul 22, 2025

Uh oh!

timmoon10 commented Aug 1, 2025

Uh oh!

timmoon10 left a comment

Choose a reason for hiding this comment

Uh oh!

ksivaman left a comment

Choose a reason for hiding this comment

Uh oh!

ksivaman commented Aug 4, 2025

Uh oh!

Uh oh!

Uh oh!

hxbai commented Jul 22, 2025 •

edited

Loading