modify cast from hp to mx to help inductor fuse #1786

vkuzo · 2025-02-26T21:12:04Z

Summary:

Thanks to investigation from @eellison, moving the reshape to the end of the cast helps inductor fuse the cast into a single kernel. This doesn't yet work with fp4, but let's unblock fp8 and deal with fp4 later.

Fixes #1769

Note: in the repro with swizzling from
#1773, we go from 3 to 2 kernels. Further investigation is needed whether we can fuse the swizzling.

Test Plan:

pytest test/prototype/mx_formats/test_mx_tensor.py -x -s -k test_to_mx_inductor_single_kernel

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2025-02-26T21:12:08Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1786

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 3 Pending

As of commit 584efe0 with merge base d00ee41 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@eellison

Summary: Thanks to investigation from @eellison, moving the reshape to the end of the cast helps inductor fuse the cast into a single kernel. This doesn't yet work with fp4, but let's unblock fp8 and deal with fp4 later. Fixes #1690 Note: in the repro with swizzling from #1773, we go from 3 to 2 kernels. Further investigation is needed whether we can fuse the swizzling. Test Plan: ``` pytest test/prototype/mx_formats/test_mx_tensor.py -x -s -k test_to_mx_inductor_single_kernel ``` Reviewers: Subscribers: Tasks: Tags:

eellison

Looks good as a workaround ! I'm still planning on making the code fuse as it was previously.. but worth landing in interim

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 26, 2025

vkuzo requested a review from eellison February 26, 2025 21:12

vkuzo added the topic: performance Use this tag if this PR improves the performance of a feature label Feb 26, 2025

vkuzo force-pushed the 20250226_mx_single_cast_kernel branch from d233496 to 584efe0 Compare February 26, 2025 21:13

eellison approved these changes Feb 26, 2025

View reviewed changes

vkuzo merged commit 8d110bf into main Feb 26, 2025
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

modify cast from hp to mx to help inductor fuse #1786

modify cast from hp to mx to help inductor fuse #1786

Uh oh!

vkuzo commented Feb 26, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Feb 26, 2025 •

edited

Loading

Uh oh!

eellison left a comment

Uh oh!

Uh oh!

Uh oh!

modify cast from hp to mx to help inductor fuse #1786

modify cast from hp to mx to help inductor fuse #1786

Uh oh!

Conversation

vkuzo commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1786

⏳ No Failures, 3 Pending

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

vkuzo commented Feb 26, 2025 •

edited

Loading

pytorch-bot bot commented Feb 26, 2025 •

edited

Loading