-
Notifications
You must be signed in to change notification settings - Fork 511
Make quantize_ respect the usages of the quantizer #1836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
1d7776f
Beginning of work to properly reuse the output given to quantize
ptrendx d207cea
Add current scaling
ptrendx fa28e49
Beginning of the other recipes
ptrendx 49ad122
Added MXFP8 and cleanup
ptrendx 17678d9
Fix
ptrendx b488101
Actually reuse tensors and get rid of the hack for MXFP8
ptrendx 209cb9f
Small cleaning
ptrendx c61b14b
Make sure dgrad is not needed in the test during eval phase
ptrendx 7561fb4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 41b8fb4
Fixes
ptrendx 20363a4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 2acb07a
Fixes
ptrendx 9550644
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 2f9f9ce
Merge branch 'main' into pr_quantize_output_respect_usages
ptrendx f0f96b9
Fix for integer overflow
ptrendx eb49987
Try copying the quantizer
ptrendx 6dcd480
Fix
ptrendx b6f1aeb
Fix CUDA graphs test
ptrendx b92f3f5
Merge branch 'main' into pr_quantize_output_respect_usages
ptrendx 1f4f894
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 53554f2
Fix
ptrendx 343d43d
Fix the float8blockwise tests and MXFP8 cuda graphs tests
ptrendx 817d8ce
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] b6b4af3
Merge branch 'main' into pr_quantize_output_respect_usages
ptrendx 207e4b7
Fix issue from merge
ptrendx 715cc53
Always use tex.quantize when updating cache to use proper quantizer
ptrendx d682178
Debug
ptrendx e6f38d1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat orthogonal, but since we're touching
Quantizer::create_tensor
, we should consider removing therowwise_data
arg. It was a UB-specific option that doesn't really make sense anymore. I believe all usages have been refactored away.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, cool. I will do that - it will make the code nicer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually can't do that just yet. Attention also uses this unfortunately.