Skip to content

Conversation

ahmtox
Copy link
Contributor

@ahmtox ahmtox commented Jun 9, 2025

Improving the cpu implementation op_quantize to support input half dtype and adding additional testing

Differential Revision: [D76053764](https://our.internmc.facebook.com/intern/diff/D76053764/)

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Jun 9, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11479

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 1 Cancelled Job

As of commit 316d2c0 with merge base 8cfa858 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76053764

Improving the cpu implementation op_quantize to support input half dtype and adding additional testing

Differential Revision: [D76053764](https://our.internmc.facebook.com/intern/diff/D76053764/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76053764

Copy link
Contributor

@manuelcandales manuelcandales left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Improving the cpu implementation op_quantize to support input half dtype and adding additional testing

Differential Revision: [D76053764](https://our.internmc.facebook.com/intern/diff/D76053764/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76053764

@ahmtox ahmtox added the release notes: vulkan Changes to the Vulkan backend delegate label Jun 11, 2025
Improving the cpu implementation op_quantize to support input half dtype and adding additional testing

Differential Revision: [D76053764](https://our.internmc.facebook.com/intern/diff/D76053764/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76053764

Improving the cpu implementation op_quantize to support input half dtype and adding additional testing

Differential Revision: [D76053764](https://our.internmc.facebook.com/intern/diff/D76053764/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76053764

Improving the cpu implementation op_quantize to support input half dtype and adding additional testing

Differential Revision: [D76053764](https://our.internmc.facebook.com/intern/diff/D76053764/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76053764

Improving the cpu implementation op_quantize to support input half dtype and adding additional testing

Differential Revision: [D76053764](https://our.internmc.facebook.com/intern/diff/D76053764/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76053764

Improving the cpu implementation op_quantize to support input half dtype and adding additional testing

Differential Revision: [D76053764](https://our.internmc.facebook.com/intern/diff/D76053764/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76053764

Improving the cpu implementation op_quantize to support input half dtype and adding additional testing

Differential Revision: [D76053764](https://our.internmc.facebook.com/intern/diff/D76053764/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76053764

@facebook-github-bot facebook-github-bot merged commit 1f024fc into gh/ahmtox/12/base Jun 14, 2025
92 of 98 checks passed
@facebook-github-bot facebook-github-bot deleted the gh/ahmtox/12/head branch June 14, 2025 03:45
ahmtox added a commit that referenced this pull request Jun 16, 2025
Summary:
# Context

Need these changes that were reverted in the weekend. Original stack of commits were unable to be merged into main due to an existing lintrunner issue blocking the merge. All the changes already went through [review](#11479) and approved.

Differential Revision: D76737404
SS-JIA added a commit that referenced this pull request Aug 6, 2025
Pull Request resolved: #11479

Currently the cpu implementation for the quantization operator (which includes `quantize_per_token`, `quantize_per_tensor`, and `quantize_per_channel`), does not inherently support half (fp16) input scalar types. In order to align with the PyTorch implementation that accepts fp16 and bfp16 inputs, this diff aims to enable half input dtype support for the quantization operators. We will be comparing this implementation against the vulkan operators.

As defined in ExecuTorch [scalar_type_util.h](https://github.com/pytorch/executorch/blob/053686242c1687f0d51b3bb8befd14b047d7b025/runtime/core/exec_aten/util/scalar_type_util.h#L190) file, there is a method to enable support simply changing which preprocessor is called to ET_FORALL_FLOATH_TYPES. This enables support for Half (fp16), Float (fp32), and Double (fp64).

I have also included more comprehensive testing against the input dtypes, including adding double testing since it didn't already exist before. Instead of just confirming that all the output dtypes are supported, we also have a check that all input dtypes are supported now as well.
ghstack-source-id: 290376481
@exported-using-ghexport

Differential Revision: [D76053764](https://our.internmc.facebook.com/intern/diff/D76053764/)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported release notes: vulkan Changes to the Vulkan backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants