Skip to content

[quant] Add default symmetric qconfig for qnnpack #74396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

digantdesai
Copy link
Contributor

Summary:

New qconfig default_symmetric_qnnpack_qconfig

Returns a qconfig with signed activation and symmetric weights with range restrictions. Also adds per_channel variant for the same.

Restrictions on weights

Restrictions on weights include,

  1. weight zero point is force zero. and
  2. weight 8-bit signed quantized value are limited to [-127, +127] excluding the value +128.

This is driven, in part, by the desire to achieve better performance by XNNPACK ops.

qengine/backend = qnnpack and XNNPACK ops

Qconfig returned by this function allows us to use faster XNNPACK quantized ops for CPUs w/ said restrictions. Although we are using XNNPACK ops the qengine is still qnnpack, and there are no plans to introduce a new qengine for XNNPACK ops. Support to use XNNPACK ops with asymmetric (returned by get_default_qconfig()) qconfig is WIP.

Updated EPS value:

  • From PyTorch:

eps:

>>> import torch
>>> torch.finfo(torch.float32).eps
1.1920928955078125e-07
>>> torch.finfo(torch.float32).eps.hex()
'0x1.0000000000000p-23'

All scale values are float32 and scale = max(scale, eps)

  • Requirement from XNNPACK

For both fp32 as well as rndnu requantization schema, 0x1p-32 <= requantization_scale < 256.0
Where, requantization_scale = (input_scale * kernel_scale) / (output_scale)

  • New minimum allowed scale value

With current float32 eps (=0x1p-23) as minimum, xnnpack lower bound is the problem. We haven’t observed upper bound issues so far with assuming the max scale value of 256. So focusing on the lower bound, to cover all possible cases of requantization value, conservatively, we must have the minimum possible requantization scale value such that,

minimum_requantization_value = xnnpack_lower_threshold
input_scale * kernel_scale / output_scale = 0x1p-32
min_scale_value * min_scale_value / max_scale_value = 0x1p-32
min_scale_value * new_eps / 256 = 0x1p-32
min_scale_value**2 = 0x1p-24
min_scale_value = 0x1p-12

With scale_value >= 0x1p-12, we should be able to avoid the lower threshold on requantization scale by xnnpack kernels.

Obviously this is a very unlikely to happen. So practically, we should be get away with much smaller value than 0x1p-12 as EPS, but it is not easy to choose a smaller value empirically.

  • Impact on accuracy is unclear as of writing this.

Reviewed By: kimishpatel

Differential Revision: D34625300

Summary:
# New qconfig `default_symmetric_qnnpack_qconfig`

Returns a qconfig with signed activation and symmetric weights with range restrictions. Also adds per_channel variant for the same.

## Restrictions on weights

Restrictions on weights include,
1. weight zero point is force zero. and
2. weight 8-bit signed quantized value are limited to [-127, +127] excluding the value +128.

This is driven, in part, by the desire to achieve better performance by XNNPACK ops.

## qengine/backend = `qnnpack` and XNNPACK ops

Qconfig returned by this function allows us to use faster XNNPACK quantized ops for CPUs w/ said restrictions. Although we are using XNNPACK ops the qengine is still `qnnpack`, and there are no plans to introduce a new qengine for XNNPACK ops. Support to use XNNPACK ops with asymmetric (returned by get_default_qconfig()) qconfig is WIP.

## Updated EPS value:
* From PyTorch:

eps:
```
>>> import torch
>>> torch.finfo(torch.float32).eps
1.1920928955078125e-07
>>> torch.finfo(torch.float32).eps.hex()
'0x1.0000000000000p-23'
```
All scale values are float32 and `scale = max(scale, eps)`

* Requirement from XNNPACK

For both fp32 as well as rndnu requantization schema, `0x1p-32 <= requantization_scale < 256.0`
Where, requantization_scale = (input_scale * kernel_scale) / (output_scale)

* New minimum allowed scale value

With current float32 eps (=0x1p-23) as minimum, xnnpack lower bound is the problem. We haven’t observed upper bound issues so far with assuming the max scale value of 256. So focusing on the lower bound, to cover all possible cases of requantization value, conservatively, we must have the minimum possible requantization scale value such that,

```
minimum_requantization_value = xnnpack_lower_threshold
input_scale * kernel_scale / output_scale = 0x1p-32
min_scale_value * min_scale_value / max_scale_value = 0x1p-32
min_scale_value * new_eps / 256 = 0x1p-32
min_scale_value**2 = 0x1p-24
min_scale_value = 0x1p-12
```

With `scale_value >= 0x1p-12`, we should be able to avoid the lower threshold on requantization scale by xnnpack kernels.

Obviously this is a very unlikely to happen. So practically, we should be get away with much smaller value than `0x1p-12` as EPS, but it is not easy to choose a smaller value empirically.

* Impact on accuracy is unclear as of writing this.

Reviewed By: kimishpatel

Differential Revision: D34625300

fbshipit-source-id: f8ddea2ec3c2d31aae03096d8851e9893344a0fc
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 17, 2022

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/digantdesai/pytorch/blob/e1b003e5aa3d8a11f3381fd29ee1b8b3e750abfb/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows Labels (bold enabled) Status
Triggered Workflows
deploy-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-manywheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk ✅ triggered
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/default, ciflow/linux, ciflow/rocm, ciflow/trunk ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4-mobile-lightweight-dispatch-build ciflow/all, ciflow/cpu, ciflow/default, ciflow/libtorch, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
macos-arm64-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-arm64-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
macos-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
windows-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
windows-binary-libtorch-debug ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-libtorch-release ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-wheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-bionic-rocm4.5-py3.7-distributed ciflow/all, ciflow/linux, ciflow/rocm, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk, ciflow/xla 🚫 skipped

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Mar 17, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit e1b003e (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D34625300

facebook-github-bot pushed a commit that referenced this pull request Mar 18, 2022
Summary:
Pull Request resolved: #74396

# New qconfig `default_symmetric_qnnpack_qconfig`

Returns a qconfig with signed activation and symmetric weights with range restrictions. Also adds per_channel variant for the same.

## Restrictions on weights

Restrictions on weights include,
1. weight zero point is force zero. and
2. weight 8-bit signed quantized value are limited to [-127, +127] excluding the value +128.

This is driven, in part, by the desire to achieve better performance by XNNPACK ops.

## qengine/backend = `qnnpack` and XNNPACK ops

Qconfig returned by this function allows us to use faster XNNPACK quantized ops for CPUs w/ said restrictions. Although we are using XNNPACK ops the qengine is still `qnnpack`, and there are no plans to introduce a new qengine for XNNPACK ops. Support to use XNNPACK ops with asymmetric (returned by get_default_qconfig()) qconfig is WIP.

## Updated EPS value:
* From PyTorch:

eps:
```
>>> import torch
>>> torch.finfo(torch.float32).eps
1.1920928955078125e-07
>>> torch.finfo(torch.float32).eps.hex()
'0x1.0000000000000p-23'
```
All scale values are float32 and `scale = max(scale, eps)`

* Requirement from XNNPACK

For both fp32 as well as rndnu requantization schema, `0x1p-32 <= requantization_scale < 256.0`
Where, requantization_scale = (input_scale * kernel_scale) / (output_scale)

* New minimum allowed scale value

With current float32 eps (=0x1p-23) as minimum, xnnpack lower bound is the problem. We haven’t observed upper bound issues so far with assuming the max scale value of 256. So focusing on the lower bound, to cover all possible cases of requantization value, conservatively, we must have the minimum possible requantization scale value such that,

```
minimum_requantization_value = xnnpack_lower_threshold
input_scale * kernel_scale / output_scale = 0x1p-32
min_scale_value * min_scale_value / max_scale_value = 0x1p-32
min_scale_value * new_eps / 256 = 0x1p-32
min_scale_value**2 = 0x1p-24
min_scale_value = 0x1p-12
```

With `scale_value >= 0x1p-12`, we should be able to avoid the lower threshold on requantization scale by xnnpack kernels.

Obviously this is a very unlikely to happen. So practically, we should be get away with much smaller value than `0x1p-12` as EPS, but it is not easy to choose a smaller value empirically.

* Impact on accuracy is unclear as of writing this.

Reviewed By: kimishpatel

Differential Revision: D34625300

fbshipit-source-id: 005e6757ed1185b3940b58ac55246cba8b267828
@github-actions
Copy link
Contributor

Hey @digantdesai.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

shahofblah pushed a commit that referenced this pull request Mar 25, 2022
Summary:
Pull Request resolved: #74396

# New qconfig `default_symmetric_qnnpack_qconfig`

Returns a qconfig with signed activation and symmetric weights with range restrictions. Also adds per_channel variant for the same.

## Restrictions on weights

Restrictions on weights include,
1. weight zero point is force zero. and
2. weight 8-bit signed quantized value are limited to [-127, +127] excluding the value +128.

This is driven, in part, by the desire to achieve better performance by XNNPACK ops.

## qengine/backend = `qnnpack` and XNNPACK ops

Qconfig returned by this function allows us to use faster XNNPACK quantized ops for CPUs w/ said restrictions. Although we are using XNNPACK ops the qengine is still `qnnpack`, and there are no plans to introduce a new qengine for XNNPACK ops. Support to use XNNPACK ops with asymmetric (returned by get_default_qconfig()) qconfig is WIP.

## Updated EPS value:
* From PyTorch:

eps:
```
>>> import torch
>>> torch.finfo(torch.float32).eps
1.1920928955078125e-07
>>> torch.finfo(torch.float32).eps.hex()
'0x1.0000000000000p-23'
```
All scale values are float32 and `scale = max(scale, eps)`

* Requirement from XNNPACK

For both fp32 as well as rndnu requantization schema, `0x1p-32 <= requantization_scale < 256.0`
Where, requantization_scale = (input_scale * kernel_scale) / (output_scale)

* New minimum allowed scale value

With current float32 eps (=0x1p-23) as minimum, xnnpack lower bound is the problem. We haven’t observed upper bound issues so far with assuming the max scale value of 256. So focusing on the lower bound, to cover all possible cases of requantization value, conservatively, we must have the minimum possible requantization scale value such that,

```
minimum_requantization_value = xnnpack_lower_threshold
input_scale * kernel_scale / output_scale = 0x1p-32
min_scale_value * min_scale_value / max_scale_value = 0x1p-32
min_scale_value * new_eps / 256 = 0x1p-32
min_scale_value**2 = 0x1p-24
min_scale_value = 0x1p-12
```

With `scale_value >= 0x1p-12`, we should be able to avoid the lower threshold on requantization scale by xnnpack kernels.

Obviously this is a very unlikely to happen. So practically, we should be get away with much smaller value than `0x1p-12` as EPS, but it is not easy to choose a smaller value empirically.

* Impact on accuracy is unclear as of writing this.

Reviewed By: kimishpatel

Differential Revision: D34625300

fbshipit-source-id: 005e6757ed1185b3940b58ac55246cba8b267828
(cherry picked from commit 61ed1a2)
andrewor14 added a commit that referenced this pull request Sep 28, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- quant_min_lower_bound = -127 for weight
- quant_max_upper_bound = 127 for weight
- scale_min_lower_bound = 2 ** -12 for both activations and weight

These are consistent with the existing settings in
`default_symmetric_qnnpack_qconfig` and its per_channel and QAT
equivalents, which were added in #74396
and #74507 to enable users
to use this backend with faster XNNPACK quantized ops.

**BC-breaking notes:**

The QConfigs returned by `get_default_qconfig("qnnpack")` and
`get_default_qat_qconfig("qnnpack")` are changed to reflect the
new constraints imposed on the backend. These default QConfigs
are still compatible with the BackendConfig returned by
`get_qnnpack_backend_config()`.

However, existing non-default QConfigs that did not impose the
constraints described above will no longer work with the QNNPACK
BackendConfig. The resulting behavior in this case is that the
corresponding patterns will no longer by quantized, and a warning
explaining what the missing constraints are will be logged.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo
andrewor14 added a commit that referenced this pull request Sep 28, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- quant_min_lower_bound = -127 for weight
- quant_max_upper_bound = 127 for weight
- scale_min_lower_bound = 2 ** -12 for both activations and weight

These are consistent with the existing settings in
`default_symmetric_qnnpack_qconfig` and its per_channel and QAT
equivalents, which were added in #74396
and #74507 to enable users
to use this backend with faster XNNPACK quantized ops.

**BC-breaking notes:**

The QConfigs returned by `get_default_qconfig("qnnpack")` and
`get_default_qat_qconfig("qnnpack")` are changed to reflect the
new constraints imposed on the backend. These default QConfigs
are still compatible with the BackendConfig returned by
`get_qnnpack_backend_config()`.

However, existing non-default QConfigs that did not impose the
constraints described above will no longer work with the QNNPACK
BackendConfig. The resulting behavior in this case is that the
corresponding patterns will no longer by quantized, and a warning
explaining what the missing constraints are will be logged.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 28, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for weight
- `quant_max_upper_bound` = 127 for weight
- `scale_min_lower_bound` = 2 ** -12 for both activations and weight

These are consistent with the existing settings in
`default_symmetric_qnnpack_qconfig` and its per_channel and QAT
equivalents, which were added in #74396
and #74507 to enable users
to use this backend with faster XNNPACK quantized ops.

**BC-breaking notes:**

The QConfigs returned by `get_default_qconfig("qnnpack")` and
`get_default_qat_qconfig("qnnpack")` are changed to reflect the
new constraints imposed by the backend. These default QConfigs
are still compatible with the BackendConfig returned by
`get_qnnpack_backend_config()`.

However, existing non-default QConfigs that did not impose the
constraints described above will no longer work with the QNNPACK
BackendConfig. The resulting behavior in this case is that the
corresponding patterns will no longer be quantized, and a warning
explaining what the missing constraints are will be logged.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 28, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for weight
- `quant_max_upper_bound` = 127 for weight
- `scale_min_lower_bound` = 2 ** -12 for both activations and weight

These are consistent with the existing settings in
`default_symmetric_qnnpack_qconfig` and its per_channel and QAT
equivalents, which were added in #74396
and #74507 to enable users
to use this backend with faster XNNPACK quantized ops.

**BC-breaking notes:**

The QConfigs returned by `get_default_qconfig("qnnpack")` and
`get_default_qat_qconfig("qnnpack")` are changed to reflect the
new constraints imposed by the backend. These default QConfigs
are still compatible with the BackendConfig returned by
`get_qnnpack_backend_config()`.

However, existing non-default QConfigs that did not impose the
constraints described above will no longer work with the QNNPACK
BackendConfig. The resulting behavior in this case is that the
corresponding patterns will no longer be quantized, and a warning
explaining what the missing constraints are will be logged.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

ghstack-source-id: ed1927e
Pull Request resolved: #85863
andrewor14 added a commit that referenced this pull request Sep 29, 2022
…kendConfig"

**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for weight
- `quant_max_upper_bound` = 127 for weight
- `scale_min_lower_bound` = 2 ** -12 for both activations and weight

These are consistent with the existing settings in
`default_symmetric_qnnpack_qconfig` and its per_channel and QAT
equivalents, which were added in #74396
and #74507 to enable users
to use this backend with faster XNNPACK quantized ops.

**BC-breaking notes:**

The QConfigs returned by `get_default_qconfig("qnnpack")` and
`get_default_qat_qconfig("qnnpack")` are changed to reflect the
new constraints imposed by the backend. These default QConfigs
are still compatible with the BackendConfig returned by
`get_qnnpack_backend_config()`.

However, existing non-default QConfigs that did not impose the
constraints described above will no longer work with the QNNPACK
BackendConfig. The resulting behavior in this case is that the
corresponding patterns will no longer be quantized, and a warning
explaining what the missing constraints are will be logged.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 29, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for weight
- `quant_max_upper_bound` = 127 for weight
- `scale_min_lower_bound` = 2 ** -12 for both activations and weight

These are consistent with the existing settings in
`default_symmetric_qnnpack_qconfig` and its per_channel and QAT
equivalents, which were added in #74396
and #74507 to enable users
to use this backend with faster XNNPACK quantized ops.

**BC-breaking notes:**

The QConfigs returned by `get_default_qconfig("qnnpack")` and
`get_default_qat_qconfig("qnnpack")` are changed to reflect the
new constraints imposed by the backend. These default QConfigs
are still compatible with the BackendConfig returned by
`get_qnnpack_backend_config()`.

However, existing non-default QConfigs that did not impose the
constraints described above will no longer work with the QNNPACK
BackendConfig. The resulting behavior in this case is that the
corresponding patterns will no longer be quantized, and a warning
explaining what the missing constraints are will be logged.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 29, 2022
…BackendConfig"

**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for weight
- `quant_max_upper_bound` = 127 for weight
- `scale_min_lower_bound` = 2 ** -12 for both activations and weight

These constraints will enable users to use this backend with faster
XNNPACK quantized ops and are consistent with the existing settings
in `default_symmetric_qnnpack_qconfig` and its per_channel and QAT
variants. For more detail on why these exact values were chosen,
please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be either per_tensor_symmetric
or per_channel_symmetric.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 29, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for weight
- `quant_max_upper_bound` = 127 for weight
- `scale_min_lower_bound` = 2 ** -12 for both activations and weight

These constraints will enable users to use this backend with faster
XNNPACK quantized ops and are consistent with the existing settings
in `default_symmetric_qnnpack_qconfig` and its per_channel and QAT
variants. For more detail on why these exact values were chosen,
please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be either per_tensor_symmetric
or per_channel_symmetric.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 29, 2022
…endConfig"

**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 29, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 29, 2022
…endConfig"

**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 29, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 29, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

ghstack-source-id: 0aff560
Pull Request resolved: #85863
andrewor14 added a commit that referenced this pull request Sep 30, 2022
…endConfig"

**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 30, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 30, 2022
…endConfig"

**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 30, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 30, 2022
…endConfig"

**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 30, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

[ghstack-poisoned]
andrewor14 added a commit that referenced this pull request Sep 30, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

ghstack-source-id: 30dd311
Pull Request resolved: #85863
pytorchmergebot pushed a commit that referenced this pull request Sep 30, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo
Pull Request resolved: #85863
Approved by: https://github.com/jerryzh168
mehtanirav pushed a commit that referenced this pull request Oct 4, 2022
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
#74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo
Pull Request resolved: #85863
Approved by: https://github.com/jerryzh168
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants