Unify Quantization APIs for add, pool and relu #26335

supriyar · 2019-09-17T04:15:18Z

Stack from ghstack:

Unify Quantization APIs for add, pool and relu #26335 Unify Quantization APIs for add, pool and relu
Changes to support int8 weight and fp32 bias in QNNPACK #26307 Changes to support int8 weight and fp32 bias in QNNPACK

Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D17504331

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 3feae47 Pull Request resolved: #26335

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

aten/src/ATen/native/quantized/cpu/qadd.cpp

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 95d1f48 Pull Request resolved: #26335

raghuramank100 · 2019-09-18T17:51:05Z

aten/src/ATen/native/quantized/cpu/qpool.cpp

+         "qnnpack_maxpool(): Expected padding to be 2-dimensional: got ",
+         padding.size());
+
+     Tensor input_contig = input.contiguous();


Do you need a check for input layout here? How do you know that the input is nhwc contiguous?

I'll make the change similar to #26242 once that lands for all qnnpack ops.

test/test_quantized.py

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 755b9df Pull Request resolved: #26335

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: bdc5883 Pull Request resolved: #26335

t-vi · 2019-09-19T17:26:02Z

It might well be me, but on raspbian (32bit armv7 hf) and USE_PYTORCH_QNNPACK=1, I get None from conv_prepack called like this:

r = torch.ones([2,2,2,2], dtype=torch.float)
scale, zero_point = 1, 2
q = torch.quantize_linear(r, scale, zero_point, torch.qint8)
torch.ops.quantized.conv_prepack(q, None, [1,1], [0,0], [1,1], 1)

I do get some torch.uint8 tensor when I do the same with a vanilla build on x86 (i.e. with FBGEMM).

supriyar · 2019-09-19T18:13:34Z

It might well be me, but on raspbian (32bit armv7 hf) and USE_PYTORCH_QNNPACK=1, I get None from conv_prepack called like this:
r = torch.ones([2,2,2,2], dtype=torch.float)
scale, zero_point = 1, 2
q = torch.quantize_linear(r, scale, zero_point, torch.qint8)
torch.ops.quantized.conv_prepack(q, None, [1,1], [0,0], [1,1], 1)
I do get some torch.uint8 tensor when I do the same with a vanilla build on x86 (i.e. with FBGEMM).

Did you try setting torch.backends.quantized.engine = torch.qnnpack? You can set the same on server side (x86) as well to confirm qnnpack works.

t-vi · 2019-09-19T18:28:11Z

Indeed, this fixes the example case and seems to return something in the quantized model. Thank you!

At some point, conv_prepack might raise a more descriptive exception except of just returning None... 🙂
Should the engine default to qnnpack when it available but fbgemm isn't?

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 3a2e1c8 Pull Request resolved: #26335

supriyar · 2019-09-19T23:28:47Z

aten/src/ATen/native/quantized/cpu/qpool.cpp

+     int64_t inW = input.size(3);
+     // TODO: change it to contiguous(MemoryFormat::ChannelsLast) once a perf
+     // regression of it is fixed.
+     Tensor input_contig = input.permute({0, 2, 3, 1}).contiguous();


@dzhulgakov please review this. I changed it to permute input tensor to NHWC similar to your changes for qconv.

Yep, it should be fine

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: aed4f05 Pull Request resolved: #26335

dzhulgakov

Looks good, I'll let Jerry to review in the details.

Would be really nice to unify the tests too and extend the benchmarks in benchmarks/op_benchmarks to go through both engines

dzhulgakov · 2019-09-20T07:20:54Z

aten/src/ATen/native/quantized/cpu/qpool.cpp

+     int64_t inW = input.size(3);
+     // TODO: change it to contiguous(MemoryFormat::ChannelsLast) once a perf
+     // regression of it is fixed.
+     Tensor input_contig = input.permute({0, 2, 3, 1}).contiguous();


Yep, it should be fine

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: c506a0c Pull Request resolved: #26335

jerryzh168

LGTM

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D17504331](https://our.internmc.facebook.com/intern/diff/D17504331) [ghstack-poisoned]

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 0de2379 Pull Request resolved: #26335

Summary: Pull Request resolved: pytorch/pytorch#26335 Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Imported from OSS Differential Revision: D17504331 fbshipit-source-id: 35cb2189067ac5cc6a7307179ef0335d1cec7b8f

facebook-github-bot · 2019-09-21T02:37:22Z

This pull request has been merged in f337459.

Summary: Pull Request resolved: pytorch#26335 Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Imported from OSS Differential Revision: D17504331 fbshipit-source-id: 35cb2189067ac5cc6a7307179ef0335d1cec7b8f

Unify Quantization APIs for add, pool and relu

8a7c8b8

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

pytorchbot added module: operators oncall: quantization Quantization support in PyTorch labels Sep 17, 2019

supriyar requested review from jerryzh168, ljk53 and dzhulgakov September 17, 2019 04:18

Update on "Unify Quantization APIs for add, pool and relu"

5bd25f9

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Update on "Unify Quantization APIs for add, pool and relu"

80b6044

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Update on "Unify Quantization APIs for add, pool and relu"

b5afc4e

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Update on "Unify Quantization APIs for add, pool and relu"

9b9f96a

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Update on "Unify Quantization APIs for add, pool and relu"

45ac7c7

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

jerryzh168 reviewed Sep 17, 2019

View reviewed changes

aten/src/ATen/native/quantized/cpu/qadd.cpp Outdated Show resolved Hide resolved

jerryzh168 reviewed Sep 17, 2019

View reviewed changes

aten/src/ATen/native/quantized/cpu/qadd.cpp Show resolved Hide resolved

Update on "Unify Quantization APIs for add, pool and relu"

0cac3ca

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Update on "Unify Quantization APIs for add, pool and relu"

62c4964

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

raghuramank100 reviewed Sep 18, 2019

View reviewed changes

test/test_quantized.py Show resolved Hide resolved

Update on "Unify Quantization APIs for add, pool and relu"

4316774

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Update on "Unify Quantization APIs for add, pool and relu"

bd7cfaf

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Update on "Unify Quantization APIs for add, pool and relu"

5edf099

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

supriyar commented Sep 19, 2019

View reviewed changes

Update on "Unify Quantization APIs for add, pool and relu"

110f39d

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

dzhulgakov reviewed Sep 20, 2019

View reviewed changes

Update on "Unify Quantization APIs for add, pool and relu"

9e586bb

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Update on "Unify Quantization APIs for add, pool and relu"

84ccf11

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Update on "Unify Quantization APIs for add, pool and relu"

ebcfa1c

Summary: Use the backend engine flag to call QNNPACK for quantized ops. Test Plan: python test/test_quantized.py TestQNNPACKOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

jerryzh168 approved these changes Sep 20, 2019

View reviewed changes

facebook-github-bot closed this in f337459 Sep 20, 2019

facebook-github-bot added the merged label Sep 21, 2019

facebook-github-bot deleted the gh/supriyar/18/head branch October 28, 2019 22:20

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unify Quantization APIs for add, pool and relu #26335

Unify Quantization APIs for add, pool and relu #26335

Uh oh!

supriyar commented Sep 17, 2019 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

raghuramank100 Sep 18, 2019

Uh oh!

supriyar Sep 18, 2019

Uh oh!

Uh oh!

t-vi commented Sep 19, 2019

Uh oh!

supriyar commented Sep 19, 2019

Uh oh!

t-vi commented Sep 19, 2019

Uh oh!

supriyar Sep 19, 2019

Uh oh!

dzhulgakov Sep 20, 2019

Uh oh!

dzhulgakov left a comment

Uh oh!

dzhulgakov Sep 20, 2019

Uh oh!

jerryzh168 left a comment

Uh oh!

facebook-github-bot commented Sep 21, 2019

Uh oh!

Uh oh!

Unify Quantization APIs for add, pool and relu #26335

Unify Quantization APIs for add, pool and relu #26335

Uh oh!

Conversation

supriyar commented Sep 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

raghuramank100 Sep 18, 2019

Choose a reason for hiding this comment

Uh oh!

supriyar Sep 18, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

t-vi commented Sep 19, 2019

Uh oh!

supriyar commented Sep 19, 2019

Uh oh!

t-vi commented Sep 19, 2019

Uh oh!

supriyar Sep 19, 2019

Choose a reason for hiding this comment

Uh oh!

dzhulgakov Sep 20, 2019

Choose a reason for hiding this comment

Uh oh!

dzhulgakov left a comment

Choose a reason for hiding this comment

Uh oh!

dzhulgakov Sep 20, 2019

Choose a reason for hiding this comment

Uh oh!

jerryzh168 left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Sep 21, 2019

Uh oh!

Uh oh!

supriyar commented Sep 17, 2019 •

edited

Loading