Skip to content

Unify Quantization APIs for add, pool and relu #26335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 19 commits into from

Conversation

supriyar
Copy link
Contributor

@supriyar supriyar commented Sep 17, 2019

Stack from ghstack:

Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D17504331

Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Sep 17, 2019
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 3feae47
Pull Request resolved: #26335
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Sep 17, 2019
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 95d1f48
Pull Request resolved: #26335
"qnnpack_maxpool(): Expected padding to be 2-dimensional: got ",
padding.size());

Tensor input_contig = input.contiguous();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need a check for input layout here? How do you know that the input is nhwc contiguous?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make the change similar to #26242 once that lands for all qnnpack ops.

Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Sep 19, 2019
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 755b9df
Pull Request resolved: #26335
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Sep 19, 2019
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: bdc5883
Pull Request resolved: #26335
@t-vi
Copy link
Collaborator

t-vi commented Sep 19, 2019

It might well be me, but on raspbian (32bit armv7 hf) and USE_PYTORCH_QNNPACK=1, I get None from conv_prepack called like this:

r = torch.ones([2,2,2,2], dtype=torch.float)
scale, zero_point = 1, 2
q = torch.quantize_linear(r, scale, zero_point, torch.qint8)
torch.ops.quantized.conv_prepack(q, None, [1,1], [0,0], [1,1], 1)

I do get some torch.uint8 tensor when I do the same with a vanilla build on x86 (i.e. with FBGEMM).

@supriyar
Copy link
Contributor Author

It might well be me, but on raspbian (32bit armv7 hf) and USE_PYTORCH_QNNPACK=1, I get None from conv_prepack called like this:

r = torch.ones([2,2,2,2], dtype=torch.float)
scale, zero_point = 1, 2
q = torch.quantize_linear(r, scale, zero_point, torch.qint8)
torch.ops.quantized.conv_prepack(q, None, [1,1], [0,0], [1,1], 1)

I do get some torch.uint8 tensor when I do the same with a vanilla build on x86 (i.e. with FBGEMM).

Did you try setting torch.backends.quantized.engine = torch.qnnpack? You can set the same on server side (x86) as well to confirm qnnpack works.

@t-vi
Copy link
Collaborator

t-vi commented Sep 19, 2019

Indeed, this fixes the example case and seems to return something in the quantized model. Thank you!

At some point, conv_prepack might raise a more descriptive exception except of just returning None... 🙂
Should the engine default to qnnpack when it available but fbgemm isn't?

Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Sep 19, 2019
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 3a2e1c8
Pull Request resolved: #26335
int64_t inW = input.size(3);
// TODO: change it to contiguous(MemoryFormat::ChannelsLast) once a perf
// regression of it is fixed.
Tensor input_contig = input.permute({0, 2, 3, 1}).contiguous();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dzhulgakov please review this. I changed it to permute input tensor to NHWC similar to your changes for qconv.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it should be fine

Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Sep 20, 2019
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: aed4f05
Pull Request resolved: #26335
Copy link
Collaborator

@dzhulgakov dzhulgakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I'll let Jerry to review in the details.

Would be really nice to unify the tests too and extend the benchmarks in benchmarks/op_benchmarks to go through both engines

int64_t inW = input.size(3);
// TODO: change it to contiguous(MemoryFormat::ChannelsLast) once a perf
// regression of it is fixed.
Tensor input_contig = input.permute({0, 2, 3, 1}).contiguous();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it should be fine

Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Sep 20, 2019
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: c506a0c
Pull Request resolved: #26335
Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D17504331](https://our.internmc.facebook.com/intern/diff/D17504331)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Sep 20, 2019
Summary:
Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 0de2379
Pull Request resolved: #26335
zdevito pushed a commit to zdevito/ATen that referenced this pull request Sep 20, 2019
Summary:
Pull Request resolved: pytorch/pytorch#26335

Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Imported from OSS

Differential Revision: D17504331

fbshipit-source-id: 35cb2189067ac5cc6a7307179ef0335d1cec7b8f
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in f337459.

mingbowan pushed a commit to mingbowan/pytorch that referenced this pull request Sep 23, 2019
Summary:
Pull Request resolved: pytorch#26335

Use the backend engine flag to call QNNPACK for quantized ops.

Test Plan:
python test/test_quantized.py TestQNNPACKOps

Imported from OSS

Differential Revision: D17504331

fbshipit-source-id: 35cb2189067ac5cc6a7307179ef0335d1cec7b8f
@facebook-github-bot facebook-github-bot deleted the gh/supriyar/18/head branch October 28, 2019 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merged oncall: quantization Quantization support in PyTorch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants