[New Operator] FusedRowwiseQuantizedSparseLengthsWeightedSumNode #2368

jfix71 · 2019-02-09T04:25:07Z

Description: As noted in #2292, we decided to implement both fused and unfused versions of rowwise-quantized SLWS.

Testing: Added OperatorTests and Caffe2ImporterTests.

Documentation: Added.

jfix71 · 2019-02-09T05:19:33Z

Note that Caffe2 does not have an explicit "fused Int8" fill/type, so when the Caffe2Importer loads the Constant data input it is loaded as normal Int8QTy. During loading a FusedRowwiseQuantizedSparseLengthsWeightedSumNode, it assumes that its data input was previously loaded as Int8QTy, and then changes the type to Int8FusedQTy instead.

If there are other users of the same Constant data that are expecting it to be Int8QTy then it will have an incorrect type. However I don't expect this to be the case, and it doesn't make sense that other users would be unaware that it's fused data.

yinghai · 2019-02-10T01:29:26Z

What's the status of this one? Is it ready for review?

jfix71 · 2019-02-10T01:32:52Z

@yinghai Yep!

yinghai

LGTM. But I'll let Glow experts accept. :)

rdzhabarov

Few questions.

include/glow/Base/Tensor.h

lib/Base/Tensor.cpp

include/glow/Graph/Graph.h

include/glow/Quantization/Base/Base.h

lib/Backends/CPU/LLVMIRGen.cpp

lib/Backends/CPU/libjit/libjit.cpp

lib/Base/Tensor.cpp

lib/Graph/Graph.cpp

docs/Quantization.md

rdzhabarov

This is great work!

include/glow/Quantization/Base/Base.h

tests/unittests/Caffe2ImporterTest.cpp

…eLengthsWeightedSumNode and FusedRowwiseQuantizedSparseLengthsSumNode

…WeightedSumNode and FusedRowwiseQuantizedSparseLengthsSumNode

yinghai · 2019-02-16T00:46:31Z

include/glow/Base/Type.h

+  Int32QTy,     // 32-bit quantized type (int32_t)
+  Int32ITy,     // 32-bit index type (int32_t)
+  Int64ITy,     // 64-bit index type (int64_t)
+  Int8FusedQTy, // 8-bit quantized type with fused scale/offset (int8_t)


Why do we have such a type instead of plain Int8ITy?

The biggest reason I wanted to explicitly differentiate between normal Int8QTy and Int8FusedQTy is because the shape for Int8FusedQTy is very strange -- it appears to be 8 columns wider than the actual data. This prevents accidental interpretation of fused scales/offsets as instead data, or viewing the input data as an otherwise normal tensor. And I didn't think there was a large downside to adding the new type.

yinghai · 2019-02-16T02:19:22Z

docs/Quantization.md

+[SparseLengthsWeightedSum](https://caffe2.ai/docs/operators-catalogue.html#sparselengthsweightedsumfused8bitrowwise). Glow
+supports such fused Nodes/Instructions, for example
+`FusedRowwiseQuantizedSparseLengthsWeightedSum`. The `ElemKind` of fused tensors
+is `Int8FusedQTy`. Tensors with `Int8FusedQTy` are 2-dimensional, and have an


It should be uint8?

We subtract OFFSETSHIFT to convert to int8_t in our importer:

glow/lib/Importer/Caffe2ModelLoader.cpp

Lines 1118 to 1127 in a64d64f

// Although in Caffe2 quantized model, the weights is int8 quantized,

// the weights is stored in uint8_t format due to that Caffe2 requires

// the type of input and weights must be the same. Therefore, we need to

// convert it to int8 by subtracting 128.

T->reset(ElemKind::Int8QTy, dim, scale, offset - OFFSETSHIFT);

auto TH = T->getHandle<int8_t>();

std::string str = dict["values"]->s();

for (; i < str.size(); i++) {

TH.raw(i) = ((uint8_t)(str.c_str()[i]) - OFFSETSHIFT);

}

facebook-github-bot added the CLA Signed label Feb 9, 2019

jfix71 force-pushed the fused_rwq_slws branch from cbec569 to aa8414d Compare February 9, 2019 04:52

yinghai requested review from rdzhabarov and bertmaher February 10, 2019 01:35

yinghai reviewed Feb 10, 2019

View reviewed changes

rdzhabarov reviewed Feb 11, 2019

View reviewed changes

include/glow/Base/Tensor.h Show resolved Hide resolved

lib/Base/Tensor.cpp Outdated Show resolved Hide resolved

jackm321 reviewed Feb 12, 2019

View reviewed changes

jfix71 force-pushed the fused_rwq_slws branch from aa8414d to f5c7863 Compare February 12, 2019 00:37

jfix71 added 2 commits February 11, 2019 16:44

[Graph] Fix comments for RWQ-SLWS/SLS

8c6229f

Add Int8FusedQTy

914d10f

jfix71 force-pushed the fused_rwq_slws branch from f5c7863 to 365bc47 Compare February 12, 2019 01:20

rdzhabarov reviewed Feb 12, 2019

View reviewed changes

docs/Quantization.md Show resolved Hide resolved

rdzhabarov approved these changes Feb 12, 2019

View reviewed changes

include/glow/Quantization/Base/Base.h Outdated Show resolved Hide resolved

tests/unittests/Caffe2ImporterTest.cpp Show resolved Hide resolved

jfix71 added 5 commits February 12, 2019 16:22

[Quantization/Base] Add method for quantizing a fused tensor

7c40989

[New Operator] Add Interpreter support for FusedRowwiseQuantizedSpars…

c00113f

…eLengthsWeightedSumNode and FusedRowwiseQuantizedSparseLengthsSumNode

[New Operator] Add CPU support for FusedRowwiseQuantizedSparseLengths…

23182ed

…WeightedSumNode and FusedRowwiseQuantizedSparseLengthsSumNode

[Caffe2ImporterTest] Add fused RWQ-SLWS/SLS tests

013791e

Add documentation for ElemKind::Int8FusedQTy

a7e795f

jfix71 force-pushed the fused_rwq_slws branch from 365bc47 to 2557e47 Compare February 13, 2019 00:23

[Tensor] Change init so it doesn't touch fused scale/offset.

3f0740f

jfix71 force-pushed the fused_rwq_slws branch from 2557e47 to 3f0740f Compare February 13, 2019 00:41

jfix71 merged commit b975212 into pytorch:master Feb 13, 2019

jfix71 deleted the fused_rwq_slws branch February 13, 2019 03:57

yinghai reviewed Feb 16, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[New Operator] FusedRowwiseQuantizedSparseLengthsWeightedSumNode #2368

[New Operator] FusedRowwiseQuantizedSparseLengthsWeightedSumNode #2368

jfix71 commented Feb 9, 2019

jfix71 commented Feb 9, 2019

yinghai commented Feb 10, 2019

jfix71 commented Feb 10, 2019

yinghai left a comment

rdzhabarov left a comment

rdzhabarov left a comment

yinghai Feb 16, 2019

jfix71 Feb 18, 2019

yinghai Feb 16, 2019

jfix71 Feb 18, 2019

	// Although in Caffe2 quantized model, the weights is int8 quantized,
	// the weights is stored in uint8_t format due to that Caffe2 requires
	// the type of input and weights must be the same. Therefore, we need to
	// convert it to int8 by subtracting 128.
	T->reset(ElemKind::Int8QTy, dim, scale, offset - OFFSETSHIFT);
	auto TH = T->getHandle<int8_t>();
	std::string str = dict["values"]->s();
	for (; i < str.size(); i++) {
	TH.raw(i) = ((uint8_t)(str.c_str()[i]) - OFFSETSHIFT);
	}

[New Operator] FusedRowwiseQuantizedSparseLengthsWeightedSumNode #2368

[New Operator] FusedRowwiseQuantizedSparseLengthsWeightedSumNode #2368

Conversation

jfix71 commented Feb 9, 2019

jfix71 commented Feb 9, 2019

yinghai commented Feb 10, 2019

jfix71 commented Feb 10, 2019

yinghai left a comment

Choose a reason for hiding this comment

rdzhabarov left a comment

Choose a reason for hiding this comment

rdzhabarov left a comment

Choose a reason for hiding this comment

yinghai Feb 16, 2019

Choose a reason for hiding this comment

jfix71 Feb 18, 2019

Choose a reason for hiding this comment

yinghai Feb 16, 2019

Choose a reason for hiding this comment

jfix71 Feb 18, 2019

Choose a reason for hiding this comment