Skip to content

[New Operator] FusedRowwiseQuantizedSparseLengthsWeightedSumNode #2368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 13, 2019

Conversation

jfix71
Copy link
Contributor

@jfix71 jfix71 commented Feb 9, 2019

Description: As noted in #2292, we decided to implement both fused and unfused versions of rowwise-quantized SLWS.

Testing: Added OperatorTests and Caffe2ImporterTests.

Documentation: Added.

Closes #1698

@jfix71
Copy link
Contributor Author

jfix71 commented Feb 9, 2019

Note that Caffe2 does not have an explicit "fused Int8" fill/type, so when the Caffe2Importer loads the Constant data input it is loaded as normal Int8QTy. During loading a FusedRowwiseQuantizedSparseLengthsWeightedSumNode, it assumes that its data input was previously loaded as Int8QTy, and then changes the type to Int8FusedQTy instead.

If there are other users of the same Constant data that are expecting it to be Int8QTy then it will have an incorrect type. However I don't expect this to be the case, and it doesn't make sense that other users would be unaware that it's fused data.

@yinghai
Copy link
Contributor

yinghai commented Feb 10, 2019

What's the status of this one? Is it ready for review?

@jfix71
Copy link
Contributor Author

jfix71 commented Feb 10, 2019

@yinghai Yep!

Copy link
Contributor

@yinghai yinghai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But I'll let Glow experts accept. :)

Copy link
Contributor

@rdzhabarov rdzhabarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few questions.

Copy link
Contributor

@rdzhabarov rdzhabarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work!

@jfix71 jfix71 merged commit b975212 into pytorch:master Feb 13, 2019
@jfix71 jfix71 deleted the fused_rwq_slws branch February 13, 2019 03:57
Int32QTy, // 32-bit quantized type (int32_t)
Int32ITy, // 32-bit index type (int32_t)
Int64ITy, // 64-bit index type (int64_t)
Int8FusedQTy, // 8-bit quantized type with fused scale/offset (int8_t)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have such a type instead of plain Int8ITy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest reason I wanted to explicitly differentiate between normal Int8QTy and Int8FusedQTy is because the shape for Int8FusedQTy is very strange -- it appears to be 8 columns wider than the actual data. This prevents accidental interpretation of fused scales/offsets as instead data, or viewing the input data as an otherwise normal tensor. And I didn't think there was a large downside to adding the new type.

[SparseLengthsWeightedSum](https://caffe2.ai/docs/operators-catalogue.html#sparselengthsweightedsumfused8bitrowwise). Glow
supports such fused Nodes/Instructions, for example
`FusedRowwiseQuantizedSparseLengthsWeightedSum`. The `ElemKind` of fused tensors
is `Int8FusedQTy`. Tensors with `Int8FusedQTy` are 2-dimensional, and have an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be uint8?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We subtract OFFSETSHIFT to convert to int8_t in our importer:

// Although in Caffe2 quantized model, the weights is int8 quantized,
// the weights is stored in uint8_t format due to that Caffe2 requires
// the type of input and weights must be the same. Therefore, we need to
// convert it to int8 by subtracting 128.
T->reset(ElemKind::Int8QTy, dim, scale, offset - OFFSETSHIFT);
auto TH = T->getHandle<int8_t>();
std::string str = dict["values"]->s();
for (; i < str.size(); i++) {
TH.raw(i) = ((uint8_t)(str.c_str()[i]) - OFFSETSHIFT);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants