-
Notifications
You must be signed in to change notification settings - Fork 699
[New Operator] FusedRowwiseQuantizedSparseLengthsWeightedSumNode #2368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cbec569
to
aa8414d
Compare
Note that Caffe2 does not have an explicit "fused Int8" fill/type, so when the Caffe2Importer loads the Constant If there are other users of the same Constant |
What's the status of this one? Is it ready for review? |
@yinghai Yep! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. But I'll let Glow experts accept. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few questions.
aa8414d
to
f5c7863
Compare
f5c7863
to
365bc47
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great work!
…eLengthsWeightedSumNode and FusedRowwiseQuantizedSparseLengthsSumNode
…WeightedSumNode and FusedRowwiseQuantizedSparseLengthsSumNode
365bc47
to
2557e47
Compare
2557e47
to
3f0740f
Compare
Int32QTy, // 32-bit quantized type (int32_t) | ||
Int32ITy, // 32-bit index type (int32_t) | ||
Int64ITy, // 64-bit index type (int64_t) | ||
Int8FusedQTy, // 8-bit quantized type with fused scale/offset (int8_t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have such a type instead of plain Int8ITy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The biggest reason I wanted to explicitly differentiate between normal Int8QTy
and Int8FusedQTy
is because the shape for Int8FusedQTy
is very strange -- it appears to be 8 columns wider than the actual data. This prevents accidental interpretation of fused scales/offsets as instead data, or viewing the input data as an otherwise normal tensor. And I didn't think there was a large downside to adding the new type.
[SparseLengthsWeightedSum](https://caffe2.ai/docs/operators-catalogue.html#sparselengthsweightedsumfused8bitrowwise). Glow | ||
supports such fused Nodes/Instructions, for example | ||
`FusedRowwiseQuantizedSparseLengthsWeightedSum`. The `ElemKind` of fused tensors | ||
is `Int8FusedQTy`. Tensors with `Int8FusedQTy` are 2-dimensional, and have an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be uint8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We subtract OFFSETSHIFT
to convert to int8_t
in our importer:
glow/lib/Importer/Caffe2ModelLoader.cpp
Lines 1118 to 1127 in a64d64f
// Although in Caffe2 quantized model, the weights is int8 quantized, | |
// the weights is stored in uint8_t format due to that Caffe2 requires | |
// the type of input and weights must be the same. Therefore, we need to | |
// convert it to int8 by subtracting 128. | |
T->reset(ElemKind::Int8QTy, dim, scale, offset - OFFSETSHIFT); | |
auto TH = T->getHandle<int8_t>(); | |
std::string str = dict["values"]->s(); | |
for (; i < str.size(); i++) { | |
TH.raw(i) = ((uint8_t)(str.c_str()[i]) - OFFSETSHIFT); | |
} |
Description: As noted in #2292, we decided to implement both fused and unfused versions of rowwise-quantized SLWS.
Testing: Added OperatorTests and Caffe2ImporterTests.
Documentation: Added.
Closes #1698