Skip to content

Commit 356e087

Browse files
committed
[docs] Add blurb to docs on RWQ SLWS
Also reformat a little and remove trailing whitespace.
1 parent 61fb93d commit 356e087

File tree

1 file changed

+33
-26
lines changed

1 file changed

+33
-26
lines changed

docs/Quantization.md

Lines changed: 33 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -57,16 +57,16 @@ inference. Then, we recompile the network using this profile information to
5757
convert the network into a quantized form, allowing for static optimization of
5858
the quantized graph. We convert portions of the network into islands of integer
5959
computation and aim to generate outputs in the range that the original
60-
floating-point network produces. During the conversion, for the following types
61-
of quantized nodes, we ignore the output's quantization params (if they are
62-
provided) and force the output have the same quantization params as the input
60+
floating-point network produces. During the conversion, for the following types
61+
of quantized nodes, we ignore the output's quantization params (if they are
62+
provided) and force the output have the same quantization params as the input
6363
for performance purpose:
6464
```
65-
LocalResponseNormalizationNode
66-
SliceNode
67-
ReshapeNode
68-
TopKNode
69-
GatherNode
65+
LocalResponseNormalizationNode
66+
SliceNode
67+
ReshapeNode
68+
TopKNode
69+
GatherNode
7070
MaxPoolNode
7171
```
7272

@@ -131,9 +131,9 @@ By default, target quantization precision is int8. However, precision can be
131131
controlled via command line parameter: `quantization-precision`. There are
132132
two supported values: `Int8` and `Int16`.
133133

134-
## Caffe2 Quantized Model Support
134+
## Caffe2 Quantized Model Support
135135

136-
Glow is able to support Caffe2 Resnet50 quantized model:
136+
Glow is able to support Caffe2 Resnet50 quantized model:
137137
https://github.com/caffe2/models/tree/master/resnet50_quantized
138138

139139
To support Caffe2 quantized models, Glow has:
@@ -152,16 +152,16 @@ Int8GivenTensorFill
152152
```
153153
- Supported int32 quantized bias.
154154

155-
In most of the cases, bias is quantized in int32 to improve precision
156-
(the partial sum of the matrix-matrix multiplication is accumulated into int32,
157-
so int32 bias can be added to the int32 partial sum for better accuracy).
158-
Glow now supports int32 quantized bias in ```Convolution```, ```FullyConnected```
155+
In most of the cases, bias is quantized in int32 to improve precision
156+
(the partial sum of the matrix-matrix multiplication is accumulated into int32,
157+
so int32 bias can be added to the int32 partial sum for better accuracy).
158+
Glow now supports int32 quantized bias in ```Convolution```, ```FullyConnected```
159159
and ```RowwiseQuantizedFullyConnected``` nodes.
160160

161161
- Supported the conversion from uint8 quantized activations to int8 quantized activations.
162162

163-
For the quantized Caffe2 ops, the activations are quantized to uint8. In Glow, the
164-
activations are quantized to int_8. Therefore, for the offset read from quantized Caffe2
163+
For the quantized Caffe2 ops, the activations are quantized to uint8. In Glow, the
164+
activations are quantized to int_8. Therefore, for the offset read from quantized Caffe2
165165
model, we need to subtract 128(i.e. INT8_MIN) to make the activations become int8.
166166

167167
## Compiler Optimizations
@@ -191,17 +191,24 @@ For more specific graph optimizations check [here](Optimizations.md#quantization
191191

192192
## Row-wise Quantization
193193

194-
Row-wise (or channel-wise) quantization is an important way to minimize accuracy drop.
195-
Glow supports row-wise quantized FullyConnected node ```RowwiseQuantizedFullyConnected```
196-
which is enabled by an image-classifier/loader option "-enable-rowwise".
194+
Row-wise (or channel-wise) quantization is an important way to minimize accuracy
195+
drop. Glow supports row-wise quantized FullyConnected node
196+
```RowwiseQuantizedFullyConnected``` which is enabled by an
197+
image-classifier/loader option "-enable-rowwise".
197198

198-
For the regular quantized FC, we quantize the whole weights tensor with the same
199-
scale and offset, which are computed based on the max and min of the entire tensor).
200-
But for row-wise, after getting ```min_i``` and ```max_i``` for each row ```i```, we compute the pair
201-
of ```(scale_i, offset_i)``` to quantize each element in row ```i```. The figure below shows
202-
the quantized FC node and RowwiseQuantizedFullyConnected node. Instead of using only
203-
one tensor to represent the quantized weights, we need 2 extra vectors ```Scales```
204-
and ```Offsets``` to store the ```(scale, offset)``` for each row.
199+
For the regular quantized FC, we quantize the whole weights tensor with the same
200+
scale and offset, which are computed based on the max and min of the entire
201+
tensor). But for row-wise, after getting ```min_i``` and ```max_i``` for each
202+
row ```i```, we compute the pair of ```(scale_i, offset_i)``` to quantize each
203+
element in row ```i```. The figure below shows the quantized FC node and
204+
RowwiseQuantizedFullyConnected node. Instead of using only one tensor to
205+
represent the quantized weights, we need 2 extra vectors ```Scales``` and
206+
```Offsets``` to store the ```(scale, offset)``` for each row.
205207

206208

207209
![](rowwise_quantized_fc.png)
210+
211+
Row-wise quantized SparseLengthsWeightedSum is also supported. Similar to the
212+
above, we compute scales and offsets per row, to be used with the `Data` input
213+
for the `RowwiseQuantizedSparseLengthsSumNode`. Scales and Offsets are inputs to
214+
the node. Output of this node is float, matching the Caffe2 implementation.

0 commit comments

Comments
 (0)