Skip to content

Commit 817c459

Browse files
mcremon-metafacebook-github-bot
authored andcommitted
Remove layer norm from the default quantizer, add one that has it
Summary: Layer norm is not performing great in quantized mode, and is currently using a split scheme (weights are quantized, activations are not). In most cases, it's actually much faster to keep it fp32, so this diff removes it from the default quantizer. We add a CadenceWithLayerNormQuantizer for easy access to the current behavior, which can be good in some cases (mostly if quantizing layer norm will help extend the quantized liveness). Differential Revision: D72941790
1 parent 82ff404 commit 817c459

File tree

1 file changed

+13
-2
lines changed

1 file changed

+13
-2
lines changed

backends/cadence/aot/quantizer/quantizer.py

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -193,7 +193,6 @@ def get_cadence_default_quantizers() -> List[Quantizer]:
193193
CadenceAtenQuantizer(BmmPattern(), qconfig_A8W8),
194194
CadenceAtenQuantizer(Conv1dPattern(), qconfig_A8W8sym),
195195
CadenceAtenQuantizer(Conv2dPattern(), qconfig_A8W8sym),
196-
CadenceAtenQuantizer(LayerNormPattern(), qconfig_A8W8),
197196
CadenceAtenQuantizer(LinearPattern(), qconfig_A8W8),
198197
CadenceAtenQuantizer(MatmulPattern(), qconfig_A8W8),
199198
CadenceAtenQuantizer(ReluPattern0(), qconfig_A8W8),
@@ -236,9 +235,21 @@ def __init__(
236235
super().__init__([])
237236

238237

238+
class CadenceWithLayerNormQuantizer(CadenceQuantizer):
239+
"""
240+
Quantizer including layer norm
241+
"""
242+
243+
def __init__(self, quantizers: Optional[list[Quantizer]] = None) -> None:
244+
if quantizers is None:
245+
quantizers = get_cadence_default_quantizers()
246+
quantizers.append(CadenceAtenQuantizer(LayerNormPattern(), qconfig_A8W8))
247+
super().__init__(quantizers)
248+
249+
239250
class CadenceWakeWordQuantizer(CadenceQuantizer):
240251
"""
241-
Quantizer for WakeWord, including add
252+
Quantizer for WakeWord, including add and cat
242253
"""
243254

244255
def __init__(self, quantizers: Optional[list[Quantizer]] = None) -> None:

0 commit comments

Comments
 (0)