You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Remove layer norm from the default quantizer, add one that has it
Summary:
Layer norm is not performing great in quantized mode, and is currently using a split scheme (weights are quantized, activations are not). In most cases, it's actually much faster to keep it fp32, so this diff removes it from the default quantizer.
We add a CadenceWithLayerNormQuantizer for easy access to the current behavior, which can be good in some cases (mostly if quantizing layer norm will help extend the quantized liveness).
Differential Revision: D72941790
0 commit comments