Skip to content

Commit 9d12efd

Browse files
cccclaifacebook-github-bot
authored andcommitted
set per channel quant for weight (#3709)
Summary: Pull Request resolved: #3709 As title, verified with stories model and the accuracy is better. Reviewed By: kirklandsign Differential Revision: D57655227 fbshipit-source-id: 6257aaafb26f1a91c749c4fc1e2efca609e07935
1 parent f42942a commit 9d12efd

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

examples/models/llama2/lib/quant_lib.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,8 @@ def get_qnn_quantizer(args):
158158
backend == "qnn"
159159
), f"The quantization config is for backend {backend} instead of qnn."
160160
qnn_quantizer = QnnQuantizer()
161+
qnn_quantizer.set_per_channel_conv_quant(enable=True)
162+
qnn_quantizer.set_per_channel_linear_quant(enable=True)
161163
# more custom quantization are supported including 16a4w etc. default to 8bit quantized
162164
custom_annotations = ()
163165
if quant_config == "8a8w":

0 commit comments

Comments
 (0)