Skip to content

Commit 76ca12d

Browse files
committed
Change swap threshold
Signed-off-by: Barry Kang <[email protected]>
1 parent 46811d3 commit 76ca12d

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

tensorrt_llm/_torch/modules/linear.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -574,7 +574,7 @@ def apply(self, module: Linear, input: torch.Tensor,
574574

575575
if get_sm_version() == 100:
576576
import deep_gemm
577-
if input.shape[0] < 128:
577+
if input.shape[0] < 32:
578578
# Swap AB
579579
a, a_sf = fp8_utils.per_token_quant_and_transform(input,
580580
swap_ab=True)

0 commit comments

Comments
 (0)