Skip to content

Commit b36d574

Browse files
authored
Merge pull request vllm-project#1 from Bellk17/main
Triton compilation fix
2 parents c2b4a1b + fae4f82 commit b36d574

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

vllm/attention/ops/triton_flash_attention.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -415,7 +415,11 @@ def attn_fwd(
415415
return
416416

417417
is_mqa = hq != hk
418-
off_h_k = off_h_q % hk if is_mqa else off_h_q
418+
if is_mqa: # noqa: SIM108
419+
off_h_k = off_h_q % hk
420+
else:
421+
off_h_k = off_h_q
422+
419423
n_extra_tokens = 0
420424
if seqlen_k < BLOCK_N:
421425
n_extra_tokens = BLOCK_N - seqlen_k

0 commit comments

Comments
 (0)