Skip to content

Commit 5b877be

Browse files
authored
Fix an issue when piping attn_logits_soft_cap through in vllm. (#8600)
1 parent fbbdfca commit 5b877be

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

torch_xla/experimental/custom_kernel.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1080,7 +1080,8 @@ def flash_attention_non_xla(q: torch.Tensor,
10801080

10811081

10821082
XLA_LIB.define(
1083-
"paged_attention(Tensor q, Tensor k_pages, Tensor v_pages, Tensor lengths, Tensor page_indices, int pages_per_compute_block, str megacore_mode=None, float attn_logits_soft_cap=None) -> Tensor",
1083+
"paged_attention(Tensor q, Tensor k_pages, Tensor v_pages, Tensor lengths, Tensor page_indices,"
1084+
" int pages_per_compute_block, str megacore_mode=None, float? attn_logits_soft_cap=None) -> Tensor",
10841085
)
10851086

10861087

0 commit comments

Comments
 (0)