Fix an issue when piping attn_logits_soft_cap through in vllm. #8600

fenghuizhang · 2025-01-22T15:57:35Z

PR on vllm here: vllm-project/vllm#12294.

Found an issue when piping the soft cap through in vllm: https://buildkite.com/vllm/fastcheck/builds/12156#01948bc4-58cd-4863-9eca-e2ea098879f9

It looks like torch compile wasn't able to trace the kernel due to the arg was of float type and we couldn't pass None into the func.

…ed_attention

…optional

fenghuizhang and others added 14 commits January 15, 2025 18:43

Pipes attn_logits_soft_cap through multi_queries_paged_attention

2b98060

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

8106ad2

…ed_attention

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

8802322

…ed_attention

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

9e57ad4

…ed_attention

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

1835876

…ed_attention

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

68cd431

…ed_attention

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

491dbdb

…ed_attention

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

b8660fe

…ed_attention

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

351de89

…ed_attention

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

2ce9e2f

…ed_attention

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

19cf3a0

…ed_attention

Implements attn_logits_soft_cap and pass it through multi_queries_pag…

172f9cd

…ed_attention

Fix the signature of paged_attention by marking attn_logits_soft_cap …

633792c

…optional

Merge branch 'pytorch:master' into master

28df218

fenghuizhang marked this pull request as ready for review January 22, 2025 15:58

fenghuizhang mentioned this pull request Jan 22, 2025

[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels vllm-project/vllm#12294

Closed

lsy323 approved these changes Jan 22, 2025

View reviewed changes

lsy323 merged commit 5b877be into pytorch:master Jan 22, 2025
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix an issue when piping attn_logits_soft_cap through in vllm. #8600

Fix an issue when piping attn_logits_soft_cap through in vllm. #8600

Uh oh!

fenghuizhang commented Jan 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Fix an issue when piping attn_logits_soft_cap through in vllm. #8600

Fix an issue when piping attn_logits_soft_cap through in vllm. #8600

Uh oh!

Conversation

fenghuizhang commented Jan 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fenghuizhang commented Jan 22, 2025 •

edited

Loading