Skip to content

Conversation

LucasWilkinson
Copy link
Collaborator

@LucasWilkinson LucasWilkinson commented Jun 9, 2025

NOTE: cmake changes lifted from vllm-project/vllm#18155

Helps with wheel size, not seeing any performance degradation on Blackwell once the cache is warmed up

See vllm-project/vllm#19336 for more details

Signed-off-by: Lucas Wilkinson <[email protected]>
@LucasWilkinson LucasWilkinson changed the title [WIP] FA2 8.0 PTX FA2 8.0 PTX Jun 16, 2025
@LucasWilkinson LucasWilkinson marked this pull request as ready for review June 16, 2025 16:09
@LucasWilkinson LucasWilkinson merged commit 763ad15 into main Jun 16, 2025
3 of 4 checks passed
LucasWilkinson added a commit that referenced this pull request Jun 16, 2025
Signed-off-by: Lucas Wilkinson <[email protected]>
LucasWilkinson added a commit that referenced this pull request Jun 16, 2025
* varlen combine scheduler

Signed-off-by: Lucas Wilkinson <[email protected]>

* cleanup

Signed-off-by: Lucas Wilkinson <[email protected]>

* move check

Signed-off-by: Lucas Wilkinson <[email protected]>

* standard scheduling algo

Signed-off-by: Lucas Wilkinson <[email protected]>

* better heuristic

Signed-off-by: Lucas Wilkinson <[email protected]>

* better comments

Signed-off-by: Lucas Wilkinson <[email protected]>

* cleanup

Signed-off-by: Lucas Wilkinson <[email protected]>

* cleanup

Signed-off-by: Lucas Wilkinson <[email protected]>

* put in a more readable heurisitic

Signed-off-by: Lucas Wilkinson <[email protected]>

* Apply suggestions from code review

Co-authored-by: Tyler Michael Smith <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>

* FA2 8.0 PTX (#69)

Signed-off-by: Lucas Wilkinson <[email protected]>

---------

Signed-off-by: Lucas Wilkinson <[email protected]>
Co-authored-by: Tyler Michael Smith <[email protected]>
zyongye pushed a commit to zyongye/flash-attention that referenced this pull request Aug 7, 2025
Signed-off-by: Lucas Wilkinson <[email protected]>
zyongye pushed a commit to zyongye/flash-attention that referenced this pull request Aug 7, 2025
* varlen combine scheduler

Signed-off-by: Lucas Wilkinson <[email protected]>

* cleanup

Signed-off-by: Lucas Wilkinson <[email protected]>

* move check

Signed-off-by: Lucas Wilkinson <[email protected]>

* standard scheduling algo

Signed-off-by: Lucas Wilkinson <[email protected]>

* better heuristic

Signed-off-by: Lucas Wilkinson <[email protected]>

* better comments

Signed-off-by: Lucas Wilkinson <[email protected]>

* cleanup

Signed-off-by: Lucas Wilkinson <[email protected]>

* cleanup

Signed-off-by: Lucas Wilkinson <[email protected]>

* put in a more readable heurisitic

Signed-off-by: Lucas Wilkinson <[email protected]>

* Apply suggestions from code review

Co-authored-by: Tyler Michael Smith <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>

* FA2 8.0 PTX (vllm-project#69)

Signed-off-by: Lucas Wilkinson <[email protected]>

---------

Signed-off-by: Lucas Wilkinson <[email protected]>
Co-authored-by: Tyler Michael Smith <[email protected]>
LucasWilkinson added a commit that referenced this pull request Aug 7, 2025
Signed-off-by: Lucas Wilkinson <[email protected]>
LucasWilkinson added a commit that referenced this pull request Aug 7, 2025
* varlen combine scheduler

Signed-off-by: Lucas Wilkinson <[email protected]>

* cleanup

Signed-off-by: Lucas Wilkinson <[email protected]>

* move check

Signed-off-by: Lucas Wilkinson <[email protected]>

* standard scheduling algo

Signed-off-by: Lucas Wilkinson <[email protected]>

* better heuristic

Signed-off-by: Lucas Wilkinson <[email protected]>

* better comments

Signed-off-by: Lucas Wilkinson <[email protected]>

* cleanup

Signed-off-by: Lucas Wilkinson <[email protected]>

* cleanup

Signed-off-by: Lucas Wilkinson <[email protected]>

* put in a more readable heurisitic

Signed-off-by: Lucas Wilkinson <[email protected]>

* Apply suggestions from code review

Co-authored-by: Tyler Michael Smith <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>

* FA2 8.0 PTX (#69)

Signed-off-by: Lucas Wilkinson <[email protected]>

---------

Signed-off-by: Lucas Wilkinson <[email protected]>
Co-authored-by: Tyler Michael Smith <[email protected]>
jayhshah pushed a commit that referenced this pull request Aug 8, 2025
Signed-off-by: Lucas Wilkinson <[email protected]>
Signed-off-by: Jay Shah <[email protected]>
jayhshah pushed a commit that referenced this pull request Aug 8, 2025
* varlen combine scheduler

Signed-off-by: Lucas Wilkinson <[email protected]>

* cleanup

Signed-off-by: Lucas Wilkinson <[email protected]>

* move check

Signed-off-by: Lucas Wilkinson <[email protected]>

* standard scheduling algo

Signed-off-by: Lucas Wilkinson <[email protected]>

* better heuristic

Signed-off-by: Lucas Wilkinson <[email protected]>

* better comments

Signed-off-by: Lucas Wilkinson <[email protected]>

* cleanup

Signed-off-by: Lucas Wilkinson <[email protected]>

* cleanup

Signed-off-by: Lucas Wilkinson <[email protected]>

* put in a more readable heurisitic

Signed-off-by: Lucas Wilkinson <[email protected]>

* Apply suggestions from code review

Co-authored-by: Tyler Michael Smith <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>

* FA2 8.0 PTX (#69)

Signed-off-by: Lucas Wilkinson <[email protected]>

---------

Signed-off-by: Lucas Wilkinson <[email protected]>
Co-authored-by: Tyler Michael Smith <[email protected]>
Signed-off-by: Jay Shah <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants