Skip to content

Conversation

leejet
Copy link
Owner

@leejet leejet commented Sep 7, 2025

fix #802

@Green-Sky
Copy link
Contributor

Green-Sky commented Sep 7, 2025

Will test in a sec.

update1: works with cuda again. (6 images in 523.34s, similar to before)
update2: vulkan still works. (6 images in 205.99s, slower but I also started it heat soaked)
update3: reran vulkan with a little cooled down system (6 images in 183.43s)

@Green-Sky
Copy link
Contributor

I more and more wish the whole architecture was more llama.cpp like.

@leejet leejet merged commit f8fe4e7 into master Sep 7, 2025
9 checks passed
@leejet leejet deleted the fix-flash-attn branch September 16, 2025 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CUDA: flashattn head dim issues

2 participants