-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Open
Labels
bugSomething isn't workingSomething isn't workingunstaleRecieved activity after being labelled staleRecieved activity after being labelled stale
Description
Your current environment
N/A
🐛 Describe the bug
Hi,
I am looking for a clarification regarding the warning message that pops up when trying to load a Molmo Vision model in vLLM. It seems that the warning message was first introduced in this PR and carried along with many refactors but I couldn't find any discussion or reference to what the actual bug was or any issue tracking it in the vllm-flash-attn repo. It seems anyway that the vllm-flash-attn
repo is (mostly) up-to-date with the latest flash-attn, so I'm confused why this bug still exists in the fork but not in the main repo.
"Current `vllm-flash-attn` has a bug inside vision module, so we use xformers backend instead. You can run `pip install flash-attn` to use flash-attention backend."
Can we add an issue tracking this and add it to the warning message or perhaps remove the check if it is perhaps no longer relevant?
cc @ywang96 @mrsalehi @DarkLight1337 @mgoin
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
tomasruizt
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingunstaleRecieved activity after being labelled staleRecieved activity after being labelled stale