-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Open
Labels
Description
Your current environment
None
How would you like to use vllm
I have read #12071 and it's a wonderful work.
I wonder if this torchrun-compatible executor supports EP? Since the comments in #12071 point out that the input should be same across all ranks (maybe the context is TP).
In EP scenario, all ranks in the same EP group should have different input to take the advantage of EP MoE. And if DeepEP is enabled, prefill and decode would dispatch to normal kernels and ll kernels separately. This requires schedulers ascross ranks in the same EP group should schedule the same prefill/decode action with different inputs. Are we now ensuring this behavior or this is not necessary in current design?
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.