Support expert parallel in Transformers backend #26162

hmellor · 2025-10-03T10:21:40Z

This PR solves 2 problems that enables expert parallel in the Transformers backend:

Ensures that the dtypes of topk_ids (torch.int32) and topk_weights (torch.float32) match what is expected from FusedMoE.select_experts
Gathers the topk_ids which are passed directly from Transformers as is done for the hidden_states and topk_weights automatically inside FusedMoE

Signed-off-by: Harry Mellor <[email protected]>

Signed-off-by: Harry Mellor <[email protected]> Co-authored-by: Isotr0py <[email protected]> Signed-off-by: Tomer Asida <[email protected]>

Signed-off-by: Harry Mellor <[email protected]> Co-authored-by: Isotr0py <[email protected]> Signed-off-by: Karan Goel <[email protected]>

Signed-off-by: Harry Mellor <[email protected]> Co-authored-by: Isotr0py <[email protected]>

Signed-off-by: Harry Mellor <[email protected]> Co-authored-by: Isotr0py <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

hmellor added 3 commits October 3, 2025 10:15

Remove error saying EP unsuported

8c2021e

Signed-off-by: Harry Mellor <[email protected]>

Rename args and ensure correct dtype

f57e2b5

Signed-off-by: Harry Mellor <[email protected]>

All gather topk_ids

09a2861

Signed-off-by: Harry Mellor <[email protected]>

hmellor added this to Transformers backend Oct 3, 2025

hmellor moved this to In Progress in Transformers backend Oct 3, 2025

Update doc

0061256

Signed-off-by: Harry Mellor <[email protected]>

hmellor marked this pull request as ready for review October 3, 2025 11:22

mergify bot added the documentation Improvements or additions to documentation label Oct 3, 2025

hmellor requested a review from Isotr0py October 3, 2025 13:37

Isotr0py approved these changes Oct 4, 2025

View reviewed changes

Merge branch 'main' into transformers-backend-ep

fa5acaf

Isotr0py enabled auto-merge (squash) October 4, 2025 02:47

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 4, 2025

Isotr0py merged commit d3d649e into vllm-project:main Oct 4, 2025
54 checks passed

github-project-automation bot moved this from In Progress to Done in Transformers backend Oct 4, 2025

hmellor deleted the transformers-backend-ep branch October 4, 2025 07:37

southfreebird pushed a commit to southfreebird/vllm that referenced this pull request Oct 7, 2025

Support expert parallel in Transformers backend (vllm-project#26162)

6320814

Signed-off-by: Harry Mellor <[email protected]> Co-authored-by: Isotr0py <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support expert parallel in Transformers backend #26162

Support expert parallel in Transformers backend #26162

Uh oh!

hmellor commented Oct 3, 2025 •

edited by github-actions bot

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Support expert parallel in Transformers backend #26162

Support expert parallel in Transformers backend #26162

Uh oh!

Conversation

hmellor commented Oct 3, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hmellor commented Oct 3, 2025 •

edited by github-actions bot

Loading