-
-
Notifications
You must be signed in to change notification settings - Fork 10.4k
[Model] Refactor Phi-4-multimodal to use merged processor and support V1 #15477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Hi @Isotr0py, do you have any estimative to merge this one? I'm planning to test it. |
I'm close to figure out the cause of the regression, so this PR will be good to go ahead once the regression issue is solved. |
Setting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem seems unrelated to this PR, rather it is about how model runner handles max_num_batched_tokens
. So I think we can merge this PR first.
Can you fix the CI failures? Also for now it would be best to add |
OK, let me also update the multimodal processor to catch up #16416. |
Interesting, I'm testing in Engine v0 (v0.8.3) and I got a few non-sense responses. I will give a try setting max batched tokens.
|
Signed-off-by: Isotr0py <[email protected]>
… V1 (vllm-project#15477) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Signed-off-by: Yang Wang <[email protected]>
… V1 (vllm-project#15477) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]>
… V1 (vllm-project#15477) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]>
… V1 (vllm-project#15477) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Signed-off-by: Agata Dobrzyniewicz <[email protected]>
@Isotr0py Do you know why loading the model is taking so long? I'm trying to load 4 instances of the model (one in each GPU) and it takes 20+ minute with CPUs over 70% usage :? |
… V1 (vllm-project#15477) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Signed-off-by: Mu Huai <[email protected]>
Uh oh!
There was an error while loading. Please reload this page.