-
-
Notifications
You must be signed in to change notification settings - Fork 10.3k
[Feature][Frontend]: Deprecate --enable-reasoning #17452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature][Frontend]: Deprecate --enable-reasoning #17452
Conversation
17fe201
to
fc5191c
Compare
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
/cc @aarnphm Maybe you are interested. |
Can you also have a manual test without the reasoning parser?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall lgtm, but let me finish the other PR wrt to deprecated tags in args
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
Test
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "Hello, vLLM!"}],
"max_tokens": 240
}' |jq
{
"id": "chatcmpl-24eee75f346e44c19342da94081275f8",
"object": "chat.completion",
"created": 1746025506,
"model": "Qwen/Qwen3-8B",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"reasoning_content": null,
"content": "<think>\nOkay, the user said \"Hello, vLLM!\" and I need to respond appropriately. First, I should acknowledge their greeting. Since vLLM is a large language model, I should mention that I'm a Qwen model. Maybe they confused the name, so I should clarify that. I should keep the response friendly and open-ended, inviting them to ask questions. Let me check if there's anything else I need to consider. No, just a simple greeting and clarification should do. Alright, time to put it all together.\n</think>\n\nHello! I'm Qwen, a large language model developed by Alibaba Cloud. If you have any questions or need assistance, feel free to ask me! 😊",
"tool_calls": []
},
"logprobs": null,
"finish_reason": "stop",
"stop_reason": null
}
],
"usage": {
"prompt_tokens": 14,
"total_tokens": 161,
"completion_tokens": 147,
"prompt_tokens_details": null
},
"prompt_logprobs": null
} test vllm serve Qwen/Qwen3-8B --reasoning-parser qwen3 curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "Hello, vLLM!"}],
"max_tokens": 240
}' |jq
{
"id": "chatcmpl-ab9fdf511b3f4791ac4dd52a6b6d42b2",
"object": "chat.completion",
"created": 1746025382,
"model": "Qwen/Qwen3-8B",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"reasoning_content": "\nOkay, the user said \"Hello, vLLM!\" So first, I need to figure out what they're asking. They might be greeting me, but I should check if they're referring to the vLLM framework or maybe another model. Wait, I'm Qwen, not vLLM. Maybe they confused me with vLLM. I should clarify that. Let me make sure I respond correctly. I should greet them back and explain that I'm Qwen, not vLLM. Then offer assistance with their needs. Keep it friendly and helpful.\n",
"content": "\n\nHello! I'm Qwen, a large language model developed by Alibaba Cloud. I'm not vLLM, but I'm here to help you with any questions or tasks you might have. How can I assist you today? 😊",
"tool_calls": []
},
"logprobs": null,
"finish_reason": "stop",
"stop_reason": null
}
],
"usage": {
"prompt_tokens": 14,
"total_tokens": 182,
"completion_tokens": 168,
"prompt_tokens_details": null
},
"prompt_logprobs": null
} |
Failing V1 test looks unrelated and has been fixed in #17500 Let's wait and see for the entrypoints test |
The test failure is persistent, PTAL |
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Head branch was pushed to by a user without write access
36b464b
to
91704e2
Compare
The test case |
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]> Signed-off-by: Mu Huai <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]> Signed-off-by: Yuqi Zhang <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]> Signed-off-by: minpeter <[email protected]>
Fix #14088
warning test :
help test
test:
clent
output: