-
-
Notifications
You must be signed in to change notification settings - Fork 10.4k
Open
Labels
feature requestNew feature or requestNew feature or requestunstaleRecieved activity after being labelled staleRecieved activity after being labelled stale
Description
🚀 The feature, motivation and pitch
For now --enable-auto-tool-choice
and --enable-reasoning
can't enable together, with the following errors:
# vllm serve /Qwen/QwQ-32B/ --served-model-name QwQ-32B --gpu-memory-utilization 0.97 --tensor-parallel-size 8 --max-model-len 32768 --enable-reasoning --reasoning-parser deepseek_r1 --enable-auto-tool-choice --tool-call-parser hermes
INFO 03-07 18:14:44 [__init__.py:207] Automatically detected platform cuda.
Traceback (most recent call last):
File "/usr/local/bin/vllm", line 8, in <module>
sys.exit(main())
^^^^^^
File "/usr/local/lib/python3.12/site-packages/vllm/entrypoints/cli/main.py", line 70, in main
cmds[args.subparser].validate(args)
File "/usr/local/lib/python3.12/site-packages/vllm/entrypoints/cli/serve.py", line 36, in validate
validate_parsed_serve_args(args)
File "/usr/local/lib/python3.12/site-packages/vllm/entrypoints/openai/cli_args.py", line 285, in validate_parsed_serve_args
raise TypeError(
TypeError: Error: --enable-auto-tool-choice and --enable-reasoning cannot be enabled at the same time
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
HermitSun, jiangyinzuo, Superskyyy, LRriver, juncaofish and 4 more
Metadata
Metadata
Assignees
Labels
feature requestNew feature or requestNew feature or requestunstaleRecieved activity after being labelled staleRecieved activity after being labelled stale