-
Notifications
You must be signed in to change notification settings - Fork 230
Description
Basic Information - Models Used
MiniMax-M1
Basic Information - Scenario Description
本地部署: H100x8. 完全按照文档操作
Is this badcase known and solvable?
- I have followed the GitHub README of the model and found no duplicates in existing issues.
- I have checked Minimax documentation and found no solution.
Information about environment
- Ubuntu22.04
- VLLM Lastest
- Python3.12
Call & Execution Information
部署:
SAFETENSORS_FAST_GPU=1 VLLM_USE_V1=0 python3 -m vllm.entrypoints.openai.api_server --model MiniMax-M1-80k --tensor-parallel-size 8 --trust-remote-code --quantization experts_int8 --dtype bfloat16 --gpu-memory-utilization 0.95 --host 0.0.0.0 --port 8000 --max_model_len 32768 --served-model-name minimax-m1-80k
输入:
curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "minimax-m1-80k", "messages": [{"role": "system", "content": [{"type": "text", "text": "You are a helpful assistant."}]}, {"role": "user", "content": [{"type": "text", "text": "您好, 请详细的介绍一下你自己"}]}]}'
输出:
{"id":"chatcmpl-597ea3a94e0c4ee2a25ab8b07e8df141","object":"chat.completion","created":1754825999,"model":"minimax-m1-80k","choices":[{"index":0,"message":{"role":"assistant","content":"<think>\n嗯,用户让我详细介绍一下自己。首先,我需要确定用户的具体需求是什么。他们可能想了解我的功能、用途,或者背后的技术原理。作为一个AI助手,xxxx\n</think>\n\n您好!我是由人工智能技术驱动的智能助手,xxxx","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning_content":null},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":35,"total_tokens":918,"completion_tokens":883,"prompt_tokens_details":null},"prompt_logprobs":null,"kv_transfer_params":null}
部署增加参数:
--reasoning-parser deepseek_r1
报错信息为:
{"error":{"message":"DeepSeek R1 reasoning parser could not locate think start/end tokens in the tokenizer!","type":"BadRequestError","param":null,"code":400}}
Description
输出think没有自动分离.
查看贵团队的huggingface和vllm针对性配置中, 没有看到reasoning-parser相关的内容.
想确认一下贵团队有没有计划做这种思维链分离的适配. 感谢贵团队的工作.