[Usage]: When I use the Qwen3-32B with tool_choice='required' parameter, the tool calling gets stuck in a loop

### Your current environment

```text
vLLM API server version 0.9.2

vllm serve Qwen/Qwen3-32B --tensor-parallel-size 4 --host 0.0.0.0 --enable-auto-tool-choice --tool-call-parser hermes

```


### How would you like to use vllm

I know this might be a bug, a usage issue, or a model problem. Currently, I'm using LangChain version 0.3.25. When I use bind_tools(tools=[xxx], tool_choice="required"), after calling the tool and returning the result to the LLM, it continues to choose to call the same tool again, resulting in an infinite loop.

    model_params = {
        "temperature": 0.1,
        "seed": 42,
        "timeout": 3600,
    }



### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Usage]: When I use the Qwen3-32B with tool_choice='required' parameter, the tool calling gets stuck in a loop #21026

Your current environment

How would you like to use vllm

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Usage]: When I use the Qwen3-32B with tool_choice='required' parameter, the tool calling gets stuck in a loop #21026

Description

Your current environment

How would you like to use vllm

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions