Skip to content

Conversation

ahartel
Copy link
Contributor

@ahartel ahartel commented Sep 18, 2025

Purpose

Fix: #19056

Test Plan

Added some tests in the PR.

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a streaming output error in the Hermes tool parser, particularly for the Qwen3 model, by correcting the logic for handling partially streamed JSON arguments. The addition of a comprehensive test suite for the Hermes parser, including specific cases for Qwen tokenization, streaming, and non-streaming scenarios, is a great improvement and significantly enhances the robustness of the parser. While the main fix is sound, I've identified a pre-existing critical bug in the type checking logic that should be addressed.

@ahartel ahartel force-pushed the fix-hermes-parser branch 2 times, most recently from d8ed627 to e7e9587 Compare September 19, 2025 06:02
@ahartel
Copy link
Contributor Author

ahartel commented Sep 19, 2025

@cedonley I saw that you changed lines 369-373 of hermes_tool_parser.py in #10979. The actual fix I am proposing here touches those lines as well by removing 2 string slice operations. Can you remember why you introduced them? Or do you have test cases at hand that I could add to the code base to make sure that they still run?

@ahartel
Copy link
Contributor Author

ahartel commented Sep 19, 2025

Just found this question as well on this topic. The question and your answer seem to suggest that my fix might be applicable.

@tugot17
Copy link

tugot17 commented Sep 19, 2025

The tests seem to pass and all looks fine;

Do you have any idea why it used to be

if (delta_text not in cur_arguments_json[:-2]):

before?

@ahartel
Copy link
Contributor Author

ahartel commented Sep 19, 2025

The tests seem to pass and all looks fine;

Do you have any idea why it used to be

if (delta_text not in cur_arguments_json[:-2]):

before?

No, unfortunately not. Let's see if @cedonley has any insights. See also the links to a previous discussion in my previous comment.

@chaunceyjiang
Copy link
Collaborator

Hi @ahartel Could you retest based on the main branch to see if the issue still exists?

@ahartel
Copy link
Contributor Author

ahartel commented Sep 22, 2025

Hi @ahartel Could you retest based on the main branch to see if the issue still exists?

My newly added tests do indeed pass on main and used to fail previously. Seems to have been fixed. Thanks for pointing that out.

@chaunceyjiang Would you mind merging the changes to file tests/entrypoints/openai/tool_parsers/test_hermes_tool_parser.py? This would add some more test coverage and maybe also document the bevavior of the hermes tool parser.

@chaunceyjiang
Copy link
Collaborator

@chaunceyjiang Would you mind merging the changes to file tests/entrypoints/openai/tool_parsers/test_hermes_tool_parser.py? This would add some more test coverage and maybe also document the bevavior of the hermes tool parser.

Of course, this is a new test case. We can move forward quickly.

@chaunceyjiang chaunceyjiang self-assigned this Sep 22, 2025
@ahartel
Copy link
Contributor Author

ahartel commented Sep 22, 2025

Thanks. I updated my PR to only contain my test additions (plus some very minor reformattings)

@chaunceyjiang chaunceyjiang changed the title [fix]: Hermes tool parser stream output error in Qwen3 case (#19056) [Test]: Hermes tool parser stream output error in Qwen3 case Sep 22, 2025
@chaunceyjiang chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 22, 2025
@chaunceyjiang chaunceyjiang enabled auto-merge (squash) September 23, 2025 09:54
Copy link
Collaborator

@chaunceyjiang chaunceyjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks~

@chaunceyjiang chaunceyjiang merged commit 4322c55 into vllm-project:main Sep 23, 2025
30 checks passed
@ahartel
Copy link
Contributor Author

ahartel commented Sep 23, 2025

Thank you for your support @chaunceyjiang

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
gjc0824 pushed a commit to gjc0824/vllm that referenced this pull request Oct 10, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed tool-calling

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[Bug]: Hermes tool parser stream output error in Qwen3 case

3 participants