[Feature] Support user-specified "trigger" token before starting structured decoding #12995

jacobthebanana · 2025-02-09T23:04:55Z

This PR allows the user to specify a "trigger token" that needs to be produced before xgrammar is applied to structured decoding. For example, when generating with r1-like models, the end-of-thought token </think> can be the trigger token, as seen in the example in the added unit test.

Additional work might be required to:

Extend these logic to the other logit processor options:
- outlines
- lm-format-enforcer
Allow the user to specify strings consisting of more than one tokens (e.g., JSON Output: or \boxed in math prompts) as the trigger for structured decoding.

FIX #12619

I was not aware of #12955 from Saturday morning before I started working on this PR on Sunday- I apologize to @gaocegege if this PR partially overlapped with their contribution. From what I understand, the main difference between these two PR is the handling of batch_size in xgrammar_decoding, in case more than one stream of generations are being sent through this logic processor at a time. Though it is unclear whether that would ever be the case in the current setup.

…xgrammar. Signed-off-by: jacobthebanana <[email protected]>

github-actions · 2025-02-09T23:05:05Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

jacobthebanana · 2025-02-09T23:28:06Z

Output from unit test:

$ python \
-m pytest \
--maxfail=1 \
--disable-warnings \
-sv tests/entrypoints/llm/test_guided_generate.py::test_guided_json_for_reasoning

Prompt: 'Solve 8x + 7 = -23. Summarize your steps in JSON format. <think>', 
Generated text: '\nI need to solve the equation 8x + 7 = -23. First, I\'ll subtract 7 from both sides to isolate the term with the variable. This gives me 8x = -30.\n\nNext, I\'ll divide both sides by 8 to solve for x, resulting in x = -30/8, which simplifies to x = -15/4.\n</think>{\n\n  "steps": [\n    {\n      "explanation": "To solve the equation 8x + 7 = -23 for x, follow these steps.",\n      "output": "Calculate the initial step by isolating the term with x.\\n\\nSubtract 7 from both sides:\\n8x + 7 - 7 = -23 - 7\\n8x = -30\\n\\nThen, solve for x by dividing both sides by 8:\\n8x/8 = -30/8\\nx = -15/4"\n    }\n  ]\n  ,\n  "final_answer": "x = -15/4"\n}'
Reasoning output: "\nI need to solve the equation 8x + 7 = -23. First, I'll subtract 7 from both sides to isolate the term with the variable. This gives me 8x = -30.\n\nNext, I'll divide both sides by 8 to solve for x, resulting in x = -30/8, which simplifies to x = -15/4.\n", 
Structured output: '{\n\n  "steps": [\n    {\n      "explanation": "To solve the equation 8x + 7 = -23 for x, follow these steps.",\n      "output": "Calculate the initial step by isolating the term with x.\\n\\nSubtract 7 from both sides:\\n8x + 7 - 7 = -23 - 7\\n8x = -30\\n\\nThen, solve for x by dividing both sides by 8:\\n8x/8 = -30/8\\nx = -15/4"\n    }\n  ]\n  ,\n  "final_answer": "x = -15/4"\n}'

(new lines added for readability)

…ading r1-distill alongside qwen-instruct. Signed-off-by: jacobthebanana <[email protected]>

…type signature. Signed-off-by: jacobthebanana <[email protected]>

…from R1-distill-1.5B Signed-off-by: jacobthebanana <[email protected]>

jacobthebanana · 2025-02-10T21:37:23Z

Closing this PR in favor of #12955

Guided decoding: Implemented r1-styled think_end_token "trigger" for …

ac6f9fe

…xgrammar. Signed-off-by: jacobthebanana <[email protected]>

jacobthebanana requested review from DarkLight1337, robertgshaw2-redhat, simon-mo and mgoin as code owners February 9, 2025 23:04

mergify bot added frontend structured-output labels Feb 9, 2025

jacobthebanana added 3 commits February 9, 2025 20:39

Guided decoding: Revised unit test to reduce memory footprint when lo…

67dabe0

…ading r1-distill alongside qwen-instruct. Signed-off-by: jacobthebanana <[email protected]>

Guided decoding: Updated placeholder value for is_triggered to match …

4bf2bc9

…type signature. Signed-off-by: jacobthebanana <[email protected]>

Guided decoding: Updated placeholder value for is_triggered to match …

4c76c31

…type signature. Signed-off-by: jacobthebanana <[email protected]>

liuyanyi mentioned this pull request Feb 10, 2025

[Feature]: Chat Prefix Completion #13005

Closed

1 task

Guided decoding: Revised unit test to mitigate JSON white space loop …

aaa1c5f

…from R1-distill-1.5B Signed-off-by: jacobthebanana <[email protected]>

jacobthebanana closed this Feb 10, 2025

gaocegege mentioned this pull request Feb 26, 2025

[v0][structured output] Support reasoning output #12955

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature] Support user-specified "trigger" token before starting structured decoding #12995

[Feature] Support user-specified "trigger" token before starting structured decoding #12995

Uh oh!

jacobthebanana commented Feb 9, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Feb 9, 2025

Uh oh!

jacobthebanana commented Feb 9, 2025

Uh oh!

jacobthebanana commented Feb 10, 2025

Uh oh!

Uh oh!

Uh oh!

[Feature] Support user-specified "trigger" token before starting structured decoding #12995

[Feature] Support user-specified "trigger" token before starting structured decoding #12995

Uh oh!

Conversation

jacobthebanana commented Feb 9, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 9, 2025

Uh oh!

jacobthebanana commented Feb 9, 2025

Uh oh!

jacobthebanana commented Feb 10, 2025

Uh oh!

Uh oh!

jacobthebanana commented Feb 9, 2025 •

edited by github-actions bot

Loading