Skip to content

Conversation

alecsolder
Copy link
Contributor

@alecsolder alecsolder commented Sep 11, 2025

Purpose

This PR is to add the ability to use MCP tools to enable built in tools for gpt-oss. Either the OpenAI tool type (i.e browser) or an MCP tool with server_label set to the name of the OpenAI tool type.

This enables header passthrough from the request to the MCP server in addition to the session_id

Test Plan

Tested with responses API server with curl. Added one unit test which worked locally but can't run in CI which tests code_interpreter tool functionality.

curl -X POST "http://localhost:8000/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dummy-api-key" \
  -d '{
        "model": "gpt-oss-20b",
        "input": "Multiply 64548*15151 using builtin python.",
        "tools": [
          {
            "type": "mcp",
            "server_label": "container", 
            "headers": {"test": "test"},
            "server_url": "IGNORED"
          }
        ]
      }'

Test Result

Interacting with the server worked locally and local unit test worked as well.

Copy link
Collaborator

@heheda12345 heheda12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this direction. And I think it is possible to setup this test in CI by using the python MCP server and PYTHON_EXECUTION_BACKEND=dangerously_use_uv. You can have a try if you don't want this feature be broken by future PRs.

async def test_mcp_tool(client: OpenAI, model_name: str):
response = await client.responses.create(
model=model_name,
input="Please multiply 123 and 456 using the available python tool.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use a complex test like the one in #24315 :

input=("What's the first 4 digits after the decimal point of "
               "cube root of `19910212 * 20250910`? "
               "Show only the digits."),

model can be very clever and skip using python tools.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a try and it did work, however because we don't respect limiting the output tokens when tools are enabled and max_tool_calls is not implemented yet, we can't stop the generation after just one tool call.

Because the dangerously_use_uv implementation is not jupyter, the model gets confused and keeps trying to make tool calls until it seems some output (it does eventually start printing), which causes the tests to time out in CI. Otherwise I think it would work just fine though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alecsolder let's open another issues to improve the tests and possibly bring in swe-rex / docker client in oss as we get time on this.

@lacora
Copy link
Contributor

lacora commented Sep 17, 2025

cc @yeqcharlotte @houseroad

Comment on lines 735 to 741
# If there is a MCP tool with a matching server_label, then
# the tool is enabled. Native types like code_interpreter work
# as well
tool_types = [
tool.server_label if tool.type == "mcp" else tool.type
for tool in request.tools
]
Copy link
Collaborator

@yeqcharlotte yeqcharlotte Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use mcp as built-in tool is a bit weird for most developers. do you think we should gate that with similar env like VLLM_GPT_OSS_USE_CONTAINER_TOOL. possibly we can modify the naming and use one env to control

@pytest.mark.asyncio
@pytest.mark.parametrize("model_name", [MODEL_NAME])
@pytest.mark.skip(reason="Code interpreter tool is not available in CI yet.")
async def test_mcp_tool(client: OpenAI, model_name: str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also add some unit tests coverage tool server init logic for mcp tool

Copy link
Collaborator

@yeqcharlotte yeqcharlotte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the change! let's also remember to update the recipe and make this easy to follow. main concern is to make this opt-in and more coverage in unit tests

@github-project-automation github-project-automation bot moved this from To Triage to In progress in gpt-oss Issues & Enhancements Sep 18, 2025
@yeqcharlotte
Copy link
Collaborator

mind also taking this vllm-project/recipes#43 and update it? feel free to open a new one explaining how to use it.

@yeqcharlotte
Copy link
Collaborator

thanks for pushing this, allow mcp tools to override built-in should be generic enough to enable more dev and research usage.

for val in values:
if not case_sensitive:
check_value = val.lower()
check_choices = [choice.lower() for choice in actual_choices]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this list doesn't have to be recreated every time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is implemented in a lazy way in order to support valid values being a callable, which has to happen after other python importing logic is done. This means there isn't a great way to "cache" these values without writing a bigger class for this purpose, which I think it out of the scope for this PR

return _get_validated_env


def env_list_with_choices(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add some unit tests here. also @youkaichao do you have concerns to init a list from environment variables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did something similar to how it is done for plugins right now:

lambda: None if "VLLM_PLUGINS" not in os.environ else os.environ[

Copy link

mergify bot commented Sep 22, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @alecsolder.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Sep 22, 2025
@mergify mergify bot removed the needs-rebase label Sep 22, 2025
Copy link
Collaborator

@yeqcharlotte yeqcharlotte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for addressing all the feedback. cc: @heheda12345

@github-project-automation github-project-automation bot moved this from In progress to Ready in gpt-oss Issues & Enhancements Sep 22, 2025
@yeqcharlotte yeqcharlotte enabled auto-merge (squash) September 22, 2025 21:53
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 22, 2025
@yeqcharlotte yeqcharlotte merged commit 45d7d85 into vllm-project:main Sep 22, 2025
51 of 53 checks passed
@alecsolder alecsolder deleted the builtin_mcp_tools_with_headers branch September 23, 2025 15:45
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
…ough headers (vllm-project#24628)

Signed-off-by: Alec Solder <[email protected]>
Signed-off-by: Alec S <[email protected]>
Co-authored-by: Alec Solder <[email protected]>
Co-authored-by: Ye (Charlotte) Qi <[email protected]>
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
…ough headers (vllm-project#24628)

Signed-off-by: Alec Solder <[email protected]>
Signed-off-by: Alec S <[email protected]>
Co-authored-by: Alec Solder <[email protected]>
Co-authored-by: Ye (Charlotte) Qi <[email protected]>
Signed-off-by: charlifu <[email protected]>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
…ough headers (#24628)

Signed-off-by: Alec Solder <[email protected]>
Signed-off-by: Alec S <[email protected]>
Co-authored-by: Alec Solder <[email protected]>
Co-authored-by: Ye (Charlotte) Qi <[email protected]>
Signed-off-by: yewentao256 <[email protected]>
gjc0824 pushed a commit to gjc0824/vllm that referenced this pull request Oct 10, 2025
…ough headers (vllm-project#24628)

Signed-off-by: Alec Solder <[email protected]>
Signed-off-by: Alec S <[email protected]>
Co-authored-by: Alec Solder <[email protected]>
Co-authored-by: Ye (Charlotte) Qi <[email protected]>
Signed-off-by: gaojc <[email protected]>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
…ough headers (vllm-project#24628)

Signed-off-by: Alec Solder <[email protected]>
Signed-off-by: Alec S <[email protected]>
Co-authored-by: Alec Solder <[email protected]>
Co-authored-by: Ye (Charlotte) Qi <[email protected]>
Signed-off-by: xuebwang-amd <[email protected]>
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
…ough headers (vllm-project#24628)

Signed-off-by: Alec Solder <[email protected]>
Signed-off-by: Alec S <[email protected]>
Co-authored-by: Alec Solder <[email protected]>
Co-authored-by: Ye (Charlotte) Qi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend gpt-oss Related to GPT-OSS models ready ONLY add when PR is ready to merge/full CI is needed

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants