-
Notifications
You must be signed in to change notification settings - Fork 602
[DO NOT MERGE] feat: cli for tool and reasoning parsers #2562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds an optional tool_parser_name setting, exposed via a new CLI flag and threaded through Python bindings, engine/state/config, local model builder, HTTP service, and OpenAI aggregators. Updated APIs pass this value into response construction/aggregation for completions and chat completions. Tests updated for new method signatures. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant CLI as Frontend CLI
participant Py as Python Bindings (EntrypointArgs)
participant Eng as Engine/Builder
participant Svc as Http Service State
participant OA as OpenAI Handlers
participant Agg as Aggregators
User->>CLI: run ... --tool-parser-name=<name?>
CLI->>Py: EntrypointArgs(tool_parser_name)
Py->>Eng: engine.build(..., tool_parser_name)
Eng->>Svc: State::new_with_etcd(..., tool_parser_name)
Note right of Svc: State stores Option<String>
User->>OA: /v1/chat/completions or /v1/completions
OA->>Svc: state.tool_parser_name()
OA->>Agg: from_annotated_stream(..., tool_parser_name)
alt chat completions
Agg->>Agg: parse tool calls with named parser
else text completions
Agg->>Agg: aggregate deltas (tool_parser_name passed)
end
Agg-->>OA: Response
OA-->>User: JSON (with parsed tools if any)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (11)
lib/bindings/python/rust/llm/entrypoint.rs (2)
110-111
: Threadingtool_parser_name
through EntrypointArgs is correct; consider normalizing early.Normalize at construction time to trim whitespace and treat empty strings as
None
. This avoids propagating sentinel empty strings and keeps semantics crisp across layers.Apply near the return in
EntrypointArgs::new
:@@ - Ok(EntrypointArgs { + // Normalize tool_parser_name: trim and treat empty as None + let tool_parser_name = tool_parser_name + .map(|s| s.trim().to_string()) + .filter(|s| !s.is_empty()); + Ok(EntrypointArgs { engine_type, @@ - extra_engine_args, - tool_parser_name, + extra_engine_args, + tool_parser_name, })
133-134
: Optional: replace free-form String with a typed enum to prevent invalid values.Right now any string will pass through and only fail downstream. A small enum (e.g.,
enum ToolParserName { Default, Hermes, NemotronDeci, Llama3Json, Mistral, Phi4 }
plusFromStr
) would make invalid values unrepresentable and tighten the API boundary. You can still construct it from the CLI string in one place.Also applies to: 165-166
components/frontend/src/dynamo/frontend/main.py (2)
179-184
: Validate and normalize the CLI value; support env var default.Add a small validator to enforce allowed values and normalize casing. This prevents accidental typos and keeps values consistent across the stack.
Apply:
@@ - parser.add_argument( - "--tool-parser-name", - type=str, - default=None, - help="Tool parser name for the model. Available options: 'hermes', 'nemotron_deci', 'llama3_json', 'mistral', 'phi4', 'default'.", - ) + # Allowed tool parsers. Keep in sync with backend enum/impl. + TOOL_PARSERS = ("default", "hermes", "nemotron_deci", "llama3_json", "mistral", "phi4") + + def validate_tool_parser_name(value: str | None) -> str | None: + if value is None: + return None + v = value.strip().lower() + if not v: + return None + if v not in TOOL_PARSERS: + raise argparse.ArgumentTypeError( + f"--tool-parser-name must be one of {TOOL_PARSERS}, got: {value}" + ) + return v + + parser.add_argument( + "--tool-parser-name", + type=validate_tool_parser_name, + default=os.environ.get("DYN_TOOL_PARSER_NAME"), + help="Tool parser name for the model. Options: 'hermes', 'nemotron_deci', 'llama3_json', 'mistral', 'phi4', 'default'.", + )
242-244
: Pass normalized value into EntrypointArgs.If you keep the validator above, this remains fine. If not, normalize here to avoid propagating stray whitespace/case.
- if flags.tool_parser_name: - kwargs["tool_parser_name"] = flags.tool_parser_name + if flags.tool_parser_name: + kwargs["tool_parser_name"] = flags.tool_parser_name.strip().lower()lib/llm/src/local_model.rs (2)
177-180
: Normalizetool_parser_name
in the builder setter.Trim and drop empty strings so we only carry meaningful values downstream.
- pub fn tool_parser_name(&mut self, tool_parser_name: Option<String>) -> &mut Self { - self.tool_parser_name = tool_parser_name; - self - } + pub fn tool_parser_name(&mut self, tool_parser_name: Option<String>) -> &mut Self { + self.tool_parser_name = tool_parser_name + .map(|s| s.trim().to_string()) + .filter(|s| !s.is_empty()); + self + }
317-318
: Consider a zero-copy getter or an additional ref-returning API.
tool_parser_name(&self) -> Option<String>
clones on each call. If this is accessed in hot paths, add a ref-returning variant to avoid allocations.Example addition (keep existing method for compatibility):
pub fn tool_parser_name_ref(&self) -> Option<&str> { self.tool_parser_name.as_deref() }Also applies to: 385-387
lib/llm/src/protocols/openai/completions/aggregator.rs (1)
68-71
: Parameter is currently pass-through only; either use it or underscore it.Right now
tool_parser_name
is only logged. Consider one of:
- Implement tool-call parsing for completions (like chat), or
- Rename to
_tool_parser_name
and add aTODO
to avoid confusion.- tool_parser_name: Option<String>, + _tool_parser_name: Option<String>, @@ - tracing::trace!("Tool parser name: {:?}", tool_parser_name); // Remove this after enabling tool calling for completions + tracing::trace!("Tool parser name provided for completions (currently unused)"); // TODO: enable tool calling for completionslib/llm/tests/aggregators.rs (1)
40-41
: LGTM: Tests correctly adapted to the newfrom_sse_stream(..., None)
signature.No behavior change; just API alignment. Consider adding one chat-completions test that passes a non-None parser to exercise the new path when tool-calling aggregation lands.
Also applies to: 62-63, 82-83, 102-103, 115-116
lib/llm/src/http/service/service_v2.rs (2)
187-189
: Builder wiring is fine; consider normalizing empty strings at the boundary.
- The builder correctly stores and passes
tool_parser_name
intoState::new_with_etcd
.- Optional: normalize here too (treat
Some("")
or whitespace-only asNone
) to catch misconfigurations early.Example tweak in build():
- let state = Arc::new(State::new_with_etcd( + let normalized = config.tool_parser_name.and_then(|s| { + let t = s.trim().to_string(); + if t.is_empty() { None } else { Some(t) } + }); + let state = Arc::new(State::new_with_etcd( model_manager, config.etcd_client, - config.tool_parser_name, + normalized, ));Also applies to: 311-315
378-381
: API ergonomics: add a convenience setter for &str.Current setter requires
Option<String>
. Consider an overload to acceptimpl Into<String>
and one that takesOption<&str>
to reduce friction for callers.impl HttpServiceConfigBuilder { + pub fn with_tool_parser_name_str<T: Into<String>>(mut self, tool_parser_name: T) -> Self { + self.tool_parser_name = Some(Some(tool_parser_name.into())); + self + } }lib/llm/src/protocols/openai/chat_completions/aggregator.rs (1)
98-100
: Tool-call parsing hook: correct use of Option via as_deref; add minimal failure logging.
- Passing
tool_parser_name.as_deref()
is correct and avoids unnecessary allocations.- Recommend logging parse failures at trace/debug to aid diagnostics when a parser is configured but yields an error.
- if choice.tool_calls.is_none() { - if let Ok(tool_calls) = try_tool_call_parse_aggregate(&choice.text, tool_parser_name.as_deref()) { + if choice.tool_calls.is_none() { + match try_tool_call_parse_aggregate(&choice.text, tool_parser_name.as_deref()) { + Ok(tool_calls) => { if tool_calls.is_empty() { - continue; + continue; } for tool_call in &tool_calls { tracing::debug!( tool_call_id = %tool_call.id, function_name = %tool_call.function.name, arguments = %tool_call.function.arguments, "Parsed structured tool call from aggregated content" ); } choice.tool_calls = Some(tool_calls); choice.text.clear(); choice.finish_reason = Some(dynamo_async_openai::types::FinishReason::ToolCalls); - } + } + Err(e) => { + tracing::trace!(error = %e, "Tool-call parsing failed; leaving content as text"); + } + } }Also applies to: 165-185
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (8)
components/frontend/src/dynamo/frontend/main.py
(2 hunks)lib/bindings/python/rust/llm/entrypoint.rs
(4 hunks)lib/llm/src/http/service/openai.rs
(5 hunks)lib/llm/src/http/service/service_v2.rs
(8 hunks)lib/llm/src/local_model.rs
(7 hunks)lib/llm/src/protocols/openai/chat_completions/aggregator.rs
(9 hunks)lib/llm/src/protocols/openai/completions/aggregator.rs
(6 hunks)lib/llm/tests/aggregators.rs
(5 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (8)
lib/llm/src/protocols/openai/completions/aggregator.rs (3)
lib/llm/src/local_model.rs (2)
tool_parser_name
(177-180)tool_parser_name
(385-387)lib/llm/src/http/service/service_v2.rs (1)
tool_parser_name
(130-132)lib/llm/src/protocols/openai/chat_completions/aggregator.rs (1)
apply
(97-210)
lib/llm/src/local_model.rs (1)
lib/llm/src/http/service/service_v2.rs (1)
tool_parser_name
(130-132)
lib/llm/tests/aggregators.rs (2)
lib/llm/src/protocols/openai/chat_completions/aggregator.rs (2)
from_sse_stream
(266-269)from_sse_stream
(280-286)lib/llm/src/protocols/openai/completions/aggregator.rs (1)
from_sse_stream
(180-186)
components/frontend/src/dynamo/frontend/main.py (2)
lib/llm/src/local_model.rs (3)
default
(66-87)tool_parser_name
(177-180)tool_parser_name
(385-387)lib/llm/src/http/service/service_v2.rs (1)
tool_parser_name
(130-132)
lib/llm/src/protocols/openai/chat_completions/aggregator.rs (4)
lib/llm/src/local_model.rs (2)
tool_parser_name
(177-180)tool_parser_name
(385-387)lib/llm/src/http/service/service_v2.rs (1)
tool_parser_name
(130-132)lib/parsers/src/tool_calling/tools.rs (1)
try_tool_call_parse_aggregate
(13-34)lib/llm/src/protocols/openai/completions/aggregator.rs (3)
from_annotated_stream
(188-193)apply
(66-163)from_sse_stream
(180-186)
lib/bindings/python/rust/llm/entrypoint.rs (2)
lib/llm/src/http/service/service_v2.rs (2)
new
(75-88)tool_parser_name
(130-132)lib/llm/src/local_model.rs (3)
extra_engine_args
(162-165)tool_parser_name
(177-180)tool_parser_name
(385-387)
lib/llm/src/http/service/openai.rs (4)
lib/llm/src/local_model.rs (2)
tool_parser_name
(177-180)tool_parser_name
(385-387)lib/llm/src/http/service/service_v2.rs (2)
tool_parser_name
(130-132)state
(200-202)lib/llm/src/protocols/openai/chat_completions/aggregator.rs (2)
from_annotated_stream
(253-256)from_annotated_stream
(273-278)lib/llm/src/protocols/openai/completions/aggregator.rs (1)
from_annotated_stream
(188-193)
lib/llm/src/http/service/service_v2.rs (2)
lib/bindings/python/rust/llm/entrypoint.rs (3)
new
(40-55)new
(70-80)new
(118-167)lib/llm/src/local_model.rs (2)
tool_parser_name
(177-180)tool_parser_name
(385-387)
🔇 Additional comments (12)
lib/bindings/python/rust/llm/entrypoint.rs (2)
117-118
: PyO3 signature remains backward-compatible.Adding
tool_parser_name=None
at the tail preserves positional/kwargs compatibility for existing callers. Good placement and defaults.
198-200
: PreserveNone
semantics fortool_parser_name
in State constructorsThe current constructors in
lib/llm/src/http/service/service_v2.rs
always wraptool_parser_name
intoSome("")
whenNone
is passed, losing the distinction between “unset” and an empty string:
- State::new (around line 86):
- tool_parser_name: Some(tool_parser_name.unwrap_or_else(|| String::from(""))), + tool_parser_name,- State::new_with_etcd (around line 105):
- tool_parser_name: Some(tool_parser_name.unwrap_or_else(|| String::from(""))), + tool_parser_name,A quick search confirmed these are the only spots defaulting to
Some("")
forNone
inputs. Preserving the originalOption<String>
ensures downstream logic can accurately detect when the parser name was truly unset.Likely an incorrect or invalid review comment.
lib/llm/src/local_model.rs (2)
62-63
: LGTM: Addedtool_parser_name
to builder and Default.State is extended cleanly and defaults to
None
; no behavior change unless explicitly provided.Also applies to: 85-86
223-224
: Propagation into LocalModel (both echo and main paths) looks correct.Both construction paths consistently pass the builder’s
tool_parser_name
intoLocalModel
.Also applies to: 300-301
lib/llm/src/protocols/openai/completions/aggregator.rs (3)
179-186
: API propagation LGTM.The
from_sse_stream
andfrom_annotated_stream
signatures and call chain correctly forwardtool_parser_name
.Also applies to: 188-193
248-248
: Tests updated for the new parameter are correct.Passing
None
matches the intended default behavior and keeps existing test semantics unchanged.Also applies to: 272-272, 312-312, 372-372
179-186
: All usages updated to new arityI've verified that every call to both
NvCreateCompletionResponse::from_sse_stream(...)
and
NvCreateChatCompletionResponse::from_sse_stream(...)
now supplies two arguments (the stream and anOption<String>
), and that every invocation ofDeltaAggregator::apply
in the completions and chat modules likewise passes two parameters (stream
andtool_parser_name
orNone
). No remaining call sites are using the old signature—everything compiles as expected.lib/llm/src/http/service/openai.rs (3)
298-299
: Plumbing tool_parser_name into completions folding looks correct.
- Correctly retrieves
tool_parser_name
from state and threads it intoNvCreateCompletionResponse::from_annotated_stream(...)
.- This aligns with the updated aggregator API taking
Option<String>
.Also applies to: 319-329
737-751
: Responses endpoint: tool_parser_name propagation is consistent.
- Correctly mirrors chat completions’ non-streaming fold by passing
tool_parser_name
intoNvCreateChatCompletionResponse::from_annotated_stream(...)
.
483-484
: Chat and non-chat completions: tool_parser_name correctly threaded
- Verified that
tool_parser_name()
is fetched before engine creation and passed into
NvCreateCompletionResponse::from_annotated_stream(stream, tool_parser_name)
at line 319.- Confirmed the same parameter is forwarded to
NvCreateChatCompletionResponse::from_annotated_stream(stream, tool_parser_name)
at lines 549 and 739.- Streaming path remains an SSE passthrough—parsing happens during fold, consistent with the new aggregator signature.
No further changes needed.
lib/llm/src/protocols/openai/chat_completions/aggregator.rs (2)
253-257
: Public API updates to thread tool_parser_name are consistent.
- Trait and impl signatures cleanly propagate the optional parser name across annotated and SSE flows.
Also applies to: 266-270, 273-287
221-237
: I’ve added a script to dump the surrounding context inaggregator.rs
; once you provide the results, I can confirm whetherdelta.role
is indeed anOption
and whether anunwrap_or
fallback is appropriate.
tool_parser_name: Option<String>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Do not coerce None into Some(""); preserve None semantics for tool_parser_name.
Currently State::new
/new_with_etcd
convert None
into Some(String::from(""))
. This silently changes semantics: downstream code now receives Some("")
instead of None
. The parser API differentiates between “no parser configured” (None) vs “specific parser name”. Passing Some("")
may disable default behavior or select a non-existent parser.
Refactor to keep None
as None
and also normalize empty/whitespace-only strings to None
.
Apply this diff:
pub struct State {
@@
- tool_parser_name: Option<String>,
+ tool_parser_name: Option<String>,
}
@@
- pub fn new(manager: Arc<ModelManager>, tool_parser_name: Option<String>) -> Self {
+ pub fn new(manager: Arc<ModelManager>, tool_parser_name: Option<String>) -> Self {
Self {
manager,
metrics: Arc::new(Metrics::default()),
etcd_client: None,
flags: StateFlags {
chat_endpoints_enabled: AtomicBool::new(false),
cmpl_endpoints_enabled: AtomicBool::new(false),
embeddings_endpoints_enabled: AtomicBool::new(false),
responses_endpoints_enabled: AtomicBool::new(false),
},
- tool_parser_name: Some(tool_parser_name.unwrap_or_else(|| String::from(""))),
+ tool_parser_name: tool_parser_name.and_then(|s| {
+ let trimmed = s.trim().to_string();
+ if trimmed.is_empty() { None } else { Some(trimmed) }
+ }),
}
}
@@
- pub fn new_with_etcd(
- manager: Arc<ModelManager>,
- etcd_client: Option<etcd::Client>,
- tool_parser_name: Option<String>,
- ) -> Self {
+ pub fn new_with_etcd(
+ manager: Arc<ModelManager>,
+ etcd_client: Option<etcd::Client>,
+ tool_parser_name: Option<String>,
+ ) -> Self {
Self {
manager,
metrics: Arc::new(Metrics::default()),
etcd_client,
flags: StateFlags {
chat_endpoints_enabled: AtomicBool::new(false),
cmpl_endpoints_enabled: AtomicBool::new(false),
embeddings_endpoints_enabled: AtomicBool::new(false),
responses_endpoints_enabled: AtomicBool::new(false),
},
- tool_parser_name: Some(tool_parser_name.unwrap_or_else(|| String::from(""))),
+ tool_parser_name: tool_parser_name.and_then(|s| {
+ let trimmed = s.trim().to_string();
+ if trimmed.is_empty() { None } else { Some(trimmed) }
+ }),
}
}
@@
- pub fn tool_parser_name(&self) -> Option<String> {
- self.tool_parser_name.clone()
- }
+ pub fn tool_parser_name(&self) -> Option<String> {
+ self.tool_parser_name.clone()
+ }
Also applies to: 75-88, 90-107, 129-132
🏁 Script executed:
#!/bin/bash
# 1) Ensure State constructors no longer coerce None -> Some("")
rg -n 'unwrap_or_else\(\|\|\s*String::from\(\"\"\)\s*\)\)' lib/llm/src/http/service/service_v2.rs
# 2) Check all call sites for State::new/new_with_etcd are passing Option<String> directly
rg -nP 'State::new(_with_etcd)?\s*\(' -C2
# 3) Sanity: find all usages of tool_parser_name() to confirm Option<> is handled (not unwrapped blindly)
rg -nP '\btool_parser_name\(\)' -C2
Length of output: 9796
🏁 Script executed:
#!/bin/bash
# 1) Verify no unwrap_or_else remains in service_v2.rs
echo "Searching for unwrap_or_else occurrences..."
rg -n 'unwrap_or_else' lib/llm/src/http/service/service_v2.rs || echo "No unwrap_or_else found"
echo "Searching for String::from(\"\") occurrences..."
rg -n 'String::from\(\s*\"\"\s*\)' lib/llm/src/http/service/service_v2.rs || echo "No String::from(\"\") found"
# 2) Inspect constructor assignments for tool_parser_name
echo "Inspecting tool_parser_name assignments in State constructors..."
rg -n 'tool_parser_name:' -C2 lib/llm/src/http/service/service_v2.rs
# 3) Inspect State::new and new_with_etcd definitions
echo "Inspecting State::new definition..."
rg -nP 'pub fn new\(' -C10 lib/llm/src/http/service/service_v2.rs
echo "Inspecting State::new_with_etcd definition..."
rg -nP 'pub fn new_with_etcd' -C10 lib/llm/src/http/service/service_v2.rs
# 4) Inspect tool_parser_name usage in openai.rs
echo "Inspecting tool_parser_name() usage in openai.rs..."
rg -nP 'tool_parser_name\(\)' -C5 lib/llm/src/http/service/openai.rs
Length of output: 5675
Preserve None semantics for tool_parser_name
in State
constructors
State::new and new_with_etcd
still coerce None
into Some("")
, altering downstream behavior. On lines 86 and 105 of lib/llm/src/http/service/service_v2.rs
, the code uses:
tool_parser_name: Some(tool_parser_name.unwrap_or_else(|| String::from(""))),
This should be refactored to preserve None
and normalize empty or whitespace-only strings to None
.
Locations to fix:
- lib/llm/src/http/service/service_v2.rs: State::new (line 86)
- lib/llm/src/http/service/service_v2.rs: State::new_with_etcd (line 105)
Apply this diff at both constructor sites:
--- a/lib/llm/src/http/service/service_v2.rs
+++ b/lib/llm/src/http/service/service_v2.rs
@@ -86,7 +86,12 @@ impl State {
- tool_parser_name: Some(tool_parser_name.unwrap_or_else(|| String::from(""))),
+ tool_parser_name: tool_parser_name.and_then(|s| {
+ let trimmed = s.trim().to_string();
+ if trimmed.is_empty() {
+ None
+ } else {
+ Some(trimmed)
+ }
+ }),
@@ -105,7 +110,12 @@ impl State {
- tool_parser_name: Some(tool_parser_name.unwrap_or_else(|| String::from(""))),
+ tool_parser_name: tool_parser_name.and_then(|s| {
+ let trimmed = s.trim().to_string();
+ if trimmed.is_empty() {
+ None
+ } else {
+ Some(trimmed)
+ }
+ }),
This change will ensure that when no parser is configured (None
), it remains None
, and that empty or all-whitespace names are normalized to None
.
🤖 Prompt for AI Agents
In lib/llm/src/http/service/service_v2.rs around lines 86 and 105, the
constructors currently coerce None into Some("") via tool_parser_name:
Some(tool_parser_name.unwrap_or_else(|| String::from(""))); change both sites so
that None is preserved and any provided name that is empty or only whitespace is
normalized to None (i.e., map the Option<String> through trim(), filter out
empty results, and only wrap in Some if non-empty), updating the field
initialization accordingly.
lib/llm/src/local_model.rs
Outdated
@@ -59,6 +59,7 @@ pub struct LocalModelBuilder { | |||
extra_engine_args: Option<PathBuf>, | |||
runtime_config: ModelRuntimeConfig, | |||
user_data: Option<serde_json::Value>, | |||
tool_parser_name: Option<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make this an enum
? Then you give it an impl From<String> for ToolParserName {
.
Then you call the builder by doing .tool_parser_name(the_str_name.into())
and it will magic itself into an enum.
Or you can do try_into()
(impl TryFrom<String> ..
instead of impl From) and that can return an error.
Examples in lib/llm/src/entrypoint/input.rs
on the Input
type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the default type here this makes MyThing::One
the Default::default()
.
enum MyThing {
#[default]
One,
Two,
}
Then #[derive(Copy, Clone)]
on it, because now it's implemented as a single byte, and so you don't need to .clone()
it, it copies automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't want to bog down development here - but for all the many LoRA, Tool calling, reasoning, etc. PRs coming in -- can we start a small docs/guides/frontend.md
or something and start to call out some of the big things we're adding and how to opt in? Not looking for something terribly comprehensive, but want to get something in so each new PR has an easier time just appending a small chunk to the doc each time.
# Dynamo OpenAI Frontend
...
## Tool Calling
## Tool Parsers
For tool parsers, see the `--tool-parser-name` arg in output of `python -m dynamo.frontend --help`.
## Reasoning Parsers
For reasoning models, dynamo supports ...
## LoRAs
LoRAs are supported under unique model names in the `/v1/models` route, for example ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't need to be specific to this PR either - but I think it would be good to start writing this stuff down somewhere earlier than later. CC @grahamking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example @zhongdaor-nv is starting to look into tool calling and reasoning for GPT OSS specifically, and I don't have a good resource/overview to point him to the list of things we just added in past 2 weeks that may be applicable to him, other than linking a dozen PRs or so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good. I will add it soon once I finalize the tool call param design
8eca226
to
ed22ee5
Compare
@@ -176,6 +176,12 @@ def parse_args(): | |||
default=None, | |||
help="Prefix for Dynamo frontend metrics. If unset, uses DYN_METRICS_PREFIX env var or 'dynamo_frontend'.", | |||
) | |||
parser.add_argument( | |||
"--tool-call-parser", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the frontend can serve many models - should this be a frontend argument? Or backend/worker argument passed through ModelDeploymentCard/RuntimeConfig etc. that the frontend can load on demand when worker is discovered via register_llm
?
ex: python -m dynamo.vllm
instead of python -m dynamo.frontend
python -m dynamo.vllm --tool-call-parser ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, sorry I see you said the same thing on slack - ignore me.
closing this in favor of #2619 |
Overview:
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit