fix: limit Support for HTTP Body limit in axum server #2581

michaelfeil · 2025-08-20T23:01:24Z

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

Summary by CodeRabbit

Chores
- Enforced a 45 MB maximum request body size for the Completions, Chat Completions, and Embeddings endpoints.
- Requests exceeding this limit will be rejected, ensuring more reliable performance and protection against oversized payloads.
- Typical usage is unaffected; only unusually large inputs are impacted.
- No changes to public APIs or request/response formats.

copy-pr-bot · 2025-08-20T23:01:27Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2025-08-20T23:01:32Z

👋 Hi michaelfeil! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

coderabbitai · 2025-08-20T23:03:05Z

Walkthrough

A private 45 MB BODY_LIMIT was added and applied via DefaultBodyLimit to three OpenAI HTTP routers: completions, chat_completions, and embeddings. This enforces maximum request body sizes internally without changing public APIs or signatures.

Changes

Cohort / File(s)	Summary
OpenAI HTTP service body limit enforcement `lib/llm/src/http/service/openai.rs`	Added private const BODY_LIMIT = 45 MB; wrapped completions_router, chat_completions_router, and embeddings_router with DefaultBodyLimit(BODY_LIMIT); no public API changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant S as OpenAI Service
  participant BL as DefaultBodyLimit (45 MB)
  participant H as Endpoint Handler

  C->>S: HTTP request (completion/chat/embedding)
  S->>BL: Route request
  alt Body <= 45 MB
    BL->>H: Forward request
    H-->>S: Response
    S-->>C: HTTP 200/4xx/5xx
  else Body > 45 MB
    BL-->>S: Reject (Payload Too Large)
    S-->>C: HTTP 413
  end

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I nibbled bytes and set a gate,
Forty-five meg—no more on the plate.
Completions, chats, embeddings too,
Queue up small, we’ll speed you through.
Thump-thump logs, the burrow’s tidy—
Requests behave, no payload mighty.

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

lib/llm/src/http/service/openai.rs (4)

51-52: Make the limit name self-descriptive and consider configurability.

The constant works, but a more descriptive name and docs help future maintainers. Consider also making it configurable (env/config) so ops can tune without a rebuild.

-const BODY_LIMIT: usize = 45 * 1024 * 1024;
+/// Maximum OpenAI request body size in bytes (default: 45 MiB).
+const MB: usize = 1024 * 1024;
+const OPENAI_BODY_LIMIT_BYTES: usize = 45 * MB;

And update the usages accordingly:

- .layer(axum::extract::DefaultBodyLimit::max(BODY_LIMIT))
+ .layer(axum::extract::DefaultBodyLimit::max(OPENAI_BODY_LIMIT_BYTES))

If you’d like, I can wire this to a config/env flag (e.g., DYNAMO_OPENAI_MAX_BODY_BYTES) in a follow-up.

1023-1024: Apply the same body limit to the /v1/responses router for consistency.

Responses accepts Json and should likely be capped the same as the other OpenAI endpoints to avoid surprising differences.

 pub fn responses_router(
     state: Arc<service_v2::State>,
     template: Option<RequestTemplate>,
     path: Option<String>,
 ) -> (Vec<RouteDoc>, Router) {
     let path = path.unwrap_or("/v1/responses".to_string());
     let doc = RouteDoc::new(axum::http::Method::POST, &path);
     let router = Router::new()
         .route(&path, post(handler_responses))
+        .layer(axum::extract::DefaultBodyLimit::max(OPENAI_BODY_LIMIT_BYTES))
         .with_state((state, template));
     (vec![doc], router)
 }

If the omission is intentional (e.g., different limits per endpoint), consider documenting the rationale or making the per-endpoint limit explicit at the call site.

1038-1039: DRY: factor the body-limit layer into a tiny helper to avoid duplication.

This is minor, but it cuts repetition across routers and keeps future changes one-liner.

// Near this module:
fn apply_openai_body_limit<S>(router: Router<S>) -> Router<S> {
    router.layer(axum::extract::DefaultBodyLimit::max(OPENAI_BODY_LIMIT_BYTES))
}

// Usage:
let router = apply_openai_body_limit(
    Router::new().route(&path, post(embeddings))
).with_state(state);

1007-1008: Map oversized JSON-body rejections to your ErrorMessage shape
Axum’s DefaultBodyLimit::max enforces a cap (2 MB by default) and, when exceeded on the Json extractor, produces a JsonRejection::BytesRejection whose .status() is 413 Payload Too Large and whose .body_text() yields a plain-text error(docs.rs). To keep your API’s ErrorMessage { error: String } format consistent, catch this rejection and return JSON:

 // In your handler signature:
- request: Json<NvCreateCompletionRequest>,
+ request: Result<Json<NvCreateCompletionRequest>, JsonRejection>,

 async fn handler_completions(
     State(state): State<Arc<service_v2::State>>,
     headers: HeaderMap,
-    request: Json<NvCreateCompletionRequest>,
+    request: Result<Json<NvCreateCompletionRequest>, JsonRejection>,
 ) -> Result<Response, ErrorResponse> {
     let Json(request) = request.map_err(|rej| {
-        // Default produces a 413 with plain text
-        rej.into_response()
+        (
+            StatusCode::PAYLOAD_TOO_LARGE,
+            Json(ErrorMessage {
+                error: rej.body_text().into(),
+            }),
+        )
     })?;
     // ...
 }

Alternatively, apply a global mapping middleware:

use tower::{BoxError, ServiceBuilder};
use tower_http::error_handling::HandleErrorLayer;

let app = Router::new()
    // ...
    .layer(
        ServiceBuilder::new()
            .layer(DefaultBodyLimit::max(BODY_LIMIT))
            .layer(HandleErrorLayer::new(|err: BoxError| async move {
                if let Some(JsonRejection::BytesRejection(_)) =
                    err.downcast_ref::<JsonRejection>()
                {
                    (
                        StatusCode::PAYLOAD_TOO_LARGE,
                        Json(ErrorMessage {
                            error: "Request body too large".into(),
                        }),
                    )
                        .into_response()
                } else {
                    Err(err)
                }
            }))
    )
    .with_state(state);

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bc290e7 and d0fa8e6.

📒 Files selected for processing (1)

lib/llm/src/http/service/openai.rs (4 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Build and Test - dynamo
GitHub Check: pre-merge-rust (lib/bindings/python)
GitHub Check: pre-merge-rust (.)
GitHub Check: pre-merge-rust (lib/runtime/examples)

lib/llm/src/http/service/openai.rs

rmccorm4

LGTM. We probably want an env var to control this, but can follow-up on it. CC @grahamking @GuanLuo @kthui

Added #2584 for follow-up on configuring

rmccorm4 · 2025-08-21T00:19:13Z

Likely relates to #2580

lib/llm/src/http/service/openai.rs

Co-authored-by: Ryan McCormick <[email protected]> Signed-off-by: Michael Feil <[email protected]>

grahamking

Very nice! Completely missed the 2MB default.

Signed-off-by: Michael Feil <[email protected]> Co-authored-by: Ryan McCormick <[email protected]> Signed-off-by: Hannah Zhang <[email protected]>

Signed-off-by: Michael Feil <[email protected]> Co-authored-by: Ryan McCormick <[email protected]>

Signed-off-by: Michael Feil <[email protected]> Co-authored-by: Ryan McCormick <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]>

body limit

61a7d94

michaelfeil requested a review from a team as a code owner August 20, 2025 23:01

pull-request-size bot added the size/XS label Aug 20, 2025

github-actions bot added the external-contribution Pull request is from an external contributor label Aug 20, 2025

michaelfeil changed the title ~~Support for HTTP Body limit in axum server~~ fix: limit Support for HTTP Body limit in axum server Aug 20, 2025

github-actions bot added the fix label Aug 20, 2025

body limit

d0fa8e6

coderabbitai bot reviewed Aug 20, 2025

View reviewed changes

rmccorm4 requested review from GuanLuo and ishandhanani August 20, 2025 23:45

rmccorm4 reviewed Aug 21, 2025

View reviewed changes

lib/llm/src/http/service/openai.rs Outdated Show resolved Hide resolved

rmccorm4 approved these changes Aug 21, 2025

View reviewed changes

This was referenced Aug 21, 2025

[BUG]: Large embedding or chat requests over 2MB are dropped. #2580

Closed

[FEATURE]: Make HTTP server max body size limit configurable #2584

Closed

nnshah1 reviewed Aug 21, 2025

View reviewed changes

lib/llm/src/http/service/openai.rs Show resolved Hide resolved

michaelfeil and others added 2 commits August 20, 2025 19:30

Update lib/llm/src/http/service/openai.rs

8234cbd

Co-authored-by: Ryan McCormick <[email protected]> Signed-off-by: Michael Feil <[email protected]>

as env var at runtime

08854b4

pull-request-size bot added size/S and removed size/XS labels Aug 21, 2025

Merge branch 'main' into mf/max-limit

193e808

grahamking approved these changes Aug 21, 2025

View reviewed changes

grahamking merged commit 41a617f into ai-dynamo:main Aug 21, 2025
11 checks passed

hhzhang16 pushed a commit that referenced this pull request Aug 27, 2025

fix: limit Support for HTTP Body limit in axum server (#2581)

53ff9d5

Signed-off-by: Michael Feil <[email protected]> Co-authored-by: Ryan McCormick <[email protected]> Signed-off-by: Hannah Zhang <[email protected]>

nv-anants pushed a commit that referenced this pull request Aug 28, 2025

fix: limit Support for HTTP Body limit in axum server (#2581)

608fa42

Signed-off-by: Michael Feil <[email protected]> Co-authored-by: Ryan McCormick <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: limit Support for HTTP Body limit in axum server #2581

fix: limit Support for HTTP Body limit in axum server #2581

Uh oh!

michaelfeil commented Aug 20, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Aug 20, 2025

Uh oh!

github-actions bot commented Aug 20, 2025

Uh oh!

coderabbitai bot commented Aug 20, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

rmccorm4 left a comment •

edited

Loading

Uh oh!

rmccorm4 commented Aug 21, 2025

Uh oh!

Uh oh!

grahamking left a comment

Uh oh!

Uh oh!

Uh oh!

fix: limit Support for HTTP Body limit in axum server #2581

fix: limit Support for HTTP Body limit in axum server #2581

Uh oh!

Conversation

michaelfeil commented Aug 20, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Aug 20, 2025

Uh oh!

github-actions bot commented Aug 20, 2025

Uh oh!

coderabbitai bot commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rmccorm4 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rmccorm4 commented Aug 21, 2025

Uh oh!

Uh oh!

grahamking left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

michaelfeil commented Aug 20, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 20, 2025 •

edited

Loading

rmccorm4 left a comment •

edited

Loading