-
Notifications
You must be signed in to change notification settings - Fork 579
fix: limit Support for HTTP Body limit in axum server #2581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hi michaelfeil! Thank you for contributing to ai-dynamo/dynamo. Just a reminder: The 🚀 |
WalkthroughA private 45 MB BODY_LIMIT was added and applied via DefaultBodyLimit to three OpenAI HTTP routers: completions, chat_completions, and embeddings. This enforces maximum request body sizes internally without changing public APIs or signatures. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant C as Client
participant S as OpenAI Service
participant BL as DefaultBodyLimit (45 MB)
participant H as Endpoint Handler
C->>S: HTTP request (completion/chat/embedding)
S->>BL: Route request
alt Body <= 45 MB
BL->>H: Forward request
H-->>S: Response
S-->>C: HTTP 200/4xx/5xx
else Body > 45 MB
BL-->>S: Reject (Payload Too Large)
S-->>C: HTTP 413
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
lib/llm/src/http/service/openai.rs (4)
51-52
: Make the limit name self-descriptive and consider configurability.The constant works, but a more descriptive name and docs help future maintainers. Consider also making it configurable (env/config) so ops can tune without a rebuild.
-const BODY_LIMIT: usize = 45 * 1024 * 1024; +/// Maximum OpenAI request body size in bytes (default: 45 MiB). +const MB: usize = 1024 * 1024; +const OPENAI_BODY_LIMIT_BYTES: usize = 45 * MB;And update the usages accordingly:
- .layer(axum::extract::DefaultBodyLimit::max(BODY_LIMIT)) + .layer(axum::extract::DefaultBodyLimit::max(OPENAI_BODY_LIMIT_BYTES))If you’d like, I can wire this to a config/env flag (e.g., DYNAMO_OPENAI_MAX_BODY_BYTES) in a follow-up.
1023-1024
: Apply the same body limit to the /v1/responses router for consistency.Responses accepts Json and should likely be capped the same as the other OpenAI endpoints to avoid surprising differences.
pub fn responses_router( state: Arc<service_v2::State>, template: Option<RequestTemplate>, path: Option<String>, ) -> (Vec<RouteDoc>, Router) { let path = path.unwrap_or("/v1/responses".to_string()); let doc = RouteDoc::new(axum::http::Method::POST, &path); let router = Router::new() .route(&path, post(handler_responses)) + .layer(axum::extract::DefaultBodyLimit::max(OPENAI_BODY_LIMIT_BYTES)) .with_state((state, template)); (vec![doc], router) }
If the omission is intentional (e.g., different limits per endpoint), consider documenting the rationale or making the per-endpoint limit explicit at the call site.
1038-1039
: DRY: factor the body-limit layer into a tiny helper to avoid duplication.This is minor, but it cuts repetition across routers and keeps future changes one-liner.
// Near this module: fn apply_openai_body_limit<S>(router: Router<S>) -> Router<S> { router.layer(axum::extract::DefaultBodyLimit::max(OPENAI_BODY_LIMIT_BYTES)) } // Usage: let router = apply_openai_body_limit( Router::new().route(&path, post(embeddings)) ).with_state(state);
1007-1008
: Map oversized JSON-body rejections to your ErrorMessage shape
Axum’s DefaultBodyLimit::max enforces a cap (2 MB by default) and, when exceeded on the Json extractor, produces a JsonRejection::BytesRejection whose.status()
is 413 Payload Too Large and whose.body_text()
yields a plain-text error(docs.rs). To keep your API’s ErrorMessage{ error: String }
format consistent, catch this rejection and return JSON:// In your handler signature: - request: Json<NvCreateCompletionRequest>, + request: Result<Json<NvCreateCompletionRequest>, JsonRejection>, async fn handler_completions( State(state): State<Arc<service_v2::State>>, headers: HeaderMap, - request: Json<NvCreateCompletionRequest>, + request: Result<Json<NvCreateCompletionRequest>, JsonRejection>, ) -> Result<Response, ErrorResponse> { let Json(request) = request.map_err(|rej| { - // Default produces a 413 with plain text - rej.into_response() + ( + StatusCode::PAYLOAD_TOO_LARGE, + Json(ErrorMessage { + error: rej.body_text().into(), + }), + ) })?; // ... }Alternatively, apply a global mapping middleware:
use tower::{BoxError, ServiceBuilder}; use tower_http::error_handling::HandleErrorLayer; let app = Router::new() // ... .layer( ServiceBuilder::new() .layer(DefaultBodyLimit::max(BODY_LIMIT)) .layer(HandleErrorLayer::new(|err: BoxError| async move { if let Some(JsonRejection::BytesRejection(_)) = err.downcast_ref::<JsonRejection>() { ( StatusCode::PAYLOAD_TOO_LARGE, Json(ErrorMessage { error: "Request body too large".into(), }), ) .into_response() } else { Err(err) } })) ) .with_state(state);
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
lib/llm/src/http/service/openai.rs
(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build and Test - dynamo
- GitHub Check: pre-merge-rust (lib/bindings/python)
- GitHub Check: pre-merge-rust (.)
- GitHub Check: pre-merge-rust (lib/runtime/examples)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. We probably want an env var to control this, but can follow-up on it. CC @grahamking @GuanLuo @kthui
Added #2584 for follow-up on configuring
Likely relates to #2580 |
Co-authored-by: Ryan McCormick <[email protected]> Signed-off-by: Michael Feil <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! Completely missed the 2MB default.
Signed-off-by: Michael Feil <[email protected]> Co-authored-by: Ryan McCormick <[email protected]> Signed-off-by: Hannah Zhang <[email protected]>
Signed-off-by: Michael Feil <[email protected]> Co-authored-by: Ryan McCormick <[email protected]>
Signed-off-by: Michael Feil <[email protected]> Co-authored-by: Ryan McCormick <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]>
Overview:
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit