Releases: BerriAI/litellm
v1.68.0-nightly
What's Changed
- [Contributor PR] Support Llama-api as an LLM provider (#10451) by @ishaan-jaff in #10538
- UI - fix(model_management_endpoints.py): allow team admin to update model info + fix request logs - handle expanding other rows when existing row selected + fix(organization_endpoints.py): enable proxy admin with 'all-proxy-model' access to create new org with specific models by @krrishdholakia in #10539
- [Bug Fix] UnicodeDecodeError: 'charmap' on Windows during litellm import by @ishaan-jaff in #10542
- fix(converse_transformation.py): handle meta llama tool call response by @krrishdholakia in #10541
Full Changelog: v1.67.6.dev1...v1.68.0-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.0-nightly
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.0-nightly
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 180.0 | 210.99923315604772 | 6.1894793990457675 | 0.0 | 1852 | 0 | 166.69672900002297 | 3755.0343799999837 |
Aggregated | Passed β | 180.0 | 210.99923315604772 | 6.1894793990457675 | 0.0 | 1852 | 0 | 166.69672900002297 | 3755.0343799999837 |
v1.68.0-stable
What's Changed
- Handle more gemini tool calling edge cases + support bedrock 'stable-image-core' by @krrishdholakia in #10351
- [Feat] Add logging callback support for /moderations API by @ishaan-jaff in #10390
- [Reliability fix] Redis transaction buffer - ensure all redis queues are periodically flushed by @ishaan-jaff in #10393
- [Bug Fix] Responses API - fix for handling multiturn responses API sessions by @ishaan-jaff in #10415
- build(deps): bump axios, @docusaurus/core, @docusaurus/plugin-google-gtag, @docusaurus/plugin-ideal-image and @docusaurus/preset-classic in /docs/my-website by @dependabot in #10419
- docs: Fix link formatting in GitHub PR template by @user202729 in #10417
- docs: Improve documentation of phoenix logging by @user202729 in #10416
- [Feat Security] - Allow blocking web crawlers by @ishaan-jaff in #10420
- [Feat] Add support for using Bedrock Knowledge Bases with LiteLLM /chat/completions requests by @ishaan-jaff in #10413
- Revert "build(deps): bump axios, @docusaurus/core, @docusaurus/plugin-google-gtag, @docusaurus/plugin-ideal-image and @docusaurus/preset-classic in /docs/my-website" by @ishaan-jaff in #10421
- fix google studio url by @nonZero in #10095
- [New model] Add openai/computer-use-preview cost tracking / pricing by @ishaan-jaff in #10422
- fix(langsmith.py): respect langsmith batch size param by @krrishdholakia in #10411
- Support
x-litellm-api-key
header param + allow key at max budget to call non-llm api endpoints by @krrishdholakia in #10392 - Update fireworks ai pricing by @krrishdholakia in #10425
- Schedule budget resets at expectable times (#10331) by @krrishdholakia in #10333
- Embedding caching fixes - handle str -> list cache, set usage tokens for cache hits, combine usage tokens on partial cache hits by @krrishdholakia in #10424
- Contributor PR - Support OPENAI_BASE_URL in addition to OPENAI_API_BASE (#9995) by @ishaan-jaff in #10423
- New feature: Add Python client library for LiteLLM Proxy by @msabramo in #10445
- Add key-level multi-instance tpm/rpm/max parallel request limiting by @krrishdholakia in #10458
- [UI] Allow adding triton models on LiteLLM UI by @ishaan-jaff in #10456
- [Feat] Vector Stores/KnowledgeBases - Allow defining Vector Store Configs by @ishaan-jaff in #10448
- Add low-level interface to client library for doing HTTP requests by @msabramo in #10452
- Correctly re-raise 504 errors and Add
gpt-4o-mini-tts
support by @krrishdholakia in #10462 - UI - Fix filtering on key alias + support global sorting on keys by @krrishdholakia in #10455
- [Bug Fix] Ensure Non-Admin virtual keys can access /mcp routes by @ishaan-jaff in #10473
- [Fixes] Azure OpenAI OIDC - allow using litellm defined params for OIDC Auth by @ishaan-jaff in #10394
- Add supports_pdf_input: true to Claude 3.7 bedrock models by @RupertoM in #9917
- Add
llamafile
as a provider (#10203) by @ishaan-jaff in #10482 - Fix mcp.md in documentation by @1995parham in #10493
- docs(realtime): yaml config example for realtime model by @kmontocam in #10489
- Fix return finish_reason = "tool_calls" for gemini tool calling by @krrishdholakia in #10485
- Add user + team based multi-instance rate limiting by @krrishdholakia in #10497
- mypy tweaks by @msabramo in #10490
- Add vertex ai meta llama 4 support + handle tool call result in content for vertex ai by @krrishdholakia in #10492
- Fix and rewrite of token_counter by @happyherp in #10409
- [Fix + Refactor] Trigger Soft Budget Webhooks When Key Crosses Threshold by @ishaan-jaff in #10491
- [Bug Fix] Ensure Web Search / File Search cost are only added when the response includes the too call by @ishaan-jaff in #10476
- Fixes for
test_team_budget_metrics
andtest_generate_and_update_key
by @S1LV3RJ1NX in #10500 - [Feat] KnowledgeBase/Vector Store - Log
StandardLoggingVectorStoreRequest
for requests made when a vector store is used by @ishaan-jaff in #10509 - Don't depend on uvloop on windows (#10060) by @ishaan-jaff in #10483
- fix: PydanticDeprecatedSince20: Support for class-based
config
is eprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. by @Elijas in #9372 - [Feat] Show Vector Store / KB Request on LiteLLM Logs Page by @ishaan-jaff in #10514
- Fix pytest event loop warning (#9641) by @msabramo in #10512
- UI - fix adding vertex models with reusable credentials + fix pagination on keys table + fix showing org budgets on table by @krrishdholakia in #10528
- Playwright test for team admin (#10366) by @krrishdholakia in #10470
- [QA] Bedrock Vector Stores Integration - Allow using with registry + in OpenAI API spec with tools by @ishaan-jaff in #10516
- UI - allow reassigning team to other org by @krrishdholakia in #10527
- [Models/ LLM Credentials] Fix edit credentials modal by @NANDINI-star in #10519
New Contributors
- @user202729 made their first contribution in #10417
- @nonZero made their first contribution in #10095
- @RupertoM made their first contribution in #9917
- @1995parham made their first contribution in #10493
- @kmontocam made their first contribution in #10489
- @happyherp made their first contribution in #10409
- @Elijas made their first contribution in #9372
Full Changelog: v1.67.4-stable...v1.67.7-stable
v1.67.6.dev1
What's Changed
- [Docs] Using LiteLLM with vector stores / knowledge bases by @ishaan-jaff in #10534
- [Docs] Document StandardLoggingVectorStoreRequest by @ishaan-jaff in #10535
- Litellm stable release notes 05 03 2025 by @krrishdholakia in #10536
- [New model pricing] Add add perplexity/sonar-deep-research by @ishaan-jaff in #10537
Full Changelog: v1.68.0-stable...v1.67.6.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.6.dev1
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 240.0 | 260.6852919703334 | 6.202735781000604 | 0.0 | 1854 | 0 | 210.055011999998 | 2657.431592000023 |
Aggregated | Passed β | 240.0 | 260.6852919703334 | 6.202735781000604 | 0.0 | 1854 | 0 | 210.055011999998 | 2657.431592000023 |
v1.67.6-nightly
What's Changed
- Update fireworks ai pricing by @krrishdholakia in #10425
- Schedule budget resets at expectable times (#10331) by @krrishdholakia in #10333
- Embedding caching fixes - handle str -> list cache, set usage tokens for cache hits, combine usage tokens on partial cache hits by @krrishdholakia in #10424
- Contributor PR - Support OPENAI_BASE_URL in addition to OPENAI_API_BASE (#9995) by @ishaan-jaff in #10423
- New feature: Add Python client library for LiteLLM Proxy by @msabramo in #10445
- Add key-level multi-instance tpm/rpm/max parallel request limiting by @krrishdholakia in #10458
- [UI] Allow adding triton models on LiteLLM UI by @ishaan-jaff in #10456
- [Feat] Vector Stores/KnowledgeBases - Allow defining Vector Store Configs by @ishaan-jaff in #10448
- Add low-level interface to client library for doing HTTP requests by @msabramo in #10452
- Correctly re-raise 504 errors and Add
gpt-4o-mini-tts
support by @krrishdholakia in #10462 - UI - Fix filtering on key alias + support global sorting on keys by @krrishdholakia in #10455
- [Bug Fix] Ensure Non-Admin virtual keys can access /mcp routes by @ishaan-jaff in #10473
- [Fixes] Azure OpenAI OIDC - allow using litellm defined params for OIDC Auth by @ishaan-jaff in #10394
- Add supports_pdf_input: true to Claude 3.7 bedrock models by @RupertoM in #9917
- Add
llamafile
as a provider (#10203) by @ishaan-jaff in #10482 - Fix mcp.md in documentation by @1995parham in #10493
- docs(realtime): yaml config example for realtime model by @kmontocam in #10489
- Fix return finish_reason = "tool_calls" for gemini tool calling by @krrishdholakia in #10485
- Add user + team based multi-instance rate limiting by @krrishdholakia in #10497
- mypy tweaks by @msabramo in #10490
- Add vertex ai meta llama 4 support + handle tool call result in content for vertex ai by @krrishdholakia in #10492
- Fix and rewrite of token_counter by @happyherp in #10409
- [Fix + Refactor] Trigger Soft Budget Webhooks When Key Crosses Threshold by @ishaan-jaff in #10491
- [Bug Fix] Ensure Web Search / File Search cost are only added when the response includes the too call by @ishaan-jaff in #10476
- Fixes for
test_team_budget_metrics
andtest_generate_and_update_key
by @S1LV3RJ1NX in #10500
New Contributors
- @RupertoM made their first contribution in #9917
- @1995parham made their first contribution in #10493
- @kmontocam made their first contribution in #10489
- @happyherp made their first contribution in #10409
Full Changelog: v1.67.5-nightly...v1.67.6-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.6-nightly
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 180.0 | 222.14133768625408 | 6.199997054289087 | 0.0 | 1855 | 0 | 165.41539600001443 | 4686.558129000048 |
Aggregated | Passed β | 180.0 | 222.14133768625408 | 6.199997054289087 | 0.0 | 1855 | 0 | 165.41539600001443 | 4686.558129000048 |
v1.67.5-nightly
What's Changed
- [Docs] v1.67.4-stable by @ishaan-jaff in #10338
- Prisma Migrate - support setting custom migration dir by @krrishdholakia in #10336
- Fix: Prevent cache token overwrite by last chunk in streaming usage by @mdonaj in #10284
- [UI] Fixes for sessions on UI - ensure errors have a session and use 1 session for test key by @ishaan-jaff in #10342
- [UI QA Bug Fix] - Fix SSO Sign in flow by @ishaan-jaff in #10344
- [UI] Fix infinite Scroll on
Models
on Test Key Page by @ishaan-jaff in #10343 - [UI QA Fix] Fix width of the model_id on Models Page by @ishaan-jaff in #10345
- Fix - support azure dall e custom pricing by @krrishdholakia in #10339
- [Bug Fix] UI QA - Fix wildcard model test connection not working by @ishaan-jaff in #10347
- Litellm UI improvements 04 26 2025 p1 by @krrishdholakia in #10346
- [QA] Allow managing sessions with
litellm_session_id
by @ishaan-jaff in #10348 - Handle more gemini tool calling edge cases + support bedrock 'stable-image-core' by @krrishdholakia in #10351
- [Feat] Add logging callback support for /moderations API by @ishaan-jaff in #10390
- [Reliability fix] Redis transaction buffer - ensure all redis queues are periodically flushed by @ishaan-jaff in #10393
- [Bug Fix] Responses API - fix for handling multiturn responses API sessions by @ishaan-jaff in #10415
- build(deps): bump axios, @docusaurus/core, @docusaurus/plugin-google-gtag, @docusaurus/plugin-ideal-image and @docusaurus/preset-classic in /docs/my-website by @dependabot in #10419
- docs: Fix link formatting in GitHub PR template by @user202729 in #10417
- docs: Improve documentation of phoenix logging by @user202729 in #10416
- [Feat Security] - Allow blocking web crawlers by @ishaan-jaff in #10420
- [Feat] Add support for using Bedrock Knowledge Bases with LiteLLM /chat/completions requests by @ishaan-jaff in #10413
- Revert "build(deps): bump axios, @docusaurus/core, @docusaurus/plugin-google-gtag, @docusaurus/plugin-ideal-image and @docusaurus/preset-classic in /docs/my-website" by @ishaan-jaff in #10421
- fix google studio url by @nonZero in #10095
- [New model] Add openai/computer-use-preview cost tracking / pricing by @ishaan-jaff in #10422
- fix(langsmith.py): respect langsmith batch size param by @krrishdholakia in #10411
- Support
x-litellm-api-key
header param + allow key at max budget to call non-llm api endpoints by @krrishdholakia in #10392
New Contributors
- @mdonaj made their first contribution in #10284
- @user202729 made their first contribution in #10417
- @nonZero made their first contribution in #10095
Full Changelog: v1.67.4-nightly...v1.67.5-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.5-nightly
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 270.0 | 290.6566157251101 | 6.175800923917475 | 0.0 | 1848 | 0 | 232.84122400002616 | 2432.3238870000523 |
Aggregated | Passed β | 270.0 | 290.6566157251101 | 6.175800923917475 | 0.0 | 1848 | 0 | 232.84122400002616 | 2432.3238870000523 |
v1.67.4-stable
What's Changed
- [Feat] Expose Responses API on LiteLLM UI Test Key Page by @ishaan-jaff in #10166
- [Bug Fix] Spend Tracking Bug Fix, don't modify in memory default litellm params by @ishaan-jaff in #10167
- Bug Fix - Responses API, Loosen restrictions on allowed environments for computer use tool by @ishaan-jaff in #10168
- [UI] Bug Fix, team model selector by @ishaan-jaff in #10171
- [Bug Fix] Auth Check, Fix typing to ensure case where model is None is handled by @ishaan-jaff in #10170
- [Docs] Responses API by @ishaan-jaff in #10172
- Litellm release notes 04 19 2025 by @krrishdholakia in #10169
- fix(transformation.py): pass back in gemini thinking content to api by @krrishdholakia in #10173
- Litellm docs SCIM by @ishaan-jaff in #10174
- fix(common_daily_activity.py): support empty entity id field by @krrishdholakia in #10175
- fix(proxy_server.py): pass llm router to get complete model list by @krrishdholakia in #10176
- Model pricing updates for Azure & VertexAI by @marty-sullivan in #10178
- fix(bedrock): wrong system prompt transformation by @hewliyang in #10120
- Fix: Potential SQLi in spend_management_endpoints.py by @n1lanjan in #9878
- Handle edge case where user sets model_group inside model_info + Return hashed_token in
token
field on/key/generate
by @krrishdholakia in #10191 - Remove user_id from url by @krrishdholakia in #10192
- [Feat] Pass through endpoints - ensure
PassthroughStandardLoggingPayload
is logged and contains method, url, request/response body by @ishaan-jaff in #10194 - [Feat] Add Responses API - Routing Affinity logic for sessions by @ishaan-jaff in #10193
- [Feat] Add infinity embedding support (contributor pr) by @ishaan-jaff in #10196
- [Bug Fix] caching does not account for thinking or reasoning_effort config by @ishaan-jaff in #10140
- Gemini-2.5-flash improvements by @krrishdholakia in #10198
- Add AgentOps Integration to LiteLLM by @Dwij1704 in #9685
- Add global filtering to Users tab by @krrishdholakia in #10195
- [Feat] Add Support for DELETE /v1/responses/{response_id} on OpenAI, Azure OpenAI by @ishaan-jaff in #10205
- Bug Fix - Address deprecation of open_text by @ishaan-jaff in #10208
- UI - Users page - Enable global sorting (allows finding users with highest spend) by @krrishdholakia in #10211
- feat: Added Missing Attributes For Arize & Phoenix Integration (#10043) by @ishaan-jaff in #10215
- Users page - new user info pane by @krrishdholakia in #10213
- Fix datadog llm observability logging + (Responses API) Ensures handling for undocumented event types by @krrishdholakia in #10206
- Discard duplicate sentence by @DimitriPapadopoulos in #10231
- Require auth for all dashboard pages by @crisshaker in #10229
- [Feat] Add gpt-image-1 cost tracking by @ishaan-jaff in #10241
- [Bug Fix] Add Cost Tracking for gpt-image-1 when quality is unspecified by @ishaan-jaff in #10247
- [Feat] Add support for GET Responses Endpoint - OpenAI, Azure OpenAI by @ishaan-jaff in #10235
- fix(user_dashboard.tsx): add token expiry logic to user dashboard by @krrishdholakia in #10250
- [Helm] fix for serviceAccountName on migration job by @ishaan-jaff in #10258
- Fix typos by @DimitriPapadopoulos in #10232
- Reset key alias value when resetting filters by @crisshaker in #10099
- Support all compatible bedrock params when model="arn:.." by @krrishdholakia in #10256
- UI - fix edit azure public model name + support changing model names post create by @krrishdholakia in #10249
- Litellm fix UI login by @krrishdholakia in #10260
- Multi-admin + Users page fixes: show all models, show user personal models, allow editing user role, available models by @krrishdholakia in #10259
- Fix UI Flicker in Dashboard by @crisshaker in #10261
- Keys and tools pages: Use proper terminology for loading and no data cases by @msabramo in #10253
- adding support for cohere command-a-03-2025 by @ryanchase-cohere in #10295
- [Feat] Add GET, DELETE Responses endpoints on LiteLLM Proxy by @ishaan-jaff in #10297
- [Bug Fix] Timestamp Granularities are not properly passed to whisper in Azure by @ishaan-jaff in #10299
- Contributor PR - Support max_completion_tokens on Sagemaker (#10243) by @ishaan-jaff in #10300
- feat(grafana_dashboard): enable datasource selection via templating by @minatoaquaMK2 in #10257
- Update image_generation.md parameters by @daureg in #10312
- Update deprecation dates and prices by @o-khytrov in #10308
- Fix SSO user login - invalid token error by @krrishdholakia in #10298
- UI - Add team based filtering to models page by @krrishdholakia in #10325
- UI (Teams Page) - Support filtering by team id + team name by @krrishdholakia in #10324
- Move UI to encrypted token usage by @krrishdholakia in #10302
- add azure/gpt-image-1 pricing by @marty-sullivan in #10327
- fix(ui_sso.py): support experimental jwt keys for UI auth w/ SSO by @krrishdholakia in #10326
- UI (Keys Page) - Support cross filtering, filter by user id, filter by key hash by @krrishdholakia in #10322
- [Feat] Responses API - Add session management support for non-openai models by @ishaan-jaff in #10321
- Fix the table render on key creation. by @NANDINI-star in #10224
- Internal Users: Refresh user list on create by @crisshaker in #10296
- [Docs] UI Session Logs by @ishaan-jaff in #10334
New Contributors
- @Dwij1704 made their first contribution in #9685
- @DimitriPapadopoulos made their first contribution in #10231
- @ryanchase-cohere made their first contribution in #10295
- @minatoaquaMK2 made their first contribution in #10257
- @daureg made their first contribution in #10312
- @o-khytrov made their first contribution in #10308
Full Changelog: v1.67.0-stable...v1.67.4-stable
v1.67.4-nightly
What's Changed
- Fix UI Flicker in Dashboard by @crisshaker in #10261
- Keys and tools pages: Use proper terminology for loading and no data cases by @msabramo in #10253
- adding support for cohere command-a-03-2025 by @ryanchase-cohere in #10295
- [Feat] Add GET, DELETE Responses endpoints on LiteLLM Proxy by @ishaan-jaff in #10297
- [Bug Fix] Timestamp Granularities are not properly passed to whisper in Azure by @ishaan-jaff in #10299
- Contributor PR - Support max_completion_tokens on Sagemaker (#10243) by @ishaan-jaff in #10300
- feat(grafana_dashboard): enable datasource selection via templating by @minatoaquaMK2 in #10257
- Update image_generation.md parameters by @daureg in #10312
- Update deprecation dates and prices by @o-khytrov in #10308
- Fix SSO user login - invalid token error by @krrishdholakia in #10298
- UI - Add team based filtering to models page by @krrishdholakia in #10325
- UI (Teams Page) - Support filtering by team id + team name by @krrishdholakia in #10324
- Move UI to encrypted token usage by @krrishdholakia in #10302
- add azure/gpt-image-1 pricing by @marty-sullivan in #10327
- fix(ui_sso.py): support experimental jwt keys for UI auth w/ SSO by @krrishdholakia in #10326
- UI (Keys Page) - Support cross filtering, filter by user id, filter by key hash by @krrishdholakia in #10322
- [Feat] Responses API - Add session management support for non-openai models by @ishaan-jaff in #10321
- Fix the table render on key creation. by @NANDINI-star in #10224
- Internal Users: Refresh user list on create by @crisshaker in #10296
- [Docs] UI Session Logs by @ishaan-jaff in #10334
New Contributors
- @ryanchase-cohere made their first contribution in #10295
- @minatoaquaMK2 made their first contribution in #10257
- @daureg made their first contribution in #10312
- @o-khytrov made their first contribution in #10308
Full Changelog: v1.67.3.dev1...v1.67.4-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.4-nightly
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.4-nightly
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 220.0 | 247.38248619018765 | 6.061326672784343 | 0.0 | 1814 | 0 | 197.54123199999185 | 2435.6727050000018 |
Aggregated | Passed β | 220.0 | 247.38248619018765 | 6.061326672784343 | 0.0 | 1814 | 0 | 197.54123199999185 | 2435.6727050000018 |
v1.67.3.dev6
Full Changelog: v1.67.3.dev4...v1.67.3.dev6
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.3.dev6
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 240.0 | 263.90902888665886 | 6.165203372220349 | 0.0 | 1844 | 0 | 210.97686299992802 | 2930.0805719999516 |
Aggregated | Passed β | 240.0 | 263.90902888665886 | 6.165203372220349 | 0.0 | 1844 | 0 | 210.97686299992802 | 2930.0805719999516 |
v1.67.3.dev4
What's Changed
- Fix UI Flicker in Dashboard by @crisshaker in #10261
- Keys and tools pages: Use proper terminology for loading and no data cases by @msabramo in #10253
- adding support for cohere command-a-03-2025 by @ryanchase-cohere in #10295
- [Feat] Add GET, DELETE Responses endpoints on LiteLLM Proxy by @ishaan-jaff in #10297
- [Bug Fix] Timestamp Granularities are not properly passed to whisper in Azure by @ishaan-jaff in #10299
- Contributor PR - Support max_completion_tokens on Sagemaker (#10243) by @ishaan-jaff in #10300
- feat(grafana_dashboard): enable datasource selection via templating by @minatoaquaMK2 in #10257
- Update image_generation.md parameters by @daureg in #10312
- Update deprecation dates and prices by @o-khytrov in #10308
- Fix SSO user login - invalid token error by @krrishdholakia in #10298
- UI - Add team based filtering to models page by @krrishdholakia in #10325
- UI (Teams Page) - Support filtering by team id + team name by @krrishdholakia in #10324
- Move UI to encrypted token usage by @krrishdholakia in #10302
- add azure/gpt-image-1 pricing by @marty-sullivan in #10327
- fix(ui_sso.py): support experimental jwt keys for UI auth w/ SSO by @krrishdholakia in #10326
- UI (Keys Page) - Support cross filtering, filter by user id, filter by key hash by @krrishdholakia in #10322
- [Feat] Responses API - Add session management support for non-openai models by @ishaan-jaff in #10321
- Fix the table render on key creation. by @NANDINI-star in #10224
- Internal Users: Refresh user list on create by @crisshaker in #10296
- [Docs] UI Session Logs by @ishaan-jaff in #10334
- [Docs] v1.67.4-stable by @ishaan-jaff in #10338
- Prisma Migrate - support setting custom migration dir by @krrishdholakia in #10336
- Fix: Prevent cache token overwrite by last chunk in streaming usage by @mdonaj in #10284
New Contributors
- @ryanchase-cohere made their first contribution in #10295
- @minatoaquaMK2 made their first contribution in #10257
- @daureg made their first contribution in #10312
- @o-khytrov made their first contribution in #10308
- @mdonaj made their first contribution in #10284
Full Changelog: v1.67.3.dev1...v1.67.3.dev4
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.3.dev4
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 190.0 | 216.71376621493337 | 6.269037380681852 | 0.0 | 1875 | 0 | 164.64077799997767 | 4562.471842000036 |
Aggregated | Passed β | 190.0 | 216.71376621493337 | 6.269037380681852 | 0.0 | 1875 | 0 | 164.64077799997767 | 4562.471842000036 |
v1.67.3.dev1
What's Changed
- [Feat] Add gpt-image-1 cost tracking by @ishaan-jaff in #10241
- [Bug Fix] Add Cost Tracking for gpt-image-1 when quality is unspecified by @ishaan-jaff in #10247
- [Feat] Add support for GET Responses Endpoint - OpenAI, Azure OpenAI by @ishaan-jaff in #10235
- fix(user_dashboard.tsx): add token expiry logic to user dashboard by @krrishdholakia in #10250
- [Helm] fix for serviceAccountName on migration job by @ishaan-jaff in #10258
- Fix typos by @DimitriPapadopoulos in #10232
- Reset key alias value when resetting filters by @crisshaker in #10099
- Support all compatible bedrock params when model="arn:.." by @krrishdholakia in #10256
- UI - fix edit azure public model name + support changing model names post create by @krrishdholakia in #10249
- Litellm fix UI login by @krrishdholakia in #10260
- Multi-admin + Users page fixes: show all models, show user personal models, allow editing user role, available models by @krrishdholakia in #10259
Full Changelog: v1.67.2-nightly...v1.67.3.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.3.dev1
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.3.dev1
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 210.0 | 235.18092614107948 | 6.181088327781123 | 0.0 | 1850 | 0 | 192.45027600004505 | 4892.269687999942 |
Aggregated | Passed β | 210.0 | 235.18092614107948 | 6.181088327781123 | 0.0 | 1850 | 0 | 192.45027600004505 | 4892.269687999942 |