Skip to content

Releases: BerriAI/litellm

v1.68.0-nightly

04 May 06:30
Compare
Choose a tag to compare

What's Changed

  • [Contributor PR] Support Llama-api as an LLM provider (#10451) by @ishaan-jaff in #10538
  • UI - fix(model_management_endpoints.py): allow team admin to update model info + fix request logs - handle expanding other rows when existing row selected + fix(organization_endpoints.py): enable proxy admin with 'all-proxy-model' access to create new org with specific models by @krrishdholakia in #10539
  • [Bug Fix] UnicodeDecodeError: 'charmap' on Windows during litellm import by @ishaan-jaff in #10542
  • fix(converse_transformation.py): handle meta llama tool call response by @krrishdholakia in #10541

Full Changelog: v1.67.6.dev1...v1.68.0-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.0-nightly

Don't want to maintain your internal proxy? get in touch πŸŽ‰

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.0-nightly

Don't want to maintain your internal proxy? get in touch πŸŽ‰

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed βœ… 180.0 210.99923315604772 6.1894793990457675 0.0 1852 0 166.69672900002297 3755.0343799999837
Aggregated Passed βœ… 180.0 210.99923315604772 6.1894793990457675 0.0 1852 0 166.69672900002297 3755.0343799999837

v1.68.0-stable

03 May 16:01
Compare
Choose a tag to compare
v1.68.0-stable Pre-release
Pre-release

What's Changed

  • Handle more gemini tool calling edge cases + support bedrock 'stable-image-core' by @krrishdholakia in #10351
  • [Feat] Add logging callback support for /moderations API by @ishaan-jaff in #10390
  • [Reliability fix] Redis transaction buffer - ensure all redis queues are periodically flushed by @ishaan-jaff in #10393
  • [Bug Fix] Responses API - fix for handling multiturn responses API sessions by @ishaan-jaff in #10415
  • build(deps): bump axios, @docusaurus/core, @docusaurus/plugin-google-gtag, @docusaurus/plugin-ideal-image and @docusaurus/preset-classic in /docs/my-website by @dependabot in #10419
  • docs: Fix link formatting in GitHub PR template by @user202729 in #10417
  • docs: Improve documentation of phoenix logging by @user202729 in #10416
  • [Feat Security] - Allow blocking web crawlers by @ishaan-jaff in #10420
  • [Feat] Add support for using Bedrock Knowledge Bases with LiteLLM /chat/completions requests by @ishaan-jaff in #10413
  • Revert "build(deps): bump axios, @docusaurus/core, @docusaurus/plugin-google-gtag, @docusaurus/plugin-ideal-image and @docusaurus/preset-classic in /docs/my-website" by @ishaan-jaff in #10421
  • fix google studio url by @nonZero in #10095
  • [New model] Add openai/computer-use-preview cost tracking / pricing by @ishaan-jaff in #10422
  • fix(langsmith.py): respect langsmith batch size param by @krrishdholakia in #10411
  • Support x-litellm-api-key header param + allow key at max budget to call non-llm api endpoints by @krrishdholakia in #10392
  • Update fireworks ai pricing by @krrishdholakia in #10425
  • Schedule budget resets at expectable times (#10331) by @krrishdholakia in #10333
  • Embedding caching fixes - handle str -> list cache, set usage tokens for cache hits, combine usage tokens on partial cache hits by @krrishdholakia in #10424
  • Contributor PR - Support OPENAI_BASE_URL in addition to OPENAI_API_BASE (#9995) by @ishaan-jaff in #10423
  • New feature: Add Python client library for LiteLLM Proxy by @msabramo in #10445
  • Add key-level multi-instance tpm/rpm/max parallel request limiting by @krrishdholakia in #10458
  • [UI] Allow adding triton models on LiteLLM UI by @ishaan-jaff in #10456
  • [Feat] Vector Stores/KnowledgeBases - Allow defining Vector Store Configs by @ishaan-jaff in #10448
  • Add low-level interface to client library for doing HTTP requests by @msabramo in #10452
  • Correctly re-raise 504 errors and Add gpt-4o-mini-tts support by @krrishdholakia in #10462
  • UI - Fix filtering on key alias + support global sorting on keys by @krrishdholakia in #10455
  • [Bug Fix] Ensure Non-Admin virtual keys can access /mcp routes by @ishaan-jaff in #10473
  • [Fixes] Azure OpenAI OIDC - allow using litellm defined params for OIDC Auth by @ishaan-jaff in #10394
  • Add supports_pdf_input: true to Claude 3.7 bedrock models by @RupertoM in #9917
  • Add llamafile as a provider (#10203) by @ishaan-jaff in #10482
  • Fix mcp.md in documentation by @1995parham in #10493
  • docs(realtime): yaml config example for realtime model by @kmontocam in #10489
  • Fix return finish_reason = "tool_calls" for gemini tool calling by @krrishdholakia in #10485
  • Add user + team based multi-instance rate limiting by @krrishdholakia in #10497
  • mypy tweaks by @msabramo in #10490
  • Add vertex ai meta llama 4 support + handle tool call result in content for vertex ai by @krrishdholakia in #10492
  • Fix and rewrite of token_counter by @happyherp in #10409
  • [Fix + Refactor] Trigger Soft Budget Webhooks When Key Crosses Threshold by @ishaan-jaff in #10491
  • [Bug Fix] Ensure Web Search / File Search cost are only added when the response includes the too call by @ishaan-jaff in #10476
  • Fixes for test_team_budget_metrics and test_generate_and_update_key by @S1LV3RJ1NX in #10500
  • [Feat] KnowledgeBase/Vector Store - Log StandardLoggingVectorStoreRequest for requests made when a vector store is used by @ishaan-jaff in #10509
  • Don't depend on uvloop on windows (#10060) by @ishaan-jaff in #10483
  • fix: PydanticDeprecatedSince20: Support for class-based config is eprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. by @Elijas in #9372
  • [Feat] Show Vector Store / KB Request on LiteLLM Logs Page by @ishaan-jaff in #10514
  • Fix pytest event loop warning (#9641) by @msabramo in #10512
  • UI - fix adding vertex models with reusable credentials + fix pagination on keys table + fix showing org budgets on table by @krrishdholakia in #10528
  • Playwright test for team admin (#10366) by @krrishdholakia in #10470
  • [QA] Bedrock Vector Stores Integration - Allow using with registry + in OpenAI API spec with tools by @ishaan-jaff in #10516
  • UI - allow reassigning team to other org by @krrishdholakia in #10527
  • [Models/ LLM Credentials] Fix edit credentials modal by @NANDINI-star in #10519

New Contributors

Full Changelog: v1.67.4-stable...v1.67.7-stable

v1.67.6.dev1

03 May 23:15
fb9e5db
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.68.0-stable...v1.67.6.dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.6.dev1

Don't want to maintain your internal proxy? get in touch πŸŽ‰

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed βœ… 240.0 260.6852919703334 6.202735781000604 0.0 1854 0 210.055011999998 2657.431592000023
Aggregated Passed βœ… 240.0 260.6852919703334 6.202735781000604 0.0 1854 0 210.055011999998 2657.431592000023

v1.67.6-nightly

02 May 20:15
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.67.5-nightly...v1.67.6-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.6-nightly

Don't want to maintain your internal proxy? get in touch πŸŽ‰

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed βœ… 180.0 222.14133768625408 6.199997054289087 0.0 1855 0 165.41539600001443 4686.558129000048
Aggregated Passed βœ… 180.0 222.14133768625408 6.199997054289087 0.0 1855 0 165.41539600001443 4686.558129000048

v1.67.5-nightly

30 Apr 05:16
839878f
Compare
Choose a tag to compare

What's Changed

  • [Docs] v1.67.4-stable by @ishaan-jaff in #10338
  • Prisma Migrate - support setting custom migration dir by @krrishdholakia in #10336
  • Fix: Prevent cache token overwrite by last chunk in streaming usage by @mdonaj in #10284
  • [UI] Fixes for sessions on UI - ensure errors have a session and use 1 session for test key by @ishaan-jaff in #10342
  • [UI QA Bug Fix] - Fix SSO Sign in flow by @ishaan-jaff in #10344
  • [UI] Fix infinite Scroll on Models on Test Key Page by @ishaan-jaff in #10343
  • [UI QA Fix] Fix width of the model_id on Models Page by @ishaan-jaff in #10345
  • Fix - support azure dall e custom pricing by @krrishdholakia in #10339
  • [Bug Fix] UI QA - Fix wildcard model test connection not working by @ishaan-jaff in #10347
  • Litellm UI improvements 04 26 2025 p1 by @krrishdholakia in #10346
  • [QA] Allow managing sessions with litellm_session_id by @ishaan-jaff in #10348
  • Handle more gemini tool calling edge cases + support bedrock 'stable-image-core' by @krrishdholakia in #10351
  • [Feat] Add logging callback support for /moderations API by @ishaan-jaff in #10390
  • [Reliability fix] Redis transaction buffer - ensure all redis queues are periodically flushed by @ishaan-jaff in #10393
  • [Bug Fix] Responses API - fix for handling multiturn responses API sessions by @ishaan-jaff in #10415
  • build(deps): bump axios, @docusaurus/core, @docusaurus/plugin-google-gtag, @docusaurus/plugin-ideal-image and @docusaurus/preset-classic in /docs/my-website by @dependabot in #10419
  • docs: Fix link formatting in GitHub PR template by @user202729 in #10417
  • docs: Improve documentation of phoenix logging by @user202729 in #10416
  • [Feat Security] - Allow blocking web crawlers by @ishaan-jaff in #10420
  • [Feat] Add support for using Bedrock Knowledge Bases with LiteLLM /chat/completions requests by @ishaan-jaff in #10413
  • Revert "build(deps): bump axios, @docusaurus/core, @docusaurus/plugin-google-gtag, @docusaurus/plugin-ideal-image and @docusaurus/preset-classic in /docs/my-website" by @ishaan-jaff in #10421
  • fix google studio url by @nonZero in #10095
  • [New model] Add openai/computer-use-preview cost tracking / pricing by @ishaan-jaff in #10422
  • fix(langsmith.py): respect langsmith batch size param by @krrishdholakia in #10411
  • Support x-litellm-api-key header param + allow key at max budget to call non-llm api endpoints by @krrishdholakia in #10392

New Contributors

Full Changelog: v1.67.4-nightly...v1.67.5-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.5-nightly

Don't want to maintain your internal proxy? get in touch πŸŽ‰

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed βœ… 270.0 290.6566157251101 6.175800923917475 0.0 1848 0 232.84122400002616 2432.3238870000523
Aggregated Passed βœ… 270.0 290.6566157251101 6.175800923917475 0.0 1848 0 232.84122400002616 2432.3238870000523

v1.67.4-stable

27 Apr 14:26
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.67.0-stable...v1.67.4-stable

v1.67.4-nightly

27 Apr 02:03
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.67.3.dev1...v1.67.4-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.4-nightly

Don't want to maintain your internal proxy? get in touch πŸŽ‰

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.4-nightly

Don't want to maintain your internal proxy? get in touch πŸŽ‰

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed βœ… 220.0 247.38248619018765 6.061326672784343 0.0 1814 0 197.54123199999185 2435.6727050000018
Aggregated Passed βœ… 220.0 247.38248619018765 6.061326672784343 0.0 1814 0 197.54123199999185 2435.6727050000018

v1.67.3.dev6

26 Apr 23:33
Compare
Choose a tag to compare

Full Changelog: v1.67.3.dev4...v1.67.3.dev6

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.3.dev6

Don't want to maintain your internal proxy? get in touch πŸŽ‰

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed βœ… 240.0 263.90902888665886 6.165203372220349 0.0 1844 0 210.97686299992802 2930.0805719999516
Aggregated Passed βœ… 240.0 263.90902888665886 6.165203372220349 0.0 1844 0 210.97686299992802 2930.0805719999516

v1.67.3.dev4

26 Apr 22:30
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.67.3.dev1...v1.67.3.dev4

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.3.dev4

Don't want to maintain your internal proxy? get in touch πŸŽ‰

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed βœ… 190.0 216.71376621493337 6.269037380681852 0.0 1875 0 164.64077799997767 4562.471842000036
Aggregated Passed βœ… 190.0 216.71376621493337 6.269037380681852 0.0 1875 0 164.64077799997767 4562.471842000036

v1.67.3.dev1

24 Apr 06:01
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.67.2-nightly...v1.67.3.dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.3.dev1

Don't want to maintain your internal proxy? get in touch πŸŽ‰

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.3.dev1

Don't want to maintain your internal proxy? get in touch πŸŽ‰

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed βœ… 210.0 235.18092614107948 6.181088327781123 0.0 1850 0 192.45027600004505 4892.269687999942
Aggregated Passed βœ… 210.0 235.18092614107948 6.181088327781123 0.0 1850 0 192.45027600004505 4892.269687999942