feat: add batches API with OpenAI compatibility #3088

mattf · 2025-08-10T18:08:22Z

Add complete batches API implementation with protocol, providers, and tests:

Core Infrastructure:

Add batches API protocol using OpenAI Batch types directly
Add Api.batches enum value and protocol mapping in resolver
Add OpenAI "batch" file purpose support
Include proper error handling (ConflictError, ResourceNotFoundError)

Reference Provider:

Add ReferenceBatchesImpl with full CRUD operations (create, retrieve, cancel, list)
Implement background batch processing with configurable concurrency
Add SQLite KVStore backend for persistence
Support /v1/chat/completions endpoint with request validation

Comprehensive Test Suite:

Add unit tests for provider implementation with validation
Add integration tests for end-to-end batch processing workflows
Add error handling tests for validation, malformed inputs, and edge cases

Configuration:

Add max_concurrent_batches and max_concurrent_requests_per_batch options
Add provider documentation with sample configurations

Test with -

$ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run &
$ LLAMA_STACK_CONFIG=http://localhost:8321 uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK

addresses #3066

Add complete batches API implementation with protocol, providers, and tests: Core Infrastructure: - Add batches API protocol using OpenAI Batch types directly - Add Api.batches enum value and protocol mapping in resolver - Add OpenAI "batch" file purpose support - Include proper error handling (ConflictError, ResourceNotFoundError) Reference Provider: - Add ReferenceBatchesImpl with full CRUD operations (create, retrieve, cancel, list) - Implement background batch processing with configurable concurrency - Add SQLite KVStore backend for persistence - Support /v1/chat/completions endpoint with request validation Comprehensive Test Suite: - Add unit tests for provider implementation with validation - Add integration tests for end-to-end batch processing workflows - Add error handling tests for validation, malformed inputs, and edge cases Configuration: - Add max_concurrent_batches and max_concurrent_requests_per_batch options - Add provider documentation with sample configurations Test with - ``` $ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run & $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK ```

mattf · 2025-08-10T18:08:42Z

@franciscojavierarceo ptal

ashwinb · 2025-08-12T15:35:42Z

lgtm

I am not a fan of the "batches" name for an API but hey this time I can just ignore that piece and point to someone else :)

ashwinb

Should we note in the documentation somewhere that this API is under development?

this also captures other notes from agents, eval and inference apis

mattf · 2025-08-13T11:29:19Z

Should we note in the documentation somewhere that this API is under development?

i updated provider_codegen.py to include the api protocol's docstring in the providers docs and added a note about the development status.

fyi, this also picked up additional docs from agents, inference & eval.

mattf · 2025-08-14T01:32:25Z

@ashwinb thoughts on the replay tests failing?

llama_stack/apis/batches/batches.py

franciscojavierarceo · 2025-08-14T02:36:14Z

llama_stack/providers/inline/batches/reference/batches.py

+        batch_id = f"batch_{uuid.uuid4().hex[:16]}"
+        current_time = int(time.time())
+
+        batch = BatchObject(


It could be useful to generate an idempotency key for retries and avoiding processing the same file more than once.

❤️ idempotency is hugely valuable for apis like this.

since files are immutable, the params can serve as that id.

let me add an issue for that feature.

since files are immutable, the params can serve as that id.

💯

…ral "24h"

franciscojavierarceo

🚀

This reverts commit de69216.

Reverts #3088 The PR broke integration tests.

Add complete batches API implementation with protocol, providers, and tests: Core Infrastructure: - Add batches API protocol using OpenAI Batch types directly - Add Api.batches enum value and protocol mapping in resolver - Add OpenAI "batch" file purpose support - Include proper error handling (ConflictError, ResourceNotFoundError) Reference Provider: - Add ReferenceBatchesImpl with full CRUD operations (create, retrieve, cancel, list) - Implement background batch processing with configurable concurrency - Add SQLite KVStore backend for persistence - Support /v1/chat/completions endpoint with request validation Comprehensive Test Suite: - Add unit tests for provider implementation with validation - Add integration tests for end-to-end batch processing workflows - Add error handling tests for validation, malformed inputs, and edge cases Configuration: - Add max_concurrent_batches and max_concurrent_requests_per_batch options - Add provider documentation with sample configurations Test with - ``` $ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run & $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK ``` addresses llamastack#3066

mattf requested review from ashwinb, yanxi0830, hardikjshah, raghotham, ehhuang, terrytangyuan, leseb, bbrowning, reluctantfuturist and slekkala1 as code owners August 10, 2025 18:08

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 10, 2025

ashwinb approved these changes Aug 12, 2025

View reviewed changes

add notes about batches development status to docs

04a73c8

this also captures other notes from agents, eval and inference apis

Merge branch 'main' into add-batches

95a3ecd

franciscojavierarceo reviewed Aug 14, 2025

View reviewed changes

llama_stack/apis/batches/batches.py Show resolved Hide resolved

franciscojavierarceo reviewed Aug 14, 2025

View reviewed changes

llama_stack/apis/batches/batches.py Outdated Show resolved Hide resolved

franciscojavierarceo reviewed Aug 14, 2025

View reviewed changes

varshaprasad96 mentioned this pull request Aug 14, 2025

Add Batch API support for /v1/embeddings #3140

Closed

remove unused CreateBatchRequest, update completion_window to be lite…

44263ce

…ral "24h"

franciscojavierarceo approved these changes Aug 14, 2025

View reviewed changes

mattf merged commit de69216 into llamastack:main Aug 14, 2025
32 checks passed

Elbehery mentioned this pull request Aug 14, 2025

chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage #3061

Merged

ashwinb added a commit that referenced this pull request Aug 14, 2025

Revert "feat: add batches API with OpenAI compatibility (#3088)"

8ffb74a

This reverts commit de69216.

ashwinb mentioned this pull request Aug 14, 2025

Revert "feat: add batches API with OpenAI compatibility" #3149

Merged

ashwinb added a commit that referenced this pull request Aug 14, 2025

Revert "feat: add batches API with OpenAI compatibility" (#3149)

ee7631b

Reverts #3088 The PR broke integration tests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add batches API with OpenAI compatibility #3088

feat: add batches API with OpenAI compatibility #3088

Uh oh!

mattf commented Aug 10, 2025

Uh oh!

mattf commented Aug 10, 2025

Uh oh!

ashwinb commented Aug 12, 2025

Uh oh!

ashwinb left a comment

Uh oh!

mattf commented Aug 13, 2025

Uh oh!

mattf commented Aug 14, 2025

Uh oh!

Uh oh!

Uh oh!

franciscojavierarceo Aug 14, 2025 •

edited

Loading

Uh oh!

mattf Aug 14, 2025

Uh oh!

franciscojavierarceo Aug 14, 2025 •

edited

Loading

Uh oh!

mattf Aug 14, 2025

Uh oh!

franciscojavierarceo left a comment

Uh oh!

Uh oh!

Uh oh!

feat: add batches API with OpenAI compatibility #3088

feat: add batches API with OpenAI compatibility #3088

Uh oh!

Conversation

mattf commented Aug 10, 2025

Uh oh!

mattf commented Aug 10, 2025

Uh oh!

ashwinb commented Aug 12, 2025

Uh oh!

ashwinb left a comment

Choose a reason for hiding this comment

Uh oh!

mattf commented Aug 13, 2025

Uh oh!

mattf commented Aug 14, 2025

Uh oh!

Uh oh!

Uh oh!

franciscojavierarceo Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattf Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

franciscojavierarceo Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattf Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

franciscojavierarceo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

franciscojavierarceo Aug 14, 2025 •

edited

Loading

franciscojavierarceo Aug 14, 2025 •

edited

Loading