A unified Elixir client for interfacing with multiple Large Language Model (LLM) providers.
ExLLM
provides a single, consistent API to interact with a growing list of LLM providers. It abstracts away the complexities of provider-specific request formats, authentication, and error handling, allowing you to focus on building features.
π Release Candidate: This library is approaching its 1.0.0 stable release. The API is stabilized and ready for production use.
- Unified API: Use a single
ExLLM.chat/2
interface for all supported providers, dramatically reducing boilerplate code - Broad Provider Support: Seamlessly switch between models from 14+ major providers
- Streaming Support: Handle real-time responses for chat completions using Elixir's native streaming
- Standardized Error Handling: Get predictable
{:error, reason}
tuples for common failure modes - Session Management: Built-in conversation state tracking and persistence
- Function Calling: Unified tool use interface across providers that support it
- Multimodal Support: Vision, audio, and document processing capabilities where available
- Minimal Overhead: Designed as a thin, efficient client layer with focus on performance
- Extensible Architecture: Adding new providers is straightforward through clean delegation patterns
β
Production Ready: Core chat, streaming, sessions, providers, function calling, cost tracking
π§ Under Development: Context management, model capabilities API, configuration validation
See FEATURE_STATUS.md for detailed testing results and API status.
ExLLM supports 14 providers with access to hundreds of models:
- Anthropic Claude - Claude 4, 3.7, 3.5, and 3 series models
- OpenAI - GPT-4.1, o1 reasoning models, GPT-4o, and GPT-3.5 series
- AWS Bedrock - Multi-provider access (Anthropic, Amazon Nova, Meta Llama, etc.)
- Google Gemini - Gemini 2.5, 2.0, and 1.5 series with multimodal support
- OpenRouter - Access to hundreds of models from multiple providers
- Groq - Ultra-fast inference with Llama 4, DeepSeek R1, and more
- X.AI - Grok models with web search and reasoning capabilities
- Mistral AI - Mistral Large, Pixtral, and specialized code models
- Perplexity - Search-enhanced language models
- Ollama - Local model runner (any model in your installation)
- LM Studio - Local model server with OpenAI-compatible API
- Bumblebee - Local model inference with Elixir/Nx (optional dependency)
- Mock Adapter - For testing and development
Add ex_llm
to your list of dependencies in mix.exs
:
def deps do
[
{:ex_llm, "~> 1.0.0-rc1"},
# Optional: For local model inference via Bumblebee
{:bumblebee, "~> 0.6.2", optional: true},
{:nx, "~> 0.7", optional: true},
# Optional hardware acceleration backends (choose one):
{:exla, "~> 0.7", optional: true},
# Optional: For Apple Silicon Metal acceleration
{:emlx, github: "elixir-nx/emlx", branch: "main", optional: true}
]
end
Set your API keys as environment variables:
export ANTHROPIC_API_KEY="your-anthropic-key"
export OPENAI_API_KEY="your-openai-key"
export GROQ_API_KEY="your-groq-key"
# ... other provider keys as needed
# Single completion
{:ok, response} = ExLLM.chat(:anthropic, [
%{role: "user", content: "Explain quantum computing in simple terms"}
])
IO.puts(response.content)
# Cost automatically tracked: response.cost
# Streaming response
ExLLM.chat_stream(:openai, [
%{role: "user", content: "Write a short story"}
], fn chunk ->
IO.write(chunk.delta)
end)
# With session management
{:ok, session} = ExLLM.Session.new(:groq)
{:ok, session, response} = ExLLM.Session.chat(session, "Hello!")
{:ok, session, response} = ExLLM.Session.chat(session, "How are you?")
# Multimodal with vision
{:ok, response} = ExLLM.chat(:gemini, [
%{role: "user", content: [
%{type: "text", text: "What's in this image?"},
%{type: "image", image: %{data: base64_image, media_type: "image/jpeg"}}
]}
])
You can configure providers in your config/config.exs
:
import Config
config :ex_llm,
default_provider: :openai,
providers: [
openai: [api_key: System.get_env("OPENAI_API_KEY")],
anthropic: [api_key: System.get_env("ANTHROPIC_API_KEY")],
gemini: [api_key: System.get_env("GEMINI_API_KEY")]
]
The test suite includes both unit tests and integration tests. Integration tests that make live API calls are tagged and excluded by default.
To run unit tests only:
mix test
To run integration tests (requires API keys):
mix test --include integration
To run tests with intelligent caching for faster development:
mix test.live # Runs with test response caching enabled
ExLLM uses a clean, modular architecture that separates concerns while maintaining a unified API:
ExLLM
- Main entry point with unified APIExLLM.API.Delegator
- Central delegation engine for provider routingExLLM.API.Capabilities
- Provider capability registryExLLM.Pipeline
- Phoenix-style pipeline for request processing
ExLLM.Embeddings
- Vector operations and similarity calculationsExLLM.Assistants
- OpenAI Assistants API for stateful agentsExLLM.KnowledgeBase
- Document management and semantic searchExLLM.Builder
- Fluent interface for chat constructionExLLM.Session
- Conversation state management
- Clean Separation: Each module has a single, focused responsibility
- Easy Extension: Adding providers requires changes in just 1-2 files
- Performance: Delegation adds minimal overhead
- Maintainability: Clear boundaries between components
π Quick Start Guide - Get up and running in 5 minutes
π User Guide - Comprehensive documentation of all features
ποΈ Architecture Guide - Clean layered architecture and namespace organization
π Pipeline Architecture - Phoenix-style plug system and extensibility
π§ Logger Guide - Debug logging and troubleshooting
β‘ Provider Capabilities - Feature comparison across providers
π§ͺ Testing Guide - Comprehensive testing system with semantic tagging and caching
- Configuration: Environment variables, config files, and provider setup
- Chat Completions: Messages, parameters, and response handling
- Streaming: Real-time responses with error recovery and coordinator
- Session Management: Conversation state and persistence
- Function Calling: Tool use and structured interactions across providers
- Vision & Multimodal: Image, audio, and document processing
- Cost Tracking: Automatic cost calculation and token estimation
- Error Handling: Retry logic and error recovery strategies
- Test Caching: Intelligent response caching for faster development
- Model Discovery: Query available models and capabilities
- OAuth2 Integration: Complete OAuth2 flow for Gemini and other providers
- π Unified API Guide - Complete unified API documentation
- π Migration Guide - Upgrading to v1.0.0
- β Release Checklist - Automated release process
- π API Reference - Detailed API documentation on HexDocs
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- π Documentation: User Guide
- π Issues: GitHub Issues
- π¬ Discussions: GitHub Discussions