Cheshire Cat: AI agent as a microservice

Why use the Cheshire Cat?

The Cheshire Cat is a framework to build custom AI agents:

🤖 Build your own AI agent in minutes, not months
🧠 Make it smart with Retrieval Augmented Generation (RAG)
🏆 Multi-modality, to build the RAG with any kind of documents
💬 Multi-tenancy, to manage multiple chatbots at the same time, each with its own settings, plugins, LLMs, etc.
⚡️ API first, to easily add a conversational layer to your app
☁️ Cloud Ready, working even with horizontal autoscaling
🔐 Secure by design, with API Key and granular permissions
🏗 Production ready, cloud native and scalable
🐋 100% dockerized, to run anywhere
🛠 Easily extendable with plugins
🧩 Built-in plugins
- 🪛 Extend core components (file managers, LLMs, vector databases)
- ✂️ Customizable chunking and embedding
- 🛠 Custom tools, forms, endpoints, MCP clients
- 🪛 LLM callbacks
🌐 Customizable integration of MCP clients, such as LangSmith or LlamaIndex
🏛 Easy to use Admin Panel (available with the repository matteocacciola/cheshirecat-admin)
🦄 Easy to understand docs
🌍 Supports any language model via LangChain

We are committed to openness, privacy and creativity, we want to bring AI to the long tail. If you want to know more about our vision and values, read the Code of Ethics.

Key differences of this version

The current version is a multi-tenant fork of the original Cheshire Cat. The main differences are reported in the CHANGELOG.

Quickstart

To make Cheshire Cat run on your machine, you just need docker installed:

docker run --rm -it -p 1865:80 ghcr.io/matteocacciola/cheshirecat-core:latest

Chat with the Cheshire Cat by downloading the Admin Panel or by using the widget.
Try out the REST API on localhost:1865/docs.

This fork is intended as a microservice.

As a first thing, set the Embedder for the Cheshire Cat. A favourite LLM must be set for each chatbot; each chatbot can have its own language model, with custom settings. Everything can be done via the Admin Panel or via the REST API endpoints.

Important

The following core plugins are enabled by default:

Conversation History: to store and retrieve the conversation history;
Factories: extending objects like LLMs, Embedders, File Managers, Chunkers;
Interactions: add the interaction handler to the language model;
March Hare: handling events via RabbitMQ;
Memory: interacting with Working Memory and adding a handler to trace the activities of the Embedder;
Multimodality: a plugin that adds multimodal capabilities to the Cheshire Cat framework, enabling the processing of images;
White Rabbit: cron and schedule tasks;
Why: add the context and the reasoning behind the answers of the LLM.

You can disable one or more (e.g., March Hare if you don't need to autoscale over cloud PODs) by using the Admin Toggle endpoint.

Enjoy the Cheshire Cat! Follow instructions on how to run it with docker compose and volumes.

Admin panel and UI widget

You can install an admin panel by using the cheshirecat-admin repository. The admin panel is a separate project that allows you to manage the Cheshire Cat and its settings, plugins, and chatbots. It is built with Streamlit and is designed to be easy to use and customizable.

Moreover, a suitable widget for the current fork is available in my Github account to chat the Cheshire Cat.

API Usage

For Streaming Responses (Real-time chat)

Use WebSocket connection at /ws, /ws/{agent_id} or /ws/{agent_id}/{chat_id}; add the token or the API key as a querystring parameter with the syntax ?token=...
Receive tokens in real-time as they're generated: message type chat_token for individual tokens; message type chat for complete responses

For Non-Streaming Responses (Simple API calls)

Use HTTP POST to /message
Receive complete response in single API call
Better for integrations, batch processing, or simple request/response patterns

Compatibility

This new version is no more completely compatible with the original version, since the architecture has been changed. Please, refer to COMPATIBILITY.md for more information.

Best practices

Custom endpoints and permissions

When implementing custom endpoints, you can use the @endpoint decorator to create a new endpoint. Please, refer to the documentation for more information.

Important

Each endpoint implemented for chatbots must use the check_permissions method to authenticate. See this example.

Each endpoint implemented at a system level must use the check_admin_permissions method to authenticate. See this example.

Minimal plugin example

Hooks (events)

from cat.mad_hatter.decorators import hook


# hooks are an event system to get fine-grained control over your assistant
@hook
def agent_prompt_prefix(prefix, cat):
    prefix = """You are Marvin the socks seller, a poetic vendor of socks.
You are an expert in socks, and you reply with exactly one rhyme.
"""
    return prefix

Tools

from cat.mad_hatter.decorators import tool


# langchain inspired tools (function calling)
@tool(return_direct=True)
def socks_prices(color, cat):
    """How much do socks cost? Input is the sock color."""
    prices = {
        "black": 5,
        "white": 10,
        "pink": 50,
    }

    price = prices.get(color, 0)
    return f"{price} bucks, meeeow!"

Conversational Forms

from enum import Enum
from pydantic import BaseModel, Field

from cat.mad_hatter.decorators import CatForm, form


class PizzaBorderEnum(Enum):
    HIGH = "high"
    LOW = "low"


# simple pydantic model
class PizzaOrder(BaseModel):
    pizza_type: str
    pizza_border: PizzaBorderEnum
    phone: str = Field(max_length=10)


@form
class PizzaForm(CatForm):
    name = "pizza_order"
    description = "Pizza Order"
    model_class = PizzaOrder
    examples = ["order a pizza", "I want pizza"]
    stop_examples = [
        "stop pizza order",
        "I do not want a pizza anymore",
    ]

    ask_confirm: bool = True

    def submit(self, form_data) -> str:
        return f"Form submitted: {form_data}"

MCP Clients

# my_mcp_plugin.py

from cat.mad_hatter.decorators import hook, plugin
from cat.log import log

from cat.mad_hatter.decorators.experimental.mcp_client.mcp_client_decorator import mcp_client
from cat.mad_hatter.decorators.experimental.mcp_client.cat_mcp_client import CatMcpClient


# 1. Define your MCP client
@mcp_client
class WeatherMcpClient(CatMcpClient):
    """MCP client for weather information"""
    
    @property
    def init_args(self):
        # Return the connection parameters for your MCP server
        return {
            "server_url": "http://localhost:3000",
            # or for stdio: ["python", "path/to/mcp_server.py"]
        }


# 2. Hook to intercept elicitation requests and ask the user
@hook
def agent_fast_reply(fast_reply, cat):
    """
    Intercept tool responses that require elicitation.
    This hook runs after tools are executed but before the agent generates a response.
    """
    
    # Check if a tool returned an elicitation request
    if isinstance(fast_reply, dict) and fast_reply.get("status") == "elicitation_required":
        # Extract information about what we need
        field_description = fast_reply["message"]
        
        log.info(f"Elicitation required: {field_description}")
        
        # Return a message asking the user for the information
        # This will be sent to the user instead of going through the LLM
        return field_description
    
    # If not an elicitation, continue normally
    return fast_reply


# 3. Hook to capture user responses and store them
@hook
def before_agent_starts(agent_input, cat):
    """
    Handle user responses to pending elicitations.
    This hook runs at the start of each conversation turn.
    """
    
    # Check if there's a pending elicitation in working memory
    pending_elicitation = cat.working_memory.get("pending_mcp_elicitation")
    
    if pending_elicitation:
        log.info("Processing user response to pending elicitation")
        
        # Extract elicitation details
        mcp_client_name = pending_elicitation["mcp_client_name"]
        elicitation_id = pending_elicitation["elicitation_id"]
        missing_fields = pending_elicitation["missing_fields"]

        # Find the MCP client instance from plugin procedures
        client = None
        for procedure in cat.plugin_manager.procedures:
            if procedure.name == mcp_client_name:
                client = procedure
                break
        
        if client and missing_fields:
            # Get the first missing field (we handle one field per turn)
            first_field = missing_fields[0]
            field_name = first_field.get("name")
            
            # The user's input is their response to our question
            user_response = agent_input.input
            
            # Store the response
            client.store_elicitation_response(
                elicitation_id=elicitation_id,
                field_name=field_name,
                value=user_response,
                stray=cat
            )
            
            # Clear the pending elicitation
            del cat.working_memory["pending_mcp_elicitation"]
            
            log.info(f"Stored response for field '{field_name}': {user_response}")
            
            # Modify the input to tell the agent to retry the original tool
            # This ensures the agent knows to call the tool again
            original_tool_call = pending_elicitation.get("original_tool_call", "the original action")
            agent_input.input = f"I've just provided the required information: '{user_response}'. Please **retry the original tool call**: {original_tool_call}"

    return agent_input


# 4. (Optional) Hook to track which tool was being called
@hook
def before_cat_sends_message(message, cat):
    """
    You can use this to clean up or track state.
    """
    # Clean up any stale elicitation data if needed
    # (Usually not necessary as it's handled in before_agent_starts)
    return message


# 5. (Optional) Settings for your plugin
@plugin
def settings_schema():
    return {
        "mcp_server_url": {
            "title": "MCP Server URL",
            "type": "string",
            "default": "http://localhost:3000"
        }
    }

Docs and Resources

For your PHP based projects, I developed a PHP SDK that allows you to easily interact with the Cat. Please, refer to the SDK documentation for more information.

For your Node.js / React.js / Vue.js based projects, I developed a Typescript library that allows you to easily interact with the Cheshire Cat. Please, refer to the library documentation for more information.

List of resources:

Official Documentation of the current fork
PHP SDK
Typescript SDK
Python SDK
Tutorial - Write your first plugin

Roadmap & Contributing

All contributions are welcome! Fork the project, create a branch, and make your changes. Then, follow the contribution guidelines to submit your pull request.

If you like this fork, give it a star ⭐! It is very important to have your support. Thanks again!🙏

License and trademark

Code is licensed under GPL3.
The Cheshire Cat AI logo and name are property of Piero Savastano (founder and maintainer). The current fork is created, refactored and maintained by Matteo Cacciola.

Name		Name	Last commit message	Last commit date
Latest commit History 2,915 Commits
.github		.github
cat		cat
data		data
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE-OF-ETHICS.md		CODE-OF-ETHICS.md
COMPATIBILITY.md		COMPATIBILITY.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
compose.yml		compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cheshire Cat: AI agent as a microservice

Why use the Cheshire Cat?

Key differences of this version

Quickstart

Admin panel and UI widget

API Usage

For Streaming Responses (Real-time chat)

For Non-Streaming Responses (Simple API calls)

Compatibility

Best practices

Custom endpoints and permissions

Minimal plugin example

Docs and Resources

Roadmap & Contributing

License and trademark

About

Uh oh!

Releases 32

Packages

Uh oh!

Languages

License

matteocacciola/cheshirecat-core

Folders and files

Latest commit

History

Repository files navigation

Cheshire Cat: AI agent as a microservice

Why use the Cheshire Cat?

Key differences of this version

Quickstart

Admin panel and UI widget

API Usage

For Streaming Responses (Real-time chat)

For Non-Streaming Responses (Simple API calls)

Compatibility

Best practices

Custom endpoints and permissions

Minimal plugin example

Docs and Resources

Roadmap & Contributing

License and trademark

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 32

Packages 0

Uh oh!

Languages

Packages