Skip to content

collabnix/aiwatch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

45 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AIWatch - AI Model Management and Observability powered by Docker Model Runner

image

A modern, full-stack chat application demonstrating how to integrate React frontend with a Go backend and run local Large Language Models (LLMs) using Docker's Model Runner. This project features a comprehensive Redis-powered observability stack with real-time monitoring, analytics, and distributed tracing.

Overview

image

This project showcases a complete Generative AI interface with enterprise-grade observability that includes:

  • React/TypeScript frontend with a responsive chat UI
  • Go backend server for API handling
  • Integration with Docker's Model Runner to run Llama 3.2 locally
  • Redis Stack with TimeSeries for data persistence and analytics
  • Comprehensive observability with metrics, logging, and tracing
  • NEW: Redis-powered analytics with real-time performance monitoring
  • Enhanced Docker Compose setup with full observability stack

πŸ”§ Features

  • πŸ’¬ Interactive chat interface with message history
  • πŸ”„ Real-time streaming responses (tokens appear as they're generated)
  • πŸŒ“ Light/dark mode support based on user preference
  • 🐳 Dockerized deployment for easy setup and portability
  • 🏠 Run AI models locally without cloud API dependencies
  • πŸ”’ Cross-origin resource sharing (CORS) enabled
  • πŸ§ͺ Integration testing using Testcontainers
  • πŸ“Š Redis-powered metrics and performance monitoring
  • πŸ“ Structured logging with zerolog
  • πŸ” Distributed tracing with OpenTelemetry & Jaeger
  • πŸ“ˆ Grafana dashboards for visualization
  • πŸš€ Advanced llama.cpp performance metrics
  • πŸ†• Redis Stack with TimeSeries, Search, and JSON support
  • πŸ†• Redis Exporter for Prometheus metrics integration
  • πŸ†• Token Analytics Service for usage tracking
  • πŸ†• Production-ready health checks and service dependencies
  • πŸ†• Auto-configured Grafana with Prometheus and Redis datasources

πŸ—οΈ Enhanced Architecture

The application now consists of a comprehensive observability stack:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Frontend      β”‚ >>> β”‚   Backend       β”‚ >>> β”‚  Model Runner   β”‚
β”‚  (React/TS)     β”‚     β”‚    (Go)         β”‚     β”‚ (Llama 3.2)     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
      :3000                   :8080                   :12434
                              β”‚  β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    Grafana      β”‚ <<< β”‚  Prometheus  β”‚    β”‚     Jaeger      β”‚
β”‚  Dashboards     β”‚     β”‚   Metrics    β”‚    β”‚    Tracing      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
      :3001                   :9091                :16686

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Redis Stack   β”‚    β”‚ Redis Exporter  β”‚    β”‚ Token Analytics β”‚
β”‚ DB + Insight    β”‚    β”‚ (Prometheus)    β”‚    β”‚    Service      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
   :6379, :8001              :9121                   :8082

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Redis TimeSeriesβ”‚
β”‚    Service      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       :8085

πŸš€ Quick Start

Prerequisites

  • Docker and Docker Compose
  • Git
  • Go 1.19 or higher (for local development)
  • Node.js and npm (for frontend development)

Before starting, pull the required model:

docker model pull ai/llama3.2:1B-Q8_0

🎯 One-Command Deployment

Start the complete AIWatch observability stack:

# Clone the repository
git clone https://github.com/collabnix/aiwatch.git
cd aiwatch

# Start the complete stack (builds and runs all services)
docker-compose up -d --build

🌐 Access Points

After deployment, access these services:

Service URL Credentials Purpose
AIWatch Frontend http://localhost:3000 - Main chat interface
Grafana http://localhost:3001 admin/admin Monitoring dashboards
Redis Insight http://localhost:8001 - Redis database GUI
Prometheus http://localhost:9091 - Metrics collection
Jaeger http://localhost:16686 - Distributed tracing
Token Analytics http://localhost:8082 - Usage analytics API
TimeSeries API http://localhost:8085 - Redis TimeSeries service

βœ… Verification Steps

After deployment, verify the observability stack is working:

  1. Check Grafana Connection:

    • Visit http://localhost:3001
    • Login with admin/admin
    • Go to Configuration > Data Sources
    • Verify Prometheus datasource shows "βœ… Data source is working"
    • Verify Redis datasource is configured
  2. Check Prometheus Targets:

    • Visit http://localhost:9091/targets
    • All targets should show State: UP:
      • prometheus:9090 (Prometheus itself)
      • redis-exporter:9121 (Redis metrics)
      • backend:9090 (Backend metrics)
      • token-analytics:8082 (Analytics metrics)
  3. View Pre-built Dashboard:

    • In Grafana, go to Dashboards
    • Open "AIWatch Redis Monitoring"
    • You should see Redis metrics: Memory Usage, Connected Clients, Commands/sec

πŸ“Š Redis Observability Features

Redis Stack Components

  1. Redis Database (Port 6379)

    • Primary data store for chat history and session management
    • Redis TimeSeries for metrics storage
    • Redis JSON for complex data structures
    • Redis Search for full-text capabilities
  2. Redis Insight (Port 8001)

    • Web-based Redis GUI for database inspection
    • Real-time monitoring of Redis performance
    • Key-value browser and query interface
  3. Redis Exporter (Port 9121)

    • Exports Redis metrics to Prometheus
    • Monitors memory usage, command statistics, connection counts
    • Integration with alerting systems
  4. Token Analytics Service (Port 8082)

    • Tracks token usage patterns and costs
    • API endpoint for analytics queries
    • Integration with frontend metrics display
  5. Redis TimeSeries Service (Port 8085)

    • Dedicated API for time-series data operations
    • Historical performance data storage
    • Real-time metrics aggregation

Monitoring & Analytics

  • Real-time Redis Metrics: Memory usage, commands/sec, connections
  • Token Usage Analytics: Input/output tokens, cost tracking, usage patterns
  • Performance Monitoring: Response times, throughput, error rates
  • Historical Data: Time-series storage of all metrics for trend analysis
  • Grafana Integration: Pre-configured dashboards for Redis monitoring
  • Auto-configured Datasources: Prometheus and Redis datasources automatically set up

πŸ› οΈ Development Setup

Frontend

The frontend is built with React, TypeScript, and Vite:

cd frontend
npm install
npm run dev

This will start the development server at http://localhost:3000.

Backend

The Go backend can be run directly:

go mod download
go run main.go

Make sure to set the required environment variables from backend.env:

  • BASE_URL: URL for the model runner
  • MODEL: Model identifier to use
  • API_KEY: API key for authentication (defaults to "ollama")
  • REDIS_ADDR: Redis connection address (redis:6379)
  • LOG_LEVEL: Logging level (debug, info, warn, error)
  • LOG_PRETTY: Whether to output pretty-printed logs
  • TRACING_ENABLED: Enable OpenTelemetry tracing
  • OTLP_ENDPOINT: OpenTelemetry collector endpoint

πŸ”„ How It Works

  1. The frontend sends chat messages to the backend API
  2. The backend formats the messages and sends them to the Model Runner
  3. Chat history and session data are stored in Redis
  4. The LLM processes the input and generates a response
  5. The backend streams the tokens back to the frontend as they're generated
  6. Token analytics are collected and stored in Redis TimeSeries
  7. Redis metrics are exported to Prometheus for monitoring
  8. Observability components collect metrics, logs, and traces throughout the process
  9. Grafana dashboards provide real-time visualization of system performance

πŸ“ Project Structure

β”œβ”€β”€ compose.yaml                 # Complete observability stack deployment
β”œβ”€β”€ backend.env                  # Backend environment variables
β”œβ”€β”€ main.go                     # Go backend server
β”œβ”€β”€ frontend/                   # React frontend application
β”‚   β”œβ”€β”€ src/                    # Source code
β”‚   β”‚   β”œβ”€β”€ components/         # React components
β”‚   β”‚   β”œβ”€β”€ App.tsx            # Main application component
β”‚   β”‚   └── ...
β”œβ”€β”€ pkg/                       # Go packages
β”‚   β”œβ”€β”€ logger/                # Structured logging
β”‚   β”œβ”€β”€ metrics/               # Prometheus metrics
β”‚   β”œβ”€β”€ middleware/            # HTTP middleware
β”‚   β”œβ”€β”€ tracing/               # OpenTelemetry tracing
β”‚   └── health/                # Health check endpoints
β”œβ”€β”€ prometheus/                # Prometheus configuration
β”‚   └── prometheus.yml         # Scraping configuration
β”œβ”€β”€ grafana/                   # Grafana configuration
β”‚   β”œβ”€β”€ provisioning/          # Auto-configuration
β”‚   β”‚   β”œβ”€β”€ datasources/       # Prometheus & Redis datasources
β”‚   β”‚   └── dashboards/        # Dashboard provisioning
β”‚   └── dashboards/            # Pre-built dashboard JSON files
β”œβ”€β”€ redis/                     # Redis configuration
β”‚   └── redis.conf            # Redis server configuration
β”œβ”€β”€ observability/             # Observability documentation
└── ...

πŸ“ˆ llama.cpp Metrics Features

The application includes detailed llama.cpp metrics displayed directly in the UI:

  • Tokens per Second: Real-time generation speed
  • Context Window Size: Maximum tokens the model can process
  • Prompt Evaluation Time: Time spent processing the input prompt
  • Memory per Token: Memory usage efficiency
  • Thread Utilization: Number of threads used for inference
  • Batch Size: Inference batch size

These metrics help in understanding the performance characteristics of llama.cpp models and can be used to optimize configurations.

πŸ” Observability Features

The project includes comprehensive observability features:

Metrics

  • Model performance (latency, time to first token)
  • Token usage (input and output counts)
  • Request rates and error rates
  • Active request monitoring
  • Redis performance metrics (memory, commands, connections)
  • Token analytics with cost tracking
  • llama.cpp specific performance metrics

Logging

  • Structured JSON logs with zerolog
  • Log levels (debug, info, warn, error, fatal)
  • Request logging middleware
  • Error tracking

Tracing

  • Request flow tracing with OpenTelemetry
  • Integration with Jaeger for visualization
  • Span context propagation

For more information, see Observability Documentation.

πŸŽ›οΈ Configuration Options

Redis Configuration

The Redis setup includes:

  • Persistence: RDB and AOF enabled for data durability
  • Memory Optimization: Configured for optimal performance
  • Security: Protected mode disabled for development (configure for production)
  • TimeSeries: Enabled for metrics storage
  • Networking: Bridge network for service communication

Service Dependencies

All services include:

  • Health Checks: Automated service health monitoring
  • Restart Policies: Automatic restart on failure
  • Resource Limits: Memory and CPU constraints
  • Logging: Centralized log collection

Auto-Configuration

  • Grafana Datasources: Automatically configured Prometheus and Redis connections
  • Dashboard Provisioning: Pre-built Redis monitoring dashboard
  • Prometheus Targets: All services automatically discovered and monitored

βš™οΈ Customization

You can customize the application by:

  1. Changing the model in backend.env to use a different LLM
  2. Modifying the frontend components for a different UI experience
  3. Extending the backend API with additional functionality
  4. Customizing the Grafana dashboards for different metrics
  5. Adjusting llama.cpp parameters for performance optimization
  6. Configuring Redis for different persistence and performance requirements
  7. Adding custom analytics using the Token Analytics Service API
  8. Creating custom dashboards in Grafana for specific monitoring needs
  9. Adding new datasources in grafana/provisioning/datasources/

πŸ§ͺ Testing

The project includes integration tests using Testcontainers:

cd tests
go test -v

🚨 Troubleshooting

Common Issues

  • Model not loading: Ensure you've pulled the model with docker model pull
  • Connection errors: Verify Docker network settings and that Model Runner is running
  • Streaming issues: Check CORS settings in the backend code
  • Metrics not showing: Verify that Prometheus can reach the backend metrics endpoint
  • Redis connection failed: Check Redis container status and network connectivity
  • llama.cpp metrics missing: Confirm that your model is indeed a llama.cpp model
  • Grafana dashboards empty: Ensure Prometheus is collecting metrics and data source is configured correctly

Redis-Specific Troubleshooting

  • Redis Insight not accessible: Check if port 8001 is available and Redis container is running
  • Token analytics not working: Verify Redis TimeSeries module is loaded and service dependencies are met
  • Performance degradation: Monitor Redis memory usage and consider adjusting configuration
  • Data not persisting: Check Redis volume mounts and persistence configuration

Grafana Connectivity Issues

If Grafana shows "No data" like in your screenshot:

  1. Check Datasource Configuration:

    # Verify Prometheus is accessible from Grafana container
    docker exec aiwatch-grafana wget -qO- http://prometheus:9090/api/v1/query?query=up
  2. Check Prometheus Targets:

    # View Prometheus targets status
    curl http://localhost:9091/api/v1/targets
  3. Restart Stack (if needed):

    docker-compose down
    docker-compose up -d --build

The issue you encountered was due to Docker networking - services within the Docker network must communicate using service names (like prometheus:9090) rather than localhost:9090. We've fixed this by:

  • βœ… Mounting the prometheus.yml configuration file properly
  • βœ… Using correct service names in Prometheus targets
  • βœ… Auto-configuring Grafana datasources with proper internal URLs
  • βœ… Adding pre-built Redis monitoring dashboard

Health Checks

Monitor service health using:

# Check all container status
docker-compose ps

# View specific service logs
docker-compose logs redis
docker-compose logs grafana
docker-compose logs prometheus
docker-compose logs token-analytics

πŸ“Š Performance Optimization

Redis Optimization

  • Memory Management: Configure maxmemory and eviction policies
  • Persistence: Balance between RDB and AOF based on use case
  • Networking: Use Redis clustering for high availability
  • Monitoring: Set up alerts for memory usage and connection limits

Model Performance

  • Thread Configuration: Optimize thread count based on CPU cores
  • Memory Settings: Configure context window based on available RAM
  • Batch Processing: Adjust batch size for optimal throughput

πŸ”„ Migration from Basic Setup

If upgrading from a previous version:

  1. Backup existing data (if any)
  2. Stop current services: docker-compose down
  3. Use new compose file: docker-compose up -d --build
  4. Verify all services: Check health endpoints and Grafana dashboards
  5. Import existing data into Redis if needed

πŸ“œ License

MIT

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ™ Acknowledgments

  • Docker Model Runner team for local LLM capabilities
  • Redis Stack for comprehensive data management
  • Grafana and Prometheus communities for observability tools
  • OpenTelemetry project for distributed tracing standards

About

AIWatch - AI Model Management and Observability powered by Docker Model Runner

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •