Skip to content

Conversation

allozaur
Copy link
Collaborator

@allozaur allozaur commented Jul 23, 2025

Overview

This PR introduces a complete rewrite of the llama.cpp web interface, migrating from a React-based implementation to a modern SvelteKit architecture. The new implementation provides significant improvements in user experience, developer tooling, and feature capabilities while maintaining full compatibility with the llama.cpp server API.

🚀 Tech Stack Upgrade

Why Svelte 5 + SvelteKit Over React

  • No Virtual DOM: Direct DOM manipulation for better performance without re-rendering whole DOM with each UI change
  • Compile-Time Optimizations: Svelte compiles to pure JavaScript with no runtime overhead
  • Minimal Boilerplate: Significantly less code required for the same functionality
  • Smaller Bundle Size: Dramatically reduced bundle size compared to React (~1MB less for gzipped static build with a lot more functionality included)
  • Built-in Reactivity: No need for external state management libraries
  • File-Based Routing: Automatic route generation with type safety
  • Static Site Generation: Optimized build output for deployment

Framework Changes

  • Framework: React 18.3.1 → SvelteKit + Svelte 5.0.0
  • Build Tool: Vite 6.0.5 → Vite 7.0.4
  • UI Library: DaisyUI 5.0.12 → ShadCN + bits-ui 2.8.11

Development Tooling

  • Testing Framework: Playwright (E2E/Regression Tests) + Vitest (Unit + Browser Tests) + Storybook (UI Interactions & Visual Tests)
  • Component Development: Storybook with UI tests & documentation for crucial components and interactions
  • Code Quality: ESLint 9.18.0 + typescript-eslint 8.20.0 + Prettier 3.4.2
  • Build Optimization: Static site generation with inline bundling strategy

🏗️ Architecture Modernization

Component Architecture

Components are designed with single responsibilities and clear boundaries, making them highly testable and reusable. The application components handle feature-specific logic while UI components provide consistent design patterns across the entire interface.

  • Application Components (src/lib/components/app/): specialized components focused on specific features
  • UI Component System (src/lib/components/ui/): ShadCN components providing design primitives like buttons, dialogs, dropdowns, and form elements
See example

Before (React - Monolithic Component):

// Single 554-line SettingDialog.tsx component
function SettingDialog({ show, onClose }) {
  const { config, setConfig } = useAppContext();
  const [localConfig, setLocalConfig] = useState(config);
  
  const handleSave = () => {
    setConfig(localConfig);
    localStorage.setItem('config', JSON.stringify(localConfig));
    onClose();
  };
  
  return (
    <div className="modal modal-open">
      {/* 500+ lines of inline form elements */}
    </div>
  );
}

After (Svelte - Modular Components):

// Separate focused components
<ChatSettingsDialog bind:open={showSettings}>
  <ChatSettingsGeneral />
  <ChatSettingsAdvanced />
  <ChatSettingsTheme />
</ChatSettingsDialog>

State Management

Stores act as the application's orchestration layer, coordinating between services and managing reactive state with Svelte 5 runes. They handle complex business logic like conversation branching, streaming responses, and persistent data synchronization.

  • chat.svelte.ts - Manages conversation state, message threading, AI interaction orchestration, and conversation branching logic
  • settings.svelte.ts - Handles user preferences, theme management, and configuration persistence with localStorage sync
  • server.svelte.ts - Tracks server properties, health status, capability detection, and real-time monitoring
  • database.ts - Provides IndexedDB abstraction with conversation and message CRUD operations
See example

Before (React Context with boilerplate):

const [isLoading, setIsLoading] = useState(false);
const [conversations, setConversations] = useState([]);
const [activeConversation, setActiveConversation] = useState(null);

useEffect(() => {
  const loadConversations = async () => {
    setIsLoading(true);
    try {
      const convs = await DatabaseStore.getAllConversations();
      setConversations(convs);
    } catch (error) {
      console.error('Failed to load conversations:', error);
    } finally {
      setIsLoading(false);
    }
  };
  loadConversations();
}, []);

const updateConfig = (key, value) => {
  setConfig(prev => ({ ...prev, [key]: value }));
  localStorage.setItem('config', JSON.stringify({ ...config, [key]: value }));
};

After (Svelte Runes - minimal boilerplate):

class ChatStore {
  conversations = $state<DatabaseConversation[]>([]);
  isLoading = $state(false);
  activeConversation = $state<DatabaseConversation | null>(null);
  
  async loadConversations() {
    this.conversations = await DatabaseStore.getAllConversations();
  }
}

class SettingsStore {
  config = $state<SettingsConfigType>({ ...SETTING_CONFIG_DEFAULT });
  
  updateConfig<K extends keyof SettingsConfigType>(key: K, value: SettingsConfigType[K]) {
    this.config[key] = value;
    this.saveConfig(); // Automatic persistence
  }
}

Service Layer

Services provide stateless, pure functions for external communication and complex operations, completely separated from UI concerns. They handle HTTP requests, real-time monitoring, and data transformations while remaining easily testable.

  • chat.ts - HTTP communication with llama.cpp server, message formatting, and streaming response parsing
  • slots.ts - Real-time server resource monitoring and capacity tracking during generation
  • context.ts - Context window calculations, token limit handling, and error detection
  • index.ts - Service exports and dependency injection management
See example

Before (React - Mixed concerns in components):

// API logic mixed with component logic
const ChatScreen = () => {
  const sendMessage = async (content) => {
    setIsLoading(true);
    try {
      const response = await fetch('/completion', {
        method: 'POST',
        body: JSON.stringify({ messages: [...messages, { role: 'user', content }] })
      });
      // Streaming logic mixed with UI updates
      const reader = response.body.getReader();
      // ... complex streaming logic in component
    } catch (error) {
      // Error handling in component
    }
  };
};

After (Svelte - Clean service separation):

// Pure service layer
export class ChatService {
  async sendMessage(messages: ApiChatMessageData[], options: SettingsChatServiceOptions) {
    const response = await fetch('/completion', {
      method: 'POST',
      body: JSON.stringify({ messages, ...options })
    });
    
    if (options.stream) {
      return this.handleStreamingResponse(response, options);
    }
    return this.handleNonStreamingResponse(response);
  }
}

// Store orchestrates service calls
class ChatStore {
  async sendMessage(content: string) {
    await chatService.sendMessage(allMessages, {
      onChunk: (chunk) => this.currentResponse += chunk,
      onComplete: (content) => this.saveMessage(content)
    });
  }
}

Utilities & Hooks

Utilities contain framework-agnostic helper functions while hooks encapsulate reusable stateful logic patterns. Both are designed to be pure and testable in isolation from the rest of the application.

  • Utils (src/lib/utils/): fileProcessing.ts, pdfProcessing.ts, thinking.ts, branching.ts - 13 modules for file handling, content processing, and data manipulation
  • Hooks (src/lib/hooks/): useFileUpload.svelte.ts, useProcessingState.svelte.ts - Composables for file upload state and processing status management
See example **Before (React - Inline file processing):** ```typescript const useChatExtraContext = () => { const processFiles = async (files) => { const processedFiles = []; for (const file of files) { if (file.type.startsWith('image/')) { const base64 = await convertToBase64(file); processedFiles.push({ type: 'image', data: base64 }); } else if (file.type === 'application/pdf') { // Inline PDF processing logic... } // More inline processing... } return processedFiles; }; }; ```

After (Svelte - Modular utilities and hooks):

// Pure utility functions
export async function convertPDFToText(file: File): Promise<string> {
  const pdf = await getDocument(arrayBuffer).promise;
  // PDF processing logic
}

export async function convertImageToBase64(file: File): Promise<string> {
  // Image processing logic
}

// Reusable hook
export function useFileUpload() {
  let files = $state<File[]>([]);
  let isProcessing = $state(false);
  
  const processFiles = async (newFiles: File[]) => {
    isProcessing = true;
    const processed = await Promise.all(
      newFiles.map(file => processFileByType(file))
    );
    files = processed;
    isProcessing = false;
  };
  
  return { files: () => files, isProcessing: () => isProcessing, processFiles };
}

Type System

The type system defines clear contracts between all layers and prevents runtime errors through compile-time checking. It enables confident refactoring and provides excellent developer experience with IntelliSense support.

  • database.d.ts - Conversation and message interfaces with branching support and attachment types
  • api.d.ts - Request/response interfaces for llama.cpp server communication and streaming
  • files.d.ts - File attachment type definitions for images, PDFs, text, and audio files
  • settings.d.ts - Configuration and theme type definitions with validation schemas
See example

Before (React - Loose typing):

// Minimal type definitions
interface Message {
  id: string;
  role: 'user' | 'assistant';
  content: string;
  timestamp: number;
}

const updateConfig = (key: string, value: any) => {
  // No type safety for config updates
};

After (Svelte - Comprehensive type system):

// Detailed type definitions with branching support
interface DatabaseMessage {
  id: string;
  convId: string;
  type: 'root' | 'text';
  timestamp: number;
  role: 'user' | 'assistant';
  content: string;
  thinking?: string;
  parent: string | null;
  children: string[];
  extra?: DatabaseMessageExtra[];
}

// Type-safe configuration updates
updateConfig<K extends keyof SettingsConfigType>(
  key: K, 
  value: SettingsConfigType[K]
) {
  this.config[key] = value; // Fully type-safe
}

Dependency Flow

ComponentsStoresServicesAPI: The unidirectional flow ensures predictable data flow and easier debugging. Stores orchestrate business logic while keeping components focused purely on presentation, and services handle all external communication.

✨ Feature Enhancements

File Handling

  • Dropdown Upload Menu: Type-specific file selection (Images/Text/PDFs)
  • Universal Preview System: Full-featured preview dialogs for all supported file types
  • PDF Dual View: Text extraction + page-by-page image rendering
  • Enhanced Support: SVG/WEBP→PNG conversion, binary detection, syntax highlighting
  • Vision Model Awareness: Smart UI adaptation based on model capabilities
  • Graceful Failure: Proper error handling and user feedback for unsupported file types

Advanced Chat Features

  • Reasoning Content: Dedicated thinking blocks with streaming support
  • Conversation Branching: Full tree structure with parent-child relationships
  • Message Actions: Edit, regenerate, delete with intelligent branch management
  • Keyboard Shortcuts:
    • Ctrl+Shift+N: Start new conversation
    • Ctrl+Shift+D: Delete current conversation
    • Ctrl+K: Focus search conversations
    • Ctrl+V: Paste files and content to conversation
    • Ctrl+B: Toggle sidebar
    • Enter: Send message
    • Shift+Enter: New line in message
  • Smart Paste: Auto-conversion of long text to files with customizable threshold (default 2000 characters)

Server Integration

  • Slots Monitoring: Real-time server resource tracking during generation
  • Context Management: Advanced context error handling and recovery
  • Server Status: Comprehensive server state monitoring
  • API Integration: Full reasoning_content and slots endpoint support

🎨 User Experience Improvements

Interface Design

  • Modern UI Components: Consistent design system with ShadCN components
  • Responsive Layout: Adaptive sidebar and mobile-friendly design
  • Theme System: Seamless auto/light/dark mode switching
  • Visual Hierarchy: Clear information architecture and content organization

Interaction Patterns

  • Keyboard Navigation: Complete keyboard accessibility with shortcuts
  • Drag & Drop: Intuitive file upload with visual feedback
  • Smart Defaults: Context-aware UI behavior and intelligent defaults (sidebar auto-management, conversation naming)
  • Progressive Disclosure: Advanced features available without cluttering basic interface

Feedback & Communication

  • Loading States: Clear progress indicators during operations
  • Error Handling: User-friendly error messages with recovery suggestions
  • Status Indicators: Real-time server status and resource monitoring
  • Confirmation Dialogs: Prevent accidental data loss with confirmation prompts

🛠️ Developer Experience Improvements

Code Organization

  • Modular Architecture: Clear separation of concerns with dedicated folders
  • Component Composition: Reusable components with single responsibilities
  • Type Safety: Comprehensive TypeScript coverage with strict checking
  • Service Layer: Clean API abstraction with dependency injection

Development Workflow

  • Hot Reload: Instant feedback during development
  • Component Development: Isolated component development with Storybook
  • Concurrent Development: Simultaneous dev server and component library
  • Build Optimization: Fast builds with optimized output

Testing Infrastructure

  • Multi-Layer Testing: Unit, integration, and E2E testing coverage
  • Component Testing: Isolated component testing with Storybook
  • Browser Testing: Real browser environment testing with Vitest
  • Automated Testing: CI/CD integration with automated test runs

Code Quality

  • Linting & Formatting: Automated code quality enforcement
  • Type Checking: Compile-time error detection with TypeScript
  • Documentation: Self-documenting code with component stories
  • Error Handling: Comprehensive error boundaries and logging

Debugging & Monitoring

  • Development Tools: Enhanced debugging with Svelte DevTools
  • Error Messages: Clear, actionable error messages
  • Performance Monitoring: Built-in performance tracking
  • State Inspection: Easy state debugging with reactive stores

📊 Database Schema Updates

Modern Conversation Structure

// New branching-capable schema
interface DatabaseConversation {
  id: string;
  lastModified: number;
  currNode: string;        // Current conversation path
  name: string;
}

interface DatabaseMessage {
  id: string;
  convId: string;
  type: 'root' | 'text';
  timestamp: number;
  role: 'user' | 'assistant';
  content: string;
  thinking?: string;       // Reasoning content
  parent: string | null;   // Tree structure
  children: string[];      // Branching support
  extra?: DatabaseMessageExtra[];
}

Enhanced Features

  • Conversation Branching: Navigate between different response paths
  • Reasoning Support: Dedicated storage for model thinking content
  • Rich Attachments: Comprehensive file attachment system
  • Tree Navigation: Smart current node tracking and path management

🧪 Testing Infrastructure

Comprehensive Testing Strategy

  • UI Testing: Storybook setup with tests for crucial components and interactions
  • Unit Testing: Vitest with browser support for client-side testing
  • E2E Testing: Playwright for full user journey validation
  • Automated CI/CD: Integrated testing pipeline

Development Experience

  • Component Isolation: Individual component development and testing
  • Hot Reload: Concurrent Storybook + dev server
  • Type Safety: Full TypeScript coverage with strict checking
  • Code Quality: Automated linting, formatting, and validation

📈 Performance & Build Improvements

Build Optimization

  • Static Site Generation: Pre-built static files for optimal performance
  • Inline Bundling: Single-file deployment strategy
  • Asset Optimization: Automatic asset processing and compression
  • Post-Build Processing: Custom build pipeline with optimization scripts

Runtime Performance

  • Reactive Updates: Svelte's compile-time optimizations
  • Smaller Bundle: More efficient runtime compared to React
  • Memory Efficiency: Better garbage collection and memory usage
  • Fast Hydration: Optimized client-side initialization

🔄 Migration Benefits

Maintainability

  • Modular Components: Easy to extend and modify individual features
  • Type Safety: Reduced runtime errors with comprehensive TypeScript
  • Testing Coverage: Automated testing for reliability and regression prevention
  • Documentation: Storybook documentation for components

Performance

  • Faster Interface: Improved performance and responsiveness
  • Better Resource Usage: More efficient memory and CPU utilization
  • Optimized Loading: Faster initial load and navigation
  • Reduced Bundle Size: Smaller JavaScript payload

🚦 Breaking Changes

  • New Database Schema: Automatic migration from linear to tree structure
  • API Compatibility: Maintains full compatibility with llama.cpp server API
  • Configuration: Settings migration with backward compatibility
  • File Structure: Complete reorganization with improved organization

📝 Migration Notes

The new implementation maintains full API compatibility while providing significant enhancements. The database schema automatically migrates existing conversations to the new branching structure. All existing functionality is preserved while adding extensive new capabilities.

@bayorm
Copy link

bayorm commented Jul 26, 2025

what is reason to change from react -> svelte? Would it be better just to improve UI with react?

allozaur added 25 commits August 1, 2025 14:25
Introduces the ability to send and display attachments within the chat interface.

This includes:
- A new `ChatAttachmentsList` component to display attachments in both the `ChatForm` (pending uploads) and `ChatMessage` (stored attachments).
- Updates to `ChatForm` to handle file uploads and display previews.
- Updates to `ChatMessage` to display stored attachments.
- Logic to convert uploaded files to a format suitable for storage and transmission.
Improves code organization and reusability by introducing separate components for displaying image and file attachments in the chat interface.

This change simplifies the ChatAttachmentsList component and enhances the maintainability of the codebase.
Adds support for pasting files and converting long text to files.

Enhances file upload capabilities to handle pasted files directly from the clipboard.  Implements a feature that converts lengthy pasted text into a text file if it exceeds a configurable length.

Also enhances file processing to read text file content and prevent binary files from being processed.
Implements file attachment previews in the chat interface, including support for text file previews with truncated display and remove functionality.

Improves the display of image attachments.
Text file detection is enhanced using filename extensions.
Refactors the chat form to handle file uploads more robustly.
- Changes `onSend` to be an async function that returns a boolean to
  indicate the success of sending the message. This allows for
  validation before clearing the message and uploaded files.
- Adds client-side validation for WebP images to avoid server-side errors.
- Moves text file utility functions and constants to dedicated files and utils.
- Fixes a bug where chat messages were not being cleared after sending.
Ensures that SVG images are converted to PNG format before being stored in the database and displayed in the chat interface.

This improves compatibility with systems that may not fully support SVG rendering and addresses potential security concerns related to SVG files.

Handles cases where SVG to PNG conversion fails gracefully, preventing application errors.
Enables displaying PDF files in the chat interface.

This commit introduces the `pdfjs-dist` library to process PDFs, extracting text content for display. It also includes functionality to potentially convert PDF pages into images, allowing preview of PDF documents.

The changes enhance the chat functionality by allowing add PDF attachments to the chat. Currently only supports converting PDFs to text.
Adds a dialog for previewing attachments, including images, text, PDFs, and audio files.

This change enhances the user experience by allowing users to view attachments in a larger, more detailed format before downloading or interacting with them.
It introduces a new component, `ChatAttachmentPreviewDialog.svelte`, and updates the `ChatAttachmentsList.svelte` and `ChatAttachmentImagePreview.svelte` components to trigger the dialog.
Enhances the chat interface with several UI improvements:

- Adds backdrop blur to the chat form for better visual separation.
- Makes the chat header background transparent on larger screens and introduces a blur effect.
- Adjusts the maximum width of filenames in attachment previews for better responsiveness.
- Fixes sidebar interaction on mobile, ensuring it closes after item selection.
- Adjusts input field styling.
Sets the focus to the textarea element when the component
mounts and after the loading state changes from true to false.

This improves user experience by automatically placing the cursor
in the input field, allowing users to start typing immediately.
Updates the timestamp format in chat messages to display hours and minutes in a more user-friendly manner.

This improves the readability and clarity of the message timestamps,
making it easier for users to quickly understand when a message was sent.
Refactors the directory structure by moving chat-related components to a dedicated "app" directory.

This change improves organization and maintainability by grouping components based on their function within the application.
Refactors server properties to align with the new API structure.
Changes how context length is accessed.
Adds conditional rendering based on model availability.
Implements a check to ensure the message content and history do not exceed the model's context window limits.

This prevents errors and improves the user experience by displaying an informative alert dialog when the context length would be exceeded, suggesting ways to shorten the message or start a new conversation.

Also adds a method to clear the context error.
@allozaur
Copy link
Collaborator Author

Will pyodide return to the new web interface?

Hey! Yes, we will add it in near future 😄

@ggerganov
Copy link
Member

I think there is some rendering issue when the number of chat sessions is too large?

image

The reason I think it is related to the number of chats is because in a new incognito window without any previous sessions it works ok.

@allozaur
Copy link
Collaborator Author

I think there is some rendering issue when the number of chat sessions is too large?
image

The reason I think it is related to the number of chats is because in a new incognito window without any previous sessions it works ok.

Hmm, I've never had this error with multiple conversations... how many of them do you have? I will take a look at that

@ggerganov
Copy link
Member

I think close to 100. I tried to delete a few of them, but the issue persists.

Btw, is there a way to access the data for the conversations using the browser developer console? I looked in localStorage but they are not saved there.

@allozaur
Copy link
Collaborator Author

I think close to 100. I tried to delete a few of them, but the issue persists.

Btw, is there a way to access the data for the conversations using the browser developer console? I looked in localStorage but they are not saved there.

Okay, I will check that. And you can access data from IndexedDB

@easyfab
Copy link

easyfab commented Sep 17, 2025

Very nice thank you.

How to manage PDFs/images/audio once uploaded ?

image

The edit button only access text.
How to delete/unload this pdf to load another one for example.
Is it possible or is a new chat needed ?

@allozaur
Copy link
Collaborator Author

How to delete/unload this pdf to load another one for example.
Is it possible or is a new chat needed?

@easyfab this is not possible (yet), but this should be a very straightforward change, I will create a PR with this enhancement and let you know here as well ;)

@O-J1
Copy link

O-J1 commented Sep 18, 2025

@allozaur Forwarding from someone else

Regarding the keyboard shortcuts, some are already used by browsers:

  • ctrl-shift-n opens a private window in chromium browsers
  • ctrl-shift-d adds a bookmark in firefox browsers and prompts to add all tabs to bookmarks in chrome browsers.
  • ctrl-k focuses the search engine bar in firefox browsers
  • ctrl-b toggles the bookmark sidebar in firefox browsers

https://support.mozilla.org/en-US/kb/keyboard-shortcuts-perform-firefox-tasks-quickly
https://support.google.com/chrome/answer/157179

@allozaur
Copy link
Collaborator Author

@allozaur Forwarding from someone else

Regarding the keyboard shortcuts, some are already used by browsers:

* ctrl-shift-n opens a private window in chromium browsers

* ctrl-shift-d adds a bookmark in firefox browsers and prompts to add all tabs to bookmarks in chrome browsers.

* ctrl-k focuses the search engine bar in firefox browsers

* ctrl-b toggles the bookmark sidebar in firefox browsers

https://support.mozilla.org/en-US/kb/keyboard-shortcuts-perform-firefox-tasks-quickly https://support.google.com/chrome/answer/157179

Hey @O-J1, thanks for your feedback. Generally, the thing with keyboard shortcuts for web apps is that just few of shortcuts that are intuitive are not taken by somme browser feature... As it goes for the ones that you listed above:

  • ctrl-shift-n opens a private window in chromium browsers

We are not using this one. For new chat we are using Cmd/Ctrl+Shift+O

  • ctrl-shift-d adds a bookmark in firefox browsers and prompts to add all tabs to bookmarks in chrome browsers.

For Firefox it's actually Cmd/Ctrl+D to add a bookmark, if you add a Shift key, then it allows to create a bookmark folder by default.

I think that these are relatively rare use cases for Chromium/Gecko browsers and it's okay to override default behaviour for this shortcut as deleting conversations can come handier when working with multiple convos. Otherwise, if you have suggestions for better alternative, I am all ears (and eyes) 😉

  • ctrl-k focuses the search engine bar in firefox browsers

Cmd/Ctrl + K is one of the most popular shortcuts to open search in multuple web apps (GitHub included), so i really wouldn't consider changing this one.

  • ctrl-b toggles the bookmark sidebar in firefox browsers

This shortcut is used by default for ShadCN sidebar component, I am not sure if it is that crucial to disable it to allow opening/closing bookmarks. If you think there are solid arguments behind it, of course let's then consider removing it.

Generally, I think that the best would be to continue this discussion in a separate GH issue, so feel free to open one and let's discuss the shortcuts more in detail over there 😄

@allozaur
Copy link
Collaborator Author

How to delete/unload this pdf to load another one for example.
Is it possible or is a new chat needed?

@easyfab this is not possible (yet), but this should be a very straightforward change, I will create a PR with this enhancement and let you know here as well ;)

@easyfab let's continue this thread in #16085 🙂

@allozaur
Copy link
Collaborator Author

I think close to 100. I tried to delete a few of them, but the issue persists.
Btw, is there a way to access the data for the conversations using the browser developer console? I looked in localStorage but they are not saved there.

Okay, I will check that. And you can access data from IndexedDB

@ggerganov i wasn't able to reproduce this issue so far... can you please create a separate issue and ideally attach a video recording and/or screenshots from the browser console and webui?

@vbooka1
Copy link

vbooka1 commented Sep 21, 2025

Hello, the thought process is not displayed in the new Web UI. Neither enabling nor disabling "Show thought in progress" works, the chat screen just shows "Processing..." and ticking content tokens until the model starts outputting the result. How do I expand the model thinking window?

@allozaur
Copy link
Collaborator Author

allozaur commented Sep 21, 2025

Hello, the thought process is not displayed in the new Web UI. Neither enabling nor disabling "Show thought in progress" works, the chat screen just shows "Processing..." and ticking content tokens until the model starts outputting the result. How do I expand the model thinking window?

Hi, are u using --jinja flag? Also which model are u using?

@DocMAX
Copy link

DocMAX commented Sep 22, 2025

this broke llama-swap upstream chat

@allozaur
Copy link
Collaborator Author

allozaur commented Sep 22, 2025

@BradHutchings
Copy link

This broke being able to serve the chat ui on a non-root path, e.g:

http://[server]/[path-to-chat]/

I use this in Mmojo Server (fork of llama.cpp) to serve a completion UI as default and the chat UI as an option.

https://github.com/BradHutchings/Mmojo-Server

I've tried patching this new UI up, but it's beyond my expertise. It seems like you'd have to design an alternate path in somehow.

@d-a-v
Copy link

d-a-v commented Sep 29, 2025

Same issue as @BradHutchings
Reverse proxy works no more with apache.
Maybe a beginning of an answer is there: https://svelte.dev/docs/kit/hooks#externalfetch

The apache2 config I was using and which is now broken is:

<location "/llm/">
    ProxyPreserveHost On
    ProxyPass http://server-hosting-llama-server:1234/
    ProxyPassReverse http://server-hosting-llama-server:1234/
    RequestHeader set X-Forwarded-Proto http
    RequestHeader set X-Forwarded-Prefix /llm/
</location>

@BradHutchings
Copy link

@d-a-v If you haven't figured it out, you can roll back the webui as described here:

#16261

This is what I'm doing now until the issue is fixed. But I bet you have that figured out 🤣.

-Brad

Same issue as @BradHutchings Reverse proxy works no more with apache. Maybe a beginning of an answer is there: https://svelte.dev/docs/kit/hooks#externalfetch

@allozaur
Copy link
Collaborator Author

allozaur commented Sep 29, 2025

@BradHutchings @d-a-v

Can you please create an issue with reproduction steps and screenshots/code snippets?

I will be more than happy to address this, but I need more context and some examples to start working on a fix.

Edit: Nevermind, just saw the issue. Will let u know with the updates on this.

@BradHutchings
Copy link

@allozaur I really appreciate you taking a look at this. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops improvements to build systems and github actions examples python python script changes script Script related server/webui server
Projects
None yet
Development

Successfully merging this pull request may close these issues.