Simplify AI Chat Response Streaming #1167

debanjum · 2025-04-21T05:31:42Z

Reason

Simplify code and logic to stream chat response by solely relying on asyncio event loop.
Reduce overhead of managing threads to increase efficiency and throughput (where possible).

Details

Use async/await with no threading when generating chat response via OpenAI, Gemini, Anthropic AI model APIs
Use threading for offline chat model as llama-cpp doesn't support async streaming yet

…kage This package is where the get openai client functions also reside.

- Refactor chat API to use async/await for Openai streaming - Fix and clean Openai chat response async streaming

debanjum added 5 commits April 21, 2025 09:30

Create async get anthropic, openai client funcs, move to reusable pac…

c93c0d9

…kage This package is where the get openai client functions also reside.

Refactor Openai chat response to stream async, no separate thread

0751f2e

- Refactor chat API to use async/await for Openai streaming - Fix and clean Openai chat response async streaming

Refactor Gemini chat response to stream async, no separate thread

a557031

Refactor Anthropic chat response to stream async, no separate thread

932a961

Refactor Offline chat response to stream async, with separate thread

Loading
Loading status checks…

763fa2f

debanjum added the upgrade label Apr 21, 2025

Remove ThreadedGenerator class, previously used to stream chat response

Loading
Loading status checks…

a4b5842

debanjum merged commit f929ff8 into master Apr 21, 2025
10 checks passed

debanjum deleted the simplify-chat-response-streaming branch April 21, 2025 08:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify AI Chat Response Streaming #1167

Simplify AI Chat Response Streaming #1167

debanjum commented Apr 21, 2025 •

edited

Loading

Simplify AI Chat Response Streaming #1167

Simplify AI Chat Response Streaming #1167

Conversation

debanjum commented Apr 21, 2025 • edited Loading

Reason

Details

debanjum commented Apr 21, 2025 •

edited

Loading