Skip to content

feat: limit message context sent to llm #66

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 14, 2024
Merged

Conversation

gregnr
Copy link
Collaborator

@gregnr gregnr commented Aug 14, 2024

Problem

In an ideal world, we send as much context to the LLM as possible so that it has the most amount of information to help respond to your questions/tasks. Unfortunately token costs grow quadratically for every message added to the chat, since you need to send all previous messages each time.

Solution

To prevent token costs from blowing out of proportion, we can limit the max number of previous messages to send to the LLM as context each request.

Important: this does not mean the actual message history is limited on the frontend. This is strictly referring to the sliding window of the past X messages being sent to the LLM as context each time.

This PR sets this message context limit to 30 messages. The consequence of this change is that the model will have no memory of messages older than 30 messages, so will be unable to answer a questions or refer to information from more than 30 messages back.

30 messages was chosen as a reasonable amount of context to help the user accomplish whatever task they are working on in that moment, but not be too much that irrelevant/stale messages are sent every time (adding large costs) with little value.

Future

In the future, we can consider summarizing old messages before trimming the context so that the model at least has some history to refer to. Rewriting message history can get tricky though, so this will need more thought.

@gregnr gregnr merged commit 2480217 into main Aug 14, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants