-
Notifications
You must be signed in to change notification settings - Fork 19
feat: add pagination support for search tools and chat response chunking #239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add cursor-based pagination to company_search and people_profile_search tools - Implement automatic response chunking for chat tool to handle large responses - Add ChatResponseBuffer class for intelligent text splitting at natural boundaries - Update all tool schemas to use proper Glean API pagination (cursor, not pageToken) - Add comprehensive pagination tests covering all scenarios - Update documentation with pagination examples and best practices - Fix search formatter test to match new response format This addresses the issue of large responses consuming excessive context window and enables efficient handling of large result sets from Glean APIs.
- Reduce MAX_TOKENS from 20000 to 15000 for more buffer - Change CHARS_PER_TOKEN from 4 to 3 for more conservative estimation - Prevents token limit errors on very large chat responses
- Add comprehensive .env.example with all configuration options - Update README to reference the example file - Includes both new GLEAN_SERVER_INSTANCE and legacy options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I'm still digesting the overall set of changes, but I've left some initial thoughts/comments inline.
/** | ||
* Manages chunking of large chat responses to avoid token limit errors. | ||
*/ | ||
export class ChatResponseBuffer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3/5 (strong opinion: non-blocking)
Could you create some dedicated semi-unit tests for this class? Having a basic setup to iterate forward some of the logic WRT how the chat results are chunked would make it easier to fix bugs in this area in the future.
* @param response The response object with potential chunk metadata | ||
* @returns Formatted response with chunk information if applicable | ||
*/ | ||
export function formatChunkedResponse(response: any): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2/5 (minor preference, non-blocking)
Can we use the actual type for response
here (instead of any
)?
.env.example
Outdated
# Your Glean server instance URL (copy from your Glean admin panel) | ||
GLEAN_SERVER_INSTANCE=https://your-company-be.glean.com/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3/5 (strong opinion: non-blocking)
Let's call this GLEAN_SERVER_URL
instead of GLEAN_SERVER_INSTANCE
. The main reason here is that the underlying API clients already allow passing a server_url
param upon new Glean(...)
constructor, and using the same verbiage throughout the stack seems better.
Thanks for the detailed feedback, @rwjblue-glean! I appreciate you taking the time to review this. I'll address each of your points:
I'll push these updates shortly and let you know when they're ready for another look. |
Hey @rwjblue-glean, thanks for the thorough review! 🙏 Aaron here - working with Claude Code to address your feedback. Here's what we've implemented: 1. Dedicated Unit Tests for ChatResponseBufferCreated comprehensive test coverage in
Example test showing our chunking logic validation: it('should prefer splitting at paragraph boundaries', async () => {
const paragraph = 'This is a test paragraph with some content.\n\n';
const largeText = paragraph.repeat(2000); // Creates text large enough to chunk
const result = await buffer.processResponse(largeText);
expect(result.metadata).toBeDefined();
expect(result.content.length).toBeGreaterThan(0);
expect(result.content.length).toBeLessThan(largeText.length);
}); 2. Replaced
|
- Add comprehensive unit tests for ChatResponseBuffer class with 15 test cases - Replace all 'any' types with proper TypeScript types from @gleanwork/api-client - Rename GLEAN_SERVER_INSTANCE to GLEAN_SERVER_URL for API consistency - Update all documentation and examples to use new env var name - Add type guards for safe type discrimination in formatChunkedResponse - Fix test files to use Author/MessageType enums instead of string literals Co-Authored-By: Claude <[email protected]>
The test that creates 100k characters with no natural boundaries was timing out in CI (5 second default). Increased to 10 seconds to handle the heavy processing load. Co-Authored-By: Claude <[email protected]>
Reduced test from 100k to 50k characters to test force split logic without causing timeouts in slower CI environments. The test still validates the force split behavior with 2 chunks. Co-Authored-By: Claude <[email protected]>
All checks should be passing! Quick note on the test fix: The force-split test was creating 100k characters which was causing timeouts on GitHub's CI runners (which are, let's be honest, running on potato computers 🥔). I dialed it back to 50k characters - still enough to properly test the force split logic (since it exceeds the 45k chunk boundary) but much more reasonable for the limited compute resources in CI. The test went from timing out at 10+ seconds to completing in under a second. Sometimes you gotta optimize for the potatoes! 😄 |
Summary
This PR implements pagination support for the Glean MCP server to address issues with large responses consuming excessive context window tokens and to enable efficient handling of large result sets.
Changes Made
Phase 1: Search Pagination
company_search
andpeople_profile_search
toolspageSize
(1-100, default 10) andcursor
(for fetching next page)hasMoreResults
and nextcursor
Phase 2: Chat Response Chunking
continueFrom
parameter withresponseId
andchunkIndex
Environment Variable Improvements
GLEAN_SERVER_INSTANCE
environment variable for clearer configurationhttps://company-be.glean.com/
)import 'dotenv/config'
GLEAN_INSTANCE
andGLEAN_BASE_URL
still supported.env.example
with comprehensive configuration optionsTesting & Documentation
Breaking Changes
None. All changes are backward compatible.
Usage Examples
Search with pagination:
Chat continuation:
Environment configuration:
Testing