-
Notifications
You must be signed in to change notification settings - Fork 96
feat: adds an export tool and exported-data resource MCP-16 #424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
61e7064
to
e359184
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a comprehensive JSON export functionality for MongoDB collections, allowing users to export data in both relaxed and canonical JSON formats, with automatic cleanup and resource access via MCP resource templates.
Key changes:
- Added a new
export
tool that supports filtering, sorting, limiting, and projecting MongoDB collection data - Implemented an
exported-data://
resource template for stable access to exported files - Created automatic cleanup mechanisms with configurable timeouts and intervals
Reviewed Changes
Copilot reviewed 17 out of 18 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
src/common/sessionExportsManager.ts | Core export management with file operations, cleanup, and EJSON formatting |
src/tools/mongodb/read/export.ts | Export tool implementation with MongoDB query support |
src/resources/common/exportedData.ts | Resource template for accessing exported data with autocomplete |
tests/unit/common/sessionExportsManager.test.ts | Comprehensive unit tests for export manager functionality |
tests/integration/tools/mongodb/read/export.test.ts | Integration tests for export tool with various query scenarios |
tests/integration/resources/exportedData.test.ts | Resource template integration tests |
Comments suppressed due to low confidence (1)
src/common/sessionExportsManager.ts:131
- The closing bracket is written directly to the output stream here, but the Transform stream's
final
callback on line 184 also pushes a closing bracket. This will create invalid JSON with duplicate closing brackets.
outputStream.write("]\n");
This comment has been minimized.
This comment has been minimized.
📊 Accuracy Test Results📈 Summary
📊 Baseline Comparison
📎 Download Full HTML Report - Look for the Report generated on: 8/6/2025, 11:28:49 AM |
6109cea
to
e89cf23
Compare
Pull Request Test Coverage Report for Build 16876636630Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't reviewed it thoroughly, but leaving some comments for things that should be reworked. Happy to grab some time later today or tomorrow to go through the high-level architecture and offer some suggestions there.
e89cf23
to
e8845bc
Compare
export class ExportedData { | ||
private readonly name = "exported-data"; | ||
private readonly description = "Data files exported in the current session."; | ||
private readonly uri = "exported-data://{exportName}"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe mongodb-exported-data
? The namespace is too generic otherwise (similar argument can be made about our tool names in general)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea this makes sense. I would personally like that we do it at-least for all the resources, if not for everything. Will ask if someone has an objection if I make this blanket change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is the resource uris are already scoped to the MCP server, so it's not like they could conflict between different servers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scoping I think is controlled by clients and each might decide on implementing it differently. It should be fine if make our resources/tools more specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is controlled by clients, but they need to be able to associate resources with servers - e.g. even if there's a conflict between schemes, there's no way that we could serve exported-data://name
if it comes from a different MCP server.
b76b8cb
to
a3f9e8e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More comments from me - haven't gone through the entire thing yet, but made it through the exports manager and will continue after a few meetings.
export class ExportedData { | ||
private readonly name = "exported-data"; | ||
private readonly description = "Data files exported in the current session."; | ||
private readonly uri = "exported-data://{exportName}"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is the resource uris are already scoped to the MCP server, so it's not like they could conflict between different servers.
5e5698b
to
dc457ae
Compare
781ee47
to
6752099
Compare
6752099
to
d3d81d6
Compare
1. outputStream.write moved to within the Transform.flush 2. won't send resource updated notification on export-expired event or it might trigger client to fetch expired exports. 3. added ObjectId to the file names to make them unique
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more comments from me - it's almost ready
src/common/exportsManager.ts
Outdated
static init(sessionId: string, config: ExportsManagerConfig, logger: LoggerBase): ExportsManager { | ||
const exportsDirectoryPath = path.join(config.exportsPath, sessionId); | ||
const exportsManager = new ExportsManager(exportsDirectoryPath, config, logger); | ||
exportsManager.init(); | ||
return exportsManager; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't have super strong feelings here, but I feel this pattern is not super idiomatic. I guess you went that route to avoid starting the cleanup in the ctor, but an alternative approach would be to move setting up the interval to the first time an export is requested. That would give us 2 benefits - first, we'd be able to use the ctor like normal and second, we wouldn't be running the cleanup logic if the user never requested a an export. So it would look something like:
constructor(...) {
// set the class-level variables as usual
}
createJSONExport(...) {
if (!this.exportsCleanupInterval) {
this.exportsCleanupInterval = setInterval(...);
}
}
We can also get rid of the wasInitialized
field since it's pretty much just returning whether exportsCleanupInterval
is undefined or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the suggestion looks really good, the only reason I wouldn't wanna do that is because we'll be triggering an unrelated side-effect in createJSONExport.
As per my understanding, offloading complex initializations to dedicated methods is pretty common. And I offloaded it to a static method only because otherwise it would have been a pain to do this in every tests:
const exportsManager = new ExportsManager(...)
exportsManager.init();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't say the side-effect is unrelated - createJSONExport
is the entrypoint for creating files on disk. So, initializing the cleanup loop only once we have files to clean up does make sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even then I don't think createJSONExport
should concern itself with starting the cleanup loop as well, that's not what its meant for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a counterargument, what's the point of having a cleanup loop if you know there's nothing to clean up? Not a hill I'm willing to die on, so happy to keep it as-is, but an alternative design would be to use setTimeout
instead of a cleanup loop - that is, everytime we create a json export, we setup a clean up callback to delete it after x minutes. In that case, it would very much be createJSONExport
that sets that up, right? While the cleanup mechanism is different, I would say the notion of triggering/scheduling/setting up a garbage collection procedure as a side effect of creating "garbage" would not be surprpising to me.
export class ExportedData { | ||
private readonly name = "exported-data"; | ||
private readonly description = "Data files exported in the current session."; | ||
private readonly uri = "exported-data://{exportName}"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is controlled by clients, but they need to be able to associate resources with servers - e.g. even if there's a conflict between schemes, there's no way that we could serve exported-data://name
if it comes from a different MCP server.
{ | ||
type: "resource_link", | ||
name: exportName, | ||
uri: exportURI, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how this behaves, but as things stand, we'll be providing a uri
that's unavailable in the response. Does this confuse models at all or are they correctly handling it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my tests, I did not see any "confusing" responses from the models. They correctly interpreted the response and the "in-progress" part of it. Yes the response_uri is available which means users (through client) can decide to access it but even that should be fine because we always respond with "still in progress" notification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I'm misunderstanding the logic here, but looking at ExportsManager.availableExports
, it seems like we're filtering out the in-progress reports, which means that the exportURI
we supply here would not resolve to a valid resource until the export is completed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filtering is only for the autocompletion of the export names. The autocomplete is triggered when someone chose the templated URI and then started typing name of the export. But if someone requests the resource directly by just providing the exportName / exportURI then the readResourceCallback
will be triggered and that would return "Resource is still being generated" error content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for the most part, but we may need to rework the last commit slightly to either replace the dependency or reimplement it in another way.
802bd8e
to
78a9044
Compare
RWLocks are not necessary here because sessions are single user and we don't want the agent to wait until a resource is available, as it can take forever depending on the data set.
78a9044
to
1039f4e
Compare
Proposed changes
This PR introduces:
exported-data://
to access the data exports created by export toolJSON Export tool provides two formats for data export:
relaxed
: This exports plain json format at the possible loss of some BSON types. Useful when the data is expected to be used in tools other than MongoDB such as BI tools or when simply having conversation with LLMscanonical
: This exports json with BSON types and is expected to be used when data might get imported back again in MongoDB.JSON export tool allows exporting:
filter
(possibly withlimit
,sort
,projection
)Export tool result contains:
stdio
transport to make it easier for editor clients to access the exported file.Exported data temporarily resides on the host machine running the mongodb-mcp-server at the configured path (configuration exposed by flag
--exportPath
or environment variableMDB_MCP_EXPORT_PATH
), until it gets cleaned up automatically either by:--exportCleanupIntervalMs
or environment variableMDB_MCP_EXPORT_CLEANUP_INTERVAL_MS
)The expiry of exported data is controlled by flag
--exportTimeoutMs
or the environment variableMDB_MCP_EXPORT_TIMEOUT_MS
The exported data can be accessed via the resource URI template -
exported-data://{exportName}
. Because the exported data are written to non-guessable filenames, the autocomplete API is provided to easily autocomplete the matching file names.Checklist