-
Notifications
You must be signed in to change notification settings - Fork 461
Fix: handle subprocess failing after startup #134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Fix resource cleanup in error cases - Improve error handling in waitUntilReadyOrExit - Add proper pipe cleanup on command start failure
WalkthroughThe changes introduce enhanced error handling for subprocess exits in the Stdio transport layer, including tracking exit errors and refining readiness checks. Tests are updated to simulate and verify immediate subprocess exit scenarios. The mock server is modified to support environment-driven early termination for more robust testing of startup failure conditions. Changes
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
client/transport/stdio.go (3)
18-21
: Timeout constants are baked‑in – consider configurability
readyTimeout
(5 s) andreadyCheckTimeout
(1 s) are reasonable defaults but
may be too aggressive/lenient for slower environments (CI, ARM boards, etc.).
Exposing them as optional parameters ofNewStdio
(or via functional options)
would let callers tune startup behaviour without touching the code.
41-41
: Channel name could better convey intentThe field
processExitErr
is a channel, not the error itself. Renaming it to
processExitCh
(or similar) clarifies its signalling nature and avoids mental
overhead when readingprocessExitErr <- cmd.Wait()
.Also applies to: 58-61
98-101
: Close the sender goroutine on context cancellationIf the parent
ctx
passed toStart()
is cancelled,cmd.Wait()
will return
but the goroutine remains blocked sending onprocessExitErr
if the buffer is
full. Adding aselect
with a parent‑context<-ctx.Done()
avoids a
potential goroutine leak:-go func() { - c.processExitErr <- cmd.Wait() -}() +go func() { + err := cmd.Wait() + select { + case c.processExitErr <- err: + case <-ctx.Done(): + } +}()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
client/transport/stdio.go
(5 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
client/transport/stdio.go (2)
client/transport/interface.go (1)
JSONRPCResponse
(36-45)testdata/mockstdio_server.go (1)
JSONRPCResponse
(18-26)
Hi @ezynda3, could you please help me review the PR? I look forward to your reply. Thanks. |
Looks like there are some merge conflicts. Can you have a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
client/transport/stdio_test.go (1)
428-428
: Uncomment or remove the commented defer statement.The defer statement for cleaning up the temporary file is commented out, which could leave temporary files behind after test execution.
-//defer os.Remove(mockServerPath) +defer os.Remove(mockServerPath)client/transport/stdio.go (1)
178-180
: Misleading comment about multiple notification handlers.The comment suggests multiple handlers can be registered, but the implementation in
SetNotificationHandler
only supports a single handler (it overwrites any previous handler).-// OnNotification registers a handler function to be called when notifications are received. -// Multiple handlers can be registered and will be called in the order they were added. +// SetNotificationHandler registers a handler function to be called when notifications are received. +// Only one handler can be active at a time; setting a new handler replaces any previous handler.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
client/stdio_test.go
(1 hunks)client/transport/sse_test.go
(0 hunks)client/transport/stdio.go
(4 hunks)client/transport/stdio_test.go
(4 hunks)server/sse.go
(0 hunks)testdata/mockstdio_server.go
(1 hunks)
💤 Files with no reviewable changes (2)
- client/transport/sse_test.go
- server/sse.go
✅ Files skipped from review due to trivial changes (1)
- client/stdio_test.go
🧰 Additional context used
🧬 Code Graph Analysis (1)
client/transport/stdio_test.go (1)
client/transport/stdio.go (1)
NewStdio
(63-80)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: test
🔇 Additional comments (9)
testdata/mockstdio_server.go (1)
29-32
: Good implementation of the immediate failure simulation.This addition enables testing scenarios where a child process starts but immediately fails, which is exactly what the PR aims to fix. The error message is clear and the exit code (1) appropriately signals a failure.
client/transport/stdio_test.go (1)
410-439
: Effective test for immediate subprocess failure.This test case directly addresses the PR objective by testing what happens when a subprocess starts successfully but exits immediately before normal operation begins. The test properly sets up the test environment with the
MOCK_FAIL_IMMEDIATELY=1
flag and correctly verifies that an error is returned fromStart()
.client/transport/stdio.go (7)
19-22
: Good extraction of timeout constants.Extracting these timeout values as constants improves readability and makes them easier to modify if needed.
42-43
: Effective solution for tracking subprocess exit.Adding these fields is a good approach to track subprocess exit and propagate errors without risk of deadlocks. The
processExited
channel signals subprocess termination, while the atomicexitErr
safely stores the error value.
74-77
: Proper initialization of the new channel.Initializing
processExited
in the constructor ensures it's always available for signaling process termination.
124-130
: Well-implemented subprocess exit monitoring.This goroutine effectively addresses the PR objective by monitoring the subprocess for early termination. It safely stores any error in the atomic value and signals termination through channel closure, allowing other parts of the code to react to subprocess failure.
132-141
: Improved readiness signaling with error detection.This change correctly separates process startup from readiness signaling, allowing for proper error detection when a process exits early. The approach ensures that early subprocess failures are properly detected and reported.
145-159
: Good implementation of readiness and early-exit detection.This function effectively handles three scenarios:
- Process exits before signaling readiness
- Process signals readiness but exits immediately after
- Process fails to signal readiness within timeout
The short 1-second check after readiness signal is particularly important for detecting failures that happen right after startup.
171-174
: Fixed potential deadlock in Close method.This change resolves the deadlock issue mentioned in the previous review by waiting on channel closure rather than directly calling
cmd.Wait()
. It also properly propagates any exit error that was stored in the atomic value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
client/transport/stdio.go
(4 hunks)
🔇 Additional comments (9)
client/transport/stdio.go (9)
7-8
: Good additions to the import list.Adding the necessary imports for error handling, atomic operations, and time manipulation is essential for the new functionality. These imports properly support the changes to handle subprocess failures.
Also applies to: 13-14
19-22
: Good use of constants for timeout values.Defining timeout constants at package level improves readability and maintainability. These constants support the new readiness and exit detection mechanism.
42-43
: Good addition of atomic field for error storage.The
exitErr
field using atomic.Value is a thread-safe way to store and retrieve subprocess exit errors across goroutines.
85-91
: Good implementation of the readiness check.The updated Start method properly separates concerns by using a goroutine to read responses and a dedicated function to wait for readiness or detect early process exit.
130-136
: Great improvement to handle subprocess failures.This goroutine properly waits for the subprocess to exit, stores any error using atomic operations, and signals completion by closing the done channel. This directly addresses the PR's goal of handling subprocess failures after startup.
140-147
: Good implementation of safe channel closure.The
tryCloseDone
function prevents panics by safely closing the channel only if it hasn't been closed yet. This is important when multiple goroutines might attempt to close the same channel.
148-162
: Good implementation of timeout handling.The
waitUntilReadyOrExit
function effectively handles multiple scenarios:
- Process exits before signaling readiness
- Process signals readiness but exits immediately after
- Process signals readiness and continues running
- Process doesn't signal readiness within the timeout period
This comprehensive approach ensures proper error handling in all cases.
167-169
: Good update to Close method.Using
tryCloseDone
in the Close method ensures proper cleanup even if the channel was already closed by the exit goroutine.
176-178
: Great improvement to error handling.Checking the stored exit error using atomic operations is thread-safe and properly propagates subprocess errors to the caller. This fixes the issue where subprocess failures weren't being detected.
Hi @ezynda3 I have resolved the conflict and added a test case. |
this fixes an issue where the child process is created successfully, but immediately crashes.The client did not detect the failure.
Summary by CodeRabbit