Description
When the Stdin
, Stdout
, or Stderr
fields of os/exec.Cmd
are set to anything other than nil
or a *os.File
(a common case is *bytes.Buffer
), we call os.Pipe
to get a pipe and create goroutines to copy data in or out. The (*Cmd).Wait
method first waits for the subprocess to exit, then waits for those goroutines to finish copying data.
If the subprocess C1 itself starts a subsubprocess C2, and if C1 passes any of its stdin/stdout/stderr descriptors to C2, and if C1 exits without waiting for C2 to exit, then C2 will hold an open end of the pipes created by the os/exec package. The (*Cmd).Wait
method will wait until the goroutines complete, which means waiting until those pipes are closed, which in practice means waiting until C2 exits. This is confusing, because the user sees that C1 is done, and doesn't understand why their program is still waiting for it.
This confusion has been filed as an issue multiple times, at least #7378, #18874, #20730, #21922, #22485.
It doesn't have to work this way. Although the current goroutines call io.Copy
, we could change them to use a loop. After every Read
, the loop could check whether the process has been waited for. Then Wait
could wait for the child, tell the goroutines to stop, give them a chance for a final write, and then return. The stdout/stderr goroutines would close their end of the pipe. There wouldn't be any race and there wouldn't be any unexpected waits. But in cases where there is a C2 process, not all of the standard output and standard error output that we currently collect would be available.
To be clear, programmers can already handle these cases however they like by calling os.Pipe
themselves and using the pipe in the Cmd
struct. That is true today and it would be true if change how it works.
The questions I want to raise are: would making this change be less confusing for people? Can we make this change without breaking the Go 1 guarantee?
CC @bradfitz