Skip to content

os/exec: resource leak on exec failure #69284

Closed
@rustyx

Description

@rustyx

Go version

go version go1.23.0 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/user/.cache/go-build'
GOENV='/home/user/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/user/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/user/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.23.0'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/user/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2691713470=/tmp/go-build -gno-record-gcc-switches'

What did you do?

It seems that when process.Start fails, there are some files left open, which after a while leads to a "too many open files" error.

Here's a minimal repro:

package main

import (
	"context"
	"errors"
	"os/exec"
	"syscall"
	"testing"
)

func TestExecResources(t *testing.T) {
	ctx := context.Background()
	var oldRLimit syscall.Rlimit
	err := syscall.Getrlimit(syscall.RLIMIT_NOFILE, &oldRLimit)
	if err != nil {
		t.Fatal(err)
	}
	newRLimit := oldRLimit
	newRLimit.Cur = 20
	err = syscall.Setrlimit(syscall.RLIMIT_NOFILE, &newRLimit)
	if err != nil {
		t.Fatal(err)
	}
	defer func() {
		syscall.Setrlimit(syscall.RLIMIT_NOFILE, &oldRLimit)
	}()

	for i := 0; i < 22; i++ {
		ctx, cancel := context.WithCancel(ctx)
		process := exec.CommandContext(ctx, "/bin/nonexistent")
		err = process.Start()
		cancel()
		process.Wait() // not really needed, just to demonstrate that even calling Wait doesn't help
		var se syscall.Errno
		if !errors.As(err, &se) || se != syscall.ENOENT {
			t.Fatal(err)
		}
	}
}

What did you see happen?

The test fails with:

--- FAIL: TestExecResources (0.00s)
    process_unix_test.go:35: fork/exec /bin/nonexistent: too many open files

What did you expect to see?

I expected the test to pass.

Note that the issue occurs also on other exec failures, such as permission denied, invalid ELF format, etc.

Activity

rogpeppe

rogpeppe commented on Sep 5, 2024

@rogpeppe
Contributor

This is a regression in Go 1.23.0. I've bisected it to commit 2f64268, introduced by this CL https://go-review.googlesource.com/c/go/+/570036/

rogpeppe

rogpeppe commented on Sep 5, 2024

@rogpeppe
Contributor

Looks like a bug in syscall.StartProcess to me. The docs for PidFD say:

	// PidFD, if not nil, is used to store the pidfd of a child, if the
	// functionality is supported by the kernel, or -1. Note *PidFD is
	// changed only if the process starts successfully.
	PidFD *int

but this program prints "PidFD set on failure", which seems wrong

package main

import (
	"fmt"
	"syscall"
)

func main() {
	var pidfd int = -1
	_, _, err := syscall.StartProcess("nonexistent", []string{"nonexistent"}, &syscall.ProcAttr{
		Sys: &syscall.SysProcAttr{
			PidFD: &pidfd,
		},
	})
	if err == nil {
		panic("unexpected StartProcess success")
	}
	if pidfd != -1 {
		fmt.Printf("PidFD set on failure to %v\n", pidfd)
	}
}
dmitshur

dmitshur commented on Sep 5, 2024

@dmitshur
Member
added
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.
on Sep 5, 2024
added this to the Backlog milestone on Sep 5, 2024
ianlancetaylor

ianlancetaylor commented on Sep 5, 2024

@ianlancetaylor
Contributor
ianlancetaylor

ianlancetaylor commented on Sep 5, 2024

@ianlancetaylor
Contributor

@rogpeppe I don't think your program is buggy. The program does get a valid pidfd that refers to the child. It's true that the child exits immediately because the exec failed. That just means that the pidfd refers to the zombie process.

gopherbot

gopherbot commented on Sep 5, 2024

@gopherbot
Contributor

Change https://go.dev/cl/611217 mentions this issue: os: release pidfd if StartProcess fails

rogpeppe

rogpeppe commented on Sep 6, 2024

@rogpeppe
Contributor

I don't think your program is buggy.

I'd say it's buggy because the behaviour does not conform to the documentation. The docs say "PidFD is changed only if the process starts successfully". The process did not start successfully, but PidFD was nonetheless changed.

ianlancetaylor

ianlancetaylor commented on Sep 6, 2024

@ianlancetaylor
Contributor

@rogpeppe Thanks, I see what you mean. I wasn't quite grasping that syscall.StartProcess was returning failure and in that case it has already waited for the zombie process.

gopherbot

gopherbot commented on Sep 6, 2024

@gopherbot
Contributor

Change https://go.dev/cl/611495 mentions this issue: syscall: on exec failure, close pidfd

added
NeedsFixThe path to resolution is known, but the work has not been done.
FixPendingIssues that have a fix which has not yet been reviewed or submitted.
and removed
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.
on Sep 10, 2024
modified the milestones: Backlog, Go1.24 on Sep 10, 2024
added a commit that references this issue on Sep 10, 2024
8926ca9
mvdan

mvdan commented on Sep 11, 2024

@mvdan
Member

Is the plan to backport this to Go 1.23.x?

ianlancetaylor

ianlancetaylor commented on Sep 11, 2024

@ianlancetaylor
Contributor

@gopherbot Please open a backport issue to 1.23.

This causes os/exec to leak file descriptors when used to run a non-existent file on Linux. There is no simple workaround.

gopherbot

gopherbot commented on Sep 11, 2024

@gopherbot
Contributor

Backport issue(s) opened: #69402 (for 1.23).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

gopherbot

gopherbot commented on Sep 16, 2024

@gopherbot
Contributor

Change https://go.dev/cl/613616 mentions this issue: [release-branch.go1.23] syscall: on exec failure, close pidfd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    FixPendingIssues that have a fix which has not yet been reviewed or submitted.NeedsFixThe path to resolution is known, but the work has not been done.OS-Linux

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @rogpeppe@rustyx@dmitshur@ianlancetaylor@mvdan

        Issue actions

          os/exec: resource leak on exec failure · Issue #69284 · golang/go