Description
Twice this week I've helped people debug panics in Go programs due to runtime/cgo: pthread_create failed: Resource temporarily unavailable
. Both times, it has turned out that they were calling syscall.Exec
and it happened to execute concurrently with a pthread_create
for a background goroutine, causing the pthread_create
to fail with EAGAIN
.
There have been a couple of previous threads on go-nuts involving similar issues.
We could consider modifying the runtime to stop the world and/or shutdown threads during calls to syscall.Exec
, but that seems like a fair amount of work and the syscall
package is frozen / deprecated anyway.
As a simpler step, I think we should have vet
warn about any calls to syscall.Exec
from outside the standard library.
Activity
[-]cmd/vet: warn about calls to `syscall.Exec`[/-][+]cmd/vet: warn about calls to syscall.Exec[/+]ianlancetaylor commentedon Dec 1, 2016
If you can write a test case for the problem, we will simply change runtime/cgo to retry the
pthread_create
if it fails withEAGAIN
. The only reason I didn't do that years ago is that I wasn't able to write a test case.bcmills commentedon Dec 2, 2016
I'll try to write one tomorrow based on the failures I've been seeing.
I think the key elements are:
pthread_create
)Exec
very early (so that we haven't reachedGOMAXPROCS
yet when it executes)bcmills commentedon Dec 2, 2016
The following test gives me a flake rate of about 0.1% of attempts:
[-]cmd/vet: warn about calls to syscall.Exec[/-][+]runtime/cgo: "pthread_create failed" panics during syscall.Exec[/+][-]runtime/cgo: "pthread_create failed" panics during syscall.Exec[/-][+]runtime/cgo: "pthread_create failed" panic during syscall.Exec[/+][-]runtime/cgo: "pthread_create failed" panic during syscall.Exec[/-][+]runtime/cgo: "pthread_create failed" during syscall.Exec[/+]ianlancetaylor commentedon Dec 2, 2016
@bcmills thanks for figuring out the test. Sent https://golang.org/cl/33894.
gopherbot commentedon Dec 2, 2016
CL https://golang.org/cl/33894 mentions this issue.
runtime/cgo: retry pthread_create on EAGAIN
[-]runtime/cgo: "pthread_create failed" during syscall.Exec[/-][+]runtime/cgo: "pthread_create failed" during syscall.Exec on Darwin & OpenBSD[/+]ianlancetaylor commentedon Dec 5, 2016
I believe this is now fixed on systems other than OpenBSD (requires changes to runtime/cgo/gcc_libinit_openbsd.c) and Darwin (test fails for unknown reasons).
2 remaining items
[-]runtime/cgo: "pthread_create failed" during syscall.Exec on Darwin & OpenBSD[/-][+]runtime/cgo: "pthread_create failed" during syscall.Exec on Darwin/OpenBSD/DragonFly[/+]gopherbot commentedon Dec 5, 2016
CL https://golang.org/cl/33905 mentions this issue.
gopherbot commentedon Dec 5, 2016
CL https://golang.org/cl/33906 mentions this issue.
misc/cgo/test: skip Test18146 on DragonFly
gopherbot commentedon Dec 5, 2016
CL https://golang.org/cl/33907 mentions this issue.
misc/cgo/test: ignore "too many open files" in issue 18146 test
mdempsky commentedon Mar 23, 2017
FYI, just saw a flake on linux-amd64: https://storage.googleapis.com/go-build-log/28b72716/linux-amd64_50de8e07.log
syscall: fix Exec on solaris
gopherbot commentedon Jul 27, 2017
Change https://golang.org/cl/47032 mentions this issue:
syscall: fix Exec on solaris