log/syslog, net: goroutine permanently stuck in waitWrite() to /dev/log using a unix datagram socket

Writes to syslog using a unix datagram socket (the default behavior on Ubuntu 14.04, probably more) may become stuck indefinitely if the syslog daemon closes /dev/log.

The following reproduces the issue. This code is derived from a reproducer found in the comments section of https://github.com/golang/go/issues/5932 - https://play.golang.org/p/vp_e6n8VJuX

To reproduce:
- Execute the above reproducer binary several times in the background (3-6 times is usually good).
- Restart your syslog daemon. I'm using syslog-ng, so `sudo service syslog-ng restart` suffices.
- Repeat this exact experiment a few times until you observe some of the runs never exit. Kill them with -SIQUIT and hopefully see the same stack pasted below. It sometimes helps to stagger running the the background runs, restart the daemon, wait a few seconds, run some more in the background, restart the daemon again. Get aggressive. Anger the system.

https://pastebin.com/GuV5JZDS

We first observed this bad behavior in a production environment last week during a DNS outage. The outage prevented our syslog-ng daemon from reloading configuration properly, since it exits uncleanly on configuration reload when DNS is unavailable. We then observed a large portion of production boxes hung completely, with one goroutine waiting on syslog and the rest waiting to acquire the log package mutex. I believe the syslog-ng daemon restart caused the bad behavior to be exposed in the syslog and/or net packages. Our syslog write volume is fairly high.

After that, I reproduced the issue locally using the above reproducer. I think the badness has to do with missing a socket close event, or some other invalid state transition in the net code.

Goroutine stack after killing the stuck process with -SIGQUIT: https://pastebin.com/GuV5JZDS

Environment:
$ go version
go version go1.7 linux/amd64
$ go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/jesmet/.gvm/pkgsets/go1.7/global"
GORACE=""
GOROOT="/home/jesmet/.gvm/gos/go1.7"
GOTOOLDIR="/home/jesmet/.gvm/gos/go1.7/pkg/tool/linux_amd64"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build126114309=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

log/syslog, net: goroutine permanently stuck in waitWrite() to /dev/log using a unix datagram socket #23604

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

log/syslog, net: goroutine permanently stuck in waitWrite() to /dev/log using a unix datagram socket #23604

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions