Skip to content

log/syslog, net: goroutine permanently stuck in waitWrite() to /dev/log using a unix datagram socket #23604

Closed
@esmet

Description

@esmet

Writes to syslog using a unix datagram socket (the default behavior on Ubuntu 14.04, probably more) may become stuck indefinitely if the syslog daemon closes /dev/log.

The following reproduces the issue. This code is derived from a reproducer found in the comments section of #5932 - https://play.golang.org/p/vp_e6n8VJuX

To reproduce:

  • Execute the above reproducer binary several times in the background (3-6 times is usually good).
  • Restart your syslog daemon. I'm using syslog-ng, so sudo service syslog-ng restart suffices.
  • Repeat this exact experiment a few times until you observe some of the runs never exit. Kill them with -SIQUIT and hopefully see the same stack pasted below. It sometimes helps to stagger running the the background runs, restart the daemon, wait a few seconds, run some more in the background, restart the daemon again. Get aggressive. Anger the system.

https://pastebin.com/GuV5JZDS

We first observed this bad behavior in a production environment last week during a DNS outage. The outage prevented our syslog-ng daemon from reloading configuration properly, since it exits uncleanly on configuration reload when DNS is unavailable. We then observed a large portion of production boxes hung completely, with one goroutine waiting on syslog and the rest waiting to acquire the log package mutex. I believe the syslog-ng daemon restart caused the bad behavior to be exposed in the syslog and/or net packages. Our syslog write volume is fairly high.

After that, I reproduced the issue locally using the above reproducer. I think the badness has to do with missing a socket close event, or some other invalid state transition in the net code.

Goroutine stack after killing the stuck process with -SIGQUIT: https://pastebin.com/GuV5JZDS

Environment:
$ go version
go version go1.7 linux/amd64
$ go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/jesmet/.gvm/pkgsets/go1.7/global"
GORACE=""
GOROOT="/home/jesmet/.gvm/gos/go1.7"
GOTOOLDIR="/home/jesmet/.gvm/gos/go1.7/pkg/tool/linux_amd64"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build126114309=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions