Skip to content

time: Sleep requires ~7 syscalls #25471

Open
@bwesterb

Description

@bwesterb

What version of Go are you using (go version)?

go version go1.10.1 linux/amd64

Does this issue reproduce with the latest release?

Yes (1.10.2).

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/bas/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/bas/go"
GORACE=""
GOROOT="/usr/lib/go-1.10"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go-1.10/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build723289083=/tmp/go-build -gno-record-gcc-switches"

What did you do?

The following Go program calls time.Sleep the number of times given as a commandline argument.

package main

import (
	"os"
	"strconv"
	"time"
)

var max int

func main() {
	max, _ = strconv.Atoi(os.Args[1])
	n := 0
	for {
		time.Sleep(time.Second / 100)
		n += 1
		if n >= max {
			return
		}
	}
}

If track the number of sys calls using strace -f -c, we find

bas@fourier2:~/gosleeptest$ strace -c -f ./gosleeptest 1
strace: Process 3115 attached
strace: Process 3114 attached
strace: Process 3116 attached
strace: Process 3117 attached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  0.00    0.000000           0         8           mmap
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0       114           rt_sigaction
  0.00    0.000000           0        14           rt_sigprocmask
  0.00    0.000000           0         4           clone
  0.00    0.000000           0         1           execve
  0.00    0.000000           0        10           sigaltstack
  0.00    0.000000           0         5           arch_prctl
  0.00    0.000000           0         9           gettid
  0.00    0.000000           0         8         1 futex
  0.00    0.000000           0         1           sched_getaffinity
  0.00    0.000000           0         1           readlinkat
  0.00    0.000000           0        22           pselect6
------ ----------- ----------- --------- --------- ----------------
100.00    0.000000                   198         1 total
bas@fourier2:~/gosleeptest$ strace -c -f ./gosleeptest 10
strace: Process 3919 attached
strace: Process 3918 attached
strace: Process 3917 attached
strace: Process 3927 attached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  0.00    0.000000           0         8           mmap
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0       114           rt_sigaction
  0.00    0.000000           0        14           rt_sigprocmask
  0.00    0.000000           0         2           sched_yield
  0.00    0.000000           0         4           clone
  0.00    0.000000           0         1           execve
  0.00    0.000000           0        10           sigaltstack
  0.00    0.000000           0         5           arch_prctl
  0.00    0.000000           0         9           gettid
  0.00    0.000000           0        74        12 futex
  0.00    0.000000           0         1           sched_getaffinity
  0.00    0.000000           0         1           readlinkat
  0.00    0.000000           0        69           pselect6
------ ----------- ----------- --------- --------- ----------------
100.00    0.000000                   313        12 total
bas@fourier2:~/gosleeptest$ strace -c -f ./gosleeptest 100
strace: Process 4491 attached
strace: Process 4490 attached
strace: Process 4489 attached
strace: Process 4532 attached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 89.01    0.043330          82       530       104 futex
  9.76    0.004751          21       228           pselect6
  0.27    0.000131           1       114           rt_sigaction
  0.23    0.000114          23         5           arch_prctl
  0.19    0.000091           9        10           sigaltstack
  0.18    0.000086          10         9           gettid
  0.13    0.000061           4        14           rt_sigprocmask
  0.07    0.000035           4         8           mmap
  0.07    0.000033          33         1           readlinkat
  0.03    0.000017           4         4           clone
  0.03    0.000017          17         1           execve
  0.02    0.000009           9         1           munmap
  0.01    0.000003           3         1           sched_getaffinity
------ ----------- ----------- --------- --------- ----------------
100.00    0.048678                   926       104 total

What did you expect to see?

A single time.Sleep should use approximately one syscall. (Python's time.sleep does only use one syscall, for instance.)

What did you see instead?

Approximately seven sys calls per time.Sleep. As a consequence, the go process also uses quite a bit of CPU time per time.Sleep: 500us (compared to 13us for Python's time.sleep).

Notes

I encountered this issue while debugging unexpectedly high idle CPU usage by wireguard-go.

Activity

robpike

robpike commented on May 21, 2018

@robpike
Contributor

I doubt it will ever go as low as one in general. In a concurrent language like Go, unlike Python, putting a thread to sleep requires locking data structures in the concurrent runtime. Seven may be more than are truly needed, I'm not sure, but time.Sleep cannot be done in only one call to the kernel without blocking the rest of the program's running goroutines.

changed the title [-]Single `time.Sleep` requires ~7 syscalls[/-] [+]time: Sleep requires ~7 syscalls[/+] on May 21, 2018
added
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.
on May 21, 2018
added this to the Go1.11 milestone on May 21, 2018
agnivade

agnivade commented on May 21, 2018

@agnivade
Contributor

What Rob said. Not sure if there's anything to be done here. /cc @rsc

bwesterb

bwesterb commented on May 21, 2018

@bwesterb
Author

I'm a bit confused about your hunch that this is due to locking. Futexes use a CMPXCHG in userspace to acquire a mutex. Only when the lock is contended will they call the futex() syscall with FUTEX_WAIT on the mutex to wait for it to unlock, right? Why is there contention in this example?

bcmills

bcmills commented on May 21, 2018

@bcmills
Contributor

I encountered this issue while debugging unexpectedly high idle CPU usage by wireguard-go.

Does the number of syscalls account for the observed CPU usage in wireguard-go, or is this a peripheral issue? We try to focus on concrete problems (specific programs with poor behavior) rather than abstract ones (i.e., microbenchmarks).

bwesterb

bwesterb commented on May 21, 2018

@bwesterb
Author

Yes, it did. As a workaround, the author now disables the 1Hz ticker if there is nothing to do because of the overhead. (Although that's a good idea anyway to prevent unnecessary wake-ups.)

(So the overhead is 0.05% of CPU constantly for a 1Hz ticker (on my test machine), which isn't huge — certainly not for servers. It's 43 seconds per day.)

odeke-em

odeke-em commented on May 21, 2018

@odeke-em
Member
aclements

aclements commented on May 21, 2018

@aclements
Member

The high number of futex calls is surprising, for the reasons you pointed out. Most likely what's happening is not lock contention, but rather that the runtime is waking a sleeping thread, though I haven't confirmed that.

ianlancetaylor

ianlancetaylor commented on May 25, 2018

@ianlancetaylor
Contributor

The Go runtime works with a set of timer goroutines sharded by P. The time.Sleep call is implemented by having the calling goroutine tell its local timer goroutine to wake it up at a certain time. This means adding a timer to a heap, and then waking up the timer goroutine: a futex operation.

After that, the goroutine that called time.Sleep parks itself. The M that was running that goroutine then looks for another goroutine to run, doesn't find any, and parks itself. Parking a goroutine just means marking the G as asleep, but parking an M means sleeping until told to wake up: a futex operation.

Meanwhile, the timer goroutine looks at the new timer, decides how long it has to sleep, and goes to sleep for that long: a futex operation.

Eventually the timer goroutine wakes up. It unparks the sleeping goroutine, adding it to the queue of goroutines ready to run. There aren't any M's running, so it has to wake one up: a futex operation.

The timer goroutine then sees that there are no timers, and goes back to sleep waiting for something to happen: a futex operation.

The M that was woken up starts to execute the goroutine, which immediately calls time.Sleep again, and the cycle starts over.

Meanwhile the system monitor thread runs, poking along doing its own thing while all this is happening. On tip this mostly shows up as calls to nanosleep, above it shows up as pselect6.

So I think that gives me 6 system calls per time.Sleep system call. Above it says there are 7, so I guess I missed one somewhere. Maybe the system monitor makes another system call.

Some important points here are that the system monitor thread is part of the background of a Go process. It doesn't have anything to do with time.Sleep, it just shows up here because nothing else is happening. The M parking and unparking itself is only happening because there are no other goroutines; if there were, the M would be picking those up and running them, steps that in themselves do not require any system calls.

So I think this microbenchmark is not necessarily indicative of the cost of calling time.Sleep. The costs that are closely related to time.Sleep are the futex call to wake up the timer goroutine and the futex call by the timer goroutine itself to go back to sleep. Those seem difficult to avoid, assuming that we don't want a sleeping goroutine to tie up a thread, which of course we don't. Even those system calls wouldn't necessarily happen if there were other timers, as we only need to notify the timer goroutine of a new timer if it is newer than any existing timer.

All in all it's not obvious to me that there is anything to do here. We really do need three threads: the main thread, the timer goroutine, the system monitor. The main thread and the timer goroutine really do need to communicate, and they really do need to go to sleep during the time.Sleep call. I don't see how to avoid that even hypothetically. For this example we could save a system call if the timer goroutine spun for a while when there was nothing to do, hoping that something would come in, but that hardly seems like a good idea in general.

These system calls are all fast and should not put a significant load on the system.

As mentioned above the comparison to Python is unhelpful, since Python doesn't have to support multiplexing multiple sleeping goroutines onto a small set of threads.

bwesterb

bwesterb commented on May 25, 2018

@bwesterb
Author

My comparison with Python indeed wasn't fair.

Thanks for the explanation. If I understand correctly the current implementation forces the timer goroutine to be executed on a different machine thread than the goroutine calling it. Couldn't the machine thread of the sleeper be reused when there is no timerproc running yet? (Implementation complexity is of course a valid objection, if applicable.)

ianlancetaylor

ianlancetaylor commented on May 30, 2018

@ianlancetaylor
Contributor

Given the existence of timers, there has to be a thread that sleeps until the next timer is ready. In the current implementation that thread is the one running the timer goroutine. So in the ordinary steady state there's no way for the goroutine that calls time.Sleep to run on the same thread that runs the timer goroutine, because the latter thread is already sleeping until the next timer fires.

I think it's true that in your example program, in which there is only ever one timer, when the call to time.Sleep happens the timer goroutine is not sleeping but is instead simply parked. So it would be possible for this example to have the thread running the goroutine calling time.Sleep to pick up the timer goroutine directly. But in a real program it doesn't seem like a particularly likely case, so it doesn't seem worth implementing that optimization.

bcmills

bcmills commented on May 30, 2018

@bcmills
Contributor

Given the existence of timers, there has to be a thread that sleeps until the next timer is ready.

Is that actually true? I thought that, at least on some platforms, there is a system call you can use to deliver a signal to the process at the next timer instead: if that's true, we only need a sleeping thread if there isn't some other running thread in the process that can receive the signal.

14 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Performance

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @bradfitz@bwesterb@tv42@agnivade@aclements

        Issue actions

          time: Sleep requires ~7 syscalls · Issue #25471 · golang/go