Description
What version of Go are you using (go version
)?
go1.21 version go version go1.21.1 darwin/arm64
Does this issue reproduce with the latest release?
Yes, Go 1.21 is the latest release series.
What operating system and processor architecture are you using (go env
)?
go env
Output
$ go1.21 env GO111MODULE='' GOARCH='arm64' GOBIN='' GOCACHE='/Users/rhys/Library/Caches/go-build' GOENV='/Users/rhys/Library/Application Support/go/env' GOEXE='' GOEXPERIMENT='' GOFLAGS='' GOHOSTARCH='arm64' GOHOSTOS='darwin' GOINSECURE='' GOMODCACHE='/Users/rhys/go/pkg/mod' GONOPROXY='*' GONOSUMDB='*' GOOS='darwin' GOPATH='/Users/rhys/go' GOPRIVATE='*' GOPROXY='direct' GOROOT='/Users/rhys/go/version/go1.21' GOSUMDB='sum.golang.org' GOTMPDIR='' GOTOOLCHAIN='local' GOTOOLDIR='/Users/rhys/go/version/go1.21/pkg/tool/darwin_arm64' GOVCS='' GOVERSION='go1.21.1' GCCGO='gccgo' AR='ar' CC='clang' CXX='clang++' CGO_ENABLED='1' GOMOD='/dev/null' GOWORK='' CGO_CFLAGS='-O2 -g' CGO_CPPFLAGS='' CGO_CXXFLAGS='-O2 -g' CGO_FFLAGS='-O2 -g' CGO_LDFLAGS='-O2 -g' PKG_CONFIG='pkg-config' GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/lf/n4ry0qv5639_3z0nhyhy3z7c0000gs/T/go-build1797745903=/tmp/go-build -gno-record-gcc-switches -fno-common'
What did you do?
$ cat slowprof.go
package main
import (
"flag"
"io"
"log"
"runtime"
"runtime/pprof"
)
func main() {
cpu := flag.Bool("cpu", false, "collect a CPU profile")
mutex := flag.Int("mutex", -1, "collect a mutex profile with specified debug level")
block := flag.Int("block", -1, "collect a block profile with specified debug level")
flag.Parse()
if *cpu {
err := pprof.StartCPUProfile(io.Discard)
if err != nil {
log.Fatalf("pprof.StartCPUProfile: %v", err)
}
pprof.StopCPUProfile()
}
if *mutex >= 0 {
runtime.SetMutexProfileFraction(2)
err := pprof.Lookup("mutex").WriteTo(io.Discard, *mutex)
if err != nil {
log.Fatalf(`pprof.Lookup("mutex").WriteTo: %v`, err)
}
}
if *block >= 0 {
runtime.SetBlockProfileRate(2)
err := pprof.Lookup("block").WriteTo(io.Discard, *block)
if err != nil {
log.Fatalf(`pprof.Lookup("block").WriteTo: %v`, err)
}
}
}
$ go1.21 build -o /tmp/slowprof ./slowprof.go
$ for i in {1..10}; do time /tmp/slowprof; done
/tmp/slowprof 0.00s user 0.01s system 58% cpu 0.022 total
/tmp/slowprof 0.00s user 0.00s system 36% cpu 0.009 total
/tmp/slowprof 0.00s user 0.00s system 55% cpu 0.007 total
/tmp/slowprof 0.00s user 0.00s system 41% cpu 0.010 total
/tmp/slowprof 0.00s user 0.00s system 39% cpu 0.009 total
/tmp/slowprof 0.00s user 0.00s system 36% cpu 0.006 total
/tmp/slowprof 0.00s user 0.00s system 37% cpu 0.005 total
/tmp/slowprof 0.00s user 0.00s system 33% cpu 0.008 total
/tmp/slowprof 0.00s user 0.00s system 29% cpu 0.007 total
/tmp/slowprof 0.00s user 0.00s system 29% cpu 0.007 total
$ for i in {1..10}; do time /tmp/slowprof -cpu; done
/tmp/slowprof -cpu 0.00s user 0.00s system 2% cpu 0.214 total
/tmp/slowprof -cpu 0.00s user 0.00s system 2% cpu 0.213 total
/tmp/slowprof -cpu 0.00s user 0.01s system 4% cpu 0.216 total
/tmp/slowprof -cpu 0.00s user 0.00s system 3% cpu 0.215 total
/tmp/slowprof -cpu 0.00s user 0.00s system 3% cpu 0.216 total
/tmp/slowprof -cpu 0.00s user 0.00s system 3% cpu 0.215 total
/tmp/slowprof -cpu 0.00s user 0.00s system 3% cpu 0.215 total
/tmp/slowprof -cpu 0.00s user 0.00s system 3% cpu 0.216 total
/tmp/slowprof -cpu 0.00s user 0.01s system 3% cpu 0.216 total
/tmp/slowprof -cpu 0.00s user 0.00s system 2% cpu 0.213 total
What did you expect to see?
I hoped to see very little wall-time overhead as a result of enabling the CPU profile.
What did you see instead?
Starting and stopping the CPU profile adds around 200ms of wall-clock time. That's a significant slowdown when the process under test is something like "a benchmark" or "a command, such as the compiler". (It's not as noticeable when the process under test is a long-running server program.)
Furthermore, the DurationNanos field in the resulting protobuf-encoded profile includes that idle time. As a result, go tool pprof
reporting that the process was less busy than it really was during the time the profile was active: the CPU profile samples represent the work the app does from the start of the call to pprof.StopCPUProfile
, but the duration covers until the end of that call.
I think this relates to the 100ms sleep in runtime/pprof.profileWriter
's loop at https://github.com/golang/go/blob/go1.21.1/src/runtime/pprof/pprof.go#L809
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Activity
mknyszek commentedon Sep 20, 2023
This is probably at least partially a consequence of a bug fix on Darwin where accessing the profbuf is no longer blocking, and there's an explicit 100ms sleep in the profile reader loop.
Michael Pratt points out that the 100ms sleep could probably be replaced with a select on a couple channels, wherein if the profile ends sooner it wakes up, flushes whatever it can, then exits promptly.
prattmic commentedon Sep 20, 2023
https://go.dev/cl/228886 tried to fix this, but it was reverted.