Closed
Description
What version of Go are you using (go version
)?
go version go1.10beta2 darwin/amd64
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (go env
)?
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/jlindamo/Library/Caches/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/jlindamo/go"
GORACE=""
GOROOT="/Users/jlindamo/sdk/go1.10beta2"
GOTMPDIR=""
GOTOOLDIR="/Users/jlindamo/sdk/go1.10beta2/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/hz/fdkmg98n3wx1xth0ypjq1s4c3xsp_j/T/go-build325901435=/tmp/go-build -gno-record-gcc-switches -fno-common"
What did you do?
Ran my benchmarks on go 1.9 and go 1.10
What did you expect to see?
Little or no change in timings
What did you see instead?
Differences by up to 600%
To reproduce:
go get github.com/cep21/circuit
cd $GOPATH/src/github.com/cep21/circuit
make bench
I ran benchmarks for go 1.9 and go 1.10. There is the result of benchcmp
> benchcmp /tmp/go1.9.2 /tmp/go1.10.b2
benchmark old ns/op new ns/op delta
BenchmarkCiruits/cep21-circuit/Hystrix/passing/1-8 767 1698 +121.38%
BenchmarkCiruits/cep21-circuit/Hystrix/passing/75-8 436 666 +52.75%
BenchmarkCiruits/cep21-circuit/Hystrix/failing/1-8 103 333 +223.30%
BenchmarkCiruits/cep21-circuit/Hystrix/failing/75-8 79.1 159 +101.01%
BenchmarkCiruits/cep21-circuit/Minimal/passing/1-8 159 800 +403.14%
BenchmarkCiruits/cep21-circuit/Minimal/passing/75-8 84.4 496 +487.68%
BenchmarkCiruits/cep21-circuit/Minimal/failing/1-8 58.4 282 +382.88%
BenchmarkCiruits/cep21-circuit/Minimal/failing/75-8 20.3 159 +683.25%
BenchmarkCiruits/cep21-circuit/UseGo/passing/1-8 1361 1932 +41.95%
BenchmarkCiruits/cep21-circuit/UseGo/passing/75-8 363 622 +71.35%
BenchmarkCiruits/cep21-circuit/UseGo/failing/1-8 1454 1994 +37.14%
BenchmarkCiruits/cep21-circuit/UseGo/failing/75-8 385 674 +75.06%
BenchmarkCiruits/GoHystrix/DefaultConfig/passing/1-8 10198 10182 -0.16%
BenchmarkCiruits/GoHystrix/DefaultConfig/passing/75-8 2897 3091 +6.70%
BenchmarkCiruits/GoHystrix/DefaultConfig/failing/1-8 6376 7215 +13.16%
BenchmarkCiruits/GoHystrix/DefaultConfig/failing/75-8 1589 1912 +20.33%
BenchmarkCiruits/rubyist/Threshold-10/passing/1-8 1671 2021 +20.95%
BenchmarkCiruits/rubyist/Threshold-10/passing/75-8 878 1662 +89.29%
BenchmarkCiruits/rubyist/Threshold-10/failing/1-8 132 325 +146.21%
BenchmarkCiruits/rubyist/Threshold-10/failing/75-8 1450 2750 +89.66%
BenchmarkCiruits/gobreaker/Default/passing/1-8 207 588 +184.06%
BenchmarkCiruits/gobreaker/Default/passing/75-8 757 734 -3.04%
BenchmarkCiruits/gobreaker/Default/failing/1-8 94.5 282 +198.41%
BenchmarkCiruits/gobreaker/Default/failing/75-8 376 358 -4.79%
BenchmarkCiruits/handy/Default/passing/1-8 1105 1377 +24.62%
BenchmarkCiruits/handy/Default/passing/75-8 1377 1860 +35.08%
BenchmarkCiruits/handy/Default/failing/1-8 1383 1293 -6.51%
BenchmarkCiruits/handy/Default/failing/75-8 1699 1701 +0.12%
BenchmarkCiruits/iand_circuit/Default/passing/1-8 122 113 -7.38%
BenchmarkCiruits/iand_circuit/Default/passing/75-8 312 299 -4.17%
BenchmarkCiruits/iand_circuit/Default/failing/1-8 21.4 23.2 +8.41%
BenchmarkCiruits/iand_circuit/Default/failing/75-8 5.71 8.51 +49.04%
BenchmarkRollingCounter/super-small-buckets/1-8 98.6 90.9 -7.81%
BenchmarkRollingCounter/super-small-buckets/50-8 191 185 -3.14%
BenchmarkRollingCounter/normal-rate/1-8 63.7 61.1 -4.08%
BenchmarkRollingCounter/normal-rate/50-8 84.1 85.5 +1.66%
BenchmarkRollingCounter/default/1-8 60.6 57.2 -5.61%
BenchmarkRollingCounter/default/50-8 71.2 84.1 +18.12%
benchmark old allocs new allocs delta
BenchmarkCiruits/cep21-circuit/Hystrix/passing/1-8 4 4 +0.00%
BenchmarkCiruits/cep21-circuit/Hystrix/passing/75-8 4 4 +0.00%
BenchmarkCiruits/cep21-circuit/Hystrix/failing/1-8 0 0 +0.00%
BenchmarkCiruits/cep21-circuit/Hystrix/failing/75-8 0 0 +0.00%
BenchmarkCiruits/cep21-circuit/Minimal/passing/1-8 0 0 +0.00%
BenchmarkCiruits/cep21-circuit/Minimal/passing/75-8 0 0 +0.00%
BenchmarkCiruits/cep21-circuit/Minimal/failing/1-8 0 0 +0.00%
BenchmarkCiruits/cep21-circuit/Minimal/failing/75-8 0 0 +0.00%
BenchmarkCiruits/cep21-circuit/UseGo/passing/1-8 5 5 +0.00%
BenchmarkCiruits/cep21-circuit/UseGo/passing/75-8 5 4 -20.00%
BenchmarkCiruits/cep21-circuit/UseGo/failing/1-8 5 5 +0.00%
BenchmarkCiruits/cep21-circuit/UseGo/failing/75-8 5 4 -20.00%
BenchmarkCiruits/GoHystrix/DefaultConfig/passing/1-8 18 18 +0.00%
BenchmarkCiruits/GoHystrix/DefaultConfig/passing/75-8 20 20 +0.00%
BenchmarkCiruits/GoHystrix/DefaultConfig/failing/1-8 19 19 +0.00%
BenchmarkCiruits/GoHystrix/DefaultConfig/failing/75-8 20 20 +0.00%
BenchmarkCiruits/rubyist/Threshold-10/passing/1-8 5 5 +0.00%
BenchmarkCiruits/rubyist/Threshold-10/passing/75-8 4 4 +0.00%
BenchmarkCiruits/rubyist/Threshold-10/failing/1-8 0 0 +0.00%
BenchmarkCiruits/rubyist/Threshold-10/failing/75-8 5 5 +0.00%
BenchmarkCiruits/gobreaker/Default/passing/1-8 0 0 +0.00%
BenchmarkCiruits/gobreaker/Default/passing/75-8 0 0 +0.00%
BenchmarkCiruits/gobreaker/Default/failing/1-8 0 0 +0.00%
BenchmarkCiruits/gobreaker/Default/failing/75-8 0 0 +0.00%
BenchmarkCiruits/handy/Default/passing/1-8 0 0 +0.00%
BenchmarkCiruits/handy/Default/passing/75-8 0 0 +0.00%
BenchmarkCiruits/handy/Default/failing/1-8 0 0 +0.00%
BenchmarkCiruits/handy/Default/failing/75-8 0 0 +0.00%
BenchmarkCiruits/iand_circuit/Default/passing/1-8 0 0 +0.00%
BenchmarkCiruits/iand_circuit/Default/passing/75-8 0 0 +0.00%
BenchmarkCiruits/iand_circuit/Default/failing/1-8 0 0 +0.00%
BenchmarkCiruits/iand_circuit/Default/failing/75-8 0 0 +0.00%
BenchmarkRollingCounter/super-small-buckets/1-8 0 0 +0.00%
BenchmarkRollingCounter/super-small-buckets/50-8 0 0 +0.00%
BenchmarkRollingCounter/normal-rate/1-8 0 0 +0.00%
BenchmarkRollingCounter/normal-rate/50-8 0 0 +0.00%
BenchmarkRollingCounter/default/1-8 0 0 +0.00%
BenchmarkRollingCounter/default/50-8 0 0 +0.00%
benchmark old bytes new bytes delta
BenchmarkCiruits/cep21-circuit/Hystrix/passing/1-8 192 208 +8.33%
BenchmarkCiruits/cep21-circuit/Hystrix/passing/75-8 192 208 +8.33%
BenchmarkCiruits/cep21-circuit/Hystrix/failing/1-8 0 0 +0.00%
BenchmarkCiruits/cep21-circuit/Hystrix/failing/75-8 0 0 +0.00%
BenchmarkCiruits/cep21-circuit/Minimal/passing/1-8 0 0 +0.00%
BenchmarkCiruits/cep21-circuit/Minimal/passing/75-8 0 0 +0.00%
BenchmarkCiruits/cep21-circuit/Minimal/failing/1-8 0 0 +0.00%
BenchmarkCiruits/cep21-circuit/Minimal/failing/75-8 0 0 +0.00%
BenchmarkCiruits/cep21-circuit/UseGo/passing/1-8 256 256 +0.00%
BenchmarkCiruits/cep21-circuit/UseGo/passing/75-8 256 256 +0.00%
BenchmarkCiruits/cep21-circuit/UseGo/failing/1-8 256 256 +0.00%
BenchmarkCiruits/cep21-circuit/UseGo/failing/75-8 256 255 -0.39%
BenchmarkCiruits/GoHystrix/DefaultConfig/passing/1-8 1005 1015 +1.00%
BenchmarkCiruits/GoHystrix/DefaultConfig/passing/75-8 1000 1022 +2.20%
BenchmarkCiruits/GoHystrix/DefaultConfig/failing/1-8 1013 1037 +2.37%
BenchmarkCiruits/GoHystrix/DefaultConfig/failing/75-8 1005 1013 +0.80%
BenchmarkCiruits/rubyist/Threshold-10/passing/1-8 332 348 +4.82%
BenchmarkCiruits/rubyist/Threshold-10/passing/75-8 309 324 +4.85%
BenchmarkCiruits/rubyist/Threshold-10/failing/1-8 0 0 +0.00%
BenchmarkCiruits/rubyist/Threshold-10/failing/75-8 304 320 +5.26%
BenchmarkCiruits/gobreaker/Default/passing/1-8 0 0 +0.00%
BenchmarkCiruits/gobreaker/Default/passing/75-8 0 0 +0.00%
BenchmarkCiruits/gobreaker/Default/failing/1-8 0 0 +0.00%
BenchmarkCiruits/gobreaker/Default/failing/75-8 0 0 +0.00%
BenchmarkCiruits/handy/Default/passing/1-8 0 0 +0.00%
BenchmarkCiruits/handy/Default/passing/75-8 0 0 +0.00%
BenchmarkCiruits/handy/Default/failing/1-8 0 0 +0.00%
BenchmarkCiruits/handy/Default/failing/75-8 0 0 +0.00%
BenchmarkCiruits/iand_circuit/Default/passing/1-8 0 0 +0.00%
BenchmarkCiruits/iand_circuit/Default/passing/75-8 0 0 +0.00%
BenchmarkCiruits/iand_circuit/Default/failing/1-8 0 0 +0.00%
BenchmarkCiruits/iand_circuit/Default/failing/75-8 0 0 +0.00%
BenchmarkRollingCounter/super-small-buckets/1-8 0 0 +0.00%
BenchmarkRollingCounter/super-small-buckets/50-8 0 0 +0.00%
BenchmarkRollingCounter/normal-rate/1-8 0 0 +0.00%
BenchmarkRollingCounter/normal-rate/50-8 0 0 +0.00%
BenchmarkRollingCounter/default/1-8 0 0 +0.00%
BenchmarkRollingCounter/default/50-8 0 0 +0.00%
Metadata
Metadata
Assignees
Type
Projects
Relationships
Development
No branches or pull requests
Activity
cep21 commentedon Jan 11, 2018
Is this an appropriate use of github issues?
Not at all a minimal reproducible test case, sorry :( But is very reproducible. My own investigation saw a lot of pprof time inside the time package.
bradfitz commentedon Jan 11, 2018
Yes.
bradfitz commentedon Jan 11, 2018
We should at least understand the difference before shipping Go 1.10, but no promises on fixing it this late in the cycle.
Have you run the CPU profiler on the worst cases there (like
BenchmarkCiruits/cep21-circuit/Minimal/failing/75-8
) to see what the CPU was doing before vs after?[-]go 1.10beta2 Benchmarks slower on Mac (possibly related to time)[/-][+]runtime: go 1.10beta2 Benchmarks slower on Mac (possibly related to time)[/+]ianlancetaylor commentedon Jan 11, 2018
We did change the Darwin time code in 1.10, for High Sierra (#22037). That may have made it slower, though I would expect that it would only matter for benchmarks that are dominated by timing calls. Please do attach some profiling data. Thanks.
CC @aclements
cep21 commentedon Jan 11, 2018
How do I do that? I don't see a way to attach files to github issues.
Here is some top30 output
And with go1.10
bradfitz commentedon Jan 11, 2018
Sure looks like time, but that's pretty extreme. time.Now went from 13.31% cumulative to 99.75%.
The new code that you're in (specifically the assembly for time.now lowercase) is:
https://go-review.googlesource.com/c/go/+/67332/5/src/runtime/sys_darwin_amd64.s
I wonder if it can get stuck for too many iterations in this part:
@rsc, do you remember if that CL was that much slower for you?
ianlancetaylor commentedon Jan 11, 2018
@cep21 Which version of Darwin are you using? Are you running it on a physical machine or some sort of VM?
cep21 commentedon Jan 11, 2018
Running on my physical mac (MacBook Pro)
rasky commentedon Jan 11, 2018
@cep21 I'm trying to reproduce it. The instructions seem incomplete because I also need to download test dependencies (
go get -t ./...
) but besides that, I'm getting pages of this error:If I just ignore it, I get results which are much better on tip:
Go 1.9.2:
Go tip c13e0e8
Full benchcmp:
This is on a MacBook Pro 2016 (Core i7).
cep21 commentedon Jan 11, 2018
The package
go-hystrix
is written in a way that it logs to stdout when it is overused :/ I manually comment out that log statement in my local repository. I'll push a branch that does that for you.cep21 commentedon Jan 11, 2018
Ok new reproduce steps
3 remaining items
rasky commentedon Jan 12, 2018
So, results seem to make sense.
I run High Sierra. High Sierra broke Go's
time.Now()
fast implementation so all versions of Go (up to and including 1.9.2) fallback to a slower codepath that is syscall-based. This might have an effect on benchmarks wheretime.Now()
dominates. We fixed this during the Go 1.10 cycle, sotime.Now()
on 1.10 is fast again like previous Go versions on previous macOS versions.You're running Sierra (I assume so given your Darwin version), so Go 1.9.2 is fast for you just like Go 1.10 is fast on my High Sierra. The question is: why Go 1.10 is slow for you... it shouldn't be.
I don't have Sierra handy right now, I'll need some help. Let's see if the version detection works: can you please try compiling Go from source, after modifying this line to an absolute jump (
JMP
instead ofJB
)? This should force the 1.9 codepath for all calls totime.Now()
. Let's see what happens.cep21 commentedon Jan 12, 2018
Magic! Working much better. Changing to a JMP makes the benchmarks between tip and go 1.9 comparable, with tip slightly better.
rasky commentedon Jan 12, 2018
Nice. What is the output of this program on your Mac?
rasky commentedon Jan 12, 2018
It'll be
13
. It looks like I made an oversight when preparing d8ff3d5: Apple did not increment the commpage version in High Sierra. I had assumed so because they added more fields and changed the way the real time is computed, but that's probably a backward compatible change for whatever API/ABI is enforced for commpage. I should have known better and double-checked with a git blame over XNU sources.I'll prepare a fix tomorrow.
bradfitz commentedon Jan 12, 2018
@rasky, thanks!
gopherbot commentedon Jan 13, 2018
Change https://golang.org/cl/87655 mentions this issue:
runtime: fix time.Now on Sierra and older
rasky commentedon Jan 13, 2018
@cep21 it would be great if you could test this patch, as I don't have Sierra available.
cep21 commentedon Jan 13, 2018
@rasky I applied the patch. Benchmark numbers are back down.