Description
What version of Go are you using (go version
)?
$ go version go version go1.15.2 linux/amd64
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (go env
)?
go env
Output
$ go env GO111MODULE="" GOARCH="amd64" GOBIN="" GOCACHE="/home/user/.cache/go-build" GOENV="/home/user/.config/go/env" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOINSECURE="" GOMODCACHE="/home/user/.go_workspace/pkg/mod" GONOPROXY="" GONOSUMDB="" GOOS="linux" GOPATH="/home/user/.go_workspace:/usr/local/go_workspace" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/usr/local/go" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="/home/user/project/go.mod" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build013192427=/tmp/go-build -gno-record-gcc-switches"
What did you do?
We have a large(ish) cache built up from CI builds (approx 2GB in size). We've seen tests slow down fairly dramatically, and it seems that the underlying cause is (possibly partial) cache misses resulting in at least some of the test process to be run every single time (in particular I see the linker being invoked).
$ time GODEBUG=gocachetest=1 go test -race ./service/
testcache: github.com/foo/bar/service: test ID 5277766d4236333530556841686b797248445250 => 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1
testcache: github.com/foo/bar/service: input list not found: cache entry not found: open /home/user/.cache/go-build/32/32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1-a: no such file or directory
testcache: github.com/foo/bar/service: test ID 484f364f314d6d6c647371774138555430696467 => ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18
testcache: github.com/foo/bar/service: test ID ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => 74e40de3953e4347471748ef1e1ff4aacfa7380968c99f0baa14b0b7b1d24dae
ok github.com/foo/bar/service (cached)
real 0m2.185s
user 0m2.687s
sys 0m0.387s
$ time GODEBUG=gocachetest=1 go test -race ./service/
testcache: github.com/foo/bar/service: test ID 5277766d4236333530556841686b797248445250 => 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1
testcache: github.com/foo/bar/service: input list not found: cache entry not found: open /home/user/.cache/go-build/32/32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1-a: no such file or directory
testcache: github.com/foo/bar/service: test ID 484f364f314d6d6c647371774138555430696467 => ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18
testcache: github.com/foo/bar/service: test ID ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => 74e40de3953e4347471748ef1e1ff4aacfa7380968c99f0baa14b0b7b1d24dae
ok github.com/foo/bar/service (cached)
real 0m2.207s
user 0m2.786s
sys 0m0.397s
I'm not familiar with what it is caching, or indeed how the caching works, but it seems removing the cache entry that is being hit then resolves this issue:
$ rm ~/.cache/go-build/74/74e40de3953e4347471748ef1e1ff4aacfa7380968c99f0baa14b0b7b1d24dae-a
$ time GODEBUG=gocachetest=1 go test -race ./service/
testcache: github.com/foo/bar/service: test ID 5277766d4236333530556841686b797248445250 => 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1
testcache: github.com/foo/bar/service: input list not found: cache entry not found: open /home/user/.cache/go-build/32/32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1-a: no such file or directory
testcache: github.com/foo/bar/service: test ID 484f364f314d6d6c647371774138555430696467 => ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18
testcache: github.com/foo/bar/service: test ID ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => 74e40de3953e4347471748ef1e1ff4aacfa7380968c99f0baa14b0b7b1d24dae
testcache: github.com/foo/bar/service: test output not found: cache entry not found: open /home/user/.cache/go-build/74/74e40de3953e4347471748ef1e1ff4aacfa7380968c99f0baa14b0b7b1d24dae-a: no such file or directory
testcache: github.com/foo/bar/service: save test ID 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => c5d26a24287e143e5891845de941e3d6e14ccca91d2e881d31430334a09001a8
testcache: github.com/foo/bar/service: save test ID ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => 74e40de3953e4347471748ef1e1ff4aacfa7380968c99f0baa14b0b7b1d24dae
ok github.com/foo/bar/service 0.052s
real 0m2.236s
user 0m2.863s
sys 0m0.370s
$ time GODEBUG=gocachetest=1 go test -race ./service/
testcache: github.com/foo/bar/service: test ID 5277766d4236333530556841686b797248445250 => 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1
testcache: github.com/foo/bar/service: test ID 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => c5d26a24287e143e5891845de941e3d6e14ccca91d2e881d31430334a09001a8
ok github.com/foo/bar/service (cached)
real 0m0.728s
user 0m0.974s
sys 0m0.276s
$ time GODEBUG=gocachetest=1 go test -race ./service/
testcache: github.com/foo/bar/service: test ID 5277766d4236333530556841686b797248445250 => 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1
testcache: github.com/foo/bar/service: test ID 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => c5d26a24287e143e5891845de941e3d6e14ccca91d2e881d31430334a09001a8
ok github.com/foo/bar/service (cached)
real 0m0.725s
user 0m1.034s
sys 0m0.234s
Note the save test ID
lines and the the speed up.
We can of course clear out the cache to help resolve this, but it feels like there is perhaps an issue here: there is a cache miss yet in these circumstances no new cache entry is created, at least without intervention. I'm open to the possibility that our cache has gotten into a bad state, though I'd have a lot of difficulty trying to understand why.
Activity
[-]go test perpetual cache misses[/-][+]cmd/go: test reported as cached, but not actually cached[/+]bcmills commentedon Sep 23, 2020
The
ok github.com/foo/bar/service (cached)
output in that first trace seems to indicate that the test result should have been loaded from the cache, but the timing information is not consistent with that output.To me, that suggests that we are missing some error-handling somewhere when the cache gets into an inconsistent state.
CC @jayconrod @matloob
nick-jones commentedon Sep 24, 2020
Just to add, this issue seems to be quite prevalent across our entire test suite (I only pasted an isolated example above).
I've not included the full output as it is quite large, but with runs 1, 2, 4 & 5 the output only contained lines with
no test files
orcached
. So after clearing the cache out, the time to run the tests where all tests are apparently cached drops from ~6 minutes to ~1 minutenick-jones commentedon Nov 10, 2020
@bcmills - having taken another look at this, I'm wondering if we're hitting this condition here:
go/src/cmd/go/internal/test/test.go
Lines 1143 to 1154 in 1c7650a
tryCache
is invoked twice, with the first attempt getting a cache missNote that the comment suggests
saveOutput
will be invoked in this particular case, however from what I see it will only be called if this also results in a cache miss. Which is fine, however it means a relatively expensive link step is going to to run every single time?Edit: I note that calling
saveOutput
in the case where the second cache lookup succeeds won't be an answer to this, assaveOutput
relies on the tests actually being run (readstestlog.txt
).nick-jones commentedon Nov 10, 2020
With the previous comment in mind, steps to force this undesirable behaviour are:
We can confirm that using the linker flag is causing the first cache fetch to miss:
The time loss seems minor, but with a large test suite with complex and large scale dependencies this very quickly adds up (as demonstrated in #41593 (comment)). Note that we aren't using any bogus
-X
flags AFAIK, I'm wondering if the cache misses start to occur after upgrading go versions.bcmills commentedon Nov 27, 2023
This may be another symptom of #49267.