Description
What version of Go are you using (go version
)?
$ go version go version go1.12.13 darwin/amd64 go version go1.13.4 darwin/amd64
Does this issue reproduce with the latest release?
Yes. Tested on
go version go1.4-bootstrap-20170531 darwin/amd64
What operating system and processor architecture are you using (go env
)?
go env
Output
$ go env GO111MODULE="on" GOARCH="amd64" GOBIN="" GOCACHE="/Users/qwe/Library/Caches/go-build" GOENV="/Users/qwe/Library/Application Support/go/env" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="darwin" GONOPROXY="none" GONOSUMDB="" GOOS="darwin" GOPATH="/Users/qwe/go" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/usr/local/Cellar/go/1.13.4/libexec" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/usr/local/Cellar/go/1.13.4/libexec/pkg/tool/darwin_amd64" GCCGO="gccgo" AR="ar" CC="clang" CXX="clang++" CGO_ENABLED="1" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/qv/37vc1dv526n0drc9jljj4v240000gp/T/go-build366955304=/tmp/go-build -gno-record-gcc-switches -fno-common"
What did you do?
Original issue description is posted at gonum library repo gonum/gonum#1174
In a few words: we have a huge graph in memory and we need to update it periodically.
After a new graph initialization, we switch pointer to the new graph and wait for GC to free unused memory.
But it never reuses all the HeapIdle and asks OS for new pages increasing HeapSys infinitely.
Base self-contained demo: https://github.com/savalin/example/blob/master/main.go.
It's simplified code but even this one shows some memory management strangeness.
What did you expect to see?
HeapIdle re-usage instead of asking OS for new memory.
What did you see instead?
HeapIdle/HeapSys grow infinitely (until OMM-killer wake up and kill the app)
Currently, we downgraded golang to v1.11.13 as a workaround, but we hope to be able to use new golang releases
Activity
dmitshur commentedon Nov 26, 2019
/cc @aclements @mknyszek
(Please let me know if there are other people who are better to ping about an issue like this.)
mknyszek commentedon Nov 26, 2019
@savalin FWIW, I can't reproduce this at tip.
HeapSys
stays nice and low. This leads me to an interesting conclusion. Go 1.12 introduced the concept of treating memory separately if it's returned to the OS or if it isn't. This policy was carried through into Go 1.13, but in Go 1.14, we no longer do this because it would be expensive and cumbersome to do so.The reason this decision was made in the first place was to accommodate the more aggressive mechanism for returning memory to the OS in Go 1.13 (#30333). The decision was deemed a good idea because jemalloc and other allocators follow a similar rule for returning memory to the OS. It seems like for whatever reason this decision is causing fragmentation which really hurts this application.
Anyway, it shouldn't matter for future releases. This bug is effectively fixed for the Go 1.14 release. Please give the tip of the master branch a try and see if you get the same results.
Finally, this made me realize that
debug.FreeOSMemory
no longer works as it's supposed to in Go 1.14, so I'll open up an issue and fix that.CC @aclements, since this is the first consequence we're seeing of that original decision.
Please continue reading if you'd like to see some additional notes I wrote while sussing this out.
HeapSys
is expected to never decrease since we never unmap memory in the runtime. Instead we call madvise which increases HeapReleased.HeapSys - HeapReleased
closer to your actual working set, not accounting for certain accounts of fragmentation.GODEBUG=madvdontneed=1
, but here you specifydarwin/amd64
; note that that option only has any effect on Linux, where it'll properly drop RSS counters at the cost of additional page faults (but the OS lets you overcommit, it'll just kick outMADV_FREE
'd pages when it needs it). We don't do what drops RSS counters ondarwin/amd64
because it's non-trivial to do, though perhaps we could now that we rely on libSystem.HeapIdle == HeapReleased
after you calldebug.FreeOSMemory
, which is exactly what that does. I'm not sure why your OOM killer is triggering becauseHeapSys
is just virtual memory. Do you haveulimit -v
set? Perhaps this is a Darwin overcommit thing we need to work around.HeapSys
to keep getting larger (runtime: treap implementation of find() doesn't return the best-fit span #31616) because of a bug. We also noticed that there was a problem in Go 1.12 if we decided to prefer unscavenged memory over scavenged memory in the allocator, so we decided to take the best between the two instead, which resolved the issue.[-]HeapIdle is increasing infinitely on big data volume allocations[/-][+]runtime: being unable to allocate across scavenged and unscavenged memory leads to fragmentation blow-up in some cases[/+]mknyszek commentedon Nov 26, 2019
@savalin I will close this once you're able to verify that this is fixed for you. If you're still experiencing problems, we'll move on from there.
gopherbot commentedon Nov 26, 2019
mknyszek commentedon Dec 2, 2019
@savalin My mistake. This doesn't reproduce at tip as long as you put a
runtime.GC()
call at the very top of the for loop (when all the heap memory is actually dead).I can get it to increase the same as in #35890. What @randall77 mentioned in that one is pretty much what I'm seeing here: it caps out at around 20 GiB.