Closed
Description
Please answer these questions before submitting your issue. Thanks!
- What version of Go are you using (
go version
)?
1.6, 1.6.1, 1.5.2 - What operating system and processor architecture are you using (
go env
)?
set GOARCH=amd64
set GOBIN=
set GOEXE=.exe
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=C:\Development\Projects\go
set GORACE=
set GOROOT=C:\Go
set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
set GO15VENDOREXPERIMENT=
set CC=gcc
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0
set CXX=g++
set CGO_ENABLED=1 - What did you do?
If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.
Ran the following program on windows using go 1.6.1 and found go routines using ~11kB of memory each. Decided to run the test again using go 1.5.2 to see if there was a difference in the amount of memory being used per go routine.
https://play.golang.org/p/bgP8fs5O7q - What did you expect to see?
I had expected to see go routines using approximately the same memory in different versions of go. - What did you see instead?
Go routines were using ~11kB of memory in 1.6.1 and 9.5kB in 1.5.2.
Metadata
Metadata
Assignees
Type
Projects
Relationships
Development
No branches or pull requests
Activity
bradfitz commentedon Apr 15, 2016
I think you mean per goroutine. Your program has only one channel total.
[-]Memory usage of channels increased on Windows by 20% from 1.5 to 1.6[/-][+]runtime: memory usage of goroutines on Windows increased by 20% from 1.5 to 1.6[/+]bradfitz commentedon Apr 15, 2016
/cc @alexbrainman @aclements
aclements commentedon Apr 15, 2016
The channel may actually be important here. On Linux, it went from 2,590 bytes in 1.5.2 to 4,709 bytes in 1.6. However, if you replace
<-ch
withselect {}
, it only goes up to 2,600 bytes in 1.6. This suggests that, at least on Linux, the channel operation used to fit in the initial 2K stack allocation and now doesn't. The explanation may be different on Windows, however, since there the initial stack is 8K, so it would grow to 16K if it did grow, which is more than the observed 11K.aclements commentedon Apr 15, 2016
Interesting. 1.6 is not growing the stack, so that's not where the extra memory is coming from.
aclements commentedon Apr 15, 2016
The extra memory is all in mstats.GCSys, which went from 33,596,416 in 1.5.2 to 212,801,536 in 1.6. The majority of that is almost certainly in workbufs. I'm not surprised there are a fair number of workbufs, since it's going to pick up all of the sudogs created by the blocked goroutines during stack scanning. However, I don't know why it would have increased so much since 1.5.2.
/cc @RLH
Ariemeth commentedon Apr 15, 2016
You are right. I had channels on the brain. I meant go goroutine
Ariemeth commentedon Apr 15, 2016
Thank you for fixing that @bradfitz
valyala commentedon Apr 16, 2016
This sounds pretty bad :(
Is there justified reason for such a large stack size fo channel operations? This effectively prohibits using channels in memory-effective highly concurrent code operating millions of goroutines.
aclements commentedon Apr 16, 2016
@valyala, I confirmed (in my later comments) that it's not in fact stack growth causing this. The goroutines are still running on their initial stack allocation. What's causing the increased memory usage is that GC is allocating more internal memory (most likely work buffers), though I haven't tracked down why yet.
[-]runtime: memory usage of goroutines on Windows increased by 20% from 1.5 to 1.6[/-][+]runtime: wbuf allocation increased significantly from 1.5 to 1.6[/+]aclements commentedon May 24, 2016
Using
benchmany run -n 1 -order metric -metric gc-bytes -buildcmd 'go build' go1.5..go1.6
to bisect on memstats.GCSys between 1.5 and 1.6, there are two clear change points: commit 1870572 made it go from 33.6 MB to 417 MB and commit b6c0934 made it go down to 213 MB.This makes sense to some extent: 1870572 increased the size of the workbuf by 16x, but that was supposed to mean we had ~16x fewer of them. Commit b6c0934 then halved the workbuf size (since it started caching two of them). Instead, in this benchmark, we have almost the same number of workbufs. The next step is to figure out why they aren't being reused like they're supposed to be.
aclements commentedon May 24, 2016
This is happening because of the dispose in scanstack. Because of the rutime.GC calls, alls stacks are being scanned during mark termination, which causes every scanstack to dispose its buffer. Even though there are only a few pointers in the buffer when it's disposed here, it goes to the "full" queue. Since all of the stack scans happen before we start draining mark work during mark termination, the number of work buffers is proportional to the number of stacks, rather than the number of pointers. In fact, the math works out almost exactly: 213 MB / 2048 bytes/workbuf = 1.09e5 workbufs ≈ 1e5 goroutines.
aclements commentedon May 24, 2016
I have a fairly simple fix that reduces this test down to 10 MB of workbufs. I'll test and benchmark it more thoroughly tomorrow and send a CL.
gopherbot commentedon May 24, 2016
CL https://golang.org/cl/23391 mentions this issue.
runtime: pass gcWork to scanstack