Skip to content

sync: Pool example suggests incorrect usage #23199

Open
@dsnet

Description

@dsnet

The operation of sync.Pool assumes that the memory cost of each element is approximately the same in order to be efficient. This property can be seen by the fact that Pool.Get returns you a random element, and not the one that has "the greatest capacity" or what not. In other words, from the perspective of the Pool, all elements are more or less the same.

However, the Pool example stores bytes.Buffer objects, which have an underlying []byte of varying capacity depending on how much of the buffer is actually used.

Dynamically growing an unbounded buffers can cause a large amount of memory to be pinned and never be freed in a live-lock situation. Consider the following:

pool := sync.Pool{New: func() interface{} { return new(bytes.Buffer) }}

processRequest := func(size int) {
	b := pool.Get().(*bytes.Buffer)
	time.Sleep(500 * time.Millisecond) // Simulate processing time
	b.Grow(size)
	pool.Put(b)
	time.Sleep(1 * time.Millisecond) // Simulate idle time
}

// Simulate a set of initial large writes.
for i := 0; i < 10; i++ {
	go func() {
		processRequest(1 << 28) // 256MiB
	}()
}

time.Sleep(time.Second) // Let the initial set finish

// Simulate an un-ending series of small writes.
for i := 0; i < 10; i++ {
	go func() {
		for {
			processRequest(1 << 10) // 1KiB
		}
	}()
}

// Continually run a GC and track the allocated bytes.
var stats runtime.MemStats
for i := 0; ; i++ {
	runtime.ReadMemStats(&stats)
	fmt.Printf("Cycle %d: %dB\n", i, stats.Alloc)
	time.Sleep(time.Second)
	runtime.GC()
}

Depending on timing, the above snippet takes around 35 GC cycles for the initial set of large requests (2.5GiB) to finally be freed, even though each of the subsequent writes only use around 1KiB. This can happen in a real server handling lots of small requests, where large buffers allocated by some prior request end up being pinned for a long time since they are not in Pool long enough to be collected.

The example claims to be based on fmt usage, but I'm not convinced that fmt's usage is correct. It is susceptible to the live-lock problem described above. I suspect this hasn't been an issue in most real programs since fmt.PrintX is typically not used to write very large strings. However, other applications of sync.Pool may certainly have this issue.

I suggest we fix the example to store elements of fixed size and document this.

\cc @kevinburke @LMMilewski @bcmills

Activity

dsnet

dsnet commented on Dec 20, 2017

@dsnet
MemberAuthor

I should also note that if #22950 is done, then usages like this will cause large buffers to be pinned forever since this example has a steady state of Pool usage, so the GC would never clear the pool.

dsnet

dsnet commented on Dec 20, 2017

@dsnet
MemberAuthor

Here's an even worse situation than earlier (suggested by @bcmills):

pool := sync.Pool{New: func() interface{} { return new(bytes.Buffer) }}

processRequest := func(size int) {
	b := pool.Get().(*bytes.Buffer)
	time.Sleep(500 * time.Millisecond) // Simulate processing time
	b.Grow(size)
	pool.Put(b)
	time.Sleep(1 * time.Millisecond) // Simulate idle time
}

// Simulate a steady stream of infrequent large requests.
go func() {
	for {
		processRequest(1 << 28) // 256MiB
	}
}()

// Simulate a storm of small requests.
for i := 0; i < 1000; i++ {
	go func() {
		for {
			processRequest(1 << 10) // 1KiB
		}
	}()
}

// Continually run a GC and track the allocated bytes.
var stats runtime.MemStats
for i := 0; ; i++ {
	runtime.ReadMemStats(&stats)
	fmt.Printf("Cycle %d: %dB\n", i, stats.Alloc)
	time.Sleep(time.Second)
	runtime.GC()
}

Rather than a single one-off large request, let there be a steady stream of occasional large requests intermixed with a large number of small requests. As this snippet runs, the heap keeps growing over time. The large request is "poisoning" the pool such that most of the small requests eventually pin a large capacity buffer under the hood.

kevinburke

kevinburke commented on Dec 20, 2017

@kevinburke
Contributor

Yikes. My goal in adding the example was to try to show the easiest-to-understand use case for a Pool. fmt was the best one I could find in the standard library.

ulikunitz

ulikunitz commented on Dec 21, 2017

@ulikunitz
Contributor

The solution is of course to put only buffers with small byte slices back into the pool.

if b.Cap() <= 1<<12 {
         pool.put(b)
}
dsnet

dsnet commented on Dec 21, 2017

@dsnet
MemberAuthor

Alternatively, you could use an array of sync.Pools to bucketize the items by size: https://github.com/golang/go/blob/7e394a2/src/net/http/h2_bundle.go#L998-L1043

bcmills

bcmills commented on Dec 21, 2017

@bcmills
Contributor

The solution is of course …

There are many possible solutions: the important thing is to apply one of them.

A related problem can arise with goroutine stacks in conjunction with “worker pools”, depending on when and how often the runtime reclaims large stacks. (IIRC that has changed several times over the lifetime of the Go runtime, so I'm not sure what the current behavior is.) If you have a pool of worker goroutines executing callbacks that can vary significantly in stack usage, you can end up with all of the workers consuming very large stacks even if the overall fraction of large tasks remains very low.

kevinburke

kevinburke commented on Dec 21, 2017

@kevinburke
Contributor

Do you have any suggestions for better use cases we could include in the example, that are reasonably compact?

Maybe the solution is not to recommend a sync.Pool at all anymore? This is my understanding from a comment I read about how GC makes this more or less useless

jzelinskie

jzelinskie commented on Dec 22, 2017

@jzelinskie
Contributor

Would changing the example to use an array (fixed size) rather than a slice solve this problem?
In Chihaya, this is how we've used sync.Pool and our implementation before it was in the standard library.

Maybe the solution is not to recommend a sync.Pool at all anymore?

I legitimately don't think there ever was a time to generally recommend sync.Pool. I find it a pretty contentious add to the standard library because of how careful and knowledgable of the runtime you need to be in order to use it effectively. If you need optimization at this level, you probably know how to implement this best for your own use case.

Sorry to interject randomly, but I saw this thread on Twitter and have strong opinions on this feature.

aclements

aclements commented on Dec 22, 2017

@aclements
Member

Maybe the solution is not to recommend a sync.Pool at all anymore? This is my understanding from a comment I read about how GC makes this more or less useless

We would certainly like to get to this point, and the GC has improved a lot, but for high-churn allocations with obvious lifetimes and no need for zeroing, sync.Pool can still be a significant optimization. As @RLH likes to say, every use of sync.Pool is a bug report on the GC. But we're still taking those bug reports. :)

I should also note that if #22950 is done, then usages like this will cause large buffers to be pinned forever since this example has a steady state of Pool usage, so the GC would never clear the pool.

That's clearly true, but even right now it's partly by chance that these examples are eventually dropping the large buffers. And in the more realistic stochastic mix example, it's not clear to me that #22950 would make it any better or worse.

I agree with @dsnet's original point that we should document that sync.Pool treats all objects interchangeably, so they should all have roughly the same "weight". And it would be good to provide some suggestions for what to do in situations where this isn't the case, and perhaps some concrete examples of poor sync.Pool usage.

84 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocumentationIssues describing a change to documentation.NeedsFixThe path to resolution is known, but the work has not been done.compiler/runtimeIssues related to the Go compiler and/or runtime.help wanted

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @FMNSSun@peterbourgon@andig@kevinburke@jzelinskie

        Issue actions

          sync: Pool example suggests incorrect usage · Issue #23199 · golang/go