Description
After #35112 landed we saw some notable improvements in allocator scalability, but performance still plateaus at around 20-24 cores. The cause is mcentral
, which has become the new scalability bottleneck. The reason it's a bottleneck is because each per-size-class mcentral
is a pair of linked lists protected by lock. The lock covers all iteration and operations on the mcentral
. While a single addition or removal from the linked list isn't a source of great contention, caching a span from an mcentral
involves iteration which is a source of significant contention this lock.
Furthermore, the code around mcentral
is fairly confusing. The main source of this confusion is span ownership: when should a span be in an mcentral
? Currently a span may be owned simultaneously by:
- An
mcentral
. - A concurrent sweeper.
mheap_.sweepSpans
.- An
mcache
.
This makes reasoning about the span lifecycle and ownership tricky. With some refactoring, I think we can achieve scalability and also change the span ownership model to limit the number of simultaneous owners. @aclements suggested before that we could repurpose the gcSweepBuf
, a data structure built for fast concurrent access, to replace the linked lists in the mcentral
. We can take this idea further and also use these data structures for sweeping, instead of having a separate mheap_.sweepSpans
. This means that markrootSpans
will need a different mechanism for finding spans with specials, but we can use a bitmap for that similar to the page reclaimer.
By unifying the sweep queue with mcentral
we can also make it so that concurrent sweepers take complete ownership of the span, which makes reasoning about (*mspan).sweep
much easier as well. Finally, since we don't need to acquire a lock, there's nothing wrong with an mcache
taking complete ownership of a span.
There remains one place where multiple span ownership would still exist and that's with the page reclaimer, which will probably never be able to take ownership of a span. But that's OK, since it only ever sweeps spans which will be freed to the heap, so all the other mechanisms can just ignore spans which are picked up by the page reclaimer (identified by their sweepgen
value).
Activity
gopherbot commentedon Feb 26, 2020
Change https://golang.org/cl/221179 mentions this issue:
runtime: add spanSet data structure
gopherbot commentedon Feb 26, 2020
Change https://golang.org/cl/221178 mentions this issue:
runtime: add bitmap-based markrootSpans implementation
gopherbot commentedon Feb 26, 2020
Change https://golang.org/cl/221182 mentions this issue:
runtime: add new mcentral implementation
gopherbot commentedon Feb 26, 2020
Change https://golang.org/cl/221181 mentions this issue:
runtime: implement the spanSet data structure
gopherbot commentedon Feb 26, 2020
Change https://golang.org/cl/221180 mentions this issue:
runtime: manage a pool of spanSetBlocks and free them eagerly
gopherbot commentedon Feb 26, 2020
Change https://golang.org/cl/221183 mentions this issue:
runtime: clean up old markrootSpans
gopherbot commentedon Feb 26, 2020
Change https://golang.org/cl/221184 mentions this issue:
runtime: clean up old mcentral code
runtime: add bitmap-based markrootSpans implementation
runtime: add spanSet data structure
runtime: manage a pool of spanSetBlocks and free them eagerly
runtime: implement the spanSet data structure
12 remaining items