Description
Preface
Many projects written in Golang have modules that use either sync.Map
, or a plain map
protected with a mutex. Those of them that use sync.Map
not necessarily fit into the use cases suggested by the docs:
The Map type is optimized for two common use cases: (1) when the entry for a given key is only ever written once but read many times, as in caches that only grow, or (2) when multiple goroutines read, write, and overwrite entries for disjoint sets of keys. In these two cases, use of a Map may significantly reduce lock contention compared to a Go map paired with a separate Mutex or RWMutex.
Such projects and even those that conform to the above use cases would benefit from better performance of sync.Map
. This issue is aimed to describe a way to improve existing sync.Map
or add a new concurrent map implementation.
Proposal
Proposed data structure may be found here: https://github.com/puzpuzpuz/xsync#map
This map uses a modified version of Cache-Line Hash Table (CLHT) data structure. CLHT is built around idea to organize the hash table in cache-line-sized buckets, so that on all modern CPUs update operations complete with at most one cache-line transfer. Also, Get operations involve no write to memory, as well as no mutexes or any other sort of locks. xsync.Map
has some modifications of the CLHT algorithm.
I won't go into all details of the original algorithm, but if you're familiar with Java's ConcurrentHashMap, it's somewhat similar, yet organizes the table into a CPU-friendly way. On 64-bit builds the bucket layout looks like the following:
| bucket mutex | keys array | values array | pointer to next bucket |
| 8 bytes | 24 bytes (3 pointers) | 24 bytes (3 pointers) | 8 bytes |
|<- one cache line (64 bytes) ->|
More details and considerations may be found here.
In general, I'm under impression that the algorithm fits nicely into Golang due to flexible control over memory layout of structs, as well as the presence of garbage collection.
Compatibility aspects
The most significant behavioral difference with sync.Map
is that nil values are not supported. That could be preserved, or a interface{} values pointing to a special "nil" struct could be used internally in the Map to mark nil values. Or the restriction could be put of the user code which means a breaking change in case if it's done in sync.Map
.
Resize is also different since xsync.Map
is grow-only, but that could be changed if necessary.
Benchmarks
The following image shows results of a benchmark run on a cloud VM (GCP, e2-highcpu-32, Intel Xeon, Haswell gen, Golang 1.16.5, Ubuntu 20.04 64-bit). The scenario assumes pre-warmed 1M entries, 99% Gets, 0.5% Stores, 0.5% Deletes and can be considered as read-heavy which is beneficial for sync.Map
.
It may be seen that sync.Map
(BenchmarkMapStandard_WarmUp) doesn't scale in this scenario, while xsync.Map
(BenchmarkMap_WarmUp) does. More measurements can be found here.
If you have an idea of a better benchmark, I'm happy to run it and share the measurements. So far, I didn't find a scenario where xsync.Map
would lead to a regression when compared with sync.Map
.
Summary
Not necessary xsync.Map
or its algorithm should be used for the next version of sync.Map
and any other viable option may be considered. But having an efficient concurrent map as a part of the standard Golang library would be certainly beneficial for all users.
Activity
earthboundkid commentedon Aug 11, 2021
If another sync.Map is added, I expect users will demand a generic version. Can this be made to work with the generics proposal? If so, maybe it could go into the proposed generics maps package as maps.Sync or some other name.
bcmills commentedon Aug 11, 2021
I would prefer to improve (replace the implementation of) the existing
sync.Map
and/or add a genericMap
type than to add a second (non-generic)sync.Map
type parallel to the first.If the proposed implementation satisfies all of the existing (documented) invariants of the
sync.Map
API, then I think it would be fine to swap out the implementation. I do think it's important to preserve the existing behavior for thenil
key, though.We also need to be careful that
Store
operations concurrent withRange
calls continue to work; see #46399.I assume that this would also address #21035 and #21032?
ianlancetaylor commentedon Aug 11, 2021
Can you clarify what it means to say that nil values are not supported? I can think of a couple of possibilities. For example, perhaps you could show a small piece of code that works with the current
sync.Map
but would stop working under this proposal. Thanks.puzpuzpuz commentedon Aug 11, 2021
Trying to answer all questions raised so far. Please let me know if something is still unclear. Say, I could go into more details on how atomic snapshots in readers work.
I believe it's possible to swap current
sync.Map
with the new implementation without breaking the API. I left some ideas on how to approach this in the "Compatibility aspects" section above. Moreover, it would be possible to relax some things described as hints to users, like the list of use cases or the O(N) comment for Range method.So, yes, a replacement for
sync.Map
seems to be the best option to me.As for generics, I'm not familiar with the design, so can't tell if
xsync.Map
is a good match for them. But even if it's not, it doesn't mean that another algorithm (say, similar to CHM in Java) won't be a good fit.Yes, both nested and concurrent Stores in Range should work fine.
It would certainly address #21035 since writers won't block any reader (at least in a general sense - with a lock/mutex; atomic snapshot code in readers may need to do some spins until it succeeds) and also writers that want to modify another hash table bucket.
As for #21032, I'd say that it should be also resolved since readers don't need to acquire any mutex (and, in general, do any loads into shared memory). Due to this, readers should scale linearly with the number of cores.
I just mean that in my library I raise a panic when a nil value is provided. So
m.Store("foo", nil)
would panic if you usexsync.Map
which is not the case withsync.Map
.But as I mentioned above it's possible to work around that restriction and support nil values. This would mean an allocation of intermediate interface{} struct, but I don't think it can't be considered as something impacting users. Other than that, the number of allocations in
xsync.Map
's main operations is the same as insync.Map
, but with a lower allocation rate.puzpuzpuz commentedon Aug 11, 2021
One more thing to mention. I didn't verify it yet, but there is a way to make scans more efficient than they are now. ATM they do an equality check for each key in scanned buckets, but that could be improved. By storing MSBs of key's hash code in tagged pointers to key/value, it would possible to skip entries known to have a different hash code. See puzpuzpuz/xsync#2
puzpuzpuz commentedon Aug 12, 2021
If there is a consensus to update
sync.Map
internals with the proposed implementation, I'm happy to open a PR and start working on an integration with the runtime (the first step would be hash code and equality calculation for interface{} keys). I'll be having some questions, so it would be nice to have someone who could answer the questions.In the meanwhile, I'm going to resolve both nil values and grow-only resize limitations in my library.
puzpuzpuz commentedon Aug 14, 2021
Update. Both limitations were resolved. See puzpuzpuz/xsync#6 and puzpuzpuz/xsync#11
13 remaining items