-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Use PersistentDict for efficient implementation of ScopedValue's #51066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
787dda5
to
879f20a
Compare
@Tokazama could you review this? I took parts #46453 to make sure that I did not provide a divergent API. Also happy to split the persitent dict out into a separate PR, but I do need it for ScopedValues :) |
Is this thread safe? |
I agree that the order of the arguments is a bit odd but I'm not sure if the parallel to setindex! makes changing that a no go |
The underlying HAMT is not. The would require a CTrie (ConcurrentTrie), the PersistentDict by how it is implemented is thread safe.
Yeah the problem is that the API exists for Names tuple, but I think we could probably add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks good and love the additional stuff this is building up to. I think there's a need for more general API docs on some of this but I don't know if it's necessarily important to bog down this PR since some of the details will depend on what happens moving forward with other stuff in #46453.
base/abstractdict.jl
Outdated
end | ||
|
||
""" | ||
setindex(collection, val, key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the original docs it had setindex(collection, value, key...)
. Should this be setindex(collection, value, inds...)
for application to other collection types (tuples, arrays, etc.)? Also, is this public API now and belongs in the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this makes it public API IMO (I still vote for insert
instead).
If we don't want to make Base.setindex
API just jet I could also do the same as ImmutableDict and just use the constructor.
Yeah I kinda would want #46453 to happen first so that I don't have to make the API calls here xD. |
The problem with that PR is that it implements a ton of stuff that isn't currently used in Base, so there's little incentive to merge it and take on the maintenance burden. On the other hand, there are places that it could/should be used in base but isn't because the methods don't yet exist. I think doing some of it here without committing to a public API yet may be a good step forward (shows necessity of methods without prematurely committing to a public API). |
Yeah I agree with that reasoning and that's why I think going with |
IMO nothing this PR adds should be part of the public API. It's good to add the feature and use it internally in #50958 to work out the quirks before committing to anything. By all means, pay careful attention to the internal API of functions that are defined here/used by #50958, but I wouldn't put any of it in the manual except explicitly marked as internal. And, because we don't have #50105, it might even be good to note that methods and functions this adds are internal in their docstrings. |
Note for triage: Two decisions need to be made.
|
Triage: |
IIRC triage was against adding any new public API with this PR. (Because this PR already has a lot going on, not because we were opposed to this functionality eventually finding its way into the API) Edit to add public |
That's not what I got. I understood it to mean we do our best here with an internal and well documented API that might change and be finalized into a public API at some unknown date. |
I was imprecise in my previous comment. I agree with @Tokazama's recounting and believe we are in agreement. I have edited my previous comment to try to clarify. |
a0db65c
to
64d75e6
Compare
64d75e6
to
08d9224
Compare
In #50958 I used an inlined immutable dictonary to implement
ScopedValue
.That implementation performs very well for a small number of scoped values,
or a shallow nesting of dynamical scopes.
Crucially the lookup time for a scoped value grows
O(n)
with the numberdynamic scopes. Here I propose the usage of a PersistentDict based on
a hash array mapped trie (HAMT). HAMT has lookup
O(log(32, n))
.A few numbers for the old implementation gather from ScopedValues.jl
["DEPTH", "depth=1"]
["DEPTH", "depth=2"]
["DEPTH", "depth=4"]
["DEPTH", "depth=8"]
["DEPTH", "depth=16"]
["DEPTH", "depth=32"]
["DEPTH", "depth=64"]
["DEPTH", "depth=256"]
["DEPTH", "depth=512"]
["DEPTH", "depth=1024"]
["DEPTH", "depth=2048"]
["DEPTH", "depth=4096"]
With a PersistentDict:
["DEPTH", "depth=1"]
["DEPTH", "depth=2"]
["DEPTH", "depth=4"]
["DEPTH", "depth=8"]
["DEPTH", "depth=16"]
["DEPTH", "depth=32"]
["DEPTH", "depth=64"]
["DEPTH", "depth=128"]
["DEPTH", "depth=256"]
["DEPTH", "depth=512"]
["DEPTH", "depth=1024"]
["DEPTH", "depth=2048"]
["DEPTH", "depth=4096"]
As we can see access time is now predictable (similar in cost to task local storage),
and no longer running the risk that abundant usage of scoped values in libraries,
having negative effects through spooky action at a distance.
Now the tradeoff is that dynamic scope entry becomes more expensive,
from ~80ns to ~192ns and memory usage going from 112 bytes to 368 bytes.
So why the complication of using a HAMT and not just copying a dictonary at entry.
HAMT have the benefit of being able to structurally share memory.
Using the persitent operation
insert
in https://github.com/vchuravy/HashArrayMappedTries.jl we can seethe overhead cost.
["Base.Dict", "insert, size=0"]
["Base.Dict", "insert, size=1"]
["Base.Dict", "insert, size=2"]
["Base.Dict", "insert, size=4"]
["Base.Dict", "insert, size=8"]
["Base.Dict", "insert, size=16"]
["Base.Dict", "insert, size=32"]
["Base.Dict", "insert, size=64"]
["Base.Dict", "insert, size=128"]
["Base.Dict", "insert, size=256"]
["Base.Dict", "insert, size=512"]
["Base.Dict", "insert, size=1024"]
["Base.Dict", "insert, size=2048"]
["Base.Dict", "insert, size=4096"]
["Base.Dict", "insert, size=8192"]
["Base.Dict", "insert, size=16384"]
["HAMT", "insert, size=0"]
["HAMT", "insert, size=1"]
["HAMT", "insert, size=2"]
["HAMT", "insert, size=4"]
["HAMT", "insert, size=8"]
["HAMT", "insert, size=16"]
["HAMT", "insert, size=32"]
["HAMT", "insert, size=64"]
["HAMT", "insert, size=128"]
["HAMT", "insert, size=256"]
["HAMT", "insert, size=512"]
["HAMT", "insert, size=1024"]
["HAMT", "insert, size=2048"]
["HAMT", "insert, size=4096"]
["HAMT", "insert, size=8192"]
["HAMT", "insert, size=16384"]
For
Dict
we mustcopy
the dictionary to have the persitency requirement we need for scoped values.This would also be an in-tree user of #46453.
Kudos to @gbaraldi for bouncing a lot of ideas over the last two weeks.