Skip to content

Move ValueSortedDict into SortedContainers (was: Support for SortedBidict) #57

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jab opened this issue Dec 7, 2016 · 8 comments
Closed

Comments

@jab
Copy link

jab commented Dec 7, 2016

Hi Grant,

I'm the author of bidict. I just came across SortedContainers and it looks excellent. Kudos on the great work! I'm delighted to have found another high-quality Python collections library, and look forward to studying it further.

In the meantime, thanks to finding this project, I've already made a tiny change to the development version of bidict that I'm excited about: now users can make a SortedBidict type (backed by your SortedDict) in just a few lines of code:

>>> import bidict, sortedcontainers
>>>
>>> class sortedbidict(bidict.bidict):
...     _dict_class = sortedcontainers.SortedDict
>>>
>>> b = sortedbidict({'Tokyo': 'Japan', 'Cairo': 'Egypt', 'Lima': 'Peru'})
>>>
>>> list(b.items())  # b stays sorted by its keys
[('Cairo', 'Egypt'), ('Lima', 'Peru'), ('Tokyo', 'Japan')]
>>>
>>> list(b.inv.items())  # b.inv stays sorted by *its* keys (b's values!)
[('Egypt', 'Cairo'), ('Japan', 'Tokyo'), ('Peru', 'Lima')]
>>>
>>> b.index('Lima')  # passes through to SortedDict's implementation
1
>>> b.iloc[-1]
'Tokyo'
>>> b.peekitem(index=-1)
('Tokyo', 'Japan')
>>> b.inv.peekitem(index=-1)
('Peru', 'Lima')
>>> b.popitem(last=False)
('Cairo', 'Egypt')
>>> list(b.items())
[('Lima', 'Peru'), ('Tokyo', 'Japan')]
>>> list(b.inv.items())
[('Japan', 'Tokyo'), ('Peru', 'Lima')]

Documented more here.

Always interested to hear more ideas for fruitful interop and in collaboration in general. In case of interest!

Thanks for reading and best wishes.

Josh

P.S. Hope you don't mind my using your tracker for this. Please of course feel free to close / migrate to anywhere else you prefer. bidict has a (mostly inactive) channel at https://gitter.im/jab/bidict in case you'd ever like to chat there.

@grantjenks
Copy link
Owner

Hi jab! Issue tracker is fine. Interesting use case. Thanks for implementing support and writing up the recipe. I'm impressed the extra methods pass through to SortedDict e.g. index and iloc. Does it work for the KeysView too?

@grantjenks grantjenks changed the title hi from the author of bidict Support for SortedBidict Dec 7, 2016
@jab
Copy link
Author

jab commented Dec 7, 2016

Thanks for implementing support and writing up the recipe.

My pleasure, thanks for your interest and the quick reply!

I'm impressed the extra methods pass through to SortedDict e.g. index and iloc.

Oh, that's because I actually added fallback attribute proxying specially with this change (see the __getattribute__ method override here). Thought that might be a nice complement to the main _dict_class change, curious what you think.

Does it work for the KeysView too?

You can call bidict.keys() (bidict.viewkeys() on Python 2.7) and get back a collections.abc.KeysView, but that's been the case since before this change, and is not because of the new __getattribute__ override, but because bidict.BidirectionalMapping inherits from collections.abc.Mapping, and so bidict.keys resolves to Mapping's keys() implementation before it has a chance to fall back to SortedDict's in the new __getattribute__ logic.

So here's what it looks like:

>>> sortedbidict().keys()
KeysView(sortedbidict())
>>> sortedbidict().keys().__class__
collections.abc.KeysView
>>> frozenbidict().keys()
KeysView(frozenbidict())
>>> orderedbidict().keys()
KeysView(orderedbidict())
>>> # etc.

My biggest concern with this _dict_class idea is that it might make users take for granted that they should be able to extend bidict with extra functionality with such simple compositions, but the reality is that things aren't always this easy. For example, what if a user wants a SortedBidict whose inverse's items have the same order as it?

>>> elementByAtomicNumber = SortedBidict({1: 'hydrogen', 2: 'helium'})
>>> list(elementByAtomicNumber.items())
[(1, 'hydrogen'), (2, 'helium')]
>>> list(elementByAtomicNumber.inv.items())  # should remain sorted by atomic number, but isn't
[('helium', 2), ('hydrogen', 1)]

If SortedDict supported sorting by value instead of by key, or even if it allowed specifying a key function that gets passed both the key and value of each item, then maybe the minimal _dict_class change could still enable this kind of extension simply (though this would mean SortedDict's key function argument would no longer work like the one that Python's built-in sorted function takes). I'm not sure how much sense it would make for SortedDict to add that change, but even if it did, I'm still concerned about over-promising extensibility with a change that has not been designed for any actual users' real use cases. Another data point is that bidict.orderedbidict originally made use of a similar mechanism to enable maintaining insertion order (i.e. it just overrode the internal dict class to be collections.OrderedDict instead of dict, and so was a super simple change), but this resulted in insertion order not being preserved when overwriting items in some cases. And because OrderedDict doesn't expose its backing linked list, to fix this I had to essentially reimplement OrderedDict by hand. So, another case where this simple _dict_class change fell short of enabling insanely easy extensibility.

Would love to know what you think about all this, and whether anything else occurs to you about how disparate collection implementations like ours could interoperate better and be designed to compose more effectively.

@grantjenks
Copy link
Owner

Take a look at Sorted Collections: http://www.grantjenks.com/docs/sortedcollections/ that's kind of a child project of Sorted Containers. I like to collect recipes from Issues there. The ValueSortedDict and ItemSortedDict recipes are in that repo too.

@jab
Copy link
Author

jab commented Dec 8, 2016

Thanks, that's super helpful! After replacing _dict_class with two separate _fwd_class and _inv_class attributes, look what I can do with _inv_class = ValueSortedDict (the sortedbidict2 example):

>>> import bidict, sortedcontainers
>>>
>>> # a sorted bidict whose forward items stay sorted by their keys,
>>> # and whose inverse items stay sorted by *their* keys (i.e. it and
>>> # its inverse iterate over their items in different orders):
>>>
>>> class sortedbidict1(bidict.OrderedMapping, bidict.bidict):
...     _fwd_class = sortedcontainers.SortedDict
...     _inv_class = sortedcontainers.SortedDict
>>>
>>> b = sortedbidict1({'Tokyo': 'Japan', 'Cairo': 'Egypt', 'Lima': 'Peru'})
>>>
>>> list(b.items())  # b stays sorted by its keys
[('Cairo', 'Egypt'), ('Lima', 'Peru'), ('Tokyo', 'Japan')]
>>>
>>> list(b.inv.items())  # b.inv stays sorted by *its* keys (b's values!)
[('Egypt', 'Cairo'), ('Japan', 'Tokyo'), ('Peru', 'Lima')]
>>>
>>> # a sorted bidict whose forward items stay sorted by their keys,
>>> # and whose inverse items stay sorted by their values (i.e. it and
>>> # its inverse iterate over their items in the same order):
>>>
>>> import sortedcollections
>>>
>>> class sortedbidict2(bidict.OrderedMapping, bidict.bidict):
...     _fwd_class = sortedcontainers.SortedDict
...     _inv_class = sortedcollections.ValueSortedDict
>>>
>>> elemByAtomicNum = sortedbidict2({1: 'hydrogen', 2: 'helium', 3: 'lithium'})
>>>
>>> # forward items sorted by key:
>>> list(elemByAtomicNum.items())
[(1, 'hydrogen'), (2, 'helium'), (3, 'lithium')]
>>>
>>> # inverse items sorted by value:
>>> list(elemByAtomicNum.inv.items())
[('hydrogen', 1), ('helium', 2), ('lithium', 3)]
>>>
>>> elemByAtomicNum.update({4: 'beryllium'})
>>> list(elemByAtomicNum.inv.items())
[('hydrogen', 1), ('helium', 2), ('lithium', 3), ('beryllium', 4)]
>>>
>>> # order is preserved correctly when passing .inv back into constructor:
>>> atomicNumByElem = sortedbidict2(elemByAtomicNum.inv)
>>> list(atomicNumByElem.items()) == list(elemByAtomicNum.inv.items())
True
>>> # copies of .inv preserve order correctly too:
>>> list(elemByAtomicNum.inv.copy().items()) == list(atomicNumByElem.items())
True
>>>
>>> # attrs not defined by bidict are passed through to the _fwd SortedDict:
>>> elemByAtomicNum.peekitem(index=0)
(1, 'hydrogen')
>>> elemByAtomicNum.popitem(last=False)
(1, 'hydrogen')
>>> elemByAtomicNum.inv.popitem(last=True)
('beryllium', 4)
>>>
>>> # bidict provides an OrderedMapping ABC for interoperability.
>>> # Its __eq__ method is order-sensitive when comparing against another
>>> # OrderedMapping. If we want our sortedbidicts to have order-sensitive
>>> # comparison, we can inherit from OrderedMapping before inheriting
>>> # from bidict, so OrderedMapping's __eq__ comes first in the MRO.
>>> # Since this is what we did above, the following works:
>>> elemByAtomicNum == bidict.orderedbidict([(2, 'helium'), (3, 'lithium')])
True
>>> elemByAtomicNum != bidict.orderedbidict([(3, 'lithium'), (2, 'helium')])
True
>>> # comparing against an unordered Mapping is order-insensitive:
>>> elemByAtomicNum == {3: 'lithium', 2: 'helium'}
True
>>> # Since bidict.OrderedMapping implements __subclasshook__, classes like
>>> # SortedDict and OrderedDict are detected as subclasses automatically,
>>> # and so the following works:
>>> issubclass(sortedcontainers.SortedDict, bidict.OrderedMapping)
True
>>> import collections
>>> issubclass(collections.OrderedDict, bidict.OrderedMapping)
True
>>> elemByAtomicNum == collections.OrderedDict([(2, 'helium'), (3, 'lithium')])
True
>>> elemByAtomicNum != collections.OrderedDict([(3, 'lithium'), (2, 'helium')])
True
>>> # If we wanted our sortedbidicts to be automatically picked up as
>>> # subclasses of Python 3.6's collections.abc.Reversible, we could
>>> # add a __reversed__ class attribute that just proxies the method
>>> # call to the _fwd SortedDict:
>>> class sortedbidict3(bidict.OrderedMapping, bidict.bidict):
...     _fwd_class = _inv_class = sortedcontainers.SortedDict
...     __reversed__ = lambda self: reversed(self._fwd)
>>>
>>> # Python 3.6:
>>> issubclass(sortedbidict3, collections.Reversible)  
True
>>> # bidict also provides an OrderedBidirectionalMapping ABC with a
>>> # __subclasshook__, so with the __reversed__ method implying ordering,
>>> # the following works automatically too:
>>> issubclass(sortedbidict3, bidict.OrderedBidirectionalMapping)
True

(Implemented in jab/bidict@2ce0153, and now documented at https://bidict.readthedocs.io/en/dev/other-bidict-types.html#extending-bidict as well.)

It was a little tricky to get all the corner cases right, but I think it's shaping up. Would love a sanity check though. What do you think?

@grantjenks
Copy link
Owner

Looks good to me. You might want to mention sortedcollections alongside sortedcontainers in the docs.

This is also motivating to consider including ValueSortedDict within SortedContainers, possibly with a faster implementation.

@jab
Copy link
Author

jab commented Dec 9, 2016

Thanks for taking a look! I added a mention of sortedcollections alongside sortedcontainers, as you suggested. Now live at https://bidict.readthedocs.io/en/dev/other-bidict-types.html#extending-bidict. I also made some more changes to the examples, and updated the comment above with the latest. jab/bidict@2ce0153 is the current latest revision.

Including ValueSortedDict right in SortedContainers would be nice. It'd make this recipe that much shorter to boot :)

@grantjenks grantjenks changed the title Support for SortedBidict Move ValueSortedDict into SortedContainers (was: Support for SortedBidict) Feb 3, 2017
@laserson
Copy link

laserson commented Sep 3, 2017

+1 on adding ValueSortedDict to the "core" library

@grantjenks
Copy link
Owner

I think the decision has been made (perhaps through neglect) to leave ValueSortedDict in sortedcollections.

There is a separate issue to mention that library in the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants