Skip to content
This repository was archived by the owner on Oct 4, 2020. It is now read-only.

kill maps dependency #46

Open
matthewleon opened this issue Jan 16, 2018 · 2 comments
Open

kill maps dependency #46

matthewleon opened this issue Jan 16, 2018 · 2 comments

Comments

@matthewleon
Copy link

matthewleon commented Jan 16, 2018

This is a proposal to eliminate the dependency on purescript-maps by making the 2-3 tree internal.

First, the disadvantages : 1. Some code duplication between sets and maps. 2. Any general optimizations to the 2-3 tree implementation might need to be reimplemented here.

The advantages, which I think make this worthwhile:

  1. One fewer dependency (okay, this isn't really such a significant advantage)

  2. Increased efficiency of operations across the board: Since a Set a is currently a Map a Unit, there is a fair bit of redundant setting of values to unit. Furthermore, some basic operations have unnecessarily inefficient behavior. insert, for example, is written to replace values when they already exist in the Map. When this code is reused for Set, it causes a redundant kick-up phase when an element that already exists in the Set is inserted again.

  3. Significantly for some of the functions I'm working on now, this allows us to use internal functions on 2-3 trees that increase efficiency of some operations. Take, for example, powerSet as written by Edward Kmett for Haskell's containers: https://github.com/haskell/containers/blob/f37a6991ae095dabb4eb4bd3fbdc14eb609b0f41/Data/Set/Internal.hs#L1687. It relies on an internal function, glue: https://github.com/haskell/containers/blob/f37a6991ae095dabb4eb4bd3fbdc14eb609b0f41/Data/Set/Internal.hs#L1456. As things stand currently, to write something like this for purescript-sets, it would need to be exposed by purescript-maps.

@matthewleon
Copy link
Author

As it turns out, the "one fewer dependency" argument actually does have something going for it, as it would make it easier to do this: purescript/purescript-arrays#91

@hdgarrood
Copy link
Contributor

This sounds sensible, but imo it's not obviously the right thing to do, and I'm not really willing to take on the extra maintenance burden of doing this right now. I haven't benchmarked but my gut feeling is that the performance hit from having a unit together with each key is not that serious. Of course the other optimisations you mention would be nice.

I'd be willing to revisit this after 0.12, and once it seems clear that maps has gotten to a stage where there are not really any more general optimisations to the 2-3 tree which both libraries would benefit from. I'd also like to investigate whether it would make sense to have both maps and sets depend on a library providing a basic 2-3 tree type.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants