Skip to content
This repository was archived by the owner on Nov 27, 2023. It is now read-only.

don't prefer Object over Map #33

Closed
mfulton26 opened this issue Dec 22, 2021 · 21 comments
Closed

don't prefer Object over Map #33

mfulton26 opened this issue Dec 22, 2021 · 21 comments

Comments

@mfulton26
Copy link

mfulton26 commented Dec 22, 2021

I suggest renaming groupBy and groupByToMap to groupByToObject and groupBy respectively or dropping support for an object altogether.

  1. there are various reasons to prefer Map over Object

    reasons from MDN
    Map Object
    Accidental Keys A Map does not contain any keys by default. It only contains what is explicitly put into it.

    An Object has a prototype, so it contains default keys that could collide with your own keys if you're not careful.

    Note: As of ES5, this can be bypassed by using {{jsxref("Object.create", "Object.create(null)")}}, but this is seldom done.

    Key Types A Map's keys can be any value (including functions, objects, or any primitive). The keys of an Object must be either a {{jsxref("String")}} or a {{jsxref("Symbol")}}.
    Key Order

    The keys in Map are ordered in a simple, straightforward way: A Map object iterates entries, keys, and values in the order of entry insertion.

    Although the keys of an ordinary Object are ordered now, this was not always the case, and the order is complex. As a result, it's best not to rely on property order.

    The order was first defined for own properties only in ECMAScript 2015; ECMAScript 2020 defines order for inherited properties as well. See the OrdinaryOwnPropertyKeys and EnumerateObjectProperties abstract specification operations. But note that no single mechanism iterates all of an object's properties; the various mechanisms each include different subsets of properties. ({{jsxref("Statements/for...in", "for-in")}} includes only enumerable string-keyed properties; {{jsxref("Object.keys")}} includes only own, enumerable, string-keyed properties; {{jsxref("Object.getOwnPropertyNames")}} includes own, string-keyed properties even if non-enumerable; {{jsxref("Object.getOwnPropertySymbols")}} does the same for just Symbol-keyed properties, etc.)

    Size

    The number of items in a Map is easily retrieved from its {{jsxref("Map.prototype.size", "size")}} property. The number of items in an Object must be determined manually.
    Iteration A Map is an iterable, so it can be directly iterated.

    Object does not implement an iteration protocol, and so objects are not directly iterable using the JavaScript for...of statement (by default).

    Note:

    • An object can implement the iteration protocol, or you can get an iterable for an object using Object.keys or Object.entries.
    • The for...in statement allows you to iterate over the enumerable properties of an object.
    Performance

    Performs better in scenarios involving frequent additions and removals of key-value pairs.

    Not optimized for frequent additions and removals of key-value pairs.

  2. destructuring and transformation for JSON stringification can be accomplished from a Map easily enough using Object.fromEntries:

    const { even, odd } = Object.fromEntries(
      array.groupBy((num, index, array) => {
        return num % 2 === 0 ? "even" : "odd";
      })
    );
  3. groupBy (to a null-prototype object) may set a precedence for future features to avoid defaulting to more proper keyed collections like Map and Set and use objects and arrays instead; these have been used historically and continue to be used prevalently (from what I can tell) even when there are now these more optimized/appropriate collections available

in short, making it easy for developers to use a Map seems like a win to me and paves the way for more rich data structures and utilities using them and for those use cases where an Object is wanted (destructuring and JSON stringification) then there is a built-in way to quickly, efficiently, and (IMO) ergonomically transform the returned Map to an Object. If further convenience/ergonomics is desired then a groupByToObject method could still be provided while the shorter go-to method groupBy would return a Map as the "preferred" type to work with

@ljharb
Copy link
Member

ljharb commented Dec 22, 2021

I can’t speak for anyone else ofc, but imo destructuring will be wanted there vast majority of the time, and i do not prefer working with a Map unless i have object keys, which for me is a rarity - objects continue to be the “proper” keyed collections for most of my use cases.

@bakkot
Copy link
Contributor

bakkot commented Dec 22, 2021

See earlier discussion in #3.

Note that most of the reasons listed in that MDN article to prefer Maps over objects aren't relevant here: we're producing a null-prototype object, so there's no accidental keys; performance is very likely to be better for objects unless you're mutating the result a bunch, which is a weird thing to do; etc.

And conversely, there are strong reasons to prefer objects (when they are suitable, i.e. you need only string and symbol keys), which include a.) they're more convenient, since you can access properties directly, b.) the rest of the language works with them, as in JSON.stringify and destructuring, and c.) this matches the ecosystem precedent.

I think there is very good reason for both to exist. And I think it makes sense for the shorter groupBy one to be the one which is more convenient to work with, which also happens to match every other common implementation of this method in libraries like Lodash. If you need the more powerful but less convenient-to-use version, you can reach for the slightly longer name.

@mfulton26
Copy link
Author

Wow. Okay. I see that a lot of this has already been discussed.

I believe Lodash and the like introduced JavaScript developers to a groupBy before Map was even a thing (or destructuring for that matter).

I do think that a Map is a better tool for most jobs (https://stackoverflow.com/a/49164774/3255152 lists more reasons why). As such, I still feel like a single groupBy returning a Map would be more than sufficient as Object.fromEntries, new WeakMap, and other constructors/factories that take an iterable could easily transform the resulting map as desired.

I am not aware of data that says that the vast majority of use cases for groupBy will be followed by destructuring or followed by direct access to known keys. There are those use cases but there are also many other use cases such as transforming the groups, reducing them to other values, etc.

e.g.

// using a Map
Array.from(array.groupBy(), ([k, v]) => );

// using an Object
Array.from(Object.entries(array.groupBy()), ([k, v]) => ));

I still think using a Map is more general purpose and should be preferred but that is my 2 cents among many others'.

Thank you.

@bakkot
Copy link
Contributor

bakkot commented Dec 22, 2021

A Map is more general purpose, but it's also less convenient to use in common cases.

And I think it makes sense to have the more convenient thing be made available under the more convenient name, and to make the more powerful but less convenient thing available under the slightly less convenient name.

(Note that the stackoverflow question you linked is about why one might prefer a Map over an object as a dictionary. But if you know the group names up front - as is often, but not always, the case with groupBy - then you're not using the result as a dictionary. And it's pretty much the same list as MDN except with the addition of "supports millions of objects", which is not really relevant here.)

@mfulton26
Copy link
Author

I think my remaining concern is this: what data is there that says that an object is a more common use case?

due to the fact that Lodash, a very widely adopted library, uses object doesn't mean if it were to use Map instead that it wouldn't still be widely adopted

I personally find myself using groupBy's result with unknown key sets that I need to iterate through than I do with known key sets

from what I can tell, by making groupBy return an object all other use cases are inconvenienced while if it were to return a Map then only destructuring and JSON stringification are slightly inconvenienced (wrapping the groupBy call in Object.fromEntries is still very ergonomically IMO and isn't much different from need an array from Map or Set keys/values instead of an iterator)

again, my 2 cents here among many others; I appreciate the (re)consideration but I suppose if the decision has already been made and there is no data we can look at to say that one way is conclusively "better" than the other than we're left to move forward with what the authors/champions think is best; either way I'm excited to see groupBy* added natively to JavaScript whether using a Map requires using a longer name or not; thanks!

@bakkot
Copy link
Contributor

bakkot commented Dec 23, 2021

I didn't claim use cases where an object suffices would be more common, just that they would be common. Which I think is plainly true and does not require data.

@mfulton26
Copy link
Author

I may have misunderstood this then:

A Map is more general purpose, but it's also less convenient to use in common cases.

I think you make a good point that it will be common to use groupBy and then destructure, etc. My concern is that it will also be common to use groupBy and iterate over the groups where the keys are not yet all known and so a Map is more suitable.

As such, both use cases may be common and I don't think we have any data that says one will be more common.

So in my mind it seems that as a Map is more general purpose and can easily be used in the common use cases identified then it might be prudent to actually make groupBy return a Map and there might not even be a need another built-in method but such could be added as a groupByToObject if passing the returned Map from groupBy to Object.fromEntries isn't sufficiently ergonomically.

i.e. I propose that given the lack of data for which use cases will be most common that whichever is most versatile be made the go-to (shorter name) and if that is the criteria then I believe groupBy returning a Map is the winner as it is an iterable and an iterable can be passed directly into Object.fromEntries, Array.from, new WeakMap, etc., etc. while Object is less versatile itself as it first needs to be passed into Object.entries/Object.keys/Object.values before it can be passed into other functions that receive an iterable.

@mfulton26
Copy link
Author

mfulton26 commented Dec 23, 2021

As an aside: I really appreciate all the conversation here.

I do not think it is an easy call as to whether to have one or two groupBy methods and which to name which, etc.

I appreciate the willingness to discuss, share, and discover (perhaps even rediscover in some cases) the advantages and disadvantages.

Regardless of one or two methods and their names I am excited to see this come about as I think it will be very useful!

@bakkot
Copy link
Contributor

bakkot commented Dec 23, 2021

My point is that working with a Map is usually less convenient than working with an object: you have to call .has and .get rather than using in and x[field], you can't destructure or pass it to methods like JSON.stringify, etc. (Iterating the entries is the sole exception to this general rule.) Therefore there is almost always going to be a certain amount of awkwardness anyway when you need to work with a Map. On the other hand, when an object is sufficient, there is not necessarily any inherent awkwardness.

We have to make one of the two cases slightly more awkward to use, by giving it the non-default name. My preference is to make the Map case, which is already inherently somewhat awkward, be slightly more so: this is a small cost, because the relative increase in awkwardness is low. By contrast, if we make the object case, which is not already inherently awkward, be slightly more awkward, that is a large cost, because the relative increase in awkwardness is larger (even though the total increase is the same).

This is what I meant above by "I think it makes sense to have the more convenient thing be made available under the more convenient name, and to make the more powerful but less convenient thing available under the slightly less convenient name."

To put it another way: there are never going to be very short idioms possible in the Map case because of the inherent inconvenience of using Maps, so making these use cases require a wordier method name doesn't cost us much. On the other hand, there are short idioms possible in the object case, as in let { even, odd } = arr.groupBy(x => x % 2 === 0 ? 'even' : 'odd'), so making these use cases require a wordier method name does cost us: it makes these idioms more annoying.

@mfulton26
Copy link
Author

mfulton26 commented Dec 24, 2021

My point is that the decision on which use cases to make more convenient should be based on empirical data.

To my knowledge we don't collectively have empirical data that says that destructuring and/or JSON stringification is a more common use case than iterating over the groups.

I often find myself doing both, it simply depends on the use case and I often want to destructure and I often want to iterate on or transform somehow the group keys and/or values before further use.

As such, what I am proposing is that the more versatile / general purpose / adaptable return type be given the shorter name if one is to have a shorter name.

On that note I suppose a different idea is to not prefer one over the other at all but to name the methods groupByToObject and groupByToMap.

I think there should be a good reason to give one the shorter name over the other and from what I can tell we can't know which will be used more often and as such I am suggesting that we shouldn't make one more preferred than the other unless there is empirical data to say that it should be so.

@mfulton26 mfulton26 changed the title prefer Map over Object don't prefer Object over Map Dec 24, 2021
@bathos
Copy link

bathos commented Dec 24, 2021

should be based on empirical data

Perhaps some data could be found by analyzing lodash groupBy usage? Even though it's always objects there AFAIK, comparing the frequency of use with fixed set of keys with the frequency of use with arbitrary/calculated keys seems like a passable proxy for whether object (in the first case) or map (in the second) is more likely to be the natural choice.

Unfortunately, determining what's fixed vs arbitary doesn't seem friendly to automation, so it may only be realistic with a sample small enough for by-hand classification.

@ljharb
Copy link
Member

ljharb commented Dec 24, 2021

If you can figure out how many times new Map( appears in JS on github, I would be surprised if that compared to even the number of Object.keys( usages, let alone all the usages of objects combined together.

Github Search is quite inadequate for this task, so if someone wanted to present such data, that'd be interesting to look at.

@bakkot
Copy link
Contributor

bakkot commented Dec 24, 2021

Empirical data can't tell us about all possible future usage; it's really not all that helpful. For example, most kitchen-sink libraries which include groupBy also include partition, which we are choosing not to do because it's easy to do the equivalent operation with groupBy, and that way readers don't have to remember whether partition returns true or false first - but this means that current usage will miss a lot of what groupBy can be used for. (Though I note some people are already using it as a replacement for partition.)

Without knowing the future, we have to rely on our judgement and on other reasoning. And in my judgement it makes sense for the key-value collection which has explicit support in the syntax of the language to be the preferred return type here. I think this is a very good reason to prefer one over the other.

@bathos
Copy link

bathos commented Dec 24, 2021

Github Search is quite inadequate for this task

Yeah, I didn't know whether there was a realistic way to collect that data and unfortunately there may not be. It was a "here's a thing one could theoretically measure" suggestion for @mfulton26 because "empirical data" about things like "frequency of serializing result to JSON" seemingly couldn't say anything: it doesn't compare to anything else. To have any chance of being useful input, data would need to groupBy(whetherObjectOrMapIsLikeyBetterSuited)).

@bakkot I think that if someone is able to collect data about something like this, that's sort of interesting in its own right and potentially useful whether or not it turns out to have a bearing on this specific decision, so would not want to discourage it. FWIW, I agree with your conclusions ("this is almost always for destructuring" matches my experience).

@bakkot
Copy link
Contributor

bakkot commented Dec 24, 2021

Yeah, I don't mean to discourage doing this kind of research, just relying on it.

Re: GitHub Search - SourceGraph works ok for this sort of query, in my experience.

@mfulton26
Copy link
Author

I doubt much research would really help as neat as it might be to do. Without Lodash having unbiasly named groupBy functions I think the results works overwhelmingly be in favor of object and not Map.

@mfulton26
Copy link
Author

mfulton26 commented Dec 25, 2021

I appreciate the discussion here and it seems that my concerns and opinions are not shared and that is okay. Is there anyone else's comments we should wait for before closing this out? @jridgewell perhaps?

@jridgewell
Copy link
Member

I strongly prefer the object version, and imagine objects (and destructuring) will be the expected output for regular users. As cool as Maps are, the JS ecosystem prefers objects for just about everything.

@mfulton26
Copy link
Author

mfulton26 commented Dec 26, 2021

Thank you everyone. One last thought that I didn't think of before (and perhaps it has already been discussed somewhere too) but instead of a null-prototype would it be worth exploring adding a prototype that doesn't extend Object's prototype and has function named asMap/toMap? Maybe there are other issues with that that aren't currently coming to mind for me:

array.groupBy().asMap();

this wouldn't disrupt destructing or JSON serialization and I think it more ergonomic than a separate method

thoughts?

@ljharb
Copy link
Member

ljharb commented Dec 26, 2021

If it was an object with toObject and toMap, you’ve made the convenient case less so. If it’s an object with toMap on it, you’ve broken anyone who wants a key of “toMap”.

@bathos
Copy link

bathos commented Dec 27, 2021

In addition to the issue ljharb described, that approach would mean Map keys need to be cast to property keys as well. That’s a problem not just because of “extra work” but because ToPropertyKey(x) would throw for some map keys (including, ironically, objects returned by this method).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants