-
Notifications
You must be signed in to change notification settings - Fork 72
Remove operations of Iterator
#7
Conversation
I am skeptical about this one. I think if the contract of Iterator is understood, it's quite useful. Besides, this probably falls under the category of "too much change". How would you migrate a program that uses full collection ops on iterators? |
I believe an interesting possibility would be to add an immutable Iterator. Instead of |
Another thought: How about we extend this notation of linearity to mutable collections? @lihaoyi has once remarked that he finds most operations on mutable collections useless whereas the variants that are useful are often missing, If I remember correctly, he mentioned operations like map and filter. I think the default should still be that these operations return new collections, because that's the cleanest semantics and for small collections performance does not matter. Remember, a good guideline is that most collections are small. But what about adding a subclass of linear collections that do have mutation behavior?. There should be an operator - call it for now
and the result would be the collection We could probably achieve that by adding the following to
and defining the necessary operations in trait WDYT? |
My main complaint with mutable collections was that the immutable operations are generally not useful. This is an empirical rather than a theoretical argument, but if I have a The "aggregation" operations like Apart from that @odersky is right that I believe commonly needed operations are not provided. For example, Having a set of in-place operations on the mutable collections that mirror the copying-transformation operations on the immutable collections would be something I could get behind. For example, Java 8 now has operations on such as replaceAll(UnaryOperator<E> operator)
Replaces each element of this list with the result of applying the operator to that element.
removeIf(Predicate<? super E> filter)
Removes all of the elements of this collection that satisfy the given predicate. on If we do this, I believe that the symmetry between e.g. |
My main problem with this approach is that it further increases the already large number of different names for collection operations. That's why I think in the end
is easier to remember than
or something like that. (I now think that
Yes,
where |
This also means that linear collections would have two
The other is the usual |
I’m not in favor of removing operations from mutable collections. I just wanted to make I agree that operations on mutable collections should modify the collections in place. I think that this should even be the default behavior, but I understand that for backward compatibility we will have to put these operations under a different namespace (I like |
Also, I would be happy to deprecate the operations that return a new collection on mutable collections. I think the ones that mutate the collection in place should be preferred. By the way, note that in the current design, mutable collections do not have the |
The problem with that is that there is a lot of code out there that uses things like:
I know of two Scala books and intro courses that promote this style. And why not? As long as the arrays are not humongous this is perfectly reasonable. |
Since we have seen several discussions with interesting/difficult tradeoffs, I have tried to summarize my philosophy that guided the original Scala collections design. Here are some rules, which I think are still valid:
The secondary goal should be followed only as long as it does not conflict with the primary goals. Also a goal now (somewhat orthogonal to the others) is to follow the principle of least power. Pick the simplest interface that can be made to fit the requirements. It's here that I believe we can improve on the current collections design. |
Regarding the changes in this PR: I'm in favor of keeping the I agree with the general goals above. When it comes to in-place semantics, goal 2 pretty much mandates that shared operations are immutable and not in-place. And in my opinion that makes perfect sense. I'll have to disagree with @lihaoyi here. In my experience, in many cases you only "consume" a collection that you get from somewhere and you do not care how it was built (i.e. if it is really mutable or immutable). If you request a generic collection type (not mutable or immutable) there is an unwritten contract that the collection will not change behind the scenes while you use it (along with a guarantee that you only use it for an "obvious" duration, usually until the method returns). Any method that you call on such a generic collection should not break this contract in surprising ways. This requires that you treat all collections as de-facto immutable for these operations. So where do we put the mutating methods? I think that If you want to move the mutating methods into a separate linear view trait it must not inherit from any collection type. I think I would prefer methods with different names on the mutable collection traits though ( |
I think this cuts at the heart of the problem: how much an API is used in practice is a key factor when considering whether it should be kept around. I can accept that I do not have the answer to this question, but I think there's a path to one if we can invest the manpower: instrument the community build and dump out compile-time and run-time call-site statistics of all these collection methods, to see which ones people call in practice, and how often. This wouldn't be anywhere near a complete snapshot of the Scala world, but it would give us a good sense of the rough ratios between the importance of different methods. I expect (???) it would show most immutable operations on mutable collections, save for a select few like I don't have the bandwidth the perform this research, but it seems to me that if we're going through the trouble of overhauling the collections and their internals, it would be worth someone first putting in the work to figure out how they're used right now 😛. I think this point also supports my proposal:
I think a method that's been defined for years but gets hardly any usage "in the wild" is almost the definition of nonsensical: even if the author of the method thinks it's awesome and elegant, clearly none of the users think it makes sense to use. Some empirical analysis of current usage patterns would tell us in-concrete-terms which of the methods users think are nonsensical, which I think is a better measure than what our internal-intuition tells us.
I agree with this, but I don't think it's a big deal. We could easily (???) set it up such that mutable collections have immutable views, that are accessible through Just as an Again, I believe |
I guess I should probably do this research, @lihaoyi, given that I already did it when we were modifying the collections in 2.10/2.11. I'll dust off my bytecode analyzer sometime soon, but it's going to take a couple of weeks, probably, to actually have an answer. |
Yes, this.type & InPlace would require serious compiler magic to make it work. I now think it should just be
I still do think we should use the same names. If not what naming scheme would we use for mutating methods? |
A pattern that I occasionally see is to use e.g. an
|
No idea if .linear makes a copy of the immutable collection, but in my head
.readonly would return a non-copied view of a mutable collection.
With that, building up a Seq using an ArrayBuffer is just as possible as it
is now; you just need to call .readonly to make your intention explicit
…On Thu, 5 Jan 2017 at 6:58 PM, Julien Richard-Foy ***@***.***> wrote:
I can accept that Array is special and often treated like an immutable
collection. But how often do you see people calling mutable.Buffer#map? Or
mutable.Set#map? Or mutable.Queue#map? I've been writing Scala for a few
years and never bumped into any of these.
A pattern that I occasionally see is to use e.g. an ArrayBuffer to build
a collection and then directly pass it to some method that expects a Seq.
This method typically uses map, filter, etc. but the underlying (runtime)
type of the collection is ArrayBuffer.
Just as an immutable collection would have only immutable operations by
default and a .linear view to do mutable stuff
.linear would first do a *copy* of the immutable collection, right?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA5A_Ltq-mj4_aQkcrEsFBtdHi1o--Jqks5rPMzrgaJpZM4LVsGn>
.
|
Actually for most of the collections (e.g. I like the idea of using A good benefit of that would be that we would not anymore have a reason to share the same hierarchy between mutable and immutable collections. Having distinct hierarchies would make the code a lot cleaner (e.g. some |
Complete yak-shave, but you might be interested to follow the JDK's vernacular by saying "unmodifiable view" rather than "immutable view". |
Yes, that's an interesting alternative. I expect it will require extensive rewriting of existing code though. It's basically a removal of all generic collection types.
I don't have a preference for using different names but for putting the methods on the base traits together with the immutable versions, in which case they would have to use different names. This point is moot if we remove all shared immutable methods, of course. Let's assume we keep the generic collections: The mutable collections have lots of mutating methods like
Agreed. The symmetry of From the discussion so far the major choice we have to make boils down to:
|
I like this design, though I would not include |
Where would |
The main collection type. |
There is an inherent tension here that hasn't been called out between correctness and readability. It's really important to know whether your operations are mutating in place or not, if it's up to you to manage mutability. (A Rust-like borrow checker mostly removes this requirement.) And that suggests that code like the following: val ys = xs.inPlace
val zs = ys.map(_+ 1).filter(_ > 15) while delightfully readable and highly familiar, is potentially dangerously deceptive. This suggests to me that we not split out methods into an Thus, I suggest instead that we adopt a consistent naming scheme per method, so it's obvious every time that you use a method that it's the mutable version. val xs = new ArrayBuffer[Int]
(xs += 7).mapMe(_ + 8).flatMapMe(1 to _).filterMe(_ % 2 == 0)
println(xs) If Operators can be suffixed with In any case, I think practically, without additional features like compiler-supported aliased views that are truly zero-cost (not "maybe zero cost after the JVM gets done with them" like value classes are), the direct methods should at least be provided as options even if you do have an |
@Ichoran I'd be remiss if I didn't share this Shakespearean syllabic syllogism from which I'd propose Also, +1 to I want to know if |
I don't think it's that deceptive; the types are different, trying to pass a In a way, it's no different from overloading
My vote would be to remove the common abstraction in favor of read-only views 😛 I think that
Is one of the primary failings of Scala collections, worse than Things like While people complain about Scala's immutable collections, IMO this flood of nonsensical methods greatly reduces the quality of Scala's mutable collections, far below that of most other languages' such as Java's or Python's. At least for those, the operations on a mutable collection generally tell me how I do the things I want. In Scala, trying to scroll through the autocomplete or the scaladoc for an unfamiliar collection such as mutable.Queue or mutable.Set is an exercise in frustration. As an exercise, try digging through the scaladoc and seeing if an in-place "XOR"/"difference" method exists on But as I'm not the one doing the bulk of the work (...implementation, and the migration adding |
I vote against using the same names for in-place operations, I agree with Rex's comment above. Having same-named methods doing different things (and having different signatures) can confuse learners. Some operations would be changed to in-place ( |
As stated above, I’m in favor in removing the common abstraction. Also, I’m convinced by arguments of @lrytz, @Ichoran and @szeiger: I finally see more advantages in putting all the mutating methods in the collection type (rather than going through an |
I could live with either |
As far as common superclasses of mutable/immutable go, I continue to lean strongly towards keeping them, for the following reasons:
I agree that noise in ScalaDoc is a problem. Maybe this can be alleviated by massaging the doc-comments. I.e. clearly state that |
After thinking some more about it, I believe we could experiment with a scheme like this one:
With our decorator-based design, this is easy to achieve, I believe. We could then experiment and see how much of the community build breaks. Based on the results, we can either tweak the scheme, or go back to the current scheme where all collections support all transforms. WDYT? |
Even if we have immutable arrays instead? |
Also, about backward source compatibility: what do you think of introducing a deprecated implicit conversion from |
@julienrf Yes, I think even if we have immutable arrays, we still need functional ops on normal arrays. There are simply too many books written that use this idiom. |
I think deprecation could work well here. Let's try it, and see how much breaks. One other question: Do we want to keep mutable views? That's something I would love to get rid of, and I don't think they were used a lot (this needs to be verified). If we don't have mutable views, then we can just stick to |
While I feel like immutable views have a lot of uses, I don't see that for mutable ones, on the opposite: I think they are a possible source of surprising bugs, due to laziness and caching on views. I also agree that we should remove operations from iterator, as this has caused multiple bugs and misconceptions, both for newcomers and even for experienced developers(happened twice in the dotty codebase) I would also prefer to keep operations on mutable collections: people coming from Java find it very useful that arrays are a collection that works uniformly with other collections. |
I agree on removing mutable views and only having views as reified operations on Iterators. Would it help the situation with Iterator operations if we removed TraversableOnce (or at least its methods)? Does the current confusion arise from collections and Iterators sharing these operations through a common supertype? I think that transformations on Iterators are simply too useful to remove them. Adding an extra |
I’m currently running the community build with a scala-library were mutable views are removed (you can see the patch here, please tell me if I missed something!). So far it runs fine but please tell me if I forgot something in the patch! |
Mutable views were a minefield of surprising behavior, bugs, and lack of functionality, so they weren't used much. (Last time I scanned the bytecode of a bunch of projects, I'm not sure they came up at all.) However, that doesn't mean mutable collections shouldn't have views; it just means that if they do have views they need their own style of views with operations that are appropriate. Even if immutable and mutable collections share a common parent, their views need not necessarily. |
Which operations do you have in mind that would be appropriate for mutable collections only but would not constitute a minefield of surprising behavior? |
Things like update and mapInPlace. Anything that alters the contents but not the number of elements in the collection. Also, mutable views do not sensibly support filter operations in the usual sense. (A reindexing view that was built from filter results is possible, though.) |
I don't think so. TraversableOnce was introduced after the other collections library (I think it was in 2.9) to solve a real pain point: Every operation taking an argument to be iterated over had to be duplicated, once for an Iterator argument, the other for an Iterable argument. Sometimes someone forgot, and had just one kind of argument, and then users got burned. So TraversableOnce (or IterableOnce which is what we have now) is really important. |
I've created a separate ticket for discussing views to avoid derailing the discussion here. |
I’m closing this PR, let’s continue the discussion about views in the other issue. |
Iterator
. The rationale is that most of these operations were leaving the underlyingIterator
in a stale state.Iterating
, which provides the exact same operations asIterator
used to provide, but takes care of instantiating a freshIterator
each time, so that there is no risk of manipulating a staleIterator
.IterableOnce
: this type was confusing since anIterableOnce
was iterable more than once as long as we use a freshIterator
each time.