Skip to content

takeRight and dropRight of views eagerly traverse the underlying collection elements #11275

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
julienrf opened this issue Nov 26, 2018 · 6 comments

Comments

@julienrf
Copy link

julienrf commented Nov 26, 2018

I’m not sure we can do better, but I’d like to draw attention to the current behaviour of takeRight and dropRight applied to views, at least to agree on what we should expect.

This REPL session shows the current behaviour:

Welcome to Scala 2.13.0-20181120-204541-343c2d4 (OpenJDK 64-Bit Server VM, Java 1.8.0_181).
Type in expressions for evaluation. Or try :help.

scala> List.from(1 to 10)
res0: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

scala> res0.view
res1: scala.collection.SeqView[Int] = View(?)

scala> res1.map { x => println(x); x }
res2: scala.collection.SeqView[Int] = View(?)

scala> res2.take(3) // With `take`, no traversal happens
res3: scala.collection.SeqView[Int] = View(?)

scala> res2.takeRight(3)
1
2
3
4
1
5
2
6
3
7
4
8
5
9
6
10
7
8
9
10
res4: scala.collection.View[Int] = View(?)

takeRight returns a View whose elements are eagerly forced, and if we look at the implementation, we see that the elements are collected into an ArrayBuffer, and then a view of that ArrayBuffer is returned. I see two problems with that: (1) the fact that the returned view elements are eagerly evaluated breaks the property of transformation operations applied to views being lazy, and (2) an intermediate ArrayBuffer is used, although the purpose of using views is to not create intermediate data structures.

That being said, unlike take, we can not implement takeRight on views in a fully lazy way because we have to reach the end of the underlying collection to know that we can start emitting elements. So, the best we could do would be to delay the time we traverse the underlying collection to the time when at least one element of the view is effectively accessed. Doing that would fix the two mentioned issues.

However, that’s not the end of the story because this solution would break another (maybe informal?) property of views: accessing view elements should have a predictable complexity, and more specifically, it should have the same complexity of accessing elements of the underlying collection. Said otherwise, head should be O(1) on views, but that wouldn’t be the case here because it would require a complete traversal of the underlying collection to get the first element resulting from the takeRight operation.

So, it seems that the two properties (transformation operations of views should be lazy, and accessing view elements should have the same complexity as accessing the underlying collection elements) can not be satisfied at the same time. Which one should we pick?

My suggestion would be to keep the second property and give up the first one, because we already have some other operations that are not lazy (e.g. groupBy, permutations, …). These operations are not really transformation operations because they don’t return the same collection type but a collection of sub-collections (ie groupBy returns a Map[K, View[A]]). Another similar case is sorted, which eagerly evaluates its elements as well (and uses an intermediate ArrayBuffer).

If we agree on that, we can still at least improve the current implementation by not creating an intermediate ArrayBuffer, like LazyList does here: #11083 (comment)

@SethTisue SethTisue added this to the 2.13.0-RC1 milestone Nov 26, 2018
@som-snytt
Copy link

Just to point out leading-dot in REPL, in case the feature is not well-known:

scala> (1 to 10).toList
res0: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

scala> .view
res1: scala.collection.SeqView[Int] = View(?)

scala> .map { x => println(x) ; x }
res2: scala.collection.SeqView[Int] = View(?)

@szeiger
Copy link

szeiger commented Dec 5, 2018

My suggestion would be to keep the second property and give up the first one, because we already have some other operations that are not lazy (e.g. groupBy, permutations, …). These operations are not really transformation operations because they don’t return the same collection type but a collection of sub-collections

But that's an important difference. These methods cannot be lazy because they return strict collections. There is no indication in the types that takeRight or dropRight shouldn't be lazy. They are symmetrical to take and drop, they have the same type and they should be equally lazy.

accessing view elements should have a predictable complexity, and more specifically, it should have the same complexity of accessing elements of the underlying collection. Said otherwise, head should be O(1) on views

I don't think that's true. A View represents a reified operation on an Iterator (or on indexed access in case of an IndexedView). Traversing is done with an Iterator, values are computed based on an Iterator, so the complexity is that of accessing the unerlying collection through an Iterator. head won't be O(1) if the base collection doesn't have O(1) head, it won't be O(1) if you put it after filter(_ => false), etc.

Any kind of caching in Views is a problem because it goes against the simple model of Views as reified Iterator operations. I don't we should do it unless it's clear from the types (like computing a strict collection or building a View from an Iterator). What we need for takeRight and dropRight are View-based implementations that already exist for take and drop. StrictOptimizedIterableOps can override them with more efficient strict implementations.

@julienrf
Copy link
Author

julienrf commented Dec 5, 2018

head won't be O(1) if the base collection doesn't have O(1) head, it won't be O(1) if you put it after filter(_ => false), etc.

Yeah, good point.

What we need for takeRight and dropRight are View-based implementations that already exist for take and drop. StrictOptimizedIterableOps can override them with more efficient strict implementations.

I agree.

@szeiger
Copy link

szeiger commented Dec 13, 2018

Small catch: Iterator doesn't even have takeRight and dropRight and these are non-trivial to implement efficiently. I'll give it a try.

@NthPortal
Copy link

Possibly not the best option, but you could convert it to a LazyList first, which has decent takeRight and dropRight implementations

szeiger added a commit to szeiger/scala that referenced this issue Dec 13, 2018
- Add these methods to IterableOnceOps
- Implement them in Iterator (with the necessary amount of caching but
  no more than that)
- Move the existing strict implementations from IterableOps to
  StrictOptimizedIterableOps
- Add View.(TakeRight|DropRight) based on Iterator
- Move IndexedSeqView implementations of TakeRight, Drop and DropRight
  up to SeqView
- Add new overrides in IndexedSeqView

Fixes scala/bug#11275
@szeiger szeiger added the has PR label Dec 13, 2018
szeiger added a commit to szeiger/scala that referenced this issue Dec 14, 2018
- Add these methods to IterableOnceOps
- Implement them in Iterator (with the necessary amount of caching but
  no more than that)
- Move the existing strict implementations from IterableOps to
  StrictOptimizedIterableOps
- Add View.(TakeRight|DropRight) based on Iterator
- Move IndexedSeqView implementations of TakeRight, Drop and DropRight
  up to SeqView
- Add new overrides in IndexedSeqView

Fixes scala/bug#11275
@szeiger
Copy link

szeiger commented Dec 14, 2018

I took the direct route with dedicated Iterator implementations with array-based ring buffers. This should be much faster than going through LazyList.

szeiger added a commit to szeiger/scala that referenced this issue Jan 25, 2019
- Move the existing strict implementations from IterableOps to
  StrictOptimizedIterableOps
- Add View.(TakeRight|DropRight) with private Iterator-based
  implementations that cache as little and as late as possible
- Move IndexedSeqView implementations of TakeRight, Drop and DropRight
  up to SeqView
- Add new overrides in IndexedSeqView

Fixes scala/bug#11275
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants