Skip to content

Very lazy LazyList creation #11105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
NthPortal opened this issue Aug 23, 2018 · 7 comments
Closed

Very lazy LazyList creation #11105

NthPortal opened this issue Aug 23, 2018 · 7 comments
Assignees
Milestone

Comments

@NthPortal
Copy link

NthPortal commented Aug 23, 2018

Currently, LazyList has two forms of laziness:

  1. laziness in determining whether or not it is empty (and by extension, how many elements it has)
  2. independent laziness of elements - that is, evaluating a subsequent head or tail of the LazyList does not cause the evaluation of previous heads.

Of LazyList's factory methods, none support both kinds of laziness. unfold and from (when called with an Iterator or View) support the first kind, but have to evaluate previous elements in order to know if the collection contains subsequent elements. tabulate, fill and continually support the second kind of laziness, but their sizes are known at construction (infinity in the case of continually). The lazy #:: method can only be used to create non-empty LazyLists as well.

In theory, #::: can be used to combine LazyLists of 0 or 1 elements which are lazy in the first way, thus making the elements independently lazy; however, this seems to have poor usability both from API and performance perspectives (also it may not be stack-safe).

What factory method (or other strategy) can we use to create a LazyList with both kinds of laziness? Is this even a useful feature? I do not personally have a motivating example, but that doesn't mean that none exists.

@NthPortal NthPortal added this to the 2.13.0-RC1 milestone Aug 23, 2018
@javax-swing
Copy link

javax-swing commented Aug 24, 2018

@NthPortal just some stuff off the top of my head. I can delete it if it is utter nonsense 😁

Reasoning for a Lazy unfold. Perhaps you have an application where a user chooses a date range of date-indexed data they wish to look at. This then presents them the option of choosing a date to click on.

Fetching this data is very expensive,and so you want to cache the results of the data in the users session. The api from the clients perspective is to ask for a given index within the date range supplied.

For convenience you may have some kind of Session that looks like the following.

class SomeSession(db : DataStore[SomeData], from : Date, to : Date){
   private val results : LazyList[SomeData] = LazyList.unfoldLazy(from) { date =>
    if (date after to) None
    else {
        val nextData = () => db.fetch(date)
        Some(nextData, date + 1 day)
    }
    }

    def fetch(index: Int) : Option[SomeData] = results.get(index)
}

Implementing the same thing with a mutable data structure

class SomeSession(db : DataStore[SomeData], from : Date, to : Date){
   private val cache : mutable.Map[Int, SomeData] = Map()

    def fetch(index: Int) : Option[SomeData] = {
        if (cache.contains(index)) {
            cache.get(index)
        } else {
            val date = from + index days
            if (index < 0 || !(date after to)) {
                None
            } else {
                cache += (index -> db.fetch(date))
                 cache.get(index)
            }
        }
    }
}

}

Reasoning not to have it: you can always use the current unfold and map to achieve the same result

val lazyLazyList : LazyList[A] = LazyList.unfold(1) {
  s => Some(() => head, s + 1)
}.map(_.apply())

Not sure if this is too contrived though 👌

Sorry for the dodgy pseudo Scala and formatting, I did it on my phone.

@javax-swing
Copy link

@NthPortal
Copy link
Author

@javax-swing if you have a fixed date range, you can use tabulate

@NthPortal
Copy link
Author

NthPortal commented Aug 25, 2018

I think actually a very compelling usecase is testing the laziness of LazyList. Currently (not pushed), the implementation for testing both laziness of state and of elements is quite complex and unfriendly:

    private[this] val heads  = new Array[Boolean](count)
    private[this] val states = new Array[Boolean](count)

    private[this] def gen(index: Int): LazyList[Int] = {
      def f(): Int = {
        heads(index) = true
        index
      }

      LazyList.unfold(0) { _ => { states(index) = true; None } } #::: f() #:: LazyList.empty[Int]
    }

    private[this] def genList(): LazyList[Int] = {
      def doGen(n: Int): LazyList[Int] =
        if (n < count) gen(n) #::: doGen(n + 1)
        else LazyList.unfold(0)(_ => None)

      doGen(0)
    }

If we think about adding a method to simplify the implementation, we might try something like this:

def unfoldLazy[A, S](init: S)(f: S => Option[(() => A, S)]): LazyList[A] = ???

This allows us to rewrite the testing implementation as follows:

    private[this] val heads  = new Array[Boolean](count)
    private[this] val states = new Array[Boolean](count)

    private[this] def genList(): LazyList[Int] = LazyList.unfoldLazy(0) { index =>
      if (index >= count) None
      else {
        states(index) = true
        Some(() => { heads(index) = true; index }, index + 1)
      }
    }

In fact, at this point, the definition of genList() can just be inlined into the lazyList variable (not shown)

Edit: If the state part of unfoldLazy isn't going to be lazy, I'm not sure if there is any value in the method above doing what @javax-swing suggested with .map(_.apply())

@javax-swing
Copy link

@NthPortal Even if the state part is lazy, you could still replicate both with unfold

https://scastie.scala-lang.org/S44MPq47R0eTDGxvpo4Q5A

If people were so inclined to delay the computation of S

@NthPortal NthPortal self-assigned this Sep 2, 2018
@NthPortal
Copy link
Author

NthPortal commented Sep 2, 2018

I'm wondering if there is value in having

def unfoldLazy[A, S](init: S)(f: S => Option[(() => A, S)]): LazyList[A] = ???

in order to reduce allocation over @javax-swing's suggestion. Internally, head is wrapped inside of a Function0 (i.e. () => A); thus, calling LazyList.unfold[() => A, S] wraps it twice as () => () => A. If we added unfoldLazy as shown above, we could remove the need to wrap head twice, thus improving both performance and reducing GC. Additionally, it would remove the need to allocate a whole other LazyList when calling map.


For comparison:

// taken from @javax-swing's scastie
val lazyList1: LazyList[A] = LazyList.unfold(1) {
  s => Some(() => s * s, s + 1)
}.map(_.apply())

val lazyList2: LazyList[A] = LazyList.unfoldLazy(1) {
  s => Some(() => s * s, s + 1)
}

@NthPortal
Copy link
Author

Rejected by #11307

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants