Skip to content

List strategies (except evalBuffer/parBuffer) on infinite lists results in non-termination #73

@robstewart57

Description

@robstewart57

Whilst the documentation for parBuffer suggests it is more suitable for lazy lists, the documentation for parList or other list strategies does not explicitly state that evaluating infinite lists directly with e.g. parList may change the termination property of a program.

Calling print f terminates for:

f :: Int
f = sum (takeWhile (< 100) (concat infinite_list_chunked))
  where
    infinite_list = [n + n | n <- [0, 1 ..]]
    infinite_list_chunked = chunks 20 infinite_list

Because sum consumes 50 values from takeWhile, which consumes 50 values from the infinite list produced by concat.

However if:

concat infinite_list_chunked

is replaced with:

concat (infinite_list_chunked `using` parList rseq)

then the function does not terminate. I.e. print g does not terminate:

g :: Int
g = sum (takeWhile (< 100) (concat (infinite_list_chunked `using` parList rseq)))
  where
    infinite_list = [n + n | n <- [0, 1 ..]]
    infinite_list_chunked = chunks 20 infinite_list

It might be surprising to a user that introducing a parList strategy to their code changes the termination property of their program. Presumably parList triggers an attempted full evaluation of the spine of the infinite list to preemptively create all sparks, which is where non-termination occurs because the spine is infinite. But introducing parList is breaking the compositionality of composing lazy functions over infinite lists.

For completeness, introducing parBuffer rather than parList does preserve the termination of the original sequential version. I.e.g print k terminates because a buffer is size 10 is repeatedly populated with list elements for consumption from takeWhile until n+n is >=100:

k :: Int
k = sum (takeWhile (< 100) (concat (infinite_list_chunked `using` parBuffer 10 rseq)))
  where
    infinite_list = [n + n | n <- [0, 1 ..]]
    infinite_list_chunked = chunks 20 infinite_list

@simonmar 's book says:

In contrast to parList which eagerly creates a spark for every list element, parBuffer N creates sparks for only the first N elements of the list, and then creates more sparks as the result list is consumed.

However a user may not realise that "eagerly creates a spark for every list element" means that control flow will not be relinquished from parList until it has created all sparks (or not, for infinite lists).

If the non-terminating example above is changed so that the parList strategy is applied to the finite list produced by takeWhile rather than the infinite list consumed by concat, then it does terminate as the sequential one does. print m terminates:

m :: Int
m = sum (takeWhile (< 100) (concat infinite_list_chunked) `using` parList rseq)
  where
    infinite_list = [n + n | n <- [0, 1 ..]]
    infinite_list_chunked = chunks 20 infinite_list

Two points about the documentation for the list strategies documented here:

https://hackage.haskell.org/package/parallel-3.2.2.0/docs/Control-Parallel-Strategies.html#g:7

  1. Should the documentation explicitly state that the list strategies (except evalBuffer/parBuffer) should not be used to directly evaluate infinite lists?

  2. Should the documentation explicitly state that if they are used with infinite lists, a function may not terminate even if the infinite list is demanded to just WHNF?

The chunks function:

chunks :: Int -> [a] -> [[a]]
chunks _ [] = []
chunks n xs =
  let (ys, zs) = splitAt n xs
   in ys : chunks n zs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions