Julep: solving tricky iteration problems

# The problem (a brief summary)

In a [PR for a blog post](https://github.com/JuliaLang/julialang.github.com/pull/366), I described a number of challenges for several potential new `AbstractArray` types.  Very briefly, the types and the challenges they exhibit are:
- `ReshapedArrays`: the only efficient way of indexing some types is by indexing the parent array directly (i.e., with an iterator optimized for the pre-reshaping array).  This means that indexing a 2D array as `B[i,j]`, for integer `i` and `j`, may not always be efficient.
- `TransposedMatrix` (for JuliaLang/LinearAlgebra.jl#42): iteration is most efficient when it produces a cache-friendly sequence of array accesses. Unfortunately, algorithms that mix column-major and row-major arrays imply that we can't naively allow `eachindex(A)` to always return the most efficient iterator.
- `OffsetArrays` (sometimes called "Fortran arrays"): these violate the assumptions that array indexes run from `1:size(A,d)`, and more generally question how one should match elements of two different
  arrays.
# Possible APIs

In the blog post, only two concrete algorithms were considered: matrix-vector multiplication and `copy!`.  At the risk of being insufficiently general, let's explore possible APIs in terms of these two functions.

**EDIT: note that the thoughts on the API are evolving as a consequence of this discussion.** To jump ahead to an updated proposal, see https://github.com/JuliaLang/julia/issues/15648#issuecomment-205306513. There may be further updates even later. The material below is left for reference, but some of it is outdated.
## The "raw" API

For the moment, let's worry about correctness and efficiency more than elegance.  Correctness trumps all other considerations.  Efficiency means that we have to be able to iterate over the matrix in
cache-friendly order.  That means that the array `B` has to be "in charge" of the sequence of values and pick its most efficient iterator.

I should emphasize that I haven't taken even the first step towards attempting to implement this API.  So this is crystal-ball development, trying to imagine what might work while hoping to shake
out likely problems in advance via community feedback.  That caveat understood, here's a possible implementation of matrix-vector multiplication:

``` jl
RB = eachindex(B)
Rv = eachindex(RB[*,:], v)
Rdest = eachindex(RB[:,*], dest)
for (idest, iB, iv) in zip(Rdest, RB, Rv)
    dest[idest] += B[iB] * v[iv]
end
```

`eachindex(iter, obj)` should perform bounds-checking, to ensure that the indexes of `iter` align with `obj` (even handling things like `OffsetArrays`).  The notation `RB[*,:]` means "corresponding to the
second dimension of `RB`"; interestingly, this expression parses without error, attempting to dispatch to `getindex` with argument types `Tuple{typeof(RB), Function, Colon}`. Hopefully there isn't another pending meaning for indexing with a `Function`.

Because either `idest` or `iv` is constant over the "inner" loop (depending on storage order of `B`), ideally one would like to hoist the corresponding access `dest[idest]` or `v[iv]`.  I'd argue that's a
job for the compiler, and not something we should have to bake into the API.

Because `zip(X, Y)` does not allow `Y` to see the state of `X`, it seems quite possible that this could not be made efficient in general (consider the case described [for `ReshapedArray`s](https://github.com/JuliaLang/julialang.github.com/blob/teh/arrays/blog/_posts/2016-03-27-arrays-iteration.md#reshapedarrays-15449), where the
efficiently-parametrized iterator for the first dimension depends on the state of the index for the second dimension).  If this proves to be the case, we have to abandon `zip` and write this more like

``` jl
RB = eachindex(B)
for (iB, idest, iv) in couple(RB, (RB[:,*], dest), (RB[*,:], v))
    dest[idest] += B[iB] * v[iv]
end
```

where `couple` is a new exported function.

One interesting thing about this API is that sparse multiplication might be implemented efficiently just by replacing the first line with

``` jl
RB = eachstored(B)
```

so that only "stored" elements of `B` are visited.  (@tkelman may have been the first to propose an `eachstored` iterator.)  It should be pointed out that several recent discussions have addressed some of the complications that arise from `Inf` and `NaN`, and it remains to be determined whether a version that [doesn't ignore these complications](https://github.com/JuliaLang/julia/pull/15579#issuecomment-200604174) can be written with this (or similar) API.

Likewise, `copy!` might be implemented

``` jl
Rdest = eachindex(dest)
for (idest, isrc) in couple(Rdest, (Rdest, src))
    dest[idest] = src[isrc]
end
```

At least for the array types described here, this API seems plausibly capable of generating both correct and (fairly) efficient code.  One might ideally want to write generic "tiled" implementations, but that seems beyond the scope of this initial effort.
## Making it prettier with "sufficiently smart iteration"

This API is still more complicated than naive iteration.  One might possibly be able to hide a lot of this with a magical `@iterate` macro,

``` jl
@iterate B dest[i] += B[i,j]*v[j]
```

for matrix-vector multiplication and

``` jl
@iterate dest dest[i] = src[i]
```

for `copy!`. The idea is that the `@iterate` macro would expand these expressions into their longer forms above. Note that the first argument to `@iterate` specifies which object is "in charge" of the
sequence of operations.  `@iterate` might be too limited for many things, but may be worth exploring. The [KernelTools](https://github.com/timholy/KernelTools.jl) repository may have some useful tidbits.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Julep: solving tricky iteration problems #15648

The problem (a brief summary)

Possible APIs

The "raw" API

Making it prettier with "sufficiently smart iteration"

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Julep: solving tricky iteration problems #15648

Description

The problem (a brief summary)

Possible APIs

The "raw" API

Making it prettier with "sufficiently smart iteration"

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions