Julep: Generalize indexing with and by `Associative`s

I've always loved how in Julia (and MATLAB) one can create a new array from an old one, using what is now the APL indexing rules. Basically if you index a collection of values with a collection of indices, you get a new collection of the indexed values. Beautiful, simple. Indexing has also been extended by allowing arrays that don't use 1-based indexing by e.g. the *OffsetArrays.jl* package.

I'm not sure if this issue exists elsewhere as its own entity (cleaning up distinctions between arrays and associatives was surely mentioned in #20402 and this Julep seems to be a logical extension of #22907), but here I propose specifically that we extend indexing of and by `Associative` and make related changes so that the semantics are consistent across these two types of container. I prototyped ideas at https://github.com/andyferris/AssociativeArray.jl and basically came up with the ability to (with simple code):

 * Index an `Associative{K,V}` with an `Associative{I,K}` to get an `Associative{I,V}`. E.g. `Dict(:a=>1, :b=>2, c:=>3)[Dict("a"=>:a, "c"=>:c)] == Dict("a"=>1, "c"=>3)`.
 * Index an `Associative{K,V}` with an `AbstractArray{K,N}` to get an `AbstractArray{V,N}`. E.g. `Dict(:a=>1, :b=>2, c:=>3)[[:c, :a]] == [3,1]`.
 * Index an `AbstractArray{T,N}` with an `Associative{K,I}` to get an `Associative{K,T}` (where `I` might be `Int` for linear indexing, or a `CartesianIndex{N}` for Cartesian indexing). E.g. `[11,12,13][Dict(:a=>1, :c=>3)] == Dict(:a=>11, :c=>13)`.

The semantics are consistent across arrays and dictionaries, and provide that for `out = a[b]`:

 * The output container `out` shares the indices of `b` (note: these are `CartesianRange` for arrays)
 * The values `out[i]` correspond to `a[b[i]]`.

This is fully consistent with both the `Base` arrays and the *OffsetArrays.jl* package (We can do something similar for `setindex!`).

To make everything consistent, it helps to make the following associated changes:

 * Make `Associative`s be containers of values, not of `index=>value` pairs, so that arrays and dictionaries are consistent on this fundamental point. Use the existing `pairs` function when necessary (and ideally make it preserve indexability).
 * Make `similar` always return a container with the same indices, even for dictionaries. Ideally, unify `similar` across `Associative`s and `Array`s (for example a dictionary which is `similar` to a distributed array might also be distributed) via use of the indices.
 * Have an new `empty` function that makes empty `Dict`s and `Vector`s to which elements should be added. (Done, #24390).
 * Consider whether we want to have collection of things you call `getindex` and `setindex!` with be called `indices`, rather than `keys` (and rename the current `indices(::AbstractArray)` to something else)
 * Have `view` work for the various combinations where `getindex` works.

The demonstration package also prototypes making `AbstractArray{T, N} <: Associative{CartesianIndex{N}, T}` - I don't think this is strictly necessary but it helped (me) to highlight which parts of the existing interface were inconsistent. The package *does* demonstrate that we can put something simple together without excessive amounts of code (some performance tuning is surely required).

Finally, a word on what motivates this: lately I've been playing with what fundamental data operations (such as mapping, grouping, joining or filtering) would be useful for both generic data structures and tables/dataframes (that iterate rows), and I found whenever I created say a grouping (using a dictionary of groups), I immediately felt the loss of ability to do complex indexing and other operations with the result (as well have to worry whether the output iterates values or key-value pairs, etc).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Julep: Generalize indexing with and by `Associative`s #24019

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Julep: Generalize indexing with and by Associatives #24019

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Julep: Generalize indexing with and by `Associative`s #24019