Adoption of the Apache Arrow memory alignment and padding?

Hi,

I'm just trying to get a sense of the level of interest from the ndarray developers regarding adopting the [Apache Arrow memory layout and padding](https://arrow.apache.org/docs/format/Columnar.html#buffer-alignment-and-padding).

I have been wanting to build integrations between Arrow and ndarray for some time.  Today it should be easy enough to build a zero-copy converter to ndarray types.  Arrow has a tensor type and this could be converted (with the optional names for dimensions in Arrow dropped).

However, without guarantees over the memory alignment and padding assumptions you could not go back to Arrow with zero-copy.  The easiest way to do this would be for ndarray to use the [Arrow functions that allocate memory](https://github.com/apache/arrow/blob/master/rust/arrow/src/memory.rs) through the Arrow [Buffer](https://github.com/apache/arrow/blob/master/rust/arrow/src/buffer.rs) type.

Arrow is attempting to make integrations between crates easier, I noticed [this](https://github.com/LukeMathWalker/linfa/issues/15) issue today.  This is the kind of issue we could avoid.

In general, I think that Arrow and ndarray fit together quite nicely where Arrow could provide alot of help processing data and ndarray provides all the algorithms once data is cleaned and in-memory.

I'm not very familiar with the ndarray codebase, if this sounds like a good idea could you point me to where you allocate memory etc. and any other information that might help?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adoption of the Apache Arrow memory alignment and padding? #771

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Adoption of the Apache Arrow memory alignment and padding? #771

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions