-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Avro is a widely used binary, row-oriented data encoding. It is very similar to protobuf, and has seen very wide adoption in the data ecosystem, especially for streaming workloads.
Describe the solution you'd like
A new arrow_avro crate will provide vectorised support for reading and writing avro data. The APIs should be designed in such a way as to work for the various different container formats for avro encoded data, including single object encoding, object container files and message even if first-class support is not provided for all these framing mechanisms.
Describe alternatives you've considered
Additional context
DataFusion has some avro support, however, it is based on the row-based apache_avro crate and is therefore likely extremely sub-optimal.
FYI @Samrose-Ahmed @sarutak @devinjdangelo I intend to work on this, but any help with reviews / testing would be most welcome
Subtasks