Skip to content

[JS] Improve JS documentation on how to read/deserialize arrow data #78

@bluehat974

Description

@bluehat974

Describe the enhancement requested

cc @domoritz

Current JS documentation is not clear on how to read & manipulate the data from Apache Arrow JS

JS version of Apache Arrow is used in JS environment (DuckDB Wasm, ObservableHQ, Arquero)
and people are asking on how to properly read the data, but there is no clear answer
duckdb/duckdb-wasm#1418

There is some documentation to read arrow data or deserialize to JSON
https://duckdb.org/docs/api/wasm/query.html#arrow-table-to-json
https://observablehq.com/@theneuralbit/using-apache-arrow-js-with-large-datasets

but this examples should be unified to the original Apache Arrow JS documentation
https://github.com/apache/arrow/blob/main/js/README.md

Some ideas of code example to provide to the documentation:

  • Best way to read data without deserialize into JSON version
  • Explain how to take advantage of JS Proxy to read data faster instead of deserialize to JSON
  • If serialization is required, how to do it properly
  • How to convert column to row
  • How to read nested type (STRUCT, MAP, DICTIONNARY...)
  • How to cast arrow type (from DECIMAL to DOUBLE)
  • How to cast arrow type (LONG, DOUBLE, DECIMAL) to desired js type (bigint, number, string...)

Component(s)

Documentation, JavaScript

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions