-
Notifications
You must be signed in to change notification settings - Fork 296
Description
Iceberg-rust Write support
I've noticed a lot of interest in write support in Iceberg-rust. This issue aims to break this down into smaller pieces so they can be picked up in parallel.
Appetizers
If you're not super familiar with the codebase, feel free to pick up one of the appetizers below. They are, or are not, related to the write path, but are good things to get in, and are good to get to know the code base:
- Able to parse name-mapping into a recusive structure. #723
- Extend the e2e tests with append a partitioned file #720
- Extend the
DataFileWriterBuilder
tests #726 - Sort Order Replacement API #734
Commit path
The commit path entails writing a new metadata JSON.
- Applying updates to the metadata Updating the metadata is important both for writing a new version of the JSON in case of a non-REST catalog, but also to keep an up-to-date version in memory. It is recommended to re-use the Updates/Requirement objects provided by the REST catalog protocol. PyIceberg uses a similar approach.
- REST Catalog serialize the updates and requirements into JSON which is dispatched to the REST catalog. Done in feat: Implement create table and update table api for rest catalog. #97.
- Other catalogs For the other catalogs, instead of dispatching the updates/requirements to the catalog. There are additional steps:
- Logic to validate the requirements against the metadata, to detect commit conflicts. A lot of this logic is already being implemented by TableMetadataBuilder #587.
- Writing a new version of the metadata.json to the object store. Taking into account the naming as mentioned in the spec.
- Provide locking mechanisms within the commit (Glue, Hive, SQL, ..) so the atomic swap happens safely.
- SQL Looks like conflict detection is missing. I was expecting logic there to see if rows are being affected (if not, another process has altered the table).
- Glue Not yet implemented
- Commit semantics
- MergeAppend appends new manifest list entries to existing manifest files. Reduces the amount of metadata produced, but takes some more time to commit since existing metadata has to be rewritten, and retries are also more costly: Support for MergeAppend #736
- Initial defaults This makes it easier when merge-appending V1 metadata: Implement
initial-default
#737
- Initial defaults This makes it easier when merge-appending V1 metadata: Implement
- FastAppend Generates a new manifest per commit, which allows fast commits, but generates more metadata in the long run. PR by @ZENOTME in feat: support append data file and add e2e test #349
- MergeAppend appends new manifest list entries to existing manifest files. Reduces the amount of metadata produced, but takes some more time to commit since existing metadata has to be rewritten, and retries are also more costly: Support for MergeAppend #736
- Snapshot generation manipulation of data within a table is done by appending snapshots to the metadata JSON.
- APPEND Only data files were added and no files were removed. Similar to
add_files
. - REPLACE Data and delete files were added and removed without changing table data; i.e., compaction, changing the data file format, or relocating data files.
- OVERWRITE Data and delete files were added and removed in a logical overwrite operation.
- DELETE Data files were removed and their contents were logically deleted and/or delete files were added to delete rows.
- APPEND Only data files were added and no files were removed. Similar to
- Add files to add existing Parquet files to a table. Issue in Add files to add existing Parquet files to a table #932
- Name mapping in case the files don't have field-IDs set.
- Summary generations Part of the snapshot that indicates what's in the snapshot: Generation of Snapshot Summaries #724
- Metrics collection There are two situations:
- Collect metrics when writing
This is done with the Java API where during writing the upper, lower bound are tracked and the number of null- and nan records are counted.. Most of this is in, except theNaN
counts: Implement nan_value_counts && distinct_counts metrics in parquet writer #417
- Collect metrics when writing
Related operations
These are not on the critical path to enable writes, but are related to it:
- Update table properties Sets properties on the table. Probably the best to start with since it doesn't require a complicated API.
- Schema evolution API to update the schema, and produce new metadata.
- Having the SchemaUpdate API to evolve the schema without a user have to worry about field-IDs: Add
SchemaUpdate
logic to Iceberg-Rust #697 - Add the
unionByName
to easily union two schemas to provide easy schema evolution: Update a TableSchema from a Schema #698
- Having the SchemaUpdate API to evolve the schema without a user have to worry about field-IDs: Add
- Partition spec evolution API to update the partition spec, and produce new metadata: Sort Order Replacement API #734.
- Sort order evolution API to update the schema, and produce new metadata: Partition Spec Evolution API #732.
Metadata tables
Metadata tables are used to inspect the table. Having these tables also allows easy implementation of the maintenance procedures since you can easily list all the snapshots, and expire the ones that are older than a certain threshold.
Integration Tests
Integration tests with other engines like spark.
Contribute
If you want to contribute to the upcoming milestone, feel free to comment on this issue. If there is anything unclear or missing, feel free to reach out here as well 👍