You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While thinking about #1971 and #1972, I realised that V3 introduces new fields to Snapshot - one required for V3 and the other not.
As it stands, it feels inelegant to add the V3 required field as an optional field on the Snapshot class and e.g. check within TableMetadata construction that it's present if the table is V3 (or just not do this at all). I think it might be nicer to encode that information within the typing (model), similar to the TableMetadataV3 excerpt below.
"""A long higher than all assigned row IDs; the next snapshot's `first-row-id`."""
I'm therefore wondering about about "versioning" Snapshot similar to TableMetadata, so that V3 TableMetadata would contain a list of V3 Snapshots. Then, if V3 snapshot fields are present in V2 metadata, we'd get the benefit of throwing which I think is nice about PyIceberg's TableMetadataUnion setup here compared to other implementations.
(I've not fleshed out the details here so not certain this is feasible but dropping an issue for now. Perhaps this has already been discussed / thought about 😄)
The text was updated successfully, but these errors were encountered:
I'm therefore wondering about "versioning" Snapshot similar to TableMetadata, so that V3 TableMetadata would contain a list of V3 Snapshots.
The problem is that from the moment we upgrade a table from {V1,V2} to V3, the field is not there, so we still would run into deserialization issues. For simplicity, I'm leaning towards not versioning because we still would need to check if the fields are not-null, as they stay null after bumping the version to V3: https://iceberg.apache.org/spec/#row-lineage-for-upgraded-tables
Uh oh!
There was an error while loading. Please reload this page.
Feature Request / Improvement
While thinking about #1971 and #1972, I realised that V3 introduces new fields to
Snapshot
- one required for V3 and the other not.As it stands, it feels inelegant to add the V3 required field as an optional field on the
Snapshot
class and e.g. check withinTableMetadata
construction that it's present if the table is V3 (or just not do this at all). I think it might be nicer to encode that information within the typing (model), similar to theTableMetadataV3
excerpt below.iceberg-python/pyiceberg/table/metadata.py
Lines 552 to 560 in 201057e
I'm therefore wondering about about "versioning"
Snapshot
similar toTableMetadata
, so that V3TableMetadata
would contain a list of V3Snapshot
s. Then, if V3 snapshot fields are present in V2 metadata, we'd get the benefit of throwing which I think is nice about PyIceberg'sTableMetadata
Union
setup here compared to other implementations.(I've not fleshed out the details here so not certain this is feasible but dropping an issue for now. Perhaps this has already been discussed / thought about 😄)
The text was updated successfully, but these errors were encountered: