Skip to content

Change avro schema names for ManifestList and Manifest to the official ones #179

Closed
@zeodtr

Description

@zeodtr

Currently, the avro schema names in the source code are as follows:

  • ManifestList: "manifest_list"
  • Manifest: "manifest"

(Search for schema_to_avro_schema function calls to see them)
But official names in the spec(https://iceberg.apache.org/spec/) are as follows:

  • ManifestList: "manifest_file"
  • Manifest: "manifest_entry"

This difference makes Iceberg Spark SQL (using spark-3.2.2-bin-hadoop3.2.tgz and iceberg-spark-runtime-3.2_2.12-1.4.3.jar) cannot read the manifest list files and manifest files that are created by iceberg-rust.
For example, Iceberg Spark SQL's error message for ManifestFile is as follows:
org.apache.iceberg.shaded.org.apache.avro.generic.GenericData$Record cannot be cast to org.apache.iceberg.ManifestFile

When I changed the avro schema names to the official ones, those errors were gone.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions