Closed
Description
Currently, the avro schema names in the source code are as follows:
- ManifestList: "manifest_list"
- Manifest: "manifest"
(Search for schema_to_avro_schema
function calls to see them)
But official names in the spec(https://iceberg.apache.org/spec/) are as follows:
- ManifestList: "manifest_file"
- Manifest: "manifest_entry"
This difference makes Iceberg Spark SQL (using spark-3.2.2-bin-hadoop3.2.tgz
and iceberg-spark-runtime-3.2_2.12-1.4.3.jar
) cannot read the manifest list files and manifest files that are created by iceberg-rust.
For example, Iceberg Spark SQL's error message for ManifestFile is as follows:
org.apache.iceberg.shaded.org.apache.avro.generic.GenericData$Record cannot be cast to org.apache.iceberg.ManifestFile
When I changed the avro schema names to the official ones, those errors were gone.