Partition subfolder is not showing #1404

Tanvo149 · 2025-06-01T21:36:22Z

Tanvo149
Jun 1, 2025

I've been learning Rust and iceberg recently and ran into an issue while writing to Parquet files. It doesn't seem to create a partition subdirectory as expected. I'm using the REST Catalog locally as my Iceberg catalog.

It writes to "/tmp/iceberg_warehouse255034623058200441/iceberg_data/cat/t1/data instead of "/tmp/iceberg_warehouse255034623058200441/iceberg_data/cat/t1/data/[partition]...

Am i doing something wrong? Also, how can I just write to parquet by the columns instead of partition_value?

I tried looking at github folder to see if there was any example, and I tried to mirror it.

Thank you for taking the time in reading this. Thanks.

        let props = WriterProperties::builder()
        .set_compression(parquet::basic::Compression::SNAPPY)
        .set_created_by("Data".to_string())
        .build();

     let parquet_writer_builder = ParquetWriterBuilder::new(
            props,
            _created_table.metadata().current_schema().clone(),
            _created_table.file_io().clone(),
            location_generator.clone(),
            file_name_generator.clone()
    );


    let partition_value = Struct::from_iter([Some(Literal::int(1))]);

    let mut data_file_writer =
            DataFileWriterBuilder::new(parquet_writer_builder, Some(partition_value.clone()))
                .build()
                .await?;

    let arrow_schema = arrow_schema::Schema::new(vec![
            Field::new("name", DataType::Utf8, false).with_metadata(HashMap::from([(
                PARQUET_FIELD_ID_META_KEY.to_string(),
                1.to_string(),
            )])),
            Field::new("id", DataType::Int32, false).with_metadata(HashMap::from([(
                PARQUET_FIELD_ID_META_KEY.to_string(),
                2.to_string(),
            )])),
            Field::new("partition_date", DataType::Date32, false).with_metadata(HashMap::from([(
                PARQUET_FIELD_ID_META_KEY.to_string(),
                3.to_string(),
            )])),
        ]);


        let date_strs = vec!["2024-01-01", "2024-01-01", "2024-01-01"]; // Added a duplicate for testing grouping
        let date32_values: Vec<i32> = date_strs.iter().map(|s| to_date32(s)).collect();

        let batch = RecordBatch::try_new(Arc::new(arrow_schema.clone()), vec![
            Arc::new(StringArray::from(vec!["Alice", "Bob", "Charlie"])),
            Arc::new(Int32Array::from(vec![1, 1, 1])),
            Arc::new(Date32Array::from(date32_values.clone())),
        ])?;
        data_file_writer.write(batch).await?;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Partition subfolder is not showing #1404

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Partition subfolder is not showing #1404

Uh oh!

Tanvo149 Jun 1, 2025

Replies: 0 comments

Tanvo149
Jun 1, 2025