Skip to content

Avro nullable field decode failure leads to panic upon decoder flush #8212

@yongkyunlee

Description

@yongkyunlee

Describe the bug

When decoding Avro data with nullable fields, the decoder panics during flush() if any intermediate records fail to decode. This occurs even though the decoder correctly returns an error for the malformed record and successfully processes subsequent valid records.

To Reproduce

use arrow_avro::reader::*;

// Create a nullable Int32 decoder with NullSecond ordering
let avro_type = AvroDataType::new(
    Codec::Int32,
    Default::default(),
    Some(Nullability::NullSecond),
);
let mut decoder = Decoder::try_new(&avro_type).unwrap();

// Row 1: Valid null value (branch = 1 for NullSecond)
let row1 = vec![0x02]; // varint encoding of 1

// Row 2: Invalid non-null - branch indicates non-null but missing int32 payload
let row2_malformed = vec![0x00]; // varint encoding of 0, but no following int32

// Row 3: Valid non-null value
let row3 = vec![0x00, 0x54]; // branch 0 + varint encoding of 42

// Process rows
decoder.decode(&mut AvroCursor::new(&row1)).unwrap(); // Success: null
decoder.decode(&mut AvroCursor::new(&row2_malformed)).is_err(); // Error: incomplete
decoder.decode(&mut AvroCursor::new(&row3)).unwrap(); // Success: 42

// This panics with the buggy code due to bitmap/values mismatch
let array = decoder.flush(None).unwrap(); // PANIC!

Expected behavior

The decoder should maintain internal consistency even when individual record decodes fail. After processing the three rows above:

  • flush() should succeed and return an array with 2 elements: [null, 42]
  • Row2's decode error should not corrupt the decoder's state

Additional context

This bug affects production systems processing Avro data with nullable fields. When malformed or truncated Avro data is encountered, instead of gracefully handling the error and continuing with valid records, the decoder panics during flush due to internal state corruption.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions