-
Notifications
You must be signed in to change notification settings - Fork 1k
Closed
Labels
Description
Describe the bug
When decoding Avro data with nullable fields, the decoder panics during flush() if any intermediate records fail to decode. This occurs even though the decoder correctly returns an error for the malformed record and successfully processes subsequent valid records.
To Reproduce
use arrow_avro::reader::*;
// Create a nullable Int32 decoder with NullSecond ordering
let avro_type = AvroDataType::new(
Codec::Int32,
Default::default(),
Some(Nullability::NullSecond),
);
let mut decoder = Decoder::try_new(&avro_type).unwrap();
// Row 1: Valid null value (branch = 1 for NullSecond)
let row1 = vec![0x02]; // varint encoding of 1
// Row 2: Invalid non-null - branch indicates non-null but missing int32 payload
let row2_malformed = vec![0x00]; // varint encoding of 0, but no following int32
// Row 3: Valid non-null value
let row3 = vec![0x00, 0x54]; // branch 0 + varint encoding of 42
// Process rows
decoder.decode(&mut AvroCursor::new(&row1)).unwrap(); // Success: null
decoder.decode(&mut AvroCursor::new(&row2_malformed)).is_err(); // Error: incomplete
decoder.decode(&mut AvroCursor::new(&row3)).unwrap(); // Success: 42
// This panics with the buggy code due to bitmap/values mismatch
let array = decoder.flush(None).unwrap(); // PANIC!
Expected behavior
The decoder should maintain internal consistency even when individual record decodes fail. After processing the three rows above:
- flush() should succeed and return an array with 2 elements: [null, 42]
- Row2's decode error should not corrupt the decoder's state
Additional context
This bug affects production systems processing Avro data with nullable fields. When malformed or truncated Avro data is encountered, instead of gracefully handling the error and continuing with valid records, the decoder panics during flush due to internal state corruption.