Skip to content

Reduce copying in CoalesceBatchesExec for StringViews #11628

Open
@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

In pictures, what #11587 does is like
this (to ensure lots of unreachable "garbage" does not accumulate in the output batch)

┌────────────────────┐
│    RecordBatch     │                ┌────────────────────┐
│   num_rows = 23    │                │    RecordBatch     │
└────────────────────┘                │   num_rows = 23    │              ┌────────────────────┐
                                      └────────────────────┘              │                    │
┌────────────────────┐                                        Coalesce    │                    │
│                    │ StringView::gc ┌────────────────────┐   Batches    │                    │
│    RecordBatch     │                │    RecordBatch     │              │                    │
│   num_rows = 50    │                │   num_rows = 50    │  ─ ─ ─ ─ ─▶  │                    │
│                    │    ─ ─ ─ ─ ─▶  │                    │              │    RecordBatch     │
│                    │                └────────────────────┘              │   num_rows = 106   │
└────────────────────┘                                                    │                    │
                                                                          │                    │
┌────────────────────┐                ┌────────────────────┐              │                    │
│                    │                │    RecordBatch     │              │                    │
│    RecordBatch     │                │   num_rows = 33    │              │                    │
│   num_rows = 33    │                │                    │              └────────────────────┘
│                    │                └────────────────────┘
└────────────────────┘

However, as @2010YOUY01 pointed out in https://github.com/apache/datafusion/pull/11587/files#r1686678665

So here inside gc string buffer will be copied once, (below) in
concat_batches() string buffer will be copied again, it seems possible to copy
only once by changing the internal implementation of concat_batches()

This implementation will effectively copy the data twice -- once for the call to
gc and once for the call coalsece batches.

Due to the nature of StringView the actual strings vaules are only copied once, but the u128 view value will be copied twice

Describe the solution you'd like

Somehow structure the code to avoid copying the views again. Like this

┌────────────────────┐                                       
│    RecordBatch     │                                       
│   num_rows = 23    │                 ┌────────────────────┐
└────────────────────┘                 │                    │
                       StringView::gc  │                    │
┌────────────────────┐  and Coalesce   │                    │
│                    │ Batches in same │                    │
│    RecordBatch     │    operation    │                    │
│   num_rows = 50    │                 │    RecordBatch     │
│                    │    ─ ─ ─ ─ ─▶   │   num_rows = 106   │
│                    │                 │                    │
└────────────────────┘                 │                    │
                                       │                    │
┌────────────────────┐                 │                    │
│                    │                 │                    │
│    RecordBatch     │                 └────────────────────┘
│   num_rows = 33    │                                       
│                    │                                       
└────────────────────┘                                       

Describe alternatives you've considered

https://github.com/apache/datafusion/pull/11587/files#r1687099239

I think given how concat is implemented for StringView it will only copy the fixed parts (not the actual string data)

Perhaps what we could do is implement a wrapper around arrow::concat_batches that has the datafusion specific GC trigger for sparse arrays, and falls back to concat for other types: https://docs.rs/arrow-select/52.1.0/src/arrow_select/concat.rs.html#150

/// wrapper around [`arrow::compute::concat`] that 
pub fn concat(arrays: &[&dyn Array]) -> Result<ArrayRef, ArrowError> {
 // loop over columns here and handle StringView specially, 
 // or fallback to concat
 }

Additional context

#7957 is another related idea for avoding copies

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions