RowSelection::and_then
is slow
#7458
Labels
enhancement
Any new improvement worthy of a entry in the changelog
parquet
Changes to the parquet crate
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
This ticket records the symptoms reported by @mbutrovich in (discord) where they see inconsistent performance. It appears the root cause is allocations related to computing the RowSelection to evaluate multiple predicates:
Background:
RowSelection::and_then
is used to combine the results of multiple ArrowPredicates in a RowFilter -- see source:Here is the code for
RowSelection::and_then
.Describe the solution you'd like
I would like the combination of multiple
RowSelection
s to go fasterDescribe alternatives you've considered
Some suggestions from @Dandandan in discord:
Here is one idea for better representing
RowSelection
instead ofVec<RowSelector>
skip
fromRowSelector
#7450Additional context
The text was updated successfully, but these errors were encountered: