Skip to content

HashAggregate performance improvements #418

@ravlio

Description

@ravlio

Hello! I'm just learning Rust and maybe I'm wrong, please correct me, if so. I noticed that this part with key creation https://github.com/apache/arrow-datafusion/blob/174226c086a4838eab2a238853b4871c295c0189/datafusion/src/physical_plan/hash_aggregate.rs#L514 is called on each row. And I see that you do match and downcast on each row on each grouping key. All this should add a lot of additional CPU instructions (I haven't tested it, just my thoughts). Isn't this a suboptimal approach? For instance, we could leverage generics or macro here. Thanks.
(couldn't set "question" label)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions