Description
Is your feature request related to a problem or challenge?
- Part of [EPIC] Extract remaining physical optimizer out of core #11502
- Related to Building project takes a *long* time (esp compilation time for
datafusion
core crate) #13814
Historically DataFusion was one (very) large crate datafusion
, and as it grew bigger we extracted various functionality into separate crates. This leads to both faster compile times (as the crates can be compiled in parallel) as well easier to navigate code (as the crates force a cleaner dependency separation)
One project we have not yet completed is to extract the physical optimizer passes out #11502
- The original physical optimizers are here https://github.com/apache/datafusion/tree/main/datafusion/core/src/physical_optimizer
- The new crate is here https://github.com/apache/datafusion/tree/main/datafusion/physical-optimizer
Describe the solution you'd like
Extract the ProjectionPushdown
from the datafusion core crate to the datafusion-physical-optimizer crate
Describe alternatives you've considered
Move the code
- From https://github.com/apache/datafusion/blob/main/datafusion/core/src/physical_optimizer/projection_pushdown.rs
- To https://github.com/apache/datafusion/tree/main/datafusion/physical-optimizer/src/projection_pushdown.rs
Notes that due to dependencies (e.g. on SessionContext or functions), you may have to move the tests into the core_integration tests here:
Additional context
Here are some example PRs
- chore: move
SanityChecker
intophysical-optimizer
crate #14083 - Move JoinSelection into datafusion-physical-optimizer crate (#14073) #14085
- Move
Pruning
intophysical-optimizer
crate #13485
I think this is a good first issue as it is mostly mechanical, has several good examples, and does not require deep internals knowledge