Skip to content

SanityChecker rejects certain valid UNION plans  #12446

@alamb

Description

@alamb

Describe the bug

There is a regression that was added that in a very very specific circumstance with sorted data and constant predicates and UNION queries where the query will now error with a SanityCheckPlan error when it should complete.

To Reproduce

@wiedld found a reproducer as part of #12414

c2e652e

# Test: inputs into union with different orderings
query TT
explain select * from (select b, c, a, NULL::int as a0 from ordered_table order by a, c) t1
union all
select * from (select b, c, NULL::int as a, a0 from ordered_table order by a0, c) t2
order by d, c, a, a0, b
limit 2;
----
logical_plan
01)Projection: t1.b, t1.c, t1.a, t1.a0
02)--Sort: t1.d ASC NULLS LAST, t1.c ASC NULLS LAST, t1.a ASC NULLS LAST, t1.a0 ASC NULLS LAST, t1.b ASC NULLS LAST, fetch=2
03)----Union
04)------SubqueryAlias: t1
05)--------Projection: ordered_table.b, ordered_table.c, ordered_table.a, Int32(NULL) AS a0, ordered_table.d
06)----------TableScan: ordered_table projection=[a, b, c, d]
07)------SubqueryAlias: t2
08)--------Projection: ordered_table.b, ordered_table.c, Int32(NULL) AS a, ordered_table.a0, ordered_table.d
09)----------TableScan: ordered_table projection=[a0, b, c, d]

# Test: run the query from above
# TODO: query fails since the constant columns t1.a0 and t2.a are not in the ORDER BY subquery,
# and SanityCheckPlan does not allow this.
statement error DataFusion error: SanityCheckPlan
select * from (select b, c, a, NULL::int as a0 from ordered_table order by a, c) t1
union all
select * from (select b, c, NULL::int as a, a0 from ordered_table order by a0, c) t2
order by d, c, a, a0, b
limit 2;

statement ok
drop table ordered_table;

Expected behavior

Query should run

Additional context

We believe this was uncovered by #11196 . The error in the sort order calculation has existed for awhile but #11196 now uncovered the issue

This was released in 40.0.0 https://github.com/apache/datafusion/blob/main/dev/changelog/40.0.0.md

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingregressionSomething that used to work no longer does

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions