-
Notifications
You must be signed in to change notification settings - Fork 6.2k
planner: missing outer join elimination for row_number top1 derived table #67517
Description
Bug Report
1. Minimal reproduce
use test;
drop table if exists t1, t2;
create table t1 (a int, b int, c int);
create table t2 (a int, b int, c int, key(a));
explain format = 'brief'
select t1.a
from t1
left join (
select
a,
row_number() over(partition by a order by c desc) as rn
from t2
) t2w
on t1.a = t2w.a and t2w.rn = 1
where t1.a = 1;2. Current behavior
TiDB keeps the LeftOuterJoin and still plans the right side window subtree, even though:
- the query output does not use any column from the right side
row_number() over(partition by a ...)withrn = 1guarantees at most one row pera- therefore the
LEFT JOINis semantically redundant
3. Expected behavior
The planner should eliminate the redundant LEFT JOIN and keep only the left-side scan/filter.
4. Why this matters
For this query pattern, failing to eliminate the join can leave an unnecessary Window + Sort + Scan subtree in the plan and significantly increase planning/execution cost.
5. Root cause
outer_join_eliminate currently recognizes uniqueness from schema key information, but it does not recognize the derived-table pattern:
row_number() over(partition by join_key order by ...)- filtered by
rn = 1orrn <= 1
That pattern also implies uniqueness on the partition key for join elimination purposes.
6. Suggested fix
Teach outer join elimination to treat the following right-side shape as unique on the partition key:
Selection- over
LogicalWindow - with a single
row_number()window function - partitioned by the join key
- constrained by
rn = 1orrn <= 1
7. Validation
Add a planner regression test that checks the join is eliminated for the anonymized query shape above.