Extend dynamic filter pushdown to Left and LeftSemi hash joins#20447
Extend dynamic filter pushdown to Left and LeftSemi hash joins#20447helgikrs wants to merge 1 commit intoapache:mainfrom
Conversation
|
|
||
| fn allow_join_dynamic_filter_pushdown(&self, config: &ConfigOptions) -> bool { | ||
| if self.join_type != JoinType::Inner | ||
| if !matches!(self.join_type, JoinType::Inner | JoinType::Left | JoinType::LeftSemi) |
There was a problem hiding this comment.
Can JoinType::Right and JoinType::RightSemi also be applied based on the same rationale?
There was a problem hiding this comment.
I don't think so. The build side is always the left side and the probe side is always the right side. The filters are therefore always pushed to the right side. We need to retain all rows from the right side for a Right join, so there's nothing we can push down.
A right join query can still take advantage of this if the the optimizer decides to swap the sides to build the right side to the join, in that case it would become a left join here.
There was a problem hiding this comment.
Actually I think it might be safe to add RightSemi and LeftAnti to this list.
The dynamic filter only removes probe rows that can't match any build row, and neither RightSemi (which outputs only matched probe rows) nor LeftAnti (which outputs only unmatched build rows) includes unmatched probe rows in its output. So I think it would be safe to add those here as well. Right and RightAnti would not be.
The dynamic filter from HashJoinExec was previously gated to Inner joins only. Left and LeftSemi joins have the same probe-side filtering semantics. PR apache#20192 refactored the join filter pushdown infrastructure, which as a side effect makes extending self-generated filters to Left/LeftSemi join types trivial.
7cf1bc6 to
4e40891
Compare
The dynamic filter from HashJoinExec was previously gated to Inner joins only. Left and LeftSemi joins have the same probe-side filtering semantics.
PR #20192 refactored the join filter pushdown infrastructure, which as a side effect makes extending self-generated filters to Left/LeftSemi join types trivial.
Which issue does this PR close?
This PR makes progress on #16973
Rationale for this change
The self-generated dynamic filter in HashJoinExec filters the probe side using build-side values. For Left and LeftSemi joins, the right-hand probe side has the same filtering semantics as Inner. Relaxing the gate to take advantage of this optimization for Left and LeftSemi joins.
Are these changes tested?
Yes.
Are there any user-facing changes?
No.