Skip to content

Comments

feat: support dynamic filter pushdown through SortMergeJoinExec#20455

Open
mdashti wants to merge 4 commits intoapache:mainfrom
paradedb:moe/smj-dynamic-filter-pushdown
Open

feat: support dynamic filter pushdown through SortMergeJoinExec#20455
mdashti wants to merge 4 commits intoapache:mainfrom
paradedb:moe/smj-dynamic-filter-pushdown

Conversation

@mdashti
Copy link

@mdashti mdashti commented Feb 20, 2026

Which issue does this PR close?

Rationale for this change

SortMergeJoinExec uses the default gather_filters_for_pushdown implementation, which marks all parent filters as unsupported. This means dynamic filters from TopK (SortExec with fetch) cannot pass through sort-merge joins to reach scan nodes — even though the filter routing logic is straightforward for Inner joins. HashJoinExec already supports this.

What changes are included in this PR?

Implements gather_filters_for_pushdown and handle_child_pushdown_result on SortMergeJoinExec for Inner joins only. For Inner joins the output schema is [left_cols..., right_cols...], so each parent filter is routed to the correct child based on its column references using ChildFilterDescription::from_child_with_allowed_indices (same approach as HashJoinExec). All non-Inner join types conservatively return all_unsupported.

This is a minimal, non-intrusive patch: static filter passthrough only. No dynamic filter creation from join keys (that's a separate, larger feature).

Are these changes tested?

Yes, at three levels:

  1. Optimizer unit tests (filter_pushdown.rs) — verify plan structure: left/right filters route to correct children, cross-side filters stay, non-Inner joins reject pushdown, and TopK dynamic filters propagate through SMJ to scan nodes.
  2. SQL logic tests (dynamic_filter_pushdown_config.slt) — end-to-end EXPLAIN verification that DynamicFilter appears on the correct DataSourceExec for Inner joins (on both join-key and non-key columns), and is absent for Left joins. Includes correctness checks.
  3. Integration tests (smj_filter_pushdown.rs) — 11 tests using in-memory parquet that run each query with and without dynamic filter pushdown and assert identical results. Covers TopK on left/right/join-key columns, DESC order, multi-column sorts, WHERE clauses, LIMIT edge cases, LEFT JOIN correctness, and nested joins.

Are there any user-facing changes?

No API changes. Queries using SortMergeJoinExec with Inner joins may now benefit from dynamic filter pushdown (e.g. TopK pruning), improving performance for ORDER BY ... LIMIT queries over sort-merge joins.

@github-actions github-actions bot added core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) physical-plan Changes to the physical-plan crate labels Feb 20, 2026
@mdashti mdashti changed the title Moe/smj dynamic filter pushdown feat: support dynamic filter pushdown through SortMergeJoinExec Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate physical-plan Changes to the physical-plan crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support filter pushdown through SortMergeJoinExec

1 participant