Skip to content

[branch-52] Bump DataFusion rev to f62e79d (skip ensure_distribution rebuild)#47

Merged
zhuqi-lucas merged 1 commit into
branch-52from
qizhu/bump-df-rev-ensure-distribution
May 29, 2026
Merged

[branch-52] Bump DataFusion rev to f62e79d (skip ensure_distribution rebuild)#47
zhuqi-lucas merged 1 commit into
branch-52from
qizhu/bump-df-rev-ensure-distribution

Conversation

@zhuqi-lucas
Copy link
Copy Markdown
Collaborator

Summary

Bump DataFusion rev 05a6c45f62e79d to pick up massive-com/arrow-datafusion#57, the cherry-pick of upstream apache/datafusion#22521.

The upstream change makes ensure_distribution route through with_new_children_if_necessary, which short-circuits via Arc::ptr_eq when no child plan was actually replaced. Saves the (often expensive) with_new_children rebuild on point-query plan shapes where no redistribution is needed.

Why MV needs to move in lockstep

Atlas pins both the MV crate AND DF separately:

  • datafusion-materialized-views = { rev = "906d9cf" } ← this PR will produce a new rev
  • datafusion = { rev = "..." } ← atlas will bump to f62e79d once this lands

If the two crates pin different DF revs, cargo resolves them as two separate DF copies in the workspace, which breaks atlas's MV/DF boundary (types don't unify). So MV has to land first.

Tests

cargo test — 25/25 unit + 1/1 integration + 4 ignored doctests pass against the new rev.

…rebuild)

Picks up massive-com/arrow-datafusion#57 (cherry-pick of upstream
apache/datafusion#22521): ensure_distribution now routes through
with_new_children_if_necessary, skipping the expensive
plan.with_new_children() rebuild when children are pointer-identical.

Atlas's reference query servers need this DF rev to consume the
optimization end-to-end. MV crate has to bump in lockstep otherwise
atlas ends up with two different DF copies in the workspace
(MV depending on 05a6c45, atlas depending on f62e79d), causing type
mismatches across the MV/atlas boundary.

All 25 + 1 tests pass against the new rev.
Copilot AI review requested due to automatic review settings May 29, 2026 03:03
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Bumps the pinned git revision of the DataFusion dependency set used by datafusion-materialized-views to pick up an upstream optimization that avoids unnecessary plan rebuilds in ensure_distribution, helping reduce overhead for plan shapes where no redistribution is needed.

Changes:

  • Update DataFusion git rev from 05a6c45 to f62e79d across all directly pinned datafusion* crates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@zhuqi-lucas zhuqi-lucas merged commit 17da908 into branch-52 May 29, 2026
9 checks passed
@zhuqi-lucas zhuqi-lucas deleted the qizhu/bump-df-rev-ensure-distribution branch May 29, 2026 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants