Skip to content

[MAJOR] Equivalence System Overhaul #16217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 191 commits into
base: main
Choose a base branch
from

Conversation

ozankabak
Copy link
Contributor

@ozankabak ozankabak commented May 30, 2025

Which issue does this PR close?

Rationale for this change

DF's theoretical approach to representing orderings, equivalences and constants is sound, but it suffers from many implementation problems that lead to various issues. Some of these issues include:

  • High-complexity planning times for queries involving UNION and ORDER BY clauses (see the referred issue).
  • A continuous stream of subtle bugs that arise from the fact that our objects allow multiple representations of the same underlying state (e.g. a non-existing ordering is sometimes represented by a None value for a type Option<LexOrdering>, and sometimes with a Some value with the payload being "empty").
  • The inability to relay ordering preference information to optimizer rules such as EnforceSorting, which results in missed optimization opportunities. For example, a windowing operator prefers its input to be sorted w.r.t. both PARTITION BY and ORDER BY keys, but if the former is not available, it is still preferable to have just the latter instead of a fully unsorted input.

What changes are included in this PR?

  • The equivalence system is rewritten to solve the computational complexity problem. The new system avoids handling constants separately from other equivalences and treats them on an equal footing within the same mechanism, avoiding unnecessary work during ordering_satisfy checks. Also, the new system caches normalization results for equivalent orderings, which was one of the major sources of excessive computational complexity during sort enforcement.
  • The LexOrdering and LexRequirement objects do not allow degenerate orderings anymore, whenever you see this object, you can be sure there will be some ordering. Possible absence of an ordering will henceforth be properly represented by an Option<LexOrdering>.
  • The OrderingRequirements object is introduced to capture ordering preferences of operators.

Are these changes tested?

Yes.

Are there any user-facing changes?

Yes -- this is a major API change and will break old code. We spent months looking for ways to (1) break this PR down into smaller ones, (2) yet lump all API changes in one PR -- but couldn't find an effective way to do this.

Therefore, I invite all contributors who are also downstream stakeholders to help by participating in a "pre-scream" test. Please test and report how this PR breaks your code, so we can collectively prepare an upgrade guide for the community at large.

Tips and Tricks for Reviewing

The "meat" of the code is in the following files:

  • datafusion/physical-expr-common/src/sort_expr.rs: Definition of OrderingRequirements, changes to LexOrdering and friends.
  • datafusion/physical-plan/src/execution_plan.rs: API change to required_input_ordering.
  • datafusion/physical-expr/src/equivalence: Re-implementation of the equivalence system.
  • datafusion/functions-aggregate/src: Effects of the above on aggregate function implementations.

The rest of the changes are "mandatory" propagative effects of fixing a fundamental low-level mechanism in a large codebase, and mostly trivial.

@alamb, @andygrove, @berkaysynnada -- feel free to tag others.

mertak-synnada added 30 commits February 12, 2025 15:22
remove usage of prefer_existing_sort
as default set requirements Hard
set soft on AggregateExec and BoundedWindowExec since they have InputOrderMode functionalities
remove prefer_existing_sort based test cases
# Conflicts:
#	datafusion/core/src/datasource/listing/table.rs
#	datafusion/physical-optimizer/src/enforce_sorting/mod.rs
# Conflicts:
#	datafusion-examples/examples/custom_file_format.rs
#	datafusion/core/src/datasource/file_format/csv.rs
#	datafusion/core/src/datasource/file_format/json.rs
#	datafusion/core/src/datasource/file_format/mod.rs
#	datafusion/core/src/datasource/file_format/parquet.rs
#	datafusion/core/tests/physical_optimizer/enforce_distribution.rs
#	datafusion/core/tests/physical_optimizer/test_utils.rs
#	datafusion/physical-optimizer/src/enforce_sorting/sort_pushdown.rs
return alternative on BoundedWindowAggExec
# Conflicts:
#	datafusion/catalog/src/stream.rs
#	datafusion/core/tests/physical_optimizer/enforce_distribution.rs
#	datafusion/physical-plan/src/joins/sort_merge_join.rs
#	datafusion/physical-plan/src/joins/symmetric_hash_join.rs
add requirements compatible test cases
@ozankabak ozankabak requested review from alamb and berkaysynnada May 30, 2025 13:17
@ozankabak ozankabak added the api change Changes the API exposed to users of the crate label May 30, 2025
@ozankabak ozankabak changed the title [MAJOREquivalence System Overhaul [MAJOR] Equivalence System Overhaul May 30, 2025
@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) catalog Related to the catalog crate common Related to common crate proto Related to proto crate functions Changes to functions implementation datasource Changes to the datasource crate ffi Changes to the ffi crate physical-plan Changes to the physical-plan crate labels May 30, 2025
@alamb
Copy link
Contributor

alamb commented May 30, 2025

😮

I will put this on my list to review asap

@alamb
Copy link
Contributor

alamb commented May 30, 2025

@wiedld @xudong963 and @suremarc perhaps you are also interested in this PR given your past interest in the equivalence code. Also cc @chenkovsky

@ozankabak
Copy link
Contributor Author

Thanks for the quick response @alamb -- I added some reviewing tips and tricks to the PR body to help you guys out.

@xudong963
Copy link
Member

One thing on my mind is that maybe it's better to hold up the PR until DF48 release, then we can have enough time to check and verify the PR after it merges.

@ozankabak
Copy link
Contributor Author

ozankabak commented May 30, 2025

Let's definitely time the merge and DF48 release in a cooperative way. We have two competing realities at play: (1) The PR attracts a lot of conflicts and it is not easy to resolve them without accidents and/or introducing possible regressions, (2) We don't want to dump a large API change on to DF48 without giving prep time to people.

Checking out #15771, it seems we can either:

  1. Move fast with the release while this gets reviews, which has the advantage of making DF48 a "checkpoint" before this PR. The downside is that we will need to do extensive testing both for the release and then one for the PR.
  2. Move slower with the release (maybe release it a week later than usual) and merge this relatively quickly. The advantage of this choice would be performing extensive testing only once.

I am fine with both approaches as long as we arrive at a merge point quickly, before or after the release.

@alamb
Copy link
Contributor

alamb commented May 30, 2025

Move fast with the release while this gets reviews, which has the advantage of making DF48 a "checkpoint" before this PR. The downside is that we will need to do extensive testing both for the release and then one for the PR.

I think this is my preferred choice -- let's get DF 48 out and then give this one the maximum bake time. I think we can be careful of changes we merge to try and minimize the conflicts that this will collect

@ozankabak
Copy link
Contributor Author

Sounds good -- let's go with the first choice (moving fast with the release and merging this right after) unless someone else brings a good argument to the contrary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate catalog Related to the catalog crate common Related to common crate core Core DataFusion crate datasource Changes to the datasource crate ffi Changes to the ffi crate functions Changes to functions implementation logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate proto Related to proto crate sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exponential planning time (100s of seconds) with UNION and ORDER BY queries
3 participants