Fix incorrect Not predicate evaluation in filtering#78
Merged
WenyXu merged 2 commits intodatafusion-contrib:mainfrom Feb 27, 2026
Merged
Conversation
5a5a69b to
e8f97b9
Compare
ac97685 to
fd9f67a
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes a bug in the evaluation of Not predicates against row group statistics in ORC file reading. The old implementation incorrectly used simple boolean negation, which failed for mixed row groups (containing both nulls and values). The new implementation correctly pushes negation down to leaf predicates using De Morgan's laws.
Changes:
- Fixed
Notpredicate evaluation to use predicate pushdown instead of boolean negation - Added
ComparisonOp::negate()method to support comparison operator negation - Added comprehensive tests for all Not predicate transformation cases
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/row_group_filter.rs | Reimplemented Not predicate evaluation to recursively push negation down using De Morgan's laws, comparison operator negation, and IsNull/IsNotNull swapping. Added six new comprehensive tests. |
| src/predicate.rs | Added ComparisonOp::negate() method to return the negated comparison operator for all six comparison types. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
WenyXu
approved these changes
Feb 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR fixes a bug in how
Notpredicates are evaluated against row group statistics.The Problem:
Previously,
Predicate::Notwas handled by simply negating the boolean result of the inner predicate's evaluation (!evaluate(inner)). However, since row group statistics checks are existential (returningtrueif any row in the group matches), simple negation is incorrect for mixed row groups.For example:
IsNullreturnstrueif a row group has any nulls.Not(IsNull)would returnfalse(skip) for a mixed row group containing both nulls and values.Not(IsNull)is equivalent toIsNotNull, which should returntrue(keep) if the group has any non-null values.The Fix:
I updated the
Notevaluation logic to recursively push the negation down to the leaf predicates:And/Or(e.g.,Not(A And B)->Not(A) Or Not(B)).Not(Eq)->NotEq,Not(Lt)->Gte).IsNull<->IsNotNull.Tests:
test_evaluate_predicate_not_is_not_nullandtest_evaluate_predicate_not_orto verify that mixed row groups are correctly preserved.