-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Short circuit complex case evaluation modes as soon as possible #17898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
bb647a5
to
3b75814
Compare
@alamb could you trigger a benchmark run on this PR (once CI gives the green light)? |
🤖 |
🤖: Benchmark completed Details
|
Small improvement (which is what I had expected). WDYT, worth integrating? Edit: I had a closer look at the The example I'm looking at locally is a projection where a classification category is being associated with each row using a large case expression. Something like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like it is worth pursing to me -- thanks @pepijnve
I wonder if we can factor out the pattern of "don't call evaluate_selection if the selection is still the entire remainder (mostly so we can document the behavior / make sure this it doesn't get lost in some future refactoring)
Maybe something we could wrap the count in some structure or a function
🤔
let mut remainder = Remainder::new(batch.num_rows());
remainder.update(&self.when_then_expr[i], &batch);
let mut current_value = new_null_array(&return_type, batch.num_rows()); | ||
// We only consider non-null values while comparing with whens | ||
let mut remainder = not(&base_nulls)?; | ||
let mut remainder_count = remainder.true_count(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have found in past evaluations, that the code generated for true_count
is astonishingly fast (it uses some special hardware instruction) so I am not surprised this works well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know that that’s fairly cheap.
What I’m experimenting with is retaining the filtered record batch from one loop iteration to the next so that the amount of data to be churned through each iteration shrinks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to leave the more complex optimisation work for another PR. The short circuit logic is already useful on it's own.
🤖 |
🤖: Benchmark completed Details
|
3b75814
to
a1add0a
Compare
a1add0a
to
563dbf0
Compare
The |
🤖 |
🤖: Benchmark completed Details
|
I’ll have a look at the existing micro benchmarks tomorrow. Not sure if there’s anything in there already with sufficient branches that you would notice the impact. |
I had already done the work of adding extra benchmarks in another branch. I've moved that over to #18097 which is now just an extension of the micro benchmarks. That'll make it easier to compare results. |
🤖 |
🤖: Benchmark completed Details
|
Going to take a closer look at this. In one of the benches I would expect a more significant difference. |
🤦♂️ I botched the benchmark. |
🤖 |
🤖: Benchmark completed Details
|
Those are the gains I was expecting to see! 🚀 Just to keep our feet solidly on the ground, these are somewhat contrived and extreme examples. But I'll take it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the diligence @pepijnve
// start with nulls as default output | ||
let mut current_value = new_null_array(&return_type, batch.num_rows()); | ||
let mut remainder = BooleanArray::from(vec![true; batch.num_rows()]); | ||
let mut remainder_count = batch.num_rows(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given the similarity of this code and the one above, I wonder if there is some way to avoid the duplication (as part of a follow on PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #18152 the code is changing a bit further. If the approach there pans out I want to try to do the same for case_when_with_expr
. It'll be easier to see if there's an extractable pattern once that work settles down, so if it's ok with you I would like to postpone your suggestion for a little bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, of course -- we can always make the code better as follow on PRs
Which issue does this PR close?
Improvement in the context of #18075
Rationale for this change
Speculative performance improvements for case evaluation
What changes are included in this PR?
Short circuit case evaluation loop when as soon as a value has been calculated for each input rows
Are these changes tested?
(Hopefully) covered by SQL logic tests
Are there any user-facing changes?
No