by default enable skip agg #12558

binmahone · 2025-04-21T01:32:11Z

this PR closes #12557, pls check out the descriptions there

Signed-off-by: Hongbin Ma (Mahone) <[email protected]>

binmahone · 2025-04-21T01:32:20Z

build

Signed-off-by: Hongbin Ma (Mahone) <[email protected]>

binmahone · 2025-04-21T06:16:32Z

build

abellina · 2025-04-22T14:58:51Z

I believe we should mark the config internal as it doesn't seem like it is a user setting. You'd have to know that the partial aggregate has two stages, and that the first stage does some aggregation and that there are more stages that further combine these partial aggregates into bigger partial aggregates.

Side comment, this log at INFO level seems a little odd. Shouldn't this be at DEBUG level? https://github.com/NVIDIA/spark-rapids/blob/branch-25.06/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuAggregateExec.scala#L995 and the INFO level one should be the case where we do skip? https://github.com/NVIDIA/spark-rapids/blob/branch-25.06/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuAggregateExec.scala#L986

docs/additional-functionality/advanced_configs.md

binmahone · 2025-04-27T01:33:41Z

I believe we should mark the config internal as it doesn't seem like it is a user setting. You'd have to know that the partial aggregate has two stages, and that the first stage does some aggregation and that there are more stages that further combine these partial aggregates into bigger partial aggregates.

will do

Side comment, this log at INFO level seems a little odd. Shouldn't this be at DEBUG level? https://github.com/NVIDIA/spark-rapids/blob/branch-25.06/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuAggregateExec.scala#L995 and the INFO level one should be the case where we do skip? https://github.com/NVIDIA/spark-rapids/blob/branch-25.06/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuAggregateExec.scala#L986

These two logs are both helpful when troubleshooting the skip agg case. How about I changing both to INFO level? I don't know what role DEBUG level logging are supposed to play for us. To me, it would be nice if a log in user production is informative enough for most troubleshootings. Asking user to run it again with DEBUG logging settings would be impractical sometimes because it might be very costly.

binmahone · 2025-04-27T01:34:23Z

I believe we should mark the config internal as it doesn't seem like it is a user setting. You'd have to know that the partial aggregate has two stages, and that the first stage does some aggregation and that there are more stages that further combine these partial aggregates into bigger partial aggregates.

will do

Side comment, this log at INFO level seems a little odd. Shouldn't this be at DEBUG level? https://github.com/NVIDIA/spark-rapids/blob/branch-25.06/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuAggregateExec.scala#L995 and the INFO level one should be the case where we do skip? https://github.com/NVIDIA/spark-rapids/blob/branch-25.06/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuAggregateExec.scala#L986

These two logs are both helpful when troubleshooting the skip agg case. How about I changing both to INFO level? I don't know what role DEBUG level logging are supposed to play for us. To me, it would be nice if a log in user production is informative enough for most troubleshootings (unless it's very verbose). Asking user to run it again with DEBUG logging settings would be impractical sometimes because it might be very costly.

abellina · 2025-04-27T22:06:51Z

I believe we should mark the config internal as it doesn't seem like it is a user setting. You'd have to know that the partial aggregate has two stages, and that the first stage does some aggregation and that there are more stages that further combine these partial aggregates into bigger partial aggregates.

will do

Side comment, this log at INFO level seems a little odd. Shouldn't this be at DEBUG level? https://github.com/NVIDIA/spark-rapids/blob/branch-25.06/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuAggregateExec.scala#L995 and the INFO level one should be the case where we do skip? https://github.com/NVIDIA/spark-rapids/blob/branch-25.06/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuAggregateExec.scala#L986

These two logs are both helpful when troubleshooting the skip agg case. How about I changing both to INFO level? I don't know what role DEBUG level logging are supposed to play for us. To me, it would be nice if a log in user production is informative enough for most troubleshootings (unless it's very verbose). Asking user to run it again with DEBUG logging settings would be impractical sometimes because it might be very costly.

That's fine with me. It's once per task for the first batch only, so it should be fine for now to make them both INFO

Signed-off-by: Hongbin Ma (Mahone) <[email protected]>

binmahone · 2025-04-28T03:41:05Z

build

Signed-off-by: Hongbin Ma (Mahone) <[email protected]>

binmahone · 2025-04-28T03:45:15Z

build

binmahone · 2025-05-06T01:10:11Z

build

by default enable skip agg

f5fec91

Signed-off-by: Hongbin Ma (Mahone) <[email protected]>

fix config md

a6a3cc0

Signed-off-by: Hongbin Ma (Mahone) <[email protected]>

winningsix previously approved these changes Apr 22, 2025

View reviewed changes

weiatwork reviewed Apr 22, 2025

View reviewed changes

docs/additional-functionality/advanced_configs.md Outdated Show resolved Hide resolved

sameerz added the performance A performance related task/issue label Apr 22, 2025

address comments

cd43dcc

Signed-off-by: Hongbin Ma (Mahone) <[email protected]>

binmahone dismissed winningsix’s stale review via cd43dcc April 28, 2025 03:40

address comments

a57da3b

Signed-off-by: Hongbin Ma (Mahone) <[email protected]>

abellina approved these changes Apr 28, 2025

View reviewed changes

binmahone merged commit 6ce344b into NVIDIA:branch-25.06 May 6, 2025
54 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

by default enable skip agg #12558

by default enable skip agg #12558

binmahone commented Apr 21, 2025

binmahone commented Apr 21, 2025

binmahone commented Apr 21, 2025

abellina commented Apr 22, 2025 •

edited

Loading

binmahone commented Apr 27, 2025

binmahone commented Apr 27, 2025

abellina commented Apr 27, 2025

binmahone commented Apr 28, 2025

binmahone commented Apr 28, 2025

binmahone commented May 6, 2025

by default enable skip agg #12558

by default enable skip agg #12558

Conversation

binmahone commented Apr 21, 2025

binmahone commented Apr 21, 2025

binmahone commented Apr 21, 2025

abellina commented Apr 22, 2025 • edited Loading

binmahone commented Apr 27, 2025

binmahone commented Apr 27, 2025

abellina commented Apr 27, 2025

binmahone commented Apr 28, 2025

binmahone commented Apr 28, 2025

binmahone commented May 6, 2025

abellina commented Apr 22, 2025 •

edited

Loading