-
Notifications
You must be signed in to change notification settings - Fork 253
Add support for Acceldata ODP-3.5.1 #12621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: branch-25.08
Are you sure you want to change the base?
Add support for Acceldata ODP-3.5.1 #12621
Conversation
@shubhluck @kuhushukla can you please help review this PR? |
60c10f2
to
fbf3bb0
Compare
@@ -381,7 +381,7 @@ object HybridExecutionUtils extends PredicateHelper { | |||
} | |||
} | |||
|
|||
/** | |||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit remove whitespace change
@@ -33,7 +33,7 @@ import org.apache.spark.sql.rapids.execution.TrampolineUtil | |||
import org.apache.spark.sql.types._ | |||
|
|||
object HybridExecutionUtils extends PredicateHelper { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove whitespace change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this PR removes it — but I’ll add it back to reduce the diff size.
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Remove unrelated change
@@ -57,7 +58,7 @@ class RapidsCsvScanMeta( | |||
parent: Option[RapidsMeta[_, _, _]], | |||
rule: DataFromReplacementRule) | |||
extends ScanMeta[CSVScan](cScan, conf, parent, rule) { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Remove whitespace change
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Remove unrelated change
@@ -65,7 +66,7 @@ object SparkShimImpl extends Spark340PlusNonDBShims { | |||
GpuToPrettyString(child) | |||
} | |||
} | |||
}), | |||
}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Remove unrelated change
@@ -116,13 +117,13 @@ class RegularExpressionSuite extends SparkQueryCompareTestSuite { | |||
} | |||
|
|||
testSparkResultsAreEqual("String regexp_extract regex 1", extractStrings, conf = conf) { | |||
frame => | |||
frame => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Remove unrelated change
assume(isUnicodeEnabled()) | ||
frame.selectExpr("regexp_extract(strings, '^([a-z]*)([0-9]*)([a-z]*)$', 1)") | ||
} | ||
|
||
testSparkResultsAreEqual("String regexp_extract regex 2", extractStrings, conf = conf) { | ||
frame => | ||
frame => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Remove unrelated change
@@ -44,7 +45,7 @@ class ToPrettyStringSuite extends GpuUnitTests { | |||
val numRows = 100 | |||
val inputRows = GpuBatchUtilsSuite.createRows(schema, numRows) | |||
val cpuOutput: Array[String] = inputRows.map { | |||
input => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Remove unrelated change
@@ -39,6 +39,7 @@ | |||
{"spark": "350"} | |||
{"spark": "350db143"} | |||
{"spark": "351"} | |||
{"spark": "351odp"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there is a better way to add odp versions that repeat across many of the files in this PR> I will defer that to other reviewers but would be certainly a nice thing to have
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it was painful for me too.
Would love to know if there's a better way to handle this — or even better, if someone already has a script for it. Happy to learn!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically when we add a new shim, we know what existing shim it will mostly be based on https://github.com/NVIDIA/spark-rapids/blob/branch-25.06/docs/dev/shimplify.md#adding-a-new-shim . So most of this tagging is not by hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @gerashegalov will definitely use it next time.
@@ -27,6 +27,11 @@ case class ClouderaShimVersion(major: Int, minor: Int, patch: Int, clouderaVersi | |||
override def toString(): String = s"$major.$minor.$patch-cloudera-$clouderaVersion" | |||
} | |||
|
|||
case class AcceldataShimVersion(major: Int, minor: Int, patch: Int, acceldataVersion: String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may want to decompose accelDataVersion into int dimensions just like Spark versions for proper cmpSparkVersion even between different instances of AcceldataShimVersion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure let me try and fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gerashegalov Have made the required changes. Does this look better?
498204e
to
f21c60f
Compare
@gerashegalov @kuhushukla could you please help with a code review for this PR? Also, let me know if there's anything else needed from my side to move this forward. |
@gerashegalov @kuhushukla could you help me with what else is required here? |
Sorry for the delay, there is a discussion about accepting additional Spark vendor shims to reduce instances like |
Will need to be rebased to branch-25.08 as branch-25.06 is in burndown. |
Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit 1a9c55a)
Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit 5c335c4)
Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit 70cb32a)
Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit f21c60f)
Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit 14798b1)
14798b1
to
e552bc7
Compare
@sameerz sure I've rebased this with branch-25.08 |
Add shim layer for Acceldata ODP 3.5.1