Add support for Acceldata ODP-3.5.1 #12621

prabhjyotsingh · 2025-05-01T18:54:24Z

Add shim layer for Acceldata ODP 3.5.1

prabhjyotsingh · 2025-05-01T18:55:44Z

@shubhluck @kuhushukla can you please help review this PR?

kuhushukla · 2025-05-01T19:00:21Z

sql-plugin/src/main/scala/org/apache/spark/rapids/hybrid/HybridExecutionUtils.scala

@@ -381,7 +381,7 @@ object HybridExecutionUtils extends PredicateHelper {
    }
  }

-  /** 
+  /**


Nit remove whitespace change

kuhushukla · 2025-05-01T19:01:31Z

sql-plugin/src/main/scala/org/apache/spark/rapids/hybrid/HybridExecutionUtils.scala

@@ -33,7 +33,7 @@ import org.apache.spark.sql.rapids.execution.TrampolineUtil
 import org.apache.spark.sql.types._

 object HybridExecutionUtils extends PredicateHelper {
-  
+


Remove whitespace change

Actually, this PR removes it — but I’ll add it back to reduce the diff size.

kuhushukla · 2025-05-01T19:03:09Z

sql-plugin/src/main/spark320/scala/com/nvidia/spark/rapids/shims/GpuBatchScanExecBase.scala

Nit: Remove unrelated change

kuhushukla · 2025-05-01T19:03:41Z

sql-plugin/src/main/spark320/scala/com/nvidia/spark/rapids/shims/RapidsCsvScanMeta.scala

@@ -57,7 +58,7 @@ class RapidsCsvScanMeta(
    parent: Option[RapidsMeta[_, _, _]],
    rule: DataFromReplacementRule)
  extends ScanMeta[CSVScan](cScan, conf, parent, rule) {
-  
+


Nit: Remove whitespace change

kuhushukla · 2025-05-01T19:03:46Z

sql-plugin/src/main/spark320/scala/com/nvidia/spark/rapids/shims/ShimAQEShuffleReadExec.scala

Nit: Remove unrelated change

kuhushukla · 2025-05-01T19:06:47Z

sql-plugin/src/main/spark350/scala/com/nvidia/spark/rapids/shims/SparkShims.scala

@@ -65,7 +66,7 @@ object SparkShimImpl extends Spark340PlusNonDBShims {
              GpuToPrettyString(child)
            }
          }
-      }), 
+      }),


Nit: Remove unrelated change

kuhushukla · 2025-05-01T19:06:59Z

tests/src/test/scala/com/nvidia/spark/rapids/RegularExpressionSuite.scala

@@ -116,13 +117,13 @@ class RegularExpressionSuite extends SparkQueryCompareTestSuite {
  }

  testSparkResultsAreEqual("String regexp_extract regex 1", extractStrings, conf = conf) {
-    frame => 
+    frame =>


Nit: Remove unrelated change

kuhushukla · 2025-05-01T19:07:04Z

tests/src/test/scala/com/nvidia/spark/rapids/RegularExpressionSuite.scala

      assume(isUnicodeEnabled())
      frame.selectExpr("regexp_extract(strings, '^([a-z]*)([0-9]*)([a-z]*)$', 1)")
  }

  testSparkResultsAreEqual("String regexp_extract regex 2", extractStrings, conf = conf) {
-    frame => 
+    frame =>


Nit: Remove unrelated change

kuhushukla · 2025-05-01T19:07:12Z

tests/src/test/spark341db/scala/com/nvidia/spark/rapids/ToPrettyStringSuite.scala

@@ -44,7 +45,7 @@ class ToPrettyStringSuite extends GpuUnitTests {
    val numRows = 100
    val inputRows = GpuBatchUtilsSuite.createRows(schema, numRows)
    val cpuOutput: Array[String] = inputRows.map {
-      input => 


Nit: Remove unrelated change

kuhushukla · 2025-05-01T19:08:00Z

datagen/src/main/spark320/scala/org/apache/spark/sql/tests/datagen/DataGenExprShims.scala

@@ -39,6 +39,7 @@
 {"spark": "350"}
 {"spark": "350db143"}
 {"spark": "351"}
+{"spark": "351odp"}


I wonder if there is a better way to add odp versions that repeat across many of the files in this PR> I will defer that to other reviewers but would be certainly a nice thing to have

Yes, it was painful for me too.
Would love to know if there's a better way to handle this — or even better, if someone already has a script for it. Happy to learn!

Typically when we add a new shim, we know what existing shim it will mostly be based on https://github.com/NVIDIA/spark-rapids/blob/branch-25.06/docs/dev/shimplify.md#adding-a-new-shim . So most of this tagging is not by hand.

Thank you @gerashegalov will definitely use it next time.

gerashegalov · 2025-05-01T21:04:02Z

sql-plugin-api/src/main/scala/com/nvidia/spark/rapids/SparkShimVersion.scala

@@ -27,6 +27,11 @@ case class ClouderaShimVersion(major: Int, minor: Int, patch: Int, clouderaVersi
  override def toString(): String = s"$major.$minor.$patch-cloudera-$clouderaVersion"
 }

+case class AcceldataShimVersion(major: Int, minor: Int, patch: Int, acceldataVersion: String)


you may want to decompose accelDataVersion into int dimensions just like Spark versions for proper cmpSparkVersion even between different instances of AcceldataShimVersion

Sure let me try and fix it.

@gerashegalov Have made the required changes. Does this look better?

prabhjyotsingh · 2025-05-12T15:39:29Z

@gerashegalov @kuhushukla could you please help with a code review for this PR? Also, let me know if there's anything else needed from my side to move this forward.
Thanks!

prabhjyotsingh · 2025-05-19T14:27:11Z

@gerashegalov @kuhushukla could you help me with what else is required here?

gerashegalov · 2025-05-19T19:36:23Z

Sorry for the delay, there is a discussion about accepting additional Spark vendor shims to reduce instances like checkNotRunningCDHorDatabricks

sameerz · 2025-06-02T22:47:37Z

Will need to be rebased to branch-25.08 as branch-25.06 is in burndown.

Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit 1a9c55a)

Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit 5c335c4)

Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit 70cb32a)

Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit f21c60f)

Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit 14798b1)

prabhjyotsingh · 2025-06-03T16:14:03Z

@sameerz sure I've rebased this with branch-25.08

prabhjyotsingh requested a review from a team as a code owner May 1, 2025 18:54

prabhjyotsingh force-pushed the branch-25.06-351odp branch 2 times, most recently from 60c10f2 to fbf3bb0 Compare May 1, 2025 19:02

kuhushukla reviewed May 1, 2025

View reviewed changes

gerashegalov reviewed May 2, 2025

View reviewed changes

prabhjyotsingh force-pushed the branch-25.06-351odp branch from 498204e to f21c60f Compare May 5, 2025 19:00

sameerz added the feature request New feature or request label Jun 2, 2025

prabhjyotsingh added 5 commits June 3, 2025 12:07

Add support for Acceldata ODP-3.5.1

35290d8

Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit 1a9c55a)

Remove unrelated change

0b6a49f

Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit 5c335c4)

Remove unrelated change

3f1ac08

Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit 70cb32a)

decompose AcceldataShimVersion into proper version

c774415

Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit f21c60f)

fix license header

e552bc7

Signed-off-by: Prabhjyot Singh <[email protected]> (cherry picked from commit 14798b1)

prabhjyotsingh force-pushed the branch-25.06-351odp branch from 14798b1 to e552bc7 Compare June 3, 2025 16:11

prabhjyotsingh changed the base branch from branch-25.06 to branch-25.08 June 3, 2025 16:12

@@ @@ -381,7 +381,7 @@ object HybridExecutionUtils extends PredicateHelper { @@
                   }
                 }
-                /**
+                /**

		@@ -33,7 +33,7 @@ import org.apache.spark.sql.rapids.execution.TrampolineUtil
		import org.apache.spark.sql.types._

		object HybridExecutionUtils extends PredicateHelper {

Add support for Acceldata ODP-3.5.1 #12621

Are you sure you want to change the base?

Add support for Acceldata ODP-3.5.1 #12621

Conversation

prabhjyotsingh commented May 1, 2025

Uh oh!

prabhjyotsingh commented May 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kuhushukla May 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

prabhjyotsingh May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

prabhjyotsingh commented May 12, 2025

Uh oh!

prabhjyotsingh commented May 19, 2025

Uh oh!

gerashegalov commented May 19, 2025

Uh oh!

sameerz commented Jun 2, 2025

Uh oh!

prabhjyotsingh commented Jun 3, 2025

Uh oh!

Uh oh!

kuhushukla May 1, 2025 •

edited

Loading

prabhjyotsingh May 2, 2025 •

edited

Loading