[Bug] The inclusion of high-version Spark classes in paimon-spark-common may lead to certain exceptions. #5037

thomasg19930417 · 2025-02-08T01:34:50Z

Search before asking

I searched in the issues and found nothing similar.

Paimon version

1.0.0

Compute Engine

spark3.3

Minimal reproduce step

execute desc tableName or show create table command

What doesn't meet your expectations?

scala> spark.sql("show create table tableName").show(false)
java.lang.NoClassDefFoundError: [Lorg/apache/spark/sql/connector/catalog/Column;
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at org.apache.kyuubi.plugin.spark.authz.util.AuthZUtils$.invoke(AuthZUtils.scala:63)
at org.apache.kyuubi.plugin.spark.authz.util.AuthZUtils$.invokeAs(AuthZUtils.scala:77)
at org.apache.kyuubi.plugin.spark.authz.serde.TableExtractor$.getOwner(tableExtractors.scala:50)
at org.apache.kyuubi.plugin.spark.authz.serde.ResolvedTableTableExtractor.apply(tableExtractors.scala:103)
at org.apache.kyuubi.plugin.spark.authz.serde.ResolvedTableTableExtractor.apply(tableExtractors.scala:97)
at org.apache.kyuubi.plugin.spark.authz.serde.TableDesc.extract(Descriptor.scala:244)
at org.apache.kyuubi.plugin.spark.authz.PrivilegesBuilder$.getTablePriv$1(PrivilegesBuilder.scala:128)
at org.apache.kyuubi.plugin.spark.authz.PrivilegesBuilder$.$anonfun$buildCommand$7(PrivilegesBuilder.scala:174)
at scala.collection.immutable.List.foreach(List.scala:431)
at org.apache.kyuubi.plugin.spark.authz.PrivilegesBuilder$.buildCommand(PrivilegesBuilder.scala:172)
at org.apache.kyuubi.plugin.spark.authz.PrivilegesBuilder$.build(PrivilegesBuilder.scala:224)
at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization$.checkPrivileges(RuleAuthorization.scala:50)
at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.apply(RuleAuthorization.scala:36)
at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.apply(RuleAuthorization.scala:33)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211)
at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
at scala.collection.immutable.List.foldLeft(List.scala:91)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200)
at scala.collection.immutable.List.foreach(List.scala:431)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$optimizedPlan$1(QueryExecution.scala:126)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185)
at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184)
at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:122)
at org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:118)
at org.apache.spark.sql.execution.QueryExecution.assertOptimized(QueryExecution.scala:136)
at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:154)
at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:151)
at org.apache.spark.sql.execution.QueryExecution.simpleString(QueryExecution.scala:204)
at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$explainString(QueryExecution.scala:249)
at org.apache.spark.sql.execution.QueryExecution.explainString(QueryExecution.scala:218)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:103)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:94)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:584)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:176)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:584)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:560)
at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:94)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:81)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:79)
at org.apache.spark.sql.Dataset.(Dataset.scala:220)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:622)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617)
... 47 elided
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.connector.catalog.Column
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 118 more

Anything else?

I'm using the Spark authentication plugin of Kyuubi here. I've found that when checking permissions, an exception occurs because the SparkTable class in Paimon carries high - version Spark classes. I compared the implementations of different versions. In versions 0.8 and earlier of Paimon, the SparkTable was a Java class. After version 0.9, it was rewritten in Scala. This has led to the SparkTable class importing classes that only exist in high - version Spark after compilation.

Are you willing to submit a PR?

I'm willing to submit a PR!

The text was updated successfully, but these errors were encountered:

thomasg19930417 · 2025-02-10T03:21:38Z

We should keep using Java for implementation here. This should be able to avoid the import issues caused by Scala compilation when relying on the high - version Spark. Or is it possible to directly specify different Spark versions to generate different common JARs?

thomasg19930417 · 2025-02-10T08:14:07Z

We should keep using Java for implementation here. This should be able to avoid the import issues caused by Scala compilation when relying on the high - version Spark. Or is it possible to directly specify different Spark versions to generate different common JARs?

Attempts to modify the POM to the corresponding Spark version resulted in compilation failures, as some features from a higher - version Spark were separately introduced.

thomasg19930417 · 2025-02-10T09:31:27Z

@JingsongLi Could you spare some time to take a look at this issue? The current problem is preventing the upgrade to Paimon 1.0 version.

thomasg19930417 · 2025-02-11T06:19:22Z

This problem was solved by adding an empty implementation of Column in paimon - spark3.3.

thomasg19930417 added the bug Something isn't working label Feb 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] The inclusion of high-version Spark classes in paimon-spark-common may lead to certain exceptions. #5037

[Bug] The inclusion of high-version Spark classes in paimon-spark-common may lead to certain exceptions. #5037

thomasg19930417 commented Feb 8, 2025

thomasg19930417 commented Feb 10, 2025 •

edited

Loading

thomasg19930417 commented Feb 10, 2025

thomasg19930417 commented Feb 10, 2025

thomasg19930417 commented Feb 11, 2025

[Bug] The inclusion of high-version Spark classes in paimon-spark-common may lead to certain exceptions. #5037

[Bug] The inclusion of high-version Spark classes in paimon-spark-common may lead to certain exceptions. #5037

Comments

thomasg19930417 commented Feb 8, 2025

Search before asking

Paimon version

Compute Engine

Minimal reproduce step

What doesn't meet your expectations?

Anything else?

Are you willing to submit a PR?

thomasg19930417 commented Feb 10, 2025 • edited Loading

thomasg19930417 commented Feb 10, 2025

thomasg19930417 commented Feb 10, 2025

thomasg19930417 commented Feb 11, 2025

thomasg19930417 commented Feb 10, 2025 •

edited

Loading