-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix NOTICE and LICENSE in the spark-runtime jar #12160
Conversation
e366cdf
to
8efaf1c
Compare
spark/v3.5/spark-runtime/NOTICE
Outdated
@@ -500,9 +363,52 @@ file: | |||
This binary artifact includes Project Nessie with the following in its NOTICE | |||
file: | |||
|
|||
| NOTICE applicable to Nessie source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update Nessie NOTICE
from the latest version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in #12145, I think we need Nessie to update license documentation before we can conclude that this is accurate. The current NOTICE for binary artifacts (that we use) still includes a line about GPL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated Nessie NOTICE
corresponding to 0.102.5 release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it mostly looks great, just one comment on keeping the mention of Presto in the LICENSE. Thank you @jbonofre !
The description for this issue has:
I don't see the change anymore, but in case I'm missing it: this was included because source code is partially based on a Delta class, so to avoid any doubt we wanted to include it:
|
3d834a0
to
163f5c2
Compare
Yes, it was my bad. I re-add delta mention when I saw the code in the root |
163f5c2
to
497f6b8
Compare
spark/v3.5/spark-runtime/NOTICE
Outdated
| Copyright 2015-2025 Dremio Corporation | ||
| | ||
| --------------------------------------- | ||
| This project includes code from Apache Iceberg, with the following in its NOTICE file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be trimmed because it is from our own NOTICE. We are responsible for all the content about what Iceberg ships, so there is no need to duplicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I trimmed the Nessie NOTICE
by removing Iceberg and Netty (as it's already mentioned in the Iceberg Spark runtime NOTICE
). I keep Polaris mention as it's not duplicate (ok, it's about build/gradle resources in Polaris, but for consistency, better to keep).
spark/v3.5/spark-runtime/NOTICE
Outdated
| | to the ASF by Snowflake Inc. (https://www.snowflake.com/) copyright 2024. | ||
| | ||
| --------------------------------------- | ||
| This project includes code from Netty, with the following in its NOTICE file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's too bad that this was based on a source inclusion in Nessie. Nessie could probably check that none of this applies to ResolveConf that was copied to avoid more work for downstream bundlers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Nessie is using a similar approach as in Iceberg: as discussed, Iceberg is using LICENSE
and NOTICE
from source in jar files by default (only bundle and runtime jars have specific LICENSE
and NOTICE
.
For instance, Iceberg mentions ScriptRunner
which is used for iceberg-hive-metastore test, but mention in all Iceberg jars. Or AssignmentAlignmentSupport
is used only in iceberg-spark-extensions but mentioned in all iceberg jars.
So, I'm not very confortable to ask Nessie to change that (even if it's what I asked but the Nessie team says they will plan this as it's more work) when Iceberg is doing the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is that we're more selective with NOTICE content when code is copied. That's because we can more easily verify that sections of the original project's NOTICE do or do not apply to the specific section that was copied or used.
497f6b8
to
f574393
Compare
f574393
to
126bf34
Compare
I'm seeing
In the license, but I'm not able to find this in the dependency tree |
The Parquet License still references JavaFastPFOR https://github.com/apache/parquet-java/blob/parquet-1.15.x/LICENSE#L189 and I still see some references to the source in Parquet https://github.com/apache/parquet-java/blob/fb6f0be0323f5f52715b54b8c6602763d8d0128d/parquet-generator/src/main/java/org/apache/parquet/encoding/bitpacking/IntBasedBitPackingGenerator.java#L29 . I think we'd need to keep this, until we update Parquet (based on the code comment it looks like that code path I referenced is no longer really used upstream)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification @amogh-jahagirdar, I need to work on my grep
skills :D
This PR does:
Avro
copyrightParquet
copyrightThrift
copyrightORC
copyrightHive
copyrightNOTICE