Add HDFS StorageBackend implementation#583
Add HDFS StorageBackend implementation#583tigrulya-exe wants to merge 4 commits intoAiven-Open:mainfrom
Conversation
|
Thanks @tigrulya-exe! This looks like a great addition and quite complete coverage of the storage back-end. However, I'm hesitant to move forward on the review as I lack experience on HDFS to be useful on anything apart from the APIs usage. There are still some work on the project we would like to prioritize before on-boarding a new back-end as well: preparing for Tiered Storage becoming prod-ready in 3.9 or later, and adding docs and release process, etc. A couple of alternatives while this is open for discussion is to point to your fork (or a separate repo with just HDFS) from our README to let users know there's an HDFS implementation. Let me know wdyt, and thanks again for your contribution! |
|
@jeqo Hi! Thank you for the feedback! I think it's a nice idea to point to our fork with the HDFS storage implementation in your README while this PR is open for discussion :) I don't think we need to create a separate repository just for HDFS, as it could complicate porting future features from the main repository |
|
@jeqo Hi! Eventually, we decided to move our implementation of the storage backend to a separate repository. However, we discovered that there were no publicly available Maven repositories containing your jars. Could you please publish them in one of the Maven artifactories so we and other potential developers of custom storage backends can use them without having to build the core project locally? We will also be grateful if you publish the |
|
@tigrulya-exe thanks for the update! Yes, I'm working on this. Could you validate if the snapshot artifacts are available for you? e.g. this is the JAR for test-fixtures: https://oss.sonatype.org/service/local/repositories/snapshots/content/io/aiven/tiered-storage-for-apache-kafka-storage-core/0.0.1-SNAPSHOT/tiered-storage-for-apache-kafka-storage-core-0.0.1-20250306.172800-8-test-fixtures.jar |
|
@jeqo Hi! Thanks for the quick reply! I deleted locally built core project artifacts from the local Gradle cache and added |
This PR adds support for HDFS as a StorageBackend implementation. It also provides Kerberos authentication through the use of a provided keytab and supports asynchronous metric collection based on HDFS client file system statistics.
Users can provide HDFS client configuration in two ways: either by using traditional XML files, specifying their location in the
hdfs.core-site.pathandhdfs.hdfs-site.pathoptions, or by passing the configuration options as regular Kafka options with thehdfs.conf.prefix.