Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(puffin): Add reader and writer #714

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

fqaiser94
Copy link
Contributor

@fqaiser94 fqaiser94 commented Nov 24, 2024

Fixes: #744

Do not review

I am currently breaking this up into multiple PRs to make it easier to review (you can follow progress in the ticket).
Otherwise, this PR should be functionally complete and at feature parity with the Java implementation.

Summary

Adds Puffin file format reader and writer implementations.

Testing

Added unit tests.

In particular, I have tests that check the Rust generated Puffin files are bit-wise identical to their Java counterparts.
You can perform the same check manually using:

cd crates/puffin/testdata/v1
diff java-generated/empty-puffin-uncompressed.bin rust-generated/empty-puffin-uncompressed.bin
diff java-generated/sample-metric-data-uncompressed.bin rust-generated/sample-metric-data-uncompressed.bin
diff java-generated/sample-metric-data-compressed-zstd.bin rust-generated/sample-metric-data-compressed-zstd.bin

Out of Scope

  • Support for LZ4 compression/decompression
    • The Java library does not currently support this either. As the Java implementations are the reference implementations in the Iceberg ecosystem, I would like implement LZ4 support on the Java side first before implementing on the Rust side. At that point, we can also easily check the Java and Rust generated outputs are bit-wise identical.

@fqaiser94 fqaiser94 force-pushed the puffin branch 9 times, most recently from a617f4e to 61ac9bf Compare November 30, 2024 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Puffin file format
1 participant