Delta Tables and Parquet Files #2882
hpatner
started this conversation in
Integrations
Replies: 1 comment
-
👋 @hpatner I think it should now be possible to work with Delta Lakes using https://clickhouse.com/docs/en/sql-reference/table-functions/deltalake and ClickHouse (livebook-dev/kino_db#83) cells. Alternatively, DuckDB with an extension (https://duckdb.org/2024/06/10/delta.html) might also work. Or a wrapper around https://github.com/delta-io/delta-kernel-rs could be used to get the data directly into Explorer quickly. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Would be cool if livebook had the ability to work with parquet files through the delta table format. It takes CSV files converts to parquet files and includes a JSON log. The JSON log records every operation allowing versioning and rollbacks (for time travel). Together, you get ACID transactions, metadata handling, and a great base for working with large datasets. Databricks and Microsoft Fabric have done implementations of this but it is open source and seems like it would be a nice fit for livebooks as storage could be local or based on an S3 instance without the need for a database engine for compute.
Beta Was this translation helpful? Give feedback.
All reactions