-
Notifications
You must be signed in to change notification settings - Fork 394
Description
Feature description
I would like to add support for other Iceberg catalogs, rather than just the SQLite, to the OSS version of DLT. I believe this is not a complicated addition, as we can leverage the existing pyiceberg functionalities to load catalog configs from the .pyiceberg.yaml and load_catalog. This allows the rest of the implementation to continue as normal.
Are you a dlt user?
Yes, I run dlt in production.
Use case
Yes! So far DLT is awesome, but the integration with Iceberg catalogs is not full (understandably as this is offered in DLT+). But while I understand that in DLT+ we may have much better features on top of this, I think the option to allow to connect to more catalogs for the OSS is a key feature to allow DLT to be a top-tier ingestion framework (which already is, but this would be like top tier ++++++ ๐ )
Proposed solution
I am preparing a PR leveraging the load_catalog function in pyiceberg. All the rest largely remains the same, we just leverage load_catalog to find the catalog, and return it to the pipeline as normal. I added other elements such as constructing the catalog config when required, but in essence, this is the focus.
Right now I am using this approach by monkey-patching DLT before running the pipeline. I am currently running tests and making some corrections to the pr.
Related issues
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status