-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catalog fails to load table using the table's identifier #123
Comments
Other than the obvious edge case: loading a table from a catalog that supports 4-part namespaces and happens to have a namespace that matchings the first part. I would propose that we just update Thoughts @Fokko ? |
Thanks for raising this @pdames In hindsight, I think adding the catalog name to the identifier was a bad choice. We tried to mimic the behavior of Java, but I don't see any advantages of having this (since we have a reference to the catalog anyway). If you want to refresh a table, you could also run: table = catalog.load_table(("test_namespace", "test_table")) # tuple identifier also works...
table.refresh() However, I still think that your example should work as well. Removing it will break existing behavior, so I think @danielcweeks's suggestion is best. We probably also want to fix this for the renaming/delete/etc scenario: catalog.rename_table(table.identifier, 'database.new_table_name')
catalog.drop_table(table.identifier)
catalog.purge_table(table.identifier) |
Thanks for the input @danielcweeks and @Fokko. I'll raise a PR to apply the fix recommended by @danielcweeks for review. |
Resolved by #150 |
Apache Iceberg version
main (development)
Please describe the bug 🐞
Reproduction Steps
test_catalog
.test_namespace.test_table
table inside of it.catalog.load_table
, and ensure that the 3rd variation usingtable.identifier
fails due to inclusion of the catalog's name (test_catalog
) @table.identifier[0]
:A more interesting question that I'd like to open for discussion as part of this issue is the best way to fix it. For example, we could argue that the bug lies with
catalog.load_table()
for not handling the catalog name as part of the identifier. Alternatively, we could argue thattable.identifier
shouldn't include the catalog name in the first place.As a workaround in my code for now, I've created a wrapper method for
catalog.load_table(identifier)
that scrubs the current catalog's name from the given identifier before forwarding it tocatalog.load_table(identifier)
.Environment Info
Iceberg REST Catalog running in the
tabulario/iceberg-rest
Docker container w/ a mock S3 Filesystem:The text was updated successfully, but these errors were encountered: