Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support catalog name in table identifier during load, rename, drop, and purge #150

Merged
merged 1 commit into from
Nov 25, 2023

Conversation

pdames
Copy link
Contributor

@pdames pdames commented Nov 15, 2023

Related Issue: #123

Validations Run:
make test
make test-integration
make lint

Description:
This PR fixes a bug where Catalog APIs returning a table (e.g. catalog.load_table(), catalog.create_table(), etc.) return a Table whose identifier has the catalog name prepended, but providing this identifier to catalog APIs to locate this table raises a NoSuchTable error due to the prepended catalog name.

This implementation simply removes the prepended catalog name from the identifier whenever it is composed of at least 3 elements and the catalog name matches the current catalog implementation in use.

It currently only removes the catalog name in cases where a table identifier refers to the location of an existing table, such as in catalog.load_table(), the from_identifier of catalog.rename_table(), catalog.drop_table(), and catalog.purge_table().

One side-effect of this implementation decision is illustrated in the following code, which fails to load a table with a 2-part namespace whose first element is identical to the catalog name:

catalog = RestCatalog("rest", uri=TEST_URI, token=TEST_TOKEN)
table_identifier = ("rest", "pdames", "table")
table = catalog.create_table(identifier=table_identifier, ...)  # table.identifier string is "rest.rest.pdames.table"
loaded_table = catalog.load_table(table.identifier)  # loads the created table successfully
catalog.load_table(table_identifier)  # NoSuchTable error due to removing the namespace that matches the catalog name 

In this case, we could fallback to again trying to load the table with the catalog name included after catching the NoSuchTable error, but this may result in us loading a different table than intended since we don't know if the first identifier element is meant to refer to a namespace or a catalog name.

I'm curious to hear reviewers opinions about how concerning these types of side-effects are, potential fixes, and if this calls for further refactoring to add the missing context in the form of identifier metadata, type-differentiated identifier elements, etc.

@pdames
Copy link
Contributor Author

pdames commented Nov 15, 2023

FYI @Fokko and @danielcweeks

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good @pdames Thanks for fixing this, and also refactoring the tests. Appreciate it 👍

@Fokko Fokko merged commit a57cb07 into apache:main Nov 25, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants