-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
table_exists unit/integration test for NoSuchTableError #678
Conversation
MehulBatra
commented
Apr 30, 2024
•
edited
Loading
edited
- Implementation of table_exists method at Initialization level already exists bypassing it.
- Unit Test for glue, sqlcatalog and dynamodb.
- Integration test for glue and dynamodb.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! thanks for adding tests for different catalogs. I left a few minor comments
pyiceberg/catalog/__init__.py
Outdated
@@ -394,6 +394,11 @@ def table_exists(self, identifier: Union[str, Identifier]) -> bool: | |||
Returns: | |||
bool: True if the table exists, False otherwise. | |||
""" | |||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we get rid of the other table_exists
function in the same file?
iceberg-python/pyiceberg/catalog/__init__.py
Lines 668 to 673 in 87c13b5
def table_exists(self, identifier: Union[str, Identifier]) -> bool: | |
try: | |
self.load_table(identifier) | |
return True | |
except NoSuchTableError: | |
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about changing the REST catalog's table_exists
implementation to be similar to this?
https://github.com/apache/iceberg-python/pull/512/files#diff-3bda7391ebd8aa3dcfd6703d8d2764830b9d9c35fa854188a37d69611274bd3dR722-R727
maybe calls super().table_exists()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better for RestCatalog to maintain a separate implementation to make a head request to /v1/{prefix}/namespaces/{namespace}/tables/{table}, ref: #512 (comment), #507 (comment)
The try-catch implementation is for other non-rest catalogs and thus it is put in the MetastoreCatalog
instead of the Catalog
interface.
A little bit more context regarding the MetastoreCatalog
: while working on #498, we found that many helper functions as well as some implementations are for non-rest catalogs only. So we decide to move those to another layer, MetastoreCatalog
to make the inheritance look better: #498 (comment).
Do these sound reasonable to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, thanks for the detailed explanation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pushed the changes, as per the discussion, thank you, guys, for the feedback!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I think we've had the implementation of table_exists
.
For non-rest catalog it is
iceberg-python/pyiceberg/catalog/__init__.py
Lines 668 to 673 in 87c13b5
def table_exists(self, identifier: Union[str, Identifier]) -> bool: | |
try: | |
self.load_table(identifier) | |
return True | |
except NoSuchTableError: | |
return False |
as pointed out by @kevinjqliu
For rest catalog it is
iceberg-python/pyiceberg/catalog/rest.py
Lines 778 to 791 in a892309
def table_exists(self, identifier: Union[str, Identifier]) -> bool: | |
"""Check if a table exists. | |
Args: | |
identifier (str | Identifier): Table identifier. | |
Returns: | |
bool: True if the table exists, False otherwise. | |
""" | |
identifier_tuple = self.identifier_to_tuple_without_catalog(identifier) | |
response = self._session.head( | |
self.url(Endpoints.load_table, prefixed=True, **self._split_identifier_for_path(identifier_tuple)) | |
) | |
return response.status_code == 200 |
But we do miss relevant tests for glue, sql, dynamodb, ...
Co-authored-by: Honah J. <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM! The lint issue can be resolved by executing make lint
Thank you @HonahX and @kevinjqliu for all the feedback! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work! This will be good to go when the CI pass.
Thank you for the PR! Do you mind changing the PR title and description to match? |
Thanks everyone! Merged! |