Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add view_exists method to REST Catalog #1242

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

shiv-io
Copy link

@shiv-io shiv-io commented Oct 20, 2024

Part of the adding view support to the REST catalog: #818

Todo:

  • Add tests
  • Add docs

Please let me know what the appropriate place to add docs would be

@shiv-io shiv-io marked this pull request as ready for review October 20, 2024 20:26
@shiv-io
Copy link
Author

shiv-io commented Oct 20, 2024

When I tested catalog.view_exists('default.bar') with a local REST catalog, I got the following exception. This also occurs with the existing catalog.table_exists() method. Is this expected?

from pyiceberg.catalog.rest import RestCatalog

catalog = RestCatalog(
    name='local',
    **{
        'uri': 'http://0.0.0.0:8181'
    }
)

catalog.view_exists('default.bar')

# Traceback (most recent call last):
#   File "/Users/shivgupta/projects/iceberg-python/pyiceberg/catalog/rest.py", line 916, in view_exists
#     response.raise_for_status()
#   File "/Users/shivgupta/projects/iceberg-python/venv/lib/python3.9/site-packages/requests/models.py", line 1024, in raise_for_status
#     raise HTTPError(http_error_msg, response=self)
# requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http://0.0.0.0:8181/v1/namespaces/default/views/bar

# The above exception was the direct cause of the following exception:

# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
#   File "/Users/shivgupta/projects/iceberg-python/venv/lib/python3.9/site-packages/tenacity/__init__.py", line 336, in wrapped_f
#     return copy(f, *args, **kw)
#   File "/Users/shivgupta/projects/iceberg-python/venv/lib/python3.9/site-packages/tenacity/__init__.py", line 475, in __call__
# >>> catalog.view_exists('default.bar')
# Traceback (most recent call last):
#   File "/Users/shivgupta/projects/iceberg-python/pyiceberg/catalog/rest.py", line 916, in view_exists
#     response.raise_for_status()
#   File "/Users/shivgupta/projects/iceberg-python/venv/lib/python3.9/site-packages/requests/models.py", line 1024, in raise_for_status
#     raise HTTPError(http_error_msg, response=self)
# requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http://0.0.0.0:8181/v1/namespaces/default/views/bar

# The above exception was the direct cause of the following exception:

# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
#   File "/Users/shivgupta/projects/iceberg-python/venv/lib/python3.9/site-packages/tenacity/__init__.py", line 336, in wrapped_f
#     return copy(f, *args, **kw)
#   File "/Users/shivgupta/projects/iceberg-python/venv/lib/python3.9/site-packages/tenacity/__init__.py", line 475, in __call__
#     do = self.iter(retry_state=retry_state)
#   File "/Users/shivgupta/projects/iceberg-python/venv/lib/python3.9/site-packages/tenacity/__init__.py", line 376, in iter
#     result = action(retry_state)
#   File "/Users/shivgupta/projects/iceberg-python/venv/lib/python3.9/site-packages/tenacity/__init__.py", line 398, in <lambda>
#     self._add_action_func(lambda rs: rs.outcome.result())
#   File "/Users/shivgupta/.pyenv/versions/3.9.16/lib/python3.9/concurrent/futures/_base.py", line 439, in result
#     return self.__get_result()
#   File "/Users/shivgupta/.pyenv/versions/3.9.16/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
#     raise self._exception
#   File "/Users/shivgupta/projects/iceberg-python/venv/lib/python3.9/site-packages/tenacity/__init__.py", line 478, in __call__
#     result = fn(*args, **kwargs)
#   File "/Users/shivgupta/projects/iceberg-python/pyiceberg/catalog/rest.py", line 918, in view_exists
#     self._handle_non_200_response(exc, {})
#   File "/Users/shivgupta/projects/iceberg-python/pyiceberg/catalog/rest.py", line 472, in _handle_non_200_response
#     raise exception(response) from exc
# pyiceberg.exceptions.BadRequestError: RESTError 400: Could not decode json payload: 

Note, I used the tabulario/iceberg-rest image to spin up a REST catalog server locally

@sungwy
Copy link
Collaborator

sungwy commented Oct 26, 2024

Hi @shiv-io - thank you for putting together this PR!

When I tested catalog.view_exists('default.bar') with a local REST catalog, I got the following exception. This also occurs with the existing catalog.table_exists() method. Is this expected?

Yes, we have seen issues with specific implementations of the REST Catalog exhibiting issues with certain endpoints. Which REST Catalog implementation/image are you using to run your local test?

@shiv-io
Copy link
Author

shiv-io commented Oct 26, 2024

@sungwy I used tabulario/iceberg-rest image to spin up a REST catalog server locally to test with

Copy link
Collaborator

@sungwy sungwy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it @shiv-io . Unfortunately the tabular REST catalog image doesn't expose the View endpoints. I have WIP PR to address this issue in our test suite, and update our integration test suite to use a REST Catalog image that has newer endpoints implemented.

However, I think the view_exists endpoint is simple enough to test through mocked unit tests and merge. I think your implementation looks good - could we lint and include docs to the API docs?

namespace = "examples"
view = "some_view"
rest_mock.head(
f"{TEST_URI}v1/namespaces/{namespace}/views/{view}",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a test case here for a multi-level namespace as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sungwy thanks! Is that as simple as creating a test case with:

multilevel_namespace = "core.examples.some_namespace"
view = "some_view"
rest_mock.head(
    f"{TEST_URI}v1/namespaces/{multilevel_namespace}/views/{view}",
    status_code=404,
    request_headers=TEST_HEADERS,
)

@shiv-io
Copy link
Author

shiv-io commented Nov 8, 2024

@sungwy thanks for reviewing! Let me know if there's anything else. I noticed we don't yet have docs for list_views (#817) and drop_views (#820), btw.

@shiv-io
Copy link
Author

shiv-io commented Nov 19, 2024

@sungwy just bumping this in case this fell off your radar!

@sungwy
Copy link
Collaborator

sungwy commented Dec 7, 2024

Hi @shiv-io !

Sorry this fell off my radar. Could we actually add an integration test here as well, now that we have a new REST Catalog image we are testing against that supports the view endpoints? :)

#1389

@Fokko
Copy link
Contributor

Fokko commented Jan 15, 2025

@shiv-io Gentle ping :)

@shiv-io
Copy link
Author

shiv-io commented Jan 18, 2025

@sungwy I'm seeing that the /v1/namespaces/default/views endpoint doesn't return any views. How would I go about adding an integration test for view_exists given pyiceberg currently doesn't support create view? Appreciate any direction here in case I'm missing something

@sungwy
Copy link
Collaborator

sungwy commented Jan 18, 2025

Hi @shiv-io you make a great point!

I think we could still create an integration test by leveraging the fact that Iceberg is a language agnostic spec. We have a spark session fixture specified in our integration test suite that could be used to create a view, that could in turn be queries by PyIceberg.

Invoking an action using PyIceberg and then verifying the behavior using Spark is a common pattern in our integration test suite, and I think we could do the reverse to create the view using Spark and then verify its existence using PyIceberg function instead in this case.

@shiv-io
Copy link
Author

shiv-io commented Jan 19, 2025

Makes sense, @sungwy -- thanks! Added the integration test and it passed. Let me know if that looks good!

@sungwy
Copy link
Collaborator

sungwy commented Jan 22, 2025

This looks good to me @shiv-io ! Thank you very much for including the integration test to test the API :)

Could we run make lint to make the CI pass? Other than that, this looks good to be merged!

@shiv-io
Copy link
Author

shiv-io commented Jan 22, 2025

@sungwy thanks for the review :)

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the contribution :)

Copy link
Collaborator

@sungwy sungwy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm as well. Thank you for putting together this PR @shiv-io 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants