-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1191:Added docstrings to the pyiceberg/table/inspect.py file #1533
base: main
Are you sure you want to change the base?
1191:Added docstrings to the pyiceberg/table/inspect.py file #1533
Conversation
pyiceberg/table/inspect.py
Outdated
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid changing the LICENSE, the linebreak is now also a bit awkward
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a few comments here, thanks for the PR
pyiceberg/table/inspect.py
Outdated
"""shot ID, | ||
and optional configuration parameters. | ||
Retrieve references from the Iceberg table metadata as a PyArrow Table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is cut off
pyiceberg/table/inspect.py
Outdated
Retrieve partition information from the Iceberg table as a PyArrow Table. | ||
|
||
Args: | ||
snapshot_id (Optional[int]): The snapshot ID to filter partitions. If not provided, all partitions are included. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not provided, all partitions are included.
I dont think this is right, the snapshot_id
for all of these functions are used to time travel to a specific snapshot. otherwise the current snapshot will be used
pyiceberg/table/inspect.py
Outdated
import pyarrow as pa | ||
from typing import Optional, List, Dict, Any, Set | ||
from datetime import datetime, timezone | ||
from pyiceberg.table.snapshots import MetadataLogEntry | ||
from pyiceberg.io.pyarrow import schema_to_pyarrow | ||
from pyiceberg.table import Snapshot, ManifestContent, DataFileContent, PartitionSpec | ||
from pyiceberg.utils import from_bytes | ||
from executor_factory import ExecutorFactory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imports should be on top of the file
pyiceberg/table/inspect.py
Outdated
from pyiceberg.utils import from_bytes | ||
from executor_factory import ExecutorFactory | ||
|
||
class IcebergTableUtils: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class IcebergTableUtils: |
i dont think we need a new class here
"Thank you for the detailed feedback! I'll make the following updates: |
I have pushed the latest changes, including refinements to the docstrings and updates to related files. Please review the updates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! I've added some comments, i think theres a linter error, could you run make lint
locally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is part of pyenv's local config, should not be checked into the repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! I will remove .python-version from the PR.
@@ -30,7 +30,8 @@ | |||
- [Verify a release](verify-release.md) | |||
- [How to release](how-to-release.md) | |||
- [Release Notes](https://github.com/apache/iceberg-python/releases) | |||
- [Code Reference](reference/) | |||
- [Code Reference](reference/pyiceberg/index.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this changed? i dont think this is necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove the change from the SUMMARY.md file if it's not necessary. Thanks for pointing it out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you rebase this PR against main to get the latest change from #1538?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the suggestion! I’m not very familiar with rebasing yet, but I’m eager to learn. Could you guide me on how to properly rebase this PR against the main branch, or would merging the latest changes from #1538 be a better approach in this case? I want to ensure I’m following the best practices.
paths: | ||
- pyiceberg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for this, what is this change for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pointing this out! I made this change while addressing errors in inspect.py. However, I’ll review it again to ensure it’s necessary and aligned with the overall structure. Please let me know if you have additional context or suggestions.
@@ -28,12 +29,24 @@ | |||
from pyiceberg.utils.singleton import _convert_to_hashable_type | |||
|
|||
if TYPE_CHECKING: | |||
import pyarrow as pa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we still need this here
@@ -57,7 +87,21 @@ def _get_snapshot(self, snapshot_id: Optional[int] = None) -> Snapshot: | |||
raise ValueError("Cannot get a snapshot as the table does not have any.") | |||
|
|||
def snapshots(self) -> "pa.Table": | |||
import pyarrow as pa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we actually need this here, in case someone imports this function directly
from pyiceberg.table.inspect import snapshots
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand. I will keep the import for pyarrow as suggested, in case the function is used directly.
Args: | ||
snapshot_id (Optional[int]): The ID of the snapshot to retrieve entries for. | ||
If None, entries for the current snapshot are returned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like this description of how snapshot_id
is used, can we apply this for all similar functions
perhaps something more generic
Args: | |
snapshot_id (Optional[int]): The ID of the snapshot to retrieve entries for. | |
If None, entries for the current snapshot are returned. | |
Args: | |
snapshot_id (Optional[int]): The ID of the snapshot to retrieve. If None, the current snapshot is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the feedback! I’m glad you liked the description. I’ll review other similar functions and update their docstrings to use a more generic and consistent phrasing as suggested. Let me know if there are any additional improvements you'd like to see
Closes #1191
Added detailed docstrings to the pyiceberg/table/inspect.py file to improve documentation and code clarity. The updates enhance readability and help developers understand the functionality of the InspectTable class and its methods.