-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: add support to optimize, analyze tables and expire snapshots, remove orphan files #8183
Comments
Thanks @alaturqua for raising this. It would be cool to add this. Can you elaborate on what you mean by analyze tables? |
I mean letting it gather statistics. I don‘t know exactly if metadata files are holding statistics. For example trino has this function: additionally letting it roll back snapshots or registering iceberg tables would be a nice thing as well. |
@alaturqua That's an interesting thought. I think that should be quite straightforward. We could collect all the column metrics, and combine them to show |
@alaturqua which catalog are you using? Rollback of snapshots is also not that much work. |
We are using hive metastore. But plan to switch to rest catalog, if it supports views in the future. |
I have a use case, where we copy iceberg table folders from a blob storage location to another. And updating metadata and file location on meta files is really cumbersome. Being able to update metadata location and file locations respectively plus registering the table via pyiceberg would be great improvement as well. It can be used for migrations between vendors or storages etc. |
This would be quite straightforward. You would need to implement the If you're interested in contributing, this might be an interesting PR, otherwise I can try to squeeze it in at some point |
Hey @Fokko, happy to start looking into some features here one by one. Would be great if each task is an issue and has a brief description about how to tackle it. 😃 |
@alaturqua Hi, Can you please also add the rewriteManifests operation? |
I created the feature request but unfortunately I do not have time to work on these in the near future. I was hoping @Fokko would take these into his backlog. |
Thanks everyone for jumping in here.
Yes, I'm happy to, but we first need to get write support in :) |
I've migrated this issue to the new repository: apache/iceberg-python#31 We definitely don't want to lose track of this! 👍 |
Feature Request / Improvement
As an pyiceberg user I would like to be able to do following with PyIceberg:
Regards.
Query engine
Other
The text was updated successfully, but these errors were encountered: