-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement update for remove-snapshots
action
#1561
base: main
Are you sure you want to change the base?
Conversation
remove-snapshot
actionremove-snapshots
action
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This was one of the missing update functions mentioned in #952
Do you mind also including some tests? Similar to
iceberg-python/tests/table/test_init.py
Lines 663 to 682 in 2cd4e78
def test_apply_remove_properties_update(table_v2: Table) -> None: | |
base_metadata = update_table_metadata( | |
table_v2.metadata, | |
(SetPropertiesUpdate(updates={"test_a": "test_a", "test_b": "test_b", "test_c": "test_c", "test_d": "test_d"}),), | |
) | |
new_metadata_no_removal = update_table_metadata(base_metadata, (RemovePropertiesUpdate(removals=[]),)) | |
assert base_metadata == new_metadata_no_removal | |
new_metadata = update_table_metadata(base_metadata, (RemovePropertiesUpdate(removals=["test_a", "test_c"]),)) | |
assert base_metadata.properties == { | |
"read.split.target.size": "134217728", | |
"test_a": "test_a", | |
"test_b": "test_b", | |
"test_c": "test_c", | |
"test_d": "test_d", | |
} | |
assert new_metadata.properties == {"read.split.target.size": "134217728", "test_b": "test_b", "test_d": "test_d"} |
Sure! Thanks for the fast review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some comments, looks like theres also a linter error, do you mind running make lint
locally?
snapshot_ids=[3051729675574597004], | ||
) | ||
new_metadata = update_table_metadata(table_v2.metadata, (update,)) | ||
assert len(new_metadata.snapshots) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if theres 1 snapshot, isnt that the current snapshot. if so, wouldnt it error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what's left after the removal, originally fixture has 2 snapshots
@@ -455,6 +455,19 @@ def _(update: SetSnapshotRefUpdate, base_metadata: TableMetadata, context: _Tabl | |||
return base_metadata.model_copy(update=metadata_updates) | |||
|
|||
|
|||
@_apply_table_update.register(RemoveSnapshotsUpdate) | |||
def _(update: RemoveSnapshotsUpdate, base_metadata: TableMetadata, context: _TableMetadataUpdateContext) -> TableMetadata: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _(update: RemoveSnapshotsUpdate, base_metadata: TableMetadata, context: _TableMetadataUpdateContext) -> TableMetadata: | ||
for remove_snapshot_id in update.snapshot_ids: | ||
if remove_snapshot_id == base_metadata.current_snapshot_id: | ||
raise ValueError(f"Can't remove current snapshot id {remove_snapshot_id}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we block the current snapshot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not an expert in iceberg spec, but it's not clear what should happen if you try to remove the current snapshot.
I'm also not sure if I should update parent_snapshot_id in every snapshot that was referencing removed snapshots
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decided to set parent_snapshot_id to None if the parent is gone
Hey @kevinjqliu, ready for another review round. I had to cherry pick the changes from #822 to reuse the code that removes refs |
No description provided.