Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting non-string typed values in set_properties #504

Merged
merged 7 commits into from
Mar 9, 2024

Conversation

kevinjqliu
Copy link
Contributor

@kevinjqliu kevinjqliu commented Mar 7, 2024

Requires #503 to be merged first, extension of #502.

This PR modifies Transaction's set_properties API to accept table properties as type Dict[str, Any], either as dictionary or kwargs.

For example,

with table.transaction() as transaction:
    transaction.set_properties(abc=123)
    transaction.set_properties({"abc": 123})

This is done by reusing the field validator defined in #469. Setting None property value is disallowed since the transformation will change None to str "None" which is unintuitive.

Note

The Iceberg spec stores table properties as strings. That is why when setting a property value as int, reading it back produces a string.

table = table.transaction().set_properties({"abc": 123}).commit_transaction()
assert table.properties.get("abc") == "123"

Use PropertyUtil.property_as_int to return property value as int

table = table.transaction().set_properties({"abc": 123}).commit_transaction()
assert PropertyUtil.property_as_int("abc") == 123

@kevinjqliu kevinjqliu changed the title Kevinjqliu/set non string properties Allow setting non-string typed values in set_properties Mar 7, 2024
Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! @kevinjqliu . It adds lots of convenience when setting properties through transaction.

updates: Dict[str, str]
updates: Properties
# validators
transform_properties_dict_value_to_str = field_validator('updates', mode='before')(transform_dict_value_to_str)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
transform_properties_dict_value_to_str = field_validator('updates', mode='before')(transform_dict_value_to_str)
@field_validator('updates', mode='before')
def transform_properties_dict_value_to_str(cls, properties: Properties) -> Dict[str, str]:
return transform_dict_value_to_str(properties)

Shall we reformat the validator to use decorator? This aligns with the way we define model validators in metadata.py and also the example in Pydantic doc. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks much cleaner, thank you!

pyiceberg/table/__init__.py Show resolved Hide resolved
@kevinjqliu kevinjqliu force-pushed the kevinjqliu/set-non-string-properties branch from bda7fdb to 79a5954 Compare March 8, 2024 15:15
@kevinjqliu kevinjqliu force-pushed the kevinjqliu/set-non-string-properties branch from 79a5954 to 9fe9cff Compare March 8, 2024 15:17
@kevinjqliu kevinjqliu requested a review from HonahX March 8, 2024 16:23
@kevinjqliu
Copy link
Contributor Author

rebased after #503

Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being late. Overall LGTM. Just have one small comment regarding the type.

BTW, could you please update the same field validator in metadata to use the decorator @field_validator?
https://github.com/apache/iceberg-python/blob/main/pyiceberg/table/metadata.py#L224 Thanks!

@@ -472,7 +473,11 @@ class SetLocationUpdate(TableUpdate):

class SetPropertiesUpdate(TableUpdate):
action: TableUpdateAction = TableUpdateAction.set_properties
updates: Dict[str, str]
updates: Properties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one should be Dict[str, str] because at the time when the model is constructed, the before validator has converted all values to str

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kevinjqliu for working on this, and @HonahX for the review!

@Fokko Fokko merged commit 70342ac into apache:main Mar 9, 2024
7 checks passed
@kevinjqliu kevinjqliu deleted the kevinjqliu/set-non-string-properties branch March 9, 2024 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants