Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change dot notation in add column documentation to tuple #1433

Merged
merged 8 commits into from
Jan 9, 2025
17 changes: 11 additions & 6 deletions mkdocs/docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -951,8 +951,10 @@ Using `add_column` you can add a column, without having to worry about the field
with table.update_schema() as update:
update.add_column("retries", IntegerType(), "Number of retries to place the bid")
# In a struct
update.add_column("details.confirmed_by", StringType(), "Name of the exchange")
update.add_column("details", StructType())
update.add_column(("details", "confirmed_by"), StringType(), "Name of the exchange")
kevinjqliu marked this conversation as resolved.
Show resolved Hide resolved
```
A complex type must exist before columns can added to it. Fields in complex types are added in a tuple.
jeppe-dos marked this conversation as resolved.
Show resolved Hide resolved

### Rename column

Expand All @@ -961,20 +963,21 @@ Renaming a field in an Iceberg table is simple:
```python
with table.update_schema() as update:
update.rename_column("retries", "num_retries")
# This will rename `confirmed_by` to `exchange`
update.rename_column("properties.confirmed_by", "exchange")
# This will rename `confirmed_by` to `processed_by` in the `details` struct
update.rename_column(("details", "confirmed_by"), ("detail", "processed_by"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this example doesnt work

>>> with table.update_schema() as update:
...     update.rename_column(("details", "confirmed_by"), ("detail", "processed_by"))
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/Users/kevinliu/repos/iceberg-python/pyiceberg/table/update/schema.py", line 278, in rename_column
    self._updates[field_from.field_id] = NestedField(
                                         ^^^^^^^^^^^^
  File "/Users/kevinliu/repos/iceberg-python/pyiceberg/types.py", line 333, in __init__
    super().__init__(**data)
  File "/Users/kevinliu/Library/Caches/pypoetry/virtualenvs/pyiceberg-Is5Rt7Ah-py3.12/lib/python3.12/site-packages/pydantic/main.py", line 214, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for NestedField
name
  Input should be a valid string [type=string_type, input_value=('detail', 'processed_by'), input_type=tuple]
    For further information visit https://errors.pydantic.dev/2.10/v/string_type

even though table has details.confirmed_by

>>> print(table.schema())
table {
  1: city: optional string
  2: lat: optional double
  3: long: optional double
  4: details: optional struct<5: confirmed_by: optional string (Name of the exchange)>
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course. The renamed field shouldn't be in a tuple. I have fixed it now. Each part now individually works, except for add_column, as discussed.

```

### Move column

Move a field inside of struct:
Move order of fields:

```python
with table.update_schema() as update:
update.move_first("symbol")
# This will move `bid` after `ask`
update.move_after("bid", "ask")
# This will move `confirmed_by` before `exchange`
update.move_before("details.created_by", "details.exchange")
# This will move `confirmed_by` before `exchange` in the `details` struct
update.move_before(("details", "confirmed_by"), ("details", "exchange"))
```

### Update column
Expand Down Expand Up @@ -1006,6 +1009,8 @@ Delete a field, careful this is a incompatible change (readers/writers might exp
```python
with table.update_schema(allow_incompatible_changes=True) as update:
update.delete_column("some_field")
# In a struct
update.delete_column(("details", "confirmed_by"))
```

## Partition evolution
Expand Down
Loading