@@ -280,7 +280,7 @@ tbl.overwrite(df)
280
280
281
281
The data is written to the table, and when the table is read using `tbl.scan().to_arrow()` :
282
282
283
- ` ` `
283
+ ` ` ` python
284
284
pyarrow.Table
285
285
city: string
286
286
lat: double
@@ -303,7 +303,7 @@ tbl.append(df)
303
303
304
304
When reading the table `tbl.scan().to_arrow()` you can see that `Groningen` is now also part of the table :
305
305
306
- ` ` `
306
+ ` ` ` python
307
307
pyarrow.Table
308
308
city: string
309
309
lat: double
@@ -342,7 +342,7 @@ tbl.delete(delete_filter="city == 'Paris'")
342
342
In the above example, any records where the city field value equals to `Paris` will be deleted.
343
343
Running `tbl.scan().to_arrow()` will now yield :
344
344
345
- ` ` `
345
+ ` ` ` python
346
346
pyarrow.Table
347
347
city: string
348
348
lat: double
@@ -362,7 +362,6 @@ To explore the table metadata, tables can be inspected.
362
362
!!! tip "Time Travel"
363
363
To inspect a tables's metadata with the time travel feature, call the inspect table method with the `snapshot_id` argument.
364
364
Time travel is supported on all metadata tables except `snapshots` and `refs`.
365
-
366
365
` ` ` python
367
366
table.inspect.entries(snapshot_id=805611270568163028)
368
367
` ` `
@@ -377,7 +376,7 @@ Inspect the snapshots of the table:
377
376
table.inspect.snapshots()
378
377
` ` `
379
378
380
- ```
379
+ ` ` ` python
381
380
pyarrow.Table
382
381
committed_at: timestamp[ms] not null
383
382
snapshot_id: int64 not null
@@ -405,7 +404,7 @@ Inspect the partitions of the table:
405
404
table.inspect.partitions()
406
405
` ` `
407
406
408
- ```
407
+ ` ` ` python
409
408
pyarrow.Table
410
409
partition: struct<dt_month: int32, dt_day: date32[day]> not null
411
410
child 0, dt_month: int32
@@ -446,7 +445,7 @@ To show all the table's current manifest entries for both data and delete files.
446
445
table.inspect.entries()
447
446
` ` `
448
447
449
- ```
448
+ ` ` ` python
450
449
pyarrow.Table
451
450
status: int8 not null
452
451
snapshot_id: int64 not null
@@ -604,7 +603,7 @@ To show a table's known snapshot references:
604
603
table.inspect.refs()
605
604
` ` `
606
605
607
- ```
606
+ ` ` ` python
608
607
pyarrow.Table
609
608
name: string not null
610
609
type: string not null
@@ -629,7 +628,7 @@ To show a table's current file manifests:
629
628
table.inspect.manifests()
630
629
` ` `
631
630
632
- ```
631
+ ` ` ` python
633
632
pyarrow.Table
634
633
content: int8 not null
635
634
path: string not null
@@ -679,7 +678,7 @@ To show table metadata log entries:
679
678
table.inspect.metadata_log_entries()
680
679
` ` `
681
680
682
- ```
681
+ ` ` ` python
683
682
pyarrow.Table
684
683
timestamp: timestamp[ms] not null
685
684
file: string not null
@@ -702,7 +701,7 @@ To show a table's history:
702
701
table.inspect.history()
703
702
` ` `
704
703
705
- ```
704
+ ` ` ` python
706
705
pyarrow.Table
707
706
made_current_at: timestamp[ms] not null
708
707
snapshot_id: int64 not null
@@ -723,7 +722,7 @@ Inspect the data files in the current snapshot of the table:
723
722
table.inspect.files()
724
723
` ` `
725
724
726
- ```
725
+ ` ` ` python
727
726
pyarrow.Table
728
727
content: int8 not null
729
728
file_path: string not null
@@ -850,7 +849,7 @@ readable_metrics: [
850
849
851
850
Expert Iceberg users may choose to commit existing parquet files to the Iceberg table as data files, without rewriting them.
852
851
853
- ```
852
+ ` ` ` python
854
853
# Given that these parquet files have schema consistent with the Iceberg table
855
854
856
855
file_paths = [
@@ -930,7 +929,7 @@ with table.update_schema() as update:
930
929
931
930
Now the table has the union of the two schemas `print(table.schema())` :
932
931
933
- ```
932
+ ` ` ` python
934
933
table {
935
934
1: city: optional string
936
935
2: lat: optional double
@@ -1180,7 +1179,7 @@ table.scan(
1180
1179
1181
1180
This will return a PyArrow table:
1182
1181
1183
- ```
1182
+ ``` python
1184
1183
pyarrow.Table
1185
1184
VendorID: int64
1186
1185
tpep_pickup_datetime: timestamp[us, tz=+ 00 :00 ]
@@ -1222,7 +1221,7 @@ table.scan(
1222
1221
1223
1222
This will return a Pandas dataframe:
1224
1223
1225
- ```
1224
+ ``` python
1226
1225
VendorID tpep_pickup_datetime tpep_dropoff_datetime
1227
1226
0 2 2021 - 04 - 01 00 :28 :05 + 00 :00 2021 - 04 - 01 00 :47 :59 + 00 :00
1228
1227
1 1 2021 - 04 - 01 00 :39 :01 + 00 :00 2021 - 04 - 01 00 :57 :39 + 00 :00
@@ -1295,7 +1294,7 @@ ray_dataset = table.scan(
1295
1294
1296
1295
This will return a Ray dataset:
1297
1296
1298
- ```
1297
+ ``` python
1299
1298
Dataset(
1300
1299
num_blocks = 1 ,
1301
1300
num_rows = 1168798 ,
@@ -1346,7 +1345,7 @@ df = df.select("VendorID", "tpep_pickup_datetime", "tpep_dropoff_datetime")
1346
1345
1347
1346
This returns a Daft Dataframe which is lazily materialized. Printing ` df ` will display the schema:
1348
1347
1349
- ```
1348
+ ``` python
1350
1349
╭──────────┬───────────────────────────────┬───────────────────────────────╮
1351
1350
│ VendorID ┆ tpep_pickup_datetime ┆ tpep_dropoff_datetime │
1352
1351
│ -- - ┆ -- - ┆ -- - │
@@ -1364,7 +1363,7 @@ This is correctly optimized to take advantage of Iceberg features such as hidden
1364
1363
df.show(2 )
1365
1364
```
1366
1365
1367
- ```
1366
+ ``` python
1368
1367
╭──────────┬───────────────────────────────┬───────────────────────────────╮
1369
1368
│ VendorID ┆ tpep_pickup_datetime ┆ tpep_dropoff_datetime │
1370
1369
│ -- - ┆ -- - ┆ -- - │
0 commit comments