apache · kevinjqliu · Sep 1, 2024 · Aug 31, 2024 · Aug 31, 2024 · Aug 31, 2024
diff --git a/.markdownlint.yaml b/.markdownlint.yaml
@@ -0,0 +1,26 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Default state for all rules
+default: true
+
+# MD013/line-length - Line length
+MD013: false
+
+# MD007/ul-indent - Unordered list indentation
+MD007:
+  indent: 4
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -46,17 +46,11 @@ repos:
     hooks:
       - id: pycln
         args: [--config=pyproject.toml]
-  - repo: https://github.com/executablebooks/mdformat
-    rev: 0.7.17
+  - repo: https://github.com/igorshubovych/markdownlint-cli
+    rev: v0.41.0
     hooks:
-      - id: mdformat
-        additional_dependencies:
-          - mdformat-black==0.1.1
-          - mdformat-config==0.1.3
-          - mdformat-beautysh==0.1.1
-          - mdformat-admon==1.0.1
-          - mdformat-mkdocs==1.0.1
-          - mdformat-frontmatter==2.0.1
+      - id: markdownlint
+        args: ["--fix"]
   - repo: https://github.com/pycqa/pydocstyle
     rev: 6.3.0
     hooks:

diff --git a/mkdocs/docs/SUMMARY.md b/mkdocs/docs/SUMMARY.md
@@ -18,6 +18,7 @@
 <!-- prettier-ignore-start -->
 
 <!-- markdown-link-check-disable -->
+# Summary
 
 - [Getting started](index.md)
 - [Configuration](configuration.md)

diff --git a/mkdocs/docs/api.md b/mkdocs/docs/api.md
@@ -280,7 +280,7 @@ tbl.overwrite(df)
 
 The data is written to the table, and when the table is read using `tbl.scan().to_arrow()`:
 
-```
+```python
 pyarrow.Table
 city: string
 lat: double
@@ -303,7 +303,7 @@ tbl.append(df)
 
 When reading the table `tbl.scan().to_arrow()` you can see that `Groningen` is now also part of the table:
 
-```
+```python
 pyarrow.Table
 city: string
 lat: double
@@ -342,7 +342,7 @@ tbl.delete(delete_filter="city == 'Paris'")
 In the above example, any records where the city field value equals to `Paris` will be deleted.
 Running `tbl.scan().to_arrow()` will now yield:
 
-```
+```python
 pyarrow.Table
 city: string
 lat: double
@@ -362,7 +362,6 @@ To explore the table metadata, tables can be inspected.
 !!! tip "Time Travel"
     To inspect a tables's metadata with the time travel feature, call the inspect table method with the `snapshot_id` argument.
     Time travel is supported on all metadata tables except `snapshots` and `refs`.
-
     ```python
     table.inspect.entries(snapshot_id=805611270568163028)
     ```
@@ -377,7 +376,7 @@ Inspect the snapshots of the table:
 table.inspect.snapshots()
 ```
 
-```
+```python
 pyarrow.Table
 committed_at: timestamp[ms] not null
 snapshot_id: int64 not null
@@ -405,7 +404,7 @@ Inspect the partitions of the table:
 table.inspect.partitions()
 ```
 
-```
+```python
 pyarrow.Table
 partition: struct<dt_month: int32, dt_day: date32[day]> not null
   child 0, dt_month: int32
@@ -446,7 +445,7 @@ To show all the table's current manifest entries for both data and delete files.
 table.inspect.entries()
 ```
 
-```
+```python
 pyarrow.Table
 status: int8 not null
 snapshot_id: int64 not null
@@ -604,7 +603,7 @@ To show a table's known snapshot references:
 table.inspect.refs()
 ```
 
-```
+```python
 pyarrow.Table
 name: string not null
 type: string not null
@@ -629,7 +628,7 @@ To show a table's current file manifests:
 table.inspect.manifests()
 ```
 
-```
+```python
 pyarrow.Table
 content: int8 not null
 path: string not null
@@ -679,7 +678,7 @@ To show table metadata log entries:
 table.inspect.metadata_log_entries()
 ```
 
-```
+```python
 pyarrow.Table
 timestamp: timestamp[ms] not null
 file: string not null
@@ -702,7 +701,7 @@ To show a table's history:
 table.inspect.history()
 ```
 
-```
+```python
 pyarrow.Table
 made_current_at: timestamp[ms] not null
 snapshot_id: int64 not null
@@ -723,7 +722,7 @@ Inspect the data files in the current snapshot of the table:
 table.inspect.files()
 ```
 
-```
+```python
 pyarrow.Table
 content: int8 not null
 file_path: string not null
@@ -850,7 +849,7 @@ readable_metrics: [
 
 Expert Iceberg users may choose to commit existing parquet files to the Iceberg table as data files, without rewriting them.
 
-```
+```python
 # Given that these parquet files have schema consistent with the Iceberg table
 
 file_paths = [
@@ -930,7 +929,7 @@ with table.update_schema() as update:
 
 Now the table has the union of the two schemas `print(table.schema())`:
 
-```
+```python
 table {
   1: city: optional string
   2: lat: optional double
@@ -1180,7 +1179,7 @@ table.scan(
 
 This will return a PyArrow table:
 
-```
+```python
 pyarrow.Table
 VendorID: int64
 tpep_pickup_datetime: timestamp[us, tz=+00:00]
@@ -1222,7 +1221,7 @@ table.scan(
 
 This will return a Pandas dataframe:
 
-```
+```python
         VendorID      tpep_pickup_datetime     tpep_dropoff_datetime
 0              2 2021-04-01 00:28:05+00:00 2021-04-01 00:47:59+00:00
 1              1 2021-04-01 00:39:01+00:00 2021-04-01 00:57:39+00:00
@@ -1295,7 +1294,7 @@ ray_dataset = table.scan(
 
 This will return a Ray dataset:
 
-```
+```python
 Dataset(
     num_blocks=1,
     num_rows=1168798,
@@ -1346,7 +1345,7 @@ df = df.select("VendorID", "tpep_pickup_datetime", "tpep_dropoff_datetime")
 
 This returns a Daft Dataframe which is lazily materialized. Printing `df` will display the schema:
 
-```
+```python
 ╭──────────┬───────────────────────────────┬───────────────────────────────╮
 │ VendorID ┆ tpep_pickup_datetime          ┆ tpep_dropoff_datetime         │
 │ ---      ┆ ---                           ┆ ---                           │
@@ -1364,7 +1363,7 @@ This is correctly optimized to take advantage of Iceberg features such as hidden
 df.show(2)
 ```
 
-```
+```python
 ╭──────────┬───────────────────────────────┬───────────────────────────────╮
 │ VendorID ┆ tpep_pickup_datetime          ┆ tpep_dropoff_datetime         │
 │ ---      ┆ ---                           ┆ ---                           │