Skip to content

Commit

Permalink
Fix dead links in docs (#493)
Browse files Browse the repository at this point in the history
  • Loading branch information
kevinjqliu authored Mar 3, 2024
1 parent 13bd343 commit 3c225a7
Show file tree
Hide file tree
Showing 4 changed files with 28 additions and 1 deletion.
3 changes: 3 additions & 0 deletions .github/workflows/check-md-link.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@ on:
push:
paths:
- mkdocs/**
branches:
- 'main'
pull_request:

jobs:
markdown-link-check:
Expand Down
4 changes: 4 additions & 0 deletions mkdocs/docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@

<!-- prettier-ignore-start -->

<!-- markdown-link-check-disable -->

- [Getting started](index.md)
- [Configuration](configuration.md)
- [CLI](cli.md)
Expand All @@ -28,4 +30,6 @@
- [How to release](how-to-release.md)
- [Code Reference](reference/)

<!-- markdown-link-check-enable-->

<!-- prettier-ignore-end -->
20 changes: 20 additions & 0 deletions mkdocs/docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,8 @@ For the FileIO there are several configuration options available:

### S3

<!-- markdown-link-check-disable -->

| Key | Example | Description |
| -------------------- | ------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| s3.endpoint | https://10.0.19.25/ | Configure an alternative endpoint of the S3 service for the FileIO to access. This could be used to use S3FileIO with any s3-compatible object storage service that has a different endpoint, or access a private S3 endpoint in a virtual private cloud. |
Expand All @@ -91,17 +93,25 @@ For the FileIO there are several configuration options available:
| s3.proxy-uri | http://my.proxy.com:8080 | Configure the proxy server to be used by the FileIO. |
| s3.connect-timeout | 60.0 | Configure socket connection timeout, in seconds. |

<!-- markdown-link-check-enable-->

### HDFS

<!-- markdown-link-check-disable -->

| Key | Example | Description |
| -------------------- | ------------------- | ------------------------------------------------ |
| hdfs.host | https://10.0.19.25/ | Configure the HDFS host to connect to |
| hdfs.port | 9000 | Configure the HDFS port to connect to. |
| hdfs.user | user | Configure the HDFS username used for connection. |
| hdfs.kerberos_ticket | kerberos_ticket | Configure the path to the Kerberos ticket cache. |

<!-- markdown-link-check-enable-->

### Azure Data lake

<!-- markdown-link-check-disable -->

| Key | Example | Description |
| ----------------------- | ----------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| adlfs.connection-string | AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqF...;BlobEndpoint=http://localhost/ | A [connection string](https://learn.microsoft.com/en-us/azure/storage/common/storage-configure-connection-string). This could be used to use FileIO with any adlfs-compatible object storage service that has a different endpoint (like [azurite](https://github.com/azure/azurite)). |
Expand All @@ -112,8 +122,12 @@ For the FileIO there are several configuration options available:
| adlfs.client-id | ad667be4-b811-11ed-afa1-0242ac120002 | The client-id |
| adlfs.client-secret | oCA3R6P\*ka#oa1Sms2J74z... | The client-secret |

<!-- markdown-link-check-enable-->

### Google Cloud Storage

<!-- markdown-link-check-disable -->

| Key | Example | Description |
| -------------------------- | ------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| gcs.project-id | my-gcp-project | Configure Google Cloud Project for GCS FileIO. |
Expand All @@ -128,6 +142,8 @@ For the FileIO there are several configuration options available:
| gcs.default-location | US | Configure the default location where buckets are created, like 'US' or 'EUROPE-WEST3'. |
| gcs.version-aware | False | Configure whether to support object versioning on the GCS bucket. |

<!-- markdown-link-check-enable-->

## REST Catalog

```yaml
Expand All @@ -145,6 +161,8 @@ catalog:
cabundle: /absolute/path/to/cabundle.pem
```
<!-- markdown-link-check-disable -->
| Key | Example | Description |
| ---------------------- | ----------------------- | -------------------------------------------------------------------------------------------------- |
| uri | https://rest-catalog/ws | URI identifying the REST Server |
Expand All @@ -156,6 +174,8 @@ catalog:
| rest.signing-name | execute-api | The service signing name to use when SigV4 signing a request |
| rest.authorization-url | https://auth-service/cc | Authentication URL to use for client credentials authentication (default: uri + 'v1/oauth/tokens') |

<!-- markdown-link-check-enable-->

### Headers in RESTCatalog

To configure custom headers in RESTCatalog, include them in the catalog properties with the prefix `header.`. This
Expand Down
2 changes: 1 addition & 1 deletion mkdocs/docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ You either need to install `s3fs`, `adlfs`, `gcsfs`, or `pyarrow` to be able to

## Connecting to a catalog

Iceberg leverages the [catalog to have one centralized place to organize the tables](https://iceberg.apache.org/catalog/). This can be a traditional Hive catalog to store your Iceberg tables next to the rest, a vendor solution like the AWS Glue catalog, or an implementation of Icebergs' own [REST protocol](https://github.com/apache/iceberg/tree/main/open-api). Checkout the [configuration](configuration.md) page to find all the configuration details.
Iceberg leverages the [catalog to have one centralized place to organize the tables](https://iceberg.apache.org/concepts/catalog/). This can be a traditional Hive catalog to store your Iceberg tables next to the rest, a vendor solution like the AWS Glue catalog, or an implementation of Icebergs' own [REST protocol](https://github.com/apache/iceberg/tree/main/open-api). Checkout the [configuration](configuration.md) page to find all the configuration details.

For the sake of demonstration, we'll configure the catalog to use the `SqlCatalog` implementation, which will store information in a local `sqlite` database. We'll also configure the catalog to store data files in the local filesystem instead of an object store. This should not be used in production due to the limited scalability.

Expand Down

0 comments on commit 3c225a7

Please sign in to comment.