Skip to content

Commit 7025018

Browse files
Merge branch 'apache:main' into python-3.12
2 parents 926a8d2 + 088ee40 commit 7025018

21 files changed

+2250
-827
lines changed

.github/workflows/python-release.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ jobs:
8080
if: startsWith(matrix.os, 'ubuntu')
8181
run: ls -lah dist/* && cp dist/* wheelhouse/
8282

83-
- uses: actions/upload-artifact@v3
83+
- uses: actions/upload-artifact@v4
8484
with:
8585
name: "release-${{ github.event.inputs.version }}"
8686
path: ./wheelhouse/*

.github/workflows/stale.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ jobs:
3131
if: github.repository_owner == 'apache'
3232
runs-on: ubuntu-22.04
3333
steps:
34-
- uses: actions/stale@v8.0.0
34+
- uses: actions/stale@v9.0.0
3535
with:
3636
stale-issue-label: 'stale'
3737
exempt-issue-labels: 'not-stale'

mkdocs/docs/api.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,22 @@ catalog:
3333
credential: t-1234:secret
3434
```
3535
36+
Note that multiple catalogs can be defined in the same `.pyiceberg.yaml`:
37+
38+
```yaml
39+
catalog:
40+
hive:
41+
uri: thrift://127.0.0.1:9083
42+
s3.endpoint: http://127.0.0.1:9000
43+
s3.access-key-id: admin
44+
s3.secret-access-key: password
45+
rest:
46+
uri: https://rest-server:8181/
47+
warehouse: my-warehouse
48+
```
49+
50+
and loaded in python by calling `load_catalog(name="hive")` and `load_catalog(name="rest")`.
51+
3652
This information must be placed inside a file called `.pyiceberg.yaml` located either in the `$HOME` or `%USERPROFILE%` directory (depending on whether the operating system is Unix-based or Windows-based, respectively) or in the `$PYICEBERG_HOME` directory (if the corresponding environment variable is set).
3753

3854
For more details on possible configurations refer to the [specific page](https://py.iceberg.apache.org/configuration/).

mkdocs/docs/configuration.md

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ For the FileIO there are several configuration options available:
7474
| s3.signer | bearer | Configure the signature version of the FileIO. |
7575
| s3.region | us-west-2 | Sets the region of the bucket |
7676
| s3.proxy-uri | http://my.proxy.com:8080 | Configure the proxy server to be used by the FileIO. |
77+
| s3.connect-timeout | 60.0 | Configure socket connection timeout, in seconds. |
7778

7879
### HDFS
7980

@@ -140,8 +141,9 @@ catalog:
140141

141142
## SQL Catalog
142143

143-
The SQL catalog requires a database for its backend. As of now, pyiceberg only supports PostgreSQL through psycopg2.
144-
The database connection has to be configured using the `uri` property (see SQLAlchemy's [documentation for URL format](https://docs.sqlalchemy.org/en/20/core/engines.html#backend-specific-urls)):
144+
The SQL catalog requires a database for its backend. PyIceberg supports PostgreSQL and SQLite through psycopg2. The database connection has to be configured using the `uri` property. See SQLAlchemy's [documentation for URL format](https://docs.sqlalchemy.org/en/20/core/engines.html#backend-specific-urls):
145+
146+
For PostgreSQL:
145147

146148
```yaml
147149
catalog:
@@ -150,6 +152,22 @@ catalog:
150152
uri: postgresql+psycopg2://username:password@localhost/mydatabase
151153
```
152154

155+
In the case of SQLite:
156+
157+
<!-- prettier-ignore-start -->
158+
159+
!!! warning inline end "Development only"
160+
SQLite is not built for concurrency, you should use this catalog for exploratory or development purposes.
161+
162+
<!-- prettier-ignore-end -->
163+
164+
```yaml
165+
catalog:
166+
default:
167+
type: sql
168+
uri: sqlite:////tmp/pyiceberg.db
169+
```
170+
153171
## Hive Catalog
154172

155173
```yaml

mkdocs/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,6 @@ mkdocstrings-python==1.7.5
2323
mkdocs-literate-nav==0.6.1
2424
mkdocs-autorefs==0.5.0
2525
mkdocs-gen-files==0.5.0
26-
mkdocs-material==9.4.14
26+
mkdocs-material==9.5.3
2727
mkdocs-material-extensions==1.3.1
2828
mkdocs-section-index==0.3.8

poetry.lock

Lines changed: 1022 additions & 415 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pyiceberg/catalog/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,7 @@ def infer_catalog_type(name: str, catalog_properties: RecursiveDict) -> Optional
153153
return CatalogType.REST
154154
elif uri.startswith("thrift"):
155155
return CatalogType.HIVE
156-
elif uri.startswith("postgresql"):
156+
elif uri.startswith(("sqlite", "postgresql")):
157157
return CatalogType.SQL
158158
else:
159159
raise ValueError(f"Could not infer the catalog type from the uri: {uri}")

pyiceberg/exceptions.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ class GenericDynamoDbError(DynamoDbError):
104104
pass
105105

106106

107-
class CommitFailedException(RESTError):
107+
class CommitFailedException(Exception):
108108
"""Commit failed, refresh and try again."""
109109

110110

pyiceberg/io/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@
5151
S3_SESSION_TOKEN = "s3.session-token"
5252
S3_REGION = "s3.region"
5353
S3_PROXY_URI = "s3.proxy-uri"
54+
S3_CONNECT_TIMEOUT = "s3.connect-timeout"
5455
HDFS_HOST = "hdfs.host"
5556
HDFS_PORT = "hdfs.port"
5657
HDFS_USER = "hdfs.user"

pyiceberg/io/fsspec.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@
4949
GCS_TOKEN,
5050
GCS_VERSION_AWARE,
5151
S3_ACCESS_KEY_ID,
52+
S3_CONNECT_TIMEOUT,
5253
S3_ENDPOINT,
5354
S3_PROXY_URI,
5455
S3_REGION,
@@ -127,6 +128,9 @@ def _s3(properties: Properties) -> AbstractFileSystem:
127128
if proxy_uri := properties.get(S3_PROXY_URI):
128129
config_kwargs["proxies"] = {"http": proxy_uri, "https": proxy_uri}
129130

131+
if connect_timeout := properties.get(S3_CONNECT_TIMEOUT):
132+
config_kwargs["connect_timeout"] = connect_timeout
133+
130134
fs = S3FileSystem(client_kwargs=client_kwargs, config_kwargs=config_kwargs)
131135

132136
for event_name, event_function in register_events.items():

0 commit comments

Comments
 (0)