Skip to content

Commit d24e0c9

Browse files
kevinjqliusumanth-manchalagitzwzWenzhuo Zhaovincenzon
authored
[release] Pyiceberg 0.8.1 (#1369)
* use the non-deprecated func (#1326) * 0.8.0 post release steps (#1334) * add * fix mkdoc * Drop upper bounds for fsspec and it's implementations (#1341) * Drop upper bounds for fsspec and it's implementations * Run poetry lock * Ignore tables without `table_type` from Glue and Hive * Ignore tables without table_type parameters while loading all iceberg table from Glue and Hive catalog (#1331) * Use TABLE_TYPE --------- Co-authored-by: Wenzhuo Zhao <[email protected]> * Replace reference of `Table.identifier` with `Table.name` (#1346) * fix Table.name * replace Table.identifier with Table.name * add warning filter * Allow leading underscore in column name used in row filter (#1358) * Update parser.py Allow leading underscore in column name used in row filter. * Update test_parser.py * Update test_parser.py * Update test_parser.py * Remove Python 3.13 upper bound restriction (#1355) * Remove Python 3.13 upper bound restriction * Fix missing poetry.lock file * Upgrading numpy on the poetry.lock file from v1.26.0 to v1.26.4 * Improve documentation for "how to release" (#1359) * initial update * edits * add gpg instructions * verify artifacts * add twine not * grammar * edits * remove old artifacts * update doc workflow action * and name * add docs on patch vs major/minor release * fix `KeyError` raised by `add_files` when parquet file doe not have column stats (#1354) * fix KeyError, by switching del to pop * added unit test * update test * fix python 3.9 compatibility, and refactor test * update test * bump to 0.8.1 * Add instruction for patch release (#1373) * add instruction for patch release * create branch from tag * Write `null` when there is no parent-snapshot-id (#1383) --------- Co-authored-by: Sumanth <[email protected]> Co-authored-by: gitzwz <[email protected]> Co-authored-by: Wenzhuo Zhao <[email protected]> Co-authored-by: vincenzon <[email protected]> Co-authored-by: Luca Bigon <[email protected]> Co-authored-by: Binayak Dasgupta <[email protected]> Co-authored-by: Fokko Driesprong <[email protected]>
1 parent 3ccdc44 commit d24e0c9

27 files changed

+431
-231
lines changed

.github/ISSUE_TEMPLATE/iceberg_bug_report.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,8 @@ body:
99
description: What Apache Iceberg version are you using?
1010
multiple: false
1111
options:
12-
- "0.7.1 (latest release)"
12+
- "0.8.0 (latest release)"
13+
- "0.7.1"
1314
- "0.7.0"
1415
- "0.6.1"
1516
- "0.6.0"

dev/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ WORKDIR ${SPARK_HOME}
3939
ENV SPARK_VERSION=3.5.0
4040
ENV ICEBERG_SPARK_RUNTIME_VERSION=3.5_2.12
4141
ENV ICEBERG_VERSION=1.6.0
42-
ENV PYICEBERG_VERSION=0.7.1
42+
ENV PYICEBERG_VERSION=0.8.0
4343

4444
RUN curl --retry 3 -s -C - https://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop3.tgz -o spark-${SPARK_VERSION}-bin-hadoop3.tgz \
4545
&& tar xzf spark-${SPARK_VERSION}-bin-hadoop3.tgz --directory /opt/spark --strip-components 1 \

mkdocs/docs/how-to-release.md

Lines changed: 183 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -17,15 +17,31 @@
1717
- under the License.
1818
-->
1919

20-
# How to release
20+
# How to Release
2121

22-
The guide to release PyIceberg.
22+
This guide outlines the process for releasing PyIceberg in accordance with the [Apache Release Process](https://infra.apache.org/release-publishing.html). The steps include:
2323

24-
The first step is to publish a release candidate (RC) and publish it to the public for testing and validation. Once the vote has passed on the RC, the RC turns into the new release.
24+
1. Preparing for a release
25+
2. Publishing a Release Candidate (RC)
26+
3. Community Voting and Validation
27+
4. Publishing the Final Release (if the vote passes)
28+
5. Post-Release Step
2529

26-
## Preparing for a release
30+
## Requirements
2731

28-
Before running the release candidate, we want to remove any APIs that were marked for removal under the @deprecated tag for this release.
32+
* A GPG key must be registered and published in the [Apache Iceberg KEYS file](https://downloads.apache.org/iceberg/KEYS). Follow [the instructions for setting up a GPG key and uploading it to the KEYS file](#set-up-gpg-key-and-upload-to-apache-iceberg-keys-file).
33+
* SVN Access
34+
* Permission to upload artifacts to the [Apache development distribution](https://dist.apache.org/repos/dist/dev/iceberg/) (requires Apache Commmitter access).
35+
* Permission to upload artifacts to the [Apache release distribution](https://dist.apache.org/repos/dist/release/iceberg/) (requires Apache PMC access).
36+
* PyPI Access
37+
* The `twine` package must be installed for uploading releases to PyPi.
38+
* A PyPI account with publishing permissions for the [pyiceberg project](https://pypi.org/project/pyiceberg/).
39+
40+
## Preparing for a Release
41+
42+
### Remove Deprecated APIs
43+
44+
Before running the release candidate, we want to remove any APIs that were marked for removal under the `@deprecated` tag for this release. See [#1269](https://github.com/apache/iceberg-python/pull/1269).
2945

3046
For example, the API with the following deprecation tag should be removed when preparing for the 0.2.0 release.
3147

@@ -48,23 +64,49 @@ deprecation_message(
4864
)
4965
```
5066

51-
## Running a release candidate
67+
### Update Library Version
68+
69+
Update the version in `pyproject.toml` and `pyiceberg/__init__.py` to match the release version. See [#1276](https://github.com/apache/iceberg-python/pull/1276).
70+
71+
## Publishing a Release Candidate (RC)
72+
73+
### Release Types
74+
75+
#### Major/Minor Release
5276

53-
Make sure that the version is correct in `pyproject.toml` and `pyiceberg/__init__.py`. Correct means that it reflects the version that you want to release.
77+
* Use the `main` branch for the release.
78+
* Includes new features, enhancements, and any necessary backward-compatible changes.
79+
* Examples: `0.8.0`, `0.9.0`, `1.0.0`.
5480

55-
### Setting the tag
81+
#### Patch Release
5682

57-
Make sure that you're on the right branch, and the latest branch:
83+
* Use the branch corresponding to the patch version, such as `pyiceberg-0.8.x`.
84+
* Focuses on critical bug fixes or security patches that maintain backward compatibility.
85+
* Examples: `0.8.1`, `0.8.2`.
5886

59-
For a Major/Minor release, make sure that you're on `main`, for patch versions the branch corresponding to the version that you want to patch, i.e. `pyiceberg-0.6.x`.
87+
To create a patch branch from the latest release tag:
6088

6189
```bash
62-
git checkout <branch>
63-
git fetch --all
64-
git reset --hard apache/<branch>
90+
# Fetch all tags
91+
git fetch --tags
92+
93+
# Assuming 0.8.0 is the latest release tag
94+
git checkout -b pyiceberg-0.8.x pyiceberg-0.8.0
95+
96+
# Cherry-pick commits for the upcoming patch release
97+
git cherry-pick <commit>
6598
```
6699

67-
Set the tag on the last commit:
100+
### Create Tag
101+
102+
Ensure you are on the correct branch:
103+
104+
* For a major/minor release, use the `main` branch
105+
* For a patch release, use the branch corresponding to the patch version, i.e. `pyiceberg-0.6.x`.
106+
107+
Create a signed tag:
108+
109+
Replace `VERSION` and `RC` with the appropriate values for the release.
68110

69111
```bash
70112
export RC=rc1
@@ -74,48 +116,49 @@ export VERSION_BRANCH=${VERSION_WITHOUT_RC//./-}
74116
export GIT_TAG=pyiceberg-${VERSION}
75117

76118
git tag -s ${GIT_TAG} -m "PyIceberg ${VERSION}"
77-
git push apache ${GIT_TAG}
78-
79-
export GIT_TAG_REF=$(git show-ref ${GIT_TAG})
80-
export GIT_TAG_HASH=${GIT_TAG_REF:0:40}
81-
export LAST_COMMIT_ID=$(git rev-list ${GIT_TAG} 2> /dev/null | head -n 1)
119+
git push [email protected]:apache/iceberg-python.git ${GIT_TAG}
82120
```
83121

84-
The `-s` option will sign the commit. If you don't have a key yet, you can find the instructions [here](http://www.apache.org/dev/openpgp.html#key-gen-generate-key). To install gpg on a M1 based Mac, a couple of additional steps are required: <https://gist.github.com/phortuin/cf24b1cca3258720c71ad42977e1ba57>.
85-
If you have not published your GPG key in [KEYS](https://downloads.apache.org/iceberg/KEYS) yet, you must publish it before sending the vote email by doing:
86-
87-
```bash
88-
svn co https://dist.apache.org/repos/dist/release/iceberg icebergsvn
89-
cd icebergsvn
90-
echo "" >> KEYS # append a newline
91-
gpg --list-sigs <YOUR KEY ID HERE> >> KEYS # append signatures
92-
gpg --armor --export <YOUR KEY ID HERE> >> KEYS # append public key block
93-
svn commit -m "add key for <YOUR NAME HERE>"
94-
```
122+
### Publish Release Candidate (RC)
95123

96-
### Upload to Apache SVN
124+
#### Upload to Apache Dev SVN
97125

98-
Both the source distribution (`sdist`) and the binary distributions (`wheels`) need to be published for the RC. The wheels are convenient to avoid having people to install compilers locally. The downside is that each architecture requires its own wheel. [use `cibuildwheel`](https://github.com/pypa/cibuildwheel) runs in Github actions to create a wheel for each of the architectures.
126+
##### Create Artifacts for SVN
99127

100-
Before committing the files to the Apache SVN artifact distribution SVN hashes need to be generated, and those need to be signed with gpg to make sure that they are authentic.
128+
Run the [`Python release` Github Action](https://github.com/apache/iceberg-python/actions/workflows/python-release.yml).
101129

102-
Go to [Github Actions and run the `Python release` action](https://github.com/apache/iceberg-python/actions/workflows/python-release.yml). **Set the version to main, since we cannot modify the source**.
130+
* Tag: Use the newly created tag.
131+
* Version: Set the `version` to `main`, as the source cannot be modified.
103132

104133
![Github Actions Run Workflow for SVN Upload](assets/images/ghactions-run-workflow-svn-upload.png)
105134

106-
Download the zip, and sign the files:
135+
This action will generate:
136+
137+
* Source distribution (`sdist`)
138+
* Binary distributions (`wheels`) for each architectures. These are created using [`cibuildwheel`](https://github.com/pypa/cibuildwheel)
139+
140+
##### Download Artifacts, Sign, and Generate Checksums
141+
142+
Download the ZIP file containing the artifacts from the GitHub Actions run and unzip it.
143+
144+
Navigate to the release directory. Sign the files and generate checksums:
145+
146+
* `.asc` files: GPG-signed versions of each artifact to ensure authenticity.
147+
* `.sha512` files: SHA-512 checksums for verifying file integrity.
107148

108149
```bash
109150
cd release-main/
110151

111152
for name in $(ls pyiceberg-*.whl pyiceberg-*.tar.gz)
112153
do
113-
gpg --yes --armor --local-user [email protected] --output "${name}.asc" --detach-sig "${name}"
154+
gpg --yes --armor --output "${name}.asc" --detach-sig "${name}"
114155
shasum -a 512 "${name}" > "${name}.sha512"
115156
done
116157
```
117158

118-
Now we can upload the files from the same directory:
159+
##### Upload Artifacts to Apache Dev SVN
160+
161+
Now, upload the files from the same directory:
119162

120163
```bash
121164
export SVN_TMP_DIR=/tmp/iceberg-${VERSION_BRANCH}/
@@ -128,21 +171,59 @@ svn add $SVN_TMP_DIR_VERSIONED
128171
svn ci -m "PyIceberg ${VERSION}" ${SVN_TMP_DIR_VERSIONED}
129172
```
130173

131-
### Upload to PyPi
174+
Verify the artifact is uploaded to [https://dist.apache.org/repos/dist/dev/iceberg](https://dist.apache.org/repos/dist/dev/iceberg/).
175+
176+
##### Remove Old Artifacts From Apache Dev SVN
177+
178+
Clean up old RC artifacts:
179+
180+
```bash
181+
svn delete https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-<OLD_RC_VERSION> -m "Remove old RC artifacts"
182+
```
183+
184+
#### Upload to PyPi
132185

133-
Go to Github Actions and run the `Python release` action again. This time, set the **version** of the release candidate as the input: e.g. `0.7.0rc1`. Download the zip and unzip it locally.
186+
##### Create Artifacts for PyPi
187+
188+
Run the [`Python release` Github Action](https://github.com/apache/iceberg-python/actions/workflows/python-release.yml).
189+
190+
* Tag: Use the newly created tag.
191+
* Version: Set the `version` to release candidate, e.g. `0.7.0rc1`.
134192

135193
![Github Actions Run Workflow for PyPi Upload](assets/images/ghactions-run-workflow-pypi-upload.png)
136194

137-
Next step is to upload them to pypi. Please keep in mind that this **won't** bump the version for everyone that hasn't pinned their version, since it is set to an RC [pre-release and those are ignored](https://packaging.python.org/en/latest/guides/distributing-packages-using-setuptools/#pre-release-versioning).
195+
##### Download Artifacts
196+
197+
Download the zip file from the Github Action run and unzip locally.
198+
199+
##### Upload Artifacts to PyPi
200+
201+
Upload release candidate to PyPi. This **won't** bump the version for everyone that hasn't pinned their version, since it is set to an RC [pre-release and those are ignored](https://packaging.python.org/en/latest/guides/distributing-packages-using-setuptools/#pre-release-versioning).
202+
203+
<!-- prettier-ignore-start -->
204+
205+
!!! note
206+
`twine` might require an PyPi API token.
207+
208+
<!-- prettier-ignore-end -->
138209

139210
```bash
140-
twine upload release-0.7.0rc1/*
211+
twine upload release-${VERSION}/*
141212
```
142213

214+
Verify the artifact is uploaded to [PyPi](https://pypi.org/project/pyiceberg/#history).
215+
216+
## Vote
217+
218+
### Generate Vote Email
219+
143220
Final step is to generate the email to the dev mail list:
144221

145222
```bash
223+
export GIT_TAG_REF=$(git show-ref ${GIT_TAG})
224+
export GIT_TAG_HASH=${GIT_TAG_REF:0:40}
225+
export LAST_COMMIT_ID=$(git rev-list ${GIT_TAG} 2> /dev/null | head -n 1)
226+
146227
cat << EOF > release-announcement-email.txt
147228
148229
Subject: [VOTE] Release Apache PyIceberg $VERSION
@@ -185,12 +266,19 @@ Please vote in the next 72 hours.
185266
[ ] +0
186267
[ ] -1 Do not release this because...
187268
EOF
188-
189-
cat release-announcement-email.txt
190269
```
191270

192-
## Vote has passed
271+
### Send Vote Email
193272

273+
Verify the content of `release-announcement-email.txt` and send it to `[email protected]` with the corresponding subject line.
274+
275+
## Vote has failed
276+
277+
If there are concerns with the RC, address the issues and generate another RC.
278+
279+
## Publish the Final Release (Vote has passed)
280+
281+
A minimum of 3 binding +1 votes is required to pass an RC.
194282
Once the vote has been passed, you can close the vote thread by concluding it:
195283

196284
```text
@@ -205,36 +293,54 @@ The release candidate has been accepted as PyIceberg <VERSION>. Thanks everyone,
205293
Kind regards,
206294
```
207295

208-
### Copy the artifacts to the release dist
296+
### Upload the accepted RC to Apache Release SVN
297+
<!-- prettier-ignore-start -->
209298

210-
```bash
211-
export RC=rc2
212-
export VERSION=0.7.0${RC}
213-
export VERSION_WITHOUT_RC=${VERSION/rc?/}
299+
!!! note
300+
Only a PMC member has the permission to upload an artifact to the SVN release dist.
301+
302+
<!-- prettier-ignore-end -->
214303

304+
```bash
215305
export SVN_DEV_DIR_VERSIONED="https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-${VERSION}"
216306
export SVN_RELEASE_DIR_VERSIONED="https://dist.apache.org/repos/dist/release/iceberg/pyiceberg-${VERSION_WITHOUT_RC}"
217307

218308
svn mv ${SVN_DEV_DIR_VERSIONED} ${SVN_RELEASE_DIR_VERSIONED} -m "PyIceberg: Add release ${VERSION_WITHOUT_RC}"
219309
```
220310

221-
<!-- prettier-ignore-start -->
311+
Verify the artifact is uploaded to [https://dist.apache.org/repos/dist/release/iceberg](https://dist.apache.org/repos/dist/release/iceberg/).
222312

223-
!!! note
224-
Only a PMC member has the permission to upload an artifact to the SVN release dist.
313+
### Remove Old Artifacts From Apache Release SVN
225314

226-
<!-- prettier-ignore-end -->
315+
We only want to host the latest release. Clean up old release artifacts:
316+
317+
```bash
318+
svn delete https://dist.apache.org/repos/dist/release/iceberg/pyiceberg-<OLD_RELEASE_VERSION> -m "Remove old release artifacts"
319+
```
227320

228321
### Upload the accepted release to PyPi
229322

230323
The latest version can be pushed to PyPi. Check out the Apache SVN and make sure to publish the right version with `twine`:
231324

325+
<!-- prettier-ignore-start -->
326+
327+
!!! note
328+
`twine` might require an PyPi API token.
329+
330+
<!-- prettier-ignore-end -->
331+
232332
```bash
233333
svn checkout https://dist.apache.org/repos/dist/release/iceberg /tmp/iceberg-dist-release/
234334
cd /tmp/iceberg-dist-release/pyiceberg-${VERSION_WITHOUT_RC}
235335
twine upload pyiceberg-*.whl pyiceberg-*.tar.gz
236336
```
237337

338+
Verify the artifact is uploaded to [PyPi](https://pypi.org/project/pyiceberg/#history).
339+
340+
## Post Release
341+
342+
### Send out Release Announcement Email
343+
238344
Send out an announcement on the dev mail list:
239345

240346
```text
@@ -253,19 +359,19 @@ This Python release can be downloaded from: https://pypi.org/project/pyiceberg/<
253359
Thanks to everyone for contributing!
254360
```
255361

256-
## Release the docs
362+
### Release the docs
257363

258-
A committer triggers the [`Python Docs` Github Actions](https://github.com/apache/iceberg-python/actions/workflows/python-ci-docs.yml) through the UI by selecting the branch that just has been released. This will publish the new docs.
364+
Run the [`Release Docs` Github Action](https://github.com/apache/iceberg-python/actions/workflows/python-release-docs.yml).
259365

260-
## Update the Github template
366+
### Update the Github template
261367

262368
Make sure to create a PR to update the [GitHub issues template](https://github.com/apache/iceberg-python/blob/main/.github/ISSUE_TEMPLATE/iceberg_bug_report.yml) with the latest version.
263369

264-
## Update the integration tests
370+
### Update the integration tests
265371

266372
Ensure to update the `PYICEBERG_VERSION` in the [Dockerfile](https://github.com/apache/iceberg-python/blob/main/dev/Dockerfile).
267373

268-
## Create a Github Release Note
374+
### Create a Github Release Note
269375

270376
Create a [new Release Note](https://github.com/apache/iceberg-python/releases/new) on the iceberg-python Github repository.
271377

@@ -278,3 +384,22 @@ Then, select the previous release version as the **Previous tag** to use the dif
278384
**Generate release notes**.
279385

280386
**Set as the latest release** and **Publish**.
387+
388+
## Misc
389+
390+
### Set up GPG key and Upload to Apache Iceberg KEYS file
391+
392+
To set up GPG key locally, see the instructions [here](http://www.apache.org/dev/openpgp.html#key-gen-generate-key).
393+
394+
To install gpg on a M1 based Mac, a couple of additional steps are required: <https://gist.github.com/phortuin/cf24b1cca3258720c71ad42977e1ba57>.
395+
396+
Then, published GPG key to the [Apache Iceberg KEYS file](https://downloads.apache.org/iceberg/KEYS):
397+
398+
```bash
399+
svn co https://dist.apache.org/repos/dist/release/iceberg icebergsvn
400+
cd icebergsvn
401+
echo "" >> KEYS # append a newline
402+
gpg --list-sigs <YOUR KEY ID HERE> >> KEYS # append signatures
403+
gpg --armor --export <YOUR KEY ID HERE> >> KEYS # append public key block
404+
svn commit -m "add key for <YOUR NAME HERE>"
405+
```

0 commit comments

Comments
 (0)