GH-33241: [Archery] Replace github3 with pygithub#48886
Conversation
|
|
|
|
raulcd
left a comment
There was a problem hiding this comment.
Thanks for the PR, awesome work. The changes look great in general. I'll have to take some time to do some validation as crossbow isn't extensively tested at the moment. Can you remove this one while I find some time to test it?
arrow/ci/conda_env_archery.txt
Line 22 in 894d6a3
|
Failures are unrelated, will be fixed once the following issue closed. It has a PR ready to be merged: |
There was a problem hiding this comment.
Changes look good to me. Thanks @fangchenli
@kou I plan to merge this and if we find any small issue not covered by tests during crossbow runs we can fix them.
I haven't validated manually the different cases but have reviewed the changes.
|
|
kou
left a comment
There was a problem hiding this comment.
+1
It makes sense.
@fangchenli BTW, I'm not sure why you worked on this but are you interested in GitHub API related issue? We want to complete #47340 but we need to update our GitHub API usage too. (Changing Type: XXX issue label to XXX issue type in our development scripts.) Do you want to work on it?
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR migrates Crossbow’s GitHub integration from github3.py to PyGithub, updating dependency declarations and adapting the core/CLI codepaths plus tests around releases, PRs, and asset uploads.
Changes:
- Replace
github3.pyusage withPyGithubin Crossbow core logic and CLI flows. - Update packaging/conda environment dependencies and remove the
crossbow-uploadextra. - Add/expand unit tests covering the new PyGithub-based behaviors (releases, PRs, assets, statuses).
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| dev/archery/setup.py | Switch Crossbow extra dependency from github3.py to pygithub and remove crossbow-upload extra. |
| dev/archery/archery/crossbow/core.py | Port GitHub API interactions (login, releases, PRs, assets, statuses) from github3.py to PyGithub. |
| dev/archery/archery/crossbow/cli.py | Update asset download and upload CLI commands to use PyGithub APIs; remove upload method option. |
| dev/archery/archery/crossbow/tests/test_core.py | Add tests for new PyGithub-based Repo/TaskStatus/TaskAssets behaviors. |
| dev/archery/README.md | Remove docs for the now-removed crossbow-upload extra. |
| ci/conda_env_crossbow.txt | Replace github3.py with pygithub in Crossbow conda env. |
| ci/conda_env_archery.txt | Remove github3.py from Archery conda env. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| release = repo.get_release(tag_name) | ||
| release.delete_release() | ||
| except GithubException as e: | ||
| if e.status != 404: | ||
| raise | ||
|
|
||
| release = repo.create_git_release(tag_name, tag_name, "", | ||
| target_commitish=target_commitish) |
| except IOError as e: | ||
| # Catch network and file I/O errors (includes requests exceptions) | ||
| logger.error(f"Attempt {i + 1} has failed with message: {e}.") |
| with mock.patch.object(Repo, 'as_github_repo') as mock_repo: | ||
| with mock.patch.object(Repo, 'default_branch_name', 'main'): | ||
| mock_pr = mock.MagicMock() |
| 'crossbow': ['pygithub', jinja_req, 'pygit2>=1.14.0', 'requests', | ||
| 'ruamel.yaml', 'setuptools_scm>=8.0.0'], |
| # remove the whole release if it already exists | ||
| try: | ||
| release = repo.release_from_tag(tag_name) | ||
| except github3.exceptions.NotFoundError: | ||
| pass | ||
| else: | ||
| release.delete() | ||
|
|
||
| release = repo.create_release(tag_name, target_commitish) | ||
| release = repo.get_release(tag_name) | ||
| release.delete_release() | ||
| except GithubException as e: | ||
| if e.status != 404: | ||
| raise |
| except IOError as e: | ||
| # Catch network and file I/O errors (includes requests exceptions) | ||
| logger.error(f"Attempt {i + 1} has failed with message: {e}.") |
|
@kszucs Would you like to review this? |
|
Back then github3.py had wider feature support and better maintenance though some features were missing hence the pygithub dependency as well (at least if I recall correctly, it was a while ago / there could have been missing packages as well). Apparently pygithub is more actively maintained nowadays so it is indeed better to keep only pygithub. |
|
Overall it looks good to me. |
| def github_release(self, tag): | ||
| repo = self.as_github_repo() | ||
| try: | ||
| return repo.release_from_tag(tag) | ||
| except github3.exceptions.NotFoundError: | ||
| return None | ||
|
|
||
| def github_upload_asset_requests(self, release, path, name, mime, | ||
| max_retries=None, retry_backoff=None): | ||
| return repo.get_release(tag) | ||
| except GithubException as e: | ||
| if e.status == 404: | ||
| return None | ||
| raise |
| @@ -560,13 +534,14 @@ def github_overwrite_release_assets(self, tag_name, target_commitish, | |||
|
|
|||
| # remove the whole release if it already exists | |||
| try: | |||
| release = repo.release_from_tag(tag_name) | |||
| except github3.exceptions.NotFoundError: | |||
| pass | |||
| else: | |||
| release.delete() | |||
|
|
|||
| release = repo.create_release(tag_name, target_commitish) | |||
| release = repo.get_release(tag_name) | |||
| release.delete_release() | |||
| except GithubException as e: | |||
| if e.status != 404: | |||
| raise | |||
| except IOError as e: | ||
| # Catch network and file I/O errors (includes requests exceptions) | ||
| logger.error(f"Attempt {i + 1} has failed with message: {e}.") |
|
@github-actions crossbow submit wheelcp313* |
|
Revision: b7ab6e7 Submitted crossbow builds: ursacomputing/crossbow @ actions-9dd6cd6454 |
|
|
||
| click | ||
| github3.py | ||
| pygithub |
There was a problem hiding this comment.
Can we keep this list alphabetically-ordered?
| extras = { | ||
| 'benchmark': ['pandas'], | ||
| 'crossbow': ['github3.py', jinja_req, 'pygit2>=1.14.0', 'requests', | ||
| 'crossbow': ['pygithub>=2.5.0', jinja_req, 'pygit2>=1.14.0', 'requests', |
raulcd
left a comment
There was a problem hiding this comment.
Thanks @fangchenli for the work here and the effort of keeping it alive. I am going to merge this. It will be tested by some CI jobs and thoroughly tested during the release (download and upload artifacts steps)
|
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 98ee71e. There were 6 benchmark results indicating a performance regression:
The full Conbench report has more details. It also includes information about 556 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change
Archery currently uses both pygithub and github3. It's unnecessary and increases maintenance burden.
What changes are included in this PR?
Replace all github3 usage with pygithub.
Are these changes tested?
Yes, unittests added.
Are there any user-facing changes?
No.