Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: improving build/test time #820

Merged

Conversation

bashirpartovi
Copy link
Contributor

@bashirpartovi bashirpartovi commented Mar 22, 2025

Description

  • Moving pre-commit to ubuntu-only runs
  • Added caching for python deps

Checkout the runs to see the details of the caching

@romanlutz
Copy link
Contributor

I'm curious: How does this cache help? We only pull the packages once, don't we?

@bashirpartovi
Copy link
Contributor Author

bashirpartovi commented Mar 23, 2025

@romanlutz

I'm curious: How does this cache help? We only pull the packages once, don't we?

Great question, GitHub Actions always spins up brand new VMs for every single workflow run. Without caching, even a tiny change in the same PR causes it to re-download and reinstall all our Python packages from scratch, which can get pretty time-consuming especially if we have tons of dependencies or multiple matrix jobs.

If we use a cache keyed to the OS, Python version, and the hash of our pyproject.toml, we can store those packages on GitHub’s cache service. Then, when dependencies haven’t changed, the workflow just restores everything from that cache instead of starting from zero.

This really helps with active PRs where we push updates frequently. Each subsequent commit or push can skip the full reinstall, running much faster and improving our feedback loop. It might not seem huge at first, but when it’s repeated across multiple PRs and pushes, it really adds up.

It also benefits new PRs that share the same dependencies, so they start out hitting the cache as well. We had builds that took 21 minutes 👀 nobody has time for that. With caching, we’re already down to 7 minutes

@bashirpartovi bashirpartovi changed the title [DRAFT] MAINT: improving build/test time MAINT: improving build/test time Mar 23, 2025
@bashirpartovi bashirpartovi marked this pull request as ready for review March 23, 2025 16:33
@romanlutz
Copy link
Contributor

Do we ever need to clear the cache?

@bashirpartovi
Copy link
Contributor Author

Do we ever need to clear the cache?

You generally won’t need to clear the cache manually. GitHub automatically handles cache eviction after 7 days of inactivity (or 10GB), so old or unused caches get cleaned up on their own. Also, whenever we change our dependencies in pyproject.toml, the cache key changes automatically, so any old caches no longer get used.

The only time you might need to clear the cache manually is if something gets corrupted or if you want a fresh start for some reason (very rare). In that case, you could just update your cache key (so it doesn’t match the old one) or go to the repository’s Actions settings to delete any existing caches. But under normal circumstances, you don’t have to clear the cache it just works on its own.

Here is from GitHub's docs:
GitHub will remove any cache entries that have not been accessed in over 7 days. There is no limit on the number of caches you can store, but the total size of all caches in a repository is limited to 10 GB. Once a repository has reached its maximum cache storage, the cache eviction policy will create space by deleting the oldest caches in the repository.

If you exceed the limit, GitHub will save the new cache but will begin evicting caches until the total size is less than the repository limit.
https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/caching-dependencies-to-speed-up-workflows

@romanlutz
Copy link
Contributor

Can we add a comment regarding that and maybe a link to the GH cache docs into the workflow definition?

@bashirpartovi
Copy link
Contributor Author

@romanlutz done

Copy link
Contributor

@rlundeen2 rlundeen2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it, great work!

@bashirpartovi bashirpartovi merged commit b944c34 into Azure:main Mar 24, 2025
19 checks passed
Copy link
Contributor

@romanlutz romanlutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants