-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use selfhosted S3 build cache #724
Open
chippmann
wants to merge
68
commits into
master
Choose a base branch
from
feature/s3-build-cache
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This reverts commit bf8fedd.
CedNaru
approved these changes
Oct 12, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TLDR;
Overview
We regularly run into cache misses on our CI/CD pipeline and thus unnecessarily recompile lot's of code because our available cache size is "only" 10GB.
In that we have to fit all of the intermediate files needed for all compilations for all targets and subtargets as well as gradle and it's dependencies. Thus we regularly reach an over allocation of around 60GB before github deletes some of our cache again.
To gain more control and flexibility regarding caching (especially with the goal of supporting even more targets and platforms in the future like linux arm64) I opted to implement a selfhosted S3 cache using minio but with a fallback to regular github caching for forks which do not have access to the needed credentials in our runner secrets.
With this we can cache even more fine grained as i assigned a cap for 1TB for
now (not that we would need that much anyways).
Currently we sit around 10GB already for all caches of only one branch.
With this as we now have way more storage on our hands, i opted for a cache per git ref (branch, tag) and if none is present, the cache from master is copied. This means we can now even cache on a per branch basis which should further improve our pipeline and give us more control if we need it.
This all comes with one downside though: As the cache is located at my home and i do not have a business internet contract, my upload is capped at 80-100Mbit/s while the download is capped at 1Gbit/s. This means that from the view of the github runner, the performance is inverted. Which means that the runner's cache download is quite a lot slower than from github (which is around 120-150 Mbit/s. So we are roughly 20-70Mbit/s slower). The upload performance is not affected though.
But overall IMO this is negligible. In the real world, this really only affects the dev and debug build of the editors noticeably by around a minute as there the cache is around 1GB each.
But I expect the overall performance to still be better as we should not have any cache misses anymore
I also took the opportunity to rework some other parts of our CI/CD pipeline;
Finally it's worth noting that the overall speed of the pipeline really depends on a lot of factors. So this PR is not the silver bullet for our pipeline! It should however make it a lot more stable in regards to caching.
Other research work
I actually started this work by using selfhosted runners instead of github hosted ones. But in short; while using much more powerful hardware than we get for free, we have so many jobs in parallel that a few fast selfhosted runners can never compete with the vast amount of slower runners we get from github. So even with 3 Linux, 3 Windows and 1 MacOs runner self hosted, we were a lot slower than the github hosted runners.
And managing the cache and the dependencies on these was not worth it.
Thus i switched to caring more about cache misses rather than pure build performance.
But that work is not in vain. Once we revive the work done in #640 we can use them again for benchmarking as there we need controlled consistency. So my service setup at home is still ready for self hosted runners once needed.
Other S3 service providers
I also looked at some readily available S3 providers and services like infomaniak or amazon aws. But they are either limited in access restrictions or just way to expensive for our use cases.