-
Notifications
You must be signed in to change notification settings - Fork 183
fix: prevent cache collisions when URLs return duplicate ETags #1949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Fixes chainguard-dev#1944 where multi-arch builds failed because Alpine Linux returns the same ETag for different signing keys. The cache used only ETags as keys, causing different files to collide. Changes: - Add URL hash to cache key: etag + sha256(url)[:4] - Each unique URL gets unique cache entry regardless of ETag - Add integration test simulating Alpine Linux duplicate ETags - Add concurrency test for multi-arch build scenario - Add backward compatibility test documenting cache format change - Simplify and remove low-value unit tests Test results: - All existing tests pass - Integration test verifies end-to-end cache behavior - Concurrency test validates multi-arch scenario with goroutines
- Replace if-else chain with switch statement (gocritic) - Use fmt.Fprintf instead of []byte(fmt.Sprintf()) (modernize) - Use range over int for nested loops (modernize)
Add a new workflow job and example config that specifically tests the scenario from chainguard-dev#1944 where Alpine Linux returns the same ETag for different architecture-specific signing keys. The new test: - Builds a multi-arch image (x86_64, aarch64) with Alpine repos - Includes alpine-keys package which triggers the ETag collision - Would have failed before the cache collision fix - Prevents future regressions of this issue Files added: - examples/alpine-multiarch-keys.yaml - Minimal config to trigger bug - .github/workflows/build-samples.yml - New job to test multi-arch build
|
|
||
| func cacheFileFromEtag(cacheFile, etag string) (string, error) { | ||
| cacheDir := filepath.Dir(cacheFile) | ||
| ext := ".etag" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact the the filename is no longer just an etag makes this feel a bit odd - might need a tweak
|
we write out keys inside the built image; thus we might as well store the keys by filename, as we have an existing bug that we cannot use two different keys with the same filename but different full-url / prefix. Unless we want to parse the key and store it by like it's RSA public key properties (or full sha256?) and solve the bug of distinct keys under the same name. |
The previous fix added URL hashes to cache keys to prevent ETag collisions, but broke offline mode. fetchOffline() was picking any file from the cache directory instead of filtering by URL hash. Fix: Filter cache files by expected URL hash suffix to ensure the correct cached file is returned for the specific URL.
Extract URL hash computation into cacheFileHash() to avoid duplication between cacheFileFromEtag() and fetchOffline(). Returns string for direct use in cache filenames.
The short hash is calculated using the full URL so I think this should avoid same filename collisions. |
Fixes #1944 where multi-arch builds failed because Alpine Linux
returns the same ETag for different signing keys. The cache used
only ETags as keys, causing different files to collide.
Changes:
Test results: