Skip to content

Resolver cache race condition causes duplicate upstream pulls #9364

@twoGiants

Description

@twoGiants

Expected Behavior

With N resolver replicas, resolving the same digest-pinned bundle for multiple concurrent TaskRuns should produce at most N upstream registry pulls (one per replica to warm its cache).

Actual Behavior

Concurrent resolution requests within the same replica all miss the cache independently and each call the upstream source, causing a cache stampede. With 4 replicas and 20 TaskRuns, 13 registry pulls were observed instead of the expected 4.

The race: multiple goroutines check the cache before the first resolution completes, all see a miss, and all resolve independently. The cache only prevents duplicates after the first store, not during in-flight resolution.

Steps to Reproduce the Problem

  1. Deploy Tekton Pipelines with 4 resolver replicas and default cache settings
  2. Create 20 TaskRuns simultaneously referencing the same bundle by digest
  3. Check registry logs/metrics for manifest GET request count

Additional Info

  • Kubernetes version:

    Output of kubectl version:

Client Version: v1.34.2
Kustomize Version: v5.7.1
Server Version: v1.34.0
  • Tekton Pipeline version:

    Output of tkn version or kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'

Client version: 0.43.0
Pipeline version: devel
Triggers version: v0.34.0
Dashboard version: v0.65.0

Fix: add singleflight.Group to deduplicate concurrent in-flight resolutions for the same cache key. Affects all resolvers (bundle, git, cluster).

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.

Type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions