Orca - ORigin CAche by plombardi89 · Pull Request #124 · Azure/unbounded

plombardi89 · 2026-05-05T21:59:28Z

No description provided.

jveski · 2026-05-06T18:29:53Z

+  breaker** (s10.2). Sustained `ErrAuth` (default 3 consecutive) flips
+  `/readyz` to NotReady so load balancers drain the replica.
+
+### 8.7 Metadata-layer singleflight


So if I'm reading this right the doc proposes consistent hashing of the file/blob keyspace across Orca replicas, plus in-memory pipelining to batch concurrent requests for the same blocks.

Any concerns about hotspots if we're only running a few replicas? Since we have shared storage under the cache it's feasible to avoid partitioning (at the cost of another network roundtrip for locking)

jveski · 2026-05-06T18:33:52Z

+  access-frequency-driven eviction loop. This is the recommended
+  posixfs path when external sweep tooling is impractical.
+
+### 13.2 Active eviction (opt-in, access-frequency)


Do we definitely need eviction? I assume the remote dataset might be larger than in-region storage, but worth confirming

I think @jwilder has mentioned that the origin data set is going to be bigger than the origin cache data store can store at least some of the times.

It feels like we need some kind of eviction but maybe we need more clarity on the requirement

Cool, this approach checks out then. Being able to keep the counters in memory might justify sticking with consistent hashing even if it comes at the cost of some hotspotting across replicas. So I guess that mostly addresses both of my remaining comments. 🙃

As long as we run fewer, "larger" replicas of this service I think we're good to go 👍

…rred optimization

Add the orca origin cache binary (cmd/orca, internal/orca) that fronts S3 / Azure Blob origins with a chunked, rendezvous-hashed, in-DC cache backed by an S3-compatible store (LocalStack in dev, VAST or similar in production). Includes: - Per-replica fetch coordinator with cluster-wide singleflight collapse via rendezvous-hash coordinator selection plus an /internal/fill RPC for cross-replica fan-in. - cluster.PeerSource abstraction (DNS-backed in production, mutable StaticPeerSource for tests) with Peer.Port to support multiple replicas sharing an IP under test. - internal/orca/app factory exposing Start/Shutdown plus options for injecting alternate origin / cachestore / peer-source / internal handler wrap (used by tests). - Integration suite (internal/orca/inttest, build tag integrationtest) driven by testcontainers-go: 7 scenarios against real LocalStack + Azurite covering cold/warm GET, ranged GET with chunk-boundary edge cases, 64-chunk multi-chunk GET, rendezvous routing, singleflight collapse with 64 <= origin GetRanges <= 76 bound, and a real membership-disagreement fallback test. - Unit tests covering driver branches (cachestore versioning gate, azureblob blob-type gate, server error mapping + handler XML + range parser + path split + headers, chunk arithmetic + path determinism, config env-var fallback, manifest YAML validity). - Deploy manifest templates (deploy/orca/) defaulting to the unbounded-kube namespace, and an extracted reusable hack/cmd/render-manifests/render package consumed by both the CLI and the manifest validity test. Adds make orca-inttest target and a parallel CI job. Docs for the dev harness and integration suite are intentionally excluded from this commit.

+	in := &s3.ListObjectsV2Input{
+		Bucket:  aws.String(b),
+		Prefix:  aws.String(prefix),
+		MaxKeys: aws.Int32(int32(maxResults)),


+	}
+
+	cc := a.client.ServiceClient().NewContainerClient(cName)
+	max := int32(maxResults)


plombardi89 added 8 commits May 4, 2026 21:54

Working on design of the origincache

3d86b90

Review round 1 rework

6ab683f

Update design.md with more review work

118c300

push further review changes

1f532fe

more updates for consistency

38a220f

clarity updates to brief and design plus handle negative metadata case

d33d8bc

handle gaps discovered on review pass

2bbc6a5

simplify the spooler

bea7935

jveski reviewed May 6, 2026

View reviewed changes

plombardi89 added 2 commits May 7, 2026 11:27

switch back to local token bucket rate limiting with notes about defe…

0e89777

…rred optimization

github-advanced-security AI found potential problems May 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Orca - ORigin CAche#124

Orca - ORigin CAche#124
plombardi89 wants to merge 10 commits intomainfrom
phlombar/origincache

plombardi89 commented May 5, 2026

Uh oh!

Uh oh!

jveski May 6, 2026

Uh oh!

jveski May 6, 2026

Uh oh!

plombardi89 May 6, 2026

Uh oh!

jveski May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

plombardi89 commented May 5, 2026

Uh oh!

Uh oh!

jveski May 6, 2026

Choose a reason for hiding this comment

Uh oh!

jveski May 6, 2026

Choose a reason for hiding this comment

Uh oh!

plombardi89 May 6, 2026

Choose a reason for hiding this comment

Uh oh!

jveski May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants