Distributed cache and s3 proxy experiments.
graph TD
START[GET /bucket/data.parquet
Range: bytes=0-15383] --> COORD[Coordinator]
COORD --> CALC[Calculate Pages<br/>Pages 0-3, 4KB page_size each]
CALC --> BCAST[Broadcast CacheQuery]
BCAST --> N1[Node 1 Check]
BCAST --> N2[Node 2 Check]
BCAST --> N3[Node 3 Check]
N1 --> R1[Response: Pages 0,1<br/>Coverage: 50%]
N2 --> R2[Response: Page 2<br/>Coverage: 25%]
N3 --> R3[Response: None<br/>Coverage: 0%]
R1 & R2 & R3 --> BUILD[Build Coverage Map]
BUILD --> MAP{Coverage Map}
MAP --> SEG1[Segment 1: Pages 0-1<br/>Source: Node 1 Local]
MAP --> SEG2[Segment 2: Page 2<br/>Source: Node 2 RPC]
MAP --> SEG3[Segment 3: Page 3<br/>Source: S3 Origin]
SEG1 --> FETCH1[Fetch from<br/>Local Cache]
SEG2 --> FETCH2[Fetch from<br/>Node 2 via gRPC]
SEG3 --> FETCH3[Fetch from S3<br/>& Cache locally]
FETCH1 --> ASSEMBLE[Assemble Data<br/>in Order]
FETCH2 --> ASSEMBLE
FETCH3 --> ASSEMBLE
ASSEMBLE --> STREAM[Stream to Client<br/>HTTP 206]
STREAM --> END[Client receives<br/>complete data]
style COORD fill:#1565C0,stroke:#0D47A1,color:#fff
style BUILD fill:#43A047,stroke:#1B5E20,color:#fff
style MAP fill:#F9A825,stroke:#F57F17,color:#fff
style ASSEMBLE fill:#8E24AA,stroke:#6A1B9A,color:#fff
style STREAM fill:#D81B60,stroke:#880E4F,color:#fff
style START fill:#546E7A,stroke:#37474F,color:#fff
style END fill:#00897B,stroke:#004D40,color:#fff