Data lake implementation integrated with AWS S3
-
Async-Download chunks from AWS S3
-
Persist on-disk in a lock-less manner
-
List all persisted chunks by ID from a cache
-
Find and lock a chunk - Once locked, chunk cannot be deleted until all DataChunkRef are dropped.
-
Scheduled deletion - Scheduled for deletion, a chunk will be removed once it is no longer in use.
-
Maximum allocated on-disk storage limit
-
Backend-agnostic datamanager. The RocksDB backend can be substituted with any in-process NoSQL or SQL storage engine.g
-
Simple Prompt UI
| Chunk_ID | Semaphore |
|---|---|
| 0x0A0B | Instance |
| 0x0A0C | Instance |
| 0x0A0C | Instance |
| Chunk_ID | Encoded Chunk Data |
|---|---|
| 0x0A0B | 0x.. |
| 0x0A0C | 0x... |
| 0x0A0C | 0x |
| DatasetID_BlockRange | Chunk_ID |
|---|---|
| 100_0_100 | 0x0A0B |
| 100_101_120 | 0x0A0C |
| 100_121_1000 | 0x0C0A |
