This document covers planned enhancements to ThemisDB's time series storage subsystem, which provides append-optimised storage via tsstore.h/tsstore.cpp, Gorilla delta-of-delta compression (gorilla.cpp), continuous aggregation (continuous_agg.cpp), configurable retention management (retention.cpp), hypertable partitioning (hypertable.cpp), and the TSAutoBuffer (ts_auto_buffer.cpp) for auto-batching high-frequency single-point inserts. The module is in Beta state and requires improved query performance, tighter integration with the downsampling pipeline, and hardened compression paths before GA.
- The
tsstorewrite path must sustain ≥500k data points per second per node on commodity NVMe hardware without exceeding 10% CPU overhead. - Gorilla compression must be transparent to callers;
tsstore.hconsumers must not need to decompress chunks manually. - Retention policies executed by
retention.cppmust be atomic at the chunk boundary — partial chunk deletion is not permitted. TSAutoBuffermust not buffer data for longer than its configured flush interval even under backpressure from the storage layer; overdue flushes must emit a metrics alert viatimeseries_metrics.cpp.
| Interface | Consumer | Notes |
|---|---|---|
TSStore::insert_batch(points) |
ts_auto_buffer.cpp, ingestion module |
Atomic batch; returns sequence number |
TSStore::scan(series_id, start, end) |
query_optimizer.cpp, analytics module |
Returns compressed chunks; caller decodes |
ContinuousAgg::refresh(agg_id) |
aggregate_scheduler.cpp, aggregate_scheduler_helper.cpp |
Incremental refresh from watermark |
RetentionManager::apply_policies() |
retention.cpp, scheduler module |
Chunk-granular; must be idempotent |
Hypertable::partition(time_column) |
hypertable.cpp |
Configures time-dimension chunk interval |
GorillaCoder::encode() / decode() |
gorilla.cpp, tsstore.cpp |
In-place chunk compression/decompression |
Priority: High Target Version: v1.8.0 Status: Implemented (PR: copilot/tsstore-single-point-insert-buffering)
tsstore.cpp line 213 (resolved TODO): TSStore::putDataPoint() now routes single-point inserts through TSAutoBuffer::push() when Gorilla compression is enabled and an auto-buffer is attached, enabling Gorilla batch-encoding for IoT / streaming workloads.
Implementation Notes:
[x]TheTSAutoBuffer(ts_auto_buffer.cpp) already exists as the adaptive flush layer; wireTSStore::insert(single_point)to route throughTSAutoBufferrather than writing directly to RocksDB when batch size = 1.[x]TSAutoBuffershould accumulate up toconfig_.gorilla_batch_size(default 128) points before encoding with Gorilla and writing as a single chunk.[x]Add backpressure signal toTSAutoBuffer::push(): returnBUFFER_FULLwhen the in-memory buffer exceedsconfig_.max_buffer_bytes. (INVALID_INPUTadded to distinguish permanent validation errors from transient back-pressure.)[x]Add unit test: 1000 single-point inserts, verify compressed on-disk size is ≤ 15% of raw (Gorilla target), p99 insert latency ≤ 50 µs. (8 focused tests intests/test_tsstore_gorilla_buffer.cpp;GorillaSmallerThanRawverifies compression,ThousandPointsP99Latencymeasures latency.)
Priority: High Target Version: v0.9.0 (delivered v1.6.0)
Rewrite the gorilla.cpp decode path to use SIMD intrinsics (AVX2 on x86-64, NEON on ARM) for delta-of-delta reconstruction, dramatically increasing scan throughput for range queries over long time windows.
Implementation Notes:
- Added
gorilla_simd.cppandinclude/timeseries/gorilla_simd.halongsidegorilla.cppwith AVX2 and NEON implementations selected via runtime CPUID check (gorilla_simd_has_avx2()/gorilla_simd_has_neon()). - Two-phase decode: Phase 1 (scalar) parses the bit-stream into flat
dods[]/xorvals[]staging arrays; Phase 2 (SIMD) applies two in-place prefix-sum passes (dod→Δt→ts) and one prefix-XOR pass (vbits reconstruction). - AVX2 in-register Kogge-Stone prefix scan processes 4 × int64_t per iteration via
_mm256_permute4x64_epi64+_mm256_blend_epi32+_mm256_permute2x128_si256. - NEON path processes 2 × int64_t (or uint64_t) per iteration using
vextq_s64/vextq_u64. - Scalar fallback delegates to
GorillaDecoderunchanged. - 29 focused tests in
tests/test_gorilla_simd.cpp(GorillaSIMDTest suite) cover correctness, edge cases, NaN/inf, SIMD tail handling, and runtime dispatch.
Performance Targets:
- Gorilla decode throughput: >2 GB/s of decoded data per core (up from ~400 MB/s scalar).
- Range scan over 1M points (float64): <50 ms P99 including chunk fetch from
tsstore.cpp.
Priority: High Target Version: v0.9.0
Extend continuous_agg.cpp to support watermark-based incremental refresh so that only newly ingested data since the last refresh is re-aggregated. The watermark is tracked per aggregate in the metadata layer and pushed down to tsstore.cpp scan predicates to skip already-processed chunks.
Implementation Notes:
- Add a
ContinuousAggWatermarktable to the metadata store;continuous_agg.cpp::refresh()reads the watermark, scans only[watermark, now)intsstore.cpp, and advances the watermark atomically after a successful aggregate write. aggregate_scheduler.cppmust persist per-aggregate state including watermark to survive node restarts; use the WAL path fromtsstore.cppfor durability.aggregate_scheduler_helper.cppshould expose abackfill_range(agg_id, start, end)method for manual recovery from gaps in watermark history.- Emit aggregate refresh latency and lag metrics from
timeseries_metrics.cpptagged withagg_id.
Performance Targets:
- Incremental refresh overhead: <500 ms per aggregate per 1-minute interval under 100k inserts/s ingest rate.
- Watermark write amplification: <1.5× (aggregate write bytes / raw data bytes processed).
Priority: Medium Target Version: v0.10.0
Implement a configurable multi-tier downsampling pipeline (raw → 1 min → 1 hour → 1 day) integrated with continuous_agg.cpp and governed by retention.cpp policies. Each tier is stored in its own hypertable.cpp partition with tier-specific Gorilla compression settings.
Implementation Notes:
- Add
DownsamplingPolicyconfiguration toretention.cppthat declares tier resolutions and retention durations;hypertable.cppauto-provisions per-tier tables at policy creation time. continuous_agg.cppexecutes downsampling as a watermark-driven aggregate (see incremental refresh feature above), computing min/max/avg/sum/count per downsampling window.- Reads from
query_optimizer.cppmust be routed to the coarsest tier that satisfies the query's time granularity; add aTierSelectorinquery_optimizer.cppthat compares requested resolution against available tiers. - Retention expiry of raw data must not leave gaps in coarser tiers;
retention.cppmust enforce that the target tier is fully populated before deleting raw chunks.
Performance Targets:
- Downsampling throughput: >10M raw points/s reduced to 1-min aggregates on a single node.
- Storage reduction from raw to 1-day tier: >50× for typical sensor/metric workloads.
Priority: High Target Version: v0.9.0
Enhance ts_auto_buffer.cpp to dynamically adjust the flush batch size based on downstream tsstore.cpp write latency feedback, implementing a feedback-control loop that prevents buffer overruns without requiring manual tuning of the flush interval.
Implementation Notes:
- Add a
FlushControllerclass tots_auto_buffer.cppthat maintains an EWMA of recenttsstore.cppwrite latencies and scales the target batch size inversely with latency. - If
tsstore.cppwrite latency exceeds a configurable SLO threshold (default 50 ms),TSAutoBuffermust emit ats_autobuffer_backpressurecounter viatimeseries_metrics.cppand block producers until the queue drains below the low-water mark. - Ensure timer-based flush still fires at the configured maximum interval even when adaptive sizing is active, satisfying the constraint that data must not be held longer than the flush interval.
FlushControllerstate (EWMA, current batch size) must be exposed as runtime metrics for observability.
Performance Targets:
- Sustained single-point ingest throughput through
TSAutoBuffer: >500k points/s per node. - Buffer-to-storage flush latency P99: <10 ms under normal load.
- Backpressure event rate during sustained overload: <1 event/s (adaptive batching absorbs bursts).
Priority: Medium Target Version: v1.7.0 Status: Implemented (PR: copilot/add-chunk-level-encryption)
Add AES-256-GCM encryption to individual time series chunks in tsstore.cpp using data encryption keys derived by utils/hkdf_helper.cpp and managed by utils/lek_manager.cpp. Encryption must be transparent to the query path; chunks are decrypted on-demand during scan.
Implementation Notes:
EncryptedChunkStorewrapper ininclude/timeseries/encrypted_chunk_store.h/src/timeseries/encrypted_chunk_store.cppintercepts chunk write/read operations and applies AES-256-GCM using HKDF-derived per-series DEKs.TSStore::setEncryptedChunkStore()attaches the wrapper;TSStore::getEncryptedChunkStore()retrieves it.- Key rotation implemented in
include/timeseries/ts_encrypted_key_rotation.h/src/timeseries/ts_encrypted_key_rotation.cpp— background job re-encrypts stale chunks without blocking reads. - Gorilla-compressed data is encrypted after compression (compress-then-encrypt).
- Every key access is audited via
utils/audit_logger.cppwith series ID, chunk range, and accessor identity.
Performance Targets:
- Encryption overhead on write path: <5% throughput reduction vs. unencrypted baseline.
- AES-256-GCM throughput per core: >1 GB/s (AES-NI assisted via OpenSSL EVP).
| Test Type | Coverage Target | Notes |
|---|---|---|
| Unit | >85% new code | Cover GorillaSIMD decode, ContinuousAggWatermark, FlushController, TierSelector |
| Integration | Full write → aggregate → query → retention cycle | Use realistic 1-hour dataset with 100k series |
| Performance | P99 < budgets above | Gorilla SIMD decode bench, TSAutoBuffer throughput under backpressure |
| Correctness | Gorilla encode/decode round-trip | Fuzz gorilla.cpp with property-based tests; verify lossless for float64 |
| Metric | Current | Target | Method |
|---|---|---|---|
| Write throughput per node | ~200k pts/s | >500k pts/s | TSAutoBuffer adaptive flush benchmark |
| Gorilla decode throughput | ~400 MB/s | >2 GB/s | SIMD decoder microbenchmark |
| Range scan (1M pts, float64) | ~300 ms | <50 ms | SIMD decode + query_optimizer.cpp tier selection |
| Continuous agg refresh latency | ~5 s | <500 ms | Incremental watermark refresh benchmark |
| Storage compression ratio (Gorilla) | ~4× | >6× (with multi-tier downsampling) | Dataset comparison on real sensor traces |
| Chunk encryption overhead | N/A | <5% write throughput | AES-NI benchmark vs. plaintext baseline |
- Chunk-level AES-256-GCM encryption keys must be managed exclusively through
utils/lek_manager.cpp; hard-coded or environment-variable keys are prohibited. -
retention.cppchunk deletion must be atomic at the chunk boundary and logged toutils/audit_logger.cpp; partially deleted chunks must be detected and repaired on startup. - The Gorilla SIMD decoder must validate chunk magic bytes and version headers before decoding to prevent corrupt chunk data from causing undefined behaviour in the SIMD path.
- [?] Determine whether time series data containing legal event timestamps must be retained for a minimum period regardless of configured retention policy (regulatory constraint).
-
TSAutoBuffermust not silently drop data under extreme backpressure: producers block onbackpressure_cv_and receiveERR_API_RESOURCE_EXHAUSTEDwhen the buffer is stopped during the wait. Non-adaptive mode still accepts data up tomax_memory_bytesthen forces a flush.