📝 Note on Documentation Language
This is a translation of the ThemisDB documentation. The authoritative and most up-to-date documentation is maintained in German (docs/de/).
Translations may lag behind the German version. For the latest information, please refer to the German documentation.
Last Updated: January 5, 2026
Version: 1.4.0-alpha (Advanced LLM Features)
Type: Documentation Index
Language: English (Translation)
AI directly in the database with advanced capabilities - no external API costs!
- 📝 Grammar-Constrained Generation - EBNF/GBNF support for guaranteed valid outputs (95-99% reliability vs 60-70%)
- Built-in grammars: JSON, XML, CSV, ReAct Agent
- Thread-safe grammar cache with LRU eviction
- Zero post-processing required
- 🔭 RoPE Scaling - Extended context window from 4K → 32K tokens (8x increase)
- Linear, NTK-aware, YaRN scaling methods
- Process entire research papers and codebases
- 🖼️ Vision Support - Multi-modal LLMs with CLIP-based image encoding
- LLaVA integration for image analysis
- Single and multiple image support
- ⚡ Flash Attention - CUDA kernels for 15-25% speedup, 30% memory reduction
- Optimized attention mechanism
- Backward pass for training support
- 🎯 Speculative Decoding - 2-3x faster inference with draft+target models
- 🔄 Continuous Batching - 2x+ throughput with dynamic request batching
-
Grammar-Constrained Generation ⭐ v1.4.0-alpha
- EBNF/GBNF grammar support
- Built-in and custom grammars
- Usage examples and best practices
-
RoPE Scaling Implementation ⭐ v1.4.0-alpha
- Extended context windows (4K→32K)
- Scaling methods comparison
- Configuration guide
-
Vision Support Quick Start ⭐ v1.4.0-alpha
- Multi-modal LLM setup
- CLIP model integration
- Image analysis examples
-
Flash Attention Implementation ⭐ v1.4.0-alpha
- CUDA kernel optimization
- Performance benchmarks
- Configuration guide
-
Speculative Decoding ⭐ v1.4.0-alpha
- Draft+target model pairing
- 2-3x speedup guide
- Model recommendations
-
Continuous Batching ⭐ v1.4.0-alpha
- Dynamic batching configuration
- Throughput optimization
- Token budget management
Important: LLM Integration is an optional feature in v1.3.0+:
- Requires build flag:
-DTHEMIS_ENABLE_LLM=ON- Requires external dependency: llama.cpp (clone separately)
- See Build Guide for setup instructions
ThemisDB can be extended as the first multi-model database with an embedded LLM engine:
- 🧠 Embedded llama.cpp - SLMs/LLMs (1B-70B parameters) directly on GPU ✅
- ⚡ GPU Acceleration - Significant speedup with NVIDIA CUDA support ✅
- 💾 PagedAttention - Optimized memory management ✅
- 🎯 Continuous Batching - Multiple concurrent requests ✅
- 🔧 Kernel Fusion - CUDA kernels for additional speedup ✅
- 📊 Production Monitoring - Grafana/Prometheus integration ✅
- 🔌 Plugin Architecture - Extensible LLM backend system ✅
- 🌐 RPC Framework - Inter-shard communication for distributed LLM ops ✅
- 🖼️ Image Analysis Plugins - Multi-backend AI (llama.cpp Vision, ONNX CLIP, OpenCV DNN) ✅
- 🌐 HTTP/2 with Server Push - CDC/Changefeed with proactive event delivery (~0ms latency) ✅
- 🔌 WebSocket Support - CDC streaming with bidirectional real-time communication ✅
- 📡 MQTT Broker - WebSocket transport, rate limiting, monitoring metrics ✅
- 🚀 HTTP/3 Base - QUIC-based implementation (ngtcp2 + nghttp3) 🚧
- 🐘 PostgreSQL Wire Protocol - SQL-to-Cypher translation for BI tool compatibility ✅
- 🤖 MCP Server - Model Context Protocol with cross-platform support ✅
- Significant speedup with GPU vs CPU-only
- Memory savings with PagedAttention
- Additional optimization with kernel fusion
- Comprehensive test coverage with unit tests
| GPU Tier | Hardware | Model | Use Case | Cost/1M Tokens | vs. GPT-4 |
|---|---|---|---|---|---|
| Entry | RTX 4060 Ti (16GB) | Phi-3-Mini (3.8B) | FAQ, simple RAG | €0.02 | 1500x cheaper |
| Mid-Range | RTX 4090 (24GB) | Mistral-7B | Production RAG | €0.05 | 600x cheaper |
| High-End | A100 (80GB) | Llama-3-70B | Enterprise Scale | €0.15 | 200x cheaper |
Break-Even vs. Hyperscaler: 2-7 months depending on hardware tier
-
GPU Inference Guide ⭐ v1.3.0
- CUDA setup and configuration
- Performance tuning
- Troubleshooting
-
Quantization Guide ⭐ v1.3.0
- Q4_K_M, Q5_K_M, Q8_0 formats
- Memory vs. quality trade-offs
- Best practices
-
Performance Benchmarks ⭐ v1.3.0
- GPU vs. CPU comparisons
- Throughput measurements
- Latency analysis
-
Deployment Guide ⭐ v1.3.0
- Docker with GPU support
- Kubernetes deployment
- Production best practices
-
RPC Framework ⭐ v1.3.0
- Inter-shard communication
- TLS/mTLS security
- Snapshot/blob transfer
-
GPU Tier Analysis & Hyperscaler Comparison
- SLM/LLM performance on entry/mid/high-end GPUs
- TCO analysis over 3 years
- ROI calculation vs. AWS/Azure/GCP
-
All LLM Documentation - Complete index (31 guides)
The documentation has been restructured for better clarity:
Root Documents (essentials only):
README.md- Main documentationindex.md- Documentation indexglossary.md- Terminology
Organized Folders:
aql/- AQL Grammar (EBNF) ⭐ v1.3.0build/- Build system documentation (BUILD-SYSTEM.md, BUILDGUIDE.md, etc.)development/- Development documentation (IMPLEMENTATION-.md, CODE_REVIEW-.md)guides/- User and developer guides (RAILWAY_COMPLETE_GUIDE.md, etc.)architecture/- Architecture documentation (ARCHITECTURE_OVERVIEW.md, etc.)stakeholder/- Stakeholder documentationreleases/- Release notes (v1.3.0.md, v1.2.0.md, v1.1.0.md, etc.)llm/- LLM & AI Integration ⭐ v1.3.0 RELEASEDplugins/- RPC Framework ⭐ v1.3.0archive/- Old/historical documentation
🔮 COMING SOON - v1.1.0 Optimization Release (Q1 2026):
Focus: Better utilize existing libraries + vLLM co-location
Highlights:
- ✅ RocksDB TTL, Incremental Backup, Stats (no new lib!)
- ✅ TBB Parallel Sort, Concurrent Containers (no new lib!)
- ✅ Arrow Parquet Export (no new lib!)
- ✅ CUDA as core (when GPU available, NOT Enterprise!)
- ✅ 🆕 ThemisDB + vLLM Synergy (optimized CPU/GPU/RAM coordination)
- ✅ mimalloc (only new dependency, 20-40% memory boost)
Engineering: 9-11 weeks | Impact: 3-10x performance
Details: v1.1.0 Variant Strategy
🚀 PLANNED - v1.2.0 Enterprise Features (Q2 2026):
Focus: vLLM AI Support (LoRA), Geo-Spatial (PostGIS), IoT/Timescale
Highlights:
- ✅ LoRA Manager - Multi-tenant LoRA serving (HuggingFace PEFT)
- ✅ FAISS Advanced - IVF+PQ vector search (already integrated, expand!)
- ✅ GEOS + PROJ - PostGIS compatibility (topology + geography)
- ✅ Hypertables - TimescaleDB-compatible via RocksDB CF (code only!)
- ✅ cuSpatial - GPU geo ops (optional, uses Arrow + CUDA)
Engineering: 12-16 weeks | Impact: PostGIS + LoRA + TimescaleDB compatibility
Details: Enterprise Features Strategy
- Changelog - Complete version history (v1.2.0, v1.1.0, v1.0.2, v1.0.1, v1.0.0)
- 🆕 Roadmap v1.1.0 - UPDATED: Q1 2026 Optimization Release
- Architecture Overview - Complete system architecture with diagrams
- Source Code Changes v1.0 - Detailed source code documentation (191 files, 26 modules)
- Features List - Complete feature overview with status
- Themis Status Report 2025 - Executive summary, status v1.0.1
- 🆕 v1.1.0 Variant Strategy - Q1 2026: Optimization strategy with vLLM co-location (9-11 weeks, 1 new lib)
- 🆕 v1.2.0 Enterprise Features - Q2 2026: vLLM AI (LoRA), geo-spatial (PostGIS), IoT/Timescale (12-16 weeks, 3 new libs)
Project cost estimation & total value- 🔒 Confidential (available to licensed customers only)- Release Strategy Audit - SLSA compliance, SBOM (8.5/10 rating)
- Release & Benchmarking Summary - v1.0.1 session report
- Development Summary - Development status v1.0.1
- 🆕 External Libraries Analysis - NEW: Feature gap analysis (RocksDB, TBB, CUDA, Arrow)
- 🆕 Library Interactions - NEW: Interactions & additional libraries
- Source Code Audit - Code analysis (132 headers, 124 sources, 90,829 LOC)
- Documentation Index - Complete documentation index with module mapping
- Documentation Verification - Verification documentation ↔ code
- Operations Runbook - Daily operations
- Deployment Guide - Deployment strategies
- Build Strategy - Build toolchain
- Docker Guide - Container deployment
- Compliance Dashboard - Overview of all compliance activities
- Security Audit Report - Completed security audit
- Compliance Full Checklist - BSI C5, ISO 27001, GDPR, eIDAS, SOC 2
- Security Policy - Vulnerability disclosure
- Incident Response Plan - Emergency plan (BSI IT-Grundschutz & NIST CSF)
- SBOM Documentation - Software Bill of Materials (CycloneDX 1.4)
- DPIA - Data protection impact assessment (GDPR Art. 35)
- BCP/DRP - Business continuity (ISO 22301 & NIS2)
- AQL Documentation - Advanced Query Language (parser, optimizer, 240K LOC)
- Query Module - Query engine, execution
- Analytics Module - OLAP engine (CUBE, ROLLUP), CEP, process mining (57K LOC)
- Search Documentation - Fulltext (BM25), vector, hybrid search
- Storage Module - RocksDB wrapper, LSM-tree, MVCC (76K LOC)
- Index Module - Vector HNSW, graph, secondary, spatial (400K LOC)
- Cache Module - Semantic cache, result cache
- Timeseries Module - Gorilla compression, aggregates (39K LOC)
- Sharding Module - VCC-URN sharding, auto-rebalancing, gossip (300K LOC)
- Replication Module - Leader-follower, multi-master CRDTs (12K LOC)
- Transaction Module - MVCC, SAGA patterns (42K LOC)
- GPU Acceleration Plan - 10 GPU backends (173K LOC)
- CUDA, Vulkan, FAISS, DirectX, HIP, OpenCL, OneAPI, ZLUDA
- Content Module - 15 file format processors (256K LOC)
- CDC Module - Change data capture, changefeed
- Geo Module - Spatial operations, plugin system
- Server Module - HTTP server, 21 API handlers (164K LOC)
- HTTP API Reference - Complete HTTP endpoint documentation ⭐
- API Documentation - REST API overview
- LLM Module - LLM interaction store, prompt manager
- Security Module - Field encryption, HSM/PKI, RBAC, Ranger (187K LOC)
- Governance Module - Policy engine, data classification
- Auth Module - JWT validation, multi-tenancy
- Main README - Project overview and quick start
- Deployment Guide - Deployment options
- Docker Guide - Container deployment
- QNAP Quickstart - ARM deployment
- Architecture Overview - Understanding system architecture
- Features Overview - Available features
- AQL Tutorial - Learning the query language
- SDK Audit - Overview of all 7 SDKs
- Python SDK - Python client
- JavaScript SDK - Node.js/browser client
- Rust SDK - Rust client
- Go SDK - Go client
- Java SDK - Java client
- C# SDK - .NET client
- Swift SDK - iOS/macOS client
- Exporters - Data export
- JSONL LLM Exporter - LLM training data export
- Importers - Data import
- PostgreSQL Importer - PostgreSQL migration
- Plugins - Plugin system
- Plugin Security - Security & sandboxing
- Plugin Migration - Migration guide
- Admin Tools - 7 WPF administration tools
- User Guide - User manual
- Admin Guide - Administrator manual
- Feature Matrix - Tool overview
- Operations Runbook - Daily operations
- TLS Setup - TLS/mTLS configuration
- Vault Integration - HashiCorp Vault setup
- RBAC Setup - Access control configuration
- Code Quality - Code quality tools
- Performance Tuning - Performance optimization
- Benchmarks - Performance benchmarks
- Memory Tuning - Memory optimization
- Observability - Monitoring & metrics
- Development Summary - Current development status v1.0.1
- Audit Log - Development audit log
- Implementation Status - Implementation status
- Priorities - Development priorities
- Themis Status Report - Main status report v1.5
- Documentation Summary - Documentation overview
- Benchmark Audit - Test & benchmark status
- Security Audit - Security audit results
- Roadmap Overview - Development roadmap (complete 2026!)
- Features Priorities - Q1 2026 priorities
- Database Capabilities - Capabilities roadmap
- Ingestion - Data ingestion patterns
- VCC CLARA - CLARA adapter
- VCC VERITAS - VERITAS adapter
- VCC Base - Base adapter framework
- Enterprise Features - Rate limiting, load shedding
- Integration Analysis - Legacy code integration
All 26 modules with detailed documentation in src/:
- Acceleration - GPU/CPU backends (173K LOC)
- Analytics - OLAP, CEP (57K LOC)
- API - GraphQL, geo hooks
- Auth - JWT validation
- Cache - Semantic cache
- CDC - Change data capture
- Content - 15 file processors (256K LOC)
- Exporters - Data export
- Geo - Spatial operations
- Governance - Policy engine
- Importers - Data import
- Index - Vector, graph, secondary (400K LOC)
- LLM - LLM integration
- Network - Wire protocol
- Observability - Metrics, tracing
- Plugins - Plugin system
- Query - AQL engine (240K LOC)
- Replication - Leader-follower, multi-master (12K LOC)
- Security - Encryption, RBAC (187K LOC)
- Server - HTTP, API handlers (164K LOC)
- Sharding - VCC-URN, gossip (300K LOC)
- Storage - RocksDB, MVCC (76K LOC)
- Timeseries - Gorilla compression (39K LOC)
- Transaction - MVCC, SAGA (42K LOC)
- Updates - Schema migration
- Utils - Utilities (120K LOC)
- GitHub Wiki - Community wiki
- GitHub Pages - Online documentation
- PDF Documentation - Complete documentation as PDF
- Benchmarks Suite - Benchmark framework
- Docker Benchmarks - Competitive benchmarks
- Hardware Constraints - Resource constraints testing
- v1.0.1 Release Notes - Latest release
- v1.0.0 Release Notes - Production release
- Release Package Structure - Package organization
- Format: Markdown (.md)
- Encoding: UTF-8
- Line Endings: LF (Unix-style)
- Code Blocks: Always specify language
- Links: Use relative paths
- Follow structure - Place docs in appropriate subdirectory
- Link properly - Use relative links to other documents
- Update README - Update relevant README.md files
- Markdown style - Follow Style Guide
- Keep current - Update docs when features change
# Install dependencies
pip install -r requirements-docs.txt
# Build documentation
.\build-docs.ps1
# Test locally
mkdocs serveDocumentation is automatically deployed to GitHub Pages on merge to main.
- Issues: GitHub Issues
- Wiki: GitHub Wiki
- Security: Security Policy
| Metric | Value |
|---|---|
| Documentation Files | 456+ |
| Documentation Folders | 71 |
| Source Code LOC | 90,829 |
| Source Files | 191 (.cpp) |
| Header Files | 132 (.h) |
| Modules | 26 directories |
| Logical Components | 16 |
Version: 1.3.0
Last Updated: December 20, 2025
License: See LICENSE