The Config module provides backward-compatible configuration path resolution and JSON/YAML schema validation for ThemisDB. It maps legacy flat-file config paths to their new hierarchical directory structure, enabling a seamless migration window where both old and new paths are supported simultaneously. It includes LRU caching for resolved paths, structured deprecation metadata, a typed exception hierarchy for config-related errors, and a ConfigSchemaValidator that validates YAML/JSON configuration files against JSON Schema (Draft 7 subset) definitions.
| Interface / File | Role |
|---|---|
config_path_resolver.h / config_path_resolver.cpp |
Legacy-to-new config path mapping with filesystem fallback |
config_schema_validator.h / config_schema_validator.cpp |
JSON Schema (Draft 7 subset) validation of YAML/JSON config files |
config_audit_log.h / config_audit_log.cpp |
Bounded in-memory audit trail for config path accesses |
config_metrics_exporter.h / config_metrics_exporter.cpp |
Prometheus text-format metrics exporter for the /metrics endpoint |
config_encrypted_store.h / config_encrypted_store.cpp |
AES-256-GCM encrypted key-value store for sensitive config values with key rotation |
lru_cache.h |
LRU cache with TTL for resolved path results |
path_mapping_metadata.h |
Deprecation and removal-date metadata per mapped path |
config_errors.h |
Typed exception hierarchy for config-related errors |
config_migration_scanner_impl.h |
Testable inline implementation for the config_migration_scanner CLI tool |
In Scope:
- Legacy-to-new config path mapping with filesystem fallback
- LRU cache with TTL for resolved path results
- Path validation (path-traversal prevention, normalization)
- Deprecation/removal-date metadata per mapped path
- Thread-safe metrics tracking (hits, misses, cache hits, legacy fallbacks)
- Prometheus metrics export via
ConfigMetricsExporter::collect()(served on/metrics) - Typed exception hierarchy for config errors
- JSON Schema (Draft 7 subset) validation of YAML and JSON config files
- Config path access audit trail (bounded in-memory log with timestamps)
- Encrypted config storage (
ConfigEncryptedStore): AES-256-GCM encryption, per-value random IV, authentication tag verification, zero-downtime key rotation
Out of Scope:
- Parsing or loading config file contents (YAML/JSON) beyond what is needed for schema validation
- Runtime configuration hot-reload
- Master-key envelope protection for serialised
ConfigEncryptedStoresnapshots (caller responsibility)
Location: config_path_resolver.h, config_path_resolver.cpp
Static utility that resolves legacy config paths to their new hierarchical locations. Checks the new path first, then falls back to the legacy path with a deprecation warning.
Features:
- Path Mapping Table: 60+ mappings covering AI/ML, security, compliance, performance, platform, networking, and monitoring categories
- Filesystem Fallback: Tries the new path first; if absent, uses the legacy path and emits a
spdlogwarning - Optional API:
tryResolve()returnsstd::nulloptinstead of throwing on failure - Metadata Lookup:
getMetadata()returns deprecation date, removal date, and migration guide link per path - Thread-Safe Metrics: All counters use
std::atomic— safe for concurrent reads with no locking - LRU Cache: Resolved paths are cached to avoid repeated filesystem
exists()calls. Capacity and TTL are configurable via environment variables (see Environment Variables below). - Symlink Hardening:
validatePath()rejects symlinks that resolve outside the config root - Deprecation Aggregation:
deprecationReport()returns a usage-sorted snapshot of all legacy paths accessed since startup - Multi-Environment Overlay: Dev/staging path sets allow environment-specific config overrides without touching production files (see Multi-Environment Config Overlay below).
Environment Variables:
| Variable | Default | Valid Values / Range | Description |
|---|---|---|---|
THEMIS_CONFIG_CACHE_SIZE |
1000 |
[10, 100000] |
Maximum number of entries in the path-resolution LRU cache |
THEMIS_CONFIG_CACHE_TTL |
300 |
[1, 86400] |
Entry TTL in seconds; expired entries are evicted on next access |
THEMIS_CONFIG_ENV |
prod |
dev | staging | prod (case-insensitive) |
Active deployment environment for config overlay resolution |
Read the active runtime values via ConfigPathResolver::currentCacheConfig() and ConfigPathResolver::getEnvironment().
When a variable is absent, empty, or invalid a warning is written to stderr and the default value is used.
Thread Safety:
- All public methods are safe for concurrent read access
- The
PATH_MAPPINGtable isconstand initialized at compile time - Metrics use
std::atomic<uint64_t>; no locks needed for reads ConfigAuditLoguses an internalstd::mutex; audit recording is a separate lock acquisition from path resolution
Location: config_encrypted_store.h, config_encrypted_store.cpp
Thread-safe, AES-256-GCM encrypted key-value store for sensitive configuration values (passwords, API tokens, connection strings). Each call to set() generates a fresh random 96-bit IV, ensuring that two encryptions of the same plaintext produce distinct ciphertexts. Authentication tags (128 bits) are verified on every get() call, so tampered data is detected before it is returned.
Encryption scheme:
| Property | Value |
|---|---|
| Algorithm | AES-256-GCM (NIST SP 800-38D) |
| Key size | 256 bits (32 bytes) |
| IV size | 96 bits (12 bytes), randomly generated per encryption |
| Tag size | 128 bits (16 bytes), verified on every decryption |
| IV source | RAND_bytes (OpenSSL CSPRNG) |
Key rotation:
rotateKey() atomically re-encrypts every stored value under a new randomly-generated 256-bit key. The operation is serialised under the internal mutex so no concurrent read can observe a partially-rotated store. The old key bytes are zero-filled before the std::vector holding them is dropped.
Persistence:
serialize() returns a JSON string containing the current key material and all encrypted blobs. deserialize() restores a store from such a snapshot. The serialised form contains the AES key in plaintext; callers must wrap the snapshot in a master-key envelope before writing it to persistent storage.
Thread Safety: All public methods acquire the internal std::mutex; safe for concurrent use from multiple threads.
Location: config_audit_log.h, config_audit_log.cpp
Bounded, thread-safe in-memory audit trail for config path accesses. Disabled by default; enabled via ConfigPathResolver::setAuditLogEnabled(true). Each successful resolution appends an AuditEntry containing:
| Field | Type | Description |
|---|---|---|
requested_path |
std::string |
The path as originally passed by the caller |
resolved_path |
std::string |
The final filesystem path returned |
timestamp |
std::chrono::system_clock::time_point |
UTC time of the access |
is_legacy |
bool |
true if the legacy fallback path was used |
is_cache_hit |
bool |
true if the result was served from the LRU cache |
The log is bounded (default 10,000 entries); oldest entries are evicted when the limit is reached. Failed resolutions are never recorded.
Location: lru_cache.h
Generic LRU cache with per-entry TTL eviction. Used internally by ConfigPathResolver to cache resolved paths.
Location: config_metrics_exporter.h, config_metrics_exporter.cpp
Static utility that formats ConfigPathResolver metrics in Prometheus text-exposition format and exposes them on the server-wide /metrics scrape endpoint.
Exported metrics:
| Metric Name | Type | Description |
|---|---|---|
themis_config_resolution_hits_total |
counter | Successful path resolutions |
themis_config_resolution_misses_total |
counter | Failed resolutions (path not found) |
themis_config_legacy_fallbacks_total |
counter | Times legacy path was used as fallback |
themis_config_new_path_hits_total |
counter | Times new (canonical) path was resolved |
themis_config_unmapped_requests_total |
counter | Requests for paths with no mapping |
themis_config_cache_hits_total |
counter | LRU cache hits |
themis_config_cache_misses_total |
counter | LRU cache misses |
themis_config_cache_hit_ratio |
gauge | Cache hit / (hit + miss), 0.0–1.0 |
themis_config_cache_size |
gauge | Current number of entries in cache |
themis_config_cache_capacity |
gauge | Maximum cache capacity (info) |
themis_config_cache_ttl_seconds |
gauge | Cache entry TTL in seconds (info) |
themis_config_legacy_fallbacks_by_category_total{category} |
counter | Legacy fallbacks broken down by config category |
themis_config_legacy_fallbacks_all_total |
counter | Aggregate legacy fallbacks across all categories |
collect() performs a pure read unless a Prometheus registry is registered via registerWithRegistry(), in which case it also updates the registered counters using deltas before returning serialized text. updateMetricsCollector() pushes the same values into the central MetricsCollector singleton as _current gauges for Grafana dashboard integration.
Location: path_mapping_metadata.h
Holds deprecation and removal timestamps, category, and a link to the migration guide for each mapped legacy path. Used to emit structured warnings when legacy paths are accessed.
Location: config_schema_validator.h, config_schema_validator.cpp
Static utility that validates YAML and JSON config files — or in-memory strings — against JSON Schema (Draft 7 subset) definitions. YAML content is parsed via yaml-cpp and converted to an internal JSON representation before validation. JSON content is parsed directly with nlohmann::json. Schema file lookups use ConfigPathResolver::tryResolve() so that legacy-to-new path mapping applies automatically.
Public API summary:
| Method | Input | Description |
|---|---|---|
validate(path, schema) |
file path + schema object | Validate a YAML/JSON file against an inline schema |
validateWithSchemaFile(config_path, schema_path) |
two file paths | Validate a YAML/JSON file against a schema file |
validateFromString(content, is_yaml, schema) |
in-memory string + schema object | Validate a YAML or JSON string without touching the filesystem |
loadAsJson(file_path) |
file path | Parse a YAML/JSON file to nlohmann::json |
loadAsJson(content, is_yaml) |
in-memory string | Parse a YAML or JSON string to nlohmann::json |
Supported JSON Schema keywords:
type,properties,required,additionalPropertiesminLength,maxLength,pattern,format(string; formats:date,date-time,email,uri,ipv4,ipv6)minimum,maximum,exclusiveMinimum,exclusiveMaximum(number/integer)minItems,maxItems,items,uniqueItems(array)enum,constallOf,anyOf,oneOf(schema composition)$refwith local$defs/definitionslookup (JSON Pointer RFC 6901 subset)
$ref and $defs support:
ConfigSchemaValidator resolves document-internal $ref values using a JSON Pointer walk (RFC 6901). Only refs beginning with # (document-local pointers) are supported. Both the Draft 2019-09 $defs keyword and the older Draft 4/6/7 definitions keyword are accepted as lookup targets. External URI references (e.g., https://example.com/schema.json) are rejected with a validation error to prevent SSRF.
- Nested references (a
$defsentry that itself uses$ref) are fully resolved. - Cyclic
$refchains are detected and reported as a validation error rather than causing infinite recursion. - An unresolvable
$refis reported as an error inValidationResult. $refwith local$defs/definitionslookup (JSON Pointer, RFC 6901 subset; external URI resolution is not supported)
Schema Composition keywords:
| Keyword | Semantics |
|---|---|
allOf |
Value must satisfy all sub-schemas. Errors from every failing sub-schema are collected and reported. |
anyOf |
Value must satisfy at least one sub-schema. Passes silently on the first match. |
oneOf |
Value must satisfy exactly one sub-schema. Fails if zero or more than one sub-schemas match. |
Thread Safety: All public methods are stateless static functions; safe for concurrent use.
Location: config_errors.h
Typed exceptions for config-related failures:
| Exception | Thrown When |
|---|---|
ConfigNotFoundException |
Neither new nor legacy path exists on disk |
MappingNotFoundException |
No mapping found for a legacy path |
InvalidPathException |
Path contains .. (traversal attempt) or is otherwise invalid |
ConfigPermissionException |
Filesystem permission denied |
SchemaValidationException |
A config or schema file cannot be read or parsed by ConfigSchemaValidator |
Caller
│
└─► ConfigPathResolver::resolve(legacy_path)
│
├─ normalizePath() ← strip "./" and backslashes
├─ validatePath() ← reject ".." traversal
├─ LRUCacheWithTTL::get() ← return if cached (+ audit entry if enabled)
│
├─ mapLegacyToNew() ← look up PATH_MAPPING table
│
├─ filesystem::exists(new_path)? → return new path (+ audit entry if enabled)
│
└─ filesystem::exists(legacy_path)?
├─ yes → log deprecation warning, return legacy path (+ audit entry if enabled)
└─ no → throw ConfigNotFoundException (no audit entry)
config/lru_cache.h— LRU cache with TTLconfig/path_mapping_metadata.h— deprecation metadata structconfig/config_errors.h— typed exception hierarchyconfig/config_audit_log.h— bounded in-memory audit trail
spdlog— structured logging for deprecation warnings and debug traces<filesystem>(C++17) — file existence checks and path manipulationyaml-cpp— YAML file parsing used byConfigSchemaValidatornlohmann/json— JSON file parsing and schema representation used byConfigSchemaValidatorandConfigEncryptedStoreOpenSSL(libcrypto) — AES-256-GCM encryption used byConfigEncryptedStore
#include "config/config_encrypted_store.h"
using namespace themis::config;
// --- Basic usage ---
ConfigEncryptedStore store;
store.set("db_password", "hunter2");
store.set("api_token", "tok_abc123");
// Retrieve (decrypts and verifies authentication tag on every call)
std::string pw = store.get("db_password"); // "hunter2"
// Non-throwing variant (returns std::nullopt if key absent)
auto token = store.tryGet("api_token");
if (token) { /* use *token */ }
// --- Key rotation (zero-downtime, atomic) ---
uint32_t new_version = store.rotateKey();
// All stored values are now re-encrypted under a fresh AES-256 key.
// Values remain accessible without any change to callers.
assert(store.get("db_password") == "hunter2");
assert(store.currentKeyVersion() == new_version);
// --- Persistence ---
// Serialise to JSON (contains AES key in plaintext — protect before persisting)
std::string snapshot = store.serialize();
// Restore from snapshot
ConfigEncryptedStore restored;
restored.deserialize(snapshot);
assert(restored.get("db_password") == "hunter2");
assert(restored.currentKeyVersion() == store.currentKeyVersion());#include "config/config_path_resolver.h"
using namespace themis::config;
// Resolve a legacy path (throws ConfigNotFoundException if not found)
std::string path = ConfigPathResolver::resolve("config/lora_training_config.yaml");
// Returns "config/ai_ml/lora_training_config.yaml" if new path exists,
// or "config/lora_training_config.yaml" with a deprecation warning if only legacy exists.
// Non-throwing variant
auto opt = ConfigPathResolver::tryResolve("config/pii_patterns.yaml");
if (opt) {
// use *opt
}
// Check if a path is a known legacy path
if (ConfigPathResolver::isLegacyPath("config/rbac_roles.json")) {
// suggest migration
}
// Retrieve deprecation metadata
auto meta = ConfigPathResolver::getMetadata("config/lora_training_config.yaml");
if (meta && meta->isDeprecated()) {
// meta->getDeprecationMessage() returns a human-readable message
}
// Inspect resolution metrics
const auto& m = ConfigPathResolver::metrics();
// m.new_path_hits, m.legacy_fallbacks, m.cache_hits, etc.
// Prometheus metrics export (used by MonitoringApiHandler at /metrics scrape)
#include "config/config_metrics_exporter.h"
std::string prom_text = ConfigMetricsExporter::collect();
// Returns Prometheus text-exposition format string with HELP/TYPE annotations.
// Sync into MetricsCollector for Grafana dashboard gauges
ConfigMetricsExporter::updateMetricsCollector();
// Query the active cache configuration (may differ from defaults if env vars are set)
auto cfg = ConfigPathResolver::currentCacheConfig();
// cfg.capacity, cfg.ttl_seconds
// Enumerate all known legacy paths (e.g. for tooling)
for (const auto& [legacy, new_path] : ConfigPathResolver::legacyPathMappings()) {
// ...
}
// Disable caching (e.g., in tests)
ConfigPathResolver::setCachingEnabled(false);
ConfigPathResolver::resetMetrics();
// Enable config path audit trail
ConfigPathResolver::setAuditLogEnabled(true);
std::string path2 = ConfigPathResolver::resolve("config/pii_patterns.yaml");
// Query all recorded audit entries (oldest first)
for (const auto& entry : ConfigPathResolver::auditLog()) {
// entry.requested_path — original caller path
// entry.resolved_path — path that was returned
// entry.timestamp — std::chrono::system_clock::time_point
// entry.is_legacy — true if legacy fallback was used
// entry.is_cache_hit — true if served from LRU cache
}
// Clear audit entries and disable logging
ConfigPathResolver::clearAuditLog();
ConfigPathResolver::setAuditLogEnabled(false);
// Limit audit log to 500 entries (oldest are evicted when limit is reached)
ConfigPathResolver::setAuditLogMaxEntries(500);ConfigPathResolver supports dev, staging, and prod path sets via an overlay
directory mechanism. When the active environment is DEV or STAGING, the
resolver probes an environment-specific overlay directory before the
standard config root:
| Environment | Overlay root | Activated by |
|---|---|---|
DEV |
config/dev/ |
THEMIS_CONFIG_ENV=dev or setEnvironment(ConfigEnvironment::DEV) |
STAGING |
config/staging/ |
THEMIS_CONFIG_ENV=staging or setEnvironment(ConfigEnvironment::STAGING) |
PROD |
(no overlay) | default; THEMIS_CONFIG_ENV=prod or setEnvironment(ConfigEnvironment::PROD) |
Resolution order (example: config/lora_training_config.yaml in DEV):
config/dev/ai_ml/lora_training_config.yaml← overlay (checked first)config/ai_ml/lora_training_config.yaml← canonical new pathconfig/lora_training_config.yaml← legacy fallback (with deprecation warning)
If the overlay file is absent the resolver falls through to the next path
without error. Overlay directories are located under the repository root at
config/dev/ and config/staging/; each contains a README.md with usage
guidelines.
Programmatic API:
#include "config/config_path_resolver.h"
using namespace themis::config;
// Set active environment (also clears the LRU cache)
ConfigPathResolver::setEnvironment(ConfigEnvironment::DEV);
// Query active environment
ConfigEnvironment env = ConfigPathResolver::getEnvironment();
// env == ConfigEnvironment::DEVCache isolation: Cache keys include the active environment name
("dev:config/lora_training_config.yaml") to prevent cross-environment cache
poisoning. setEnvironment() clears the cache atomically.
The following environment variables are read once at process startup (during static initialization) and cannot be changed at runtime.
| Variable | Default | Valid Values / Range | Description |
|---|---|---|---|
THEMIS_CONFIG_CACHE_SIZE |
1000 |
[10, 100000] |
LRU cache capacity (max number of cached path resolutions) |
THEMIS_CONFIG_CACHE_TTL |
300 |
[1, 86400] |
LRU cache TTL in seconds (300 = 5 minutes) |
THEMIS_CONFIG_ENV |
prod |
dev | staging | prod (case-insensitive) |
Active deployment environment for config overlay resolution |
When a variable is absent, empty, not a valid integer, or outside its valid range, a warning is written to stderr and the default value is used. Values outside the valid range are rejected to prevent pathological configurations (e.g., a zero-capacity cache or a TTL longer than one day).
Example:
# Large deployment with many config paths, running in dev overlay mode
THEMIS_CONFIG_CACHE_SIZE=5000 THEMIS_CONFIG_CACHE_TTL=60 THEMIS_CONFIG_ENV=dev ./themisdb#include "config/config_schema_validator.h"
using namespace themis::config;
// Validate a YAML config file against an inline JSON Schema
nlohmann::json schema = R"({
"type": "object",
"required": ["host", "port"],
"properties": {
"host": { "type": "string" },
"port": { "type": "integer", "minimum": 1, "maximum": 65535 },
"worker_threads": { "type": "integer", "minimum": 1 }
}
})"_json;
auto result = ConfigSchemaValidator::validate("config/server.yaml", schema);
if (!result.valid) {
spdlog::error("Config validation failed:\n{}", result.formatErrors());
}
// Validate against a JSON Schema file on disk
// (schema_path is resolved via ConfigPathResolver for legacy/new path mapping)
auto result2 = ConfigSchemaValidator::validateWithSchemaFile(
"config/server.yaml",
"config/schema/server.schema.json");
// Load any YAML or JSON file as nlohmann::json (e.g., for custom processing)
nlohmann::json data = ConfigSchemaValidator::loadAsJson("config/server.yaml");
// --- In-memory YAML/JSON validation (no file required) ---
// Parse and validate a YAML or JSON string without writing it to disk.
// Useful for dynamic config editing, unit tests, and server-side hot-checks.
// Parse an in-memory YAML string to nlohmann::json
const std::string yaml_content = "port: 8080\nhost: localhost\n";
nlohmann::json parsed = ConfigSchemaValidator::loadAsJson(yaml_content, true /*is_yaml*/);
// Parse an in-memory JSON string to nlohmann::json
const std::string json_content = R"({"port": 8080, "host": "localhost"})";
nlohmann::json parsed_json = ConfigSchemaValidator::loadAsJson(json_content, false /*is_yaml*/);
// Validate an in-memory YAML string directly against a schema
nlohmann::json inmem_schema = R"({
"type": "object",
"required": ["host", "port"],
"properties": {
"host": { "type": "string" },
"port": { "type": "integer", "minimum": 1, "maximum": 65535 }
}
})"_json;
auto inmem_result = ConfigSchemaValidator::validateFromString(yaml_content, true, inmem_schema);
if (!inmem_result.valid) {
spdlog::error("In-memory config validation failed:\n{}", inmem_result.formatErrors());
}
// Same API works for JSON strings — set is_yaml=false
auto inmem_json_result = ConfigSchemaValidator::validateFromString(json_content, false, inmem_schema);
// Edge cases for in-memory validation:
// - Invalid YAML (e.g. tab-indented block) is reported as a validation error,
// not thrown, so callers can inspect result.errors without try/catch.
// - Invalid JSON string is similarly reported as a validation error.
// - result.config_path is set to "<string>" for in-memory calls (no file path).
// - $ref / $defs, allOf / anyOf / oneOf, format and all other schema keywords
// work identically in validateFromString and validate.
// --- $ref and $defs: reusable schema fragments ---
// Define shared type definitions in "$defs" and reference them via "$ref".
// Both "$defs" (Draft 2019-09) and "definitions" (Draft 4/6/7) are supported.
nlohmann::json schema_with_defs = R"({
"$defs": {
"Port": { "type": "integer", "minimum": 1, "maximum": 65535 },
"NonEmptyString": { "type": "string", "minLength": 1 },
"ServerConfig": {
"type": "object",
"required": ["host", "port"],
"properties": {
"host": { "$ref": "#/$defs/NonEmptyString" },
"port": { "$ref": "#/$defs/Port" }
}
}
},
"$ref": "#/$defs/ServerConfig"
})"_json;
auto result3 = ConfigSchemaValidator::validate("config/server.yaml", schema_with_defs);
if (!result3.valid) {
spdlog::error("Config validation failed:\n{}", result3.formatErrors());
}
// Notes:
// - External URI refs (e.g. "https://...") are rejected to prevent SSRF.
// - Cyclic $ref chains are detected and reported as a validation error.
// - Nested $ref resolution (a $defs entry referencing another $defs entry) is supported.
// Schema composition: allOf, anyOf, oneOf
// allOf — value must satisfy ALL sub-schemas (errors from all failing branches are reported)
nlohmann::json allof_schema = R"({
"allOf": [
{ "type": "object" },
{ "required": ["host", "port"] },
{ "properties": { "port": { "minimum": 1, "maximum": 65535 } } }
]
})"_json;
// anyOf — value must satisfy AT LEAST ONE sub-schema
nlohmann::json anyof_schema = R"({
"properties": {
"log_level": {
"anyOf": [
{ "type": "string", "enum": ["debug", "info", "warn", "error"] },
{ "type": "integer", "minimum": 0, "maximum": 5 }
]
}
}
})"_json;
// oneOf — value must satisfy EXACTLY ONE sub-schema (discriminated union)
nlohmann::json oneof_schema = R"({
"oneOf": [
{
"type": "object",
"required": ["type", "port"],
"properties": {
"type": { "const": "tcp" },
"port": { "type": "integer" }
},
"additionalProperties": false
},
{
"type": "object",
"required": ["type", "path"],
"properties": {
"type": { "const": "unix" },
"path": { "type": "string" }
},
"additionalProperties": false
}
]
})"_json;
// $ref with $defs — reusable schema fragments (local references only)
nlohmann::json ref_schema = R"({
"$defs": {
"Port": { "type": "integer", "minimum": 1, "maximum": 65535 }
},
"type": "object",
"properties": {
"port": { "$ref": "#/$defs/Port" },
"admin_port": { "$ref": "#/$defs/Port" }
}
})"_json;
// format — enforce well-known string formats (date, date-time, email, uri, ipv4, ipv6)
// Unknown format identifiers are silently accepted (informational keyword per JSON Schema spec).
nlohmann::json format_schema = R"({
"type": "object",
"properties": {
"created_at": { "type": "string", "format": "date" },
"updated_at": { "type": "string", "format": "date-time" },
"contact": { "type": "string", "format": "email" },
"endpoint": { "type": "string", "format": "uri" },
"server_ip": { "type": "string", "format": "ipv4" },
"ipv6_addr": { "type": "string", "format": "ipv6" }
}
})"_json;
// Valid values: "2026-03-11", "2026-03-11T09:30:00Z", "user@example.com",
// "https://api.example.com/v1", "192.168.1.1", "2001:db8::1"
// uniqueItems — require all array elements to be distinct
nlohmann::json unique_schema = R"({
"type": "object",
"properties": {
"tags": { "type": "array", "uniqueItems": true },
"modules": { "type": "array", "items": { "type": "string" }, "uniqueItems": true }
}
})"_json;
// A config file containing {"tags": ["a", "b", "a"]} will fail validation
// because index 0 and index 2 are equal. {"tags": ["a", "b", "c"]} passes.Location: tools/config_migration_scanner.cpp
A standalone CLI tool that scans a deployment directory tree for files referencing legacy config paths and outputs a migration report.
# Text report (default)
config_migration_scanner --root /srv/themis
# JSON report
config_migration_scanner --root /srv/themis --output json
# CSV report
config_migration_scanner --root /srv/themis --output csv
# Dry-run: show what --fix would change
config_migration_scanner --root /srv/themis --dry-run --fix
# Rewrite files in-place (creates .bak backups)
config_migration_scanner --root /srv/themis --fixExit codes:
0– No overdue (past removal_date) legacy paths found1– At least one path past itsremoval_datewas found (usable as a CI gate)2– Argument / usage error
Current Status: Production Ready
- All public methods are thread-safe for concurrent read access
- Path-traversal prevention and symlink escape hardening are enforced via
validatePath() - LRU cache avoids repeated filesystem calls under load; capacity and TTL are configurable at runtime via env vars
- Complete deprecation metadata for all 60+ mapped paths in
METADATA_TABLE - Known limitations:
- HTTP/network config paths are not validated for reachability; only filesystem presence is checked
- Migration tooling (
config_migration_scanner) scans for path references but does not handle binary files
-
Saltzer, J. H., & Schroeder, M. D. (1975). The Protection of Information in Computer Systems. Proceedings of the IEEE, 63(9), 1278–1308. https://doi.org/10.1109/PROC.1975.9939
-
Nygard, M. T. (2018). Release It!: Design and Deploy Production-Ready Software (2nd ed.). Pragmatic Bookshelf. ISBN: 978-1-680-50239-8
-
Krioukov, A., Baig, L., Treuhaft, S., Ungureanu, C., Bhatia, K., Rolia, J., & Talwar, V. (2011). Napsack: Solving Conflicts Among Distributed Configuration Requirements. Proceedings of the 6th ACM European Conference on Computer Systems (EuroSys), 331–344. https://doi.org/10.1145/1966445.1966475