The Maintenance module provides a centralized orchestration layer for all database maintenance operations. It allows operators to define named maintenance schedules with cron-based execution, maintenance window enforcement, task sequencing with halt-on-failure semantics, and aggregated per-module health reporting.
| Interface / File | Role |
|---|---|
include/maintenance/database_maintenance_orchestrator.h |
Primary public API |
include/maintenance/maintenance_task.h |
Task types, job struct, job state enum |
include/maintenance/maintenance_schedule.h |
Schedule entry with JSON serialization |
include/maintenance/maintenance_health_report.h |
Health report aggregation |
src/maintenance/database_maintenance_orchestrator.cpp |
Implementation |
src/maintenance/maintenance_registry.cpp |
Default schedule bundles |
Central coordinator for all maintenance scheduling and execution.
#include "maintenance/database_maintenance_orchestrator.h"
// Construction (via dependency injection)
auto orchestrator = DatabaseMaintenanceOrchestrator(
scheduler, // TaskScheduler*
index_maintenance, // std::shared_ptr<IndexMaintenanceManager>
audit_logger // std::shared_ptr<utils::AuditLogger>
);
orchestrator.start();
// Create a schedule
MaintenanceScheduleEntry schedule;
schedule.id = "nightly-index-rebuild";
schedule.name = "Nightly Index Rebuild";
schedule.cron_expression = "0 2 * * *"; // 2:00 AM daily
schedule.window_start_hour = 1;
schedule.window_end_hour = 5;
schedule.tasks = { MaintenanceTaskType::INDEX_REBUILD, MaintenanceTaskType::STATISTICS_UPDATE };
schedule.halt_on_task_failure = true;
schedule.enabled = true;
auto result = orchestrator.createSchedule(schedule);
// List recent jobs
auto jobs = orchestrator.listJobs(50);
// Get aggregated health report
MaintenanceHealthReport health = orchestrator.getHealthReport();Provides pre-built schedule bundles for common maintenance patterns:
#include "maintenance/maintenance_registry.h"
// Get default schedule bundles
auto daily_schedules = MaintenanceRegistry::getDailySchedules();
auto weekly_schedules = MaintenanceRegistry::getWeeklySchedules();
auto monthly_schedules = MaintenanceRegistry::getMonthlySchedules();In Scope:
- Schedule CRUD (create, read, update, patch, delete, enable, disable)
- Cron-based execution via
TaskScheduler - Maintenance window enforcement (UTC hour range)
- Sequential task execution with halt-on-failure
- Per-module health probe registry and aggregation
- Job lifecycle management (PENDING → RUNNING → SUCCEEDED/FAILED/CANCELLED/SKIPPED)
- 24-hour job retention with automatic pruning
- Audit logging and Prometheus-compatible metrics
Out of Scope:
- Schedule persistence (planned v1.1.0 — currently in-memory only)
- Explicit DAG task dependencies (planned v1.2.0 — currently total order)
- Distributed maintenance coordination (planned v2.0.0)
INDEX_REBUILD INDEX_OPTIMIZE INDEX_CONSISTENCY_CHECK
STORAGE_COMPACTION WAL_ARCHIVING BACKUP_VERIFICATION
METRICS_COLLECTION LOG_ROTATION CACHE_WARM
DEAD_LETTER_DRAIN REPLICA_VALIDATION MVCC_CLEANUP
SCHEMA_VALIDATION RETENTION_ENFORCEMENT STATISTICS_UPDATE
SECURITY_SCAN AUDIT_LOG_FLUSH BLOOM_FILTER_REBUILD
CUSTOM
11 endpoints under /api/v1/maintenance/:
POST /schedules— create scheduleGET /schedules— list allGET /schedules/{id}— get by IDPUT /schedules/{id}— replacePATCH /schedules/{id}— partial updateDELETE /schedules/{id}— deletePOST /schedules/{id}/enable— enablePOST /schedules/{id}/disable— disableGET /jobs— list recent jobs (last 24 hours)GET /jobs/{id}— get job detailsGET /health— aggregated health report
RBAC: maintenance:read · maintenance:write · maintenance:admin
Modules can register health probes to contribute to the aggregated health report:
orchestrator.registerHealthProbe("my_module", []() -> ModuleHealthSignal {
ModuleHealthSignal signal;
signal.module_name = "my_module";
signal.status = ModuleHealthStatus::HEALTHY;
signal.message = "All systems nominal";
return signal;
});40+ unit tests in tests/test_maintenance_orchestrator.cpp covering:
- Schedule CRUD and validation
- JSON round-trips (
toJson()/fromJson()/applyPatch()) - Maintenance window enforcement and SKIPPED state
- Job lifecycle (SUCCEEDED, FAILED, CANCELLED)
halt_on_task_failurecascading behaviour- Health probe registration and aggregation
- Metrics collection