XLens Changelog

All notable changes to XLens will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Added

AutoExec functionality for problematic-translogs: Automated replica reset operations
- New --autoexec flag: Automatically executes replica reset operations without manual intervention
- New --dry-run flag: Safe simulation mode that shows what would be executed without making changes
- New --percentage parameter: Filter tables by threshold percentage (default: 200%)
- New --max-wait parameter: Configure timeout for retention lease monitoring (default: 720s)
- New --log-format parameter: JSON logging support for container/Kubernetes environments
- Robust state machine implementation: Handles set replicas → monitor leases → restore replicas workflow
- Intelligent retry logic: Exponential backoff with configurable timeouts
- Comprehensive error handling: Automatic rollback on failures with manual intervention guidance
- Production-ready: Designed for Kubernetes CronJob automation
New read-check command: Professional cluster data readability monitor
- Continuously monitors the 5 largest tables/partitions using max(_seq_no) to detect data changes
- Health status indicators: 🟢 Active, 🟡 Slow, 🔴 Stale tables
- Query performance tracking with ⚡ alerts for >1000ms queries
- Fresh connections with exponential backoff retry logic
- Enhanced statistics with per-table metrics on exit (CTRL+C)
- Automatic table discovery every 10 minutes with partition support
- Professional logging format: timestamp: schema.table // _seq_no ±X // total_docs ±Y // XXXms
- Anomaly detection with ⚠️ indicators for unusual activity patterns
- Optimized max(_seq_no) queries instead of expensive ORDER BY + LIMIT operations
- Removed obsolete --limit parameter (no longer needed with max() aggregation)
- Uses CrateDB's actual query execution time instead of network RTT for performance metrics

Fixed

Critical CrateDB response handling bug: Fixed AutoExec operations incorrectly reported as failures
- Issue: Successful ALTER TABLE operations were marked as "Failed: Unknown error"
- Root cause: Code checked for 'success' field that CrateDB doesn't return in HTTP responses
- Fix: Changed to check for absence of 'error' field instead of presence of 'success' field
- Impact: AutoExec operations now correctly report success/failure status
- Evidence: Operations were actually succeeding (retention leases changed) but reported as failed
Hardcoded threshold bug in AutoExec filtering: Fixed percentage calculations using wrong thresholds
- Issue: AutoExec percentage filtering used hardcoded 563MB for all tables instead of adaptive thresholds
- Fix: Modified _filter_tables_by_percentage() to use actual table-specific adaptive thresholds
- Impact: AutoExec now respects individual table configurations instead of using one-size-fits-all approach
Adaptive threshold override bug: Fixed --sizeMB parameter being ignored in favor of adaptive thresholds
- Issue: User-specified --sizeMB 50 was ignored, shards filtered using adaptive threshold (563.2MB) instead
- Root cause: _apply_adaptive_thresholds() compared translog sizes against adaptive threshold, not user's --sizeMB
- Fix: Changed filtering to always respect --sizeMB as minimum threshold; adaptive thresholds shown for reference only
- Impact: --sizeMB parameter now works as expected - shows all shards exceeding the specified size
- Example: With --sizeMB 50, shards at 542MB, 360MB, 301MB now correctly appear (previously: 0 shards shown)

Changed

Enhanced problematic-translogs command: Adaptive threshold detection based on table settings
- Default --sizeMB changed from 300MB to 512MB (CrateDB default flush threshold)
- Adaptive thresholds: Uses table-specific flush_threshold_size * 1.1 for informational display (does NOT override --sizeMB)
- --sizeMB parameter: Always respected as minimum threshold regardless of adaptive thresholds
- Performance optimized: Only queries table settings for tables with initially problematic shards
- Enhanced display: Shows both configured value and calculated threshold (e.g., "2048MB/2253MB config/threshold")
- Partition support: Handles partition-specific flush_threshold_size settings
- Clean CLI: Simplified help text for better usability
- Fixed REROUTE CANCEL commands to include partition information for partitioned tables
Enhanced SQL logging: Complete transparency for AutoExec operations
- Dry-run mode: Shows "DRY RUN: Would execute: SQL" for all operations
- Regular mode: Shows "Executing: SQL" before actual database execution
- JSON logging: Uses loguru with structured data (consistent with read-check command)
- Rollback operations: Clear logging for failure recovery attempts
- Benefit: Full audit trail and debugging visibility for all database operations
Consistent loguru usage: Both read-check and problematic-translogs --autoexec use loguru for structured logging
Enhanced per-table statistics: Shows document change tracking and performance metrics
- Document changes: Total change with min/avg/max deltas
- Performance: Query response time min/avg/max
- Anomaly counter per table
Query optimization: read-check uses efficient max(_seq_no) aggregation instead of sorting
Parameter cleanup: Removed obsolete --limit parameter from read-check command
Performance measurement: read-check now uses database execution time from CrateDB response instead of measuring network round-trip time

Dependencies

Enhanced loguru>=0.7.0 usage: AutoExec now uses loguru for JSON logging (consistent with read-check command)

Testing

Comprehensive AutoExec test coverage: 16 focused tests covering essential business scenarios
- test_autoexec.py: Consolidated test suite covering replica reset workflows, dry-run safety, adaptive thresholds, CLI validation, and error handling
- Test philosophy: Focus on business scenarios rather than implementation details
- Coverage includes: Regular and partitioned table resets, timeout handling, dry-run simulation, percentage-based filtering, partial/complete failure scenarios, CLI flag validation
- Reduced test complexity by 77.7% (from 1,998 → 445 lines) while maintaining comprehensive coverage

Documentation

Updated main README.md with read-check command reference and AutoExec functionality
Added comprehensive docs/read-check.md with usage guide and examples
Updated docs/README.md with data readability monitoring section
Added SQL_LOGGING_AND_BUGFIX_SUMMARY.md: Technical documentation of bug fixes and enhancements
Created verification scripts: verify_dry_run_safety.py and verify_adaptive_thresholds.py

Format Notes

Version Format

[Unreleased]: Features ready but not yet in a tagged release
[X.Y.Z]: Released versions with date

Change Categories

Added: New features
Changed: Changes to existing functionality
Deprecated: Soon-to-be removed features
Removed: Removed features
Fixed: Bug fixes
Security: Vulnerability fixes

Commit Convention

This project will follow conventional commits for automatic changelog generation:

feat: for new features (minor version bump)
fix: for bug fixes (patch version bump)
docs: for documentation changes
refactor: for code refactoring
test: for adding tests
chore: for maintenance tasks

Example Future Entry

## [1.2.0] - 2024-02-15

### Added
- New feature X with capability Y

### Fixed
- Bug where Z caused unexpected behavior

### Changed
- Improved performance of command A

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XLens Changelog

[Unreleased]

Added

Fixed

Changed

Dependencies

Testing

Documentation

Format Notes

Version Format

Change Categories

Commit Convention

Example Future Entry

FilesExpand file tree

CHANGES.md

Latest commit

History

CHANGES.md

File metadata and controls

XLens Changelog

[Unreleased]

Added

Fixed

Changed

Dependencies

Testing

Documentation

Format Notes

Version Format

Change Categories

Commit Convention

Example Future Entry