Skip to content

Conversation

@Enjection
Copy link
Collaborator

@Enjection Enjection commented Jul 23, 2025

Summary by CodeRabbit

  • New Features

    • Introduced automated daily backup and advanced recovery system for the SQLite database, including multi-strategy recovery and corruption archiving.
    • Added backup management tools with manual backup creation, status reporting, and restoration capabilities.
    • Enhanced database initialization with improved concurrency, reliability, and connection pooling settings.
    • Custom SQLite driver now enforces safety features such as foreign key constraints, WAL mode, and busy timeout.
    • Added detailed logging support during database store creation and operations.
  • Bug Fixes

    • Improved database resilience against corruption and ensured reliable startup through enhanced integrity checks and recovery flows.
  • Tests

    • Added comprehensive integration and unit tests for backup scheduling, recovery strategies, safety features, and concurrent access scenarios.

@mentatbot
Copy link

mentatbot bot commented Jul 23, 2025

If you would like me to review this PR, either tag me in a comment or check this box:

  • Review PR

You can disable automatic comments on my settings page

@coderabbitai
Copy link

coderabbitai bot commented Jul 23, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

"""

Walkthrough

This set of changes introduces a robust backup and recovery system for the SQLite database used in the Go backend. Key enhancements include the implementation of a DailyBackupManager for automated and manual backups, a StartupRecoveryManager with multiple recovery strategies for handling database corruption, and integration of these systems into the database Store. The SQLite driver is customized to enforce safety features such as WAL mode and foreign key constraints. Extensive tests are added to verify backup scheduling, concurrent access, recovery strategies, and overall system integration. The public API of the Store is extended to expose backup and recovery operations, and logger injection is supported for improved diagnostics.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant Store as Store (DB Layer)
    participant BackupMgr as DailyBackupManager
    participant RecoveryMgr as StartupRecoveryManager
    participant Logger as Logger

    App->>Store: NewStoreWithLogger(ctx, dbPath, logger)
    Store->>Logger: Log initialization
    Store->>BackupMgr: Initialize with dbPath, logger
    Store->>RecoveryMgr: Initialize with dbPath, backupMgr, logger
    Store->>Store: Open database (with recovery)
    alt Open fails
        Store->>RecoveryMgr: Attempt recovery strategies
        RecoveryMgr->>BackupMgr: Restore from backup (if needed)
        RecoveryMgr->>Store: Retry open database
    end
    Store->>BackupMgr: Start daily backup routine
    App->>Store: Use database (CRUD operations)
    App->>Store: CreateManualBackup(ctx)
    Store->>BackupMgr: Create backup
    App->>Store: RestoreFromBackup(ctx)
    Store->>BackupMgr: Restore backup
    App->>Store: Close()
    Store->>BackupMgr: Stop backup routine
    Store->>Store: Close database
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

In the warren where data grows deep,
A rabbit devised a backup to keep.
With WAL and checks, and a logger so bright,
Each morning at two, backups take flight.
Should corruption appear, don’t worry or stew—
Recovery’s layered, with strategies anew!
Now the database hops, safe through the night.

(_/)
( •_•)
/ >🍃
"""


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@Enjection
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Jul 23, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🔭 Outside diff range comments (4)
backend/golang/pkg/dataprocessing/telegram/telegram_test.go (4)

297-303: Fix inconsistent store constructor usage.

This test function still uses the old db.NewStore constructor while other test functions in the same file have been updated to use db.NewStoreWithLogger. This creates inconsistency within the file.

Apply this diff to maintain consistency:

	ctx := context.Background()
-	store, err := db.NewStore(ctx, dbFile.Name())
+	logger := log.New(os.Stdout)
+	store, err := db.NewStoreWithLogger(ctx, dbFile.Name(), logger)
	if err != nil {
		t.Fatalf("Failed to create store: %v", err)
	}
	defer store.Close() //nolint:errcheck

-	logger := log.New(os.Stdout)
	processor, err := NewTelegramProcessor(store, logger)

353-359: Fix inconsistent store constructor usage.

Same inconsistency issue as the previous function - should use db.NewStoreWithLogger for consistency.

Apply this diff:

	ctx := context.Background()
-	store, err := db.NewStore(ctx, dbFile.Name())
+	logger := log.New(os.Stdout)
+	store, err := db.NewStoreWithLogger(ctx, dbFile.Name(), logger)
	if err != nil {
		t.Fatalf("Failed to create store: %v", err)
	}
	defer store.Close() //nolint:errcheck

-	logger := log.New(os.Stdout)
	processor, err := NewTelegramProcessor(store, logger)

391-397: Fix inconsistent store constructor usage.

Another instance of the same inconsistency issue.

Apply this diff:

	ctx := context.Background()
-	store, err := db.NewStore(ctx, dbFile.Name())
+	logger := log.New(os.Stdout)
+	store, err := db.NewStoreWithLogger(ctx, dbFile.Name(), logger)
	if err != nil {
		t.Fatalf("Failed to create store: %v", err)
	}
	defer store.Close() //nolint:errcheck

-	logger := log.New(os.Stdout)
	processor, err := NewTelegramProcessor(store, logger)

467-471: Fix inconsistent store constructor usage.

Another instance of inconsistent store constructor usage.

Apply this diff:

	ctx := context.Background()
-	store, err := db.NewStore(ctx, dbFile.Name())
+	logger := log.New(os.Stdout)
+	store, err := db.NewStoreWithLogger(ctx, dbFile.Name(), logger)
	if err != nil {
		t.Fatalf("Failed to create store: %v", err)
	}
	defer store.Close() //nolint:errcheck

And move the logger creation from line 543 to after the store creation:

-	logger := log.New(os.Stdout)
	processor, err := NewTelegramProcessor(store, logger)
🧹 Nitpick comments (4)
backend/golang/pkg/db/integration_test.go (2)

29-30: Remove redundant cleanup call

The cleanup is called both immediately and in defer, which is redundant.

-	cleanupArchives()
	defer cleanupArchives()

118-120: Remove unused variable or add test coverage

The dbsqlc variable is created but never used in the test.

Either remove it or add test coverage:

-	dbsqlc, err := New(store.DB().DB, logger)
-	require.NoError(t, err)
-	require.NotNil(t, dbsqlc)
+	// Remove if not needed, or add actual usage test
backend/golang/pkg/db/recovery_enhanced.go (1)

111-123: Cache SQLite version to avoid repeated connections

Opening a new database connection each time to get the SQLite version is inefficient.

Consider caching the version:

var (
    sqliteVersion     string
    sqliteVersionOnce sync.Once
)

func getSQLiteVersion() string {
    sqliteVersionOnce.Do(func() {
        db, err := sql.Open("sqlite3", ":memory:")
        if err != nil {
            sqliteVersion = "unknown"
            return
        }
        defer db.Close()
        
        if err := db.QueryRow("SELECT sqlite_version()").Scan(&sqliteVersion); err != nil {
            sqliteVersion = "unknown"
        }
    })
    return sqliteVersion
}
backend/golang/pkg/db/backup.go (1)

372-372: Consider making timeout configurable

The busy timeout is hardcoded to 30000ms in multiple places. Consider making this configurable.

Add a constant or configuration option:

const defaultBusyTimeout = 30000 // milliseconds

// Or as part of the manager struct
type StartupRecoveryManager struct {
    dbPath        string
    backupManager *DailyBackupManager
    logger        Logger
    busyTimeout   int // milliseconds
}

Also applies to: 387-387, 405-405

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Lite
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between a27c716 and d0b8ea3.

📒 Files selected for processing (16)
  • backend/golang/cmd/memory-processor-test/main.go (2 hunks)
  • backend/golang/cmd/server/main.go (1 hunks)
  • backend/golang/pkg/bootstrap/temporal.go (3 hunks)
  • backend/golang/pkg/dataprocessing/chatgpt/chatgpt_test.go (1 hunks)
  • backend/golang/pkg/dataprocessing/gmail/gmail_test.go (1 hunks)
  • backend/golang/pkg/dataprocessing/slack/slack_test.go (1 hunks)
  • backend/golang/pkg/dataprocessing/telegram/telegram_test.go (2 hunks)
  • backend/golang/pkg/db/backup.go (1 hunks)
  • backend/golang/pkg/db/backup_test.go (1 hunks)
  • backend/golang/pkg/db/integration_test.go (1 hunks)
  • backend/golang/pkg/db/recovery_enhanced.go (1 hunks)
  • backend/golang/pkg/db/recovery_test.go (1 hunks)
  • backend/golang/pkg/db/safety_test.go (1 hunks)
  • backend/golang/pkg/db/sqlite.go (3 hunks)
  • backend/golang/pkg/db/temporal_safety_fixed_test.go (1 hunks)
  • backend/golang/pkg/db/temporal_safety_test.go (1 hunks)
🔇 Additional comments (20)
backend/golang/pkg/dataprocessing/gmail/gmail_test.go (1)

19-19: LGTM: Store initialization updated to include logger support.

The change correctly updates the store creation to use NewStoreWithLogger, passing the previously created logger instance. This aligns with the PR's objective to enhance SQLite database management with logging capabilities.

backend/golang/pkg/dataprocessing/slack/slack_test.go (1)

19-19: LGTM: Consistent store initialization with logger support.

The update to NewStoreWithLogger is implemented correctly, maintaining the existing test functionality while adding logging capabilities to the database store.

backend/golang/pkg/dataprocessing/chatgpt/chatgpt_test.go (1)

71-71: LGTM: Store initialization enhanced with logger integration.

The change properly updates the database store creation to use the new logger-enabled constructor, maintaining all existing test functionality while supporting the enhanced database management features.

backend/golang/cmd/server/main.go (1)

81-81: LGTM: Production store initialization updated with logging support.

This critical update enables the main server to use the enhanced database store with logging capabilities, which is essential for the backup and recovery functionality introduced in this PR. The change maintains existing error handling and cleanup patterns.

backend/golang/cmd/memory-processor-test/main.go (2)

223-224: LGTM: Test utility updated with logger-enabled store creation.

The runDataProcessor function correctly updates to use NewStoreWithLogger, creating a test logger and passing it to the store constructor. This maintains existing test functionality while enabling enhanced database features.


406-407: LGTM: Gmail senders test utility updated consistently.

The runGmailSenders function follows the same pattern as other test utilities, properly creating a test logger and updating to use the new store constructor. The change maintains existing functionality while adding logging support.

backend/golang/pkg/dataprocessing/telegram/telegram_test.go (2)

92-93: LGTM! Correct adoption of new logger-enabled store constructor.

The transition from db.NewStore to db.NewStoreWithLogger is implemented correctly with proper logger initialization and error handling.


189-190: LGTM! Consistent implementation of logger-enabled store constructor.

The change follows the same correct pattern as the other test functions.

backend/golang/pkg/bootstrap/temporal.go (2)

103-106: LGTM! Good isolation of Temporal database.

Creating a separate database file for Temporal (.temporal suffix) prevents conflicts with the main application database and allows for independent configuration.


139-151: LGTM! Excellent SQLite safety configuration.

The connection attributes implement important safety and performance improvements:

  • WAL mode enables better concurrency
  • Foreign key enforcement ensures data integrity
  • 20-second busy timeout prevents immediate failures under contention
  • NORMAL synchronous mode balances safety and performance
  • 64MB cache improves performance
  • Connection pool settings are appropriate for the workload

These configurations align with the safety tests being added in other files.

backend/golang/pkg/db/backup_test.go (2)

13-74: LGTM! Well-designed backup integration test.

The test properly:

  • Uses the new logger-enabled store constructor
  • Tests backup status and manual backup functionality
  • Handles race conditions gracefully (lines 39-41)
  • Includes proper resource cleanup
  • Validates backup file existence and non-zero size

The approach of accepting backup failures due to race conditions while still testing the functionality is pragmatic.


76-110: LGTM! Good backup scheduling test.

The test appropriately:

  • Tests the backup manager lifecycle (start/stop)
  • Uses proper timeout context to prevent hanging
  • Creates necessary resources and cleans them up
  • Acknowledges the limitation of not being able to test actual scheduling timing
backend/golang/pkg/db/safety_test.go (2)

16-63: LGTM! Comprehensive safety features test.

The test thoroughly validates the key SQLite safety features:

  • Foreign key enforcement (enabled = 1)
  • WAL journal mode for better concurrency
  • Appropriate busy timeout (≥5000ms)

The test structure with subtests is clean and the assertions are appropriate.


65-141: LGTM! Excellent concurrent access test.

This test effectively validates concurrent write capability:

  • Appropriate concurrency level (10 workers × 100 inserts)
  • Proper synchronization with WaitGroup and error channels
  • Deadlock detection via 30-second timeout
  • Final count validation ensures no lost writes
  • Good error reporting with worker and insert identifiers

This test provides confidence that the WAL mode and busy timeout configuration properly handles concurrent access.

backend/golang/pkg/db/backup.go (6)

40-72: LGTM! Well-structured backup manager initialization

The backup scheduling logic is well-implemented with proper context handling and initial backup creation.


98-132: Excellent backup creation implementation

The backup creation follows best practices with integrity checks, atomic operations, verification, and proper error handling.


155-174: Well-implemented backup verification

Good use of read-only mode and PRAGMA quick_check for efficient backup validation.


237-272: Robust backup restoration with proper validation

The restoration process correctly archives the corrupted database and validates the restored data.


296-367: Excellent multi-strategy recovery implementation

The recovery strategies are well-ordered from least to most destructive, with proper verification after each attempt. This provides maximum chance of data recovery.


436-442: Incorrect panic recovery pattern

The recover() function returns the panic value, not an error. The current implementation might not properly preserve panic information.

 defer func() {
 	_ = os.Remove(dumpPath)
-	if err := recover(); err != nil {
+	if r := recover(); r != nil {
 		_ = os.Remove(recoveredPath)
-		panic(err)
+		panic(r)
 	}
 }()

Likely an incorrect or invalid review comment.

@Enjection
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Jul 24, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
backend/golang/pkg/db/sqlite.go (1)

100-108: Consider initializing backup manager with database to avoid partial initialization

While the backup manager's Start method does check for nil database, having a partially initialized backup manager in the Store struct could lead to issues if other methods are called before SetDatabase.

Consider passing the database to the backup manager constructor or creating a factory method:

-	store := &Store{
-		db:          db,
-		backupMgr:   backupMgr,
-		recoveryMgr: recoveryMgr,
-		logger:      logger,
-	}
-
-	backupMgr.SetDatabase(db.DB)
-	backupMgr.Start(ctx)
+	backupMgr.SetDatabase(db.DB)
+	
+	store := &Store{
+		db:          db,
+		backupMgr:   backupMgr,
+		recoveryMgr: recoveryMgr,
+		logger:      logger,
+	}
+	
+	backupMgr.Start(ctx)
backend/golang/pkg/db/backup_timing_test.go (1)

83-85: Simplify redundant assertion logic

The assertion on line 84 contains redundant logic. The shouldCatchUp variable already includes the check for !backupMgr.todaysBackupExists().

-			assert.Equal(t, scenario.shouldDoCatchUpBackup && !backupMgr.todaysBackupExists(), shouldCatchUp,
+			assert.Equal(t, scenario.shouldDoCatchUpBackup, shouldCatchUp,
				"Catch-up backup decision incorrect for %s", scenario.description)
backend/golang/pkg/db/backup.go (1)

239-264: Consider refining the backup existence logic.

The todaysBackupExists method uses a 24-hour sliding window rather than checking for backups created "today" in calendar terms. This might lead to unexpected behavior where a backup created yesterday evening is considered "today's backup" the next morning.

Consider implementing calendar-day-based logic:

func (dbm *DailyBackupManager) todaysBackupExists() bool {
	pattern := dbm.dbPath + ".backup.*"
	matches, err := filepath.Glob(pattern)
	if err != nil || len(matches) == 0 {
		return false
	}

	now := time.Now()
-	twentyFourHoursAgo := now.Add(-24 * time.Hour)
+	todayStart := time.Date(now.Year(), now.Month(), now.Day(), 0, 0, 0, 0, now.Location())

	for _, backupPath := range matches {
		stat, err := os.Stat(backupPath)
		if err != nil {
			continue
		}

-		if stat.ModTime().After(twentyFourHoursAgo) {
+		if stat.ModTime().After(todayStart) {
			return true
		}
	}

	return false
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Lite
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between d0b8ea3 and 96ab863.

📒 Files selected for processing (6)
  • backend/golang/pkg/db/backup.go (1 hunks)
  • backend/golang/pkg/db/backup_timing_test.go (1 hunks)
  • backend/golang/pkg/db/integration_test.go (1 hunks)
  • backend/golang/pkg/db/recovery_enhanced.go (1 hunks)
  • backend/golang/pkg/db/recovery_test.go (1 hunks)
  • backend/golang/pkg/db/sqlite.go (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/golang/pkg/db/integration_test.go
🧰 Additional context used
🧬 Code Graph Analysis (3)
backend/golang/pkg/db/recovery_test.go (3)
backend/golang/pkg/db/recovery_enhanced.go (2)
  • NewLoggerAdapter (22-24)
  • NewDatabaseManagerWithStartupRecovery (172-177)
backend/golang/pkg/db/backup.go (2)
  • NewDailyBackupManager (53-59)
  • NewStartupRecoveryManager (299-305)
backend/golang/pkg/db/sqlite.go (1)
  • NewStoreWithLogger (62-116)
backend/golang/pkg/db/backup.go (1)
backend/golang/pkg/db/database.go (1)
  • DB (17-23)
backend/golang/pkg/db/recovery_enhanced.go (2)
backend/golang/pkg/db/backup.go (4)
  • Logger (17-20)
  • StartupRecoveryManager (293-297)
  • DailyBackupManager (45-51)
  • NewStartupRecoveryManager (299-305)
backend/golang/pkg/db/database.go (1)
  • DB (17-23)
🔇 Additional comments (14)
backend/golang/pkg/db/recovery_test.go (2)

18-22: Logger consistency improvement implemented correctly.

The test now uses a single logger instance throughout, which addresses the previous concern about inconsistent logger usage.


115-133: Excellent improvement to recovery strategy testing.

The test now properly creates a corrupted database and validates recovery behavior, which addresses the previous concern about testing recovery on healthy databases.

backend/golang/pkg/db/backup.go (12)

24-31: LGTM! SQL injection vulnerability and double newline issues resolved.

The implementation now correctly handles both the double newline issue (by checking if format already ends with \n) and the SQL injection vulnerabilities have been addressed elsewhere with the escapeSQLitePath function.


38-43: Well-designed backup metadata structure.

The BackupInfo struct provides comprehensive metadata for backup files including path, timestamp, size, and validity status, which is essential for backup management operations.


65-95: Robust backup scheduling logic with catch-up mechanism.

The startup logic correctly handles different time scenarios and includes a catch-up backup for missed daily backups. The scheduling calculation properly handles both before and after 2 AM scenarios.


102-122: Comprehensive status reporting for backup operations.

The status method provides detailed information about backup system state, including initialization status, today's backup completion, and next scheduled time. This will be helpful for monitoring and debugging.


144-182: Well-structured backup creation with proper verification.

The backup process follows a solid pattern: integrity check → create backup → verify backup → rename to final location. The temporary file approach with cleanup on failure is a good practice for atomic operations.


199-210: Excellent fix for SQL injection vulnerability.

The escapeSQLitePath function properly addresses the SQL injection concern by escaping single quotes according to SQLite standards. This is correctly used in both createBackupFile and the recovery methods.


307-378: Excellent comprehensive recovery strategy implementation.

The recovery manager implements a well-thought-out cascade of recovery strategies from least to most destructive. The strategy pattern with descriptive logging makes the recovery process transparent and debuggable.


411-440: SQL injection vulnerability properly addressed.

The attemptVacuumIntoRecovery method now correctly uses escapeSQLitePath to prevent SQL injection, addressing the previous security concern.


582-607: Excellent improvement over previous unsafe implementation.

The createDatabaseFromDump method now uses the sqlite3 command-line tool with direct file piping instead of loading the entire dump into memory and performing unsafe string replacements. This addresses both the memory consumption and data corruption concerns from previous reviews.


442-471: Verify error handling in defer functions.

The defer function in attemptDumpRecovery includes panic recovery, but calling panic(err) re-raises the panic. Consider whether this is the intended behavior or if the panic should be logged and handled differently.

The current implementation re-raises panics after cleanup:

defer func() {
    _ = os.Remove(dumpPath)
    if err := recover(); err != nil {
        _ = os.Remove(recoveredPath)
        panic(err) // Re-raises the panic
    }
}()

Is this the intended behavior? If panics should be handled gracefully, consider logging the error instead of re-panicking.


643-670: Comprehensive recovery verification with basic functionality testing.

The verification process goes beyond simple integrity checks to test actual database operations, which provides better confidence in recovery success. The schema access test is a good way to verify the database structure is intact.


672-693: Robust file copying with proper cleanup and sync.

The safeCopy method implements proper error handling with cleanup of partially written files and ensures data is flushed to disk with Sync(). This is essential for reliable backup restoration.

@Enjection Enjection marked this pull request as ready for review July 25, 2025 00:35
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (9)
backend/golang/pkg/db/integration_test.go (1)

57-101: Recovery test properly simulates corruption scenario.

The test now correctly simulates database corruption by overwriting the database file with invalid content, addressing the previous review feedback. The recovery verification is comprehensive.

backend/golang/pkg/db/sqlite.go (3)

19-50: Custom SQLite driver properly handles PRAGMA errors.

The implementation correctly collects all PRAGMA errors and returns a combined error message, addressing the previous review feedback about error handling.


61-119: Store initialization properly handles backup manager setup.

The implementation correctly sets the database on the backup manager before store creation and starts it after initialization, addressing the previous race condition concern.


137-158: Backup methods properly implement dynamic status reporting.

The GetBackupStatus method now correctly delegates to the backup manager for dynamic status, addressing the previous review feedback about hardcoded messages.

backend/golang/pkg/db/backup_timing_test.go (1)

138-185: Edge case tests properly verify scheduling logic.

The tests now correctly use the backup manager's CalculateNextBackup method to test actual scheduling behavior, addressing the previous review feedback.

backend/golang/pkg/db/backup.go (4)

24-31: Printf implementation correctly handles newlines.

The method now checks for existing newlines before appending, addressing the previous review feedback about potential double newlines.


201-212: SQL injection vulnerability properly mitigated.

The implementation now correctly escapes file paths using the escapeSQLitePath function before using them in VACUUM INTO queries, addressing the previous security concern.


424-426: VACUUM INTO recovery properly escapes paths.

The implementation correctly uses escapeSQLitePath to prevent SQL injection in the recovery process, addressing the previous security concern.


584-609: Database dump restoration properly uses sqlite3 CLI.

The implementation now correctly pipes the dump file directly to sqlite3's stdin, addressing previous concerns about memory usage and unsafe string replacements.

🧹 Nitpick comments (3)
backend/golang/pkg/db/recovery_enhanced.go (2)

64-78: Consider logging auxiliary file copy failures.

While the method correctly handles optional WAL/SHM files, silently ignoring copy errors on lines 69 and 76 could make debugging difficult. Consider logging these failures as warnings.

-		_ = srm.safeCopy(walPath, walArchivePath)
+		if err := srm.safeCopy(walPath, walArchivePath); err != nil {
+			srm.logger.Printf("Warning: Failed to archive WAL file: %v", err)
+		}
-		_ = srm.safeCopy(shmPath, shmArchivePath)
+		if err := srm.safeCopy(shmPath, shmArchivePath); err != nil {
+			srm.logger.Printf("Warning: Failed to archive SHM file: %v", err)
+		}

130-153: Consider improving sort error handling.

The sorting logic works well, but when os.Stat fails on lines 146-148, returning false may not provide consistent sorting. Consider logging these errors or using filename-based sorting as a fallback.

		if errI != nil || errJ != nil {
+			// Fallback to filename comparison if stat fails
+			return matches[i] > matches[j]
-			return false
		}
backend/golang/pkg/db/integration_test.go (1)

167-178: Consider making error handling test more portable.

The test uses /proc/version which is Linux-specific. This test will fail on Windows and macOS systems where /proc doesn't exist.

Consider using a more portable approach:

-	// Use a more reliable approach - try to create a database in a definitely invalid location
-	// that will fail consistently across different environments
-	invalidPath := filepath.Join("/proc/version", "test.db") // /proc/version is a file, not a directory
+	// Create a temporary file and try to use it as a directory
+	tmpFile, err := os.CreateTemp("", "test_file")
+	require.NoError(t, err)
+	defer os.Remove(tmpFile.Name())
+	tmpFile.Close()
+	
+	// Try to create database inside a file (not a directory)
+	invalidPath := filepath.Join(tmpFile.Name(), "test.db")
 	_, err := NewStoreWithLogger(context.Background(), invalidPath, logger)
 	assert.Error(t, err, "Should fail when trying to create database in invalid location")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Lite
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 96ab863 and 0c50d94.

📒 Files selected for processing (15)
  • backend/golang/cmd/memory-processor-test/main.go (2 hunks)
  • backend/golang/cmd/server/main.go (1 hunks)
  • backend/golang/pkg/bootstrap/temporal.go (3 hunks)
  • backend/golang/pkg/dataprocessing/chatgpt/chatgpt_test.go (1 hunks)
  • backend/golang/pkg/dataprocessing/gmail/gmail_test.go (1 hunks)
  • backend/golang/pkg/dataprocessing/slack/slack_test.go (1 hunks)
  • backend/golang/pkg/dataprocessing/telegram/telegram_test.go (2 hunks)
  • backend/golang/pkg/db/backup.go (1 hunks)
  • backend/golang/pkg/db/backup_test.go (1 hunks)
  • backend/golang/pkg/db/backup_timing_test.go (1 hunks)
  • backend/golang/pkg/db/integration_test.go (1 hunks)
  • backend/golang/pkg/db/recovery_enhanced.go (1 hunks)
  • backend/golang/pkg/db/recovery_test.go (1 hunks)
  • backend/golang/pkg/db/safety_test.go (1 hunks)
  • backend/golang/pkg/db/sqlite.go (3 hunks)
✅ Files skipped from review due to trivial changes (2)
  • backend/golang/cmd/memory-processor-test/main.go
  • backend/golang/pkg/bootstrap/temporal.go
🚧 Files skipped from review as they are similar to previous changes (7)
  • backend/golang/pkg/dataprocessing/slack/slack_test.go
  • backend/golang/pkg/dataprocessing/gmail/gmail_test.go
  • backend/golang/pkg/dataprocessing/telegram/telegram_test.go
  • backend/golang/pkg/dataprocessing/chatgpt/chatgpt_test.go
  • backend/golang/pkg/db/backup_test.go
  • backend/golang/cmd/server/main.go
  • backend/golang/pkg/db/safety_test.go
🔇 Additional comments (16)
backend/golang/pkg/db/recovery_test.go (2)

14-82: Well-structured test implementation with proper logger usage!

The test correctly uses a single logger instance with the adapter pattern, has comprehensive error handling, and tests the recovery manager's core functionality including archiving and database opening with recovery.


115-133: Excellent implementation of corrupted database testing!

The test now properly creates a corrupted database by writing invalid SQLite data, which effectively validates the recovery logic against actual corruption scenarios. This addresses the previous concern about testing recovery on healthy databases.

backend/golang/pkg/db/recovery_enhanced.go (6)

1-15: Package declaration and imports look good.

All imports are properly utilized and the package structure is appropriate for the database recovery functionality.


17-33: LoggerAdapter implementation is well-designed.

The adapter pattern correctly abstracts the charmbracelet/log.Logger, enabling dependency injection and testing flexibility. The method implementations properly delegate to the underlying logger.


34-62: Robust database archiving implementation.

The method provides comprehensive archiving with conflict-free naming, proper directory creation, and good error handling. The delegation to separate methods for auxiliary files and metadata follows good separation of concerns.


80-128: Comprehensive metadata and utility functions.

The metadata creation provides valuable debugging information, and utility functions handle errors gracefully. The Go version detection has been properly implemented with runtime.Version() as addressed from previous feedback.


155-184: Excellent implementation of direct backup restoration.

The method has been properly improved based on previous feedback to perform direct backup restoration. The implementation correctly archives the corrupted database, performs direct restoration, and verifies the result with comprehensive logging.


186-275: Well-designed database manager with comprehensive recovery.

The implementation properly consolidates recovery logic in the attemptRecoveryAndReopen helper method (addressing previous feedback), includes integrity verification, and provides informative error messages with actionable recovery options. The separation of concerns is excellent.

backend/golang/pkg/db/integration_test.go (4)

14-34: Good test setup with proper cleanup handling.

The test properly initializes a temporary directory and implements thorough cleanup of corrupted archives both at the start and end of the test.


35-55: Well-structured store creation test.

The test properly verifies store initialization with all components and validates basic operations.


103-125: Database manager test validates recovery integration.

The test properly exercises the database manager's recovery capabilities and verifies SQL operations work correctly after initialization.


138-165: Comprehensive application usage test.

The test effectively simulates typical application usage patterns including store creation, database operations, and backup functionality.

backend/golang/pkg/db/backup_timing_test.go (2)

15-96: Comprehensive backup scheduling test with proper logic verification.

The test effectively validates backup scheduling logic using the actual backup manager's CalculateNextBackup method with various time scenarios.


187-286: Robust date-based backup detection tests.

The tests comprehensively verify backup detection logic with proper handling of environment issues and thorough timestamp-based scenarios.

backend/golang/pkg/db/backup.go (2)

361-380: Robust recovery strategy implementation.

The layered recovery approach with verification after each strategy provides excellent resilience against various corruption scenarios.


45-184: Well-architected daily backup system.

The DailyBackupManager implementation includes proper state management, dynamic status reporting, integrity verification, and atomic backup operations with temporary files.

Comment on lines +19 to +50
// Register a custom SQLite driver with connection hooks for safety.
func init() {
sql.Register("sqlite3_safe", &sqlite3.SQLiteDriver{
ConnectHook: func(conn *sqlite3.SQLiteConn) error {
var errs []error

// Execute all PRAGMA commands, collecting any errors
if _, err := conn.Exec("PRAGMA foreign_keys = ON", nil); err != nil {
errs = append(errs, fmt.Errorf("failed to enable foreign keys: %w", err))
}

if _, err := conn.Exec("PRAGMA busy_timeout = 5000", nil); err != nil {
errs = append(errs, fmt.Errorf("failed to set busy timeout: %w", err))
}

if _, err := conn.Exec("PRAGMA journal_mode = WAL", nil); err != nil {
errs = append(errs, fmt.Errorf("failed to set WAL mode: %w", err))
}

// Return combined error if any occurred
if len(errs) > 0 {
var errStrings []string
for _, err := range errs {
errStrings = append(errStrings, err.Error())
}
return fmt.Errorf("PRAGMA errors: %s", strings.Join(errStrings, "; "))
}

return nil
},
})
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Ensure consistent use of sqlite3_safe driver.

The custom sqlite3_safe driver with safety pragmas is registered but not used consistently throughout the codebase.

Let me verify the driver usage across the codebase:


🏁 Script executed:

#!/bin/bash
# Check for sqlite3 driver usage patterns
echo "=== Checking sqlite3 driver usage ==="
rg -n 'sql\.Open\("sqlite3' --type go

echo -e "\n=== Checking sqlite3_safe driver usage ==="
rg -n 'sql\.Open\("sqlite3_safe' --type go

echo -e "\n=== Files that might need updating ==="
rg -l 'sql\.Open\("sqlite3"' --type go | grep -v '_test\.go'

Length of output: 2638


Ensure consistent use of the sqlite3_safe driver

The custom sqlite3_safe driver—registered with safety PRAGMAs—must replace all calls to the plain "sqlite3" driver. Please update the following locations:

  • backend/golang/pkg/db/backup.go
    Replace sql.Open("sqlite3", …) at lines 385, 400, 418, 613, and 652
  • backend/golang/pkg/db/recovery_enhanced.go
    Replace sql.Open("sqlite3", ":memory:") at line 113
  • backend/golang/pkg/dataprocessing/whatsapp/whatsapp.go
    Replace sql.Open("sqlite3", dbPath) at line 417
  • backend/golang/pkg/ai/test_utils.go
    Replace sql.Open("sqlite3", ":memory:") at line 78
🤖 Prompt for AI Agents
In backend/golang/pkg/db/backup.go at lines 385, 400, 418, 613, and 652,
backend/golang/pkg/db/recovery_enhanced.go at line 113,
backend/golang/pkg/dataprocessing/whatsapp/whatsapp.go at line 417, and
backend/golang/pkg/ai/test_utils.go at line 78, replace all instances of
sql.Open("sqlite3", ...) with sql.Open("sqlite3_safe", ...) to ensure the custom
sqlite3_safe driver with safety PRAGMAs is consistently used throughout the
codebase.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Lite
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 0c50d94 and 0ff9de3.

📒 Files selected for processing (4)
  • backend/golang/pkg/ai/test_utils.go (1 hunks)
  • backend/golang/pkg/dataprocessing/whatsapp/whatsapp.go (1 hunks)
  • backend/golang/pkg/db/backup.go (1 hunks)
  • backend/golang/pkg/db/recovery_enhanced.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/golang/pkg/db/recovery_enhanced.go
🔇 Additional comments (7)
backend/golang/pkg/db/backup.go (5)

16-36: Logger implementation looks good.

The logger interface and default implementation properly handle newline formatting, addressing the previous concern about double newlines. The suffix checking logic ensures consistent output formatting.


65-74: Backup scheduling logic is correct.

The method properly calculates the next 2 AM backup time, handling both same-day and next-day scenarios appropriately.


201-205: SQL path escaping function is correctly implemented.

The function properly escapes single quotes by doubling them, which is the correct SQLite standard approach. This addresses the SQL injection vulnerability flagged in previous reviews.


295-380: Excellent recovery strategy implementation.

The recovery manager provides a comprehensive, layered approach to database recovery with well-defined strategies and proper error handling. The strategy pattern implementation is clean, maintainable, and follows good software engineering practices.


584-609: Improved dump processing implementation.

The method now correctly uses the sqlite3 command-line tool approach, which addresses the previous concerns about unsafe string replacements and memory usage with large dumps. The direct piping approach is much safer and more efficient.

backend/golang/pkg/ai/test_utils.go (1)

78-78: LGTM - Driver change aligns with SQLite safety improvements.

The change from "sqlite3" to "sqlite3_safe" is consistent with the broader codebase update to use the enhanced SQLite driver that enforces safety features like foreign key constraints, WAL mode, and busy timeouts.

backend/golang/pkg/dataprocessing/whatsapp/whatsapp.go (1)

417-417: LGTM - Consistent SQLite driver update.

The change to "sqlite3_safe" maintains consistency with the codebase-wide initiative to use the enhanced SQLite driver with safety features. This ensures the WhatsApp processor benefits from the same foreign key enforcement, WAL mode, and connection safety as other components.

Comment on lines +45 to +63
type DailyBackupManager struct {
dbPath string
db *sql.DB
logger Logger
stopChan chan struct{}
started bool
}

func NewDailyBackupManager(dbPath string, logger Logger) *DailyBackupManager {
return &DailyBackupManager{
dbPath: dbPath,
logger: logger,
stopChan: make(chan struct{}),
}
}

func (dbm *DailyBackupManager) SetDatabase(db *sql.DB) {
dbm.db = db
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential race condition with the started field.

The started field is accessed without synchronization in multiple methods (Start, Stop, GetStatus). This could lead to race conditions when the backup manager is accessed from multiple goroutines.

Consider using atomic operations or a mutex to protect the started field:

+import "sync/atomic"

 type DailyBackupManager struct {
 	dbPath   string
 	db       *sql.DB
 	logger   Logger
 	stopChan chan struct{}
-	started  bool
+	started  int32 // Use atomic int32 instead of bool
 }

Then use atomic.LoadInt32(&dbm.started) == 1 to check and atomic.StoreInt32(&dbm.started, 1) to set.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
type DailyBackupManager struct {
dbPath string
db *sql.DB
logger Logger
stopChan chan struct{}
started bool
}
func NewDailyBackupManager(dbPath string, logger Logger) *DailyBackupManager {
return &DailyBackupManager{
dbPath: dbPath,
logger: logger,
stopChan: make(chan struct{}),
}
}
func (dbm *DailyBackupManager) SetDatabase(db *sql.DB) {
dbm.db = db
}
// Add the atomic package
import (
"database/sql"
"sync/atomic"
)
type DailyBackupManager struct {
dbPath string
db *sql.DB
logger Logger
stopChan chan struct{}
started int32 // was bool; use atomic int32
}
func NewDailyBackupManager(dbPath string, logger Logger) *DailyBackupManager {
return &DailyBackupManager{
dbPath: dbPath,
logger: logger,
stopChan: make(chan struct{}),
}
}
func (dbm *DailyBackupManager) SetDatabase(db *sql.DB) {
dbm.db = db
}
🤖 Prompt for AI Agents
In backend/golang/pkg/db/backup.go around lines 45 to 63, the `started` boolean
field in DailyBackupManager is accessed concurrently without synchronization,
risking race conditions. Replace the `started` bool with an int32 field and use
atomic operations from the sync/atomic package to safely read and write its
value. Update all accesses to `started` to use atomic.LoadInt32 and
atomic.StoreInt32 accordingly to ensure thread-safe checks and updates.

Comment on lines +76 to +97
func (dbm *DailyBackupManager) Start(ctx context.Context) {
// Ensure database is set before starting
if dbm.db == nil {
dbm.logger.Printf("Cannot start backup manager: database not set")
return
}

now := time.Now()
nextBackup := dbm.CalculateNextBackup(now)

// Do catch-up backup if we missed today's backup
if now.Hour() >= 2 && !dbm.todaysBackupExists() {
go func() {
if err := dbm.CreateBackup(ctx); err != nil {
dbm.logger.Printf("Initial backup failed: %v", err)
}
}()
}

dbm.started = true
go dbm.runDailyBackups(ctx, nextBackup)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Missing synchronization in Start method.

The method doesn't prevent multiple concurrent calls to Start(), which could result in multiple backup goroutines running simultaneously.

Add synchronization to prevent duplicate starts:

+import "sync"

+type DailyBackupManager struct {
+	// ... existing fields ...
+	mu sync.Mutex
+}

 func (dbm *DailyBackupManager) Start(ctx context.Context) {
+	dbm.mu.Lock()
+	defer dbm.mu.Unlock()
+	
+	if dbm.started {
+		dbm.logger.Printf("Backup manager already started")
+		return
+	}
+	
 	// Ensure database is set before starting
 	if dbm.db == nil {
 		dbm.logger.Printf("Cannot start backup manager: database not set")
 		return
 	}

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In backend/golang/pkg/db/backup.go lines 76 to 97, the Start method lacks
synchronization to prevent multiple concurrent calls, which can cause multiple
backup goroutines to run simultaneously. Add a mutex or use an atomic flag to
check and set the started state atomically at the beginning of the method,
returning early if already started. This ensures only one backup goroutine is
launched and prevents duplicate starts.

Comment on lines +99 to +102
func (dbm *DailyBackupManager) Stop() {
dbm.started = false
close(dbm.stopChan)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Stop method lacks proper synchronization and channel safety.

The method has two issues:

  1. Race condition with the started field
  2. Potential panic if Stop() is called multiple times (channel already closed)
 func (dbm *DailyBackupManager) Stop() {
+	dbm.mu.Lock()
+	defer dbm.mu.Unlock()
+	
+	if !dbm.started {
+		return
+	}
+	
 	dbm.started = false
-	close(dbm.stopChan)
+	select {
+	case <-dbm.stopChan:
+		// Already closed
+	default:
+		close(dbm.stopChan)
+	}
 }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In backend/golang/pkg/db/backup.go around lines 99 to 102, the Stop method has a
race condition on the started field and can panic if called multiple times due
to closing an already closed channel. Fix this by adding synchronization, such
as a mutex, to safely check and update the started field and ensure the stopChan
channel is closed only once, preventing multiple close calls and race
conditions.

@Enjection Enjection changed the title [Draft] Add auto backup/recovery for sqlite and tune settings Add auto backup/recovery for sqlite and tune settings Jul 25, 2025
@Enjection Enjection marked this pull request as draft July 25, 2025 16:12
@Enjection Enjection marked this pull request as ready for review July 25, 2025 16:12
@Enjection Enjection marked this pull request as draft July 25, 2025 16:22
@Enjection Enjection marked this pull request as ready for review July 25, 2025 16:22
@Enjection Enjection marked this pull request as draft July 25, 2025 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants