Skip to content

Fix: recovery amplification loop and provider echo race (INV-SAFETY-02)#47

Open
Droyder7 wants to merge 1 commit into
kavinsood:mainfrom
Droyder7:fix/inv-safety-02
Open

Fix: recovery amplification loop and provider echo race (INV-SAFETY-02)#47
Droyder7 wants to merge 1 commit into
kavinsood:mainfrom
Droyder7:fix/inv-safety-02

Conversation

@Droyder7
Copy link
Copy Markdown

@Droyder7 Droyder7 commented May 19, 2026

fixes #46

This PR resolves the unbounded file growth and data corruption loops observed during editor-bound
recoveries. The fix addresses a multi-layer root cause where the system misclassified its own
repairs as remote edits and inappropriately applied stale editor state back to the CRDT during
active recovery cycles.

Technical Details

The fix implements a defense-in-depth approach across three subsystems to enforce the
INV-SAFETY-02 and INV-EDIT-02 invariants:

  1. Provider Echo Suppression (DiskMirror):

    • Implemented a TTL-based hash tracking mechanism (recentRepairEchoes) to identify local
      repair writes.
    • Refactored the afterTransaction observer to be asynchronous. It now computes the hash of
      incoming "remote" updates and suppresses disk writes if they are confirmed to be echoes of
      a recent local repair. This closes the race window where a stale echo could overwrite a
      concurrent external disk edit.
  2. Recovery Guard (EditorBindingManager):

    • Added a strict runtime guard to the heal() method. It now accepts an
      isDiskAuthorityRecoveryActive predicate.
    • If a recovery is active for a path, heal() (writing editor → CRDT) is skipped in favor of
      repair() (applying CRDT → editor). This prevents "Frankenstein" patches from stale editor
      state from undoing a successful disk-authority recovery.
  3. State Exposure (ReconciliationController):

    • Exposed isDiskAuthorityRecoveryActive() to allow other subsystems (like the editor binding)
      to respect active recovery locks.
  4. Origin Cleanup:

    • Centralized all repair-related origin constants in LOCAL_STRING_ORIGIN_SET to prevent
      future classification failures.

Motivation
Previously, a local repair would trigger a Yjs provider echo that looked like a remote change.
This caused DiskMirror to schedule a disk write, which triggered a vault event, re-starting the
reconciliation and creating an infinite loop. Additionally, concurrent health checks in the editor
would often "heal" the CRDT back to a stale state before the editor had settled on the new
recovered content.

Verification

Alternative Approaches Considered
We considered purely timing-based suppression, but content-hash verification in the
echo-suppressor was chosen for higher reliability, as it explicitly confirms the content matches
the repair before skipping the write.

@Droyder7 Droyder7 changed the title feat(sync): implement disk authority recovery checks and enhance heal… Fix: recovery amplification loop and provider echo race (INV-SAFETY-02) May 19, 2026
…ing logic; add integration tests for recovery scenarios
@Droyder7 Droyder7 force-pushed the fix/inv-safety-02 branch from 5342730 to 66cd864 Compare May 19, 2026 09:03
@Droyder7 Droyder7 marked this pull request as ready for review May 19, 2026 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Local repairs round-tripping as remote writebacks (INV-SAFETY-02 violation) Sync loop glitching making edits impossible

1 participant