fix(vault): sync writes to memory_tree, not legacy UnifiedMemory (#2705) by justinhsu1477 · Pull Request #2720 · tinyhumansai/openhuman

justinhsu1477 · 2026-05-27T00:57:42Z

Closes #2705.

Root cause

vault::sync::sync_vault walked the directory, called doc_ingest → `MemoryClient::ingest_doc` → `UnifiedMemory::ingest_document` → `memory_docs` table. The UI's "synced" message was technically correct (UnifiedMemory accepted the writes), but every modern retrieval surface — `memory.search`, `tree.read_chunk`, `tree.browse`, the agent's recall path, the summary-tree builder — reads from the memory-tree backend (`mem_tree_chunks` + `mem_tree_ingested_sources`).

Vault data was invisible to all of them.

That explains the entire bug report:

Symptom	Explanation
`mem_tree_chunks` = 0	vault sync never wrote there
`mem_tree_ingested_sources` = 0	same
"Memory wiped: 0 rows removed"	wipe clears memory_tree, which was already empty
"Build summary trees" produces no jobs	memory_tree has no chunks to summarise
UI "N file(s) · synced"	vault state correctly reported the UnifiedMemory ingest succeeded — to the wrong table
Agent can't recall vault content	agent retrieval reads memory_tree

Recent PRs (#2585, #2556, #2574) migrated RAG primitives to memory_tree as the canonical layer; the vault sync path was not migrated alongside.

Fix

`process_file` now calls `memory::ingest_pipeline::ingest_document` directly with a stable `source_id = vault:{vault_id}:{rel_path}`. The pipeline writes to `mem_tree_chunks` + `mem_tree_ingested_sources` — the tables the modern retrieval stack reads from.

Three design choices worth calling out:

Content-update path — the pipeline's `already_ingested` gate is content-blind and the source_id is stable per file path. For real content updates the vault layer drops prior chunks via `memory_store::chunks::store::delete_chunks_by_source` before the re-ingest, otherwise new content gets short-circuited. The vault ledger's `content_hash` check still gates whether we run delete+reingest at all, so untouched files cost zero pipeline work.
Deletion path (Phase 4) — `by_path` entries the walk didn't see now call `delete_chunks_by_source` instead of `doc_delete`. Migration-safe: ledger entries whose stored `document_id` doesn't start with `vault:` (rows persisted before this fix) fall back to a recomputed source_id.
Vault ledger semantic — the `VaultFile.document_id` field now holds the memory-tree source_id. Schema column name unchanged for backward compatibility with persisted rows; only the semantic of what we store changed. Deletion uses the prefix-check above to handle the migration window cleanly.

Tests

Three new regression tests in `vault::sync::sync_tests`:

Test	What it pins
`sync_writes_to_memory_tree`	The #2705 regression. Creates a vault with two .md files, runs `sync_vault`, asserts `count_chunks` goes up + both source_ids appear in `mem_tree_ingested_sources` + ledger `document_id` starts with `vault:`
`second_sync_with_no_changes_is_idempotent`	Re-sync with unchanged content does not duplicate chunks (vault-layer hash dedup guards the pipeline)
`vault_source_id_is_stable_and_namespaced`	Unit test on the id format — defends against an accidental rename breaking cross-file isolation

`cargo test --lib vault` — 28/28 pass.
`cargo check --lib` — clean.
`cargo fmt --check` — clean.
`cargo test --tests --no-run` — clean (lesson from feat(mcp-registry): InstalledServer HTTP-remote transport #2603 — integration-test target must compile).

Out of scope (separate audits / PRs)

Composio providers and `agent_experience` still call `doc_ingest` → UnifiedMemory. If they have the same gap, that's a separate audit; this PR is scoped to the vault path the user reported.
Removing `UnifiedMemory` entirely is the larger follow-up senamakel listed on refactor(memory): separate tree policy from generic engine + E2E tests #2585; out of scope here.

Refs #2705.

Summary by CodeRabbit

Improvements
- Vault file synchronization now efficiently detects and skips redundant processing of unchanged content.
- Improved cleanup of deleted vault files from the system.
Tests
- Added regression tests for vault sync reliability and consistency.
- Enhanced end-to-end test coverage for sync update, deletion, and addition workflows.

coderabbitai · 2026-05-27T00:57:57Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 66601b04-60c0-4920-aa97-c86d2cd6567b

📥 Commits

Reviewing files that changed from the base of the PR and between f8b5054 and e7c61b1.

📒 Files selected for processing (2)

src/openhuman/vault/sync.rs
tests/vault_sync_e2e.rs

🚧 Files skipped from review as they are similar to previous changes (2)

tests/vault_sync_e2e.rs
src/openhuman/vault/sync.rs

📝 Walkthrough

Walkthrough

Vault sync has been rerouted to ingest per-file content through the memory-tree pipeline via stable per-file vault:{vault_id}:{rel_path} source IDs, with SHA-256-based change detection, pre-deletion of stale chunks, and migration support for legacy ledger entries. Regression tests and e2e assertions now validate memory-tree table population and idempotency.

Changes

Memory-tree migration and regression fix

Layer / File(s)	Summary
Module setup and stable source ID helper `src/openhuman/vault/sync.rs` (1–59)	Module documentation describes the memory-tree ingestion path and stable `vault:{vault_id}:{rel_path}` source-id semantics; `vault_source_id(vault_id, rel_path)` helper introduced.
Per-file data structures and process_file signature `src/openhuman/vault/sync.rs` (132–139, 160–165)	FileToProcess struct reshaped to retain only memory-tree-specific fields (`vault_id`, `prev_hash`); `process_file` signature updated to accept `Arc<Config>` for routing through the memory-tree pipeline.
Per-file ingestion and change detection `src/openhuman/vault/sync.rs` (179–296)	Compute SHA-256 content hash and return `Unchanged` when content matches prior hash; derive stable `source_id` and preemptively delete prior memory-tree chunks when content differs; build `DocumentInput` with vault/ext tags and call `ingest_pipeline::ingest_document`; map results to `IngestFileResult` with `document_id` set to the stable source_id.
Discovery wiring and concurrent ingestion `src/openhuman/vault/sync.rs` (468–473, 491–498)	Discovery emits FileToProcess records with `prev_hash` and `vault_id`; concurrent ingestion shares a cloned `Config` via `Arc` so each worker calls `process_file(Arc::clone(&config), file)` for each candidate.
Memory-tree file deletion with legacy fallback `src/openhuman/vault/sync.rs` (550–637)	Deletion driven by memory-tree `delete_chunks_by_source`; for pre-#2705 ledger rows with non-`vault:` prefixed `document_id`, recomputes stable `vault_source_id` for chunk deletion and performs best-effort `doc_delete` for UnifiedMemory migration cleanup.
Regression and idempotency tests `src/openhuman/vault/sync.rs` (694–894)	New `sync_tests` module verifies mem_tree_chunks and mem_tree_ingested_sources population, validates ledger `document_id` encodes stable memory-tree source_ids, confirms unchanged re-sync is idempotent, and includes a unit test for `vault_source_id` determinism and namespacing.
E2E test alignment with memory-tree behavior `tests/vault_sync_e2e.rs` (5–12, 22–23, 56, 117–146, 185–207)	Updated `vault_sync_roundtrip_updates_memory_and_ledger` e2e test to assert memory-tree ingestion: post-first-sync validates chunk population and `mem_tree_ingested_sources` registration for `vault:<vault_id>:<path>` source IDs; post-second-sync validates update/delete/add lifecycle under memory-tree semantics.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

tinyhumansai/openhuman#1994: Both PRs modify the per-file ingestion and deduplication logic in vault sync, including use of stable per-file vault-scoped identifiers for tracking content and avoiding duplicates.

Suggested labels

rust-core

Suggested reviewers

graycyrus
M3gA-Mind

Poem

🐰 From memory's old attic to the tree so bright,
Files synced with hashes, stable paths in sight,
No duplication blooms where vault: IDs flow,
The ledger learns to skip what hasn't changed below! 🌿

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: routing vault sync from UnifiedMemory (legacy) to memory_tree (modern retrieval backend), directly addressing issue `#2705`.
Linked Issues check	✅ Passed	The PR comprehensively addresses `#2705`: vault sync now writes to memory_tree tables (mem_tree_chunks, mem_tree_ingested_sources) using stable source_ids, making synced content visible to retrieval, tree browsing, summaries, and agent recall.
Out of Scope Changes check	✅ Passed	All changes are scoped to vault sync routing and memory-tree integration; deletions cleanup, content-hash skipping, and test regression suite are all directly tied to issue `#2705` objectives.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

src/openhuman/vault/sync.rs (1)

136-136: 💤 Low value

Unused field existing_doc_id after migration.

The existing_doc_id field is populated at line 452 but never read in the new process_file implementation. This appears to be a remnant from the legacy doc_ingest path.

🧹 Remove unused field

 struct FileToProcess {
     rel_path: String,
     title: String,
     path: PathBuf,
     mtime_ms: i64,
     bytes: u64,
     ext: String,
     /// Content hash from the previous successful sync, for secondary dedup.
     prev_hash: Option<String>,
-    /// Document ID to update on re-ingest (keeps embedding lineage stable).
-    existing_doc_id: Option<String>,
     /// Memory namespace (`vault:<id>`).
     namespace: String,
     /// Vault id for tags and state updates.
     vault_id: String,
 }

And at line 452:

-            existing_doc_id: prev.map(|p| p.document_id.clone()),

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/vault/sync.rs` at line 136, The field existing_doc_id in the
struct (declared as existing_doc_id: Option<String>) is no longer used by the
new process_file implementation; remove the unused field and any assignments to
it (the population at the site where existing_doc_id is set around the former
doc_ingest path, e.g., the code at the location that sets existing_doc_id near
line 452) and clean up related variables/usages so the struct and process_file
logic only include necessary fields; ensure compilation by removing or
refactoring any code that referenced existing_doc_id (search for
existing_doc_id, its assignment, and any related comments) and run cargo
build/tests to confirm no remaining references.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/openhuman/vault/sync.rs`:
- Line 136: The field existing_doc_id in the struct (declared as
existing_doc_id: Option<String>) is no longer used by the new process_file
implementation; remove the unused field and any assignments to it (the
population at the site where existing_doc_id is set around the former doc_ingest
path, e.g., the code at the location that sets existing_doc_id near line 452)
and clean up related variables/usages so the struct and process_file logic only
include necessary fields; ensure compilation by removing or refactoring any code
that referenced existing_doc_id (search for existing_doc_id, its assignment, and
any related comments) and run cargo build/tests to confirm no remaining
references.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9d4a2856-3a34-4876-93db-6363db338593

📥 Commits

Reviewing files that changed from the base of the PR and between 0fddf11 and 0181448.

📒 Files selected for processing (1)

src/openhuman/vault/sync.rs

…tion CodeRabbit nitpick on tinyhumansai#2720: `existing_doc_id` was populated from the prior-sync ledger row for the legacy `doc_ingest` path (used as the update key for `memory_docs`), but the memory_tree pipeline keys on `source_id` directly so the field has no reader after the migration. While here, dropping `namespace` for the same reason — the legacy `IngestDocParams.namespace = "vault:<id>"` no longer flows anywhere because the new path encodes the same scope into the stable `source_id = vault:{vault_id}:{rel_path}` and tags chunks with `vault:{vault_id}` directly. No behavior change — purely struct-side cleanup. 3/3 vault::sync regression tests still pass.

…ansai#2720) Pre-existing integration test `vault_sync_roundtrip_updates_memory_and_ledger` asserted the old UnifiedMemory behavior via `list_documents(namespace)` — which was the silent-failure surface this PR set out to fix. After the memory_tree migration in tinyhumansai#2720 (commit 0181448), vault sync no longer writes to `memory_docs`, so the legacy assertion was guaranteed to fail. Replaced the document-list assertions with direct memory_tree probes: * First sync — assert `count_chunks(&config) > 0` and `is_source_ingested(config, SourceKind::Document, "vault:{id}:{path}")` for both files. Also pin the ledger contract: `document_id` now encodes the memory-tree source_id (prefix `vault:`). * Second sync — pin the full lifecycle: - `notes/one.md` content-updated → still registered (delete + re-ingest via `delete_chunks_by_source` then `ingest_pipeline::ingest_document`). - `docs/two.json` file removed → no longer registered (Phase 4 must delete from `mem_tree_ingested_sources`). - `docs/three.toml` new file → freshly registered. Removed the now-dead `documents_from_payload` helper since no caller remains. `memory_global::init` is still called in setup (initialises the in-process memory client the way a real launch would). Verification: `cargo test --test vault_sync_e2e` passes; full lib suite `cargo test --lib vault` 28/28 pass.

YellowSnnowmann · 2026-05-27T08:36:07Z

justinhsu1477 could you please fix the tests and then push?

…l_sync pollution `memory_ingestion_status_reflects_initialized_client_snapshot` was asserting `status.queue_depth == 1` against the process-global `IngestionState`, on the assumption that the lock acquired at the top of the test guarantees a clean baseline. The two sibling tests (`memory_sync_channel_publishes_targeted_event` and `memory_sync_all_publishes_broadcast_event`) call `memory_sync_channel` / `memory_sync_all` → `spawn_manual_sync`, which detaches a `tokio::spawn(...)` background task that runs `composio::run_connection_sync`. That detached task can enqueue an ingestion job via `MemoryClient::store_skill_sync` → `IngestionQueue::submit` → `state.enqueue()`. By the time this test acquires `GLOBAL_MEMORY_TEST_LOCK`, that background work may already have bumped the global counter — so the equality assertion fires with `left: 2, right: 1`. CI hit this on tinyhumansai#2720 (\"Rust Core Tests + Quality\") despite the lock because the lock only serialises against tests that *also* acquire it — not the detached worker that was spawned earlier and is still draining. The same flake is reachable on `upstream/main` in CI with a parallel-test ordering that puts the snapshot test directly after one of the sync-channel tests; my PR's added vault sync tests didn't cause it, but they shifted the ordering enough to expose it. The principled fix is to snapshot the baseline `queue_depth` at the start of the test and assert the delta. That keeps the test's contract intact (an `enqueue + mark_running` round-trip must surface in the snapshot) without depending on the rest of the suite leaving the global state pristine. Targeted run on the failure window passes: cargo test --lib -- memory::ops::sync vault::sync → 9/9 pass

justinhsu1477 · 2026-05-27T08:48:24Z

@YellowSnnowmann thanks — pushed `f8b50543` about 5 min before your ping.

The failing test (`memory_ingestion_status_reflects_initialized_client_snapshot`) was asserting `queue_depth == 1` against the process-global `IngestionState`. Sibling tests in the same module call `spawn_manual_sync`, which detaches a `tokio::spawn(...)` that runs `composio::run_connection_sync` → `MemoryClient::store_skill_sync` → `IngestionQueue::submit` → `state.enqueue()`. That background task can still be draining when the snapshot test acquires `GLOBAL_MEMORY_TEST_LOCK`, so the global counter isn't 0.

Fix is to snapshot the baseline and assert the delta — pre-existing flake on `upstream/main`, the added vault sync tests just shifted the parallel-test ordering enough to expose it. Full reasoning in the commit body.

`cargo test --lib -- memory::ops::sync vault::sync` passes 9/9 locally; CI re-running now.

sanil-23 · 2026-05-27T10:39:03Z

+        // singleton therefore can't be assumed to start at queue_depth=0.
+        // Capture the baseline and assert the delta instead.
+        let baseline_depth = state.snapshot().queue_depth;
+


🛑 Blocker — unresolved merge conflict against main. Merging this branch with current main leaves git conflict markers around this line, so the merged tree won't compile. Both sides add the same let baseline_depth = state.snapshot().queue_depth; de-flake (main landed an equivalent fix independently) — just with different comments. Resolve by keeping one baseline_depth capture and dropping the duplicate + markers before this can merge.

sanil-23 · 2026-05-27T10:39:29Z

+                chunks_written,
+                already_ingested,
+            );
+            IngestFileResult::Ingested {


Nitpick — false-success edge case. An ingest_document returning Ok(IngestResult { already_ingested: true, chunks_written: 0, .. }) still lands in this Ingested arm and writes a "success" ledger row. The delete-first guard above prevents this on the normal update path, but on a ledger↔memory-tree desync (ledger wiped while the mem_tree_ingested_sources row survives) it would report success while nothing reaches retrieval — the exact false-success mode this PR exists to kill. Consider a log::warn! (or a distinct result) when chunks_written == 0 && already_ingested.

sanil-23 · 2026-05-27T10:39:37Z

+    // pre-ingest cleanup on content updates. Cloning per-task keeps the
+    // borrow life-cycle simple inside `buffer_unordered` (no `&'a Config`
+    // bouncing through closures).
+    let config_for_workers = config.clone();


Nitpick — per-file Config clone. config.clone() runs once per candidate file as the stream is polled. buffer_unordered keeps only ~4 alive at a time, so memory is bounded, but a large vault still performs N full Config clones. Wrapping in Arc<Config> and cloning the Arc would make this O(1)-per-file. Minor.

sanil-23 · 2026-05-27T10:39:40Z

+        // ledger rows that pre-date the migration and still carry a legacy
+        // UnifiedMemory document_id.
+        let stored_id = prev.document_id.clone();
+        let source_id = if stored_id.starts_with("vault:") {


Question — legacy doc_id fallback is a no-op for genuinely-legacy rows. For ledger rows that pre-date #2705, the stored document_id is a UnifiedMemory id whose data lived in memory_docs, never in memory_tree. Recomputing a vault:{id}:{path} source_id and calling delete_chunks_by_source therefore deletes nothing, and the old UnifiedMemory doc is never cleaned up when such a file is removed. Intentional (deferred to the UnifiedMemory-removal follow-up), or should the fallback branch also doc_delete(prev.document_id) during the migration window so legacy rows don't leak?

sanil-23

PR #2720 — fix(vault): sync writes to memory_tree, not legacy UnifiedMemory (#2705)

Walkthrough

This PR migrates the vault sync path off the legacy UnifiedMemory / memory_docs backend onto the memory-tree pipeline (mem_tree_chunks + mem_tree_ingested_sources), which is what every modern retrieval surface actually reads from. process_file now ingests via ingest_pipeline::ingest_document against a stable source_id = vault:{vault_id}:{rel_path}, deletes prior chunks before re-ingesting on content updates (to defeat the content-blind already_ingested gate), and routes vanished-file deletions through delete_chunks_by_source. The diagnosis and the core sync fix are sound and well-tested.

However, this PR cannot merge as-is. Two issues block it: an unresolved merge conflict against current main, and a stale legacy-backend call in the vault-removal/purge path that this migration leaves orphaning data — the same silent-failure class the PR set out to fix.

Changes

File	Summary
`src/openhuman/vault/sync.rs`	Core fix: ingest via memory-tree pipeline with stable `vault:{id}:{rel}` source ids; delete-then-reingest on content change; Phase-4 deletions via `delete_chunks_by_source`; migration-safe legacy doc_id fallback; 3 new unit tests.
`tests/vault_sync_e2e.rs`	Updated e2e to assert against `mem_tree_chunks` / `mem_tree_ingested_sources` instead of `list_documents`; drops the unused `documents_from_payload` helper.
`src/openhuman/memory/ops/sync.rs`	Test-only: capture `queue_depth` baseline and assert delta (de-flake). Currently carries unresolved conflict markers vs `main`.

Actionable comments (2)

🛑 Blockers

1. `src/openhuman/memory/ops/sync.rs:370-387` — Unresolved merge conflict against `main`

Merging this branch with the current main leaves git conflict markers in this file, so the merged tree does not compile. Both sides introduce the same fix — let baseline_depth = state.snapshot().queue_depth; — with different explanatory comments; main independently landed an equivalent de-flake. Trivial to resolve (keep one capture, drop the duplicate and the markers), but it must be resolved before this PR can merge. See the inline comment for the suggested resolution.

⚠️ Major

2. `src/openhuman/vault/ops.rs:116-136` — `vault_remove(purge_memory=true)` still purges the legacy backend, orphaning all memory-tree data

Not in the diff, but a direct downstream consequence of this migration. Vault content now lives in mem_tree_chunks / mem_tree_ingested_sources (keyed vault:{id}:{rel_path}), yet the purge path still calls clear_namespace(v.namespace) → MemoryClient::clear_namespace, which only touches the legacy vectors / memory_docs tables (memory_store/vectors/store.rs:386) — tables this PR no longer writes to.

Result: removing a vault with purge now deletes nothing meaningful and orphans every mem_tree_chunks row, so retrieval keeps surfacing content from a deleted vault. That is the exact silent-failure class this PR fixes, re-appearing on the removal side. The in-sync Phase-4 deletion only covers per-file vanishing, not whole-vault removal.

The codebase already has the helper and an established pattern for this (composio/ops.rs:560, channels/controllers/ops.rs:565):

use crate::openhuman::memory_store::chunks::store::delete_chunks_by_source_prefix;
use crate::openhuman::memory_store::chunks::types::SourceKind;

let cfg = config.clone();
let prefix = format!("vault:{id}:");
let purged = tokio::task::spawn_blocking(move || {
    delete_chunks_by_source_prefix(&cfg, SourceKind::Document, &prefix)
})
.await;
// (optionally also keep clear_namespace for migration-window legacy rows)

At minimum, acknowledge this in the PR body's "Out of scope" section if you intend to defer it.

Nitpicks (3)

src/openhuman/vault/sync.rs:250-270 — A successful ingest_document returning already_ingested: true with chunks_written == 0 is still counted as Ingested and writes a "success" ledger row. The delete-first guard prevents this on the normal path, but a ledger↔memory-tree desync would re-introduce false-success reporting. Consider a log::warn! when chunks_written == 0 && already_ingested. (inline)
src/openhuman/vault/sync.rs:474-476 — Config is cloned once per candidate file; Arc<Config> would avoid N copies on large vaults. (inline)
tests/vault_sync_e2e.rs:117-145 vs src/openhuman/vault/sync.rs:748-766 — inconsistent spawn_blocking convention between the e2e and unit tests for the same is_source_ingested / count_chunks calls; pick one. sync_writes_to_memory_tree could also tighten chunks_after > chunks_before to chunks_before == 0 (fresh tempdir).

Questions for the author (2)

src/openhuman/vault/sync.rs:543-552 — for genuinely pre-#2705 ledger rows the recomputed memory-tree source_id has no chunks (data was in memory_docs), so delete_chunks_by_source is a no-op and the legacy doc leaks on deletion. Deferred, or should the fallback also doc_delete the legacy id? (inline)
src/openhuman/vault/types.rs:56 — is Vault.namespace still meaningful post-migration? Its only remaining consumer is the (now-misdirected) purge call; fixing #2 by prefix-delete may make it vestigial.

Verified / looks good

delete_chunks_by_source clears both mem_tree_chunks and mem_tree_ingested_sources (memory_store/chunks/store.rs:809-832), so the delete-then-reingest content-update path genuinely resets the already_ingested gate. The central design assumption holds.
Error handling is fail-closed: a failed pre-reingest delete returns Failed rather than re-ingesting alongside stale chunks; the next sync self-heals.
delete_chunks_by_source and the Phase-4 deletion correctly run under spawn_blocking (SQLite is sync I/O).
sync_writes_to_memory_tree precisely pins the #2705 regression (chunk count up + both source ids registered + ledger stores the vault: source id); the e2e exercises the full update / delete / add lifecycle.

…yhumansai#2705) Before this fix, `vault::sync::sync_vault` walked the vault directory, read each file, and called `doc_ingest` → `MemoryClient::ingest_doc` → `UnifiedMemory::ingest_document`. UnifiedMemory wrote to the legacy `memory_docs` table. The user's "synced" message was therefore technically correct (the legacy backend accepted the writes), but every modern retrieval surface — `memory.search`, `tree.read_chunk`, `tree.browse`, the agent's recall path, the summary-tree builder — reads from the memory-tree backend (`mem_tree_chunks` + `mem_tree_ingested_sources`). Vault data was invisible to all of them. That explains the entire bug report (tinyhumansai#2705): * `SELECT COUNT(*) FROM mem_tree_chunks` returns 0 (vault sync never wrote there) * `SELECT COUNT(*) FROM mem_tree_ingested_sources` returns 0 (same) * "Memory wiped: 0 rows removed" (wipe clears memory_tree, which was already empty) * "Build summary trees" produces no jobs (memory_tree has no chunks) * Agent can't recall vault content (agent retrieval reads memory_tree) Root cause: recent PRs (tinyhumansai#2585, tinyhumansai#2556, tinyhumansai#2574) migrated RAG primitives to memory_tree as the canonical layer, but the vault sync path was not migrated alongside it. ## Fix `process_file` now calls `memory::ingest_pipeline::ingest_document` directly with a stable `source_id` of the form `vault:{vault_id}:{rel_path}`. The pipeline writes to `mem_tree_chunks` + `mem_tree_ingested_sources` — exactly the tables the modern retrieval stack reads from. Three additional design choices: 1. **Content-update path** — the pipeline's `already_ingested( SourceKind::Document, source_id)` gate is content-blind and the source_id is stable per file path. For real content updates the vault layer drops prior chunks via `memory_store::chunks::store::delete_chunks_by_source` *before* the re-ingest, otherwise the new content gets short-circuited. The vault ledger's `content_hash` check still gates whether we run the delete+reingest at all, so untouched files cost zero pipeline work. 2. **Deletion path** — Phase 4 (`by_path` entries the walk didn't see) now calls `delete_chunks_by_source` instead of the legacy `doc_delete`. Handles migration too: ledger entries whose stored `document_id` doesn't start with `vault:` (rows persisted before this fix) fall back to a recomputed source_id rather than failing. 3. **Vault ledger** — the `VaultFile.document_id` field is now the memory-tree source_id. Schema column name unchanged for backward compat with persisted rows; only the semantic of what we store changed. Deletion uses the prefix-check above to handle the migration window. ## Tests Three new regression tests in `vault::sync::sync_tests` (28/28 vault suite passes): * `sync_writes_to_memory_tree` — the tinyhumansai#2705 regression. Creates a vault with two .md files, runs `sync_vault`, asserts `count_chunks` goes up and both source_ids appear in `mem_tree_ingested_sources`. Also pins the ledger contract: `document_id` must start with `vault:` so the deletion-path prefix check stays correct. * `second_sync_with_no_changes_is_idempotent` — pins that re-sync with unchanged content does not duplicate chunks (the vault-layer hash dedup guards the pipeline). * `vault_source_id_is_stable_and_namespaced` — unit test on the id format itself; defends against an accidental rename that would break the `already_ingested` gate's cross-file isolation. `cargo test --lib vault` 28/28 pass. `cargo check --lib` + `cargo fmt --check` + `cargo test --tests --no-run` all clean (lesson from tinyhumansai#2603 — must compile the integration-test target before push). ## Out of scope (separate audits / PRs) * Composio providers and agent_experience still call `doc_ingest` → UnifiedMemory. If they have a similar gap, that's a separate audit; this PR is scoped to the vault path the user reported. * Removing `UnifiedMemory` entirely is the larger follow-up senamakel listed on tinyhumansai#2585; out of scope here. * In-flight tracing during phase 2 / 3 is left at the existing `log::debug!` level — the new ingestion log lines along with the pre-existing entry/exit logs cover the silent-failure surface the issue's reporter would have used to triage. Closes tinyhumansai#2705.

…tion CodeRabbit nitpick on tinyhumansai#2720: `existing_doc_id` was populated from the prior-sync ledger row for the legacy `doc_ingest` path (used as the update key for `memory_docs`), but the memory_tree pipeline keys on `source_id` directly so the field has no reader after the migration. While here, dropping `namespace` for the same reason — the legacy `IngestDocParams.namespace = "vault:<id>"` no longer flows anywhere because the new path encodes the same scope into the stable `source_id = vault:{vault_id}:{rel_path}` and tags chunks with `vault:{vault_id}` directly. No behavior change — purely struct-side cleanup. 3/3 vault::sync regression tests still pass.

…ansai#2720) Pre-existing integration test `vault_sync_roundtrip_updates_memory_and_ledger` asserted the old UnifiedMemory behavior via `list_documents(namespace)` — which was the silent-failure surface this PR set out to fix. After the memory_tree migration in tinyhumansai#2720 (commit 0181448), vault sync no longer writes to `memory_docs`, so the legacy assertion was guaranteed to fail. Replaced the document-list assertions with direct memory_tree probes: * First sync — assert `count_chunks(&config) > 0` and `is_source_ingested(config, SourceKind::Document, "vault:{id}:{path}")` for both files. Also pin the ledger contract: `document_id` now encodes the memory-tree source_id (prefix `vault:`). * Second sync — pin the full lifecycle: - `notes/one.md` content-updated → still registered (delete + re-ingest via `delete_chunks_by_source` then `ingest_pipeline::ingest_document`). - `docs/two.json` file removed → no longer registered (Phase 4 must delete from `mem_tree_ingested_sources`). - `docs/three.toml` new file → freshly registered. Removed the now-dead `documents_from_payload` helper since no caller remains. `memory_global::init` is still called in setup (initialises the in-process memory client the way a real launch would). Verification: `cargo test --test vault_sync_e2e` passes; full lib suite `cargo test --lib vault` 28/28 pass.

graycyrus

[APPROVED]

This is a solid fix for #2705. Vault sync now correctly routes through the memory-tree pipeline (mem_tree_chunks + mem_tree_ingested_sources) instead of the legacy UnifiedMemory backend. The migration is clean: stable per-file source_ids ensure idempotency, content updates properly delete stale chunks before re-ingesting, and pre-fix ledger rows are handled gracefully.

Tests are comprehensive — three unit tests pin the regression and the invariants, plus e2e validation of the full lifecycle. Code patterns are solid (async/blocking correctly separated, error context preserved, logging appropriate).

No issues. Ship it.

@sanil-23

…i#2720 Three improvements from @sanil-23's review pass: 1. **Surface ledger↔memory_tree desync as a `warn!` log instead of a silent success** (L263). When `ingest_document` returns `Ok { already_ingested: true, chunks_written: 0 }` we still land in the `Ingested` arm and write a fresh ledger row — but no chunks actually reach retrieval. The delete-first guard above prevents this on the normal update path, so seeing it means ledger and memory_tree are out of sync. That's the exact silent-failure mode this PR set out to kill, so it now logs at `warn!` with the suggestion for a manual `delete_chunks_by_source` resync. 2. **Share a single `Config` allocation across the buffer_unordered workers via `Arc<Config>`** (L474). The previous loop did one `config.clone()` per candidate file. With `Arc` a 5k-file vault pays one clone + N atomic ref-count bumps instead of N full `Config` deep-clones — measurably cheaper on cold backfills. Signature change: `process_file(config: Arc<Config>, ...)`. 3. **Run a parallel UnifiedMemory `doc_delete` on the legacy-ledger fallback path** (L544). Pre-tinyhumansai#2705 ledger rows store a UnifiedMemory `{ts}_{hex}` id whose data lives in `memory_docs`, not `mem_tree_*`. Recomputing the memory_tree source_id and running `delete_chunks_by_source` deletes nothing on those rows — so without a parallel `doc_delete` the legacy data leaked until UnifiedMemory removal lands (tinyhumansai#2585 follow-up). The deletion path now does both during the migration window so vanished files actually go away. `doc_delete` failures on the legacy path are best-effort: a 404 / already-absent there shouldn't block the canonical `delete_chunks_by_source` cleanup below. Tests: 9/9 in `memory::ops::sync` + `vault::sync` pass. `cargo check --lib`, `cargo fmt --check`, and `cargo test --tests --no-run` all clean.

justinhsu1477 · 2026-05-27T11:10:57Z

@sanil-23 thanks for the thorough pass — all four addressed in `e7c61b14` (rebased on `upstream/main` so the #2737 conflict is gone).

🛑 L378 conflict — Resolved by rebase: dropped my standalone `baseline_depth` commit since #2737 already landed the same de-flake with a cleaner `state.reset_for_test()` helper. Kept that, dropped mine.

L263 false-success — Added a `log::warn!` for the `already_ingested == true && chunks_written == 0` branch. The delete-first guard prevents this on the normal update path, so seeing it means ledger ↔ memory_tree desync — exactly the silent mode this PR exists to kill. Log line names the source_id + suggests a manual `delete_chunks_by_source` resync.

L474 per-file Config clone — Switched to `Arc`. `process_file` now takes `Arc`, the call site does one outer `Arc::new(config.clone())` then `Arc::clone(&...)` per candidate. A 5k-file backfill pays one deep clone + N ref-count bumps instead of N full clones.

L544 legacy fallback no-op — You're right, it was silently leaving legacy `memory_docs` rows behind. Migration-window fix: when `document_id` doesn't start with `vault:` (= pre-#2705 ledger row), the deletion path now runs `doc_delete(prev.document_id)` against UnifiedMemory in parallel with `delete_chunks_by_source` against the recomputed memory_tree source_id. `doc_delete` failures are best-effort (a 404 on the legacy path shouldn't block the canonical memory_tree cleanup).

`cargo test --lib -- memory::ops::sync vault::sync` → 9/9 pass.
`cargo check --lib` + `cargo fmt --check` + `cargo test --tests --no-run` all clean.

Pushed as force-with-lease since the rebase was unavoidable.

justinhsu1477 requested a review from a team May 27, 2026 00:57

coderabbitai Bot added memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. bug labels May 27, 2026

coderabbitai Bot reviewed May 27, 2026

View reviewed changes

coderabbitai Bot previously approved these changes May 27, 2026

View reviewed changes

justinhsu1477 dismissed coderabbitai[bot]’s stale review via 1e5002a May 27, 2026 01:09

coderabbitai Bot added the working A PR that is being worked on by the team. label May 27, 2026

coderabbitai Bot previously approved these changes May 27, 2026

View reviewed changes

justinhsu1477 dismissed coderabbitai[bot]’s stale review via b8bdb09 May 27, 2026 05:56

coderabbitai Bot previously approved these changes May 27, 2026

View reviewed changes

justinhsu1477 dismissed coderabbitai[bot]’s stale review via f8b5054 May 27, 2026 08:46

coderabbitai Bot previously approved these changes May 27, 2026

View reviewed changes

sanil-23 reviewed May 27, 2026

View reviewed changes

sanil-23 requested changes May 27, 2026

View reviewed changes

justinhsu1477 added 3 commits May 27, 2026 19:02

graycyrus previously approved these changes May 27, 2026

View reviewed changes

justinhsu1477 dismissed stale reviews from graycyrus and coderabbitai[bot] via e7c61b1 May 27, 2026 11:10

justinhsu1477 force-pushed the fix/vault-sync-silent-failure branch from f8b5054 to e7c61b1 Compare May 27, 2026 11:10

coderabbitai Bot added the rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. label May 27, 2026

coderabbitai Bot approved these changes May 27, 2026

View reviewed changes

Conversation

justinhsu1477 commented May 27, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root cause

Fix

Tests

Out of scope (separate audits / PRs)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

YellowSnnowmann commented May 27, 2026

Uh oh!

justinhsu1477 commented May 27, 2026

Uh oh!

sanil-23 May 27, 2026

Choose a reason for hiding this comment

Uh oh!

sanil-23 May 27, 2026

Choose a reason for hiding this comment

Uh oh!

sanil-23 May 27, 2026

Choose a reason for hiding this comment

Uh oh!

sanil-23 May 27, 2026

Choose a reason for hiding this comment

Uh oh!

sanil-23 left a comment

Choose a reason for hiding this comment

PR #2720 — fix(vault): sync writes to memory_tree, not legacy UnifiedMemory (#2705)

Walkthrough

Changes

Actionable comments (2)

🛑 Blockers

1. src/openhuman/memory/ops/sync.rs:370-387 — Unresolved merge conflict against main

⚠️ Major

2. src/openhuman/vault/ops.rs:116-136 — vault_remove(purge_memory=true) still purges the legacy backend, orphaning all memory-tree data

Nitpicks (3)

Questions for the author (2)

Verified / looks good

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

Uh oh!

justinhsu1477 commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

justinhsu1477 commented May 27, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 27, 2026 •

edited

Loading

1. `src/openhuman/memory/ops/sync.rs:370-387` — Unresolved merge conflict against `main`

2. `src/openhuman/vault/ops.rs:116-136` — `vault_remove(purge_memory=true)` still purges the legacy backend, orphaning all memory-tree data