memory::ops tests flake under cargo-llvm-cov from shared global client (queue residue + SQLite write contention)

## Summary

`memory::ops` unit tests intermittently fail under **Rust Core Coverage (cargo-llvm-cov)** with two signatures — an inflated `IngestionState.queue_depth`, and a `tool_rule_put` DB write erroring mid-test. Both stem from `memory::ops` tests sharing one process-global memory client whose async lifecycle isn't fully gated by the test lock.

## Problem

**Observed (CI, PR #2717 / #2721 runs):**
- `memory::ops::sync::tests::memory_ingestion_status_reflects_initialized_client_snapshot` — `assert_eq!(queue_depth, 1)` got `2`. *(addressed by #2721)*
- `memory::ops::tool_memory::tests::tool_rules_for_prompt_sorts_by_priority_and_tool_name` — second `tool_rule_put(...).expect("put normal")` returned `Err` (the first `put` succeeded). *(still open — this issue)*

Both tests **pass deterministically in isolation**; they only fail under `cargo-llvm-cov`'s slower parallel timing, and intermittently (~1/9500).

**Root cause:** All `memory::ops` tests share one process-global memory client — one SQLite DB and one `IngestionState` — pinned by `ensure_shared_memory_client` (`memory/ops/test_support.rs`). `GLOBAL_MEMORY_TEST_LOCK` (`memory/ops/mod.rs`) serializes the test **bodies**, but it does **not** cover:
1. **`IngestionState.queue_depth`** (an `AtomicUsize`) — never reset between tests, so residue accumulates. *(mitigated for the sync test in #2721 via a delta assertion; the underlying residue remains.)*
2. **The shared SQLite connection / client's async lifecycle / background ingestion** — a write can hit transient contention (`SQLITE_BUSY`/`IOERR`) when work from a prior test outlives the body-lock. This is the `tool_memory` `.expect("put normal")` failure.

**Steps to reproduce:** non-deterministic; runs `cargo llvm-cov` (full core test suite, parallel) repeatedly. Platform: CI Linux runner (`cargo-llvm-cov`), toolchain 1.93.0. Not seen on the plain `cargo test` runner.

## Solution (optional)

Candidate fixes (each has trade-offs — pick per maintainer judgement):
- **Reset the shared `IngestionState` / drain in-flight ingestion at the start of each locked test**, so each test begins from a clean global.
- **Quiesce/await the shared client's background ingestion under `GLOBAL_MEMORY_TEST_LOCK`** so async work cannot bleed past the body-lock.
- **Per-test isolated client** (fights the deliberate shared-client design in `ensure_shared_memory_client`).
- **SQLite busy-timeout/retry** on memory-store writes (a production change; there is prior art in `whatsapp_data/sqlite_retry.rs`).

Note: #2649 added `GLOBAL_MEMORY_TEST_LOCK` for `memory::global`, and #2712 pinned `OPENHUMAN_WORKSPACE` under `TEST_ENV_LOCK` for documents tests — this is a remaining gap on the `memory::ops` client *lifecycle/queue* axis.

## Acceptance criteria

- [ ] **Repro gone** — `memory::ops` tests no longer flake under `cargo-llvm-cov` (validate by repeated CI runs / a stress loop).
- [ ] **Regression safety** — the chosen isolation/reset mechanism is covered so the bleed can't return.
- [ ] **Diff coverage ≥ 80%** — fix PR meets the changed-lines gate.
- [ ] The `tool_memory` `.expect("put normal")` contention failure specifically no longer occurs.

## Related

- #2721 — hardens the `sync` ingestion-status assertion (delta vs absolute); partial mitigation.
- #2717 — where these flakes were observed (unrelated chat fix; the coverage job flaked 3× on `memory::ops`).
- #2649, #2712 — prior `memory` test-isolation work this builds on.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory::ops tests flake under cargo-llvm-cov from shared global client (queue residue + SQLite write contention) #2722

Summary

Problem

Solution (optional)

Acceptance criteria

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

memory::ops tests flake under cargo-llvm-cov from shared global client (queue residue + SQLite write contention) #2722

Description

Summary

Problem

Solution (optional)

Acceptance criteria

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions