Skip to content

fix(mcp): instrument recall_query tool handler for telemetry (v0.3.3)#39

Merged
tznthou merged 2 commits into
mainfrom
fix/mcp-recall-query-telemetry
May 22, 2026
Merged

fix(mcp): instrument recall_query tool handler for telemetry (v0.3.3)#39
tznthou merged 2 commits into
mainfrom
fix/mcp-recall-query-telemetry

Conversation

@tznthou
Copy link
Copy Markdown
Owner

@tznthou tznthou commented May 22, 2026

Summary

  • Fix v0.3.2 telemetry coverage gap: MCP recall_query handler was missing the appendRecallTelemetry call (HTTP /memory/query had it).
  • Add 2 regression tests in tests/mcp.test.ts (hit + miss cases).
  • Bump 0.3.2 → 0.3.3 + CHANGELOG entries (EN/ZH).

Why this is a bug

Most clients reach the daemon via MCP, not direct HTTP. The v0.3.2 ship instrumented only the HTTP route, so the telemetry log (~/.ccrecall/recall-query.log.jsonl) captured only ship smoke-test traffic. Post-ship 22h accumulated exactly 1 row (the smoke-test row); all real recall_query MCP calls were silently dropped.

Test plan

  • pnpm vitest run tests/mcp.test.ts — 25 pass (2 new regression)
  • pnpm vitest run — 598 pass (39 files)
  • pnpm eslint src/mcp/tools.ts tests/mcp.test.ts --fix — clean
  • Post-merge smoke: daemon upgrade → MCP query → verify log row appended

Observation period reset

Any consumer of the v0.3.2 7d hit-rate window should restart from v0.3.3 ship date. Original 5/28 deadline becomes 6/04. Cold-rate estimates from the bug window are not reliable evidence.

Follow-up

The root cause (per-handler telemetry call is fragile) will be tracked in a separate issue covering broader telemetry/observability layer design gaps for v0.4.0+ planning.

🤖 Generated with Claude Code

tznthou and others added 2 commits May 22, 2026 23:12
v0.3.2 added appendRecallTelemetry to GET /memory/query but not to the
MCP recall_query handler in src/mcp/tools.ts. Most clients reach the
daemon via MCP rather than direct HTTP, so the telemetry log
(~/.ccrecall/recall-query.log.jsonl) only captured ship smoke-test
traffic — actual user calls were silently dropped (1 row in 22h
post-ship, the smoke-test row itself).

Fix: recallQueryHandler now calls appendRecallTelemetry with
hitCount=emittedIds.length (post-budget, matches HTTP semantics) and
projectId=null (MCP schema does not carry a project parameter).

Regression: 2 tests in tests/mcp.test.ts using
CCRECALL_RECALL_TELEMETRY_PATH to isolate test log.

Observation period reset: v0.3.2 7d hit-rate window must restart from
0.3.3 ship date; samples gathered during the bug window only reflect
HTTP-direct traffic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex adversarial review on PR #39 caught a guideline violation: the
v0.3.3 fix in this branch added an unconditional appendRecallTelemetry
call to recallQueryHandler, but existing tests in tests/mcp.test.ts
(query='nonexistent'/'Apache'/'apache'/'keyword'/'searchable') and
tests/touch-integration.test.ts (query='alpha'/'beta'/'repeated'/
'nothing') do not override CCRECALL_RECALL_TELEMETRY_PATH. Every
`pnpm vitest run` therefore appended fixture queries to the operator's
real ~/.ccrecall/recall-query.log.jsonl — violating CLAUDE.md "tests
must not touch production data: use mkdtemp isolation" and contaminating
the very log this hotfix is making trustworthy.

Fix: describe-level beforeEach in tests/mcp.test.ts (MCP recall_query /
recall_save blocks) and file-level beforeEach in tests/touch-integration
.test.ts set CCRECALL_RECALL_TELEMETRY_PATH to a tmpDir path, restore
original in afterEach. Verified: log file row count unchanged before
and after a full `pnpm vitest run` (598 pass).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant