Skip to content

feat(wake-slo): recipient-side responder for WAKE_SLO_PROBE_* topics#144

Open
askhatsoltanov1984-lang wants to merge 3 commits into
PleasePrompto:mainfrom
askhatsoltanov1984-lang:feat/wake-slo-recipient-responder
Open

feat(wake-slo): recipient-side responder for WAKE_SLO_PROBE_* topics#144
askhatsoltanov1984-lang wants to merge 3 commits into
PleasePrompto:mainfrom
askhatsoltanov1984-lang:feat/wake-slo-recipient-responder

Conversation

@askhatsoltanov1984-lang
Copy link
Copy Markdown

Summary

Short-circuit AgentComm wake-push payloads matching the Wake SLO probe protocol before the heavy Claude session spawns. The responder acks + replies via the local Qoopia MCP endpoint, so the wake SLO metric measures wake-push transport + recipient-side dispatch rather than LLM session liveness.

Why recipient-side (not sender-side)

A sender-side responder synthesises the pong locally and fakes L2/L3/L4 proof — the sender writes "the recipient acked" without the recipient process ever running. Leo R2 (2026-05-24) BLOCKed that design. This PR implements the responder where it has to live: in the recipient agent's wake handler, so the ack + reply genuinely originate from the recipient.

Mechanism

  • ductor_bot/wake_slo_responder.py: new module. Probe detection is a strict topic + body regex pair (^WAKE_SLO_PROBE_(C2L|L2C)_\d+\$ and ^WAKE_SLO_PING <iso-ts>\$). On match, the module POSTs two JSON-RPC calls to \$QOOPIA_PUBLIC_URL/mcp (stateless mode): agent_ack(message_id, status='auto_pong') then agent_reply(session_id, body='WAKE_SLO_PONG <echoed-ts>', close=true). Stdlib urllib.request only — no new dependencies.
  • ductor_bot/webhook/observer.py: in _dispatch, immediately after the hook is resolved, check hook.mode == 'wake' && is_probe_payload(payload). If yes, call respond(). On status: ok return a WebhookResult with status='success:wake_slo_probe' and skip the Claude dispatch. On any error the dispatch falls through to the normal wake path, so a broken responder never silently drops real payloads.

Auth (deployment note)

agent_ack is gated on auth.agent_id == message.recipient_agent_id. The responder must therefore authenticate as the local recipient agent. Sourcing precedence (highest first):

  1. WAKE_SLO_RESPONDER_API_KEY (raw value)
  2. WAKE_SLO_RESPONDER_API_KEY_FILE (path)
  3. QOOPIA_API_KEY_FILE (path; back-compat)
  4. Default ~/.ductor-corsairmain/.secrets/qoopia_api_key

On hosts where the default file holds a steward key, set the env override at deploy time to the recipient agent's own key, otherwise the responder logs a warning and falls through to the Claude path.

Tests

tests/test_wake_slo_responder.py — 9 tests covering: probe-payload regex (positive + 3 negative cases), api-key missing path, full ack+reply round-trip, ack-failure short-circuit, network-error fall-through, SSE-framed MCP response parsing.

9 passed in 0.09s

Integration test (corsair-main on Corsair, 2026-05-24): probe sent from Leo → corsair-main; fresh python process loading the new module with corsair-main's key wrote real ack + reply rows in <100ms; session closed via close=true. Real-DB evidence captured at /srv/qoopia/test-evidence/phase1-item-1-r3-2026-05-24T005244Z.log (Corsair-local).

Compatibility

  • Builds on fix/claude-full-model-provider (PR Fix provider_for for full Claude model IDs #143 ancestor commit included).
  • Stateless MCP confirmed via Qoopia code review (sessionIdGenerator: undefined).
  • No new dependencies. No schema migration.
  • Falls through to existing wake-handler on any responder error.

Test plan

  • Unit tests pass (uv run pytest tests/test_wake_slo_responder.py).
  • Integration test on Corsair (real Qoopia, recipient-side ack + reply written, session closed).
  • Cross-host probe round-trip after Leo-side responder ships (out of scope for this PR).

askhatsoltanov1984-lang and others added 3 commits May 23, 2026 18:53
ModelRegistry.provider_for previously matched only the short Claude
aliases (opus / sonnet / haiku). Clients that pass canonical Claude
model IDs such as claude-opus-4-7, claude-sonnet-4-6 or
claude-haiku-4-5-20251001 fell through to the codex branch, breaking
Claude routing.

Match the claude- prefix in addition to the short aliases, and add a
unit test covering both old aliases and the new full IDs.
Intercept Qoopia agentcomm_wake payloads matching the Wake SLO probe
protocol before the heavy Claude session spawns. The responder acks +
replies via the local Qoopia MCP endpoint (stateless JSON-RPC over HTTP,
stdlib urllib.request only — no extra deps).

This is the recipient-side counterpart to /srv/qoopia/code/scripts/
wake_slo_probe.ts. A sender-side responder would synthesize the pong
locally and fake L2/L3/L4 proof; Leo R2 BLOCK 2026-05-24 retracted that
design. The dispatch intercept fires only on hook.mode=='wake' AND a
strict topic+body regex match. Any other traffic falls through to the
normal Claude dispatch. Any responder error also falls through, so a
broken MCP endpoint never silently drops real wake-push payloads.
The responder must act as the local recipient agent (e.g., corsair-main)
when calling agent_ack, which is gated on auth.agent_id ==
message.recipient_agent_id. A steward api_key (Qoo) can authenticate
to MCP but cannot ack messages addressed to another agent, so the
recipient agent's own key must be supplied explicitly. Add an env
variable override so deployments can wire the right key without
relying on the default secrets file (which may belong to a steward).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant