You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fixes#35. contact_supervisor could fail to reach a still-connected supervisor when the supervisor was targeted by a display name/fallback alias that more than one session shared. The child was told to use contact_supervisor, but target resolution threw Multiple sessions named "..." and the ask was never sent.
This change lets contact_supervisor route by a stable broker session ID first, falling back to the existing name-based target for backwards compatibility.
Root cause
readChildOrchestratorMetadata() only captured PI_SUBAGENT_ORCHESTRATOR_TARGET (a display name or ID), and contact_supervisor resolved it via resolveSessionTarget(...). When two connected sessions shared that name, resolution threw before the message was sent, and the child had no ID to retry with.
What changed
Stable-ID routing: contact_supervisor now prefers a stable broker session ID when one is available, then falls back to the name target.
New optional PI_SUBAGENT_ORCHESTRATOR_SESSION_ID for the supervisor's broker session ID.
If absent, an inherited PI_INTERCOM_SESSION_ID from the parent process environment is used as the supervisor ID.
PI_SUBAGENT_ORCHESTRATOR_TARGET remains for display and name-based routing, so existing behavior is unchanged when no ID is provided.
ID publication: each connected session now publishes its own broker session ID as PI_INTERCOM_SESSION_ID in process.env, so a pi-subagents bridge that inherits the parent environment can route children back to the exact parent session. The previous value is saved and restored on disconnect/shutdown to avoid leaking state across sessions.
Graceful fallback: resolveSupervisorTarget(...) tries the session ID, then the name; if a stale ID can no longer be resolved, it falls back to the name target instead of failing hard. When resolution still fails and no stable ID was supplied, the error now hints which env vars enable stable routing.
Docs: README "When the Tool Appears" documents the new env contract and routing precedence.
Backwards compatibility
No pi-subagents change is required for this PR to be safe to merge: when the new env vars are absent, routing is identical to today.
To fully resolve the duplicate-name failure end to end, pi-subagents should populate PI_SUBAGENT_ORCHESTRATOR_SESSION_ID (or rely on inherited PI_INTERCOM_SESSION_ID). This PR makes pi-intercom ready for that without breaking the current name-based path.
Tests
Added focused integration tests in intercom.integration.test.ts:
sessions publish their broker id for subagent inheritance (and restore the prior value on shutdown)
contact_supervisor routes by stable broker id when two supervisors share a name
contact_supervisor uses an inherited parent broker id when no explicit id is set
contact_supervisor falls back to the name target when the broker id is stale
Refactored duplicated client setup into a connectTestClient(name) helper.
Reviewed this — the diagnosis is right: resolveSessionTarget resolves the display name/fallback alias via findSessions with no uniqueness, so a send / contact_supervisor to a shared name either errors with "Multiple sessions named …" or lands in an unrelated session. Routing by a stable broker session ID first (name as display/fallback only) is the right call, and keeping today's behavior when the env vars are absent is a good backward-compatible landing.
While reviewing I dug into the surrounding code and found the related issues that share this root cause (name-based addressing with no identity binding):
Before I'd want this merged, I'm going to validate the fix locally first:
Run the unit/integration suite on this branch (npm test).
Real-world repro in a tmux session with two pi agents — ideally stripped down to just pi-intercom (no full pi-subagents stack) to isolate the intercom layer — loading the local extension build via pi -e. Reproduce the duplicate-name case (two sessions sharing a name), confirm the misdelivery/ambiguity on main, then confirm correct routing on this branch.
Heads up: as the PR notes, fully closing this end-to-end needs the pi-subagents bridge to set PI_SUBAGENT_ORCHESTRATOR_SESSION_ID (or inherit PI_INTERCOM_SESSION_ID). Happy to take that side on a pi-subagents branch once this lands.
Validation results (local)
Ran the plan above. Setup: cloned nicobailon/pi-intercom (bare) with worktrees at origin/main (5caa4aa, v0.6.0) and this PR's head fix/stable-supervisor-session-id (b4db34e); npm install in both. Confirmed childOrchestratorMetadata = readChildOrchestratorMetadata() runs at extension-load (index.ts:1082), i.e. before the child connects and publishIntercomSessionId overwrites PI_INTERCOM_SESSION_ID — so the inherited parent id is captured correctly.
1. Unit/integration suite (npm test)
main (5caa4aa): 36/36 pass (pre-existing tests; the routing tests don't exist here).
✔ sessions publish their broker id for subagent inheritance
✔ child supervisor tool routes by stable broker id when supervisor names collide ← the core fix
✔ child supervisor tool uses inherited parent broker id when explicit id is absent ← inherited-id path
✔ child supervisor tool falls back to target when broker id is stale ← stale-id fallback
These drive the realindex.ts extension + a real spawned broker + real IntercomClient (via a harness, no LLM), exercising resolveSupervisorTarget/resolveSessionTarget against a live broker with two colliding session names. The fix is confirmed at the routing layer.
One caveat: in a first full-suite run, child supervisor tool preserves delivery failure reasons showed as ✖, but it passes in isolation and in a clean full re-run (40/40) — a test-harness socket/ordering flake (the new tests share a per-process broker socket and depend on cleanup ordering), not a logic regression. Worth tightening for determinism before merge.
2. Real-world repro (tmux, two real pi agents, just intercom via pi -e)
Loaded each branch's own extension with pi --no-extensions -e <worktree>/index.ts (so the installed 0.6.0 didn't interfere), on the real per-user broker. Started two real pi sessions both named repro-collide (collision), then a one-shot intercom send to:"repro-collide":
main:RESULT: Multiple sessions named repro-collide
Identical ambiguity on both — i.e. this PR does not change the general send/ask-by-name path (only contact_supervisor). That matches the design, but it means the broad misdelivery vector tracked in #16 (any send to a non-unique name) is still open live and should be the next layer (broker-side: reject duplicate name on register + verify peer UID), independent of this PR.
3. Why I stopped short of a real-pi contact_supervisor repro
The inherited-id path needs the child to inherit PI_INTERCOM_SESSION_ID from the parent process env — i.e. the pi-subagents parent→child spawn flow, which is exactly the pi-subagents-side change noted as still needed. With "just intercom" there's no clean way to obtain the supervisor's full broker UUID: intercom list only renders short id prefixes, not the full UUID that PI_INTERCOM_SESSION_ID requires. So the authoritative validation of the fix's routing is the integration tests in (1); a full end-to-end real-pi demonstration will need the pi-subagents bridge to set PI_SUBAGENT_ORCHESTRATOR_SESSION_ID (or rely on inherited PI_INTERCOM_SESSION_ID).
Verdict
The fix is correct and validated at the routing layer (40/40, including colliding-name + inherited-id + stale-fallback against a real broker). End-to-end closure still needs the pi-subagents side. Separately, the general send-by-name misdelivery (#16) is reproduced live and remains unfixed by this PR — recommend tracking that as follow-up broker-side hardening (duplicate-name rejection + peer UID verification on register).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #35.
contact_supervisorcould fail to reach a still-connected supervisor when the supervisor was targeted by a display name/fallback alias that more than one session shared. The child was told to usecontact_supervisor, but target resolution threwMultiple sessions named "..."and the ask was never sent.This change lets
contact_supervisorroute by a stable broker session ID first, falling back to the existing name-based target for backwards compatibility.Root cause
readChildOrchestratorMetadata()only capturedPI_SUBAGENT_ORCHESTRATOR_TARGET(a display name or ID), andcontact_supervisorresolved it viaresolveSessionTarget(...). When two connected sessions shared that name, resolution threw before the message was sent, and the child had no ID to retry with.What changed
contact_supervisornow prefers a stable broker session ID when one is available, then falls back to the name target.PI_SUBAGENT_ORCHESTRATOR_SESSION_IDfor the supervisor's broker session ID.PI_INTERCOM_SESSION_IDfrom the parent process environment is used as the supervisor ID.PI_SUBAGENT_ORCHESTRATOR_TARGETremains for display and name-based routing, so existing behavior is unchanged when no ID is provided.PI_INTERCOM_SESSION_IDinprocess.env, so api-subagentsbridge that inherits the parent environment can route children back to the exact parent session. The previous value is saved and restored on disconnect/shutdown to avoid leaking state across sessions.resolveSupervisorTarget(...)tries the session ID, then the name; if a stale ID can no longer be resolved, it falls back to the name target instead of failing hard. When resolution still fails and no stable ID was supplied, the error now hints which env vars enable stable routing.Backwards compatibility
pi-subagentschange is required for this PR to be safe to merge: when the new env vars are absent, routing is identical to today.pi-subagentsshould populatePI_SUBAGENT_ORCHESTRATOR_SESSION_ID(or rely on inheritedPI_INTERCOM_SESSION_ID). This PR makespi-intercomready for that without breaking the current name-based path.Tests
Added focused integration tests in
intercom.integration.test.ts:contact_supervisorroutes by stable broker id when two supervisors share a namecontact_supervisoruses an inherited parent broker id when no explicit id is setcontact_supervisorfalls back to the name target when the broker id is staleRefactored duplicated client setup into a
connectTestClient(name)helper.