Skip to content

Prevent mutual-ask deadlock; make ask timeout configurable#42

Open
iRonin wants to merge 1 commit into
nicobailon:mainfrom
iRonin:fix/mutual-ask-deadlock-guard
Open

Prevent mutual-ask deadlock; make ask timeout configurable#42
iRonin wants to merge 1 commit into
nicobailon:mainfrom
iRonin:fix/mutual-ask-deadlock-guard

Conversation

@iRonin

@iRonin iRonin commented Jun 13, 2026

Copy link
Copy Markdown

Fixes #41. Relates to #14 (configurable timeout) and #29 (the single-session concurrent-ask crash, a different failure mode).

Problem

Two sessions that ask each other at the same time deadlock: each ask blocks its caller's turn, so neither is idle to answer the other, and both only unblock at the 10-minute reply timeout. See #41 for the full reproduction.

Fix

Broker-side cycle guard (the broker is the only component with a global view of all sessions):

  • Track each session's single outstanding ask edge (asker -> recipient).
  • When an ask (expectsReply) arrives whose recipient is already awaiting a reply from the sender, refuse it immediately with a Mutual ask refused delivery_failed. The caller's existing !delivered path turns this into an immediate tool error, so it fails fast and can fall back to send/reply.
  • FIFO: whichever ask the broker dequeues first installs the edge and wins; the reverse one is refused. A plain send (no reply expected) is untouched.
  • Edges clear on reply (replyTo match), on disconnect, and via a TTL.

Configurable reply timeout (relates to #14): the previously hardcoded 10-minute timeout (in both waitForReply and ReplyTracker) is now PI_INTERCOM_ASK_TIMEOUT_MS, read by both the extension and the broker so the edge TTL stays aligned with the caller-side timeout (a timed-out ask leaves no phantom edge). Values below 1s are ignored.

Tests

  • mutual ask is refused immediately so two sessions cannot deadlock: A asks B; the reverse B→A ask fails fast with Mutual ask refused; a plain send and a reply still work; after the reply clears the edge, B can ask A. (Asserts the reverse ask fails — the test fails if the guard is removed or is over-broad.)
  • resolveAskTimeoutMs honours PI_INTERCOM_ASK_TIMEOUT_MS and ignores junk.

npm test → 38/38 passing.

Notes

No protocol/message-type changes — reuses the existing delivery_failed path and the expectsReply / replyTo fields the broker already receives. No new dependencies.

Two sessions that issue a reply-waiting `ask` to each other at the same
time deadlock: each blocks its turn waiting for a reply the other cannot
send while it is itself blocked. Both only unblock at the 10-minute reply
timeout.

The broker now tracks each session's single outstanding ask edge
(asker -> recipient). When an `ask` (expectsReply) arrives whose recipient
is already awaiting a reply from the sender, the broker refuses it
immediately with a "Mutual ask refused" delivery_failed instead of
delivering, so the caller fails fast and can fall back to send/reply.
Whichever ask the broker dequeues first wins (FIFO); a plain send is never
affected. Edges clear on reply, on disconnect, and via a TTL equal to the
reply timeout (so a timed-out ask leaves no phantom edge).

The reply timeout (previously a hardcoded 10 minutes in both waitForReply
and ReplyTracker) is now configurable via PI_INTERCOM_ASK_TIMEOUT_MS,
read by both the extension and the broker so their values stay aligned.

Tests: mutual-ask refusal round-trip (reverse ask fails fast; plain send
and reply still work; edge clears so a later ask succeeds) and
resolveAskTimeoutMs env parsing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Two sessions that ask each other deadlock until the reply timeout

1 participant