From 9b0078b7beb1c23d271a3f4b14b232a8e8dbe028 Mon Sep 17 00:00:00 2001 From: Gupta-ujjwal14 Date: Sun, 10 May 2026 18:18:23 +0530 Subject: [PATCH 1/8] docs: propose meta-agent planning-mode dispatcher MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Field-notes proposal for a conversational layer over kolu that summarises what each active terminal is asking and dispatches user replies into the right session — a planning-mode-only orchestration surface that quiets itself the moment a terminal is focused. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../0002-meta-agent-planning-dispatcher.md | 81 +++++++++++++++++++ 1 file changed, 81 insertions(+) create mode 100644 docs/proposals/0002-meta-agent-planning-dispatcher.md diff --git a/docs/proposals/0002-meta-agent-planning-dispatcher.md b/docs/proposals/0002-meta-agent-planning-dispatcher.md new file mode 100644 index 000000000..7adb4d049 --- /dev/null +++ b/docs/proposals/0002-meta-agent-planning-dispatcher.md @@ -0,0 +1,81 @@ +--- +title: Meta-Agent Planning-Mode Dispatcher +number: 0002 +status: draft +author: Gupta-ujjwal14 +created: 2026-05-10 +--- + +# 0002 — Meta-Agent Planning-Mode Dispatcher + +## Summary + +A conversational layer over the kolu UI — voice or text — that surfaces what every active terminal is currently asking, lets the user reply into specific terminals through that single layer, and quiets itself the moment a terminal is focused. Active in planning and brainstorming only; the full terminal remains the right surface for implementation and review. + +## Motivation + +Treat this as field notes from one user juggling three projects in parallel. + +The cost is not any single read or reply — it is the constant context reload. Look at terminal A, page back to remember what was happening, reply, switch to B, reload its state, reply, switch to C. Each switch evicts the higher-level thread the user was holding ("how do these three projects relate? am I making consistent calls across them?") and forces a rebuild from the terminal contents. With one project this overhead is invisible. With three, it dominates. + +kolu already solves the hardest version of this problem: nudges and actionable-terminal indicators tell the user **which** terminal needs attention. What remains is the trip itself — entering the session to learn **what** it needs. That summarise step happens in the user's head, repeatedly. The proposal lifts it into a layer so the high-level thread can stay loaded. + +The friction is most visible in *parallel brainstorming*: three Claude / opencode sessions exploring related design questions, each periodically asking a yes/no or for a quick steer. Today the only way to keep the conversation moving is to focus each terminal in turn, and the per-trip context-reload tax is the cost of admission. Outside that mode — single-project work, deep implementation, code review — the friction disappears, which is exactly why the scoping below is load-bearing. + +## User-facing behavior + +Two surfaces over a single conversational primitive: + +**Read side.** On demand, the meta-agent surfaces a unified picture of what every active terminal is currently asking. Example phrasing the user might hear or read back: + +> Project A is waiting on a yes/no about the migration shape. Project B finished and wants you to verify the diff. Project C is mid-build with no input expected. + +This does not replace the terminal indicators — it complements them. The indicator says *which*; the meta-agent answers *what*, without the user having to enter the session. + +**Write side.** The user replies through the same conversational layer, and the meta-agent dispatches each reply into the right terminal: + +> Tell A yes, codemod approach. Tell B I'll review in ten. Skip C. + +The terminal still owns the conversation; the meta-agent is a router, not a parallel agent. + +**Voice and text as transports.** Brainstorming flows faster spoken than typed, and a verbal interface meets the user where they already are when thinking hard — pacing, looking away from the screen. Text remains available for situations where voice is inconvenient (open offices, noisy environments). Both feed the same underlying primitive. + +**Auto-quiet on terminal focus.** When the user focuses a specific terminal, the meta-agent stops surfacing summaries and stops accepting dispatches until refocused. Outside planning mode the full terminal context is the right surface; the meta-agent should disappear rather than compete for attention. + +## Prototype + +Not yet attached. CONTRIBUTING notes that proposal+prototype is the strongest form, and a screen recording of the workflow this would replace — the user pacing while orchestrating three projects through a single interface — would communicate the value better than prose. Happy to add one if maintainers want it before accepting; flagging the gap explicitly rather than treating the proposal as complete. + +## Implementation notes + +*Intentionally empty.* The user has no opinion on the *how* and per CONTRIBUTING that is the implementer's job. The Open questions below capture the parts that genuinely need design work. + +## Alternatives considered + +**Status quo: rely on existing nudges.** The actionable-terminal indicators already point the user at the right session. This is sufficient when N=1 or the sessions are unrelated — the trip into a single terminal is cheap. With multiple parallel brainstorms, knowing *which* still requires *entering* the terminal to learn *what*, and the in-head summarise step dominates. + +**Read-only meta layer (no write side).** A summary-on-demand surface without dispatch is cheaper to build and avoids the arbitration questions below. It also captures most of the perceived value. Rejected as the primary shape because the orchestration win — staying in the high-level thread instead of dropping into terminals to type replies — is what makes the feature pay rent. A read-only version is a reasonable phase-one if the write side proves contentious. + +**Build it outside kolu as a separate desktop tool that reads kolu's WebSocket.** Possible, but duplicates UI state, loses access to kolu's focus model (so the auto-quiet behavior becomes guesswork), and forces the user to install and maintain a second tool to use one feature. + +**General orchestration / "Claude over Claude".** Out — see Out of scope. + +## Open questions + +- **UI shape vs. kolu's existing layout.** Where does the meta-agent live? A panel inside the existing window, an overlay over the terminal grid / canvas, or a dedicated window that can sit on a second display while the user paces? Each has trade-offs against kolu's current per-folder and per-workspace model that maintainers are better placed to judge. +- **Write-side arbitration when a session is mid-tool-call.** Dispatching a natural-language instruction into a Claude / opencode session that is currently waiting for the user is straightforward. Dispatching while the agent is mid-tool-call is not. Does the dispatcher queue, refuse, or interrupt? Is queueing safe across all supported agent CLIs? +- **Voice: primitive or transport?** The wishlist framed voice as central. On reflection it might be one transport over a more general primitive — *one input that knows which session to route to* — and voice and text are equally valid surfaces over that primitive. Worth deciding early; it changes how the feature is scoped and named. +- **Streaming surface.** Does the read side need a new streaming procedure, or can it be derived from the existing terminal subscriptions plus a server-side classifier of "what is this session asking"? +- **Authoring surface for the meta-agent itself.** Is this an LLM call wrapping kolu's existing terminal state, a deterministic surface that templates from session metadata, or a hybrid (deterministic for the read-side ledger, LLM for the dispatcher's NL parsing)? +- **Agent-CLI fragmentation.** Different agent CLIs (Claude Code, opencode, Codex, anyagent) have different prompts, different ways of indicating "waiting on user", and different control surfaces. Does the dispatcher need a per-integration adapter, or can the existing `AgentProvider` abstraction carry it? + +## Out of scope + +This proposal deliberately does **not** address: + +- Implementation- or review-mode summarisation. In those modes a summary is strictly worse than the full terminal — the details that would be lost are exactly the ones the user needs. +- Replacing direct terminal access. The meta-agent is additive; the terminal remains the canonical surface for any work past planning. +- Cross-machine federation. Single-host kolu only. +- Multi-user shared planning sessions. Single-user only. +- Persistence of planning conversations across kolu restarts. Could be a follow-up proposal once the basic shape lands; not required for the initial behavior. +- "Claude over Claude" / general agent orchestration. The planning-mode scoping is load-bearing; this is not a step toward a meta-agent that survives outside brainstorming. From 97458c5885cfe04d91c04c866942bc361549cc62 Mon Sep 17 00:00:00 2001 From: Gupta-ujjwal14 Date: Sun, 10 May 2026 19:31:58 +0530 Subject: [PATCH 2/8] =?UTF-8?q?refactor(lowy):=20close=20OQ4=20=E2=80=94?= =?UTF-8?q?=20read=20side=20derives=20from=20existing=20subscription?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Per-terminal metadata subscription already publishes the "what is this session asking" data; unified ledger is a presentation surface, not a new streaming procedure. --- docs/proposals/0002-meta-agent-planning-dispatcher.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/proposals/0002-meta-agent-planning-dispatcher.md b/docs/proposals/0002-meta-agent-planning-dispatcher.md index 7adb4d049..16463e996 100644 --- a/docs/proposals/0002-meta-agent-planning-dispatcher.md +++ b/docs/proposals/0002-meta-agent-planning-dispatcher.md @@ -48,7 +48,9 @@ Not yet attached. CONTRIBUTING notes that proposal+prototype is the strongest fo ## Implementation notes -*Intentionally empty.* The user has no opinion on the *how* and per CONTRIBUTING that is the implementer's job. The Open questions below capture the parts that genuinely need design work. +The user has no opinion on the *how* in general, but the structural review surfaced a few directions worth recording so they don't get re-litigated: + +- **Read side is a presentation surface, not a new layer.** It derives from the existing per-terminal metadata subscription that already publishes "what is this session asking" data (agent state, summary). No new streaming procedure or server-side classifier is needed; the unified ledger is a client-side render over data the subscription already delivers. ## Alternatives considered @@ -65,7 +67,6 @@ Not yet attached. CONTRIBUTING notes that proposal+prototype is the strongest fo - **UI shape vs. kolu's existing layout.** Where does the meta-agent live? A panel inside the existing window, an overlay over the terminal grid / canvas, or a dedicated window that can sit on a second display while the user paces? Each has trade-offs against kolu's current per-folder and per-workspace model that maintainers are better placed to judge. - **Write-side arbitration when a session is mid-tool-call.** Dispatching a natural-language instruction into a Claude / opencode session that is currently waiting for the user is straightforward. Dispatching while the agent is mid-tool-call is not. Does the dispatcher queue, refuse, or interrupt? Is queueing safe across all supported agent CLIs? - **Voice: primitive or transport?** The wishlist framed voice as central. On reflection it might be one transport over a more general primitive — *one input that knows which session to route to* — and voice and text are equally valid surfaces over that primitive. Worth deciding early; it changes how the feature is scoped and named. -- **Streaming surface.** Does the read side need a new streaming procedure, or can it be derived from the existing terminal subscriptions plus a server-side classifier of "what is this session asking"? - **Authoring surface for the meta-agent itself.** Is this an LLM call wrapping kolu's existing terminal state, a deterministic surface that templates from session metadata, or a hybrid (deterministic for the read-side ledger, LLM for the dispatcher's NL parsing)? - **Agent-CLI fragmentation.** Different agent CLIs (Claude Code, opencode, Codex, anyagent) have different prompts, different ways of indicating "waiting on user", and different control surfaces. Does the dispatcher need a per-integration adapter, or can the existing `AgentProvider` abstraction carry it? From 9e726f374f5fcd410d752621bfa1f0261a101fd5 Mon Sep 17 00:00:00 2001 From: Gupta-ujjwal14 Date: Sun, 10 May 2026 19:32:47 +0530 Subject: [PATCH 3/8] refactor(lowy): split write side into NL parser + per-CLI injection seams NL intent parsing and per-CLI safe injection are independent volatility axes. Naming them as separate seams prevents a single dispatcher module from coupling algorithm-strategy changes to per-CLI protocol changes. --- docs/proposals/0002-meta-agent-planning-dispatcher.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/docs/proposals/0002-meta-agent-planning-dispatcher.md b/docs/proposals/0002-meta-agent-planning-dispatcher.md index 16463e996..62acab933 100644 --- a/docs/proposals/0002-meta-agent-planning-dispatcher.md +++ b/docs/proposals/0002-meta-agent-planning-dispatcher.md @@ -36,7 +36,7 @@ This does not replace the terminal indicators — it complements them. The indic > Tell A yes, codemod approach. Tell B I'll review in ten. Skip C. -The terminal still owns the conversation; the meta-agent is a router, not a parallel agent. +The terminal still owns the conversation; the meta-agent is a router, not a parallel agent. Internally the write side decomposes into two seams along independent volatility axes — see Implementation notes. **Voice and text as transports.** Brainstorming flows faster spoken than typed, and a verbal interface meets the user where they already are when thinking hard — pacing, looking away from the screen. Text remains available for situations where voice is inconvenient (open offices, noisy environments). Both feed the same underlying primitive. @@ -51,6 +51,11 @@ Not yet attached. CONTRIBUTING notes that proposal+prototype is the strongest fo The user has no opinion on the *how* in general, but the structural review surfaced a few directions worth recording so they don't get re-litigated: - **Read side is a presentation surface, not a new layer.** It derives from the existing per-terminal metadata subscription that already publishes "what is this session asking" data (agent state, summary). No new streaming procedure or server-side classifier is needed; the unified ledger is a client-side render over data the subscription already delivers. +- **Write side is two independent seams, not one dispatcher.** The volatility axes are different and should be encapsulated separately: + - *NL intent parser.* Takes one instruction string plus the list of currently-active terminals, returns structured `(terminalId, message)` pairs. Volatile along the algorithm axis (rule-based vs. LLM-backed vs. hybrid; provider choice; confidence handling). + - *Per-CLI safe injection.* Given a `(terminalId, message)` pair, decides when and how to write into that terminal's PTY — including mid-tool-call arbitration. Volatile along the per-agent-CLI axis: each CLI has its own input protocol and its own definition of "safe to deliver right now". + + Treating these as one dispatcher couples the two axes: changing the NL parser would force a touch on the injection logic, and vice versa. The proposal names them as separate seams so an implementer doesn't collapse them out of convenience. ## Alternatives considered @@ -65,9 +70,9 @@ The user has no opinion on the *how* in general, but the structural review surfa ## Open questions - **UI shape vs. kolu's existing layout.** Where does the meta-agent live? A panel inside the existing window, an overlay over the terminal grid / canvas, or a dedicated window that can sit on a second display while the user paces? Each has trade-offs against kolu's current per-folder and per-workspace model that maintainers are better placed to judge. -- **Write-side arbitration when a session is mid-tool-call.** Dispatching a natural-language instruction into a Claude / opencode session that is currently waiting for the user is straightforward. Dispatching while the agent is mid-tool-call is not. Does the dispatcher queue, refuse, or interrupt? Is queueing safe across all supported agent CLIs? +- **Mid-tool-call arbitration on the per-CLI injection seam.** Dispatching a natural-language instruction into a Claude / opencode session that is currently waiting for the user is straightforward. Dispatching while the agent is mid-tool-call is not. Does the injection seam queue, refuse, or interrupt? The right answer is per-CLI and lives inside the injection seam (see Implementation notes), not in the NL parser. - **Voice: primitive or transport?** The wishlist framed voice as central. On reflection it might be one transport over a more general primitive — *one input that knows which session to route to* — and voice and text are equally valid surfaces over that primitive. Worth deciding early; it changes how the feature is scoped and named. -- **Authoring surface for the meta-agent itself.** Is this an LLM call wrapping kolu's existing terminal state, a deterministic surface that templates from session metadata, or a hybrid (deterministic for the read-side ledger, LLM for the dispatcher's NL parsing)? +- **NL parser authoring strategy.** Inside the NL intent parser seam: deterministic templating, an LLM call, or hybrid? This is the parser's internal volatility — confined behind the seam, but the choice still has UX implications (latency, failure modes, confidence handling) worth deciding before scoping. - **Agent-CLI fragmentation.** Different agent CLIs (Claude Code, opencode, Codex, anyagent) have different prompts, different ways of indicating "waiting on user", and different control surfaces. Does the dispatcher need a per-integration adapter, or can the existing `AgentProvider` abstraction carry it? ## Out of scope From d94984635ec38deee53b7dc8bf6c510330699157 Mon Sep 17 00:00:00 2001 From: Gupta-ujjwal14 Date: Sun, 10 May 2026 19:33:09 +0530 Subject: [PATCH 4/8] =?UTF-8?q?refactor(lowy):=20close=20OQ6=20=E2=80=94?= =?UTF-8?q?=20extend=20AgentProvider,=20not=20a=20parallel=20adapter?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Per-CLI injection volatility lives on the same axis as detection and state-watching. AgentProvider already encapsulates that axis; adding a separate DispatchProvider family would double the blast radius of adding a new CLI without naming a new volatility axis. --- docs/proposals/0002-meta-agent-planning-dispatcher.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposals/0002-meta-agent-planning-dispatcher.md b/docs/proposals/0002-meta-agent-planning-dispatcher.md index 62acab933..1c810dc77 100644 --- a/docs/proposals/0002-meta-agent-planning-dispatcher.md +++ b/docs/proposals/0002-meta-agent-planning-dispatcher.md @@ -56,6 +56,7 @@ The user has no opinion on the *how* in general, but the structural review surfa - *Per-CLI safe injection.* Given a `(terminalId, message)` pair, decides when and how to write into that terminal's PTY — including mid-tool-call arbitration. Volatile along the per-agent-CLI axis: each CLI has its own input protocol and its own definition of "safe to deliver right now". Treating these as one dispatcher couples the two axes: changing the NL parser would force a touch on the injection logic, and vice versa. The proposal names them as separate seams so an implementer doesn't collapse them out of convenience. +- **The per-CLI injection seam belongs in `AgentProvider`, not a parallel adapter family.** `AgentProvider` already encapsulates per-CLI volatility (detection, state-watching) and is the seam every new agent CLI already has to implement. Adding a second `DispatchProvider`/`InjectionAdapter` family alongside it would double the blast radius of adding a new CLI without naming a different volatility axis. Extend `AgentProvider` with an optional injection capability (method signature is an implementation decision). ## Alternatives considered @@ -73,7 +74,6 @@ The user has no opinion on the *how* in general, but the structural review surfa - **Mid-tool-call arbitration on the per-CLI injection seam.** Dispatching a natural-language instruction into a Claude / opencode session that is currently waiting for the user is straightforward. Dispatching while the agent is mid-tool-call is not. Does the injection seam queue, refuse, or interrupt? The right answer is per-CLI and lives inside the injection seam (see Implementation notes), not in the NL parser. - **Voice: primitive or transport?** The wishlist framed voice as central. On reflection it might be one transport over a more general primitive — *one input that knows which session to route to* — and voice and text are equally valid surfaces over that primitive. Worth deciding early; it changes how the feature is scoped and named. - **NL parser authoring strategy.** Inside the NL intent parser seam: deterministic templating, an LLM call, or hybrid? This is the parser's internal volatility — confined behind the seam, but the choice still has UX implications (latency, failure modes, confidence handling) worth deciding before scoping. -- **Agent-CLI fragmentation.** Different agent CLIs (Claude Code, opencode, Codex, anyagent) have different prompts, different ways of indicating "waiting on user", and different control surfaces. Does the dispatcher need a per-integration adapter, or can the existing `AgentProvider` abstraction carry it? ## Out of scope From d6cdf5e77d8e147e1fec384f051971f88446b519 Mon Sep 17 00:00:00 2001 From: Gupta-ujjwal14 Date: Sun, 10 May 2026 19:33:32 +0530 Subject: [PATCH 5/8] =?UTF-8?q?refactor(lowy):=20close=20OQ3=20=E2=80=94?= =?UTF-8?q?=20voice/text=20are=20client-side=20transports?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Server contract takes a plain instruction string; transport mode is client-only, mirroring sendInput. STT and input-widget concerns stay on the client and aren't part of this proposal's scope. --- docs/proposals/0002-meta-agent-planning-dispatcher.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/proposals/0002-meta-agent-planning-dispatcher.md b/docs/proposals/0002-meta-agent-planning-dispatcher.md index 1c810dc77..9406e0a8a 100644 --- a/docs/proposals/0002-meta-agent-planning-dispatcher.md +++ b/docs/proposals/0002-meta-agent-planning-dispatcher.md @@ -38,7 +38,7 @@ This does not replace the terminal indicators — it complements them. The indic The terminal still owns the conversation; the meta-agent is a router, not a parallel agent. Internally the write side decomposes into two seams along independent volatility axes — see Implementation notes. -**Voice and text as transports.** Brainstorming flows faster spoken than typed, and a verbal interface meets the user where they already are when thinking hard — pacing, looking away from the screen. Text remains available for situations where voice is inconvenient (open offices, noisy environments). Both feed the same underlying primitive. +**Voice and text as transports.** Brainstorming flows faster spoken than typed, and a verbal interface meets the user where they already are when thinking hard — pacing, looking away from the screen. Text remains available for situations where voice is inconvenient (open offices, noisy environments). Both are *client-side* transports: the server contract receives a plain instruction string and never knows whether it was typed or spoken (mirroring how `sendInput` for terminals works today). Speech-to-text, push-to-talk UX, and text-input widgets are client-only concerns and not part of this proposal's scope. **Auto-quiet on terminal focus.** When the user focuses a specific terminal, the meta-agent stops surfacing summaries and stops accepting dispatches until refocused. Outside planning mode the full terminal context is the right surface; the meta-agent should disappear rather than compete for attention. @@ -72,7 +72,6 @@ The user has no opinion on the *how* in general, but the structural review surfa - **UI shape vs. kolu's existing layout.** Where does the meta-agent live? A panel inside the existing window, an overlay over the terminal grid / canvas, or a dedicated window that can sit on a second display while the user paces? Each has trade-offs against kolu's current per-folder and per-workspace model that maintainers are better placed to judge. - **Mid-tool-call arbitration on the per-CLI injection seam.** Dispatching a natural-language instruction into a Claude / opencode session that is currently waiting for the user is straightforward. Dispatching while the agent is mid-tool-call is not. Does the injection seam queue, refuse, or interrupt? The right answer is per-CLI and lives inside the injection seam (see Implementation notes), not in the NL parser. -- **Voice: primitive or transport?** The wishlist framed voice as central. On reflection it might be one transport over a more general primitive — *one input that knows which session to route to* — and voice and text are equally valid surfaces over that primitive. Worth deciding early; it changes how the feature is scoped and named. - **NL parser authoring strategy.** Inside the NL intent parser seam: deterministic templating, an LLM call, or hybrid? This is the parser's internal volatility — confined behind the seam, but the choice still has UX implications (latency, failure modes, confidence handling) worth deciding before scoping. ## Out of scope From 2081f38eb823399d8d7dd5dd793df798285a5f53 Mon Sep 17 00:00:00 2001 From: Gupta-ujjwal14 Date: Sun, 10 May 2026 19:33:48 +0530 Subject: [PATCH 6/8] refactor(hickey): clarify shared UI chrome vs distinct mechanisms MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The original "two surfaces over a single conversational primitive" framing reads as if read and write share the same authoring mechanism. After the lowy split they don't — they share UI chrome over different seams. Replace the framing so an implementer doesn't infer a unified LLM context from the prose. --- docs/proposals/0002-meta-agent-planning-dispatcher.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposals/0002-meta-agent-planning-dispatcher.md b/docs/proposals/0002-meta-agent-planning-dispatcher.md index 9406e0a8a..1a77b13a1 100644 --- a/docs/proposals/0002-meta-agent-planning-dispatcher.md +++ b/docs/proposals/0002-meta-agent-planning-dispatcher.md @@ -24,7 +24,7 @@ The friction is most visible in *parallel brainstorming*: three Claude / opencod ## User-facing behavior -Two surfaces over a single conversational primitive: +Two surfaces — read and write — sharing a single UI chrome (the conversational layer the user interacts with) but with distinct underlying mechanisms (see Implementation notes). The user experiences one place to glance and one place to dictate; the implementation behind that experience is two seams. **Read side.** On demand, the meta-agent surfaces a unified picture of what every active terminal is currently asking. Example phrasing the user might hear or read back: From 1a3f9e1b37cb6dbf0e5b75844e7ea4a128bf237f Mon Sep 17 00:00:00 2001 From: Gupta-ujjwal14 Date: Sun, 10 May 2026 19:34:04 +0530 Subject: [PATCH 7/8] refactor(hickey): scope auto-quiet rule to the single-window assumption "Terminal focus" and "user attention" are synonymous on one monitor and diverge on two. Note the assumption explicitly so the second-display layout variant doesn't silently inherit a wrong trigger. --- docs/proposals/0002-meta-agent-planning-dispatcher.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/proposals/0002-meta-agent-planning-dispatcher.md b/docs/proposals/0002-meta-agent-planning-dispatcher.md index 1a77b13a1..f14c3e6b6 100644 --- a/docs/proposals/0002-meta-agent-planning-dispatcher.md +++ b/docs/proposals/0002-meta-agent-planning-dispatcher.md @@ -42,6 +42,8 @@ The terminal still owns the conversation; the meta-agent is a router, not a para **Auto-quiet on terminal focus.** When the user focuses a specific terminal, the meta-agent stops surfacing summaries and stops accepting dispatches until refocused. Outside planning mode the full terminal context is the right surface; the meta-agent should disappear rather than compete for attention. +This rule assumes the meta-agent and the terminals share one window. The second-display layout variant (see Open questions) breaks the assumption: with the meta-agent on display A and a focused terminal on display B, "terminal focus" no longer signals "user has shifted attention away from the meta-agent." Whichever layout maintainers pick will need to re-derive the auto-quiet trigger; the rule as stated above is the single-window default. + ## Prototype Not yet attached. CONTRIBUTING notes that proposal+prototype is the strongest form, and a screen recording of the workflow this would replace — the user pacing while orchestrating three projects through a single interface — would communicate the value better than prose. Happy to add one if maintainers want it before accepting; flagging the gap explicitly rather than treating the proposal as complete. From 93d9de052dd5e111b29d9c6ba4a22fecc267c3c0 Mon Sep 17 00:00:00 2001 From: Gupta-ujjwal14 Date: Sun, 10 May 2026 19:34:30 +0530 Subject: [PATCH 8/8] refactor(hickey): note that mid-tool-call semantics are themselves per-CLI The arbitration policy can't be answered once and applied uniformly because each agent CLI surfaces "I'm busy" differently. Surface the fragmentation inside the injection-seam description so it can't be treated as a single global decision. --- docs/proposals/0002-meta-agent-planning-dispatcher.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposals/0002-meta-agent-planning-dispatcher.md b/docs/proposals/0002-meta-agent-planning-dispatcher.md index f14c3e6b6..64c552816 100644 --- a/docs/proposals/0002-meta-agent-planning-dispatcher.md +++ b/docs/proposals/0002-meta-agent-planning-dispatcher.md @@ -55,7 +55,7 @@ The user has no opinion on the *how* in general, but the structural review surfa - **Read side is a presentation surface, not a new layer.** It derives from the existing per-terminal metadata subscription that already publishes "what is this session asking" data (agent state, summary). No new streaming procedure or server-side classifier is needed; the unified ledger is a client-side render over data the subscription already delivers. - **Write side is two independent seams, not one dispatcher.** The volatility axes are different and should be encapsulated separately: - *NL intent parser.* Takes one instruction string plus the list of currently-active terminals, returns structured `(terminalId, message)` pairs. Volatile along the algorithm axis (rule-based vs. LLM-backed vs. hybrid; provider choice; confidence handling). - - *Per-CLI safe injection.* Given a `(terminalId, message)` pair, decides when and how to write into that terminal's PTY — including mid-tool-call arbitration. Volatile along the per-agent-CLI axis: each CLI has its own input protocol and its own definition of "safe to deliver right now". + - *Per-CLI safe injection.* Given a `(terminalId, message)` pair, decides when and how to write into that terminal's PTY — including mid-tool-call arbitration. Volatile along the per-agent-CLI axis: each CLI has its own input protocol and its own definition of "safe to deliver right now". Note that *what counts as mid-tool-call* is itself per-CLI — Claude Code, opencode, Codex, and anyagent each surface "I'm busy" in different ways, so the arbitration policy can't be answered once and applied uniformly. Treating these as one dispatcher couples the two axes: changing the NL parser would force a touch on the injection logic, and vice versa. The proposal names them as separate seams so an implementer doesn't collapse them out of convenience. - **The per-CLI injection seam belongs in `AgentProvider`, not a parallel adapter family.** `AgentProvider` already encapsulates per-CLI volatility (detection, state-watching) and is the seam every new agent CLI already has to implement. Adding a second `DispatchProvider`/`InjectionAdapter` family alongside it would double the blast radius of adding a new CLI without naming a different volatility axis. Extend `AgentProvider` with an optional injection capability (method signature is an implementation decision).