Epic 6.3–6.10: Self-Organizing Agent Orchestration

## Summary

Evolve Loomkin's agent orchestration from reactive event-driven coordination to **self-organizing swarms** that surpass the traditional sequential-phase model used by tools like Claude Code teams.

### Foundation from PR #141 & #144

Recent work laid critical infrastructure that several epics build on:

- **Bootstrap agents** (Concierge + Orienter) — sessions now spawn two always-on agents. Concierge orchestrates, Orienter scans project context silently
- **Concierge routing** — user messages route through Concierge first (`maybe_route_to_concierge/2`), making it the natural cross-team dispatch point for 6.7
- **Deferred bootstrap** — agents spawn on first `send_message`, not session creation, aligning with 6.10's context-aware spawning philosophy
- **Agent attribution** — `from:` field on broadcasts provides identity for cross-team messages
- **`team_id` on sessions** — persistence link between session and team hierarchy
- **Context keeper rehydration** — `ContextKeeper.rehydrate_from_db/1` restores shared knowledge across restarts

### The Problem with Traditional AI Agent Teams

Traditional AI code agent teams (Claude Code, Cursor, etc.) use rigid sequential phases:

```
Phase 1: Design agent (solo)         → wait for completion
Phase 2: Component agents (parallel) → wait for ALL to complete
Phase 3: Integration agent (solo)    → done
```

All orchestration intelligence lives in the leader's upfront plan. Agents can't adapt, can't communicate laterally, and can't start downstream work until an entire phase completes.

### What Loomkin Already Does Better

| Capability | Traditional Teams | Loomkin Today |
|---|---|---|
| Agent spawning | Pre-planned by leader | Dynamic via TeamSpawn — LLM decides |
| Session bootstrap | Cold start, no context | Concierge + Orienter warm start with project scan |
| Task assignment | Leader assigns explicitly | `smart_assign` by capability + load |
| Conflict handling | Avoided by design (non-overlapping files) | Real-time detection (file, approach, decision) |
| Message priority | None (sequential) | 4-tier routing with urgent interrupts |
| Stuck recovery | None (wait forever) | Rebalancer detects, nudges, escalates |
| Knowledge sharing | None between agents | RelevanceScorer + cross-team propagation |
| Consensus | Leader decides | Weighted voting (expertise × capability × confidence) |
| User routing | Single agent | Concierge dispatches to specialists |

### What's Missing

Despite the above, **sub-teams are completely isolated from each other**. Communication is one-way upward only (insights/blockers propagate to parent via `comms.ex`). Sibling sub-teams cannot:
- Send direct messages to each other
- Ask questions across team boundaries
- Share tasks or coordinate work laterally
- Even discover each other's existence

Additionally, tasks run to completion with no mid-execution steering, no partial results, and no speculative parallelism.

## Sub-tasks

### 6.3: Interruptible Checkpoints (CRITICAL)
Insert yield points in `AgentLoop` after LLM response and after each tool execution. Allow the user (or system) to pause, inspect, redirect, or cancel an agent mid-loop rather than waiting for full task completion.

- Add checkpoint callbacks in `agent_loop.ex` after `do_loop` LLM call and after `execute_single_tool`
- New agent status `:paused` with resume capability
- User-facing "pause/steer" control in activity feed
- Must handle auto-initiated loops (Orienter's `handle_continue(:auto_orient)`), not just user-triggered ones
- Foundation for all other epics — without this, agents are black boxes during execution

**Existing foundation:** Permission system (`check_permission` callback, `:waiting_permission` status, `PermissionComponent` modal) provides the primitive for single-tool blocking. Extend to full checkpoint protocol.

### 6.4: Dynamic Task Dependencies (HIGH)
Extend `TeamTaskDep` beyond simple `:blocks` to support content-aware coupling. Tasks should depend on **milestones** ("schema_ready", "API defined"), not just full task completion.

- New dependency type `:requires_output` — inject predecessor's output into dependent task context
- Milestone signaling: agents emit named checkpoints that unblock specific dependents
- Dynamic dependency creation: agents discover mid-work that a new dependency should exist
- Priority inheritance: urgent downstream tasks make their blockers urgent

### 6.5: Speculative Execution (HIGH)
Allow agents to begin work optimistically before all dependencies resolve. Mark outputs as "tentative" and merge or discard when assumptions are confirmed.

- Tentative task state with assumption tracking
- Merge protocol when speculative work validates against actual results
- Discard + replay when assumptions are wrong
- Extends ConflictDetector to handle intentional overlap (speculative vs accidental)

### 6.6: Readiness Signaling (MEDIUM)
Add fine-grained task states beyond `pending → in_progress → completed`. Agents can signal readiness for integration, request review, or indicate partial availability.

- New states: `:ready_for_review`, `:paused`, `:blocked`, `:partially_complete`
- Agents emit `:agent_ready` events that other agents can subscribe to
- Rendezvous points: named synchronization barriers where multiple agents signal readiness and a coordinator action triggers when all arrive

**Existing prototype:** Orienter→Concierge handshake via `peer_message` after `handle_continue(:auto_orient)` is a primitive readiness signal. Generalize this pattern.

### 6.7: Cross-Team Communication (HIGH PRIORITY)
**This is a critical gap.** Currently sub-teams have completely isolated PubSub topic spaces. All cross-team knowledge flows one-way upward (only `:insight` and `:blocker` types via `maybe_propagate_to_parent/2`).

What needs to change:
- **Sub-team ↔ sub-team peer messaging** — sibling teams coordinate without routing through the lead
- **Parent ↔ sub-team bidirectional queries** — parent can ask sub-team agents questions, not just receive insights
- **Cross-team task visibility** — agents can see and reference tasks across team boundaries
- **Team discovery** — agents can discover sibling teams and their members (scope: sibling-to-sibling discovery; Concierge already knows its spawned agents)
- PubSub bridge or shared topic layer for cross-team routing
- ~~Fix `parent_team_id` bug in `agent_loop.ex:309`~~ **FIXED** — now resolves actual parent via `Manager.get_parent_team/1`

**Existing foundation:** Concierge routing provides user→agent dispatch. The gap is specifically lateral (sibling↔sibling) and downward (parent→child queries).

### 6.8: Agent Negotiation (MEDIUM)
Allow agents to counter-propose task assignments rather than silently accepting or being force-assigned.

- Agent can respond to `:task_assigned` with a counter-proposal (suggest better-suited agent, request clarification, flag conflicts)
- Negotiation protocol with timeout fallback to current behavior
- Reduces wasted cycles from misassigned tasks

### 6.9: Partial Task Results (MEDIUM)
Tasks currently succeed or fail atomically. Enable intermediate outputs that downstream dependents can consume immediately.

- Streaming task results: agents emit partial outputs as they work
- Dependent tasks can start with partial predecessor data
- Pipelined workflows: task A produces data → task B consumes incrementally → task C integrates
- Complements 6.4 (milestone dependencies) with actual data flow

### 6.10: Adaptive Team Spawning (LOW)
Automatically scale team composition based on workload rather than requiring explicit `team_spawn` calls.

- Auto-scaler monitors ratio of unblocked tasks to idle agents
- Spawns specialist agents when backlog exceeds threshold
- Work stealing: idle agents claim tasks from overloaded agents' queues
- Extends Rebalancer (currently only detects stuck agents) to handle capacity

**Note:** Concierge already does manual adaptive spawning via `team_spawn` tool with LLM-driven decisions. This epic adds automatic scaling on top.

## The Vision

**From:** Leader plans everything → agents execute in waves → leader integrates

**To:** Leader seeds goals → agents discover work → system scales → lateral coordination → emergent completion

## Recommended Priority Order

1. **6.3 Checkpoints** — foundation for user steering (everything else builds on this)
2. **6.7 Cross-Team Comms** — removes the biggest structural bottleneck
3. **6.4 Dynamic Dependencies** — enables content-aware task coupling
4. **6.5 Speculative Execution** — unlocks true parallelism
5. **6.6 Readiness Signaling** — fine-grained observation + control (Orienter pattern as prototype)
6. **6.9 Partial Results** — pipelined workflows
7. **6.8 Agent Negotiation** — agent autonomy
8. **6.10 Adaptive Spawning** — dynamic team composition

## Key Files

- `lib/loomkin/agent_loop.ex` — LLM loop, tool execution, checkpoints target
- `lib/loomkin/teams/agent.ex` — GenServer, async execution, priority dispatch
- `lib/loomkin/teams/manager.ex` — team/agent lifecycle, sub-team creation
- `lib/loomkin/teams/comms.ex` — PubSub broadcasting, cross-team propagation
- `lib/loomkin/teams/priority_router.ex` — message classification
- `lib/loomkin/teams/rebalancer.ex` — stuck agent detection
- `lib/loomkin/teams/conflict_detector.ex` — conflict detection
- `lib/loomkin/teams/capabilities.ex` — capability tracking, smart_assign
- `lib/loomkin/teams/collective_decision.ex` — weighted voting
- `lib/loomkin/teams/query_router.ex` — question routing (same-team only today)
- `lib/loomkin/schemas/team_task_dep.ex` — task dependency schema
- `lib/loomkin/teams/role.ex` — Concierge/Orienter/specialist role definitions
- `lib/loomkin/session/session.ex` — Concierge routing, session lifecycle

## References

- Research: `/memory/orchestration-synthesis.md`
- Epic 6.1 (Async Loop): Done — `Task.Supervisor.async_nolink` pattern
- Epic 6.2 (Priority Router): Done — 4-tier classification
- Issue #75 (Project Switching): Done — dynamic path resolution via ETS
- PR #141 (Bootstrap Agents): Done — Concierge + Orienter warm start
- PR #144 (Bootstrap Fixes): Done — deferred spawning, agent attribution, stale model fallback


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epic 6.3–6.10: Self-Organizing Agent Orchestration #129

Summary

Foundation from PR #141 & #144

The Problem with Traditional AI Agent Teams

What Loomkin Already Does Better

What's Missing

Sub-tasks

6.3: Interruptible Checkpoints (CRITICAL)

6.4: Dynamic Task Dependencies (HIGH)

6.5: Speculative Execution (HIGH)

6.6: Readiness Signaling (MEDIUM)

6.7: Cross-Team Communication (HIGH PRIORITY)

6.8: Agent Negotiation (MEDIUM)

6.9: Partial Task Results (MEDIUM)

6.10: Adaptive Team Spawning (LOW)

The Vision

Recommended Priority Order

Key Files

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Capability	Traditional Teams	Loomkin Today
Agent spawning	Pre-planned by leader	Dynamic via TeamSpawn — LLM decides
Session bootstrap	Cold start, no context	Concierge + Orienter warm start with project scan
Task assignment	Leader assigns explicitly	`smart_assign` by capability + load
Conflict handling	Avoided by design (non-overlapping files)	Real-time detection (file, approach, decision)
Message priority	None (sequential)	4-tier routing with urgent interrupts
Stuck recovery	None (wait forever)	Rebalancer detects, nudges, escalates
Knowledge sharing	None between agents	RelevanceScorer + cross-team propagation
Consensus	Leader decides	Weighted voting (expertise × capability × confidence)
User routing	Single agent	Concierge dispatches to specialists

Epic 6.3–6.10: Self-Organizing Agent Orchestration #129

Description

Summary

Foundation from PR #141 & #144

The Problem with Traditional AI Agent Teams

What Loomkin Already Does Better

What's Missing

Sub-tasks

6.3: Interruptible Checkpoints (CRITICAL)

6.4: Dynamic Task Dependencies (HIGH)

6.5: Speculative Execution (HIGH)

6.6: Readiness Signaling (MEDIUM)

6.7: Cross-Team Communication (HIGH PRIORITY)

6.8: Agent Negotiation (MEDIUM)

6.9: Partial Task Results (MEDIUM)

6.10: Adaptive Team Spawning (LOW)

The Vision

Recommended Priority Order

Key Files

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions