diff --git a/README.md b/README.md
index ff8cc86..76c884f 100644
--- a/README.md
+++ b/README.md
@@ -10,7 +10,7 @@
 
 ### OpenArmature is a workflow framework for LLM pipelines and tool-calling agents.
 
-Typed state, compile-time topology checks, and observability and crash-safe checkpoints are baked into the engine. The graph layer itself has no concept of LLMs or tools, so the same primitives drive deterministic ETL pipelines and tool-calling agents alike.
+Typed state, compile-time topology checks, observability, and crash-safe checkpoints are baked into the engine. The graph layer itself has no concept of LLMs or tools, so the same primitives drive deterministic ETL pipelines and tool-calling agents alike.
 
 This Python package is the reference implementation. The behavioral contract is specified in [openarmature-spec](https://github.com/LunarCommand/openarmature-spec) and verified by conformance fixtures.
 
@@ -187,8 +187,8 @@ A few things to notice:
 ## Next steps
 
 - **Quickstart**: build your first graph end-to-end. [openarmature.ai/getting-started](https://openarmature.ai/getting-started/)
-- **Concepts**: typed state, reducers, composition, fan-out, checkpointing, observability. [openarmature.ai/concepts](https://openarmature.ai/concepts/)
+- **Concepts**: typed state, reducers, graphs, composition, fan-out, parallel branches, LLMs, prompts, observability, checkpointing. [openarmature.ai/concepts](https://openarmature.ai/concepts/)
 - **Model Providers**: implement the Provider Protocol for a custom LLM backend. [openarmature.ai/model-providers/authoring](https://openarmature.ai/model-providers/authoring/)
 - **API reference**: auto-generated from docstrings. [openarmature.ai/reference](https://openarmature.ai/reference/)
-- **Examples**: runnable demos. [openarmature-python/examples/](https://github.com/LunarCommand/openarmature-python/tree/main/examples)
+- **Examples**: ten runnable demos with walk-throughs. [openarmature.ai/examples](https://openarmature.ai/examples/) (source at [./examples/](./examples/))
 - **Spec**: behavioral contract this implementation conforms to. [LunarCommand/openarmature-spec](https://github.com/LunarCommand/openarmature-spec)
diff --git a/docs/concepts/composition.md b/docs/concepts/composition.md
index b6ea35c..c34ce72 100644
--- a/docs/concepts/composition.md
+++ b/docs/concepts/composition.md
@@ -277,3 +277,17 @@ the shape or it doesn't; the type checker verifies at use sites. If
 you have Java instincts ("where's the `implements` keyword?"), reach
 for TypeScript or Go interface instincts instead; that's the same
 family.
+
+## Related composition primitives
+
+Subgraphs run once per outer-graph entry into them. Two related
+primitives run subgraphs multiple times or in parallel; both use
+the same projection machinery at their boundaries.
+
+- [Fan-out](fan-out.md): dispatch N copies of *one* compiled subgraph
+  against an input collection. Use when you have a list of similar
+  items to process independently.
+- [Parallel branches](parallel-branches.md): dispatch M *heterogeneous*
+  subgraphs concurrently against the same parent state, each with its
+  own state schema and (optional) middleware. Use when several
+  independent analyses share a single input.
diff --git a/docs/concepts/fan-out.md b/docs/concepts/fan-out.md
index 4d69526..40d9060 100644
--- a/docs/concepts/fan-out.md
+++ b/docs/concepts/fan-out.md
@@ -18,36 +18,29 @@ A fan-out can dispatch instances driven by a list in state
 **`items_field` mode**: one instance per item in a parent list field:
 
 ```python
-from openarmature.graph import FanOutConfig, FanOutNode
-
-scrape_all = FanOutNode(
-    name="scrape_all",
-    config=FanOutConfig(
-        subgraph=scrape_subgraph,        # CompiledGraph[ScrapeState]
-        items_field="urls",              # parent list field, one instance per item
-        item_field="url",                # subgraph field that receives each item
-        collect_field="content",         # subgraph field whose value is collected
-        target_field="contents",         # parent list field that receives the collection
-        concurrency=4,
-        error_policy="fail_fast",        # or "collect"
-        on_empty="raise",                # or "noop"
-    ),
+builder.add_fan_out_node(
+    "scrape_all",
+    subgraph=scrape_subgraph,        # CompiledGraph[ScrapeState]
+    items_field="urls",              # parent list field, one instance per item
+    item_field="url",                # subgraph field that receives each item
+    collect_field="content",         # subgraph field whose value is collected
+    target_field="contents",         # parent list field that receives the collection
+    concurrency=4,
+    error_policy="fail_fast",        # or "collect"
+    on_empty="raise",                # or "noop"
 )
-builder.add_node("scrape_all", scrape_all)
 ```
 
 **`count` mode**: fixed-or-dynamic instance count, no list field:
 
 ```python
-fan_out = FanOutNode(
-    name="sample",
-    config=FanOutConfig(
-        subgraph=sample_subgraph,
-        count=8,                          # int or callable: state -> int
-        collect_field="reading",
-        target_field="readings",
-        concurrency=4,
-    ),
+builder.add_fan_out_node(
+    "sample",
+    subgraph=sample_subgraph,
+    count=8,                          # int or callable: state -> int
+    collect_field="reading",
+    target_field="readings",
+    concurrency=4,
 )
 ```
 
diff --git a/docs/concepts/graphs.md b/docs/concepts/graphs.md
index 20813d0..2552e29 100644
--- a/docs/concepts/graphs.md
+++ b/docs/concepts/graphs.md
@@ -117,6 +117,17 @@ The methods you'll use:
 - **`.add_subgraph_node(name, compiled, projection=None)`**: register
   a compiled graph as a node inside this graph (see
   [Composition](composition.md)).
+- **`.add_fan_out_node(name, subgraph=..., ...)`**: dispatch N copies
+  of one subgraph in parallel (see [Fan-out](fan-out.md)).
+- **`.add_parallel_branches_node(name, branches=...)`**: dispatch M
+  heterogeneous subgraphs concurrently (see
+  [Parallel branches](parallel-branches.md)).
+- **`.with_checkpointer(checkpointer)`**: wire a `Checkpointer`; the
+  engine saves a record after every `completed` event (see
+  [Checkpointing](checkpointing.md)).
+- **`.with_state_migration(from_version, to_version, migrate)`**:
+  register one edge of the state-migration chain used when resuming
+  an older saved invocation (see [Checkpointing](checkpointing.md)).
 - **`.set_entry(name)`**: declare where execution begins.
 - **`.compile()`**: validate and return `CompiledGraph`.
 
diff --git a/docs/concepts/index.md b/docs/concepts/index.md
index a55a722..611b6ff 100644
--- a/docs/concepts/index.md
+++ b/docs/concepts/index.md
@@ -16,11 +16,14 @@ the framework, or jump to whichever concept you need.
   heterogeneous subgraphs concurrently with per-branch state schemas
   and middleware.
 - [LLMs](llms.md): how LLM calls fit into nodes, structured output,
-  routing on parsed fields, errors at the LLM boundary.
+  multimodal content blocks, tool definitions, routing on parsed
+  fields, errors at the LLM boundary.
+- [Prompts](prompts.md): versioned templates, composite backends,
+  prompt-group observability propagation.
 - [Observability](observability.md): node-boundary hooks, OTel mapping,
   log correlation.
 - [Checkpointing](checkpointing.md): save state at each node boundary,
-  resume from a prior point.
+  resume from a prior point, schema migration across versions.
 
 If you're brand-new, [Quickstart](../getting-started/index.md) is the
 faster entry; under a minute to a running graph. Come back here when
diff --git a/docs/concepts/llms.md b/docs/concepts/llms.md
index 1d8f407..ae7c5fb 100644
--- a/docs/concepts/llms.md
+++ b/docs/concepts/llms.md
@@ -221,6 +221,58 @@ on every object. Pydantic-derived schemas may need `model_config =
 ConfigDict(extra="forbid")` on the class to get the
 `additionalProperties: false` in the generated JSON Schema.
 
+## Tool calling
+
+Beyond producing typed text, an LLM call can request work from local
+Python functions and resume with their results. The wire shape is a
+turn-based loop driven entirely from the same `complete()` call: the
+model emits `tool_calls`, the caller dispatches them to local
+functions, appends `ToolMessage` responses, and re-calls. The graph
+engine has no special concept of tools; the loop fits as a
+conditional-edge cycle.
+
+```python
+from openarmature.llm import Tool
+
+lookup_mission = Tool(
+    name="lookup_mission",
+    description="Look up factual records for a named lunar mission.",
+    parameters={
+        "type": "object",
+        "properties": {
+            "name": {"type": "string"},
+        },
+        "required": ["name"],
+        "additionalProperties": False,
+    },
+)
+
+response = await provider.complete(messages, tools=[lookup_mission, ...])
+```
+
+When the model decides to use one or more tools, the response carries
+`finish_reason="tool_calls"` and `response.message.tool_calls` is a
+list of `ToolCall(id, name, arguments)` records. `arguments` is a
+parsed dict whose shape matches the corresponding tool's `parameters`
+schema. The single edge case where `arguments` is `None` is
+`finish_reason="error"` for unparseable model output.
+
+The caller dispatches each call to its local function, appends one
+`ToolMessage(content=..., tool_call_id=...)` per call to the message
+list, and re-calls. The `tool_call_id` field MUST match the
+`ToolCall.id` the model emitted so the model can pair its requests
+with the responses. The next turn either emits more `tool_calls` or
+returns a normal assistant content message signaling completion.
+
+Wiring the loop as a graph cycle: a `call_llm` node, a
+`dispatch_tools` node that resolves calls and appends
+`ToolMessage`s, a conditional edge from `call_llm` that routes back
+to `call_llm` when `tool_calls` are present and forward to a
+termination node when they aren't. A turn cap on the routing function
+prevents runaway loops on a model that stays in tool-calling forever.
+See [`09 - Tool use`](../examples/09-tool-use.md) for the runnable
+shape.
+
 ## Content blocks (multimodal user messages)
 
 User messages carry content in one of two shapes: a plain text string,
@@ -434,6 +486,10 @@ classifier won't do this for them.
 - [API reference: `openarmature.llm`](../reference/llm.md) for the
   full surface: message types, `Response`, `RuntimeConfig`, every
   error class, validation helpers.
-- [Examples: `00-hello-world`](https://github.com/LunarCommand/openarmature-python/tree/main/examples/00-hello-world)
-  for a runnable graph exercising both `response_schema` forms in one
+- [Examples: 00 - Hello, world](../examples/00-hello-world.md) for a
+  runnable graph exercising both `response_schema` forms in one
   pipeline.
+- [Examples: 09 - Tool use](../examples/09-tool-use.md) for the
+  agent-loop pattern with two local tools.
+- [Examples: 07 - Multimodal prompt](../examples/07-multimodal-prompt.md)
+  for content blocks alongside versioned prompts.
diff --git a/docs/concepts/observability.md b/docs/concepts/observability.md
index f6aa23e..12027bd 100644
--- a/docs/concepts/observability.md
+++ b/docs/concepts/observability.md
@@ -132,7 +132,7 @@ A walk-through:
 
 - **`attempt_index`**: 0-based retry attempt counter. `0` for nodes
   not wrapped by retry middleware; `1+` for retries. Retry middleware
-  may wrap transitively — a retry on a [parallel-branches
+  may wrap transitively. A retry on a [parallel-branches
   branch](parallel-branches.md) or fan-out `instance_middleware`
   re-runs the whole subgraph; events from inner nodes carry the
   wrapping retry's attempt counter.
@@ -148,7 +148,7 @@ A walk-through:
 - **`branch_name`**: populated on events from nodes inside a
   [parallel-branches branch](parallel-branches.md), carrying the
   branch's name as declared on the dispatcher. `None` outside.
-  Independent of `fan_out_index` — both may be present simultaneously
+  Independent of `fan_out_index`; both may be present simultaneously
   when a parallel-branches branch contains a fan-out (or a fan-out
   instance contains a parallel-branches node). The combination
   `(namespace, branch_name, fan_out_index, attempt_index, phase)`
diff --git a/docs/concepts/parallel-branches.md b/docs/concepts/parallel-branches.md
index b6e94f6..6255e21 100644
--- a/docs/concepts/parallel-branches.md
+++ b/docs/concepts/parallel-branches.md
@@ -6,8 +6,8 @@ insertion order.
 
 Sibling to [fan-out](fan-out.md) (same `for each thing, do work in
 parallel` shape), but the *thing* is different per branch: a research
-subgraph, a categorize subgraph, a sentiment subgraph — each with its
-own state schema, its own middleware, its own observer events —
+subgraph, a categorize subgraph, a sentiment subgraph (each with its
+own state schema, its own middleware, its own observer events),
 running in parallel and joining their results into one parent state.
 
 ## When to reach for parallel branches
@@ -56,14 +56,14 @@ builder.add_parallel_branches_node(
 Each branch's `subgraph` is a compiled graph; `inputs` and `outputs`
 mirror the explicit projection shape from
 [composition](composition.md#explicitmapping-declarative). The
-branches dict's key is the branch name — used as the branch identity
+branches dict's key is the branch name, used as the branch identity
 on observer events (see [observability](observability.md)) and in
 the per-branch error records that `error_policy: "collect"`
 produces.
 
 ## Per-branch state, inputs and outputs
 
-Each branch runs its own subgraph against its own state — heterogeneous
+Each branch runs its own subgraph against its own state; heterogeneous
 schemas are explicit. Subgraph fields named in `inputs` are seeded
 from the parent's corresponding field at branch entry; other subgraph
 fields take their schema defaults. At branch exit, only the parent
@@ -72,7 +72,7 @@ branch's final state is discarded.
 
 When two branches contribute to the same parent field, the parent's
 reducer for that field applies both values in **branch insertion
-order** — first the branch declared first in the `branches` dict,
+order**: first the branch declared first in the `branches` dict,
 then the next, and so on. This is deterministic regardless of which
 branch's inner work finishes first.
 
@@ -83,7 +83,7 @@ branch's inner work finishes first.
   `ParallelBranchesBranchFailed` (a `NodeException` subtype) carrying
   the failing `branch_name` and the original cause as `__cause__`.
   `recoverable_state` is the parent's snapshot at the moment the
-  dispatcher entered — **no buffered branch contributions are
+  dispatcher entered. **No buffered branch contributions are
   applied**, including those of branches that successfully completed
   before the failure. Buffer-and-apply semantics: contributions are
   held until every branch finishes, then either all apply (success)
@@ -100,7 +100,7 @@ branch's inner work finishes first.
 
 ## Branch middleware
 
-Each `BranchSpec` accepts a `middleware` tuple — middlewares that
+Each `BranchSpec` accepts a `middleware` tuple of middlewares that
 wrap that branch's whole subgraph invocation as a unit. Retry
 middleware on a branch retries the **whole branch**: a fresh
 subgraph invocation each time, fresh inner-node execution. The
@@ -109,7 +109,7 @@ inner nodes (per graph-engine §6 v0.16.1), so observer events
 inside the branch correctly show `attempt_index` ticking across
 retries.
 
-Branch middleware is independent across branches — branch A may
+Branch middleware is independent across branches: branch A may
 have `[retry, timing]`; branch B may have `[]`; branch C may have
 some custom breaker. Each branch's chain composes in isolation.
 
@@ -118,15 +118,15 @@ some custom breaker. Each branch's chain composes in isolation.
 Parallel branches compose with the rest of the engine the way
 subgraphs and fan-outs do:
 
-- A branch's subgraph can itself contain a fan-out node — inner-node
+- A branch's subgraph can itself contain a fan-out node; inner-node
   events inside that fan-out carry **both** `branch_name` (this
   branch) and `fan_out_index` (the instance within this branch).
   The two fields are independent.
 - The parallel-branches node itself can be invoked from inside a
-  fan-out instance — inner events then carry the outer fan-out's
+  fan-out instance, and inner events then carry the outer fan-out's
   `fan_out_index` and the inner branch's `branch_name`.
 - Per-graph and per-node middleware on the parallel-branches node
-  wrap the dispatcher as a single unit — one `started` event before
+  wrap the dispatcher as a single unit: one `started` event before
   dispatch begins, one `completed` event after all branches finish
   and fan-in lands. The parent's retry middleware retries the **whole
   parallel-branches node**, not individual branches.
@@ -143,9 +143,9 @@ Per-branch progress is not individually persisted in v1.
 - **Not the same as N copies of one subgraph.** If you want "run
   this subgraph for each item in a list," reach for
   [fan-out](fan-out.md).
-- **Not a router.** A router is a conditional-edge pattern — pick
-  one branch based on state. Parallel branches runs *all* branches
-  concurrently.
+- **Not a router.** A router is a conditional-edge pattern that
+  picks one branch based on state. Parallel branches runs *all*
+  branches concurrently.
 - **Not a coordinator.** Branches don't communicate with each other
   during execution; if branch B's work depends on branch A's
   output, you want a linear pipeline (A → B), not parallel branches.
diff --git a/docs/getting-started/index.md b/docs/getting-started/index.md
index 8675dd1..5b1a821 100644
--- a/docs/getting-started/index.md
+++ b/docs/getting-started/index.md
@@ -69,9 +69,10 @@ assert final.log == ["hello", "world"]
 
 ## Next
 
-- [Concepts](../concepts/index.md): deeper on state, reducers,
-  projections, fan-out, subgraphs, observability.
-- [Examples](https://github.com/LunarCommand/openarmature-python/tree/main/examples):
-  five runnable demos, each driving a local OpenAI-compatible LLM
-  endpoint to do real work.
+- [Concepts](../concepts/index.md): deeper on state, reducers, graphs,
+  composition, fan-out, parallel branches, LLMs, prompts,
+  observability, checkpointing.
+- [Examples](../examples/index.md): ten runnable demos with
+  walk-throughs, each driving an OpenAI-compatible LLM endpoint to
+  do real work.
 - [API reference](../reference/index.md): auto-generated from docstrings.