diff --git a/docs/examples/00-hello-world.md b/docs/examples/00-hello-world.md new file mode 100644 index 0000000..246bf34 --- /dev/null +++ b/docs/examples/00-hello-world.md @@ -0,0 +1,104 @@ +# 00 - Hello, world + +The smallest possible LLM-routed pipeline: classify a query, then +either plan research on it or summarize it in one sentence. Sets the +shape every other example builds on. + +## Overview + +You ask a question. A classifier LLM decides whether the question +wants new information or a summary of known material. Depending on +the answer, the run either calls a research-planner node (returns +topics to investigate plus follow-up questions) or a summarizer node +(returns one sentence plus a confidence score). + +The demo query is *"why did Apollo 13 abort its lunar landing?"*, +which the model usually routes to `summarize` because the facts are +well-established. + +## What it teaches + +- A typed [`State`](../concepts/state-and-reducers.md) holding query + plus per-node artifacts, with three reducer policies in one model + (`last_write_wins`, `append`, `merge`). +- The [`OpenAIProvider`](../concepts/llms.md) talking to any + OpenAI-compatible endpoint. +- Both forms of [structured output](../concepts/llms.md): pass a + Pydantic class as `response_schema` (`Classification`, `Summary`) + and get an instance back on `Response.parsed`; pass a JSON Schema + dict (`research`) and get a raw dict. +- `RuntimeConfig` for per-call sampling knobs. Every `complete()` + passes `RuntimeConfig(temperature=0.0)` so the run is as + reproducible as the API allows. +- A [conditional edge](../concepts/graphs.md) reading a parsed field + off state (`route` returns `state.classification.intent`). +- A function-shaped [observer](../concepts/observability.md) attached + after compile. + +## How to run + +```bash +uv sync --group examples +LLM_API_KEY=sk-... uv run python examples/00-hello-world/main.py +``` + +To point at a local OpenAI-compatible server, override `LLM_BASE_URL` +and (often) `LLM_MODEL`: + +```bash +LLM_BASE_URL=http://localhost:8000 LLM_MODEL=Qwen2.5-7B-Instruct \ + LLM_API_KEY= \ + uv run python examples/00-hello-world/main.py +``` + +## The graph + +```mermaid +flowchart TD + start([start]) + classify[classify] + research[research] + summarize[summarize] + stop([end]) + + start --> classify + classify -->|"intent == 'research'"| research + classify -->|"intent == 'summarize'"| summarize + research --> stop + summarize --> stop +``` + +Three nodes, one conditional edge. `classify` is the entry; `route` +inspects `state.classification.intent` and returns the name of the +next node. + +## Reading the output + +A clean run prints two lines from the observer and then the final +state: + +``` +classify: sources=[] +summarize: sources=['cache'] + +classification: intent='summarize' rationale='...' +summary: one_liner='...' confidence=0.92 +sources: ['cache'] +metadata: {'classified_by': 'llm', 'tool': 'summarize'} +``` + +- `classify: sources=[]` - the classifier ran, no sources have been + appended yet because only the chosen follow-up node adds them. +- `summarize: sources=['cache']` - the second node ran (since the + classifier picked `summarize`). The `append` reducer on the + `sources` field merged the new entry into the existing list. +- `classification` and `summary` are the parsed Pydantic instances, + not raw model output. Compare with `research_plan`, which would + show as a plain dict if the classifier had picked `research`. +- `metadata: {...}` shows the `merge` reducer in action. Each node + contributed one key (`classified_by`, `tool`); the final map has + both. + +If the classifier picks `research` instead, you'll see `research` +in the second observer line and a `research_plan` dict (with +`topics` and `follow_up_questions`) in the final printout. diff --git a/docs/examples/01-routing-and-subgraphs.md b/docs/examples/01-routing-and-subgraphs.md new file mode 100644 index 0000000..b3064bd --- /dev/null +++ b/docs/examples/01-routing-and-subgraphs.md @@ -0,0 +1,109 @@ +# 01 - Routing and subgraphs + +A question-answering assistant. Classify the question, then either +give a one-shot quick answer or run a multi-step research +sub-pipeline, then lightly copy-edit the result. + +## Overview + +You ask a question. A classifier LLM decides whether it can be +answered in one or two sentences ("quick") or whether it benefits +from considering multiple angles ("research"). Quick questions go +through a single `quick_answer` node. Research questions descend +into a subgraph that plans three angles, gathers a short note for +each, and synthesizes them into a paragraph. Either way, a final +`format_final` node copy-edits the answer. + +Demo questions: *"what year did the moon landing happen"* +(usually routes to quick) and *"why is the lunar south pole +strategically important?"* (usually routes to research). + +## What it teaches + +- [Conditional edges](../concepts/graphs.md) routing on a state + field. `classify` writes `state.route`; the conditional edge + function reads it and returns the next node's name. +- [Subgraph composition](../concepts/composition.md). The research + pipeline is itself a compiled graph, wrapped as a single node in + the outer graph via `add_subgraph_node`. +- A custom + [`ProjectionStrategy`](../concepts/composition.md). The default + `FieldNameMatching` only carries fields back *out* of a subgraph; + carrying the parent's question *in* requires writing a small + `ProjectionStrategy` class. The `QuestionProjection` here is the + canonical pattern for non-trivial subgraph boundaries. +- The [`merge` reducer](../concepts/state-and-reducers.md) for dict + accumulation. Every node returns a small `tallies` fragment; the + reducer accumulates them into one dict on the final state. + +## How to run + +```bash +uv sync --group examples +LLM_API_KEY=sk-... uv run python examples/01-routing-and-subgraphs/main.py \ + "why is the lunar south pole strategically important?" +``` + +The first positional arg becomes the question. With no arg, it falls +back to the lunar-south-pole question above. + +## The graph + +```mermaid +flowchart TD + start([start]) + classify[classify] + quick_answer[quick_answer] + format_final[format_final] + stop([end]) + + subgraph research [research subgraph] + direction TB + plan_research[plan_research] + gather[gather] + synthesize[synthesize] + plan_research --> gather --> synthesize + end + + start --> classify + classify -->|"route == 'quick'"| quick_answer + classify -->|"route == 'research'"| research + quick_answer --> format_final + research --> format_final + format_final --> stop +``` + +The research box is a separate compiled graph with its own state +schema (`ResearchState`). The `QuestionProjection` carries +`parent.question` in as `subgraph.question`, and brings +`subgraph.answer` plus `subgraph.trace` back out. + +## Reading the output + +For a research-route run, expect: + +``` +question: why is the lunar south pole strategically important? +route: research + +answer: + + +trace: ['classify', 'plan_research', 'gather', 'synthesize', 'format_final'] +tallies: {'classify_calls': 1, 'research_runs': 1, 'formatted': 1} +``` + +- `route` is the field `classify` wrote that the conditional edge + read. +- `trace` lists nodes in invocation order. Subgraph nodes appear + inline; that's the projection's `trace` field flowing back out + through the parent's `append` reducer. +- `tallies` has one entry per node that contributed: `classify` set + `classify_calls`, the subgraph projection's `project_out` set + `research_runs`, `format_final` set `formatted`. `quick_answer` + would have contributed `quick_answers: 1` if the run had gone the + other way. + +For a quick-route run, `trace` drops to `['classify', +'quick_answer', 'format_final']` and `tallies` has +`quick_answers: 1` in place of `research_runs: 1`. diff --git a/docs/examples/02-explicit-subgraph-mapping.md b/docs/examples/02-explicit-subgraph-mapping.md new file mode 100644 index 0000000..d3addf0 --- /dev/null +++ b/docs/examples/02-explicit-subgraph-mapping.md @@ -0,0 +1,103 @@ +# 02 - Explicit subgraph mapping + +Compare two topics by running the *same* compiled analysis subgraph +on each, with each call site writing into disjoint parent fields. +This is the canonical use of `ExplicitMapping`. + +## Overview + +You give the pipeline two topics ("Apollo 11" vs "Apollo 17", or +"Apollo program vs Artemis program"). One compiled subgraph +(`summarize → score`) is registered twice in the outer graph. The +first registration analyzes topic A and writes its results into +`a_summary` / `a_score`; the second analyzes topic B and writes +into `b_summary` / `b_score`. A final `synthesize` node reads both +sides and renders a verdict. + +Without explicit mapping the two sites would both write to a single +`parent.summary` field under default name matching, and the second +call would clobber the first. + +## What it teaches + +- [`ExplicitMapping`](../concepts/composition.md) for reusing one + compiled subgraph at multiple parent sites with disjoint parent + fields. Each site declares its own `inputs` and `outputs` dicts; + the same compiled subgraph value is registered twice. +- The encapsulation property that makes this work: the subgraph + speaks in neutral field names (`topic`, `summary`, `score`) and + has no idea which side of the comparison it's running for. The + mapping at each call site is what wires the subgraph's neutral + names to the parent's per-side fields. +- The contrast with example 01: there a custom + `ProjectionStrategy` carried one field in. Here the two sites + need to be similar-but-different, and `ExplicitMapping` is the + zero-boilerplate way to express that. + +## How to run + +```bash +uv sync --group examples +LLM_API_KEY=sk-... uv run python examples/02-explicit-subgraph-mapping/main.py "Apollo 11" "Apollo 17" +``` + +Or pass a single `"X vs Y"` arg, or no args (defaults to +`"Apollo 11"` vs `"Apollo 17"`). + +## The graph + +```mermaid +flowchart TD + start([start]) + synthesize[synthesize] + stop([end]) + + subgraph analysis_a [analyze_a: analysis subgraph] + direction TB + sa[summarize] + ra[score] + sa --> ra + end + + subgraph analysis_b [analyze_b: analysis subgraph] + direction TB + sb[summarize] + rb[score] + sb --> rb + end + + start --> analysis_a --> analysis_b --> synthesize --> stop +``` + +Both subgraph boxes are the *same* compiled value, registered twice +under different names with different mappings. The `analyze_a` site +maps `parent.topic_a → subgraph.topic` and back out as `summary → +a_summary`, `score → a_score`. The `analyze_b` site does the same +thing on the B-side parent fields. + +## Reading the output + +``` +topic A: Apollo 11 + summary: + score: 8/10 + +topic B: Apollo 17 + summary: + score: 7/10 + +verdict: + + +trace: ['summarize', 'score', 'summarize', 'score', 'synthesize'] +``` + +- The per-side summary and score fields are populated by separate + invocations of the same subgraph, routed by the mappings. +- `trace` shows the subgraph's nodes running **twice**, interleaved + with the outer `synthesize`. Both invocations contribute to the + same parent `trace` list because each `outputs` mapping includes + `"trace": "trace"`. +- `verdict` is whatever `synthesize` produced from reading both + sides. The outer node knows nothing about which side ran first; + it just reads four parent fields. diff --git a/docs/examples/03-observer-hooks.md b/docs/examples/03-observer-hooks.md new file mode 100644 index 0000000..ca74567 --- /dev/null +++ b/docs/examples/03-observer-hooks.md @@ -0,0 +1,124 @@ +# 03 - Observer hooks + +Add observability to a `draft → review → finalize` pipeline without +touching any node code. Three observer flavors run side-by-side: a +console tracer, a per-invocation metrics collector, and the +OpenTelemetry observer wired to a console span exporter. + +## Overview + +You ask a question. The outer graph drafts an answer, then descends +into a `review` subgraph that critiques the draft and produces a +revision, then runs a `finalize` node that marks the run done. + +Three observers watch every node boundary: + +1. A **graph-attached console tracer** prints one structured line + per node boundary to stderr. +2. An **invocation-scoped metrics collector** counts events, + errors, and unique namespaces seen on *this* call only. +3. The **`OTelObserver`** opens and closes spans on a private + `TracerProvider`, and a console span exporter prints the JSON + span at close. + +The observers share one `Observer` Protocol; nothing in the node +bodies knows or cares that they're attached. + +## What it teaches + +- [`attach_observer`](../concepts/observability.md) for + graph-attached observers (fire on every invocation until removed). +- The `observers=[...]` kwarg on `invoke()` for invocation-scoped + observers (fire only for that call). +- The [`NodeEvent`](../concepts/observability.md) shape: `phase`, + `step`, `namespace`, `pre_state`, `post_state`, `error`. +- Namespace chaining across subgraph boundaries. The subgraph's + events arrive with their parent node name prepended to the + namespace tuple. +- Function-shaped versus class-shaped observers (both satisfy the + Protocol structurally). +- The [`OTelObserver`](../concepts/observability.md) from the + `[otel]` extra, registered like any other observer. Same hook, + spans instead of prints. +- The `await graph.drain()` requirement for short-lived processes: + events go through a background queue, and `invoke()` returns when + the graph hits `END`, not when the queue empties. + +## How to run + +```bash +uv sync --group examples --all-extras +LLM_API_KEY=sk-... uv run python examples/03-observer-hooks/main.py \ + "what year did the moon landing happen" +``` + +`--all-extras` pulls in `opentelemetry-sdk` for the OTel observer. +The first positional arg becomes the question. + +## The graph + +```mermaid +flowchart TD + start([start]) + draft[draft] + finalize[finalize] + stop([end]) + + subgraph review [review subgraph] + direction TB + critique[critique] + revise[revise] + critique --> revise + end + + start --> draft --> review --> finalize --> stop +``` + +The `review` subgraph is wired with an `ExplicitMapping` that +carries `draft` IN; the default field-name matching brings +`revised` and `trace` back OUT. + +## Reading the output + +Both observers subscribe to `started` and `completed` events by +default, so for each node the engine fires two events sharing the +same `step`. The console tracer prints one line per event to +**stderr** in `[step=N] namespace → fields_changed` form; `started` +events print as `→ {}` since `post_state` isn't populated yet. The +trimmed sample below shows only the completed lines for +readability, interleaved with OTel JSON spans on **stdout**: + +``` +[step=1] draft → {'draft': '...', 'trace': ['draft']} +{"name": "draft", "context": {...}, "kind": "SpanKind.INTERNAL", ...} +[step=2] review.critique → {'critique': '...', 'trace': ['critique']} +[step=3] review.revise → {'revised': '...', 'trace': ['critique', 'revise']} +[step=4] finalize → {'trace': ['draft', 'critique', 'revise', 'finalize']} + +question: what year did the moon landing happen +draft: +revised: + +per-invocation metrics: + events seen: 8 + errors observed: 0 + unique namespaces: 4 + trace order: ['draft', 'critique', 'revise', 'finalize'] +``` + +- **`namespace` chains across subgraphs.** `draft` and `finalize` + fire at the top level (single-element namespace). `critique` and + `revise` fire inside the subgraph and arrive with namespace + `('review', 'critique')` / `('review', 'revise')`, which the + tracer joins with `.`. +- **`fields_changed` is the diff** between `pre_state` and + `post_state`. Each node's "what did it do?" is visible without + the node logging anything itself. +- **The metrics observer is per-invocation.** It counts only events + from this single `invoke()` call. The tracer and OTel observer + would persist across further `invoke()` calls on the same + compiled graph. +- **OTel spans appear as JSON on stdout** because we wired a + `ConsoleSpanExporter`. The `OTelObserver` uses a private + `TracerProvider`, so it does not pollute any global OTel setup + the surrounding application might have. diff --git a/docs/examples/04-nested-subgraphs.md b/docs/examples/04-nested-subgraphs.md new file mode 100644 index 0000000..c839ab7 --- /dev/null +++ b/docs/examples/04-nested-subgraphs.md @@ -0,0 +1,135 @@ +# 04 - Nested subgraphs + +Question answering against a tiny baked-in document corpus, with +two levels of subgraph nesting: outer coordinator, middle doc-QA, +inner section-extract. + +## Overview + +You ask a question. A small in-memory corpus of three documents +(Apollo 11, Apollo 13, Artemis II) sits at module level. The +pipeline runs in three layers: + +1. **Outer (coordinator).** Receives the question, delegates to the + doc-QA subgraph, polishes the final answer. +2. **Middle (doc-QA).** Picks the single most relevant document + from the corpus, hands it to the section-extract subgraph, then + synthesizes a clean answer from what came back. +3. **Inner (section-extract).** Given one document and the + question, finds the relevant paragraph and pulls out the answer + text. + +Each layer has its own state schema scoped to its job: the outer +cares about a question and a final answer, the middle picks one +document and synthesizes, the inner narrows to a paragraph and +extracts a span. + +## What it teaches + +- [Nested subgraphs](../concepts/composition.md): a compiled + subgraph is just a value, so it can be embedded inside another + compiled subgraph, recursively. +- Layer-scoped state schemas. Each compiled subgraph has its own + `State` subclass. Boundaries between layers are explicit + projections, not aliased namespaces. +- [`ExplicitMapping`](../concepts/composition.md) at every + parent ↔ child boundary, plumbing the question down through three + layers and the answer back up. +- A depth-aware [observer](../concepts/observability.md). The + observer prints `namespace` as a `>`-joined breadcrumb and indents + by depth. Useful for seeing the descent into nested subgraphs and + the return. + +## How to run + +```bash +uv sync --group examples +LLM_API_KEY=sk-... uv run python examples/04-nested-subgraphs/main.py \ + "what happened on Apollo 13?" +``` + +Other demo questions: *"what year did humans first land on the +moon?"* (routes to Apollo 11) and *"who was on the Artemis II +crew?"* (routes to Artemis II). With no arg the default is the +moon-landing year question. + +## The graph + +```mermaid +flowchart TD + start([start]) + receive[receive] + format_final[format_final] + stop([end]) + + subgraph doc_qa [doc_qa subgraph] + direction TB + pick_doc[pick_doc] + synthesize[synthesize] + + subgraph section_extract [section_extract subgraph] + direction TB + find_section[find_section] + extract_answer[extract_answer] + find_section --> extract_answer + end + + pick_doc --> section_extract --> synthesize + end + + start --> receive --> doc_qa --> format_final --> stop +``` + +The doc-QA box wraps the section-extract box. The outer's +`ExplicitMapping` carries `question` down into `doc_qa` and brings +`answer` plus `trace` back. Inside `doc_qa`, a second +`ExplicitMapping` carries `question` plus `selected_body` into the +section-extract subgraph (as its `question` and `doc_body` fields) +and brings the extracted text back as `raw_answer`. + +## Reading the output + +The depth observer indents each event by its namespace depth, so +the descent and return are visually obvious. A trimmed run: + +``` +[step 1] depth=1 receive + started question='what happened on Apollo 13?' answer='' + completed question='what happened on Apollo 13?' answer='' + + [step 2] depth=2 doc_qa > pick_doc + started question='...' selected_title='' selected_body='' + completed question='...' selected_title='Apollo 13' selected_body='...' + + [step 3] depth=3 doc_qa > section_extract > find_section + started question='...' doc_body='...' relevant_section='' + completed question='...' doc_body='...' relevant_section='...' + + [step 4] depth=3 doc_qa > section_extract > extract_answer + started relevant_section='...' extracted='' + completed relevant_section='...' extracted='aborted after an oxygen tank ruptured' + + [step 5] depth=2 doc_qa > synthesize + started raw_answer='aborted after...' answer='' + completed raw_answer='aborted after...' answer='Apollo 13's lunar landing was aborted after...' + +[step 6] depth=1 format_final + started answer='Apollo 13's lunar landing was aborted...' + completed answer='Apollo 13's lunar landing was aborted after an oxygen tank ruptured.' + +Answer: Apollo 13's lunar landing was aborted after an oxygen tank ruptured. + +Trace: ['receive', 'pick_doc', 'find_section', 'extract_answer', 'synthesize', 'format_final'] +``` + +- **`depth`** counts the namespace tuple. Top-level nodes are + depth=1; doc-QA subgraph nodes are depth=2; section-extract nodes + are depth=3. The indent makes the levels obvious at a glance. +- **`namespace`** chains: `doc_qa > section_extract > find_section` + means "find_section running inside section_extract running inside + doc_qa." Observability backends can use this same chain for + trace correlation. +- **`trace`** in final state shows the order of node completions, + flattened across all layers. Each subgraph's projection + contributed its trace back to the parent through the parent's + `append` reducer; the outer's `trace` ends up as the concatenation. diff --git a/docs/examples/05-fan-out-with-retry.md b/docs/examples/05-fan-out-with-retry.md new file mode 100644 index 0000000..35911a0 --- /dev/null +++ b/docs/examples/05-fan-out-with-retry.md @@ -0,0 +1,137 @@ +# 05 - Fan-out with retry + +Summarize a batch of lunar-mission headlines in parallel, with +per-headline retry and timing middleware wrapping each instance's +subgraph run. + +## Overview + +You have a list of news headlines. Each one needs a one-sentence +summary plus a topic tag. The headlines are independent, so the +work parallelizes naturally: dispatch one per-headline subgraph +run per headline, bounded concurrency, retry transient LLM failures +on a per-instance basis. + +The per-instance subgraph is small (`summarize → classify`) and +would also run standalone against a single headline. Fan-out +multiplies it out across the batch. + +A second mode, controlled by the `COLLECT_MODE` env var, exercises +the failure path. With `COLLECT_MODE=1` the demo prepends a +sentinel headline that always raises `ProviderUnavailable`; under +`error_policy="collect"` the failure lands in +`state.instance_errors` and the rest of the batch completes. + +## What it teaches + +- [`add_fan_out_node`](../concepts/fan-out.md) in `items_field` + mode: one subgraph invocation per element of `state.headlines`. + `item_field` names the per-instance input field on the subgraph's + state. +- `collect_field` and `extra_outputs` for harvesting per-instance + results into parent lists. The two lists (`summaries`, `topics`) + end up index-aligned. +- `instance_middleware`: middleware wrapped around each instance's + subgraph run. `RetryMiddleware` (3 attempts, deterministic + backoff) plus `TimingMiddleware` (captures duration per + instance). Retries are per-instance: a transient failure on + headline 3 doesn't restart 0-2. +- `concurrency=3` capping how many instances run in flight at once. +- `error_policy="fail_fast"` (default, first exhausted-retry + failure aborts the batch) vs `"collect"` (failures land in + `errors_field` and the batch produces partial results). +- A `fan_out_config_observer` reads + `NodeEvent.fan_out_config` on the fan-out node's dispatch event, + recording the resolved `item_count` / `concurrency` / + `error_policy` at runtime. Inner-instance events carry + `fan_out_index` but not the config. + +## How to run + +```bash +uv sync --group examples +LLM_API_KEY=sk-... uv run python examples/05-fan-out-with-retry/main.py +``` + +To exercise the collect path with a synthetic failure: + +```bash +COLLECT_MODE=1 LLM_API_KEY=sk-... \ + uv run python examples/05-fan-out-with-retry/main.py +``` + +## The graph + +```mermaid +flowchart TD + start([start]) + announce[announce] + present[present] + stop([end]) + + subgraph headline_runs [headline_runs: fan-out, concurrency=3] + direction TB + note["N instances of:
summarize -> classify
(retry + timing middleware)"] + end + + start --> announce --> headline_runs --> present --> stop +``` + +`headline_runs` is the fan-out node. At dispatch time it expands +into N copies of the per-instance subgraph, one per headline. +`RetryMiddleware` and `TimingMiddleware` wrap each instance. + +## Reading the output + +A clean default-mode run (`fail_fast`, all instances succeed): + +``` +======================================================================== +Summarizing 5 headlines in parallel (concurrency=3) +error_policy='fail_fast' +======================================================================== + + [observer] fan-out node 'headline_runs' dispatching: item_count=5 concurrency=3 error_policy='fail_fast' + +Results (in input order): + + [0] Artemis II splashes down in Pacific after ten-day lunar flyby + summary: + topic: crew + + [1] NASA pauses Lunar Gateway program in favor of crewed surface base + summary: + topic: policy + + ... + +Per-instance timings (in completion order): + #0 812.3 ms outcome=success + #1 941.7 ms outcome=success + #2 876.2 ms outcome=success + #3 903.4 ms outcome=success + #4 1012.8 ms outcome=success + + wall-clock total: 2089.3 ms + sum of per-instance: 4546.4 ms + → concurrency speedup: 2.18x +``` + +- **The observer line** is the `fan_out_config_observer` printing + the dispatch-time config. Useful when `count` or `concurrency` + are callable resolvers whose runtime value isn't visible in code. +- **Per-input order vs completion order.** The result loop walks + `final.headlines` in input order; `final.summaries` and + `final.topics` are index-aligned with it. The timings list is in + completion order, not input order (instance 2 may finish before + instance 1 under concurrency). +- **Concurrency speedup.** `sum of per-instance / wall-clock`. A + speedup near `concurrency` indicates the work parallelized well; + a value near 1.0 indicates concurrency didn't help (the upstream + serialized you, or instances themselves are short). + +With `COLLECT_MODE=1`, the output includes the sentinel headline +at index 0 with a `(failed after retries; ...)` marker, plus a +`Captured 1 per-instance error(s):` block listing the failed +`fan_out_index` and error category. The other instances complete +as usual. diff --git a/docs/examples/06-parallel-branches.md b/docs/examples/06-parallel-branches.md new file mode 100644 index 0000000..4150012 --- /dev/null +++ b/docs/examples/06-parallel-branches.md @@ -0,0 +1,134 @@ +# 06 - Parallel branches + +Enrich a lunar-mission news article with three independent +analyses (one-sentence summary, sentiment label, topic tags) +running concurrently as separate subgraphs. + +## Overview + +Where fan-out (example 05) runs N copies of *one* subgraph against +different inputs, parallel-branches runs M *heterogeneous* +subgraphs against the same input. Different state schemas, +different middleware, different topologies per branch, one +dispatch. + +The article goes into three branches in parallel: + +- **summary**: bare subgraph, one node, writes `summary` back. +- **sentiment**: subgraph wrapped in `RetryMiddleware` (the + classification call is short and cheap to retry), writes a + `label` back into the parent's `sentiment` field. +- **topics**: bare subgraph, writes a `tags` list back into the + parent's `topics` field. + +The branches don't depend on each other, so they fire concurrently +and the parent fans in once all three complete. + +## What it teaches + +- [`add_parallel_branches_node`](../concepts/parallel-branches.md): + M named `BranchSpec`s under one node. Each spec carries its own + compiled subgraph plus per-branch input/output projection plus + optional per-branch middleware. +- Branches with *different* state schemas. The summary subgraph's + state has a `summary` field; the sentiment subgraph's has + `label`; the topics subgraph's has a `tags` list. The projection + mappings translate between the branch's vocabulary and the + parent's. +- Heterogeneous per-branch middleware. The sentiment branch wraps + its subgraph in retry; the other two run bare. A production + pipeline often wants different retry policies, timing windows, or + custom middleware per branch. +- Branch insertion order = fan-in order. When two branches write to + the same parent field, the parent's reducer applies them in the + order they were declared in the `branches` mapping (not in + completion order). The three branches here write disjoint parent + fields, so the order doesn't affect the result, but the property + holds. +- A `branch_attribution_observer` reads + `NodeEvent.branch_name` on inner-node events. `branch_name` is + populated only for events *inside* a branch's subgraph; + outer-graph nodes carry `branch_name=None`. This is the + per-event attribution that lets observability backends route + metrics and spans by branch. + +## How to run + +```bash +uv sync --group examples +LLM_API_KEY=sk-... uv run python examples/06-parallel-branches/main.py +``` + +The article is baked into the example. + +## The graph + +```mermaid +flowchart TD + start([start]) + receive[receive] + present[present] + stop([end]) + + subgraph enrich [enrich: parallel-branches] + direction TB + summary[summary branch] + sentiment[sentiment branch
retry middleware] + topics[topics branch] + end + + start --> receive --> enrich --> present --> stop +``` + +`enrich` is the parallel-branches node; the three branches inside +the box dispatch concurrently against the same `article` field on +parent state. The sentiment branch is the only one with middleware +attached. + +## Reading the output + +``` +======================================================================== +Lunar-mission article enrichment - three independent analyses in parallel +======================================================================== + +Article (642 chars): + +NASA's Artemis II crew capsule Integrity splashed down in the Pacific +Ocean this evening, ending a ten-day flight that carried four astronauts +on a free-return trajectory around the Moon and back... + + [observer] (branch=summary) inner node 'write_summary' started + [observer] (branch=sentiment) inner node 'classify_sentiment' started + [observer] (branch=topics) inner node 'extract_topics' started + +======================================================================== +Enrichment results +======================================================================== + + summary: + sentiment: positive + topics: ['Artemis II', 'splashdown', 'lunar program'] + + wall-clock: 1142.6 ms + +The three branches ran in parallel - wall-clock is closer to the +slowest single branch than to the sum of all three. +``` + +- **The three observer lines** fire close together (often within a + few ms of each other), confirming the branches dispatched in + parallel rather than serially. +- **`branch_name` attribution** is what makes per-branch + observability tractable. `write_summary` knows nothing about + `branch_name`; it's the engine that tags the event for the + observer. +- **Wall-clock under 1500 ms** for three sequential LLM calls is + the clearest indicator of parallelism. Three serial calls at + roughly 1s each would land near 3 seconds; under parallel + dispatch the wall-clock approaches the slowest branch's duration. +- **Disjoint output fields** mean the reducer order at fan-in + doesn't matter here. If two branches both wrote to `summary`, the + declared branch order (`summary` before `sentiment` before + `topics`) would determine which value won under the default + `last_write_wins` reducer. diff --git a/docs/examples/07-multimodal-prompt.md b/docs/examples/07-multimodal-prompt.md new file mode 100644 index 0000000..4838f5a --- /dev/null +++ b/docs/examples/07-multimodal-prompt.md @@ -0,0 +1,126 @@ +# 07 - Multimodal prompt + +Two independent analyses of a lunar-mission photograph using +versioned prompt templates, a fallback prompt backend, and a +multimodal user message that carries both text and image. + +## Overview + +Given a photograph of a lunar mission (default: Buzz Aldrin on the +lunar surface during Apollo 11), run two independent analyses +against the same image: + +- **describe-surface**: describe what's visible of the lunar + surface. +- **describe-equipment**: identify the equipment in the frame. + +Both prompts take the mission name as their only variable; neither +depends on the other's output. Both rendered prompts are grouped +under one observability `PromptGroup` so a trace UI can show the +analyses as one logical unit. + +The image source can be a URL (default) or a local file. Setting +`IMAGE_PATH` switches the demo to inline base64 transport. + +## What it teaches + +- [`PromptManager`](../concepts/prompts.md) configured with two + `FilesystemPromptBackend`s: a primary `prompts/` directory and a + fallback `prompts_fallback/` directory. The manager tries them in + order on every `get`; the fallback fires only when the primary + raises `PromptStoreUnavailable`. Typical production shape is + "remote primary + filesystem fallback". +- [`FilesystemPromptBackend`](../concepts/prompts.md) with the + `/