LunarCommand · chris-colinsky · May 27, 2026 · May 27, 2026 · May 27, 2026
diff --git a/docs/concepts/observability.md b/docs/concepts/observability.md
@@ -586,3 +586,100 @@ appear dropped. Two workarounds:
 - Use `SimpleSpanProcessor` instead of `BatchSpanProcessor` in
   tests; it exports synchronously and is unaffected by teardown
   timing.
+
+## Langfuse mapping (opt-in)
+
+A second sibling observer maps the same `NodeEvent` stream onto
+Langfuse's native Trace + Observation data model — Traces at the
+top, Span observations for graph nodes, Generation observations for
+LLM calls. Use it instead of (or alongside) the OTel observer when
+your trace UI is Langfuse and you want first-class Generation
+rendering without going through Langfuse's OTLP ingest.
+
+```python
+from openarmature.observability.langfuse import (
+    InMemoryLangfuseClient,
+    LangfuseObserver,
+)
+
+client = InMemoryLangfuseClient()  # or langfuse.Langfuse(...) in prod
+observer = LangfuseObserver(client=client)
+graph.attach_observer(observer)
+```
+
+The `client` is anything matching the `LangfuseClient` Protocol —
+the bundled `InMemoryLangfuseClient` (used by the conformance
+harness, useful for unit tests), or a real `langfuse.Langfuse()`
+instance from the [Langfuse Python SDK](https://github.com/langfuse/langfuse-python).
+The Protocol declares only the methods the observer calls, so SDK
+versions whose shape matches drop in directly. SDK versions whose
+shape diverges (renamed kwargs, return-type quirks) plug in via a
+small adapter; see
+[`examples/10-langfuse-observability`](../examples/10-langfuse-observability.md)
+for the runnable demo plus the adapter shape.
+
+### What Langfuse sees
+
+- **Trace ID = invocation ID.** The Trace's `id` is the OA
+  `invocation_id` verbatim, so cross-system lookup by invocation_id
+  finds the Langfuse Trace directly (spec §8.4.1).
+- **Trace name.** Defaults to the entry-node name (spec §8.6
+  fallback). Caller-supplied invocation labels land in PR 4
+  (proposal 0034).
+- **Per-observation metadata.** Each Span / Generation carries
+  `namespace`, `step`, `attempt_index`, optional `fan_out_index` /
+  `branch_name`, and the `correlation_id` cross-cutting join key
+  (spec §8.5).
+- **Generation fields.** LLM calls become Generation observations
+  with `model`, `model_parameters` (the `gen_ai.request.*` request
+  parameters lifted by inclusion per §8.4.3), `usage` (input /
+  output / total tokens), and `metadata.finish_reason` /
+  `system` / `response_model` / `response_id`.
+
+### Payload + truncation
+
+`disable_llm_payload` mirrors the OTel observer's flag — defaults
+to `True` for the same privacy reason. Flip to `False` to populate
+`generation.input` / `output` / `metadata.request_extras` from the
+LLM event payload.
+
+```python
+observer = LangfuseObserver(
+    client=client,
+    disable_llm_payload=False,
+    payload_byte_cap=65536,
+)
+```
+
+When a payload exceeds `payload_byte_cap`, the observer emits the
+serialized form with the §5.5.5 truncation marker
+(`…[truncated, M bytes total]`) verbatim as a raw string instead of
+parsing back to native shape. The unparseable JSON IS the
+truncation signal in the Langfuse UI.
+
+### Prompt linkage
+
+When a Prompt's source backend exposes a Langfuse Prompt entity
+reference under `Prompt.observability_entities['langfuse_prompt']`,
+the Generation observation links to that entity natively (spec
+§8.4.4 case 1). Backends that don't surface a Langfuse reference
+(filesystem, in-memory, etc.) leave the Generation with
+`metadata.prompt` populated but no entity link (case 2).
+
+### Composition with OTel
+
+The two observers are independent §6 event consumers and can be
+attached together. They share the `correlation_id` as the
+cross-backend join key — find a slow Generation in Langfuse, search
+for its `correlation_id` in OTel logs, see the surrounding
+infrastructure activity.
+
+```python
+otel_observer = OTelObserver(span_processor=...)
+langfuse_observer = LangfuseObserver(client=langfuse_client)
+graph.attach_observer(otel_observer)
+graph.attach_observer(langfuse_observer)
+```
+
+Each observer's `disable_llm_spans` / `disable_llm_payload` flag is
+independent; one MAY emit while the other suppresses.
diff --git a/docs/examples/10-langfuse-observability.md b/docs/examples/10-langfuse-observability.md
@@ -0,0 +1,159 @@
+# 10 - Langfuse observability
+
+Send LLM call observability to Langfuse natively — Trace at the top,
+Span observations for graph nodes, Generation observations with input,
+output, token usage, model parameters, and a native link back to the
+prompt entity the call rendered from.
+
+## Overview
+
+A mission-briefing assistant answers questions about Apollo and Artemis
+missions. The pipeline fetches a versioned prompt template, renders it
+with the user's question, sends the rendered messages to the model,
+and stores the response. The Langfuse observer captures the full call
+shape as the graph runs.
+
+The demo's prompt backend stubs a Langfuse-source by attaching a
+sentinel `langfuse_prompt` reference to the rendered prompt. The
+Generation observation reads that reference and links back to the
+prompt entity — exactly what you'd see in a production Langfuse
+dashboard threading "this generation came from prompt v7" without any
+manual wiring at the call site.
+
+## What it teaches
+
+- [`LangfuseObserver`](../concepts/observability.md#langfuse-mapping-opt-in)
+  attaches like any other observer; nothing in the node code knows or
+  cares about which backend is recording.
+- The `LangfuseClient` Protocol decouples the observer from the SDK.
+  The bundled `InMemoryLangfuseClient` recorder is the test/demo
+  shape; production passes a real `langfuse.Langfuse()` instance (or
+  a thin adapter — see [Reading the output](#reading-the-output)
+  below).
+- Prompt linkage through
+  [`Prompt.observability_entities`](../concepts/prompts.md#backend-keyed-observability-entity-references):
+  a prompt backend that exposes a Langfuse Prompt entity reference
+  surfaces it on every Generation that renders from that prompt.
+  Filesystem / in-memory backends without that reference work too,
+  they just produce metadata-only linkage.
+- `disable_llm_payload=False` opt-in for capturing input messages +
+  output content on Generation observations. Default-off is the
+  privacy posture; the demo deliberately flips it.
+- `correlation_id` cross-cutting metadata on the Trace and every
+  Observation — the join key if you're also running an OTel observer
+  alongside.
+
+## How to run
+
+```bash
+uv sync --group examples
+LLM_API_KEY=sk-... uv run python examples/10-langfuse-observability/main.py \
+  "what year did Apollo 11 land"
+```
+
+The first positional arg becomes the question. The demo uses an
+in-memory recorder so no Langfuse account is needed.
+
+## The graph
+
+```mermaid
+flowchart TD
+  start([start])
+  answer[answer_briefing]
+  stop([end])
+
+  start --> answer --> stop
+```
+
+A single-node graph: fetch the prompt, render with the question, call
+the LLM under `with_active_prompt(...)`, store the response. The
+single node is deliberate — the value is in the captured Trace shape,
+not the graph topology.
+
+## Reading the output
+
+After the answer prints, the script renders the captured Langfuse
+Trace + Observation tree:
+
+```
+question: what year did Apollo 11 land
+answer:   Apollo 11 landed on the Moon on July 20, 1969 ...
+prompt:   mission-briefing v7
+
+─── captured Langfuse trace ─────────────────────────────────
+Trace id=01234567-89ab-...
+      name='answer_briefing'
+      metadata={correlation_id='...', entry_node='answer_briefing', spec_version='0.26.0'}
+  [span] 'answer_briefing' level=DEFAULT
+    metadata={attempt_index=0, correlation_id='...', namespace=['answer_briefing'], step=0}
+    [generation] 'openarmature.llm.complete' level=DEFAULT
+      metadata={correlation_id='...', finish_reason='stop', prompt={...},
+                response_id='...', response_model='gpt-4o-mini-2024-...',
+                system='openai'}
+      model='gpt-4o-mini'
+      usage=input:48 output:32 total:80
+      prompt_entity_link='lf-prompt-mission-briefing-v7'
+      output='Apollo 11 landed on the Moon on July 20, 1969 ...'
+```
+
+- **Trace name = entry node name** by default. The caller-supplied
+  invocation-label path (a per-`invoke()` argument that overrides the
+  default) ships with proposal 0034's caller-metadata work.
+- **Span observation per node.** `answer_briefing` is the only node
+  here; a multi-node graph would produce a tree of nested Span
+  observations under the Trace.
+- **Generation observation per LLM call.** Carries `model`, `usage`,
+  `output`, and the prompt-identity metadata. In a production Langfuse
+  dashboard this is what the "Generation" detail view renders.
+- **`prompt_entity_link`** is the value `Prompt.observability_entities['langfuse_prompt']`
+  carried — a sentinel string in this demo, a real Langfuse SDK Prompt
+  object in production. When the backend doesn't surface the reference
+  (e.g., a filesystem backend), the link is absent but the
+  `metadata.prompt` map (name, version, label, hashes) still appears
+  for traceability.
+
+## Swapping to a real Langfuse SDK
+
+The observer's `client` parameter is `LangfuseClient`-Protocol-typed,
+so any structurally-compatible value works:
+
+```python
+from langfuse import Langfuse
+
+client = Langfuse(
+    public_key="pk-lf-...",
+    secret_key="sk-lf-...",
+    host="https://cloud.langfuse.com",
+)
+observer = LangfuseObserver(client=client, disable_llm_payload=False)
+```
+
+If the installed SDK version's `trace` / `span` / `generation` method
+signatures match the Protocol exactly, this is the whole change. If
+they diverge (renamed kwargs, return-type quirks), wrap the SDK in a
+small adapter class that implements `LangfuseClient` and delegates to
+the SDK call-by-call. The Protocol surface is narrow — four methods —
+so the adapter is on the order of 40 lines.
+
+For prompt linkage: in production, the
+`Prompt.observability_entities['langfuse_prompt']` value is the SDK's
+own Prompt-entity object (returned by `langfuse_client.get_prompt(...)`)
+rather than the sentinel string this demo uses. The observer passes
+that value straight through to the SDK's `generation(..., prompt=...)`
+argument, which is what the SDK uses to establish the native link.
+
+## Composition with OTel
+
+Both observers consume the same `NodeEvent` stream and can be attached
+together:
+
+```python
+graph.attach_observer(OTelObserver(span_processor=batch))
+graph.attach_observer(LangfuseObserver(client=langfuse_client))
+```
+
+Their `disable_llm_spans` / `disable_llm_payload` flags are
+independent. The `correlation_id` cross-cutting attribute is the join
+key — find a slow Generation in Langfuse, search for the
+`correlation_id` in OTel logs to see the surrounding infrastructure
+activity.