Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The

### Added

- **`tool_choice` parameter on `Provider.complete()`** (proposal 0025, accepted in spec v0.20.0). Optional discriminated-union value constraining the model's tool-calling behavior — one of `"auto"`, `"required"`, `"none"`, or a `ForceTool(name=...)` record. Validation runs pre-send: `"required"` and `ForceTool` both demand non-empty `tools`, and `ForceTool.name` must appear in the supplied list; violations raise `ProviderInvalidRequest` (§7's existing category — no new error category). When `tool_choice` is `None` (the default) the wire field is omitted and the provider's own default applies, preserving pre-0025 behavior exactly. The `OpenAIProvider` maps the spec shape onto OpenAI's wire shape per §8.1.1 (the `ForceTool.type="tool"` renames to wire `type="function"`).
- **`ForceTool` and `ToolChoice` public types** at `openarmature.llm.ForceTool` / `openarmature.llm.ToolChoice`. `ForceTool` is a frozen Pydantic model with `type: Literal["tool"] = "tool"` and `name: str`; `ToolChoice = Literal["auto", "required", "none"] | ForceTool` is the type alias used in `Provider.complete()`'s signature.
- **`validate_tool_choice` public validator** at `openarmature.llm.validate_tool_choice`. Standalone validator covering the three §5 pre-send rules; useful for third-party `Provider` implementations that want to reuse the canonical validation logic.
- **Bounded drain timeout on `CompiledGraph.drain()`** (proposal 0010, accepted in spec v0.19.0). `drain()` accepts an optional `timeout: float | None = None` parameter (non-negative seconds). When supplied, drain returns no later than the deadline; any observer events still queued or in-flight are reported as undelivered. Workers are cancelled cleanly so the compiled graph remains usable for subsequent invocations — partial delivery state from one drain does NOT leak into the next. Solves the "slow / hung / misbehaving observer blocks process exit" footgun for short-lived processes (CLIs, scripts, serverless functions). Observers SHOULD be cancellation-safe (idempotent writes, `try/finally` cleanup); the spec doesn't mandate it but the docs recommend it.
- **`DrainSummary` frozen dataclass** at `openarmature.graph.DrainSummary`. Returned from every `drain()` call (with or without `timeout`). Fields: `undelivered_count: int`, `timeout_reached: bool`. The shape is consistent across timed and untimed drains — callers receive the same dataclass whether the timeout was supplied or not. Per the v0.19.0 contract the two declared fields are the spec-mandated minimum; richer diagnostic detail (per-observer counts, sampled event metadata) is reserved for follow-on PRs.
- **Per-instance fan-out resume contract** (proposal 0009, accepted in spec v0.18.0). The engine now writes a checkpoint record at every `completed` event inside a fan-out instance (in addition to the existing outermost-graph + subgraph-internal + fan-out node completion saves). On resume the engine consults the saved record's `fan_out_progress` field and treats each instance as `completed` (skip, contribution rolls forward), `in_flight` (re-run from subgraph entry), or `not_started` (dispatch normally). The `append` reducer's no-double-merge guarantee holds across resume because `completed` is a one-shot accumulator state.
Expand All @@ -17,13 +20,14 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The

### Changed

- **`Provider.complete()` signature** extended with an optional `tool_choice: ToolChoice | None = None` parameter (per proposal 0025 v0.20.0). Backward-compatible: callers that omit the new argument see no wire-shape change. Third-party `Provider` implementations MUST add the parameter to remain Protocol-conformant under strict type checking (and to accept calls that pass `tool_choice` without raising `TypeError`); they MAY ignore it in their wire-body emission, which is how "provider doesn't honor tool_choice" looks at the impl level. The `OpenAIProvider` wire mapping is implemented per §8.1.1.
- **`CompiledGraph.drain()` return type** changed from `None` to `DrainSummary` (pre-1.0; per proposal 0010 v0.19.0 contract). Callers that ignored the return are unaffected — `await graph.drain()` discards the returned dataclass exactly as before. Callers that explicitly typed the return as `None` will need to update their annotation.
- **Fan-out resume behavior** flipped from atomic restart (0008's v1 contract) to per-instance resume. A crash mid-fan-out used to re-run the entire fan-out on resume; now only the instances that did not complete-and-record their contribution re-run. The economics matter for large fan-outs of expensive work (LLM calls, long extractions): an 80% complete fan-out crash now restores 80% of its results rather than discarding them.
- **`SQLiteCheckpointer` schema** picks up a new `fan_out_progress_blob` column (added via `ALTER TABLE` for backward compatibility with pre-0009 databases). Pre-0009 rows back-fill as NULL on load and round-trip as the empty-tuple default. Both `pickle` and `json` serialization modes round-trip the new field.

### Notes

- **Pinned spec version bumped from v0.17.0 to v0.19.0 over this Unreleased cycle.** Four spec versions absorbed: v0.17.1 (proposal 0019, multi-provider wire-format extension; purely textual reframe of llm-provider §8 as a catalog of wire-format mappings, OpenAI-compatible body nested under §8.1, code references updated to §8.1 / §8.1.1 / §8.1.2 / §8.1.3 / §8.1.5.1 / §8.1.1.1), v0.18.0 (proposal 0009, per-instance fan-out resume; pipeline-utilities §10.3 / §10.7 revised, §10.11 added with per-instance state machine plus composition rules plus configurable batching; the `append` reducer no-double-merge invariant from §10.11.1 is the load-bearing correctness story; see Added / Changed above), v0.18.1 (fixture-only patch on `release/v0.18.1` correcting an off-by-one literal in fixture 052's expected `results`), and v0.19.0 (proposal 0010, bounded drain timeout; graph-engine §6 amended with the `timeout` parameter and `DrainSummary` return contract; see Added / Changed above). All existing conformance fixtures continue to pass.
- **Pinned spec version bumped from v0.17.0 to v0.20.1 over this Unreleased cycle.** Six spec versions absorbed: v0.17.1 (proposal 0019, multi-provider wire-format extension — purely textual reframe of llm-provider §8 as a catalog of wire-format mappings; OpenAI-compatible body nested under §8.1), v0.18.0 (proposal 0009, per-instance fan-out resume — pipeline-utilities §10.3 / §10.7 revised, §10.11 added; the `append` reducer no-double-merge invariant is the load-bearing correctness story), v0.18.1 (fixture-only patch correcting an off-by-one literal in fixture 052's expected `results`), v0.19.0 (proposal 0010, bounded drain timeout — graph-engine §6 amended with the `timeout` parameter and `DrainSummary` return contract), v0.20.0 (proposal 0025, llm-provider `tool_choice` — §5 / §7 / §8.1.1 amended; see Added / Changed above), and v0.20.1 (proposal 0026, llm-provider §8.X wire-format mapping subsection template — purely textual §8 framing paragraph; the existing OpenAI §8.1 mapping is the template's reference shape so no python module-level work was needed). All existing conformance fixtures continue to pass.

## [0.8.0] — 2026-05-23

Expand Down
41 changes: 41 additions & 0 deletions docs/concepts/llms.md
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,47 @@ prevents runaway loops on a model that stays in tool-calling forever.
See [`09 - Tool use`](../examples/09-tool-use.md) for the runnable
shape.

### Controlling tool-call behavior with `tool_choice`

By default the model decides whether and which tools to call.
`tool_choice` constrains that decision per call. Four modes:

- `"auto"` — the model decides. Equivalent to omitting the parameter
when `tools` is non-empty.
- `"required"` — the model MUST call at least one tool. Useful for
routing nodes that branch on tool selection.
- `"none"` — the model MUST NOT call tools, even if `tools` is
supplied. Useful for guarded LLM calls or for explicitly disabling
tool-calling without rebuilding a tools-less request.
- `ForceTool(name=...)` — the model MUST call the named tool exactly.

Pre-send validation catches the three failure modes (`required` with
empty tools, `ForceTool` with empty tools, `ForceTool.name` not in
the supplied list) and raises `ProviderInvalidRequest` before the
HTTP request is sent.

```python
from openarmature.llm import ForceTool

# Routing node: model MUST pick one of the supplied tools.
response = await provider.complete(
messages, tools=[search, summarize], tool_choice="required"
)

# Forced specific tool: useful when the pipeline knows which tool
# the model should call next (e.g., a `dispatch_search` node).
response = await provider.complete(
messages, tools=[search, summarize], tool_choice=ForceTool(name="search")
)
```

Not all providers honor `tool_choice` — confirm with your provider's
documentation. The `OpenAIProvider` maps the spec shape onto OpenAI's
wire shape per the §8.1.1 mapping table. Whether the model actually
honored the constraint is observable from the returned
`finish_reason` and `tool_calls` fields; the framework does NOT
re-validate the response against the constraint.

## Content blocks (multimodal user messages)

User messages carry content in one of two shapes: a plain text string,
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Repository = "https://github.com/LunarCommand/openarmature-python"
Specification = "https://github.com/LunarCommand/openarmature-spec"

[tool.openarmature]
spec_version = "0.19.0"
spec_version = "0.20.1"

[dependency-groups]
dev = [
Expand Down
2 changes: 1 addition & 1 deletion src/openarmature/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""OpenArmature: workflow framework for LLM pipelines and tool-calling agents."""

__version__ = "0.8.0"
__spec_version__ = "0.19.0"
__spec_version__ = "0.20.1"
6 changes: 6 additions & 0 deletions src/openarmature/llm/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
from .messages import (
AssistantMessage,
ContentBlock,
ForceTool,
ImageBlock,
ImageSource,
ImageSourceInline,
Expand All @@ -56,6 +57,7 @@
TextBlock,
Tool,
ToolCall,
ToolChoice,
ToolMessage,
UserMessage,
)
Expand All @@ -64,6 +66,7 @@
strict_mode_supported,
validate_message_list,
validate_response_schema,
validate_tool_choice,
validate_tools,
)
from .providers import OpenAIProvider, classify_http_error, parse_retry_after
Expand All @@ -83,6 +86,7 @@
"AssistantMessage",
"ContentBlock",
"FinishReason",
"ForceTool",
"ImageBlock",
"ImageSource",
"ImageSourceInline",
Expand All @@ -107,6 +111,7 @@
"TextBlock",
"Tool",
"ToolCall",
"ToolChoice",
"ToolMessage",
"Usage",
"UserMessage",
Expand All @@ -115,5 +120,6 @@
"strict_mode_supported",
"validate_message_list",
"validate_response_schema",
"validate_tool_choice",
"validate_tools",
]
42 changes: 42 additions & 0 deletions src/openarmature/llm/messages.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,46 @@ class Tool(BaseModel):
parameters: dict[str, Any]


# Spec: realizes llm-provider §5 `tool_choice` discriminated-union
# (proposal 0025). The string-literal modes (`"auto"`, `"required"`,
# `"none"`) and the `ForceTool` record share the `ToolChoice` alias.
# Implementations validate `tool_choice` against `tools` before send
# (see ``validate_tool_choice`` in :mod:`provider`); violations raise
# ``ProviderInvalidRequest`` per §7.
class ForceTool(BaseModel):
"""Force the model to call exactly the named tool.

Use the record form of the §5 `tool_choice` discriminated union
when you need the model to call a specific tool by name. ``type``
is the spec-level discriminator (``"tool"``); the wire mapping
(§8.1.1) renames it to ``"function"`` for the OpenAI body. The
``name`` MUST match a ``Tool.name`` in the supplied ``tools``
list; ``validate_tool_choice`` enforces this at pre-send time and
raises ``ProviderInvalidRequest`` on violation.
"""

model_config = ConfigDict(extra="forbid", frozen=True)

# Frozen + extras-forbidden so a ``ForceTool`` instance is safely
# hashable and structurally pinned. The ``Literal["tool"]`` default
# makes ``ForceTool(name="search")`` ergonomic at the call site
# while preserving the spec-level discriminator on the type.
type: Literal["tool"] = "tool"
name: str


# Per spec §5: `tool_choice` is one of:
# - ``"auto"`` — the model decides.
# - ``"required"`` — the model MUST call at least one tool.
# - ``"none"`` — the model MUST NOT call tools.
# - ``ForceTool(name=X)`` — the model MUST call the named tool.
# A union of the three string literals plus the record form.
# Callers pass ``tool_choice=None`` (the default) to omit the field
# from the wire — the provider's own default applies, preserving
# pre-0025 behavior.
ToolChoice = Literal["auto", "required", "none"] | ForceTool


# ---------------------------------------------------------------------------
# Per-role message classes
# ---------------------------------------------------------------------------
Expand Down Expand Up @@ -274,6 +314,7 @@ class ToolMessage(_MessageBase):
__all__ = [
"AssistantMessage",
"ContentBlock",
"ForceTool",
"ImageBlock",
"ImageSource",
"ImageSourceInline",
Expand All @@ -283,6 +324,7 @@ class ToolMessage(_MessageBase):
"TextBlock",
"Tool",
"ToolCall",
"ToolChoice",
"ToolMessage",
"UserMessage",
]
74 changes: 74 additions & 0 deletions src/openarmature/llm/provider.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,11 @@
from .errors import ProviderInvalidRequest
from .messages import (
AssistantMessage,
ForceTool,
Message,
SystemMessage,
Tool,
ToolChoice,
ToolMessage,
UserMessage,
)
Expand All @@ -75,6 +77,7 @@ async def complete(
tools: Sequence[Tool] | None = None,
config: RuntimeConfig | None = None,
response_schema: dict[str, Any] | type[BaseModel] | None = None,
tool_choice: ToolChoice | None = None,
) -> Response:
"""Perform a single completion call.

Expand All @@ -93,6 +96,12 @@ async def complete(
supplied, the implementation constrains the model's
output to the schema and populates ``Response.parsed``
with the validated value.
tool_choice: Optional tool-choice constraint (spec §5). One
of ``"auto"``, ``"required"``, ``"none"``, or a
:class:`ForceTool` record. When ``None`` (the default)
the wire ``tool_choice`` field is omitted and the
provider's own default applies. Pre-send validation
routes through ``provider_invalid_request``.
"""
...

Expand Down Expand Up @@ -174,6 +183,70 @@ def validate_tools(tools: Sequence[Tool] | None) -> None:
seen.add(t.name)


# The string literals allowed under the §5 `tool_choice` shape.
# Pyright catches non-literal strings at type-check time via the
# ``ToolChoice = Literal[...] | ForceTool`` alias, but Python does
# not enforce Literal at runtime — untyped callers (tests, dynamic
# harnesses, ad-hoc scripts) can pass an arbitrary string. The
# runtime check below is the API-boundary defense against that.
_ALLOWED_TOOL_CHOICE_MODES: frozenset[str] = frozenset({"auto", "required", "none"})


# Spec: realizes llm-provider §5 `tool_choice` pre-send validation
# rules (proposal 0025). The three failure modes route through the
# existing §7 ``provider_invalid_request`` category; no new error
# categories per the spec's "no new category" framing. Validation
# fires BEFORE any HTTP request is sent (fixture 031's mock_provider
# returns an empty response list on these cases to fail the test
# if a request escapes the validation gate).
def validate_tool_choice(
tool_choice: ToolChoice | None,
tools: Sequence[Tool] | None,
) -> None:
"""Validate ``tool_choice`` against ``tools`` per spec §5.

Raises :class:`ProviderInvalidRequest` (the §7
``provider_invalid_request`` category) on:

- ``tool_choice`` supplied as a string that is not one of
``"auto"`` / ``"required"`` / ``"none"`` (runtime defense
against untyped callers; the Literal alias catches well-typed
ones at type-check time).
- ``tool_choice="required"`` supplied with empty / absent
``tools``.
- ``tool_choice=ForceTool(name=X)`` supplied with empty / absent
``tools``.
- ``tool_choice=ForceTool(name=X)`` supplied with ``X`` not in the
supplied tools list.

No-op when ``tool_choice`` is ``None`` (the default — preserves
pre-0025 behavior; the wire field is omitted and the provider's
own default applies). ``tool_choice="auto"`` and
``tool_choice="none"`` have no ``tools``-related preconditions.
"""
if tool_choice is None:
return
if isinstance(tool_choice, str) and tool_choice not in _ALLOWED_TOOL_CHOICE_MODES:
raise ProviderInvalidRequest(
f'tool_choice {tool_choice!r} is not one of "auto" / "required" / "none"'
)
has_tools = bool(tools)
Comment thread
chris-colinsky marked this conversation as resolved.
if tool_choice == "required" and not has_tools:
raise ProviderInvalidRequest('tool_choice="required" requires non-empty tools')
Comment thread
chris-colinsky marked this conversation as resolved.
if isinstance(tool_choice, ForceTool):
if not has_tools:
raise ProviderInvalidRequest(
f"tool_choice ForceTool(name={tool_choice.name!r}) requires non-empty tools"
)
# ``tools`` is non-empty here per the preceding guard. The list
# is also guaranteed non-None inside this branch.
names = {t.name for t in tools or ()}
if tool_choice.name not in names:
raise ProviderInvalidRequest(
f"tool_choice name {tool_choice.name!r} not in tools (declared: {sorted(names)})"
)


# ---------------------------------------------------------------------------
# Schema helpers — used by structured-output Provider implementations
# ---------------------------------------------------------------------------
Expand Down Expand Up @@ -485,5 +558,6 @@ def _resolve_ref(ref: str, root: dict[str, Any]) -> Any:
"strict_mode_supported",
"validate_message_list",
"validate_response_schema",
"validate_tool_choice",
"validate_tools",
]
Loading