Static contracts for agent-authored Python.
clauz3 experiments with a different permission surface for agents: the
agent writes a small Python program, attaches a contract describing its trusted
side effects, and a static prover checks the contract before anything runs.
The contract is the thing the user should be asked to accept. The program is still available for inspection, but a user should not have to read every branch to answer questions like:
- Will this email anyone other than Bob?
- Will this email the same person twice?
- Will this withdraw more than $5 total?
- Will this write outside a sandbox?
A run, end to end. The user asks for a bill to be paid (capped); the agent
writes a small program that looks up the outstanding balance and pays
min(balance, $500); ClauZ3 proves the spend stays under the cap on every
branch and that only the named account is touched; the user approves the
guarantees, not the code; only then does the runtime execute it.
Editable vector source: docs/assets/figure-sequence.svg.
Projects are split into three layers:
- trusted roots such as
tools/email/trusted/: small audited modules containing both side-effecting functions and reusable domain contracts. These are the environment contract: the prover trusts their signatures,@deal.has(...)markers,@deal.pre(...)preconditions, and@contracthelper definitions. - agent-authored code: ordinary Python that imports trusted functions and adds
@clauz3.guarantee(...)decorators to the function being proved.
Trusted calls bottom out into symbolic effect facts. For example:
@deal.pre(lambda addr, msg: "@" in addr)
@deal.has("trusted")
def send_email(addr: str, msg: str) -> None:
...A call to send_email("bob@example.com", "hi") creates a fact whose relation
name is send_email, whose fields are addr and msg, and whose condition is
the branch condition under which the call is reachable.
Domain logic can query those generated facts:
from clauz3.spec import ContractSpec, contract, effect
from clauz3.spec import no_guarantees as core_no_guarantees
Email = effect("send_email")
@contract
def only(addresses: list[str]) -> ContractSpec:
return Email.all(lambda e: e.addr in addresses)
@contract
def no_guarantees() -> ContractSpec:
return core_no_guarantees()
@contract
def unique_recipients() -> ContractSpec:
return Email.distinct(lambda e: e.addr)Agent code then states guarantees in terms of that vocabulary:
import clauz3
from tools.email.trusted import contracts as emails
from tools.email.trusted.effects import send_email
@clauz3.guarantee(emails.only(["bob@example.com"]))
@clauz3.guarantee(emails.unique_recipients())
def main() -> None:
send_email("bob@example.com", "hi")Run the prover:
cd examples/email
uv run clauz3 prove \
--trusted-root tools/email/trusted \
cases/only_bob_pass.pyFor programs that combine domains, either repeat --trusted-root or pass
several roots with --trusted-roots:
uv run clauz3 prove plan.py \
--trusted-roots tools/email/trusted tools/db/trustedTo start a new project from an existing one, copy its trusted tools/ layer
with clauz3 install. It accepts a local path, a gh:org/repo shorthand, or
a full git URL, and copies each tools/<domain>/ into the current directory:
# Local path
clauz3 install /path/to/repo/examples/email
# GitHub shorthand (clones into ~/.cache/clauz3/sources/<sha>/)
clauz3 install gh:normalform-ai/clauz3-tools-autolabs
# Pin to a tag, branch, or sha for reproducibility
clauz3 install gh:normalform-ai/clauz3-tools-autolabs@v0.3.1
# Full URL (HTTPS or SSH)
clauz3 install git@github.com:normalform-ai/clauz3-tools-assistant.gitPass --into <dir> to target a different destination, --force to overwrite an
existing layer, and --skills to also generate
agents/skills/<domain>/SKILL.md describing the installed effects and
contracts:
clauz3 install gh:normalform-ai/clauz3-tools-autolabs --into my-project --skillsAuthentication for git remotes uses your existing git configuration (SSH key
or credential helper); clauz3 does not manage credentials itself. The remote
cache is keyed by the resolved commit sha; pinning to @<sha> hits a stable
entry forever, while ref-pinned or HEAD installs re-resolve on each call and
get a fresh cache entry whenever upstream moves. Override the cache location
with CLAUZ3_CACHE.
The install itself is a filesystem copy. Signing — so a user can verify the installed layer matches what was attested upstream — is tracked in issue #47.
clauz3 ships a small set of real (non-mock) tools under
src/clauz3/stdlib. After pip install clauz3, install
one into the current repo with the stdlib: scheme:
clauz3 install stdlib:filesystem
clauz3 install stdlib:grepThe bundled tools are located with importlib.resources, so this works the
same from a source checkout or an installed wheel. (A bare clauz3 install filesystem also works as a shorthand.)
| Tool | Effects | Headline contracts |
|---|---|---|
filesystem |
read_file, write_file |
read_only, only_write_under, never_read_under, writes_at_most |
grep |
grep (imports filesystem) |
only_read_under, never_read_under, searches_at_most, only_pattern |
These differ from the examples/ layers in two ways: their effect bodies do
real work (so the prover skips them but clauz3 run executes them), and their
proof cases live under tests/cases/ with a tests/Justfile rather than at the
top level. just stdlib proves them all, or run one library's suite directly
with clauz3 test stdlib:filesystem (it invokes the library's tests/Justfile
with just).
The repo currently has five worked examples:
| Example | Contracts | What it demonstrates |
|---|---|---|
examples/email |
emails.only, emails.none, emails.no_guarantees, emails.unique_recipients, emails.content_length_at_most |
allowlists, explicit absence of guarantees, absence of effects, pairwise uniqueness, string length bounds |
examples/bank |
bank.max_spend, bank.only_account |
numeric aggregation, field equality, and fixed-bound loop unrolling over trusted calls |
examples/email-from-db |
emails.addresses_from, db.only_table, db.only_where |
for-loops over trusted query returns, column-binding constraints |
examples/text |
text.length_between, text.must_not_contain, text.no_regex_metacharacters, text.only_edit_under, text.edit_length_at_most |
string-length bounds, required/banned substrings, regex-metacharacter safety before sending, and file-edit path/size policies |
examples/http |
http.host_only, http.no_posts |
a shared @deal.has marker spanning two trusted calls, url-prefix matching, and method-specific absence of effects |
All examples use generic effect relations inferred from trusted function signatures. There is no email-specific, bank-specific, or database-specific logic in the core.
This is experimental. The current prover vendors and extends deal-solver.
Implemented pieces include:
clauz3.guarantee(...), a no-op runtime decorator consumed by the prover.- trusted-call fact recording at
@deal.has(...)boundaries. clauz3.spec.effect(...)for relation-style queries over trusted facts.- relation primitives:
no_guarantees,all,empty,where,count,distinct, numericsum, andshares_value. - stateful fluents (
clauz3.fluent): trusted functions declare successor-state axioms with@effect(...), and contracts query the final valuation withFluent.final.all(...)/Fluent.final[key] == value. This expresses order- and post-state-sensitive policies such as "every door is locked at the end" that the multiset relation language cannot. See docs/reference/fluents.md. clauz3 provefor proving examples from the CLI.clauz3 installfor copying a trustedtools/layer from a local path or a bundled stdlib tool (stdlib:filesystem,stdlib:grep), optionally generatingagents/skills/<domain>/SKILL.mdstubs.clauz3 testfor running a library's bundledtests/Justfilewithjust, resolving the source the same way asinstall.clauz3 configfor writing this repo's default Claude Code permissions (read-only tools plus theclauz3CLI) to.claude/settings.json. Idempotent and the configuration counterpart toinstall.clauz3 runfor proving a complete inline program, submitting an approval request to an externally configured approval service, and executingmainonly after an approval receipt is returned.clauz3 approval-servicefor starting a simple localhost FastAPI approval service with REST endpoints and a browser UI for user decisions. With--policy, a policy admin's rules can auto-approve or auto-reject a request by asking the prover whether the program entails the rule's contracts, falling back to a human otherwise.clauz3 policy-checkdry-runs a policy against a program. See docs/todos/approval-policies.md.clauz3 mock-approval-servicefor config-driven tests and local demos.- For-loops over
list[Row]-returning trusted calls, with column-binding contracts viaUserRow.emailmarkers. See docs/explanation/symbolic-iteration.md. - Domain coverage policies declared in a trusted-root
policy.py: contracts a domainrecommendeds are flagged in the approval UI when a used domain is under-constrained, and contracts it marksrequiredare conjoined into the proof and rejected if unmet. See docs/reference/coverage-policies.md.
Important limitations:
- Contract helper functions are still executed as builder functions. Their lambda bodies are parsed as a small expression subset, but the helper body itself is not yet fully AST-validated.
- Runtime receipt enforcement in trusted functions is not implemented yet; the
first
runslice requires a receipt before execution and exposes it to the process asCLAUZ3_APPROVAL_RECEIPT. clauz3 runincludes source checks and reduced builtins, but it is not a hardened Python or OS sandbox. The agent harness must still restrict execution to theclauz3command path.- The relation language is intentionally small. Richer string predicates, richer aggregates, Datalog-style rules, and deterministic natural language summaries are future work.
More detail:
- FAQ and terminology
- Concepts
- Background and related work
- Effect specs
- Coverage policies
- Symbolic iteration
- ClauZ3 Python subset
- Approval service
- User approval dialog design
- Guardians synergies
- Ideas
uv sync --dev
just testClauZ3 layers on two upstream projects and would not exist without them:
- deal — the Python runtime contract
engine whose
@deal.pre,@deal.post, and@deal.hasdecorators define the trusted-layer vocabulary. ClauZ3 reads the same decorators and discharges them statically; in runtime-only mode, deal enforces them at execution. See Static proof vs runtime for how the two layers relate. - Z3 — Microsoft Research's SMT solver,
the back-end the prover compiles every guarantee into. The vendored
deal-solvermachinery is what bridges Python AST and Z3 constraints.
