diff --git a/SHARED-LANGUAGE.md b/SHARED-LANGUAGE.md new file mode 100644 index 000000000..bf137789f --- /dev/null +++ b/SHARED-LANGUAGE.md @@ -0,0 +1,345 @@ +# ProvekIt — Shared Language + +> The dictionary. T defines each term, plainly, once. Claude records exactly what +> was said — no reverse-engineered models, no additions, no running ahead. + +## Lift + +Lift has **two parts**: +1. It lifts **contracts** into **ProofIR**. +2. It lifts **sugar** into **body_text**. + +## Lower — the mirror of Lift + +Lower is the inverse of lift, and it has the **same two parts**: +1. It lowers a **contract** into a native artifact — a test, a gate, an annotation. + (This is what we used to call **emit**.) +2. It lowers **sugar** into native source at a boundary — a library call. + (This is what we used to call **materialize**.) + +So the translation surface is a **2×2, not three verbs**: + +| | contract | sugar | +|-----------------------|----------|-------| +| **in** (native → IR) | lift | lift | +| **out** (IR → native) | lower | lower | + +`emit` and `materialize` were never two concepts — they are the **contract-facet** and the +**sugar-facet** of `lower`, exactly as contract-lifting and sugar-lifting are the two facets +of `lift`. `migrate`/`transport` were a *third* egress name for the same act ("migration is +materialize wearing a lab coat") — `lift` then `lower` into another library. The verb is +`lower`; the rest was sprawl. + +Naming: **lift ⇄ lower** is the canonical compiler dual (raise to IR / descend to target), +a symmetric pair in both grammar and meaning — unlike "realize," which names an outcome, not +a direction. **"Realize" is reserved for the kit operation that *performs* a lower over RPC:** +lower is what you ask for; realize is the kit doing it for one language. + +## Boundary + +A boundary is a **realization site of a sugar**. It's a **client of a library**. +- The **library author supplies the sugar**. +- The **user of the library has boundaries** at which the sugar is materialized. + +## Concept + +A concept is a **name for a shared tag amongst sugars**. +Example: `json_encode`. Or `json_decode`. + +## Contract + +A contract is a **ProofIR first-order logic**. It's **tied to a specific sugar**. +- When you materialize the sugar, the **contract propagates with the sugar** to the + materialize site **at the boundary**. +- So: **library authors declare "sugar x gets contract y."** +- Users get **red squigglies when they violate a contract at compile time.** + +**Why it's tied to the sugar (the contractual reading).** A contract is *contractual*: +a **binding, dischargeable obligation**. `verify` **discharges** it (satisfies the +obligation); the solver is the discharge engine. Sugar is *that which is under contract* — +the subject the obligation is about. So sugar and contract are not independent payloads; +they are a **bound pair**: sugar is the subject, the contract is the encumbrance it carries. +That is **why** they co-travel — `lift` produces both, `lower` carries both, the contract +propagates with the sugar to the boundary. It isn't a mechanism choice: **sugar is never +naked.** A sugar without its contract is just a call; the contract is the lien that makes it +accountable, and you can't take the subject free of the obligation. A **boundary is where a +contracted sugar comes due** — which is exactly where the squigglies fire. + +This also sharpens [concept equivalence](#concept-equivalence--vendors-call): a concept +clusters sugars; the federation question is just *"are these sugars under the **same** +contract?"* — bearing the same obligation. The vendor rules whether two encumbrances are +one encumbrance. + +## Sugar + +Sugar is **that which is under contract** — the arbitrary subject an obligation rides on. +The asymmetry with the contract is the load-bearing fact of the whole substrate: + +- **ProofIR (the contract) is uniform.** *Always* first-order logic. One form in every + domain — finite, dischargeable, **federatable across domains**, composing in the same + solver whether it governs crypto or a tax rule. The logic never varies. +- **Sugar is unconstrained.** It is *literally anything*: a function, a poem, a Wikipedia + article, an entire Rust codebase. Content-addressed but **uninterpreted** — the carrier + has no required shape because the world has no required shape. + +What the separation buys: **you can place a dischargeable obligation on something you can +never formalize.** ProvekIt models *nothing* about the subject — the sugar stays itself, +opaque — and the only formal object is the obligation on it. **Law over subject.** Federation +across domains then falls out for free: because every contract is FOL, the sugar's wildness +never touches the logic; the domain lives entirely in the sugar and the sugar never enters +the solver. The logic layer is **domain-blind by construction.** + +This is the deepest reason there is **no bespoke contract language** (see +[`project_provekit_no_bespoke_contract_language`]): you were never meant to formalize the +subject. `.invariant` tried to author the sugar into a formal intermediate. The truth is the +inverse — leave the subject arbitrary, lift only the obligation, and the one formal +vocabulary (first-order logic) is **universal**: the same for the poem and the kernel. No new +language is possible or needed. **The substrate formalizes nothing about the world and can +still prove things about all of it.** + +## Implication — the composition operator of the contract algebra + +A contract is a pre/post over a sugar. Composing two operations — `A(B())`, B produces, A +consumes — is licensed by **exactly one** proof obligation: + +``` +post(B) → pre(A) +``` + +The producer's postcondition implies the consumer's precondition. **That arrow is an +implication** — the trinity's third element, the thing `cmd_implicate` mints. (`a | b` and +`A(b())` collapse to the same invariant: *post of whoever runs first → pre of whoever runs +next*; direction follows the data, not the parentheses.) + +This is **Hoare's rule of composition**, content-addressed: + +``` +{P} B {Q} {Q} A {R} +─────────────────────── + {P} B;A {R} +``` + +The meeting condition `Q` — where B's post must discharge A's pre — *is* the implication. +Sequential composition is sound iff that edge holds. + +So the **trinity {terms, contracts, implications} is a graph:** +- **terms** = the operations — the **nodes**. +- **contracts** = pre/post obligations on each — the **node labels**. +- **implications** = `post → pre` — the **edges** that compose them. + +Implications are the **durable** layer: a proven `P → Q` is a reusable edge — a lemma. Terms +and contracts are local; the edges between them are the composable, federatable, +content-addressed proof. Mint an implication once, compose with it forever. + +## Composition & contradiction — why the AST-composition machinery is the spine + +The three facts are one coin: +- **Composition over AST trees** = laying the `post → pre` edges along the call structure. + You compose over the tree because the call graph *is* the composition order — it tells you + whose post meets whose pre. (`libprovekit::compose` + the AST composition in + `core/source_transform.rs` / `core/bind.rs`.) +- **Contradiction = a failed edge.** `post(B) ∧ ¬pre(A)` is SAT ⟹ the composition is unsound + ⟹ **refuse**. Detecting a contradiction *is* finding a composition implication that does + not hold. No composition → no conjunction for the solver → no contradiction → the **refuse** + arm of the trichotomy has nothing to fire on. **Composition is upstream of *supra omnia, + rectum*** — cut it and you remove the substrate's ability to say no. +- **A whole program verifies by discharging every edge,** grounded in `true`: the base + precondition is ⊤ (the empty conjunction, `EMPTY_SET_CID`); every call adds one `post → pre` + edge up the graph to the top-level postcondition. The proofchain `k(I)=t` is a DAG of + implications rooted at ⊤. + +Plainly: **a bug is a missing edge; a contradiction is a present-but-false edge; correct +software is a graph of contracts whose every composition edge discharges, rooted in `true`.** +This is why the promotion/consensus apparatus is severable cruft but the composition machinery +is not: composition is how contracts compose, which is how contradiction is solved. + +## Where composition lives — per-language extraction, language-agnostic discharge + +The rust CLI must stay **language-agnostic** yet compose **every** language's call graph. The +resolution splits call-edge *extraction* (per-lifter, language-specific) from call-edge +*composition + discharge* (rust CLI, language-blind). + +**Every lifter must express, in universal ProofIR — and all 10 kits do today** (audited +2026-05-27: java, rust, python, go, cpp, csharp, ruby, swift, zig, c each emit both the +pre/post interface and call-edges; none is a bare-proposition emitter)**:** +1. The contract as **pre/post over the operation's symbol** — the typed interface, so the + substrate can align one op's output to the next op's input. *Java today:* + `ContractDecl { symbol, preconditions, postconditions, invariants }` → + `{"precondition": …, "postcondition": …}` (`provekit-lift-java-core/ContractDecl.java`). +2. The **call-edges of its own language's call graph** — because only the java lifter can parse + java, only the python lifter python. *Java today:* `ProductionWalk` finds each callee's + callsites in each caller and, per hit, `substituteVar(callee.precondition, + callee.formals[i], actualArg[i])` — i.e. `post → pre` **with variables aligned to the actual + call** — emitting the implication (`provekit-lift-java-core/ProductionWalk.java`, + `provekit-lift-java-junit`). + +Call-edge **extraction is necessarily per-lifter** (it requires parsing the source language's +AST) and **all lifters must do it.** A lifter that emits a bare proposition with no interface +and no edges has produced a contract that cannot compose — composability is the whole game. +(The 2026-05-27 audit is a **presence** check — the interface + call-edge machinery exists in +all 10 kits. Byte-identical cross-language composition round-trip is a separate conformance +run, not asserted here.) + +**The rust CLI stays language-agnostic** because all of the above crosses the RPC line as +**uniform ProofIR** — contracts + `post → pre` implications. The CLI composes and discharges +the graph (`provekit-linker`: `bindings = f(contracts ∪ call-edges)`; `libprovekit::compose`; +the solver) without ever knowing the source language. It "deals with all languages' call +edges" precisely because, by the time they reach it, they are no longer java/python/rust ASTs +but the one content-addressed implication form. **Language-specific parsing stops at the +lifter; everything above the RPC line is the universal graph of edges.** This is the founding +rule cashed out: *computation over data belongs in rust, post-RPC* — the lifter extracts +(language-bound), the CLI computes (language-blind). + +## Kit + +A kit is a **language-specific implementation of these ideas**. The **Java kit**: +- lifts **sugar bodies** +- lifts **contracts from JUnit tests** +- emits **witnesses as JUnit tests from contracts** +- writes **sugars into boundary bodies** +- handles **side-effect propagation** like import statements + +## Concept equivalence — vendor's call + +Whether several sugars sharing a concept mean the *same* FOL is **entirely the vendor's +decision. ProvekIt sets nothing.** (Mechanism: three-axis pinning — below.) + +## Emitter — the contract-facet of lower + +Turns ProofIR (a contract) into a concrete native artifact — a JUnit test, a Spring +annotation, a gate that throws, etc. A kit hosts **many** emitters; **which to emit is a +kit-time decision and you can invoke many** (one contract → stub + test + annotation + +gate, all at once). Inverse of the lift-from-native-test path: the emitter writes the +contract OUT as a test/annotation; lift reads it back IN. +(In the 2×2 above this is **lower(contract)**; **lower(sugar)** is the materializer.) + +## The lift/lower asymmetry (load-bearing) + +- **Lower is plural** (a relation): one contract/sugar → N faithful native forms, + simultaneously. Safe, because every form is a *projection* of the one truth; none can + contradict it. +- **Lift is singular** (a function): one surface → exactly one contract. Forced by + content-addressing — if a surface could lift two ways, the ingested truth is ambiguous, + the CID is unstable, pinning is meaningless, federation collapses. Two parties lifting + the same surface MUST get the same contract. +- Plainly: **truth has one source but many expressions.** Lifting establishes truth (must + be a function); lowering expresses truth (may be a relation). +- Refinement: a kit may have *many lifters* (one per distinct surface — tests, proptest, + bean-validation, JML, …); the rule is per-surface determinism (one surface → one + contract). Multiple surfaces on the same code each lift once and **compose/conjoin**. + Forbidden thing = one surface → two contracts. (This is why `.invariant` died: a second + lifting path competing with the canonical one.) + +## Solvers — DERIVED FROM T ("verifies those contracts against solvers"), confirm + +The discharge engine for contracts. A contract is ProofIR FOL; the solver composes, +refutes, or accepts it. The clean discharge receipt is what the **witness axis** pins. +(Z3/CVC5/Vampire/CeTA/Lean/Maude/Coq — `provekit-verifier/src/solvers/`.) + +## LSP — DERIVED FROM T ("red squigglies when they violate a contract at compile time"), confirm + +The per-language editor face that DELIVERS the contract to the user. It sees the boundaries +(sugar usages) in the user's code, knows the contract each sugar carries, and surfaces +violations live as diagnostics. (Per-kit `provekit-lsp-`.) +- OPEN: does the LSP drive the solver live at the boundary, or is the squiggle a cheaper + structural check with full solver discharge at verify/CI time? (T to settle.) + +## Three axes of pinning (paper 03 §8; docs/security/what-binaryCid-does-not-catch.md) + +A proof bundle binds **three independent CIDs**, each pinnable (frozen) or floatable +(track latest acceptable at verify time) → 2³ = **8 trust postures**: +1. **Contract** — what it conforms to. (Pinned: contract identity stable across re-attestation.) +2. **Witness** — the chain that endorses it. (Pinned: the contract is what it claims to be.) +3. **Binary** — what it asserts about. (Pinned: the binary you run is the one the witness was minted against.) + +The substrate **picks none of it** — the consumer/vendor decides per axis (security team: +tight witness; dev team: tight binary; compliance: all three). This is *why* concept +equivalence / trust is the vendor's call. Together the three axes close the supply-chain +attack class (authenticated betrayal): a correctly-signed package still can't swap +behavior, forge endorsement, or swap bytes without breaking a pin. + +## Loss record — IN SCOPE (100%, T) + +Produced **per emitter / per materializer / per solver**. The honest accounting of +lossiness: when emitting a contract, materializing a sugar, or discharging against a +solver cannot be **exact**, the loss is **recorded** — content-addressed and named, not +silent. This is what makes the trichotomy honest — *exact / loudly-bounded-lossy / +refuse* — because "loudly-bounded-lossy" is only honest if the bound is written down. +Silent loss would be a lie; the loss record is the substrate keeping correctness-above-all. +(Lift is not in this list — lift is the singular truth-ingestion; it refuses or +sugar-carries rather than loses.) + +## Catalog (concept/realization) — FICTION + +There is no central concept/realization catalog. A concept is a shared tag nobody owns; +everything a catalog would hold already lives, content-addressed + signed, in vendor +`.proof`s (paper 23). "The catalog" is at most the **ephemeral union of the vendor .proofs +a consumer has resolved** (`union(resolve(dependencies).proof)`) — a query/view at resolve +time, per-consumer, never a stored global object. A central catalog is the registry +anti-pattern this content-addressed, vendor-pinned substrate exists to abolish. +- INFECTION (delete): the central `menagerie/concept-shapes/catalog/realizations/` store + the realizer walks up to find. Replace with: resolve the dependency .proofs, union them. + +## Protocol record (currently MISNAMED "protocol catalog") — REAL + +A living, signed, content-addressed artifact: the record of the **protocol's own +evolution** (lift-plugin protocol, PEP transitions, IR version, extension surface). NOT a +registry of vendor content — the substrate's own ledger of how its rules changed. +"Catalog" mis-frames a living *record* as a static *index*. **Rename: protocol record.** +- Re-signing it on a protocol change (e.g. tonight's `agent-plugin-protocol` de-list) is + legitimate record-keeping, NOT feeding the fiction. Keep the artifact; fix the name. + +## Everything is config — lift/lower dispatch to a plugin roster + +The substrate ships **two translation verbs** and **no hardcoded translators**. Which +lifters and lowerers run is declared in `config.toml` (today: `.provekit/lift/*/manifest.toml`, +each a `command` that spawns a kit over RPC). +- Want to lift contracts from a new idiom? Add a **contract-lifter** to config. +- Want to lift sugar? Add a **sugar-lifter**. Same on lower. +`contract` vs `sugar` is a **plugin role declared in config**, never a verb. The substrate +(rust, post-RPC) is language-blind: it runs `lift`/`lower` and routes to whatever the roster +declares. **The kit roster *is* config.toml.** + +Lift was already config-driven (the lift manifests). The disease was that **lower never got +the same treatment** — so it grew hardcoded verbs (`emit`, `materialize`, `transport`, +`migrate`) instead of a config roster. Every egress verb cut in the 2026-05 cleanup was a +lowerer that should have been a config line. + +## Composition — the Unix nature + +The verbs are **single-purpose and compositional**, Unix-style, and the **pipe is the +content-addressed `.proof`** (and the IR/contract stream under it): +- `lift` writes IR; `mint` **tastes that IR** and writes a signed `.proof`; `verify` + **discharges the contracts the lifter loaded** and writes verdicts + a witness; `lower` + reads the same IR and writes native. +- No verb reaches into another — they **meet at the artifact**. Content-addressing is what + makes the pipe trustworthy: the IR `mint` tasted is provably the IR `lift` produced, + because the CID says so. **Unix pipes with integrity welded in.** + +This is why most "new verbs" are a mistake — **a composition frozen into a primitive:** +- `migrate` = `lift | lower`-to-another-library. +- `transport` = `migrate` with paperwork = still `lift | lower`. +- `exam` = `verify`'s output, re-read. +- `catalog` = `union(.proof)` — a reduction, not a command. + +Ship orthogonal primitives; let the `.proof` be the pipe; anything that looks like a new +verb is almost always two old verbs and a config line. (`compose` survives the cull +precisely because it is **not** a composition — it's a genuine `libprovekit` primitive the +pipe exposes.) + +## CLI + +A **small set of composable verbs over content-addressed artifacts**, Unix-natured: +- **Translation:** `lift`, `lower` — two verbs, dispatching to a config-declared plugin + roster over RPC. The substrate is language-blind; **all kits speak one RPC language**; the + kits resolve their own `.proof` (jar / pip / cargo / classloader) and feed the rust CLI. +- **Substrate algebra (the trinity):** `mint`, `verify`, `witness`, `implicate` over + {terms, contracts, implications} — not native↔native translation; the substrate's own + operations, composing through the same `.proof` currency. +- The CLI handles **all computation over the data**; kits do language-specific work behind + the RPC line. + +Anything beyond these that looks like a verb is a frozen composition (above) and belongs as +a pipeline + config, not a command. + +_(awaiting next term)_ diff --git a/examples/rust-missing-edge/.gitignore b/examples/rust-missing-edge/.gitignore new file mode 100644 index 000000000..9b3786ccb --- /dev/null +++ b/examples/rust-missing-edge/.gitignore @@ -0,0 +1,4 @@ +/.provekit/proofs/ +/.provekit/runs/ +/target/ +/Cargo.lock diff --git a/examples/rust-missing-edge/.provekit/config.toml b/examples/rust-missing-edge/.provekit/config.toml new file mode 100644 index 000000000..aea8c3830 --- /dev/null +++ b/examples/rust-missing-edge/.provekit/config.toml @@ -0,0 +1,26 @@ +# ProvekIt project config for the Rust end-to-end production bridge example. +# +# `rust-walk-contracts` emits body-derived function-contracts from real Rust +# source via provekit-walk-rpc. `rust-tests` emits harvested #[test] assertions +# via provekit-lift's Rust test adapters. Mint merges both normalized +# ir-document responses and auto-writes the bridge. +[[plugins]] +name = "rust-walk-contracts" +surface = "rust-walk-contracts" +emit = "ir-document" + +[[plugins]] +name = "rust-tests" +surface = "rust-tests" +emit = "ir-document" + +[solvers] +default = "z3" + +[solvers.dispatch] +linear_arithmetic = "z3" +default = "z3" + +[solvers.z3] +binary = "z3" +flags = ["-smt2", "-in"] diff --git a/examples/rust-missing-edge/.provekit/lift/rust-tests/manifest.toml b/examples/rust-missing-edge/.provekit/lift/rust-tests/manifest.toml new file mode 100644 index 000000000..485a81c67 --- /dev/null +++ b/examples/rust-missing-edge/.provekit/lift/rust-tests/manifest.toml @@ -0,0 +1,3 @@ +name = "rust-tests" +command = ["../../implementations/rust/target/debug/provekit-lift", "--rpc"] +working_dir = "." diff --git a/examples/rust-missing-edge/.provekit/lift/rust-walk-contracts/manifest.toml b/examples/rust-missing-edge/.provekit/lift/rust-walk-contracts/manifest.toml new file mode 100644 index 000000000..3d398479c --- /dev/null +++ b/examples/rust-missing-edge/.provekit/lift/rust-walk-contracts/manifest.toml @@ -0,0 +1,3 @@ +name = "rust-walk-contracts" +command = ["../../implementations/rust/target/debug/provekit-walk-rpc", "--rpc"] +working_dir = "." diff --git a/examples/rust-missing-edge/Cargo.toml b/examples/rust-missing-edge/Cargo.toml new file mode 100644 index 000000000..0edc4786a --- /dev/null +++ b/examples/rust-missing-edge/Cargo.toml @@ -0,0 +1,8 @@ +[package] +name = "provekit-rust-missing-edge-example" +version = "0.1.0" +edition = "2021" +publish = false + +[lib] +path = "src/lib.rs" diff --git a/examples/rust-missing-edge/RESULT.md b/examples/rust-missing-edge/RESULT.md new file mode 100644 index 000000000..aa6346b33 --- /dev/null +++ b/examples/rust-missing-edge/RESULT.md @@ -0,0 +1,111 @@ +# rust-missing-edge: the missing edge, CAUGHT through the real verbs + +This crate demonstrates the founding claim end-to-end through the real CLI: +**a bug is a missing edge** -- a callsite whose precondition no producer's +postcondition establishes, caught by lift + compose + discharge, *not* by a +unit test. + +The bug: + +- `serialize(value) -> i64` returns `value`. The lifter derives the post + `result == value`. It says **nothing** about the sign of the result. +- `content_address(encoding) -> i64` has a leading guard + `if encoding < 0 { panic!(...) }`, which lifts to the precondition + `pre = (encoding >= 0)`. +- `address_of(value) = content_address(serialize(value))` is the seam. + `serialize`'s post (`result == value`) does **not** establish + `content_address`'s pre (`encoding >= 0`): for `value < 0` the precondition + is violated and the program panics. +- The unit test `address_of_nonneg_round_trips` calls `address_of(7)`, which + satisfies the (undischarged) precondition by luck. **The test passes. The bug + hides from the test.** + +## What the real verbs do (run 2026-05-27, release build) + +``` +$ cargo test # address_of_nonneg_round_trips ... ok (bug hides) + +$ provekit mint --project examples/rust-missing-edge --out .../proofs --no-attest +mint ok # the rust lifter emits serialize/content_address/address_of contracts + +$ provekit prove examples/rust-missing-edge +ProvekIt verifier report + total callsites : 3 + discharged : 2 + violations : 1 + + [unsatisfied] content_address (source -> kit) + reason: solver 'z3' returned sat (counterexample found) + [discharged] serialize (source -> kit) unsat (obligation holds) + [discharged] address_of (source -> kit) unsat (obligation holds) +``` + +**The missing edge is CAUGHT.** `prove` composes `address_of`'s body, finds the +`content_address(serialize(value))` callsite, and discharges the obligation +`serialize.post -> content_address.pre`. z3 returns **sat**: there is a +`value < 0` (the region the source panics on, and the test never exercises) for +which `serialize`'s postcondition fails to establish `content_address`'s +precondition. `serialize` and `address_of` discharge (unsat: they hold). The +bug the unit test misses, refuted by the solver. + +## How it was caught: three stacked holes (all below the language line) + +Before this work, `prove` reported the buggy crate as `discharged: 1, +violations: 0` -- a **false green**. Three holes had to be closed, each exposed +by the previous fix: + +1. **Pool category error (verifier, `MementoPool::insert`).** A contract's + `preHash` was indexed into `formula_to_memento`, the map Tier 0 (`verify`) + trusts as "this formula is proven true". A precondition is an *obligation*, + not a fact: a callsite's consumer-pre self-discharged merely because the + callee *declared* it. Fix: index only `postHash`/`invHash`/`consequentHash` + (established facts); `preHash`/`antecedentHash` are obligations. + +2. **Vacuous lift (lifter, `lift_expr_to_term_inner`).** A plain function call + `f(args)` had no lift arm (only method calls did), so `address_of`'s body + lifted to `None` and its postcondition collapsed to a vacuous `true`. With + no `content_address(...)` ctor in any formula, the seam was invisible to + `enumerate_callsites`. Fix: lift `Expr::Call` to `Ctor{name, args}`, so the + call tree survives into the contract. (Callsites: 1 -> 3.) + +3. **Discharge expected a quantifier the lifter never emits (verifier).** Both + solver paths (`instantiate`, `build_implication_obligation`) require the + pre/post quantified as `forall`, but body-derived contracts emit *bare* + predicates over named formals. Also `locate_producer_post` read the v1.1 + `evidence.body` shape while mint emits v1.2 `header`, so the producer post + never resolved. Fixes: synthesize `forall (formal). pre` and + `forall result. post` at the discharge boundary from the contract's own + `formals`; make `locate_producer_post` shape-agnostic; normalize + integer-width binder sorts (`I64`...) to the SMT `Int` the verifier already + reasons in. + +**What this exposed about the substrate's history:** holes (1) and (3) together +mean the *solver* discharge path had effectively never run on a real +body-derived lift before this session -- every real callsite discharged at +Tier 0 (hash lookup) or vacuously. Fixing the Tier 0 category error routed real +obligations into the solver for the first time, which is how holes (3a)/(3b) +surfaced at all. + +**Why it generalizes:** holes (1) and (3) are in the verifier, operating on the +protocol's formulas and hashes -- not on any source language. Hole (2)'s +*principle* (a lifter emits the body call-tree as ctors referencing the +callee's bridge symbol) is universal; the Rust arm is local, the catch is not. +A missing edge in Go, Java, or Python reduces to the identical IR shape -- a +callsite whose consumer-pre no producer-post establishes. One verifier refutes +them all. The bug the unit tests miss is caught at the substrate, below the +language. + +## Known follow-ups (not this change; tracked separately) + +- **Multi-formal callsites.** The callsite model tracks a single arg term, so + the forall-wrap binds the first formal only. Multi-arg discharge is a + pre-existing limitation. +- **`emit_sort` integer widths.** The generated SMT emitter passes primitive + sort names through verbatim; `I64` (and other widths) reach z3 as unknown + sorts. The binder normalization here keeps obligations in LIA; the deeper fix + is to teach `emit_sort` the integer-width families. +- **`targetProofCid` back-compat.** The auto-minted bridges carry no + `targetProofCid`, so `ConsequentBundlePinned` is not enforced (a soft warning + on every callsite). + +Re-run the three verbs above to reproduce the catch. diff --git a/examples/rust-missing-edge/src/lib.rs b/examples/rust-missing-edge/src/lib.rs new file mode 100644 index 000000000..39529db95 --- /dev/null +++ b/examples/rust-missing-edge/src/lib.rs @@ -0,0 +1,44 @@ +// A MISSING-EDGE bug, demonstrated through the real ProvekIt verbs +// (`provekit mint` -> the rust lifter; `provekit verify` -> compose + discharge). +// +// A bug is a missing edge: a call site whose precondition no producer's +// postcondition establishes, so the composition `post(producer) -> pre(consumer)` +// does not discharge. The unit test passes (it only exercises an input that +// happens to satisfy the precondition), so the bug hides from the test. Only +// lifting the contracts and discharging the seam reveals it. + +/// Producer. The lifter derives the postcondition from the body: `result == value`. +/// It says NOTHING about the sign of the result. +pub fn serialize(value: i64) -> i64 { + value +} + +/// Consumer. The leading guard lifts to a precondition: a negative `encoding` +/// panics, so the contract requires `pre = NOT(encoding < 0)` -- the encoding +/// must be non-negative ("canonical"). +pub fn content_address(encoding: i64) -> i64 { + if encoding < 0 { + panic!("content address requires a non-negative (canonical) encoding"); + } + encoding +} + +/// The seam. `serialize`'s post (`result == value`) does not establish +/// `content_address`'s pre (`encoding >= 0`): for `value < 0` the precondition +/// is violated. The `post -> pre` edge is MISSING; `verify` must refuse it. +pub fn address_of(value: i64) -> i64 { + content_address(serialize(value)) +} + +#[cfg(test)] +mod tests { + use super::address_of; + + /// Passes: this input satisfies the (undischarged) precondition by luck. + /// The bug -- `address_of(-1)` panics -- is exactly the missing edge that + /// the test does NOT catch but lift + compose + discharge does. + #[test] + fn address_of_nonneg_round_trips() { + assert_eq!(address_of(7), 7); + } +} diff --git a/implementations/rust/libprovekit/src/core/bind.rs b/implementations/rust/libprovekit/src/core/bind.rs index 1c7db14f6..50acf2206 100644 --- a/implementations/rust/libprovekit/src/core/bind.rs +++ b/implementations/rust/libprovekit/src/core/bind.rs @@ -1,19 +1,16 @@ // SPDX-License-Identifier: Apache-2.0 use std::collections::{BTreeMap, BTreeSet}; -use std::path::{Path, PathBuf}; use provekit_canonicalizer::blake3_512_of; use provekit_ir_types::{ - ExamManifestMemento, GapKind, IrFormula, IrTerm, OptionStatus, PromotionDecisionEnvelope, - PromotionDecisionHeader, PromotionDecisionMemento, PromotionDecisionMetadata, PromotionGate, - PromotionResult, ResolutionOption, ResolutionOptionKind, Sort, TransportGapMemento, + ExamManifestMemento, GapKind, IrFormula, IrTerm, OptionStatus, ResolutionOption, + ResolutionOptionKind, Sort, TransportGapMemento, }; use serde::{Deserialize, Serialize}; use serde_json::{json, Value as Json}; use thiserror::Error; -use crate::proofir_bridge::CatalogIndex; use super::primitives::address; use super::traits::{Kit, KitError}; @@ -22,7 +19,6 @@ use super::types::{ }; const CONCEPT_BIND_RESULT: &str = "concept:bind-result"; -const CONCEPT_BIND_RESULT_CID: &str = "blake3-512:22dcd7895fd7abee9d9f34893b5ab9513b4801c0244a64e7a8c5180bba313f3b116d045b0aa3377f39bd892e020a1bd99d4bc60547b11fd7131fbe2f7e33dd75"; const CONCEPT_OP_APPLICATION: &str = "concept:op-application"; const CONCEPT_SEQ: &str = "concept:seq"; @@ -239,8 +235,6 @@ pub struct NamedTermDocument { #[serde(default, rename = "gapRecords", skip_serializing_if = "Vec::is_empty")] pub gap_records: Vec, pub kind: String, - #[serde(rename = "promotionDecisionMementos")] - pub promotion_decision_mementos: Vec, #[serde(rename = "schemaVersion")] pub schema_version: String, #[serde(rename = "sourceLanguage")] @@ -420,7 +414,6 @@ pub fn bind_term_document( let catalog = seed_catalog(); let mut seen_names: BTreeSet = BTreeSet::new(); let mut terms = Vec::with_capacity(entries.len()); - let mut decisions = Vec::new(); let mut gap_records = Vec::new(); let mut operation_namer = UnnamedConceptNamer::default(); for (idx, entry) in entries.into_iter().enumerate() { @@ -434,15 +427,8 @@ pub fn bind_term_document( }; let site_memento_cid = site_cid(&entry, &name, &term_shape_cid)?; let witnesses = named_witnesses(&entry); - let promoted_cid = blake3_512_of(format!("provekit-bind/promoted/{name}").as_bytes()); let named_term_tree = named_operation_tree(&entry.term_shape, &catalog, &mut operation_namer)?; - decisions.extend(promotion_decisions( - &term_shape_cid, - &promoted_cid, - &site_memento_cid, - &witnesses, - )?); if witnesses.is_empty() { gap_records.push(wp_rule_synthesis_gap_record( &source_language, @@ -497,7 +483,6 @@ pub fn bind_term_document( candidate_cluster_manifest, gap_records, kind: "named-term-document".to_string(), - promotion_decision_mementos: decisions, schema_version: "1".to_string(), source_language, terms, @@ -562,7 +547,20 @@ fn bind_payload_wire_named_term_document(named: &NamedTermDocument) -> NamedTerm } pub fn concept_bind_result_cid() -> Cid { - Cid::try_from(CONCEPT_BIND_RESULT_CID).expect("concept:bind-result CID is pinned") + // Computed from the pinned SHAPE, never a pinned hash. The address is + // whatever json_cid(grammar_op_shape) produces, by construction -- there is + // no magic-number literal to drift from its preimage. + ConceptOpCatalog + .cid(CONCEPT_BIND_RESULT) + .expect("concept:bind-result is a language primitive") +} + +/// Resolve a grammar primitive's address from the code shape-authority. `Some` +/// iff `name` is a language primitive (op-application / seq / ite / bind-result); +/// the address is `json_cid` of its pinned shape, computed, never frozen. +/// Consumers derive handles from this; they never store a copy. +pub fn grammar_op_cid(name: &str) -> Option { + ConceptOpCatalog.cid(name) } pub fn bind_result_payload( @@ -1235,44 +1233,39 @@ fn named_witness(role: &str, predicate_text: &str, source_kind: &str) -> NamedWi } } -struct ConceptOpCatalog { - index: CatalogIndex, -} +// Grammar lives in CODE, pinned by SHAPE, never on disk. An IR grammar +// primitive (op-application / seq / ite / bind-result) is the LANGUAGE, not a +// promoted concept -- and a thing is one or the other, never both. So its +// identity is the `json_cid` of its pure, name-free, positional structural +// shape, computed here from a shape compiled into the binary. There is no disk +// load, no catalog, and no `required_cid` that can fail on an empty catalog: +// the language is always present because it is the code. +struct ConceptOpCatalog; impl ConceptOpCatalog { fn load() -> Result { - let root = find_concept_shapes_root().ok_or_else(|| { - BindError::Failed("concept-shapes catalog root not found".to_string()) - })?; - let index = CatalogIndex::from_catalog_root(root.join("catalog")) - .map_err(|error| BindError::Failed(format!("load concept-shapes catalog: {error}")))?; - Ok(Self { index }) + Ok(Self) } fn required_cid(&self, name: &str) -> Result { - self.cid(name).ok_or_else(|| { - BindError::Failed(format!( - "concept op `{name}` missing from concept-shapes catalog" - )) - }) + self.cid(name) + .ok_or_else(|| BindError::Failed(format!("`{name}` is not a language primitive"))) } fn cid(&self, name: &str) -> Option { - self.index - .op_definition_cid(name) - .and_then(|cid| Cid::try_from(cid).ok()) + let shape = grammar_op_shape(name)?; + let cid = crate::canonical::json_cid(&shape).ok()?; + Cid::try_from(cid.as_str()).ok() } fn resolved_name_and_cid(&self, name: &str) -> Result<(String, Cid), BindError> { if let Some(cid) = self.cid(name) { return Ok((name.to_string(), cid)); } - if !name.starts_with("concept:") { - let concept_name = format!("concept:{name}"); - if let Some(cid) = self.cid(&concept_name) { - return Ok((concept_name, cid)); - } - } + // Not a language primitive: an unwitnessed concept resolves to the + // grammar floor -- the generic op-application -- rather than fabricating + // a name. (Once promotion-by-witnessing lands, a witnessed concept + // carries its own address here.) Ok(( CONCEPT_OP_APPLICATION.to_string(), self.required_cid(CONCEPT_OP_APPLICATION)?, @@ -1280,6 +1273,77 @@ impl ConceptOpCatalog { } } +/// The pure, name-free, positional structural shape of an IR grammar primitive. +/// Grammar is the language: pinned by shape in code, addressed by `json_cid` of +/// this shape. `fn_name`, `operator`, formal parameter names, `wp_note` prose, +/// and the memento envelope are all sugar and are ABSENT FROM THE PREIMAGE BY +/// CONSTRUCTION -- born pure, never stripped. Slot references are positional. +fn grammar_op_shape(name: &str) -> Option { + use serde_json::json; + let sort = |n: &str| json!({ "kind": "ctor", "name": n, "args": [] }); + let slot = |i: usize| json!({ "kind": "slot", "index": i }); + let pre = json!({ "kind": "atomic", "name": "true", "args": [] }); + let shape = match name { + CONCEPT_OP_APPLICATION => json!({ + "kind": "grammar-op", + "formalSorts": [sort("Cid"), sort("List")], + "returnSort": sort("Term"), + "pre": pre, + "post": { + "arity": ["Cid", "List"], + "result": "Term", + "slotTerms": [slot(0), slot(1)] + }, + "effects": [] + }), + CONCEPT_SEQ => json!({ + "kind": "grammar-op", + "formalSorts": [sort("Stmt"), sort("Stmt")], + "returnSort": sort("Stmt"), + "pre": pre, + "post": { + "arity": ["Stmt", "Stmt"], + "result": "Stmt", + "wpRule": { + "kind": "apply", + "fn": "wp_slot_0", + "args": [{ + "kind": "apply", + "fn": "wp_slot_1", + "args": [{ "kind": "var", "name": "Q" }] + }] + } + }, + "effects": [{ "kind": "effect-polymorphic", "rule": "union(slot_0.effects, slot_1.effects)" }] + }), + "concept:ite" => json!({ + "kind": "grammar-op", + "formalSorts": [sort("Bool"), sort("Expr"), sort("Expr")], + "returnSort": sort("Expr"), + "pre": pre, + "post": { + "arity": ["Bool", "Expr", "Expr"], + "result": "Expr" + }, + "effects": [{ "kind": "effect-polymorphic", "rule": "union branch value effects" }] + }), + CONCEPT_BIND_RESULT => json!({ + "kind": "grammar-op", + "formalSorts": [sort("Term"), sort("Term")], + "returnSort": sort("BoundTerm"), + "pre": pre, + "post": { + "arity": ["Term", "Term"], + "result": "BoundTerm", + "slotTerms": [slot(0), slot(1)] + }, + "effects": [] + }), + _ => return None, + }; + Some(shape) +} + fn named_term_document_op_tree( named: &NamedTermDocument, catalog: &ConceptOpCatalog, @@ -1495,8 +1559,6 @@ fn document_metadata_value(named: &NamedTermDocument) -> Result "candidateClusterManifest": serde_json::to_value(&named.candidate_cluster_manifest) .map_err(|error| BindError::Failed(format!("serialize candidate cluster manifest: {error}")))?, "kind": named.kind.clone(), - "promotionDecisionMementos": serde_json::to_value(&named.promotion_decision_mementos) - .map_err(|error| BindError::Failed(format!("serialize promotion decisions: {error}")))?, "schemaVersion": named.schema_version.clone(), "sourceLanguage": named.source_language.clone(), "workspaceRoot": named.workspace_root.clone(), @@ -1545,13 +1607,6 @@ fn named_term_document_from_op_tree(term: &Term) -> Result Result Result< crate::canonical::json_cid(&value).map_err(|e| BindError::Failed(e.to_string())) } -fn promotion_decisions( - candidate_cid: &str, - promoted_cid: &str, - site_memento_cid: &str, - witnesses: &[NamedWitness], -) -> Result, BindError> { - witnesses - .iter() - .enumerate() - .map(|(idx, witness)| { - let evidence_cid = crate::canonical::json_cid(&json!({ - "predicate": witness.predicate, - "predicateText": witness.predicate_text, - "role": witness.role, - "siteMementoCid": site_memento_cid, - "sourceKind": witness.source_kind, - })) - .map_err(|e| BindError::Failed(e.to_string()))?; - let mut decision = PromotionDecisionMemento { - envelope: PromotionDecisionEnvelope { - declared_at: "2026-05-15T00:00:00.000Z".to_string(), - signature: String::new(), - signer: "builtin:provekit-bind".to_string(), - }, - header: PromotionDecisionHeader { - candidate_cid: candidate_cid.to_string(), - cid: String::new(), - decider_cid: "builtin:provekit-bind".to_string(), - decision_payload: json!({ - "evidence_count": 1, - "ordinal": idx, - "result": "admitted" - }), - evidence_cids: vec![evidence_cid], - gate: PromotionGate::Proof, - kind: "promotion-decision".to_string(), - policy_cid: "builtin:provekit-bind/default-policy".to_string(), - promoted_cid: promoted_cid.to_string(), - result: PromotionResult::Admitted, - schema_version: "1".to_string(), - }, - metadata: PromotionDecisionMetadata { - counterexample_cids: None, - note: Some("bind admitted lifted evidence into named term".to_string()), - source_url: None, - }, - }; - decision.header.cid = decision - .recompute_header_cid() - .map_err(|err| BindError::Failed(err.to_string()))?; - decision - .validate() - .map_err(|err| BindError::Failed(err.to_string()))?; - Ok(decision) - }) - .collect() -} - fn wp_rule_synthesis_gap_record( source_lang: &str, source_op_cid: &str, @@ -1860,165 +1856,19 @@ impl Catalog { } fn seed_catalog() -> Catalog { - let mut entries = Vec::new(); - if let Some(root) = find_concept_shapes_root() { - entries.extend(load_catalog_abstractions(&root)); - entries.extend(load_catalog_specs(&root)); - } - entries.extend(legacy_classification_entries()); - Catalog { entries } -} - -fn legacy_classification_entries() -> Vec { - vec![ - CatalogEntry { - name: "concept:retry-with-bounded-attempts".to_string(), - shape_cid: String::new(), - classification: "retry-loop", - }, - CatalogEntry { - name: "concept:guard-then-commit".to_string(), - shape_cid: String::new(), - classification: "guard-then-commit", - }, - ] -} - -fn find_concept_shapes_root() -> Option { - let mut starts = Vec::new(); - if let Ok(cwd) = std::env::current_dir() { - starts.push(cwd); - } - if let Some(manifest_dir) = option_env!("CARGO_MANIFEST_DIR") { - starts.push(PathBuf::from(manifest_dir)); - } - for start in starts { - for ancestor in start.ancestors() { - let candidate = ancestor.join("menagerie").join("concept-shapes"); - if candidate.is_dir() { - return Some(candidate); - } - } - } - None -} - -fn load_catalog_abstractions(concept_shapes_root: &Path) -> Vec { - let dir = concept_shapes_root.join("catalog").join("abstractions"); - catalog_json_files(&dir, ".json") - .into_iter() - .filter_map(|path| load_catalog_abstraction(&path)) - .collect() -} - -fn load_catalog_abstraction(path: &Path) -> Option { - let doc = read_json_file(path)?; - let name = doc - .get("memento") - .and_then(|memento| memento.get("operator")) - .and_then(Json::as_str)? - .to_string(); - let shape_cid = doc - .get("cid") - .and_then(Json::as_str) - .map(str::to_string) - .or_else(|| shape_cid_from_abstraction_filename(path))?; - Some(CatalogEntry { - name, - shape_cid, - classification: "catalog-shape", - }) -} - -fn load_catalog_specs(concept_shapes_root: &Path) -> Vec { - let dir = concept_shapes_root.join("specs"); - catalog_json_files(&dir, ".spec.json") - .into_iter() - .flat_map(|path| load_catalog_spec(&path)) - .collect() -} - -fn load_catalog_spec(path: &Path) -> Vec { - let Some(doc) = read_json_file(path) else { - return Vec::new(); - }; - let Some(name) = doc - .get("fn_name") - .and_then(Json::as_str) - .map(str::to_string) - else { - return Vec::new(); - }; - let mut entries = Vec::new(); - if let Ok(shape_cid) = crate::canonical::json_cid(&doc) { - entries.push(CatalogEntry { - name: name.clone(), - shape_cid, - classification: "catalog-shape", - }); - } - if name.starts_with("concept:") { - if let Some(operator) = doc - .get("post") - .and_then(|post| post.get("operator")) - .and_then(Json::as_str) - .map(|operator| operator.replace('_', "-")) - { - if let Ok(shape_cid) = crate::canonical::json_cid(&operation_lookup_shape(&operator)) { - entries.push(CatalogEntry { - name, - shape_cid, - classification: "catalog-shape", - }); - } - } - } - entries -} - -fn catalog_json_files(dir: &Path, suffix: &str) -> Vec { - if !dir.is_dir() { - return Vec::new(); - } - let mut paths = Vec::new(); - collect_catalog_json_files(dir, suffix, &mut paths); - paths.sort(); - paths -} - -fn collect_catalog_json_files(dir: &Path, suffix: &str, out: &mut Vec) { - let Ok(entries) = std::fs::read_dir(dir) else { - return; - }; - for entry in entries.filter_map(Result::ok) { - let path = entry.path(); - let Ok(file_type) = entry.file_type() else { - continue; - }; - if file_type.is_dir() { - collect_catalog_json_files(&path, suffix, out); - } else if file_type.is_file() - && path - .file_name() - .and_then(|name| name.to_str()) - .is_some_and(|name| name.ends_with(suffix)) - { - out.push(path); - } + // The catalog is the signature of witnesses having witnessed. At t=0, + // before any witness has witnessed, it is EMPTY. There is no seed: a + // hand-assembled initial population is the lie -- concepts named/classified + // by fiat instead of accreted from witnessing. Concept naming falls back to + // UNNAMED until a real witness promotes a concept in. (The former + // `legacy_classification_entries` carried empty `shape_cid`s -- not even + // content-addressed, pure fiction -- and the on-disk concept-shapes load + // was hand-authored, UNSIGNED_DEV_ONLY mementos. Both gone.) + Catalog { + entries: Vec::new(), } } -fn read_json_file(path: &Path) -> Option { - let bytes = std::fs::read(path).ok()?; - serde_json::from_slice(&bytes).ok() -} - -fn shape_cid_from_abstraction_filename(path: &Path) -> Option { - let file_name = path.file_name()?.to_str()?; - let (_, cid_hex_with_suffix) = file_name.split_once(".blake3-512:")?; - let cid_hex = cid_hex_with_suffix.strip_suffix(".json")?; - Some(format!("blake3-512:{cid_hex}")) -} fn primitive_sort(name: &str) -> Sort { Sort::Primitive { @@ -2146,7 +1996,6 @@ mod tests { assert_eq!(named.kind, "named-term-document"); assert_eq!(named.terms[0].concept_name, "concept:demo"); assert_eq!(named.terms[0].function, "f"); - assert_eq!(named.promotion_decision_mementos.len(), 1); } #[test] @@ -2306,7 +2155,6 @@ mod tests { candidate_cluster_manifest: CandidateClusterManifest::default(), gap_records: vec![], kind: "named-term-document".to_string(), - promotion_decision_mementos: vec![], schema_version: "1".to_string(), source_language: "rust".to_string(), terms: vec![NamedTerm { @@ -2350,7 +2198,6 @@ mod tests { candidate_cluster_manifest: CandidateClusterManifest::default(), gap_records: vec![], kind: "named-term-document".to_string(), - promotion_decision_mementos: vec![], schema_version: "1".to_string(), source_language: "rust".to_string(), terms: vec![NamedTerm { @@ -2406,122 +2253,5 @@ mod tests { ); } - #[test] - fn seed_catalog_loads_real_concept_shape_catalog() { - let catalog = seed_catalog(); - assert!( - catalog.entries.len() > 10, - "catalog should load real concept-shape entries, got {}", - catalog.entries.len() - ); - assert!( - catalog - .entries - .iter() - .any(|entry| entry.name == "concept:identity"), - "catalog should include concept:identity" - ); - assert!( - catalog - .entries - .iter() - .any(|entry| entry.name == "concept:new"), - "catalog should include algorithm-tier concept:new" - ); - } - #[test] - fn catalog_matches_loaded_shape_cid_before_legacy_classification() { - let catalog = seed_catalog(); - let identity_shape_cid = "blake3-512:6920f6e26184ca316f3dce6c02690b515c11b3d96d3b476bb5abe67cb55e1885031484c3add8a5f26b630e305ad3fe41eed10acca2e141898f9d6629c278867f"; - let unknown_shape = TermShape::from_kit( - json!({ - "kind": "body", - "stmts": [] - }), - identity_shape_cid.to_string(), - ); - let matched = catalog - .match_shape(identity_shape_cid, &unknown_shape) - .expect("identity CID should match before classify fallback"); - assert_eq!(matched.name, "concept:identity"); - } - - #[test] - fn bind_names_blake3_512_of_operations_from_catalog() { - let term = json!({ - "kind": "ir-document", - "sourceLanguage": "rust", - "workspaceRoot": "/tmp/provekit-bind-test", - "ir": [{ - "kind": "bind-lift-entry", - "file": "implementations/rust/provekit-canonicalizer/src/hash.rs", - "fn_name": "blake3_512_of", - "param_names": ["bytes"], - "param_types": ["& [u8]"], - "return_type": "String", - "term_shape": { - "kind": "body", - "stmts": [ - {"kind": "let"}, - {"kind": "call"}, - {"kind": "let"}, - {"kind": "call"}, - {"kind": "let"}, - {"kind": "let"}, - {"kind": "call"}, - {"kind": "call"}, - {"kind": "opaque"} - ] - }, - "witnesses": [] - }] - }); - - let named = bind_term_document( - &term, - &BindOptions { - lang: "rust".to_string(), - exam_manifest: None, - }, - ) - .expect("bind succeeds"); - let named_json = serde_json::to_value(&named).expect("named term serializes"); - let tree = named_json["terms"][0] - .get("namedTermTree") - .expect("operation-level named term tree is emitted"); - let nested_names = serde_json::to_string(tree).expect("tree stringifies"); - let mut operation_concepts = Vec::new(); - collect_tree_concept_names(tree, &mut operation_concepts); - operation_concepts.sort(); - operation_concepts.dedup(); - eprintln!( - "operation-level matches for blake3_512_of: {}", - operation_concepts.join(", ") - ); - - assert!( - nested_names.contains("\"conceptName\":\"concept:call\""), - "blake3_512_of call operations should match catalog concept:call; tree={nested_names}" - ); - assert!( - tree.get("args") - .and_then(Json::as_array) - .is_some_and(|args| !args.is_empty()), - "operation tree should retain recursive children; tree={tree}" - ); - } - - fn collect_tree_concept_names(tree: &Json, out: &mut Vec) { - if let Some(name) = tree.get("conceptName").and_then(Json::as_str) { - if name.starts_with("concept:") { - out.push(name.to_string()); - } - } - if let Some(args) = tree.get("args").and_then(Json::as_array) { - for arg in args { - collect_tree_concept_names(arg, out); - } - } - } } diff --git a/implementations/rust/libprovekit/src/core/mod.rs b/implementations/rust/libprovekit/src/core/mod.rs index 9217c8b36..e2a8f18ac 100644 --- a/implementations/rust/libprovekit/src/core/mod.rs +++ b/implementations/rust/libprovekit/src/core/mod.rs @@ -36,8 +36,9 @@ pub mod walks; pub use crate::exam_manifest::ExamManifestKit; pub use bind::{ - bind_result_payload, bind_term_document, concept_bind_result_cid, named_term_document_cid, - named_term_document_from_bind_payload, strip_realize_sidecar_from_lift_term, + bind_result_payload, bind_term_document, concept_bind_result_cid, grammar_op_cid, + named_term_document_cid, named_term_document_from_bind_payload, + strip_realize_sidecar_from_lift_term, BindContractWitness, BindError, BindKit, BindLiftEntry, BindOptions, CandidateCluster, CandidateClusterManifest, NamedTerm, NamedTermDocument, NamedTermTree, NamedWitness, }; diff --git a/implementations/rust/libprovekit/src/lib.rs b/implementations/rust/libprovekit/src/lib.rs index e1f64f43d..7b6f47f5d 100644 --- a/implementations/rust/libprovekit/src/lib.rs +++ b/implementations/rust/libprovekit/src/lib.rs @@ -8,7 +8,6 @@ pub mod effect_propagation; pub mod exam_manifest; pub mod ffi; pub mod policy_profile_registry; -pub mod promotion_decision_registry; pub mod proofir_bridge; pub mod substrate_default_cids; pub mod transport; diff --git a/implementations/rust/libprovekit/src/promotion_decision_registry.rs b/implementations/rust/libprovekit/src/promotion_decision_registry.rs deleted file mode 100644 index 5ad56621c..000000000 --- a/implementations/rust/libprovekit/src/promotion_decision_registry.rs +++ /dev/null @@ -1,378 +0,0 @@ -// SPDX-License-Identifier: Apache-2.0 - -use std::collections::{BTreeMap, BTreeSet}; - -use provekit_ir_types::{MigrateReceiptEnvelope, PromotionDecisionMemento, PromotionResult}; -use serde::Deserialize; -use serde_json::{json, Value}; - -#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)] -pub struct PromotionDecisionKey { - pub promoted_op: String, - pub fixture_cid: String, -} - -impl PromotionDecisionKey { - pub fn new(promoted_op: impl Into, fixture_cid: impl Into) -> Self { - Self { - promoted_op: promoted_op.into(), - fixture_cid: fixture_cid.into(), - } - } -} - -#[derive(Debug, Clone, PartialEq)] -pub struct PromotionStatus { - pub key: PromotionDecisionKey, - pub decision_cids: Vec, - pub decision_policy_cids: Vec, - pub consensus_vector: Value, - pub witnesses_consulted: u64, -} - -#[derive(Debug, Clone, Default)] -pub struct PromotionDecisionRegistry { - decisions: BTreeMap, - by_key: BTreeMap>, - summaries: BTreeMap, -} - -#[derive(Debug, Clone)] -struct PromotionSummary { - decision_policy_cid: String, - consensus_vector: Value, - witnesses_consulted: u64, -} - -#[derive(Debug, Clone, Deserialize)] -pub struct ConsensusPolicy { - #[serde(default)] - pub cid: Option, - #[serde(default)] - pub thresholds: Vec, - #[serde(default)] - pub allow_failures: bool, - #[serde(default)] - pub require_loss_dim_coverage: Option, -} - -#[derive(Debug, Clone, Deserialize)] -pub struct ConsensusThreshold { - pub axis: String, - pub predicate: String, -} - -#[derive(Debug, thiserror::Error)] -pub enum PromotionDecisionRegistryError { - #[error("promotion-decision-registry: parse failed as PromotionDecisionMemento ({decision}) and MigrateReceiptEnvelope ({receipt})")] - Parse { decision: String, receipt: String }, - #[error("promotion-decision-registry: invalid decision: {0}")] - Invalid(String), - #[error( - "promotion-decision-registry: header.cid mismatch: claimed {claimed}, recomputed {actual}" - )] - HeaderCidMismatch { claimed: String, actual: String }, - #[error("promotion-decision-registry: parse consensus policy: {0}")] - PolicyParse(String), - #[error("promotion-decision-registry: invalid consensus policy: {0}")] - InvalidPolicy(String), -} - -impl ConsensusPolicy { - pub fn from_json_str(text: &str) -> Result { - let policy: Self = serde_json::from_str(text) - .map_err(|err| PromotionDecisionRegistryError::PolicyParse(err.to_string()))?; - if policy.thresholds.is_empty() { - return Err(PromotionDecisionRegistryError::InvalidPolicy( - "thresholds must be non-empty".to_string(), - )); - } - for threshold in &policy.thresholds { - parse_predicate(&threshold.predicate).map_err(|err| { - PromotionDecisionRegistryError::InvalidPolicy(format!( - "threshold axis `{}` predicate `{}`: {err}", - threshold.axis, threshold.predicate - )) - })?; - } - if let Some(mode) = policy.require_loss_dim_coverage.as_deref() { - if mode != "all-named" { - return Err(PromotionDecisionRegistryError::InvalidPolicy(format!( - "unsupported require_loss_dim_coverage `{mode}`" - ))); - } - } - Ok(policy) - } - - pub fn admits(&self, status: &PromotionStatus) -> Result<(), String> { - if !self.allow_failures && outcome_count(&status.consensus_vector, "fail") > 0 { - return Err("policy rejected: failure_mode_distribution contains fail".to_string()); - } - if self.require_loss_dim_coverage.as_deref() == Some("all-named") { - let unwitnessed = status - .consensus_vector - .pointer("/loss_dim_coverage/unwitnessed") - .and_then(Value::as_array) - .map(Vec::len) - .unwrap_or(0); - if unwitnessed > 0 { - return Err("policy rejected: loss_dim_coverage is not all-named".to_string()); - } - } - for threshold in &self.thresholds { - eval_predicate(&threshold.predicate, status).map_err(|err| { - format!( - "policy rejected on axis `{}` predicate `{}`: {err}", - threshold.axis, threshold.predicate - ) - })?; - } - Ok(()) - } -} - -impl PromotionDecisionRegistry { - pub fn new() -> Self { - Self::default() - } - - pub fn admit_json_str(&mut self, text: &str) -> Result { - match serde_json::from_str::(text) { - Ok(decision) => self.admit(decision), - Err(decision_err) => match MigrateReceiptEnvelope::parse_json_str(text) { - Ok(receipt) => self.admit_many(receipt.promotion_decisions), - Err(receipt_err) => Err(PromotionDecisionRegistryError::Parse { - decision: decision_err.to_string(), - receipt: receipt_err.to_string(), - }), - }, - } - } - - pub fn admit_many(&mut self, decisions: I) -> Result - where - I: IntoIterator, - { - let mut indexed = 0usize; - for decision in decisions { - indexed += self.admit(decision)?; - } - Ok(indexed) - } - - pub fn admit( - &mut self, - decision: PromotionDecisionMemento, - ) -> Result { - decision - .validate() - .map_err(|err| PromotionDecisionRegistryError::Invalid(err.to_string()))?; - let actual = decision - .recompute_header_cid() - .map_err(|err| PromotionDecisionRegistryError::Invalid(err.to_string()))?; - if actual != decision.header.cid { - return Err(PromotionDecisionRegistryError::HeaderCidMismatch { - claimed: decision.header.cid.clone(), - actual, - }); - } - - let decision_cid = decision.header.cid.clone(); - if self.decisions.contains_key(&decision_cid) { - return Ok(0); - } - - let entries = index_entries(&decision); - self.decisions.insert(decision_cid.clone(), decision); - for (key, summary) in entries { - self.by_key - .entry(key) - .or_default() - .insert(decision_cid.clone()); - self.summaries.insert(decision_cid.clone(), summary); - } - Ok(self - .by_key - .values() - .filter(|cids| cids.contains(&decision_cid)) - .count()) - } - - pub fn get(&self, key: &PromotionDecisionKey) -> Option { - self.statuses(key).into_iter().max_by(|left, right| { - left.witnesses_consulted - .cmp(&right.witnesses_consulted) - .then(left.decision_cids.cmp(&right.decision_cids)) - }) - } - - pub fn statuses(&self, key: &PromotionDecisionKey) -> Vec { - let Some(cids) = self.by_key.get(key) else { - return Vec::new(); - }; - cids.iter() - .filter_map(|cid| { - let summary = self.summaries.get(cid)?; - Some(PromotionStatus { - key: key.clone(), - decision_cids: vec![cid.clone()], - decision_policy_cids: vec![summary.decision_policy_cid.clone()], - consensus_vector: summary.consensus_vector.clone(), - witnesses_consulted: summary.witnesses_consulted, - }) - }) - .collect() - } -} - -fn index_entries( - decision: &PromotionDecisionMemento, -) -> Vec<(PromotionDecisionKey, PromotionSummary)> { - if decision.header.result != PromotionResult::Admitted { - return Vec::new(); - } - let Some(promoted_op) = payload_string(&decision.header.decision_payload, "promoted_op") else { - return Vec::new(); - }; - let fixtures = payload_string_array(&decision.header.decision_payload, "fixtures_consulted"); - if fixtures.is_empty() { - return Vec::new(); - } - let summary = PromotionSummary { - decision_policy_cid: decision.header.policy_cid.clone(), - consensus_vector: consensus_vector(&decision.header.decision_payload, decision), - witnesses_consulted: witnesses_consulted(decision), - }; - fixtures - .into_iter() - .map(|fixture| { - ( - PromotionDecisionKey::new(promoted_op.clone(), fixture), - summary.clone(), - ) - }) - .collect() -} - -fn consensus_vector(payload: &Value, decision: &PromotionDecisionMemento) -> Value { - payload - .get("consensus_vector") - .cloned() - .unwrap_or_else(|| { - json!({ - "failure_mode_distribution": [ - {"outcome": "pass", "count": witnesses_consulted(decision)}, - {"outcome": "fail", "count": 0}, - {"outcome": "inconclusive", "count": 0} - ], - "input_distribution_summary": {"shape": "unspanned"}, - "loss_dim_coverage": { - "named_in_concept_spec": [], - "unwitnessed": [], - "witnessed": [] - }, - "temporal_spread": { - "first_observed_at": null, - "last_observed_at": null, - "span_seconds": 0 - }, - "total_sample_count": payload.get("total_observations").and_then(Value::as_u64).unwrap_or(0), - "unique_fixtures": payload.get("fixtures_consulted").and_then(Value::as_array).map(Vec::len).unwrap_or(0), - "unique_signer_keys": ["unsigned"], - "unique_signers": 1 - }) - }) -} - -fn eval_predicate(predicate: &str, status: &PromotionStatus) -> Result<(), String> { - let (left, op, right) = parse_predicate(predicate)?; - let actual = metric_value(left, status).ok_or_else(|| format!("unknown metric `{left}`"))?; - let ok = match op { - ">=" => actual >= right, - "<=" => actual <= right, - "==" => actual == right, - ">" => actual > right, - "<" => actual < right, - _ => return Err(format!("unsupported operator `{op}`")), - }; - if ok { - Ok(()) - } else { - Err(format!("{left} was {actual}, required {op}{right}")) - } -} - -fn parse_predicate(predicate: &str) -> Result<(&str, &str, u64), String> { - for op in [">=", "<=", "==", ">", "<"] { - if let Some((left, right)) = predicate.split_once(op) { - let right = right - .trim() - .parse::() - .map_err(|err| format!("threshold is not an integer: {err}"))?; - return Ok((left.trim(), op, right)); - } - } - Err("unsupported predicate grammar".to_string()) -} - -fn metric_value(metric: &str, status: &PromotionStatus) -> Option { - match metric { - "n" | "witnesses_consulted" => Some(status.witnesses_consulted), - other => status.consensus_vector.get(other).and_then(Value::as_u64), - } -} - -fn outcome_count(vector: &Value, outcome: &str) -> u64 { - vector - .get("failure_mode_distribution") - .and_then(Value::as_array) - .and_then(|items| { - items.iter().find_map(|item| { - (item.get("outcome").and_then(Value::as_str) == Some(outcome)) - .then(|| item.get("count").and_then(Value::as_u64).unwrap_or(0)) - }) - }) - .unwrap_or(0) -} - -fn payload_string(payload: &Value, field: &str) -> Option { - payload - .get(field) - .and_then(Value::as_str) - .map(str::to_string) -} - -fn payload_string_array(payload: &Value, field: &str) -> Vec { - payload - .get(field) - .and_then(Value::as_array) - .map(|items| { - items - .iter() - .filter_map(Value::as_str) - .map(str::to_string) - .collect() - }) - .unwrap_or_default() -} - -fn witnesses_consulted(decision: &PromotionDecisionMemento) -> u64 { - if let Some(items) = decision - .header - .decision_payload - .get("witnesses_consulted") - .and_then(Value::as_array) - { - return items.len() as u64; - } - if let Some(count) = decision - .header - .decision_payload - .get("witnesses_consulted") - .and_then(Value::as_u64) - { - return count; - } - decision.header.evidence_cids.len() as u64 -} diff --git a/implementations/rust/libprovekit/tests/bind_kit.rs b/implementations/rust/libprovekit/tests/bind_kit.rs index da2660b27..d76ac0131 100644 --- a/implementations/rust/libprovekit/tests/bind_kit.rs +++ b/implementations/rust/libprovekit/tests/bind_kit.rs @@ -3,10 +3,10 @@ use std::path::PathBuf; use libprovekit::core::{ - address, bind_term_document, concept_bind_result_cid, named_term_document_from_bind_payload, - strip_realize_sidecar_from_lift_term, BindKit, BindOptions, Input, Kit, Term, + address, bind_term_document, concept_bind_result_cid, grammar_op_cid, + named_term_document_from_bind_payload, strip_realize_sidecar_from_lift_term, BindKit, + BindOptions, Input, Kit, Term, }; -use libprovekit::proofir_bridge::CatalogIndex; use provekit_ir_types::{ExamManifestMemento, Sort}; use serde_json::{json, Value}; @@ -27,10 +27,6 @@ fn repo_root() -> PathBuf { .to_path_buf() } -fn concept_catalog() -> CatalogIndex { - CatalogIndex::from_catalog_root(repo_root().join("menagerie/concept-shapes/catalog")) - .expect("concept-shapes catalog loads") -} fn v1_1_exam_manifest() -> ExamManifestMemento { libprovekit::exam_manifest::load_default_exam_manifest().expect("v1.1 exam manifest loads") @@ -194,15 +190,19 @@ fn bind_kit_transform_emits_bind_result_op_tree() { payload.walk().count() >= 2, "bind output should expose operation nodes to Term::walk" ); - let catalog = concept_catalog(); + // Every op CID in the bind payload must be the CODE shape-authority's + // computed address for its op name: no dangling/fabricated CIDs, and no + // dependency on the (deleted) on-disk concept catalog. The vector is the + // incident of the shape; we check it derives from the authority, not that + // it equals a frozen number. let unresolved = payload .walk() - .filter(|node| catalog.get(node.op_cid.as_str()).is_none()) + .filter(|node| grammar_op_cid(node.op_name).as_ref() != Some(node.op_cid)) .map(|node| (node.op_name.to_string(), node.op_cid.to_string())) .collect::>(); assert!( unresolved.is_empty(), - "every bind payload op CID should resolve in the concept catalog: {unresolved:?}" + "every bind payload op CID must be the shape-authority's address for its op name: {unresolved:?}" ); let recovered = named_term_document_from_bind_payload(payload).expect("bind payload recovers named terms"); diff --git a/implementations/rust/libprovekit/tests/promotion_decision_registry.rs b/implementations/rust/libprovekit/tests/promotion_decision_registry.rs deleted file mode 100644 index f9e8f97f9..000000000 --- a/implementations/rust/libprovekit/tests/promotion_decision_registry.rs +++ /dev/null @@ -1,321 +0,0 @@ -// SPDX-License-Identifier: Apache-2.0 - -use libprovekit::promotion_decision_registry::{ - ConsensusPolicy, PromotionDecisionKey, PromotionDecisionRegistry, -}; -use provekit_ir_types::{ - PromotionDecisionEnvelope, PromotionDecisionHeader, PromotionDecisionMemento, - PromotionDecisionMetadata, PromotionGate, PromotionResult, -}; -use serde_json::json; - -const FIXTURE_STATE_CID: &str = "blake3-512:295e0fd280088fc1e5e00d7bade11a2bf850c932180622e28f2fc92e64f97cd5bd757a73acf07f888b7c523e8efb65d8f0d01d50bc02740e5d771e750485d8f4"; - -#[test] -fn registry_indexes_admitted_consensus_vector_by_typed_key() { - let decision = consensus_decision( - "concept:sql-query", - FIXTURE_STATE_CID, - "documentary -> empirically-witnessed", - 8, - PromotionResult::Admitted, - ); - - let mut registry = PromotionDecisionRegistry::new(); - registry.admit(decision.clone()).expect("decision admitted"); - - let key = PromotionDecisionKey::new("concept:sql-query", FIXTURE_STATE_CID); - let status = registry - .get(&key) - .expect("admitted consensus vector should be indexed"); - - assert_eq!(status.key, key); - assert_eq!(status.decision_cids, vec![decision.header.cid.clone()]); - assert_eq!( - status.decision_policy_cids, - vec![decision.header.policy_cid.clone()] - ); - assert_eq!(status.witnesses_consulted, 8); - assert_eq!(status.consensus_vector["unique_signers"], 1); - assert_eq!(status.consensus_vector["unique_fixtures"], 1); - assert_eq!(status.consensus_vector["total_sample_count"], 8); -} - -#[test] -fn consensus_policy_decides_whether_a_vector_is_sufficient() { - let decision = consensus_decision( - "concept:sql-query", - FIXTURE_STATE_CID, - "documentary -> empirically-witnessed", - 8, - PromotionResult::Admitted, - ); - let mut registry = PromotionDecisionRegistry::new(); - registry.admit(decision).expect("decision admitted"); - let status = registry - .get(&PromotionDecisionKey::new( - "concept:sql-query", - FIXTURE_STATE_CID, - )) - .expect("status indexed"); - - let permissive = ConsensusPolicy::from_json_str( - r#"{ - "kind": "consensus-policy", - "schemaVersion": "1", - "thresholds": [ - {"axis": "min-witnesses-floor", "predicate": "n>=8"}, - {"axis": "environment-diversity", "predicate": "unique_fixtures>=1"}, - {"axis": "sample-depth", "predicate": "total_sample_count>=8"} - ], - "allow_failures": false - }"#, - ) - .expect("policy parses"); - assert!(permissive.admits(&status).is_ok()); - - let stricter = ConsensusPolicy::from_json_str( - r#"{ - "kind": "consensus-policy", - "schemaVersion": "1", - "thresholds": [ - {"axis": "observer-diversity", "predicate": "unique_signers>=2"} - ], - "allow_failures": false - }"#, - ) - .expect("policy parses"); - let err = stricter - .admits(&status) - .expect_err("one unsigned signer must not satisfy signer-diversity policy"); - assert!( - err.contains("observer-diversity") && err.contains("unique_signers was 1"), - "unexpected policy rejection: {err}" - ); -} - -#[test] -fn consensus_policy_does_not_synthesize_strength_across_decisions() { - let enough_witnesses_low_diversity = consensus_decision_with_vector( - "concept:sql-query", - FIXTURE_STATE_CID, - "documentary -> empirically-witnessed", - 8, - PromotionResult::Admitted, - json!({ - "failure_mode_distribution": [ - {"outcome": "pass", "count": 8}, - {"outcome": "fail", "count": 0}, - {"outcome": "inconclusive", "count": 0} - ], - "input_distribution_summary": {"shape": "unspanned"}, - "loss_dim_coverage": { - "named_in_concept_spec": [], - "unwitnessed": [], - "witnessed": [] - }, - "temporal_spread": { - "first_observed_at": "2026-05-14T00:00:00.000Z", - "last_observed_at": "2026-05-14T00:00:00.000Z", - "span_seconds": 0 - }, - "total_sample_count": 8, - "unique_fixtures": 1, - "unique_signer_keys": ["unsigned"], - "unique_signers": 1 - }), - ); - let high_diversity_too_few_witnesses = consensus_decision_with_vector( - "concept:sql-query", - FIXTURE_STATE_CID, - "documentary -> empirically-witnessed", - 1, - PromotionResult::Admitted, - json!({ - "failure_mode_distribution": [ - {"outcome": "pass", "count": 1}, - {"outcome": "fail", "count": 0}, - {"outcome": "inconclusive", "count": 0} - ], - "input_distribution_summary": {"shape": "unspanned"}, - "loss_dim_coverage": { - "named_in_concept_spec": [], - "unwitnessed": [], - "witnessed": [] - }, - "temporal_spread": { - "first_observed_at": "2026-05-14T00:00:00.000Z", - "last_observed_at": "2026-05-14T00:00:00.000Z", - "span_seconds": 0 - }, - "total_sample_count": 1, - "unique_fixtures": 3, - "unique_signer_keys": ["a", "b", "c"], - "unique_signers": 3 - }), - ); - let mut registry = PromotionDecisionRegistry::new(); - registry - .admit(enough_witnesses_low_diversity) - .expect("first decision admitted"); - registry - .admit(high_diversity_too_few_witnesses) - .expect("second decision admitted"); - - let key = PromotionDecisionKey::new("concept:sql-query", FIXTURE_STATE_CID); - let policy = ConsensusPolicy::from_json_str( - r#"{ - "kind": "consensus-policy", - "schemaVersion": "1", - "thresholds": [ - {"axis": "min-witnesses-floor", "predicate": "n>=8"}, - {"axis": "environment-diversity", "predicate": "unique_fixtures>=3"} - ], - "allow_failures": false - }"#, - ) - .expect("policy parses"); - - assert!( - registry - .statuses(&key) - .iter() - .all(|status| policy.admits(status).is_err()), - "policy must evaluate each observed vector independently" - ); -} - -#[test] -fn registry_does_not_index_rejected_decisions() { - let decision = consensus_decision( - "concept:sql-query", - FIXTURE_STATE_CID, - "documentary -> empirically-witnessed", - 8, - PromotionResult::Rejected, - ); - - let mut registry = PromotionDecisionRegistry::new(); - registry - .admit(decision) - .expect("valid rejected decision admitted"); - - assert!(registry - .get(&PromotionDecisionKey::new( - "concept:sql-query", - FIXTURE_STATE_CID - )) - .is_none()); -} - -#[test] -fn registry_rejects_header_cid_mismatch() { - let mut decision = consensus_decision( - "concept:sql-query", - FIXTURE_STATE_CID, - "documentary -> empirically-witnessed", - 8, - PromotionResult::Admitted, - ); - decision.header.cid = cid('a'); - - let mut registry = PromotionDecisionRegistry::new(); - let err = registry - .admit(decision) - .expect_err("mismatched header cid must reject"); - - assert!( - err.to_string().contains("header.cid mismatch"), - "unexpected error: {err}" - ); -} - -fn consensus_decision( - promoted_op: &str, - fixture: &str, - promotion: &str, - min_witnesses: u64, - result: PromotionResult, -) -> PromotionDecisionMemento { - consensus_decision_with_vector( - promoted_op, - fixture, - promotion, - min_witnesses, - result, - json!({ - "failure_mode_distribution": [ - {"outcome": "pass", "count": min_witnesses}, - {"outcome": "fail", "count": 0}, - {"outcome": "inconclusive", "count": 0} - ], - "input_distribution_summary": {"shape": "unspanned"}, - "loss_dim_coverage": { - "named_in_concept_spec": ["row-order"], - "unwitnessed": ["row-order"], - "witnessed": [] - }, - "temporal_spread": { - "first_observed_at": "2026-05-14T00:00:00.000Z", - "last_observed_at": "2026-05-14T00:00:00.000Z", - "span_seconds": 0 - }, - "total_sample_count": min_witnesses, - "unique_fixtures": 1, - "unique_signer_keys": ["unsigned"], - "unique_signers": 1 - }), - ) -} - -fn consensus_decision_with_vector( - promoted_op: &str, - fixture: &str, - promotion: &str, - min_witnesses: u64, - result: PromotionResult, - consensus_vector: serde_json::Value, -) -> PromotionDecisionMemento { - let evidence_cids = (0..min_witnesses) - .map(|idx| cid(char::from(b'1' + (idx as u8 % 8)))) - .collect::>(); - let mut decision = PromotionDecisionMemento { - envelope: PromotionDecisionEnvelope { - declared_at: "2026-05-14T00:00:00.000Z".to_string(), - signature: String::new(), - signer: cid('d'), - }, - header: PromotionDecisionHeader { - candidate_cid: cid('c'), - cid: String::new(), - decider_cid: cid('d'), - decision_payload: json!({ - "agreement": "byte-equal", - "fixtures_consulted": [fixture], - "min_witnesses": min_witnesses, - "promotion": promotion, - "promoted_op": promoted_op, - "reason": "test consensus", - "total_observations": min_witnesses, - "witnesses_consulted": evidence_cids, - "consensus_vector": consensus_vector, - }), - evidence_cids, - gate: PromotionGate::Threshold, - kind: "promotion-decision".to_string(), - policy_cid: cid('e'), - promoted_cid: cid('f'), - result, - schema_version: "1".to_string(), - }, - metadata: PromotionDecisionMetadata::default(), - }; - decision.header.cid = decision - .recompute_header_cid() - .expect("fixture decision cid recomputes"); - decision -} - -fn cid(ch: char) -> String { - format!("blake3-512:{}", ch.to_string().repeat(128)) -} diff --git a/implementations/rust/provekit-cli/src/cmd_prove.rs b/implementations/rust/provekit-cli/src/cmd_prove.rs index 32a656d03..f7e2f73d7 100644 --- a/implementations/rust/provekit-cli/src/cmd_prove.rs +++ b/implementations/rust/provekit-cli/src/cmd_prove.rs @@ -487,9 +487,6 @@ pub fn run(args: ProveArgs) -> u8 { return run_target(&args.project, &args.output, target); } - if args.require_empirically_witnessed.is_some() { - return run_empirically_witnessed_gate(&args); - } if args.artifact.is_some() || args.proof.is_some() || args.policy.is_some() { return run_admission_gate(&args); @@ -695,99 +692,6 @@ fn solver_text(formula: &Value, target: ProveTarget) -> Result { } } -fn run_empirically_witnessed_gate(args: &ProveArgs) -> u8 { - let concept = match args.require_empirically_witnessed.as_deref() { - Some(concept) if !concept.trim().is_empty() => concept, - _ => { - eprintln!( - "{}: --require-empirically-witnessed requires a concept", - "error".red().bold() - ); - return crate::EXIT_USER_ERROR; - } - }; - let fixture = match args.require_fixture.as_deref() { - Some(fixture) if !fixture.trim().is_empty() => fixture, - _ => { - eprintln!( - "{}: --require-fixture is required with --require-empirically-witnessed", - "error".red().bold() - ); - return crate::EXIT_USER_ERROR; - } - }; - let project_root: PathBuf = args.project.clone().unwrap_or_else(|| PathBuf::from(".")); - if !project_root.exists() { - eprintln!( - "{}: project root does not exist: {}", - "error".red().bold(), - project_root.display() - ); - return crate::EXIT_USER_ERROR; - } - let policy_path = match args.consensus_policy.as_ref() { - Some(path) => path, - None => { - eprintln!( - "{}: --consensus-policy is required with --require-empirically-witnessed", - "error".red().bold() - ); - return crate::EXIT_USER_ERROR; - } - }; - let policy = match crate::promotion_query::load_consensus_policy(policy_path) { - Ok(policy) => policy, - Err(err) => { - eprintln!("{}: {err}", "error".red().bold()); - return crate::EXIT_USER_ERROR; - } - }; - - match crate::promotion_query::query_consensus_vector_for_policy( - &project_root, - &args.with.iter().map(PathBuf::from).collect::>(), - concept, - fixture, - &policy, - ) { - Ok(Some(hit)) => { - let report = crate::promotion_query::status_json(concept, fixture, &hit, Some(&policy)); - let ok = report["ok"].as_bool().unwrap_or(false); - if args.out.json { - println!("{}", serde_json::to_string_pretty(&report).unwrap()); - } else if !args.out.quiet { - println!( - "verify empirically-witnessed: {}", - if ok { "accepted" } else { "rejected" } - ); - println!(" concept: {concept}"); - println!(" fixture: {fixture}"); - println!(" promoted_op: {}", hit.status.key.promoted_op); - println!(" decisions: {}", hit.status.decision_cids.len()); - } - if ok { - crate::EXIT_OK - } else { - crate::EXIT_VERIFY_FAIL - } - } - Ok(None) => { - let report = crate::promotion_query::missing_json(concept, fixture); - if args.out.json { - println!("{}", serde_json::to_string_pretty(&report).unwrap()); - } else if !args.out.quiet { - println!("verify empirically-witnessed: rejected"); - println!(" reason: required promotion was not found"); - } - crate::EXIT_VERIFY_FAIL - } - Err(err) => { - eprintln!("{}: {err}", "error".red().bold()); - crate::EXIT_USER_ERROR - } - } -} - fn run_admission_gate(args: &ProveArgs) -> u8 { run_admission_gate_with( &args.artifact, diff --git a/implementations/rust/provekit-cli/src/cmd_verify.rs b/implementations/rust/provekit-cli/src/cmd_verify.rs index 87ae6dc7d..c63261950 100644 --- a/implementations/rust/provekit-cli/src/cmd_verify.rs +++ b/implementations/rust/provekit-cli/src/cmd_verify.rs @@ -244,11 +244,6 @@ struct ClaimResult { } pub fn run(args: VerifyArgs) -> u8 { - // Early dispatch: --require-empirically-witnessed bypasses the - // standard solver-dispatch flow and queries the promotion catalog. - if args.require_empirically_witnessed.is_some() { - return run_empirically_witnessed_gate(&args); - } // Early dispatch: the supply-chain admission gate. When any of // --artifact / --proof / --policy is given, verify a package release @@ -702,25 +697,13 @@ fn mint_verification_witness( // Build the content over which the CID + signature are computed. // (Mirrors the WitnessMemento envelope/header convention: the CID is - // blake3-512 of the JCS of the signable content; the signature is - // over the same bytes.) - let signable = json!({ - "kind": "witness", - "schemaVersion": "1", - "witness_for": cs.property_cid, - "subject": cs.bridge_target_cid, - "fixture_state_cid": obligation_cid, - "observed_at": observed_at, - "sample_count": 1, - "measurements": measurements, - "outcome": "pass", - "signed_by": pubkey, - }); - let signable_jcs = jcs_of_json(&signable)?; - let cid = blake3_512_of(signable_jcs.as_bytes()); - let signature = ed25519_sign_string(signer_seed, signable_jcs.as_bytes()); - - let witness = provekit_ir_types::WitnessMemento { + // signed_by/signature are the HOW (attestation), EXCLUDED from the CID + // preimage per WitnessMemento::recompute_cid. Mirror witness_ingest: build + // the memento with signer set + empty signature, derive the CID from the + // canonical observation bytes, then sign the CID (the address of the WHAT). + // Both witness-mint paths now share one scheme: same observation => same + // CID regardless of who attests, and the signature attests that CID. + let mut witness = provekit_ir_types::WitnessMemento { kind: "witness".to_string(), schema_version: "1".to_string(), witness_for: cs.property_cid.clone(), @@ -731,9 +714,12 @@ fn mint_verification_witness( measurements, outcome: "pass".to_string(), signed_by: Some(pubkey), - signature: Some(signature), - cid: cid.clone(), + signature: None, + cid: String::new(), }; + let cid = witness.recompute_cid().map_err(|e| e.to_string())?; + witness.signature = Some(ed25519_sign_string(signer_seed, cid.as_bytes())); + witness.cid = cid.clone(); let bytes = serde_json::to_string_pretty(&witness).map_err(|e| format!("serialize witness: {e}"))?; @@ -745,7 +731,7 @@ fn mint_verification_witness( Ok(cid) } -fn jcs_of_json(v: &Json) -> Result { +pub(crate) fn jcs_of_json(v: &Json) -> Result { let canonical = json_to_canonical(v)?; Ok(encode_jcs(&canonical)) } @@ -964,121 +950,3 @@ mod tests { } } -/// Gate: query the promotion catalog to confirm a concept has reached -/// the empirically-witnessed promotion tier. -/// -/// This is the `verify`-side counterpart of the same gate that lives on -/// `prove --require-empirically-witnessed`. The gate performs no -/// solver dispatch and mints no witnesses; it is a pure query against -/// the project's `.provekit/promotions` catalog. -fn run_empirically_witnessed_gate(args: &VerifyArgs) -> u8 { - let concept = match args.require_empirically_witnessed.as_deref() { - Some(concept) if !concept.trim().is_empty() => concept, - _ => { - eprintln!( - "{}: --require-empirically-witnessed requires a concept", - "error".red().bold() - ); - return EXIT_USER_ERROR; - } - }; - let fixture = match args.require_fixture.as_deref() { - Some(fixture) if !fixture.trim().is_empty() => fixture, - _ => { - eprintln!( - "{}: --require-fixture is required with --require-empirically-witnessed", - "error".red().bold() - ); - return EXIT_USER_ERROR; - } - }; - - // Resolve the project root from --project, a path-like positional, - // or the current directory. - let project_root: PathBuf = if let Some(p) = &args.project { - p.clone() - } else if let Some(kit) = &args.kit { - // Accept an explicit path passed as the positional argument. - let p = PathBuf::from(kit); - if p.is_absolute() || kit.contains(std::path::MAIN_SEPARATOR) || kit.starts_with('.') { - p - } else { - // Treat a bare kit name as a project path of last resort; - // the caller is responsible for passing a valid directory. - PathBuf::from(kit) - } - } else { - PathBuf::from(".") - }; - - if !project_root.exists() { - eprintln!( - "{}: project root does not exist: {}", - "error".red().bold(), - project_root.display() - ); - return EXIT_USER_ERROR; - } - - let policy_path = match args.consensus_policy.as_ref() { - Some(path) => path, - None => { - eprintln!( - "{}: --consensus-policy is required with --require-empirically-witnessed", - "error".red().bold() - ); - return EXIT_USER_ERROR; - } - }; - let policy = match crate::promotion_query::load_consensus_policy(policy_path) { - Ok(policy) => policy, - Err(err) => { - eprintln!("{}: {err}", "error".red().bold()); - return EXIT_USER_ERROR; - } - }; - - match crate::promotion_query::query_consensus_vector_for_policy( - &project_root, - &[], - concept, - fixture, - &policy, - ) { - Ok(Some(hit)) => { - let report = crate::promotion_query::status_json(concept, fixture, &hit, Some(&policy)); - let ok = report["ok"].as_bool().unwrap_or(false); - if args.out.json { - println!("{}", serde_json::to_string_pretty(&report).unwrap()); - } else if !args.out.quiet { - println!( - "verify empirically-witnessed: {}", - if ok { "accepted" } else { "rejected" } - ); - println!(" concept: {concept}"); - println!(" fixture: {fixture}"); - println!(" promoted_op: {}", hit.status.key.promoted_op); - println!(" decisions: {}", hit.status.decision_cids.len()); - } - if ok { - EXIT_OK - } else { - EXIT_VERIFY_FAIL - } - } - Ok(None) => { - let report = crate::promotion_query::missing_json(concept, fixture); - if args.out.json { - println!("{}", serde_json::to_string_pretty(&report).unwrap()); - } else if !args.out.quiet { - println!("verify empirically-witnessed: rejected"); - println!(" reason: required promotion was not found"); - } - EXIT_VERIFY_FAIL - } - Err(err) => { - eprintln!("{}: {err}", "error".red().bold()); - EXIT_USER_ERROR - } - } -} diff --git a/implementations/rust/provekit-cli/src/cmd_witness.rs b/implementations/rust/provekit-cli/src/cmd_witness.rs index 61173813b..d2dde64d2 100644 --- a/implementations/rust/provekit-cli/src/cmd_witness.rs +++ b/implementations/rust/provekit-cli/src/cmd_witness.rs @@ -16,28 +16,13 @@ // 5. If unsat: mint witness memento, save to .provekit/witnesses/ // 6. The witness can be shipped with the package -use std::path::{Path, PathBuf}; +use std::path::PathBuf; -use chrono::{SecondsFormat, Utc}; -use libprovekit::promotion_decision_registry::{PromotionDecisionKey, PromotionStatus}; -use provekit_ir_types::{ - CrossLanguageWitnessPair, MigrateReceiptEnvelope, PromotionDecisionEnvelope, - PromotionDecisionHeader, PromotionDecisionMemento, PromotionDecisionMetadata, PromotionGate, - PromotionResult, WitnessMemento, -}; -use serde_json::json; use serde_json::Value as Json; -use walkdir::WalkDir; use crate::{EXIT_OK, EXIT_SOLVER_FAIL, EXIT_USER_ERROR, EXIT_VERIFY_FAIL}; pub fn run(args: crate::WitnessArgs) -> u8 { - if args.command_or_contract == "consensus" { - return run_consensus(args); - } - if args.command_or_contract == "status" { - return run_status(args); - } let project_root = args .project @@ -134,219 +119,6 @@ pub fn run(args: crate::WitnessArgs) -> u8 { } } -fn run_consensus(args: crate::WitnessArgs) -> u8 { - let project_root = args - .project - .unwrap_or_else(|| std::env::current_dir().unwrap()); - let concept = match args.concept.as_deref() { - Some(concept) if !concept.trim().is_empty() => concept.to_string(), - _ => { - eprintln!("error: witness consensus requires --concept"); - return EXIT_USER_ERROR; - } - }; - let fixture = match args.require_fixture.as_deref() { - Some(fixture) if !fixture.trim().is_empty() => fixture.to_string(), - _ => { - eprintln!("error: witness consensus requires --require-fixture"); - return EXIT_USER_ERROR; - } - }; - let emit = match args.emit.as_ref() { - Some(path) => path.clone(), - None => { - eprintln!("error: witness consensus requires --emit "); - return EXIT_USER_ERROR; - } - }; - let catalog_roots = crate::promotion_query::catalog_roots(&project_root, &args.catalogs); - let concept_terms = crate::promotion_query::concept_query_terms(&project_root, &concept); - - let catalog = match collect_witness_catalog(&catalog_roots) { - Ok(catalog) => catalog, - Err(err) => { - eprintln!("error: witness consensus catalog scan failed: {err}"); - return EXIT_USER_ERROR; - } - }; - let selected: Vec = catalog - .witnesses - .into_iter() - .filter(|witness| { - concept_terms - .iter() - .any(|candidate| candidate == &witness.witness_for) - && witness.fixture_state_cid == fixture - && witness.outcome == "pass" - }) - .collect(); - - let consensus = match consensus_evidence(&selected, &catalog.cross_language_pairs) { - Ok(consensus) => consensus, - Err(err) => { - eprintln!("witness consensus: rejected: {err}"); - return EXIT_VERIFY_FAIL; - } - }; - - if consensus.witnesses.len() < args.min_witnesses { - eprintln!( - "witness consensus: rejected: {} matching passing witnesses, require {}", - consensus.witnesses.len(), - args.min_witnesses - ); - return EXIT_VERIFY_FAIL; - } - - let policy_path = match args.consensus_policy.as_ref() { - Some(path) => path, - None => { - eprintln!("error: witness consensus requires --consensus-policy"); - return EXIT_USER_ERROR; - } - }; - let policy = match crate::promotion_query::load_consensus_policy(policy_path) { - Ok(policy) => policy, - Err(err) => { - eprintln!("error: {err}"); - return EXIT_USER_ERROR; - } - }; - let policy_cid = policy - .cid - .clone() - .expect("load_consensus_policy fills missing policy cid"); - let loss_dimensions = crate::promotion_query::concept_loss_dimensions(&project_root, &concept); - let consensus_vector = consensus_vector(&consensus.witnesses, &loss_dimensions); - let status_for_policy = PromotionStatus { - key: PromotionDecisionKey::new(concept.clone(), fixture.clone()), - decision_cids: Vec::new(), - decision_policy_cids: vec![policy_cid.clone()], - consensus_vector: consensus_vector.clone(), - witnesses_consulted: consensus.witnesses.len() as u64, - }; - if let Err(err) = policy.admits(&status_for_policy) { - eprintln!("witness consensus: rejected by policy: {err}"); - return EXIT_VERIFY_FAIL; - } - - let decision = match promotion_decision_for_consensus( - &concept, - &fixture, - args.min_witnesses, - &consensus.witnesses, - consensus.row_schema, - consensus_vector, - policy_cid, - ) { - Ok(decision) => decision, - Err(err) => { - eprintln!("error: mint promotion decision: {err}"); - return EXIT_USER_ERROR; - } - }; - if let Some(parent) = emit.parent() { - if let Err(err) = std::fs::create_dir_all(parent) { - eprintln!("error: create {}: {err}", parent.display()); - return EXIT_USER_ERROR; - } - } - let bytes = match serde_json::to_string_pretty(&decision) { - Ok(bytes) => bytes, - Err(err) => { - eprintln!("error: serialize promotion decision: {err}"); - return EXIT_USER_ERROR; - } - }; - if let Err(err) = std::fs::write(&emit, format!("{bytes}\n")) { - eprintln!("error: write {}: {err}", emit.display()); - return EXIT_USER_ERROR; - } - - if args.out.json { - let summary = json!({ - "agreement": "byte-equal", - "promotion_cid": decision.header.cid, - "promotion_receipt": emit.display().to_string(), - "result": "admitted", - "witnesses_consulted": consensus.witnesses.len() - }); - println!( - "{}", - serde_json::to_string_pretty(&summary).unwrap_or_else(|_| summary.to_string()) - ); - } else if !args.out.quiet { - println!("witness consensus: admitted"); - println!(" concept: {concept}"); - println!(" fixture: {fixture}"); - println!(" witnesses: {}", consensus.witnesses.len()); - println!(" agreement: byte-equal"); - println!(" receipt: {}", emit.display()); - } - EXIT_OK -} - -fn run_status(args: crate::WitnessArgs) -> u8 { - let project_root = args - .project - .unwrap_or_else(|| std::env::current_dir().unwrap()); - let concept = match args.concept.as_deref() { - Some(concept) if !concept.trim().is_empty() => concept.to_string(), - _ => { - eprintln!("error: witness status requires --concept"); - return EXIT_USER_ERROR; - } - }; - let fixture = match args.require_fixture.as_deref() { - Some(fixture) if !fixture.trim().is_empty() => fixture.to_string(), - _ => { - eprintln!("error: witness status requires --require-fixture"); - return EXIT_USER_ERROR; - } - }; - - match crate::promotion_query::query_consensus_vector( - &project_root, - &args.catalogs, - &concept, - &fixture, - ) { - Ok(Some(hit)) => { - if args.out.json { - let report = crate::promotion_query::status_json(&concept, &fixture, &hit, None); - println!( - "{}", - serde_json::to_string_pretty(&report).unwrap_or_else(|_| report.to_string()) - ); - } else if !args.out.quiet { - println!("witness status: consensus-vector present"); - println!(" concept: {concept}"); - println!(" fixture: {fixture}"); - println!(" promoted_op: {}", hit.status.key.promoted_op); - println!(" decisions: {}", hit.status.decision_cids.len()); - println!(" witnesses_consulted: {}", hit.status.witnesses_consulted); - } - EXIT_OK - } - Ok(None) => { - if args.out.json { - let report = crate::promotion_query::missing_json(&concept, &fixture); - println!( - "{}", - serde_json::to_string_pretty(&report).unwrap_or_else(|_| report.to_string()) - ); - } else { - eprintln!("witness status: missing consensus vector"); - } - EXIT_VERIFY_FAIL - } - Err(err) => { - eprintln!("error: witness status catalog scan failed: {err}"); - EXIT_USER_ERROR - } - } -} - fn extract_post(contract: &Json) -> Option { contract.get("evidence")?.get("body")?.get("post").cloned() } @@ -375,348 +147,6 @@ fn build_witness_obligation(post: &Json, property: &Json) -> Result, - cross_language_pairs: Vec, -} - -#[derive(Debug)] -struct ConsensusEvidence { - witnesses: Vec, - row_schema: Json, -} - -fn collect_witness_catalog(catalog_roots: &[PathBuf]) -> Result { - let mut catalog = WitnessCatalog::default(); - for root in catalog_roots { - if root.is_file() { - collect_witnesses_from_file(root, &mut catalog)?; - continue; - } - if !root.exists() { - continue; - } - for entry in WalkDir::new(root) - .into_iter() - .filter_entry(|entry| should_descend(entry.path())) - { - let entry = entry.map_err(|e| e.to_string())?; - if entry.file_type().is_file() { - collect_witnesses_from_file(entry.path(), &mut catalog)?; - } - } - } - catalog - .witnesses - .sort_by(|left, right| left.cid.cmp(&right.cid)); - catalog - .witnesses - .dedup_by(|left, right| left.cid == right.cid); - catalog.cross_language_pairs.sort_by(|left, right| { - left.concept_site_cid - .cmp(&right.concept_site_cid) - .then(left.source_witness_cid.cmp(&right.source_witness_cid)) - .then(left.target_witness_cid.cmp(&right.target_witness_cid)) - }); - catalog.cross_language_pairs.dedup(); - Ok(catalog) -} - -fn should_descend(path: &Path) -> bool { - let Some(name) = path.file_name().and_then(|name| name.to_str()) else { - return true; - }; - !matches!( - name, - ".git" | ".worktrees" | "target" | "node_modules" | "vendor" | ".venv" - ) -} - -fn collect_witnesses_from_file(path: &Path, catalog: &mut WitnessCatalog) -> Result<(), String> { - let Some(ext) = path.extension().and_then(|ext| ext.to_str()) else { - return Ok(()); - }; - if !matches!(ext, "json" | "proof") { - return Ok(()); - } - let text = match std::fs::read_to_string(path) { - Ok(text) => text, - Err(_) => return Ok(()), - }; - if let Ok(witness) = serde_json::from_str::(&text) { - if witness.validate().is_ok() { - catalog.witnesses.push(witness); - } - return Ok(()); - } - if let Ok(receipt) = MigrateReceiptEnvelope::parse_json_str(&text) { - catalog.witnesses.extend(receipt.witnesses); - catalog - .cross_language_pairs - .extend(receipt.cross_language_witness_pairs); - } - Ok(()) -} - -fn consensus_evidence( - witnesses: &[WitnessMemento], - pairs: &[CrossLanguageWitnessPair], -) -> Result { - match consensus_row_schema(witnesses) { - Ok(row_schema) => { - return Ok(ConsensusEvidence { - witnesses: witnesses.to_vec(), - row_schema, - }); - } - Err(err) if pairs.is_empty() => return Err(err), - Err(_) => {} - } - consensus_pairwise_row_schema(witnesses, pairs) -} - -fn consensus_pairwise_row_schema( - witnesses: &[WitnessMemento], - pairs: &[CrossLanguageWitnessPair], -) -> Result { - let by_cid = witnesses - .iter() - .map(|witness| (witness.cid.as_str(), witness)) - .collect::>(); - let mut evidence_cids = std::collections::BTreeSet::new(); - let mut pair_count = 0usize; - for pair in pairs { - if pair.equivalence_outcome != "pass" { - continue; - } - let Some(source) = by_cid.get(pair.source_witness_cid.as_str()) else { - continue; - }; - let Some(target) = by_cid.get(pair.target_witness_cid.as_str()) else { - continue; - }; - let source_schema = source - .measurements - .get("row_schema") - .ok_or_else(|| format!("witness {} missing measurements.row_schema", source.cid))?; - let target_schema = target - .measurements - .get("row_schema") - .ok_or_else(|| format!("witness {} missing measurements.row_schema", target.cid))?; - if canonical_json_bytes(source_schema) != canonical_json_bytes(target_schema) { - return Err("cross_language_witness_pairs row_schema disagreement".to_string()); - } - evidence_cids.insert(source.cid.clone()); - evidence_cids.insert(target.cid.clone()); - pair_count += 1; - } - if evidence_cids.is_empty() { - return Err("measurements.row_schema disagreement".to_string()); - } - let consensus_witnesses = witnesses - .iter() - .filter(|witness| evidence_cids.contains(&witness.cid)) - .cloned() - .collect::>(); - Ok(ConsensusEvidence { - witnesses: consensus_witnesses, - row_schema: json!({ - "agreement_axis": "cross_language_witness_pairs.measurements.row_schema", - "pair_count": pair_count - }), - }) -} - -fn consensus_row_schema(witnesses: &[WitnessMemento]) -> Result { - let mut schemas = witnesses.iter().map(|witness| { - witness - .measurements - .get("row_schema") - .cloned() - .ok_or_else(|| format!("witness {} missing measurements.row_schema", witness.cid)) - }); - let first = schemas - .next() - .ok_or_else(|| "no witnesses selected".to_string())??; - let first_bytes = canonical_json_bytes(&first); - for schema in schemas { - let schema = schema?; - let bytes = canonical_json_bytes(&schema); - if bytes != first_bytes { - return Err("measurements.row_schema disagreement".to_string()); - } - } - Ok(first) -} - -fn promotion_decision_for_consensus( - concept: &str, - fixture: &str, - min_witnesses: usize, - witnesses: &[WitnessMemento], - row_schema: Json, - consensus_vector: Json, - policy_cid: String, -) -> Result { - let mut evidence_cids: Vec = witnesses.iter().map(|w| w.cid.clone()).collect(); - evidence_cids.sort(); - let subjects: Vec = witnesses.iter().map(|w| w.subject.clone()).collect(); - let total_observations: u64 = witnesses.iter().map(|w| w.sample_count).sum(); - let decider_cid = crate::promotion_query::content_cid_for_json(&json!({ - "kind": "decider", - "name": "provekit-witness-consensus" - })); - let candidate_cid = crate::promotion_query::content_cid_for_json(&json!({ - "concept": concept, - "kind": "concept-candidate" - })); - let promoted_cid = crate::promotion_query::content_cid_for_json(&json!({ - "concept": concept, - "fixture_state_cid": fixture, - "kind": "promoted-concept-tier", - "tier": "empirically-witnessed" - })); - let reason = format!( - "{} witnesses selected; consensus vector records the observed axes", - witnesses.len() - ); - let mut decision = PromotionDecisionMemento { - envelope: PromotionDecisionEnvelope { - declared_at: Utc::now().to_rfc3339_opts(SecondsFormat::Millis, true), - signature: String::new(), - signer: decider_cid.clone(), - }, - header: PromotionDecisionHeader { - candidate_cid, - cid: String::new(), - decider_cid, - decision_payload: json!({ - "agreement": "byte-equal", - "fixtures_consulted": [fixture], - "min_witnesses": min_witnesses, - "promotion": "documentary -> empirically-witnessed", - "promoted_op": concept, - "reason": reason, - "row_schema": row_schema, - "subjects_consulted": subjects, - "total_observations": total_observations, - "witnesses_consulted": evidence_cids, - "consensus_vector": consensus_vector - }), - evidence_cids, - gate: PromotionGate::Threshold, - kind: "promotion-decision".to_string(), - policy_cid, - promoted_cid, - result: PromotionResult::Admitted, - schema_version: "1".to_string(), - }, - metadata: PromotionDecisionMetadata { - counterexample_cids: None, - note: Some("witness consensus promoted empirical contract tier".to_string()), - source_url: None, - }, - }; - decision.header.cid = decision.recompute_header_cid().map_err(|e| e.to_string())?; - decision.validate().map_err(|e| e.to_string())?; - Ok(decision) -} - -fn consensus_vector(witnesses: &[WitnessMemento], loss_dimensions: &[String]) -> Json { - let mut signer_keys = witnesses - .iter() - .map(|witness| { - witness - .signed_by - .clone() - .unwrap_or_else(|| "unsigned".to_string()) - }) - .collect::>(); - if signer_keys.is_empty() { - signer_keys.insert("unsigned".to_string()); - } - let fixtures = witnesses - .iter() - .map(|witness| witness.fixture_state_cid.clone()) - .collect::>(); - let total_sample_count: u64 = witnesses.iter().map(|witness| witness.sample_count).sum(); - let mut observed_times = witnesses - .iter() - .filter_map(|witness| { - chrono::DateTime::parse_from_rfc3339(&witness.observed_at) - .ok() - .map(|dt| dt.with_timezone(&Utc)) - }) - .collect::>(); - observed_times.sort(); - let first_observed_at = observed_times - .first() - .map(|dt| dt.to_rfc3339_opts(SecondsFormat::Millis, true)); - let last_observed_at = observed_times - .last() - .map(|dt| dt.to_rfc3339_opts(SecondsFormat::Millis, true)); - let span_seconds = match (observed_times.first(), observed_times.last()) { - (Some(first), Some(last)) => (*last - *first).num_seconds().max(0), - _ => 0, - }; - let mut outcomes = std::collections::BTreeMap::::new(); - for witness in witnesses { - *outcomes.entry(witness.outcome.clone()).or_default() += 1; - } - let witnessed_loss_dims = witnesses - .iter() - .flat_map(|witness| { - witness - .measurements - .pointer("/observer/loss_dims_exercised") - .and_then(Json::as_array) - .into_iter() - .flatten() - .filter_map(Json::as_str) - .map(str::to_string) - }) - .filter(|dim| loss_dimensions.iter().any(|known| known == dim)) - .collect::>(); - let named_loss_dims = loss_dimensions - .iter() - .cloned() - .collect::>(); - let unwitnessed_loss_dims = named_loss_dims - .difference(&witnessed_loss_dims) - .cloned() - .collect::>(); - - json!({ - "unique_signers": signer_keys.len(), - "unique_signer_keys": signer_keys.into_iter().collect::>(), - "unique_fixtures": fixtures.len(), - "total_sample_count": total_sample_count, - "loss_dim_coverage": { - "named_in_concept_spec": named_loss_dims.into_iter().collect::>(), - "witnessed": witnessed_loss_dims.into_iter().collect::>(), - "unwitnessed": unwitnessed_loss_dims - }, - "input_distribution_summary": { - "shape": "unspanned" - }, - "temporal_spread": { - "first_observed_at": first_observed_at, - "last_observed_at": last_observed_at, - "span_seconds": span_seconds - }, - "failure_mode_distribution": [ - {"outcome": "pass", "count": outcomes.get("pass").copied().unwrap_or(0)}, - {"outcome": "fail", "count": outcomes.get("fail").copied().unwrap_or(0)}, - {"outcome": "inconclusive", "count": outcomes.get("inconclusive").copied().unwrap_or(0)} - ] - }) -} - -fn canonical_json_bytes(value: &Json) -> Vec { - crate::promotion_query::canonical_json_bytes(value) -} - enum SolverOutput { Unsat, Sat, diff --git a/implementations/rust/provekit-cli/src/main.rs b/implementations/rust/provekit-cli/src/main.rs index a9d6276f4..15a71489c 100644 --- a/implementations/rust/provekit-cli/src/main.rs +++ b/implementations/rust/provekit-cli/src/main.rs @@ -38,10 +38,10 @@ mod cmd_verify; mod cmd_verify_protocol; mod cmd_version; mod cmd_witness; +mod witness_ingest; mod kit_dispatch; mod lift_plugin; mod project_config; -mod promotion_query; mod protocol; mod report_fmt; mod sort_morphism_catalog; diff --git a/implementations/rust/provekit-cli/src/promotion_query.rs b/implementations/rust/provekit-cli/src/promotion_query.rs deleted file mode 100644 index a3b7d927a..000000000 --- a/implementations/rust/provekit-cli/src/promotion_query.rs +++ /dev/null @@ -1,324 +0,0 @@ -// SPDX-License-Identifier: Apache-2.0 - -use std::collections::BTreeSet; -use std::path::{Path, PathBuf}; -use std::sync::Arc; - -use libprovekit::promotion_decision_registry::{ - ConsensusPolicy, PromotionDecisionKey, PromotionDecisionRegistry, PromotionStatus, -}; -use provekit_canonicalizer::{blake3_512_of, encode_jcs, Value as CValue}; -use serde_json::{json, Value}; -use walkdir::WalkDir; - -#[derive(Debug, Clone)] -pub(crate) struct ConsensusQueryHit { - pub status: PromotionStatus, -} - -pub(crate) fn query_consensus_vector( - project_root: &Path, - catalogs: &[PathBuf], - concept: &str, - fixture: &str, -) -> Result, String> { - let roots = catalog_roots(project_root, catalogs); - let registry = load_promotion_registry(&roots)?; - let terms = concept_query_terms(project_root, concept); - for term in terms { - let key = PromotionDecisionKey::new(term, fixture); - if let Some(status) = registry.get(&key) { - return Ok(Some(ConsensusQueryHit { status })); - } - } - Ok(None) -} - -pub(crate) fn query_consensus_vector_for_policy( - project_root: &Path, - catalogs: &[PathBuf], - concept: &str, - fixture: &str, - policy: &ConsensusPolicy, -) -> Result, String> { - let roots = catalog_roots(project_root, catalogs); - let registry = load_promotion_registry(&roots)?; - let terms = concept_query_terms(project_root, concept); - let mut rejected = None; - for term in terms { - let key = PromotionDecisionKey::new(term, fixture); - for status in registry.statuses(&key) { - if policy.admits(&status).is_ok() { - return Ok(Some(ConsensusQueryHit { status })); - } - rejected.get_or_insert(ConsensusQueryHit { status }); - } - } - Ok(rejected) -} - -pub(crate) fn concept_query_terms(project_root: &Path, concept: &str) -> Vec { - let mut terms = BTreeSet::new(); - terms.insert(concept.to_string()); - if let Some(resolved) = resolve_concept_counterpart(project_root, concept) { - terms.insert(resolved); - } - // Family expansion: a grouping concept (e.g. concept:family:sql-query, which - // rolls up the cardinality concepts sql-query-row/-all/-iterate after the - // #1469 split) declares a `members` array in its catalog index entry. When - // present, the query terms include each member name AND its CID, so witness - // consensus / verify aggregate witnesses across the whole family rather than - // a single concept. Non-family concepts have no `members` and are unaffected. - if let Some((_, entry)) = resolve_concept_index_entry(project_root, concept) { - if let Some(members) = entry.get("members").and_then(Value::as_array) { - for member in members { - if let Some(name) = member.as_str() { - terms.insert(name.to_string()); - if let Some(cid) = resolve_concept_counterpart(project_root, name) { - terms.insert(cid); - } - } - } - } - } - terms.into_iter().collect() -} - -pub(crate) fn concept_loss_dimensions(project_root: &Path, concept: &str) -> Vec { - let Some((repo_root, entry)) = resolve_concept_index_entry(project_root, concept) else { - return Vec::new(); - }; - let Some(path) = entry.get("path").and_then(Value::as_str) else { - return Vec::new(); - }; - let spec_path = repo_root - .join("menagerie") - .join("concept-shapes") - .join("catalog") - .join(path); - let Ok(raw) = std::fs::read_to_string(spec_path) else { - return Vec::new(); - }; - let Ok(spec) = serde_json::from_str::(&raw) else { - return Vec::new(); - }; - spec.get("loss_dimensions") - .and_then(Value::as_array) - .map(|items| { - items - .iter() - .filter_map(Value::as_str) - .map(str::to_string) - .collect() - }) - .unwrap_or_default() -} - -pub(crate) fn catalog_roots(project_root: &Path, catalogs: &[PathBuf]) -> Vec { - if !catalogs.is_empty() { - return catalogs.to_vec(); - } - let project_catalog = project_root.join(".provekit"); - if project_catalog.exists() { - vec![project_catalog] - } else { - vec![project_root.to_path_buf()] - } -} - -pub(crate) fn load_promotion_registry( - catalog_roots: &[PathBuf], -) -> Result { - let mut registry = PromotionDecisionRegistry::new(); - for root in catalog_roots { - if root.is_file() { - admit_promotion_file(root, &mut registry)?; - continue; - } - if !root.exists() { - continue; - } - for entry in WalkDir::new(root) - .into_iter() - .filter_entry(|entry| should_descend(entry.path())) - { - let entry = entry.map_err(|err| err.to_string())?; - if entry.file_type().is_file() { - admit_promotion_file(entry.path(), &mut registry)?; - } - } - } - Ok(registry) -} - -pub(crate) fn load_consensus_policy(path: &Path) -> Result { - let raw = - std::fs::read_to_string(path).map_err(|err| format!("read {}: {err}", path.display()))?; - let value: Value = serde_json::from_str(&raw) - .map_err(|err| format!("parse consensus policy {}: {err}", path.display()))?; - let mut policy = ConsensusPolicy::from_json_str(&raw).map_err(|err| format!("{err}"))?; - if policy.cid.as_deref().map(str::is_empty).unwrap_or(true) { - policy.cid = Some(content_cid_for_json(&value)); - } - Ok(policy) -} - -pub(crate) fn content_cid_for_json(value: &Value) -> String { - blake3_512_of(&canonical_json_bytes(value)) -} - -pub(crate) fn canonical_json_bytes(value: &Value) -> Vec { - encode_jcs(&json_to_value(value)).into_bytes() -} - -pub(crate) fn status_json( - requested_concept: &str, - fixture: &str, - hit: &ConsensusQueryHit, - policy: Option<&ConsensusPolicy>, -) -> Value { - let policy_verdict = policy.map(|policy| policy.admits(&hit.status)); - let ok = policy_verdict.as_ref().map(|r| r.is_ok()).unwrap_or(true); - let reason = policy_verdict - .as_ref() - .and_then(|result| result.as_ref().err()) - .cloned(); - json!({ - "ok": ok, - "verdict": if ok { "accepted" } else { "rejected" }, - "reason": reason, - "requirement": { - "concept": requested_concept, - "fixture_state_cid": fixture, - "policy_cid": policy.and_then(|policy| policy.cid.clone()) - }, - "promotion": { - "consensus_vector": hit.status.consensus_vector.clone(), - "decision_cids": hit.status.decision_cids.clone(), - "decision_policy_cids": hit.status.decision_policy_cids.clone(), - "promoted_op": hit.status.key.promoted_op.clone(), - "witnesses_consulted": hit.status.witnesses_consulted - } - }) -} - -pub(crate) fn missing_json(requested_concept: &str, fixture: &str) -> Value { - json!({ - "ok": false, - "verdict": "rejected", - "reason": "required empirically-witnessed promotion was not found", - "requirement": { - "concept": requested_concept, - "fixture_state_cid": fixture, - "policy_cid": null - } - }) -} - -fn admit_promotion_file( - path: &Path, - registry: &mut PromotionDecisionRegistry, -) -> Result<(), String> { - let Some(ext) = path.extension().and_then(|ext| ext.to_str()) else { - return Ok(()); - }; - if !matches!(ext, "json" | "proof") { - return Ok(()); - } - let text = match std::fs::read_to_string(path) { - Ok(text) => text, - Err(_) => return Ok(()), - }; - let _ = registry.admit_json_str(&text); - Ok(()) -} - -fn should_descend(path: &Path) -> bool { - let Some(name) = path.file_name().and_then(|name| name.to_str()) else { - return true; - }; - !matches!( - name, - ".git" | ".worktrees" | "target" | "node_modules" | "vendor" | ".venv" - ) -} - -fn resolve_concept_counterpart(project_root: &Path, concept: &str) -> Option { - let (_, entry) = resolve_concept_index_entry(project_root, concept)?; - let entry_cid = entry.get("cid").and_then(Value::as_str)?; - let name = entry.get("name").and_then(Value::as_str)?; - if concept == name { - Some(entry_cid.to_string()) - } else if concept == entry_cid { - Some(name.to_string()) - } else { - None - } -} - -fn resolve_concept_index_entry(project_root: &Path, concept: &str) -> Option<(PathBuf, Value)> { - let repo_root = repo_root_for_concepts(project_root)?; - let index_path = repo_root - .join("menagerie") - .join("concept-shapes") - .join("catalog") - .join("index.json"); - let raw = std::fs::read_to_string(index_path).ok()?; - let index: Value = serde_json::from_str(&raw).ok()?; - let entries = index - .get("entries") - .and_then(Value::as_object) - .or_else(|| index.as_object())?; - for (cid, entry) in entries { - let entry_cid = entry.get("cid").and_then(Value::as_str).unwrap_or(cid); - let name = entry.get("name").and_then(Value::as_str); - if name == Some(concept) || concept == entry_cid { - return Some((repo_root, entry.clone())); - } - } - None -} - -fn repo_root_for_concepts(project_root: &Path) -> Option { - for base in [project_root.to_path_buf(), std::env::current_dir().ok()?] { - for candidate in base.ancestors() { - if candidate - .join("menagerie") - .join("concept-shapes") - .join("catalog") - .join("index.json") - .exists() - { - return Some(candidate.to_path_buf()); - } - } - } - None -} - -fn json_to_value(j: &Value) -> Arc { - match j { - Value::String(s) => CValue::string(s.clone()), - Value::Number(n) => { - if let Some(i) = n.as_i64() { - CValue::integer(i) - } else if let Some(u) = n.as_u64() { - if let Ok(i) = i64::try_from(u) { - CValue::integer(i) - } else { - CValue::string(n.to_string()) - } - } else { - CValue::string(n.to_string()) - } - } - Value::Bool(b) => CValue::boolean(*b), - Value::Null => CValue::null(), - Value::Array(items) => CValue::array(items.iter().map(json_to_value).collect()), - Value::Object(map) => CValue::object( - map.iter() - .map(|(key, value)| (key.as_str(), json_to_value(value))) - .collect::>(), - ), - } -} diff --git a/implementations/rust/provekit-cli/src/witness_ingest.rs b/implementations/rust/provekit-cli/src/witness_ingest.rs new file mode 100644 index 000000000..8431b3af7 --- /dev/null +++ b/implementations/rust/provekit-cli/src/witness_ingest.rs @@ -0,0 +1,133 @@ +// SPDX-License-Identifier: Apache-2.0 +// +// Witness ingestion: convert the raw contents that FORM a witness into a +// canonical WitnessMemento. +// +// A witness is raw external attestation -- the human-readable dump of a JUnit +// suite run, the JSON a vitest runner emits, a logger that printed `2+2=4`, a +// signature on a page. Heterogeneous evidence that something was observed to +// hold. This module is the single ingestion converter: contents-that-form-a- +// witness in, a canonical WitnessMemento out. +// +// The CID addresses WHAT is attested -- the observation (subject, fixture, +// measurements, outcome). The substrate does not dictate HOW it is attested: +// `signed_by` + `signature` are the attestation layer, excluded from the CID +// preimage. The CID is blake3-512 of the JCS of the observation; the signature +// (ed25519 here, but the mechanism is the attestor's, not the substrate's) is +// over the same bytes. `observed_at` is stamped at ingestion. + +use provekit_canonicalizer::blake3_512_of; +use provekit_ir_types::WitnessMemento; +use provekit_proof_envelope::{ed25519_pubkey_string, ed25519_sign_string}; +use serde_json::{json, Value as Json}; + +/// The raw facts that constitute a witness: WHAT was observed, with the +/// evidence verbatim. `measurements` carries the raw artifact unchanged (a +/// JUnit run's text, a vitest JSON object, a log line, a signature) -- no schema +/// is imposed on the observation. +pub struct WitnessContents { + /// CID of the contract/property this witness attests to. + pub witness_for: String, + /// The concept / op / subject observed. + pub subject: String, + /// CID of the fixture / input state the observation was made under. + pub fixture_state_cid: String, + /// Observations folded into this witness (1 for a single run). + pub sample_count: u64, + /// The raw witness artifact, verbatim (any shape). + pub measurements: Json, + /// When the observation was made (RFC3339), as reported by the observer/ + /// runner. Part of the observation, not fabricated by the converter: the + /// same observation observed at the same time is the same witness. + pub observed_at: String, + /// Observed result: "pass" | "fail" | "inconclusive". + pub outcome: String, +} + +/// Convert raw witness contents into a canonical, content-addressed, signed +/// WitnessMemento. This is THE witness converter -- every producer (a solver +/// discharge, an emitted-test run) reduces its observation to `WitnessContents` +/// and calls this. The minted CID is verified against `recompute_cid` so a +/// witness can never carry a CID divorced from its own content. +pub fn ingest_witness( + contents: &WitnessContents, + signer_seed: &[u8; 32], +) -> Result { + let signed_by = ed25519_pubkey_string(signer_seed); + + // Build the witness, then compute its CID exactly as validate() will: via + // recompute_cid, which addresses the observation and excludes the + // attestation layer (cid / signed_by / signature). Deriving the CID through + // the same function that checks it makes cid == recompute_cid true by + // construction -- no hand-rolled preimage that can drift from the canonical + // serialization. + let mut witness = WitnessMemento { + kind: "witness".to_string(), + schema_version: "1".to_string(), + witness_for: contents.witness_for.clone(), + subject: contents.subject.clone(), + fixture_state_cid: contents.fixture_state_cid.clone(), + observed_at: contents.observed_at.clone(), + sample_count: contents.sample_count, + measurements: contents.measurements.clone(), + outcome: contents.outcome.clone(), + signed_by: Some(signed_by), + signature: None, + cid: String::new(), + }; + let cid = witness.recompute_cid().map_err(|e| e.to_string())?; + + // HOW it is attested: ed25519 over the address (the WHAT). The mechanism is + // the attestor's, layered over the CID, not baked into it. + let signature = ed25519_sign_string(signer_seed, cid.as_bytes()); + witness.cid = cid; + witness.signature = Some(signature); + + // cid == recompute_cid by construction; validate() also checks field shape. + witness.validate().map_err(|e| e.to_string())?; + Ok(witness) +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn ingests_a_raw_test_output_into_a_valid_witness() { + // A vitest-runner-style raw artifact: heterogeneous JSON, verbatim. + let contents = WitnessContents { + witness_for: "blake3-512:contractcid".to_string(), + subject: "concept:adds".to_string(), + fixture_state_cid: "blake3-512:fixturecid".to_string(), + sample_count: 1, + measurements: json!({ + "runner": "vitest", + "stdout": "2+2=4", + "assertions": [{"name": "adds", "passed": true}] + }), + observed_at: "2026-05-27T00:00:00.000Z".to_string(), + outcome: "pass".to_string(), + }; + let witness = ingest_witness(&contents, &[7u8; 32]).expect("ingest"); + // The minted CID addresses the WHAT and matches recompute (which + // excludes the attestation layer). + assert_eq!(witness.cid, witness.recompute_cid().expect("recompute")); + assert!(witness.signed_by.is_some()); + assert!(witness.signature.is_some()); + assert_eq!(witness.outcome, "pass"); + // The raw artifact is carried verbatim. + assert_eq!(witness.measurements["stdout"], json!("2+2=4")); + + // Same observation, different signer key -> SAME CID (same WHAT), + // different attestation. The substrate addresses what, not how. + let other = ingest_witness(&contents, &[9u8; 32]).expect("ingest 2"); + assert_eq!( + witness.cid, other.cid, + "CID is over the observation; the signer is not part of it" + ); + assert_ne!( + witness.signature, other.signature, + "different key -> different attestation" + ); + } +} diff --git a/implementations/rust/provekit-cli/tests/cmd_bind_integration.rs b/implementations/rust/provekit-cli/tests/cmd_bind_integration.rs index 77273c8cf..67233fed7 100644 --- a/implementations/rust/provekit-cli/tests/cmd_bind_integration.rs +++ b/implementations/rust/provekit-cli/tests/cmd_bind_integration.rs @@ -125,7 +125,7 @@ fn assert_success(label: &str, output: &std::process::Output) { } #[test] -fn bind_from_stdin_emits_named_term_document_with_promotion_decisions() { +fn bind_from_stdin_emits_named_term_document_without_promotion() { let mut child = Command::new(provekit_bin()) .arg("bind") .stdin(Stdio::piped()) @@ -162,9 +162,12 @@ fn bind_from_stdin_emits_named_term_document_with_promotion_decisions() { named["terms"][0]["witnesses"][0]["predicateText"], "out == balance + amount" ); - assert_eq!( - named["promotionDecisionMementos"][0]["header"]["kind"], - "promotion-decision" + // Promotion was torn out of the bind/verify pipeline; bind must NOT emit + // promotion-decision mementos anymore. Guard the teardown. + assert!( + named["promotionDecisionMementos"].is_null(), + "promotion teardown: bind must not emit promotion-decision mementos, got {:?}", + named["promotionDecisionMementos"] ); } diff --git a/implementations/rust/provekit-cli/tests/witness_consensus_test.rs b/implementations/rust/provekit-cli/tests/witness_consensus_test.rs deleted file mode 100644 index a1e67bc40..000000000 --- a/implementations/rust/provekit-cli/tests/witness_consensus_test.rs +++ /dev/null @@ -1,245 +0,0 @@ -// SPDX-License-Identifier: Apache-2.0 - -use std::fs; -use std::path::{Path, PathBuf}; -use std::process::Command; - -use provekit_ir_types::{PromotionDecisionMemento, PromotionGate, PromotionResult, WitnessMemento}; -use serde_json::{json, Value}; - -fn provekit_bin() -> PathBuf { - PathBuf::from(env!("CARGO_BIN_EXE_provekit")) -} - -#[test] -fn witness_consensus_emits_promotion_decision_for_byte_equal_row_shape() { - let tmp = tempfile::tempdir().expect("tempdir"); - let catalog = tmp.path().join("catalog"); - fs::create_dir_all(&catalog).expect("create catalog dir"); - let out = tmp.path().join("promotion-receipt.proof"); - let policy = tmp.path().join("consensus-policy.json"); - write_consensus_policy(&policy, 4); - let concept = "concept:sql-query"; - let fixture = format!("blake3-512:{}", "a".repeat(128)); - let row_schema = json!({ - "columns": [ - {"name": "id", "declared_type": "INTEGER", "observed_typeof": "integer"}, - {"name": "name", "declared_type": "TEXT", "observed_typeof": "text"} - ] - }); - - for i in 0..4 { - let witness = witness( - concept, - &format!("ts:getUserById:{i}"), - &fixture, - &row_schema, - i, - ); - fs::write( - catalog.join(format!("witness-{i}.json")), - serde_json::to_string_pretty(&witness).expect("serialize witness"), - ) - .expect("write witness"); - } - - let output = Command::new(provekit_bin()) - .arg("witness") - .arg("consensus") - .arg("--concept") - .arg(concept) - .arg("--require-fixture") - .arg(&fixture) - .arg("--min-witnesses") - .arg("4") - .arg("--consensus-policy") - .arg(&policy) - .arg("--catalog") - .arg(&catalog) - .arg("--emit") - .arg(&out) - .arg("--json") - .output() - .expect("spawn provekit witness consensus"); - let stdout = String::from_utf8_lossy(&output.stdout); - let stderr = String::from_utf8_lossy(&output.stderr); - assert!( - output.status.success(), - "witness consensus failed\nstdout:\n{stdout}\nstderr:\n{stderr}" - ); - - let summary: Value = serde_json::from_slice(&output.stdout).expect("parse summary json"); - assert_eq!(summary["result"], "admitted"); - assert_eq!(summary["witnesses_consulted"], 4); - assert_eq!(summary["agreement"], "byte-equal"); - assert_eq!(summary["promotion_receipt"], out.display().to_string()); - - let raw = fs::read_to_string(&out).expect("promotion receipt written"); - let decision: PromotionDecisionMemento = - serde_json::from_str(&raw).expect("parse promotion decision"); - decision.validate().expect("promotion decision validates"); - assert_eq!( - decision.header.cid, - decision - .recompute_header_cid() - .expect("recompute promotion decision cid") - ); - assert_eq!(decision.header.kind, "promotion-decision"); - assert_eq!(decision.header.gate, PromotionGate::Threshold); - assert_eq!(decision.header.result, PromotionResult::Admitted); - assert_eq!(decision.header.evidence_cids.len(), 4); - assert!(decision.header.policy_cid.starts_with("blake3-512:")); - assert_eq!(decision.header.decision_payload["promoted_op"], concept); - assert_eq!( - decision.header.decision_payload["promotion"], - "documentary -> empirically-witnessed" - ); - assert_eq!(decision.header.decision_payload["agreement"], "byte-equal"); - assert_eq!(decision.header.decision_payload["total_observations"], 4); - assert_eq!( - decision.header.decision_payload["fixtures_consulted"], - json!([fixture]) - ); - assert_eq!(decision.header.decision_payload["row_schema"], row_schema); - assert_eq!( - decision.header.decision_payload["consensus_vector"]["unique_fixtures"], - 1 - ); - assert_eq!( - decision.header.decision_payload["consensus_vector"]["total_sample_count"], - 4 - ); -} - -#[test] -fn witness_consensus_rejects_row_shape_disagreement() { - let tmp = tempfile::tempdir().expect("tempdir"); - let catalog = tmp.path().join("catalog"); - fs::create_dir_all(&catalog).expect("create catalog dir"); - let out = tmp.path().join("promotion-receipt.proof"); - let concept = "concept:sql-query"; - let fixture = format!("blake3-512:{}", "b".repeat(128)); - let row_schema = json!({ - "columns": [ - {"name": "id", "declared_type": "INTEGER", "observed_typeof": "integer"} - ] - }); - let divergent_row_schema = json!({ - "columns": [ - {"name": "id", "declared_type": "INTEGER", "observed_typeof": "integer"}, - {"name": "email", "declared_type": "TEXT", "observed_typeof": "text"} - ] - }); - - for i in 0..3 { - let witness = witness( - concept, - &format!("ts:getUserById:{i}"), - &fixture, - &row_schema, - i, - ); - fs::write( - catalog.join(format!("witness-{i}.json")), - serde_json::to_string_pretty(&witness).expect("serialize witness"), - ) - .expect("write witness"); - } - let divergent = witness( - concept, - "python:get_user_by_id:0", - &fixture, - &divergent_row_schema, - 99, - ); - fs::write( - catalog.join("witness-divergent.json"), - serde_json::to_string_pretty(&divergent).expect("serialize divergent witness"), - ) - .expect("write divergent witness"); - - let output = Command::new(provekit_bin()) - .arg("witness") - .arg("consensus") - .arg("--concept") - .arg(concept) - .arg("--require-fixture") - .arg(&fixture) - .arg("--min-witnesses") - .arg("4") - .arg("--catalog") - .arg(&catalog) - .arg("--emit") - .arg(&out) - .output() - .expect("spawn provekit witness consensus"); - let stderr = String::from_utf8_lossy(&output.stderr); - assert!( - !output.status.success(), - "row-shape disagreement must reject promotion" - ); - assert!( - stderr.contains("measurements.row_schema disagreement"), - "stderr should identify the discovered loss axis:\n{stderr}" - ); - assert!( - !out.exists(), - "rejected consensus must not emit a promotion decision" - ); -} - -fn witness( - concept: &str, - subject: &str, - fixture: &str, - row_schema: &Value, - idx: usize, -) -> WitnessMemento { - let mut witness = WitnessMemento { - cid: String::new(), - fixture_state_cid: fixture.to_string(), - kind: "witness".to_string(), - measurements: json!({ - "query": { - "sql": format!("SELECT id, name FROM users WHERE id = {idx}"), - "sample_args": [idx] - }, - "row_schema": row_schema, - "sample_row": { - "id": idx, - "name": format!("user-{idx}") - } - }), - observed_at: "2026-05-14T00:00:00.000Z".to_string(), - outcome: "pass".to_string(), - sample_count: 1, - schema_version: "1".to_string(), - signature: None, - signed_by: None, - subject: subject.to_string(), - witness_for: concept.to_string(), - }; - witness.cid = witness.recompute_cid().expect("witness cid"); - witness -} - -fn write_consensus_policy(path: &Path, min_witnesses: usize) { - fs::write( - path, - format!( - r#"{{ - "kind": "consensus-policy", - "schemaVersion": "1", - "name": "test-row-schema-consensus", - "thresholds": [ - {{"axis": "min-witnesses-floor", "predicate": "n>={min_witnesses}"}}, - {{"axis": "environment-diversity", "predicate": "unique_fixtures>=1"}}, - {{"axis": "sample-depth", "predicate": "total_sample_count>={min_witnesses}"}} - ], - "allow_failures": false -}} -"# - ), - ) - .expect("write consensus policy"); -} diff --git a/implementations/rust/provekit-ir-types/src/lib.rs b/implementations/rust/provekit-ir-types/src/lib.rs index 11810e014..4993f2a41 100644 --- a/implementations/rust/provekit-ir-types/src/lib.rs +++ b/implementations/rust/provekit-ir-types/src/lib.rs @@ -6340,7 +6340,14 @@ impl LanguageTransitionMemento { impl WitnessMemento { pub fn recompute_cid(&self) -> Result { - migration_cid_without_keys(self, &["cid"]) + // The CID addresses WHAT is attested -- the observation (subject, + // fixture, measurements, outcome). It is not the substrate's job to say + // HOW it is attested: the signer identity and the signature bytes are + // the attestation layer, excluded from the preimage. The same + // observation attested by a different key is the SAME CID (same WHAT), + // a different attestation. (Excluding only "cid" was a bug: it baked + // the HOW into the WHAT and made a self-consistent CID impossible.) + migration_cid_without_keys(self, &["cid", "signed_by", "signature"]) } pub fn validate(&self) -> Result<(), MigrationReceiptError> { diff --git a/implementations/rust/provekit-ir-types/tests/promotion_decision_serde.rs b/implementations/rust/provekit-ir-types/tests/promotion_decision_serde.rs deleted file mode 100644 index 8f175e46e..000000000 --- a/implementations/rust/provekit-ir-types/tests/promotion_decision_serde.rs +++ /dev/null @@ -1,50 +0,0 @@ -// SPDX-License-Identifier: Apache-2.0 -// -// Round-trip and CID-pin tests for PromotionDecisionMemento. -// -// Source of truth: -// protocol/specs/2026-05-13-promotion-decision-memento.md §1 and §4 - -use provekit_ir_types::PromotionDecisionMemento; - -const PROMOTION_DECISION_FIXTURE: &str = r#"{"envelope":{"declaredAt":"2026-05-13T12:00:00Z","signature":"ed25519:fixture-signature","signer":"ed25519:fixture-signer"},"header":{"candidate_cid":"blake3-512:111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111","cid":"blake3-512:4c6902a6c15b4a539f956e89d43fbf1fb0a7b243993e9b6c9ce0dd96646a8d956c9981d7313086a82bf04f897767666f320002f58560616c21c50b0bee9ca9ba","decider_cid":"blake3-512:222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222","decision_payload":{"distinctness_axis":"language","evaluated_score":17,"required_score":12,"result":"admitted"},"evidence_cids":["blake3-512:333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333","blake3-512:444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444"],"gate":"threshold","kind":"promotion-decision","policy_cid":"blake3-512:555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555","promoted_cid":"blake3-512:666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666","result":"admitted","schemaVersion":"1"},"metadata":{"counterexample_cids":[],"note":"Promotion accepted by threshold policy.","source_url":"https://example.test/provekit/promotion/791"}}"#; - -#[test] -fn promotion_decision_round_trips_as_jcs_bytes() { - let memento: PromotionDecisionMemento = - serde_json::from_str(PROMOTION_DECISION_FIXTURE).expect("parse fixture"); - - let serialized = memento.to_jcs_string().expect("canonicalize"); - - assert_eq!(serialized, PROMOTION_DECISION_FIXTURE); -} - -#[test] -fn promotion_decision_recomputes_pinned_header_cid() { - let memento: PromotionDecisionMemento = - serde_json::from_str(PROMOTION_DECISION_FIXTURE).expect("parse fixture"); - - assert_eq!( - memento.recompute_header_cid().expect("recompute cid"), - memento.header.cid - ); -} - -#[test] -fn promotion_decision_rejects_empty_evidence_cids() { - use provekit_ir_types::PromotionDecisionInvariantError; - let mut m: PromotionDecisionMemento = - serde_json::from_str(PROMOTION_DECISION_FIXTURE).expect("parse fixture"); - m.header.evidence_cids.clear(); - assert_eq!( - m.validate(), - Err(PromotionDecisionInvariantError::EmptyEvidenceCids) - ); -} - -#[test] -fn promotion_decision_validate_accepts_fixture() { - let m: PromotionDecisionMemento = - serde_json::from_str(PROMOTION_DECISION_FIXTURE).expect("parse fixture"); - m.validate().expect("fixture has non-empty evidence_cids"); -} diff --git a/implementations/rust/provekit-verifier/src/handshake.rs b/implementations/rust/provekit-verifier/src/handshake.rs index 21e7726e3..b399429c4 100644 --- a/implementations/rust/provekit-verifier/src/handshake.rs +++ b/implementations/rust/provekit-verifier/src/handshake.rs @@ -128,18 +128,48 @@ pub fn locate_producer_post( } let inner_name = arg.get("name").and_then(|v| v.as_str())?; let producer_bridge = bridges_by_symbol.get(inner_name)?; - let target_cid = producer_bridge - .pointer("/evidence/body/targetContractCid") - .and_then(|v| v.as_str())?; + // Shape-agnostic: production mint emits v1.2-layered mementos (fields on + // `header`); only v1.1-flat carries them on `evidence.body`. Reading the + // flat path alone meant the producer post never resolved for harvested + // calls, so the callsite fell through to the bare `instantiate` form + // instead of the real `producer_post -> consumer_pre` implication. + let bridge_body = crate::types::memento_body(producer_bridge)?; + let target_cid = bridge_body.get("targetContractCid").and_then(|v| v.as_str())?; let producer_contract = pool_mementos.get(target_cid)?; - let post = producer_contract - .pointer("/evidence/body/post") + let producer_body = crate::types::memento_body(producer_contract)?; + let post = producer_body + .get("post") .filter(|v| v.is_object()) .cloned()?; + // The post relates the producer's output to its inputs via the carrier + // variable `result` (e.g. `result == value`). Quantify over that carrier + // so `build_implication_obligation` can unify it with the consumer's + // formal: `forall _h0. producer_post[result:=_h0] -> consumer_pre[formal:=_h0]`. + // Already-quantified posts pass through untouched. + let post = wrap_post_forall(post, producer_body); let post_hash = formula_hash(&post); Some((post, post_hash)) } +/// Wrap a bare producer post in `forall result. post`, binding the output +/// carrier variable `result`. Sort is taken from the producer's first formal +/// sort (its return width in the single-formal model) or `Int`. An +/// already-quantified post passes through untouched. +fn wrap_post_forall(post: Json, producer_body: &Json) -> Json { + if post.get("kind").and_then(|v| v.as_str()) == Some("forall") { + return post; + } + // The carrier `result` is the producer's RETURN value, not a parameter, so + // its sort is NOT `formalSorts[i]` (those model parameter sorts, paired + // with `formals`). The verifier reasons in LIA; bind the carrier as the + // canonical `Int`, matching `build_implication_obligation`'s default. A + // non-Int return would need the contract's return sort, which the memento + // does not expose separately today. + let _ = producer_body; + let sort = serde_json::json!({"kind": "primitive", "name": "Int"}); + serde_json::json!({"kind": "forall", "name": "result", "sort": sort, "body": post}) +} + /// Tier 1: literal equality of canonical hashes. pub fn try_tier1(producer_post_hash: &str, consumer_pre_hash: &str) -> bool { producer_post_hash == consumer_pre_hash diff --git a/implementations/rust/provekit-verifier/src/resolve_target.rs b/implementations/rust/provekit-verifier/src/resolve_target.rs index b959573c7..3daf1221c 100644 --- a/implementations/rust/provekit-verifier/src/resolve_target.rs +++ b/implementations/rust/provekit-verifier/src/resolve_target.rs @@ -10,6 +10,8 @@ // `targetProofCid`. See protocol/specs/2026-04-30-ir-formal-grammar.md // § "Bridge target pinning: the shim-poisoning vector". +use serde_json::{json, Value as Json}; + use crate::types::{memento_body, memento_kind, CallSite, MementoPool, ResolvedProperty}; pub fn run(cs: &CallSite, pool: &MementoPool) -> Result { @@ -64,7 +66,15 @@ pub fn run(cs: &CallSite, pool: &MementoPool) -> Result= 0`). The discharge stages + // (`instantiate`, `build_implication_obligation`) require it quantified, so + // the actual passed at the callsite can be substituted for the formal. + // Synthesize `forall (formal). pre` here from the contract's own + // `formals`/`formalSorts`. Single-formal: the callsite model tracks a + // single arg term, so we bind the first formal. An already-quantified pre + // (hand-built bundles) passes through untouched. + let ir_formula = body.get("pre").cloned().map(|pre| wrap_pre_forall(pre, body)); // A target carrying a `formals` array is a body-derived op-contract // (body-bearing). The caller must NOT vacuous-pass such a target if its // obligation was not reduced + discharged; it must refuse. Surface the @@ -84,3 +94,31 @@ pub fn run(cs: &CallSite, pool: &MementoPool) -> Result Json { + if pre.get("kind").and_then(|v| v.as_str()) == Some("forall") { + return pre; + } + let Some(name) = body + .get("formals") + .and_then(|v| v.as_array()) + .and_then(|a| a.first()) + .and_then(|v| v.as_str()) + else { + return pre; + }; + // The formal's sort is already canonical (the lifter emits the canonical + // primitive set; integers are `Int`). Use it directly; default to `Int` + // when a contract omits formalSorts. + let sort = body + .get("formalSorts") + .and_then(|v| v.as_array()) + .and_then(|a| a.first()) + .cloned() + .unwrap_or_else(|| json!({"kind": "primitive", "name": "Int"})); + json!({"kind": "forall", "name": name, "sort": sort, "body": pre}) +} diff --git a/implementations/rust/provekit-verifier/src/types.rs b/implementations/rust/provekit-verifier/src/types.rs index a3e3c4f03..b5447415e 100644 --- a/implementations/rust/provekit-verifier/src/types.rs +++ b/implementations/rust/provekit-verifier/src/types.rs @@ -184,12 +184,10 @@ impl MementoPool { /// This is the core of bridge enforcement: "does the publisher's /// post imply the consumer's pre?" pub fn verify_implication(&self, antecedent_cid: &str, consequent_cid: &str) -> Option<&Json> { - // Direct lookup: find a memento that indexes both hashes - // and is an implication evidence kind. - let _ant_memento = self.verify_by_hash(antecedent_cid)?; - let _con_memento = self.verify_by_hash(consequent_cid)?; - - // Scan for implication mementos that link these two. + // The proof of `P → Q` is the implication memento that links them, not + // the presence of P or Q in `formula_to_memento`. (P, the antecedent, + // is no longer indexed there at all -- it is an assumption, not a + // fact.) Scan for an implication memento with these exact endpoints. // Shape-agnostic: under v1.2 these references live in the // metadata; under v1.1 they live in evidence.body. for envelope in self.mementos.values() { @@ -294,19 +292,21 @@ impl MementoPool { /// The .proof protocol IS the cache: storing a memento IS caching /// the verification result. pub fn insert(&mut self, memento_cid: String, envelope: Json) { - // Index by the formula hashes referenced in the body. Layout - // depends on shape; `memento_body_field` papers over the cut. - // Contract: preHash/postHash/invHash (in metadata under v1.2, - // in evidence.body under v1.1). - // Implication: antecedentHash/consequentHash (in header under - // v1.2, in evidence.body under v1.1). - for field in &[ - "preHash", - "postHash", - "invHash", - "antecedentHash", - "consequentHash", - ] { + // Index ONLY the formula hashes that name an ESTABLISHED FACT into + // `formula_to_memento` -- the map Tier 0 (`verify`) trusts as "this + // formula is proven true". A precondition (`preHash`) and an + // implication antecedent (`antecedentHash`) are OBLIGATIONS / + // ASSUMPTIONS, not facts: indexing them let a callsite's consumer + // precondition self-discharge merely because the callee DECLARES it + // (the missing-edge hole; see `precondition_is_obligation_not_verified_fact`). + // Established facts: a function's `postHash`/`invHash` (guarantees on + // return / always). NOT `consequentHash`: an implication's consequent + // `Q` holds only GIVEN its antecedent `P`, so indexing it would make + // Tier 0 `verify(Q)` treat a conditional as unconditional -- the same + // category error as `preHash`/`antecedentHash`. Proven implications + // discharge via `verify_implication`/`can_implies`, which scan + // implication mementos directly and don't need the consequent here. + for field in &["postHash", "invHash"] { if let Some(hash) = memento_body_field(&envelope, field).and_then(|v| v.as_str()) { self.formula_to_memento .insert(hash.to_string(), memento_cid.clone()); @@ -941,4 +941,63 @@ mod tests { assert_eq!(v.invariant, "state >= 0"); assert_eq!(v.function_cid, fc); } + + // ---- A precondition is an obligation, not a verified fact ---- + + fn make_contract_memento(cid: &str, name: &str, pre: &Json, post: &Json) -> Json { + // v1.1 flat shape: evidence.kind="contract", derived hashes in body. + json!({ + "cid": cid, + "evidence": { + "kind": "contract", + "body": { + "contractName": name, + "pre": pre, + "post": post, + "preHash": compute_formula_cid(pre), + "postHash": compute_formula_cid(post), + } + } + }) + } + + #[test] + fn precondition_is_obligation_not_verified_fact() { + // The missing-edge hole: indexing a contract's preHash into the + // "verified formulas" map (formula_to_memento) makes Tier 0 + // `pool.verify(consumer_pre)` self-discharge — a callsite's consumer + // precondition is satisfied merely by the callee DECLARING it. A + // precondition is an obligation to discharge, never an established + // fact. Only the post (and inv) are guarantees. + let mut pool = MementoPool::default(); + let pre = json!({ + "kind": "atomic", "pred": "ge", + "args": [{"kind":"var","name":"encoding"}, {"kind":"const","value":0}] + }); + let post = json!({ + "kind": "atomic", "pred": "eq", + "args": [{"kind":"var","name":"result"}, {"kind":"var","name":"value"}] + }); + let m_cid = + "blake3-512:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" + .to_string(); + pool.insert( + m_cid.clone(), + make_contract_memento(&m_cid, "content_address", &pre, &post), + ); + + // A postcondition IS an established fact (the function guarantees it). + assert!( + pool.verify(&post).is_some(), + "a contract's post must remain a verified fact" + ); + + // A bare precondition must NOT count as verified — else any callsite's + // consumer pre self-discharges at Tier 0 and the missing edge hides. + assert!( + pool.verify(&pre).is_none(), + "a contract's bare pre must NOT be treated as a verified fact" + ); + } + } diff --git a/implementations/rust/provekit-walk/src/lift.rs b/implementations/rust/provekit-walk/src/lift.rs index 546968640..b86debcb9 100644 --- a/implementations/rust/provekit-walk/src/lift.rs +++ b/implementations/rust/provekit-walk/src/lift.rs @@ -651,6 +651,30 @@ fn lift_expr_to_term_inner(expr: &Expr, ctx: &mut LiftCtx) -> Option { None } } + Expr::Call(call) => { + // Plain function call `f(args)`. Lift to a ctor named by the + // callee's bare symbol (the last path segment) so the call tree + // SURVIVES into the contract formula. This is the keystone: the + // ctor name matches the callee's auto-minted bridge `sourceSymbol`, + // so `enumerate_callsites` finds the seam, `resolve_target` pulls + // the callee's precondition, and the runner discharges + // `producer_post -> callee_pre`. Without this arm the call tree + // vanished and the postcondition collapsed to a vacuous `true` -- + // the missing edge was invisible to the solver. Mirrors the + // `Expr::MethodCall` arm. Language-blind once emitted: the catch + // lives in the verifier, below the source language. + let Expr::Path(syn::ExprPath { path, .. }) = &*call.func else { + // Calls through a non-path callee (closure value, fn pointer + // in a local, etc.) have no stable bridge symbol to name. + return None; + }; + let callee = path.segments.last()?.ident.to_string(); + let mut args = Vec::with_capacity(call.args.len()); + for a in &call.args { + args.push(lift_expr_to_term_inner(a, ctx)?); + } + Some(IrTerm::Ctor { name: callee, args }) + } Expr::If(_) | Expr::Match(_) | Expr::Block(_) => { // Conditional / match / block expressions don't lift to a // single canonical IR term in the MVP — they would need a @@ -1192,6 +1216,36 @@ mod tests { // ---- bug-fix regression tests ---- + #[test] + fn call_expr_body_lifts_call_tree_into_postcondition() { + // A function whose body is a nested call must derive a post that + // CONTAINS the call tree as ctor terms. Otherwise the callees are + // invisible to `enumerate_callsites` and the missing-edge seam can + // never be discharged (it was the false-green hole: this collapsed to + // a vacuous `true` because `Expr::Call` had no lift arm). + let item_fn = parse_fn( + r#" + fn address_of(value: i64) -> i64 { + content_address(serialize(value)) + } + "#, + ); + let post = lift_function_postcondition(&item_fn); + let json = serde_json::to_string(post.as_formula()).unwrap(); + assert!( + json.contains("\"result\""), + "post must derive result = : {json}" + ); + assert!( + json.contains("content_address"), + "post must contain the outer call ctor `content_address`: {json}" + ); + assert!( + json.contains("serialize"), + "post must contain the nested call ctor `serialize`: {json}" + ); + } + #[test] fn debug_assert_not_lifted_to_postcondition() { // Bug #5: `debug_assert!` is compiled out in release builds. diff --git a/implementations/rust/provekit-walk/src/marriage.rs b/implementations/rust/provekit-walk/src/marriage.rs index ef426867f..5b33bba6d 100644 --- a/implementations/rust/provekit-walk/src/marriage.rs +++ b/implementations/rust/provekit-walk/src/marriage.rs @@ -959,22 +959,24 @@ mod tests { assert_eq!( ast.formal_sorts, vec![ + // Integers canonicalize to `Int` (width is a refinement, not a + // sort); `&[u32]` -> `Ref>`, `usize` -> `Int`. Sort::Primitive { - name: "Ref>".to_string() + name: "Ref>".to_string() }, Sort::Primitive { - name: "Usize".to_string() + name: "Int".to_string() }, ], - "formal_sorts must be [Ref>, Usize]: {:?}", + "formal_sorts must be [Ref>, Int]: {:?}", ast.formal_sorts ); assert_eq!( ast.return_sort, Sort::Primitive { - name: "U32".to_string() + name: "Int".to_string() }, - "return_sort must be U32: {:?}", + "return_sort must be Int (u32 canonicalizes to Int): {:?}", ast.return_sort ); } diff --git a/implementations/rust/provekit-walk/src/sort_translate.rs b/implementations/rust/provekit-walk/src/sort_translate.rs index 857d9c8cd..cf3f2bf7d 100644 --- a/implementations/rust/provekit-walk/src/sort_translate.rs +++ b/implementations/rust/provekit-walk/src/sort_translate.rs @@ -210,18 +210,17 @@ fn path_sort_name(p: &syn::TypePath) -> String { /// to using the ident as a user-defined type sort name. fn primitive_sort_name(ident: &str) -> Option<&'static str> { Some(match ident { - "u8" => "U8", - "u16" => "U16", - "u32" => "U32", - "u64" => "U64", - "u128" => "U128", - "usize" => "Usize", - "i8" => "I8", - "i16" => "I16", - "i32" => "I32", - "i64" => "I64", - "i128" => "I128", - "isize" => "Isize", + // Integer types canonicalize to the spec's `Int`. Per + // canonicalization-grammar.md §5 the canonical primitive set is a + // fixed {Bool, Int, Real, String, Ref, ...}; width (`u32` vs `u64`) is + // NOT a sort distinction. Width is a range refinement -- sidecar to the + // contract -- and the SOURCE type system already owns width and + // narrowing (a `u64 -> u32` truncation is a compile error, not our + // job). Collapsing to `Int` is what lets a Rust `i64` contract + // federate (share CIDs) with a Java `long` contract over the same + // canonical `Int`, and keeps the solver in LIA. + "u8" | "u16" | "u32" | "u64" | "u128" | "usize" | "i8" | "i16" | "i32" | "i64" + | "i128" | "isize" => "Int", "f32" => "F32", "f64" => "F64", "bool" => "Bool", @@ -474,29 +473,14 @@ fn charon_literal_sort_name(lit: &serde_json::Value) -> String { } // Object: {"Literal": {"UInt": "U32"}} or {"Literal": {"SInt": "I8"}} - // or {"Literal": {"Float": "F32"}} + // or {"Literal": {"Float": "F32"}}. Integer widths canonicalize to `Int` + // (see primitive_sort_name): width is a refinement, not a sort. Keeps the + // syn and Charon paths in agreement on canonical `Int`. if let Some(obj) = lit.as_object() { - if let Some(uint) = obj.get("UInt").and_then(|v| v.as_str()) { - return match uint { - "U8" => "U8".to_string(), - "U16" => "U16".to_string(), - "U32" => "U32".to_string(), - "U64" => "U64".to_string(), - "U128" => "U128".to_string(), - "Usize" => "Usize".to_string(), - _ => uint.to_string(), - }; - } - if let Some(sint) = obj.get("SInt").and_then(|v| v.as_str()) { - return match sint { - "I8" => "I8".to_string(), - "I16" => "I16".to_string(), - "I32" => "I32".to_string(), - "I64" => "I64".to_string(), - "I128" => "I128".to_string(), - "Isize" => "Isize".to_string(), - _ => sint.to_string(), - }; + if obj.get("UInt").and_then(|v| v.as_str()).is_some() + || obj.get("SInt").and_then(|v| v.as_str()).is_some() + { + return "Int".to_string(); } if let Some(float) = obj.get("Float").and_then(|v| v.as_str()) { return match float { @@ -646,16 +630,23 @@ mod tests { } #[test] - fn u32_and_u64_are_distinct() { - let a = syn_type_to_sort(&parse_ty("u32")); - let b = syn_type_to_sort(&parse_ty("u64")); - assert_ne!(a, b); + fn integer_widths_canonicalize_to_int() { + // Width is a refinement, not a sort: every integer width collapses to + // the canonical `Int` (canonicalization-grammar §5). `u32`/`u64`/`i8` + // are the SAME sort; their distinction lives in the contract's range + // refinement, not the sort name. This is the deliberate reversal of + // the old width-distinct "Option A". + let int = Sort::Primitive { name: "Int".to_string() }; + for ty in ["u8", "u16", "u32", "u64", "u128", "usize", "i8", "i16", "i32", "i64", "i128", "isize"] { + assert_eq!(syn_type_to_sort(&parse_ty(ty)), int, "`{ty}` must canonicalize to Int"); + } } #[test] - fn i8_and_u8_are_distinct() { + fn int_and_bool_are_distinct() { + // Integers collapse to Int, but Int and Bool stay distinct sorts. let a = syn_type_to_sort(&parse_ty("i8")); - let b = syn_type_to_sort(&parse_ty("u8")); + let b = syn_type_to_sort(&parse_ty("bool")); assert_ne!(a, b); } @@ -676,7 +667,7 @@ mod tests { assert_eq!( slice_sort, Sort::Primitive { - name: "Slice".to_string() + name: "Slice".to_string() } ); } @@ -687,7 +678,7 @@ mod tests { assert_eq!( ty, Sort::Primitive { - name: "Ref>".to_string() + name: "Ref>".to_string() } ); }