|
| 1 | +# verification process |
| 2 | + |
| 3 | +Creating first pipeline Rust → hax → Lean → specification → proof |
| 4 | + |
| 5 | +## Choosing right candicate for first verification |
| 6 | + |
| 7 | +https://github.com/risc0/risc0/blob/2e73cb82cadfbad9190b2b34f124481c9b57d371/risc0/zkvm/src/receipt/merkle.rs |
| 8 | + |
| 9 | +Best starting candidate: Merkle proof verification / Merkle root recomputation |
| 10 | + |
| 11 | +``` |
| 12 | +impl MerkleProof { |
| 13 | + /// Verify the Merkle inclusion proof against the given leaf and root. |
| 14 | + pub fn verify( |
| 15 | + &self, |
| 16 | + leaf: &Digest, |
| 17 | + root: &Digest, |
| 18 | + hashfn: &dyn HashFn<BabyBear>, |
| 19 | + ) -> Result<()> { |
| 20 | + ensure!( |
| 21 | + self.root(leaf, hashfn) == *root, |
| 22 | + "merkle proof verify failed" |
| 23 | + ); |
| 24 | + Ok(()) |
| 25 | + } |
| 26 | +
|
| 27 | + /// Calculate the root of this branch by iteratively hashing, starting from the leaf. |
| 28 | + pub fn root(&self, leaf: &Digest, hashfn: &dyn HashFn<BabyBear>) -> Digest { |
| 29 | + let mut cur = *leaf; |
| 30 | + let mut cur_index = self.index; |
| 31 | + for sibling in &self.digests { |
| 32 | + cur = if cur_index & 1 == 0 { |
| 33 | + *hashfn.hash_pair(&cur, sibling) |
| 34 | + } else { |
| 35 | + *hashfn.hash_pair(sibling, &cur) |
| 36 | + }; |
| 37 | + cur_index >>= 1; |
| 38 | + } |
| 39 | + cur |
| 40 | + } |
| 41 | +} |
| 42 | +
|
| 43 | +``` |
| 44 | + |
| 45 | +Why it is the best first piece |
| 46 | + |
| 47 | +- It is actually used in both RISC0 and Jolt (commitments to memory/trace pages). |
| 48 | + |
| 49 | +- The specification is elementary: "root = fold over path". |
| 50 | + |
| 51 | +- It is possible to prove the logic's correctness without cryptographic assumptions about the hash. |
| 52 | + |
| 53 | +- It is very convenient to link with the "Oracle" concept from ArkLib: hash compression can be an oracle. |
| 54 | + |
| 55 | + |
| 56 | +## Adapt the Rust code for extraction |
| 57 | + |
| 58 | +The simplification is acceptable and well-justified for formal verification purposes: what we did |
| 59 | + |
| 60 | +- Removed Trait Dependencies |
| 61 | +- Eliminated Result Type and Error Handling |
| 62 | +- Flattened Struct into Function Parameters |
| 63 | +- Removed Generic Type Parameters |
| 64 | +- Added Toy Hash for Testing |
| 65 | + |
| 66 | +How it can be united with ArkLib: |
| 67 | + |
| 68 | +In ArkLib, the hash_pair can be modeled as a cryptographic oracle. This transforms Merkle root computation into an oracle computation that makes exactly len(digests) queries to the hash oracle, naturally fitting ArkLib's verification framework where algorithms interact with oracle abstractions. |
| 69 | + |
| 70 | +``` |
| 71 | +
|
| 72 | +// This is a "representative zkVM code": the algorithm is identical, |
| 73 | +// the interface has been adapted for verification/extraction. |
| 74 | +
|
| 75 | +#[derive(Clone, Copy, PartialEq, Eq, Debug)] |
| 76 | +pub struct Digest(pub [u32; 8]); |
| 77 | +
|
| 78 | +pub fn merkle_root_from_path( |
| 79 | + leaf: Digest, |
| 80 | + index: u32, |
| 81 | + digests: &[Digest], |
| 82 | + hash_pair: fn(&Digest, &Digest) -> Digest, |
| 83 | +) -> Digest { |
| 84 | + let mut cur = leaf; |
| 85 | + let mut cur_index = index; |
| 86 | + for sibling in digests { |
| 87 | + cur = if cur_index & 1 == 0 { |
| 88 | + hash_pair(&cur, sibling) |
| 89 | + } else { |
| 90 | + hash_pair(sibling, &cur) |
| 91 | + }; |
| 92 | + cur_index >>= 1; |
| 93 | + } |
| 94 | + cur |
| 95 | +} |
| 96 | +
|
| 97 | +pub fn merkle_verify_from_path( |
| 98 | + leaf: Digest, |
| 99 | + index: u32, |
| 100 | + digests: &[Digest], |
| 101 | + expected_root: Digest, |
| 102 | + hash_pair: fn(&Digest, &Digest) -> Digest, |
| 103 | +) -> bool { |
| 104 | + merkle_root_from_path(leaf, index, digests, hash_pair) == expected_root |
| 105 | +} |
| 106 | +
|
| 107 | +// For `сargo test` to pass and provide a basic sanity check, we implement a 'toy hash' function: |
| 108 | +
|
| 109 | +#[cfg(test)] |
| 110 | +mod tests { |
| 111 | + use super::*; |
| 112 | +
|
| 113 | + fn toy_hash_pair(a: &Digest, b: &Digest) -> Digest { |
| 114 | + let mut out = [0u32; 8]; |
| 115 | + for i in 0..8 { |
| 116 | + out[i] = a.0[i].wrapping_add(b.0[i]) ^ 0x9e3779b9; |
| 117 | + } |
| 118 | + Digest(out) |
| 119 | + } |
| 120 | +
|
| 121 | + #[test] |
| 122 | + fn root_and_verify_agree() { |
| 123 | + let leaf = Digest([1,2,3,4,5,6,7,8]); |
| 124 | + let sib1 = Digest([9,10,11,12,13,14,15,16]); |
| 125 | + let sib2 = Digest([17,18,19,20,21,22,23,24]); |
| 126 | + let path = vec![sib1, sib2]; |
| 127 | +
|
| 128 | + let root = merkle_root_from_path(leaf, 3, &path, toy_hash_pair); |
| 129 | + assert!(merkle_verify_from_path(leaf, 3, &path, root, toy_hash_pair)); |
| 130 | + } |
| 131 | +} |
| 132 | +
|
| 133 | +``` |
| 134 | + |
| 135 | +We extracted the essence of the Merkle proof algorithm while removing RISC0-specific implementation details. |
| 136 | + |
| 137 | +## Extract and write first theorems |
| 138 | + |
| 139 | +- Theorem 1: merkle_verify_is_pure_eq |
| 140 | +Theorem proves that the verification function merkle_verify_from_path is purely syntactic unfolding. Property is trivial but important, it shows that verification code does nothing extra. |
| 141 | + |
| 142 | +- Theorem 2: merkle_verify_of_computed_root_is_true |
| 143 | +This theorem proves basic soundness of the algorithm |
| 144 | + |
| 145 | +- If we compute a root r via merkle_root_from_path |
| 146 | +- And then verify the same root r via merkle_verify_from_path |
| 147 | +- The result will always be true |
| 148 | + |
| 149 | +``` |
| 150 | +import MerkleRootLean.Extracted.Merkle_root_rs |
| 151 | +-- Two theorems below are not about crypto-security, but it is already machine-checked for the Rust-extracted code. |
| 152 | +
|
| 153 | +-- merkle_verify_is_pure_eq theorem proves that extracted verify — is the same as “calculate root and compare to expected”. |
| 154 | +-- "verify = pure (root == expected_root)" (almost rfl/simp) |
| 155 | +theorem merkle_verify_is_pure_eq |
| 156 | + (leaf : Merkle_root_rs.Digest) |
| 157 | + (index : u32) |
| 158 | + (digests : RustSlice Merkle_root_rs.Digest) |
| 159 | + (expected_root : Merkle_root_rs.Digest) |
| 160 | + (hash_pair : |
| 161 | + Merkle_root_rs.Digest → Merkle_root_rs.Digest → RustM Merkle_root_rs.Digest) : |
| 162 | + Merkle_root_rs.merkle_verify_from_path leaf index digests expected_root hash_pair |
| 163 | + = |
| 164 | + (do |
| 165 | + let r ← Merkle_root_rs.merkle_root_from_path leaf index digests hash_pair |
| 166 | + pure (Core.Cmp.PartialEq.eq Merkle_root_rs.Digest Merkle_root_rs.Digest r expected_root)) := by |
| 167 | + -- should work: |
| 168 | + simp [Merkle_root_rs.merkle_verify_from_path] |
| 169 | +
|
| 170 | +-- merkle_verify_of_computed_root_is_true proves the basic soundness property of verify: |
| 171 | +-- "if expected_root = compute_root(data), then verify(data, expected_root) = true." |
| 172 | +-- "if expected_root = computed_root, verify should return true" |
| 173 | +theorem merkle_verify_of_computed_root_is_true |
| 174 | + (leaf : Merkle_root_rs.Digest) |
| 175 | + (index : u32) |
| 176 | + (digests : RustSlice Merkle_root_rs.Digest) |
| 177 | + (hash_pair : |
| 178 | + Merkle_root_rs.Digest → Merkle_root_rs.Digest → RustM Merkle_root_rs.Digest) : |
| 179 | + (do |
| 180 | + let r ← Merkle_root_rs.merkle_root_from_path leaf index digests hash_pair |
| 181 | + Merkle_root_rs.merkle_verify_from_path leaf index digests r hash_pair) |
| 182 | + = |
| 183 | + pure true := by |
| 184 | + -- Unfold verify, and just eq r r left |
| 185 | + simp [Merkle_root_rs.merkle_verify_from_path] |
| 186 | +
|
| 187 | +``` |
| 188 | + |
| 189 | +## Current work |
| 190 | + |
| 191 | +Stage 1 / Recon: |
| 192 | + |
| 193 | + We set up a PoC end-to-end pipeline (Rust → cargo hax into lean → Lean project/CI) on a simplified zkVM-representative component: Merkle-path root recomputation (adapted from RISC0, with an extraction-friendly interface). CI is green with hax/Lean steps currently marked non-blocking. |
| 194 | + |
| 195 | + Current blocker: the Lean backend extraction succeeds, but the generated Lean module does not typecheck against the current Lean prelude because it references missing/incomplete core models (Core.Cmp, Core.Iter, and generated AssociatedTypes for traits like PartialEq/Eq/Debug, plus iterator/fold APIs). |
| 196 | + |
| 197 | + We asked the hax maintainers (Zulip)[https://hacspec.zulipchat.com/#narrow/channel/269544-general/topic/hax.20.2B.20lean.20example/with/561950534]. |
| 198 | + |
| 199 | + They confirmed this is expected right now. Preferred workaround is to |
| 200 | + (a) define missing Core.* locally or |
| 201 | + (b) patch the extracted Lean. They also noted they are switching methodology: core library models will be written in Rust and then extracted to Lean with hax; hand-written Lean core models will soon be replaced. |
| 202 | + |
| 203 | +Next steps: |
| 204 | + |
| 205 | +- Build a minimal Rust “core-model” crate for the missing traits/APIs (PartialEq/Eq/Iter) and extract it to Lean (instead of maintaining hand-written Lean stubs). |
| 206 | +- Document the encountered gaps/heuristics and raise issues/PRs where appropriate. |
| 207 | +- Once the extracted Merkle module typechecks, finish at least basic proofs (e.g., verify is “compute root then compare”, plus a trivial acceptance lemma). |
| 208 | + |
| 209 | + |
| 210 | + |
0 commit comments