Feature/siewwwin by LingSiewWin · Pull Request #12 · rhemify/rhemify-monorepo

LingSiewWin · 2026-05-12T07:13:45Z

No description provided.

The replace directive pointed to ../../solana-go which doesn't exist in the repo, causing all Go builds to fail at module resolution. - Remove `replace github.com/gagliardetto/solana-go => ../../solana-go` - Pin require to v1.20.0 (latest stable upstream release) - `go mod tidy` cleaned up unused indirect deps (swaggo/*, openapi/*, edwards25519, KyleBanks/depth) that were transitive only through the vendored copy Verified: `go build ./...` and `go test ./...` both exit 0.

`pickPreferred` returned `solana ?? reqs[0]`, but under noUncheckedIndexedAccess `reqs[0]` is `X402Requirement | undefined`. Callers always guard with `length > 0` so this was safe at runtime, but the type lied — and a bug at the call site would silently yield undefined instead of failing fast. - Replace the nullish-coalesce with an explicit if/throw flow - Empty input now throws with a clear message instead of silently bypassing the type contract Verified: src/detect/x402.ts(108) TS2322 gone. SDK tests 116 pass / 7 fail (same 7 pre-existing session-fixture failures, audit #9).

@solana/mpp 0.5.x removed `solana.session()` — session-based pay-as-you-go (deposit + TTL + auto-topup) is now Tempo-only upstream. The Solana side ships only `solana()` / `solana.charge()` for per-request payment, and Mppx.create(...).close was removed (no session lifecycle to tear down in per-charge mode). - Replace `mppClient.solana.session({ signer, autoOpen, autoTopup, sessionDefaults: { suggestedDeposit, ttlSeconds }})` with the supported `mppClient.solana({ signer })` per-charge form - Drop the unused `mppx.close?.bind(mppx)` — no longer on Mppx 0.5.x - Mark `maxDepositUsd`, `ttlSeconds`, `autoTopup` params as intentionally unused (`_`-prefixed) — they're enforced by the outer Rhemify session() wrapper's governance, not by MPP itself - Add TODO(tempo) anchoring future work to register tempo.session() alongside solana() once RhemifyConfig.wallet gains a tempoAccount Behavior change: each governed fetch is now its own Solana tx (per-charge) instead of a single batched session settlement. The Rhemify-level session() wrapper continues to enforce maxDepositUsd cap, TTL, cumulative spend, and trace emission — those guarantees come from the wrapper, not MPP. Verified: packages/sdk: bunx tsc --noEmit exit 0 (was: 2 TS2339 errors) packages/sdk: bun test 116 pass / 7 fail (same 7 pre-existing session-fixture failures from audit #9, zero new regressions) Note on Done definition: no live MPP-protected endpoint exists in this repo, so true e2e proof (real 402 challenge → solana.charge → settled tx) is pending live integration. Type check + unit suite confirm the API adaptation is mechanically correct.

The session governance suite passed `"fake-solana-key"` as `config.wallet.solanaPrivateKey`. With @solana/mpp installed in node_modules, openMppSession takes the live path (not the test fallback) and calls decodeSolanaKey, which threw "Invalid Solana private key format. Expected JSON array, base64, or hex." All 7 tests in `session() governance wrapper` failed for this single reason. Generate a real ed25519 keypair via @solana/web3.js once at module load and reuse it across all tests. Passes both decodeSolanaKey (JSON array length 64) and @solana/kit createKeyPairFromBytes (real ed25519 bytes). Verified: packages/sdk: bun test 123 pass / 0 fail (was: 116 pass / 7 fail)

Closes three audit findings in the Anchor program suite: 1. write_daily_root squat (rhemify-anchor) Anyone could write a daily merkle root for any (fleet_id, date) tuple and become its recorded `authority`, frontrunning legitimate fleet operators and corrupting the canonical anchor record. 2. initialize_fleet_vault race-init (rhemify-dwallet) Anyone could call initialize_fleet_vault for any fleet_id first, become the vault's `authority`, set their own `co_signer`, and use that co_signer to approve withdrawals via approve_signing — a full takeover of the agent's funds path. 3. daily_cap stored but never enforced (rhemify-dwallet) FleetVault.daily_cap was written at init but approve_signing only checked the per-agent daily_limit. With multiple agents each at their max-per-tx, the fleet aggregate could exceed the intended ceiling. Fix is one consistent design: user-scoped PDAs across all fleet-derived accounts. Adversaries can still create their own fleets, but their PDAs derive at different addresses than legit users' — the namespace squat attacks are no longer possible. Seed changes (8 sites across 6 instruction files): FleetVault: [b"fleet-vault", fleet_id] -> [b"fleet-vault", authority.key().as_ref(), fleet_id] AgentWallet: [b"agent-wallet", fleet_id, agent_key] -> [b"agent-wallet", authority.key().as_ref(), fleet_id, agent_key] DailyRoot: [b"rhemify-daily", fleet_id, date] -> [b"rhemify-daily", authority.key().as_ref(), fleet_id, date] approve_signing reorders accounts so fleet_vault is declared first, then references fleet_vault.authority.as_ref() in agent_wallet seeds (no `authority` signer in this ix — co_signer signs). State + logic changes: FleetVault gains daily_spent: u64 + last_reset_day: i64 (+16 bytes INIT_SPACE). approve_signing now takes fleet_vault as &mut, mirrors the agent-wallet daily-reset block, and checks/updates fleet daily_spent. New error variant: ExceedsFleetDailyCap. Migration: FleetVault layout grows 16 bytes. Pre-launch — no production state. Existing devnet accounts under the old (unfixed) program IDs are not migrated; new program IDs assigned to the fresh deploys (declare_id! + Anchor.toml updated). Verified end-to-end on devnet: cargo check (rhemify-anchor): exit 0 (6 pre-existing cfg warnings) cargo check (rhemify-dwallet): exit 0 (9 pre-existing cfg warnings) cargo build-sbf (rhemify-anchor): produced 150,728-byte .so cargo build-sbf (rhemify-dwallet): produced 211,648-byte .so rhemify_anchor deployed to devnet: Program ID: HYWjBbLMEz98KnppVkUnHmkUZ4pyQ8abaDRTtUedUkxV Deploy tx: 37CJCxvEdqGwn9W3caf6HZNJku83D8EjHF5EfM1Yg5HLgqKMhzYcgpDcNsz3C47hXTPwujqGSrWePHfqmdECSFFr Slot: 461436925 Explorer: https://explorer.solana.com/tx/37CJCxvEdqGwn9W3caf6HZNJku83D8EjHF5EfM1Yg5HLgqKMhzYcgpDcNsz3C47hXTPwujqGSrWePHfqmdECSFFr?cluster=devnet rhemify_dwallet deployed to devnet: Program ID: GPgdzfwQ4qG1QcqePY3uR6Uo8SvCwqxRYg7oDsXd5opc Deploy tx: 4fGSJAftgdAZnjt5viYPLcU2jgQDCTaAKNNrrE8eityQxcaPHNZ13bicfK6UVe22w8AMVy6oXWDZ5J8KZhnMG58h Slot: 461436946 Explorer: https://explorer.solana.com/tx/4fGSJAftgdAZnjt5viYPLcU2jgQDCTaAKNNrrE8eityQxcaPHNZ13bicfK6UVe22w8AMVy6oXWDZ5J8KZhnMG58h?cluster=devnet Both programs verified live via `solana program show` — owned by BPFLoaderUpgradeable, authority 8usJ1ShvoR3e74E6WMaNk2owwGUf87MuCuBJHdPgdEnQ. Follow-up (not in this commit): add Mollusk happy-path tests asserting the access-control denial flow rejects a wrong-authority signer at the seeds-derivation step.

Real instruction-level proof for Phase C (commit 149c077). The deploy proved the bytecode is live; this proves the new user-scoped seeds and the migrated FleetVault layout work end-to-end on devnet. Hand-encodes the Anchor instruction discriminator (sha256 prefix) and borsh-encoded args — no IDL needed, since cargo-build-sbf doesn't ship IDL generation and Anchor CLI 1.0.0 wouldn't install on this machine (LLVM bitcode mismatch with Homebrew rustc). Verified: bun run smoke → vault account created on-chain Authority: 8usJ1ShvoR3e74E6WMaNk2owwGUf87MuCuBJHdPgdEnQ Fleet ID: e2e-1778433401599 Vault PDA: CKLZaGoayjXwNX5rhqZLyfjxgrJoPUcRfUctT84sGGQ9 Vault size: 210 bytes (= old 194 + 16 from daily_spent + last_reset_day) confirms Phase C state migration is live in bytecode Tx: 7kRHx9iXgGnzzwbVSKEkFppzDkpBXD3cg2FwGVhL74pPtWcgDFN7RvDoL8xUMLkWPStd9FALc4Qgwvjy63VtyTF Explorer: https://explorer.solana.com/tx/7kRHx9iXgGnzzwbVSKEkFppzDkpBXD3cg2FwGVhL74pPtWcgDFN7RvDoL8xUMLkWPStd9FALc4Qgwvjy63VtyTF?cluster=devnet Cost: 0.002357 SOL (tx fee + rent for the new vault account) To re-run (each invocation creates a fresh vault under a unique fleet_id): cd tools/devnet-smoke && bun run smoke

Two parties (legit + attacker, distinct keypairs) both call initialize_fleet_vault with the SAME fleet_id. After Phase C the seeds are `[b"fleet-vault", authority.key(), fleet_id]`, so the two writes land at DIFFERENT PDAs and both succeed independently. Under the old `[b"fleet-vault", fleet_id]` seeds these would have collided at one address and the second caller would have failed with "account already in use" — the squat attack closed in Phase C is now structurally impossible. Verified on devnet: Shared fleet_id: squat-1778433715564 Legit authority: 8usJ1ShvoR3e74E6WMaNk2owwGUf87MuCuBJHdPgdEnQ Legit vault PDA: Aqya6CAamPnZBnkHpXQ934MAMp1BaSfEidVJq41TdHnj (bump 255, 210 B) Legit init tx: 4UGtgCLvHABdjSgizm75GerVVKTyzzX8jZc3odDizNZjx34oZbJKVGx2SYgk57PmduAyZUrriqsrneJDfCnsLSsu Explorer: https://explorer.solana.com/tx/4UGtgCLvHABdjSgizm75GerVVKTyzzX8jZc3odDizNZjx34oZbJKVGx2SYgk57PmduAyZUrriqsrneJDfCnsLSsu?cluster=devnet Attacker authority: i1S2Q9m1sEaPmDBxh3hCZBfXwrdvMpMKxxdtJRnvdtb (fresh keypair, funded with 0.01 SOL) Attacker vault PDA: SUjThvQS9u89aYR33vjZdkbmJ3THeD7R236U9YCUzVG (bump 255, 210 B) Attacker init tx: 3hd6t1CcQKiwPFYYcgnykvRPyRa2hpPYMEHHRa6EmrrG4ShF3ywph8zeZCGpHvLJi4L2st9YWreKg34gy6cTTHgf Explorer: https://explorer.solana.com/tx/3hd6t1CcQKiwPFYYcgnykvRPyRa2hpPYMEHHRa6EmrrG4ShF3ywph8zeZCGpHvLJi4L2st9YWreKg34gy6cTTHgf?cluster=devnet Old (pre-Phase-C) collision PDA: 3LD76kMfKscCZfShiRtivofjGTwwrAn82SDZYgkeVGhu (where both parties would have collided under `[b"fleet-vault", fleet_id]`) The script reads ~/.config/solana/id.json for the legit user, generates a fresh attacker keypair each run, funds it from legit, and uses a timestamped fleet_id so consecutive runs don't collide. Re-run: cd tools/devnet-smoke && bun run squat

Pre-Phase-C, FleetVault.daily_cap was set at init but approve_signing only checked the per-agent daily_limit — the field was dead code. After Phase C the field is load-bearing. This script proves it actively on devnet. Setup: vault.daily_cap=10000, agent.max_per_tx=20000, agent.daily_limit=100000 (agent limits intentionally loose so we don't trip ExceedsDailyLimit before reaching the fleet check). Steps: 1. init vault (legit user signs, co_signer = controlled keypair) 2. register agent (legit user signs) 3. fund co_signer 0.05 SOL 4. approve_signing(amount=8000) → must SUCCEED, vault.daily_spent=8000 5. approve_signing(amount=5000) → must FAIL: 8000+5000=13000 > 10000 with error ExceedsFleetDailyCap The script asserts the failure logs contain "ExceedsFleetDailyCap" — a generic transaction failure (wrong error code) is rejected as a false positive. Verified on devnet: Authority: 8usJ1ShvoR3e74E6WMaNk2owwGUf87MuCuBJHdPgdEnQ Fleet: cap-1778433977458 Vault PDA: 3AkhmRNWHQdD9r8LexCxEAqA5qkL9bbXPdcRrPFEm33y Agent PDA: 3GsVzpgkAoyAudKsgCWHQbN6M8CeBVgbRSGqYfdr1stM init vault tx: 5PUnVNgkHWE3KTZJW54iAQbwC8a1UVuw8ykxAaqXWtG8kMK8i5T9BW7tWkvKBEs1qo4naVGNcgRm2RM92oGe8kUU register agent tx: 35Cf7n7uWFuPJSTenqA6S6jJMZKPNW2XXJCbAW7RRkYWUR8ABBGzYXpBaJDmjbM7KTMr1Kj4ombcq2CJgihWH6Z3 approve #1 (8000): 5Ja7pqGNPz6B5t9HXEkfYKSGNpz5EKD4cCDJ15NZcydJZ8GVq5tNmYzxSJvWKCvPgDjGJD7dLgUN9KYZAxmwJA9j approve #2 (5000): rejected with ExceedsFleetDailyCap (expected) Explorer: init: https://explorer.solana.com/tx/5PUnVNgkHWE3KTZJW54iAQbwC8a1UVuw8ykxAaqXWtG8kMK8i5T9BW7tWkvKBEs1qo4naVGNcgRm2RM92oGe8kUU?cluster=devnet agent: https://explorer.solana.com/tx/35Cf7n7uWFuPJSTenqA6S6jJMZKPNW2XXJCbAW7RRkYWUR8ABBGzYXpBaJDmjbM7KTMr1Kj4ombcq2CJgihWH6Z3?cluster=devnet approve#1:https://explorer.solana.com/tx/5Ja7pqGNPz6B5t9HXEkfYKSGNpz5EKD4cCDJ15NZcydJZ8GVq5tNmYzxSJvWKCvPgDjGJD7dLgUN9KYZAxmwJA9j?cluster=devnet Re-run: cd tools/devnet-smoke && bun run daily-cap

Reverts the route kill-switch from commit 432b2f6 ("feat: gate demo behind UUID, block all other routes"). Restores the original layouts: - _onboarding: theme-onboarding wrapper, header with Rhemify wordmark + ProgressBar (4-step), Outlet for /signup, /build, /fund, /deploy - dashboard: dark theme wrapper, Sidebar + Topbar, Outlet for nested routes (overview, policies, wallets, approvals, agent detail) - login: SignInForm / SignUpForm toggle Per audit #3 the kill-switch was a deliberate product gate ("Coming soon" on /, demo only at the UUID URL). Re-enabling because Solana Foundation submission demos the full stack: onboarding flow → fleet creation → dashboard → operational views. Verified live (HTTP responses from `bun run dev:web`, port 3001): / HTTP 200 71468 bytes (marketing) /signup HTTP 200 7002 bytes (theme-onboarding markup confirmed) /build HTTP 200 10581 bytes /fund HTTP 200 8010 bytes /deploy HTTP 200 8040 bytes /dashboard HTTP 200 9661 bytes (dark theme, Sidebar + Topbar markup confirmed) /login HTTP 200 4221 bytes (SignInForm / SignUpForm) Server boot required apps/web/.env (gitignored) with VITE_CONVEX_URL, VITE_CONVEX_SITE_URL, CONVEX_URL, CONVEX_SITE_URL, CORS_ORIGIN — placeholder URLs are sufficient for SSR shell, real values needed for Convex queries to return data. Documented in apps/server/.env.example pattern. Pre-existing TS warnings in apps/web (sidebar.tsx:59 path-type narrowing, mock-wallet-service.ts:1 + wallet-service.ts:1 unused Chain import) are unrelated to this commit and existed under the kill-switch — they were just hidden because dashboard.tsx never rendered. Out of scope here.

Per ADR-002, pin @ika.xyz/sdk@0.3.1 + @mysten/sui ^2.5.0 (was "latest" on both, which resolved to 0.4.x and broke against the code that was written for an older 0.2.x API). 11 TS errors cleared by adapting each call site to grounded 0.3.1 signatures (read from the installed .d.ts files, not memory). Adaptations made: 1. SuiJsonRpcClient({ url }) → SuiJsonRpcClient({ url, network }) `SuiJsonRpcClientOptions` requires both fields in 2.16. 2. UserShareEncryptionKeys.fromRootSeed(client, seed) → UserShareEncryptionKeys.fromRootSeedKey(seedBytes, curve) No longer takes IkaClient. Uses decodeSuiPrivateKey to peel the suiprivkey1... bech32 string into raw bytes. 3. userShareEncryptionKeys.prepareDKGRequestInput(client, curve) → prepareDKG(protocolPublicParameters, curve, encryptionKey, bytesToHash, senderAddress) — free function from @ika.xyz/sdk/cryptography. Sources protocol params via ikaClient.getProtocolPublicParameters(undefined, curve) and senderAddress via keypair.toSuiAddress(). 4. ikaClient.getActiveEncryptionKey() → getActiveEncryptionKey(address). 5. ikaTx.createRandomSessionIdentifier() → registerSessionIdentifier( sessionBytes) — using the same 32-byte bytesToHash consumed by prepareDKG so the on-chain session id matches the proof binding. 6. ikaTx.requestPresign({ ..., signatureAlgorithm: ECDSASecp256k1 }) The signatureAlgorithm field is now required. 7. ikaClient.getSign(signId) → getSign(signId, curve, signatureAlgorithm) Three-arg form, defaults match the createPresign flow. 8. SignatureAlgorithm.Ecdsa → SignatureAlgorithm.ECDSASecp256k1. 9. Hono body type narrowing fix in /dkg handler — explicit typed `let body: { curve?: string }` so `.catch(() => ({}))` doesn't widen to `{}` (causing TS2339 on `body.curve`). Honest scope on IkaService.sign(): The signing flow has structural API changes I cannot ground without live Ika test network access (encrypted-share id lookup, dWallet type narrowing to ZeroTrustDWallet | SharedDWallet, requestSign signature shape, hashScheme valid-for-algorithm constraint). Per CLAUDE.md "Every function must do real work or throw NotImplementedError with a TODO" — sign() throws explicitly with a TODO checklist. /sign endpoint returns 500 instead of pretending to sign. Verified: apps/ika-sidecar: bunx tsc --noEmit exit 0 (was: 11 TS errors) bun run src/index.ts boots clean curl :3010/health HTTP 200 {"status":"ok","initialized":false, "network":"testnet"} curl :3010/dwallet/abc HTTP 401 (no auth) curl :3010/dwallet/abc -H Bearer HTTP 503 (service not initialized) Not verified (requires Sui keypair + Ika test network): /dkg — patched code path is grounded in d.ts, not run e2e /presign — same /sign — explicitly throws NotImplementedError per scope /signature/:id — patched (3-arg getSign), not run e2e Tests pass but I have not run /dkg, /presign, /sign end-to-end against a live Ika network. Not done by strict definition for those endpoints. The compile + boot + auth gate IS verified.

Audit #7. The schema declared enum fields with `v.string()` and inline comments listing the allowed values, leaving Convex's runtime validation loose: any string passed by clients (including malformed/untrusted input) would land in the table with no rejection until a downstream consumer choked on it. Tightened 16 enum fields across 13 tables: fleets.role, agents.status, agents.primary_standard, transactions.standard, transactions.status, payment_events.standard, payment_events.outcome, payment_traces.confidence, bridge_executions.protocol, bridge_executions.status, policy_decisions.decision, policy_decisions.standard, task_attributions.outcome, intelligence_actions.action_type, intelligence_actions.outcome, anchor_batches.status Pattern: extracted reusable validators as `export const`s at the top of schema.ts (PaymentStandard, FleetRole, AgentStatus, TransactionStatus, PaymentOutcome, Confidence, BridgeProtocol, BridgeStatus, PolicyDecision, TaskOutcome, IntelligenceActionType, IntelligenceOutcome, AnchorBatchStatus, SigningRequestStatus, DWalletType, DWalletStatus). defineTable references the consts so the table type and the args validator type stay in sync. Tightening surfaced 13 latent type-safety violations in the mutation handlers themselves — every `args: { foo: v.string() }` that fed into `db.insert/patch` of a now-narrowed field. Patched each at the args validator boundary (not by casting at the insert site) so: 1. Convex rejects bad enum values at the API edge rather than the DB write — clients get a clear validation error immediately. 2. The literal-union type propagates through args.foo → ctx.db.insert, so future regressions can't silently re-widen. Patched files (8 mutations across 8 files): agents.ts: DEFAULT_STANDARDS Record narrowed; setStatus.args.status anchors.ts: upsertBatch.args.status events.ts: insert.args.{standard, outcome} fleets.ts: create.args.role + update.args.role intelligence.ts: listActions.args.{action_type, outcome}; insertAction.args.action_type policies.ts: insertDecision.args.{decision, standard} traces.ts: insert.args.confidence transactions.ts: add.args.{standard, status} Verified: packages/backend: bunx tsc --noEmit exit code shows 0 errors in convex/ scope (was: 13 errors all in convex/*.ts). Remaining 946 errors are pre-existing drift in apps/web JSX flag + packages/ui JSX flag + tools/test-402 unused imports — orthogonal to this commit. NOT verified end-to-end on a live Convex deployment. The shared dev deployment `dev:quixotic-puma-190` is team-owned (per CLAUDE.local.md) and `bunx convex dev` would auto-push schema, affecting team data. Holding the schema diff in git; deploy lands when the team is ready to migrate. Tests pass but I have not run this against a live Convex deployment. Not done by strict definition. The compile-time proof IS verified.

Audit #10. The SDK shipped detectors for L402, AP2, and ACP that recognize challenge headers, but no executors — `pay()` against any of those protocols would throw a generic `ExecutionError("No executor available for l402 on lightning")` that callers couldn't differentiate from a transient failure or a mis-configured wallet. Closes the gap with two structural changes: 1. New error class `ProtocolNotImplementedError` (code `PROTOCOL_NOT_IMPLEMENTED`) carrying the detected `protocol` + `network`. UIs can `instanceof` it (or switch on the code) to render "this server uses L402, which Rhemify doesn't support yet" rather than a generic execution failure. The detection still succeeds, so the diagnostic path is preserved. 2. Stub executors in execute/unsupported-protocol.ts that own each of l402/ap2/acp: - `canExecute(detection) === detection.protocol === <name>` - `execute()` throws `ProtocolNotImplementedError(protocol, network)` Registered LAST in the cascade so any future real executor takes precedence automatically. 3. Cascade short-circuits on `ProtocolNotImplementedError` — `executeWithCascade` re-throws it instead of swallowing into the generic "all executors failed" path. No other executor is going to implement a protocol the SDK doesn't have. 4. New `SUPPORTED_PROTOCOLS = ["x402", "mpp"] as const` export + `SupportedProtocol` type alias so consumers can introspect which protocols are actually executable (was implicit before). Verified: packages/sdk: bunx tsc --noEmit exit 0 packages/sdk: bun test 131 pass / 0 fail (was 130/1) - 8 new tests in unsupported-protocol.test.ts cover all three protocols: detection succeeds, execute throws the typed error, error fields populated correctly, message includes the "Currently executable: x402, MPP" hint. - One existing test in execute.test.ts updated: it asserted `selectExecutor(l402)` returns null, but after this commit l402 has a stub. Updated to use `protocol: "unknown"` (the genuinely unmatched case) — same intent, correct after Phase K. Replacement path: when a real L402/AP2/ACP executor lands, swap the matching `*UnsupportedExecutor` for the real implementation in execute/index.ts. The typed error path naturally goes away. No breaking change to the public API for that swap.

Audit #10 also flagged: "cctp evaluator wins paths but execute/ has no cctp executor → cascade picks it then throws ExecutionError". Same pattern as L402/AP2/ACP — diagnostic surface promised something the execution layer can't deliver. Phase K added typed errors for protocol-level gaps. CCTP is at the instrument layer (cross-chain bridge to fund a payment), so the fix is at the path resolver: cctp.isAvailable now returns false with a documented rejectedReason ("CCTP executor not implemented — see TODO(cctp) in src/resolve/index.ts"). Cost / latency / risk estimates are kept intact so cost-comparison UIs still render the hypothetical CCTP price. Once a real CCTP executor lands in execute/ that can: (1) quote fast-transfer fees, (2) burn USDC on source chain, (3) mint USDC on destination, (4) submit the original protocol payment from the destination chain — restoring availability is one line (the legacy hasSolana/hasEvm check is preserved verbatim in the TODO comment). Verified: packages/sdk: bunx tsc --noEmit exit 0 packages/sdk: bun test 131 pass / 0 fail - Two existing CCTP tests in resolve.test.ts updated to assert the new "intentionally disabled" behavior instead of the old "available when wallets cross chains" behavior. Same coverage, correct assertion.

…icit stubs (audit #8) Audit #8 also flagged: four call sites used `return false && <legacy condition>` to keep the original logic visible while disabling the path. The pattern is correct semantically (always false) but reads as production logic — a future contributor seeing `false && wallet.x && detection.y` could miss that the entire branch is intentionally inert. Replaced with explicit `return false;` + a comment block that: - Clearly says STUB / not implemented - Preserves the legacy availability condition as a comment so the re-enable patch is one line Sites: - execute/agentcard-mpp.ts canExecute (audit #8 line 40) - execute/mpp-session.ts canExecute (audit #8 line 23) - resolve/index.ts privySolana isAvailable (audit #8 false && short-circuit) - resolve/index.ts squads isAvailable (audit #8 false && short-circuit) mpp-session note: Phase B rewrote openMppSession to call mppClient.solana() directly under @solana/mpp 0.5.x. The session executor stays registered for future re-introduction of session-flow MPP (e.g. via tempo.session() when RhemifyConfig.wallet gains a tempoAccount — Phase B.5). Verified: packages/sdk: bunx tsc --noEmit exit 0 packages/sdk: bun test 131 pass / 0 fail (no behavior change) Pure readability + cargo-cult cleanup. Zero functional impact: canExecute()/isAvailable() still return false at all four sites — the expression `false && X && Y` was already evaluating to false. New code makes that obvious instead of disguised as conditional logic.

Reproducibility tail of d603210 (fix(ika-sidecar): pin SDK to 0.3.1). The ika-sidecar package.json change `@ika.xyz/sdk: latest` → `@ika.xyz/sdk: 0.3.1` and `@mysten/sui: latest` → `@mysten/sui: ^2.5.0` gets reflected in the root lockfile so a fresh `bun install` resolves to the same versions Phase J was tested against.

Closes Phase I's strict-definition gap. Phase I (commit 7da8393) had static type-level proof (`bunx tsc --noEmit` exit 0, types flow through to db.insert), but no live runtime evidence that Convex actually rejects bad enum strings at the API boundary. This script runs against a local anonymous Convex deployment booted via `bunx convex dev` (no shared team state touched), exercises events.insert three ways: 1. standard="x402", outcome="success" → SUCCESS, real doc id 2. standard="bitcoin" → REJECTED at .standard 3. outcome="maybe" → REJECTED at .outcome Each rejection comes from Convex's runtime validator stack inspecting the v.union(v.literal(...)) Phase I introduced. Verified output (local anonymous Convex on http://127.0.0.1:3212): [1] events.insert with standard='x402', outcome='success' (expect: SUCCESS) inserted id: k973qbx3etces0zmpaxr9jh8m586e88j [2] events.insert with standard='bitcoin' (NOT in enum) (expect: REJECTION) rejected: Path: .standard [3] events.insert with outcome='maybe' (NOT in enum) (expect: REJECTION) rejected: Path: .outcome All assertions passed. Phase I enum validators are load-bearing at runtime. Replication: cd packages/backend bunx convex dev # one-time: choose "Start without an account" bun run scripts/enum-validation-test.ts

A terminal UI dashboard for Rhemos built on @opentui/react that connects to a local Convex deployment and renders fleet activity in three live panels: ┌─ Agents ──────────────────────────┬─ Intelligence Feed ─────────┐ │ CEO Agent running $1.64 │ recommend SUB-1: recurring │ │ Research Agent running $1.12 │ auto_flag SA-1: spend ano. │ │ Marketing running $0.11 │ auto_alert VH-2: latency │ │ Sales Agent running $1.42 │ auto_block RO-1: cheaper │ │ ... │ ... │ └───────────────────────────────────┴─────────────────────────────┘ ┌─ Live Transactions ─────────────────────────────────────────────┐ │ CEO Agent stripe.com mpp $0.21 completed │ │ Engineering Agent notion.so x402 $0.45 blocked │ │ Finance Agent perplexity.ai mpp $0.30 completed │ └─────────────────────────────────────────────────────────────────┘ Color-coded status badges: green = completed/running/applied/anchored, red = blocked/rejected/failed/frozen, yellow = pending/paused/dismissed. Architecture: apps/tui/ — new workspace package src/index.tsx — App + useConvexPoll + three panels src/convex-client.ts — ConvexHttpClient + row types scripts/seed.ts — calls convex/seed.ts:demo package.json — @opentui/core@^0.2.6, @opentui/react@^0.2.6 packages/backend/convex/ seed.ts — new public mutation `demo` that inserts 1 fleet + 6 agents + 30 transactions + 12 intelligence actions + 10 payment_events. Local-deployment only, idempotent on email "demo@rhemify.local". agents.ts — added listAll query for TUI transactions.ts — added listAll query for TUI Data flow: TUI polls Convex at 2Hz via ConvexHttpClient (Convex's reactive subscription transport assumes a browser; HTTP polling is the right shape from Node/Bun). Three queries run in parallel each tick: agents:listAll, transactions:listAll, intelligence:listActions. Render diffs through React reconciliation; @opentui/react handles the terminal repaint. Verified live (5-second boot against local convex @ 127.0.0.1:3212): cd packages/backend && bunx convex dev # one-time, choose anonymous cd apps/tui && bun install && bun run seed # populates demo data cd apps/tui && bun run start # renders dashboard Output captured all three panels rendering real seeded data with color-coded status badges. Header bar shows "convex: 127.0.0.1:3212 (live) · 0s ago" confirming the polling tick lands. Demo angle for Colosseum: takes the abstract architecture story (governed payments, intelligence engine, anchor batches) and makes it a visible terminal artifact instead of a marketing landing page. Submission video can record this TUI streaming demo activity while narrating the security + intelligence primitives we shipped.

Phase N.1. First chunk in the four-command decision-replay surface that exposes apps/server/internal/replay/ to operators. This chunk ships the browse-first command — `traces list` — that a CFO uses to find a trace_id before running `show`, `replay`, or `verify` in later chunks. System view (informed by Tenderly / Stripe / Foundry / kubectl patterns, docs/hackathon-positioning.md, the existing replay engine + HTTP route at apps/server/internal/handler/replay.go): rhemify traces list ← this chunk (read-only Convex query) rhemify traces show <id> ← Phase N.2 — pretty trace dump rhemify traces replay <id> --override key=value ← Phase N.3 — Tenderly-style overrides rhemify traces verify <id> ← Phase N.4 — Merkle proof against Solana devnet anchor (the moat — nobody else has this) What's in this commit: 1. `packages/backend/convex/seed.ts` — extended the demo mutation to insert payment_traces alongside payment_events. Without this, list returns empty. The replay_snapshot is shaped exactly the way apps/server/internal/replay/replay.go:64-75 expects (policy_state with daily_limit / max_per_transaction / domain_allowlist / allowed_standards / approval_threshold; vendor_registry_snapshot keyed by domain with is_blocked; agent_context with spend_today). Three deterministic scenario shapes interleaved so demo replays produce predictable diffs: allowed-all-pass, domain-blocked, flagged-by-threshold. 2. `packages/backend/convex/traces.ts` — new `listAll` query that joins each trace to its payment_event (agent_id, vendor, amount, outcome) and computes a `decision` field ("allowed" | "blocked") from policy_rules_fired. Optional filters: limit (cap 100), agent_id, blocked_only. Mirrors the agents:listAll / transactions:listAll pattern introduced in Phase M for the TUI. 3. `packages/cli/src/commands/traces/list.ts` — new CLI command: rhemify traces list [--limit N] [--agent <id>] [--blocked-only] [--json] [--convex <url>] Reads from Convex directly via ConvexHttpClient (CQRS-style split: reads bypass the Go server, writes still go through it). Pretty terminal table by default with picocolors; --json for jq piping. Trailing hint points at the next chunk commands so users discover the workflow. 4. `packages/cli/src/index.ts` — added `traces` dispatch with resource-after-verb pattern (Stripe / kubectl convention). Stubs `show`/`replay`/`verify` with a friendly "coming in Phase N.X" message so users know what's next, not a generic "unknown" error. 5. `packages/cli/src/config.ts` — added optional `convexUrl` to RhemifyConfig + `resolveConvexUrl(override?)` helper with priority explicit-arg > config > env CONVEX_URL > default http://127.0.0.1:3210. 6. `packages/cli/package.json` — added `convex@^1.34.1` dep so the CLI can construct ConvexHttpClient directly. Verified end-to-end against the running local Convex deployment (anonymous-backend at http://127.0.0.1:3212): $ cd apps/tui && bun run seed --reseed { agents: 6, intelligence_actions: 12, payment_traces: 12, status: "seeded", transactions: 30 } $ cd packages/cli && CONVEX_URL=http://127.0.0.1:3212 \ bun run dev traces list trace_id when agent_id vendor std amount decision outcome ─────────────────────────── ─────────────── ────────── ─────────────── ──── ──────── ──────── ──────── trc_seed_1778482712054_11 2026-05-11 14:58 j971h... anthropic.com x402 $0.03 allowed success trc_seed_1778482712054_10 2026-05-11 14:58 j97ea... stripe.com x402 $0.25 allowed success ... (10 more rows) trc_seed_1778482712054_0 2026-05-11 14:58 j973y... openai.com x402 $0.19 allowed success 12 rows. next: rhemify traces show <trace_id> · rhemify traces replay <trace_id> --override key=value $ bun run dev traces list --blocked-only → 3 rows (vercel.com x2, perplexity.ai x1) $ bun run dev traces list --limit 3 → 3 rows $ bun run dev traces list --json --limit 2 → valid JSON with all 12 enriched fields (_id, _creationTime, trace_id, agent_id, amount, decision, outcome, etc.) Pre-existing TS errors in src/commands/onboard.ts and src/commands/pay.ts (missing @rhemify-monorepo/sdk types after dist staleness, plus unused imports) are not introduced by this commit and not in scope for Phase N.1. Next chunk (N.2): `rhemify traces show <trace_id>` — full decision context with rule_results, snapshot summary, anchor status. Same loop: investigate → brainstorm → plan → build → real e2e → commit.

Phase N.2. Second chunk in the four-command surface. Builds on N.1's `traces list` — operator copies a trace_id out of the list output and runs `show` to get the full decision context. This is the "why did agent-7 pay $0.44 to perplexity.ai at 06:58 UTC" view. Render is gh-pr-view-style multi-section so a CFO can read it top-to-bottom without scanning: TRACE identity + decision badge (green ALLOWED / red BLOCKED) EVENT agent, fleet, vendor, amount, outcome, trigger 402 POLICY the 6 rules fired with per-rule pass/block + thresholds (this is the WHY — the audit-grade answer the moat sells) PATH SELECTION which instrument was selected, alternatives scored SNAPSHOT captured policy + vendor + agent state at decision time (the data replay engine consumes; appears in N.3 overrides) VERIFIABILITY trace_hash + anchor status (Solana tx if anchored — N.4) NEXT pre-filled `traces replay` commands, ready to copy What's in this commit: 1. `packages/backend/convex/traces.ts` — new `getByTraceId` query that looks up by the human-readable `trace_id` field via the existing `by_trace_id` index, then joins payment_event. CLI consumers copy trace_id strings out of `list` output; they don't have Convex internal _ids. 2. `packages/cli/src/commands/traces/show.ts` — the 7-section renderer. ~280 lines. Color-coded rule icons (✓ green pass, ✗ red block, ! yellow flag, · dim skipped). Pre-fills next-step commands with the concrete trace_id + domain to make the replay flow discoverable. --json for jq piping, --convex for ad-hoc URL override. 3. `packages/cli/src/index.ts` — replaced the "coming in Phase N.2" stub with real dispatch to `tracesShow`. Updated traces help text. Verified end-to-end against local Convex (anonymous-backend at http://127.0.0.1:3212, seeded with 12 traces): $ rhemify traces show trc_seed_1778482712054_8 TRACE trace_id trc_seed_1778482712054_8 decision BLOCKED ← red badge at 2026-05-11 06:58:32 UTC confidence high EVENT agent j97ea6vwtr1tjj6v55swyvatkh86f1mj vendor perplexity.ai amount $0.4400 USDC on solana-devnet standard x402 outcome rejected agent context Research Agent called perplexity.ai ($0.44 x402) trigger 402 HTTP 402 from perplexity.ai: payment required POLICY 6 rules evaluated ✓ daily_limit pass threshold 50.00 actual 23.61 ✓ max_per_transaction pass threshold 5.00 actual 0.44 ✗ domain_allowlist BLOCK threshold allowlist actual perplexity.ai ✓ standard_allowlist pass threshold allowlist actual x402 ✓ vendor_blocked pass threshold not_blocked actual perplexity.ai ✓ approval_threshold pass threshold 10.00 actual 0.44 PATH SELECTION selected none reason domain blocked by policy alternatives • credit unavail no credit service configured • ows avail score 0.95, est $0.4410 • jupiter unavail USDC matches vendor SNAPSHOT captured state at decision time policy daily_limit=50 max_per_tx=5 approval=10 allowlist=5 domains standards=[x402,mpp] vendors 8 in registry agent ctx spend_today=$23.17 VERIFIABILITY trace hash sha256_seed_8_19e15d48df6 anchor status not anchored yet (Phase N.4 verify cmd will anchor + verify) NEXT Try a counterfactual: rhemify traces replay trc_seed_1778482712054_8 --override daily_limit=1 rhemify traces replay trc_seed_1778482712054_8 --override 'domain_allowlist=-perplexity.ai' Also verified: - Allowed trace (trc_seed_..._0, openai.com): green ALLOWED badge, all 6 rules pass with ✓, outcome success. - --json: valid JSON dump with all 6 rules + joined payment_event. - Missing trace_id: exits 1 with helpful "Browse available traces: rhemify traces list" message. Next chunk (N.3): `rhemify traces replay <id> --override key=value` — posts to the existing /api/traces/:id/replay endpoint with policy overrides, pretty-prints the original-vs-counterfactual diff. THE killer-demo chunk.

Phase N.3. The headline command from docs/hackathon-positioning.md: "why did agent-7 pay $340 at 2am?" — answered by re-running the trace through the Go server's replay engine under counterfactual policy. Hybrid override flag UX — named flags for the common case, `--override KEY=VALUE` escape hatch for anything else: Scalar overrides --daily-limit N fleet daily spend cap --max-per-tx N per-transaction cap --approval-threshold N "flag for review" threshold Array overrides (repeatable) --add-domain D / --remove-domain D domain_allowlist add/remove --add-standard S / --remove-standard S allowed_standards add/remove Generic --override KEY=VALUE any policy_state field, comma → array, scalar = replace, "-prefix" = array remove Each flag transforms into the policy_overrides map the existing Go engine's replay.ApplyOverrides understands — same contract the spec documented, same shape Tenderly / Foundry CLIs use. Auth — /api/traces/:id/replay is in middleware.FleetAPIKeyAuth. CLI loads api_key by priority: 1. --api-key flag 2. RHEMIFY_FLEET_API_KEY env var 3. ~/.rhemify/config.json (post-onboard) 4. Local-dev fallback: query Convex for demo@rhemify.local's api_key What's in this commit: 1. `packages/backend/convex/seed.ts` — demo fleet now seeded with stable api_key "rhm_demo_local_fleet_key_2026" so the local-dev fallback can resolve it. Pre-Phase-N.3 fleets get the key backfilled on reseed. Not a production secret — local-deployment only. 2. `packages/cli/src/commands/traces/replay.ts` — ~340 lines. Flag parser, override transformer, api_key resolver, fetch POST, diff renderer. Sections: REPLAY (id), OVERRIDES APPLIED, VERDICT (original vs counterfactual with the dramatic ← arrow), RULE-BY- RULE table (every rule, both sides, CHANGED marker), DIFF SUMMARY. 3. `packages/cli/src/index.ts` — replaced the Phase N.2 "coming soon" stub with real dispatch. Updated traces help. Verified end-to-end against running Go server + local Convex: Go server: cd apps/server && CONVEX_URL=http://127.0.0.1:3212 \ go run ./cmd/server # listening on :8080, /api/health → 200 ==== DEMO 1 — blocked trace + add-domain → ALLOWED ==== $ rhemify traces replay trc_seed_1778482712054_8 \ --add-domain perplexity.ai REPLAY trc_seed_1778482712054_8 OVERRIDES APPLIED domain_allowlist [perplexity.ai] VERDICT original: BLOCKED counterfactual: ALLOWED ← would now be ALLOWED RULE-BY-RULE ✓ daily_limit pass → pass — ✓ max_per_transaction pass → pass — ✓ domain_allowlist BLOCK → pass CHANGED ✓ standard_allowlist pass → pass — ✓ vendor_blocked pass → pass — ✓ approval_threshold pass → pass — DIFF SUMMARY domain_allowlist BLOCK → pass Story: "If we'd allowed perplexity.ai, that $0.44 Research Agent payment would have gone through. The CFO can see the EXACT rule that changed and the EXACT counterfactual outcome." ==== DEMO 2 — allowed trace + tight daily_limit → BLOCKED ==== $ rhemify traces replay trc_seed_1778482712054_0 \ --daily-limit 0.10 REPLAY trc_seed_1778482712054_0 OVERRIDES APPLIED daily_limit 0.1 VERDICT original: ALLOWED counterfactual: BLOCKED ← would now be BLOCKED RULE-BY-RULE ✗ daily_limit pass → BLOCK CHANGED ✓ max_per_transaction pass → pass — ✓ domain_allowlist pass → pass — ✓ standard_allowlist pass → pass — ✓ vendor_blocked pass → pass — ✓ approval_threshold pass → pass — DIFF SUMMARY daily_limit pass → BLOCK Story: "If daily_limit had been 10 cents, that openai.com payment would have been blocked at the policy gate. Counterfactual analysis for policy tuning." Pipeline proven end-to-end: CLI flag parsing → policy_overrides JSON → Convex fleet api_key lookup (local-dev fallback) → Bearer auth header → Go server /api/traces/:id/replay (port 8080) → Go server queries traces:getForReplay from Convex → replay.Replay() pure function — real cryptographic re-evaluation → JSON response with original + replayed + diff → CLI pretty-render with color-coded badges + CHANGED markers This is the moat — `--json` plus the explorer link in N.4 makes it auditor-friendly. No competitor (Tenderly, Stripe, Foundry, Datadog) ships this combo: decision replay with policy overrides + cryptographic anchor proof. Next chunk (N.4): `rhemify traces verify <trace_id>` — Merkle proof against Solana devnet anchor PDA. The cryptographic proof that the ORIGINAL decision really happened (CFO showed an auditor "yes, here's the on-chain receipt").

…4 / THE moat) Phase N.4. The fourth and final chunk in the Decision Replay CLI surface. This is the command nobody else ships — anchors a trace's hash on Solana devnet via the deployed rhemify-anchor program (Phase C/E), then reads the PDA back to cryptographically prove the trace exists on-chain. The audit-grade differentiator. Tenderly simulates. Stripe shows events. Datadog traces. Foundry replays. Rhemos *proves* — an auditor can independently re-derive the leaf, query the on-chain PDA, and confirm the root committed at a known slot. No trust required. Flow: 1. Load trace from Convex via traces:getByTraceId 2. Compute leaf = sha256(trace.trace_hash) — deterministic 32 bytes 3. Derive PDA: [b"rhemify-daily", authority, fleet_id, date] (user-scoped seeds from Phase C, the same shape that Phase F proved structurally squat-resistant) 4. If PDA already exists: read on-chain root, compare, mark VERIFIED without submitting a new tx (idempotent — important for repeat audits) 5. If not: build write_daily_root instruction, sign with user's ~/.config/solana/id.json, submit, wait for confirmation, then read PDA back 6. Print VERIFIED with computed_root == on_chain_root, anchor tx, slot, and Solana Explorer link Implementation notes: - Lifted the Solana web3.js pattern from Phase E's tools/devnet-smoke/initialize-fleet-vault.ts: anchor discriminator = sha256("global:<ix_name>").slice(0,8); strings borsh-encoded with 4-byte LE length prefix; u32 LE; raw 32-byte [u8; 32] for merkle_root. - The on-chain DailyRoot account is parsed by walking the variable-length Borsh layout (not the fixed InitSpace alloc): 8-byte discriminator, then fleet_id len+utf8, date len+utf8, merkle_root[32], etc. - Single-leaf "batch" semantics for now — leaf hash IS the Merkle root for one trace. Production batching (multi-trace daily roots with real Merkle paths) is the Go server's BatchManager cron's job; the CLI demonstrates the anchor primitive for one trace at a time. - No Go server needed for this command — talks directly to Solana devnet RPC + Convex for the trace lookup. What's in this commit: 1. `packages/cli/src/commands/traces/verify.ts` — ~280 lines. Solana web3.js + node:crypto sha256 + node:fs for keypair. Idempotent anchor + verify in one command. 2. `packages/cli/src/index.ts` — replaced the Phase N.4 "coming soon" stub with real dispatch. Updated traces help so all four verbs are now live. Verified end-to-end on Solana devnet (initial anchor, then idempotency): $ rhemify traces verify trc_seed_1778482712054_0 anchoring trace trc_seed_1778482712054_0 to devnet (~0.001 SOL fee)... VERIFY trc_seed_1778482712054_0 VERIFIED trace hash matches on-chain Merkle root ON-CHAIN program HYWjBbLMEz98KnppVkUnHmkUZ4pyQ8abaDRTtUedUkxV PDA 84qxhcQ9XTqDNeNkVbS6vW4PMwccaXr2LmdpeuwhuXgR bump 254 fleet_id jx78f22hchxpxr59y74fbk2eex86e4a3 date 2026-05-11 anchor tx 3sN7mowb3kWiSbxejnZnVdq3Kc2ZPiAhR7EN4j9iuc6Cw9pHEEr6idNRBRetXJ7wJGQ62Uu8CKx2ftGRTwwWxM3T slot 461573216 status freshly anchored in this run HASH CHAIN computed root 85104344aa1d7684f8efe19b698b37877e176de8037ca4e5bd51d55367be2b97 on-chain root 85104344aa1d7684f8efe19b698b37877e176de8037ca4e5bd51d55367be2b97 match ✓ identical EXPLORER PDA https://explorer.solana.com/address/84qxhcQ9XTqDNeNkVbS6vW4PMwccaXr2LmdpeuwhuXgR?cluster=devnet tx https://explorer.solana.com/tx/3sN7mowb3kWiSbxejnZnVdq3Kc2ZPiAhR7EN4j9iuc6Cw9pHEEr6idNRBRetXJ7wJGQ62Uu8CKx2ftGRTwwWxM3T?cluster=devnet Audit-grade proof: an auditor can independently re-derive the leaf, query the PDA at 84qxhcQ9XTqDNeNkVbS6vW4PMwccaXr2LmdpeuwhuXgR, and confirm the root committed at slot 461573216. Second run (idempotency check — same trace_id, no new tx): $ rhemify traces verify trc_seed_1778482712054_0 (no "anchoring..." message — went straight to read+verify) VERIFY trc_seed_1778482712054_0 VERIFIED trace hash matches on-chain Merkle root ON-CHAIN ... (same PDA) status already anchored — verified without writing a new tx HASH CHAIN computed root 85104344aa1d7684f8efe19b698b37877e176de8037ca4e5bd51d55367be2b97 on-chain root 85104344aa1d7684f8efe19b698b37877e176de8037ca4e5bd51d55367be2b97 match ✓ identical This completes the four-command Decision Replay CLI: rhemify traces list ✅ Phase N.1 — browse rhemify traces show <id> ✅ Phase N.2 — full decision context rhemify traces replay <id> ✅ Phase N.3 — counterfactual diff rhemify traces verify <id> ✅ Phase N.4 — on-chain anchor proof ← THIS End-to-end killer-demo flow now works: $ rhemify traces list # find a trace $ rhemify traces show trc_xxx # read why it was decided $ rhemify traces replay trc_xxx \ # what-if policy override --add-domain perplexity.ai $ rhemify traces verify trc_xxx # cryptographically prove on Solana Submission-ready for Colosseum Frontier per docs/hackathon-positioning.md's "Decision trace replay — 'why did agent-7 pay $340 at 2am?'" enterprise demo moment.

…O.1) Closes the Category B audit gap: payment_traces were entirely seeded because rhemify.pay() had never run end-to-end against any 402 endpoint. Two latent drifts hid the failure: 1. SDK PaymentEvent → Convex events:insert mismatch: - SDK emitted chain_from/chain_to, Convex required chain (separate field). - SDK emitted id/timestamp/standard_version/parent_event_id/delegation_depth — Convex strict validator rejected them. 2. SDK PaymentTrace → Convex traces:insert mismatch: - SDK used agent_task_description, Convex wanted agent_task_context. - SDK omitted confidence (required by validator). - SDK's payment_event_id was an "evt_<hex>" string, not a Convex Id. Fix at the SDK↔Convex contract boundary (Go ingest handler), not by loosening Convex validators or breaking SDK types. apps/server adds reshapeEventForConvex / reshapeTraceForConvex / reshapePolicyDecisionForConvex that project the SDK shape onto the exact field set each mutation accepts. Also in this chunk: - rhemify pay --dry-run flag (runs the full pipeline, skips chain submit, still emits the trace) — smallest viable proof the pipeline works - RhemifyConfig.fleetApiKey field — without it, the CLI hardcoded "cli-user" and the Go server's FleetAPIKeyAuth middleware 401'd silently - CLI surfaces ingest errors via onError instead of swallowing them - seed.ts no longer fakes payment_events / payment_traces / policy_decisions — those tables are now driven exclusively by real pipeline output. Faking them was the bug-hider. - seed:wipeDemoTraces mutation to clear pre-Phase-O.1 seeded rows (run via curl when ready). Verified end-to-end: $ bun run tools/test-402/server.ts & $ rhemify pay http://localhost:3402/stock-data --dry-run --max-budget '$1.00' → trc_19daf215b88d4b0c lands in Convex with 8 alternatives evaluated, 6 policy rules fired, full detection raw body, real trace_hash.

…(phase O.2) Replaces the dry-run-only flow from O.1 with a real on-chain Solana memo transaction per payment, end-to-end: rhemify pay http://localhost:3402/stock-data --max-budget '$1.00' → submits memo tx on devnet (signed by CLI wallet) → memo content: rhemify:x402:<network>:<priceRaw>:<payTo>:<path>:<ts> → encodes signed signature into x402-spec PaymentPayload (base64 JSON) → sends X-Payment header to resource, retrieves 200 OK → records signature as txHash in trace → trace lands in Convex with payment_tx_hash visible in `traces show` Verified end-to-end (devnet): signature 2ARU61BoEXY7P8H8Nd7wkacRUrZ1Bwftk86eGxpB45ScvGZt3aLsAq3cgs51jSuXi5z9ZsJppQF35kwp3EhctJUW trace_id trc_51ab4efb2fe14e06 explorer https://explorer.solana.com/tx/<sig>?cluster=devnet fee 5000 lamports (~$0.001) memo log "rhemify:x402:solana-devnet:500000:11111111111111111111111111111111:/stock-data:1778489854...." Honest scope: - This is a SIGNED memo tx, NOT a USDC SPL-Token transfer. No tokens move to the recipient — the memo serves as cryptographic intent + payable trace anchor for the audit story. A future variant (x402SolanaTransferExecutor) should do the real token transfer for production. For the audit-grade demo, every payment now has a verifiable on-chain signature. - Local test server (`tools/test-402/server.ts`) accepts any X-Payment header — for real x402 servers, a facilitator would validate the PaymentPayload contents. The header shape we send (x402Version=2, scheme=exact, network, payload.transaction) is the canonical spec shape so it would parse against a real facilitator. x402SolanaExecutor rewrite: - Drop dynamic import of `x402-solana` (peer dep was declared but the installed package's facilitator path required extra.feePayer in detection.raw, which no test/real endpoint we tried supplies — the whole executor errored out unconditionally for every real run). - Self-contained: uses `@solana/web3.js` directly to build/sign/submit the memo tx. Honest about what it does. End-to-end field plumbing for the on-chain signature: - packages/types: PaymentTrace.payment_tx_hash (string | null) — distinct from anchor_tx_hash (Merkle anchor) so we don't conflate "payment happened" with "trace document is anchored". - packages/sdk/client.ts + session/index.ts: emit payment_tx_hash from snapshot.executionTxHash (already captured by trace.recordExecution). - packages/backend/convex/schema.ts + traces.ts: payment_tx_hash optional field added to schema + insert validator. - apps/server/internal/handler/ingest.go: reshape passes payment_tx_hash through when non-empty (Convex strict optional rejects empty strings). - packages/cli/.../traces/show.ts: VERIFIABILITY section renders the signature + clickable devnet explorer link. Robustness fixes in traces/show.ts (drift exposed by real SDK output): - policy_rules_fired: SDK emits {decision, actual} where seed used {result, value}. normalizeRule() absorbs both shapes. - instrument_selection_log: SDK emits a string ("ows selected: score 0.701"), seed used {selected, reason}. Render both. - replay_snapshot.{policy_state, agent_context, vendor_registry_snapshot}: SDK emits camelCase + zero values, seed used snake_case + real values. show.ts now reads both, falls back to "(empty)" instead of crashing. (The deeper fix — actually populating SDK policy state — is phase O.4.)

…O.3) Mirrors phase O.2 (x402-solana) for the MPP standard: rhemify pay http://localhost:3402/analytics --max-budget '$1.00' → MPP detected via WWW-Authenticate (network=solana-devnet, $0.10) → submits memo tx on devnet with content "rhemify:mpp:<...>" → sends Authorization: Payment <base64-JSON-token> with signed signature → resource returns 200, SDK records signature in trace Verified end-to-end (devnet): signature EJPuNuCNuK4UGPXZEYCWPMwWSf3BpGGTsjEiNQkM58c4ZxsUEzQn3YC72PzoEn5Q6zjMM47k42tX2HtX6HWmJ3z trace_id trc_533c5b753f154fe7 memo log "rhemify:mpp:solana-devnet:100000:11111111111111111111111111111111:/analytics:1778489299557" fee 5000 lamports mppChargeExecutor rewrite (same approach as x402-solana.ts in O.2): - Drop dynamic import of `@solana/mpp` + `@solana/kit` (peer deps; the upstream API surface has shifted under us multiple times and using it as the happy path silently broke every real run). - Self-contained: uses `@solana/web3.js` directly to build/sign/submit a memo tx whose content carries the trace context (network, amount, recipient, resource path, timestamp). - Sends `Authorization: Payment <base64>` (MPP convention) instead of `X-Payment` (x402 convention). The local test server accepts either. Honest scope (documented in the executor file-level doc): - This is a SIGNED memo tx, NOT a USDC SPL-Token transfer. No tokens move to the recipient. A future variant (`mppChargeTransferExecutor`) should do the real token transfer for production. For the audit-grade demo, the memo serves as cryptographic intent + payable trace anchor. - The Payment token shape we send is JSON, not the HMAC MAC token that a real `mppx` server would expect. Works against any server that treats "Authorization present" as the gate (incl. our local test server). Real mppx interop is future work. Both supported protocols (x402, mpp) now produce real on-chain signatures end-to-end. Category B audit gap fully closed for the SUPPORTED_PROTOCOLS surface.

Closes the keystone audit gap behind `rhemify traces replay`: every emitted trace had `replay_snapshot.policy_state` hardcoded to zeros and camelCase keys, so the Go replay engine (which reads snake_case keys) saw an empty policy and every counterfactual override was meaningless. Verified end-to-end (devnet): $ rhemify pay http://localhost:3402/stock-data --max-budget '$1.00' → trc_4f362bd02f2249d9 with policy_state{daily_limit=100, max_per_tx=50, ...} (real values fetched from Go /api/policy/<agent>, not zeros) $ rhemify traces replay trc_4f362bd02f2249d9 --daily-limit 0 → original: ALLOWED counterfactual: BLOCKED ← daily_limit BLOCK i.e. lowering the limit below the actual spend correctly flips the outcome — the replay engine now has real state to flip. Three layers of drift fixed: 1. packages/types/src/intelligence.ts — canonical contract realigned: - PolicyState keys: camelCase (dailyLimit, ...) → snake_case (daily_limit, ...). This is the wire shape; Go replay reads policy_state["daily_limit"]. The type was the source of truth that was wrong. Note: SDK runtime PolicyConfig stays camelCase because that's the live policy-engine shape the agent's rules evaluate against. SDK now translates between them when emitting. - vendor_registry_snapshot: Record<string, unknown> → Record<string, {is_blocked: boolean}>. Go reads snapshot[domain].is_blocked. - agent_context: string → {spend_today: number}. Go reads agent_context.spend_today. - allowed_standards: PaymentProtocol[] → string[] (over-the-wire the literal union is lost; loose type lets emit type-check). 2. packages/sdk/src/policy/index.ts — PolicyEngine.evaluate signature: - Returns { decision, context } instead of just decision. - The caller needs the context to snapshot real policy_state into the trace; without it every trace recapitulated the empty-state bug. 3. packages/sdk/src/client.ts + session/index.ts + trace/{index,types}.ts: - Trace gains recordPolicyContext(ctx) + policyContext in snapshot. - client.ts pay() captures context, emits real snake_case policy_state by translating from camelCase PolicyConfig: daily_limit ← policy.dailyLimit max_per_transaction ← policy.maxPerTransaction approval_threshold ← policy.approvalThreshold allowed_standards ← policy.allowedStandards domain_allowlist ← policy.domainAllowlist - vendor_registry_snapshot: built from policyContext.blockedDomains. - agent_context.spend_today: from policyContext.spentToday. - Session-path emits zero-state but in the correct snake_case shape so it round-trips through Go without breaking schema validation. The replay engine itself didn't change — Go-side replay/policy.go was already correct; it was just being fed bad data. With real state, the killer-demo "what if daily_limit were $1?" works as advertised.

Audit-grade rewrite of the root README. The previous version overclaimed on multiple axes a Colosseum technical-DD pass would catch immediately: - "Any standard (x402, MPP, L402, AP2)" — L402/AP2/ACP throw ProtocolNotImplementedError. They detect; they do not execute. - "Any chain" — EVM/Base x402 path exists in code but was never proven end-to-end against a real endpoint. Solana is the only supported execution surface in v1. - "Base x402 + CCTP" — CCTP path resolver returns available:false. Wiring exists; execution does not. - "@x402/fetch, mppx, OWS signing" — those packages were peer deps that we ship around with a self-contained @solana/web3.js memo executor, because their facilitator-shaped APIs never matched any real endpoint we tested against. - "Permanently verifiable on Solana via PDAs" — Anchor program is deployed and write_daily_root works, but only `rhemify traces verify` submits anchor txs (not automatic per-payment). - "338+ seeded x402 vendor endpoints" — that was discovery-DB metadata, not flow against the endpoints. What the new README claims (and links the user to verify): - x402 + MPP detection from real HTTP 402 responses. - Solana memo execution: signed-intent tx on devnet, ~5000 lamports fee, memo carries trace context. NOT a USDC transfer — explicit. - Full decision capture (detection raw body, alternatives scored, rules fired, agent context) stored in Convex with content hash. - `rhemify traces replay <id>` counterfactuals against real captured state (post-O.4 — policy_state now has real values). - `rhemify traces verify <id>` writes Merkle root to devnet program. New "What is NOT in v1" section enumerates the typed stubs and the path resolvers that return false so a reader can audit the supported surface in seconds rather than reverse-engineering from the codebase. New "What actually works end-to-end" table maps each capability to a specific shell command and a specific proof artifact, mirroring how a Colosseum judge would walk the demo. New "Roadmap" section parks the previous overclaims as explicit future work — USDC transfers, mainnet anchoring, L402/AP2/ACP execution, EVM path, CCTP, Ika dWallet — so they remain visible without being lied about. No other files changed in this commit. Per CLAUDE.local.md, README is the one .md that is committed/pushed; other markdown stays local. The apps/web marketing components (Hero, Features, etc.) carry their own positioning copy owned by Jun Shen — out of scope for this audit fix.

Closes the audit-flagged "no automated quality gate" gap. Runs on every push to feature/* branches and PRs to main, three jobs in parallel: typescript — bun install + SDK build + bun run check-types go-server — go vet + go build + go test in apps/server anchor-programs — cargo check on rhemify-anchor + rhemify-dwallet Triggered on push to main / feature/** and PRs to main. Concurrency group cancels in-flight runs on the same branch so a rapid-fire push sequence doesn't queue up wasted compute. Toolchain pins (lifted from the actual local environment so CI matches what we develop against): - Bun 1.3.11 (package.json packageManager) - Go 1.24 (apps/server/go.mod) - Rust stable (rustup default; cargo check on host target, not SBF) Why cargo check, not anchor build: - Anchor SBF build needs cargo-build-sbf from Solana's bundled toolchain, which is heavy to install on every CI run. - The audit value of CI here is "did this change break compilation", not "is the SBF artifact byte-equal" — cargo check on the host target catches the same syntax + type errors. Full SBF compile stays a developer-machine step before devnet deploy. Smoke-tested locally before push — all three jobs pass cleanly: - bun typecheck: 3 workspaces typecheck (was previously red on MCP until phase O.4's SDK build chain stabilized). - go vet/build/test: 5 packages tested, 0 failures. - cargo check: rhemify-anchor (0.67s), rhemify-dwallet (0.76s). Each emits ~6 unexpected_cfgs warnings from anchor's cfg surface against rustc 1.95 — not failures, won't block. Caching: - bun: install cache by lockfile hash. - go: actions/setup-go built-in cache on go.sum. - cargo: registry + git + per-program target dir by Cargo.toml hash. Cold anchor build is ~5min; cached is ~30s. What this does NOT include (deferred to next chunk): - Anchor program unit tests (no tests written yet — phase O.7). - Web app dev-server smoke (apps/web visual regression is out of Sean/siewwwin scope). - Release artifact builds.

First CI run on this repo (commit 0d8cd2c) caught a latent build-order bug that my local smoke test missed: packages/mcp's tsc fails with TS2307 "Cannot find module '@rhemify-monorepo/sdk'" when the SDK's dist/index.d.ts hasn't been built yet. Why local passed but CI failed: - moduleResolution: "bundler" (in packages/config/tsconfig.base.json) reads the SDK's package.json "types" field, which points to "./dist/index.d.ts". - When dist/ doesn't exist, the resolver can't find the module at all — hence TS2307, not the softer TS7016 ("found .js but no .d.ts") error. - My local `bun run check-types` showed mcp:check-types as "cache hit, replaying logs" — turbo skipped the actual tsc invocation because a prior successful run (when dist/ existed) was cached. The cache hid the build-order dependency. Fix: - turbo.json: check-types task now dependsOn ["^build", "^check-types"] instead of ["^check-types"] alone. - Forces every workspace's check-types to wait for upstream workspaces' build to complete, guaranteeing dist/ exists before a downstream package tries to resolve its types. Verified locally with --force (cache bypassed): bun run check-types --force → SDK build runs first (sdk:build: 62ms ESM + 2417ms DTS) → MCP check-types runs after (mcp:check-types: cache bypass) → 4 tasks successful, 0 errors This unblocks the CI TypeScript job that failed on the first run (commit 0d8cd2c, run 25660844873). Anchor + Go jobs were already green.

Closes the audit-flagged "no on-chain test coverage" gap. Adds 17 unit tests across the two Anchor programs, focused on the security invariant both audit reports flagged: user-scoped PDA seeds. rhemify-anchor (6 tests): daily_root_pda_is_deterministic daily_root_pda_is_authority_scoped ← squat defense daily_root_pda_is_fleet_scoped daily_root_pda_is_date_scoped daily_root_seed_prefix_is_pinned ← rename canary program_id_matches_declare_id ← deploy canary rhemify-dwallet (11 tests): fleet_vault_pda_is_deterministic fleet_vault_pda_is_authority_scoped ← squat defense fleet_vault_pda_is_fleet_scoped agent_wallet_pda_is_deterministic agent_wallet_pda_is_authority_scoped ← squat defense (transitive) agent_wallet_pda_differs_by_agent_key signing_approval_pda_is_deterministic signing_approval_pda_is_nonce_scoped ← replay defense signing_approval_pda_inherits_agent_wallet_scope seed_prefixes_are_pinned program_id_matches_declare_id Scope discipline — what these tests do NOT cover: - Full account validation (the #[account(...)] macro constraints): init_if_needed semantics, signer enforcement, rent payment, etc. Those require an SVM runtime (Mollusk or litesvm). Future chunk. - The handler bodies (Clock::get, daily_cap math). Same — needs SVM. - SBF-target compilation. Stays a developer-machine step; CI compiles the host target only to keep job runtime under a minute. What they DO cover — the security invariant a $1M technical DD would flag if absent: every PDA in this monorepo is derived from a seed list that includes the operator's pubkey, so a different signer cannot init into another fleet's account namespace. Tests pin: - seed prefix bytes (catches accidental rename that would orphan every deployed PDA on devnet) - authority inclusion (proves squat defense holds for fleet-vault and agent-wallet PDAs) - transitive squat defense for signing-approval (which seeds off the agent_wallet PDA — itself authority-scoped) - program IDs against declared values (catches deploy mismatches) CI workflow (.github/workflows/ci.yml): - cargo check → cargo test --all-targets in both anchor jobs. cargo test runs check implicitly + builds tests + executes them. Cached builds keep the job under a minute. Verified locally before push: programs/rhemify-anchor: 6/6 passed, finished in 0.00s programs/rhemify-dwallet: 11/11 passed, finished in 0.00s

… shapes (phase O.8) The replay diff renderer previously couldn't show a real side-by-side comparison because (1) the SDK and Go used different rule names for the same checks, and (2) Go's buildOriginalOutcome read the wrong fields off SDK-emitted traces. Both bugs were hidden by the seeded traces O.1 deleted — once real pipeline output started flowing, the killer-demo output broke. Before: RULE-BY-RULE · domain_blocked → skipped CHANGED ✓ domain_allowlist → pass CHANGED · allowed_standards → skipped CHANGED ✗ daily_limit → BLOCK CHANGED · max_per_tx → skipped CHANGED ✓ max_per_transaction skipped → pass CHANGED ✓ standard_allowlist skipped → pass CHANGED ✓ vendor_blocked skipped → pass CHANGED (12 rows total — every rule shown twice, every rule "CHANGED") After: RULE-BY-RULE ✓ vendor_blocked pass → pass — ✓ domain_allowlist pass → pass — ✓ standard_allowlist pass → pass — ✗ daily_limit pass → BLOCK CHANGED ✓ max_per_transaction pass → pass — ! approval_threshold pass → flag CHANGED (6 rows — one per rule, only real changes flagged) Two layers of drift fixed: 1. SDK rule names (packages/sdk/src/policy/rules.ts): max_per_tx → max_per_transaction allowed_standards → standard_allowlist domain_blocked → vendor_blocked Go's names (apps/server/internal/replay/policy.go) were the canonical set — clearer, snake_case, descriptive — so SDK moves to match. Suggestion strings in policy/index.ts and assertions in test/policy.test.ts updated to match. 2. Go original-outcome reads (apps/server/internal/replay/replay.go): buildOriginalOutcome read m["result"] / m["value"] (the deprecated seeded shape) but SDK actually emits decision / actual. Result: original.rule_results came back with empty result/actual strings for every rule on real traces. Now reads either shape — { result, value } or { decision, actual } — and normalizes SDK's "allow" → "pass" so diff comparisons against the live engine's pass/block/flag vocabulary line up. Verified end-to-end: $ rhemify pay http://localhost:3402/stock-data --max-budget '$1.00' → trc_9b962efd66f54e57 emitted with new names $ rhemify traces replay trc_9b962efd66f54e57 --daily-limit 0 → original ALLOWED, counterfactual BLOCKED, only daily_limit shown as CHANGED (the actual override target) Side note still visible in the output: approval_threshold reads as "pass" in original (SDK: disabled when threshold=0) but "flag" in replayed (Go: any amount > 0 threshold flags). Different semantic for the "approval disabled" case. Not part of this chunk — separate seam. Local tests pass: 19/19 SDK policy tests, Go replay tests.

Six zombie imports left over from O.1's trace-seed-loop deletion: PaymentStandard, AgentStatus, TransactionStatus, PaymentOutcome, IntelligenceActionType, IntelligenceOutcome. They were the enum validators the trace-seed loop used; that loop is gone, the imports weren't. oxlint flagged all six. After: bunx oxlint packages/backend/convex/seed.ts → 0 warnings, 0 errors. Other oxlint warnings in the repo (24 total across 208 files) are pre-existing in tools/test-402/ and packages/sdk/test/ — out of scope for this chunk. Two warnings in my new executors (x402-solana.ts, mpp-charge.ts) about `...(options.headers ?? {})` are intentional. Lint suggests dropping the `?? {}` fallback as "unnecessary" — true at runtime, but TS strict mode requires it because `headers?: Record<string, string>` is typed as possibly undefined and spreading undefined into an object literal is a TS error under strict checks. Keeping the fallback.

… O.17) apps/web/public/logo/{base,agentcard,circle,l402,virtual}.svg were shipped with the old "Integrated with" surface that O.10 trimmed. None of the five reflect a capability the SDK actually executes: base.svg — Base x402 path exists in code but never proven e2e agentcard.svg — agentcard-mpp executor canExecute returns false circle.svg — CCTP path resolver returns available:false l402.svg — detected, throws ProtocolNotImplementedError on execute virtual.svg — ACP detector hardcodes Base; no executor Confirmed no references in apps/web/src/ before delete (grep clean). Remaining logos in public/logo/ — mpp, solana, superteam, x402 — match the TrustStrip LOGOS array. Future contributors can re-add any logo when its executor lands.

…layer (phase O.18) CLAUDE.local.md (2026-04-23 audit) flagged these as legacy artifacts "still in the tree but not driving the UI". Verified via grep: nothing outside services/index.ts (the barrel itself) imports any of: - apps/web/src/lib/services/fleet-service.ts (interface) - apps/web/src/lib/services/mock-fleet-service.ts (162 lines impl) - apps/web/src/lib/services/wallet-service.ts (interface) - apps/web/src/lib/services/mock-wallet-service.ts (impl) - apps/web/src/lib/services/index.ts (barrel) - apps/web/src/lib/hooks/query-keys.ts (15 lines) The dashboard's data layer pivoted to convex/react useQuery hooks (apps/web/src/lib/hooks/use-*.ts). MockFleetService was the pre-Convex in-memory backend; query-keys was the TanStack Query cache-key constants that came with it. NOT removed: - apps/web/src/lib/simulation/engine.ts (SimulationEngine) — still imported by routes/_onboarding/deploy.tsx to drive the post-deploy fake transaction feed during onboarding. Live code, despite the CLAUDE.local.md note grouping it with the dead data-layer files. Future chunk could swap it for real Convex-feed reads, but that's a feature change, not a cleanup. Verified: bun run check-types passes (turbo cache hit), no broken imports.

…utor (phase O.19) The two real on-chain executors introduced in O.2 + O.3 had no unit tests — the cascade-routing logic (which executor.canExecute returns true for a given detection + wallet) was only ever validated by the live e2e flow against tools/test-402/server.ts. A canExecute regression would silently route payments through the wrong executor (or fall through to the unsupported-protocol stubs and throw), and the seeded tests wouldn't catch it. 12 new tests (vitest), 6 per executor, extending the existing new-executors.test.ts pattern. x402SolanaExecutor: ✓ true for x402 on solana-devnet with Solana wallet ✓ true for x402 on solana-mainnet ✓ false for EVM networks (base, base-sepolia) — cascade falls through to x402EvmExecutor ✓ false without Solana wallet (empty wallet, evm-only wallet) ✓ false for non-x402 protocols (mpp, l402) mppChargeExecutor: ✓ true for mpp on solana-devnet / -mainnet ✓ true on legacy "devnet" / "mainnet-beta" network strings (some MPP WWW-Authenticate parsers yield these shorter names) ✓ false without a Solana wallet ✓ false for non-mpp protocols (x402) ✓ false for non-Solana networks (base) Execute path stays in e2e (real Solana RPC + funded keypair required). Future chunks can add Mollusk/litesvm-backed integration tests for the execute body once the test-validator surface stabilizes. Verified: bun test test/new-executors.test.ts → 27 pass, 0 fail, 31 expect() calls, 104ms.

Pre-flight diagnostic for the demo. Before this chunk, status showed fleet identity + wallet balance but said nothing about whether the services the demo actually depends on were up. A judge running `rhemify status` would still have to manually try `curl localhost:8080`, `curl localhost:3212`, etc. New "Services:" section probes three dependencies in parallel with a 2.5s per-probe timeout: Go server GET /api/health Convex POST /api/query (empty body — Convex 400s but TCP RTT confirms the deployment is up) Test 402 GET /health (informational — not mandatory) Output: Test 402 ● reachable (7ms, http://localhost:3402/health) Go server ● reachable (8ms, http://localhost:8080/api/health) Convex ● reachable (10ms, http://127.0.0.1:3212/api/query) Color coding: ● green reachable + 2xx (or any response for Convex POST mode) ● yellow reachable but non-2xx HTTP status ○ red network failure — distinguishes "timeout" / "not running" / other Error.message in the rightmost column Also hardened the existing wallet balance lookup: previous version silently crashed on RPC failure; now reports the error inline and continues to the services section instead of aborting.

One script the judge / a new contributor can invoke to walk the whole pipeline in one shot. Assumes services are up (Convex, Go server, test-402) and the CLI is onboarded — surfaces a 'not reachable' service early via the embedded `status` check rather than failing mid-replay with an opaque error. Steps (with set -euo pipefail, so any failure aborts): 1. rhemify status — fleet identity + service health 2. rhemify pay <endpoint> — real Solana memo tx extracts trace_id from stdout 3. rhemify traces show <id> — 7-section decision context render 4. rhemify traces replay <id> — counterfactual with daily_limit=0 (the killer-demo: ALLOWED → BLOCKED) 5. Summary — explorer link, follow-up commands Endpoint defaults to http://localhost:3402/stock-data (x402). Pass http://localhost:3402/analytics as $1 for MPP — same flow, different detection path, same replay primitive. One gotcha caught while writing this: `bun --cwd <path> run src/index.ts <args>` makes bun think "src/index.ts" is a package.json script name and swallows the actual argv. The script uses `bun <absolute-path> <args>` instead, which invokes the file directly. Verified end-to-end: $ tools/demo-run.sh → trc_2d56f729c38e478a + sig 4DciXdjUj... on devnet → replay diff: daily_limit pass → BLOCK CHANGED, others —

Quickstart's six manual command lines compressed to one: ./tools/demo-run.sh The individual commands stay listed below for anyone who wants to run them by hand. Also corrected the per-command invocation from `bun --cwd packages/cli run src/index.ts ...` to `bun packages/cli/src/index.ts ...` — the former was broken (bun interpreted "src/index.ts" as a package.json script name and dropped the actual argv, see O.22 commit notes). No other changes — Quickstart still requires Convex / Go server / test-402 / wallet setup in steps 1-3 before the runner can fire.

…odes (phase O.24) `.padEnd(20)` on a pc-colorized string counts the ANSI escape sequences as visible chars, so "pass" (4 chars green = ~14 bytes) padded to 20 left 6 trailing spaces instead of 16. The result column drifted off the header line by ~6 chars per row. New helper `colorPadEnd(colored, visible, width)` takes both the colorized string (rendered) and the uncolored visible (for measuring), returning `colored + repeat(width - visible.length, " ")`. Before: ✓ vendor_blocked pass → pass — After: ✓ vendor_blocked pass → pass — The width was also shrunk 20 → 10 since policy decisions are short words ("pass", "block", "flag", "skipped") — 10 is enough headroom and the columns now sit closer for easier eye-tracking. The "BLOCK" uppercase variant for `block` results is handled in the visible-string computation so the width still measures correctly when the rendered string is "BLOCK" (5 chars) not "block" (5 chars — same count, but the case-folding rule must be applied to the visible too to stay consistent if anyone changes block's rendering later).

Same ANSI-padding bug as O.24 but in show.ts's POLICY section. Single loop iterating each policy rule did: const result = pc.green(r.result); // ANSI-wrapped console.log(`... ${result.padEnd(20)} ...`); // counts escape codes so the "pass" / "block" / "flag" column was over-padded by ~6 chars, pushing the trailing `threshold ... actual ...` text rightward and breaking eye-tracking across rows. Fix mirrors O.24: compute the visible (uncolored) length first, color the visible string second, append explicit padding spaces outside the color codes. Column width pulled from 20 → 10 since the values are short ("pass", "BLOCK", "flag", "skipped") and the threshold/actual detail can use the recovered horizontal space. After: ✓ vendor_blocked pass threshold not in blocked list actual localhost ✓ daily_limit pass threshold $100 actual $0.50 (each row's "threshold" starts at the same column — was drifting before)

When a replay override doesn't change any rule outcome (e.g. --daily-limit 10000 raising the limit above the actual spend), the Go replay engine returns an empty PolicyDiff slice. Go's json.Marshal serializes `[]PolicyDiff(nil)` as `null`, not `[]`. The CLI did: diff: PolicyDiff[]; // ← type lied const diffRules = new Set(r.diff.map(...)); // crashes on null if (r.diff.length === 0) { ... } // also crashes so every "what if I loosen the policy?" counterfactual died with "null is not an object (evaluating 'r.diff.map')" — the killer demo only worked in the "tighten" direction. Fix: - Type the field as `PolicyDiff[] | null` (truthful contract). - Coalesce to [] once at the top of render() (`const diff = r.diff ?? []`). - Switch the two existing call sites (rule-by-rule + DIFF SUMMARY) to the local variable. After: $ rhemify traces replay <id> --daily-limit 10000 → counterfactual: ALLOWED (decision unchanged) → DIFF SUMMARY: "No rules changed outcome — your override didn't affect the decision." This was almost certainly the second-most-likely crash a judge would hit during the demo (after the missing payment_tx_hash render). Both were "non-happy-path" cases that real traces produce but the seeded fixtures didn't.

Previously when the PDA for fleet+date already existed but its root didn't match the current trace, verify printed "MISMATCH" at the top but then "already anchored — verified without writing a new tx" in the status line. Two contradictory messages — and the second one says "verified" which is the opposite of MISMATCH. The contradiction surfaces because the program design anchors a single trace's hash directly to the daily-root PDA. With one PDA per fleet+date, only the first traces-verify'd trace each day has a matching on-chain root; subsequent calls report MISMATCH against the first one's hash. The judge running `rhemify traces verify` on a fresh trace will hit MISMATCH on the second invocation, with no clue why. Three-branch status now reflects the actual state: newly_anchored=true "freshly anchored in this run" (green) !newly_anchored,match=true "already anchored — on-chain root matches this trace" !newly_anchored,match=false "PDA exists from a previous anchor for this fleet+date, but its root differs from this trace. To anchor this trace's hash, delete or rotate the existing PDA, or wait until the next day's PDA slot." (yellow) The product gap this exposes — anchoring single trace hashes instead of a daily Merkle root of all traces — is a design simplification, not a correctness bug. The MISMATCH report is the correct audit result; this commit just stops lying about what the state means. A future chunk could swap the anchor to a real Merkle root over the day's traces (matches the field name `merkle_root` on the program state) so any subsequent verify call computes a proof against the batch. That's a feature, not a fix.

…g gap (phase O.28) Two roadmap updates: - "CI/CD on GH Actions" now annotated with "shipped" pointer to .github/workflows/ci.yml (the O.6 commit). Was listed as future work but is live and green on every push. - Added "Per-trace Merkle anchoring" as the next concrete roadmap item, surfaced by O.27. The current design calls write_daily_root with a SINGLE trace's content hash, treating it as the day's "merkle root". With a per-fleet-per-date PDA, only the first verify-call's trace each day matches; everything else MISMATCHes. The Anchor program already accepts merkle_root + trace_count, so the on-chain structure is there — the batching layer (build a Merkle tree of the day's traces server-side, return per-trace proofs from `rhemify traces verify`) is the missing piece. Honest disclosure of a real product gap, before a judge or contributor runs into it cold.

Previous Quickstart left "<api key from Convex fleets row>" as a placeholder for fleetApiKey — meaning a new contributor had to figure out how to query Convex (which endpoint? which key? which credentials?) before they could even run the demo. The seed mutation (packages/backend/convex/seed.ts) creates a fleet with a stable known api_key "rhm_demo_local_fleet_key_2026" — exposed it so the Quickstart can stand alone. Quickstart now: 1. Install 2. Backend services up 3. curl http://127.0.0.1:3212/api/mutation -d '{"path":"seed:demo",...}' → creates fleet + 6 agents 4. Config file with the seed's known api_key 5. ./tools/demo-run.sh Fleet/agent ids are still placeholders because they're Convex auto-generated and not the auth path — the Go server's FleetAPIKeyAuth middleware looks up fleet_id by api_key on every request. A contributor can leave them as <placeholders> and the demo still works. Truer-still UX would be `rhemify onboard` writing the config automatically off the seeded fleet, but that's a feature change. The Quickstart now matches what the demo actually requires.

…ase O.31) Three rows added to the verifiable-capability table: Full demo, one shot ./tools/demo-run.sh (shipped in O.22) Dependency health rhemify status (shipped in O.20) CI on every push gh run list ... (shipped in O.6/O.7) A judge skimming the table now sees the entire shipped surface area, not just the per-command primitives. The runner row in particular is the lowest-friction entry point — same flow as the 4 manual rows below it, but one keystroke.

…st shared root (M.1-M.5) Closes the biggest remaining product gap surfaced in O.27/O.28. Before this commit, `rhemify traces verify <id>` anchored a single trace's content hash to the daily PDA; the second trace of the day reported MISMATCH because the on-chain root was the first trace's hash. The Anchor program already had `merkle_root + trace_count` fields — the batching layer just wasn't there. New machinery (M.1): apps/server/internal/merkle/ Build/Path/Verify on a standard binary Merkle tree, SHA-256, odd- count duplicate-last-leaf padding. Domain separation: leaf prefix 0x00, node prefix 0x01 — second-preimage defense. 10 unit tests pin the contract (empty / 1-leaf / 2-leaf / 4-leaf / odd-count / wrong-leaf / wrong-root / range / domain-separation / bad-length). New Convex query (M.2): traces:listByFleetDate(fleet_id, date) → ordered list of valid-hex traces with leaf indices. Order is _creationTime asc so leaf positions are stable across requests. Skips pre-O.1 seeded traces whose trace_hash isn't a valid 64-char hex SHA-256 (those would break leaf hashing). New Go endpoint (M.3): GET /api/anchor/:fleetId/:date/merkle-proof?trace_id=X Builds the Merkle tree from Convex, returns: { fleet_id, date, trace_id, trace_hash, leaf_index, leaf_hash, root, trace_count, path: [{ hash, side }] } Server-side build because every trace must be a leaf — clients can't cheaply re-fetch all of them per-verify. CLI rewrite (M.4): rhemify traces verify <id> now: 1. Fetches proof from the new endpoint 2. Recomputes root from leaf + path locally (mirrors merkle.Verify) 3. Reads on-chain PDA root. Match → VERIFIED. 4. If on-chain root is stale (different from current Merkle root — happens when more traces have been added since last anchor), submits write_daily_root with new root + new trace_count. Render expanded: MERKLE PROOF section shows leaf_index / leaf_hash / proof-valid; ON-CHAIN section shows root match; audit-grade-proof paragraph at bottom shows a third-party auditor's verification recipe. Verified end-to-end on devnet (M.5): Trace trc_d2c948257c414f02 → leaf #8 of 12 → root 7a8e7a9e... anchored fresh, tx 5eiskSZH3Ww..., slot 461598835 Trace trc_4f362bd02f2249d9 → leaf #6 of 12 → root 7a8e7a9e... VERIFIED against existing on-chain root, no new tx Trace trc_27ec99bb2f324687 → leaf #9 of 12 → root 7a8e7a9e... VERIFIED, no new tx → three different traces, one shared root, one anchor tx total. The MISMATCH bug from before is gone. Roadmap entry in README updated to (shipped).

…wup) Adds the per-trace Merkle proof + shared-root verify row. Sits below the counterfactual replay row to mirror the demo flow ordering.

After M.1-M.5 the Merkle-proof verify works for any trace in the fleet+date, not just the first one of the day. Add it as the last per-command quickstart line so a contributor exploring by-hand sees the verifiable on-chain anchor step too.

… memo fallback (phase R) The biggest single remaining product gap. Where the memo executor proves intent, this one moves actual USDC from payer's ATA to recipient's via Token::TransferChecked. Settlement, not just intent. Cascade ordering (packages/sdk/src/execute/index.ts): x402SolanaTransferExecutor — real USDC x402SolanaExecutor — memo fallback executeWithCascade tries transfer first; canExecute or execute() failure falls through to memo. Demo always succeeds; production callers get real settlement when wallet has USDC. canExecute requirements: protocol = x402, Solana network, wallet has solanaPrivateKey, payTo is a sensible-length base58 string AND NOT the System Program '1111…1' placeholder (test 402 server's default; transfer declines so memo picks up). USDC mint constants: devnet 4zMMC9srt5Ri5X14GAgXhaHii3GnPAEERYPJgZJDncDU mainnet EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v decimals 6 (matches detection.priceRaw base units) No new dependency: SPL Token + ATA programs invoked via raw TransactionInstruction, same pattern as the memo executor. Hand-built discriminators — TransferChecked (12), ATA CreateIdempotent (1). 7 unit tests pin canExecute: ✓ true on solana-devnet + real recipient ✓ true on solana-mainnet too ✓ false for System-Program placeholder recipient ✓ false for empty / malformed payTo ✓ false without a Solana wallet ✓ false for non-x402 protocols (mpp) ✓ false for EVM networks (base) Cascade fallback verified e2e: rhemify pay http://localhost:3402/stock-data → System-Program recipient → transfer declines → memo runs → sig 4T4yJuVLgr… real on devnet. Live USDC settlement requires funding payer wallet (4FCi24Yy7CWw4V5B1UhGHbhDTvy18fryrG4rrtP2mcz3) with devnet USDC via faucet.circle.com (no programmatic faucet exists) + setting RECIPIENT_ADDRESS to a real-keypair pubkey on the test server. mppChargeTransferExecutor not started — same pattern, would slot in ahead of mppChargeExecutor in the cascade. Future chunk.

…andard (phase R.MPP) Same shape and rationale as x402SolanaTransferExecutor (phase R), wired for the MPP cascade. Real USDC settlement first, memo intent fallback. Cascade ordering for MPP: mppChargeTransferExecutor — real USDC (NEW) mppChargeExecutor — memo intent fallback Differences from x402SolanaTransferExecutor: - Outgoing header is Authorization: Payment <base64> (MPP convention) not X-Payment (x402 convention). - PaymentPayload uses scheme=solana, no x402Version field — matches what mppChargeExecutor sends so a downstream parser sees the same shape across the cascade fallback. - Network list includes 'devnet' / 'mainnet-beta' legacy aliases (MPP WWW-Authenticate parsers sometimes yield these). Everything else is identical: same ATA derivation, same TransferChecked + CreateIdempotent instructions, same USDC mint constants, same System-Program decline. 6 new canExecute unit tests added (40 total in new-executors.test.ts): ✓ true on solana-devnet + real recipient ✓ true on legacy 'devnet' / 'mainnet-beta' network aliases ✓ false for System-Program placeholder recipient ✓ false without a Solana wallet ✓ false for non-mpp protocols (x402 routes through its own transfer) ✓ false for non-Solana networks (base) Cascade fallback verified e2e against MPP test endpoint: rhemify pay http://localhost:3402/analytics → System-Program recipient → transfer declines → memo runs → sig 3Gff8xeLxA… real on devnet. Closes the symmetric MPP gap; both standards (x402, mpp) now have real-USDC-with-memo-fallback executors. Live USDC e2e proof still requires user funding the payer ATA via faucet.circle.com.

… (phase E) Mirror of x402SolanaTransferExecutor for EVM chains. Real ERC-20 transfer(to, amount) on Base / Base Sepolia / Ethereum / Sepolia, USDC contract addresses hardcoded per Circle's canonical deployments. Cascade ordering for EVM x402 (packages/sdk/src/execute/index.ts): x402EvmTransferExecutor — real ERC-20 (NEW) x402EvmExecutor — legacy peer-dep variant (unproven) Same canExecute-declines-placeholder pattern as the Solana pair: the test 402 server defaults RECIPIENT_ADDRESS to 0x...0001 which this executor declines so the cascade falls through cleanly. What it does end-to-end: - createWalletClient(privateKeyToAccount(wallet.evmPrivateKey)) - publicClient.writeContract calling USDC.transfer(recipient, amount) - waitForTransactionReceipt to confirm status === 'success' - x402-spec PaymentPayload with kind=erc20-transfer + tx hash - HTTP retry with X-Payment header (same shape as Solana side) USDC contracts (Circle's canonical deployments): base 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913 base-sepolia 0x036CbD53842c5426634e7929541eC2318f3dCF7e ethereum 0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48 ethereum-sepolia 0x1c7D4B196Cb0C7B01d743Fbc6116a902379C7238 USDC has 6 decimals on Solana AND EVM, so detection.priceRaw is reused directly without re-scaling. canExecute filters: - protocol = x402 - EVM network (base / base-sepolia / ethereum / ethereum-sepolia) - wallet has evmPrivateKey - detection.payTo is a real 0x-prefixed 40-hex AND NOT the 0x...0001 placeholder AND NOT the zero address 9 new canExecute unit tests, 49 tests total in the file: ✓ true on base-sepolia + real recipient + EVM key ✓ true on base mainnet ✓ true on ethereum / ethereum-sepolia ✓ false for 0x...0001 placeholder (test server default) ✓ false for zero address ✓ false for malformed / non-hex / ENS-style names ✓ false without EVM key (only solana key, or empty wallet) ✓ false for non-x402 protocols (mpp) ✓ false for Solana networks About Phantom: investigated whether 'just use Phantom' shortcuts EVM execution. Phantom is browser-extension first; for our CLI/server SDK we need programmatic signing. WalletConfig already has both solanaPrivateKey and evmPrivateKey as first-class fields — so a user who exports a Phantom private key (multi-chain) and drops it into the config gets the same effect. Phantom doesn't shortcut the executor work; the executor work IS what was missing. Not in this chunk: - CLI integration (no ~/.rhemify/evm-wallet.json yet) — user must construct WalletConfig manually to activate EVM today. - Live e2e against Base Sepolia — same gating as Phase R was for USDC on Solana: requires user-funded testnet account. Faucet flow: faucet.circle.com for USDC + a Base Sepolia ETH faucet for gas. Closes the audit's 'EVM unproven' line item at the code-path level. Live proof is the same opt-in shape as Phase R.7 was — user funds, we run the command, we get a real tx hash.

…y in README (phase E.cli) CLI integration for the EVM transfer path shipped in phase E. Three parts: packages/cli/src/config.ts loadEvmWallet() reads ~/.rhemify/wallet-evm.json. Returns null when the file doesn't exist — EVM is opt-in, not required for the demo. Wallet shape: { privateKey: 0x-prefixed hex, address: 0x... } so the SDK's WalletConfig.evmPrivateKey can be wired without re-deriving address each call. packages/cli/src/commands/pay.ts Loads the EVM wallet when present and includes evmPrivateKey in the SDK's WalletConfig. Spread-with-conditional keeps the field absent when no EVM wallet exists (matters because x402EvmTransferExecutor's canExecute checks for wallet.evmPrivateKey presence, not just truthiness). Prints a one-line 'EVM wallet: 0x... (Base/Sepolia/ Ethereum capable)' confirmation when active. packages/cli/src/commands/status.ts New 'EVM Wallet' section with the funded-via instructions. Helps a contributor see the live e2e path is wired without having to grep config.ts. README.md Added explicit 'Signing model — ows only' section to the 'What is NOT in v1' surface. This is the security-honesty move flagged in the latest review: the demo uses Own Wallet Signing (agent holds raw key) gated by the 6-rule client-side policy engine. That's appropriate for bounded-budget testnet/production agents but NOT for treasury-scale fleets. Squads / Ika 2PC-MPC / Privy passkey instruments are registered in the path resolver as stubs precisely because they're the production-grade signing paths — the audit surface acknowledges that ows is the demo instrument, not the production recommendation. The temp EVM wallet at 0x0E250EF30E837d3b19F42029e62edc854A7011a1 was generated via viem.generatePrivateKey() into ~/.rhemify/wallet-evm.json with 0600 perms. Sending Base Sepolia ETH + USDC there activates the live x402EvmTransferExecutor demo path.

Previously /weather hardcoded network=base-sepolia + payTo=0x...0001. For a payer with Ethereum Sepolia ETH (separate chain from Base Sepolia — same 'Sepolia' name but different testnets), we need to flip the network. Made both configurable: EVM_NETWORK=ethereum-sepolia bun run server.ts EVM_RECIPIENT=0xYourRealAddress bun run server.ts Defaults stay backward-compatible (base-sepolia + 0x...0001) so existing local invocations continue to work. The EVM_RECIPIENT shape is checked by x402EvmTransferExecutor.canExecute — anything that isn't a real 0x-prefixed 20-byte address (or is the 0x...0001 / 0x...0000 placeholder) declines and the cascade falls through.

…t flow (phase X) Closes the spec divergence discovered when testing against x402.org's production endpoint. Before this commit, x402SolanaTransferExecutor always broadcast-then-handed-off — the canonical x402 flow is sign-without-broadcasting and let the facilitator pay the gas + verify + broadcast atomically. Empirically: Pre-fix attempt against x402.org: sig 4fWkbh97H72B... — REAL 0.01 USDC moved to facilitator CKPKJWNd..., but x402.org's resource didn't validate our X-Payment payload (single signature string, not the signed-tx-bytes the facilitator expects) so we 402'd. Funds lost. Post-fix attempt against x402.org (same endpoint): Transfer executor partial-signed with feePayer=facilitator, NEVER broadcast — funds preserved. Cascade fell to memo (which also rejects). 0 USDC lost. The new flow is FAIL-SAFE on rejection. Concrete changes: packages/sdk/src/types.ts DetectionResult gains optional feePayer + asset fields. feePayer is the spec's extra.feePayer (the facilitator pubkey that must pay gas + broadcast). asset is the canonical mint/contract address from the 402 response. packages/sdk/src/detect/x402.ts Extracts extra.feePayer and req.asset from the 402 response shape. Existing CAIP normalization preserved. packages/sdk/src/execute/x402-solana-transfer.ts Two paths now: - facilitator mode (detection.feePayer set): tx.feePayer = facilitator pubkey tx.partialSign(payer) serialize({requireAllSignatures:false}) → base64 PaymentPayload with payload.transaction = base64-bytes POST X-Payment, facilitator broadcasts on its own gas. DOES NOT touch chain ourselves — funds only move if 200 returned. - self mode (no facilitator): unchanged from previous version — sign + broadcast + retry. Plus uses detection.asset over hardcoded USDC mint, and echoes back CAIP network identifier in PaymentPayload (toCaipNetwork helper) since the facilitator's validator matches the original string from the 402 response, not our normalized name. What x402.org's persistent 402 means (not our bug): their 402 response lists both Base Sepolia AND Solana Devnet acceptance, and their facilitator pubkey CKPKJWNdJEqa... has been receiving prior x402-Solana payments (0.491 → 0.501 USDC from our pre-fix attempt). But their resource server doesn't return 200 for Solana, suggesting their Solana facilitator backend is listed-but-not-yet-active. The new client flow will work against any spec-compliant Solana facilitator that does implement verification. The safety-on-rejection property is the main improvement: a partial- signed tx that x402.org refuses to broadcast is just abandoned — no on-chain effect, no funds lost. Same outcome as a network error. Interop status: ✓ wire-format spec-compliant, ✗ end-to-end against x402.org (blocked on their facilitator). Testing against any other Solana x402 endpoint with a working facilitator would close the loop.

…YMENT-SIGNATURE header Wires three corrections that were preventing x402-svm facilitator-mediated flows from settling. All three are required together; missing any single one leaves the resource at a silent 402 with no diagnostic. 1. PAYMENT-SIGNATURE header (was X-Payment). v2 x402 resources read the payment payload from `PAYMENT-SIGNATURE`; `X-Payment` is the v1 header name and v2 servers ignore it (returning the same 402+menu to every input, including no header at all — the symptom the empirical retry against x402.org/protected was producing). Source: @x402/core http/x402HTTPClient.ts:encodePaymentSignatureHeader switches header name on x402Version. Self-broadcast mode (local test-402 server) keeps X-Payment since the server is v1-shaped. 2. PaymentPayload shape `{ x402Version, accepted: PaymentRequirements, payload: { transaction } }` (was flat scheme+network at top level). v2 findMatchingRequirements matches `paymentPayload.accepted` against `accepts[]`; flat scheme/network produces "No matching payment requirements". `accepted.amount` MUST be a string ("10000"), not a number — the wrong type also causes match failure. 3. v0 VersionedTransaction with feePayer = facilitator pubkey, partial-sign payer only, base64 wire bytes in payload.transaction. Mirrors @x402/svm exact/client/scheme.ts:111-182 byte-for-byte. Self-broadcast keeps the legacy Transaction path. Supporting hardcode removals discovered while debugging: - USDC_DECIMALS = 6 → now reads `mintInfo.data[44]` (the SPL Token mint layout's decimals byte). Canonical does `fetchMint(asset).data.decimals`; same idea. Hardcoded 6 worked only because all our tests used USDC; any non-6-decimal SPL mint would either get rejected at facilitator verify or transfer off-by-10^N. - Mint fallback to DEVNET_USDC_MINT in facilitator mode → removed. Throws ExecutionError if 402.extra.asset is absent (canonical client behavior). Self-broadcast keeps the USDC fallback since the test-402 server omits asset for ergonomics. - Math.random() nonce fallback → removed. crypto.getRandomValues only. ATA-create-idempotent is now gated to self-broadcast mode only — in facilitator mode the @x402/svm verify rejects any ix at position 0 that isn't ComputeUnitLimit (`invalid_exact_svm_payload_transaction_ instructions_length`), so prepending an ATA-create breaks the ix ordering the facilitator requires. ComputeBudget ixs (setComputeUnitLimit=20000, setComputeUnitPrice=1µL) + Memo ix (seller's extra.memo bytes if present, else 16-byte random hex nonce) are added in facilitator mode at positions 0, 1, and 3 to match the canonical client. Memo bytes pass through detection.memo (new field surfaced from extra.memo for facilitators that need a server-pinned memo for byte-for-byte verification). E2E proof on Solana devnet against https://www.x402.org/protected: HTTP 200 OK payer: 8usJ1ShvoR3e74E6WMaNk2owwGUf87MuCuBJHdPgdEnQ facilitator: CKPKJWNdJEqa81x7CkZ14BVPiY6y16Sxs7owznqtWYp5 settle sig: 2GWjFrZaANB5rM6hHzXtuxtXrLACNvP68kgQYLosPpwiMWi7UpSv1e9Zo285dQ5qfXND6xc28iDsWrp6rwhZqT4p USDC delta: 0.59 → 0.58 (0.01 USDC moved by facilitator, not by us) Trace: trc_08c7f6890d7e4677 (via tools/test-402/e2e-pay-test.ts Test 3) Explorer: https://explorer.solana.com/tx/2GWjFrZaANB5rM6hHzXtuxtXrLACNvP68kgQYLosPpwiMWi7UpSv1e9Zo285dQ5qfXND6xc28iDsWrp6rwhZqT4p?cluster=devnet Facilitator-broadcast is now the canonical x402 v2 client path on Solana.

…) + signer + drain race) Three bugs that combined to silently lose Layer-1 anchors on every CLI/script payment. Per docs/stack/02-convex.md, payment_traces.anchor_tx_hash must hold the Solana Memo tx signature for each trace; before this change, that field stayed null for any rhemify CLI invocation. 1. Rhemify client never exposed a drain method. The AnchorQueue runs flush() on a 2s background tick. Short-lived processes (rhemify CLI, scripts, one-shot jobs) exit before the tick fires, killing the in-flight Memo tx and Convex PATCH mid-await. Long-running services were fine. Added Rhemify.close() — awaits anchorQueue.drain() so the queue empties before the caller continues. CLI's `rhemify pay` now awaits it before exit (success + error paths). Long-running services can ignore it. 2. AnchorQueue.flush() race with the background timer. The old guard `if (this.processing) return` made drain() bail when the background timer was already mid-`processBatch`. Drain returned, CLI exited, the in-flight RPC + PATCH got torn down. Pending=0 (items were already spliced out) made it look like drain succeeded. Replaced with an `inflight: Promise<void>` that re-entrant callers join — drain awaits the existing work, then runs another pass if items remain. 3. Memo tx was built with `setTransactionMessageFeePayer(signer.address)`. @solana/kit's signTransactionMessageWithSigners needs the signer object, not just the address, to actually sign the fee-payer slot. The tx came back "missing signatures for addresses: <fee-payer>" and got rejected at send. Swapped to `setTransactionMessageFeePayerSigner(signer)` which registers both the address and the signer. Also: AnchorQueue now awaits transport.updateTraceAnchor instead of fire-and-forget — drain() can only honor its contract if the Convex patch lands before drain returns. Persistence failures are routed through onError without failing the batch (the Memo tx itself succeeded; re-attaching is recoverable). Verified end-to-end against https://www.x402.org/protected on Solana devnet: payment tx (x402 v2 facilitator settled 0.01 USDC): PhMsmnjJNaXeqcbhnoXPahtK9PNuEJ2Dohebrids7n8C6eVNMG2wHTMyhc9xX4s7kBz4vu34AzdC1s8R2U3v2no anchor tx (Layer-1 Memo, trace hash 20d3c132...): 4ESUQYmySjYarCDT3mFwdd9bzsWZ8mPRhnQCuVnsT2ijz8HYPPUYnE56YRSYwTtjyJVP8dysy68HPUdBbRYp5cmp rhemify traces show trc_4b71f5ceb193485b → both txs render with explorer links Convex payment_traces.anchor_tx_hash now populates for every `rhemify pay`.

…ses real fleet config Two real bugs that combined to drop Layer-1 anchors in short-lived scripts even after the previous close()/drain race fix landed: 1. transport.ingestPayment was fire-and-forget AND untracked by close(). The CLI's natural delay between pay()-return and close()-call masked it (~3-5s of console.log + Memo tx build), but the e2e harness exits Test 3 instantly. close() drained the anchor queue, the Memo tx fired on-chain successfully — and then updateTraceAnchor PATCH hit Convex BEFORE the trace document existed there. Convex's traces:updateAnchor throws "Trace not found" in that window; the queue's PATCH retry path swallows it; anchor_tx_hash stays null forever. Fix: client tracks every in-flight ingest promise. close() awaits them all (Promise.allSettled) BEFORE draining the anchor queue, so the trace document is always durable in Convex when the PATCH fires. Self-cleaning on settle so long-running sessions don't accumulate references. 2. tools/test-402/e2e-pay-test.ts used hardcoded test-fleet-key / fleet-e2e-test that never resolved in Convex's fleets table, so every ingest + anchor PATCH 401'd silently via FleetAPIKeyAuth — the harness's traces never landed in Convex at all. Also called process.exit(1) before awaiting close(), killing the AnchorQueue's flush mid-Memo-tx. Fix: - Load fleetApiKey / fleetId / agentId from ~/.rhemify/config.json (the same source the production CLI uses, written by `rhemify onboard`). - Load Solana wallet from ~/.rhemify/wallet.json instead of repo-root .test-wallet.json — single onboarded credential set. - Fail loud with onboard guidance if either file is missing. - Move `await rhemify.close()` inside main() before setting exitCode so Node drains the event loop instead of exiting hot. End-to-end verified on Solana devnet via `bun run tools/test-402/e2e-pay-test.ts`: trace trc_9bc95564c3b54952 payment tx 39CwYkR8w6uBsj7aCvCvv2m3zbqbV6KLcoAdNfsFXkYn4GMbLmtreQh2qVWBc1SrNfaCAiecfE3zHTtkaR2YGBhn anchor tx 2Pkc6En3ZADiz7oCiUQYokXNYMpTnime62TvhKTi1u2VcVYkY41UYYYK1dCHYrpXtSTAuW1KRsFPGUEyYgvHefef rhemify traces show trc_9bc95564c3b54952 → both explorer links render The harness now exercises the same auth path production traffic does and produces durable Convex state per-run, not silently-skipped 401s.

…ace, fix dry-run cap Three small follow-ups from the chunk-4 audit: 1. packages/sdk/src/anchor/memo.ts — remove the file-wide `@ts-nocheck`. The pipe chain is now fully type-checked via `import type * as SolanaKit from "@solana/kit"`. Two narrowly-scoped `as never` casts remain at `sendAndConfirmTransactionFactory` and the signed-tx argument because `@solana/kit`'s cluster-brand (`'~cluster': "mainnet" | "devnet" | ...`) and lifetime-brand (`Blockhash | DurableNonce`) widen across runtime `string` rpcUrl values, and the overload picker can't choose without a compile-time literal. Each cast has a one-line justification inline. Net: ~95% of the file is type-safe at compile time and the only `@ts-expect-error -- optional peer dep` comments are gone (the package is in regular `dependencies`, so import types work directly). Also switched the single-ix append loop to `appendTransactionMessageInstructions` (plural) — the lib's variadic helper sidesteps per-iteration generic widening that was breaking type inference in the loop body. 2. packages/sdk/test/anchor.test.ts — regression test for commit 9b04d89's drain race fix. Asserts that `AnchorQueue.drain()` waits for an in-flight `processBatch` to complete before resolving, even when the queue is empty (items already spliced out). Holds `transport.updateTraceAnchor` open with a manual barrier so the race is deterministic in unit test time. Without the chunk-4 `inflight: Promise<void>` tracker, this test would fail because drain() would see `queue.length === 0` and return prematurely, silently losing the Memo tx's Convex PATCH. 3. tools/test-402/e2e-pay-test.ts — Test 1 fixture fix. The local test-402 server's /stock-data advertises $0.50 to exercise the budget cap path; the harness's `defaultMaxBudget` is $0.05. Pass `maxBudget: "$1.00"` per-call for the dry run so the pipeline can run end-to-end. Test 3's safety cap ($0.02) stays explicit and unchanged. Verified: - bun run check-types → all 4 packages pass - bun run build → SDK 116KB CJS / 113KB ESM - bun run test → 170 passed, 0 failed (anchor.test.ts now 8 tests) - bun run tools/test-402/e2e-pay-test.ts → 3 passed, 0 failed - rhemify traces show trc_b17556d6c0634a65 → both payment + anchor render

Pre-PR housekeeping. Three things a senior reviewer would reject on sight: 1. Compiled Go binaries in tree (`apps/server/bin/server` 18MB, `apps/server/seed` 8MB). Belong in CI artifacts, not source. Added `apps/server/bin/` and `apps/server/seed` to .gitignore + `git rm --cached`. 2. Local SQLite dev db (`apps/web/local.db`) tracked despite being listed in .gitignore. The earlier .gitignore entry only matched the repo-root `local.db`; the apps/web copy was caught by `git add` before the ignore reached it. Added `apps/web/local.db` explicit path + `git rm --cached`. 3. `apps/web/public/ascii-animation (1).mp4` — filename suggests a re-downloaded duplicate. Renamed to `ascii-animation.mp4`; Hero.tsx updated to reference the clean URL (drops the `%20(1)` encoding). No runtime change — only file management. Web build typecheck still passes.

…alone TUI Audit-payment-rail surface now lives in the team's existing dashboard at apps/web/src/routes/dashboard/ — same TanStack Start app, same dark theme, same Convex `useQuery` data layer (matches Jun Shen's pattern in use-agents / use-transactions). The parallel `apps/tui/` terminal dashboard is removed: it was a separate surface we built, the team's product is the React dashboard. Added: apps/web/src/lib/hooks/use-traces.ts Two hooks — useTraces(filters) and useTraceByTraceId(id) — backed by Convex traces:listAll and traces:getByTraceId, same queries the CLI's `rhemify traces list/show` commands already render against. apps/web/src/routes/dashboard/traces.tsx Browse view. Mirrors rhemify CLI's `traces list` — sortable header, blocked-only filter, limit selector, deep links to the detail route. apps/web/src/routes/dashboard/traces.$traceId.tsx Full decision-context view. Mirrors `rhemify traces show <id>` 7-section render: TRACE / EVENT / POLICY / PATH / SNAPSHOT / VERIFIABILITY / NEXT. Payment + anchor txs link to Solana explorer. Wired: Sidebar nav gets a "Traces" entry between Approvals and the Agents list. TITLE_MAP gets "/dashboard/traces" → "Decision traces"; the route-id matcher recognises traces.$traceId for "Trace detail" header. routeTree.gen.ts regenerated by vite to include the two new routes. Removed: apps/tui/ (package.json, scripts/seed.ts, src/convex-client.ts, src/index.tsx, tsconfig.json) — was an OpenTUI terminal dashboard streaming Convex. Functional but a parallel surface to the team's React dashboard. Decision traces now integrate into theirs. README.md repo-tree dropped the apps/tui line. Verified: bun run check-types ← all 4 packages pass cd apps/web && bunx tsc --noEmit ← only pre-existing sidebar `/dashboard/agent/${id}` template-literal warning remains (predates this branch). bun run build ← vite SSR build success, 7s routeTree.gen.ts ← DashboardTracesRoute + DashboardTracesTraceIdRoute imports + paths registered Browser test: pages render correctly client-side. SSR throws "fetch failed" on EVERY dashboard route — pre-existing local-dev issue (root loader's auth-session fetch hits a config gap when Convex is local-only). Not introduced by this commit; identical behaviour against /dashboard, /dashboard/policies, etc. on main. Production deploy with a real Convex deployment URL renders cleanly. Follow-up: replay-button + override-form on the detail page (today the CLI handles counterfactuals; dashboard surfaces the command). Jun Shen's call.

LingSiewWin added 30 commits May 10, 2026 22:56

LingSiewWin added 30 commits May 11, 2026 17:41

docs(readme): 'what works' table covers Merkle verify (post-M.5 follo…

1707996

…wup) Adds the per-trace Merkle proof + shared-root verify row. Sits below the counterfactual replay row to mirror the demo flow ordering.

updated readme.md

38a76cd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/siewwwin#12

Feature/siewwwin#12
LingSiewWin wants to merge 67 commits into
mainfrom
feature/siewwwin

LingSiewWin commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LingSiewWin commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant