Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 5 additions & 3 deletions agent/flow-trace/00_INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,8 +176,8 @@ _Found during source-code cross-referencing of these trace documents._
| 4 | `activate()` calls `register()` → `registerOperator()` which has `require(!registered, AlreadyRegistered())`. So activate **reverts** for already-registered operators. It only works for re-registration after deregistration. | BondingRegistry.sol:308 | 01_REGISTRATION |
| 5 | `E3Requested` event is `(uint256 e3Id, E3 e3, IE3Program indexed e3Program)` — seed and params are inside the E3 struct, not separate parameters. | IEnclave.sol:82 | 03_E3_REQUEST |
| 6 | `finalizeCommittee()` checks `>=` deadline, not `>`. | CiphernodeRegistryOwnable.sol | 03_E3_REQUEST |
| 7 | `publishCommittee()` is now permissionless. The effective access control is the C5 proof verification plus the single-publish guard `publicKeyHashes[e3Id] == 0`; the old `onlyOwner` note is obsolete. | CiphernodeRegistryOwnable.sol | 04_DKG |
| 8 | `CommitteePublished` event emits `(e3Id, nodes, publicKey, proof)` — full PK bytes and C5 proof, not just pkHash. | CiphernodeRegistryOwnable.sol | 04_DKG |
| 7 | `publishCommittee()` is now permissionless. The effective access control is DKG proof verification plus the single-publish guard `publicKeyHashes[e3Id] == 0`; the old `onlyOwner` note is obsolete. | CiphernodeRegistryOwnable.sol | 04_DKG |
| 8 | `CommitteePublished` event emits `(e3Id, nodes, publicKey, pkCommitment, proof)` — full PK bytes, pkCommitment, and proof bytes (DkgAggregator when proof aggregation is enabled), not just pkHash. | CiphernodeRegistryOwnable.sol | 04_DKG |
| 9 | `_validateNodeEligibility` calls `bondingRegistry.getTicketBalanceAtBlock()` (not `ticketToken.getPastVotes()` directly). | CiphernodeRegistryOwnable.sol:668 | 03_E3_REQUEST |
| 10 | Lane A slashing uses **attestation-based** verification (committee quorum votes), not direct ZK proof re-verification on-chain. `proposeSlash()` decodes voter addresses, agrees, data hashes, and ECDSA signatures — not ZK proofs. | SlashingManager.sol | 05_FAILURE |

Expand All @@ -186,8 +186,10 @@ _Found during source-code cross-referencing of these trace documents._
| # | Concern | Severity | Detail |
| --- | ---------------------------------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 1 | **Deregister-before-slash race** | Accepted | SlashingManager Lane B (evidence+appeal) has a window during which the operator can deregister and claim their exit. If they do, the slash executes against 0 funds. The contract comments acknowledge this as an accepted tradeoff for the appeal window design. |
| 2 | **Committee publication decentralized** | Resolved | `publishCommittee()` is permissionless. Off-chain role selection chooses the active aggregator, while on-chain C5 proof verification and the single-publish guard prevent invalid or duplicate committee publication. |
| 2 | **Committee publication decentralized** | Resolved | `publishCommittee()` is permissionless. Off-chain role selection chooses the active aggregator, while on-chain DKG proof verification and the single-publish guard prevent invalid or duplicate committee publication. |
| 3 | **`gracePeriod` is dead code** | Medium | `gracePeriod` is stored and validated during config updates but never actually used in any timeout check. Either the deadlines already bake in sufficient buffer, or this is a missing feature. |
| 4 | **`activate` CLI command is misleading** | Low | Named "activate" but actually calls "register" — will fail for already-registered operators. There's no standalone way to trigger re-evaluation of active status; instead, `_updateOperatorStatus()` runs automatically inside `addTicketBalance()`, `bondLicense()`, etc. |
| 5 | **Active-job load balancing bug fixed** | Info | The Rust `NodeStateStore.available_tickets()` subtracts `active_jobs` from total tickets, reducing the chance of busy nodes being selected for new E3s. Previously, the `Sortition` actor's `Handler<EnclaveEvent>` was missing match arms for `E3Failed` and `E3StageChanged`, causing these events to fall to the default `_ => ()` — the typed handlers for decrementing jobs were dead code. This has been fixed: E3Failed and E3StageChanged are now routed to their handlers, and `finalized_committees` is cleaned up in `decrement_jobs_for_e3` to prevent unbounded memory growth. |
| 6 | **Committee member expulsion** | Info | `SlashingManager` can call `expelCommitteeMember()` mid-DKG. The `Sortition` actor enriches the raw `CommitteeMemberExpelled` event with the expelled member's `party_id` (resolved from its stored `Committee` list) and re-publishes it. `ThresholdKeyshare` then uses the enriched `party_id` to update its collectors, potentially completing DKG with fewer parties. `ThresholdKeyshare` itself does not hold committee state. |
| 7 | **PublicKeyAggregator failure bridge fixed** | Info | `PublicKeyAggregator` now routes `ComputeRequestError` for `ZkRequest::DkgAggregation` and converts mixed Some/None honest NodeFold proof sets into `E3Failed { failed_at_stage: CommitteeFinalized, reason: DKGInvalidShares }`, preventing PK aggregation stalls that previously surfaced only as `EnclaveError`. |
| 8 | **ThresholdPlaintextAggregator failure bridge fixed** | Info | `ThresholdPlaintextAggregator` now routes `ComputeRequestError` for `CalculateThresholdDecryption` and `DecryptionAggregation`, and fatal C6/C7/decryption-aggregation checks now emit `E3Failed { failed_at_stage: CiphertextReady, reason: DecryptionInvalidShares }` instead of halting locally under `trap()`. |
93 changes: 83 additions & 10 deletions agent/flow-trace/04_DKG_AND_COMPUTATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,12 @@ CiphernodeSelected event arrives at ThresholdKeyshare
│ │ → These collectors start immediately so early peer keys/shares can
│ │ be buffered while this node is still finishing earlier DKG phases
│ │
│ └─ Each collector has a timeout (60s for keys, 120s for shares)
│ └─ Collector timeouts are derived from the DKG stage budget:
│ ├─ shared base window from `E3_DKG_WINDOW_SECS` (default 7200s,
│ │ matching current production `Enclave` deployment config)
│ ├─ EncryptionKeyCollector cutoff at 10% of the DKG window
│ ├─ ThresholdShareCollector cutoff at 60% of the DKG window
│ └─ per-collector env vars still override these derived defaults
Comment on lines +47 to +52
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm ThresholdKeyshare actually emits E3Failed for the stages described in 04_DKG_AND_COMPUTATION.md.
fd -t f 'threshold_keyshare.rs|threshold_keyshare/' -X rg -nP -C3 'E3Failed|EncryptionKeyCollectionFailed|ThresholdShareCollectionFailed|CalculateDecryptionShare|CalculateDecryptionKey|GenEsiSss|GenPkShareAndSkSss' {} \;
rg -nP --type=rust -C2 'E3Failed\s*\{' -g '!crates/aggregator/**'

Repository: gnosisguild/enclave

Length of output: 19851


🏁 Script executed:

# Look at the full implementation of computation response handlers in threshold_keyshare.rs
fd -type f -name 'threshold_keyshare.rs' -x sh -c '
  echo "=== handle_gen_pk_share_and_sk_sss_response ==="
  sed -n "914,960p" "$1"
  echo ""
  echo "=== handle_gen_esi_sss_response ==="
  sed -n "997,1010p" "$1"
  echo ""
  echo "=== handle_calculate_decryption_key_response ==="
  sed -n "1837,1900p" "$1"
  echo ""
  echo "=== handle_calculate_decryption_share_response ==="
  sed -n "2136,2180p" "$1"
' sh {} \;

Repository: gnosisguild/enclave

Length of output: 234


🏁 Script executed:

# Find and read the threshold_keyshare.rs file
fd 'threshold_keyshare.rs' | head -1 | xargs -I {} sh -c '
  echo "=== handle_gen_pk_share_and_sk_sss_response ==="
  sed -n "914,960p" "$1"
  echo ""
  echo "=== handle_gen_esi_sss_response ==="
  sed -n "997,1010p" "$1"
  echo ""
  echo "=== handle_calculate_decryption_key_response ==="
  sed -n "1837,1900p" "$1"
  echo ""
  echo "=== handle_calculate_decryption_share_response ==="
  sed -n "2136,2180p" "$1"
' sh {}

Repository: gnosisguild/enclave

Length of output: 7158


🏁 Script executed:

# Check where handle_compute_response is called and how errors are handled
rg -n 'handle_compute_response|pub fn handle_compute_response' crates/keyshare/src/threshold_keyshare.rs -B 2 -A 10 | head -100

Repository: gnosisguild/enclave

Length of output: 1032


🏁 Script executed:

# Find the trap function definition
rg -n 'fn trap|pub fn trap' -A 15 --type rust | head -60

Repository: gnosisguild/enclave

Length of output: 2231


🏁 Script executed:

# Find the ErrorDispatcher trait and its err method implementation
rg -n 'trait ErrorDispatcher|fn err|pub fn err' -A 10 crates/events/src/enclave_event/enclave_error.rs | head -80

Repository: gnosisguild/enclave

Length of output: 904


🏁 Script executed:

# Find BusHandle and its err method
rg -n 'impl.*ErrorDispatcher.*BusHandle|fn err' crates/events/src/ -A 10 | grep -A 15 'BusHandle'

Repository: gnosisguild/enclave

Length of output: 3737


🏁 Script executed:

# Find the event_from_error method
rg -n 'fn event_from_error|event_from_error' crates/events/src/bus_handle.rs -A 20 | head -80

Repository: gnosisguild/enclave

Length of output: 1934


🏁 Script executed:

# Find the EnclaveEvent::from_error implementation
rg -n 'fn from_error' crates/events/src/ -A 20 | head -100

Repository: gnosisguild/enclave

Length of output: 6451


🏁 Script executed:

# Check if there are any error handlers that convert EnclaveError to E3Failed
rg -n 'EnclaveError|EnclaveEventData::EnclaveError' crates/keyshare/src/threshold_keyshare.rs -B 3 -A 5

Repository: gnosisguild/enclave

Length of output: 45


🏁 Script executed:

# Check how EnclaveError events are handled in threshold_keyshare.rs
rg -n 'EnclaveEventData::' crates/keyshare/src/threshold_keyshare.rs | head -20

Repository: gnosisguild/enclave

Length of output: 1082


🏁 Script executed:

# Check if EnclaveError events are handled anywhere that might convert them to E3Failed
rg -n 'EnclaveError|Handler.*EnclaveError' crates/keyshare/src/ -B 2 -A 10

Repository: gnosisguild/enclave

Length of output: 45


🏁 Script executed:

# Search for any handlers or converters that might turn EnclaveError into E3Failed
rg -n 'EnclaveEventData::EnclaveError' crates/ -B 3 -A 5

Repository: gnosisguild/enclave

Length of output: 1568


🏁 Script executed:

# Read the specific lines mentioned in the review comment
sed -n '47,52p;114,121p;174,182p;188,194p;323,330p;688,694p' agent/flow-trace/04_DKG_AND_COMPUTATION.md

Repository: gnosisguild/enclave

Length of output: 2318


🏁 Script executed:

# Search for ComputeRequestError handling
rg -n 'ComputeRequestError' crates/keyshare/src/threshold_keyshare.rs -B 3 -A 5

Repository: gnosisguild/enclave

Length of output: 45


🏁 Script executed:

# Search for ComputeRequestError across the entire codebase
rg -n 'ComputeRequestError' crates/ --type rust | head -20

Repository: gnosisguild/enclave

Length of output: 2620


🏁 Script executed:

# Check if ThresholdKeyshare subscribes to ComputeRequestError
rg -n 'subscribe.*ComputeRequestError|ComputeRequestError' crates/keyshare/src/threshold_keyshare.rs

Repository: gnosisguild/enclave

Length of output: 45


Verify the ThresholdKeyshare-side E3Failed routing for computation stage failures.

The documentation claims E3Failed emission from ThresholdKeyshare for GenPkShareAndSkSss, GenEsiSss, CalculateDecryptionKey, and CalculateDecryptionShare errors, but these handlers are not wired to emit E3Failed in the code.

Verification confirms that:

  • EncryptionKeyCollectionFailed and ThresholdShareCollectionFailed handlers exist (lines 2471, 2499) and do emit E3Failed (line 2562) ✓
  • The four computation response handlers exist but use ? operator to propagate errors into the trap() function, which converts them to EnclaveError events instead of E3Failed
  • ThresholdKeyshare does not subscribe to ComputeRequestError events, so computation failures don't trigger the documented E3Failed path

The doc is ahead of the code for computation stage failures. Either implement the E3Failed handlers for these stages or update the documentation to reflect that computation errors become EnclaveError events, not E3Failed.

Also applies to: 114-121, 174-182, 188-194, 323-330, 688-694

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agent/flow-trace/04_DKG_AND_COMPUTATION.md` around lines 47 - 52, The doc
claims computation-stage failures from ThresholdKeyshare (GenPkShareAndSkSss,
GenEsiSss, CalculateDecryptionKey, CalculateDecryptionShare) emit E3Failed, but
the code currently propagates errors via the ? operator into trap() producing
EnclaveError and ThresholdKeyshare also doesn't subscribe to
ComputeRequestError; fix by wiring these computation-response handlers to
explicitly emit E3Failed events instead of using ? into trap() and ensure
ThresholdKeyshare subscribes to ComputeRequestError; update the four handler
implementations (functions handling GenPkShareAndSkSss, GenEsiSss,
CalculateDecryptionKey, CalculateDecryptionShare) to convert errors into the
E3Failed emission path, and add subscription for ComputeRequestError in the
ThresholdKeyshare setup so computation failures trigger the E3Failed path rather
than EnclaveError.

```

### Step 2: C0 Proof Generation → EncryptionKeyCreated
Expand Down Expand Up @@ -106,9 +111,14 @@ EncryptionKeyCollector waits for EncryptionKeyCreated from ALL N parties
├─ On each arrival: store (party_id → bfv_public_key)
├─ On TIMEOUT (60s):
│ └─ Publish EncryptionKeyCollectionFailed
│ └─ ThresholdKeyshare actor stops
├─ On TIMEOUT (derived DKG-phase cutoff):
│ └─ Send EncryptionKeyCollectionFailed to parent ThresholdKeyshare
│ ├─ ThresholdKeyshare republishes EncryptionKeyCollectionFailed for telemetry
│ ├─ ThresholdKeyshare emits E3Failed {
│ │ failed_at_stage: CommitteeFinalized,
│ │ reason: InsufficientCommitteeMembers
│ │ }
│ └─ ThresholdKeyshare actor stops
└─ When ALL N collected:
└─ Send AllEncryptionKeysCollected to parent ThresholdKeyshare
Expand Down Expand Up @@ -161,11 +171,27 @@ ThresholdKeyshare receives AllEncryptionKeysCollected
│ │ Output: esi_sss[num_ciphertexts][N] │
│ └─────────────────────────────────────────────────────────┘
```
├─ ThresholdKeyshare tracks the correlation id for both TrBFV requests:
│ ├─ `GenPkShareAndSkSss`
│ └─ `GenEsiSss`
│ → If the worker returns `ComputeRequestError` for either request,
│ `ThresholdKeyshare` now emits `E3Failed {
│ failed_at_stage: CommitteeFinalized,
│ reason: DKGInvalidShares
│ }` and stops instead of remaining stuck in `GeneratingThresholdShare`

### Step 5: Encrypt & Broadcast Shares (with C1, C2, C3 Proofs)

```
Both GenPkShareAndSkSss and GenEsiSss complete
├─ `ThresholdKeyshare` tracks the `CalculateDecryptionKey` correlation id:
│ → `ComputeRequestError` for this request now emits
│ `E3Failed {
│ failed_at_stage: CommitteeFinalized,
│ reason: DKGInvalidShares
│ }` and stops before C4 proof dispatch
├─ handle_shares_generated():
│ │
Expand Down Expand Up @@ -294,9 +320,14 @@ ThresholdShareCollector waits for ThresholdShareCreated from ALL N parties
│ │ → This node only extracts what's encrypted for it
│ └─ Forwards filtered share to ThresholdShareCollector
├─ On TIMEOUT (120s):
│ └─ Publish ThresholdShareCollectionFailed
│ └─ ThresholdKeyshare actor stops
├─ On TIMEOUT (derived DKG-phase cutoff):
│ └─ Send ThresholdShareCollectionFailed to parent ThresholdKeyshare
│ ├─ ThresholdKeyshare republishes ThresholdShareCollectionFailed for telemetry
│ ├─ ThresholdKeyshare emits E3Failed {
│ │ failed_at_stage: CommitteeFinalized,
│ │ reason: InsufficientCommitteeMembers
│ │ }
│ └─ ThresholdKeyshare actor stops
└─ When ALL N shares collected:
├─ Send AllThresholdSharesCollected to ThresholdKeyshare
Expand Down Expand Up @@ -533,8 +564,21 @@ ThresholdKeyshare receives AllThresholdSharesCollected
│ │ e3_id, party_id, signed_proof(C5)
│ │ }
│ │
│ └─ 5. Publish PublicKeyAggregated {
│ e3_id, aggregate_pk, pk_hash, node_list
│ ├─ 5. DKG AGGREGATION REQUEST (when proof aggregation is enabled):
│ │ ├─ PublicKeyAggregator buffers one optional NodeFold proof per honest party from
│ │ │ DKGRecursiveAggregationComplete
│ │ ├─ Dispatches ComputeRequest::zk(ZkRequest::DkgAggregation {
│ │ │ node_fold_proofs, c5_proof, party_ids, params_preset
│ │ │ })
│ │ ├─ Tracks the in-flight correlation id
│ │ ├─ ComputeRequestError now emits
│ │ │ E3Failed { failed_at_stage: CommitteeFinalized, reason: DKGInvalidShares }
│ │ └─ A mixed Some/None honest NodeFold-proof set is treated as the same terminal DKG
│ │ failure instead of only surfacing as EnclaveError telemetry
│ │
│ └─ 6. Publish PublicKeyAggregated {
│ e3_id, pubkey: aggregate_pk, pk_commitment, nodes,
│ dkg_aggregator_proof
│ }
└─ CiphernodeRegistrySolWriter receives PublicKeyAggregated:
Expand Down Expand Up @@ -641,6 +685,13 @@ EnclaveSolReader decodes CiphertextOutputPublished event
│ │ │ │
│ │ │ Output: Vec<decryption_share_polynomial> │
│ │ └─────────────────────────────────────────────────────┘
├─ `ThresholdKeyshare` tracks the `CalculateDecryptionShare` correlation id:
│ → `ComputeRequestError` for this request now emits
│ `E3Failed {
│ failed_at_stage: CiphertextReady,
│ reason: DecryptionInvalidShares
│ }` and stops before C6 proof generation
├─ REQUEST C6 PROOF:
│ Publish ShareDecryptionProofPending {
Expand Down Expand Up @@ -714,6 +765,12 @@ EnclaveSolReader decodes CiphertextOutputPublished event
│ │ │ │ Output: plaintext_bytes │
│ │ │ └─────────────────────────────────────────────────────┘
│ │
│ ├─ ThresholdPlaintextAggregator tracks the `CalculateThresholdDecryption` correlation id:
│ │ ├─ `ComputeRequestError` now emits
│ │ │ `E3Failed { failed_at_stage: CiphertextReady, reason: DecryptionInvalidShares }`
│ │ └─ Fatal C6 filtering failures (too few honest shares or post-proof commitment
│ │ mismatches) emit the same terminal failure instead of only trapping locally
│ │
│ ├─ REQUEST C7 PROOF:
│ │ Publish AggregationProofPending {
│ │ proof_request: DecryptedSharesAggregationProofRequest,
Expand All @@ -733,7 +790,23 @@ EnclaveSolReader decodes CiphertextOutputPublished event
│ │ e3_id, party_id, signed_proof(C7)
│ │ }
│ │
│ └─ Publish PlaintextAggregated { e3_id, decrypted_output }
│ ├─ DECRYPTION AGGREGATION REQUEST:
│ │ ├─ ThresholdPlaintextAggregator stores the signed C7 proofs plus the honest C6 inner
│ │ │ proofs for the first `T + 1` parties after sorting by `party_id`
│ │ ├─ Dispatches ComputeRequest::zk(ZkRequest::DecryptionAggregation {
│ │ │ c6_total_slots, jobs, params_preset
│ │ │ })
│ │ ├─ Each job folds the selected C6 proofs for one ciphertext index and checks them
│ │ │ against the matching C7 proof inside `DecryptionAggregator`
│ │ ├─ Tracks the in-flight correlation id
│ │ ├─ ComputeRequestError, missing C6 inner proofs, or C7/decryption-aggregator proof-count
│ │ │ mismatches now emit
│ │ │ `E3Failed { failed_at_stage: CiphertextReady, reason: DecryptionInvalidShares }`
│ │ └─ On success, stores `decryption_aggregator_proofs`
│ │
│ └─ Publish PlaintextAggregated {
│ e3_id, decrypted_output, decryption_aggregator_proofs
│ }
└─ EnclaveSolWriter receives PlaintextAggregated:
├─ Requires EffectsEnabled
Expand Down
3 changes: 3 additions & 0 deletions crates/aggregator/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,6 @@ e3-zk-helpers = { workspace = true }
e3-utils = { workspace = true }
serde = { workspace = true }
tracing = { workspace = true }

[dev-dependencies]
e3-test-helpers = { workspace = true }
Loading
Loading