[Feat] Ensure sequential processing of potential Ledger writes #2975

ljedrz · 2025-10-23T08:47:30Z

This is a proposal tackling #2954. It introduces a background thread responsible for the sequential processing of storage-related operations that cannot happen concurrently, in order to allow us to remove some of the related locks, to be able to introduce more performance optimizations related to block syncing, and in general be able to reason about the storage with more confidence.

During prototyping, the following open questions were resolved:

Are there any more [ledger operations with potential writes]?

I have found only one more operation not listed in the issue, which is only triggered when a dev-mode node creates a genesis block. That being said, it followed one of the call paths detected previously, so it required no special treatment.

at which level should these operations be introduced? try_advance_to_next_block is too broad, but atomic_speculate might be too granular (it might be practical to know if it's block-checking or quorum block preparation stage)

In the end I decided to isolate 2 SequentialOperations:

AddNextBlock
AtomicSpeculate

There is also atomic_finalize which may not be triggered concurrently, but it is only called as part of AddNextBlock, so I decided to only introduce a new safeguard (ensure_sequential_processing) in it.

how would these operations be represented? a dedicated beefy enum that would be able to hold all the items applicable to the operations of interest, and fed to a channel where we have a shallow clone of the Ledger

It's a new enum (plus some helper objects), and in the end it doesn't seem too bad size-wise; as for the channel, it actually contains a shallow clone of the VM instead, as that's the actual entry point into the storage.

where would the channel be spawned? would it be controlled by snarkOS or snarkVM?

I decided to spawn the dedicated thread when creating the VM - it is owned by it, together with the sender to the channel for transferring the operations.

Future/follow-up considerations:

while this proposal solves the general issue for the set of operations we currently use with a small diff and little impact on the existing APIs, it is not foolproof - we would have to manually ensure that any potential future operations (however unlikely) that may not be executed concurrently use this new setup; a better API might be possible, but I haven't come up with one yet
the persistent storage could inherit the thread id from the VM, and it could be the one to perform the ensure_sequential_processing check, probably in start_atomic; it would future-proof this setup, as we would immediately know if any new write batch was introduced, but not being processed sequentially
this new setup could technically allow us to remove multiple locks from the storage-related objects, which would be a performance improvement and a simplification (as fewer checks may be needed now)

Filing this as a draft, as I haven't run all the tests locally yet, and it would require a stress-test run on snarkOS side.

Signed-off-by: ljedrz <[email protected]>

…hread Signed-off-by: ljedrz <[email protected]>

Signed-off-by: ljedrz <[email protected]>

ljedrz · 2025-10-23T13:36:40Z

synthesizer/tests/test_vm_execute_and_finalize.rs


    // Run each test and compare it against its corresponding expectation.
-    tests.par_iter().for_each(|test| {
+    tests.iter().for_each(|test| {


note: this doesn't seem to slow down the related CI job at all

kaimast · 2025-10-23T20:13:20Z

synthesizer/src/vm/helpers/sequential_op.rs

+    ) -> thread::JoinHandle<()> {
+        // Spawn a dedicated thread.
+        let vm = self.clone();
+        thread::spawn(move || {


Do you know if this code is run in WASM? I am unsure if you can spawn threads there.

I know of a wasm_thread crate, though I haven't used it yet; an alternative would be to use a blocking task, but then the plumbing would need to be moved to snarkOS; good point, I'll think about it.

Thanks for thinking of us Kai! We don't use VM object in wasm :). Most wasm usage is strictly for execution or using individual pieces of tooling in SnarkVM so I wouldn't worry about the wasm target at all if you're dealing with changes that ONLY affect the VM object.

Indeed, this is only related to the VM methods.

kaimast · 2025-10-28T03:49:49Z

synthesizer/src/vm/helpers/sequential_op.rs

+    /// A safeguard used to ensure that the given operation is processed in the thread
+    /// enforcing sequential processing of operations.
+    pub fn ensure_sequential_processing(&self) {
+        assert_eq!(thread::current().id(), self.sequential_ops_thread.lock().as_ref().unwrap().thread().id());


My understanding was that one of the goals of this PR is to avoid assertions like this.
Is there no straightforward way to ensure certain functions are only invoked from within the worker thread by leveraging the type system?

Indeed, such safeguards are suboptimal, even if more foolproof than before; as I noted in the PR description, I see no simple ways of introducing type safety here right now, but it would most likely be a heavy lift.

kaimast · 2025-10-31T02:31:32Z

synthesizer/src/vm/helpers/sequential_op.rs

+            let _ = tx.send(request);
+
+            // Wait for the result of the queued operation.
+            let Ok(response) = response_rx.blocking_recv() else {


This panics when called from an async context. We need to document that somewhere.

Yes, this is expected; in production conditions we always run these operations within the context of blocking tasks. I'm not sure where to document it, though, other than perhaps the general storage documentation. That being said, I made a note on this at the callsite for now.

I think we would need to document that operations that use it. For example, I had check_next_block panic. That should have been in a blocking task anyway, but it still a fairly easy mistake to make.

Signed-off-by: ljedrz <[email protected]>

… from async Signed-off-by: ljedrz <[email protected]>

…s called from async" This reverts commit f53027c.

ljedrz added 2 commits October 23, 2025 10:03

feat: ensure sequential processing of potential Ledger writes

0ee9c56

Signed-off-by: ljedrz <[email protected]>

fix: patch a leak in tests

12e8768

Signed-off-by: ljedrz <[email protected]>

vicsn mentioned this pull request Oct 23, 2025

[Tracking issue] consensus improvements ProvableHQ/snarkOS#3960

Open

8 tasks

ljedrz added 4 commits October 23, 2025 13:59

fix: tighten up the shutdown of the sequential op thread

b01f7b2

Signed-off-by: ljedrz <[email protected]>

tests: adjust a few more tests

0bec0e3

Signed-off-by: ljedrz <[email protected]>

tests: allow atomic_finalize to be run outside of the sequential op t…

a6af8b9

…hread Signed-off-by: ljedrz <[email protected]>

chore: adjust the lockfile

06cc701

Signed-off-by: ljedrz <[email protected]>

ljedrz force-pushed the feat/replace_vm_locks_with_channel branch from a3fedfb to 06cc701 Compare October 23, 2025 11:59

tests: run test_vm_execute_and_finalize sequentially

6a93e20

Signed-off-by: ljedrz <[email protected]>

ljedrz mentioned this pull request Oct 23, 2025

[Feat] Enable sequential storage processing enforcement in snarkVM ProvableHQ/snarkOS#3964

Draft

ljedrz commented Oct 23, 2025

View reviewed changes

kaimast reviewed Oct 23, 2025

View reviewed changes

kaimast reviewed Oct 28, 2025

View reviewed changes

kaimast reviewed Oct 31, 2025

View reviewed changes

ljedrz added 4 commits October 31, 2025 10:12

tweak: don't return a response to SequentialOperation::Shutdown

c29f430

Signed-off-by: ljedrz <[email protected]>

Merge branch 'staging' into feat/replace_vm_locks_with_channel

5bbdc08

feat: improve the panic message if run_sequential_operation is called…

f53027c

… from async Signed-off-by: ljedrz <[email protected]>

Revert "feat: improve the panic message if run_sequential_operation i…

b04dc9b

…s called from async" This reverts commit f53027c.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feat] Ensure sequential processing of potential Ledger writes #2975

[Feat] Ensure sequential processing of potential Ledger writes #2975

Uh oh!

ljedrz commented Oct 23, 2025

Uh oh!

ljedrz Oct 23, 2025

Uh oh!

kaimast Oct 23, 2025

Uh oh!

ljedrz Oct 23, 2025

Uh oh!

iamalwaysuncomfortable Oct 29, 2025

Uh oh!

ljedrz Oct 29, 2025

Uh oh!

kaimast Oct 28, 2025

Uh oh!

ljedrz Oct 28, 2025

Uh oh!

kaimast Oct 31, 2025

Uh oh!

ljedrz Oct 31, 2025 •

edited

Loading

Uh oh!

kaimast Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Feat] Ensure sequential processing of potential Ledger writes #2975

Are you sure you want to change the base?

[Feat] Ensure sequential processing of potential Ledger writes #2975

Uh oh!

Conversation

ljedrz commented Oct 23, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ljedrz Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ljedrz Oct 31, 2025 •

edited

Loading