Skip to content

Conversation

JulianGCalderon
Copy link
Contributor

@JulianGCalderon JulianGCalderon commented Oct 7, 2025

Add Cairo Runner Builder

The CairoRunner has two main responsibilities:

  • Setting the initial state for the VM execution
  • Executing the VM

This PR introduces a new structure CairoRunnerBuilder that handles only the initialization of the Cairo VM. Doing so not only improves the API, but also allows for some optimizations that were not possible before.

Optimizations

Each time we call the same contract, we do the same initialization logic:

  • Hint compilation. As long as we have the same hints, and the same hint processor implementation, the compiled hints don't change.
  • Program loading (segment 0). As long as the program is the same, the segment 0 doesn't change.
  • Instruction decoding. As the long as the program is the same, the instruction decoding doesn't change.

This could be cached. The new builder implementes Clone, so we can perform the initialization logic only once, and reuse it for all the calls to that contract.

More over, when compiling hints, we are cloning the program's hints in order to put it into an Rc, which is required by the compile_hint function. To fix this, we are actually taking the program's hints (mem::take), instead of cloning it. This is safe, as long as the program's hints are not accessed after compiling them (which is not the case). Note that this optimization was possible even without the new builder.

Alternatives

The whole idea of this new abstraction is that we allow a way for "cloning" the state of a runner, prior to the execution. I considered other approaches before settling for this one:

  • The current builder has a method fn build(self) that consumes itself, and builds a CairoRunner. We could, instead, have something closer to a "factory" with a method fn build(&self). This implies that we wouldn't need to clone something, just call build multiple times. The downside of this is that if the user doesn't want to cache the builder, and just build the runner once, it would be forced to clone the internal fields. The flow would look like this:

    1. Create a CairoRunerFactory
    2. Initialize it
    3. Build many CairoRunner instances
  • Allow a way for Cloning the CairoRunner directly. We cannot derive Clone for the CairoRunner, so we need to have a manual implementation that skips some of fields. This would work, but implies cluttering the CairoRunner with even more initialization logic. Having a dedicated structure for this is a step in the right direction of improving the API. The flow would look like this:

    1. Create a CairoRuner
    2. Initialize it
    3. Clone it multiple times, to create many CairoRunner instances.
  • Similar to the last alternative, we could do the whole initialization in the CairoRunner, and provide a way for building an alternative struct CairoRunnerSnapshot. This struct should be clonable, and should be able to be used to build a CairoRunner again. The flow would look like this:

    1. Create a CairoRuner
    2. Initialize it
    3. Create a CairoRunnerSnapshot
    4. Build many CairoRunner instances from the snapshot.

Benchmarks

I did some benchmarks over block ranges:

  • 20000-20010, found a 40% performance increase.
  • 124000-124010, found a 37% performance increase.
  • 124900-124910, found a 32% performance increase.
  • 2000000-2000010, found a 30% performance increase.
  • 2001000-2001010, found a 28% performance increase.

Some caveats:

  • The benchmark assumes that everything is cached.
  • The cache used is really simple: a static thread local.

Sequencer Code

The required code for the benchmark can be found here: https://github.com/lambdaclass/sequencer/compare/main-v0.14.0..main-v0.14.0-builder

Cairo 0 code

In the sequencer, I added the following code:

thread_local! {
    pub static CACHE: RefCell<HashMap<ClassHash, CairoRunnerBuilder>> = RefCell::new(HashMap::default());
}
let mut cairo_runner_builder = CACHE
    .with_borrow_mut(|cache| -> Result<CairoRunnerBuilder, VirtualMachineError> {
        match cache.entry(call.class_hash) {
            hash_map::Entry::Occupied(occupied_entry) => Ok(occupied_entry.get().clone()),
            hash_map::Entry::Vacant(vacant_entry) => {
                let mut cairo_runner_builder = CairoRunnerBuilder::new(
                    &compiled_class.program,
                    LayoutName::starknet,
                    None,
                    RunnerMode::ExecutionMode,
                )?;
                cairo_runner_builder.enable_trace(false);
                cairo_runner_builder.disable_trace_padding(false);
                cairo_runner_builder.allow_missing_builtins(false);
                cairo_runner_builder.initialize_base_segments();
                cairo_runner_builder.load_program()?;
                cairo_runner_builder.compile_hints(&mut syscall_handler).unwrap();
                cairo_runner_builder.initialize_builtin_runners_for_layout()?;
                cairo_runner_builder.initialize_builtin_segments();
                Ok(vacant_entry.insert(cairo_runner_builder).clone())
                
            }
        }
    })
    .unwrap();

After retrieving the builder, we call:

let mut runner = cairo_runner_builder.build()?;

After execution, we set the instruction cache for the following executions:

CACHE.with_borrow_mut(|cache| -> Result<(), VirtualMachineError> {
        match cache.get_mut(&call.class_hash) {
            Some(builder) => {
                builder.load_cached_instructions(runner.vm.take_instruction_cache())?
            }
            None => (),
        }
        Ok(())
    })?;

Cairo 1 Code

For Cairo 1, I did something similar:

In the sequencer, I added the following code:

thread_local! {
    pub static CACHE: RefCell<HashMap<ClassHash, CairoRunnerBuilder>> = RefCell::new(HashMap::default());
}
let mut cairo_runner_builder: CairoRunnerBuilder = CACHE
    .with_borrow_mut(|cache| -> Result<_, VirtualMachineError> {
        match cache.entry(class_hash) {
            hash_map::Entry::Occupied(occupied_entry) => Ok(occupied_entry.get().clone()),
            hash_map::Entry::Vacant(vacant_entry) => {
                let mut cairo_runner_builder = CairoRunnerBuilder::new(
                    &compiled_class.program,
                    LayoutName::starknet,
                    None,
                    RunnerMode::ExecutionMode,
                )?;
                cairo_runner_builder.enable_trace(execution_runner_mode.trace_enabled());
                cairo_runner_builder.disable_trace_padding(false);
                cairo_runner_builder.allow_missing_builtins(false);
                cairo_runner_builder.initialize_base_segments();
                cairo_runner_builder.load_program()?;
                cairo_runner_builder.compile_hints(&mut syscall_handler).unwrap();
                cairo_runner_builder
                    .preallocate_segment(cairo_runner_builder.get_program_base().unwrap(), 2)?;
                Ok(vacant_entry.insert(cairo_runner_builder).clone())
            }
        }
    })
    .unwrap();

After retrieving the builder, we call:

cairo_runner_builder.initialize_builtin_runners(&entry_point.builtins)?;
cairo_runner_builder.initialize_builtin_segments();
let mut runner = cairo_runner_builder.build()?;

After execution, we set the instruction cache for the following executions:

CACHE.with_borrow_mut(|cache| -> Result<(), VirtualMachineError> {
        match cache.get_mut(&syscall_handler.base.call.class_hash) {
            Some(builder) => {
                builder.load_cached_instructions(runner.vm.take_instruction_cache())?
            }
            None => (),
        }
        Ok(())
    })?;

Checklist

  • Linked to Github Issue
  • Unit tests added
  • Integration tests added.
  • This change requires new documentation.
    • Documentation has been added/updated.
    • CHANGELOG has been updated.

Copy link

github-actions bot commented Oct 7, 2025

Benchmark Results for unmodified programs 🚀

Command Mean [s] Min [s] Max [s] Relative
base big_factorial 2.116 ± 0.016 2.092 2.135 1.00 ± 0.01
head big_factorial 2.112 ± 0.024 2.086 2.159 1.00
Command Mean [s] Min [s] Max [s] Relative
base big_fibonacci 2.081 ± 0.050 2.048 2.206 1.01 ± 0.02
head big_fibonacci 2.062 ± 0.009 2.052 2.075 1.00
Command Mean [s] Min [s] Max [s] Relative
base blake2s_integration_benchmark 7.741 ± 0.172 7.644 8.224 1.00
head blake2s_integration_benchmark 7.844 ± 0.064 7.745 7.987 1.01 ± 0.02
Command Mean [s] Min [s] Max [s] Relative
base compare_arrays_200000 2.161 ± 0.013 2.146 2.179 1.00
head compare_arrays_200000 2.170 ± 0.013 2.150 2.189 1.00 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base dict_integration_benchmark 1.439 ± 0.012 1.423 1.459 1.01 ± 0.01
head dict_integration_benchmark 1.428 ± 0.005 1.422 1.438 1.00
Command Mean [s] Min [s] Max [s] Relative
base field_arithmetic_get_square_benchmark 1.220 ± 0.008 1.211 1.236 1.00
head field_arithmetic_get_square_benchmark 1.240 ± 0.006 1.234 1.256 1.02 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base integration_builtins 7.708 ± 0.037 7.645 7.763 1.00
head integration_builtins 7.941 ± 0.030 7.887 7.992 1.03 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base keccak_integration_benchmark 8.012 ± 0.102 7.930 8.285 1.00
head keccak_integration_benchmark 8.191 ± 0.127 8.032 8.342 1.02 ± 0.02
Command Mean [s] Min [s] Max [s] Relative
base linear_search 2.169 ± 0.053 2.132 2.314 1.01 ± 0.03
head linear_search 2.156 ± 0.032 2.124 2.225 1.00
Command Mean [s] Min [s] Max [s] Relative
base math_cmp_and_pow_integration_benchmark 1.494 ± 0.006 1.485 1.500 1.00 ± 0.00
head math_cmp_and_pow_integration_benchmark 1.487 ± 0.004 1.481 1.494 1.00
Command Mean [s] Min [s] Max [s] Relative
base math_integration_benchmark 1.453 ± 0.005 1.444 1.459 1.00
head math_integration_benchmark 1.464 ± 0.034 1.443 1.557 1.01 ± 0.02
Command Mean [s] Min [s] Max [s] Relative
base memory_integration_benchmark 1.201 ± 0.003 1.197 1.205 1.00
head memory_integration_benchmark 1.204 ± 0.019 1.191 1.256 1.00 ± 0.02
Command Mean [s] Min [s] Max [s] Relative
base operations_with_data_structures_benchmarks 1.565 ± 0.015 1.555 1.604 1.00
head operations_with_data_structures_benchmarks 1.565 ± 0.009 1.552 1.585 1.00 ± 0.01
Command Mean [ms] Min [ms] Max [ms] Relative
base pedersen 527.7 ± 1.8 525.2 531.5 1.00 ± 0.00
head pedersen 526.7 ± 1.0 525.0 528.2 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
base poseidon_integration_benchmark 618.9 ± 3.1 613.6 623.2 1.01 ± 0.01
head poseidon_integration_benchmark 611.5 ± 4.9 607.9 622.0 1.00
Command Mean [s] Min [s] Max [s] Relative
base secp_integration_benchmark 1.835 ± 0.014 1.824 1.872 1.00
head secp_integration_benchmark 1.852 ± 0.008 1.840 1.865 1.01 ± 0.01
Command Mean [ms] Min [ms] Max [ms] Relative
base set_integration_benchmark 624.6 ± 1.5 622.3 626.4 1.04 ± 0.00
head set_integration_benchmark 600.4 ± 2.0 598.0 604.2 1.00
Command Mean [s] Min [s] Max [s] Relative
base uint256_integration_benchmark 4.222 ± 0.053 4.181 4.367 1.00
head uint256_integration_benchmark 4.299 ± 0.008 4.283 4.313 1.02 ± 0.01

Copy link

codecov bot commented Oct 8, 2025

Codecov Report

❌ Patch coverage is 3.75783% with 461 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.65%. Comparing base (065c8f4) to head (77801e5).

Files with missing lines Patch % Lines
vm/src/vm/runners/cairo_runner.rs 2.03% 385 Missing ⚠️
vm/src/vm/vm_core.rs 10.66% 67 Missing ⚠️
vm/src/vm/vm_memory/memory.rs 18.18% 9 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            2.x.y    #2223      +/-   ##
==========================================
- Coverage   96.66%   95.65%   -1.02%     
==========================================
  Files         103      103              
  Lines       43646    44115     +469     
==========================================
+ Hits        42191    42198       +7     
- Misses       1455     1917     +462     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@JulianGCalderon JulianGCalderon marked this pull request as ready for review October 13, 2025 18:30
ModBuiltinRunner::new_mul_mod(&ModInstanceDef::new(Some(1), 1, 96), true).into()
}
};
self.builtin_runners.push(builtin_runner);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this lead to a builtin_runners with the same builtin runner multiple times? If so, isn't that a problem? Also this lets the user to add runners that are not part of the layout. Is that a problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this lead to a builtin_runners with the same builtin runner multiple times?

Yes, if the user specifies multiple times the same builtin. Receiving the builtin runners is required as the user may only want to initialize the builtins required by the entrypoint that is going to be executed. This is a problem in the current Cairo Runner API also.

If so, isn't that a problem?

I am not sure if its a problem, but it never came up.

Also this lets the user to add runners that are not part of the layout. Is that a problem?

The current behavior ignores the layout, so I don't think its a problem, but I am not sure really.

Comment on lines 427 to 429
/// Depends on:
/// - [initialize_base_segments](Self::initialize_base_segments)
/// - [initialize_builtin_runners_for_layout](Self::initialize_builtin_runners_for_layout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that both of those methods have to be used before this one? If so, why not unifying them into one single method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are at least 2 ways of initializing the builtin runners (there may be more):

  • Initializing the entrypoint builtin runners (Cairo 1 - sequencer).
  • Initializing the layout builtin runners, according to the program builtins (Cairo 0 - sequencer).

The flow, at least for the use cases that I researched, seems to be like this:

  • initialize_base_segments.
  • Either initialize_builtin_runners_for_layout or initialize_builtin_runners. Now that I think of it, the code comment is wrong.
  • initialize_builtin_segments.

There are two reasons why I didn't abstract it yet into higer-level functions:

  • I didn't want to create abstractions prematurely, as there are some usecases that I am not considering yet.
  • I didn't want to depart much from the current Cairo Runner building API.

I totally agree in that we need to find a better/safer API, but I am not sure what is the best way to do it yet.

cc: @FrancoGiachetta @gabrielbosio

Copy link
Contributor

@FrancoGiachetta FrancoGiachetta Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to let the user decide how to build the runner, at least with the same flexibility as the previous one. It is true that it may be quite unsafe, as calling functions in a different order might result in an inconsistent state. However, we've seen some use cases where the runner needs to be built in exotic ways. So I think that, in our way of creating a safer api, we you consider these things as they shouldn't be change.

Co-authored-by: DiegoC <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants