Skip to content

Upgrade Miden VM to 0.14#1353

Merged
bobbinth merged 17 commits intonextfrom
pgackst-upgrade-vm
May 19, 2025
Merged

Upgrade Miden VM to 0.14#1353
bobbinth merged 17 commits intonextfrom
pgackst-upgrade-vm

Conversation

@PhilippGackstatter
Copy link
Contributor

@PhilippGackstatter PhilippGackstatter commented May 12, 2025

Upgrade Miden VM to 0.14.

Main Changes

  • Refactor MASM error approach in build script. Since asserts now take an error string directly rather than a code, the previous categorization of errors by range is no longer possible, so it was removed. The errors in MASM were rewritten and the build script now extracts the constants with their strings instead. The auto-generated error files use a thin MasmError wrapper around the extracted string, that can compute the error code on demand (for use in tests). The TransactionExecutor no longer holds a map of errors internally, because Host::on_assert_failed can no longer return an ExecutionError. Those are handled in the VM now.
  • Passing the SourceManager to vm_processor::execute in TransactionExecutor. I thought of two options here:
    • Set the source manager on the TransactionExecutorusing a builder-style approach, e.g. TransactionExecutor::new(...).with_source_manager(source_manager).
    • Make the source manager a parameter to TransactionExecutor::execute_transaction.

With the first option, it's most likely that users would not set the source manager with such an API, because they're not forced to. This is a learning from #1294.
The latter option forces users to at least think about what source manager to pass. In a setting where no debug information is desired, an empty one can be passed. The docs of that function were updated to mention that.

Something similar applies to TransactionProver, although I think we could also not expose the source manager here. In my mind, during proving we assume that the transaction witness comes from an already executed transaction, so I wonder if the error reporting is necessary here. For now though, I've added the source manager, but it's easy to remove.

Edit: I tried making the Arc<dyn SourceManager> a field in LocalTransactionProver, but that makes it no longer Send+Sync. That causes problems in the proving service. The same would be true for TransactionExecutor, if we go with the builder-style approach. Given that we want to move in the direction of making it send + sync (#1350), that would be a problem. Even the approach I've taken now might complicate that, though.

Optimal Source Manager Setup

  • Whenever possible in tests, I tried passing a useful source manager to TransactionContext (which eventually usually calls TransactionExecutor::execute_transaction). To that end, I made Arc<dyn SourceManager> a field in the context. (That made it non-Sync, but that's not really a problem, just required some allows of clippy lints).
  • Currently our build.rs script already assembles the kernel at build time and so the source files are not part of TransactionKernel::assembler. But I think we'd want that for debug purposes, so that if there's an error in the tx kernel, we'd get better errors. So the question is, can we somehow add the source files when we instantiate the assembler? The source files are available in concat!(env!("OUT_DIR"), "/asm/kernels/transaction"), so my question is primarily about the best way to do this using the Assembler or SourceManager API.
    • In debug mode, I'd basically add all standard MASM code to the assembler so it's available, i.e. account components, kernels, miden lib and note scripts. Is that the right approach here @plafer?

Error Diagnostics

One caveat is that in order to get these error messages that reference the source file, you have to either:

So to illustrate, in the TransactionExecutor, we're basically doing this standard wrapping of ExecutionError:

// Execute the transaction kernel
let result = vm_processor::execute(
    &TransactionKernel::main(),
    stack_inputs,
    &mut host,
    self.exec_options,
    source_manager,
)
.map_err(TransactionExecutorError::TransactionProgramExecutionFailed)?;

Even with debug mode enabled in the assembler, debug mode passed to the processor and the source manager containing the source code of the erroring Masm, we get this - depending on the error library and the error handling:

Anyhow

let executed_transaction = executor
  .execute_transaction(account_id, block_ref, notes, tx_args, Arc::clone(&source_manager))
  .context("failed to execute transaction")?;

prints

Error: failed to execute transaction

Caused by:
    0: failed to execute transaction kernel program
    1: word memory access at address 25 in context 5228 is unaligned at clock cycle 6286

Miden Miette

let executed_transaction = executor
  .execute_transaction(account_id, block_ref, notes, tx_args, Arc::clone(&source_manager))
  .into_diagnostic()?;

prints

Error:   x failed to execute transaction kernel program
  `-> word memory access at address 25 in context 5228 is unaligned at clock
      cycle 6286

Note that in both cases of anyhow and miette, no source code is printed. For anyhow, it's because it simply doesn't do anything with Diagnostic, so that's expected. For miette, because our wrapping TransactionExecutorError does not implement Diagnostic, the into_diagnostic call presumably cannot access the source ExecutionError but only core::error::Error, so it cannot print the diagnostics either (this is me speculating).

If we unwrap the source error in a map_err call:

let executed_transaction = executor
.execute_transaction(account_id, block_ref, notes, tx_args, Arc::clone(&source_manager))
.map_err(|err| {
    let TransactionExecutorError::TransactionProgramExecutionFailed(source) = err else {
        todo!()
    };
    source
})?;

then we get:

Error:   x word memory access at address 25 in context 5228 is unaligned at clock
  | cycle 6286
   ,-[#exec:1:1]
 1 | 
   : ^
   : `-- tried to access memory address 25
 2 |             use.miden::contracts::wallets::basic->wallet
   `----
  help: ensure that the memory address accessed is aligned to a word
        boundary (it is a multiple of 4)

which is an actual diagnostic. But that is not a solution, of course. I just did this as a sanity check.

The only real solutions might be to derive Diagnostic for TransactionExecutorError, i.e.:

#[derive(Debug, Error, miette::Diagnostic)]
pub enum TransactionExecutorError {
  // ...
  #[error("failed to execute transaction kernel program")]
  #[diagnostic(transparent)]
  TransactionProgramExecutionFailed(#[source] ExecutionError),
}

It only seems to work when using transparent, but I'm not really familiar with miette.

Or to print the diagnostic instead of returning the source error, i.e.:

#[error("failed to execute transaction kernel program:\n{}", PrintDiagnostic::new(.0))]
TransactionProgramExecutionFailed(ExecutionError),
Error:   x failed to execute transaction kernel program:
  |   × word memory access at address 25 in context 5228 is unaligned at clock
  | cycle 6286
  |    ╭─[#exec:1:1]
  |  1 │
  |    · ▲
  |    · ╰── tried to access memory address 25
  |  2 │             use.miden::contracts::wallets::basic->wallet
  |    ╰────
  |   help: ensure that the memory address accessed is aligned to a word
  | boundary (it is a multiple of 4)
  | 

Note that both of these diagnostics are pointing to the wrong location in source code. The offending line is push.4.3.2.1.25 mem_storew. The second line of the code is actually use.miden::contracts::wallets::basic->wallet, so the code seems to be correct, just pointing to the wrong location.
I pushed an example to the pgackst-vm-diagnostic branch. Run RUST_BACKTRACE=1 cargo nextest run --cargo-profile test-dev --features concurrent,testing prove_witness_and_verify to reproduce.

Conclusion

In my opinion, the PrintDiagnostic approach is by far the easiest and most likely to get through to the user. In particular, also because it does not require end users to use miden miette for error reporting. If an end-user really wants to use a custom "diagnostics printer", we could still include the ExecutionError in the error just so that it is still accessible. So for now, I went with the print diagnostic approach.

I'd be happy to get more insights on this or if I overcomplicated things, cc @plafer.

Follow-Up

  • Add standard component sources to source managers of TransactionKernel::assembler and friends if a debug flag is enabled, if that results in better errors.
  • We should decide what to do with APIs that return ExecutionError, e.g. TransactionContext::{execute_code, execute_code_with_assembler} or CodeExecutor::{run, execute_program}. In our tests, we can just use miden-miette, but if miden-base users also use these, they should be made aware that they need to do the same in order to get the better errors.

/// Provided kernel procedure offset is out of bounds
pub const ERR_KERNEL_PROCEDURE_OFFSET_OUT_OF_BOUNDS: u32 = 0x20000;
/// Error Message: "anchor block commitment must not be empty"
pub const ERR_ACCOUNT_ANCHOR_BLOCK_COMMITMENT_MUST_NOT_BE_EMPTY: MasmError = MasmError::from_static_str("anchor block commitment must not be empty");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the goal with 0.14 was that we wouldn't need to duplicate errors defined in masm and rust (and not have to do any build.rs logic). Although I didn't check everywhere, it seems like a bunch of these are only used in tests. Do we really need MasmError, or could those strings be moved directly in tests?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, in the VM, we hardcode the expected strings in the tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising this. We don't need the errors extracted from Masm, but the main reason I wanted to keep it is because it makes testing a lot more convenient. I personally would like to avoid maintaining expected error strings and since we already have the build.rs setup, it seems fine to me to keep it for that purpose. I don't think it has any disadvantages other than needing an occasional maintenance update. What I haven't done yet, but should do is only expose these under testing. Previously users could have wanted access to them to map the codes back to the strings, but that should no longer be necessary, so we don't need to make it part of the regular API.

@plafer
Copy link
Collaborator

plafer commented May 12, 2025

In debug mode, I'd basically add all standard MASM code to the assembler so it's available, i.e. account components, kernels, miden lib and note scripts. Is that the right approach here @plafer?

I think that the best way is to use SourceManagerExt::load_file() as a convenience to loading a file in a source manager by path.

The reason I'm not 100% sure is that the kernel is weird in that all files get built into a single module, and so I'm not sure if that affects source management in weird ways. So I'd give load_file() a try to see if it works as expected. cc @bitwalker

As for the TransactionExecutor & PrintDiagnostic issue, I tried the #[diagnostic(transparent)] approach, and gave me the same issue as you're getting. This (probably) suggests that the source spans are screwed up for kernel code? This requires a more thorough investigation, and/or input from @bitwalker.

EDIT: I will look into the kernel issue to see if I can reproduce in a simpler environment and report back

@bitwalker
Copy link
Collaborator

So the question is, can we somehow add the source files when we instantiate the assembler? The source files are available in concat!(env!("OUT_DIR"), "/asm/kernels/transaction"), so my question is primarily about the best way to do this using the Assembler or SourceManager API.

In debug mode, I'd basically add all standard MASM code to the assembler so it's available, i.e. account components, kernels, miden lib and note scripts. Is that the right approach

I would have your build.rs always assemble with debug mode enabled (and thus it will include debug information, including the source files). With debug mode disabled, no source locations will be stored in the resulting library, so there won't be any way to add in source files after the fact. It's all or nothing. At this point in time, there is essentially no benefit to not enabling debug mode IMO.

As for the TransactionExecutor & PrintDiagnostic issue, I tried the #[diagnostic(transparent)] approach, and gave me the same issue as you're getting. This (probably) suggests that the source spans are screwed up for kernel code? This requires a more thorough investigation, and/or input from @bitwalker.

The issue here, AFAICT, is that the miette report handler is being bypassed. You must do one of the following:

  • unwrap with the panic hook for diagnostic reporting installed (if it isn't installed, it will print without any of the fancy rendering, including source spans)
  • Use PrintDiagnostic explicitly with a report, e.g. panic!("{}", PrintDiagnostic::new(report))

Note that you must convert error types implementing Diagnostic to a Report in order to get the pretty rendering with source spans. You can't just panic/print a Diagnostic and get the nice output, this is because a Diagnostic impl doesn't handle the display plumbing itself (that typically happens by deriving thiserror::Error, or implementing Display by hand). The Report type exists to handle all of those details, by wrapping a Diagnostic impl and providing that plumbing for you.

@plafer
Copy link
Collaborator

plafer commented May 12, 2025

Note that you must convert error types implementing Diagnostic to a Report in order to get the pretty rendering with source spans.

The problem though is that even when we do get the nice rendering, miette is not printing out the right part of the MASM code (hence my suspicion that the SourceSpan start/end values are incorrect) - or did I misunderstand what you said here?

@bitwalker
Copy link
Collaborator

Note that you must convert error types implementing Diagnostic to a Report in order to get the pretty rendering with source spans.

The problem though is that even when we do get the nice rendering, miette is not printing out the right part of the MASM code (hence my suspicion that the SourceSpan start/end values are incorrect) - or did I misunderstand what you said here?

When you say it isn't printing out the right part, do you mean that it is printing completely unrelated code; the span is offset weirdly (e.g. it is off by one line); or the span isn't what you expected (e.g. the span you have is for a related bit of code, but not useful)?

If it is the first of those, the problem might arise if you use a SourceSpan that was allocated in one SourceManager instance with a different instance, and the underlying SourceIndex exists in both. So long as the byte indices are in range, the span will behave as if it is valid, but you'll get the wrong source code when rendering the span. The only other way to get completely incorrect spans, is if you are constructing them manually, i.e. either by looking up the SourceIndex associated with a given name in a SourceManager, then converting a line/col pair to a byte range; or, by asking the SourceManager to convert a Location (i.e. file name and line/col pair) to a SourceSpan. In both cases, it is up to the caller to ensure that the source code stored in the SourceManager under the given file name, is actually the expected content. That's generally the case when using actual file system paths, but the SourceManager lets you store file content "virtually" as well, i.e. you provide the content directly, rather than have the SourceManager read the content from disk. In that case, particularly if re-using a SourceManager, you might resolve locations to SourceSpans which aren't what you expect them to be.

It really boils down to how you are using the SourceManager - different usage patterns call for different approaches to mitigating the possibility of overlapping source files, in the typical case, you don't have to really do any mitigation; but in others, particularly involving virtual files, you might need to ensure that you instantiate a new SourceManager in each unique "session" where files are known to be unique.

I'm not sure how much of the above applies to the issues you're seeing without knowing more specifics, but hopefully that provides some intuition for what might be going on. I'm happy to pair up and take a look with you guys too.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Looks good! Not a full review, but I left some comments inline and also some comments below.

I would have your build.rs always assemble with debug mode enabled (and thus it will include debug information, including the source files). With debug mode disabled, no source locations will be stored in the resulting library, so there won't be any way to add in source files after the fact. It's all or nothing. At this point in time, there is essentially no benefit to not enabling debug mode IMO.

I think we should just remove the option to assemble w/o debug info from Miden assembler. Basically, all programs would be assembled with debug info in them, and if needed, we can strip them out later (e.g., when loading a program into the processor). @plafer - do we already have an issue for this in miden-vm? I would probably prioritize this for v0.15.0 release.

Something similar applies to TransactionProver, although I think we could also not expose the source manager here. In my mind, during proving we assume that the transaction witness comes from an already executed transaction, so I wonder if the error reporting is necessary here. For now though, I've added the source manager, but it's easy to remove.

I would remove source manager from the TransactionProver. As you described above, we basically always first execute a program using TransactionExecutor and then prove using TransactionProver - so, if there were any errors, they should be caught by the executor.

@PhilippGackstatter
Copy link
Contributor Author

Basically, all programs would be assembled with debug info in them, and if needed, we can strip them out later (e.g., when loading a program into the processor).

That would be great for the UX as well. That is, you no longer have to make sure that the assembler and the processor are in debug mode in order to get better errors or debug information out of the processor. Instead, it is always present (unless explicitly stripped, like when stored on-chain) and if ExecutionOptions::enable_debugging is set, then it'll use it, otherwise it can strip it or ignore it. Something like that would be great from the user POV.

@PhilippGackstatter
Copy link
Contributor Author

When you say it isn't printing out the right part, do you mean that it is printing completely unrelated code; the span is offset weirdly (e.g. it is off by one line); or the span isn't what you expected (e.g. the span you have is for a related bit of code, but not useful)?

Yes, it prints this:

Error:   x failed to execute transaction kernel program:
  |   × word memory access at address 25 in context 5228 is unaligned at clock
  | cycle 6286
  |    ╭─[#exec:1:1]
  |  1 │
  |    · ▲
  |    · ╰── tried to access memory address 25
  |  2 │             use.miden::contracts::wallets::basic->wallet
  |    ╰────
  |   help: ensure that the memory address accessed is aligned to a word
  | boundary (it is a multiple of 4)
  | 

where the source code is this one:


So the second line is correct, i.e. it references the right source "file", but it should point to the mem_storew line.

particularly involving virtual files, you might need to ensure that you instantiate a new SourceManager in each unique "session" where files are known to be unique.

Yeah I think there are at least three virtual files in this source manager, (in the test prove_witness_and_verify) the "mock account" added by TransactionContextBuilder::with_standard_account(ONE) and two files added by TransactionContextBuilder::with_mock_notes_preserved, afaict. Not sure if that's the root cause though.

I think that the best way is to use SourceManagerExt::load_file() as a convenience to loading a file in a source manager by path.

I gave this a try by introducing an unaligned memory error in api.masm.

The following snippets I've added in TransactionKernel::testing_assembler which is the assembler instantiated in our tests, so its source manager should propagate through to the execution. First I tried using just load_file:

let source_manager = Arc::new(DefaultSourceManager::default()) as Arc<dyn SourceManager>;
let kernel_path = Path::new(concat!(env!("OUT_DIR"), "/asm/kernels/transaction"));
source_manager.load_file(&kernel_path.join("api.masm")).unwrap();

Unfortunately that didn't work. The error did not contain a source snippet:

Caused by:
  failed to execute transaction kernel program:
    x word memory access at address 25 in context 0 is unaligned at clock cycle 4865
    help: ensure that the memory address accessed is aligned to a word boundary (it is a multiple of 4)

I also tried using the same approach as in build.rs, which is to let the assembler fill the source manager in whatever way it does it by invoking the same APIs.

let shared_path = concat!(env!("OUT_DIR"), "/asm/shared");
let mut temp_assembler = Assembler::new(Arc::clone(&source_manager))
    .with_library(StdLibrary::default())
    .unwrap();
let kernel_namespace = miden_objects::assembly::LibraryNamespace::new("kernel")
    .expect("namespace should be valid");
temp_assembler
    .add_modules_from_dir(kernel_namespace.clone(), &Path::new(shared_path))
    .unwrap();

let kernel_path = Path::new(concat!(env!("OUT_DIR"), "/asm/kernels/transaction"));
let _kernel_lib = KernelLibrary::from_dir(
    kernel_path.join("api.masm"),
    Some(kernel_path.join("lib")),
    temp_assembler,
)
.unwrap();

The error was the same though.

I would somehow like to confirm whether the source manager that ends up being passed to execute does indeed contain these, but I can't find a "list all contents" API.

@bitwalker
Copy link
Collaborator

bitwalker commented May 13, 2025

Yes, it prints this:

Error:   x failed to execute transaction kernel program:
  |   × word memory access at address 25 in context 5228 is unaligned at clock
  | cycle 6286
  |    ╭─[#exec:1:1]
  |  1 │
  |    · ▲
  |    · ╰── tried to access memory address 25
  |  2 │             use.miden::contracts::wallets::basic->wallet
  |    ╰────
  |   help: ensure that the memory address accessed is aligned to a word
  | boundary (it is a multiple of 4)
  | 

where the source code is this one:

So the second line is correct, i.e. it references the right source "file", but it should point to the mem_storew line.

I think I see what is causing the issue. Here is where the code is being assembled. There are two things to note about this:

  1. The source code will be added to the SourceManager of the assembler using the default virtual file name for the #exec module, which, as you'd expect, is just #exec. The key thing here is that every time assemble_program is called on the assembler in this fashion (i.e. passing a string as the code for the #exec module), it will attempt to store that source code in the underlying SourceManager using the same name. That leads me to my next point.
  2. If the assembler is being cloned and used multiple times to assemble_program some code, using a string as the source code, then all of the resulting programs, except the first one assembled, are going to have invalid SourceSpan values, as the underlying SourceIndex will refer to the source code of the first string of code that was stored in the SourceManager under the #exec virtual file name. This is almost certainly why the diagnostic is being rendered incorrectly.

There are a couple things we could do here:

  1. Modify SourceManager to allow overwriting the content of a source file, when the path is virtual. We don't actually meaningfully distinguish between virtual/real file paths in the SourceManager currently, and we can only meaningfully do so when the std feature of miden-core is enabled, or we'd need to refactor the API a bit to take a type which encodes this distinction (we actually have a richer representation of input files in the compiler, which we use for a similar purpose). I'm less keen on these options, since it complicates the SourceManager implementation, but they are viable options.
  2. Pass a Arc<SourceFile> to assemble_program rather than a &str/&String, since you can then specify a unique virtual file name for the source code, thereby side-stepping the issue entirely. I'd implement the conversion in the code method of NoteBuilder, and either derive a name somehow, or require the caller to specify one in addition to the source code itself. This is the simplest, and most idiomatic, fix for this type of usage pattern.

EDIT: I just realized that Option 1 above is a non-starter for a different reason - it would invalidate the source spans of the previously assembled program. The SourceManager is designed to be append-only.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I'm going to merge as is and we can address some of the outstanding things in follow-up PRs. These include:

  • Potentially moving source_manager to be a field in TransactionExecutor.
  • Figuring out how to fix incorrect source location reference (if it still an issue).

block_ref: BlockNumber,
notes: InputNotes<InputNote>,
tx_args: TransactionArgs,
source_manager: Arc<dyn SourceManager>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for now, but I would probably change this in the future to make source_manager a field in the TransactionExecutor. The main reasoning here is that source manager is probably set at the time we create a TransactionExecutor and then we should be able to re-use it for different transactions. If, however, we do need to pass a new source manager into this function from time to time, then keeping it as a function argument is fine.

tx_script: TransactionScript,
advice_inputs: AdviceInputs,
foreign_account_inputs: Vec<AccountInputs>,
source_manager: Arc<dyn SourceManager>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

block_ref: BlockNumber,
notes: InputNotes<InputNote>,
tx_args: TransactionArgs,
source_manager: Arc<dyn SourceManager>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

@bobbinth bobbinth merged commit fb9ad56 into next May 19, 2025
16 checks passed
@bobbinth bobbinth deleted the pgackst-upgrade-vm branch May 19, 2025 05:50
@bobbinth
Copy link
Contributor

@igamigo - we probably will need to make corresponding updates in miden-client and possibly in miden-node as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants