Skip to content

Further optimisation of valid jump destination computation #602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
PetarMax opened this issue Jun 5, 2024 · 3 comments
Open

Further optimisation of valid jump destination computation #602

PetarMax opened this issue Jun 5, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@PetarMax
Copy link
Contributor

PetarMax commented Jun 5, 2024

Currently, we are computing jump destinations AOT, with the help of pyk. This allows us to handle programs with symbolic parts (such as immutable variables), but is slightly slower than the original approach that was fully integrated with the semantics.

We also have a PR open in KEVM that enables JIT computation, and which is much faster, but incomplete in that it does not account for cases when a jump is made into PUSH data, which should result in a EVMC_BAD_JUMP_DESTINATION error. Extending that PR to account for these cases as well is likely to end up being slower, since what we would need to do is, given a check for a jump destination I:

  • take the 32 bytes of the program preceding I;
  • check, for each of these bytes, from left to right, if the corresponding opcode is a PUSH-N, such that I is included in the N following bytes;
  • make sure to skip all push data found in the process that does not include I.

In general, the assumption that no JUMP/JUMPI lands in PUSH data does not hold for arbitrary EVM code. However, it could be safe in the case of Kontrol, since we are working exclusively with EVM code coming from the compilation of Solidity files. Since the EVM execution never continues if such a jump occurs, I can't think of a use case for the compiler do be doing this deliberately. Perhaps we could ask the developers for their opinion.

If we think that this assumption will be safe, we could implement a layer in Kontrol only that always sets the jump destinations to .Set and only does a basic JIT check. This should give us a solid speed-up.

Alternatively, we could think of automatically generating K rules that capture valid jump destinations for each contract. These rules should be handled very efficiently by the LLVM backend.

@PetarMax PetarMax added the enhancement New feature or request label Jun 5, 2024
@ehildenb
Copy link
Member

ehildenb commented Jun 5, 2024

Actually, checking the preceding 32 bytes is not enough. You have to also check, if you do find a PUSH in that preceding 32 bytes, is that PUSH actually just part of the push data of a prior push? So there isn't a way to know except scanning from the start of the bytecode.

@PetarMax
Copy link
Contributor Author

PetarMax commented Jun 5, 2024

Yes, that is correct.

@palinatolmach
Copy link
Collaborator

palinatolmach commented Jun 7, 2024

Other considerations from Slack:

  • @anvacaru: If set reasoning is the main bottleneck perhaps we could refactor the way we store the valid jumpdests. Instead of having the valid jumpdests as Set elements, we could store them in some kind of frequency array, a boolean array the same length as the bytecode. We should still be able to handle symbolic immutables and everything.
    It seems like this is the approach used by the rust evm implementation (link)
  • @ehildenb: We can also explore other mechanisms (like creating a function-table) for computing valid jump dests. That involves code-gen, which maybe we want to avoid, but at least could be correct for any EVM execution, not just Solidity ones,
  • Or, we could have a tool which scans the bytecode for JUMP and JUMPI instructions, and checks that they are all preceded with a PUSH(...) to a concrete value, then checks that all of those destinations are valid jumpdests (by scanning ahead of time in python). If we confirm that for every bytecode that we load, we can just turn off jump dest validity checks altogether. That would probably also work entirely at the bytecode level. While this approach seems easier and would require straightforward static analysis, it's complicated by the fact that JUMPs are not always preceded by a concrete PUSH, and the destination might be dynamic, e.g. https://ethereum.stackexchange.com/questions/70121/evm-bytecode-debugging-jumps.

In addition, there's a potential follow-up to a change in valid JUMPDESTs calculation implemented in #596. As @anvacaru noted, we can make the 1loadProgram` cut rule optional iff we're executing a constructor with symbolic arguments. We can probably add or pass an option enabling 'EVM.program.load' as a cut-point rule optionally based on the test signature and it being a constructor here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants