Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When doing equivalence checking, CALL to unknown code can be approximated in a different way #669

Open
msooseth opened this issue Feb 18, 2025 · 2 comments

Comments

@msooseth
Copy link
Collaborator

As @charles-cooper mentioned on the chat, for equivalence checking -- and only for equivalence checking -- one can do a different kind of overapproximation for call to unknown code via CALL. If codeA and codeB both call contract C with the same prestate and same calldata then and only then, the post-state after the call can be assumed to be symbolic, but equivalent. Which obviously does not work for regular symbolic execution, which e.g. the #658 does allow us to perform. However, this kind of CALL does not necessitate --promise-no-reent, i.e. the system may actually be non-reentrant.

NOTE: It may be important to think though things related to this. When code-A has a function "mess_me_up()" and code-B does not have that function, then even though the two codes have the same prestate and the same calldata, the end state may be different -- in one case, a malicious contract could have called mess_me_up(), while in the other, it could not. However, the mess_me_up() public will be detected as a difference anyway...

@charles-cooper
Copy link

NOTE: It may be important to think though things related to this. When code-A has a function "mess_me_up()" and code-B does not have that function, then even though the two codes have the same prestate and the same calldata, the end state may be different -- in one case, a malicious contract could have called mess_me_up(), while in the other, it could not. However, the mess_me_up() public will be detected as a difference anyway...

i don't think this is a huge deal. if the code up to the CALL is equivalent, and the code after the CALL is equivalent, the contracts are equivalent. (the simplest case here is if the code after the CALL in both contracts is byte-for-byte equal).

@msooseth
Copy link
Collaborator Author

While thinking about this yesterday, I realized that we have a bug in our overapproximation when it comes to equivalence checking -- but not when it comes to regular symbolic execution or forge-base symbolic execution. We call the fresh variables that we assign to the return value and return data the same name for both contract A and contract B -- and so during equivalence checking, they are assumed to be the same. Which is not correct. They can be different.

I will add a fix to this tomorrow. It will help me think about how to do this pre-state+calldata implies post-state thing at the same time. I'll have to collect the prestate(s)+calldata(s). Then we'll have an array of pairs [(prestates-codeA+calldata-codeA, return-buffer-codeA),...] and [(prestates-codeB+calldata-codeB,return-buffer-codeB),...]. Then we can do this:

IF prestates-codeA+calldata-codeA == prestates-codeB+calldata-codeB THEN return-buffer-codeA == return-buffer-codeB

This will be needed to be able to match staticcall returndata in a way that's actually correct (as per above, it's wrongly assumed to be the same). Then, expanding this to also allow for reentry is a lot easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants