When doing equivalence checking, CALL to unknown code can be approximated in a different way #669

msooseth · 2025-02-18T19:45:28Z

As @charles-cooper mentioned on the chat, for equivalence checking -- and only for equivalence checking -- one can do a different kind of overapproximation for call to unknown code via CALL. If codeA and codeB both call contract C with the same prestate and same calldata then and only then, the post-state after the call can be assumed to be symbolic, but equivalent. Which obviously does not work for regular symbolic execution, which e.g. the #658 does allow us to perform. However, this kind of CALL does not necessitate --promise-no-reent, i.e. the system may actually be non-reentrant.

NOTE: It may be important to think though things related to this. When code-A has a function "mess_me_up()" and code-B does not have that function, then even though the two codes have the same prestate and the same calldata, the end state may be different -- in one case, a malicious contract could have called mess_me_up(), while in the other, it could not. However, the mess_me_up() public will be detected as a difference anyway...

The text was updated successfully, but these errors were encountered:

charles-cooper · 2025-02-19T16:58:31Z

NOTE: It may be important to think though things related to this. When code-A has a function "mess_me_up()" and code-B does not have that function, then even though the two codes have the same prestate and the same calldata, the end state may be different -- in one case, a malicious contract could have called mess_me_up(), while in the other, it could not. However, the mess_me_up() public will be detected as a difference anyway...

i don't think this is a huge deal. if the code up to the CALL is equivalent, and the code after the CALL is equivalent, the contracts are equivalent. (the simplest case here is if the code after the CALL in both contracts is byte-for-byte equal).

msooseth · 2025-02-19T17:10:35Z

While thinking about this yesterday, I realized that we have a bug in our overapproximation when it comes to equivalence checking -- but not when it comes to regular symbolic execution or forge-base symbolic execution. We call the fresh variables that we assign to the return value and return data the same name for both contract A and contract B -- and so during equivalence checking, they are assumed to be the same. Which is not correct. They can be different.

I will add a fix to this tomorrow. It will help me think about how to do this pre-state+calldata implies post-state thing at the same time. I'll have to collect the prestate(s)+calldata(s). Then we'll have an array of pairs [(prestates-codeA+calldata-codeA, return-buffer-codeA),...] and [(prestates-codeB+calldata-codeB,return-buffer-codeB),...]. Then we can do this:

IF prestates-codeA+calldata-codeA == prestates-codeB+calldata-codeB THEN return-buffer-codeA == return-buffer-codeB

This will be needed to be able to match staticcall returndata in a way that's actually correct (as per above, it's wrongly assumed to be the same). Then, expanding this to also allow for reentry is a lot easier.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When doing equivalence checking, CALL to unknown code can be approximated in a different way #669

When doing equivalence checking, CALL to unknown code can be approximated in a different way #669

msooseth commented Feb 18, 2025

charles-cooper commented Feb 19, 2025

msooseth commented Feb 19, 2025

When doing equivalence checking, CALL to unknown code can be approximated in a different way #669

When doing equivalence checking, CALL to unknown code can be approximated in a different way #669

Comments

msooseth commented Feb 18, 2025

charles-cooper commented Feb 19, 2025

msooseth commented Feb 19, 2025