-
Notifications
You must be signed in to change notification settings - Fork 19
Revisit Opcode type
#1642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Revisit Opcode type
#1642
Conversation
* Add build for evm-tracing at e2e code job * Skip bash tests for evm-tracing test suite * Add template for evm-tracing test suite
Runtime level implementation
Client-side implementation
Implement RPC side
6b58725 to
e47ffbe
Compare
|
I think the original idea of the code structure was that the client doesn't know specifics of the opcodes - but the runtime does; in the same sense - client doesn't have the dependency on an |
In this case we can just use |
|
That's ideal too if we always need a single byte; that said - we might need more; the old way of doing things communicated opcodes via, essentially, strings - right? Maybe we want to do the same - but with a more narrow memory footprint |
|
There are no more than 256 different opcodes. The number can be changed based on future possible more opcodes in future EVM versions. So, the question is should we take into account more than the current number ? Or, is it enough to consider u8 for current EVM version opcodes number ? I don't like their current representation as they map opcode to string representation and then convert to vector of u8. |
quasiyoke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the original idea of the code structure was that the client doesn't know specifics of the opcodes
It’s hard to say that the client previously didn't know what opcodes were, and treated them as black boxes. For example, various kinds of CREATE were listed, as well as different CALLs.
maybe the
evm::Opcodetype supports opaque conversion and representing the "other" variant.
Yes, they support it in a sense, since opcodes can include "invalid" bytes. They didn't make Opcode an enum.
We might need more [than a single byte].
Currently, they haven't used about 30% of the available opcode space. I think the solution proposed here is a good balance between complexity, future-proofing, and performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the best way is to have a new type that would pretty much be this:
enum MarshalledOpcode {
Code(evm::Opcode),
Unknown(SmallVec<8>), // 8 cause the longest is 13 and the epmeric estimate of max length for the most popular ones in 7-ish
}It is unlikely that the previously used and deprecated opcodes names will ever be reused - so it quite a safe bet. But then again - maybe just do the SmallVec<8> and be done with this?
We don't want to have constraints from the client side here - we want runtime to fully manage the tracing side. Note that the client side is made such that it never needs to actually deal with the opcode itself - only call context.
392b9dd to
6afbfa0
Compare
611dea3 to
c238ca6
Compare
|
Revisited by introducing |
18bc2b5 to
04fc893
Compare
04fc893 to
357c9d9
Compare
| impl Encode for MarshalledOpcode { | ||
| fn encode(&self) -> Vec<u8> { | ||
| self.0.clone().to_vec().encode() | ||
| } | ||
| } | ||
|
|
||
| impl Decode for MarshalledOpcode { | ||
| fn decode<I: codec::Input>(input: &mut I) -> Result<Self, codec::Error> { | ||
| let bytes = Vec::decode(input)?; | ||
| Ok(MarshalledOpcode(SmallVec::from_vec(bytes))) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs more optimized encode/decode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you suggest something as I spent a lot of time trying different ways and met some issues with the encoding and decoding result data.
Then I decided to walk through codec implementation and this approach comes based on implementation for String.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to avoid going through the Vec and an associated allocation; also through the String due to the underlying Vec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about using Cow instead of Vec - eb7fec8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What generics params for cow do you have in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a malloc somewhere it (obviously) defeats the purpose; can we avoid a malloc at all there? Given that they go through the encode/decode traits? I think there should be a way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed type - slice [u8]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check whether Vec<u8> is used under the hood anywhere there - afaik borrow for [u8] is Vec<u8>; then it won't work
e520c25 to
ce000a6
Compare
eb7fec8 to
9659a59
Compare
| let d = std::str::from_utf8(opcode) | ||
| .map_err(|_| S::Error::custom("Opcode serialize error."))? | ||
| .to_uppercase(); | ||
|
|
||
| serializer.serialize_str(&d) | ||
| serializer.serialize_str(&opcode.to_string()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code before didn't malloc (used str); now it does (goes through String); two questions here @dmitrylavrenov.
- Do you understand the difference?
- Why change it this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it comes from Display implementation that uses malloc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, should be improved
| write!( | ||
| f, | ||
| "{}", | ||
| sp_core::sp_std::str::from_utf8(self.0.as_slice()).map_err(|_| core::fmt::Error)? | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very convoluted way to write this
| write!( | |
| f, | |
| "{}", | |
| sp_core::sp_std::str::from_utf8(self.0.as_slice()).map_err(|_| core::fmt::Error)? | |
| ) | |
| let s = sp_core::sp_std::str::from_utf8(self.0.as_slice()).map_err(|_| core::fmt::Error)?; | |
| write!( | |
| f, | |
| "{s}", | |
| ) |
Is much better; and then there are more ways to improve - like .write_str...
a08006e to
0225446
Compare
Closes #1600