Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UKM integration design #116

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
240 changes: 240 additions & 0 deletions ukm-integration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,240 @@
UKM Integration Design
======================

The main restriction for the Rust semantics is that we can't declare structs
and we can't implement traits. While we could add those, it is non-trivial,
and we will try to avoid this. In general, we will try to avoid adding
non-trivial features to the Rust-lite implementation.

However, note that we can "declare" structs in K, and they will work properly in
the current rust semantics. Also, for Mx we are loading traits as contracts
(in the actual Mx world they use macros to create the actual contract from the
trait, but we are skipping that).

As with the Mx semantics, we are going to keep the Rust semantics as a pure Rust
semnatics, and we are going to have a second semantics that will add the
blockchain features.

Structs
-------

Here is what is likely to be needed if we want to implement the curly-brace
variant of structs as a rust construct and not as a K construct as we plan
right now (we would be ignoring tuple structs):

1. We would need to decide exactly what functionality do we need.
2. For declaring structs (this part is not that hard to do, but still requires
some work) we would need to:
* Parse struct types from the input, and add them in the configuration
* Implement struct literals (both parsing and evaluation)
3. For functionality based on structs, we would need to do some (all?) of the following:
* Preprocess `impl StructName { ... }` (add it to the configuration, index its
methods) - this should not be hard
* Implement trait implementations, e.g., the following should work
```rs
pub trait From<T>: Sized {
fn from(value: T) -> Self;
}

impl From<int64> for StructName {
fn from(v: int64) -> Self { ... }
}
```
* Many important traits use generics, and, if we want to be able to implement
them, we would also need to be able to handle generics. I'm not sure yet
what that would mean in practice. Besides that, we would need to figure out
exactly which method to call given several traits (or trait hierarchies if
we want to also handle that) and several implementations for a given struct
(or set of structs, since, e.g., the right `From` implementation depends on)
two structs.

In general, this seems non-trivial, e.g., when we call `stuff.into()`, which
implementation we need to pick also depends on the type of the result we
expect from `into()`. Currently, we can't detect the expected type of an
expression, and this would require preprocessing each function's code.

After deciding what functionality we need, we would need to go through the
reference and figure out how hard it would be to implement.

* If we want to implement things like
`#[derive(Clone, Default, Debug, PartialEq, Eq)]`, which seem popular in
the Rust world, we would first need to decide what we want to do about them.
I think we can expand them to actual implementations if we run the input
source through the "expand macros" compiler phase, but that's available
only in nightly Rust builds, so we should decide if we want to depend on
them. Also, it's important to know that we can't control the generated code,
so we should check that it does not contain features that would be hard to
implement.

Contracts
---------

As for Mx, contracts will be traits, identified by an attribute
(`#[ethereum::contract]`). Endpoints will be non-static functions if this trait
identified by something like `#[endpoint(endpointName)]`. Storage will be
defined as non-static unimplemented functions identified by
`#[storage("storage_name")]`, e.g.:

```rs
#[storage("total_supply")]
fn s_total_supply(&self, key: &u64) -> Storage<u64>;
```

This representation will allow us to kind of reuse some Mx code.

Storage
-------

A storage key is the hash of the storage's name concatenated with the arguments
passed to the storage function (except self). To avoid ambiguities
(e.g. a storage called "a" with an argument "bc" should be different from
a storage called "ab" with "c" as argument), we will encode these using (say) RLP.

This representation will allow us to kind of reuse some Mx code.

Contract calls
--------------

For each contract we will define some helper functions implementing call
functionality. Initially these will be written in K,
but at some point we should use attributes like `#[ethereum::contract]` to
generate actual Rust code.

We will have a function called `Contract#call` that decodes the function's
hash from the request, matches it against the hashes of the endpoints, then
calls the endpoint's wrapper

For each endpoint, we will have a function called `Contract#endpoint#<endpointName>`.
This will decode the endpoint's arguments, passing them to the endpoint. It will
also take the endpoint's return value, it will encode it and it will store it
in the configuration (there will be an internal hook for that).

UKM Hooks
---------

A contract trait will have a special function
`fn blockchain_hooks(self) -> BlockchainHooks;`.
This function is defined automatically by the semantics. The `BlockchainHooks`
trait, also defined automatically by the semantics, will provide functions
which call the hooks directly, e.g. it will have a function
```rs
fn GetAccountBalance(&self, acct: &Int160) -> Int256;
```
that translates directly to a call to
```k
syntax Int ::= GetAccountBalance(acct: Int) [function, hook(UKM.getBalance), total]
```

Internal Hooks
--------------

A contract trait will have a special function
`fn internal_hooks(self) -> InternalHooks;` which will return an object with
which to access the internal configuration, similar to `BlockchainHooks`.

Int Types
---------

Rust has native int types up to 128 bytes. However, we will need int values
with 160 and 256 bytes. We don't want to implement structs. We could implement
them as tuples, but that's confusing. Also, we don't want to implement traits.

One option would be to hold the actual int256 values in the configuration, in
a `Map` from `Int` to `MInt{256}` or something similar. Their Rust
representation would be a K-defined struct which just holds the int ID.
Instead of implementing traits for operators (e.g. `std::ops::Add`), we would
handle the operators manually, translating them to internal hooks.

Another option would be to use K-defined structs for Int256, the struct holding
4 int64 values. We would still need to implement the operators in K, but
we could translate them to calls to Rust functions (also see the
[Helper functions](#helper-functions) section).

I prefer the second option, which seems cleaner, but may require somewhat
more work.

All functions in `BlockchainHooks` that take Ints as arguments will take whatever
representation we choose from the above.

Bytes encoding/decoding for values
----------------------------------

At least two places require encoding values as bytes: storage access and contract
calls. As above, we have two options: implement bytes operations in Rust, and
keep bytes in a Map from Id:Int to Bytes in the configuration, and provide
access through hooks.

This time, the easiest solution seems to be the configuration Map, with
a struct holding the ID in Rust. We would provide hooks for encoding and
decoding values, bytes substrings and concatentation.

All functions in `BlockchainHooks` that take bytes as arguments will take the
struct representation.

For contract call data we are currently assuming Ethereum's
[encoding](https://ethereum.org/en/developers/docs/transactions/#the-data-field
but it would be preferable to have explicit confirmation.

Contract encoding and decoding
------------------------------

One option would be to encode the term as JSON, in a similar way to how the
various K tools do this. This would require us to implement some sort of
reflection for the sorts that are interesting for the contract encoding,
but that will happen for all possible encodings anyway.

In case we are interested in a more efficient representation, I will decribe
a possible binary encoding below:

First, below I am mentioning IDs that are assigned manually. While we could
probably assign them automatically, if we are interested in extending the
semantics without breaking the existing contracts, assigning them manually
is probably safer.

We will define a set of sorts that are interesting to us for encoding/decoding
and we will manually assign int identifiers for each of them.

Any value of a given sort starts by the encoding of the sort's int identifier
as a 4-byte int, followed by the value's encoding.

If this is a builtin sort (Int, Map, String) we will define some sane encoding
(e.g. ints are encoded as the int's length in bytes represented on 4 bytes
(say, big-endian) followed by the int's bytes).

Otherwise, we will (manually) assign an int ID for each of the sort's
constructors, and one ID for injections.

If the current value is injected, we will write the injection ID followed by
the injected sort ID and the injected value representation

If the value is created with a constructor, we write the constructor's ID,
followed by the representation of the arguments.

Helper functions
----------------

We may have to implement some helper functions for various reasons, e.g.
for adding ints. If the effort is not very large, it is probably safer to
implement them in Rust. To do that, when encoding the contract, we will allow
loading a separate trait, from a separate file, containing builtins. This trait
must be called `Builtins`.

This trait will be available everywhere, so everyone will be able to call
things like `Builtins::helperFunction(...)`

Testing
-------

For a while, we will not have access to the actual hooks, so we would need to
mock them (which may be a good idea even if we have access to the hooks).

One option for that would be to use a cell called `<mocks>` which contains
a `Map` from `KItem` to `K` (wrapped in some constructor), and then we could
have this rule:

```
rule
<k> A:KItem => B ... </k>
<mocks> A |-> wrapped(B:K) ... </mocks>
[priority 10]
```
Loading