smirk - merkle tree crate #173

cameron1024 · 2023-07-11T10:55:13Z

This PR adds the smirk crate ("stable merk")

MerkleTree type for the tree itself
Storage trait and RocksdbStorage struct
Hashable trait (and maybe a derive macro in the future) for "things which can be RPO hashed" - if there's an existing abstraction with better ecosystem support, that would be ideal, but a quick google didn't reveal anything obvious
smirk! macro for easily creating trees (e.g. smirk!{ 1 => "hello", 2 => "world" })
some iterators

This is a WIP, but some "known issues" that I don't have good answers to are:

should we even try to have a generic storage backend at all? For it to be useful, we'd really need the ability to have transactions that span both the indexer, and smirk itself, which would probably require exposing some rocksdb types in the API, or we'd have to come up with our own abstraction. The latter probably isn't worth it until we start building serious support for a second backend presumably? Happy to be wrong on this though
similarly, do we want to be generic over the hash function? RPO seems to be pretty baked-into miden, so I assumed we wouldn't really want to use anything else 🤷
if we're reusing rocksdb instances from the indexer, would we need namespacing? Googling "rocksdb namespace" didn't show anything obvious - is it as simple as prefixing every key with b"smirk" and calling it a day?

Known issues that I have good answers for (basically just issues that I don't want to block reviews 😅 ):

the verify function is broken, the fix is a little tricky
the rocksdb instance won't close when dropped, which causes tests to hang indefinitely
a few other bits and pieces

calummoore · 2023-07-11T12:12:43Z

should we even try to have a generic storage backend at all? For it to be useful, we'd really need the ability to have transactions that span both the indexer, and smirk itself, which would probably require exposing some rocksdb types in the API, or we'd have to come up with our own abstraction. The latter probably isn't worth it until we start building serious support for a second backend presumably? Happy to be wrong on this though

Agree, I don't think we need to support a generic backend, we don't have any active plans to expand beyond rocksdb for now.

similarly, do we want to be generic over the hash function? RPO seems to be pretty baked-into miden, so I assumed we wouldn't really want to use anything else

That's fine, as we only need RPO for now. We could always add another later, there are other priorities 👍

if we're reusing rocksdb instances from the indexer, would we need namespacing? Googling "rocksdb namespace" didn't show anything obvious - is it as simple as prefixing every key with b"smirk" and calling it a day?

Rollup and indexer will now be separate (may not even be on the same node) so there is no need to consider the indexer at all. We might want to store some data related to the the current committed proposal, but to be honest, given that we can now very quickly identify what proposal is committed (from the root hash), we can probably just store that separately.

the rocksdb instance won't close when dropped, which causes tests to hang indefinitely

@mateuszmlc did you see this when implementing the indexer?

mateuszmlc · 2023-07-11T12:23:10Z

did you see this when implementing the indexer?

No, I haven't experienced this.

calummoore

Would be good to do some basic perf benchmarks on this to see what level of throughput we can expect with the current implementation, which would guide us to whether we need to make some perf optimisations now, or we can wait until later.

calummoore · 2023-07-11T12:13:53Z

smirk/src/storage/rocksdb/mod.rs

+    K: Ord + 'static,
+    V: Hashable + 'static,
+{
+    fn store_tree(&self, tree: &MerkleTree<K, V>) -> Result<(), Error>


Do we only have the option to store the entire tree every time we want to save it? That seems like quite a performance hit. I was expecting that we'd be able to just update the changes in the tree (I think that's how merk does it)

I'm going to add an incremental API that allows you to specify updates and then apply them to the in-memory tree or the storage

calummoore · 2023-07-11T12:23:35Z

smirk/src/storage/rocksdb/mod.rs

+            let hash = node.value().hash();
+            let bytes = codec::encode(&(node.key(), node.value())).map_err(err)?;
+
+            tx.put(&hash.to_bytes(), &bytes).map_err(err)?;


If we're using the hash as a key, I guess we're not storing the records in the same order as the merkle tree? It might be useful to store using the key, as that means record access will likely be closer together (i.e. when fetching stuff for updates), and we can easily split a rollup in half (when sharding)

Yep, I misread the merk docs - updating to be more consistent (and indexed by key, so retrieving subtrees is fast)

calummoore · 2023-07-11T12:39:25Z

smirk/src/tree/tests.rs

Would be good to add some tests to ensure the tree remains balanced.

cameron1024 added 4 commits July 7, 2023 13:17

before move to map-like structure

314700d

big refactor done except storage

56be1b7

mostly done, a few issues remain

cad8612

mostly done, a few issues remain

a5223ef

calummoore requested review from calummoore and mateuszmlc July 11, 2023 10:58

calummoore reviewed Jul 11, 2023

View reviewed changes

cameron1024 added 11 commits July 12, 2023 11:54

remove storage abstraction, replace with single struct

5b6b08e

it works finally lol

0776be5

path bug fixed

521a2ac

batch API

c485449

initial benchmarks - not good :(

d3f3433

collect benchmark

1fd1119

fmt

0bd3f8b

add docs to CI

f8375e3

update dockerfile to copy smirk

362d267

update dockerfile to copy smirk again

4a5f288

add hashable impl for merkletree

6896940

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

smirk - merkle tree crate #173

smirk - merkle tree crate #173

cameron1024 commented Jul 11, 2023

calummoore commented Jul 11, 2023

mateuszmlc commented Jul 11, 2023

calummoore left a comment

calummoore Jul 11, 2023

cameron1024 Jul 13, 2023

calummoore Jul 11, 2023

cameron1024 Jul 13, 2023

calummoore Jul 11, 2023

smirk - merkle tree crate #173

Are you sure you want to change the base?

smirk - merkle tree crate #173

Conversation

cameron1024 commented Jul 11, 2023

calummoore commented Jul 11, 2023

mateuszmlc commented Jul 11, 2023

calummoore left a comment

Choose a reason for hiding this comment

calummoore Jul 11, 2023

Choose a reason for hiding this comment

cameron1024 Jul 13, 2023

Choose a reason for hiding this comment

calummoore Jul 11, 2023

Choose a reason for hiding this comment

cameron1024 Jul 13, 2023

Choose a reason for hiding this comment

calummoore Jul 11, 2023

Choose a reason for hiding this comment