Skip to content

Commit

Permalink
docs: Data model for append-only Merkle trees
Browse files Browse the repository at this point in the history
  • Loading branch information
pav-kv authored and AlCutter committed Sep 23, 2022
1 parent 4b56b44 commit 8e42628
Show file tree
Hide file tree
Showing 2 changed files with 942 additions and 8 deletions.
51 changes: 43 additions & 8 deletions docs/data_model.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,48 @@
Data Model
==========

This document establishes terminology shared throughout this repositotry around the
structure and components of the Merkle tree.

TODO. Things to cover:
- Node addressing. Each node is a (level, index) pair.
- Leaves, nodes, roots.
- Perfect and ephemeral nodes.
- Append-only, perfect nodes are immutable.
This document establishes terminology shared throughout this repositotry around
the structure and components of the Merkle tree.

### Merkle tree

In this repository, by Merkle trees we mean a slightly modified version of
"history trees" introduced by Crosby and Wallach in the *Efficient Data
Structures for Tamper-Evident Logging*
[paper](https://static.usenix.org/event/sec09/tech/full_papers/crosby.pdf).

![data_model](images/data_model.svg)

The basis of the data structure is an append-only **log** consisting of log
**entries**, numbered with consecutive integers starting from 0.

Built on top of the log entries, there is a highly regular structure of nodes of
a perfect binary tree-like shape. Nodes are addressed as `(level, index)` pairs.
Nodes at level 0 are called **leaf nodes**, and correspond directly to the log
entries. Nodes at higher levels are defined recursively based on nodes of the
lower levels. Specifically, `(level, index)` depends directly on nodes
`(level-1, index*2)` and `(level-1, index*2 + 1)`, and, recursively, nodes of
its entire subtree.

The data structure evolves dynamically. Initially, the log is empty, and all
the nodes of the tree have no data. When new entries are appended to the log,
nodes that recursively depend on these entries are updated. While the nodes can
be updated, they are in **ephemeral** state. Eventually, when the log grows
past a certain size, a node becomes **perfect**, and is never modified again.

Effectively, perfect nodes are immutable / write-once registers, and ephemeral
nodes are mutable.

### Tree state



To represent the state of the entire log, often a single ephemeral node is used
which covers all the corresponding leaves. For example, in a tree with 21
leaves, as in the picture above, this would be the ephemeral node 5.0.

### TODO: Things to cover:

- Append-only.
- Merkle tree hasher.
- Structured objects, like proofs and compact ranges.
Loading

0 comments on commit 8e42628

Please sign in to comment.