optimize historical range #3658

rrazvan1 · 2025-01-17T12:02:28Z

Why this should be merged

Direct optimization:

how combined changes are computed between 2 roots
getting changes to a specific root
changes iterator using startKey and/or prefix

Indirect optimization

change proofs
range proofs
view changes iterator

Fixes:

getChangesToGetToRoot(..) no-ops being removed from output

How this works

changeSummary struct has a new field for having the changed keys in a sorted slice
getChangesToGetToRoot(..) -> by having the sortedKeys, we can search (binary search) for the startKey, and also easily stop iterating when we are after endKey.
getValueChanges(..) -> we can easily get change values between startRoot and endRoot, with keys within [startKey, endKey] in the following way:
1. init a minheap where we store a root traverse information: changes, insertNumber and index. Min => the root with the min key and min insertNumber (in this way, by popping out of the minheap, we traverse all the keys in ASC order by [key, insertNumber])
2. iterate through each root's changes, and find (binary search) the index of the first key within [startKey, endKey], and push that initial state of each root into the heap (or not, if there are no keys inside that interval).
3. pop elements out of minheap, and while the key is same, we merge the changes and store the final combined change.

IMPORTANT improvement for getValueChanges(..): we can stop whenever there are maxLength key changes found.

How this was tested

Using the existing unit tests.
Adding new unit tests, or modifying existing ones to properly cover the new code.

joshua-kim

We should have some benchmark results (either manual or preferably through a benchmark test) as part of this PR to verify the results

rrazvan1 · 2025-02-06T12:20:09Z

We should have some benchmark results (either manual or preferably through a benchmark test) as part of this PR to verify the results

Range proofs benchmarking:

The improvements are mostly seen when providing a small maxLength value compared to the total keys.

I attached the result of a benchmark with the following input:

maximum key length => 20
history changes => 100
changes per history => 20000
maximum maxLength provided to getRangeProof(..) => 20% of the total keys inserted/updated

The benchmark was run using the same seed, and it was generating a rangeProof from a randomly chosen interval [start, end], from 2 different random merkleRoot's from the history, with a random maxLength [0, 0.2*totalKeys].

Results with 2 differents seeds:

Benchmark_ChangeProofs-12			10         466853996 ns/op
Benchmark_ChangeProofsOptimized-12		10         317976392 ns/op

Benchmark_ChangeProofs-12		10         751904392 ns/op
Benchmark_ChangeProofsOptimized-12	10         527304283 ns/op

Iterator benchmarking:

I attached the result of a benchmark with the following input:

maximum key length => 20
keys => 1.000.000 (1M)

The benchmark was run using the same seed and it was randomly generating a start and a prefix for creating an iterator with the proper filtered changes.

BenchmarkView_NewIteratorWithStartAndPrefix-12			100          39738423 ns/op
BenchmarkView_NewIteratorWithStartAndPrefixOptimized-12		100          21231005 ns/op

BenchmarkView_NewIteratorWithStartAndPrefix-12				100          37678990 ns/op
BenchmarkView_NewIteratorWithStartAndPrefixOptimized-12			100          21599712 ns/op

rrazvan1 requested a review from StephenButtolph as a code owner January 17, 2025 12:02

rrazvan1 force-pushed the optimize-historical-range branch from d3bc7f3 to 3b35c08 Compare January 17, 2025 12:03

rrazvan1 marked this pull request as draft January 17, 2025 12:32

rrazvan1 added 3 commits January 20, 2025 15:06

optimize historical range

eb6b4ab

update history test for a better coverage

f7dd992

handle getChangesToGetToRoot no-ops

c8050d7

rrazvan1 force-pushed the optimize-historical-range branch from 39d1344 to c8050d7 Compare January 20, 2025 13:06

fix linter complains

92a07e6

rrazvan1 marked this pull request as ready for review January 20, 2025 14:47

rrazvan1 added the merkledb label Feb 3, 2025

joshua-kim assigned joshua-kim and rrazvan1 and unassigned joshua-kim Feb 4, 2025

joshua-kim self-requested a review February 4, 2025 21:24

joshua-kim reviewed Feb 4, 2025

View reviewed changes

rrazvan1 force-pushed the optimize-historical-range branch from 362d0a8 to 9a19d2e Compare February 6, 2025 13:16

added benchmarks

89cac9a

rrazvan1 force-pushed the optimize-historical-range branch from 9a19d2e to 89cac9a Compare February 6, 2025 13:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimize historical range #3658

optimize historical range #3658

rrazvan1 commented Jan 17, 2025 •

edited

Loading

joshua-kim left a comment

rrazvan1 commented Feb 6, 2025 •

edited

Loading

optimize historical range #3658

Are you sure you want to change the base?

optimize historical range #3658

Conversation

rrazvan1 commented Jan 17, 2025 • edited Loading

Why this should be merged

How this works

How this was tested

joshua-kim left a comment

Choose a reason for hiding this comment

rrazvan1 commented Feb 6, 2025 • edited Loading

rrazvan1 commented Jan 17, 2025 •

edited

Loading

rrazvan1 commented Feb 6, 2025 •

edited

Loading