Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize historical range #3658

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

optimize historical range #3658

wants to merge 5 commits into from

Conversation

rrazvan1
Copy link
Contributor

@rrazvan1 rrazvan1 commented Jan 17, 2025

Why this should be merged

Direct optimization:

  • how combined changes are computed between 2 roots
  • getting changes to a specific root
  • changes iterator using startKey and/or prefix

Indirect optimization

  • change proofs
  • range proofs
  • view changes iterator

Fixes:

  • getChangesToGetToRoot(..) no-ops being removed from output

How this works

  • changeSummary struct has a new field for having the changed keys in a sorted slice
  • getChangesToGetToRoot(..) -> by having the sortedKeys, we can search (binary search) for the startKey, and also easily stop iterating when we are after endKey.
  • getValueChanges(..) -> we can easily get change values between startRoot and endRoot, with keys within [startKey, endKey] in the following way:
    1. init a minheap where we store a root traverse information: changes, insertNumber and index. Min => the root with the min key and min insertNumber (in this way, by popping out of the minheap, we traverse all the keys in ASC order by [key, insertNumber])
    2. iterate through each root's changes, and find (binary search) the index of the first key within [startKey, endKey], and push that initial state of each root into the heap (or not, if there are no keys inside that interval).
    3. pop elements out of minheap, and while the key is same, we merge the changes and store the final combined change.

IMPORTANT improvement for getValueChanges(..): we can stop whenever there are maxLength key changes found.

How this was tested

Using the existing unit tests.
Adding new unit tests, or modifying existing ones to properly cover the new code.

@rrazvan1 rrazvan1 force-pushed the optimize-historical-range branch from d3bc7f3 to 3b35c08 Compare January 17, 2025 12:03
@rrazvan1 rrazvan1 marked this pull request as draft January 17, 2025 12:32
@rrazvan1 rrazvan1 force-pushed the optimize-historical-range branch from 39d1344 to c8050d7 Compare January 20, 2025 13:06
@rrazvan1 rrazvan1 marked this pull request as ready for review January 20, 2025 14:47
@joshua-kim joshua-kim assigned joshua-kim and rrazvan1 and unassigned joshua-kim Feb 4, 2025
@joshua-kim joshua-kim self-requested a review February 4, 2025 21:24
Copy link
Contributor

@joshua-kim joshua-kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have some benchmark results (either manual or preferably through a benchmark test) as part of this PR to verify the results

@rrazvan1
Copy link
Contributor Author

rrazvan1 commented Feb 6, 2025

We should have some benchmark results (either manual or preferably through a benchmark test) as part of this PR to verify the results

Range proofs benchmarking:

The improvements are mostly seen when providing a small maxLength value compared to the total keys.

I attached the result of a benchmark with the following input:

  • maximum key length => 20
  • history changes => 100
  • changes per history => 20000
  • maximum maxLength provided to getRangeProof(..) => 20% of the total keys inserted/updated

The benchmark was run using the same seed, and it was generating a rangeProof from a randomly chosen interval [start, end], from 2 different random merkleRoot's from the history, with a random maxLength [0, 0.2*totalKeys].

Results with 2 differents seeds:

Benchmark_ChangeProofs-12			10         466853996 ns/op
Benchmark_ChangeProofsOptimized-12		10         317976392 ns/op
Benchmark_ChangeProofs-12		10         751904392 ns/op
Benchmark_ChangeProofsOptimized-12	10         527304283 ns/op

Iterator benchmarking:

I attached the result of a benchmark with the following input:

  • maximum key length => 20
  • keys => 1.000.000 (1M)

The benchmark was run using the same seed and it was randomly generating a start and a prefix for creating an iterator with the proper filtered changes.

BenchmarkView_NewIteratorWithStartAndPrefix-12			100          39738423 ns/op
BenchmarkView_NewIteratorWithStartAndPrefixOptimized-12		100          21231005 ns/op
BenchmarkView_NewIteratorWithStartAndPrefix-12				100          37678990 ns/op
BenchmarkView_NewIteratorWithStartAndPrefixOptimized-12			100          21599712 ns/op

@rrazvan1 rrazvan1 force-pushed the optimize-historical-range branch from 362d0a8 to 9a19d2e Compare February 6, 2025 13:16
@rrazvan1 rrazvan1 force-pushed the optimize-historical-range branch from 9a19d2e to 89cac9a Compare February 6, 2025 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Backlog 🗄️
Development

Successfully merging this pull request may close these issues.

2 participants