Skip to content

ledger: Build the grouped slot leaders manually#8451

Merged
vadorovsky merged 4 commits intoanza-xyz:masterfrom
vadorovsky:compute-epoch-schedule-group-map
Mar 25, 2026
Merged

ledger: Build the grouped slot leaders manually#8451
vadorovsky merged 4 commits intoanza-xyz:masterfrom
vadorovsky:compute-epoch-schedule-group-map

Conversation

@vadorovsky
Copy link
Copy Markdown
Member

@vadorovsky vadorovsky commented Oct 13, 2025

Problem

Building them with itertools::into_group_map takes 20.9ms.

before

Summary of Changes

Building them manually with a pre-allocated hash map, using pubkey hasher, takes 5.2ms.

after_1

Ref: #8280

@vadorovsky vadorovsky force-pushed the compute-epoch-schedule-group-map branch 2 times, most recently from e42e215 to cb8fedb Compare October 13, 2025 15:14
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Oct 13, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.1%. Comparing base (da1dcd7) to head (34d5d1a).
⚠️ Report is 114 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #8451   +/-   ##
=======================================
  Coverage    83.1%    83.1%           
=======================================
  Files         849      849           
  Lines      321241   321256   +15     
=======================================
+ Hits       267026   267069   +43     
+ Misses      54215    54187   -28     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@vadorovsky vadorovsky force-pushed the compute-epoch-schedule-group-map branch 2 times, most recently from 37b32a9 to b758fe5 Compare October 14, 2025 07:07
@vadorovsky vadorovsky marked this pull request as ready for review October 14, 2025 07:24
@HaoranYi
Copy link
Copy Markdown

Excellent work! I have just one comment!

HaoranYi
HaoranYi previously approved these changes Oct 15, 2025
Copy link
Copy Markdown

@HaoranYi HaoranYi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please wait for other reviewer's approval before merge.

@brooksprumo brooksprumo removed their request for review October 15, 2025 16:14
@brooksprumo
Copy link
Copy Markdown

I'm going to bow out and defer to the other reviewers since I'm OOO.

jstarry
jstarry previously approved these changes Oct 16, 2025
@jstarry
Copy link
Copy Markdown

jstarry commented Oct 16, 2025

We should probably review our uses of itertools across the codebase because many of the utility methods create hashsets and hashmaps which are not pre-allocated.

@vadorovsky
Copy link
Copy Markdown
Member Author

We should probably review our uses of itertools across the codebase because many of the utility methods create hashsets and hashmaps which are not pre-allocated.

I'm yet to profile all uses of into_group_map() in ledger/runtime/accounts-db, but my gut feeling for now is that we should move away from it entirely, especially given that the manual alternative I'm replacing it with is basically this few-liner:

let mut grouped = HashMap::with_capacity_and_hasher(cap, hasher);
for (key, value) in input {
    grouped_slot_leaders
        .entry(value)
        .and_modify(|keys| {
            keys.push(key);
    })
    .or_insert(vec![key]);
}

Perhaps I should already move it to some common function in solana-perf.

Given the lack of elasticity of itertools (no way of providing capacities and hashers), I think its usage is a footgun. I tried to make it accept custom hashers, but the PR got closed: rust-itertools/itertools#1057.

let slots = Arc::get_mut(slots).expect("should be the only reference");
slots.push(slot)
})
.or_insert(Arc::new(vec![slot]));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could also experiment with smallvec, with just a slot value you could keep like 2-3 elements inline without allocating

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Or maybe even arrayvec.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the profile, I think you see grow taking a lot of time, so if anything you should allocate with higher capacity?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it's Vec<usize>, yeah use arrayvec

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we we can't use arrayvec. The amount of non-repeating slots we store for each leader looks like:

curl https://api.mainnet-beta.solana.com -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0", "id":1, "method":"getLeaderSchedule", "params":[null, {"commitment":"finalized"}]}' | jq -r '.result | to_entries | map({identity: .key, slots: ((.value | length)/4)}) | sort_by(.slots) | reverse[] | "\(.slots)\t\(.identity)"' | save -r leader_slots.txt

leader_slots.txt

I'm dividing by 4, because of recently merged #9126.

The top 17 validators have more than 1k non-repeating slots:

3630	HEL1USMZKAL2odpNBj2oCjffnFGaYwmbGmyewGv1e2TU
3438	Fd7btgySsrjuo25CJCj7oE7VPMyezDhnx7pZkj2v69Nk
3412	DRpbCBMxVnDK7maPM5tGv6MvB3v1sRMC86PZ8okm21hy
3142	JupmVLmA8RoyTUbTMMuTtoPWHEiNQobxgTeGTrPNkzT
2139	q9XWcZ7T1wP4bW9SB4XgNNwjnFEJ982nE8aVbbNuwot
1906	EvnRmnMrd69kFdbLMxWkTn1icZ7DCceRhvmb2SJXqDo4
1801	DtdSSG8ZJRZVv5Jx7K1MeWp7Zxcu19GD5wQRGRpQ9uMF
1793	E1r4Psq84tHfQ6aPTvvDka4U3u8zPVD7gEUrH25RdxHL
1764	JD549HsbJHeEKKUrKgg4Fj2iyv2RGjsV7NTZjZUrHybB
1708	Awes4Tr6TX8JDzEhCZY2QVNimT6iD1zWHzf1vNyGvpLM
1626	5pPRHniefFjkiaArbGX3Y8NUysJmQ9tMZg3FrFGwHzSm
1594	CAo1dCGYrB6NhHh5xb1cGjUiu86iyCfMTENxgHumSve4
1289	9jxgosAfHgHzwnxsHw4RAZYaLVokMbnYtmiZBreynGFP
1269	5Cchr1XGEg7dbBXByV5NY2ad8jfxAM7HA3x8D56rq9Ux
1173	9rkJMARqK6VBkcxGfKBAwnA44gPAfGxPbPsfsggFNDSQ
1032	FBKFWadXZJahGtFitAsBvbqh5968gLY7dMBBJUoUjeNi
1015	BkoS26vBuaXnSowACdChi4WKid8UwmuPNhEJWa8KsLHd

That's way too much for arrayvec, we would blow up the stack, since we would need to go with something like ArrayVec<Slot, 4096> and have 1k of them. Then there is always a risk we have to increase it if the dominance of top validators increases.

But also, the last ~500 validators have less than 64 non-repeating slots, and last ~300 validators less than 32, ~70 validators that have less than 10. I'm not even sure if we gain anything from smallvec in that case.

Anyways, can we think of smallvec separately, outside of this PR? First of all, the open question is how large arrays, and how many of them. can we keep on stack. The choice of what size of smallvec we pick (SmallVec<[Slot; 10]>, SmallVec<[Slot; 32]>, ``SmallVec<[Slot; 64]>` etc.) depends on that. Then the other question is if it brings any visible perf improvement.

@HaoranYi
Copy link
Copy Markdown

We should probably review our uses of itertools across the codebase because many of the utility methods create hashsets and hashmaps which are not pre-allocated.

Good point — applying the same pattern to staked_nodes in stake_cache gave us a 3–4× speed-up as well: #8516

@vadorovsky vadorovsky dismissed stale reviews from jstarry and HaoranYi via 80b2e04 October 21, 2025 08:49
@vadorovsky vadorovsky force-pushed the compute-epoch-schedule-group-map branch from cd7b5b6 to 80b2e04 Compare October 21, 2025 08:49
@vadorovsky vadorovsky force-pushed the compute-epoch-schedule-group-map branch from 80b2e04 to 3c4c8de Compare February 9, 2026 14:26
Copy link
Copy Markdown

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth reviving this one?

@mergify
Copy link
Copy Markdown

mergify bot commented Mar 17, 2026

If this PR represents a change to the public RPC API:

  1. Make sure it includes a complementary update to rpc-client/ (example)
  2. Open a follow-up PR to update the JavaScript client @solana/kit (example)

Thank you for keeping the RPC clients in sync with the server API @vadorovsky.

Copy link
Copy Markdown
Member Author

@vadorovsky vadorovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry for letting it stale!

@vadorovsky vadorovsky force-pushed the compute-epoch-schedule-group-map branch 2 times, most recently from 798b7de to d6416d8 Compare March 25, 2026 11:32
Building them with `itertools::into_group_map` takes 20.9ms.

Building them manually with a pre-allocated hash map, using pubkey
hasher, takes 5.2ms.
@vadorovsky vadorovsky force-pushed the compute-epoch-schedule-group-map branch from d6416d8 to bf965a7 Compare March 25, 2026 11:33
@kskalski
Copy link
Copy Markdown

LGTM

It works even with a custom hasher.
In the context of enumeration in leader schedule, `index` is a more
appropriate name.
@vadorovsky vadorovsky requested a review from brooksprumo March 25, 2026 14:01
Copy link
Copy Markdown

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@vadorovsky vadorovsky added this pull request to the merge queue Mar 25, 2026
Merged via the queue into anza-xyz:master with commit 5bc7ba4 Mar 25, 2026
62 checks passed
@vadorovsky vadorovsky deleted the compute-epoch-schedule-group-map branch March 25, 2026 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants