-
Notifications
You must be signed in to change notification settings - Fork 979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reusable module cache #4621
base: master
Are you sure you want to change the base?
Reusable module cache #4621
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this still needs to be rebased on top of p23 module and have the host change merged, so I haven't looked too closely at the rust parts (IIUC currently we just don't pass the module cache to the host at all).
bool mLoadedAll{false}; | ||
size_t mBytesLoaded{0}; | ||
size_t mBytesCompiled{0}; | ||
std::vector<uint32_t> mLedgerVersions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there is any significance in this being a list vs a (hardcoded) range from SOROBAN_PROTOCOL_VERSION to Config::CURRENT_PROTOCOL_VERSION
(both are constants). It seems a bit suspicious that we might be omitting some protocol versions.
stellar::SearchableSnapshotConstPtr mSnap; | ||
std::deque<xdr::xvector<uint8_t>> mWasms; | ||
|
||
const size_t mNumThreads; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: size_t const
(style consistency)
std::deque<xdr::xvector<uint8_t>> mWasms; | ||
|
||
const size_t mNumThreads; | ||
const size_t MAX_MEM_BYTES = 10 * 1024 * 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Could this go to the cpp file as a static constant?
|
||
// This class encapsulates a multithreaded strategy for loading contracts | ||
// out of the database (on one thread) and compiling them (on N-1 others). | ||
class SharedModuleCacheCompiler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Why does this not belong to stellar
namespace? I don't think we typically define types/functions outside of it.
|
||
for (auto thread = 0; thread < self->mNumThreads; ++thread) | ||
{ | ||
mApp.postOnBackgroundThread( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we should be using the background threads for this. IIUC the contract compilation is a one-off CPU-heavy task, so it would make sense to use a number of threads that's close to the number of physical cores (not sure that's the requirement for WORKER_THREADS
). On the other hand, this competes for background threads with whatever background tasks we're doing as well. So I wonder if just spawning a number of new, high-priority threads regulated by a separate config flag would make more sense for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if thread contention is necessarily a large concern here, but it might be. This is only running on startup, so the only other tasks are background bucket merges, which are also low priority and IO bound anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we using the same low priority threads here, by using postOnBackgroundThread? That's my concern. But my main concern is the flag semantics - since background threads are indeed meant to be IO bound, the value of the setting doesn't correlate with what we actually want here (roughly matching the physical cores available).
std::deque<xdr::xvector<uint8_t>> mWasms; | ||
|
||
const size_t mNumThreads; | ||
const size_t MAX_MEM_BYTES = 10 * 1024 * 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 10 MB too conservative, or do we expect a significant multiplier for this? IIUC this is just for buffering the contracts, so not sure why couldn't this be like at least 10x more.
@@ -1945,6 +1964,11 @@ handleCommandLine(int argc, char* const* argv) | |||
{"test", "execute test suite", runTest}, | |||
{"apply-load", "run apply time load test", runApplyLoad}, | |||
#endif | |||
{"list-contracts", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also update commands.md
let unlimited_budget = Budget::try_from_configs( | ||
u64::MAX, | ||
u64::MAX, | ||
ContractCostParams(vec![].try_into().unwrap()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually going to work? Don't we expect these to be of a certain exact size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this generally looks pretty good, the producer/compiler thread divide is a good idea! I think the producer thread needs some changes though.
To maintain the cache, I think we need to
- Compile all live contract modules on startup.
- Compile new contract modules during upload/restore.
- Evict entries from the cache when the underlying WASM is evicted via State Archival eviction scans.
I think decoupling cache gc from eviction events is going to be expensive. If you have some background task that checks if a given module is live or not, it will have to read from many levels of the BL to determine if whatever BucketEntry it's looking at is the most up to date. Evicting from the cache when we do state archival evictions will remove this additional read amplification (since the archival scan has to this this multi level search already) and is simpler to maintain cache validity too.
The drawback to this is intial cache generation is a little more expensive, as we're limited to a single producer thread that has to iterate throught the BL in-order and keep track of seen keys. If we don't have a proper gc, we can't add any modules that have already been evited since they would cause a memory leak.
Looking at startup as a whole, we have a bunch of tasks that are BL disk read dominated, namely Bucket Apply, BucketIndex, and p23's upcoming Soroban state cache. Bucket Index can process all Buckets in parallel, but Bucket Apply, Soroban State cache, and Module Cache all require a single thread iterating the BL in order due to the outdated keys issue (in the future we could do this in parallel where each level marks it's "last seen key" and lower levels can't make progress beyond all their parents last seen key, but that's too invloved for v1).
Given that we're adding a bunch of work on the startup path and Horizon/RPC has indicated a need for faster startup times in the past, I think it makes sense to condense the Bucket Apply, Soroban state cache population, and Module cache producer thread into a single Work that makes a one shot pass on the BucketList. Especially in a captive-core instance, which we still run in our infra on EBS last I checked, I assume we're going to be disk bound even with the compilation step, so if we do compilation in the same pass as BucketApply we might just get it for free.
I don't think this needs to be in 23.0 (other than the memory leak issue), but if we have this in mind and make the initial version a little more friendly with the other Work tasks that happen on startup, it'll be easier to optimize this later.
@@ -0,0 +1,59 @@ | |||
#pragma once | |||
// Copyright 2024 Stellar Development Foundation and contributors. Licensed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: 2025, happy new year!
@@ -0,0 +1,161 @@ | |||
#include "ledger/SharedModuleCacheCompiler.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: copyright
// This class encapsulates a multithreaded strategy for loading contracts | ||
// out of the database (on one thread) and compiling them (on N-1 others). | ||
class SharedModuleCacheCompiler | ||
: public std::enable_shared_from_this<SharedModuleCacheCompiler> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a Work
task? We don't need to rate limit or anything like that since this is a "halt until finished" sort of task, but it would still help with graceful shutdown. I also plan on doing some startup optimization work in the near-ish future (given that we're doing a lot more tasks on startup in p23) and this might be easier to work with later on if we use the same Work
interface that we use in other startup tasks.
|
||
for (auto thread = 0; thread < self->mNumThreads; ++thread) | ||
{ | ||
mApp.postOnBackgroundThread( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if thread contention is necessarily a large concern here, but it might be. This is only running on startup, so the only other tasks are background bucket merges, which are also low priority and IO bound anyway.
|
||
// Scans contract code entries in the bucket. | ||
Loop | ||
scanForContractCode(std::function<Loop(LedgerEntry const&)> callback) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the lambda return a Loop
status when it should always be Loop::INCOMPLETE
anyway?
LOG_INFO(DEFAULT_LOG, | ||
"Launching 1 loading and {} compiling background threads", | ||
mNumThreads); | ||
mApp.postOnBackgroundThread( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a memory leak here caused by the BucketList iteration strategy. Currently, we compile each CONTRACT_CODE
entry on a per-bucket basis, regardless if there is a newer version of that key higher up in the BucketList. This is a problem, as eviction deletes state from the live BucketList by writing DEADENTRY
just like any other BucketList deletion. The loop will ignore the newer DEADENTRY
and compile the outdated LIVEENTRY
at a lower Bucket. We won't be able to garbage collect the module either, since the entry has already been evicted, so there won't be an eviction event to get rid of the module. Note that in p23, it's also possible for several copies of the same WASM INITENTRY
to exist on different levels of the BucketList if you restore an entry after it's been evicted, but before the DEADENTRY
can anihilate the INITENTRY
. In this case currently the same WASM would be compiled twice, which might break an assumption or something in the cache.
c1ffdf7
to
89d85c2
Compare
This is the stellar-core side of a soroban change to surface the module cache for reuse.
On the core side we:
SorobanModuleCache
to the core-side Rust code, which holds asoroban_env_host::ModuleCache
for each host protocol version we support caching modules for (there is currently only one but there will be more in the future)CoreCompilationContext
type tocontract.rs
which carries aBudget
and logs errors to the core console logging system. This is sufficient to allow operating thesoroban_env_host::ModuleCache
from outside theHost
.SorobanModuleCache
into the host function invocation path that core calls during transactions.SorobanModuleCache
in theLedgerManagerImpl
that is long-lived, spans ledgers.SharedModuleCacheCompiler
that does a multithreaded load-all-contracts / populate-the-module-cache, and call this on startup when theLedgerManagerImpl
restores its LCL.The main things left to do here are:
p23
soroban submoduleI think that's .. kinda it? The reusable module cache is just "not passed in" on p22 and "passed in" on p23, so it should just start working at the p23 boundary.