-
Notifications
You must be signed in to change notification settings - Fork 389
Make futures send #560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make futures send #560
Conversation
Although somewhat convenient to have, coupling the Wallet with the blockchain trait causes development friction and complexity. What if sometimes the wallet is "offline" (no access to the blockchain) but sometimes its online? The only thing the Wallet needs the blockchain for is to sync. But not all applications will even use the sync method and the sync method doesn't require the full blockchain functionality. So we instead pass the blockchain in when we want to sync. - To further reduce the coupling with blockchain I removed the get_height call from `new` and just use the height of the last sync in the database. - I split up the blockchain trait a bit into subtraits.
The current options are awkward and it would be good if we could introduce more in the future without breaking changes.
to make supporting verify_tx easier
For some reason while doing the "remove blockchain from wallet PR" I changed the tests around in what I thought was a benign way. But it meant (apparently) that both parties "test_sync_double_receive" were using the same "blockchain". This meant that when the blockchain was RPC they both imported their addresses to it and got each other's results when syncing. This bugged out the sync and this commit fixes that.
Co-authored-by: Steve Myers <[email protected]>
You can do this with ensure_addresses_cached if you really want to.
probably not directly relevant for this PR but @notmandatory thought it might be helpful to share some recent thoughts I had about syncing a wallet when working with ldk and bdk. LDK provides the I guess one way to look at it would be to implement BlockSource for a Blockchain and then use the Blockchain to keep channel state in sync. The other way was to implement ldk's chain::Listen interface for either bdk's Wallet or Database objects so they could be fed blocks directly using lightning-block-sync utility and then use those blocks to update the database without needing a I guess you could actually do both of these ideas. The latter is an alternative 'sync' method where the user is in charge of pushing blocks to the wallet instead of just calling "sync". I'm not sure if these ideas are relevant to this work but thought I'd share just in case. |
They are somewhat relevant, thanks for sharing! Our sync interface will center around scripts. You should be able to track a lightning channel by taking the multi-sig or punish transaction script and "syncing" that :) |
I don't think it makes sense to feed blocks to the database directly. What if the block you are feeding is not the child of the last block you fed? In LDK they have a I think what you are after is being able to sync quickly every time there's a new block. Indeed our work here should make this much easier. So the API we are aiming for using esplora or electrum etc would be: let sync_session = wallet.start_sync(options)?;
let update = esplora.sync(&sync_session)?;
wallet.apply_update(update)?; So with a more real-time block approach we could make this something like: loop {
let update = block_source.look_for_updates(addresses).await;
wallet.apply_update(update);
} So every time The tricky thing will be updating |
Yeah that api would work for esplora / electrum style implementation. As for making sure you feed blocks in order all you need to do is track the last block synced. LDK's "lightning-block-sync" module takes an object that implements Listen and it's last known good BlockHash and can bring it and keep it to tip. So you just need to implement logic for updating the wallet database for the block connected and block disconnected events. I ended up going this route for now but it's a bit of a hack given the limitations of the current bdk api. Will watch this PR and hopefully once finalized will allow for a cleaner integration. |
We've sketched some further changes to the syncing on the ureq. Here's an interesting questions that came up:
|
Point 4 above doesn't really work since we need to know additional information like the block height/header the information is valid for. We can't return any results until we know this was consistent from start to finish. I think most of the work done so far here should be replaced with an incremental update approach and the idea of a full sync is just when you use a stop gap to find updates. Here's some more detail: // Generated by the descriptor watcher
struct SyncSession<I> {
checkpoints: Vec<(u32, BlockHash)>,
scripts: I // an iterator of scripts
}
// Generated by the blockchain
struct SyncUpdate {
transactions: Vec<Transaction>,
checkpoint_invalidate: Option<(u32, BlockHash)>.
new_checkpoint: (u32, BlockHash),
} This is missing some detail that's in #165 but the point is that when you start a sync you give the blockchain a list of checkpoints (height and header) that you know about. The blockchain will then try and invalidate one or more of the checkpoints by finding the oldest checkpoint that is no longer valid (the block doesn't exist). It also returns a new checkpoint. The descriptor watcher then deletes transactions if a checkpoint was invalidated. It then adds the new checkpoint to its list of checkpoints. thoughts? |
This incremental scheme reminds me a bit of how CBF blockchain sync works (#81), comparing block height/hash checkpoints with peers before downloading filters. Is a let sync_session = my_database.sync_session(&my_wallet).await?; // get checkpoints from db, scripts from wallet
let sync_update = my_blockchain.sync(sync_session).await?; // plus some sort of progress callback
my_database.update(sync_update).await?; // if database updates are large may need a progress callback here too
my_database.list_transactions().await?; // a bit awkward but I'm not sure how else to do this async |
Good. I hope this makes it easier.
Yeah a I dunno how we're going to integrate async databases into this atm. What @thomaseizinger has pointed out to me is that you can pretty much have all txs and indexes associated with your wallet in memory at the same time without it costing much. Sure an exchange with hundreds of thousands of tx's will need a gigabyte of memory but that's fine. If we want async storage then what we might wanna do is provide a way of syncing the in memory database back (and maybe forth?) with a more permanent database via an event sourcing kind of approach which could be |
I think this makes sense. What I'd like to mention is that from a dependency PoV, having a (shared) struct is maybe not a good idea. If you want to move each blockchain backend into its own crate, they would have to depend on some shared crate that contains the struct that is not bdk so that the bdk facade crate can then bundle all backends together and re-export them. But maybe we need such a shared "core" crate anyway so those structs could go in there. |
One more question: Why would we need to pass in multiple checkpoints per script iterator? Wouldn't one be enough assuming that script iterator == descriptor and thus we call this backend API multiple times? |
Just a minor note: The syncing can be achieved in various ways, not necessarily just with event sourcing although it might lead itself naturally to it to record all changes to the memory database as events and upon sync, apply them to the persistent one (but not store the events there). |
I think that this can be modelled by the sync checkpoints themselves. Each sync checkpoint represents a change in the database and if we can make it easy to find all changes associated with each checkpoint I think we've won (that also makes it easy to roll back a checkpoint if that checkpoint becomes invalid).
I don't understand what you meant in the second sentence. The reason why you pass in multiple is so that we know which checkpoints are invalid and which ones are still valid so we can apply the change even if one of them was invalid. |
I don't understand that. The checkpoint is the current height right as known to the wallet / database? Why is it not good enough for the sync backend to be told what the latest height is that we have in the database? The sync backend function (the one we prototyped for esplora) should IMO outright fail if the hash at the given height is no longer the same. As a result, the overall sync function has to grab an earlier checkpoint from the DB and try syncing again. You wouldn't want to reimplement this logic for every backend so I think it shouldn't live in there but in an overall sync function that looks like this: mod bdk {
// Sub-module for blocking API, modeled after `reqwest::blocking`
mod blocking {
pub fn sync(backend: &impl FetchRelatedTransactionsBlocking, wallet: &Wallet) {
// for each descriptor in wallet
// fetch related transactions
// apply result to wallet
// if any error during syncing, apply fallback logic
// for example, if sync backend fails because hash at height differs, restart this loop and grab an earlier checkpoint from the wallet
}
}
pub async fn sync(backend: &impl FetchRelatedTransactions, wallet: &Wallet) {
// same as above but async version of `FetchRelatedTransactions`
}
} Edit: IMO we should strive for the logic to be implement by every backend to be as minimal as possible and try to reuse as much as we can across all of them. |
@thomaseizinger Ok I think I can get on board with that approach. I don't mind leaving it up to another layer of logic to collect all the invalid checkpoints! |
What I'd like to point out is that such a design will push us towards |
adef166 Create vector of thread handles to spawn threads (nickfarrow) Pull request description: ### Description Speeds up esplora ureq syncing. Taken from #560 The current sync just creates a map of scripts to joinhandles which doesn't yet spawn the sync threads due to lazy evaluation. In the following `handles.map()`, the thread handles are *sequentially* evaluated and joined. With the fix, the handles are collected so that the threads spawn in parallel, and then joined ### Notes to the reviewers I had to add a `#[allow(clippy::needless_collect)]` so that it wouldn't complain about collecting and then iterating. (Perhaps clippy is partially responsible for this issue!) Tested sync performance by doing a fresh sync on an existing [gun](https://gun.fun) wallet. ``` ---- Before fix --- real0m13.121s real0m13.367s real0m13.211s ---- After fix ---- real0m5.516s real0m5.251s real0m5.594s ``` ### Checklists #### All Submissions: * [x] I've signed all my commits * [x] I followed the [contribution guidelines](https://github.com/bitcoindevkit/bdk/blob/master/CONTRIBUTING.md) * [x] I ran `cargo fmt` and `cargo clippy` before committing #### Bugfixes: * [ ] This pull request breaks the existing API * [ ] I've added tests to reproduce the issue which are now passing * [ ] I'm linking the issue being fixed by this PR ACKs for top commit: notmandatory: ACK adef166 Tree-SHA512: 47c617117afde9b4706bfa63759bf06d1ec60ff95d8a80931e5b7e40e3293c855d2f7dac0c681173d43aecf77201a842e739b82291da09ac81909cf526a51c8d
|
||
- Removed `Blockchain` from Wallet. | ||
- Removed `Wallet::broadcast` (just use `Blockchain::broadcast`) | ||
- Depreciated `Wallet::new_offline` (all wallets are offline now) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*Deprecated
) -> Result<(), Error> { | ||
use crate::blockchain::script_sync::Request; | ||
let mut request = script_sync::start(database, self.stop_gap)?; | ||
let mut request = script_sync::start(database, 20)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest bringing this magic number out into a constant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a clever approach. I just thought of another case where I could really make use of a unified interface between wasm and desktop builds; I'm looking to make a benchmark without it needing to be browser-side, so I took the time to look things over. You have my concept ACK.
hey @cryptoquick, this idea balooned into https://bitcoindevkit.org/blog/bdk-core-pt1/ I'll close this. |
Draft PR that will eventually fix #559 #165 #521
Don't review yet just here so coauthors @thomaseizinger @nickfarrow @notmandatory can keep track.