Only generate a post-close lock ChannelMonitorUpdate if we need one #3619

TheBlueMatt · 2025-02-25T01:57:03Z

If a channel is closed on startup, but we find that the
`ChannelMonitor` isn't aware of this, we generate a
`ChannelMonitorUpdate` containing a
`ChannelMonitorUpdateStep::ChannelForceClosed`. This ensures that
the `ChannelMonitor` will not accept any future updates in case we
somehow load up a previous `ChannelManager` (though that really
shouldn't happen).

Previously, we'd apply this update only if we detected that the
`ChannelManager` had not yet informed the `ChannelMonitor` about
the channel's closure, even if the `ChannelMonitor` would already
refuse any other updates because it detected a channel closure
on chain.

This doesn't accomplish anything but an extra I/O write, so we
remove it here.

Further, a user reported that, in regtest, they could:
 (a) coop close a channel (not generating a `ChannelMonitorUpdate`)
 (b) wait just under 4032 blocks (on regtest, taking only a day)
 (c) restart the `ChannelManager`, generating the above update
 (d) connect a block or two (during the startup sequence), making
     the `ChannelMonitor` eligible for archival,
 (d) restart the `ChannelManager` again (without applying the
     update from (c), but after having archived the
     `ChannelMonitor`, leading to a failure to deserialize as we
     have a pending `ChannelMonitorUpdate` for a `ChannelMonitor`
     that has been archived.

Though it seems very unlikely this would happen on mainnet, it is
theoretically possible.

TheBlueMatt · 2025-02-25T02:07:22Z

I didn't bother writing a test here because (a) we have decent coverage of monitors on shutdown already, (b) this is kinda more of an optimization around I/O than a bugfix and its quite straightforward, and (c) it would break when we fix #2238. I could write a temporary test knowing that we're gonna remove it (hopefully) soon, if folks prefer, though.

wpaulino · 2025-02-25T16:52:46Z

LGTM. The title of the first commit makes it sound like monitors were blocked from signing when cancelling claims, but claims were only being cancelled based on whether signing already happened or not. I'm fine with not needing test coverage here though.

In `ChannelMonitorImpl::cancel_prev_commitment_claims` we need to cancel any claims against a removed commitment transaction. We were checking if `holder_tx_signed` before checking if either the current or previous holder commitment transaction had pending claims against it, but (a) there's no need to do this, there's not a big performance cost to just always trying to remove claims and (b) we can't actually rely on `holder_tx_signed`. `holder_tx_signed` being set doesn't necessarily imply that the `ChannelMonitor` was persisted (i.e. it may simply be lost in a poorly-timed restart) but we also (somewhat theoretically) allow for multiple copies of a `ChannelMonitor` to exist, and a different one could have signed the commitment transaction which was confirmed (and then unconfirmed). Thus, we simply remove the additional check here.

`provide_latest_holder_commitment_tx` is used to handle `ChannelMonitorUpdateStep::LatestHolderCommitmentTXInfo` updates and returns an `Err` if we've set `holder_tx_signed`. However, later in `ChannelMonitorImpl::update_monitor` (the only non-test place that `provide_latest_holder_commitment_tx` is called), we will fail the entire update if `holder_tx_signed` is (or a few other flags are) are set if the update contained a `LatestHolderCommitmentTXInfo` (or a few other update types). Thus, the check in `provide_latest_holder_commitment_tx` is entirely redundant and can be removed.

If a channel is closed on startup, but we find that the `ChannelMonitor` isn't aware of this, we generate a `ChannelMonitorUpdate` containing a `ChannelMonitorUpdateStep::ChannelForceClosed`. This ensures that the `ChannelMonitor` will not accept any future updates in case we somehow load up a previous `ChannelManager` (though that really shouldn't happen). Previously, we'd apply this update only if we detected that the `ChannelManager` had not yet informed the `ChannelMonitor` about the channel's closure, even if the `ChannelMonitor` would already refuse any other updates because it detected a channel closure on chain. This doesn't accomplish anything but an extra I/O write, so we remove it here. Further, a user reported that, in regtest, they could: (a) coop close a channel (not generating a `ChannelMonitorUpdate`) (b) wait just under 4032 blocks (on regtest, taking only a day) (c) restart the `ChannelManager`, generating the above update (d) connect a block or two (during the startup sequence), making the `ChannelMonitor` eligible for archival, (d) restart the `ChannelManager` again (without applying the update from (c), but after having archived the `ChannelMonitor`, leading to a failure to deserialize as we have a pending `ChannelMonitorUpdate` for a `ChannelMonitor` that has been archived. Though it seems very unlikely this would happen on mainnet, it is theoretically possible.

TheBlueMatt · 2025-02-26T21:11:16Z

Fixed the commit message

valentinewallace · 2025-03-03T15:08:13Z

Landing since CI failures look unrelated

TheBlueMatt · 2025-03-05T18:27:02Z

Backported in #3613.

v0.1.2 - Apr 02, 2025 - "Foolishly Edgy Cases" API Updates =========== * `lightning-invoice` is now re-exported as `lightning::bolt11_invoice` (lightningdevkit#3671). Performance Improvements ======================== * `rapid-gossip-sync` graph parsing is substantially faster, resolving a regression in 0.1 (lightningdevkit#3581). * `NetworkGraph` loading is now substantially faster and does fewer allocations, resulting in a 20% further improvement in `rapid-gossip-sync` loading when initializing from scratch (lightningdevkit#3581). * `ChannelMonitor`s for closed channels are no longer always re-persisted immediately after startup, reducing on-startup I/O burden (lightningdevkit#3619). Bug Fixes ========= * BOLT 11 invoices longer than 1023 bytes long (and up to 7089 bytes) now properly parse (lightningdevkit#3665). * In some cases, when using synchronous persistence with higher latency than the latency to communicate with peers, when receiving an MPP payment with multiple parts received over the same channel, a channel could hang and not make progress, eventually leading to a force-closure due to timed-out HTLCs. This has now been fixed (lightningdevkit#3680). * Some rare cases with multi-hop BOLT 11 route hints or multiple redundant blinded paths could have led to the router creating invalid `Route`s were fixed (lightningdevkit#3586). * Corrected the decay logic in `ProbabilisticScorer`'s historical buckets model. Note that by default historical buckets are only decayed if no new datapoints have been added for a channel for two weeks (lightningdevkit#3562). * `{Channel,Onion}MessageHandler::peer_disconnected` will now be called if a different message handler refused connection by returning an `Err` from its `peer_connected` method (lightningdevkit#3580). * If the counterparty broadcasts a revoked state with pending HTLCs, those will now be claimed with other outputs which we consider to not be vulnerable to pinning attacks if they are not yet claimable by our counterparty, potentially reducing our exposure to pinning attacks (lightningdevkit#3564).

TheBlueMatt added the backport 0.1 label Feb 25, 2025

TheBlueMatt added 3 commits February 26, 2025 21:10

TheBlueMatt force-pushed the 2025-02-merge-chanmon-lockdown branch from cc543b5 to cd2e169 Compare February 26, 2025 21:11

wpaulino approved these changes Feb 26, 2025

View reviewed changes

TheBlueMatt added the weekly goal Someone wants to land this this week label Feb 27, 2025

valentinewallace approved these changes Mar 3, 2025

View reviewed changes

valentinewallace merged commit 5786674 into lightningdevkit:main Mar 3, 2025
22 of 26 checks passed

wpaulino mentioned this pull request Mar 5, 2025

0.1 Backports for 0.1.2 #3613

Merged

TheBlueMatt removed the backport 0.1 label Mar 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Only generate a post-close lock ChannelMonitorUpdate if we need one #3619

Only generate a post-close lock ChannelMonitorUpdate if we need one #3619

Uh oh!

TheBlueMatt commented Feb 25, 2025

Uh oh!

TheBlueMatt commented Feb 25, 2025

Uh oh!

wpaulino commented Feb 25, 2025

Uh oh!

TheBlueMatt commented Feb 26, 2025

Uh oh!

valentinewallace commented Mar 3, 2025

Uh oh!

Uh oh!

TheBlueMatt commented Mar 5, 2025

Uh oh!

Uh oh!

Only generate a post-close lock ChannelMonitorUpdate if we need one #3619

Only generate a post-close lock ChannelMonitorUpdate if we need one #3619

Uh oh!

Conversation

TheBlueMatt commented Feb 25, 2025

Uh oh!

TheBlueMatt commented Feb 25, 2025

Uh oh!

wpaulino commented Feb 25, 2025

Uh oh!

TheBlueMatt commented Feb 26, 2025

Uh oh!

valentinewallace commented Mar 3, 2025

Uh oh!

Uh oh!

TheBlueMatt commented Mar 5, 2025

Uh oh!

Uh oh!