-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix router-backtrack cases in last-hop hints #3586
Fix router-backtrack cases in last-hop hints #3586
Conversation
If we have a first-hop channel from a first-hop hint, we'll ignore the fees on it as we won't charge ourselves fees. However, if we have a first-hop channel from the network graph, we should do the same. We do so here, also teeing up a coming commit which will remove much of the custom codepath for first-hop hints and start using this common codepath as well.
These tests are a bit annoying to deal with and ultimately work on almost the same graph subset, so it makes sense to combine their graph layout logic and then call it twice. We do that here, combining them and also cleaning up the possible paths as there actually are paths that the router could select which don't meet the tests requirements.
In a coming commit we'll start calling `add_entries_to_cheapest_to_target_node` without always having a public-graph node entry in order to process last- and first-hops via a common codepath. In order to do so, we always need the `node_counter` for the node, however, and thus we track them in `RouteGraphNode` and pass them through to `add_entries_to_cheapest_to_target_node` here. We also take this opportunity to swap the node preference logic to look at the counters, which is slightly less computational work, though it does require some unrelated test changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks!
Did a first pass.
eb393ea
to
c419314
Compare
Addressed the comments and also pushed a bonus commit to clean up the router to return a sane error type instead of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
@@ -2895,165 +2969,6 @@ where L::Target: Logger { | |||
} | |||
} | |||
} | |||
for route in payment_params.payee.unblinded_route_hints().iter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to drop all this code 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think. Feel free to squash!
Confirmed the patch fixes the particular fuzz failure, and also ran the router target a bit.
c419314
to
0b34257
Compare
Pushed comment fixes for @valentinewallace will squash once she's happy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, on a flight right now. Feel free to squash these nits in!
{ | ||
old_entry.value_contribution_msat = value_contribution_msat; | ||
} | ||
old_entry.value_contribution_msat = value_contribution_msat; | ||
hop_contribution_amt_msat = Some(value_contribution_msat); | ||
} else if old_entry.was_processed && new_cost < old_cost { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also use should_replace
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we stick with new_cost < old_cost
as should_replace
is broader (incorporating equal-cost-but-higher-value replacement that we'd have to adapt the below debug assertions to exempt).
This likely only impacts very rare edge cases, but if we have two equal-cost paths, we should likely prefer ones which contribute more value (avoiding cases where we use paths which are amount-limited but equal fee to higher-amount paths) and then paths with fewer hops (which may complete faster). It does make test behavior more robust against router changes, which comes in handy over the coming commits.
When we handle the unblinded last-hop route hints from an invoice, we had a good bit of code dedicated to handling fee propagation through the (potentially) multiple last-hops and connecting them to potentially directly-connected first-hops. This was a good bit of code that was almost never used, and it turns out was also buggy - we could process a route hint with multiple hops, committing to one path through nodes A, B, to C, then process another route hint (or public channel) which changes our best path from B to C, making the A entry invalid. Here we remove the whole maze, utilizing the normal hop-processing logic in `add_entries_to_cheapest_to_target_node` for last-hops as well. It requires tracking which nodes connect to last-hop hints similar to the way we do with `is_first_hop_target` in `PathBuildingHop`, storing the `CandidateRouteHop`s in a new map, and always calling `add_entries_to_cheapest_to_target_node` on the payee node, whether its public or not.
When we do pathfinding with blinded paths, we start each pathfinding iteration by inserting all the blinded paths into our nodes map as last-hops to the destination. As we do that, we check if any of the introduction points happen to be nodes we have direct chanels with, as we want to use the local info for such channels and support finding a path even if that channel is not publicly announced. However, as we iterate the blinded paths, we may find a second blinded path from the same introduction point which we prefer over the first. If this happens, we would already have added info from us over the local channel to that intro point and end up with calculations for the first hop to a blinded path that we no longer prefer. This is ultimately fixed here in two ways: (a) we process the first-hop channels to blinded path introduction points in a separate loop after we've processed all blinded paths, ensuring we only ever consider a channel to the blinded path we will ultimately prefer. (b) In the next commit, we add we add a new tracking bool in `PathBuildingHop` called `best_path_from_hop_selected` which we set when we process a channel backwards from a node, indicating that we've committed to the best path to the node and check when we add a new path to a node. This would have resulted in a much earlier debug-assertion in fuzzing or several tests.
When we process a path backwards from a node during pathfinding, we implicitly commit to the path up to that node. Any changes to the preferred path up to that node will make the newly processed path's state invalid. In the previous few commits we fixed cases for this in last-hop paths (both blinded and unblinded). Here we add assertions to enforce this, tracked in a new bool in `PathBuildingHop`.
The router is a somewhat complicated beast, and though the last few commits removed some code from it, a complicated beast it remains. Thus, having `expect`s in it is somewhat risky, so we take this opportunity to replace some of them with `debug_assert!(false)`s and an `Err`-return.
When we see a channel come into the router as a route-hint, but its for a direct channel of ours, we'd like to ignore the route-hint as we have more information in the first-hop channel info. We do this by matching SCIDs, but only considered outbound SCID aliases. Here we change to consider both outbound SCID aliases and the full channel SCID, which some nodes may use in their invoices.
`LightningError` is an error type for returning errors back to the `PeerHandler` when handling P2P messages. However, it used to be more broadly used, in a way that never made any sense. Here we remove on vestige of this, using a `&'static str` for router errors rather than `LightningError` with a constant `action`.
0b34257
to
0f3c4d2
Compare
Squashed with a few more comment tweaks: $ git diff-tree -U1 0b3425701 0f3c4d26d
diff --git a/lightning/src/routing/router.rs b/lightning/src/routing/router.rs
index 9e59613ec..8fa9968bc 100644
--- a/lightning/src/routing/router.rs
+++ b/lightning/src/routing/router.rs
@@ -2880,9 +2880,9 @@ where L::Target: Logger {
// Step (2).
- // Add entries for first-hop and last-hop channel hints to `dist` and add the target node
- // as the best entry via `add_node`.
+ // Add entries for first-hop and last-hop channel hints to `dist` and add the payee node as
+ // the best entry via `add_entry`.
// For first- and last-hop hints we need only add dummy entries in `dist` with the relevant
// flags set. As we walk the graph in `add_entries_to_cheapest_to_target_node` we'll check
- // those flags and use the hints.
- // We then either add the target using `add_entries_to_cheapest_to_target_node` or add the
- // blinded paths to the target using `add_entry`, filling `targets` and setting us up for
+ // those flags and add the channels described by the hints.
+ // We then either add the payee using `add_entries_to_cheapest_to_target_node` or add the
+ // blinded paths to the payee using `add_entry`, filling `targets` and setting us up for
// our graph walk. |
11d12d1
into
lightningdevkit:main
Backported (all but the last commit) in #3613 |
This fixes the router fuzzer-identified issue in #3553 which ultimately were because the router was backtracking and overwriting the preferred path to a node after we'd already processed hops away from that node. We fix it both for blinded and non-blinded paths, though in two very different ways (removing a lot of code for the non-blinded case!).