Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix router-backtrack cases in last-hop hints #3586

Merged

Conversation

TheBlueMatt
Copy link
Collaborator

This fixes the router fuzzer-identified issue in #3553 which ultimately were because the router was backtracking and overwriting the preferred path to a node after we'd already processed hops away from that node. We fix it both for blinded and non-blinded paths, though in two very different ways (removing a lot of code for the non-blinded case!).

If we have a first-hop channel from a first-hop hint, we'll ignore
the fees on it as we won't charge ourselves fees. However, if we
have a first-hop channel from the network graph, we should do the
same.

We do so here, also teeing up a coming commit which will remove
much of the custom codepath for first-hop hints and start using
this common codepath as well.
These tests are a bit annoying to deal with and ultimately work on
almost the same graph subset, so it makes sense to combine their
graph layout logic and then call it twice.

We do that here, combining them and also cleaning up the possible
paths as there actually are paths that the router could select
which don't meet the tests requirements.
In a coming commit we'll start calling
`add_entries_to_cheapest_to_target_node` without always having a
public-graph node entry in order to process last- and first-hops
via a common codepath. In order to do so, we always need the
`node_counter` for the node, however, and thus we track them in
`RouteGraphNode` and pass them through to
`add_entries_to_cheapest_to_target_node` here.

We also take this opportunity to swap the node preference logic to
look at the counters, which is slightly less computational work,
though it does require some unrelated test changes.
@tnull tnull self-requested a review February 4, 2025 11:28
@TheBlueMatt TheBlueMatt added the weekly goal Someone wants to land this this week label Feb 6, 2025
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks!

Did a first pass.

@TheBlueMatt
Copy link
Collaborator Author

Addressed the comments and also pushed a bonus commit to clean up the router to return a sane error type instead of LightningError (which should also keep code from blowing up in a few places when we rustfmt soon).

@valentinewallace valentinewallace self-requested a review February 11, 2025 17:14
Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@@ -2895,165 +2969,6 @@ where L::Target: Logger {
}
}
}
for route in payment_params.payee.unblinded_route_hints().iter()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to drop all this code 💯

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think. Feel free to squash!

Confirmed the patch fixes the particular fuzz failure, and also ran the router target a bit.

@TheBlueMatt
Copy link
Collaborator Author

Pushed comment fixes for @valentinewallace will squash once she's happy.

Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, on a flight right now. Feel free to squash these nits in!

{
old_entry.value_contribution_msat = value_contribution_msat;
}
old_entry.value_contribution_msat = value_contribution_msat;
hop_contribution_amt_msat = Some(value_contribution_msat);
} else if old_entry.was_processed && new_cost < old_cost {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also use should_replace here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we stick with new_cost < old_cost as should_replace is broader (incorporating equal-cost-but-higher-value replacement that we'd have to adapt the below debug assertions to exempt).

This likely only impacts very rare edge cases, but if we have two
equal-cost paths, we should likely prefer ones which contribute
more value (avoiding cases where we use paths which are
amount-limited but equal fee to higher-amount paths) and then paths
with fewer hops (which may complete faster).

It does make test behavior more robust against router changes,
which comes in handy over the coming commits.
When we handle the unblinded last-hop route hints from an invoice,
we had a good bit of code dedicated to handling fee propagation
through the (potentially) multiple last-hops and connecting them to
potentially directly-connected first-hops.

This was a good bit of code that was almost never used, and it
turns out was also buggy - we could process a route hint with
multiple hops, committing to one path through nodes A, B, to C,
then process another route hint (or public channel) which changes
our best path from B to C, making the A entry invalid.

Here we remove the whole maze, utilizing the normal hop-processing
logic in `add_entries_to_cheapest_to_target_node` for last-hops as
well. It requires tracking which nodes connect to last-hop hints
similar to the way we do with `is_first_hop_target` in
`PathBuildingHop`, storing the `CandidateRouteHop`s in a new map,
and always calling `add_entries_to_cheapest_to_target_node` on the
payee node, whether its public or not.
When we do pathfinding with blinded paths, we start each
pathfinding iteration by inserting all the blinded paths into our
nodes map as last-hops to the destination. As we do that, we check
if any of the introduction points happen to be nodes we have direct
chanels with, as we want to use the local info for such channels
and support finding a path even if that channel is not publicly
announced.

However, as we iterate the blinded paths, we may find a second
blinded path from the same introduction point which we prefer over
the first. If this happens, we would already have added info from
us over the local channel to that intro point and end up with
calculations for the first hop to a blinded path that we no longer
prefer.

This is ultimately fixed here in two ways:
(a) we process the first-hop channels to blinded path introduction
    points in a separate loop after we've processed all blinded
    paths, ensuring we only ever consider a channel to the blinded
    path we will ultimately prefer.
(b) In the next commit, we add we add a new tracking bool in
    `PathBuildingHop` called `best_path_from_hop_selected` which we
    set when we process a channel backwards from a node, indicating
    that we've committed to the best path to the node and check when
    we add a new path to a node. This would have resulted in a much
    earlier debug-assertion in fuzzing or several tests.
When we process a path backwards from a node during pathfinding, we
implicitly commit to the path up to that node. Any changes to the
preferred path up to that node will make the newly processed path's
state invalid.

In the previous few commits we fixed cases for this in last-hop
paths (both blinded and unblinded).

Here we add assertions to enforce this, tracked in a new bool in
`PathBuildingHop`.
The router is a somewhat complicated beast, and though the last few
commits removed some code from it, a complicated beast it remains.
Thus, having `expect`s in it is somewhat risky, so we take this
opportunity to replace some of them with `debug_assert!(false)`s
and an `Err`-return.
When we see a channel come into the router as a route-hint, but its
for a direct channel of ours, we'd like to ignore the route-hint as
we have more information in the first-hop channel info. We do this
by matching SCIDs, but only considered outbound SCID aliases.

Here we change to consider both outbound SCID aliases and the full
channel SCID, which some nodes may use in their invoices.
`LightningError` is an error type for returning errors back to the
`PeerHandler` when handling P2P messages. However, it used to be
more broadly used, in a way that never made any sense.

Here we remove on vestige of this, using a `&'static str` for
router errors rather than `LightningError` with a constant
`action`.
@TheBlueMatt
Copy link
Collaborator Author

Squashed with a few more comment tweaks:

$ git diff-tree -U1 0b3425701 0f3c4d26d
diff --git a/lightning/src/routing/router.rs b/lightning/src/routing/router.rs
index 9e59613ec..8fa9968bc 100644
--- a/lightning/src/routing/router.rs
+++ b/lightning/src/routing/router.rs
@@ -2880,9 +2880,9 @@ where L::Target: Logger {
 		// Step (2).
-		// Add entries for first-hop and last-hop channel hints to `dist` and add the target node
-		// as the best entry via `add_node`.
+		// Add entries for first-hop and last-hop channel hints to `dist` and add the payee node as
+		// the best entry via `add_entry`.
 		// For first- and last-hop hints we need only add dummy entries in `dist` with the relevant
 		// flags set. As we walk the graph in `add_entries_to_cheapest_to_target_node` we'll check
-		// those flags and use the hints.
-		// We then either add the target using `add_entries_to_cheapest_to_target_node` or add the
-		// blinded paths to the target using `add_entry`, filling `targets` and setting us up for
+		// those flags and add the channels described by the hints.
+		// We then either add the payee using `add_entries_to_cheapest_to_target_node` or add the
+		// blinded paths to the payee using `add_entry`, filling `targets` and setting us up for
 		// our graph walk.

@valentinewallace valentinewallace merged commit 11d12d1 into lightningdevkit:main Feb 13, 2025
24 of 26 checks passed
@TheBlueMatt
Copy link
Collaborator Author

Backported (all but the last commit) in #3613

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
weekly goal Someone wants to land this this week
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants