Skip to content

Commit 7c5ce8c

Browse files
committed
fix(discovery): prevent premature node eviction from routing table
The replaceNode function removes nodes if either the replacement cache has entries or the node's reliability is below the chosen threshold. This means that nodes with perfect or good reliability (seen >= 0.5) are evicted just because replacement candidates exist, completely ignoring the reliability threshold. Evidence from production logs show nodes with excellent reliability being removed: DBG - Node added to routing table topics="discv5 routingtable" tid=1 n=1ff*7a561e:10.244.0.208:6890 DBG - bucket topics="discv5" tid=1 depth=0 len=2 standby=0 DBG - node topics="discv5" tid=1 n=130*db8a1b:10.244.2.207:6890 rttMin=1 rttAvg=2 reliability=1.0 DBG - node topics="discv5" tid=1 n=1ff*7a561e:10.244.0.208:6890 rttMin=1 rttAvg=14 reliability=1.0 DBG - Node removed from routing table topics="discv5 routingtable" tid=1 n=1ff*7a561e:10.244.0.208:6890 DBG - Total nodes in discv5 routing table topics="discv5" tid=1 total=1 DBG - bucket topics="discv5" tid=1 depth=0 len=1 standby=0 DBG - node topics="discv5" tid=1 n=130*db8a1b:10.244.2.207:6890 rttMin=1 rttAvg=165 reliability=0.957 DBG - Node removed from routing table topics="discv5 routingtable" tid=1 n=130*db8a1b:10.244.2.207:6890 DBG - Total nodes in discv5 routing table topics="discv5" tid=1 total=0 First entry shows a node with perfect reliability (1.0) and 14ms RTT being removed. Second one shows a node with 95.7% reliability and minimal RTT also being evicted. Both far exceed 0.5 threshold set by NoreplyRemoveThreshold. Nodes are now only removed if their reliability actually falls below the specified threshold, regardless of replacement cache status. This allows the reliability tracking to properly tolerate transient failures. The replacement cache logic inside the conditional remains unchanged, if a replacement exists, it's used; otherwise the node is just removed. Signed-off-by: Chrysostomos Nanakos <[email protected]>
1 parent f6eef1a commit 7c5ce8c

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

codexdht/private/eth/p2p/discoveryv5/routing_table.nim

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -445,7 +445,7 @@ proc replaceNode*(r: var RoutingTable, n: Node, forceRemoveBelow = 1.0) =
445445
## also avoids being too agressive because UDP losses or temporary network
446446
## failures.
447447
let b = r.bucketForNode(n.id)
448-
if (b.replacementCache.len > 0 or n.seen <= forceRemoveBelow):
448+
if n.seen <= forceRemoveBelow:
449449
if b.remove(n):
450450
debug "Node removed from routing table", n
451451
ipLimitDec(r, b, n)

0 commit comments

Comments
 (0)