Skip to content

ping: Conform to the spec & exclude from connection keep-alive#416

Merged
dmitry-markin merged 30 commits intoparitytech:masterfrom
dharjeezy:dami/repeat-request-periodically
Feb 27, 2026
Merged

ping: Conform to the spec & exclude from connection keep-alive#416
dmitry-markin merged 30 commits intoparitytech:masterfrom
dharjeezy:dami/repeat-request-periodically

Conversation

@dharjeezy
Copy link
Contributor

@dharjeezy dharjeezy commented Jul 19, 2025

This PR makes ping protocol conform to the spec, specifically:

  1. Repeat ping requests instead of sending them only on connect.
  2. Reuse ping substreams (both inbound and outbound) for subsequent requests, as required by the spec. This is needed for smoldot compatibility.
  3. Exclude ping substreams from the connection keep-alive mechanism, allowing it to close connection on timeout if only ping (and identify) substreams are open.

Tested litep2p-litep2p, litep2p-(pre-PR)litep2p, litep2p-smoldot.

Closes #415

Copy link
Collaborator

@dmitry-markin dmitry-markin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, this is going in the right direction, but there should be a way to simplify the implementation by not introducing separate async tasks for inbound/outbound substreams and not adding command channels. It should be possible to implement polling of inbound substreams by putting them into FuturesUnordered or, may be, tokio_stream::StreamMap. The latter should be quite easy as there is maximum one inbound substream allowed from any remote peer as per ping spec: https://github.com/libp2p/specs/blob/master/ping/ping.md

@dmitry-markin
Copy link
Collaborator

Hey, this is going in the right direction, but there should be a way to simplify the implementation by not introducing separate async tasks for inbound/outbound substreams and not adding command channels. It should be possible to implement polling of inbound substreams by putting them into FuturesUnordered or, may be, tokio_stream::StreamMap. The latter should be quite easy as there is maximum one inbound substream allowed from any remote peer as per ping spec: https://github.com/libp2p/specs/blob/master/ping/ping.md

tokio_stream::StreamMap might not work because we need to also write to substream and not only poll it, but still there should be a way to simplify the code.

@dharjeezy
Copy link
Contributor Author

Hey, this is going in the right direction, but there should be a way to simplify the implementation by not introducing separate async tasks for inbound/outbound substreams and not adding command channels. It should be possible to implement polling of inbound substreams by putting them into FuturesUnordered or, may be, tokio_stream::StreamMap. The latter should be quite easy as there is maximum one inbound substream allowed from any remote peer as per ping spec: https://github.com/libp2p/specs/blob/master/ping/ping.md

tokio_stream::StreamMap might not work because we need to also write to substream and not only poll it, but still there should be a way to simplify the code.

@dmitry-markin can you check my current implementation

Copy link
Collaborator

@dmitry-markin dmitry-markin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey hey, nice use of SplitStream/SplitSink! I think what is missing is a cleaner separation between inbound and outbound substreams. On inbound substreams we should only respond to incoming pings, and on outbound send pings ourself. It would be better than relying on the presence of the time in the map, which can get screwed when we both send pings to a peer and receive pings from it.

@dharjeezy
Copy link
Contributor Author

Hey hey, nice use of SplitStream/SplitSink! I think what is missing is a cleaner separation between inbound and outbound substreams. On inbound substreams we should only respond to incoming pings, and on outbound send pings ourself. It would be better than relying on the presence of the time in the map, which can get screwed when we both send pings to a peer and receive pings from it.

i have done this now @dmitry-markin

@haikoschol
Copy link
Contributor

Out of scope for this PR but I noticed that ping::config::MAX_FAILURES is not used anywhere. Should we create an issue for that?

Copy link
Contributor

@haikoschol haikoschol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I ran the following tests, using a litep2p listener and webrtc transport:

Dialer Listener -> Dialer Pings Dialer -> Listener Pings
libp2p
"minimal smoldot env"

The failing outbound pings with libp2p is an unrelated issue (#494).

@dharjeezy dharjeezy requested a review from haikoschol December 24, 2025 05:21
@dharjeezy dharjeezy requested a review from haikoschol January 18, 2026 17:49
@dharjeezy
Copy link
Contributor Author

@lexnv @haikoschol @dmitry-markin please help review

@haikoschol
Copy link
Contributor

@lexnv @haikoschol @dmitry-markin please help review

LGTM, but my approval doesn't count towards mergeability.

@dharjeezy dharjeezy requested a review from haikoschol February 21, 2026 12:54
@dmitry-markin
Copy link
Collaborator

@lexnv @haikoschol Can you review the PR one more time, please? I have pushed some fixes and it should be ready for merging now.

One thing that makes me worry a bit, is that connection keep-alive is finally working and closing the connections, and we have only request-response protocols keeping the connection open, because no other protocols open new substreams and kick KeepAliveTracker. Things seem to work fine in zombienet with the default polkadot keep-alive timeout of 10 seconds (i.e., > 6s block time), but I'd rather do more tests where sync protocol does not download blocks from every connected peer every 6 secs. Ideally, we should make Notifications protocol also kick KeepAliveTracker on every notification sent/received.

@@ -104,13 +117,19 @@ impl ConfigBuilder {
self
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Is cargo doc check ignoring missings docs on public methods? We should also document this and state the defaults (same for above in public interfaces)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

litep2p doesn't have #![warn(missing_docs)] set. But this is for another PR :)

Copy link
Contributor

@haikoschol haikoschol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, although I feel like I can't really assess the impact of the changes on the keepalive mechanism.

What I can say is that these changes do not seem to cause any issues when connecting from smoldot using webrtc. I've merged them into the branch I've been using for testing and it worked like a charm.

Copy link
Collaborator

@lexnv lexnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job here 🙏

@dmitry-markin dmitry-markin changed the title Repeat ping requests & fix connection keep-alive ping: Conform to the spec & exclude from connection keep-alive Feb 27, 2026
@dmitry-markin dmitry-markin merged commit a7be0c2 into paritytech:master Feb 27, 2026
8 checks passed
dmitry-markin added a commit that referenced this pull request Feb 27, 2026
## [0.13.1] - 2026-02-27

This release includes multiple fixes of transports and protocols, fixing
connection stability issues with other librariies (specifically,
[smoldot](https://github.com/smol-dot/smoldot/)) and increasing success
rates of dialing & opening substreams, especially in extreme cases when
remote nodes have a lot of private addresses published to the DHT.

### Fixed

- ping: Conform to the spec & exclude from connection keep-alive
([#416](#416))
- transport: Make accept async to close the gap on service races
([#525](#525))
- transport: Limit dial concurrency and bound total dialing time
([#538](#538))
- webrtc: Support `FIN`/`FIN_ACK` handshake for substream shutdown
([#513](#513))
- transport: Expose failed addresses to the transport manager
([#529](#529))

### Changed

- manager: Prioritize public addresses for dialing
([#530](#530))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ping: repeat requests periodically

4 participants