Skip to content

Conversation

@lidel
Copy link
Member

@lidel lidel commented Aug 16, 2025

This PR makes NAT port mappings self-healing after router restarts.
It also adds bunch of tests and eliminates duplicate mapping requests that occur on node start.

Changes

1. Automatic NAT recovery after router restart

Router restarts can cause UPnP (SOAP) services to change their listening ports, leading to "connection refused" errors that permanently break port mappings.
This is the problem described in #3224 (comment) (cc @sukunrt @MarcoPolo)

We had Kubo users impacted this since forever, impacting their ability to provide data:

This PR implements automatic NAT rediscovery that detects consecutive connection failures, triggers rediscovery after 3 failures, and restores all existing port mappings on the new NAT instance.

FWIW I've tested it extensively in my own LAN – see "Demo" at the end.

IIUC this

2. Port mapping deduplication

This is something I've discovered while testing (1), so also fixed it in this PR.

Multiple libp2p transports sharing the same port (TCP, QUIC, WebTransport, WebRTC-direct) were causing duplicate NAT mapping requests. The fix adds deduplication in NAT.AddMapping() that checks if mapping already exists before making NAT API calls.

In case of Kubo IPFS node:

  • Before: 10 (5 TCP + 5 UDP) mapping attempts for the same port
  • After: 2 (1 TCP + 1 UDP) mapping attempt (visible at 04:14:06 in the demo)

Demo

The following log shows automatic NAT recovery in action. The router's UPnP service (miniupnpd) restarts and changes its SOAP/HTTP endpoint from port 45251 to a different port, causing "connection refused" errors. After 3 consecutive failures, go-libp2p automatically rediscovers the NAT and restores all mappings:

$ export GOLOG_LOG_LEVEL="error,nat=debug"
$ ipfs daemon
Initializing daemon...
Kubo version: 0.37.0-dev-710953446-dirty
Repo version: 16
System version: amd64/linux
Golang version: go1.25.0
PeerID: 12D3KooWAFYEQLmxbY5xowmqMAFADRbwRDQiSM1cfbmphdHKbhex
Swarm listening on 127.0.0.1:4001 (TCP+UDP)
Swarm listening on 172.17.0.1:4001 (TCP+UDP)
Swarm listening on 172.18.0.1:4001 (TCP+UDP)
Swarm listening on 192.168.1.102:4001 (TCP+UDP)
Swarm listening on [::1]:4001 (TCP+UDP)
Run 'ipfs id' to inspect announced and discovered multiaddrs of this node.
RPC API server listening on /ip4/127.0.0.1/tcp/5001
WebUI: http://127.0.0.1:5001/webui
Gateway server listening on /ip4/127.0.0.1/tcp/8080
Daemon is ready
2025-08-16T04:14:06.558+0200	INFO	nat	nat/nat.go:67	DiscoverGateway address:192.168.1.1
2025-08-16T04:14:06.559+0200	INFO	nat	nat/nat.go:172	Starting maintenance of port mapping: tcp/4001
2025-08-16T04:14:06.559+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: tcp/4001
2025-08-16T04:14:06.668+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: tcp 4001 -> 16174
2025-08-16T04:14:06.668+0200	INFO	nat	nat/nat.go:172	Starting maintenance of port mapping: udp/4001
2025-08-16T04:14:06.668+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: udp/4001
2025-08-16T04:14:06.954+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: udp 4001 -> 16174
2025-08-16T04:14:26.559+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: tcp/4001
2025-08-16T04:14:26.567+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: tcp 4001 -> 16174
2025-08-16T04:14:26.568+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: udp/4001
2025-08-16T04:14:26.573+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: udp 4001 -> 16174
2025-08-16T04:14:46.574+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: tcp/4001
2025-08-16T04:14:46.580+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: tcp 4001 -> 16174
2025-08-16T04:14:46.580+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: udp/4001
2025-08-16T04:14:46.585+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: udp 4001 -> 16174
2025-08-16T04:15:06.586+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: tcp/4001
2025-08-16T04:15:06.596+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: tcp 4001 -> 16174
2025-08-16T04:15:06.596+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: udp/4001
2025-08-16T04:15:06.601+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: udp 4001 -> 16174
2025-08-16T04:15:26.601+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: udp/4001
2025-08-16T04:15:26.607+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: udp 4001 -> 16174
2025-08-16T04:15:26.607+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: tcp/4001
2025-08-16T04:15:26.612+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: tcp 4001 -> 16174
2025-08-16T04:15:46.613+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: tcp/4001
2025-08-16T04:16:16.616+0200	WARN	nat	nat/nat.go:326	NAT port mapping failed: protocol=tcp internal_port=4001 external_port=0
2025-08-16T04:16:16.616+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: udp/4001
2025-08-16T04:16:16.616+0200	WARN	nat	nat/nat.go:326	NAT port mapping failed: protocol=udp internal_port=4001 external_port=0
2025-08-16T04:16:36.617+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: tcp/4001
2025-08-16T04:16:36.617+0200	WARN	nat	nat/nat.go:326	NAT port mapping failed: protocol=tcp internal_port=4001 external_port=0
2025-08-16T04:16:36.617+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: udp/4001
2025-08-16T04:16:36.618+0200	WARN	nat	nat/nat.go:326	NAT port mapping failed: protocol=udp internal_port=4001 external_port=0
2025-08-16T04:17:16.619+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: tcp/4001
2025-08-16T04:17:16.619+0200	WARN	nat	nat/nat.go:326	NAT port mapping failed: protocol=tcp internal_port=4001 external_port=0
2025-08-16T04:17:16.619+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: udp/4001
2025-08-16T04:17:16.620+0200	WARN	nat	nat/nat.go:326	NAT port mapping failed: protocol=udp internal_port=4001 external_port=0
2025-08-16T04:17:36.621+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: tcp/4001
2025-08-16T04:17:36.654+0200	WARN	nat	nat/nat.go:304	NAT port mapping failed: protocol=tcp internal_port=4001 error="goupnp: error performing SOAP HTTP request: Post \"http://192.168.1.1:45251/ctl/IPConn\": dial tcp 192.168.1.1:45251: connect: connection refused"
2025-08-16T04:17:36.654+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: udp/4001
2025-08-16T04:17:36.678+0200	WARN	nat	nat/nat.go:304	NAT port mapping failed: protocol=udp internal_port=4001 error="goupnp: error performing SOAP HTTP request: Post \"http://192.168.1.1:45251/ctl/IPConn\": dial tcp 192.168.1.1:45251: connect: connection refused"
2025-08-16T04:17:56.678+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: tcp/4001
2025-08-16T04:17:56.694+0200	WARN	nat	nat/nat.go:304	NAT port mapping failed: protocol=tcp internal_port=4001 error="goupnp: error performing SOAP HTTP request: Post \"http://192.168.1.1:45251/ctl/IPConn\": dial tcp 192.168.1.1:45251: connect: connection refused"
2025-08-16T04:17:56.694+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: udp/4001
2025-08-16T04:17:56.694+0200	INFO	nat	nat/nat.go:332	NAT rediscovery triggered due to repeated connection failures
2025-08-16T04:17:56.715+0200	WARN	nat	nat/nat.go:304	NAT port mapping failed: protocol=udp internal_port=4001 error="goupnp: error performing SOAP HTTP request: Post \"http://192.168.1.1:45251/ctl/IPConn\": dial tcp 192.168.1.1:45251: connect: connection refused"
2025-08-16T04:18:00.824+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: tcp/4001
2025-08-16T04:18:00.930+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: tcp 4001 -> 65008
2025-08-16T04:18:00.930+0200	INFO	nat	nat/nat.go:367	NAT mapping restored after rediscovery: tcp 4001 -> 65008
2025-08-16T04:18:00.930+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: udp/4001
2025-08-16T04:18:01.223+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: udp 4001 -> 65008
2025-08-16T04:18:01.223+0200	INFO	nat	nat/nat.go:367	NAT mapping restored after rediscovery: udp 4001 -> 65008
2025-08-16T04:18:01.223+0200	INFO	nat	nat/nat.go:372	NAT rediscovery successful
2025-08-16T04:18:16.716+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: tcp/4001
2025-08-16T04:18:16.722+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: tcp 4001 -> 65008
2025-08-16T04:18:16.722+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: udp/4001
2025-08-16T04:18:16.726+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: udp 4001 -> 65008
2025-08-16T04:18:36.728+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: udp/4001
2025-08-16T04:18:36.734+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: udp 4001 -> 65008
2025-08-16T04:18:36.734+0200	DEBUG	nat	nat/nat.go:281	Attempting port map: tcp/4001
2025-08-16T04:18:36.742+0200	DEBUG	nat	nat/nat.go:298	NAT port mapping established: tcp 4001 -> 65008
^C
Received interrupt signal, shutting down...
(Hit ctrl-c again to force-shutdown the daemon.)
2025-08-16T04:18:45.085+0200	INFO	nat	nat/nat.go:196	Stopping maintenance of port mapping: tcp/4001
2025-08-16T04:18:45.097+0200	INFO	nat	nat/nat.go:196	Stopping maintenance of port mapping: udp/4001
ipfs daemon  14.25s user 3.83s system 6% cpu 4:43.00 total

Key moments in the log:

  • 04:14:06: Initial NAT discovery succeeds, only 2 mapping attempts (deduplication working)
  • 04:15:46: Router restarts, UPnP service begins failing
  • 04:17:36: Connection refused errors indicate UPnP port changed from :45251
  • 04:17:56: After 3rd failure, automatic rediscovery triggered
  • 04:18:00: NAT rediscovered, mappings restored with new external port 65008
  • 04:18:45: Clean shutdown removes all mappings

Without this fix, the connection refused errors would continue indefinitely, requiring manual daemon restart.

@lidel
Copy link
Member Author

lidel commented Sep 2, 2025

@MarcoPolo @sukunrt I've resolved the conflicts (d4ac0de) caused by the grand log refactor that landed in master.

Would it be possible to review this bugfix and ship in the next go-libp2p? 🙏

@MarcoPolo
Copy link
Collaborator

MarcoPolo commented Sep 2, 2025 via email

Copy link
Collaborator

@MarcoPolo MarcoPolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nits. Looks good. Thanks for this and for testing it!

@MarcoPolo MarcoPolo force-pushed the fix-heal-nat-mappings branch 2 times, most recently from 5f61fa8 to d59d597 Compare September 3, 2025 21:50
@MarcoPolo
Copy link
Collaborator

I’ll deal with the conflicts and merge tomorrow

Router restarts can cause UPnP/NAT-PMP services to change their
listening ports, leading to connection refused errors. This fix
implements automatic NAT rediscovery when consecutive connection
failures are detected, restoring all existing port mappings on
the new NAT instance.

See #3224 (comment)
for details on the router behavior that motivated this fix.
@MarcoPolo MarcoPolo force-pushed the fix-heal-nat-mappings branch from d59d597 to c916a49 Compare September 4, 2025 17:29
lidel and others added 2 commits September 4, 2025 10:31
Multiple libp2p transports can share the same port (TCP, QUIC,
WebTransport, WebRTC-direct), causing duplicate AddMapping calls
for the same protocol/port combination. This fix adds deduplication
in NAT.AddMapping to prevent redundant NAT device operations and
reduce log spam.
@MarcoPolo MarcoPolo force-pushed the fix-heal-nat-mappings branch from c916a49 to ff8b509 Compare September 4, 2025 17:31
@MarcoPolo MarcoPolo merged commit 3ecae66 into master Sep 4, 2025
11 checks passed
@lidel lidel mentioned this pull request Oct 1, 2025
65 tasks
@lthibault
Copy link
Contributor

lthibault commented Oct 21, 2025

Is self-healing NAT a planned feature for Rust?

Happy to begin working on a port if there's interest.

@mishmosh
Copy link

@jxs might be able to comment on how self-healing NAT fits into rust-libp2p plans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

nat: stuck in error loop after router restart Upnp port is not re-announced

5 participants