Skip to content

Fix nRF52 AsyncUDP multicast TX/RX race on W5100S#9765

Open
PhilipLykov wants to merge 3 commits intomeshtastic:developfrom
PhilipLykov:fix/nrf52-multicast-tx-rx-race
Open

Fix nRF52 AsyncUDP multicast TX/RX race on W5100S#9765
PhilipLykov wants to merge 3 commits intomeshtastic:developfrom
PhilipLykov:fix/nrf52-multicast-tx-rx-race

Conversation

@PhilipLykov
Copy link

@PhilipLykov PhilipLykov commented Feb 26, 2026

Summary

Fix a TX/RX race condition in the nRF52 AsyncUDP implementation that causes garbled multicast packets on RAK4631 Ethernet gateway boards (W5100S).

Root cause: The W5100S socketSendUDP() function calls yield() while polling for SEND_OK. On nRF52 (cooperative multitasking), this allows AsyncUDP::runOnce() to execute and call parsePacket() on the same UDP socket while a send is still in progress. When the multicast loopback packet arrives from the network switch during this window, parsePacket() reads a garbled 8-byte RX header from the W5100S, producing an incorrect source IP (e.g. 3.192.168.30 instead of the real sender) and wrong payload length. This corrupted packet then fails protobuf decoding in UdpMulticastHandler.

Fix: Add a volatile bool isSending flag to AsyncUDP. writeTo() sets this flag around the entire send operation (beginPacket / write / endPacket). runOnce() checks !isSending before calling parsePacket(), skipping the poll when a send is active. Loopback packets remain safely buffered in the W5100S hardware RX buffer and are read cleanly on the next 5ms poll cycle.

Changes

  • src/platform/nrf52/AsyncUDP.h — add volatile bool isSending member
  • src/platform/nrf52/AsyncUDP.cpp — guard writeTo() with the flag; skip parsePacket() in runOnce() when sending

Test plan

  • Build rak4631_eth_gw target — SUCCESS
  • Flash on RAK4631 + RAK13800 (W5100S) Ethernet gateway with PoE
  • Verify multicast mesh traffic has no garbled source IPs in logs
  • Verify no protobuf decode errors for incoming multicast packets
  • Confirm API connectivity and radio operation remain stable

@github-actions github-actions bot added the bugfix Pull request that fixes bugs label Feb 26, 2026
@PhilipLykov
Copy link
Author

That's how it looks in the logs:
DEBUG | 22:09:17 10 [AsyncUDP] UDP broadcast from: 0.0.0.0, len=512
DEBUG | 22:09:17 10 [AsyncUDP] Decoding MeshPacket from UDP len=512
ERROR | 22:09:17 10 [AsyncUDP] Can't decode protobuf reason='zero tag', pb_msgdesc 0xaa9bc
DEBUG | 22:09:17 11 [AsyncUDP] UDP broadcast from: 3.192.168.30, len=512
DEBUG | 22:09:17 11 [AsyncUDP] Decoding MeshPacket from UDP len=512
ERROR | 22:09:17 11 [AsyncUDP] Can't decode protobuf reason='wrong wire type', pb_msgdesc 0xaa9bc

return udp.endPacket();
bool ok = udp.endPacket();
isSending = false;
return ok;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UdpMulticastHandler::onSend() ignores return value of udp.writeTo() and always return true. the return value is not documented there, should it care?

int32_t AsyncUDP::runOnce()
{
if (_onPacket && udp.parsePacket() > 0) {
if (_onPacket && !isSending && udp.parsePacket() > 0) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new guard skips parsePacket() for the entire writeTo() window. If endPacket() blocks/yields for longer intervals (as described in the PR rationale), inbound packets can accumulate and be dropped by the W5100S RX buffer during bursts. This fixes corruption but may trade it for packet loss under load.

…kets

socketSendUDP() calls yield() while waiting for SEND_OK, allowing the
cooperative scheduler to run AsyncUDP::runOnce(). When runOnce() calls
parsePacket() on the same socket during an active send, the multicast
loopback packet arriving from the switch produces a garbled 8-byte RX
header (wrong source IP, wrong payload length), leading to protobuf
decode errors in UdpMulticastHandler.

Add a volatile isSending flag that writeTo() sets around endPacket()
(the only yield point). runOnce() checks this flag and skips
parsePacket() while a send is in progress. Loopback packets stay
buffered in the W5100S RX buffer and are read cleanly on the next
poll cycle.

Also propagate writeTo() return value in UdpMulticastHandler::onSend()
instead of unconditionally returning true.

Made-with: Cursor
@PhilipLykov PhilipLykov force-pushed the fix/nrf52-multicast-tx-rx-race branch from 27b27af to 522118c Compare February 27, 2026 07:39
@PhilipLykov
Copy link
Author

Hi @robekl, thank you for the review! Both points are valid and I've pushed an updated commit to address them:

1. onSend() return value (your first comment)

You're right — onSend() was silently discarding the writeTo() result. Fixed: it now returns the result of udp.writeTo() directly. The caller in Router.cpp currently discards it, but the API is now honest for any future consumer.

2. Narrowing the isSending window (your second comment)

Good catch on the blocked duration. I've narrowed the isSending flag to only wrap endPacket(), which is the sole yield point (socketSendUDP()yield()). beginPacket() just writes destination registers and write() copies into the TX buffer — both are fast non-yielding SPI operations. So now runOnce() only skips parsePacket() for the minimal window where the race can actually occur, rather than for the entire writeTo() sequence.

The W5100S has a 2KB RX buffer per socket, which holds ~8 typical Meshtastic packets, so even during the brief endPacket() window there's no realistic risk of overflow. And packets arriving during that window were already being corrupted before this fix — so the net effect is strictly positive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Pull request that fixes bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants