-
Notifications
You must be signed in to change notification settings - Fork 415
Description
Is your feature request related to a problem? Please describe.
Currently, integrating resource-constrained microcontrollers with ros2_control heavily relies on micro-ROS and an XRCE-DDS agent running on the host. While effective, this architecture introduces several friction points for high-frequency (500Hz+) control loops:
- Agent Bottleneck: The XRCE-DDS agent acts as a single point of failure and a scaling bottleneck when managing multiple hardware nodes.
- Serialization Overhead: CDR serialization on the MCU adds non-trivial latency per message.
- Zephyr Support: While micro-ROS has strong FreeRTOS support, its native integration with Zephyr RTOS and its built-in networking stack is less mature.
Describe the solution you'd like
I am proposing an agent-less, data-centric hardware interface leveraging Zenoh (zenoh-pico on the MCU and zenoh-cpp on the ROS 2 host). This bypasses the need for an intermediate agent, allowing direct P2P communication between the embedded hardware and the ros2_control framework.
To ensure strict real-time compliance and maximize network throughput, the proposed architecture incorporates two critical design choices:
-
Batched Key-Value Mapping: Instead of publishing individual keys per joint (which introduces massive network header overhead), the MCU will pack a C-struct of all joint states into a single Zenoh key (e.g.,
robot/<id>/state). This ensures data coherency per control cycle and minimizes the wire payload to raw IEEE 754 bytes + a single Zenoh header. -
Real-Time Safe Execution: The
SystemInterfaceplugin'sread()andwrite()methods cannot block or allocate memory. The plugin will instantiate a background thread to handle Zenoh network I/O. This thread will push/pull data from a lock-free, atomic double-buffer, allowing theControllerManager's RT loop to grab the latest state in$O(1)$ time without waiting on sockets.
Describe alternatives you've considered
- micro-ROS (XRCE-DDS): The current standard. I have used this extensively (e.g., ESP32 bridged to an STM32), but the agent overhead and memory footprint make it suboptimal for ultra-low latency requirements compared to Zenoh's ~5-byte wire overhead.
- Raw UDP/TCP Sockets: Completely bypasses middleware. While fast, it loses the scalability, dynamic discovery, and pub/sub routing flexibility that the Zenoh/ROS ecosystem provides.
Additional context
This proposal is aligned with the OSRF GSoC project: https://github.com/osrf/osrf_wiki/wiki/GSoC-2026#zephyr-zenoh-integration-for-ros2_control
Proposed Architecture Flow
┌──────────────────────────────────────────────────────────────────────────┐
│ ROS 2 HOST (Linux) │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ ros2_control Controller Manager │ │
│ │ │ │
│ │ ┌───────────────────────────────────────────────┐ │ │
│ │ │ ZenohHardwareInterface (SystemInterface) │ │ │
│ │ │ │ │ │
│ │ │ [RT Thread] [Background Thread] │ │ │
│ │ │ read(): atomic load <─> Zenoh subscriber │ │ │
│ │ │ write(): atomic push<─> Zenoh publisher │ │ │
│ │ └────────────────────┬──────────────────────────┘ │ │
│ └───────────────────────┼─────────────────────────────┘ │
│ Zenoh Session (zenoh-cpp) │
└──────────────────────────┼───────────────────────────────────────────────┘
│ TCP/UDP (P2P or via Router)
│ Key: robot/<id>/state (Batched struct)
┌──────────────────────────┼───────────────────────────────────────────────┐
│ MICROCONTROLLER (Zephyr RTOS) │
│ │ │
│ Zenoh Session (zenoh-pico) │
│ │ │
│ ┌───────────────────────┼───────────────────────────┐ │
│ │ pub: packed struct of all encoder states │ │
│ │ sub: packed struct of all motor commands │ │
│ └──────┬────────────────────────────────┬─────────────┘ │
│ ┌──────▼──────────┐ ┌────────▼─────────┐ │
│ │ Sensor HAL │ │ Actuator HAL │ │
│ └─────────────────┘ └──────────────────┘ │
└──────────────────────────────────────────────────────────────────────────┘
Next Steps: Proof of Concept Prototype
To validate this approach before the GSoC coding period, I am currently planning to build a proof-of-work prototype with the following milestones:
Local Simulation: A Linux C++ application simulating the ros2_control RT loop, communicating via zenoh-cpp to a Zephyr instance running in QEMU with zenoh-pico.
Hardware Benchmark: Deploying the Zephyr node to physical hardware (ESP32/STM32) to benchmark end-to-end latency, jitter, and memory footprint compared to a baseline micro-ROS setup.
I would appreciate any feedback from the maintainers on this architectural direction, specifically regarding how dynamically we should generate the Zenoh keys based on the URDF. Happy to iterate on this!