-
Notifications
You must be signed in to change notification settings - Fork 20
Control Plane Refactor and Pipeline Mode Integration #265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer's GuideThis PR refactors the control plane to run in DPDK’s pipeline model by splitting base graphs into dedicated dataplane and control-plane graphs, converting all callback-based exception logic into explicit graph nodes with GR_NODE_FLAG_CONTROL_PLANE, and updating worker/graph/statistics infrastructure to support separate control threads and metrics. Sequence diagram for exception handling: dataplane to control-plane nodesequenceDiagram
participant Dataplane as Dataplane Graph
participant ControlPlane as Control Plane Graph (Node)
participant Worker as Worker Thread
Dataplane->>Worker: Exception occurs (e.g., unreachable, ARP/NDP/ICMP)
Worker->>ControlPlane: Enqueue packet to control-plane node (e.g., nh4_unreachable, arp_probe)
ControlPlane->>ControlPlane: Process exception logic in dedicated node
ControlPlane-->>Worker: (Optional) Return result or update state
Note over ControlPlane: Control-plane node runs on dedicated control thread
ER diagram for mbuf private data extensions for ICMP/ICMP6erDiagram
ip_local_mbuf_data ||--o{ icmp_mbuf_data : extends
ip6_local_mbuf_data ||--o{ icmp6_mbuf_data : extends
icmp_mbuf_data {
clock_t timestamp
}
icmp6_mbuf_data {
clock_t timestamp
}
Class diagram for new and refactored control-plane graph nodesclassDiagram
class rte_node_register {
+name: string
+process: function
+flags: int
+nb_edges: int
+next_nodes: string[]
}
class gr_node_info {
+node: rte_node_register*
+register_callback: function
+trace_format: function
}
class nh4_unreachable_node {
+process: function
+flags = GR_NODE_FLAG_CONTROL_PLANE
+name = "nh4_unreachable"
}
class nh6_unreachable_node {
+process: function
+flags = GR_NODE_FLAG_CONTROL_PLANE
+name = "nh6_unreachable"
}
class arp_probe_node {
+process: function
+flags = GR_NODE_FLAG_CONTROL_PLANE
+name = "arp_probe"
}
class ndp_probe_node {
+process: function
+flags = GR_NODE_FLAG_CONTROL_PLANE
+name = "ndp_probe"
}
class icmp_input_ctl_node {
+process: function
+flags = GR_NODE_FLAG_CONTROL_PLANE
+name = "icmp_input_ctl"
}
class icmp6_input_ctl_node {
+process: function
+flags = GR_NODE_FLAG_CONTROL_PLANE
+name = "icmp6_input_ctl"
}
class ndp_router_sollicit_node {
+process: function
+flags = GR_NODE_FLAG_CONTROL_PLANE
+name = "ndp_router_sollicit"
}
rte_node_register <|-- nh4_unreachable_node
rte_node_register <|-- nh6_unreachable_node
rte_node_register <|-- arp_probe_node
rte_node_register <|-- ndp_probe_node
rte_node_register <|-- icmp_input_ctl_node
rte_node_register <|-- icmp6_input_ctl_node
rte_node_register <|-- ndp_router_sollicit_node
gr_node_info o-- rte_node_register
Class diagram for worker and graph management infrastructureclassDiagram
class worker {
+thread: pthread_t
+rxqs: queue_map*
+txqs: queue_map*
+base[2]: rte_graph*
+graph[2]: rte_graph*
+ctl_graph[2]: rte_graph*
+stats_ctl: atomic worker_stats*
}
class rte_graph {
+name: string
+id: int
}
worker o-- rte_graph : base/graph/ctl_graph
worker o-- worker_stats : stats_ctl
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
WalkthroughThis change set removes the legacy callback-based control output infrastructure and refactors the control plane to use explicit graph node processing for all packet types. It introduces new control and loopback output nodes, updates graph management to support separate control and dataplane graphs per worker, and modifies packet handling, statistics, and interface logic to operate within this new graph-based framework. Numerous obsolete files and callback registration mechanisms are deleted, and all related build and header files are updated accordingly. Changes
Sequence Diagram(s)sequenceDiagram
participant Worker
participant ControlLoop
participant ControlGraph
participant DataplaneGraph
participant LoopbackOutput
participant Stats
ControlLoop->>Worker: For each worker, get current control graph
alt Worker not shutting down and graph valid
ControlLoop->>ControlGraph: Walk control graph
ControlGraph->>LoopbackOutput: Process packets (if loopback)
LoopbackOutput->>Interface: Write packet to TUN device
end
Worker->>DataplaneGraph: Walk dataplane graph (separately)
DataplaneGraph->>ControlGraph: Forward control packets via edges
ControlGraph->>Stats: Aggregate stats for control plane nodes
DataplaneGraph->>Stats: Aggregate stats for dataplane nodes
sequenceDiagram
participant ICMPInput
participant ICMPInputCtl
participant NextNode
ICMPInput->>ICMPInputCtl: For non-echo ICMP, set timestamp, forward via edge
ICMPInputCtl->>NextNode: Enqueue packet for further processing
Estimated code review effort5 (~180 minutes) Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (46)
🧬 Code Graph Analysis (2)modules/infra/control/loop_output.c (3)
modules/ip/control/nexthop.c (7)
💤 Files with no reviewable changes (12)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (26)
🧰 Additional context used🧬 Code Graph Analysis (2)modules/infra/control/loop_output.c (3)
modules/ip/control/nexthop.c (7)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
🔇 Additional comments (20)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
static void control_loop_init(struct event_base *ev_base) { | ||
struct timeval time; | ||
time.tv_sec = 0; | ||
time.tv_usec = 10000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we don't have any notification from the graph framework when a packet is sent from 1 graph to another, poll this other graph every ~10000us.
Is it (not) enough ?
7fe5ef7
to
4a5c0d6
Compare
630cd22
to
2a42049
Compare
continue; | ||
for (unsigned i = 0; i < w_stats->n_stats; i++) { | ||
const struct node_stats *n = &w_stats->stats[i]; | ||
const char *name = rte_node_id_to_name(n->node_id); | ||
const struct rte_node_register *nr = gr_node_info_get(n->node_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return value of gr_node_info_get is not checked for NULL before dereferencing ->node
; this may crash if lookup fails.
continue; | ||
for (unsigned i = 0; i < w_stats->n_stats; i++) { | ||
const struct node_stats *n = &w_stats->stats[i]; | ||
const char *name = rte_node_id_to_name(n->node_id); | ||
const struct rte_node_register *nr = gr_node_info_get(n->node_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return value of gr_node_info_get is not checked before dereferencing ->node; this may lead to a NULL dereference if lookup fails.
const struct rte_node_register *nr = gr_node_info_get(n->node_id) | ||
->node; | ||
char name[RTE_NODE_NAMESIZE]; | ||
snprintf( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
snprintf return value is not checked; truncated node names could be silently dropped—consider handling truncation.
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @christophefontaine - I've reviewed your changes - here's some feedback:
- Ensure the error path in worker_graph_new cleans up all partially created graphs (base, dataplane, control) and associated node data to avoid leaks or stale state.
- Replace the brittle string matching (e.g. strstr(graph->name, "-dplane")) in rx/tx init with explicit graph metadata or flags so dataplane vs control‐plane graphs are distinguished reliably.
- Consolidate the stats aggregation for dataplane and control‐plane graphs into a common helper to eliminate duplication and simplify future changes.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Ensure the error path in worker_graph_new cleans up all partially created graphs (base, dataplane, control) and associated node data to avoid leaks or stale state.
- Replace the brittle string matching (e.g. strstr(graph->name, "-dplane")) in rx/tx init with explicit graph metadata or flags so dataplane vs control‐plane graphs are distinguished reliably.
- Consolidate the stats aggregation for dataplane and control‐plane graphs into a common helper to eliminate duplication and simplify future changes.
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 12
♻️ Duplicate comments (3)
modules/infra/api/stats.c (2)
75-75
: Replace memccpy with more standard string copy functions.Using
memccpy
with '\0' as the stop character is non-portable and non-intuitive for string copying.Replace all occurrences with
strlcpy
or explicitstrncpy
with null termination:- memccpy(stat.name, name, 0, sizeof(stat.name)); + strlcpy(stat.name, name, sizeof(stat.name));Also applies to: 107-107, 122-122, 216-216, 354-354, 369-369
53-54
: Add NULL check for gr_node_info_get.The return value of
gr_node_info_get
should be checked before dereferencing to avoid potential crashes.Apply this pattern to both occurrences:
- const struct rte_node_register *nr = gr_node_info_get(n->node_id) - ->node; + const struct gr_node_info *info = gr_node_info_get(n->node_id); + if (info == NULL) + continue; + const struct rte_node_register *nr = info->node;Also applies to: 83-86
modules/infra/control/main_loop.c (1)
27-34
: 10ms interval is reasonable for control plane processing.The 10ms timer interval provides a good balance between control plane responsiveness and CPU overhead. This should be sufficient for typical control plane protocols like ARP, NDP, and ICMP which don't require sub-millisecond latency.
🧹 Nitpick comments (3)
modules/infra/api/stats.c (1)
48-110
: Consider refactoring duplicated stats aggregation logic.The code for aggregating stats from both
worker->stats
andworker->stats_ctl
is nearly identical. This duplication could be eliminated.Consider extracting the stats aggregation into a helper function:
static void aggregate_worker_stats( struct stat **stats, const struct worker_stats *w_stats, bool is_control_plane ) { if (w_stats == NULL) return; for (unsigned i = 0; i < w_stats->n_stats; i++) { const struct node_stats *n = &w_stats->stats[i]; const struct gr_node_info *info = gr_node_info_get(n->node_id); if (info == NULL) continue; const struct rte_node_register *nr = info->node; char name[RTE_NODE_NAMESIZE]; snprintf( name, RTE_NODE_NAMESIZE, "%s%c", rte_node_id_to_name(n->node_id), nr->flags & GR_NODE_FLAG_CONTROL_PLANE ? '*' : '\0' ); struct stat *s = find_stat(*stats, name); if (s != NULL) { s->objs += n->objs; s->calls += n->calls; s->cycles += n->cycles; } else { struct stat stat = { .objs = n->objs, .calls = n->calls, .cycles = n->cycles, }; strlcpy(stat.name, name, sizeof(stat.name)); gr_vec_add(*stats, stat); } } }Then use it as:
aggregate_worker_stats(&stats, w_stats, false); aggregate_worker_stats(&stats, worker->stats_ctl, true);modules/ip6/datapath/icmp6_input.c (1)
102-103
: Make the duplicate check more explicit.For consistency with the IPv4 implementation and better clarity, consider using an explicit comparison.
- if (edges[icmp6_type]) + if (edges[icmp6_type] != UNSUPPORTED)modules/infra/control/loop_output.c (1)
54-54
: Fix grammatical error in comment-// Do not retry even in case of if EAGAIN || EWOULDBLOCK +// Do not retry even in case of EAGAIN || EWOULDBLOCK
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
docs/graph.svg
is excluded by!**/*.svg
📒 Files selected for processing (46)
modules/infra/api/affinity.c
(0 hunks)modules/infra/api/stats.c
(2 hunks)modules/infra/control/control_output.c
(0 hunks)modules/infra/control/gr_graph.h
(2 hunks)modules/infra/control/gr_loopback.h
(0 hunks)modules/infra/control/gr_worker.h
(1 hunks)modules/infra/control/graph.c
(7 hunks)modules/infra/control/loop_output.c
(1 hunks)modules/infra/control/loopback.c
(1 hunks)modules/infra/control/loopback.h
(1 hunks)modules/infra/control/main_loop.c
(1 hunks)modules/infra/control/meson.build
(1 hunks)modules/infra/control/worker.c
(1 hunks)modules/infra/datapath/control_input.c
(1 hunks)modules/infra/datapath/control_output.c
(0 hunks)modules/infra/datapath/drop.c
(1 hunks)modules/infra/datapath/gr_control_output.h
(0 hunks)modules/infra/datapath/gr_mbuf.h
(1 hunks)modules/infra/datapath/loop_output.c
(0 hunks)modules/infra/datapath/main_loop.c
(5 hunks)modules/infra/datapath/meson.build
(0 hunks)modules/infra/datapath/rx.c
(1 hunks)modules/infra/datapath/tx.c
(1 hunks)modules/ip/control/gr_ip4_control.h
(0 hunks)modules/ip/control/icmp.c
(5 hunks)modules/ip/control/nexthop.c
(1 hunks)modules/ip/datapath/arp_input_reply.c
(3 hunks)modules/ip/datapath/arp_input_request.c
(3 hunks)modules/ip/datapath/gr_ip4_datapath.h
(2 hunks)modules/ip/datapath/icmp_input.c
(3 hunks)modules/ip/datapath/icmp_local_send.c
(0 hunks)modules/ip/datapath/ip_hold.c
(2 hunks)modules/ip/datapath/ip_output.c
(0 hunks)modules/ip6/control/gr_ip6_control.h
(0 hunks)modules/ip6/control/icmp6.c
(4 hunks)modules/ip6/control/nexthop.c
(1 hunks)modules/ip6/control/router_advert.c
(1 hunks)modules/ip6/datapath/gr_ip6_datapath.h
(2 hunks)modules/ip6/datapath/icmp6_input.c
(3 hunks)modules/ip6/datapath/icmp6_local_send.c
(0 hunks)modules/ip6/datapath/ip6_hold.c
(2 hunks)modules/ip6/datapath/ndp_na_input.c
(3 hunks)modules/ip6/datapath/ndp_ns_input.c
(3 hunks)modules/ip6/datapath/ndp_rs_input.c
(3 hunks)subprojects/dpdk.wrap
(1 hunks)subprojects/packagefiles/graph-allow-non-EAL-Thread-for-pipeline-mode.patch
(1 hunks)
🧬 Code Graph Analysis (5)
modules/ip6/datapath/gr_ip6_datapath.h (1)
modules/ip6/datapath/icmp6_input.c (1)
icmp6_input_register_type
(98-106)
modules/ip/datapath/gr_ip4_datapath.h (1)
modules/ip/datapath/icmp_input.c (1)
icmp_input_register_type
(76-84)
modules/ip6/control/icmp6.c (2)
modules/ip6/datapath/icmp6_input.c (2)
icmp6_input_register
(108-110)icmp6_input_register_type
(98-106)modules/infra/datapath/trace.c (1)
trace_icmp6_format
(197-307)
modules/infra/control/loop_output.c (3)
modules/infra/control/iface.c (1)
iface_destroy
(246-277)modules/ip/datapath/ip_output.c (1)
ip_output_register_interface_type
(33-40)modules/ip6/datapath/ip6_output.c (1)
ip6_output_register_interface_type
(30-37)
modules/ip/control/nexthop.c (8)
modules/infra/datapath/gr_mbuf.h (1)
gr_mbuf_is_traced
(67-69)modules/infra/datapath/trace.c (1)
gr_mbuf_trace_add
(440-467)modules/ip/control/route.c (2)
rib4_lookup
(93-107)rib4_insert
(125-174)modules/ip/control/gr_ip4_control.h (1)
nh4_lookup
(18-21)modules/infra/control/nexthop.c (1)
nexthop_new
(236-276)modules/ip/datapath/arp_output_request.c (1)
arp_output_request_solicit
(31-44)modules/infra/control/gr_graph.h (1)
rte_node_enqueue_x1
(24-27)api/gr_clock.h (1)
gr_clock_us
(17-20)
💤 Files with no reviewable changes (12)
- modules/ip6/datapath/icmp6_local_send.c
- modules/ip/datapath/icmp_local_send.c
- modules/infra/datapath/meson.build
- modules/ip/datapath/ip_output.c
- modules/infra/control/gr_loopback.h
- modules/ip/control/gr_ip4_control.h
- modules/infra/api/affinity.c
- modules/ip6/control/gr_ip6_control.h
- modules/infra/control/control_output.c
- modules/infra/datapath/gr_control_output.h
- modules/infra/datapath/loop_output.c
- modules/infra/datapath/control_output.c
🧰 Additional context used
🧬 Code Graph Analysis (5)
modules/ip6/datapath/gr_ip6_datapath.h (1)
modules/ip6/datapath/icmp6_input.c (1)
icmp6_input_register_type
(98-106)
modules/ip/datapath/gr_ip4_datapath.h (1)
modules/ip/datapath/icmp_input.c (1)
icmp_input_register_type
(76-84)
modules/ip6/control/icmp6.c (2)
modules/ip6/datapath/icmp6_input.c (2)
icmp6_input_register
(108-110)icmp6_input_register_type
(98-106)modules/infra/datapath/trace.c (1)
trace_icmp6_format
(197-307)
modules/infra/control/loop_output.c (3)
modules/infra/control/iface.c (1)
iface_destroy
(246-277)modules/ip/datapath/ip_output.c (1)
ip_output_register_interface_type
(33-40)modules/ip6/datapath/ip6_output.c (1)
ip6_output_register_interface_type
(30-37)
modules/ip/control/nexthop.c (8)
modules/infra/datapath/gr_mbuf.h (1)
gr_mbuf_is_traced
(67-69)modules/infra/datapath/trace.c (1)
gr_mbuf_trace_add
(440-467)modules/ip/control/route.c (2)
rib4_lookup
(93-107)rib4_insert
(125-174)modules/ip/control/gr_ip4_control.h (1)
nh4_lookup
(18-21)modules/infra/control/nexthop.c (1)
nexthop_new
(236-276)modules/ip/datapath/arp_output_request.c (1)
arp_output_request_solicit
(31-44)modules/infra/control/gr_graph.h (1)
rte_node_enqueue_x1
(24-27)api/gr_clock.h (1)
gr_clock_us
(17-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04)
- GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled)
- GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04)
- GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04)
- GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04)
🔇 Additional comments (42)
modules/infra/datapath/drop.c (1)
34-34
: LGTM: Control plane drop counter addition.The new
ctlplane_sink
drop counter follows the established pattern and correctly supports the new control plane graph nodes introduced in this refactor.subprojects/dpdk.wrap (1)
4-4
: Patch file verified and looks correctThe patch
subprojects/packagefiles/graph-allow-non-EAL-Thread-for-pipeline-mode.patch
exists and adds the expected checks (replacingrte_lcore_is_enabled
with aROLE_OFF
check), properly settingrte_errno
on failure. No further action is needed here.modules/infra/datapath/control_input.c (1)
130-130
: LGTM: Control plane flag addition is correct.The addition of
GR_NODE_FLAG_CONTROL_PLANE
properly identifies this node as part of the control plane, enabling correct core affinity assignment and statistics reporting in the new architecture.modules/infra/control/meson.build (1)
6-8
: New control files verified and implement expected control-plane logic
- modules/infra/control/loop_output.c exists and defines
loopback_output_process
, pulling in the appropriate graph, loopback, and tracing headers.- modules/infra/control/main_loop.c exists, registers
control_loop
to drive the control graph per worker via an event loop.Both files correctly replace the old callback-based output and integrate the periodic graph execution for the control plane. Changes look good—approving.
modules/infra/control/worker.c (1)
424-424
: Dependency change verified and approved
- modules/infra/control/graph.c registers the
graph_module
control_output.c
and all references tocontrol_output
have been removedNo further action required.
modules/infra/control/loopback.h (1)
6-9
: LGTM! Clean structure definition for loopback interface.The
iface_info_loopback
structure is well-defined and supports the transition to event-driven control plane processing. The separation of interface definition from implementation improves modularity.modules/infra/datapath/tx.c (1)
96-98
: Approve dataplane init guard – naming verifiedConfirmed that the “-dplane” suffix is used consistently for dataplane graph identification:
- modules/infra/control/graph.c: snprintf(..., “gr-%01d-%04x-dplane”, …)
- modules/infra/datapath/tx.c and rx.c:
if (strstr(graph->name, "-dplane") == NULL) return 0;
No further action required.
modules/infra/datapath/rx.c (1)
100-102
: LGTM! Consistent dataplane-specific initialization.The conditional check mirrors the approach in
tx.c
and correctly restricts RX context initialization to dataplane graphs only. This maintains consistency in the dataplane/control plane separation.modules/infra/control/gr_graph.h (2)
13-13
: Approve multi-core dispatch model selection.The
RTE_GRAPH_MODEL_MCORE_DISPATCH
selection is appropriate for the control/dataplane separation, enabling different cores to handle different graph types.
45-45
: Approve control-plane flag bit
DPDK’srte_node_register.flags
do not document any use of bit 31 (0x80000000), so defining#define GR_NODE_FLAG_CONTROL_PLANE (1 << 31)is safe in current releases. Please continue to monitor upstream DPDK for any future assignments to this bit.
modules/infra/control/gr_worker.h (1)
66-68
: Initialization and Cleanup of Multi-Graph Members VerifiedThe new
base[2]
andctl_graph[2]
pointers are correctly handled:
- In modules/infra/control/graph.c within worker_graph_new(), both arrays are set via rte_graph_lookup().
- In the same file’s cleanup loops, each non-NULL entry in
ctl_graph
andbase
is destroyed with rte_graph_destroy() and reset to NULL.No further changes are needed for graph initialization or teardown.
modules/ip6/datapath/gr_ip6_datapath.h (2)
34-34
: LGTM: Well-structured mbuf data extension.The new
icmp6_mbuf_data
type correctly extendsip6_local_mbuf_data
with a timestamp field, which aligns with the control plane refactor's need for timing information in ICMPv6 packet processing.
75-75
: LGTM: Function signature correctly updated for graph-based processing.The change from callback-based registration to next-node string registration is consistent with the overall refactor from callback-based control plane to graph node-based processing.
modules/ip/datapath/ip_hold.c (1)
11-34
: LGTM: Clean refactor from callback-based to direct graph processing.The changes correctly remove the control output dependency by:
- Renaming the enum to the more descriptive
NH4_UNREACH
- Direct enqueueing to the
"nh4_unreachable"
node instead of using callback mechanisms- Simplifying the packet processing flow
This aligns well with the overall architectural refactor to graph-based control plane processing.
modules/ip/datapath/gr_ip4_datapath.h (2)
29-31
: LGTM: Well-documented mbuf data extension.The new
icmp_mbuf_data
type correctly extendsip_local_mbuf_data
with a timestamp field. The inline comment explaining the timestamp's purpose is helpful for maintainability.
74-74
: LGTM: Function signature correctly updated for graph-based processing.The change from
icmp_input_register_callback
toicmp_input_register_type
aligns with the broader refactor from callback-based to graph node-based control plane processing, consistent with the IPv6 equivalent.modules/ip6/datapath/ip6_hold.c (1)
11-34
: LGTM: Consistent IPv6 refactor matching IPv4 implementation.The changes correctly mirror the IPv4
ip_hold.c
refactor by:
- Using the descriptive
NH6_UNREACH
enum name- Direct enqueueing to
"nh6_unreachable"
instead of callback-based control output- Maintaining consistency with the IPv4 implementation pattern
This provides a clean, uniform approach across both IP versions for the graph-based control plane processing.
modules/infra/datapath/gr_mbuf.h (1)
43-56
: LGTM: Well-designed macro for mbuf private data inheritance.The
GR_MBUF_PRIV_DATA_EXTENDS
macro is well-implemented with:
- Proper struct composition embedding the base structure
- Consistent pattern matching
GR_MBUF_PRIV_DATA_TYPE
- Same safety checks including size assertion
- Clean inline accessor function
This macro effectively enables inheritance-like behavior for mbuf private data structures, supporting the control plane refactor's need for extended packet metadata like the timestamp fields in ICMP processing.
modules/ip/datapath/arp_input_reply.c (1)
17-17
: Clean refactor to graph-based processing.The renaming from
CONTROL
toARP_PROBE
and direct enqueueing improves code clarity by making the destination node's purpose explicit. This aligns well with the PR's goal of replacing callback-based control output with dedicated graph nodes.Also applies to: 44-44, 60-60
modules/ip6/datapath/ndp_ns_input.c (1)
17-17
: Consistent refactoring pattern applied.The transformation from
CONTROL
toNDP_PROBE
follows the same clean pattern as other datapath modules, improving code clarity and maintainability.Also applies to: 88-88, 108-108
modules/infra/control/loopback.c (1)
4-5
: Appropriate header reorganization.The header changes correctly reflect the architectural refactor, with the
iface_info_loopback
struct now properly encapsulated in its own header file.modules/infra/datapath/main_loop.c (1)
64-84
: LGTM! Statistics collection properly extended for dual-graph architecture.The changes correctly extend the statistics infrastructure to collect metrics from both dataplane and control plane graphs. The implementation properly handles the graph names array and passes both graphs to the statistics subsystem.
Also applies to: 125-126, 170-170, 180-180
subprojects/packagefiles/graph-allow-non-EAL-Thread-for-pipeline-mode.patch (1)
28-46
: Appropriate change to support non-EAL control plane threads.The patch correctly modifies the thread validation logic to use role-based checking instead of EAL-only validation. This enables the control plane threads (which may not be EAL threads) to bind to graphs in pipeline mode. The addition of explicit
rte_errno
values improves error reporting.modules/infra/control/main_loop.c (1)
15-25
: Control loop implementation looks good.The function properly iterates over workers and walks their control graphs with appropriate safety checks for shutdown state and null graphs. The use of atomic operations ensures thread-safe access to worker state.
modules/ip6/datapath/ndp_na_input.c (1)
17-17
: Clean refactor to graph-based control plane processing.The removal of control output callbacks and direct use of the
ndp_probe
graph node simplifies the code and aligns with the new graph-based control plane architecture. The more descriptive edge nameNDP_PROBE
improves code readability.Also applies to: 95-95, 112-112
modules/ip6/control/router_advert.c (1)
123-162
: Well-implemented transformation to graph node processing.The conversion from callback-based to graph node batch processing is properly implemented:
- Correct iteration over packet batches
- Proper tracing support maintained
- Event activation preserved for router advertisements
- Control plane flag ensures execution on dedicated control threads
- Appropriate use of
ctlplane_sink
for packet terminationmodules/ip/datapath/icmp_input.c (4)
18-23
: Good enum reordering for safer defaults.Moving
UNSUPPORTED
to index 0 ensures that uninitialized elements in the edges array will default to a safe value.
27-27
: Proper edges array initialization.The array correctly handles all possible ICMP type values with safe default initialization.
76-84
: Well-designed registration function with proper validation.The function correctly prevents invalid registrations and integrates cleanly with the graph framework.
60-64
: Memory allocation foricmp_mbuf_data
verifiedI’ve confirmed that
icmp_mbuf_data
is defined via theGR_MBUF_PRIV_DATA_EXTENDS
macro inmodules/ip/datapath/gr_ip4_datapath.h
, which allocates the private‐data region inside every mbuf. As a result, callingicmp_mbuf_data(mbuf)
always returns a valid pointer, and writing toicmp_data->timestamp
is safe.No further action needed here.
modules/ip/datapath/arp_input_request.c (2)
41-44
: Cleaner edge assignment logic.The simplified control flow improves readability and performance by removing unnecessary intermediate data structures.
60-64
: Confirmed arp_probe Node Registration
- Found
.name = "arp_probe"
in modules/ip/control/nexthop.c
The node configuration in modules/ip/datapath/arp_input_request.c is correct and consistent. No further changes needed.modules/ip6/datapath/icmp6_input.c (1)
87-90
: Consistent implementation with IPv4 ICMP handling.The timestamp tracking and edge dispatch pattern matches the IPv4 implementation, maintaining consistency across protocols.
modules/ip/control/icmp.c (3)
192-205
: Efficient control plane packet queueing.The batch processing and queue management strategy (dropping oldest packets when memory pool is exhausted) is appropriate for control plane packets.
207-213
: Correct control plane node configuration.The node is properly marked with
GR_NODE_FLAG_CONTROL_PLANE
and configured as a terminal node.
215-219
: Appropriate ICMP type registrations.The registered ICMP types (destination unreachable, TTL exceeded, and echo reply) are the correct ones for control plane handling.
modules/ip6/control/icmp6.c (3)
92-114
: Clean metadata access using icmp6_mbuf_data.The direct use of the mbuf data structure simplifies packet metadata access and aligns well with the new graph-based architecture.
174-196
: Consistent implementation with IPv4 control plane.The node structure and processing logic maintain excellent consistency with the IPv4 implementation.
198-204
: Comprehensive ICMPv6 error type coverage.The registration includes all relevant ICMPv6 error types that require control plane processing, including IPv6-specific ones like packet too big.
modules/infra/control/loop_output.c (1)
79-96
: LGTM!The node registration and interface type registration are correctly implemented for both IPv4 and IPv6.
modules/infra/control/graph.c (1)
383-391
: LGTM!The control plane node affinity setting is correctly implemented with proper error handling.
modules/ip/control/nexthop.c (1)
146-233
: LGTM!The ARP probe processing function is well-implemented with proper error handling and packet queue management.
|
||
if (worker->stats_ctl) | ||
for (unsigned i = 0; i < worker->stats_ctl->n_stats; i++) { | ||
const struct node_stats *n = &w_stats->stats[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Wrong stats array accessed.
Line 82 is accessing the wrong stats array - it should use worker->stats_ctl->stats[i]
instead of w_stats->stats[i]
.
Apply this diff to fix the array access:
- const struct node_stats *n = &w_stats->stats[i];
+ const struct node_stats *n = &worker->stats_ctl->stats[i];
🤖 Prompt for AI Agents
In modules/infra/api/stats.c at line 82, the code incorrectly accesses the stats
array using w_stats->stats[i]. To fix this, replace w_stats->stats[i] with
worker->stats_ctl->stats[i] to ensure the correct stats array is accessed as
intended.
err: | ||
rte_graph_destroy(graph); | ||
rte_graph_destroy(ctl_graph); | ||
rte_graph_destroy(dplane_graph); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add validation before destroying graphs in error path
The error path attempts to destroy graphs without checking if they're valid (not RTE_GRAPH_ID_INVALID).
Apply this diff to fix:
err:
- rte_graph_destroy(graph);
- rte_graph_destroy(ctl_graph);
- rte_graph_destroy(dplane_graph);
+ if (graph != RTE_GRAPH_ID_INVALID)
+ rte_graph_destroy(graph);
+ if (ctl_graph != RTE_GRAPH_ID_INVALID)
+ rte_graph_destroy(ctl_graph);
+ if (dplane_graph != RTE_GRAPH_ID_INVALID)
+ rte_graph_destroy(dplane_graph);
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
err: | |
rte_graph_destroy(graph); | |
rte_graph_destroy(ctl_graph); | |
rte_graph_destroy(dplane_graph); | |
err: | |
if (graph != RTE_GRAPH_ID_INVALID) | |
rte_graph_destroy(graph); | |
if (ctl_graph != RTE_GRAPH_ID_INVALID) | |
rte_graph_destroy(ctl_graph); | |
if (dplane_graph != RTE_GRAPH_ID_INVALID) | |
rte_graph_destroy(dplane_graph); |
🤖 Prompt for AI Agents
In modules/infra/control/graph.c around lines 296 to 299, the error handling
code calls rte_graph_destroy on graph variables without checking if they are
valid. To fix this, add validation to ensure each graph variable is not equal to
RTE_GRAPH_ID_INVALID before calling rte_graph_destroy on it, preventing invalid
destruction calls.
goto next; | ||
} | ||
if (remote->iface_id != nh->iface_id) | ||
ABORT(IP4_F " nexthop lookup gives wrong interface", &ip); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider less aggressive error handling for interface mismatch
Using ABORT for interface mismatch might be too aggressive, similar to the IPv6 version.
Consider logging and dropping:
-if (remote->iface_id != nh->iface_id)
- ABORT(IP4_F " nexthop lookup gives wrong interface", &ip);
+if (remote->iface_id != nh->iface_id) {
+ LOG(ERR, IP4_F " nexthop lookup gives wrong interface", &dst);
+ goto next;
+}
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In modules/ip/control/nexthop.c at line 79, replace the ABORT call used for
interface mismatch with a less aggressive approach by logging the error and
dropping the packet instead of aborting. Modify the code to log the interface
mismatch issue clearly and then safely return or drop the packet to avoid
crashing the program.
// Flush all held packets. | ||
struct rte_mbuf *held = nh->held_pkts_head; | ||
while (held != NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add null check before accessing nexthop fields
The code accesses nh->held_pkts_head
without verifying that nh
is not NULL. The nh
pointer could be NULL if the neighbor lookup failed or if it's a multicast/unspecified address.
Add null check:
// Flush all held packets.
+if (nh != NULL) {
struct rte_mbuf *held = nh->held_pkts_head;
while (held != NULL) {
And close the condition block after line 306:
nh->held_pkts_tail = NULL;
nh->held_pkts = 0;
+}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Flush all held packets. | |
struct rte_mbuf *held = nh->held_pkts_head; | |
while (held != NULL) { | |
// Flush all held packets. | |
if (nh != NULL) { | |
struct rte_mbuf *held = nh->held_pkts_head; | |
while (held != NULL) { | |
struct rte_mbuf *next = held->next; | |
/* existing packet flushing logic */ | |
held = next; | |
} | |
nh->held_pkts_tail = NULL; | |
nh->held_pkts = 0; | |
} |
🤖 Prompt for AI Agents
In modules/ip6/control/nexthop.c around lines 291 to 293, the code accesses
nh->held_pkts_head without checking if nh is NULL, which can cause a crash if nh
is NULL. Add a null check for nh before accessing its fields, wrapping the
existing code block that uses nh in an if (nh != NULL) condition, and close this
condition block after line 306 to ensure safe access.
modules/ip6/datapath/ndp_rs_input.c
Outdated
|
||
enum { | ||
CONTROL, | ||
ROUTER_SOLLICIT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix typo in enum and node name.
The enum value and node name contain a typo: "SOLLICIT" should be "SOLICIT". This should be corrected to maintain consistency with standard networking terminology.
Apply this diff to fix the typo:
enum {
- ROUTER_SOLLICIT,
+ ROUTER_SOLICIT,
INVAL,
EDGE_COUNT,
};
- next = ROUTER_SOLLICIT;
+ next = ROUTER_SOLICIT;
.next_nodes = {
- [ROUTER_SOLLICIT] = "ndp_router_sollicit",
+ [ROUTER_SOLICIT] = "ndp_router_solicit",
[INVAL] = "ndp_rs_input_inval",
},
Note: The node name "ndp_router_sollicit" should also be updated to "ndp_router_solicit" in the corresponding node registration.
Also applies to: 60-60, 75-75
🤖 Prompt for AI Agents
In modules/ip6/datapath/ndp_rs_input.c at lines 17, 60, and 75, correct the typo
"SOLLICIT" to "SOLICIT" in the enum value and update the node name from
"ndp_router_sollicit" to "ndp_router_solicit" in the node registration to align
with standard networking terminology.
Add pending patch for pipeline mode. Signed-off-by: Christophe Fontaine <[email protected]>
As we run the 'control output' node on the control plane thread, we can cleanup some code, and remove the signals & threads which were used to convert a signal from the dataplane to an event. Signed-off-by: Christophe Fontaine <[email protected]>
With the graph split across 2 lcores, we need to gather the statistics for the control plane as well. Signed-off-by: Christophe Fontaine <[email protected]>
Add a generic counter (sink) for all packets processed by the control plane. Migrate the nh4_unreachable_cb to its own node and update the other related nodes. Signed-off-by: Christophe Fontaine <[email protected]>
ICMP input callbacks are replaced by registering ICMP types to specific graph nodes using icmp_input_register_type. The ICMP control plane queue is now filled by a dedicated node "icmp_input_ctl", not by a callback. The ICMP mbuf data structure now includes a timestamp for response time calculation. Signed-off-by: Christophe Fontaine <[email protected]>
The ARP input handling logic is moved from callback-based control plane code to a dedicated DPDK graph node (arp_probe). Signed-off-by: Christophe Fontaine <[email protected]>
Similar to ip/icmp, using the pipeline mode, convert the callback to a node running on the control plane lcore. Signed-off-by: Christophe Fontaine <[email protected]>
The IPv6 nexthop unreachable handler (nh6_unreachable_cb) is now implemented as a DPDK graph node (nh6_unreachable_process). The node handles packets that cannot be delivered due to unreachable nexthops, with explicit edges for error and drop conditions (e.g., invalid route, nexthop allocation failure, route insertion failure, hold queue full). Signed-off-by: Christophe Fontaine <[email protected]>
The NDP router solicitation input callback (ndp_router_sollicit_input_cb) is replaced by a dedicated DPDK graph node (ndp_router_sollicit_process). Packets are now sent directly to the "ndp_router_sollicit" node instead of using the control output callback mechanism. Signed-off-by: Christophe Fontaine <[email protected]>
The Neighbor Discovery Protocol (NDP) probe logic is moved from a callback-based control plane handler (ndp_probe_input_cb) to a dedicated DPDK graph node (ndp_probe). The obsolete callback and related handler registration code are removed from both headers and implementation files. Signed-off-by: Christophe Fontaine <[email protected]>
Usage of a negative offset for rte_pktmbuf_mtod_offset is not only ugly, but also doesn't guarantee that the header is actually valid. Introduce a new helper to extend an existing mbuf data, so we can add a new metadata to the existing priv. This allows the icmp priv data to include the IP info. Fixes: 0a39900 ("ip: add ping and traceroute commands") Signed-off-by: Christophe Fontaine <[email protected]>
IP6 header was stored in control_output_mbuf_data. To have a similar code as icmp, extends the ip6_local_mbuf_data to store the timestamp. Signed-off-by: Christophe Fontaine <[email protected]>
Remove all references to control output node and the associated code. This control_output node is no longer required as all features are implemented as nodes. Signed-off-by: Christophe Fontaine <[email protected]>
3015e1a
to
397e7e2
Compare
continue; | ||
for (unsigned i = 0; i < w_stats->n_stats; i++) { | ||
const struct node_stats *n = &w_stats->stats[i]; | ||
const char *name = rte_node_id_to_name(n->node_id); | ||
const struct rte_node_register *nr = gr_node_info_get(n->node_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible NULL dereference: gr_node_info_get()
may return NULL, but its result is used to access ->node
without checking.
const struct rte_node_register *nr = gr_node_info_get(n->node_id) | ||
->node; | ||
char name[RTE_NODE_NAMESIZE]; | ||
snprintf( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return value of snprintf
is not checked; truncation of name
would be silent.
This patch series introduces a significant refactor of the control plane and exception path handling.
The main changes are as follows:
Pipeline Mode & Control Plane Graphs:
The DPDK graph pipeline mode is now enabled, with explicit separation between dataplane and control plane graphs. Control plane nodes are marked with a new GR_NODE_FLAG_CONTROL_PLANE flag and are executed on dedicated control threads.
Node-Based Exception Handling:
All major control plane exception handlers (ICMP, ARP, NDP, unreachable callbacks, etc.) have been converted from callback-based mechanisms to dedicated DPDK graph nodes. This improves modularity, visibility, and performance.
Graph and Worker Infrastructure:
The worker and graph infrastructure has been updated to manage both dataplane and control plane graphs, including proper initialization, destruction, and statistics gathering for both.
Code Cleanup:
Obsolete callback registration, thread management, and affinity code have been removed. All exception and control path logic is now handled via graph nodes.
Summary by Sourcery
Refactor the control plane to run as a separate DPDK graph pipeline and convert all callback-based exception and control logic into dedicated control-plane graph nodes executed on their own threads.
New Features:
Enhancements:
Documentation:
Chores:
Summary by Sourcery
Refactor the control plane to run as separate DPDK graph pipelines by splitting the existing worker graph into base, dataplane, and control-plane variants with dedicated control threads and affinity settings.
New Features:
Enhancements:
Documentation:
Chores:
Summary by CodeRabbit
New Features
Refactor
Bug Fixes
Chores
Documentation