Skip to content

Conversation

christophefontaine
Copy link
Collaborator

@christophefontaine christophefontaine commented Aug 15, 2025

  • Still refining code and the CLI, will work on the load balance algorithm in a subsequent PR.

Summary by CodeRabbit

  • New Features

    • Support for blackhole nexthops (IPv4/IPv6) causing matched traffic to be dropped.
    • Nexthop groups with per-packet load balancing for IPv4/IPv6 and group management APIs.
    • CLI: create/list blackhole or group nexthops, show group members, and modify group membership.
  • Bug Fixes

    • Route deletion now respects actual nexthop types.
  • Chores

    • Increased multipath limit to 64.

Copy link

coderabbitai bot commented Aug 15, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Adds two new nexthop types: BLACKHOLE and GROUP. API headers extend gr_nexthop with group capacity and group set/show messages. Nexthop creation, deletion, equality/free, and control-layer logic updated to accept blackhole and group types; grout translation and nexthop add/del now handle blackhole and group nexthops. CLI supports adding/showing/setting groups and blackhole nexthops. Route listing prints “blackhole” when appropriate. Datapath: registers blackhole drop nodes for IPv4/IPv6, adds group selection in ICMP handlers, and introduces ip/ip6 load-balance nodes. Build config enables multipath=64.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • maxime-leroy
  • rjarry

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@christophefontaine christophefontaine force-pushed the ecmp branch 6 times, most recently from 47c5af8 to 3e638de Compare August 16, 2025 08:16
@christophefontaine
Copy link
Collaborator Author

@coderabbitai review

Copy link

coderabbitai bot commented Aug 16, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai bot requested review from maxime-leroy and rjarry August 16, 2025 08:35
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🔭 Outside diff range comments (2)
modules/ip6/control/route.c (1)

291-301: Potential NULL deref + wrong type source (LPM vs exact) + UAF on flag clear

  • nh comes from rib6_lookup (LPM). It can be NULL (-> crash on nh->type).
  • Even if non-NULL, using LPM to fetch type for an exact route delete is incorrect; type may not correspond to the exact rn being removed, leading to spurious EINVAL.
  • Clearing nh->flags after rib6_delete risks use-after-free when ref_count was 1 (delete decrefs and may free).

Fix by looking up the exact rn, clear gateway flag before deletion, then delete using the exact nh type.

Apply this diff:

-	nh = rib6_lookup(req->vrf_id, GR_IFACE_ID_UNDEF, &req->dest.ip);
-	ret = rib6_delete(
-		req->vrf_id, GR_IFACE_ID_UNDEF, &req->dest.ip, req->dest.prefixlen, nh->type
-	);
-	if (ret == -ENOENT && req->missing_ok)
-		ret = 0;
-
-	if (nh && nh->ref_count == 1)
-		nh->flags &= ~GR_NH_F_GATEWAY;
+	nh = rib6_lookup_exact(
+		req->vrf_id, GR_IFACE_ID_UNDEF, &req->dest.ip, req->dest.prefixlen
+	);
+	if (nh == NULL) {
+		ret = -ENOENT;
+		if (req->missing_ok)
+			ret = 0;
+	} else {
+		/* Clear gateway flag before deletion if this was the last holder */
+		if (nh->ref_count == 1)
+			nh->flags &= ~GR_NH_F_GATEWAY;
+		ret = rib6_delete(
+			req->vrf_id, GR_IFACE_ID_UNDEF, &req->dest.ip, req->dest.prefixlen, nh->type
+		);
+		if (ret == -ENOENT && req->missing_ok)
+			ret = 0;
+	}
modules/ip/control/route.c (1)

267-275: Potential NULL deref + wrong type source (LPM vs exact) + UAF on flag clear

  • nh comes from rib4_lookup (LPM). It can be NULL (-> crash on nh->type).
  • Using LPM to fetch type for an exact delete can mismatch the route's actual type, causing EINVAL spuriously.
  • Clearing nh->flags after rib4_delete risks use-after-free when ref_count was 1 (delete decrefs and may free).

Fix by using exact lookup to get the nh to delete, clear the gateway flag before deletion if it's the last holder, then delete with the exact nh type.

Apply this diff:

-	nh = rib4_lookup(req->vrf_id, req->dest.ip);
-	ret = rib4_delete(req->vrf_id, req->dest.ip, req->dest.prefixlen, nh->type);
-	if (ret == -ENOENT && req->missing_ok)
-		ret = 0;
-
-	if (nh && nh->ref_count == 1)
-		nh->flags &= ~GR_NH_F_GATEWAY;
+	nh = rib4_lookup_exact(req->vrf_id, req->dest.ip, req->dest.prefixlen);
+	if (nh == NULL) {
+		ret = -ENOENT;
+		if (req->missing_ok)
+			ret = 0;
+	} else {
+		/* Clear gateway flag before deletion if this was the last holder */
+		if (nh->ref_count == 1)
+			nh->flags &= ~GR_NH_F_GATEWAY;
+		ret = rib4_delete(req->vrf_id, req->dest.ip, req->dest.prefixlen, nh->type);
+		if (ret == -ENOENT && req->missing_ok)
+			ret = 0;
+	}
🧹 Nitpick comments (10)
subprojects/packagefiles/frr/meson-add-dependency-definition.patch (1)

106-106: Avoid hidden drift: make multipath a meson option

Consider a meson option (e.g., -Dfrr_multipath=64) to keep this in sync with GR_NH_GROUP_MAX and ease overrides in CI.

Example snippet within this meson file (conceptual):

  • Define option in project options: option('frr_multipath', type: 'integer', value: 64)
  • Replace 64 with get_option('frr_multipath')
modules/ip/cli/route.c (1)

82-98: Also render GROUP nexthops in NEXT_HOP column

Blackhole rendering looks good. For completeness and consistency with new types, add a GROUP branch (e.g., show “group” or “group()”) instead of falling through to AF-based formatting.

-		if (route->nh.type == GR_NH_T_BLACKHOLE)
-			scols_line_sprintf(line, 2, "blackhole");
-		else
-			switch (route->nh.af) {
+		if (route->nh.type == GR_NH_T_BLACKHOLE) {
+			scols_line_sprintf(line, 2, "blackhole");
+		} else if (route->nh.type == GR_NH_T_GROUP) {
+			if (route->nh.nh_id != GR_NH_ID_UNSET)
+				scols_line_sprintf(line, 2, "group(%u)", route->nh.nh_id);
+			else
+				scols_line_sprintf(line, 2, "group");
+		} else {
+			switch (route->nh.af) {
 			case GR_AF_UNSPEC:
 				if (iface_from_id(c, route->nh.iface_id, &iface) < 0)
 					scols_line_sprintf(line, 2, "%u", route->nh.iface_id);
 				else
 					scols_line_sprintf(line, 2, "%s", iface.name);
 				break;
 			case GR_AF_IP4:
 				scols_line_sprintf(line, 2, IP4_F, &route->nh.ipv4);
 				break;
 			case GR_AF_IP6:
 				scols_line_sprintf(line, 2, IP6_F, &route->nh.ipv6);
 				break;
 			}
+		}
modules/ip6/cli/route.c (1)

82-98: Mirror GROUP nexthop rendering for IPv6

Same suggestion as IPv4: render GROUP explicitly (e.g., “group” or “group()”) so users don’t see misleading AF-based info for groups.

-		if (route->nh.type == GR_NH_T_BLACKHOLE)
-			scols_line_sprintf(line, 2, "blackhole");
-		else
-			switch (route->nh.af) {
+		if (route->nh.type == GR_NH_T_BLACKHOLE) {
+			scols_line_sprintf(line, 2, "blackhole");
+		} else if (route->nh.type == GR_NH_T_GROUP) {
+			if (route->nh.nh_id != GR_NH_ID_UNSET)
+				scols_line_sprintf(line, 2, "group(%u)", route->nh.nh_id);
+			else
+				scols_line_sprintf(line, 2, "group");
+		} else {
+			switch (route->nh.af) {
 			case GR_AF_UNSPEC:
 				if (iface_from_id(c, route->nh.iface_id, &iface) < 0)
 					scols_line_sprintf(line, 2, "%u", route->nh.iface_id);
 				else
 					scols_line_sprintf(line, 2, "%s", iface.name);
 				break;
 			case GR_AF_IP4:
 				scols_line_sprintf(line, 2, IP4_F, &route->nh.ipv4);
 				break;
 			case GR_AF_IP6:
 				scols_line_sprintf(line, 2, IP6_F, &route->nh.ipv6);
 				break;
 			}
+		}
modules/ip/datapath/ip_input.c (1)

32-41: Guard against out-of-range nexthop types indexing nh_type_edges

nh_type_edges has a fixed size (256). If gr_nh_type_t ever exceeds this bound, indexing invokes UB. Add a compile-time guard where the enum is defined or here as a sanity check.

Example (outside this hunk, near gr_nexthop.h or alongside nh_type_edges):

_Static_assert(GR_NH_T_MAX < 256, "nh_type_edges[] size too small for nexthop types");

Also applies to: 159-162

modules/ip6/datapath/ip6_loadbalance.c (1)

31-46: Defensive checks for group selection

Two edge cases worth guarding:

  • d->nh should be non-NULL and type GROUP when this node runs.
  • Selected member nh->nh_group[idx] could be NULL (e.g., transient membership).

A tiny guard prevents crashes and routes such cases to NO_NEXTHOP.

@@
-        nh = d->nh;
+        nh = d->nh;
         edge = OUTPUT;
-        if (nh->nh_grp_count == 0) {
+        if (!nh || nh->nh_grp_count == 0) {
             edge = NO_NEXTHOP;
             goto next;
         }
-        d->nh = nh->nh_group[mbuf->hash.rss % nh->nh_grp_count];
+        {
+            const struct nexthop *m = nh->nh_group[mbuf->hash.rss % nh->nh_grp_count];
+            if (m == NULL) {
+                edge = NO_NEXTHOP;
+                goto next;
+            }
+            d->nh = m;
+        }
modules/ip/datapath/ip_loadbalance.c (1)

31-46: Harden against NULL group members and unexpected nh

Same as IPv6 path: guard d->nh and the selected member pointer to avoid crashes in transient or misconfigured scenarios.

@@
-        nh = d->nh;
+        nh = d->nh;
         edge = OUTPUT;
-        if (nh->nh_grp_count == 0) {
+        if (!nh || nh->nh_grp_count == 0) {
             edge = NO_NEXTHOP;
             goto next;
         }
-        d->nh = nh->nh_group[mbuf->hash.rss % nh->nh_grp_count];
+        {
+            const struct nexthop *m = nh->nh_group[mbuf->hash.rss % nh->nh_grp_count];
+            if (m == NULL) {
+                edge = NO_NEXTHOP;
+                goto next;
+            }
+            d->nh = m;
+        }
modules/infra/control/nexthop.c (1)

672-676: Nit: resetting type to L3 in free() is redundant

nexthop_decref() memset()s the object right after calling ops->free(), so setting nh->type to L3 is unnecessary. Safe to remove for clarity.

 static void nh_type_group_free(struct nexthop *nh) {
-    nh->type = GR_NH_T_L3;
     memset(&nh->nh_group, 0, sizeof(nh->nh_group));
 }
frr/rt_grout.c (1)

863-889: Ensure AF is UNSPEC for BLACKHOLE nexthops

You pre-set AF from the NHE AFI earlier. For BLACKHOLE, force AF to UNSPEC to avoid leaking an unrelated AF down the stack.

 case NEXTHOP_TYPE_BLACKHOLE:
-    gr_nh->type = GR_NH_T_BLACKHOLE;
+    gr_nh->type = GR_NH_T_BLACKHOLE;
+    gr_nh->af = GR_AF_UNSPEC;
     break;
modules/infra/api/gr_nexthop.h (1)

77-78: Optional: static assert GR_NH_GROUP_MAX fits nh_grp_count type

nh_grp_count is uint8_t. Consider a compile-time check to prevent future regressions if GR_NH_GROUP_MAX changes.

Example:

_Static_assert(GR_NH_GROUP_MAX <= UINT8_MAX, "nh_grp_count is uint8_t");
modules/infra/cli/nexthop.c (1)

389-395: Fix help text for CLEAR command

The CLEAR variant empties a group, not appends. Update the description for clarity.

Apply this diff:

-  "Append a nexthop to a group.",
+  "Clear all nexthops from a group.",
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3b45ce5 and 3e638de.

⛔ Files ignored due to path filters (1)
  • docs/graph.svg is excluded by !**/*.svg
📒 Files selected for processing (18)
  • frr/rt_grout.c (5 hunks)
  • modules/infra/api/gr_nexthop.h (5 hunks)
  • modules/infra/api/nexthop.c (3 hunks)
  • modules/infra/cli/nexthop.c (7 hunks)
  • modules/infra/control/nexthop.c (3 hunks)
  • modules/ip/cli/route.c (1 hunks)
  • modules/ip/control/route.c (1 hunks)
  • modules/ip/datapath/icmp_local_send.c (1 hunks)
  • modules/ip/datapath/ip_input.c (2 hunks)
  • modules/ip/datapath/ip_loadbalance.c (1 hunks)
  • modules/ip/datapath/meson.build (1 hunks)
  • modules/ip6/cli/route.c (1 hunks)
  • modules/ip6/control/route.c (1 hunks)
  • modules/ip6/datapath/icmp6_local_send.c (1 hunks)
  • modules/ip6/datapath/ip6_input.c (2 hunks)
  • modules/ip6/datapath/ip6_loadbalance.c (1 hunks)
  • modules/ip6/datapath/meson.build (1 hunks)
  • subprojects/packagefiles/frr/meson-add-dependency-definition.patch (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (7)
modules/ip/cli/route.c (3)
cli/table.c (1)
  • scols_line_sprintf (10-23)
modules/infra/cli/iface.c (1)
  • iface_from_id (130-143)
modules/infra/control/gr_iface.h (1)
  • iface (15-21)
modules/ip/datapath/ip_loadbalance.c (6)
modules/infra/datapath/gr_mbuf.h (1)
  • gr_mbuf_is_traced (52-54)
modules/infra/datapath/trace.c (1)
  • gr_mbuf_trace_add (461-488)
modules/infra/control/gr_graph.h (1)
  • rte_node_enqueue_x1 (22-25)
modules/ip6/datapath/ip6_loadbalance.c (1)
  • loadbalance_register (51-54)
modules/ip/datapath/ip_input.c (1)
  • ip_input_register_nexthop_type (34-41)
modules/ip/datapath/ip_output.c (1)
  • ip_output_register_nexthop_type (45-52)
modules/ip6/datapath/ip6_loadbalance.c (6)
modules/infra/datapath/gr_mbuf.h (1)
  • gr_mbuf_is_traced (52-54)
modules/infra/datapath/trace.c (1)
  • gr_mbuf_trace_add (461-488)
modules/infra/control/gr_graph.h (1)
  • rte_node_enqueue_x1 (22-25)
modules/ip/datapath/ip_loadbalance.c (1)
  • loadbalance_register (51-54)
modules/ip6/datapath/ip6_input.c (1)
  • ip6_input_register_nexthop_type (35-42)
modules/ip6/datapath/ip6_output.c (1)
  • ip6_output_register_nexthop_type (41-48)
modules/infra/control/nexthop.c (5)
main/api.c (1)
  • init (493-495)
modules/ip/datapath/fib4.c (1)
  • init (167-169)
modules/ip6/datapath/fib6.c (1)
  • init (181-183)
main/event.c (1)
  • gr_event_register_serializer (34-54)
main/module.c (1)
  • gr_register_module (55-67)
frr/rt_grout.c (3)
modules/infra/api/nexthop.c (2)
  • nh_add (92-165)
  • nh_del (173-195)
modules/infra/cli/nexthop.c (2)
  • nh_add (61-106)
  • nh_del (108-121)
frr/zebra_dplane_grout.c (1)
  • grout_client_send_recv (179-216)
modules/infra/api/nexthop.c (5)
modules/infra/control/gr_iface.h (1)
  • iface (15-21)
modules/infra/cli/nexthop.c (3)
  • nh_add (61-106)
  • nh_group_set (262-285)
  • nh_group_show (123-165)
main/gr_module.h (1)
  • api_out (16-19)
modules/infra/control/nexthop.c (2)
  • nexthop_lookup_by_id (481-488)
  • nexthop_incref (535-537)
modules/ip/api/dnat44.c (1)
  • _init (176-181)
modules/infra/cli/nexthop.c (5)
cli/gr_cli.h (1)
  • arg_u32 (70-76)
cli/ecoli.c (2)
  • arg_str (85-95)
  • with_help (15-24)
modules/infra/api/nexthop.c (3)
  • nh_group_show (299-319)
  • nh_group_set (249-291)
  • nh_list (220-241)
api/gr_api_client_impl.h (1)
  • gr_api_client_send_recv (67-129)
api/gr_net_types.h (1)
  • gr_af_name (31-41)
🔇 Additional comments (20)
modules/ip/datapath/meson.build (2)

19-19: ip_loadbalance.c added to build: LGTM

Build integration looks correct.


19-19: ip_loadbalance node registration verified

The file exists and correctly registers its datapath node and drop handler in modules/ip/datapath/ip_loadbalance.c:

  • Line 71: GR_NODE_REGISTER(info_loadbalance)
  • Line 72: GR_DROP_REGISTER(ip_lb_no_nexthop)
subprojects/packagefiles/frr/meson-add-dependency-definition.patch (1)

106-106: Multipath=64 support confirmed
FRR’s ./configure --enable-multipath=64 is supported in 10.x (up to 999) and zebra enforces the compile-time ECMP limit of 64. No changes needed.

modules/ip6/datapath/meson.build (2)

13-13: ip6_loadbalance.c added to build: LGTM

Consistent with IPv6 GROUP nexthop support.


13-13: ip6_loadbalance.c: file and node registration confirmed

modules/ip6/datapath/ip6_loadbalance.c exists and registers:

  • GR_NODE_REGISTER(info_loadbalance)
  • GR_DROP_REGISTER(ip6_lb_no_nexthop)

No further changes needed.

modules/ip6/datapath/ip6_input.c (1)

154-154: BLACKHOLE path wiring looks good

Registering GR_NH_T_BLACKHOLE to ip6_blackhole and adding the drop node closes the loop for IPv6 blackholes. No further action from my side.

Also applies to: 189-189

modules/ip/datapath/ip_input.c (2)

159-162: BLACKHOLE nexthop wiring looks correct

Registering GR_NH_T_BLACKHOLE to "ip_blackhole" integrates cleanly with the nh_type_edges lookup and the drop path.


190-195: Drop-node registration for ip_blackhole is consistent

Registering the ip_blackhole drop path complements the registration above and ensures packets are dropped properly.

modules/ip6/datapath/ip6_loadbalance.c (1)

51-54: Registration matches IPv4 and keeps graph consistent

Mapping GROUP to "ip6_forward" on input and to "ip6_loadbalance" on output mirrors IPv4 and fits the pipeline.

modules/ip/datapath/ip_loadbalance.c (1)

51-54: Input/output nexthop-type wiring is appropriate

Registration mirrors IPv6 and maintains the intended path to the load-balancer node.

modules/infra/control/nexthop.c (3)

309-323: Allowing ops registration for BLACKHOLE and GROUP is correct

Switch extension is fine; invalidations are properly ABORTed for duplicates/missing callbacks.


331-346: Type whitelisting includes BLACKHOLE and GROUP

Creation guard now permits the new types; good gatekeeping before allocation.


682-687: Registering GROUP type ops at init is good

Ensures equality/free hooks are available before use.

frr/rt_grout.c (2)

193-200: BLACKHOLE nexthop translation to FRR is correct

Mapping to NEXTHOP_TYPE_BLACKHOLE with bh_type=BLACKHOLE_NULL and AF_UNSPEC is aligned with FRR expectations. Weight=1 default is reasonable.


823-825: Delegation to grout_add_del_nexthop_group is the right split

Early group handling keeps single-NH flow untouched. Good separation of concerns.

modules/infra/api/gr_nexthop.h (1)

158-162: String names for new types: LGTM

Mappings for BLACKHOLE and GROUP are consistent and user-friendly.

modules/infra/cli/nexthop.c (1)

62-80: nh_add control flow for blackhole/group: LGTM

Early exit to the send-path for BLACKHOLE and GROUP types is clean and avoids unnecessary AF/MAC parsing.

Also applies to: 101-106

modules/infra/api/nexthop.c (3)

42-46: nh_add type dispatch and validation: LGTM (clear separation and errno semantics)

Clean split between BLACKHOLE/GROUP/L3 paths; validation helpers return errno-style codes for early API-out; later device ops return negatives and are normalized at the end. This flow is consistent.

Also applies to: 87-91, 99-116


186-189: Double-check ops->del for BLACKHOLE

nh_del allows BLACKHOLE types, but unconditionally calls ops->del(nh) using nh->af. For non-L3 (e.g., BLACKHOLE), af may be GR_AF_UNSPEC. Confirm that nexthop_af_ops_get(GR_AF_UNSPEC) is valid, or guard the call for non-L3 types.

Would you like me to scan for the nexthop_af_ops_get implementation and its callers to verify it gracefully handles GR_AF_UNSPEC?


299-319: nh_group_show response allocation/size: LGTM

Fixed-size response with bounded array; count set appropriately; server-side is correct.

Comment on lines 262 to 293
static cmd_status_t nh_group_set(const struct gr_api_client *c, const struct ec_pnode *p) {
const struct ec_pnode *n = ec_pnode_find(p, "NH_ID");
const struct ec_strvec *v = ec_pnode_get_strvec(n);
struct gr_nh_group_set_req req = {.nh_grp_count = 0};

if (arg_u32(p, "ID", &req.group_id) < 0)
return CMD_ERROR;

if (n) {
if (v == NULL)
return CMD_ERROR;

req.nh_grp_count = ec_strvec_len(v);
for (size_t i = 0; i < req.nh_grp_count; i++) {
const char *str = ec_strvec_val(v, i);
req.nh_ids[i] = atoi(str);
}
}

if (gr_api_client_send_recv(c, GR_NH_GROUP_SET, sizeof(req), &req, NULL) < 0)
return CMD_ERROR;

return CMD_SUCCESS;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid atoi overflow; parse NH_IDs safely and bound-check nh_grp_count

atoi returns int and silently overflows above INT_MAX. You accept NH_ID up to UINT32_MAX-1 in the grammar, so values in (INT_MAX, UINT32_MAX) will be mis-parsed. Also, defensively ensure nh_grp_count <= GR_NH_GROUP_MAX even if the parser already limits it.

Apply this diff:

   if (n) {
     if (v == NULL)
       return CMD_ERROR;

-    req.nh_grp_count = ec_strvec_len(v);
+    req.nh_grp_count = ec_strvec_len(v);
+    if (req.nh_grp_count > GR_NH_GROUP_MAX)
+      return CMD_ERROR;
     for (size_t i = 0; i < req.nh_grp_count; i++) {
       const char *str = ec_strvec_val(v, i);
-      req.nh_ids[i] = atoi(str);
+      char *end = NULL;
+      errno = 0;
+      unsigned long tmp = strtoul(str, &end, 10);
+      if (errno != 0 || end == str || *end != '\0' || tmp > UINT32_MAX - 1 || tmp == 0)
+        return CMD_ERROR;
+      req.nh_ids[i] = (uint32_t)tmp;
     }
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
static cmd_status_t nh_group_set(const struct gr_api_client *c, const struct ec_pnode *p) {
const struct ec_pnode *n = ec_pnode_find(p, "NH_ID");
const struct ec_strvec *v = ec_pnode_get_strvec(n);
struct gr_nh_group_set_req req = {.nh_grp_count = 0};
if (arg_u32(p, "ID", &req.group_id) < 0)
return CMD_ERROR;
if (n) {
if (v == NULL)
return CMD_ERROR;
req.nh_grp_count = ec_strvec_len(v);
for (size_t i = 0; i < req.nh_grp_count; i++) {
const char *str = ec_strvec_val(v, i);
req.nh_ids[i] = atoi(str);
}
}
if (gr_api_client_send_recv(c, GR_NH_GROUP_SET, sizeof(req), &req, NULL) < 0)
return CMD_ERROR;
return CMD_SUCCESS;
}
static cmd_status_t nh_group_set(const struct gr_api_client *c, const struct ec_pnode *p) {
const struct ec_pnode *n = ec_pnode_find(p, "NH_ID");
const struct ec_strvec *v = ec_pnode_get_strvec(n);
struct gr_nh_group_set_req req = {.nh_grp_count = 0};
if (arg_u32(p, "ID", &req.group_id) < 0)
return CMD_ERROR;
if (n) {
if (v == NULL)
return CMD_ERROR;
req.nh_grp_count = ec_strvec_len(v);
if (req.nh_grp_count > GR_NH_GROUP_MAX)
return CMD_ERROR;
for (size_t i = 0; i < req.nh_grp_count; i++) {
const char *str = ec_strvec_val(v, i);
char *end = NULL;
errno = 0;
unsigned long tmp = strtoul(str, &end, 10);
if (errno != 0 || end == str || *end != '\0' || tmp > UINT32_MAX - 1 || tmp == 0)
return CMD_ERROR;
req.nh_ids[i] = (uint32_t)tmp;
}
}
if (gr_api_client_send_recv(c, GR_NH_GROUP_SET, sizeof(req), &req, NULL) < 0)
return CMD_ERROR;
return CMD_SUCCESS;
}

@christophefontaine
Copy link
Collaborator Author

@coderabbitai review

Copy link

coderabbitai bot commented Aug 16, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
modules/infra/control/nexthop.c (1)

662-674: Group equality now checks both count and members — resolves prior concern

Comparing nh_grp_count first and iterating over the active entries fixes the previously-flagged equality issue where differing sizes could be considered equal.

🧹 Nitpick comments (1)
modules/infra/control/nexthop.c (1)

676-679: Avoid mutating nh->type in free; also clear nh_grp_count for clarity

Setting nh->type to L3 inside the free callback is redundant (the caller memset’s the whole struct right after). Clearing nh_grp_count makes intent explicit if this callback is ever reused outside the current decref path.

Apply this diff:

 static void nh_type_group_free(struct nexthop *nh) {
-	nh->type = GR_NH_T_L3;
-	memset(&nh->nh_group, 0, sizeof(nh->nh_group));
+	nh->nh_grp_count = 0;
+	memset(&nh->nh_group, 0, sizeof(nh->nh_group));
 }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3e638de and e95840a.

⛔ Files ignored due to path filters (1)
  • docs/graph.svg is excluded by !**/*.svg
📒 Files selected for processing (12)
  • frr/rt_grout.c (5 hunks)
  • modules/infra/api/gr_nexthop.h (5 hunks)
  • modules/infra/api/nexthop.c (3 hunks)
  • modules/infra/cli/nexthop.c (7 hunks)
  • modules/infra/control/nexthop.c (3 hunks)
  • modules/ip/datapath/icmp_local_send.c (1 hunks)
  • modules/ip/datapath/ip_loadbalance.c (1 hunks)
  • modules/ip/datapath/meson.build (1 hunks)
  • modules/ip6/datapath/icmp6_local_send.c (1 hunks)
  • modules/ip6/datapath/ip6_loadbalance.c (1 hunks)
  • modules/ip6/datapath/meson.build (1 hunks)
  • subprojects/packagefiles/frr/meson-add-dependency-definition.patch (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
  • modules/ip/datapath/meson.build
  • modules/ip/datapath/icmp_local_send.c
  • modules/ip6/datapath/meson.build
  • modules/ip/datapath/ip_loadbalance.c
  • subprojects/packagefiles/frr/meson-add-dependency-definition.patch
  • modules/ip6/datapath/icmp6_local_send.c
  • frr/rt_grout.c
  • modules/ip6/datapath/ip6_loadbalance.c
  • modules/infra/api/gr_nexthop.h
  • modules/infra/cli/nexthop.c
  • modules/infra/api/nexthop.c
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04)
  • GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04)
  • GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04)
  • GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled)
  • GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04)
🔇 Additional comments (3)
modules/infra/control/nexthop.c (3)

315-317: Extending type-ops registry to BLACKHOLE/GROUP looks correct; confirm BLACKHOLE needs no ops

The switch now recognizes both types. Since there’s no BLACKHOLE ops registered in init, this implies no type-specific equality/free is needed for BLACKHOLE. Please confirm that assumption holds across call sites (e.g., no reliance on ops->free/equal for BLACKHOLE).


341-343: Allowing BLACKHOLE/GROUP in nexthop_new is correct

Creation path now accepts both types and preserves existing validation semantics.


690-691: Registering GROUP type ops at init — good wiring

Ensures GROUP gets the equal/free hooks early; integrates cleanly with nexthop_equal and decref.

@christophefontaine
Copy link
Collaborator Author

christophefontaine commented Oct 15, 2025

rebased on main, will clean & test that PR this week before switching from draft to "ready to review".

Add nexthop type "GROUP", which is an aggregation of multiple nexthops.
This may be used for ECMP, in order to configure up to
GR_NEXTHOP_GROUP_MAX nexthop for a route.

In a group, nexthop are added with a relative weight:

grout# nexthop add group id 333 members nh 45 weight 3 nh 42
grout# nexthop type group
VRF  ID   ORIGIN  IFACE  TYPE   INFO
0    333  user           group  id(45/3) id(42/1)

Signed-off-by: Christophe Fontaine <[email protected]>
Using the configured nexthop GR_NH_T_GROUP, we select one
of the nexthop based on the input RSS.
As of now, we consider all weights being equal to "1".
A further commit will take into account the route weight.

Signed-off-by: Christophe Fontaine <[email protected]>
Setting the ident field in the ICMP request message
is useful for debuging and bringing stability in the future tests.

Signed-off-by: Christophe Fontaine <[email protected]>
icmp(6)_local_send access the nexthop to select the proper gateway.
This can't work with the nexthop group. As we can't use the rss field,
use the "ident" field instead.

Add smoke test for loadbalance.

Signed-off-by: Christophe Fontaine <[email protected]>
Using the new nexthop type group, allow multiple nexthop
as a gateway from frr.

The following zebra conf:
!
ip route 192.168.1.20/32 cv0
ip route 192.168.1.20/32 192.168.210.3
ip route 192.168.1.20/32 192.168.210.254
!

translates into these rib nexthops:
ID: 35 (zebra)
     RefCnt: 2
     Uptime: 00:00:07
     VRF: default(No AFI)
     Nexthop Count: 3
     Valid, Installed
     Depends: (20) (24) (30)
           is directly connected, cv0 (vrf default), weight 1
           via 192.168.210.3, cv0 (vrf default), weight 1
           via 192.168.210.254, cv0 (vrf default), weight 1

With grout displays the proper NHs:
grout# nexthop type group
VRF  ID  ORIGIN  IFACE  TYPE   INFO
0    35  user           group  id(20/1) id(24/1) id(30/1)

Signed-off-by: Christophe Fontaine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants