Skip to content

Stop adding kubernetes.azure.com/cluster label to unbounded-managed nodes#17

Merged
bcho merged 15 commits intomainfrom
copilot/remove-kubernetes-label-unbounded-node
May 8, 2026
Merged

Stop adding kubernetes.azure.com/cluster label to unbounded-managed nodes#17
bcho merged 15 commits intomainfrom
copilot/remove-kubernetes-label-unbounded-node

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 9, 2026

AKS unbounded-managed nodes should not be stamped with kubernetes.azure.com/cluster, which is an AKS-internal label that should only appear on natively provisioned nodes. This PR removes that label and adds unbounded-managed kube-proxy for external site nodes so ClusterIP service traffic works without modifying the provider-owned AKS kube-proxy DaemonSet.

Changes

  • Stop adding kubernetes.azure.com/cluster to unbounded-managed AKS node labels.
  • Simplify AKS provider detection now that the cluster label no longer needs to be scraped from system nodes.
  • Add a managed kube-proxy controller that creates one unbounded-net-kube-proxy-<site> DaemonSet per Site.
  • Label eligible site nodes with net.unbounded-cloud.io/kube-proxy=managed only when they are not AKS/provider-managed and are not already covered by a provider kube-proxy DaemonSet.
  • Reuse the cluster kube-system/kube-proxy image when available, with runtime config and flag overrides for the managed kube-proxy image and enablement.
  • Add the unbounded-net-kube-proxy ServiceAccount, system:node-proxier binding, and controller DaemonSet RBAC.
  • Document the design, scheduling behavior, bootstrapping details, and AKS validation in designs/managed-kube-proxy.md.

Validation

  • go test ./internal/net/config ./internal/net/controller ./cmd/unbounded-net-controller
  • go build -o bin/unbounded-net-controller ./cmd/unbounded-net-controller
  • git diff --check
  • make net-manifests unbounded-net-controller
  • Live AKS validation: externally joined test site node could not reach unbounded-net-controller through ClusterIP before managed kube-proxy; after unbounded-net-kube-proxy-test scheduled, ClusterIP 10.0.203.248:9999 became reachable.

Copilot AI changed the title [WIP] Remove kubernetes.azure.com/cluster label from unbounded managed node Stop adding kubernetes.azure.com/cluster label to unbounded-managed nodes Apr 9, 2026
Copilot finished work on behalf of bcho April 9, 2026 23:55
Copilot AI requested a review from bcho April 9, 2026 23:55
Copy link
Copy Markdown
Member

@bcho bcho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread internal/cloudprovider/provider_test.go Outdated
Copilot finished work on behalf of bcho April 10, 2026 00:04
Copilot AI requested a review from bcho April 10, 2026 00:04
@bcho
Copy link
Copy Markdown
Member

bcho commented Apr 13, 2026

Partial kubelet logs:

52d7ff8bc80-cni-conf\") pod \"unbounded-net-node-n52gj\" (UID: \"bb3a49e0-a945-4c23-b5e1-252d7ff8bc80\") " pod="unbounded-net/unbounded-net-node-n52gj"
Apr 13 21:27:50 test-vm kubelet[160]: I0413 21:27:50.756260     160 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"wireguard\" (UniqueName: \"kubernetes.io/host-path/bb3a49e0-a945-4c23-b5e1-252d7ff8bc80-wireguard\") pod \"unbounded-net-node-n52gj\" (UID: \"bb3a49e0-a945-4c23-b5e1-252d7ff8bc80\") " pod="unbounded-net/unbounded-net-node-n52gj"
Apr 13 21:27:50 test-vm kubelet[160]: I0413 21:27:50.756272     160 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"tmp\" (UniqueName: \"kubernetes.io/empty-dir/bb3a49e0-a945-4c23-b5e1-252d7ff8bc80-tmp\") pod \"unbounded-net-node-n52gj\" (UID: \"bb3a49e0-a945-4c23-b5e1-252d7ff8bc80\") " pod="unbounded-net/unbounded-net-node-n52gj"
Apr 13 21:27:50 test-vm kubelet[160]: I0413 21:27:50.756283     160 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"vartmp\" (UniqueName: \"kubernetes.io/empty-dir/bb3a49e0-a945-4c23-b5e1-252d7ff8bc80-vartmp\") pod \"unbounded-net-node-n52gj\" (UID: \"bb3a49e0-a945-4c23-b5e1-252d7ff8bc80\") " pod="unbounded-net/unbounded-net-node-n52gj"
Apr 13 21:27:50 test-vm kubelet[160]: I0413 21:27:50.756294     160 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"iproute2\" (UniqueName: \"kubernetes.io/host-path/bb3a49e0-a945-4c23-b5e1-252d7ff8bc80-iproute2\") pod \"unbounded-net-node-n52gj\" (UID: \"bb3a49e0-a945-4c23-b5e1-252d7ff8bc80\") " pod="unbounded-net/unbounded-net-node-n52gj"
Apr 13 21:27:50 test-vm kubelet[160]: I0413 21:27:50.756304     160 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"runtime-config\" (UniqueName: \"kubernetes.io/configmap/bb3a49e0-a945-4c23-b5e1-252d7ff8bc80-runtime-config\") pod \"unbounded-net-node-n52gj\" (UID: \"bb3a49e0-a945-4c23-b5e1-252d7ff8bc80\") " pod="unbounded-net/unbounded-net-node-n52gj"
Apr 13 21:27:50 test-vm kubelet[160]: I0413 21:27:50.756314     160 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"controller-ca\" (UniqueName: \"kubernetes.io/configmap/bb3a49e0-a945-4c23-b5e1-252d7ff8bc80-controller-ca\") pod \"unbounded-net-node-n52gj\" (UID: \"bb3a49e0-a945-4c23-b5e1-252d7ff8bc80\") " pod="unbounded-net/unbounded-net-node-n52gj"
Apr 13 21:27:50 test-vm kubelet[160]: I0413 21:27:50.756325     160 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-fvv86\" (UniqueName: \"kubernetes.io/projected/bb3a49e0-a945-4c23-b5e1-252d7ff8bc80-kube-api-access-fvv86\") pod \"unbounded-net-node-n52gj\" (UID: \"bb3a49e0-a945-4c23-b5e1-252d7ff8bc80\") " pod="unbounded-net/unbounded-net-node-n52gj"
Apr 13 21:27:50 test-vm kubelet[160]: I0413 21:27:50.756335     160 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"cni-bin\" (UniqueName: \"kubernetes.io/host-path/bb3a49e0-a945-4c23-b5e1-252d7ff8bc80-cni-bin\") pod \"unbounded-net-node-n52gj\" (UID: \"bb3a49e0-a945-4c23-b5e1-252d7ff8bc80\") " pod="unbounded-net/unbounded-net-node-n52gj"
Apr 13 21:28:12 test-vm kubelet[160]: I0413 21:28:12.746571     160 kubelet_node_status.go:439] "Fast updating node status as it just became ready"
Apr 13 21:28:14 test-vm kubelet[160]: I0413 21:28:14.714188     160 pod_startup_latency_tracker.go:104] "Observed pod startup duration" pod="unbounded-net/unbounded-net-node-n52gj" podStartSLOduration=9.421655684 podStartE2EDuration="24.71417647s" podCreationTimestamp="2026-04-13 21:27:50 +0000 UTC" firstStartedPulling="2026-04-13 21:27:52.696865787 +0000 UTC m=+4.538787414" lastFinishedPulling="2026-04-13 21:28:07.989386583 +0000 UTC m=+19.831308200" observedRunningTime="2026-04-13 21:28:08.70728391 +0000 UTC m=+20.549205537" watchObservedRunningTime="2026-04-13 21:28:14.71417647 +0000 UTC m=+26.556098097"

Comment thread cmd/machina/machina/controller/ssh_integration_test.go Outdated
Copilot finished work on behalf of bcho April 13, 2026 21:59
Copilot AI requested a review from bcho April 13, 2026 21:59
@bcho bcho marked this pull request as ready for review April 13, 2026 22:05
@bcho bcho requested a review from a team April 13, 2026 22:05
Copy link
Copy Markdown
Member

@bcho bcho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot please include path internal/cloudprovider to kind e2e test target

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 13, 2026

@copilot please include path internal/cloudprovider to kind e2e test target

Done in bf56973 — added internal/cloudprovider/** to both the pull_request and push path triggers in .github/workflows/agent-e2e-kind.yaml.

Copilot finished work on behalf of bcho April 13, 2026 22:13
Copilot AI requested a review from bcho April 13, 2026 22:13
Copilot finished work on behalf of bcho April 13, 2026 22:40
Copilot AI requested a review from bcho April 13, 2026 22:40
jwilder
jwilder previously approved these changes Apr 24, 2026
Copy link
Copy Markdown
Member

@bcho bcho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot resolve conflicts

Co-authored-by: bcho <1975118+bcho@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 7, 2026

@copilot resolve conflicts

Resolved in ec648c7. Kept both internal/cloudprovider/** (our addition) and pkg/agent/** (added in main) in the workflow path triggers.

Copilot finished work on behalf of bcho May 7, 2026 17:24
Copilot AI requested a review from bcho May 7, 2026 17:24
Copy link
Copy Markdown
Member

@bcho bcho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix lint issue

Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 7, 2026

fix lint issue

Fixed in a404b35 — removed the double blank line at line 70 in provider_test.go that was causing the gofmt lint failure.

Copilot finished work on behalf of bcho May 7, 2026 17:36
Copilot AI requested a review from bcho May 7, 2026 17:36
@bcho bcho enabled auto-merge (squash) May 7, 2026 20:43
@bcho bcho disabled auto-merge May 8, 2026 00:55
@bcho bcho enabled auto-merge (squash) May 8, 2026 00:55
@bcho bcho merged commit 4a1d336 into main May 8, 2026
37 of 54 checks passed
@bcho bcho deleted the copilot/remove-kubernetes-label-unbounded-node branch May 8, 2026 03:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[agent] Stop adding kubernetes.azure.com/cluster label to unbounded managed node

4 participants