Enable kubernetes_node_scale benchmark (up to 5k nodes) on AWS EKS with Karpenter by kiryl-filatau · Pull Request #6512 · GoogleCloudPlatform/PerfKitBenchmarker

kiryl-filatau · 2026-03-04T18:27:06Z

Summary

Enables running the kubernetes_node_scale benchmark (0→5k→0→5k nodes) on AWS EKS with Karpenter. The benchmark scales a deployment with pod anti-affinity, measures scale-up, scale-down, and a second scale-up, then tears down the cluster.

Main changes

Kubernetes_node_scale benchmark — Template and scaling logic (scale up, scale down, phases), metrics collection, and timeouts tuned for large runs.
EKS + Karpenter — Nodepool template (instance types including t, CPU limit derived from scale target), EKS/Karpenter cluster lifecycle and cleanup.
Karpenter scaling by node count — NodePool CPU limit is computed from kubernetes_scale_num_nodes: max(1000, ceil(nodes × 2 × 1.05)) (e.g. 10 nodes → 1000, 5k → 10500). Controller pod resources scale with the same flag:
- Default or <500 nodes: 1 CPU / 1Gi
- 500–1000 nodes: 2 CPU / 8Gi
- >1000 nodes: 4 CPU / 16Gi
  One configuration works for both small and 5k-node runs.
Teardown robustness — Orphan ENI deletion in _CleanupKarpenter: retry with backoff on AWS throttle (RequestLimitExceeded), treat "ENI not found" as success; uses suppress_failure for these cases.
Tracker — Single get nodes pass in _StopWatchingForNodeChanges; resolve machine type only for current nodes, use "unknown" for others to avoid thousands of kubectl calls on 5k-node runs.
Tests — kubernetes_scale_benchmark_test mocks updated to return valid kubectl -o json output ({"items": [...]}) so tests pass after GetStatusConditionsForResourceType was switched from jsonpath to full JSON.

hubatish · 2026-03-12T17:02:02Z

perfkitbenchmarker/linux_benchmarks/kubernetes_scale_benchmark.py

-      # Output can be quite large, so we'll conditionally suppress it.
+      ['get', resource_type, '-o', 'json'],
+      timeout=60 * 5,  # 5 minutes for large clusters (e.g. 1000 pods)
      suppress_logging=NUM_PODS.value > 20,


nice this is clever

hubatish · 2026-03-12T17:04:05Z

perfkitbenchmarker/providers/aws/elastic_kubernetes_service.py

  def _PostCreate(self):
    """Performs post-creation steps for the cluster."""
    super()._PostCreate()
+    # Karpenter controller resources: default 1/1Gi; scale up when node_scale target is set.


Can we just not specify anything & let Karpenter decide? Or is this indeed necessary? It seems clever but a little annoying / bad user experience by Karpenter.

These are the resources for the Karpenter controller pod (the node where Karpenter itself runs). Karpenter doesn’t manage that node, so it can’t “decide” these values, we have to set them. For runs with ~10 nodes, 1/1Gi is sufficient; we only increase when node_scale is 500+ or 1000+.

hubatish · 2026-03-12T17:05:47Z

perfkitbenchmarker/providers/aws/elastic_kubernetes_service.py

        'v'
        + full_version.strip().strip('"').split(f'{self.cluster_version}-v')[1]
    )
+    # NodePool CPU limit: scale with benchmark target (nodes * 2 + 5%), min 1000.


Does the machine type matter here as well? If I am using a larger machine type, do I need to also set a larger cpu limit? This again seems a little annoying to have to set manually (but maybe makes senses given Karpenter can be machine type agnostic).

Makes sense to include machine type adjustment, I’ll think about how to cover it.
Thanks.

I added the eks_karpenter_limits_vcpu_per_node flag so the Karpenter NodePool CPU limit can be tuned when nodes use more than 2 vCPUs. The default remains 2 (same behavior as before)

hubatish · 2026-03-12T17:12:12Z

perfkitbenchmarker/providers/aws/elastic_kubernetes_service.py

        suppress_failure=lambda stdout, stderr, retcode: (
            'no matching resources found' in stderr.lower()
            or 'timed out' in stderr.lower()
+            or 'context deadline exceeded' in stderr.lower()


These look very similar to the RETRYABLE_KUBECTL_ERRORS list:

PerfKitBenchmarker/perfkitbenchmarker/resources/container_service/kubectl.py

Line 7 in 29845bd

RETRYABLE_KUBECTL_ERRORS = [

Just use kubectl.RunRetryableKubectlCommmand instead & get these for free. If that code is missing some of these (like 'timed out') then consider adding. It looks suppress_failure is supported too, so you can mix both - which would probably be good for 'no matching resources found' as that sounds like a wait/this command specific error message to ignore.

@hubatish
Updated: EKS cleanup now uses RunRetryableKubectlCommand with suppress_failure only for "no resources found" style messages, retryable list extended and matching is case insensitive, please check.

hubatish · 2026-03-12T17:14:08Z

perfkitbenchmarker/providers/aws/elastic_kubernetes_service.py

-            ),
-        )
+        max_retries = 5
+        backoff_seconds = 10


While this backoff logic looks pretty reasonable, prefer reusing backoff logic in vm_util.Retry. Which means moving this code to a subfunction & adding said decorator.

@hubatish

I have updated the code, please take look

hubatish · 2026-03-12T17:15:24Z

perfkitbenchmarker/traces/kubernetes_tracker.py

    """Stop watching the cluster for node add/remove events."""
    polled_events = self._cluster.GetEvents()

+    # Resolve machine type only for current nodes; use "unknown" for the rest.


O this makes sense. Was this causing the cluster to take a long time querying everything?

Yep, it was the main reason.

hubatish · 2026-03-12T17:15:52Z

perfkitbenchmarker/traces/kubernetes_tracker.py

+        if name in _current_node_names:
+          machine_type = _GetMachineTypeFromNodeName(self._cluster, name)
+        else:
+          machine_type = "unknown"


Something around here is probably what is causing the TypeError.

Checks have passed

So here you use unknown.. I wonder if instead a random different machine's type would be better. likely in a big scaling scenario they'll all use the same one.

It will anyway unknown, as on the moment of gathering the info the nodes from scaleUP1 were already removed

…test

…ded and matched case-insensitively

…arrowing

…pu_per_node

hubatish · 2026-03-19T23:10:48Z

perfkitbenchmarker/traces/kubernetes_tracker.py

-  )
-  if k8s_cluster is None:
+  if not isinstance(
+      benchmark_spec.container_cluster, kubernetes_cluster.KubernetesCluster


I was gonna say swap to raise instead of return, but the return if None seems quite reasonable.

hubatish · 2026-03-20T00:04:41Z

perfkitbenchmarker/providers/aws/flags.py

    'Default value - do not install unless explicitly requested',
 )
+flags.DEFINE_integer(
+    'eks_karpenter_limits_vcpu_per_node',


Use flagholder: https://absl.readthedocs.io/en/latest/absl.flags.html#absl.flags.FlagHolder

Also not sure if this is generally the right spot for these. Ideally probably both should go in config_overrides, with this one maybe being set cpu size from vm_spec & the other coming in a follow up cl.

hubatish · 2026-03-20T00:07:32Z

perfkitbenchmarker/traces/kubernetes_tracker.py

+        if name in _current_node_names:
+          machine_type = _GetMachineTypeFromNodeName(self._cluster, name)
+        else:
+          machine_type = "unknown"


So here you use unknown.. I wonder if instead a random different machine's type would be better. likely in a big scaling scenario they'll all use the same one.

…, quieter node_scale logs output

kiryl-filatau added 7 commits February 24, 2026 10:47

add pyink modification

214e6f0

Merge branch 'azure-5k' into aws-5k

e0b33ec

Merge branch 'azure-5k' into aws-5k

3519bf3

extend timeout for deletion

5577dee

decrease the timeout, as 1 hour is enough

5869a7e

pyink reformat

ed94853

reformatted by pyink

684215e

kiryl-filatau marked this pull request as ready for review March 11, 2026 19:18

Merge branch 'master' into aws-5k

568ba1a

hubatish reviewed Mar 12, 2026

View reviewed changes

kiryl-filatau added 7 commits March 12, 2026 18:39

Fix pytype in Prepare() and GetNodeNames mocks in kubernetes_tracker_…

d14b242

…test

EKS cleanup uses RunRetryableKubectlCommand, kubectl retry list exten…

f76985c

…ded and matched case-insensitively

Fix pytype error in kubernetes_tracker by using isinstance for type n…

c72880e

…arrowing

pyink adjustment

cea09dd

Make EKS Karpenter CPU limit configurable via eks_karpenter_limits_vc…

4e51710

…pu_per_node

Retry deleting orphaned ENIs on AWS rate limits (vm_util.Retry)

6c7542f

Add eks_karpenter_nodepool_instance_types for default Karpenter NodePool

c834c95

hubatish approved these changes Mar 20, 2026

View reviewed changes

kiryl-filatau added 2 commits March 21, 2026 00:09

added karpenter NodePool instance-type key, improved cleanup IAM step…

711893a

…, quieter node_scale logs output

Merge branch 'master' into aws-5k

7e930f9

hubatish added the ready to pull label Mar 24, 2026

hubatish approved these changes Mar 24, 2026

View reviewed changes

kiryl-filatau mentioned this pull request Mar 25, 2026

[AWS/EKS] Tune VPC CNI warm pool for kubernetes_node_scale in EksKarpenterCluster #6557

Open

Conversation

kiryl-filatau commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Main changes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kiryl-filatau Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kiryl-filatau commented Mar 4, 2026 •

edited

Loading

kiryl-filatau Mar 19, 2026 •

edited

Loading