Aggregate CPU and Memory accross all nodes instead of per node to fix AWS jobs by ronaldngounou · Pull Request #3844 · kubernetes/perf-tests

ronaldngounou · 2026-02-24T08:47:12Z

Overview

Contributes to issue #3843

Contributes to naming inconsistency issue with the AWS jobs, preventing LoadResources to be meaningful as it is not taking the average across all the nodes

2 possibilities were considered to fix:
Option 1: Arrange for the AWS nodes to have well-defined names
Option 2. Change the metrics gathering/display to just average over all the nodes without regard to their names

GCE tests don't have the same issue because they have a more consistent naming for the nodes whereas AWS logs the instance-id in the nodename. It makes it insconsisent to see the resource usage. Therefore, this PR averages CPU/Memory across all the nodes.

In this PR, option 2 was selected.

Updated parseResourceUsageData to take the average accross the nodes

Testing

> go test parser.go parser_test.go
ok      command-line-arguments  0.370s

Deployed perfdash on localhost:8080 after following https://github.com/kubernetes/perf-tests/blob/master/perfdash/README.md

make push
make run

[done] Confirming that CL2 generates the desired result files:

LoadResources uses parseResourceUsageData and has the ResourceUsageSummary prefix, which corresponds to: https://storage.googleapis.com/kubernetes-ci-logs/pr-logs/pull/perf-tests/3844/pull-perf-tests-ec2-master-scale-performance-100/2028715726881165312/artifacts/ResourceUsageSummary_load_2026-03-03T06:50:12Z.json
- Job execution: https://gcsweb.k8s.io/gcs/kubernetes-ci-logs/pr-logs/pull/perf-tests/3844/pull-perf-tests-ec2-master-scale-performance-100/2028715726881165312/artifacts/

The artifacts are not containing the new resourceUsageAggregate struct added because this change is not modifying the type resourceUsagePercentiles map[string][]resourceUsages rather, it adds type usageAggregateAtPercentiles map[string]resourceUsageAggregate which is the new struct computed. I am not modifying the struct below, which contains the resources.

perf-tests/perfdash/parser.go

Line 135 in 8d22586

usage[name][percentile] = resourceUsage{cpu, memory}

However, I after running the perfdash locally, I'm able to see the aggregation across all the nodes.

Result after

URL = http://localhost:8080/#/?jobname=gce-5000Nodes&metriccategoryname=E2E&metricname=LoadResources&Resource=CPU_Average&PodName=all-nodes after the PR

Before

https://perf-dash.k8s.io/#/?jobname=gce-5000Nodes&metriccategoryname=E2E&metricname=LoadResources&PodName=kube-proxy-control-plane-us-east1-b%2Fkube-proxy&Resource=CPU

The all-nodes is more useful for AWS jobs as the namings differ.

ronaldngounou · 2026-02-24T08:49:02Z

/test pull-perf-tests-ec2-master-scale-performance-5000

ronaldngounou · 2026-02-24T08:56:07Z

/test pull-perf-tests-ec2-master-scale-performance-100

ronaldngounou · 2026-02-24T09:09:15Z

Running only ec2-master-scale-performance-100 is sufficient. We only run 5000 nodes test to debug performance issues.

ronaldngounou · 2026-02-24T09:49:14Z

/assign @mengqiy

aojea · 2026-02-24T17:22:24Z

Example on what @ronaldngounou is mentioning

GCE https://perf-dash.k8s.io/#/?jobname=gce-100Nodes-master&metriccategoryname=E2E&metricname=LoadResources&PodName=kube-proxy-nodes-us-east1-c%2Fkube-proxy&Resource=CPU

AWS https://perf-dash.k8s.io/#/?jobname=aws-100Nodes&metriccategoryname=E2E&metricname=LoadResources&PodName=i-056392ead8e6be597%2Fkubelet&Resource=CPU

the result is that we can not see in aws a trend, since each execution will create different nodes.

@ronaldngounou can we make this change so it is backward compatible and instead of replacing the existing resources, we add a new "fake" resource that we call "aggregated" or "summary" or something like that?
I can not understand from the code if we can break other reporting that depend on the node names

/assign @wojtek-t @mborsz

ronaldngounou · 2026-02-25T01:28:13Z

Sounds good @aojea. I will address that note and update my PR.

ronaldngounou · 2026-03-03T03:31:52Z

@wojtek-t @mborsz PR is ready for review please. Thank you

ronaldngounou · 2026-03-03T06:14:52Z

/test pull-perf-tests-ec2-master-scale-performance-100

aojea · 2026-03-03T07:57:05Z

/lgtm

Thanks

k8s-ci-robot · 2026-03-03T07:57:21Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: aojea, ronaldngounou
Once this PR has been reviewed and has the lgtm label, please ask for approval from mborsz. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

perfdash/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ronaldngounou · 2026-03-04T02:32:00Z

/test pull-perf-tests-ec2-master-scale-performance-100

…WS jobs Signed-off-by: Ronald Ngounou <ronald.ngounou@yahoo.com>

k8s-ci-robot · 2026-03-06T05:59:05Z

New changes are detected. LGTM label has been removed.

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 24, 2026

k8s-ci-robot requested review from mborsz and wojtek-t February 24, 2026 08:47

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 24, 2026

ronaldngounou marked this pull request as draft February 24, 2026 08:49

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 24, 2026

ronaldngounou marked this pull request as ready for review February 24, 2026 08:49

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 24, 2026

k8s-ci-robot requested a review from shyamjvs February 24, 2026 08:49

ronaldngounou marked this pull request as draft February 24, 2026 08:50

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 24, 2026

ronaldngounou changed the title ~~Aggregate CPU and Memory accross all nodes instead of per node~~ Aggregate CPU and Memory accross all nodes instead of per node to fix AWS jobs Feb 24, 2026

k8s-ci-robot assigned mengqiy Feb 24, 2026

ronaldngounou marked this pull request as ready for review February 24, 2026 09:49

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 24, 2026

k8s-ci-robot assigned mborsz and wojtek-t Feb 24, 2026

ronaldngounou closed this Mar 3, 2026

ronaldngounou force-pushed the fix-nodename-aggregation branch from db52e1a to bcd2372 Compare March 3, 2026 03:16

k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 3, 2026

ronaldngounou reopened this Mar 3, 2026

ronaldngounou marked this pull request as draft March 3, 2026 03:19

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 3, 2026

ronaldngounou force-pushed the fix-nodename-aggregation branch from bc81329 to a0c7cac Compare March 3, 2026 03:30

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 3, 2026

ronaldngounou marked this pull request as ready for review March 3, 2026 03:30

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 3, 2026

ronaldngounou force-pushed the fix-nodename-aggregation branch from a0c7cac to 4d72c6d Compare March 3, 2026 04:03

k8s-ci-robot assigned aojea Mar 3, 2026

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 3, 2026

Aggregate CPU and Memory acros all nodes instead of per node to fix A…

2e0371a

…WS jobs Signed-off-by: Ronald Ngounou <ronald.ngounou@yahoo.com>

ronaldngounou force-pushed the fix-nodename-aggregation branch from 4d72c6d to 2e0371a Compare March 6, 2026 05:59

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aggregate CPU and Memory accross all nodes instead of per node to fix AWS jobs#3844

Aggregate CPU and Memory accross all nodes instead of per node to fix AWS jobs#3844
ronaldngounou wants to merge 1 commit intokubernetes:masterfrom
ronaldngounou:fix-nodename-aggregation

ronaldngounou commented Feb 24, 2026 •

edited

Loading

Uh oh!

ronaldngounou commented Feb 24, 2026

Uh oh!

ronaldngounou commented Feb 24, 2026

Uh oh!

ronaldngounou commented Feb 24, 2026

Uh oh!

ronaldngounou commented Feb 24, 2026

Uh oh!

aojea commented Feb 24, 2026

Uh oh!

ronaldngounou commented Feb 25, 2026

Uh oh!

ronaldngounou commented Mar 3, 2026 •

edited

Loading

Uh oh!

ronaldngounou commented Mar 3, 2026

Uh oh!

aojea commented Mar 3, 2026

Uh oh!

k8s-ci-robot commented Mar 3, 2026

Uh oh!

ronaldngounou commented Mar 4, 2026

Uh oh!

k8s-ci-robot commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

ronaldngounou commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Testing

Result after

Before

Uh oh!

ronaldngounou commented Feb 24, 2026

Uh oh!

ronaldngounou commented Feb 24, 2026

Uh oh!

ronaldngounou commented Feb 24, 2026

Uh oh!

ronaldngounou commented Feb 24, 2026

Uh oh!

aojea commented Feb 24, 2026

Uh oh!

ronaldngounou commented Feb 25, 2026

Uh oh!

ronaldngounou commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ronaldngounou commented Mar 3, 2026

Uh oh!

aojea commented Mar 3, 2026

Uh oh!

k8s-ci-robot commented Mar 3, 2026

Uh oh!

ronaldngounou commented Mar 4, 2026

Uh oh!

k8s-ci-robot commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ronaldngounou commented Feb 24, 2026 •

edited

Loading

ronaldngounou commented Mar 3, 2026 •

edited

Loading