Skip to content

Conversation

@roshkhatri
Copy link
Member

@roshkhatri roshkhatri commented Nov 25, 2025

This adds the workflow improvements for PR and Release benchmark where it runs on
c8g.metal-48xl for ARM64 and c7i.metal-48xl for X86

Cluster mode: disabled
TLS: disabled
io-threads: 1, 9
Pipelining: 1, 10
Clients: 1600
Benchmark Treads: 90
Data size: 16 ,96
Commands: SET, GET

c8g.metal-48xl Spec: https://aws.amazon.com/ec2/instance-types/c8g/
c7i.metal.48xl Spec: https://aws.amazon.com/ec2/instance-types/c7i/

vCPU: 192
NUMA nodes: 2
Memory (GiB): 384
Network Bandwidth (Gbps): 50

PR benchmarking will be executed on ARM64 machine as it has been seen to be more consistent.
Additionally, it runs 5 iterations for each tests and posts the average and other statistical metrics like

  • CI99%: 99% Confidence Interval - range where the true population mean is likely to fall
  • PI99%: 99% Prediction Interval - range where a single future observation is likely to fall
  • CV: Coefficient of Variation - relative variability (σ/μ × 100%)

Note: Values with (n=X, σ=Y, CV=Z%, CI99%=±W%, PI99%=±V%) indicate averages from X runs with standard deviation Y, coefficient of variation Z%, 99% confidence interval margin of error ±W% of the mean, and 99% prediction interval margin of error ±V% of the mean. CI bounds [A, B] and PI bounds [C, D] show the actual interval ranges.

For comparing between versions, it adds a workflow which runs on both ARM64 and X86 machine. It will also post the comparison between the versions like this: #2580 (comment)

Signed-off-by: Roshan Khatri <[email protected]>
@roshkhatri roshkhatri marked this pull request as ready for review November 25, 2025 02:02
@roshkhatri roshkhatri changed the title Align PR and Release benchmarking with new changes Add PR and Release benchmark with new changes in framework Nov 25, 2025
@codecov
Copy link

codecov bot commented Nov 25, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.43%. Comparing base (8ea7f13) to head (cec12f6).
⚠️ Report is 3 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2871      +/-   ##
============================================
- Coverage     72.44%   72.43%   -0.01%     
============================================
  Files           128      128              
  Lines         70415    70439      +24     
============================================
+ Hits          51011    51026      +15     
- Misses        19404    19413       +9     

see 19 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sarthakaggarwal97
Copy link
Contributor

@roshkhatri looks like we are adding x86 as well? I am not sure if we should add x86 runs if we are not confident on the stability and numbers yet.

Copy link
Contributor

@rainsupreme rainsupreme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

I wrote a few notes but you don't have to fix them in this PR.

"io-threads": [1,9],
"benchmark-threads": 90,
"server_cpu_range": "0-8",
"client_cpu_range": "144-191,48-95"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why the range is split. Is it something to do with NUMA nodes..?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes for X86 the NUMA nodes have cores split

python-version: "3.10"
cache: "pip"

- name: Install dependencies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I've seen this checkout and dependency setup stuff in multiple places. Is there some way we could deduplicate it and avoid the possibility of bugs from having accidental differences between them over time?

with:
path: artifacts

- name: Combine results and create comprehensive report
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the record, I'm not a fan of putting "business logic" in yml files like this. I'd prefer this to be in a script that we call, but I'm not going to make a big fuss in this PR. It's not worse than what you're replacing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats true, I was thinking the other way, where I didnt want to add another script which will be used just for one simple purpose 😅

@roshkhatri
Copy link
Member Author

@roshkhatri looks like we are adding x86 as well? I am not sure if we should add x86 runs if we are not confident on the stability and numbers yet.

Yes, we would still like to get the benchmark numbers for X86, while doing the releases. The PR only used ARM64 though

Copy link
Contributor

@rainsupreme rainsupreme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates look fine 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants