[MAINTENANCE] replace single-sample cached benchmark comparisons with repeated-sample + benchstat analysis

## Background

The `perf-regression.yaml` workflow (powered by [github-action-benchmark](https://github.com/marketplace/actions/continuous-benchmark)) compares each run's benchmark result against a single cached previous sample. This approach was disabled in PR #7307 after it fired a false regression alert for `BenchmarkRunEnumeration/Multiproto` (`432,036` allocs/op vs `201,337` allocs/op, ratio `2.15×`), exceeding the alert threshold.

## Root Cause

Single-sample cached comparisons are highly sensitive to noise:

- `RunEnumeration/Multiproto` measured locally at **1,174,753** allocs/op (`-benchtime=1x`) vs **1,047,282** allocs/op (`-benchtime=10x`).
- `RunEnumeration/Default` measured at **~54.4 M** allocs/op (`-benchtime=1x`) vs **~25.6 M** allocs/op (`-benchtime=10x`).

The metric is dominated by how many iterations the harness happens to run, setup/teardown overhead, and cached outliers — not real regressions in the code.

## Proposed Solution

Replace the current single-sample approach with a statistically sound methodology:

1. **Collect multiple samples** — run each benchmark suite with `-count=N` (e.g., `-count=10`) on both the base commit and the head commit, storing the raw `testing.B` output.
2. **Analyse with `benchstat`** — use [golang.org/x/perf/cmd/benchstat](https://pkg.go.dev/golang.org/x/perf/cmd/benchstat) to compare the two sample sets. `benchstat` applies a statistical test (Welch's t-test by default) and reports a `p`-value alongside the delta, suppressing noise-driven alerts.
3. **Threshold on significance** — only flag a regression when `benchstat` reports a statistically significant change (e.g., `p < 0.05`) _and_ the effect size exceeds a meaningful threshold (e.g., `Δ > 10 %`).
4. **CI integration** — store per-commit sample files as GitHub Actions artifacts (or in a dedicated branch/cache), diff them on PRs, and post the `benchstat` table as a PR comment.

Possible tooling options:
- [benchstat](https://pkg.go.dev/golang.org/x/perf/cmd/benchstat) (official Go toolchain)
- [continuous-benchmark](https://github.com/benchmark-action/github-action-benchmark) with a custom script that pre-aggregates samples
- A bespoke GitHub Actions workflow that builds, runs `-count=10`, saves output to an artifact, then diffs against the previous run's artifact using `benchstat`

## References

- Disabled workflow: `.github/workflows/perf-regression.yaml`
- PR where this was identified and workflow was disabled: https://github.com/projectdiscovery/nuclei/pull/7307
- Reported by: @dwisiswant0
- `benchstat` documentation: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MAINTENANCE] replace single-sample cached benchmark comparisons with repeated-sample + benchstat analysis #7318

Background

Root Cause

Proposed Solution

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[MAINTENANCE] replace single-sample cached benchmark comparisons with repeated-sample + benchstat analysis #7318

Description

Background

Root Cause

Proposed Solution

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions