This document explains the regression and reporting workflow around CDT performance benchmarks. For the benchmark inventory and instructions for adding new Criterion benchmarks, see benches/README.md.
The performance workflow uses Criterion benchmark output plus scripts/performance_analysis.py to:
- compare current results with saved baselines;
- report regressions and improvements;
- generate Markdown reports for PR or release review;
- keep local and CI checks using the same benchmark contract.
The default CI contract is benches/ci_performance_suite.rs. It is intentionally smaller than the full exploratory benchmark suite so it can provide a stable
regression signal across platforms.
just bench-ci # Run the CI regression benchmark contract
just perf-check # Compare current results against the latest baseline
just perf-check 5.0 # Use a stricter 5% regression threshold
just perf-baseline # Save current results as a timestamped baseline
just perf-baseline tag # Save current results with a descriptive tag
just perf-report # Generate a Markdown performance report
just perf-trends 7 # Summarize recent baseline trendsUseful direct analyzer commands:
uv run performance-analysis --no-run
uv run performance-analysis --no-run --threshold 5.0
uv run performance-analysis --compare performance_baselines/baseline_pre-change.json
uv run performance-analysis --report performance-report.mdjust perf-check returns exit code 1 when regressions exceed the threshold. Local callers may treat that as blocking. In CI, benchmark noise is reported
but does not by itself fail the PR workflow.
The performance workflow:
- runs the CI benchmark suite on pull requests;
- compares PR results with the main-branch baseline;
- comments with regressions, improvements, stable benchmarks, and new benchmarks;
- uploads report artifacts;
- saves updated baselines on main after successful merges.
Regression comments should be reviewed carefully, especially for changes touching geometry construction, move proposal enumeration, action calculation, Metropolis simulation, output generation, or validation.
The CI suite focuses on release-relevant CDT paths:
- open-boundary and toroidal triangulation construction;
- topology, foliation, causality, and simplex-classification validation;
- individual ergodic move attempts;
- proposal-site iteration and single-step Metropolis proposal planning;
- short random-move sweeps;
- short Metropolis simulations.
The broader Criterion suite includes exploratory groups for geometry queries, cache behavior, action calculations, simulation analysis, and validation. Those are documented in benches/README.md.
Before a performance-sensitive change:
just bench-ci
just perf-baseline pre-changeDuring development:
just perf-check 15.0Before review:
just perf-check
just perf-reportFor optimization PRs, include a short performance summary:
## Performance Impact
- proposal-site enumeration: 18% faster on the CI suite
- short Metropolis runs: stable within threshold
- memory allocation: no new persistent allocations in the hot path- Main-branch baselines are saved by CI.
- Feature baselines can be saved locally with descriptive tags.
- Release baselines should use version tags.
- The analyzer keeps recent baselines and report artifacts for comparison.
Keep baseline names descriptive enough to recover the comparison later:
just perf-baseline pre-proposal-cache
just perf-baseline v0.1.1No benchmark results found
: Run just bench-ci first, or run uv run performance-analysis --no-run only after Criterion JSON output exists.
No baseline found for comparison
: Save one with just perf-baseline initial, or compare directly against a known baseline file.
High variance
: Close other CPU-heavy work, rerun the benchmark, and compare trends rather than one noisy sample. Treat large PR comments as investigation prompts, not automatic proof of a regression.
Need deeper timing data
: Run a focused Criterion group from benches/README.md, inspect the HTML report under target/criterion/, or use platform-specific
profilers. Memory profiling is not exposed as a Cargo feature in this crate; use external profilers or targeted benchmark instrumentation.
benches/ci_performance_suite.rs: stable CI regression contractbenches/cdt_benchmarks.rs: broader Criterion benchmark groupsscripts/performance_analysis.py: baseline comparison and report generation.github/workflows/performance.yml: CI performance workflowperformance_baselines/: saved local and CI baselines