The goal of this file is to describe the common mistakes and confusion points
an agent might face as they work in this codebase.
If you ever encounter something in the project that surprises you,
please alert the developer working with you and indicate that this is the case by editing the CLAUDE.md file to help prevent future agents from having the same issue.
Bencher is a continuous benchmarking platform that detects and prevents performance regressions.
- Bencher API Server (
services/api) - seeservices/api/CLAUDE.md bencherCLI (services/cli) - CLI for the REST API- Bencher Console (
services/console) - seeservices/console/CLAUDE.md - Bare Metal
runner(services/runner) - Bare Metal benchmark runner
Version control uses Jujutsu (jj) with Git.
- Practice test-driven development (TDD)
- All new code should be designed for testability and maintainability
- All changes should include appropriate unit and integration tests
cargo buildcargo nextest runTest a single package:
cargo nextest run -p my_packagenextest does not support doctests, so also run:
cargo test --docCrates that depend on bencher_valid will need to specify either:
serverfeature for server-side usage
cargo nextest run -p my_package --features serverclientfeature for client-side usage
cargo nextest run -p my_package --features clientOtherwise, you will see:
use of undeclared type `Regex`
cargo fmtcargo clippy --no-deps --all-targets --all-features -- -Dwarningscargo check --no-default-featuresWhen modifying target_os = "linux" crates (bencher_init, bencher_rootfs, bencher_runner, bencher_runner_cli),
also run the cross-compilation checks locally:
./scripts/clippy.sh # Runs clippy natively + cross-compiles to x86_64-unknown-linux-gnu
./scripts/test.sh --linux-only # Cross-compiles tests for the Linux-only cratesThese scripts require a cross-compiler (zig, x86_64-linux-gnu-gcc, or x86_64-unknown-linux-gnu-gcc)
and the x86_64-unknown-linux-gnu Rust target.
The clippy script will install the target automatically and warn if no cross-compiler is found.
- Always run
cargo fmtandcargo clippywhen testing or before committing - Run
cargo fmtone final time after all changes are complete (including any generated code or lint fixes), since clippy fixes and other automated changes can introduce formatting drift - Use
#[expect(...)]instead of#[allow(...)]for lint suppression - Do NOT suppress a lint outside of a test module without explicit approval
- All dependency versions go in the workspace
Cargo.toml - When reviewing code, also check:
cargo check --no-default-featurescargo gen-types(if the API changed at all)
- Use idiomatic, strong types instead of
Stringandserde_json::Valuewhere possible - Database model fields should use strong validated types (e.g.,
ProjectId,ProjectUuid,ProjectName,DateTime,VersionNumber) with DieselToSql/FromSqlimpls rather than raw primitives (i32,i64,String). All conversion happens inside the Diesel impls, not in the model layer. - Avoid
select!macros - usefutures_concurrency::stream::Merge::merge - Prefer stream combinators (
try_fold,try_for_each,try_collect, etc.) over manualwhile let Some(chunk) = stream.next().awaitloops when processing streams - All time-based tests should be deterministic and use time manipulation not real wall-clock time
- Use
bencher_json::Clock::Custom(behind thetest-clockfeature) to inject a fake clock in tests instead of callingDateTime::now()directly.Clockis available onApiContext. - For unit tests without access to
ApiContext/Clock, usebencher_json::DateTime::TEST(a fixed deterministic const). Enabletest-clockinbencher_jsondev-dependencies to access it. - Most wire type definitions are in the
bencher_validorbencher_jsoncrate - Always pass strong types (
MyTypeId,MyTypeUuid, etc) into a function instead of its stringly typed equivalent, even in tests - Do NOT use shared, global mutable state
- Always use
thiserrorfor error types in libraries and production binaries (services/). Do not useanyhowin those crates.anyhowis acceptable intasks/crates (build tasks, test harnesses) where convenience outweighs structured errors. - Do not use
Box<dyn Error>(orBox<dyn std::error::Error + Send + Sync>) as a return type. UseHttpErrorfor API endpoint errors or define specificthiserrorerror enums. The only acceptable uses ofBox<dyn Error>are when wrapping third-party APIs that return boxed errors (e.g., diesel migrations, dropshot server creation). - Do NOT use
dyn std::any::Anywithout explicit justification and approval - When adding workspace dependencies without extra options (no
optional, nofeatures), use the shorthanddep.workspace = trueform instead ofdep = { workspace = true } - Use
camino(Utf8Path/Utf8PathBuf) for file paths whenever practical instead ofstd::path::Path/PathBuf. Exception:tempfile::tempdir()in tests may usestd::pathsince it returnsTempDirwith&Path; convert viaUtf8Path::from_path()at the boundary when needed. - Use
clapfor CLI argument parsing- The
clapstruct definitions should live in a separateparsermodule - The subcommand handler logic should live in a separate module named after the binary for production code (ie
bencher) or a module namedtaskfortasks/*crates - Do NOT use
num_argson flags inbencher run— it usestrailing_var_arg = trueto matchdocker runsemantics, andnum_argsconflicts with trailing vararg parsing. Validate collection sizes at the type/deserialization layer instead (e.g.,TryFromimpls inbencher_json).
- The
- Prefer destructuring a struct (
let Self { field1, field2, .. } = self;orlet Foo { .. } = foo;) over individual field access (.field1,.field2) when consuming or converting all fields. This ensures the compiler flags a build error when a field is added, preventing silent omissions. - Use macros for database connection access. All of these macros have a single-use and expanded closure-like form for use multiple times in the same scope.
public_conn!()- For read-only public access- This optionally takes in a
PublicUser
- This optionally takes in a
auth_conn!()- For read-only authenticated accesswrite_conn!()- For single writer access
- Use
diesel::QueryResult<T>instead ofResult<T, diesel::result::Error>— it is a type alias and more idiomatic - Database write methods that may be called both standalone and from within an outer transaction should NOT wrap in
conn.transaction()internally. Instead, the standalone callers should wrap the call in a transaction. This avoids unnecessary SQLite savepoints when the method is called from batch operations.
Shell scripts are used very sparingly. Prefer creating tasks in tasks/ (invoked via cargo aliases). Administrative-only tasks go in xtask/. Shell scripts are only acceptable as ultra-lightweight wrappers around commands like git or docker.
Defined in .cargo/config.toml:
cargo xtask- Administrative taskscargo gen-types/cargo gen-spec/cargo gen-ts- Type generationcargo test-api- API testing and DB seedingcargo test-runner- Runner integration tests (requires Linux + KVM)
- PRs are opened against the
develbranch - Deploy to Bencher Cloud: reset
cloudtodeveland push - After successful deploy: CI resets
maintocloud - Release tags (e.g.,
v0.5.10) are created offdevel
Rust is the single source of truth for types:
- Rust structs annotated with
#[typeshare]inbencher_jsonand other crates cargo gen-typesgenerates OpenAPI spec (services/api/openapi.json) and TypeScript types (services/console/src/types/bencher.ts)bencher_validis compiled to WASM for browser-side validationbencher_clientis auto-generated from the OpenAPI spec via progenitor
When adding a new crate, update all three Dockerfiles: