feat(benchmarks): Add Claude UI benchmark harness #427

3 issues

xcodebuildmcp-test-boundary-review: Found 3 issues (1 high, 2 medium)

High

Unit test spawns real `python3` process without injection - `src/benchmarks/claude-ui/__tests__/claude-ui-benchmark.test.ts:22-39`

The runParserScript helper spawns a real python3 process via node:child_process directly, bypassing the safety setup's executor overrides; this test calls an actual external binary in the unit test run.

Medium

Tests use real OS filesystem because log-writer is not injected into `dismissFirstRunPrompts` - `src/benchmarks/claude-ui/__tests__/first-run-preflight.test.ts:24-29`

These tests create real temp directories and read actual files from disk to verify log output; the filesystem dependency should be injected (as logWriter) so tests can stay fully in-memory, consistent with the pattern used in prepareTemporarySimulator.

Benchmark test spawns real python3 subprocess, bypassing executor safety overrides

The runParserScript helper calls spawn('python3', args) directly from node:child_process, making npm test dependent on Python 3 being installed and bypassing the vitest-executor-safety.setup.ts framework-executor overrides; wrap the parser invocation in an injectable function so tests can stub it.

_{⏱ 4m 17s · 1.1M in / 44.8k out · $2.37}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(benchmarks): Add Claude UI benchmark harness #427

Uh oh!

Uh oh!

feat(benchmarks): Add Claude UI benchmark harness #427

Uh oh!

3 issues

High

Medium

Annotations

github-actions / warden: xcodebuildmcp-test-boundary-review

github-actions / warden: xcodebuildmcp-test-boundary-review

Re-running jobs...

Uh oh!

feat(benchmarks): Add Claude UI benchmark harness #427

Uh oh!

Fix duplicate stumble count for parse errors

Uh oh!

feat(benchmarks): Add Claude UI benchmark harness #427

Uh oh!

3 issues

High

Medium

Annotations

github-actions / warden: xcodebuildmcp-test-boundary-review

github-actions / warden: xcodebuildmcp-test-boundary-review

Re-running jobs...