feat(benchmarks): Add Claude UI benchmark harness #427
3 issues
xcodebuildmcp-test-boundary-review: Found 3 issues (1 high, 2 medium)
High
Unit test spawns real `python3` process without injection - `src/benchmarks/claude-ui/__tests__/claude-ui-benchmark.test.ts:22-39`
The runParserScript helper spawns a real python3 process via node:child_process directly, bypassing the safety setup's executor overrides; this test calls an actual external binary in the unit test run.
Medium
Tests use real OS filesystem because log-writer is not injected into `dismissFirstRunPrompts` - `src/benchmarks/claude-ui/__tests__/first-run-preflight.test.ts:24-29`
These tests create real temp directories and read actual files from disk to verify log output; the filesystem dependency should be injected (as logWriter) so tests can stay fully in-memory, consistent with the pattern used in prepareTemporarySimulator.
Benchmark test spawns real python3 subprocess, bypassing executor safety overrides
The runParserScript helper calls spawn('python3', args) directly from node:child_process, making npm test dependent on Python 3 being installed and bypassing the vitest-executor-safety.setup.ts framework-executor overrides; wrap the parser invocation in an injectable function so tests can stub it.
⏱ 4m 17s · 1.1M in / 44.8k out · $2.37
Annotations
Check failure on line 39 in src/benchmarks/claude-ui/__tests__/claude-ui-benchmark.test.ts
github-actions / warden: xcodebuildmcp-test-boundary-review
Unit test spawns real `python3` process without injection
The `runParserScript` helper spawns a real `python3` process via `node:child_process` directly, bypassing the safety setup's executor overrides; this test calls an actual external binary in the unit test run.
Check warning on line 29 in src/benchmarks/claude-ui/__tests__/first-run-preflight.test.ts
github-actions / warden: xcodebuildmcp-test-boundary-review
Tests use real OS filesystem because log-writer is not injected into `dismissFirstRunPrompts`
These tests create real temp directories and read actual files from disk to verify log output; the filesystem dependency should be injected (as `logWriter`) so tests can stay fully in-memory, consistent with the pattern used in `prepareTemporarySimulator`.