test(ci): de-flake wall-clock-sensitive tests (fuzzy-match perf + AgentModelsTab toast) by alan5543 · Pull Request #213 · Beever-AI/beever-atlas

alan5543 · 2026-06-01T04:43:42Z

Why

CI / Backend (Python 3.12) failed on the push to main after #211 merged — not because of #211, but because test_fuzzy_match_10k_under_500ms asserts a hard 500ms wall-clock bound that shared GitHub runners can't reliably hit (636ms on that run; ~200ms locally). The same class of failure hit AgentModelsTab.test.tsx on #209's CI earlier the same day (toast waitFor timed out at 5082ms vs its 5000ms budget).

What

Test	Before	After
`test_fuzzy_match_10k_under_500ms`	single run, hard `<500ms`	best-of-3 runs, `<500ms` locally / `<1500ms` when `CI=true`; renamed to `test_fuzzy_match_10k_perf_budget`
`AgentModelsTab` preset toast	inner `waitFor` 5s, outer 15s	inner 10s, outer 20s

The perf bound is relaxed, not removed — it still catches the O(n²)-style algorithmic regressions it exists for.

Verification

pytest tests/test_graph_protocol.py -k fuzzy → 9 passed; ruff format/check clean
vitest run AgentModelsTab.test.tsx → 5 passed

🤖 Generated with Claude Code

Two tests assert wall-clock timing and intermittently fail on shared GitHub runners (both bit us on 2026-06-01): - test_fuzzy_match_10k_under_500ms: hard 500ms bound, failed at 636ms on a push run to main with code that runs ~200ms locally. Now best-of-3 runs (damps scheduler/CPU-frequency noise) with a 3x budget when CI=true. Renamed to test_fuzzy_match_10k_perf_budget. Still catches what it exists for: order-of-magnitude algorithmic regressions. - AgentModelsTab "preset card" toast: inner waitFor 5000ms timeout failed at 5082ms on a PR run. Bumped to 10s inner / 20s outer. Constraint: perf test must still catch O(n^2) regressions — bound relaxed, not removed Rejected: skipping perf test on CI entirely | loses regression coverage where it matters most Confidence: high Scope-risk: narrow Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

alan5543 merged commit 30ed8d1 into main Jun 1, 2026
9 checks passed

alan5543 deleted the fix/deflake-ci-timing-tests branch June 1, 2026 04:48

alan5543 mentioned this pull request Jun 3, 2026

feat(adk): ADK 2.x follow-ups — Workflow graph pipeline + native tool-error callback #222

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(ci): de-flake wall-clock-sensitive tests (fuzzy-match perf + AgentModelsTab toast)#213

test(ci): de-flake wall-clock-sensitive tests (fuzzy-match perf + AgentModelsTab toast)#213
alan5543 merged 1 commit into
mainfrom
fix/deflake-ci-timing-tests

alan5543 commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alan5543 commented Jun 1, 2026

Why

What

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant