Releases: CTlanston/claude-code-247
v1.0.0 — Local-first 24/7 multi-repo Claude Code coworker
v1.0.0 — Local-first 24/7 multi-repo Claude Code coworker
Summary
This is the first GA release of claude-code-247.
It provides a local-first, multi-repo, 24/7 Claude Code coworker system using:
- local Claude Code worker mode (subscription auth, no Anthropic API key required)
launchddaemon runtime (4 services: dashboard / orchestrator / dispatcher / backup)- Docker-based worker execution
- GitHub as source-of-truth collaboration plane
- FastAPI + HTMX dashboard with a real-time Apple-style watchdog page
- CLI / Claude Remote-compatible command surface
- SQLite-backed state machine
- real Gemini 2.5 Pro validator
- real OpenAI-compatible validator
- risk scoring
- guarded auto-merge
- failure replay
- per-phase
worker_exitsinstrumentation
Production proof (already shipped in v1.0.0-beta.2)
v1.0.0-beta.2 proved the pure auto-merge production path end-to-end on a
real GitHub repository (auto-evo-playground#59):
| Item | Value |
|---|---|
| Worker auth | local_claude_code (subscription) |
| Anthropic worker spend | $0.00 |
| Gemini real validator | PASS confidence 1.0 |
OpenAI real validator (gpt-4o) |
PASS confidence 1.0 |
| Risk score | 0 (low) |
| Merge ruling | AUTO_MERGE |
| Real PR auto-merged in | 3 seconds (pr_created → merged) |
M21 hardening (on top of beta.2)
worker_exitsschema v4 — phase lifecycle columnsstatus/
started_at/finished_at/error_typevia additive migration- dispatcher instrumentation for 7 phases (
prepare_workspace,
worker,validators,risk_score,merge_policy,push,
open_pr,auto_merge), each carrying phase-specific metadata claude247 task-phases --task <id>CLI alias- 6 failure-mode integration drills, all PASS:
secret-in-diff blocks merge / validator disagreement blocks /
high-risk path blocks / budget exceeded defers task /
stop-allemergency kill / gh merge failure records phase exit - 19-gate
GA_GATE.mdcontract, with explicitPOST_GA_BACKLOG
so no safety check gets quietly demoted - 526 → 566 tests passing (added 40 tests for the M22b watchdog
read-only status board)
M22b — read-only watchdog status board
A new read-only watchdog dashboard added in M22b:
claude247 status-board/claude247 watchdogCLI (plain / JSON /
Markdown output)/status-boardHTML page with Apple-style Activity ring for
soak progress, auto-refresh every 15s (configurable), pause/resume,
live indicator dot, EN ↔ 中文 language toggle, dark mode
auto-follow/status-board.jsonmachine-readable endpoint- Live aggregation of: release state, soak progress, launchd /
dispatcher / dashboard / notifier / doctor health, queue + signals,
GA gate status (parsed fromGA_GATE.md), worker usage
(runs / cost / active workers / by role / by auth_mode) - Strict read-only contract — a regression test asserts that
invoking the CLI does not change any row count intasks,
commands,system_state,logs, oralerts. Safe to run while
the dispatcher is mid-tick.
Explicit soak waiver
The original 24h soak gate was NOT fully completed before this release.
The owner explicitly waived the full 24h soak requirement after
~9h+ of healthy soak evidence, with all observable signals green:
| Signal | Observed |
|---|---|
| Elapsed soak at decision time | ~9h 12m / 24h (~38.4%) |
| launchd services loaded | 4 / 4 |
Dashboard /healthz |
OK |
| Dispatcher healthy idle ticks | ~1182 (every 30s, all "idle: queue empty") |
| Backup job (daily 03 local) | Completed — claude247-20260525T011703Z.db (2.06 MB) created; dispatcher continued unaffected after |
| Active tasks | 0 |
| Stuck tasks | 0 |
| Orphan running commands | 0 |
| New alerts since T0 | 0 |
| Structured log errors since T0 | 0 |
| Anthropic worker spend since T0 | $0.0000 |
Known yellow flag
One early SQLite schema-migration race occurred at T0 + 7 minutes.
It was a one-time transient failure: dispatcher.err.log recorded a
single sqlite3.OperationalError: no such column: started_at
traceback (originating in memory/db.py::init_db), then the file
stopped growing. The subsequent ~1182 dispatcher ticks all succeeded.
The error did not repeat. Most plausible root cause: schema.sql
referenced the new M21-P2 columns before the in-place ALTER TABLE
migration finished on that particular tick.
Post-GA follow-up (required)
- Record the final T+24h soak result after wall-clock crosses
2026-05-25T21:46Z(the dispatcher T0 plus 24h). The watchdog
dashboard auto-detects this and will flipsoak.resultfrom
PARTIALtoPASS(orFAIL). - File the result in a follow-up
M22c_SOAK_FINAL.md.
Safety model (unchanged from beta.2)
Auto-merge is guarded by, in order:
- risk scoring (
orchestrator/risk_score.py) - allowed / forbidden path policy
- secret scanner (drill-tested: blocks merge if any match)
- real Gemini validator
- real OpenAI-compatible validator
- validator-disagreement gate (drill-tested: routes to human)
- high-risk block (drill-tested: never auto-merges)
- failure replay (
claude247 replay) - immutable command audit trail in
commandstable
The runtime safety gate system.allow_remote_writes defaults to
false. No git push, no PR merge, no GitHub write API call may
execute unless this flag is true AND the repo is enabled: true
in repos.yaml.
Quick start
make install # venv + deps + launchd plists
claude247 doctor
claude247 repo add # CLI onboarding wizard
claude247 status-board --plain # read-only watchdog
open http://127.0.0.1:8423/status-board # live watchdog page
claude247 start --repo my-repo --goal "refactor X" # kick off a taskRemaining post-GA backlog
These are explicit non-blockers, filed so they don't get promoted to
blocker via informal scope creep:
- Record final T+24h soak result (the carry-over from the waiver)
- Optional deeper multi-day soak observation (7-day, etc.)
- Schema v5 migration to add
runs.input_tokens/runs.output_tokens
columns + worker write-through, enabling token-level rate display in
the watchdog dashboard - More advanced dashboard analytics (cost trends, per-repo PR
throughput) - Multi-machine HA (single-Mac is by design)
- Cloud-hosted dashboard with team RBAC (local-first is by design)
- Qdrant live test with a real embedding key
- BSD-sed compatibility fix for
scripts/doctor_launchd.sh doctorimprovements:- detect "the dashboard busy on port 8423 is OUR own daemon"
rather than warning - validator-key check should use
load_runtime_config()instead of
bareos.environso it reflects what the dispatcher actually
sees
- detect "the dashboard busy on port 8423 is OUR own daemon"
Pre-release lineage
| Tag | Date | Headline |
|---|---|---|
v1.0.0-alpha.0 |
2026-05-24 | v1 transformation complete |
v1.0.0-alpha.1 |
2026-05-24 | Real multi-repo E2E validation |
v1.0.0-beta.0 |
2026-05-24 | Beta-readiness milestone |
v1.0.0-beta.1 |
2026-05-24 | Beta stabilization |
v1.0.0-beta.2 |
2026-05-24 | Pure auto-merge production proof |
v1.0.0 |
2026-05-25 | GA release with explicit owner soak waiver |
v1.0.0-beta.2 — Pure auto-merge production proof
v1.0.0-beta.2 — Pure auto-merge production proof
Eight-commit milestone (M20) on top of v1.0.0-beta.1. The headline: the system landed a real PR on a real GitHub repo with both real validators returning PASS, no human approval, and $0 Anthropic worker spend — the missing claim from beta.1.
See M20_PRODUCTION_PROOF_REPORT.md for the full report.
Headline
| Item | Value |
|---|---|
| Validator panel | Gemini PASS conf 1.0 + OpenAI PASS conf 1.0 (real, gpt-4o) |
| Risk | 0 (low) |
| Merge ruling | AUTO_MERGE |
| PR | auto-evo-playground#59 (MERGED) |
| merge commit | 6c583d5e1b39f03f800591770fe2b715340f12a7 |
| Worker auth | local_claude_code (subscription) |
| Anthropic worker spend | $0.00 |
| Total wall clock (enqueue → merged) | 3m 28s |
What got fixed (M20-P1b through M20-P3j)
The first M20-P3 E2E attempt didn't auto-merge. Five distinct evidence-pipeline gaps surfaced over the iteration series; each got a regression-tested fix.
| Fix | Commit | What it does |
|---|---|---|
| M20-P1b | 5aed874 |
env_loader.discover_env_paths() probes <user_config_dir>/secrets.env (launchd-spawned daemons need this — no shell wrapper to source it). Also strips bash export VAR=value prefix. |
| M20-P3b | 1eb9d10 |
gateway/commands/dispatcher_cmd.py calls env_loader.load_chain(discover_env_paths(cwd=None)) instead of the legacy single-file load(). |
| M20-P3d | 1c18d49-adjacent |
Default OpenAI model gpt-5 (org-gated) → gpt-4o (widely available, supports response_format: json_object). Operators can opt up via config.validators.openai.model. |
| M20-P3g | 1c18d49 |
EvidenceCollector resolves base ref from task_spec.default_branch (with origin/<branch> and HEAD fallbacks) instead of always using HEAD. Necessary because Claude CLI commits its work mid-roleloop. |
| M20-P3i | ef30853 |
snapshot_diff_body_safe includes untracked files via git ls-files --others, synthesizing new-file diffs. Claude CLI often writes new files without staging. |
| M20-P3j | cca1858 |
runner/role_loop.py calls a new _refresh_diff_evidence helper after the coder and after each repair so the internal reviewer sees fresh state. Without this, the stale review.md was misleading the external OpenAI validator. |
Operational additions
- launchd daemon mode installed and verified (4 services: dashboard
KeepAlive + orchestrator / dispatcher / backup scheduled).
Dashboard/healthzis live; all 4 plists now setCLAUDE247_CONFIG. - 24h soak plan written with baseline + health-check commands +
failure conditions + stop/uninstall paths — see
M20_SOAK_PLAN.md.
Validated against CTlanston/auto-evo-playground
The clamp utility task ran end-to-end:
| Stage | Outcome |
|---|---|
| Worker (subscription claude CLI) | Wrote clamp(value, min_value, max_value) + 13 unit tests |
| Workspace pytest | 92 passed |
| BR-001 safe diff body | clean, real implementation visible, secret-scan PASS |
| BR-002 env chain | OPENAI_API_KEY + GEMINI_API_KEY loaded from secrets.env via the M20-P1b chain |
| BR-003 worker_exits | 3 rows, all tests/success |
| Gemini real verdict | PASS conf 1.0 |
OpenAI real verdict (gpt-4o) |
PASS conf 1.0 |
| Risk score | 0 (low) |
| Merge policy decision | AUTO_MERGE |
| Auto-merge wall-clock | 3 seconds (pr_created → merged) |
Test posture
$ .venv/bin/python -m pytest -q --no-cov
502 passed in ~13s
- BR-001 / BR-002 / BR-003 / M19-F1 fix: as shipped in beta.1 (497)
- M20-P1b: +5 tests (secrets.env discovery, collision rules, missing-file
graceful, ANTHROPIC-in-env doesn't flip mode, bash export strip) - M20-P3g: +2 tests (committed agent-branch visible, default_branch fallback)
- M20-P3i: +3 tests (untracked new file in diff, forbidden untracked omitted,
.evidence/ not listed)
What's still pending
- The 24h soak observation itself — baseline is recorded, but the
t+1h / t+6h / t+24h checkpoints inM20_SOAK_PLAN.mdneed an
operator to run (or for the daemons to simply sit idle for 24h). - Deeper
worker_exitsinstrumentation outsiderun_named_commands. - Qdrant live test (key not present in test env).
Pre-release status
This is still a pre-release. With M20 done, the system has now
demonstrated a real end-to-end auto-merge with two real top-shelf
validators, real GitHub writes, real auth, $0 Anthropic worker spend,
and launchd 24/7 readiness. The remaining gap before a v1.0.0 GA
tag is the observed soak window and any deeper instrumentation /
hardening you want before lifting the pre-release flag.
v1.0.0-beta.1 — Beta stabilization
v1.0.0-beta.1 — Beta stabilization
Eight-commit milestone (M19) built on top of v1.0.0-beta.0. The job was not to add product features — it was to close the three beta-readiness backlog items, prove BR-001 + BR-002 + BR-003 working end-to-end on a real GitHub repo, and ship release+main consistency so the public page reflects reality.
See M19_BETA_STABILIZATION_REPORT.md and REAL_E2E_REPORT_M19.md for the full synthesis.
What's in
| Phase | Title | Commit |
|---|---|---|
| M19-P0 | Remote consistency report + missing GH release for beta.0 | e2766f2 |
| M19-P1 | BR-001 — safe diff body to validators | c6d6aeb |
| M19-P2 | BR-002 — deterministic env + config resolution | b65e7f6 |
| M19-P3 | BR-003 — per-phase worker_exits observability | 687144c |
| M19-P4 | Full test pass + doctor + status recorded | 1e53ce2 |
| M19-P5 | Third real E2E (first run surfaced finding M19-F1) | 71dbb40 |
| M19-P5b | Fix M19-F1: secret_scanner false-positive on lowercase Python vars | 396c153 |
| M19-P5c | E2E rerun proving fix in production: Gemini real PASS conf 1.0 | 17cffac |
What this delivers
BR-001 — Safe diff body to validators
- New
EvidenceCollector.snapshot_diff_body_safe()produces
.evidence/diff_body_safe.mdand.evidence/diff_body_metadata.json. - Default forbidden-path floor (
.env,secrets/**,.github/**,
CLAUDE.md,AGENTS.md, PEM/key files) merged with task spec. - Per-file
git difffiltered throughorchestrator.secret_scanner;
any hit → body redacted to a summary +secret_scan.status = BLOCKED. - Per-file and total byte caps with truncation marker.
JudgeInputgaineddiff_body_safe+diff_body_metadata;
evidence_prompt()adds a## DIFF_BODYsection + the directive
instruction telling the validator to return NEEDS_HUMAN if the body
is insufficient.
BR-002 — Deterministic env + config resolution
- New precedence chain for config:
CLI --config > $CLAUDE247_CONFIG > $CLAUDE247_CONFIG_DIR/config.yaml > <cwd>/.claude247/config.yaml - New precedence chain for env values: already-set os.environ wins
absolutely; among .env files, project-root .env > CWD .env >
<config_dir>/.env. - New
RuntimeConfigdataclass surfaced via doctor for diagnostics. - launchd plists now set
CLAUDE247_CONFIGso launchd-launched
workers resolve config without shell profile.
BR-003 — Per-phase worker_exits observability
- New
worker_exitsSQLite table (schema v3, additive migration). - New
classify_failure(phase, exit_code, command, stderr, verdict)
heuristic returns one of 14 canonical labels (test_failure,
claude_cli_failure, auth_failure, docker_failure, git_failure,
github_failure, validator_failure, merge_policy_block, timeout,
policy_block, …, unknown_failure). evidence_collector.run_named_commandswrites a row per command.handle_explain_stucksurfacesworker_exitsin its summary.- New
claude247 worker-exits --task <id>CLI with--plain/--json.
M19-F1 — Secret scanner false positive fixed
- The pre-existing
env_var_assignregex used(?im), which made
ordinary Pythontokens = text.split()match (TOKEN+s,
case-insensitive). Removed theiflag so only conventional
all-uppercase env-var names match. Regression-tested.
Validated against CTlanston/auto-evo-playground
Two live E2E runs of the same task (dedupe_words):
| Field | Run 1 (pre-M19-F1 fix) | Run 2 (post-M19-F1 fix) |
|---|---|---|
| PR | #54 (closed) | #55 |
secret_scan.status |
BLOCKED (false positive on tokens) |
PASS (clean) |
diff_body_safe.md |
REDACTED summary | Full unified diff |
| Gemini verdict | NEEDS_HUMAN conf 0.1 | PASS conf 1.0 |
| OpenAI verdict | mock NEEDS_HUMAN | mock NEEDS_HUMAN |
| Final state | WAITING_APPROVAL | WAITING_APPROVAL |
| Anthropic spend | $0.00 | $0.00 |
Auto-merge was correctly held in both runs by orthogonal gates
(M18-P1's mock-validator block + the validator_disagreement factor's
medium-risk promotion). Per directive: we did not set up an OpenAI key
for this milestone, so we explicitly do not claim a full
auto-merge proof. What we do claim is much stronger than the
pre-fix state: BR-001 produces a real diff body that a real top-shelf
validator can read and judge correctly, and BR-002 picks up the
operator's existing GEMINI_API_KEY from ~/.claude-code-247/.env
without any shell-profile help.
Test posture
$ .venv/bin/python -m pytest -q --no-cov
492 passed in 13.07s
- BR-001: 18 new tests (evidence_diff_body_safe + validator_receives_diff_body + secret_hit_blocks_diff_body_validator + 1 tweak to test_judge_contract)
- BR-002: 28 new tests (env_loader_precedence + env_loader_cwd_support + doctor_reports_config_source + launchd_plist_sets_config_env)
- BR-003: 25 new tests (worker_exit_record + worker_exit_classification + explain_stuck_uses_worker_exit + integration test_failed_worker_writes_exit_record)
- M19-F1: 2 new tests in test_secret_scanner.py (lowercase var not flagged + uppercase env var still caught)
- Total Δ: +73 tests on top of the 419 baseline at beta.0.
Doctor
✓ config source: loaded /Users/lanston/.claude-code-247/config.yaml (kind=user); env files probed: 2
✓ auth mode: worker_mode=local_claude_code, usable=True
✓ sqlite db init: schema v3 at .../state/claude247.db
What's intentionally still not in scope for beta
- Multi-machine HA — single-Mac is by design.
- Cross-org auth — local-first, one user, one machine.
- Docker runner outside dev mode — local backend covers stated scope.
- Dashboard auth — binds to
127.0.0.1deliberately. - Real-validator auto-merge demo with both Gemini + OpenAI — gated on
the operator settingOPENAI_API_KEY; the system is ready for it.
Pre-release
This is a pre-release. Production-ready (v1.0.0) is gated on a
clean full E2E with both real validators returning a real PASS
on a non-trivial diff. The plumbing is in place; the only missing
input is an OpenAI API key.
v1.0.0-beta.0 — Beta-readiness milestone
v1.0.0-beta.0 — Beta-readiness milestone
Four-phase hardening (M18-P0..P4) on top of v1.0.0-alpha.1. M18 explicitly avoided new features; the goal was to move the product from "alpha harness" to "beta-ready live-ops" by hardening auth, validators, daemon, webhooks, and proving a clean second end-to-end on a real GitHub repo.
See BETA_READINESS_REPORT.md for the full synthesis.
Phases shipped
| Phase | Title | Commit |
|---|---|---|
| M18-P0 | Subscription/local auth — worker_mode + no silent ANTHROPIC API fallback |
334ed46 |
| M18-P1 | Real OpenAI validator + mock-cannot-silently-pass-auto-merge gate | 9dacd5d |
| M18-P2 | launchd hardening — doctor_launchd.sh + extended doctor fields + plist tests |
712a639 |
| M18-P3 | Live ngrok webhook validation + explicit handle_ping |
5170197 |
| M18-P4 | Second real E2E proving reduced API spend + cleaner auto-merge path | d50949f |
Validated against CTlanston/auto-evo-playground (P4)
Real task: normalize_whitespace(text) queued via claude247 start, dispatched, role-loop ran, tests passed, PR opened (#53 draft), Gemini judged, merge policy routed to WAITING_APPROVAL.
| Item | Value |
|---|---|
| claude-code-247 tag commit | d50949f |
| pytest | 419 passing |
| Worker auth mode | local_claude_code (subscription CLI) |
| Anthropic API spend | $0.00 (down from ~$1.50 in alpha.1 for similar-shape task) |
| Gemini verdict | NEEDS_HUMAN (honest — see Finding 1 below) |
| OpenAI verdict | openai-mock (no key in env-loader scope — see Finding 2) |
| Merge decision | WAITING_APPROVAL (mock validator gate held correctly) |
What "beta-ready" means here
The documented product works end-to-end on a real GitHub repository, the auth path is honest about what it spends, validators are honest about what they ran with, the daemon path is inspectable, and live webhook delivery has been observed. It does not mean every backlog item is closed.
Mocked vs real
| Component | Mode |
|---|---|
| Claude Code worker | real (subscription, local CLI) |
| Gemini 2.5 Pro | real |
| OpenAI validator | mock (env-loader CWD scope — see Finding 2) |
| GitHub push + PR | real |
| Auto-merge | not exercised (validator gate correctly held) |
| Webhook receiver | real (P3 live ngrok delivery; 200 OK, signature verified) |
| Qdrant | sqlite-fts fallback |
| Docker runner | local subprocess (daemon offline) |
Non-blocking backlog filed against beta.0
These were surfaced by the P4 live run and are filed as follow-up. They do not block the tag and are being addressed in the next milestone.
- BR-001 —
JudgeInputincludesdiff_summary.md(stat) but not the textual diff body. Real validators correctly refuse to verify byte-identical preservation without seeing the body. (caps real-validator PASS rate) - BR-002 —
env_loader.load()reads~/.claude-code-247/.envonly; project/CWD.envis ignored, so aOPENAI_API_KEYin the active shell runs as mock. (config UX) - BR-003 — Dispatcher
worker_exitsummary lacks phase/classification/stderr detail, making post-mortem of failed runs harder than it should be. (observability)
Intentionally not in scope for beta
- Multi-machine HA — single-Mac is by design.
- Cross-org auth — the product is local-first; one user, one machine.
- Docker runner outside dev mode — local backend covers the stated 24/7 single-Mac scope.
- Dashboard auth — it binds to
127.0.0.1deliberately.
Pre-release
This is a pre-release. Production-ready (v1.0.0) is gated on the backlog items above being closed and a clean third E2E with real validators returning a real PASS.
v1.0.0-alpha.1 — Real multi-repo E2E validation
Validated against CTlanston/auto-evo-playground
Real task: Add slugify utility with tests (PR #51 merged via this system).
Validated path
CLI → command_queue → dispatcher → repo_registry → workspace clone
→ role loop (planner/coder/reviewer via claude --print, real)
→ tests (python3 -m pytest in workspace, 15 new passing tests)
→ evidence package (all spec §10 artifacts)
→ real Gemini 2.5 Pro validator (PASS, confidence 1.0)
→ mock OpenAI validator (propagated reviewer NEEDS_HUMAN)
→ risk score (40, medium — only factor=validator_disagreement)
→ merge_policy → WAITING_APPROVAL (correct per §9.5)
→ claude247 approve-merge → mark_ready → gh pr merge --squash --admin
→ merged to main
Results
| Item | Value |
|---|---|
| claude-code-247 commit | cd563e8 |
| auto-evo-playground merge commit | 7a3414f5 |
| PR | #51 (merged) |
| Tests (claude-code-247) | 368 passing |
| Tests (auto-evo-playground new) | 15 passing |
| Gemini verdict | PASS, conf 1.0 |
| OpenAI verdict | NEEDS_HUMAN (mock — no key) |
| Risk score | 40 (medium) |
| Merge decision | approval-path → merged |
Mocked vs real
| Component | Mode |
|---|---|
| Claude Code worker | real (subscription billed to ANTHROPIC_API_KEY) |
| Gemini 2.5 Pro | real |
| OpenAI validator | mock (no OPENAI_API_KEY) |
| GitHub push + PR + merge | real |
| Webhook receiver | not exercised (needs public endpoint) |
| Qdrant | sqlite-fts fallback |
| ntfy | configured only; live push not verified |
| Docker runner backend | local subprocess (Docker daemon offline) |
6 live-discovery fixes shipped this release
- claude CLI 2.1.142: prompt now via stdin (
--allowedTools <tools...>is variadic and swallowed the positional) - Honest auth_mode: detects
ANTHROPIC_API_KEYand labelsanthropic_apiinstead of silently sayinglocal_claude_code - Spec test_command guidance: use
python3 -m pytest(worker subprocess doesn't inherit venv bin) gh pr merge --autoopt-in viarepo.auto_merge.auto: true(was always-on; failed on most repos)gh pr merge --adminopt-in viarepo.auto_merge.admin: true(needed for protected base branches)gh pr readyauto-called before merge (drafts are rejected by GitHub merge API)
Remaining gaps (next milestones)
OPENAI_API_KEYto get both-real validator panelscripts/install_launchd.shto enable 24/7 daemon mode- Live webhook test against a real GitHub repo (needs ngrok / public host)
- Live Qdrant verification (currently sqlite-fts fallback)
See REAL_E2E_REPORT.md for the full step-by-step walk.
v1.0.0-alpha.0 — production v1 transformation complete
What this is
claude-code-247 is a local-first, multi-repo, 24/7 autonomous coding coworker. The Mac stays on; one orchestrator process (run under launchd) dispatches per-task workers inside Docker containers, talks to GitHub as the source of truth, and exposes a FastAPI + HTMX dashboard plus a mobile-friendly claude247 CLI for remote control.
Status
- 348 tests passing (<8s on Python 3.13)
- 49/49 acceptance items in DEFINITION_OF_DONE.md
- Original 28-item spec (M0–M10) + 6 follow-up milestones (M11, M11.5, M12, M13, M14, M15)
- Live Gemini 2.5 Pro adapter verified end-to-end
Quick start
git clone https://github.com/CTlanston/claude-code-247.git
cd claude-code-247
make install
claude247 doctor
# Add a repo + start a task
claude247 repo add
claude247 start --repo my-repo --goal "refactor the auth middleware"
# Optional: install launchd daemons (dashboard + dispatcher + nightly backup)
scripts/install_launchd.sh
open http://127.0.0.1:8423What ships in this release
| Layer | Modules |
|---|---|
| Orchestrator | scheduler, dispatcher (13 command handlers), task/command/budget/risk/merge_policy managers, ci_poller, gc, metrics, webhook handlers, system pause flags, env loader |
| Runner | Docker image + worker driver + planner/coder/reviewer/repair role loop (memory-aware) |
| Validators | Gemini 2.5 Pro + OpenAI-compatible judges with evidence-only contract; N-validator panel (N≥1) |
| Memory | SQLite schema, .agent/*.md per repo, Qdrant + SQLite-FTS5 backends, daily/weekly compiler, planner-prompt injection |
| Gateway CLI | claude247 with 17 commands — status/repos/start/pause/resume/stop/explain-stuck/approve-merge/reject-merge/risk/tasks/task/logs/memory/replay/dispatcher/doctor |
| Dashboard | FastAPI + HTMX, /metrics (Prometheus), /webhooks/github (HMAC-SHA256), pagination |
| Operations | launchd plists (dashboard + orchestrator + dispatcher + backup), workspace/log GC, orphan-command recovery, SQLite .backup rotation |
| Docs | docs/{ARCHITECTURE, INSTALL, REMOTE_DISPATCH, SECURITY, MEMORY, AUTO_MERGE_POLICY, VALIDATORS, REPO_ONBOARDING, OPERATIONS}.md |
Safety gates (all on by default)
system.allow_remote_writes: false— global push/merge gate- Per-repo
auto_merge.enabled: false forbidden_pathsnon-empty required at onboarding- Validators require agreement; disagreement routes to human
- All push/merge gh calls triple-gated
Notes
- Old Auto-Evo + AutoDev v3 implementation is preserved at
archive/auto-evo/ - This release force-pushed
mainto replace the prior squashed snapshot — anyone with the old clone needs togit fetch+ reset