Skip to content

Releases: CTlanston/claude-code-247

v1.0.0 — Local-first 24/7 multi-repo Claude Code coworker

25 May 07:17

Choose a tag to compare

v1.0.0 — Local-first 24/7 multi-repo Claude Code coworker

Summary

This is the first GA release of claude-code-247.

It provides a local-first, multi-repo, 24/7 Claude Code coworker system using:

  • local Claude Code worker mode (subscription auth, no Anthropic API key required)
  • launchd daemon runtime (4 services: dashboard / orchestrator / dispatcher / backup)
  • Docker-based worker execution
  • GitHub as source-of-truth collaboration plane
  • FastAPI + HTMX dashboard with a real-time Apple-style watchdog page
  • CLI / Claude Remote-compatible command surface
  • SQLite-backed state machine
  • real Gemini 2.5 Pro validator
  • real OpenAI-compatible validator
  • risk scoring
  • guarded auto-merge
  • failure replay
  • per-phase worker_exits instrumentation

Production proof (already shipped in v1.0.0-beta.2)

v1.0.0-beta.2 proved the pure auto-merge production path end-to-end on a
real GitHub repository (auto-evo-playground#59):

Item Value
Worker auth local_claude_code (subscription)
Anthropic worker spend $0.00
Gemini real validator PASS confidence 1.0
OpenAI real validator (gpt-4o) PASS confidence 1.0
Risk score 0 (low)
Merge ruling AUTO_MERGE
Real PR auto-merged in 3 seconds (pr_created → merged)

M21 hardening (on top of beta.2)

  • worker_exits schema v4 — phase lifecycle columns status /
    started_at / finished_at / error_type via additive migration
  • dispatcher instrumentation for 7 phases (prepare_workspace,
    worker, validators, risk_score, merge_policy, push,
    open_pr, auto_merge), each carrying phase-specific metadata
  • claude247 task-phases --task <id> CLI alias
  • 6 failure-mode integration drills, all PASS:
    secret-in-diff blocks merge / validator disagreement blocks /
    high-risk path blocks / budget exceeded defers task /
    stop-all emergency kill / gh merge failure records phase exit
  • 19-gate GA_GATE.md contract, with explicit POST_GA_BACKLOG
    so no safety check gets quietly demoted
  • 526 → 566 tests passing (added 40 tests for the M22b watchdog
    read-only status board)

M22b — read-only watchdog status board

A new read-only watchdog dashboard added in M22b:

  • claude247 status-board / claude247 watchdog CLI (plain / JSON /
    Markdown output)
  • /status-board HTML page with Apple-style Activity ring for
    soak progress, auto-refresh every 15s (configurable), pause/resume,
    live indicator dot, EN ↔ 中文 language toggle, dark mode
    auto-follow
  • /status-board.json machine-readable endpoint
  • Live aggregation of: release state, soak progress, launchd /
    dispatcher / dashboard / notifier / doctor health, queue + signals,
    GA gate status (parsed from GA_GATE.md), worker usage
    (runs / cost / active workers / by role / by auth_mode)
  • Strict read-only contract — a regression test asserts that
    invoking the CLI does not change any row count in tasks,
    commands, system_state, logs, or alerts. Safe to run while
    the dispatcher is mid-tick.

Explicit soak waiver

The original 24h soak gate was NOT fully completed before this release.

The owner explicitly waived the full 24h soak requirement after
~9h+ of healthy soak evidence, with all observable signals green:

Signal Observed
Elapsed soak at decision time ~9h 12m / 24h (~38.4%)
launchd services loaded 4 / 4
Dashboard /healthz OK
Dispatcher healthy idle ticks ~1182 (every 30s, all "idle: queue empty")
Backup job (daily 03 local) Completedclaude247-20260525T011703Z.db (2.06 MB) created; dispatcher continued unaffected after
Active tasks 0
Stuck tasks 0
Orphan running commands 0
New alerts since T0 0
Structured log errors since T0 0
Anthropic worker spend since T0 $0.0000

Known yellow flag

One early SQLite schema-migration race occurred at T0 + 7 minutes.
It was a one-time transient failure: dispatcher.err.log recorded a
single sqlite3.OperationalError: no such column: started_at
traceback (originating in memory/db.py::init_db), then the file
stopped growing. The subsequent ~1182 dispatcher ticks all succeeded.
The error did not repeat. Most plausible root cause: schema.sql
referenced the new M21-P2 columns before the in-place ALTER TABLE
migration finished on that particular tick.

Post-GA follow-up (required)

  • Record the final T+24h soak result after wall-clock crosses
    2026-05-25T21:46Z (the dispatcher T0 plus 24h). The watchdog
    dashboard auto-detects this and will flip soak.result from
    PARTIAL to PASS (or FAIL).
  • File the result in a follow-up M22c_SOAK_FINAL.md.

Safety model (unchanged from beta.2)

Auto-merge is guarded by, in order:

  1. risk scoring (orchestrator/risk_score.py)
  2. allowed / forbidden path policy
  3. secret scanner (drill-tested: blocks merge if any match)
  4. real Gemini validator
  5. real OpenAI-compatible validator
  6. validator-disagreement gate (drill-tested: routes to human)
  7. high-risk block (drill-tested: never auto-merges)
  8. failure replay (claude247 replay)
  9. immutable command audit trail in commands table

The runtime safety gate system.allow_remote_writes defaults to
false. No git push, no PR merge, no GitHub write API call may
execute unless this flag is true AND the repo is enabled: true
in repos.yaml.

Quick start

make install                                        # venv + deps + launchd plists
claude247 doctor
claude247 repo add                                  # CLI onboarding wizard
claude247 status-board --plain                      # read-only watchdog
open http://127.0.0.1:8423/status-board             # live watchdog page
claude247 start --repo my-repo --goal "refactor X"  # kick off a task

Remaining post-GA backlog

These are explicit non-blockers, filed so they don't get promoted to
blocker via informal scope creep:

  • Record final T+24h soak result (the carry-over from the waiver)
  • Optional deeper multi-day soak observation (7-day, etc.)
  • Schema v5 migration to add runs.input_tokens / runs.output_tokens
    columns + worker write-through, enabling token-level rate display in
    the watchdog dashboard
  • More advanced dashboard analytics (cost trends, per-repo PR
    throughput)
  • Multi-machine HA (single-Mac is by design)
  • Cloud-hosted dashboard with team RBAC (local-first is by design)
  • Qdrant live test with a real embedding key
  • BSD-sed compatibility fix for scripts/doctor_launchd.sh
  • doctor improvements:
    • detect "the dashboard busy on port 8423 is OUR own daemon"
      rather than warning
    • validator-key check should use load_runtime_config() instead of
      bare os.environ so it reflects what the dispatcher actually
      sees

Pre-release lineage

Tag Date Headline
v1.0.0-alpha.0 2026-05-24 v1 transformation complete
v1.0.0-alpha.1 2026-05-24 Real multi-repo E2E validation
v1.0.0-beta.0 2026-05-24 Beta-readiness milestone
v1.0.0-beta.1 2026-05-24 Beta stabilization
v1.0.0-beta.2 2026-05-24 Pure auto-merge production proof
v1.0.0 2026-05-25 GA release with explicit owner soak waiver

v1.0.0-beta.2 — Pure auto-merge production proof

24 May 21:46

Choose a tag to compare

v1.0.0-beta.2 — Pure auto-merge production proof

Eight-commit milestone (M20) on top of v1.0.0-beta.1. The headline: the system landed a real PR on a real GitHub repo with both real validators returning PASS, no human approval, and $0 Anthropic worker spend — the missing claim from beta.1.

See M20_PRODUCTION_PROOF_REPORT.md for the full report.

Headline

Item Value
Validator panel Gemini PASS conf 1.0 + OpenAI PASS conf 1.0 (real, gpt-4o)
Risk 0 (low)
Merge ruling AUTO_MERGE
PR auto-evo-playground#59 (MERGED)
merge commit 6c583d5e1b39f03f800591770fe2b715340f12a7
Worker auth local_claude_code (subscription)
Anthropic worker spend $0.00
Total wall clock (enqueue → merged) 3m 28s

What got fixed (M20-P1b through M20-P3j)

The first M20-P3 E2E attempt didn't auto-merge. Five distinct evidence-pipeline gaps surfaced over the iteration series; each got a regression-tested fix.

Fix Commit What it does
M20-P1b 5aed874 env_loader.discover_env_paths() probes <user_config_dir>/secrets.env (launchd-spawned daemons need this — no shell wrapper to source it). Also strips bash export VAR=value prefix.
M20-P3b 1eb9d10 gateway/commands/dispatcher_cmd.py calls env_loader.load_chain(discover_env_paths(cwd=None)) instead of the legacy single-file load().
M20-P3d 1c18d49-adjacent Default OpenAI model gpt-5 (org-gated) → gpt-4o (widely available, supports response_format: json_object). Operators can opt up via config.validators.openai.model.
M20-P3g 1c18d49 EvidenceCollector resolves base ref from task_spec.default_branch (with origin/<branch> and HEAD fallbacks) instead of always using HEAD. Necessary because Claude CLI commits its work mid-roleloop.
M20-P3i ef30853 snapshot_diff_body_safe includes untracked files via git ls-files --others, synthesizing new-file diffs. Claude CLI often writes new files without staging.
M20-P3j cca1858 runner/role_loop.py calls a new _refresh_diff_evidence helper after the coder and after each repair so the internal reviewer sees fresh state. Without this, the stale review.md was misleading the external OpenAI validator.

Operational additions

  • launchd daemon mode installed and verified (4 services: dashboard
    KeepAlive + orchestrator / dispatcher / backup scheduled).
    Dashboard /healthz is live; all 4 plists now set CLAUDE247_CONFIG.
  • 24h soak plan written with baseline + health-check commands +
    failure conditions + stop/uninstall paths — see
    M20_SOAK_PLAN.md.

Validated against CTlanston/auto-evo-playground

The clamp utility task ran end-to-end:

Stage Outcome
Worker (subscription claude CLI) Wrote clamp(value, min_value, max_value) + 13 unit tests
Workspace pytest 92 passed
BR-001 safe diff body clean, real implementation visible, secret-scan PASS
BR-002 env chain OPENAI_API_KEY + GEMINI_API_KEY loaded from secrets.env via the M20-P1b chain
BR-003 worker_exits 3 rows, all tests/success
Gemini real verdict PASS conf 1.0
OpenAI real verdict (gpt-4o) PASS conf 1.0
Risk score 0 (low)
Merge policy decision AUTO_MERGE
Auto-merge wall-clock 3 seconds (pr_created → merged)

Test posture

$ .venv/bin/python -m pytest -q --no-cov
502 passed in ~13s
  • BR-001 / BR-002 / BR-003 / M19-F1 fix: as shipped in beta.1 (497)
  • M20-P1b: +5 tests (secrets.env discovery, collision rules, missing-file
    graceful, ANTHROPIC-in-env doesn't flip mode, bash export strip)
  • M20-P3g: +2 tests (committed agent-branch visible, default_branch fallback)
  • M20-P3i: +3 tests (untracked new file in diff, forbidden untracked omitted,
    .evidence/ not listed)

What's still pending

  • The 24h soak observation itself — baseline is recorded, but the
    t+1h / t+6h / t+24h checkpoints in M20_SOAK_PLAN.md need an
    operator to run (or for the daemons to simply sit idle for 24h).
  • Deeper worker_exits instrumentation outside run_named_commands.
  • Qdrant live test (key not present in test env).

Pre-release status

This is still a pre-release. With M20 done, the system has now
demonstrated a real end-to-end auto-merge with two real top-shelf
validators, real GitHub writes, real auth, $0 Anthropic worker spend,
and launchd 24/7 readiness. The remaining gap before a v1.0.0 GA
tag is the observed soak window and any deeper instrumentation /
hardening you want before lifting the pre-release flag.

v1.0.0-beta.1 — Beta stabilization

24 May 20:02

Choose a tag to compare

Pre-release

v1.0.0-beta.1 — Beta stabilization

Eight-commit milestone (M19) built on top of v1.0.0-beta.0. The job was not to add product features — it was to close the three beta-readiness backlog items, prove BR-001 + BR-002 + BR-003 working end-to-end on a real GitHub repo, and ship release+main consistency so the public page reflects reality.

See M19_BETA_STABILIZATION_REPORT.md and REAL_E2E_REPORT_M19.md for the full synthesis.

What's in

Phase Title Commit
M19-P0 Remote consistency report + missing GH release for beta.0 e2766f2
M19-P1 BR-001 — safe diff body to validators c6d6aeb
M19-P2 BR-002 — deterministic env + config resolution b65e7f6
M19-P3 BR-003 — per-phase worker_exits observability 687144c
M19-P4 Full test pass + doctor + status recorded 1e53ce2
M19-P5 Third real E2E (first run surfaced finding M19-F1) 71dbb40
M19-P5b Fix M19-F1: secret_scanner false-positive on lowercase Python vars 396c153
M19-P5c E2E rerun proving fix in production: Gemini real PASS conf 1.0 17cffac

What this delivers

BR-001 — Safe diff body to validators

  • New EvidenceCollector.snapshot_diff_body_safe() produces
    .evidence/diff_body_safe.md and .evidence/diff_body_metadata.json.
  • Default forbidden-path floor (.env, secrets/**, .github/**,
    CLAUDE.md, AGENTS.md, PEM/key files) merged with task spec.
  • Per-file git diff filtered through orchestrator.secret_scanner;
    any hit → body redacted to a summary + secret_scan.status = BLOCKED.
  • Per-file and total byte caps with truncation marker.
  • JudgeInput gained diff_body_safe + diff_body_metadata;
    evidence_prompt() adds a ## DIFF_BODY section + the directive
    instruction telling the validator to return NEEDS_HUMAN if the body
    is insufficient.

BR-002 — Deterministic env + config resolution

  • New precedence chain for config:
    CLI --config > $CLAUDE247_CONFIG > $CLAUDE247_CONFIG_DIR/config.yaml > <cwd>/.claude247/config.yaml
  • New precedence chain for env values: already-set os.environ wins
    absolutely; among .env files, project-root .env > CWD .env >
    <config_dir>/.env.
  • New RuntimeConfig dataclass surfaced via doctor for diagnostics.
  • launchd plists now set CLAUDE247_CONFIG so launchd-launched
    workers resolve config without shell profile.

BR-003 — Per-phase worker_exits observability

  • New worker_exits SQLite table (schema v3, additive migration).
  • New classify_failure(phase, exit_code, command, stderr, verdict)
    heuristic returns one of 14 canonical labels (test_failure,
    claude_cli_failure, auth_failure, docker_failure, git_failure,
    github_failure, validator_failure, merge_policy_block, timeout,
    policy_block, …, unknown_failure).
  • evidence_collector.run_named_commands writes a row per command.
  • handle_explain_stuck surfaces worker_exits in its summary.
  • New claude247 worker-exits --task <id> CLI with --plain / --json.

M19-F1 — Secret scanner false positive fixed

  • The pre-existing env_var_assign regex used (?im), which made
    ordinary Python tokens = text.split() match (TOKEN + s,
    case-insensitive). Removed the i flag so only conventional
    all-uppercase env-var names match. Regression-tested.

Validated against CTlanston/auto-evo-playground

Two live E2E runs of the same task (dedupe_words):

Field Run 1 (pre-M19-F1 fix) Run 2 (post-M19-F1 fix)
PR #54 (closed) #55
secret_scan.status BLOCKED (false positive on tokens) PASS (clean)
diff_body_safe.md REDACTED summary Full unified diff
Gemini verdict NEEDS_HUMAN conf 0.1 PASS conf 1.0
OpenAI verdict mock NEEDS_HUMAN mock NEEDS_HUMAN
Final state WAITING_APPROVAL WAITING_APPROVAL
Anthropic spend $0.00 $0.00

Auto-merge was correctly held in both runs by orthogonal gates
(M18-P1's mock-validator block + the validator_disagreement factor's
medium-risk promotion). Per directive: we did not set up an OpenAI key
for this milestone, so we explicitly do not claim a full
auto-merge proof. What we do claim is much stronger than the
pre-fix state: BR-001 produces a real diff body that a real top-shelf
validator can read and judge correctly, and BR-002 picks up the
operator's existing GEMINI_API_KEY from ~/.claude-code-247/.env
without any shell-profile help.

Test posture

$ .venv/bin/python -m pytest -q --no-cov
492 passed in 13.07s
  • BR-001: 18 new tests (evidence_diff_body_safe + validator_receives_diff_body + secret_hit_blocks_diff_body_validator + 1 tweak to test_judge_contract)
  • BR-002: 28 new tests (env_loader_precedence + env_loader_cwd_support + doctor_reports_config_source + launchd_plist_sets_config_env)
  • BR-003: 25 new tests (worker_exit_record + worker_exit_classification + explain_stuck_uses_worker_exit + integration test_failed_worker_writes_exit_record)
  • M19-F1: 2 new tests in test_secret_scanner.py (lowercase var not flagged + uppercase env var still caught)
  • Total Δ: +73 tests on top of the 419 baseline at beta.0.

Doctor

✓ config source: loaded /Users/lanston/.claude-code-247/config.yaml (kind=user); env files probed: 2
✓ auth mode: worker_mode=local_claude_code, usable=True
✓ sqlite db init: schema v3 at .../state/claude247.db

What's intentionally still not in scope for beta

  • Multi-machine HA — single-Mac is by design.
  • Cross-org auth — local-first, one user, one machine.
  • Docker runner outside dev mode — local backend covers stated scope.
  • Dashboard auth — binds to 127.0.0.1 deliberately.
  • Real-validator auto-merge demo with both Gemini + OpenAI — gated on
    the operator setting OPENAI_API_KEY; the system is ready for it.

Pre-release

This is a pre-release. Production-ready (v1.0.0) is gated on a
clean full E2E with both real validators returning a real PASS
on a non-trivial diff. The plumbing is in place; the only missing
input is an OpenAI API key.

v1.0.0-beta.0 — Beta-readiness milestone

24 May 19:13

Choose a tag to compare

v1.0.0-beta.0 — Beta-readiness milestone

Four-phase hardening (M18-P0..P4) on top of v1.0.0-alpha.1. M18 explicitly avoided new features; the goal was to move the product from "alpha harness" to "beta-ready live-ops" by hardening auth, validators, daemon, webhooks, and proving a clean second end-to-end on a real GitHub repo.

See BETA_READINESS_REPORT.md for the full synthesis.

Phases shipped

Phase Title Commit
M18-P0 Subscription/local auth — worker_mode + no silent ANTHROPIC API fallback 334ed46
M18-P1 Real OpenAI validator + mock-cannot-silently-pass-auto-merge gate 9dacd5d
M18-P2 launchd hardening — doctor_launchd.sh + extended doctor fields + plist tests 712a639
M18-P3 Live ngrok webhook validation + explicit handle_ping 5170197
M18-P4 Second real E2E proving reduced API spend + cleaner auto-merge path d50949f

Validated against CTlanston/auto-evo-playground (P4)

Real task: normalize_whitespace(text) queued via claude247 start, dispatched, role-loop ran, tests passed, PR opened (#53 draft), Gemini judged, merge policy routed to WAITING_APPROVAL.

Item Value
claude-code-247 tag commit d50949f
pytest 419 passing
Worker auth mode local_claude_code (subscription CLI)
Anthropic API spend $0.00 (down from ~$1.50 in alpha.1 for similar-shape task)
Gemini verdict NEEDS_HUMAN (honest — see Finding 1 below)
OpenAI verdict openai-mock (no key in env-loader scope — see Finding 2)
Merge decision WAITING_APPROVAL (mock validator gate held correctly)

What "beta-ready" means here

The documented product works end-to-end on a real GitHub repository, the auth path is honest about what it spends, validators are honest about what they ran with, the daemon path is inspectable, and live webhook delivery has been observed. It does not mean every backlog item is closed.

Mocked vs real

Component Mode
Claude Code worker real (subscription, local CLI)
Gemini 2.5 Pro real
OpenAI validator mock (env-loader CWD scope — see Finding 2)
GitHub push + PR real
Auto-merge not exercised (validator gate correctly held)
Webhook receiver real (P3 live ngrok delivery; 200 OK, signature verified)
Qdrant sqlite-fts fallback
Docker runner local subprocess (daemon offline)

Non-blocking backlog filed against beta.0

These were surfaced by the P4 live run and are filed as follow-up. They do not block the tag and are being addressed in the next milestone.

  • BR-001JudgeInput includes diff_summary.md (stat) but not the textual diff body. Real validators correctly refuse to verify byte-identical preservation without seeing the body. (caps real-validator PASS rate)
  • BR-002env_loader.load() reads ~/.claude-code-247/.env only; project/CWD .env is ignored, so a OPENAI_API_KEY in the active shell runs as mock. (config UX)
  • BR-003 — Dispatcher worker_exit summary lacks phase/classification/stderr detail, making post-mortem of failed runs harder than it should be. (observability)

Intentionally not in scope for beta

  • Multi-machine HA — single-Mac is by design.
  • Cross-org auth — the product is local-first; one user, one machine.
  • Docker runner outside dev mode — local backend covers the stated 24/7 single-Mac scope.
  • Dashboard auth — it binds to 127.0.0.1 deliberately.

Pre-release

This is a pre-release. Production-ready (v1.0.0) is gated on the backlog items above being closed and a clean third E2E with real validators returning a real PASS.

v1.0.0-alpha.1 — Real multi-repo E2E validation

24 May 17:34

Choose a tag to compare

Validated against CTlanston/auto-evo-playground

Real task: Add slugify utility with tests (PR #51 merged via this system).

Validated path

CLI → command_queue → dispatcher → repo_registry → workspace clone
→ role loop (planner/coder/reviewer via claude --print, real)
→ tests (python3 -m pytest in workspace, 15 new passing tests)
→ evidence package (all spec §10 artifacts)
→ real Gemini 2.5 Pro validator (PASS, confidence 1.0)
→ mock OpenAI validator (propagated reviewer NEEDS_HUMAN)
→ risk score (40, medium — only factor=validator_disagreement)
→ merge_policy → WAITING_APPROVAL (correct per §9.5)
→ claude247 approve-merge → mark_ready → gh pr merge --squash --admin
→ merged to main

Results

Item Value
claude-code-247 commit cd563e8
auto-evo-playground merge commit 7a3414f5
PR #51 (merged)
Tests (claude-code-247) 368 passing
Tests (auto-evo-playground new) 15 passing
Gemini verdict PASS, conf 1.0
OpenAI verdict NEEDS_HUMAN (mock — no key)
Risk score 40 (medium)
Merge decision approval-path → merged

Mocked vs real

Component Mode
Claude Code worker real (subscription billed to ANTHROPIC_API_KEY)
Gemini 2.5 Pro real
OpenAI validator mock (no OPENAI_API_KEY)
GitHub push + PR + merge real
Webhook receiver not exercised (needs public endpoint)
Qdrant sqlite-fts fallback
ntfy configured only; live push not verified
Docker runner backend local subprocess (Docker daemon offline)

6 live-discovery fixes shipped this release

  1. claude CLI 2.1.142: prompt now via stdin (--allowedTools <tools...> is variadic and swallowed the positional)
  2. Honest auth_mode: detects ANTHROPIC_API_KEY and labels anthropic_api instead of silently saying local_claude_code
  3. Spec test_command guidance: use python3 -m pytest (worker subprocess doesn't inherit venv bin)
  4. gh pr merge --auto opt-in via repo.auto_merge.auto: true (was always-on; failed on most repos)
  5. gh pr merge --admin opt-in via repo.auto_merge.admin: true (needed for protected base branches)
  6. gh pr ready auto-called before merge (drafts are rejected by GitHub merge API)

Remaining gaps (next milestones)

  • OPENAI_API_KEY to get both-real validator panel
  • scripts/install_launchd.sh to enable 24/7 daemon mode
  • Live webhook test against a real GitHub repo (needs ngrok / public host)
  • Live Qdrant verification (currently sqlite-fts fallback)

See REAL_E2E_REPORT.md for the full step-by-step walk.

v1.0.0-alpha.0 — production v1 transformation complete

24 May 16:10

Choose a tag to compare

What this is

claude-code-247 is a local-first, multi-repo, 24/7 autonomous coding coworker. The Mac stays on; one orchestrator process (run under launchd) dispatches per-task workers inside Docker containers, talks to GitHub as the source of truth, and exposes a FastAPI + HTMX dashboard plus a mobile-friendly claude247 CLI for remote control.

Status

  • 348 tests passing (<8s on Python 3.13)
  • 49/49 acceptance items in DEFINITION_OF_DONE.md
  • Original 28-item spec (M0–M10) + 6 follow-up milestones (M11, M11.5, M12, M13, M14, M15)
  • Live Gemini 2.5 Pro adapter verified end-to-end

Quick start

git clone https://github.com/CTlanston/claude-code-247.git
cd claude-code-247
make install
claude247 doctor

# Add a repo + start a task
claude247 repo add
claude247 start --repo my-repo --goal "refactor the auth middleware"

# Optional: install launchd daemons (dashboard + dispatcher + nightly backup)
scripts/install_launchd.sh
open http://127.0.0.1:8423

What ships in this release

Layer Modules
Orchestrator scheduler, dispatcher (13 command handlers), task/command/budget/risk/merge_policy managers, ci_poller, gc, metrics, webhook handlers, system pause flags, env loader
Runner Docker image + worker driver + planner/coder/reviewer/repair role loop (memory-aware)
Validators Gemini 2.5 Pro + OpenAI-compatible judges with evidence-only contract; N-validator panel (N≥1)
Memory SQLite schema, .agent/*.md per repo, Qdrant + SQLite-FTS5 backends, daily/weekly compiler, planner-prompt injection
Gateway CLI claude247 with 17 commands — status/repos/start/pause/resume/stop/explain-stuck/approve-merge/reject-merge/risk/tasks/task/logs/memory/replay/dispatcher/doctor
Dashboard FastAPI + HTMX, /metrics (Prometheus), /webhooks/github (HMAC-SHA256), pagination
Operations launchd plists (dashboard + orchestrator + dispatcher + backup), workspace/log GC, orphan-command recovery, SQLite .backup rotation
Docs docs/{ARCHITECTURE, INSTALL, REMOTE_DISPATCH, SECURITY, MEMORY, AUTO_MERGE_POLICY, VALIDATORS, REPO_ONBOARDING, OPERATIONS}.md

Safety gates (all on by default)

  • system.allow_remote_writes: false — global push/merge gate
  • Per-repo auto_merge.enabled: false
  • forbidden_paths non-empty required at onboarding
  • Validators require agreement; disagreement routes to human
  • All push/merge gh calls triple-gated

Notes

  • Old Auto-Evo + AutoDev v3 implementation is preserved at archive/auto-evo/
  • This release force-pushed main to replace the prior squashed snapshot — anyone with the old clone needs to git fetch + reset