claude-code-247

Your local-first, multi-repo, 24/7 autonomous coding coworker. The Mac stays on; Claude Code authenticated locally drives Docker-isolated workers across every repo in your registry, opens draft PRs on GitHub, runs external validators, scores risk, and merges low-risk changes automatically — gated by your phone if anything bigger.

v1.0.0 GA shipped 2026-05-25 (with an owner-waived 24h soak gate). v2.2.0-rc2 is the current TypeScript production-grade tag for this single-operator system. Per ADR-0013 Path A, no v2.2.0 GA tag is created under the current release policy.

Latest: conversational-cockpit-v1 — the P0-P7 workbook is complete: a chat-first Operator Cockpit now uses local Claude Code for clarification and planning, local Codex for implementation, and an isolated Gemini validator as the evidence-only PR hard gate. The strict real smoke, browser quality smoke, in-app browser validation, full test gate, and external Gemini evidence-only validator all passed on 2026-06-03. See WORKBOOK_v3.md and docs/SESSION_LOG_v3.md.

The architecture has converged to a TypeScript-only, event-sourced, three-plane design per ADR-0010; the dual-kernel section further down is retained as v1.0.0 history.

Quick Start

The fastest way to try the current product surface is the Operator Cockpit. It starts the TypeScript daemon on 7247 and the web app on 7248.

# 1. Install workspace dependencies
pnpm install

# 2. Start the local daemon + Operator Cockpit web UI
pnpm cockpit:dev

# 3. Open the chat-first cockpit
open http://127.0.0.1:7248

Then run the core operator flow:

Describe the mission in the chat composer.
Let Claude ask as many clarification questions as needed; the server will not unlock roadmap or coding until confidence is at least 95%.
Generate PRD / ADR / Roadmap once the clarification gate unlocks.
Approve the roadmap only when it matches the intended outcome.
Start execution; Codex performs the implementation in the repo-bound worker.
Read progress, evidence, Gemini verdicts, and PR-gate status inline in the same conversation thread.
Re-check the Draft PR gate only after Gemini PASS and repo policy allow it.

The cockpit defaults to local CLI / subscription-style usage: claude-cli for planner/clarifier and codex-cli for coding. It must not silently fall back to a paid API. If either local engine cannot run, the UI shows a current HOLD with a recovery action rather than fake success. Gemini is the only default external validator, and it sees evidence only.

🆕 conversational-cockpit-v1 — P0-P7 complete

The new cockpit is intentionally simple: one conversation stream plus a thin status strip. Clarification, roadmap generation, worker progress, evidence, Gemini decisions, and PR-gate results all appear inline as chat bubbles or cards. The old multi-panel cockpit surface is no longer the product default.

The P0-P7 workbook closed with these verified properties:

Local engine split — Claude Code is the clarifier/planner; Codex is the coder. Evidence records planner=claude-cli / coder=codex-cli and local subscription auth modes.
95% understanding gate — roadmap and coding are server-blocked until the planner reaches at least 95% confidence and there are no pending questions. Claude may ask zero, one, or many follow-up questions; the contract is confidence, not a fixed question count.
Conversation UI — the cockpit renders as a single chat thread and thin status strip. Legacy sidebar, inspector, tabs, and Project Pulse panels are no longer part of the default flow.
Gemini hard gate — Gemini is the default evidence-only validator. Draft PR creation is blocked unless the latest Gemini verdict is PASS, and remote writes are still blocked unless repo policy explicitly enables them.
Team memory — repo/operator Tier 1 memory and event-derived Tier 2 memory are injected into worker context; repeated Gemini rejection lessons can be compiled back into repo memory.
Hybrid execution — small missions use a single worker run; large roadmaps can execute through a per-node DAG with node evidence and honest failure handling.
Real end-to-end validation — strict real smoke, browser quality smoke, full tests, in-app browser checks, and an independent Gemini evidence-only validator all passed. Latest strict report: evidence/launch/operator-cockpit-real-smoke-2026-06-03T17-03-35-037Z.md.

🆕 v2.4.0-patch1 — real end-to-end loop proven (E2E-Harvest)

The core value loop has now run end-to-end on real LLM work for the first time: a subscription Claude coder running inside Docker writes real code → evidence is collected → two independent validator families (OpenAI + Gemini) judge it on evidence only → a real draft PR is opened on GitHub → token/cost usage is persisted. Architecture decision: ADR-0019.

Historical stage: ProductionHardened_v2.4_Ready (superseded by WORKBOOK_v3.md). This is a single-operator system; system.allow_remote_writes defaults to false and gates every outward write — git push, PR creation, and merge alike.

What's new in the technical surface

claude-in-Docker runner — packages/runner/src/claude-docker-runner.ts runs the subscription Claude CLI inside a container against a per-task git worktree, honoring the image's /entrypoint.sh contract (writes /workspace/prompt.txt; sets CLAUDE_ROLE / CLAUDE_MODEL / CLAUDE_PERMISSION_MODE / CLAUDE_ALLOWED_TOOLS; reads back /workspace/result.json). A static + runtime preflight (preflightClaudeDockerEnvironment / preflightRuntime) fails fast with a HOLD-CLAUDE-DOCKER-IMAGE or HOLD-CLAUDE-AUTH-IN-DOCKER reason rather than ever falling back silently to the paid API.
runner:e2e1 derived image — packages/runner/docker/Dockerfile.e2e1
- entrypoint-e2e1.sh. The patched entrypoint writes /workspace/cli-envelope.json (the raw CLI usage envelope) before normalization, so authoritative token counts survive (the stock result.json reported 0/0).
Subscription auth via OAuth token — inject AEDEV_CLAUDE_OAUTH_TOKEN (from claude setup-token) → CLAUDE_CODE_OAUTH_TOKEN inside the container. The macOS keychain credential is host-bound and 401s inside a Linux container, so the token path is the proven, keychain-free option. All ANTHROPIC_* paid-API env vars are stripped from the container.
model_usage accounting + live cost roller — insertModelUsage persists input/output tokens + cost per run and emits a model.usage.recorded event. Local subscription usage is tracked by run count + cost, never reported as $0. The daemon now feeds a long-lived CostRoller (seeded from model_usage on boot so spend survives a restart) and exposes cost_total_usd / cost_per_pr_usd_7d / cost_event_count on /metrics. Only known costs are summed — subscription-unknown stays 0, never fabricated.
Dual-family validators — OpenAI- and Gemini-family judges score the evidence package only (never the coder's conversation or chain-of-thought). The merge policy requires two independent families to pass.
Structured ClarificationGate (ADR-0020) — packages/daemon/src/clarification-gate.ts scores mission ambiguity deterministically (no LLM, no token spend) over four signals; above the threshold (trigger_threshold: 50 in config/policies.yaml) it asks ≤4 questions before any coder runs and writes a verifiable clarified-spec.md. Decision: ADR-0020.
Autonomous draft-PR closure — the daemon's mission loop now opens a real draft PR on an AUTO_MERGE decision via DraftPrGate over GhGitRemoteWriter / GhDraftPrCreator (runner plane), instead of stopping at a mock merge. The gate fail-closes on allow_remote_writes, repo.enabled, and forbidden paths, so the no-push default is preserved — with the flag false (default) the loop opens nothing. This folds the proven scripts/e2e1-real-loop.ts path into the loop.
Real-diff forbidden-path gate — forbidden-path detection reads the runner's changed-paths.json (the actual git diff file list) rather than regexing evidence prose, and feeds the merge policy's hard BLOCK (mission-runner.ts).
/github/sync is gated — the GitHub PR-sync route now fails closed with REMOTE_WRITES_DISABLED unless system.allow_remote_writes is true (it was previously guarded only by the presence of a GitHub token).

Operator Cockpit

Operator Cockpit is the human control plane for the local-first coding coworker. It is intended to feel more like Claude Code Desktop than a passive dashboard: chat first, explicit clarification, visible execution progress, and safety gates that stay obvious.

The current conversational surface includes:

Single chat workspace — one conversation thread for clarification, planning, execution, evidence, Gemini verdicts, and PR-gate outcomes.
Thin status strip — the only persistent chrome is stage, current action, progress, and pending approval count.
Structured clarification cards — Claude's follow-up questions are answerable through choices and free-form replies, with the original question and answer transcript sent back to Claude on follow-up.
Provider and token transparency — major planner/worker/validator actions expose whether they used claude-cli, codex-cli, mock/test mode, or Gemini, plus token/cost data when available.
Current-only HOLDs — active blockers are shown prominently, while superseded historical HOLDs remain in logs/events instead of stale top banners.
Safety-preserving PR gate — draft PR creation remains blocked unless the latest Gemini verdict is PASS, system.allow_remote_writes is true, and repo policy explicitly permits outward writes.
Repo-bound worker (trust model) — when you select a repo and press Start, the worker executes inside an isolated git worktree of that repo (checked out at the committed HEAD, so your working tree and branches are untouched), never an empty scratch directory. If the selected repo is missing, disabled, or not a git repository, the mission HOLDs (HOLD-TARGET-REPO-UNAVAILABLE) rather than writing throwaway files and reporting "done". Evidence records the real changed-paths.json, repo path, and worktree path; touching a forbidden path (.env*, secrets/**, .github/**, AGENTS.md, CLAUDE.md) blocks the merge gate.

For the detailed UX v2 implementation brief, see docs/handoff/operator-cockpit-ux-v2-prd-2026-05-31.md.

Running the E2E loop

# 0. One-time: capture a keychain-free subscription token
claude setup-token            # store the sk-ant-oat... value where your secrets live

# 1. Build the runner:e2e1 image (authoritative token counts)
docker build -f packages/runner/docker/Dockerfile.e2e1 \
  -t claude-code-247/runner:e2e1 packages/runner/docker

# 2. Real end-to-end loop: docker Claude coder → dual-family → draft PR → model_usage
#    (draft-only; never merges. Needs the OAuth token + OPENAI/GEMINI keys in env.)
node_modules/.bin/tsx scripts/e2e1-real-loop.ts

# 3. ClarificationGate shadow walk (deterministic; spends no LLM tokens)
node_modules/.bin/tsx scripts/e2e2-clarification-shadow-walk.ts

Safety model: these scripts pass allowRemoteWrites: true in-process to a draft-only PR gate; the global system.allow_remote_writes stays false. Because they pre-approve the mission, they deliberately bypass the daemon's approval path — so no ntfy phone approval is requested. To exercise the real approval flow (medium/high-risk merge, API fallback, etc.), run a mission through the daemon's IntakeService, which pushes an ntfy notification to your phone for approve/reject.

⚡ Architecture today (v1.0.0) — dual kernel, single product

The dual-kernel layout below is the current state as of v1.0.0 GA. v2.0 collapses it to a single TypeScript control plane and removes the Python tree entirely. See V2_ARCHITECTURE.md for the target architecture and the stage-by-stage plan.

claude-code-247 is one product OS with two cooperating kernels:

Layer	Implementation	Role
Control plane	TypeScript `aedev` (pnpm monorepo)	Primary CLI, daemon, dashboard, state machine, mission intake, roadmap, task graph, approvals, memory, risk, preview/deploy orchestration, evidence bundle.
Execution kernel	Python `claude247` (v1.0.0 GA)	Mature Docker worker runtime, headless `claude --print` invocation, Gemini + OpenAI judges, GitHub PR creation. Invoked by `aedev` during the parity window.
Bridge	`@aedev/claude247-bridge`	Enqueues tasks into the Python state DB, polls status, imports evidence back into `aedev`'s SQLite.

This dual-kernel design is recorded in ADR-0009, which supersedes ADR-0008. aedev is the primary entry point for new product-OS work; the Python kernel continues to drive worker execution and validator orchestration until the TypeScript runtime reaches parity (see docs/aedev-prototype-status.md for the parity gate list). Both ADRs will be superseded by ADR-0010 in Stage A of the v2.0 plan.

What you get

Multi-repo from day one. One registry, many repos. Per-repo budget, risk policy, allowed/forbidden paths.
Local-first execution. Mac + Docker. Your authenticated Claude Code session is the default; the paid API is opt-in.
Mobile control. claude247 status --plain and claude247 status-board --plain are built for SMS-sized output. ntfy.sh pushes for approvals and stuck tasks.
External validator isolation. Gemini 2.5 Pro and an OpenAI-compatible judge see only the evidence package — never the Coder's conversation.
Low-risk auto-merge with score 0–100; medium asks your phone, high blocks.
Long-term memory that compiles failures, lessons, and decisions back into per-repo .agent/*.md files.
Failure replay for any task.
Live read-only watchdog dashboard (new in v1.0.0 / M22b) — see below.

Legacy / CLI Quick Start

aedev is the primary control plane. The Python claude247 kernel is installed alongside it during the parity window and handles worker execution underneath.

# 1. Install the Python execution kernel (mature, GA v1.0.0)
make install                    # creates venv + installs deps + launchd plists
claude247 doctor                # verify kernel environment

# 2. Install the TypeScript control plane
pnpm install
pnpm -r build

# 3. Initialize aedev home (~/.aedev/)
aedev init

# 4. Start the aedev daemon (port 7247) — control plane + dashboard
aedev daemon start
open http://localhost:7247

# 5. Submit a mission via the control plane (two-step approval)
aedev intake "refactor the auth middleware in repo my-repo"
aedev mission list              # find the mission id
aedev mission approve <id>      # explicit approval — no self-approve

# 6. Inspect status / tasks via the control plane
aedev status --plain
aedev task list

# 7. Read-only watchdog (Python kernel) — phone-friendly
claude247 status-board --plain
claude247 watchdog --plain
claude247 status-board --json
claude247 status-board --write-md M22_WATCHDOG_DASHBOARD.md

During the parity window, some kernel-level operations are still invoked directly via claude247 (worker launch, validator orchestration, GitHub PR creation). The @aedev/claude247-bridge package routes aedev missions through the Python kernel automatically — see ADR-0009 and docs/aedev-prototype-status.md.

Live watchdog dashboard

A read-only operations dashboard for "is the 24/7 daemon actually OK right now?" Designed to be safe to run from a phone while the dispatcher is mid-tick — the SQL is SELECT-only and the contract is asserted by a regression test (tests/unit/test_status_board.py::test_read_only_does_not_mutate_db).

Web (Apple-style): http://127.0.0.1:8423/status-board

Activity-ring soak progress (recolors green / blue / red by state) using only inline SVG + CSS — no charting library
Auto-refresh every 15s (configurable 5 / 15 / 30 / 60s / off); fetches /status-board.json, updates DOM in place, briefly tints cards that changed — no full reload, no flicker
EN ↔ 中文 language toggle with localStorage persistence
Dark mode follows prefers-color-scheme
Live indicator dot in the top bar — pulsing green when live, amber when paused, red when a fetch fails
Pause / resume / refresh-now controls with a morphing play/pause SVG button
Zero external dependencies — no CDN, no font files, no JS library; the whole page is ~25KB inline

CLI:

claude247 status-board --plain
# Claude247 Watchdog Dashboard
# Generated: 2026-05-25T...
#
# Release State / Soak Progress / Runtime Health
# Queue / Task State / Recent Signals / GA Gates / Usage

JSON: http://127.0.0.1:8423/status-board.json

{
  "generated_at": "...",
  "release_state": { "main_sha": "...", "ga_status": "..." },
  "soak":          { "t0": "...", "progress_percent": 38, "result": "PARTIAL" },
  "runtime_health":{ "launchd_loaded": 4, "dispatcher": "healthy", ... },
  "queue":         { "active_tasks": 0, "orphan_commands": 0, ... },
  "signals":       { "new_critical_errors": 0, "alert_storm": false, ... },
  "ga_gates":      { "passed": 18, "total": 19, "recommendation": "..." },
  "usage":         { "runs_total": 0, "active_workers": 0, ... }
}

The watchdog reads M20_SOAK_RESULT.md to auto-discover the dispatcher T0; pass --t0 2026-05-24T21:46Z to override.

Status

v1.0.0 GA — released 2026-05-25 (Python claude247 kernel).
The first GA release. See RELEASE_NOTES_GA.md for the full notes, GA_GATE.md for the 19-gate GA contract, and M22_GA_DECISION_REPORT.md for the GA decision record.
Soak gate was explicitly waived by the owner after ~9h 12m of healthy soak evidence (4/4 launchd loaded, ~1182 dispatcher idle ticks, backup completed, 0 alerts, 0 orphan commands, $0 Anthropic worker spend). Final T+24h observation is a post-GA follow-up — the watchdog dashboard will auto-flip soak.result to PASS or FAIL once wall-clock crosses 2026-05-25T21:46Z.
Pre-release history (alpha.0 → beta.2) preserved on GitHub.
v2.2.0-rc2 is production grade for the TypeScript line — single TypeScript daemon, Python tree removed, HOLD as first-class state, closed-loop approval (ntfy/Tailscale), push-time security gate, resumable moves, cross-platform supervisor, chaos drills, Agent Mesh, RoadmapAgent, and Sentinel. The formal policy is docs/operations/release-policy.md.
No v2.1.0 or v2.2.0 GA tag is expected under the current policy. The expected v2 release references are v2.1.0-rc1, v2.1.0-rc2, v2.2.0-rc1, and v2.2.0-rc2.

Documentation

v2 TypeScript line:

V2_ARCHITECTURE.md — full v2.0 architecture and stage-by-stage implementation plan (start here)
docs/operations/release-policy.md — current release-grade and tag policy

v1.0.0 (current GA):

RELEASE_NOTES_GA.md — v1.0.0 release notes
GA_GATE.md — 19-gate GA contract + owner-waiver policy
M22_GA_DECISION_REPORT.md — GA decision record
M20_SOAK_RESULT.md — soak observation + waiver record
DEFINITION_OF_DONE.md — DoD checklist
CHANGELOG.md — release history
docs/ARCHITECTURE.md — module map and data flow (v1.0.0)
docs/INSTALL.md — full install + uninstall + doctor
docs/REMOTE_DISPATCH.md — phone / Remote / Dispatch operating guide
docs/SECURITY.md — secret hygiene, forbidden paths, approval flow
docs/MEMORY.md — vector + .agent file architecture
docs/AUTO_MERGE_POLICY.md — risk scoring and merge gates
docs/VALIDATORS.md — Gemini + OpenAI judge contracts
docs/REPO_ONBOARDING.md — adding repos
docs/OPERATIONS.md — day-to-day operating playbook

Working on the TypeScript control plane (`aedev`)

# Install dependencies (Node.js ≥ 20, pnpm ≥ 10 required)
pnpm install

# Run all tests
pnpm test

# Type-check across the workspace
pnpm typecheck

# Lint
pnpm lint

# Opt-in real subprocess smoke tests (require `claude` and/or Docker on PATH)
AEDEV_SMOKE_CLAUDE=1 pnpm test --filter @aedev/runner
AEDEV_SMOKE_DOCKER=1 pnpm test --filter @aedev/runner

# Start the daemon (port 7247) — serves the dashboard + REST API
cd packages/daemon && pnpm start
open http://localhost:7247

Architecture decisions for aedev: docs/adr/ (ADR-0001 through ADR-0009).

TS runtime parity gates: docs/aedev-prototype-status.md.

License

Internal.

Name		Name	Last commit message	Last commit date
Latest commit History 349 Commits
.aedev		.aedev
.claude		.claude
.github/workflows		.github/workflows
apps/dashboard		apps/dashboard
archive		archive
config		config
docs		docs
evidence		evidence
packages		packages
proposals		proposals
reports		reports
scripts		scripts
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.prettierrc		.prettierrc
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
PRODUCTION_WORKBOOK.md		PRODUCTION_WORKBOOK.md
README.md		README.md
RELEASE_NOTES_GA.md		RELEASE_NOTES_GA.md
WORKBOOK_v3.md		WORKBOOK_v3.md
eslint.config.mjs		eslint.config.mjs
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json
vitest.config.ts		vitest.config.ts
vitest.setup.ts		vitest.setup.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

claude-code-247

Quick Start

🆕 conversational-cockpit-v1 — P0-P7 complete

🆕 v2.4.0-patch1 — real end-to-end loop proven (E2E-Harvest)

What's new in the technical surface

Operator Cockpit

Running the E2E loop

⚡ Architecture today (v1.0.0) — dual kernel, single product

What you get

Legacy / CLI Quick Start

Live watchdog dashboard

Status

Documentation

Working on the TypeScript control plane (`aedev`)

License

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

claude-code-247

Quick Start

🆕 conversational-cockpit-v1 — P0-P7 complete

🆕 v2.4.0-patch1 — real end-to-end loop proven (E2E-Harvest)

What's new in the technical surface

Operator Cockpit

Running the E2E loop

⚡ Architecture today (v1.0.0) — dual kernel, single product

What you get

Legacy / CLI Quick Start

Live watchdog dashboard

Status

Documentation

Working on the TypeScript control plane (aedev)

License

About

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Working on the TypeScript control plane (`aedev`)

Packages