Skip to content

feat: elevate codex with claude depth and shared runtime features#1

Merged
mumit merged 3 commits into
mainfrom
codex/elevate-with-claude-depth
May 1, 2026
Merged

feat: elevate codex with claude depth and shared runtime features#1
mumit merged 3 commits into
mainfrom
codex/elevate-with-claude-depth

Conversation

@mumit

@mumit mumit commented May 1, 2026

Copy link
Copy Markdown
Owner

Summary

Two commits forming one logical elevation effort:

  1. feat: deepen claude-dev-team parity — closes the v1.0.0 audit gaps: documents the stage-renumbering divergence, fleshes out role prompts (24 → 100-170 lines each), adds the Stage 0 safety stoplist + budget-gate config + async-checkpoint config, hardens approval-derivation.js with file locking and atomic writes, ports the audit-phases reference, and replaces the row-existence parity check with a deep content-depth check.

  2. feat: elevate codex with claude depth and shared runtime features — ports claude-dev-team's prose depth AND adds three new runtime features neither framework had.

Prose depth ported from claude

  • .codex/rules/pipeline.md 145 → 393 lines (review shape scoped/matrix, READ-ONLY Reviewer Rule, gate merge strategy, review round limit, stage durations, parallelism)
  • .codex/rules/gates.md 74 → 280 lines (per-stage extra-field examples)
  • .codex/rules/coding-principles.md 62 → 151 lines
  • .codex/rules/execution-profiles.md 17 → 106 lines (full local/app-worktree/cloud model with parallelism patterns)

New narrative artifacts

  • EXAMPLE.md218-line end-to-end pipeline walkthrough (codex-ized password-reset feature)
  • CHANGELOG.md — versions v1.0.0, v1.1.0 (this), v1.2.0 (unreleased placeholder)
  • CONTRIBUTING.md — local dev setup, test/lint/parity commands, PR conventions, stage-numbering note

New runtime features (neither framework had these)

Feature Script What it does
Budget tracking scripts/budget.js Honors budget.enabled in .codex/config.yml; writes pipeline/budget.md; emits stage-budget.json ESCALATE on overrun (or warns). init/update/check subcommands.
Async-checkpoint auto-pass applyCheckpointAutoPass() in codex-team.js Honors checkpoints.{a,b,c}.auto_pass_when config (no_warnings, all_criteria_passed); writes CHECKPOINT-AUTO-PASS: line to context. Stoplist override prevents auto-pass on security-sensitive runs.
Pipeline visualization scripts/visualize.js Generates a Mermaid stateDiagram-v2 of the active pipeline run, color-coded by gate status (PASS/FAIL/ESCALATE/missing). Writes pipeline/diagram.md.

Test rigor

125 → 169 tests across 16 suites. New tests:

  • tests/budget.test.js — 15 tests (init, update, check escalate/warn paths, disabled-mode no-op)
  • tests/checkpoints.test.js — 15 tests (each condition + null default + stoplist override)
  • tests/visualize.test.js — 14 tests (empty/active/complete pipelines, valid Mermaid syntax)

Plus all earlier deepening tests (parity-check main() + mutation tests, role-prompt line-count checks, config-key validation, audit-phases reference).

Stage numbering preserved

Codex keeps its collapsed numbering (Stage 5 = pre-review with security_review_required flag). The translation table to claude's Stage 4.5a/4.5b lives in docs/parity/claude-dev-team-parity.md under "Stage Numbering Divergence".

Test plan

  • npm test — all 169 pass
  • npm run doctor — all PASS
  • npm run parity:check — passes (deep check)
  • npm run lint — passes
  • npm run budget -- init — works (no-op when disabled)
  • npm run visualize — writes pipeline/diagram.md with valid Mermaid
  • npm run pipeline -- "Test feature" — workspace bootstraps cleanly
  • npm run next — track-aware advancement still works

🤖 Generated with Claude Code

mumit and others added 3 commits May 1, 2026 13:25
Address audit findings where v1.0.0 had structural parity but shallow
content parity. Document the stage-renumbering divergence, port
behavioral content from claude agents into role prompts, add the
safety stoplist + budget gate + async checkpoints from claude's Stage
0, harden the approval-derivation hook with file locking and atomic
writes, port the audit-phases reference, and replace the row-existence
parity check with a deep content check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Port claude-dev-team's pipeline rules depth (review shape, READ-ONLY rule,
gate merge strategy, round limit, durations, parallelism), gates per-stage
extra-field examples, coding principles, and execution profiles. Add
EXAMPLE.md walkthrough (218 lines), CHANGELOG.md, and CONTRIBUTING.md.

Implement three runtime features neither framework had: budget tracking
(scripts/budget.js), async-checkpoint auto-pass logic, and Mermaid pipeline
visualization (scripts/visualize.js). 44 new tests; 169/169 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When two gate files land in the same millisecond (common on fast CI
filesystems), mtime-only sort is unstable and may pick the wrong
"latest" gate. Add filename localeCompare as a stable secondary sort
so latest-mode validation is reproducible.

Resolves CI flake on tests/gate-validator.test.js:164
"validates every gate when requested".
@mumit mumit merged commit e388c52 into main May 1, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant