Skip to content

让 review comment 与 refactor marker 冲突可收敛#402

Merged
loning merged 10 commits into
ChronoAIProject:devfrom
louis4li:fix/pr-review-thread-completion-gate
Jun 4, 2026
Merged

让 review comment 与 refactor marker 冲突可收敛#402
loning merged 10 commits into
ChronoAIProject:devfrom
louis4li:fix/pr-review-thread-completion-gate

Conversation

@louis4li
Copy link
Copy Markdown
Contributor

@louis4li louis4li commented Jun 1, 2026

Summary

  • 明确 PR review comment 驱动的修复必须回复/关闭原 thread 或显式升级。
  • 保留 self-doc-comment 严格 provenance,要求 issue-only refactor marker 归一化为 iter/cluster marker。
  • 让 deterministic marker 格式冲突走 retry-fix,避免升级人工卡死。

Test plan

  • python3 skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py

⟦AI:AUTO-LOOP⟧

louis4li and others added 2 commits June 1, 2026 15:29
在 GitHub State Contract 中补充 PR review comment fix 的完成定义,明确由 review comment 驱动的修复必须回复并 resolve 原 thread,或显式升级处理,避免只完成代码提交而漏掉 GitHub 状态闭环。

⟦AI:AUTO-LOOP⟧

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
触发来源: aevatarAI/aevatar#1624 review-gate 卡在 self-doc marker 身份格式反复 reject
行为类型: 收紧 codex-refactor-loop prompt 路由,保留 self-doc-comment 严格 provenance,把 issue-only marker 归一化列为 fix-codex in-scope
等价语义: 同类非规范 Refactor marker identity 冲突不再升级人工,由 retry-fix 携带精确归一化指令继续收敛
后续复用: HOST_REFACTOR_COMMENT_POLICY=self-doc-comment 的下游可维护 iter/cluster provenance,同时避免 deterministic 文本格式冲突卡死
失败痕迹归属: test_refactor_comment_policy_prompt_contract.py 覆盖 implement/review-fix/quality/reflector 契约

⟦AI:AUTO-LOOP⟧

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@louis4li louis4li force-pushed the fix/pr-review-thread-completion-gate branch from 4d5aad1 to aaaf4be Compare June 1, 2026 11:02
@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 1, 2026

🤖 测试审查:缺少 review-thread 闭环契约的回归测试

TL;DR

  • 这是什么: PR 402 的 tests 视角独立审查。
  • 结论是什么: reject,因为新增的 PR review comment fix 完成契约没有机械测试锁住。
  • 下一步: 给 SKILL.md 的 GitHub State Contract 新增一条 source-regression 测试;若已有执行逻辑,也补或指向行为测试。

详细说明

这次 prompt 中关于非规范 refactor marker 归一化的变更已有测试覆盖:test_refactor_comment_policy_prompt_contract.py 会断言 META_RESOLVED:retry-fix:<exact normalization instruction> 和 “not human escalation”,本地运行也通过。

阻断点在另一处:skills/codex-refactor-loop/SKILL.md:532 新增了 GitHub State Contract,要求由 PR review comment 驱动的修复必须回复并 resolve 原 thread,或显式升级。这个是新的流程约束,但当前只出现在 SKILL.md 文本里,没有 source-regression test 或行为测试覆盖。按本仓库“行为约束必须落到机械验证手段”的规则,这需要合并前补上。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: tests
verdict: reject
---

## Verdict
Reject: the marker-normalization prompt changes have source-regression coverage, but the new GitHub State Contract rule for PR review comment fixes has no matching test.

## Evidence
- `skills/codex-refactor-loop/SKILL.md:532` adds a new completion contract: PR review comment fixes are incomplete until the original review thread is replied to and resolved, or explicitly escalated. `rg "PR review comment fix|review-thread closure|original thread|replied to and resolved"` finds this only in `SKILL.md`, with no matching source-regression or behavior test under `skills/codex-refactor-loop/scripts/`.
- `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:122` adequately locks the new deterministic marker-normalization prompt route (`META_RESOLVED:retry-fix:<exact normalization instruction>` and "not human escalation"), and `python3 -m unittest skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py` passes. The gap is specifically the separate review-thread closure contract added to `SKILL.md`.

## What would change your verdict
Add a focused source-regression test that reads `SKILL.md` and asserts the GitHub State Contract contains the `PR review comment fix` row with the required reply-and-resolve-or-explicitly-escalate completion semantics. If there is existing controller/comment-monitor logic that enforces review-thread closure, add or point to a behavior test for that path as well.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:tests:reject

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 1, 2026

🤖 质量审查通过: PR 402 可读性与简洁性无阻塞

TL;DR

  • 这是什么: 我按 code quality 角度复查了 PR 402 的 prompt 与 source-regression 测试改动。
  • 结论: approve, 没有发现命名、死代码、过度抽象、复杂度或无关改动问题。
  • 下一步: controller 可继续汇总其他 reviewer 结论。

详细说明

这组改动把“非 canonical refactor marker identity”收敛到已有流程里: implement prompt 写清 canonical marker, review-fix prompt 把确定性 marker normalization 归到现有 (A) fixable in-scope, meta-reflector 只在 stalled 时重试 fix, reviewer-quality 要求给出 exact expected marker。没有新增 helper、接口、脚本入口或第二套 parser 规则。

测试侧只扩展了 test_refactor_comment_policy_prompt_contract.py 的 source-regression 字符串断言,用来锁住 prompt contract。质量角度看这是合适的窄验证,不是未使用公共 surface。


📎 完整 codex 原始输出(存档备查)
    ---
    pr: 402
    role: quality
    verdict: approve
    ---
    
    ## Verdict
    Approve: the PR keeps the change focused on the review-comment/refactor-marker convergence rules, with readable prompt edits and no new complexity, dead code, or unrelated cleanup.
    
    ## Evidence
    - skills/codex-refactor-loop/prompts/implement.md:34: The canonical marker identity rule is stated directly at the existing self-doc policy point, so it avoids a second flow or helper vocabulary.
    - skills/codex-refactor-loop/prompts/review-fix.md:47: The fix-codex rule classifies marker normalization as the existing (A) fixable in-scope path, which is simple and avoids an unnecessary human-decision branch.
    - skills/codex-refactor-loop/prompts/meta-reflector-stalled.md:36: The stalled-route addition is a single deterministic retry rule scoped to non-canonical marker identity, not a broad new escalation mechanism.
    - skills/codex-refactor-loop/prompts/reviewer-quality.md:29: The reviewer instruction asks for the exact expected canonical marker, which keeps feedback concrete and reviewable.
    - skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:122: The new source-regression test locks the new routing language without introducing unused helpers, public APIs, or duplicated parser logic.
    
    ⟦AI:AUTO-LOOP⟧
    REVIEW_DONE:402:quality:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 1, 2026

🤖 架构审查阻塞:新增 GitHub 闭环契约缺少机械验证

TL;DR

  • 这是什么: PR 402 的 architect 角度独立审查。
  • 结论是什么: reject;新增 PR review comment fix 完成契约,但测试只覆盖 marker normalization prompt。
  • 下一步: 请补 SKILL.md GitHub State Contract 行的 source-regression,若要实际驱动 thread reply/resolve,还需要对应 behavior 覆盖。

详细说明

skills/codex-refactor-loop/SKILL.md:532 新增了一个 GitHub 状态完成定义:review comment 驱动的修复必须回复并 resolve 原 thread,或显式升级。这个属于流程/行为契约,不是普通说明文字。

当前新增测试集中在 skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:48-128 的 refactor marker normalization 文案,没有锁住 PR review comment fixreview-thread closureGitHub State Contract 表格行。按 CLAUDE.md / AGENTS.md 的规则,行为变更必须配套 behavior test + source-regression test;治理/流程规则也必须和机械验证一起进仓库。

建议最小修复:增加一个读取 skills/codex-refactor-loop/SKILL.md 的 source-regression test,断言 PR review comment fix 行和 reply/resolve-or-escalate 语义。如果这条契约已经要被 controller/comment handling 执行,再补相应 behavior test。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: architect
verdict: reject
---

## Verdict
reject — PR adds a new GitHub completion contract but does not add the required source-regression/behavior coverage for that SKILL.md contract change.

## Evidence
- `skills/codex-refactor-loop/SKILL.md:532` adds a new required visible state contract: `PR review comment fix | Completion includes review-thread closure...`; `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:48-128` only locks refactor-comment marker normalization prompt text and has no assertion for `PR review comment fix`, `review-thread closure`, or the GitHub State Contract table. This regresses CLAUDE.md / AGENTS.md: `行为变更必须配套 **behavior test**(断言行为本身)+ **source-regression test**(对 SKILL.md 段落标题、narrow allowlist 字面、授权来源 path 等做字面断言),防止"改文档没改实现"或反之。`
- Same gap also violates the governance coupling rule in CLAUDE.md / AGENTS.md: `治理前置:架构性 / 流程性规则与对应机械验证手段同时进仓库,缺一不补口径。`

## What would change your verdict
Add mechanical coverage for the new `GitHub State Contract` row. Minimum acceptable fix: a source-regression test that reads `skills/codex-refactor-loop/SKILL.md` and asserts the `PR review comment fix` row plus the closure/escalation wording. If this is intended to drive actual GitHub thread reply/resolve behavior, add the matching behavior test or wire it into the existing controller/comment handling tests so the new completion definition is executable, not prose-only.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:architect:reject

⟦AI:AUTO-LOOP⟧

…规则(补 review-gate 缺口)

回应 PR#402 review-gate(architect+tests reject:新 GitHub State Contract 完成规则缺机械覆盖):
test_refactor_comment_policy_prompt_contract.py 加 source-regression,锁 SKILL.md 新规则文本
+ 其要求的 prompt 改动(review-comment fix 必须回复/关闭 thread 或升级、marker 归一化、format 冲突走 retry-fix)。
纯加锁定测试,不改 louis4li 的产品逻辑。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 1, 2026

🤖 tests review r2: approve

TL;DR

  • 这是什么: PR 402 的 tests 视角二轮 review。
  • 结论: approve, prompt-contract 变更已有对应 source-regression 覆盖。
  • 下一步: controller 可把本轮 tests verdict 纳入 review-gate 汇总。

详细说明

这次 diff 主要改 与几个 prompt 的契约文本,配套扩展了 。新增测试覆盖了三块关键行为: PR review comment fix 必须闭环原始 review thread、 下 marker identity 必须保持 形态、deterministic marker normalization 必须走 retry-fix 而不是 human escalation。

我检查了 diff 中的测试质量风险: 没有新增 sleep/delay pacing, 没有 或 manual 分类, 没有削弱已有断言。已跑目标测试: , 结果 , OK。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: tests
verdict: approve
---

## Verdict
Test coverage and test quality are adequate for this prompt-contract diff.

## Evidence
- `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:25` adds a source-regression assertion for the new `GitHub State Contract` row requiring PR review-thread closure or explicit escalation.
- `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:63` extends the existing `self-doc-comment` contract test to lock canonical `Refactor (iter${ITERATION}/${CLUSTER_ID})` marker identity and the fix-codex non-human-decision route.
- `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:137` adds focused coverage for deterministic marker normalization routing to `META_RESOLVED:retry-fix`, which covers the new `meta-reflector-stalled.md` behavior text.
- The diff adds no `sleep`/`delay`, no skip/manual-test marker, and no weakened assertions; the changed test remains source-regression style, matching this repository's prompt-contract test convention.
- Verification run: `python3 skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py` passed, `Ran 7 tests in 0.002s`.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:tests:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 1, 2026

🤖 tests review r2: approve (corrected summary)

TL;DR

  • 这是什么: PR 402 的 tests 视角二轮 review。
  • 结论: approve, prompt-contract 变更已有对应 source-regression 覆盖。
  • 需要 maintainer 做什么: 不需要动作;controller 可纳入 review-gate 汇总。

详细说明

上一条评论的 raw artifact 完整,但正文里的 inline code 被 shell 展开吃掉了;这里补一条干净版。

这次 diff 主要改 SKILL.md 与几个 prompt 的契约文本,配套扩展了 test_refactor_comment_policy_prompt_contract.py。新增测试覆盖三块关键行为: PR review comment fix 必须闭环原始 review thread;self-doc-comment 下 marker identity 必须保持 Refactor (iter${ITERATION}/${CLUSTER_ID}) 形态;deterministic marker normalization 必须走 META_RESOLVED:retry-fix,不是 human escalation。

我检查了 diff 中的测试质量风险: 没有新增 sleep/delay pacing,没有 [Skip] 或 manual 分类,没有削弱已有断言。已跑目标测试: python3 skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py,结果 Ran 7 tests in 0.002s, OK。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: tests
verdict: approve
---

## Verdict
Test coverage and test quality are adequate for this prompt-contract diff.

## Evidence
- `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:25` adds a source-regression assertion for the new `GitHub State Contract` row requiring PR review-thread closure or explicit escalation.
- `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:63` extends the existing `self-doc-comment` contract test to lock canonical `Refactor (iter${ITERATION}/${CLUSTER_ID})` marker identity and the fix-codex non-human-decision route.
- `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:137` adds focused coverage for deterministic marker normalization routing to `META_RESOLVED:retry-fix`, which covers the new `meta-reflector-stalled.md` behavior text.
- The diff adds no `sleep`/`delay`, no skip/manual-test marker, and no weakened assertions; the changed test remains source-regression style, matching this repository's prompt-contract test convention.
- Verification run: `python3 skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py` passed, `Ran 7 tests in 0.002s`.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:tests:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 1, 2026

🤖 Architect review: 阻塞,新增 completion contract 缺 behavior guard

TL;DR

  • 这是什么: PR 402 的 architect 复审,检查 CLAUDE.md/AGENTS.md 架构约束。
  • 结论是什么: reject,SKILL.md 新增 PR review thread completion 规则,但目前只有文字断言。
  • 下一步: implement codex 需要补实际 completion enforcement 和 behavior test,再保留现有 source-regression test。

详细说明

这次新增的规则是行为 contract: review comment 驱动的 fix 不能只靠本地 FIX_DONE 或下一轮 reviewer 通过就算完成,还必须回复并 resolve 原 GitHub review thread,或者显式升级。现在 diff 只在 test_refactor_comment_policy_prompt_contract.py 里断言 SKILL.md 有这句话,能防文档丢失,但不能证明 controller/fix-completion 路径真的会检查 GitHub thread closure。

CLAUDE.md 对这类行为变更的要求是 behavior test + source-regression test 双落地。这里 source-regression 有了,缺的是会在“没有回复/resolve 原 thread 且没有 escalation evidence”时失败的行为测试,以及对应的执行路径。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: architect
verdict: reject
---

## Verdict
Reject: the PR adds a new review-thread completion contract, but only pins the wording with source-regression assertions and does not add behavior coverage for the new required GitHub-side completion step.

## Evidence
- `skills/codex-refactor-loop/SKILL.md:532` adds a new required state transition: "PR review comment fix" completion is incomplete until the original thread is replied to and resolved, or explicitly escalated. This is a behavior contract, not just vocabulary. CLAUDE.md requires: "行为变更必须配套 **behavior test**(断言行为本身)+ **source-regression test**(对 SKILL.md 段落标题、narrow allowlist 字面、授权来源 path 等做字面断言),防止\"改文档没改实现\"或反之。"
- `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:25` only adds source-regression checks that the `SKILL.md` sentence exists. I found no behavior test or implementation path that verifies a PR-review-comment-driven fix cannot be considered complete until the original GitHub review thread is replied to and resolved, or explicitly escalated. CLAUDE.md also requires: "**变更必须可验证**:行为约束必须落到机械验证手段(behavior test / source-regression test / 段落 lint);仅靠\"agent 应该记得\"承载的约束视为未落地。"

## What would change your verdict
Add the missing behavior enforcement and behavior test for the new completion contract. Concretely: make the controller/fix-completion path record or check the original PR review thread reply+resolution or explicit escalation before treating a PR review comment fix as complete, and add a behavior test that fails when a fix completes without that GitHub thread closure/escalation evidence. Keep the existing source-regression assertions as the documentation guard.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:architect:reject

⟦AI:AUTO-LOOP⟧

@loning loning added crnd:lifecycle:managed loop-managed item crnd:phase:fixing fix codex in flight crnd:human:auto auto-advancing, no human needed crnd:phase:reviewing review-gate reviewers in flight and removed crnd:phase:fixing fix codex in flight labels Jun 1, 2026
@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 1, 2026

📊 当前状态 — review 完成,等作者更新(不需要 controller 介入)

维度
阶段 reviewing(自动 review 已完成)
自动 review 结论 architect: reject(要求为新增 GitHub 闭环 completion 契约补 behavior test / source-regression test,见上方 architect 评审评论) · tests: approve
PR 来源 fork(louis4li:fix/pr-review-thread-completion-gate),CI 全绿,base dev
为什么 loop 不自动派 fix 该 PR head 在 fork 分支,auto-loop 无法 push 修复到 fork;补 behavior guard 需作者在 fork 上更新,或 maintainer 决定接管到 loop 自有分支
是否需要人介入 可选:作者按 architect 反馈补 behavior/source-regression test 后推到该分支即自动 re-review;或 maintainer 指示接管

已加 crnd:lifecycle:managed 以纳入评论监控。作者更新 head 后下一轮 wakeup 自动重新派 3 reviewer。

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@louis4li
Copy link
Copy Markdown
Contributor Author

louis4li commented Jun 2, 2026

已按最新 architect review 反馈更新 PR head:cfc25723eb1b2b5cc9a06c878ba40abcf82ed1eb

本次补齐的是 PR review comment fix completion contract 的行为门禁,不再只依赖 SKILL.md/source-regression 文本:

  • 增加 review-thread closure 的 behavior enforcement / guard 覆盖;
  • 保留并配套 source-regression 覆盖;
  • 对应提交:cfc25723 fix(skill): 补 review thread 闭环行为门禁

请 auto-loop 在下一轮 wakeup/re-review 中重新纳入 architect verdict。

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

📊 当前状态 — Consensus-rnd Phase review-gate(❌ 否)

维度
阶段 review-gate r1
关联 PR #402(base dev,CI 全绿,mergeable)
codex 任务 architect / tests / quality reviewer 派出
是否需要人介入 ❌ 否(共识达成即自动 merge)

下一步自动会做:3 reviewer 输出 → 真值表 reject=0&approve≥1 → 合并到 dev。

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

🤖 Quality review 通过

TL;DR

  • 这是什么:PR 402 的 code quality 独立 review,只看可读性、命名、简单性、复杂度和死代码。
  • 结论是什么:approve,没有发现必须修改的质量问题。
  • 下一步:controller 可把这个 reviewer 结果纳入 review gate。

详细说明

我按三点 diff 读取了 PR 文件列表和完整 diff,并打开了 touched files 的完整内容。新增的 ReviewThreadCompletionEvidence 是一个小的证据值对象,命名直接表达业务意图;validate_review_thread_completion 分支少、职责单一;ControllerActions.validate_review_fix_completion 让 completion validation surface 有实际调用路径,不是悬空 public surface。

prompt 和 SKILL.md 的改动集中在 review-thread completion 与 deterministic marker normalization,没有看到无关清理、重复抽象、注释灌水或复杂度上升。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: quality
verdict: approve
---

## Verdict
Approve: the PR is focused, readable, and does not introduce naming, dead-code, abstraction, complexity, comment, or drive-by quality issues under `HOST_REFACTOR_COMMENT_POLICY=none`.

## Evidence
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:54`: `ReviewThreadCompletionEvidence` names the business evidence directly, stays as a small value object, and avoids a generic manager/helper abstraction.
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:69`: `validate_review_thread_completion` is 13 lines with simple fail-closed branches; no method-size or cyclomatic-complexity smell.
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py:574`: the controller wrapper is a direct call-through to the validation helper, so the new completion surface is reachable and not dead code.
- `skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:95`: tests cover non-thread fixes, replied/resolved completion, explicit escalation, missing thread evidence, and unresolved/replied-only failures without duplicated setup large enough to require extraction.
- `skills/codex-refactor-loop/SKILL.md:532`: the new state-contract row is scoped to PR review comment completion and does not add unrelated cleanup or extra process surface.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:quality:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

🤖 测试覆盖 review 通过

TL;DR

  • 这是什么: PR 402 的 tests 视角独立 review。
  • 结论: approve;新增 completion gate 有行为测试和 source-regression 覆盖。
  • 下一步: controller 可按 review-gate 真值表继续收敛。

详细说明

我检查了三点: 新增 ReviewThreadCompletionEvidence / validate_review_thread_completion 的行为覆盖、ControllerActions.validate_review_fix_completion 包装路径、以及 SKILL.md 和 prompts 中新增规则的 source-regression 断言。对应测试覆盖了非 review-thread 场景、原 thread 已 reply+resolve、显式 escalation、缺少 thread id、未 reply、未 resolve,以及 controller wrapper fail-closed。

本地 focused 验证通过: python3 -m pytest -q skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py -> 8 passed, 24 subtests passed。没有发现新增 sleep/delay 节奏、skip/manual 标记、断言弱化或 mock-only pseudo-coverage。


📎 完整 codex 原始输出(存档备查)

pr: 402
role: tests
verdict: approve

Verdict

Approve: the PR adds focused behavior coverage for the new review-thread completion gate and source-regression coverage for the new SKILL/prompt contract.

Evidence

  • skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:95 covers the non-review-thread path, so ordinary review-fix completion is not accidentally blocked by the new gate.
  • skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:104 and skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:114 cover the two intended completion exits: original thread replied+resolved, or explicit escalation evidence.
  • skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:125, skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:135, and skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:155 cover fail-closed behavior for missing thread id, unreplied thread, unresolved thread, and the ControllerActions.validate_review_fix_completion wrapper.
  • skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:25 locks the new SKILL.md GitHub state contract text requiring PR review-thread closure.
  • skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:63 and skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:137 add source-regression assertions for canonical refactor marker identity and deterministic retry-fix routing, matching the prompt changes.
  • Focused verification run: python3 -m pytest -q skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py passed with 8 passed, 24 subtests passed.
  • I found no added sleep/delay pacing, no skip/manual test markers, no assertion loosening, and no mock-only pseudo-coverage in the PR diff.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:tests:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

🤖 Architect review: 架构角度 approve

TL;DR

  • 这是什么: PR 402 的 architect reviewer 独立审查结果。
  • 结论: approve, 未发现 CLAUDE.md / AGENTS.md 架构合规阻塞项。
  • 下一步: controller 可把该角色结果纳入 review truth table。

详细说明

我按三点 diff 审了 skills/codex-refactor-loop/ 下的实际改动。新增的是 review-thread completion 的窄验证边界: 只有 review comment 驱动的修复必须证明原 thread 已回复并 resolved, 或显式升级;没有新增 git/gh/lifecycle 权限,也没有把 host production SSOT 放进 .refactor-loop/host.env

同时, deterministic refactor marker normalization 被保留在 fix/retry 路径,不是 human escalation。这符合“人工介入要诚实”的边界: 只有真正需要人做产品、战略、治理或权限决策时才升级 maintainer。


📎 完整 codex 原始输出(存档备查)

pr: 402
role: architect
verdict: approve

Verdict

approve - no architecture compliance concerns; the PR stays within the codex-refactor-loop skill contract and adds a narrow review-thread completion gate without new lifecycle authority.

Evidence

  • skills/codex-refactor-loop/SKILL.md:532 maps the changed review-comment behavior into the existing GitHub State Contract. This follows CLAUDE.md: "边界清晰,职责分层:本文件承载跨 skill 边界仓库级宪法约束;单个 skill 的工作流细则、术语定义、当前状态归该 skill 自维护,不复制回本文件。"
  • skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:54 defines ReviewThreadCompletionEvidence, and :69-:81 only fail-close validates replied/resolved/escalated evidence. It does not add git, gh, label, merge, close, tag, release, or generic lifecycle authority, preserving CLAUDE.md's daemon/controller carveout boundaries.
  • skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py:574 exposes the completion check through an existing controller helper surface rather than a new actor/store split. No new *WriteActor, *ReadActor, or *Store split appears in the diff.
  • skills/codex-refactor-loop/prompts/review-fix.md:47 and skills/codex-refactor-loop/prompts/meta-reflector-stalled.md:36 keep deterministic marker normalization on the fix/retry path, not human escalation. This aligns with CLAUDE.md: "人工介入要诚实:只有确实需要人做产品、战略、治理或权限决策时,才把状态升级给 maintainer;不把人当作 reviewer 反复 reject 的兜底出口。"
  • The changed files are all under skills/codex-refactor-loop/ and its tests/prompts. I found no external repo references, host production SSOT movement into .refactor-loop/host.env, schema/protocol surface expansion, dead wrapper, or parallel pathway.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:architect:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

📊 当前状态 — Consensus-rnd Phase review-gate 共识达成(❌ 否)

维度
PR #402 让 review comment 与 refactor marker 冲突可收敛
真值表 reject=0 approve=3 comment=0 → MERGE
reviewer architect ✅ / tests ✅ / quality ✅
CI contract-tests / manifest-version-sync / skill-degradation / lint-advisory 全绿

合并中。

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

🤖 质量复审通过:无可读性阻塞

TL;DR

  • 这是什么:PR 402 的 code quality 复审结果。
  • 结论:approve;未发现命名、死代码、过度抽象、重复或复杂度阻塞。
  • 下一步:controller 可继续按 review-gate 真值表汇总本轮结果。

详细说明

我按三点 diff 和完整 touched file context 检查了新增的 review-thread completion validator、controller action 转发点、prompt contract 更新和对应 source-regression/behavior tests。新增命名贴合业务语义,方法体短且线性,prompt 改动集中在 review thread closure 与 deterministic marker normalization,没有发现 unrelated cleanup 或 refactor self-doc policy 违规。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: quality
verdict: approve
---

## Verdict
Approve: the PR is focused, the new names describe review-thread completion evidence directly, and I found no dead code, over-engineering, avoidable duplication, or refactor self-doc policy issue under `none`.

## Evidence
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:54`: `ReviewThreadCompletionEvidence` is a small domain data carrier with fields that map directly to the completion contract; no generic manager/helper naming or unnecessary interface layer.
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:69`: `validate_review_thread_completion` is short and linear, with three explicit fail-closed checks and no complexity smell.
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py:574`: `validate_review_fix_completion` is reachable through controller actions and is covered by `test_controller_completion_path_fails_closed_for_open_review_thread`; the underlying validator is also directly covered for accept/block cases.
- `skills/codex-refactor-loop/prompts/review-fix.md:47`, `skills/codex-refactor-loop/prompts/reviewer-quality.md:28`, and `skills/codex-refactor-loop/prompts/meta-reflector-stalled.md:36`: prompt edits are narrow to deterministic marker normalization and do not add unrelated workflow cleanup.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:quality:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

🤖 Tests review: approve

TL;DR

  • 这是什么: PR 402 的 tests 角度独立审查。
  • 结论: approve,新增 review-thread completion gate 和 marker normalization prompt contract 都有对应测试。
  • 下一步: controller 可把本角色视为通过;无 tests 角度 must-fix。

详细说明

这次新增逻辑的关键路径都有覆盖:非 review-thread 驱动的修复不会被误拦,review-thread 驱动的修复必须有原 thread id、已回复、已 resolved,或者明确升级证据;ControllerActions.validate_review_fix_completion 这条 controller 包装路径也有失败用例。文档和 prompt 侧的新不变量也用 source-regression 锁住,包括 SKILL.md 的 PR review-thread closure contract,以及 deterministic marker normalization 必须回到 fix 而不是升级给人。

我没有看到新增 sleep/delay 测试节奏、[Skip] / manual-category 逃逸、mock-only pseudo-coverage,或把强断言改成弱断言。已验证相关测试命令:python3 -m unittest skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py,结果 8 tests passed。


📎 完整 codex 原始输出(存档备查)

pr: 402
role: tests
verdict: approve

Verdict

Test coverage and quality are adequate for the review-thread completion gate and deterministic marker-normalization prompt contract changes.

Evidence

  • skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:95 covers the non-thread-driven no-op path, so the new gate does not accidentally block ordinary review-fix completions.
  • skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:104 and skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:114 cover both valid completion exits: original thread replied+resolved, or explicit escalation evidence.
  • skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:125, skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:135, and skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:155 cover fail-closed behavior for missing thread identity, unreplied/unresolved threads, and the ControllerActions.validate_review_fix_completion wrapper.
  • skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:25 source-regression locks the new SKILL.md GitHub state contract requiring PR review-thread closure.
  • skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:63 and skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:137 source-regression lock canonical refactor marker identity and route deterministic marker normalization back to fix rather than human escalation.
  • No PR diff adds sleep/delay pacing, [Skip], manual-category escapes, mock-only pseudo-coverage, or weakened assertions. Related tests pass with python3 -m unittest skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:tests:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

🤖 Architect review r2: completion gate 还没接到主路由

TL;DR

  • 这是什么: architect 角度复审 PR 让 review comment 与 refactor marker 冲突可收敛 #402 的 review-thread completion gate。
  • 结论: reject,因为规则写进 SKILL.md 了,但 FIX_DONE 主路由仍可直接回 review-gate。
  • 下一步: fix-codex 需要把 unresolved 原 review thread 的检查接到真实完成路径,并补 route-level behavior test。

详细说明

PR 新增的业务规则是: 由 PR review comment 驱动的修复,必须回复并 resolve 原 thread,或者显式升级,才算完成。这个规则本身合理,但当前 diff 只新增了 ReviewThreadCompletionEvidence 和 validator,并通过 ControllerActions.validate_review_fix_completion() 暴露出来。

问题是主链路没有用它。wakeup_plan.py 仍把 FIX_DONE 直接归到 review-gate / reviewer-codex,没有从 live PR review thread 或 durable artifact 构造 evidence,也没有在缺少 replied+resolved 时 fail closed。这样会形成一个旁路 helper: 测试能直接调用 validator 通过,但真实 FIX_DONE 后仍可能继续派 reviewer 或进入可 merge gate。

需要改成: 在 FIX_DONE 回 reviewer 或任何 merge-eligible review gate 前,强制读取原 review thread 状态或明确 escalation evidence;未回复/未 resolve 时阻断,并用 route-level behavior test 覆盖这个路径。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: architect
verdict: reject
---

## Verdict
reject - the PR declares a new PR-review-thread completion gate, but the actual `FIX_DONE` route can still proceed to review-gate without validating or producing original-thread reply/resolution evidence.

## Evidence
- `skills/codex-refactor-loop/SKILL.md:532` adds the new contract: "fixes driven by PR review comments are incomplete until the original thread is replied to and resolved, or explicitly escalated." But `skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py:479-485` still maps every `FIX_DONE` directly to `review-gate`, and `:517-523` still assigns `FIX_DONE` to `reviewer-codex`; there is no call to the new completion validator on that route. This violates CLAUDE.md: "**变更必须可验证**:行为约束必须落到机械验证手段(behavior test / source-regression test / 段落 lint);仅靠\"agent 应该记得\"承载的约束视为未落地。"
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:54-81` defines `ReviewThreadCompletionEvidence` and `validate_review_thread_completion`, and `skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py:574-575` exposes a wrapper, but `git grep` on the PR head shows those are only referenced by the wrapper and tests. No controller helper, wakeup action, `FIX_DONE` consumer, or GitHub GraphQL thread resolver builds this evidence from live PR review threads or blocks completion when it is absent. This worsens the architecture by adding a parallel advisory check instead of making the new business rule part of the single review-fix mainline, contrary to CLAUDE.md: "**单一主干,插件扩展**:每个 skill 一条权威主链路;新能力以子模块/脚本挂载,禁止平行第二系统。"
- `skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:95-163` only unit-tests direct calls to the validator/wrapper. It does not assert that a review-thread-driven `FIX_DONE` without replied+resolved evidence is blocked before reviewer re-dispatch or merge. For a behavior rule added to the GitHub state contract, helper-only coverage is insufficient under CLAUDE.md: "**治理前置**:架构性 / 流程性规则与对应机械验证手段同时进仓库,缺一不补口径。"

## What would change your verdict
Wire the completion gate into the actual review-fix completion path: before `FIX_DONE` routes back to reviewers or any merge-eligible review gate, derive durable evidence from the original PR review thread or live GitHub state, require replied+resolved or explicit escalation, and fail closed otherwise. Add a behavior test for the route itself, not only the standalone validator, showing an unresolved original review thread prevents reviewer re-dispatch/merge and produces the expected blocked/escalation action.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:architect:reject

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

📊 当前状态 — Consensus-rnd Phase review-gate FIX(❌ 否,待 fork 作者)

维度
PR #402fork PR,head louis4li/...,cross-repo)
r2 真值表 reject=1 approve=2 → FIX(已追上 dev 后 re-review)
architect reject: 声明了 PR-review-thread 完成闸,但 FIX_DONE 路径仍可不产出原 thread 回复/解决证据即进 review-gate
tests/quality approve

说明:本 PR 来自 fork 分支,controller 无 fork 写权,无法自动 push fix。需 fork 作者据 architect r2 reject 补「原 thread reply/resolution 证据校验」后重触发 review-gate。详见 r2 reviewer 评论。

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
louis4li and others added 2 commits June 3, 2026 17:29
Seed review-thread completion evidence during fix dispatch and require either original-thread closure or sourced meta escalation before FIX_DONE can continue, so the gate no longer depends on manually-created artifacts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Keep the seeded review-thread completion contract in the prompt while reducing duplicate wording and preserving the source-regression anchors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
loning added a commit that referenced this pull request Jun 3, 2026
根因:reviewer artifact frontmatter 不带 head_sha,_review_evidence_from_artifact 取不到所审 head SHA → 所有 reviewing PR(#480/#473/#402)恒判 WAIT_OR_REDISPATCH:invalid_reviewer_evidence:missing_reviewed_head_sha,且 wakeup-plan 不自动重派 reviewer → review→merge headless 永久卡死(挡住发版)。

修法(codex hotfix worker 实现,controller 验证后提交):
- controller_actions 派 reviewer 时取 headRefOid,缺则 fail-closed,注入权威 HEAD_SHA 进 rendered prompt。
- reviewer-{architect,tests,quality}.md frontmatter 写 head_sha: ${HEAD_SHA}。
- wakeup_runner._review_head_sha_for:artifact 缺 head_sha 时从 controller-rendered prompt/log 回收(不依赖 worker 自报,避免 markerless 覆辙);live-head 陈旧判定与 merge 真值表语义不变。
- wakeup_plan 为 missing/stale reviewer head 的 OPEN reviewing PR 投射可执行 dispatch_reviewers 自愈动作。

验证:controller 独立重跑 test_wakeup_runner+review_gate+wakeup_plan+controller_actions 293 通过、marker/prompt 合同 30 通过;review-gate 安全语义(stale head 仍不 merge)由 13 项 review-gate 测试锁住。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
loning added a commit that referenced this pull request Jun 3, 2026
…merge 闭环

承接 r7:r7 让 reviewer 证据绑定 head_sha,但 wakeup_plan 投射的 review-evidence-redispatch 自愈动作被 closed-action-projection finalizer 强制 status_only(其 kind 不在 EXECUTABLE_ACTION_KINDS),runner 永不应用 → #480/#473/#402 缺 head 的 review 证据无法自动重派 → review_gate 恒 WAIT_OR_REDISPATCH → 并发耗干到 0。

修法(codex hotfix worker 实现,controller 验证后提交):
- wakeup_plan EXECUTABLE_ACTION_KINDS 加 review-evidence-redispatch,finalizer 保留 runner authority 不再强制 status_only。
- controller_actions.dispatch_reviewers 读 stale_review_roles 只重派陈旧角色,且 _pending_review_spawn_exists 跳过已有在飞 dispatch-reviewers:<pr>:<role>:r1 intent(幂等不双派);仍走 r7 的 HEAD_SHA 注入,使重派 reviewer 的新证据 head-bound。
- merge-gate 真值表不变,仅让既有自愈动作可执行 + 定向到陈旧角色。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 295 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
loning added a commit that referenced this pull request Jun 3, 2026
…R headless 走到 review_gate 决策

根因:review_gate 的 CI 读 PrChecksProjection.check_pr 对 gh api pulls/<n> 与 commits/<sha>/check-runs 的单次非零返回立即 fail-closed(ci_unavailable:pull_api_failure),无重试;rate limit 健康,实为瞬时 blip → 让 #473/#402(MERGEABLE+CI 绿)每 tick 被瞬时失败挡住、never reach MERGE/FIX。

修法(最小,只动 PrChecksProjection;不碰 #322 ReleaseRequiredChecksProjection):
- _run_api_read:对两处 api read 最多 3 次确定性重试(无 sleep),成功即停,持续失败仍 fail-closed(安全)。happy path 无行为变化。
- starvation 结论(worker 核实):review_gate 本就被 apply,blocked ci_unavailable 行可重评估(非 terminal duplicate-suppressed),无需改 run_once。
- e2e:test_review_gate_e2e 加「首次 pulls 读瞬时失败仍 merge」用例;test_pr_checks 加 retry 覆盖。

验证:controller 独立重跑 test_pr_checks+wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+controller_actions 309 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
loning added a commit that referenced this pull request Jun 3, 2026
…ady PR headless merge,backlog 能 drain

根因(已核实):wakeup_plan 每 tick 把 SPAWN actions(implement intent + design-consensus #495/#496/#490/#491/#418,index 0-5)排在 lifecycle review_gate(#480/#473/#402,index 6-8)之前;run_once 旧逻辑 `if applied_spawns>0 and not is_spawn_action: break` → 只要派了任一 spawn,首个 lifecycle 即 break 整 tick。design-consensus 几乎总有 spawn 活 → review_gate/merge 永远 starve → #473(approve/approve/comment → MERGE_WITH_COMMENTS,MERGEABLE+CI 绿)永不 merge,active 数永不下降。

修法(最小,核心 dispatch loop):
- 引入 consumes_spawn_budget = is_spawn_action or _uses_spawn_budget(action);超 budget 时 `continue`(不 break)以便仍能抵达后续 drain-only lifecycle。
- _uses_spawn_budget:dispatch_reviewers=True、review_gate 仅当决策为 FIX(会派 fix worker)=True;否则 review_gate(MERGE/MERGE_WITH_COMMENTS)、close 等 drain-only=False → 不占 spawn budget、floor 填满后仍被评估执行。
- 保留 r6 的 spawn launch-failure break/retry 与 dispatch_design_consensus no-intents skip-and-continue;真值表/head-binding/CI-green/mergeable 安全语义不变。

结果:每 tick 既用 spawn 填满 floor,又评估 merge/close 等 drain-only lifecycle,backlog 能 drain。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+controller_actions+pr_checks 309 通过;新增 test_wakeup_runner_lifecycle_review_gate_not_starved_after_spawn_batch。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
loning added a commit that referenced this pull request Jun 3, 2026
… malformed — review→merge 根因

根因(实测确认):_run_command 对所有 gh 命令在 full[1:1] 插 --repo(子命令之前)。gh 不接受 --repo 作 pre-subcommand 全局 flag,且 gh api 根本不吃 --repo(repo 在 URL):`gh --repo O/R api repos/O/R/pulls/473` → rc=1 "unknown flag: --repo"。于是 wakeup-runner 的每个 gh 调用都 malformed:CI 读(PrChecksProjection(runner=_run_command) 的 gh api pulls/check-runs)→ ci_unavailable:pull_api_failure;mergeability(gh pr view)→ mergeability_unavailable;live-state 等同理。→ review gate 永远读不到 CI/mergeable → #473(approve/approve/comment=MERGE_WITH_COMMENTS,绿)、#402#480 永不 headless 落地。r11 的 retry 只是重复同一 malformed 命令。直连 PrChecksProjection()/直连 gh 用的是 plain default runner(无注入)所以正常,掩盖了 bug。

最小修法:_run_command 仅对 `gh pr`/`gh issue` 注入 --repo,且放在子命令(及其 verb)之后的合法位置;`gh api` 完全不动;绝不 pre-subcommand 注入;已存在 --repo 则不重复。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+pr_checks+wakeup_plan+controller_actions 312 通过;新增 argv-shape 测试 + e2e(stub subprocess.run 拒绝 malformed `gh --repo ... api`,ready PR 仍 merge)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

🤖 测试视角 Review 通过

TL;DR

  • 这是什么: PR 让 review comment 与 refactor marker 冲突可收敛 #402 的 tests reviewer 独立结论。
  • 现在到哪一步 / 结论是什么: verdict 是 approve,新增 completion gate 和 prompt contract 都有对应测试覆盖。
  • 需要 maintainer 做什么 OR controller 下一步: controller 可继续按 review-gate 真值表收敛。

详细说明

我只看测试覆盖和测试质量。 是 review-thread-driven fix 完成所需的证据容器;相关测试覆盖了非 review-thread 路径、已 reply+resolved 的通过路径、显式 escalation 的通过路径、缺少 、未 reply、未 resolved 的 fail-closed 路径,以及 controller-facing wrapper。

source-regression 也覆盖了新增规则: 的 GitHub State Contract 必须写明 PR review comment fix 需要原线程 reply+resolved 或显式 escalate; 的 deterministic marker normalization 必须走 retry-fix,不走 human escalation。没有新增 测试节奏、、manual bypass、弱化断言或 mock-only pseudo-coverage。

验证命令已在 PR head archive 上跑过: 9 tests OK; 7 tests OK。


📎 完整 codex 原始输出(存档备查)

pr: 402
role: tests
head_sha: 6fdbbba
verdict: approve

Verdict

Test coverage and test quality are adequate for the PR diff.

Evidence

  • skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:95 covers the non-review-thread path so ordinary fixes are not blocked by the new completion gate.
  • skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:104 and skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:135 cover the core business outcome: review-thread-driven fixes pass only when the original thread is replied to and resolved, and fail closed when either condition is missing.
  • skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:155 covers the controller-facing wrapper, not only the standalone helper.
  • skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:25 adds source-regression coverage for the new GitHub State Contract wording around PR review thread closure.
  • skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:137 adds source-regression coverage for deterministic marker normalization routing to retry-fix rather than human escalation.
  • No new sleep/delay test pacing, [Skip]/manual-test bypass, assertion weakening, or mock-only pseudo-coverage was introduced in the PR diff.
  • Verification run on PR head archive passed: python3 skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py ran 9 tests OK; python3 skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py ran 7 tests OK.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:tests:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

🤖 Quality review approve

TL;DR

  • 这是什么: PR 402 的 readability / naming / simplicity / dead code 角度复审。
  • 结论是什么: approve, 没有发现质量角度必须修改的问题。
  • 下一步: controller 可以把这条 review 作为 quality=approve 输入继续 review gate。

详细说明

我按三点 diff origin/dev...origin/fix/pr-review-thread-completion-gate 看了 PR 触及文件,并打开了实际文件上下文。新增的 ReviewThreadCompletionEvidence 是 review-thread 修复完成证据的值对象,validate_review_thread_completion 是对应的线性校验函数,命名贴合业务意图,没有 Manager / Helper 这类泛名。

可执行代码变化集中在 skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:54skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py:574。方法体很短,分支数量低;prompt 改动也集中在 deterministic marker normalization 和 review-thread closure,没有看到 unrelated cleanup 或重复片段需要抽取。

验证: 我用 PR head 的 git archive 临时目录跑了两条窄测试,test_review_fix_dispatch.py 9 个测试通过,test_refactor_comment_policy_prompt_contract.py 7 个测试通过。


📎 完整 codex 原始输出(存档备查)

pr: 402
role: quality
head_sha: 6fdbbba
verdict: approve

Verdict

Approve: the PR keeps the review-thread completion change small and readable, with domain-specific naming and no quality-level dead code, over-engineering, or unrelated cleanup.

Evidence

  • skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:54: ReviewThreadCompletionEvidence is a concrete value object for the new completion gate, not a generic abstraction.
  • skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:69: validate_review_thread_completion is short, linear, and names the three acceptable completion states clearly: non-thread-driven, explicit escalation, or replied plus resolved original thread.
  • skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py:574: the controller-facing wrapper is reachable through the new behavior test and does not add branching or lifecycle behavior.
  • skills/codex-refactor-loop/prompts/review-fix.md:47, skills/codex-refactor-loop/prompts/reviewer-quality.md:29, and skills/codex-refactor-loop/prompts/meta-reflector-stalled.md:36: prompt edits stay focused on deterministic marker normalization and do not introduce unrelated wording churn.
  • Verification on the PR head archive: python3 skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py ran 9 tests and passed; python3 skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py ran 7 tests and passed.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:quality:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

🤖 架构评审:reject,review-thread 完成契约还没接到实际路由

TL;DR

  • 这是什么: PR 402 的 architect 角度独立评审。
  • 现在到哪一步 / 结论是什么: reject,因为新增的 PR review comment 修复完成契约还没有阻止 FIX_DONE 直接进入下一轮 reviewer。
  • 需要 maintainer 做什么 OR controller 下一步: implement codex 需要把 completion evidence 接到现有 review-fix 完成路径,或收窄 SKILL 文案不声明 completion gate。

详细说明

这次问题不在规则方向,而在落点。FIX_DONE 是 fix worker 表示“修复轮次完成”的 marker,controller 看到它会继续 review-gate,也就是 PR reviewer 共识闸流程。PR 在 skills/codex-refactor-loop/SKILL.md:532 新增了“PR review comment fix 必须回复并 resolve 原 thread,或显式升级”的完成契约,但同一份 SKILL 的路由表在 skills/codex-refactor-loop/SKILL.md:504 仍然写着 FIX_DONE 后直接 “Dispatch reviewers again”,没有要求检查这个新证据。

代码层也一样:skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:69 新增了 validate_review_thread_completion,这是检查原 review thread 是否已回复/resolve 的 helper;skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py:574 只是把它暴露成 validate_review_fix_completion。但 PR diff 里没有看到它被现有 FIX_DONE 路由、dispatch 或 merge 前路径调用。按 CLAUDE.md 的“变更必须可验证”和“治理前置”条款,架构规则必须同时落到机械验证和实际路径上,不能只靠 agent 记得调用 helper。

需要的修复很具体:把这个 evidence gate 接到 review-fix completion 的实际路由上。行为测试要覆盖:review-thread 驱动的 FIX_DONE 缺少 reply/resolve 证据时不能派下一轮 reviewer;reply+resolved 或显式 escalation evidence 时才允许继续。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: architect
head_sha: 6fdbbba603fc38507ab3efcbf53a032f353f3d0f
verdict: reject
---

## Verdict
reject: the PR adds a review-thread completion contract, but the existing FIX_DONE route can still proceed to reviewer redispatch without that evidence.

## Evidence
- skills/codex-refactor-loop/SKILL.md:532 adds a new GitHub State Contract row: "PR review comment fix | Completion includes review-thread closure: fixes driven by PR review comments are incomplete until the original thread is replied to and resolved, or explicitly escalated." But the phase routing table still says skills/codex-refactor-loop/SKILL.md:504 "`FIX_DONE` | Dispatch reviewers again." with no required review-thread completion gate. This regresses the CLAUDE.md clause: "**变更必须可验证**:行为约束必须落到机械验证手段(behavior test / source-regression test / 段落 lint);仅靠\"agent 应该记得\"承载的约束视为未落地。"
- skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:69 defines `validate_review_thread_completion`, and skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py:574 exposes `validate_review_fix_completion`, but PR diff evidence shows no route/dispatch/merge-path caller that enforces it before `FIX_DONE` advances the review-gate. That violates the same CLAUDE.md clause above and the governance clause: "**治理前置**:架构性 / 流程性规则与对应机械验证手段同时进仓库,缺一不补口径。"
- No scope, external repo, protocol/schema, host production SSOT, actor-split, or compatibility-shim concerns found in the changed files.

## What would change your verdict
Wire the new completion evidence into the existing review-fix completion path before `FIX_DONE` can trigger reviewer redispatch, or narrow the SKILL contract so it only documents an explicit helper API rather than declaring completion incomplete. Add a behavior test that exercises the actual route: a review-thread-driven `FIX_DONE` with missing reply/resolution evidence must not dispatch the next reviewer round, while replied+resolved or explicit escalation evidence may proceed.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:architect:reject

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

🤖 测试视角 Review 通过

TL;DR

  • 这是什么: PR 让 review comment 与 refactor marker 冲突可收敛 #402 的 tests reviewer 独立结论。
  • 现在到哪一步 / 结论是什么: verdict 是 approve,新增 completion gate 和 prompt contract 都有对应测试覆盖。
  • 需要 maintainer 做什么 OR controller 下一步: controller 可继续按 review-gate 真值表收敛;上一条 tests 评论因 shell 转义损坏,请以后续这条为准。

详细说明

我只看测试覆盖和测试质量。ReviewThreadCompletionEvidence 是 review-thread-driven fix 完成所需的证据容器;相关测试覆盖了非 review-thread 路径、已 reply+resolved 的通过路径、显式 escalation 的通过路径、缺少 thread_id、未 reply、未 resolved 的 fail-closed 路径,以及 controller-facing wrapper。

source-regression 也覆盖了新增规则:SKILL.md 的 GitHub State Contract 必须写明 PR review comment fix 需要原线程 reply+resolved 或显式 escalate;meta-reflector-stalled.md 的 deterministic marker normalization 必须走 retry-fix,不走 human escalation。没有新增 sleep/delay 测试节奏、[Skip]、manual bypass、弱化断言或 mock-only pseudo-coverage。

验证命令已在 PR head archive 上跑过:python3 skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py 9 tests OK;python3 skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py 7 tests OK。


📎 完整 codex 原始输出(存档备查)

pr: 402
role: tests
head_sha: 6fdbbba
verdict: approve

Verdict

Test coverage and test quality are adequate for the PR diff.

Evidence

  • skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:95 covers the non-review-thread path so ordinary fixes are not blocked by the new completion gate.
  • skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:104 and skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:135 cover the core business outcome: review-thread-driven fixes pass only when the original thread is replied to and resolved, and fail closed when either condition is missing.
  • skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:155 covers the controller-facing wrapper, not only the standalone helper.
  • skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:25 adds source-regression coverage for the new GitHub State Contract wording around PR review thread closure.
  • skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:137 adds source-regression coverage for deterministic marker normalization routing to retry-fix rather than human escalation.
  • No new sleep/delay test pacing, [Skip]/manual-test bypass, assertion weakening, or mock-only pseudo-coverage was introduced in the PR diff.
  • Verification run on PR head archive passed: python3 skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py ran 9 tests OK; python3 skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py ran 7 tests OK.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:tests:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 3, 2026

🤖 fix-codex 阻塞:当前 worker 不在 PR head worktree

TL;DR

  • 这是什么:PR 402 round 1 的 reject-fix 尝试结果。
  • 现在到哪一步 / 结论是什么:architect 的 reject 是有效阻塞项,但当前目录是 auto-refact-dev,不是 PR head fix/pr-review-thread-completion-gate,不能安全改代码。
  • 需要 maintainer 做什么 OR controller 下一步:controller 需要把 fix-codex 重派到 PR head worktree 后再执行同一修复。

详细说明

阻塞需求来自 architect:skills/codex-refactor-loop/SKILL.md:504 仍写着 FIX_DONE 直接重派 reviewer,而 skills/codex-refactor-loop/SKILL.md:532 新增契约要求 review-thread-driven fix 必须回复并 resolve 原 review thread,或显式升级。这个需求应在实际 FIX_DONEdispatch_reviewers 路由前接入 validate_review_thread_completion,也就是“review-thread 完成证据校验”这个 helper,并补一条行为测试。

我没有应用代码改动,因为运行态证据显示 PR 402 的 head 是 fix/pr-review-thread-completion-gate,但当前 git branch --show-currentauto-refact-devgit worktree list --porcelain 里也没有该 PR head worktree。fix prompt 明确禁止 checkout,也禁止修改其他 cluster/PR,所以在这里修改源码会把修复落到错误分支。


📎 完整 codex 原始输出(存档备查)
# Fix report for PR 402 round 1

## Applied
- None. The current worktree is `auto-refact-dev`, while PR 402 head is `fix/pr-review-thread-completion-gate`; applying source edits here would modify the wrong branch.

## Rejected as false positive
- None.

## Blocked (cannot fix this round)
- reviewer:architect's demand: `skills/codex-refactor-loop/SKILL.md:504` still routes `FIX_DONE` directly to reviewer redispatch while the new completion contract at `skills/codex-refactor-loop/SKILL.md:532` requires review-thread reply+resolution or explicit escalation. This is a valid in-scope demand, but this fix-codex session is not on PR 402's head branch. `gh pr view 402 --json baseRefName,headRefName` reports `baseRefName=dev`, `headRefName=fix/pr-review-thread-completion-gate`; `git branch --show-current` reports `auto-refact-dev`; `git worktree list --porcelain` shows no worktree for `fix/pr-review-thread-completion-gate`. The prompt forbids checkout and says not to touch other cluster PRs, so applying the fix here would violate the branch/scope boundary.

## Build status
- build: n=skipped
- tests: n=skipped

## Recommendation for next round
- controller routes to a fix worker running in a worktree attached to PR head `fix/pr-review-thread-completion-gate`, then applies the valid architect demand: wire `validate_review_thread_completion` into the actual `FIX_DONE` to `dispatch_reviewers` path and add a behavior test proving missing review-thread reply/resolution blocks reviewer redispatch while replied+resolved or explicit escalation proceeds.
- `FIX_BLOCKED:402:round-1:other:wrong-worktree`

⟦AI:AUTO-LOOP⟧
FIX_BLOCKED:402:round-1:other:wrong-worktree

⟦AI:AUTO-LOOP⟧

Align the SKILL routing table with the implemented review-thread completion gate so FIX_DONE redispatch is documented as conditional for review-thread-driven fixes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 4, 2026

🤖 测试审查通过

TL;DR

  • 这是什么:PR 402 的 tests 角度独立 review。
  • 现在到哪一步 / 结论是什么:approve,新增 review-thread completion gate 有行为测试和 source-regression 保护。
  • 需要 maintainer 做什么 OR controller 下一步:controller 可以把这个 reviewer 结果纳入 review truth table。

详细说明

我检查了三点 diff、实际 touched Python/Markdown 文件,以及对应测试文件。ReviewThreadCompletionEvidence 是 review-fix 完成闸门的证据对象,测试覆盖了非 review thread 路径、原始 thread 已回复且 resolved、显式 escalation、缺 thread_id、未回复/未 resolved,以及 ControllerActions.validate_review_fix_completion wrapper 路径。

提示词和 SKILL.md 的规则改动也有 source-regression 测试锁住,包括 PR review comment fix 必须完成原始 review thread closure、canonical marker identity、以及 deterministic marker normalization 应该 retry-fix 而不是 human escalation。diff 中没有新增 sleep/delay[Skip]、manual-test 标记、弱化断言或 mock-only 伪覆盖。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: tests
head_sha: 066f84c291186d4a7b6748778c994a80c8db197c
verdict: approve
---

## Verdict
Test coverage and quality are adequate for the review-thread completion gate and prompt contract changes.

## Evidence
- `skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:95` covers the non-review-thread path, so unrelated fixes are not forced through the new completion gate.
- `skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:104` and `skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:114` cover the two intended successful outcomes: original thread replied+resolved, or explicit escalation evidence.
- `skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:125`, `skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:135`, and `skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:155` cover fail-closed behavior for missing thread id, unreplied/unresolved original thread, and the `ControllerActions.validate_review_fix_completion` wrapper path.
- `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:25` locks the new `SKILL.md` GitHub state contract text requiring review-thread closure.
- `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:63` and `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:137` lock the prompt no-regression rules for canonical marker identity and deterministic marker normalization routing.
- The PR diff adds no `sleep/delay`, `[Skip]`, manual-test marker, assertion weakening, or mock-only pseudo-coverage in touched tests.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:tests:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 4, 2026

🤖 architect review:reject,review-thread 完成闸没有接入运行路径

TL;DR

  • 这是什么: PR 402 的 architecture compliance 独立 review。
  • 现在到哪一步 / 结论是什么: reject;新增的 PR review-thread closure 合同只落到 helper/test,没有接到 FIX_DONE 完成路由。
  • 需要 maintainer 做什么 OR controller 下一步: controller 应派 fix-codex,把原 thread 回复/resolve evidence 闸接到 reviewer redispatch 前。

详细说明

我看到 PR 新增了 PR review comment fix 合同:review comment 驱动的修复必须回复并 resolve 原 review thread,或显式升级;这是一个 GitHub 可见状态要求。问题是当前运行路径仍然把 FIX_DONE(fix-codex 宣告修复完成的 marker)直接映射到 dispatch_reviewers(重新派 reviewers 的 controller action),没有验证 ReviewThreadCompletionEvidence(原 thread 已回复/已 resolve/已升级的证据)。

这违反了 CLAUDE.md 的可验证变更要求:新增行为合同不能只停在文档和未接入 helper。需要在 FIX_DONE -> dispatch_reviewers 前 fail-closed:review-thread-driven fix 缺少原 thread id、reply、resolved 或 escalation evidence 时等待/升级,不得继续派 reviewer。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: architect
head_sha: 066f84c291186d4a7b6748778c994a80c8db197c
verdict: reject
---

## Verdict
reject — the PR adds a mandatory PR review-thread completion contract but does not wire that fail-closed check into the `FIX_DONE` runtime transition, so the architecture can still declare a review-comment fix complete without replying/resolving the original thread.

## Evidence
- `skills/codex-refactor-loop/SKILL.md:532` adds the contract: `PR review comment fix | Completion includes review-thread closure: fixes driven by PR review comments are incomplete until the original thread is replied to and resolved, or explicitly escalated.` This is a net-new workflow requirement.
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:69` defines `validate_review_thread_completion(...)`, and lines 76-81 fail closed on missing `thread_id`, missing reply, or unresolved thread. `skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py:574` only exposes this as `validate_review_fix_completion(...)`; the diff contains no production caller for that method.
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py:484` still classifies `FIX_DONE` as `review-gate`, and `skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py:2199` still maps `FIX_DONE` directly to `dispatch_reviewers`. That bypasses the new completion evidence entirely, so a review-thread-driven fix can advance without the original thread being replied/resolved or explicitly escalated.
- PROJECT_RULES clause: `行为变更必须配套 **behavior test**(断言行为本身)+ **source-regression test**(对 SKILL.md 段落标题、narrow allowlist 字面、授权来源 path 等做字面断言),防止"改文档没改实现"或反之。` The PR adds source-regression coverage and a helper-level behavior test, but not the actual workflow behavior at the controller route that marks a fix complete.
- PROJECT_RULES clause: `变更必须可验证且基于 evidence;失败、缺口和越界承诺要显式暴露,禁止用静默假设或禁用测试换取通过。` The current route silently assumes `FIX_DONE` is enough evidence even for PR review-comment fixes, contradicting the new evidence requirement.

## What would change your verdict
Wire review-thread completion evidence into the production completion path before `FIX_DONE` can route to reviewer redispatch: either make `wakeup_plan.controller_action_from_marker("FIX_DONE...")` / the runner action require validated `ReviewThreadCompletionEvidence` for review-thread-driven fixes, or route incomplete evidence to wait/escalation instead of `dispatch_reviewers`. Add a behavior test for the actual `FIX_DONE -> dispatch_reviewers` route proving open/unreplied original review threads block and replied+resolved or explicit escalation passes.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:architect:reject

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 4, 2026

🤖 质量审查:comment

TL;DR

  • 这是什么: PR 402 的 readability / simplicity 角度质量审查。
  • 现在到哪一步 / 结论是什么: 结论是 comment,主代码可读且未看到死代码或过度抽象,但有一个文档计数不一致。
  • controller 下一步: 可让 fix-codex 顺手把 bootstrap 段里的 stale 数字改掉,或作为 advisory 处理。

详细说明

我没有发现需要阻断合并的质量问题。新增的 ReviewThreadCompletionEvidence 是“review thread completion evidence 对象”,用来携带原始 review thread 的 id、是否回复、是否 resolved 等事实;它有 controller 和 wakeup planner 两条调用路径,不是孤儿 public surface。wakeup_plan.py 里新增的 review-thread gate 也拆成了几个有业务名的 helper,单个函数规模可读。

唯一建议修一下的是 skills/codex-refactor-loop/SKILL.md:486 已经写成 “7 restart-helper-managed daemons”,但同一 bootstrap anti-pattern 段仍有 “six required restart-helper-managed daemons”。这会让后续维护者看到两个计数,建议改成 “7 required” 或删掉数字。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: quality
head_sha: 066f84c291186d4a7b6748778c994a80c8db197c
verdict: comment
---

## Verdict
Comment: code shape is readable and focused, but one touched SKILL.md sentence leaves a stale count that should be cleaned up.

## Evidence
- skills/codex-refactor-loop/SKILL.md:486 and skills/codex-refactor-loop/SKILL.md:494: the diff changes bootstrap to "ensure all 7 restart-helper-managed daemons" while the same bootstrap anti-pattern block still says "fewer than the six required restart-helper-managed daemons". This is a small readability/maintenance issue because the contract now contains two visible daemon counts in the same local section.
- skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:54: `ReviewThreadCompletionEvidence` names the domain object directly; its validation helper is reachable from `ControllerActions.validate_review_fix_completion` and `wakeup_plan.completed_marker_actions`, so I do not see dead code or a single-implementer abstraction problem.
- skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py:694: `_apply_fix_done_review_thread_gate` is small and delegates artifact parsing, escalation lookup, and live GraphQL lookup to named helpers; the branch count stays readable inside the existing planner style.
- skills/codex-refactor-loop/scripts/test_wakeup_plan.py:68: `write_fake_gh` is already a large fixture factory and grows further with review-thread cases, but this is test fixture debt around an existing pattern rather than a must-fix regression for this PR.

## What would change your verdict (only if comment or reject)
Update the stale bootstrap anti-pattern text from "six required restart-helper-managed daemons" to "7 required restart-helper-managed daemons" or remove the number so the section has a single source of truth.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:quality:comment

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 4, 2026

🤖 fix-codex round 2:未改代码,architect reject 已由当前 PR head 满足

TL;DR

  • 这是什么: PR 让 review comment 与 refactor marker 冲突可收敛 #402 round 2 的 fix-codex 结果。
  • 现在到哪一步 / 结论是什么: 唯一 reject 是 architect,但其需求在当前 origin/pr/402 已实现并有行为测试覆盖,因此本轮应用 0 个代码修复、拒绝 1 个 false positive。
  • controller 下一步: 重新派发 review-gate;按 reject=0 && approve>=1 -> MERGE 真值表,预期该 reject 会清掉。

详细说明

FIX_DONE 是 fix worker 完成后的路由 marker;architect 要求它不能直接进入 dispatch_reviewers,必须先检查原始 PR review thread 是否已回复并 resolved,或已有明确 escalation。当前 PR head 已经满足:wakeup_plan.py 在每个 FIX_DONE completed marker 上调用 _apply_fix_done_review_thread_gate(...),失败时把 action 改成 status_only 并移除 controller_action,所以 wakeup-runner 不会继续执行 reviewer dispatch。

我用 origin/pr/402 的 archive 副本验证,避免当前工作树上其他分支改动干扰。compileall 通过,相关 137 个测试通过;第一次验证失败是因为 shell 继承了 CONSENSUS_RND_HOST_ENV=.config/consensus-rnd/host.env,而 /tmp archive 副本没有该文件,重新 env -u CONSENSUS_RND_HOST_ENV 后通过。


📎 完整 codex 原始输出(存档备查)

Fix report for PR 402 round 2

Applied

  • None. No in-scope code fix was needed because the only reject demand is already satisfied by the current PR head (origin/pr/402 / 066f84c291186d4a7b6748778c994a80c8db197c).
  • refactor self-doc: not applicable (HOST_REFACTOR_COMMENT_POLICY=none)

Rejected as false positive

  • reviewer:architect cited skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py:484 and skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py:2199 as still routing FIX_DONE directly to dispatch_reviewers. That is stale against the actual PR three-dot diff for origin/dev...origin/pr/402: the PR includes skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py, and the current file calls _apply_fix_done_review_thread_gate(repo_root, ctx, action) for every FIX_DONE completed marker at wakeup_plan.py:688-689.
  • reviewer:architect demanded that FIX_DONE -> dispatch_reviewers require validated ReviewThreadCompletionEvidence or route incomplete evidence to wait/escalation. Current PR head does exactly that at wakeup_plan.py:694-717: _apply_fix_done_review_thread_gate loads review-thread completion evidence, calls validate_review_thread_completion(evidence), and on failure converts the action to status_only, sets route = "review-thread-completion-gate", records blocked_reason = "review_thread_completion_incomplete:...", and removes controller_action, runner_authority, and no_generic_command.
  • reviewer:architect demanded behavior coverage for the actual route. Current PR head includes it in skills/codex-refactor-loop/scripts/test_wakeup_plan.py: test_fix_done_with_unresolved_original_review_thread_blocks_dispatch_reviewers asserts unresolved original review threads become status_only with no controller_action; test_fix_done_review_thread_completion_artifact_allows_dispatch_reviewers asserts replied+resolved evidence still dispatches reviewers; test_fix_done_explicit_escalation_allows_unresolved_review_thread asserts clean escalation evidence passes; additional tests cover live unresolved, unknown live state, pagination, missing repo slug, and clean-marker escalation source requirements.
  • Runner bypass concern is also false positive on current PR head: skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_runner.py:168-178 iterates only the closed action projection from build_plan, skips status_only actions, and dispatches dispatch_reviewers only from an action that survives plan validation. It does not call controller_action_from_marker("FIX_DONE...") directly.
  • reviewer:quality had a comment, not a blocking reject, about stale "six required restart-helper-managed daemons" wording. Per prompt, comments are context and not mandatory fix demands in this round. This remains advisory context for a later cleanup.

Blocked (cannot fix this round)

  • None.

Build status

  • build: pass (env -u CONSENSUS_RND_HOST_ENV python3 -m compileall skills/codex-refactor-loop/scripts skills/sshx -q in an origin/pr/402 archive verification copy)
  • tests: pass (env -u CONSENSUS_RND_HOST_ENV python3 -m unittest skills/codex-refactor-loop/scripts/test_wakeup_plan.py skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py; 137 tests, OK, in an origin/pr/402 archive verification copy)
  • note: the first verification attempt failed because the parent shell exported CONSENSUS_RND_HOST_ENV=.config/consensus-rnd/host.env, which is not present in the /tmp archive verification copy. Re-running with env -u CONSENSUS_RND_HOST_ENV verified the PR behavior.

Recommendation for next round

  • Expect the architect reject to clear because the requested production FIX_DONE route gate and behavior tests are already present on the current PR head. The quality stale-count comment is advisory and should not block the fixed truth table (reject=0 && approve>=1 -> MERGE).

⟦AI:AUTO-LOOP⟧
FIX_DONE:402:round-2:applied-0:rejected-1:blocked-0

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 4, 2026

🤖 质量审查通过

TL;DR

  • 这是什么:PR 让 review comment 与 refactor marker 冲突可收敛 #402 的 readability / simplicity 质量审查结果。
  • 结论是什么:approve,没有发现命名、死代码、过度抽象、重复逻辑或 HOST_REFACTOR_COMMENT_POLICY=none 下的 self-doc 违规。
  • controller 下一步:可进入本轮 review-gate 汇总。

详细说明

我按三点 diff origin/dev...origin/fix/pr-review-thread-completion-gate 看了实际 PR 文件,并打开了 touched files 的完整上下文。ReviewThreadCompletionEvidence 是承载原 PR review thread 是否回复、是否 resolved、是否显式升级的证据对象,名字表达的是业务事实;validate_review_thread_completion 只有 13 行,分支直接对应 fail-closed 条件,没有复杂度或抽象膨胀问题。

skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py:574validate_review_fix_completion 是 controller API 边界,也就是 controller 调用 review-fix 完成校验的入口;它有 skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:95 起的测试覆盖,能证明新增 public surface 不是死代码。prompt 改动集中在 review comment completion 和 refactor marker 归一化,没有把 Refactor / Old pattern / New principle 这类源码注释扩散到 none 策略。

验证已跑:

python3 skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py && python3 skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py

结果:2 个脚本均通过。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: quality
head_sha: 066f84c291186d4a7b6748778c994a80c8db197c
verdict: approve
---

## Verdict
Approve: the PR keeps the completion gate readable and focused, with no quality-blocking naming, dead code, over-engineering, duplication, or self-doc policy regression under `none`.

## Evidence
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:54`: `ReviewThreadCompletionEvidence` names the business evidence being carried, and the fields are direct review-thread completion facts rather than generic helper state.
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py:69`: `validate_review_thread_completion` is 13 lines with three straightforward fail-closed branches; no method-size or cyclomatic-complexity concern.
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py:574`: `validate_review_fix_completion` is a thin controller boundary over the shared validator, and `skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:95` through `skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:112` exercise both the non-thread and completed-thread paths.
- `skills/codex-refactor-loop/prompts/review-fix.md:47`: the prompt change stays on the PR's stated marker-normalization/refactor-comment policy path; it does not add unconditional `Refactor` / `Old pattern` / `New principle` source comments under the `none` policy.
- Local verification passed: `python3 skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py && python3 skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py`.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:quality:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 4, 2026

🤖 架构复审通过

TL;DR

  • 这是什么:PR 402 的 architect r3 独立复审,检查 CLAUDE/AGENTS 架构边界。
  • 结论是什么:approve,当前 FIX_DONE review-thread gate 没有引入新的生命周期越权。
  • controller 下一步:按 review truth table 汇总本轮 verdict。

详细说明

我按 live PR head 066f84c291186d4a7b6748778c994a80c8db197c 复查了三点: FIX_DONE 是否仍会直接进入 reviewer redispatch、.refactor-loop/state/review-thread-completion/prN.json 是否被当成 host production SSOT、以及新规则是否有机械验证。结论是通过: wakeup_plan.py 在产出 wakeup-runner action 前应用 review-thread gate,证据不完整时改成 status_only,并移除 controller_action / runner_authority,所以 runner 不会执行 redispatch。

ReviewThreadCompletionEvidence 是本 PR 新增的 review-thread 完成证据结构,用来记录“这次 fix 是否由 PR review thread 驱动、原 thread id、是否已 reply/resolve、是否有干净 escalation marker”。关键点是它不是单独授权源: plan 仍会重新查 live GitHub review-thread 状态;未知、未 resolved、分页里找到未 resolved、缺 repo slug 都会 fail closed。


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: architect
head_sha: 066f84c291186d4a7b6748778c994a80c8db197c
verdict: approve
---

## Verdict
approve: no architectural compliance concerns; merge OK from architect angle.

## Evidence
- `skills/codex-refactor-loop/SKILL.md:522` narrows the `FIX_DONE` route so review-thread-driven fixes only redispatch reviewers after review-thread completion evidence passes; this aligns with CLAUDE.md: "变更必须可验证:行为约束必须落到机械验证手段(behavior test / source-regression test / 段落 lint)".
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py:688` applies `_apply_fix_done_review_thread_gate` before the wakeup-runner action is emitted; `wakeup_plan.py:701-712` converts incomplete evidence to `status_only`, removes `controller_action`, `runner_authority`, and `no_generic_command`, and keeps the action non-lifecycle. This preserves CLAUDE.md #396: "`wakeup-runner` 只可消费 `wakeup-plan` 产出的 evidence-bound closed action projection".
- `skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py:720-741` treats `.refactor-loop/state/review-thread-completion/prN.json` as skill-private completion evidence and rechecks live GitHub review-thread state before allowing dispatch; this does not make `.refactor-loop/host.env` or `.refactor-loop/` a host production SSOT, matching FI-002: "skill-private runtime directories such as `.refactor-loop/` must not become host production configuration or ledger SSOT."
- `skills/codex-refactor-loop/scripts/test_wakeup_plan.py:1247-1482` covers unresolved, resolved, live-state-unknown, paginated, missing repo slug, and clean escalation cases; `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py` adds source-regression coverage for the SKILL.md/prompt contract.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:architect:approve

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor

loning commented Jun 4, 2026

🤖 tests review r3:approve

TL;DR

  • 这是什么: PR 402 的测试质量复审,范围只看测试覆盖和测试质量。
  • 现在到哪一步 / 结论是什么: approve,当前 head 066f84c291186d4a7b6748778c994a80c8db197c 的新增门禁、seed、prompt contract 都有对位测试。
  • controller 下一步: 可把这个 tests 结论纳入本轮 review-gate 汇总。

详细说明

我按三点 diff 重新看了当前 PR head,不用本地过期 branch ref。ReviewThreadCompletionEvidence 是 review-fix 完成前的证据对象;它现在有直接行为测试覆盖非 review-thread 路径、已回复且已 resolved 的成功路径、显式 human escalation 路径,以及缺 thread id、未回复、未 resolved、malformed escalation 的失败路径。

wakeup_plan.py 是 wakeup-runner 的只读 action projection;新增测试覆盖 FIX_DONE marker 被转换为 dispatch_reviewers 前必须经过 review-thread completion gate。重点边界也测到了:无 seed 时不被无关 unresolved thread 阻塞、有 seed 且 live thread 未 resolved 时变成 status_only、本地 resolved 不能绕过 live unresolved、GraphQL 查询失败/分页/missing repo slug 都 fail closed。相关测试在 archive 里的当前 head 跑过:137 tests, OK


📎 完整 codex 原始输出(存档备查)
---
pr: 402
role: tests
head_sha: 066f84c291186d4a7b6748778c994a80c8db197c
verdict: approve
---

## Verdict
Test coverage and quality are adequate for the review-thread completion gate and refactor marker convergence changes.

## Evidence
- `skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:121` covers seeding `.refactor-loop/state/review-thread-completion/pr<N>.json` from a live unresolved review thread, while `:158`, `:198`, and `:217` cover stale-seed cleanup, lookup failure fail-closed seed behavior, and GraphQL pagination.
- `skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py:265`, `:274`, `:284`, `:295`, `:307`, `:317`, and `:337` cover the new `validate_review_thread_completion` branches: non-thread-driven bypass, replied+resolved success, explicit escalation success, malformed escalation rejection, missing thread id rejection, unreplied/unresolved rejection, and the `ControllerActions.validate_review_fix_completion` wrapper path.
- `skills/codex-refactor-loop/scripts/test_wakeup_plan.py:1217` covers normal `FIX_DONE -> dispatch_reviewers` projection with the new `review_thread_completion_evidence` precondition and no generic command fields.
- `skills/codex-refactor-loop/scripts/test_wakeup_plan.py:1236`, `:1247`, `:1276`, and `:1301` cover the core completion gate outcomes: no seeded review-thread artifact ignores unrelated unresolved threads, unresolved original thread blocks dispatch, resolved original thread dispatches reviewers, and stale local resolved evidence cannot bypass live unresolved state.
- `skills/codex-refactor-loop/scripts/test_wakeup_plan.py:1324`, `:1352`, `:1376`, `:1408`, `:1431`, and `:1457` cover explicit escalation with clean marker source, local escalation without clean marker source, unknown live state, unresolved outdated thread, paginated live thread lookup, and missing repo slug.
- `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:25`, `:35`, and `:43` lock the new `SKILL.md` and `review-fix.md` source contracts for review-thread closure before `FIX_DONE` can advance.
- `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:87` and `:161` lock the source-regression rules for canonical refactor marker identity and deterministic marker normalization routing.
- The touched tests add no `sleep/delay`, `[Skip]`, manual-test marker, assertion weakening, or mock-only pseudo-coverage. I verified current PR head with `env -u CONSENSUS_RND_HOST_ENV python3 -m unittest skills/codex-refactor-loop/scripts/test_wakeup_plan.py skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py` in an `origin/pr/402` archive: 137 tests passed.

⟦AI:AUTO-LOOP⟧
REVIEW_DONE:402:tests:approve

⟦AI:AUTO-LOOP⟧

@loning loning merged commit da51991 into ChronoAIProject:dev Jun 4, 2026
4 checks passed
@loning loning added crnd:phase:merged merged and removed crnd:phase:reviewing review-gate reviewers in flight labels Jun 4, 2026
loning added a commit that referenced this pull request Jun 5, 2026
…act-dev

# Conflicts:
#	skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py
#	skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py
#	skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py
#	skills/codex-refactor-loop/scripts/test_wakeup_plan.py
loning added a commit that referenced this pull request Jun 6, 2026
* #471 hotfix: wakeup_plan completed-marker 投影按 open-managed 集过滤,closed/merged target 不投影(止血 graphql:消除 wakeup_runner 对已终结 target 的逐条重校验)

* #410: SKILL/prompt 注释改引 durable contract anchor,不引用一次性 .refactor-loop/runs/*;新增 test_generated_artifacts_not_authority 窄 source-regression (#472)

* #471 hotfix2: wakeup-plan 同一 target 只投影最新有效 marker,旧轮次不再投影 dispatchable action(止 graphql backoff 最后一刀)

* skill: 强化 no-gap 规则 — 0 codex + active work 无观察模式豁免,必须当 turn 立即介入派 codex(+source-regression)

* #470: direct-post prompt 渲染时 fixed-token 内联 _github-post-rules,host worktree cwd 可达(避免裸相对路径) (#479)

* #478 hotfix: daemon spawn 改 launch-and-forget 独立 supervisor(start_new_session 不 wait),daemon reload 不丢 in-flight worker;daemon 每 tick 清晰状态行

* #471: wakeup-plan latest-open-fresh 投影 — completed-marker open-target/latest + release-rollup 仅 latest-per-sha 且 ahead>0 本地 ref 验证(止 stale 投影/graphql) (#482)

* #481: consensus->implement readiness-dedup gate(止 daemon 重派 closed/has-PR/in-flight) (#486)

* #481: consensus->implement readiness-dedup gate, closed/closing-PR/remote-branch/in-flight target 不投影 dispatchable, 止重复/race

* #481 fix r1: dedup gate 复用 concurrency monitor owner surface 测量 in-flight,不重实现 daemon 算法(addr architect)

* #481 fix r2: 补 controller helper guard behavior test(addr tests)

* #485: consensus->implement 按 scope_paths 冲突分组串行(同组一个 executable,其余 status_only),复用 batching-heuristics,防并行冲突 PR (#489)

* #475: wakeup-runner WakeupApplyBudget — spawn-worker action 按 hard_gate.dispatch_required 批量到 floor deficit(lifecycle action 仍 1/tick),修 1/tick 线性恢复 (#492)

* hotfix(#475 ext): batch budget 纳入 dispatch_design_consensus —— solver triplet 一 tick 派齐(非 lifecycle,worker 派发),修 design-consensus 仍 1/tick 低并发

* hotfix: headless spawn-intent launch — skipped duplicate 不再 break 整批 spawn,stale applied ledger(log 缺失)可重试,helper_exit:3 加来源诊断事件;修 248 intent 堆积/0 并发

* hotfix r2: blocked lifecycle action 不再 dead-stop 整 tick — 跳过/抑制 stale blocked(close-drop target OPEN / publish 无 verified head)继续 launch 后续 spawn-batch,使 headless 并发拉到 floor

* hotfix r3: 根治 headless 0 并发 — detached spawn-codex launch 显式传 env_for_subprocess(修 #478 decouple 丢 env 致 EXIT=127);失败 log(非0 EXIT)可重试 launch,EXIT=0/未完成仍抑制重复;tick 状态汇总后续 launch 不被首个 blocked 掩盖

* hotfix r4: run_once 通用 blocked-not-dead-stop — 任何 blocked/skipped non-spawn action 不再 dead-stop 整 tick,继续 launch 后续 spawn-batch;wakeup-plan 抑制 precondition 注定失败的 stale action;根治 headless 并发=0

* hotfix r5: wakeup-plan 投影侧 suppress stale close_managed_drop(target 实际 OPEN 的 drop-close 注定失败)为 status_only,不再占 tick 挡住后续 design-consensus spawn launch

* hotfix r6: wakeup-runner run_once 不再把 no-intents 的 dispatch_design_consensus(helper_exit:3)误判为 codex launch failure dead-stop 整 tick

根因:dispatch_design_consensus 是 spawn-batch action;当 solver triplet 不完整(如 #496 minimal/structural EXIT=0 但未 emit SOLVER_DONE marker)时 helper 返回 exit 3(NO_INTENTS),_spawn_launch_failure 命中 "helper_exit:" 前缀判 True → run_once break,整 tick dead-stop,后续 reviewers/其他 design-consensus/implement 全部不派 → headless 并发=0。

修法:run_once 中 blocked 的 spawn-batch action 若 controller_action != spawn_codex_harness_background(即 dispatch_design_consensus),按 routing no-op skip-and-continue,继续 launch 后续 spawn-batch;仅真正 spawn_codex_harness_background launch failure 才走既有 retry/break。延续 r4 blocked-not-dead-stop 不变量到 design-consensus spawn-batch action。

behavior test: test_wakeup_runner_design_consensus_no_intents_does_not_dead_stop_later_spawn_batch(无修复时 later spawn 不 launch 而 FAIL)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r7: review head_sha binding — 打通 headless review→merge 自驱

根因:reviewer artifact frontmatter 不带 head_sha,_review_evidence_from_artifact 取不到所审 head SHA → 所有 reviewing PR(#480/#473/#402)恒判 WAIT_OR_REDISPATCH:invalid_reviewer_evidence:missing_reviewed_head_sha,且 wakeup-plan 不自动重派 reviewer → review→merge headless 永久卡死(挡住发版)。

修法(codex hotfix worker 实现,controller 验证后提交):
- controller_actions 派 reviewer 时取 headRefOid,缺则 fail-closed,注入权威 HEAD_SHA 进 rendered prompt。
- reviewer-{architect,tests,quality}.md frontmatter 写 head_sha: ${HEAD_SHA}。
- wakeup_runner._review_head_sha_for:artifact 缺 head_sha 时从 controller-rendered prompt/log 回收(不依赖 worker 自报,避免 markerless 覆辙);live-head 陈旧判定与 merge 真值表语义不变。
- wakeup_plan 为 missing/stale reviewer head 的 OPEN reviewing PR 投射可执行 dispatch_reviewers 自愈动作。

验证:controller 独立重跑 test_wakeup_runner+review_gate+wakeup_plan+controller_actions 293 通过、marker/prompt 合同 30 通过;review-gate 安全语义(stale head 仍不 merge)由 13 项 review-gate 测试锁住。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r8: 让 review-evidence-redispatch 自愈动作可执行 — 打通 headless review→merge 闭环

承接 r7:r7 让 reviewer 证据绑定 head_sha,但 wakeup_plan 投射的 review-evidence-redispatch 自愈动作被 closed-action-projection finalizer 强制 status_only(其 kind 不在 EXECUTABLE_ACTION_KINDS),runner 永不应用 → #480/#473/#402 缺 head 的 review 证据无法自动重派 → review_gate 恒 WAIT_OR_REDISPATCH → 并发耗干到 0。

修法(codex hotfix worker 实现,controller 验证后提交):
- wakeup_plan EXECUTABLE_ACTION_KINDS 加 review-evidence-redispatch,finalizer 保留 runner authority 不再强制 status_only。
- controller_actions.dispatch_reviewers 读 stale_review_roles 只重派陈旧角色,且 _pending_review_spawn_exists 跳过已有在飞 dispatch-reviewers:<pr>:<role>:r1 intent(幂等不双派);仍走 r7 的 HEAD_SHA 注入,使重派 reviewer 的新证据 head-bound。
- merge-gate 真值表不变,仅让既有自愈动作可执行 + 定向到陈旧角色。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 295 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r9: reviewer 重派 bump 到下一轮 + 绝对 cd + latest-evidence-per-role — 真正打通 headless review→merge

根因(同一类:按"产物存在"而非"成功+区分维度"抑制):reviewer 重派硬编码回 r1,撞上旧 head 早先成功(EXIT=0)的 r1 log → _harness_spawn_intent_log_suppresses_retry 判"做过了"抑制 → 当前 head 该重跑的 review 永不跑 → review_gate 恒 WAIT_OR_REDISPATCH → 并发 0。且 intent cd 是相对 "." → headless 启动层不投射。

最小修法(按 maintainer directive:文件名即唯一 id,不加 wrapper --id,留在现有 r<R> 命名约定内不改公开命名):
- controller_actions.dispatch_reviewers:_next_review_round 取该 PR/role 现有最大轮 +1,渲染 r<N+1> 的 prompt/log/output/intent;pending-spawn 幂等 guard 按真实轮号;_append_harness_spawn_intent 的 cd 改 str(cd.resolve()) 绝对路径,使 headless 投射+启动成立。
- wakeup_plan.latest_reviewer_heads / wakeup_runner._latest_review_evidence_by_role:改为"每 role 取各自最新轮证据"(而非全局最新单轮),使只重派陈旧 role 时不丢其它 role 已 head-bound 的有效证据;merge 仍要求每 role 证据绑定 live head,安全语义不变。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 299 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r10: review_gate action 带上 reviewed head_sha + e2e 锁死 headless review→decision 全路

根因(review path head_sha 串接的第 5 处):wakeup_plan 投影 REVIEW_DONE 的 review_gate action 时只读 log 的 head(log 无 head)→ action.head_sha 空 → wakeup_runner._review_gate 卡 WAIT_OR_REDISPATCH:missing_action_reviewed_head_sha,证据虽已 head-bound 但 gate 连比都没法比。

修法(最小、按文件名即 id 思路):
- wakeup_plan.completed_marker_actions:REVIEW_DONE action 的 head_sha 改由 _review_done_action_head_sha 从权威证据面解析(latest_reviewer_heads 含 r7 prompt 注入的 head;或全 role head-bound 时取 live PR head;否则回落 log)。只从真实来源填充,不绕过 gate 的 head 比对,merge 安全语义不变。
- 新增 test_review_gate_e2e.py:从真实 wakeup_plan 投影 → WakeupRunner apply,断言 FIX(有 reject)与 MERGE(全 approve+CI 绿)两条具体决策落地——端到端锁死整条 review→decision,杜绝再冒隐藏层。
- test_wakeup_plan.py:加 prompt-bound reviewed head 的 planner 回归。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions+review_gate_e2e 302 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r11: PrChecksProjection 对瞬时 gh-api 失败加 bounded retry — ready PR headless 走到 review_gate 决策

根因:review_gate 的 CI 读 PrChecksProjection.check_pr 对 gh api pulls/<n> 与 commits/<sha>/check-runs 的单次非零返回立即 fail-closed(ci_unavailable:pull_api_failure),无重试;rate limit 健康,实为瞬时 blip → 让 #473/#402(MERGEABLE+CI 绿)每 tick 被瞬时失败挡住、never reach MERGE/FIX。

修法(最小,只动 PrChecksProjection;不碰 #322 ReleaseRequiredChecksProjection):
- _run_api_read:对两处 api read 最多 3 次确定性重试(无 sleep),成功即停,持续失败仍 fail-closed(安全)。happy path 无行为变化。
- starvation 结论(worker 核实):review_gate 本就被 apply,blocked ci_unavailable 行可重评估(非 terminal duplicate-suppressed),无需改 run_once。
- e2e:test_review_gate_e2e 加「首次 pulls 读瞬时失败仍 merge」用例;test_pr_checks 加 retry 覆盖。

验证:controller 独立重跑 test_pr_checks+wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+controller_actions 309 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r12: run_once 不再让 spawn-batch starve drain-only lifecycle — ready PR headless merge,backlog 能 drain

根因(已核实):wakeup_plan 每 tick 把 SPAWN actions(implement intent + design-consensus #495/#496/#490/#491/#418,index 0-5)排在 lifecycle review_gate(#480/#473/#402,index 6-8)之前;run_once 旧逻辑 `if applied_spawns>0 and not is_spawn_action: break` → 只要派了任一 spawn,首个 lifecycle 即 break 整 tick。design-consensus 几乎总有 spawn 活 → review_gate/merge 永远 starve → #473(approve/approve/comment → MERGE_WITH_COMMENTS,MERGEABLE+CI 绿)永不 merge,active 数永不下降。

修法(最小,核心 dispatch loop):
- 引入 consumes_spawn_budget = is_spawn_action or _uses_spawn_budget(action);超 budget 时 `continue`(不 break)以便仍能抵达后续 drain-only lifecycle。
- _uses_spawn_budget:dispatch_reviewers=True、review_gate 仅当决策为 FIX(会派 fix worker)=True;否则 review_gate(MERGE/MERGE_WITH_COMMENTS)、close 等 drain-only=False → 不占 spawn budget、floor 填满后仍被评估执行。
- 保留 r6 的 spawn launch-failure break/retry 与 dispatch_design_consensus no-intents skip-and-continue;真值表/head-binding/CI-green/mergeable 安全语义不变。

结果:每 tick 既用 spawn 填满 floor,又评估 merge/close 等 drain-only lifecycle,backlog 能 drain。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+controller_actions+pr_checks 309 通过;新增 test_wakeup_runner_lifecycle_review_gate_not_starved_after_spawn_batch。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r13: 修 WakeupRunner._run_command 把 --repo 注入到 gh 子命令前致全部 gh 调用 malformed — review→merge 根因

根因(实测确认):_run_command 对所有 gh 命令在 full[1:1] 插 --repo(子命令之前)。gh 不接受 --repo 作 pre-subcommand 全局 flag,且 gh api 根本不吃 --repo(repo 在 URL):`gh --repo O/R api repos/O/R/pulls/473` → rc=1 "unknown flag: --repo"。于是 wakeup-runner 的每个 gh 调用都 malformed:CI 读(PrChecksProjection(runner=_run_command) 的 gh api pulls/check-runs)→ ci_unavailable:pull_api_failure;mergeability(gh pr view)→ mergeability_unavailable;live-state 等同理。→ review gate 永远读不到 CI/mergeable → #473(approve/approve/comment=MERGE_WITH_COMMENTS,绿)、#402#480 永不 headless 落地。r11 的 retry 只是重复同一 malformed 命令。直连 PrChecksProjection()/直连 gh 用的是 plain default runner(无注入)所以正常,掩盖了 bug。

最小修法:_run_command 仅对 `gh pr`/`gh issue` 注入 --repo,且放在子命令(及其 verb)之后的合法位置;`gh api` 完全不动;绝不 pre-subcommand 注入;已存在 --repo 则不重复。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+pr_checks+wakeup_plan+controller_actions 312 通过;新增 argv-shape 测试 + e2e(stub subprocess.run 拒绝 malformed `gh --repo ... api`,ready PR 仍 merge)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r14: managed draft PR ready-then-merge — review gate 不再硬挡 pr_draft

根因(r13 修好 gh 调用后浮现):review_gate 的 _review_gate_mergeability_error 把 isDraft=True 当 WAIT_OR_REDISPATCH:pr_draft 硬挡。但本 loop draft-until-consensus:PR 开为 draft 直到 review 共识,merge_pr 在决策 MERGE/MERGE_WITH_COMMENTS 时 ready-then-merge。于是绿 gate 的 managed draft(如 #473 approve/approve/comment)永远卡 pr_draft、永不 headless 落地。

最小修法:
- wakeup_runner:删除 mergeability 检查里的 isDraft→pr_draft 硬挡(仍挡 mergeable!=MERGEABLE,即 CONFLICTING/DIRTY)。
- controller_actions.merge_pr:对 live draft PR 先校验 crnd:lifecycle:managed 再 gh pr ready(非 managed draft 不 ready,写 CONTROLLER_ACTION_BLOCKED:target-not-managed 并返回 2);保留既有 ready-then-merge。
- 真值表/head-binding/CI-green/conflicting-blocks 安全语义不变;readying 仅限 managed PR。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+controller_actions+wakeup_plan+pr_checks 316 通过;新增 managed-draft merge + 真实 plan→runner→controller ready-then-merge e2e + conflicting/非managed 安全用例。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* #409: 移除 .refactor-loop/host.env runtime fallback,CONSENSUS_RND_HOST_ENV 为唯一 host fact locator(缺失 fail-closed);迁移依赖 legacy fallback 的 test fixture (#473)

* hotfix r15 (sshx 共识 B 之一): 统一 gh argv 构造到单一 build_gh_argv helper — 防 r13 类 --repo bug 复发

sshx 一致共识 B 的硬化第1步:把分散重复的 gh --repo 处理收口到 codex_refactor_loop/gh_invoke.py 的 build_gh_argv(slug, argv):gh api 永不加 --repo(repo 在 URL);gh pr/issue 仅在 subcommand 之后、缺失时加 --repo <slug>;绝不 pre-subcommand。wakeup_runner._run_command / ControllerActions.gh / PrChecksProjection 三处 API 调用全部改走该 helper,删除各自的内联 --repo 启发式。行为保持不变(含 r13 修复)。

验证:characterization + source-regression 锁在 test_gh_invoke.py(9 测试);controller 独立重跑 test_gh_invoke+pr_checks+wakeup_runner+review_gate+review_gate_e2e+controller_actions+wakeup_plan 326 通过;pr_checks 直跑 --help 兼容。

注:这是硬化(防复发),不直接提升并发;并发瓶颈在上游 implement→publish 阶段,接下来单独处理。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r16 (sshx 共识): 打通 implement→publish 解锁并发瓶颈 — markerless/stale implement 重派 + 安全门控发布

sshx codex-cli triplet 共识(minimal/structural/delete + meta-judge):10 个 design-solving issue 已达成共识但卡在 implement→publish(implement 早跑过却 markerless/未提交/过期 base/没 PR;dispatch_consensus_implementation 被 local_iter_branch 抑制、publish 被 verified_pr_head_unavailable 抑制)→ 无新可派工作 → 并发 ~1。

修法(wakeup_plan/wakeup_runner/controller_actions/git.py):
- success-aware 重派:implement 分支/worktree 存在但无 valid clean IMPLEMENT_DONE:ok marker(markerless/failed)或 base 过期 → 重派;fresh_safe_worktree 把 canonical worktree reset/recreate 到当前 origin/$INTEGRATION_BRANCH;仅 genuine in-flight / 已有干净 PR / 待发布 clean marker 才抑制。
- 发布门控(严防误发 stale/wrong-scope):publish_implementation_output 仅在 canonical identity(refactor/iter<issue>-<candidate> + worktree 实际分支匹配)+ fresh integration base(merge-base==origin/integration,过期 base block)+ managed open issue + 无重复 PR + 非空 scoped diff 时,commit→push→open PR→派 reviewers;否则 block。
- wakeup_plan 用 canonical_implementation_identity/single_linked_managed_issue/clean_scoped_diff 替代 verified_pr_head 抑制。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 330 通过。

已知待修(follow-up):worker 把 PR title/body/commit 文案改成英文,违反「GitHub artifact 中文」策略,下一 commit 恢复中文。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r17: fresh consensus-implement 重派前清除 terminal 非可发布 implement log — 让卡住的 implement 真正启动

承接 r16:r16 让 markerless/stale implement 重派(reset worktree),但旧的 EXIT=1/markerless implement log 仍占 spawn 的 target_log_absent 前置 → fresh implement 不启动(同一类「按存在抑制」,implement-spawn-log 层)。

修法(controller_actions,success-aware):fresh consensus-implementation dispatch 写 spawn intent 前,仅清除 terminal 非可发布 implement log(EXIT!=0 失败 / EXIT=0 无 IMPLEMENT_DONE:ok marker);保留 genuinely in-flight(无 EXIT)与 clean publish-ready(EXIT=0 + 干净 marker)log 不动。于是卡住的 markerless/failed implement 重派后能真正 launch、跑在 fresh base、产出干净 marker → r16 门控发布。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 333 通过。

设计说明:并发瓶颈用确定性逻辑修对(ready=clean EXIT=0+marker;parallel=scope_paths 不相交),不引入 LLM-in-loop 分发判断(daemon-first;sshx codex 三人组一致 revise:LLM 最多 shadow-mode,不入决策回路)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r18: implement-lifecycle 收敛为单一 success-aware 谓词 + runner 路径自清陈旧 log — headless 自愈,不再需手动清 log

承接 r16/r17(r17 把清 log 放进被抑制的 dispatch 路径,从没执行)。新增 implement_lifecycle.py 单一事实源 classify_implement_attempt → in_flight | publish_ready | redispatch,plan/runner/controller 共用:
- runner spawn-apply 路径 _spawn_log_suppresses_retry 仅在 in_flight|publish_ready 时抑制;redispatch(EXIT!=0/markerless/stale-base) 不抑制且 _clear_redispatchable_spawn_log 在 spawn 实际应用处清掉陈旧终态 log → fresh implement 真启动(修 r17 misplacement,免手动清)。
- publish 仅 publish_ready(clean IMPLEMENT_DONE:ok + canonical identity + fresh base + 非空 scoped diff);clean-marker 但 stale-base 改 redispatch 而非死锁(#421 edge)。
- 确定性逻辑(非 LLM-in-loop;daemon-first);merge-gate/#191/#322/#396 安全语义不变。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 339 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 phase9-router triplet-evidence-invalid 误判致 headless design-consensus wedge

solver prompt 注入的 issue source snapshot 含上一轮 design-consensus 的
audit-trail(peer solver log 路径)时,_peer_solver_reference_violation
扫描整个 prompt 误判为 peer-isolation 违规,fail-closed 拒派 meta-judge,
使所有 body 含历史 consensus 记录的 issue 永久 wedge。

修复:peer-isolation 扫描只覆盖 router-controlled 区域,剥离
'## Issue source snapshot' 到 '## Full solver template' 之间的 issue
作者内容;router header / solver template 注入的真实 peer 引用仍拦截。
加两个 behavior test 覆盖 snapshot-排除 与 router-区域-仍拦截。

dogfood headless 实测:reload 后 418/437/474/490/491/493/494/495 全部
恢复 solver→judge→consensus 自驱,#494 已自驱到 implement。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* CLAUDE.md: 加 consensus-rnd:foundational-invariants sentinel block

dogfood bootstrap 的 check-project-rules probe 要求 PROJECT_RULES 含此
fixed-point block;缺失时 probe fail-closed 阻断 headless restart。内容
为 probe 生成的 canonical FI-001~007,镜像既有设计哲学,不引入新约束。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 markerless implement 致 publish 卡死的 headless gap

implement worker 干净退出(EXIT=0)但把 IMPLEMENT_DONE marker 只写进 run
artifact(runs/implement-issue-<id>.md)而非 log 末尾时(codex stdout marker
落点不可靠),completed_marker_actions 只扫 log marker → markerless → 不生成
publish action → design→implement→PR 链卡在 implement 后。实测 #421:EXIT=0
+ artifact 有 IMPLEMENT_DONE:issue-421:ok + 9 文件 staged,却无 PR、label 仍
停在 design-solving。

修复:对 clean-exit 的 implement-issue log,log markerless 时 fallback 读取
run artifact 末尾的 IMPLEMENT_DONE marker,复用 review verdict 已有的
artifact-first 模式。窄作用域(仅 implement-issue + clean-exit),配 2 个
behavior test;全套 wakeup_plan 138 tests green。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 补全 markerless implement publish:revalidation 也读 run artifact marker

8aeea89 修了 wakeup_plan detection 端,但 wakeup-runner 的 source-marker
revalidation(_source_log_has_clean_marker)仍只扫 log,markerless implement
publish 仍被 reject(clean_exit_marker_missing)。补全:revalidation 对
clean-exit implement-issue log 在 log markerless 时 fallback 读
runs/implement-issue-<id>.md 的 marker,与 detection 端一致;配 behavior test,
全套 wakeup_runner 76 tests green。

注:dogfood 实测确认 markerless 是系统性病根(还影响 readiness 重派、review
verdict 等多处),应走系统性统一(见 #491 日志/observability 与后续 marker-reading
收口);本 commit 只补 implement publish 这条链。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 日志详细化:wakeup-runner tick 显示全状态计数 + 被掩盖的 blocked/skipped 详情

旧 _wakeup_tick_action 只看 results[0],一个 tick 处理多 action 时 blocked(如
publish 失败)被成功的 spawn 掩盖、完全不可见;graphql-backoff 固定显示误导的
'remaining=unknown'。dogfood 调试时这逼得每次翻 ledger 才知道真实发生了什么。

改:tick log 显示 [applied=N,blocked=M,...] 全状态计数 + 被掩盖的 blocked/skipped
具体 reason+action(graphql-backoff 作整-tick 门控单独报、去掉 unknown);dispatched
headline 保留。更新 test 反映可见性提升,全套 wakeup_runner 76 tests green。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 补全 markerless implement 第三环:readiness 谓词读 run artifact marker

classify_implement_attempt(r18 单一 success-aware 谓词)只扫 log 找
IMPLEMENT_DONE:ok,markerless log(marker 落 artifact)→ redispatch:markerless →
readiness 不 suppress → 重派覆盖已完成工作(dogfood 实测 #421 重派 4 次、#493
IMPLEMENT_DONE:ok→重派→:partial 工作丢失)。

补全:markerless log 时 fallback 读 runs/implement-<cluster>.md 的
IMPLEMENT_DONE:ok,与 detection(8aeea89)/revalidation(dfacc64)一致。窄作用域 +
r17/r18 兼容:只接受 :ok,partial/失败/真 markerless(无 artifact)仍 redispatch
recovery — 现有 markerless-redispatch test(wakeup_plan/wakeup_runner/
controller_actions 共 304 tests)全绿,新增 2 behavior test。

至此 implement→PR 的 markerless 三环(detection/revalidation/readiness)打通;
review 链 markerless 与系统性统一抽象交 #499 共识。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* #500: design-consensus 编排重设计 — phase9-router 唯一派发 owner,删 wakeup-runner 双重派发

落地 #500 r4 consensus(structural framing,no-new-schema):
- phase9-router 成为唯一 design-consensus worker intent owner,新增
  META_RESOLVED:re-design → marker.round+1 三路 solver route
- 删除 wakeup-runner 的 dispatch_design_consensus(消除双重派发 helper_exit:3)
- wakeup-runner 只保留 closed action apply(drop→close、consensus→implement)
- dispatch_consensus_implementation readiness 后转 crnd:phase:implementing
- 无新增 schema(复用 HARNESS_SPAWN_INTENT)

修复 headless P0:reflector META_RESOLVED 无 consumer + phase9-router/wakeup-runner
双重派发致 0-codex 死锁。14 文件 +207/-282;verification 471 tests 全绿。

共识由 controller 手动驱动 design-consensus r1→r4 达成(daemon 停止期间)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* #501 fix r1: 补 dispatch_consensus_implementation label-transition failure-branch test

tests reviewer reject:新增 label transition(gh issue edit→implementing)只测 success
path,缺 gh issue edit 返回 nonzero 时 dispatch stop before worktree/render/intent 的
failure-branch test。本 fix 补该测试。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 headless 死锁:label 移除集只留 canonical crnd:*,删除历史 alias

gh issue/pr edit --remove-label 会因仓库不存在的 label 整条 rc=1 失败。
ISSUE_LABELS_REMOVE/PR_LABELS_REMOVE 含 cleanup_aliases(emoji/legacy 名),
仓库未维护这些历史 label,导致 _move_issue_to_implementing_phase 永久失败,
consensus→implement 派发卡死、wakeup_runner 0 codex 自驱死锁。

按 maintainer 指令删除历史 label 兼容(不再管历史标签):移除集只列
canonical crnd:* phase/human/stuck label。新增 source-regression 锁定
两集 canonical-only、无 alias。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 #500 回归:phase9-router 生产 ctx 丢失 GH_REPO_SLUG 致 design-consensus intake 全死

#500(e733b1a)把 Phase9Router.__init__ 的 ctx 构造从
`LoopContext.load(repo_root=repo_root)` 改成
`LoopContext.load(repo_root=repo_root, env={"REPO_ROOT": str(repo_root)})`。
受限 env 不读 os.environ,使 daemon(经 source host.env 启动)的 ctx
gh_repo_slug=None → `_open_design_consensus_issues` 第一行 fail-closed 返回 []
→ DesignConsensusIssueIntake 静默永不派 → headless 无法为任何 issue 启动
design-consensus;`_require_open_source_issue` 也因无 slug 失败(marker route
如 #490 redesign 卡在 phase9-source-state-unavailable)。

修复:revert 为 `LoopContext.load(repo_root=repo_root)`,读进程环境解析 host facts。
测试一律用 `Phase9Router(ctx=...)` 显式 ctx,不依赖受限 env;新增回归测试锁定
生产 repo_root 构造路径解析 GH_REPO_SLUG。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 加 STALE_REVIVAL_HOURS:>阈值未推进的 stuck implement 自动 re-trigger(env 可配,默认 3h)

headless 下 partial/failed/markerless 的 implement log 会让已排队的
dispatch-consensus-implementation spawn intent 永久卡在 runner 的
target_log_absent 前置(清 log 的 dispatch_consensus_implementation 又被
pending_implement_intent 抑制)→ stuck implement 不会自愈(如 #421/494/493/474/498
的 10h partial-wedge)。

新增 host-tunable STALE_REVIVAL_HOURS(默认 3,非正/非法回落 3h)。wakeup_plan
投影 implement spawn intent 时,若目标 implement log 是 redispatchable
(partial/failed/markerless/stale-base)且 mtime 已超阈值,清掉它 →
target_log_absent 通过 → implement 自动重派。安全闸:in-flight(无终止 EXIT)永不清、
publish-ready(:ok+diff)不清、age-gate 防误清刚跑完的。

env 已登记进 SKILL.md host env surface matrix + host.env.example;5 个 behavior
test + matrix↔example 对称测试覆盖。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 扩展 stale-revival 覆盖死 codex:in_flight 无 EXIT 且 >阈值 + 不在存活清单 也 re-trigger

最常见的 headless wedge 是 codex/supervisor 中途死亡(如 daemon 被 kill 截断),
log 永远停在 in_flight(无终止 EXIT)→ classify 永判 in_flight → 永不 redispatch。
之前只清 redispatchable(partial/failed/markerless)log,漏了这半边。

现在 _revive_stale_redispatchable_implement_log 在 log mtime 超阈值时,先按
redispatchable 清;否则若 classify 为 in_flight 且 monitor 存活清单里无该 log 的
活进程,则判定为死 worker 清掉。安全性:spawn-codex 的 no-output stall 窗口远小于
默认 3h,故 >阈值仍 in_flight 的 log 必是死 supervisor;再加 monitor 存活校验
双保险,绝不杀活 codex。新增 3 个 behavior test(死 codex revive / fresh 不碰 /
活进程不碰)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 加手动 trigger:consensus-rnd-cli revive-implements(立即 re-trigger 卡住的 implement,不等阈值)

_revive_stale_redispatchable_implement_log 加 force 参数:跳过 stale_revival_seconds()
age gate,但 in_flight log 在 force 下必须经 monitor 存活校验证明无活进程才清(绝不杀
正在跑的 codex)。新增 force_revive_stuck_implements() 扫所有 implement-issue-*.log
强制 revive,和新 CLI 命令 revive-implements(authority delete-log)。auto 路径
force=False 不变。新增 4 个 behavior test。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 env 泄漏:测试套件 import 时隔离 ambient host.env(worktree 里跑 TEST_CMD 不再假失败)

daemon spawn 的 implement worker 在 worktree 里跑完整 TEST_CMD 时,继承的相对
CONSENSUS_RND_HOST_ENV=.config/consensus-rnd/host.env 泄漏进 temp-repo
LoopContext.load(repo_root=tmp) → 解析成 tmp/.config/... 不存在 → 假
LoopContextError(实测 concurrency_monitor 48 + restart_daemons 23 + … ≈151 errors)
→ 全套永不绿 → implement 永远 :partial → 进不了 PR。

unittest discover 在 collection 阶段先 import 所有 test_*.py 再跑;新增
test_aaa_host_env_isolation.py(字母序最先)在 import 时 pop 掉 host 注入变量,
全套在任何测试 run 前即隔离。验证:full discover 设 locator 跑 → exit 0 全绿
(此前 同条件 151 errors);targeted before/after 证 48/23/1 errors → OK。
纯测试模块,不碰 production/daemon。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 markerless keystone:solver/judge marker 加 companion-artifact fallback

solver/judge codex 有时 EXIT=0 但 SOLVER_DONE/META_JUDGE_DONE marker 只落 run
artifact、没在 log 末尾,导致 phase9-router 判 triplet 不完整、永不派 judge,
design-consensus 轮卡死(实测 #505 r1 structural 214 次嵌入、0 次行首)。

把已有的 implement artifact-marker fallback 扩到 solver/judge:clean-exit log
末尾无独立 marker 时,从 companion run artifact runs/<log-stem>.md 兜底读。纯
additive(log 有 marker 时行为不变)、prefix-scoped、精确 filename regex、复用
安全的独立 marker 提取(不信嵌入/echo)。不削弱共识(仍要 3 marked solver +
judge,只让 marker 来源更稳健)。phase9/router.py + wakeup_plan.py + behavior +
source-regression tests;full suite 1367 green。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 publish stale-base 无恢复 + 停 :ok churn:让 implementing 能发出 draft PR

implement :ok 但 worktree base 落后 origin/$INTEGRATION_BRANCH 时,publish 原本硬
block(publish_implementation_stale_base)且永不恢复;stale-revival 又把 stale-base
的 :ok implement 当 redispatchable 反复重派同一 stale worktree → 无限 churn 占满
floor、长时间零正式 PR(implement→首个 PR 永远开不出)。

改为:publish 前在 worktree merge origin/$INTEGRATION_BRANCH 恢复 base(冲突则
merge --abort + 明确 reason 交既有 resolver/re-dispatch);clean `:ok` 不再被
stale-revival churn(终态成功走 publish-with-recovery,不重跑);无 open PR 的 :ok
创建 draft PR(per #300 共识前 draft)+ Closes #N。controller 仍拥有 git topology,
worker 不 commit/push,不绕 review-gate/merge 真值表。controller_actions.py +
wakeup_plan.py + wakeup_runner.py + behavior/source-regression tests;full suite
1374 green。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 markerless implement:EXIT=0+真diff 无干净 marker 时合成 publish 候选,止 stale-revival 空重派死循环

publish build/test 闸做安全网。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 publish 顺序 bug + merge 失败派 codex 兜底 resolver:先 commit worker diff 再 merge fresh base,真冲突不 wedge 改派 resolver,幂等续跑

空判据改 git diff HEAD;顺带修 wakeup_plan stale-suppression 同类 git diff 问题。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #504 (#519)

* 修 headless review-fix prompt 渲染不全:绑定全部 placeholder(PR_NUMBER/标题/轮次/分支/REVIEW_*_PATH 等),解锁 reject PR 的 fix→merge

之前 headless _dispatch_review_fix 只绑 FIX_OUTPUT_PATH,其余 ${} 未解析致 fix codex 读不到 reject 证据。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #496 (#520)

* 修 review-fix codex 派到 main 的 cd bug:改在 PR worktree 跑,worktree 缺失 fail-closed

之前 _dispatch_review_fix cd=repo_root → fix 改 main(污染 daemon checkout)且不落 PR 分支(reject PR 永不 merge)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #508 (#523)

* 修 wakeup-plan action 排序:review_gate/publish 排到新工作 spawn 前,解 spawn-budget 饿死→reject PR 可被 fix→merge

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* CLAUDE.md 加宪法工程规则:异常必抛出+记可诊断日志严禁吞掉/静默;测试必断言真实行为禁无意义测试(maintainer 指令)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #503 (#521)

* 增加通用工程基本规则,明确面向对象设计要求及实现细则

* 实现 issue #514 (#524)

* 完善通用面向对象设计准则,增加核心原则与实现层细则

* 实现 issue #494 (#526)

* 实现 issue #517 (#528)

* 实现 issue #505 (#510)

* 实现 issue #505

* #510 fix r1:补 phase9-router 行为测试覆盖(应对 tests reviewer reject)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 headless FIX_DONE 不提交 fix 输出 gap:wakeup-runner 在 FIX_DONE re-review 前先 commit+push fix worktree

headless 下 fix codex 产出 FIX_DONE 但 worker 从不 commit,FIX_DONE 直接路由 dispatch_reviewers,
fix 输出永远留在 worktree 未提交,reviewers 永远审旧 head,reject 不收敛,阻断所有需 fix 轮的 PR
review→merge。镜像 interactive controller:FIX_DONE 源的 dispatch_reviewers 前,若 fix worktree
dirty 则 git add+commit+safe_push,clean 则 no-op。附 2 个 behavior test(dirty 提交/clean 跳过)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #491 (#531)

* 实现 issue #490 (#529)

* 实现 issue #490

* 实现 issue #490

* PR #529 review-fix output

* 实现 issue #499 (#512)

* 实现 issue #499

* PR #512 review-fix output

* 实现 issue #474 (#532)

* 实现 issue #474

* 实现 issue #474

* 实现 issue #474

* 实现 issue #474

* PR #532 review-fix output

* 实现 issue #476 (#518)

* 实现 issue #476

* PR #518 review-fix output

* 实现 issue #509 (#527)

* 实现 issue #509

* 实现 issue #509

* spawn-claim recycle 改用 artifact-aware terminal marker:companion log 被 24h retention 删除后,已完成任务的 claim 仍可回收(读 runs/ artifact),不再永久阻断重派;running 任务无终止 marker 仍不回收(防 double-spawn)

* 实现 issue #421:发行前 clean-room 集成测试制度 (#534)

* 实现 issue #421

* rebase #421 onto fresh integration:解决 test_cli_command_router 冲突 + 校验

* 实现 issue #493:测试纪律修宪(behavior-first) (#535)

* 实现 issue #493

* rebase #493 onto fresh integration:解决冲突 + 校验

* PR #535 review-fix r1:addressed tests+quality reject(behavior-first test 调整)

* 实现 issue #506 (#536)

* 实现 issue #506

* 实现 issue #506

* 实现 issue #506

* 实现 issue #506

* PR #536 review-fix output

* churn 根治:wakeup-plan 给 consensus-impl spawn-intent 套 open_closing_pr/terminal/in-flight 抑制(对齐 dispatch_consensus_implementation),已有 PR 的 issue 不再被重派 implement

* 实现 issue #507 (#525)

* 实现 issue #507

* 实现 issue #507

* PR #525 review-fix output

* 实现 issue #516 (#530)

* 实现 issue #516

* 实现 issue #516

* #530 fix r1:补 ManagedWorkSnapshot fetch-failure stale-cache fallback 行为测试(应对 tests reviewer reject)

* PR #530 review-fix output

* PR #530 review-fix r2:typed ManagedWorkItem projection(addresses architect 类型边界 reject + tests + contract-tests)

* PR #530 review-fix r3:anchor ManagedWorkSnapshot read surface in authorization mirror + cross-artifact source-regression(addresses architect governance reject)

* PR #530 review-fix r4:add canonical+legacy managed-label discovery regression test(addresses tests reject)

* 修复 worker marker reader 误判相同重复 marker 为冲突

codex 常把终止 marker 回显 2-3 次(含 diff + 前缀行),导致
read_worker_terminal_marker 返回 duplicate_or_conflicting,review 真值表
永不收敛、所有 headless review→merge 卡死。改为:相同重复 marker 折叠为
有效,仅当存在 2+ 个不同 marker 值才判 duplicate_or_conflicting。
覆盖 _last_final_marker / _sentinel_adjacent_marker /
_marker_from_companion_artifact,保留 malformed 与 foreign-prefix 检测。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Reserve implementation PR for issue #537 (#540)

⟦AI:AUTO-LOOP⟧

* 实现 issue #511 (#522)

* 实现 issue #511

* PR #522 review-fix output

* PR #522 review-fix output

* 移除误入 PR 的 .refactor-loop 运行 artifact(architect r8 reject)

resolve-pr522-merge.md 是一次性合并解决 summary,属生成物,不应进 durable diff
(CLAUDE.md「生成物不当事实源」)。解决说明保留在 commit message / PR body。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复早期 PR reservation 崩溃(#511 回归,阻断所有 implement 派发)

_reserve_implementation_pr 用 worker 富正文路径开 reservation PR,但该文件在
reservation 时尚不存在 → FileNotFoundError(未被 except RuntimeError 捕获)→
wakeup-runner tick 崩溃 → 任何 consensus issue 无法 implement → loop P0 wedge。
修复:reservation 前若 body 缺失则写占位正文(## issue #N 实现 + Closes + sentinel,
publish 会拒占位逼 worker 产真正文);reservation 幂等(已有 open PR 直接返回;
已有 remote head 则 reset 到 integration + force-with-lease push);捕获 OSError。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Reserve implementation PR for issue #543 (#544)

⟦AI:AUTO-LOOP⟧

* Reserve implementation PR for issue #541 (#546)

⟦AI:AUTO-LOOP⟧

* revert early-PR reservation(#507):不再开空占位 PR,publish 出真内容才开 PR

dispatch_consensus_implementation 不再 reserve/开 PR;publish_implementation_output
恢复 pre-#507 行为——implement 出非空 diff 才 commit+push+open_pr_with_label(用 worker
真 title/body,验证非占位);空 diff 不开 PR、不 merge、不关 issue。删除占位机制
(_reserve_implementation_pr / _placeholder_*),改 wakeup_plan/runner 的 early_pr_missing 投影。
根因:空占位 PR 被 reviewer 在 implement publish 前空合,假性关闭 #537/#541/#543。
全量套件 1496 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #542:删除 redispatch payload 冗余 head_sha (#547)

移除 review redispatch 的重复 head_sha payload

* hotfix: stale consensus-implement intent 指向已删 worktree 不再 suppress fresh re-dispatch

consensus_implementation_suppressed_reason 的 pending_implement_intent 分支增加 worktree 存在性
守卫(接入既有 _canonical_consensus_worktree_exists):仅当 stale pending intent 的 canonical
worktree 仍存在时才 suppress。revert early-PR reservation(#507)后,#537/#541/#543 的 pending
implement intent 指向已删除 worktree,旧逻辑 suppress fresh re-dispatch,而 wakeup-runner 不断把
stale intent spawn 进死 cd 致 os error 2 / EXIT=1 死循环,阻断 consensus→implement 自主链路。
现 worktree 缺失即放行 fresh dispatch 重建 worktree。

补 test_wakeup_plan(worktree 缺失不 suppress / worktree 存在仍 suppress)+ 更新
test_controller_actions 锁旧行为的 subTest(pending case 建 canonical worktree 以保留
reject-before-fresh_safe_worktree 意图)。全套件 1557 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: spawn-claim lock 持有者 pid 已死即可回收(自愈,防 cleared-log 永久 HELD)

_existing_claim_is_recyclable 增加 pid-liveness 回退:当 lock 记录的持有者 pid 已死(os.kill
pid 0 抛 ProcessLookupError)时即可回收,保留既有 terminal-marker 回收与 live-spawn 保护
(pid 存活+无 marker 仍不回收)+ metadata-mismatch fail-closed。根因:implement 失败(EXIT=1)
留 stale lock,dispatch_consensus_implementation 重派时清掉 log → #490 spawn-claim 只在 log 有
EXIT= 时回收 → cleared log 无 marker → 死进程 lock 无法回收 → SPAWN_CLAIM_HELD 永久死锁,
阻断 consensus→implement 重派(#537/#541/#543)。补 7 个 behavior test。全套件 1560 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: publish_implementation_output 接受 worker 已 commit 的实现 diff

_require_publish_implementation_diff 之前只看 git diff HEAD --quiet(未提交改动),
worker 一旦自己 commit(在自己分支上,隔离无害)工作区就 clean → 误判 no_diff → 不开 PR。
现改为:有未提交改动 OR HEAD 相对 integration base 有 committed delta(merge-base..HEAD 非空)
即视为有 diff 可发;_commit_publish_implementation_diff 在工作区已 clean(worker 已 commit)时
跳过提交返回成功,不再误报 publish_commit_failed。修 #541 类 worker-commit→no_diff→无 PR+
daemon 重复 dispatch。补 behavior test。codex 全套件 DISCOVER_STATUS:0 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #437: .refactor-loop 历史生成物与陈旧 worktree 的有界清理机制 (#548)

* #437: 新增 canonical RuntimeRetention owner,log-retention 降为一版 alias;清理覆盖 generated files+pending-events compaction+stale worktree;CLAUDE.md 加 #437 narrow carveout

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* hotfix: CONFLICTING managed review PR 的 stale-base 冲突自主 resolve

接线既有但孤儿的 prompts/rebase-resolve.md 进 headless 路径:managed review PR
mergeable=CONFLICTING(base 落后)时,controller 在 PR worktree 起 git merge
--no-commit --no-ff origin/<integration>——干净则直接 commit+push,有冲突则派
rebase-resolve codex(隔离 worktree 内解+stage,禁 commit/push),收到
REBASE_RESOLVE_DONE 后 controller commit + safe_push 仅推 PR head。结果再过 CI+
review-gate 把关。新增 dispatch_pr_rebase_resolve / commit_push_resolved_pr_rebase
(controller_actions)+ wakeup_plan 投影 + wakeup_runner 应用。

安全:只动 refactor/iter<N>-* PR 分支,显式 guard 拒碰 integration/review_base,
diff-filter=U 验未解冲突才 commit,任何 abort 路径 git merge --abort 留干净 worktree,
幂等(已含 base / 解析在飞则 NOOP)。修 stale-base churn 致 loop 永久停滞(一个 PR
合了其它全 CONFLICTING 无人解)的根因。codex 全套件 1589 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: rebase_resolve_actions 投影前 enrich live mergeability(修生产 gap)

自主-resolve 投影读 item.mergeable 前未 enrich,但生产 GhItem 来自 snapshot
mergeable 为空 → 永远 skip → CONFLICTING PR 不投影 dispatch_pr_rebase_resolve →
loop 仍停滞。复用同文件 _with_live_mergeability(review_evidence_redispatch_actions
已用同模式)。补 mergeable-为空-需-live-fetch 的回归测试(原测试 mock 了 mergeable
未覆盖此生产路径)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: dispatch_pr_rebase_resolve kind 进 EXECUTABLE_ACTION_KINDS(end-to-end 可执行)

_close_projection_action 的 else 分支默认 status_only=True;rebase action 的 kind
stale-base-conflicting-pr 不在 EXECUTABLE_ACTION_KINDS → 永远 status_only,runner 不
apply,CONFLICTING PR 永不自主 resolve。加该 kind(+ commit_push completion kind 如需)
进集合;controller_action 已在 RUNNER_NAMED_HELPER_ACTIONS 故随即保持 executable。补
端到端 wakeup-plan pipeline 测试(原测试只测 rebase_resolve_actions 单函数,漏了 closed-
action-projection 的 status_only 关卡)。codex 全套件 1593 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: rebase-resolve commit/dispatch 投影 gate 在实际 worktree merge 态(止 stale-marker 噪音)

commit_push_resolved_pr_rebase 之前只凭持久 REBASE_RESOLVE_DONE marker 投影,worktree
已 commit+push(#549)或 abort(#550/#551)后 marker 仍在 → 每 tick fail-closed 噪音 +
反复 0-codex(zero_streak 触警)+ stale marker 挡 #550/#551 fresh resolve。现 gate:仅当
worktree 真有 merge-in-progress(MERGE_HEAD)且无 unmerged 才投 executable commit_push;
否则不投。并让 worktree clean + 仍 CONFLICTING 的 PR 重新拿 executable dispatch_pr_rebase_resolve
(stale DONE marker 不再永久压制)。补端到端测试。codex 全套件通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #502: 删除遗留 label 兼容层:purge emoji/legacy alias 全套(cleanup_aliases / per-spec  (#549)

* 实现 issue #502: 删除遗留 label 兼容层:purge emoji/legacy alias 全套(cleanup_aliases / per-spec

* 修复 PR #549 review r3:#502 label-purge 适配当前 label 代码(contract-tests + architect/tests reject)

fix codex round-3 applied-4:对齐 #502 遗留 label 兼容层 purge 与近期新增 label 代码,
修 contract-tests 失败 + architect/tests reject。改 labels.py + SKILL.md + test_wakeup_plan.py。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 headless review_gate ci_failed 死锁:reject→FIX 移到 ci/mergeability 检查前

_review_gate_decision 之前 ci_error/mergeability 检查在 reject>0→FIX 之前 → CI 红的 PR
即使 reviewer reject 也返回 WAIT_OR_REDISPATCH:ci_failed 干等,而 CI 不修不会绿 → 死锁
(#548/#549 都中,需手动派 fix 破解)。把 reject>0→FIX 移到 ci/mergeability 检查前
(仍在 reviewer-validity/head 检查之后)→ reject 时即使 CI 红也自主派 fix(fix 同解 reject+CI)。
MERGE/MERGE_WITH_COMMENTS 判定完全不变(仍需 ci-green + mergeable + reject=0 + approve>=1),
无误合并风险。补 test_wakeup_runner 端到端测试(reject+ci-red→FIX;reject=0+ci-red 仍 WAIT;
happy path MERGE 不变;stale-head+reject 仍 WAIT)。full suite 1600 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修正 review_gate 顺序:mergeability 检查移到 reject→FIX 之前(CONFLICTING 不并发 review/fix)

上一修复(a125d19)把 reject→FIX 移到 ci+mergeability 前,破了 ci_failed 死锁但也让
CONFLICTING+reject 的 PR 路由到 FIX → 与独立的 dispatch_pr_rebase_resolve 并发跑同一
worktree(#550 出现 98-file mid-merge + reviewer 并发)。把 mergeability_error 检查移到
reject→FIX 之前:CONFLICTING→WAIT(先 rebase-resolve,不并发 review/fix);MERGEABLE+
CI红+reject→FIX(死锁修复保留);ci_error 仍在 reject→FIX 后(纯 CI 红无 reject 才 WAIT)。
MERGE 判定不变。test_wakeup_runner 90 OK(+conflicting 用例)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #498: headless daemon fail-closed 点的 codex 兜底裁决/生成机制(narrow allowlist,非通用 es (#550)

* 实现 issue #498

* 实现 issue #498: headless daemon fail-closed 点的 codex 兜底裁决/生成机制(narrow allowlist,非通用 es

* PR #550 review-fix output

* 新增 patrol-inspector 巡检 issue intake (#551)

* 实现 issue #541

* PR #551 review-fix output

* PR #551 review-fix output

* 修复 PR #551 contract-tests:对齐 patrol 第8 daemon + safe_push/sync_dev 测试 host.env

#551 patrol 新增 patrol_inspector_daemon(第8 daemon)→ test_anti_stop_restart 期望 7→8;
test_controller_lib_safe_push / test_sync_dev 补 CONSENSUS_RND_HOST_ENV host.env 设置
(代码按 host-config contract 需 host.env locator,rebase-resolve 丢了 base 新测试版本)。
断言未削弱(daemon 数断言强化为8)。full discovery 1634 OK + sshx 13 OK。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 PR #551 contract-tests(clean-env):e2e fixture 设 META_ESCALATION_STUCK_HOURS 隔离 de-dup 测试

test_batch_and_per_task_lock_do_not_duplicate_spawn 在 clean env(CI)下失败:#506 meta-
escalation 因 fixture mock 态被判 stuck 而 fire,产出 repository-stalled RunnerResult 污染
de-dup 断言(expected [] got [RunnerResult])。fixture host.env+env 设 META_ESCALATION_STUCK_
HOURS=999999 禁该测试的 meta-escalation,聚焦 de-dup 行为。断言未削弱。clean discovery 全过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 publish-gap:implement readiness 用 git status --porcelain 判 staged diff

原 classify_implement_attempt / wakeup_runner publish 前用 git diff --quiet 只检测 unstaged
改动;worker 正确 git add -A 后 staged diff 被误判 empty_scoped_diff,导致完成的 implement
(如 #553 干净 EXIT=0 + 885 行真实 staged diff + 有效 title/body artifacts)永远出不了 PR,
wakeup_runner 对 children 反复空转 re-dispatch。改为先 git status --porcelain 判 staged/
unstaged/untracked 任一即 publish-ready,只有真干净才 fallback git diff。publish-time 建 PR
路径不变(不重蹈 #507 early-PR 空 commit/placeholder 死锁)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 patrol_inspector daemon 启动崩溃:DaemonHeartbeatLease 用构造函数非 from_env

#551 patrol.py daemon 分支调 DaemonHeartbeatLease.from_env()(不存在的方法)→
AttributeError 启动即崩,daemon 从未成功跑起来。改为 DaemonHeartbeatLease(
"patrol_inspector_daemon", ctx.repo_root),与 closed_label_reconciler/wakeup_runner
一致,心跳文件名匹配 restart 助手期待的 patrol_inspector_daemon.ts。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 publish-gap 残留层:duplicate-marker log 读 companion artifact + 取代 superseded spawn intent

层2 残留:implement worker 把 IMPLEMENT_DONE:issue-N:ok 在 log 里 echo 多次→
worker_markers 返回 duplicate_or_conflicting_log_marker fail-closed→classifier 当
markerless 且不查 companion artifact→无 completed-marker action→无 publish→stale
harness-spawn-intent:dispatch-consensus-implementation:N 残留致 pending_implement_intent。

修:(1) classify_implement_attempt / completed_marker_actions 遇 duplicate marker 时
narrow 读 companion .refactor-loop/runs/implement-*.md(仅 implement log,仅末 30 行恰好
1 个 IMPLEMENT_DONE:ok 才接受);(2) suppress_publish_superseded_implementation_spawn_intents:
仅当某 issue 已有 executable publish_implementation_output 才把对应 dispatch-consensus-
implementation spawn intent 抑制为 status_only(implementation_ready_to_publish)。
不重蹈 #507 死锁:只在 publish 已 ready 后抑制 spawn(不漏派)、publish 仍全验证(不提前/
重复)、wakeup-runner/publish 授权口径不变。live 验证 #553 现投影 executable publish。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #555 (#557)

* 修复 publish-gap 第3层:wakeup_runner clean_exit revalidation 对 duplicate-marker log 复用 artifact-fallback

pubgap2 修了 plan 侧 duplicate-marker artifact-fallback,但 wakeup_runner 的
_source_log_has_clean_marker(clean_exit_source_marker 预条件 revalidation)仍用严格
reader,对 #553 duplicate IMPLEMENT_DONE echo 返回 clean_exit_marker_missing→阻 publish。
改为严格 reader 优先,仅当 duplicate_or_conflicting_log_marker + is_implement_log +
_implement_run_artifact_done_marker 精确匹配 projected marker 才 fallback。复用 plan 侧
同一 helper,不弱化安全门(reviewer/release/rebase/非 implement/其它 fail reason 保持严格)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 接 dispatch_remote_ci_fix 进 wakeup_runner:pipeline 自主修 ci-red(#396 内)

children review 通过但 contract-tests 红(改 restart/supervisor daemon 却没同步本地
TEST_CMD 屏蔽的 daemon-count 测试),wakeup_runner 缺 dispatch_remote_ci_fix 实现致
ci-red 永 status_only 不自主修。镜像既有 dispatch_reviewers + review-fix commit-push:
每失败 check 派一个 remote-ci-fix.md worker(无 commit 权),REMOTE_CI_FIX_DONE:<check>:ok
后 controller safe_push 推,:infra/:blocked 留 maintainer。durable retry cap=2 per
pr:head_sha:check(state/remote-ci-fix-attempts.json),超限 RETRY_CAP 停。在 #396
'dispatch remote-ci worker' 口径内,worker 无 lifecycle/commit 权,无新授权。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 wakeup_runner 单 helper 失败饿死全 tick:仅直接 spawn launch 失败才 break

run_once 对任何 spawn-budget blocked 且 reason 含 helper_exit:* 就 break 整 tick。rollup
PR #560(head rollup/<sha> 无 worktree)的 review-fix helper_exit:3(WORKTREE_MISSING)→
break → 饿死 #558/#561 的独立 ci-red 派发,致 runner ~30min noop 0 派发(0 codex)。
改为:仅直接 spawn_codex_harness_background launch 失败(is_spawn_action + _spawn_launch_failure)
才停扫描;blocked named helper(review_gate FIX / dispatch_remote_ci_fix)记录后 continue,
让同 tick 后面独立 PR 的 action 仍被处理。单 PR helper 失败不再阻塞全局派发。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 收紧无人唤醒与大 issue 分解授权测试 (#559)

* 实现 issue #556

* 实现 issue #556

* PR #559 review-fix output

* 修复 identical-duplicate terminal marker 被误判 conflicting:reader 容忍相同重复

remote-ci-fix worker echo REMOTE_CI_FIX_DONE:check:ok 两次→read_worker_terminal_marker
返回 duplicate_or_conflicting→runner 读不到→不 commit worker fix→撞 retry cap,#558 卡。
根因修(worker_markers._sentinel_adjacent_marker):无 sentinel-adjacent 但 tail 有 2+
standalone marker 全 identical 时返回该 marker。一次修 implement/ci-fix/未来所有 marker。
不弱化 conflicting:不同 marker 仍 fail,malformed 仍 fail,单非终止 marker 仍不终止。
不扩 #396;wakeup_runner 仍 revalidate clean_exit_source_marker 才 commit/push。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* rollup PR 自主化(maintainer-directive 2026-06-06):singleton + CI-only + 详细 body + auto-squash-merge

实现 maintainer 指令「rollup 只要ci过了就可以,不用review...绿了自动squash...默认自动,在env可配,只开一个rollup」:
- Singleton:已有 open rollup(head rollup/ + base review-base)时 force-with-lease 更新其 head 到当前
  integration SHA + 刷新 title/body;无则才新建,不增殖。
- CI-only:rollup PR 排除出 reviewer/review-fix/remote-ci-fix dispatch(不走 cluster 3-reviewer gate)。
- 详细 title/body:integration 领先 review-base 的 commit 数 + 范围 + commit 摘要(中文经 hex 编码守 source-EN-only)。
- Auto-squash-merge:host.env ROLLUP_AUTO_MERGE(默认 auto;manual 则等人工)。owner-gated + 仅 rollup PR
  narrow allowlist + exact-SHA required checks 全绿硬 gate(ReleaseRequiredChecksProjection)→ gh pr merge
  --squash --delete-branch;分支保护/host-policy 失败→WAIT 留人工不 force。所有失败写诊断 pending event。
- 授权双锚定:runtime-exceptions.md#rollup-autonomous-merge-2026-06-06 + SKILL.md + host.env surface matrix;
  durable artifact = maintainer-directives/2026-06-06。source-regression + behavior test 全覆盖。回归 1619 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
loning added a commit that referenced this pull request Jun 6, 2026
* #471 hotfix: wakeup_plan completed-marker 投影按 open-managed 集过滤,closed/merged target 不投影(止血 graphql:消除 wakeup_runner 对已终结 target 的逐条重校验)

* #410: SKILL/prompt 注释改引 durable contract anchor,不引用一次性 .refactor-loop/runs/*;新增 test_generated_artifacts_not_authority 窄 source-regression (#472)

* #471 hotfix2: wakeup-plan 同一 target 只投影最新有效 marker,旧轮次不再投影 dispatchable action(止 graphql backoff 最后一刀)

* skill: 强化 no-gap 规则 — 0 codex + active work 无观察模式豁免,必须当 turn 立即介入派 codex(+source-regression)

* #470: direct-post prompt 渲染时 fixed-token 内联 _github-post-rules,host worktree cwd 可达(避免裸相对路径) (#479)

* #478 hotfix: daemon spawn 改 launch-and-forget 独立 supervisor(start_new_session 不 wait),daemon reload 不丢 in-flight worker;daemon 每 tick 清晰状态行

* #471: wakeup-plan latest-open-fresh 投影 — completed-marker open-target/latest + release-rollup 仅 latest-per-sha 且 ahead>0 本地 ref 验证(止 stale 投影/graphql) (#482)

* #481: consensus->implement readiness-dedup gate(止 daemon 重派 closed/has-PR/in-flight) (#486)

* #481: consensus->implement readiness-dedup gate, closed/closing-PR/remote-branch/in-flight target 不投影 dispatchable, 止重复/race

* #481 fix r1: dedup gate 复用 concurrency monitor owner surface 测量 in-flight,不重实现 daemon 算法(addr architect)

* #481 fix r2: 补 controller helper guard behavior test(addr tests)

* #485: consensus->implement 按 scope_paths 冲突分组串行(同组一个 executable,其余 status_only),复用 batching-heuristics,防并行冲突 PR (#489)

* #475: wakeup-runner WakeupApplyBudget — spawn-worker action 按 hard_gate.dispatch_required 批量到 floor deficit(lifecycle action 仍 1/tick),修 1/tick 线性恢复 (#492)

* hotfix(#475 ext): batch budget 纳入 dispatch_design_consensus —— solver triplet 一 tick 派齐(非 lifecycle,worker 派发),修 design-consensus 仍 1/tick 低并发

* hotfix: headless spawn-intent launch — skipped duplicate 不再 break 整批 spawn,stale applied ledger(log 缺失)可重试,helper_exit:3 加来源诊断事件;修 248 intent 堆积/0 并发

* hotfix r2: blocked lifecycle action 不再 dead-stop 整 tick — 跳过/抑制 stale blocked(close-drop target OPEN / publish 无 verified head)继续 launch 后续 spawn-batch,使 headless 并发拉到 floor

* hotfix r3: 根治 headless 0 并发 — detached spawn-codex launch 显式传 env_for_subprocess(修 #478 decouple 丢 env 致 EXIT=127);失败 log(非0 EXIT)可重试 launch,EXIT=0/未完成仍抑制重复;tick 状态汇总后续 launch 不被首个 blocked 掩盖

* hotfix r4: run_once 通用 blocked-not-dead-stop — 任何 blocked/skipped non-spawn action 不再 dead-stop 整 tick,继续 launch 后续 spawn-batch;wakeup-plan 抑制 precondition 注定失败的 stale action;根治 headless 并发=0

* hotfix r5: wakeup-plan 投影侧 suppress stale close_managed_drop(target 实际 OPEN 的 drop-close 注定失败)为 status_only,不再占 tick 挡住后续 design-consensus spawn launch

* hotfix r6: wakeup-runner run_once 不再把 no-intents 的 dispatch_design_consensus(helper_exit:3)误判为 codex launch failure dead-stop 整 tick

根因:dispatch_design_consensus 是 spawn-batch action;当 solver triplet 不完整(如 #496 minimal/structural EXIT=0 但未 emit SOLVER_DONE marker)时 helper 返回 exit 3(NO_INTENTS),_spawn_launch_failure 命中 "helper_exit:" 前缀判 True → run_once break,整 tick dead-stop,后续 reviewers/其他 design-consensus/implement 全部不派 → headless 并发=0。

修法:run_once 中 blocked 的 spawn-batch action 若 controller_action != spawn_codex_harness_background(即 dispatch_design_consensus),按 routing no-op skip-and-continue,继续 launch 后续 spawn-batch;仅真正 spawn_codex_harness_background launch failure 才走既有 retry/break。延续 r4 blocked-not-dead-stop 不变量到 design-consensus spawn-batch action。

behavior test: test_wakeup_runner_design_consensus_no_intents_does_not_dead_stop_later_spawn_batch(无修复时 later spawn 不 launch 而 FAIL)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r7: review head_sha binding — 打通 headless review→merge 自驱

根因:reviewer artifact frontmatter 不带 head_sha,_review_evidence_from_artifact 取不到所审 head SHA → 所有 reviewing PR(#480/#473/#402)恒判 WAIT_OR_REDISPATCH:invalid_reviewer_evidence:missing_reviewed_head_sha,且 wakeup-plan 不自动重派 reviewer → review→merge headless 永久卡死(挡住发版)。

修法(codex hotfix worker 实现,controller 验证后提交):
- controller_actions 派 reviewer 时取 headRefOid,缺则 fail-closed,注入权威 HEAD_SHA 进 rendered prompt。
- reviewer-{architect,tests,quality}.md frontmatter 写 head_sha: ${HEAD_SHA}。
- wakeup_runner._review_head_sha_for:artifact 缺 head_sha 时从 controller-rendered prompt/log 回收(不依赖 worker 自报,避免 markerless 覆辙);live-head 陈旧判定与 merge 真值表语义不变。
- wakeup_plan 为 missing/stale reviewer head 的 OPEN reviewing PR 投射可执行 dispatch_reviewers 自愈动作。

验证:controller 独立重跑 test_wakeup_runner+review_gate+wakeup_plan+controller_actions 293 通过、marker/prompt 合同 30 通过;review-gate 安全语义(stale head 仍不 merge)由 13 项 review-gate 测试锁住。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r8: 让 review-evidence-redispatch 自愈动作可执行 — 打通 headless review→merge 闭环

承接 r7:r7 让 reviewer 证据绑定 head_sha,但 wakeup_plan 投射的 review-evidence-redispatch 自愈动作被 closed-action-projection finalizer 强制 status_only(其 kind 不在 EXECUTABLE_ACTION_KINDS),runner 永不应用 → #480/#473/#402 缺 head 的 review 证据无法自动重派 → review_gate 恒 WAIT_OR_REDISPATCH → 并发耗干到 0。

修法(codex hotfix worker 实现,controller 验证后提交):
- wakeup_plan EXECUTABLE_ACTION_KINDS 加 review-evidence-redispatch,finalizer 保留 runner authority 不再强制 status_only。
- controller_actions.dispatch_reviewers 读 stale_review_roles 只重派陈旧角色,且 _pending_review_spawn_exists 跳过已有在飞 dispatch-reviewers:<pr>:<role>:r1 intent(幂等不双派);仍走 r7 的 HEAD_SHA 注入,使重派 reviewer 的新证据 head-bound。
- merge-gate 真值表不变,仅让既有自愈动作可执行 + 定向到陈旧角色。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 295 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r9: reviewer 重派 bump 到下一轮 + 绝对 cd + latest-evidence-per-role — 真正打通 headless review→merge

根因(同一类:按"产物存在"而非"成功+区分维度"抑制):reviewer 重派硬编码回 r1,撞上旧 head 早先成功(EXIT=0)的 r1 log → _harness_spawn_intent_log_suppresses_retry 判"做过了"抑制 → 当前 head 该重跑的 review 永不跑 → review_gate 恒 WAIT_OR_REDISPATCH → 并发 0。且 intent cd 是相对 "." → headless 启动层不投射。

最小修法(按 maintainer directive:文件名即唯一 id,不加 wrapper --id,留在现有 r<R> 命名约定内不改公开命名):
- controller_actions.dispatch_reviewers:_next_review_round 取该 PR/role 现有最大轮 +1,渲染 r<N+1> 的 prompt/log/output/intent;pending-spawn 幂等 guard 按真实轮号;_append_harness_spawn_intent 的 cd 改 str(cd.resolve()) 绝对路径,使 headless 投射+启动成立。
- wakeup_plan.latest_reviewer_heads / wakeup_runner._latest_review_evidence_by_role:改为"每 role 取各自最新轮证据"(而非全局最新单轮),使只重派陈旧 role 时不丢其它 role 已 head-bound 的有效证据;merge 仍要求每 role 证据绑定 live head,安全语义不变。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 299 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r10: review_gate action 带上 reviewed head_sha + e2e 锁死 headless review→decision 全路

根因(review path head_sha 串接的第 5 处):wakeup_plan 投影 REVIEW_DONE 的 review_gate action 时只读 log 的 head(log 无 head)→ action.head_sha 空 → wakeup_runner._review_gate 卡 WAIT_OR_REDISPATCH:missing_action_reviewed_head_sha,证据虽已 head-bound 但 gate 连比都没法比。

修法(最小、按文件名即 id 思路):
- wakeup_plan.completed_marker_actions:REVIEW_DONE action 的 head_sha 改由 _review_done_action_head_sha 从权威证据面解析(latest_reviewer_heads 含 r7 prompt 注入的 head;或全 role head-bound 时取 live PR head;否则回落 log)。只从真实来源填充,不绕过 gate 的 head 比对,merge 安全语义不变。
- 新增 test_review_gate_e2e.py:从真实 wakeup_plan 投影 → WakeupRunner apply,断言 FIX(有 reject)与 MERGE(全 approve+CI 绿)两条具体决策落地——端到端锁死整条 review→decision,杜绝再冒隐藏层。
- test_wakeup_plan.py:加 prompt-bound reviewed head 的 planner 回归。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions+review_gate_e2e 302 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r11: PrChecksProjection 对瞬时 gh-api 失败加 bounded retry — ready PR headless 走到 review_gate 决策

根因:review_gate 的 CI 读 PrChecksProjection.check_pr 对 gh api pulls/<n> 与 commits/<sha>/check-runs 的单次非零返回立即 fail-closed(ci_unavailable:pull_api_failure),无重试;rate limit 健康,实为瞬时 blip → 让 #473/#402(MERGEABLE+CI 绿)每 tick 被瞬时失败挡住、never reach MERGE/FIX。

修法(最小,只动 PrChecksProjection;不碰 #322 ReleaseRequiredChecksProjection):
- _run_api_read:对两处 api read 最多 3 次确定性重试(无 sleep),成功即停,持续失败仍 fail-closed(安全)。happy path 无行为变化。
- starvation 结论(worker 核实):review_gate 本就被 apply,blocked ci_unavailable 行可重评估(非 terminal duplicate-suppressed),无需改 run_once。
- e2e:test_review_gate_e2e 加「首次 pulls 读瞬时失败仍 merge」用例;test_pr_checks 加 retry 覆盖。

验证:controller 独立重跑 test_pr_checks+wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+controller_actions 309 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r12: run_once 不再让 spawn-batch starve drain-only lifecycle — ready PR headless merge,backlog 能 drain

根因(已核实):wakeup_plan 每 tick 把 SPAWN actions(implement intent + design-consensus #495/#496/#490/#491/#418,index 0-5)排在 lifecycle review_gate(#480/#473/#402,index 6-8)之前;run_once 旧逻辑 `if applied_spawns>0 and not is_spawn_action: break` → 只要派了任一 spawn,首个 lifecycle 即 break 整 tick。design-consensus 几乎总有 spawn 活 → review_gate/merge 永远 starve → #473(approve/approve/comment → MERGE_WITH_COMMENTS,MERGEABLE+CI 绿)永不 merge,active 数永不下降。

修法(最小,核心 dispatch loop):
- 引入 consumes_spawn_budget = is_spawn_action or _uses_spawn_budget(action);超 budget 时 `continue`(不 break)以便仍能抵达后续 drain-only lifecycle。
- _uses_spawn_budget:dispatch_reviewers=True、review_gate 仅当决策为 FIX(会派 fix worker)=True;否则 review_gate(MERGE/MERGE_WITH_COMMENTS)、close 等 drain-only=False → 不占 spawn budget、floor 填满后仍被评估执行。
- 保留 r6 的 spawn launch-failure break/retry 与 dispatch_design_consensus no-intents skip-and-continue;真值表/head-binding/CI-green/mergeable 安全语义不变。

结果:每 tick 既用 spawn 填满 floor,又评估 merge/close 等 drain-only lifecycle,backlog 能 drain。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+controller_actions+pr_checks 309 通过;新增 test_wakeup_runner_lifecycle_review_gate_not_starved_after_spawn_batch。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r13: 修 WakeupRunner._run_command 把 --repo 注入到 gh 子命令前致全部 gh 调用 malformed — review→merge 根因

根因(实测确认):_run_command 对所有 gh 命令在 full[1:1] 插 --repo(子命令之前)。gh 不接受 --repo 作 pre-subcommand 全局 flag,且 gh api 根本不吃 --repo(repo 在 URL):`gh --repo O/R api repos/O/R/pulls/473` → rc=1 "unknown flag: --repo"。于是 wakeup-runner 的每个 gh 调用都 malformed:CI 读(PrChecksProjection(runner=_run_command) 的 gh api pulls/check-runs)→ ci_unavailable:pull_api_failure;mergeability(gh pr view)→ mergeability_unavailable;live-state 等同理。→ review gate 永远读不到 CI/mergeable → #473(approve/approve/comment=MERGE_WITH_COMMENTS,绿)、#402#480 永不 headless 落地。r11 的 retry 只是重复同一 malformed 命令。直连 PrChecksProjection()/直连 gh 用的是 plain default runner(无注入)所以正常,掩盖了 bug。

最小修法:_run_command 仅对 `gh pr`/`gh issue` 注入 --repo,且放在子命令(及其 verb)之后的合法位置;`gh api` 完全不动;绝不 pre-subcommand 注入;已存在 --repo 则不重复。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+pr_checks+wakeup_plan+controller_actions 312 通过;新增 argv-shape 测试 + e2e(stub subprocess.run 拒绝 malformed `gh --repo ... api`,ready PR 仍 merge)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r14: managed draft PR ready-then-merge — review gate 不再硬挡 pr_draft

根因(r13 修好 gh 调用后浮现):review_gate 的 _review_gate_mergeability_error 把 isDraft=True 当 WAIT_OR_REDISPATCH:pr_draft 硬挡。但本 loop draft-until-consensus:PR 开为 draft 直到 review 共识,merge_pr 在决策 MERGE/MERGE_WITH_COMMENTS 时 ready-then-merge。于是绿 gate 的 managed draft(如 #473 approve/approve/comment)永远卡 pr_draft、永不 headless 落地。

最小修法:
- wakeup_runner:删除 mergeability 检查里的 isDraft→pr_draft 硬挡(仍挡 mergeable!=MERGEABLE,即 CONFLICTING/DIRTY)。
- controller_actions.merge_pr:对 live draft PR 先校验 crnd:lifecycle:managed 再 gh pr ready(非 managed draft 不 ready,写 CONTROLLER_ACTION_BLOCKED:target-not-managed 并返回 2);保留既有 ready-then-merge。
- 真值表/head-binding/CI-green/conflicting-blocks 安全语义不变;readying 仅限 managed PR。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+controller_actions+wakeup_plan+pr_checks 316 通过;新增 managed-draft merge + 真实 plan→runner→controller ready-then-merge e2e + conflicting/非managed 安全用例。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* #409: 移除 .refactor-loop/host.env runtime fallback,CONSENSUS_RND_HOST_ENV 为唯一 host fact locator(缺失 fail-closed);迁移依赖 legacy fallback 的 test fixture (#473)

* hotfix r15 (sshx 共识 B 之一): 统一 gh argv 构造到单一 build_gh_argv helper — 防 r13 类 --repo bug 复发

sshx 一致共识 B 的硬化第1步:把分散重复的 gh --repo 处理收口到 codex_refactor_loop/gh_invoke.py 的 build_gh_argv(slug, argv):gh api 永不加 --repo(repo 在 URL);gh pr/issue 仅在 subcommand 之后、缺失时加 --repo <slug>;绝不 pre-subcommand。wakeup_runner._run_command / ControllerActions.gh / PrChecksProjection 三处 API 调用全部改走该 helper,删除各自的内联 --repo 启发式。行为保持不变(含 r13 修复)。

验证:characterization + source-regression 锁在 test_gh_invoke.py(9 测试);controller 独立重跑 test_gh_invoke+pr_checks+wakeup_runner+review_gate+review_gate_e2e+controller_actions+wakeup_plan 326 通过;pr_checks 直跑 --help 兼容。

注:这是硬化(防复发),不直接提升并发;并发瓶颈在上游 implement→publish 阶段,接下来单独处理。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r16 (sshx 共识): 打通 implement→publish 解锁并发瓶颈 — markerless/stale implement 重派 + 安全门控发布

sshx codex-cli triplet 共识(minimal/structural/delete + meta-judge):10 个 design-solving issue 已达成共识但卡在 implement→publish(implement 早跑过却 markerless/未提交/过期 base/没 PR;dispatch_consensus_implementation 被 local_iter_branch 抑制、publish 被 verified_pr_head_unavailable 抑制)→ 无新可派工作 → 并发 ~1。

修法(wakeup_plan/wakeup_runner/controller_actions/git.py):
- success-aware 重派:implement 分支/worktree 存在但无 valid clean IMPLEMENT_DONE:ok marker(markerless/failed)或 base 过期 → 重派;fresh_safe_worktree 把 canonical worktree reset/recreate 到当前 origin/$INTEGRATION_BRANCH;仅 genuine in-flight / 已有干净 PR / 待发布 clean marker 才抑制。
- 发布门控(严防误发 stale/wrong-scope):publish_implementation_output 仅在 canonical identity(refactor/iter<issue>-<candidate> + worktree 实际分支匹配)+ fresh integration base(merge-base==origin/integration,过期 base block)+ managed open issue + 无重复 PR + 非空 scoped diff 时,commit→push→open PR→派 reviewers;否则 block。
- wakeup_plan 用 canonical_implementation_identity/single_linked_managed_issue/clean_scoped_diff 替代 verified_pr_head 抑制。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 330 通过。

已知待修(follow-up):worker 把 PR title/body/commit 文案改成英文,违反「GitHub artifact 中文」策略,下一 commit 恢复中文。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r17: fresh consensus-implement 重派前清除 terminal 非可发布 implement log — 让卡住的 implement 真正启动

承接 r16:r16 让 markerless/stale implement 重派(reset worktree),但旧的 EXIT=1/markerless implement log 仍占 spawn 的 target_log_absent 前置 → fresh implement 不启动(同一类「按存在抑制」,implement-spawn-log 层)。

修法(controller_actions,success-aware):fresh consensus-implementation dispatch 写 spawn intent 前,仅清除 terminal 非可发布 implement log(EXIT!=0 失败 / EXIT=0 无 IMPLEMENT_DONE:ok marker);保留 genuinely in-flight(无 EXIT)与 clean publish-ready(EXIT=0 + 干净 marker)log 不动。于是卡住的 markerless/failed implement 重派后能真正 launch、跑在 fresh base、产出干净 marker → r16 门控发布。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 333 通过。

设计说明:并发瓶颈用确定性逻辑修对(ready=clean EXIT=0+marker;parallel=scope_paths 不相交),不引入 LLM-in-loop 分发判断(daemon-first;sshx codex 三人组一致 revise:LLM 最多 shadow-mode,不入决策回路)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r18: implement-lifecycle 收敛为单一 success-aware 谓词 + runner 路径自清陈旧 log — headless 自愈,不再需手动清 log

承接 r16/r17(r17 把清 log 放进被抑制的 dispatch 路径,从没执行)。新增 implement_lifecycle.py 单一事实源 classify_implement_attempt → in_flight | publish_ready | redispatch,plan/runner/controller 共用:
- runner spawn-apply 路径 _spawn_log_suppresses_retry 仅在 in_flight|publish_ready 时抑制;redispatch(EXIT!=0/markerless/stale-base) 不抑制且 _clear_redispatchable_spawn_log 在 spawn 实际应用处清掉陈旧终态 log → fresh implement 真启动(修 r17 misplacement,免手动清)。
- publish 仅 publish_ready(clean IMPLEMENT_DONE:ok + canonical identity + fresh base + 非空 scoped diff);clean-marker 但 stale-base 改 redispatch 而非死锁(#421 edge)。
- 确定性逻辑(非 LLM-in-loop;daemon-first);merge-gate/#191/#322/#396 安全语义不变。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 339 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 phase9-router triplet-evidence-invalid 误判致 headless design-consensus wedge

solver prompt 注入的 issue source snapshot 含上一轮 design-consensus 的
audit-trail(peer solver log 路径)时,_peer_solver_reference_violation
扫描整个 prompt 误判为 peer-isolation 违规,fail-closed 拒派 meta-judge,
使所有 body 含历史 consensus 记录的 issue 永久 wedge。

修复:peer-isolation 扫描只覆盖 router-controlled 区域,剥离
'## Issue source snapshot' 到 '## Full solver template' 之间的 issue
作者内容;router header / solver template 注入的真实 peer 引用仍拦截。
加两个 behavior test 覆盖 snapshot-排除 与 router-区域-仍拦截。

dogfood headless 实测:reload 后 418/437/474/490/491/493/494/495 全部
恢复 solver→judge→consensus 自驱,#494 已自驱到 implement。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* CLAUDE.md: 加 consensus-rnd:foundational-invariants sentinel block

dogfood bootstrap 的 check-project-rules probe 要求 PROJECT_RULES 含此
fixed-point block;缺失时 probe fail-closed 阻断 headless restart。内容
为 probe 生成的 canonical FI-001~007,镜像既有设计哲学,不引入新约束。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 markerless implement 致 publish 卡死的 headless gap

implement worker 干净退出(EXIT=0)但把 IMPLEMENT_DONE marker 只写进 run
artifact(runs/implement-issue-<id>.md)而非 log 末尾时(codex stdout marker
落点不可靠),completed_marker_actions 只扫 log marker → markerless → 不生成
publish action → design→implement→PR 链卡在 implement 后。实测 #421:EXIT=0
+ artifact 有 IMPLEMENT_DONE:issue-421:ok + 9 文件 staged,却无 PR、label 仍
停在 design-solving。

修复:对 clean-exit 的 implement-issue log,log markerless 时 fallback 读取
run artifact 末尾的 IMPLEMENT_DONE marker,复用 review verdict 已有的
artifact-first 模式。窄作用域(仅 implement-issue + clean-exit),配 2 个
behavior test;全套 wakeup_plan 138 tests green。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 补全 markerless implement publish:revalidation 也读 run artifact marker

8aeea89 修了 wakeup_plan detection 端,但 wakeup-runner 的 source-marker
revalidation(_source_log_has_clean_marker)仍只扫 log,markerless implement
publish 仍被 reject(clean_exit_marker_missing)。补全:revalidation 对
clean-exit implement-issue log 在 log markerless 时 fallback 读
runs/implement-issue-<id>.md 的 marker,与 detection 端一致;配 behavior test,
全套 wakeup_runner 76 tests green。

注:dogfood 实测确认 markerless 是系统性病根(还影响 readiness 重派、review
verdict 等多处),应走系统性统一(见 #491 日志/observability 与后续 marker-reading
收口);本 commit 只补 implement publish 这条链。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 日志详细化:wakeup-runner tick 显示全状态计数 + 被掩盖的 blocked/skipped 详情

旧 _wakeup_tick_action 只看 results[0],一个 tick 处理多 action 时 blocked(如
publish 失败)被成功的 spawn 掩盖、完全不可见;graphql-backoff 固定显示误导的
'remaining=unknown'。dogfood 调试时这逼得每次翻 ledger 才知道真实发生了什么。

改:tick log 显示 [applied=N,blocked=M,...] 全状态计数 + 被掩盖的 blocked/skipped
具体 reason+action(graphql-backoff 作整-tick 门控单独报、去掉 unknown);dispatched
headline 保留。更新 test 反映可见性提升,全套 wakeup_runner 76 tests green。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 补全 markerless implement 第三环:readiness 谓词读 run artifact marker

classify_implement_attempt(r18 单一 success-aware 谓词)只扫 log 找
IMPLEMENT_DONE:ok,markerless log(marker 落 artifact)→ redispatch:markerless →
readiness 不 suppress → 重派覆盖已完成工作(dogfood 实测 #421 重派 4 次、#493
IMPLEMENT_DONE:ok→重派→:partial 工作丢失)。

补全:markerless log 时 fallback 读 runs/implement-<cluster>.md 的
IMPLEMENT_DONE:ok,与 detection(8aeea89)/revalidation(dfacc64)一致。窄作用域 +
r17/r18 兼容:只接受 :ok,partial/失败/真 markerless(无 artifact)仍 redispatch
recovery — 现有 markerless-redispatch test(wakeup_plan/wakeup_runner/
controller_actions 共 304 tests)全绿,新增 2 behavior test。

至此 implement→PR 的 markerless 三环(detection/revalidation/readiness)打通;
review 链 markerless 与系统性统一抽象交 #499 共识。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* #500: design-consensus 编排重设计 — phase9-router 唯一派发 owner,删 wakeup-runner 双重派发

落地 #500 r4 consensus(structural framing,no-new-schema):
- phase9-router 成为唯一 design-consensus worker intent owner,新增
  META_RESOLVED:re-design → marker.round+1 三路 solver route
- 删除 wakeup-runner 的 dispatch_design_consensus(消除双重派发 helper_exit:3)
- wakeup-runner 只保留 closed action apply(drop→close、consensus→implement)
- dispatch_consensus_implementation readiness 后转 crnd:phase:implementing
- 无新增 schema(复用 HARNESS_SPAWN_INTENT)

修复 headless P0:reflector META_RESOLVED 无 consumer + phase9-router/wakeup-runner
双重派发致 0-codex 死锁。14 文件 +207/-282;verification 471 tests 全绿。

共识由 controller 手动驱动 design-consensus r1→r4 达成(daemon 停止期间)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* #501 fix r1: 补 dispatch_consensus_implementation label-transition failure-branch test

tests reviewer reject:新增 label transition(gh issue edit→implementing)只测 success
path,缺 gh issue edit 返回 nonzero 时 dispatch stop before worktree/render/intent 的
failure-branch test。本 fix 补该测试。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 headless 死锁:label 移除集只留 canonical crnd:*,删除历史 alias

gh issue/pr edit --remove-label 会因仓库不存在的 label 整条 rc=1 失败。
ISSUE_LABELS_REMOVE/PR_LABELS_REMOVE 含 cleanup_aliases(emoji/legacy 名),
仓库未维护这些历史 label,导致 _move_issue_to_implementing_phase 永久失败,
consensus→implement 派发卡死、wakeup_runner 0 codex 自驱死锁。

按 maintainer 指令删除历史 label 兼容(不再管历史标签):移除集只列
canonical crnd:* phase/human/stuck label。新增 source-regression 锁定
两集 canonical-only、无 alias。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 #500 回归:phase9-router 生产 ctx 丢失 GH_REPO_SLUG 致 design-consensus intake 全死

#500(e733b1a)把 Phase9Router.__init__ 的 ctx 构造从
`LoopContext.load(repo_root=repo_root)` 改成
`LoopContext.load(repo_root=repo_root, env={"REPO_ROOT": str(repo_root)})`。
受限 env 不读 os.environ,使 daemon(经 source host.env 启动)的 ctx
gh_repo_slug=None → `_open_design_consensus_issues` 第一行 fail-closed 返回 []
→ DesignConsensusIssueIntake 静默永不派 → headless 无法为任何 issue 启动
design-consensus;`_require_open_source_issue` 也因无 slug 失败(marker route
如 #490 redesign 卡在 phase9-source-state-unavailable)。

修复:revert 为 `LoopContext.load(repo_root=repo_root)`,读进程环境解析 host facts。
测试一律用 `Phase9Router(ctx=...)` 显式 ctx,不依赖受限 env;新增回归测试锁定
生产 repo_root 构造路径解析 GH_REPO_SLUG。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 加 STALE_REVIVAL_HOURS:>阈值未推进的 stuck implement 自动 re-trigger(env 可配,默认 3h)

headless 下 partial/failed/markerless 的 implement log 会让已排队的
dispatch-consensus-implementation spawn intent 永久卡在 runner 的
target_log_absent 前置(清 log 的 dispatch_consensus_implementation 又被
pending_implement_intent 抑制)→ stuck implement 不会自愈(如 #421/494/493/474/498
的 10h partial-wedge)。

新增 host-tunable STALE_REVIVAL_HOURS(默认 3,非正/非法回落 3h)。wakeup_plan
投影 implement spawn intent 时,若目标 implement log 是 redispatchable
(partial/failed/markerless/stale-base)且 mtime 已超阈值,清掉它 →
target_log_absent 通过 → implement 自动重派。安全闸:in-flight(无终止 EXIT)永不清、
publish-ready(:ok+diff)不清、age-gate 防误清刚跑完的。

env 已登记进 SKILL.md host env surface matrix + host.env.example;5 个 behavior
test + matrix↔example 对称测试覆盖。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 扩展 stale-revival 覆盖死 codex:in_flight 无 EXIT 且 >阈值 + 不在存活清单 也 re-trigger

最常见的 headless wedge 是 codex/supervisor 中途死亡(如 daemon 被 kill 截断),
log 永远停在 in_flight(无终止 EXIT)→ classify 永判 in_flight → 永不 redispatch。
之前只清 redispatchable(partial/failed/markerless)log,漏了这半边。

现在 _revive_stale_redispatchable_implement_log 在 log mtime 超阈值时,先按
redispatchable 清;否则若 classify 为 in_flight 且 monitor 存活清单里无该 log 的
活进程,则判定为死 worker 清掉。安全性:spawn-codex 的 no-output stall 窗口远小于
默认 3h,故 >阈值仍 in_flight 的 log 必是死 supervisor;再加 monitor 存活校验
双保险,绝不杀活 codex。新增 3 个 behavior test(死 codex revive / fresh 不碰 /
活进程不碰)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 加手动 trigger:consensus-rnd-cli revive-implements(立即 re-trigger 卡住的 implement,不等阈值)

_revive_stale_redispatchable_implement_log 加 force 参数:跳过 stale_revival_seconds()
age gate,但 in_flight log 在 force 下必须经 monitor 存活校验证明无活进程才清(绝不杀
正在跑的 codex)。新增 force_revive_stuck_implements() 扫所有 implement-issue-*.log
强制 revive,和新 CLI 命令 revive-implements(authority delete-log)。auto 路径
force=False 不变。新增 4 个 behavior test。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 env 泄漏:测试套件 import 时隔离 ambient host.env(worktree 里跑 TEST_CMD 不再假失败)

daemon spawn 的 implement worker 在 worktree 里跑完整 TEST_CMD 时,继承的相对
CONSENSUS_RND_HOST_ENV=.config/consensus-rnd/host.env 泄漏进 temp-repo
LoopContext.load(repo_root=tmp) → 解析成 tmp/.config/... 不存在 → 假
LoopContextError(实测 concurrency_monitor 48 + restart_daemons 23 + … ≈151 errors)
→ 全套永不绿 → implement 永远 :partial → 进不了 PR。

unittest discover 在 collection 阶段先 import 所有 test_*.py 再跑;新增
test_aaa_host_env_isolation.py(字母序最先)在 import 时 pop 掉 host 注入变量,
全套在任何测试 run 前即隔离。验证:full discover 设 locator 跑 → exit 0 全绿
(此前 同条件 151 errors);targeted before/after 证 48/23/1 errors → OK。
纯测试模块,不碰 production/daemon。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 markerless keystone:solver/judge marker 加 companion-artifact fallback

solver/judge codex 有时 EXIT=0 但 SOLVER_DONE/META_JUDGE_DONE marker 只落 run
artifact、没在 log 末尾,导致 phase9-router 判 triplet 不完整、永不派 judge,
design-consensus 轮卡死(实测 #505 r1 structural 214 次嵌入、0 次行首)。

把已有的 implement artifact-marker fallback 扩到 solver/judge:clean-exit log
末尾无独立 marker 时,从 companion run artifact runs/<log-stem>.md 兜底读。纯
additive(log 有 marker 时行为不变)、prefix-scoped、精确 filename regex、复用
安全的独立 marker 提取(不信嵌入/echo)。不削弱共识(仍要 3 marked solver +
judge,只让 marker 来源更稳健)。phase9/router.py + wakeup_plan.py + behavior +
source-regression tests;full suite 1367 green。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 publish stale-base 无恢复 + 停 :ok churn:让 implementing 能发出 draft PR

implement :ok 但 worktree base 落后 origin/$INTEGRATION_BRANCH 时,publish 原本硬
block(publish_implementation_stale_base)且永不恢复;stale-revival 又把 stale-base
的 :ok implement 当 redispatchable 反复重派同一 stale worktree → 无限 churn 占满
floor、长时间零正式 PR(implement→首个 PR 永远开不出)。

改为:publish 前在 worktree merge origin/$INTEGRATION_BRANCH 恢复 base(冲突则
merge --abort + 明确 reason 交既有 resolver/re-dispatch);clean `:ok` 不再被
stale-revival churn(终态成功走 publish-with-recovery,不重跑);无 open PR 的 :ok
创建 draft PR(per #300 共识前 draft)+ Closes #N。controller 仍拥有 git topology,
worker 不 commit/push,不绕 review-gate/merge 真值表。controller_actions.py +
wakeup_plan.py + wakeup_runner.py + behavior/source-regression tests;full suite
1374 green。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 markerless implement:EXIT=0+真diff 无干净 marker 时合成 publish 候选,止 stale-revival 空重派死循环

publish build/test 闸做安全网。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 publish 顺序 bug + merge 失败派 codex 兜底 resolver:先 commit worker diff 再 merge fresh base,真冲突不 wedge 改派 resolver,幂等续跑

空判据改 git diff HEAD;顺带修 wakeup_plan stale-suppression 同类 git diff 问题。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #504 (#519)

* 修 headless review-fix prompt 渲染不全:绑定全部 placeholder(PR_NUMBER/标题/轮次/分支/REVIEW_*_PATH 等),解锁 reject PR 的 fix→merge

之前 headless _dispatch_review_fix 只绑 FIX_OUTPUT_PATH,其余 ${} 未解析致 fix codex 读不到 reject 证据。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #496 (#520)

* 修 review-fix codex 派到 main 的 cd bug:改在 PR worktree 跑,worktree 缺失 fail-closed

之前 _dispatch_review_fix cd=repo_root → fix 改 main(污染 daemon checkout)且不落 PR 分支(reject PR 永不 merge)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #508 (#523)

* 修 wakeup-plan action 排序:review_gate/publish 排到新工作 spawn 前,解 spawn-budget 饿死→reject PR 可被 fix→merge

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* CLAUDE.md 加宪法工程规则:异常必抛出+记可诊断日志严禁吞掉/静默;测试必断言真实行为禁无意义测试(maintainer 指令)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #503 (#521)

* 增加通用工程基本规则,明确面向对象设计要求及实现细则

* 实现 issue #514 (#524)

* 完善通用面向对象设计准则,增加核心原则与实现层细则

* 实现 issue #494 (#526)

* 实现 issue #517 (#528)

* 实现 issue #505 (#510)

* 实现 issue #505

* #510 fix r1:补 phase9-router 行为测试覆盖(应对 tests reviewer reject)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 headless FIX_DONE 不提交 fix 输出 gap:wakeup-runner 在 FIX_DONE re-review 前先 commit+push fix worktree

headless 下 fix codex 产出 FIX_DONE 但 worker 从不 commit,FIX_DONE 直接路由 dispatch_reviewers,
fix 输出永远留在 worktree 未提交,reviewers 永远审旧 head,reject 不收敛,阻断所有需 fix 轮的 PR
review→merge。镜像 interactive controller:FIX_DONE 源的 dispatch_reviewers 前,若 fix worktree
dirty 则 git add+commit+safe_push,clean 则 no-op。附 2 个 behavior test(dirty 提交/clean 跳过)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #491 (#531)

* 实现 issue #490 (#529)

* 实现 issue #490

* 实现 issue #490

* PR #529 review-fix output

* 实现 issue #499 (#512)

* 实现 issue #499

* PR #512 review-fix output

* 实现 issue #474 (#532)

* 实现 issue #474

* 实现 issue #474

* 实现 issue #474

* 实现 issue #474

* PR #532 review-fix output

* 实现 issue #476 (#518)

* 实现 issue #476

* PR #518 review-fix output

* 实现 issue #509 (#527)

* 实现 issue #509

* 实现 issue #509

* spawn-claim recycle 改用 artifact-aware terminal marker:companion log 被 24h retention 删除后,已完成任务的 claim 仍可回收(读 runs/ artifact),不再永久阻断重派;running 任务无终止 marker 仍不回收(防 double-spawn)

* 实现 issue #421:发行前 clean-room 集成测试制度 (#534)

* 实现 issue #421

* rebase #421 onto fresh integration:解决 test_cli_command_router 冲突 + 校验

* 实现 issue #493:测试纪律修宪(behavior-first) (#535)

* 实现 issue #493

* rebase #493 onto fresh integration:解决冲突 + 校验

* PR #535 review-fix r1:addressed tests+quality reject(behavior-first test 调整)

* 实现 issue #506 (#536)

* 实现 issue #506

* 实现 issue #506

* 实现 issue #506

* 实现 issue #506

* PR #536 review-fix output

* churn 根治:wakeup-plan 给 consensus-impl spawn-intent 套 open_closing_pr/terminal/in-flight 抑制(对齐 dispatch_consensus_implementation),已有 PR 的 issue 不再被重派 implement

* 实现 issue #507 (#525)

* 实现 issue #507

* 实现 issue #507

* PR #525 review-fix output

* 实现 issue #516 (#530)

* 实现 issue #516

* 实现 issue #516

* #530 fix r1:补 ManagedWorkSnapshot fetch-failure stale-cache fallback 行为测试(应对 tests reviewer reject)

* PR #530 review-fix output

* PR #530 review-fix r2:typed ManagedWorkItem projection(addresses architect 类型边界 reject + tests + contract-tests)

* PR #530 review-fix r3:anchor ManagedWorkSnapshot read surface in authorization mirror + cross-artifact source-regression(addresses architect governance reject)

* PR #530 review-fix r4:add canonical+legacy managed-label discovery regression test(addresses tests reject)

* 修复 worker marker reader 误判相同重复 marker 为冲突

codex 常把终止 marker 回显 2-3 次(含 diff + 前缀行),导致
read_worker_terminal_marker 返回 duplicate_or_conflicting,review 真值表
永不收敛、所有 headless review→merge 卡死。改为:相同重复 marker 折叠为
有效,仅当存在 2+ 个不同 marker 值才判 duplicate_or_conflicting。
覆盖 _last_final_marker / _sentinel_adjacent_marker /
_marker_from_companion_artifact,保留 malformed 与 foreign-prefix 检测。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Reserve implementation PR for issue #537 (#540)

⟦AI:AUTO-LOOP⟧

* 实现 issue #511 (#522)

* 实现 issue #511

* PR #522 review-fix output

* PR #522 review-fix output

* 移除误入 PR 的 .refactor-loop 运行 artifact(architect r8 reject)

resolve-pr522-merge.md 是一次性合并解决 summary,属生成物,不应进 durable diff
(CLAUDE.md「生成物不当事实源」)。解决说明保留在 commit message / PR body。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复早期 PR reservation 崩溃(#511 回归,阻断所有 implement 派发)

_reserve_implementation_pr 用 worker 富正文路径开 reservation PR,但该文件在
reservation 时尚不存在 → FileNotFoundError(未被 except RuntimeError 捕获)→
wakeup-runner tick 崩溃 → 任何 consensus issue 无法 implement → loop P0 wedge。
修复:reservation 前若 body 缺失则写占位正文(## issue #N 实现 + Closes + sentinel,
publish 会拒占位逼 worker 产真正文);reservation 幂等(已有 open PR 直接返回;
已有 remote head 则 reset 到 integration + force-with-lease push);捕获 OSError。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Reserve implementation PR for issue #543 (#544)

⟦AI:AUTO-LOOP⟧

* Reserve implementation PR for issue #541 (#546)

⟦AI:AUTO-LOOP⟧

* revert early-PR reservation(#507):不再开空占位 PR,publish 出真内容才开 PR

dispatch_consensus_implementation 不再 reserve/开 PR;publish_implementation_output
恢复 pre-#507 行为——implement 出非空 diff 才 commit+push+open_pr_with_label(用 worker
真 title/body,验证非占位);空 diff 不开 PR、不 merge、不关 issue。删除占位机制
(_reserve_implementation_pr / _placeholder_*),改 wakeup_plan/runner 的 early_pr_missing 投影。
根因:空占位 PR 被 reviewer 在 implement publish 前空合,假性关闭 #537/#541/#543。
全量套件 1496 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #542:删除 redispatch payload 冗余 head_sha (#547)

移除 review redispatch 的重复 head_sha payload

* hotfix: stale consensus-implement intent 指向已删 worktree 不再 suppress fresh re-dispatch

consensus_implementation_suppressed_reason 的 pending_implement_intent 分支增加 worktree 存在性
守卫(接入既有 _canonical_consensus_worktree_exists):仅当 stale pending intent 的 canonical
worktree 仍存在时才 suppress。revert early-PR reservation(#507)后,#537/#541/#543 的 pending
implement intent 指向已删除 worktree,旧逻辑 suppress fresh re-dispatch,而 wakeup-runner 不断把
stale intent spawn 进死 cd 致 os error 2 / EXIT=1 死循环,阻断 consensus→implement 自主链路。
现 worktree 缺失即放行 fresh dispatch 重建 worktree。

补 test_wakeup_plan(worktree 缺失不 suppress / worktree 存在仍 suppress)+ 更新
test_controller_actions 锁旧行为的 subTest(pending case 建 canonical worktree 以保留
reject-before-fresh_safe_worktree 意图)。全套件 1557 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: spawn-claim lock 持有者 pid 已死即可回收(自愈,防 cleared-log 永久 HELD)

_existing_claim_is_recyclable 增加 pid-liveness 回退:当 lock 记录的持有者 pid 已死(os.kill
pid 0 抛 ProcessLookupError)时即可回收,保留既有 terminal-marker 回收与 live-spawn 保护
(pid 存活+无 marker 仍不回收)+ metadata-mismatch fail-closed。根因:implement 失败(EXIT=1)
留 stale lock,dispatch_consensus_implementation 重派时清掉 log → #490 spawn-claim 只在 log 有
EXIT= 时回收 → cleared log 无 marker → 死进程 lock 无法回收 → SPAWN_CLAIM_HELD 永久死锁,
阻断 consensus→implement 重派(#537/#541/#543)。补 7 个 behavior test。全套件 1560 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: publish_implementation_output 接受 worker 已 commit 的实现 diff

_require_publish_implementation_diff 之前只看 git diff HEAD --quiet(未提交改动),
worker 一旦自己 commit(在自己分支上,隔离无害)工作区就 clean → 误判 no_diff → 不开 PR。
现改为:有未提交改动 OR HEAD 相对 integration base 有 committed delta(merge-base..HEAD 非空)
即视为有 diff 可发;_commit_publish_implementation_diff 在工作区已 clean(worker 已 commit)时
跳过提交返回成功,不再误报 publish_commit_failed。修 #541 类 worker-commit→no_diff→无 PR+
daemon 重复 dispatch。补 behavior test。codex 全套件 DISCOVER_STATUS:0 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #437: .refactor-loop 历史生成物与陈旧 worktree 的有界清理机制 (#548)

* #437: 新增 canonical RuntimeRetention owner,log-retention 降为一版 alias;清理覆盖 generated files+pending-events compaction+stale worktree;CLAUDE.md 加 #437 narrow carveout

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* hotfix: CONFLICTING managed review PR 的 stale-base 冲突自主 resolve

接线既有但孤儿的 prompts/rebase-resolve.md 进 headless 路径:managed review PR
mergeable=CONFLICTING(base 落后)时,controller 在 PR worktree 起 git merge
--no-commit --no-ff origin/<integration>——干净则直接 commit+push,有冲突则派
rebase-resolve codex(隔离 worktree 内解+stage,禁 commit/push),收到
REBASE_RESOLVE_DONE 后 controller commit + safe_push 仅推 PR head。结果再过 CI+
review-gate 把关。新增 dispatch_pr_rebase_resolve / commit_push_resolved_pr_rebase
(controller_actions)+ wakeup_plan 投影 + wakeup_runner 应用。

安全:只动 refactor/iter<N>-* PR 分支,显式 guard 拒碰 integration/review_base,
diff-filter=U 验未解冲突才 commit,任何 abort 路径 git merge --abort 留干净 worktree,
幂等(已含 base / 解析在飞则 NOOP)。修 stale-base churn 致 loop 永久停滞(一个 PR
合了其它全 CONFLICTING 无人解)的根因。codex 全套件 1589 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: rebase_resolve_actions 投影前 enrich live mergeability(修生产 gap)

自主-resolve 投影读 item.mergeable 前未 enrich,但生产 GhItem 来自 snapshot
mergeable 为空 → 永远 skip → CONFLICTING PR 不投影 dispatch_pr_rebase_resolve →
loop 仍停滞。复用同文件 _with_live_mergeability(review_evidence_redispatch_actions
已用同模式)。补 mergeable-为空-需-live-fetch 的回归测试(原测试 mock 了 mergeable
未覆盖此生产路径)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: dispatch_pr_rebase_resolve kind 进 EXECUTABLE_ACTION_KINDS(end-to-end 可执行)

_close_projection_action 的 else 分支默认 status_only=True;rebase action 的 kind
stale-base-conflicting-pr 不在 EXECUTABLE_ACTION_KINDS → 永远 status_only,runner 不
apply,CONFLICTING PR 永不自主 resolve。加该 kind(+ commit_push completion kind 如需)
进集合;controller_action 已在 RUNNER_NAMED_HELPER_ACTIONS 故随即保持 executable。补
端到端 wakeup-plan pipeline 测试(原测试只测 rebase_resolve_actions 单函数,漏了 closed-
action-projection 的 status_only 关卡)。codex 全套件 1593 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: rebase-resolve commit/dispatch 投影 gate 在实际 worktree merge 态(止 stale-marker 噪音)

commit_push_resolved_pr_rebase 之前只凭持久 REBASE_RESOLVE_DONE marker 投影,worktree
已 commit+push(#549)或 abort(#550/#551)后 marker 仍在 → 每 tick fail-closed 噪音 +
反复 0-codex(zero_streak 触警)+ stale marker 挡 #550/#551 fresh resolve。现 gate:仅当
worktree 真有 merge-in-progress(MERGE_HEAD)且无 unmerged 才投 executable commit_push;
否则不投。并让 worktree clean + 仍 CONFLICTING 的 PR 重新拿 executable dispatch_pr_rebase_resolve
(stale DONE marker 不再永久压制)。补端到端测试。codex 全套件通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #502: 删除遗留 label 兼容层:purge emoji/legacy alias 全套(cleanup_aliases / per-spec  (#549)

* 实现 issue #502: 删除遗留 label 兼容层:purge emoji/legacy alias 全套(cleanup_aliases / per-spec

* 修复 PR #549 review r3:#502 label-purge 适配当前 label 代码(contract-tests + architect/tests reject)

fix codex round-3 applied-4:对齐 #502 遗留 label 兼容层 purge 与近期新增 label 代码,
修 contract-tests 失败 + architect/tests reject。改 labels.py + SKILL.md + test_wakeup_plan.py。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 headless review_gate ci_failed 死锁:reject→FIX 移到 ci/mergeability 检查前

_review_gate_decision 之前 ci_error/mergeability 检查在 reject>0→FIX 之前 → CI 红的 PR
即使 reviewer reject 也返回 WAIT_OR_REDISPATCH:ci_failed 干等,而 CI 不修不会绿 → 死锁
(#548/#549 都中,需手动派 fix 破解)。把 reject>0→FIX 移到 ci/mergeability 检查前
(仍在 reviewer-validity/head 检查之后)→ reject 时即使 CI 红也自主派 fix(fix 同解 reject+CI)。
MERGE/MERGE_WITH_COMMENTS 判定完全不变(仍需 ci-green + mergeable + reject=0 + approve>=1),
无误合并风险。补 test_wakeup_runner 端到端测试(reject+ci-red→FIX;reject=0+ci-red 仍 WAIT;
happy path MERGE 不变;stale-head+reject 仍 WAIT)。full suite 1600 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修正 review_gate 顺序:mergeability 检查移到 reject→FIX 之前(CONFLICTING 不并发 review/fix)

上一修复(a125d19)把 reject→FIX 移到 ci+mergeability 前,破了 ci_failed 死锁但也让
CONFLICTING+reject 的 PR 路由到 FIX → 与独立的 dispatch_pr_rebase_resolve 并发跑同一
worktree(#550 出现 98-file mid-merge + reviewer 并发)。把 mergeability_error 检查移到
reject→FIX 之前:CONFLICTING→WAIT(先 rebase-resolve,不并发 review/fix);MERGEABLE+
CI红+reject→FIX(死锁修复保留);ci_error 仍在 reject→FIX 后(纯 CI 红无 reject 才 WAIT)。
MERGE 判定不变。test_wakeup_runner 90 OK(+conflicting 用例)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #498: headless daemon fail-closed 点的 codex 兜底裁决/生成机制(narrow allowlist,非通用 es (#550)

* 实现 issue #498

* 实现 issue #498: headless daemon fail-closed 点的 codex 兜底裁决/生成机制(narrow allowlist,非通用 es

* PR #550 review-fix output

* 新增 patrol-inspector 巡检 issue intake (#551)

* 实现 issue #541

* PR #551 review-fix output

* PR #551 review-fix output

* 修复 PR #551 contract-tests:对齐 patrol 第8 daemon + safe_push/sync_dev 测试 host.env

#551 patrol 新增 patrol_inspector_daemon(第8 daemon)→ test_anti_stop_restart 期望 7→8;
test_controller_lib_safe_push / test_sync_dev 补 CONSENSUS_RND_HOST_ENV host.env 设置
(代码按 host-config contract 需 host.env locator,rebase-resolve 丢了 base 新测试版本)。
断言未削弱(daemon 数断言强化为8)。full discovery 1634 OK + sshx 13 OK。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 PR #551 contract-tests(clean-env):e2e fixture 设 META_ESCALATION_STUCK_HOURS 隔离 de-dup 测试

test_batch_and_per_task_lock_do_not_duplicate_spawn 在 clean env(CI)下失败:#506 meta-
escalation 因 fixture mock 态被判 stuck 而 fire,产出 repository-stalled RunnerResult 污染
de-dup 断言(expected [] got [RunnerResult])。fixture host.env+env 设 META_ESCALATION_STUCK_
HOURS=999999 禁该测试的 meta-escalation,聚焦 de-dup 行为。断言未削弱。clean discovery 全过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 publish-gap:implement readiness 用 git status --porcelain 判 staged diff

原 classify_implement_attempt / wakeup_runner publish 前用 git diff --quiet 只检测 unstaged
改动;worker 正确 git add -A 后 staged diff 被误判 empty_scoped_diff,导致完成的 implement
(如 #553 干净 EXIT=0 + 885 行真实 staged diff + 有效 title/body artifacts)永远出不了 PR,
wakeup_runner 对 children 反复空转 re-dispatch。改为先 git status --porcelain 判 staged/
unstaged/untracked 任一即 publish-ready,只有真干净才 fallback git diff。publish-time 建 PR
路径不变(不重蹈 #507 early-PR 空 commit/placeholder 死锁)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #552

* PR #558 remote-ci-fix output

* 移除 #552 越界加的孤立 flaky phase9-router 测试

fb2f461(#552 实现)给未改动的 phase9 router 加了 test_phase9_router_tick_summary_is_
router_local_diagnostic,与 #552「收窄 restart daemon 权威」scope 无关,且该测试 call
self.router.tick() 不 mock managed-work-snapshot fetch → CI 无 GitHub 时 fetch-failed
诊断行污染输出致 anchored regex 不匹配(flaky),挡 CI。main 从无此测试。按删除优先+scope
收口移除;若 router 诊断测试有价值应在 phase9-focused PR 以 hermetic 形式正式加。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
loning added a commit that referenced this pull request Jun 6, 2026
* #471 hotfix: wakeup_plan completed-marker 投影按 open-managed 集过滤,closed/merged target 不投影(止血 graphql:消除 wakeup_runner 对已终结 target 的逐条重校验)

* #410: SKILL/prompt 注释改引 durable contract anchor,不引用一次性 .refactor-loop/runs/*;新增 test_generated_artifacts_not_authority 窄 source-regression (#472)

* #471 hotfix2: wakeup-plan 同一 target 只投影最新有效 marker,旧轮次不再投影 dispatchable action(止 graphql backoff 最后一刀)

* skill: 强化 no-gap 规则 — 0 codex + active work 无观察模式豁免,必须当 turn 立即介入派 codex(+source-regression)

* #470: direct-post prompt 渲染时 fixed-token 内联 _github-post-rules,host worktree cwd 可达(避免裸相对路径) (#479)

* #478 hotfix: daemon spawn 改 launch-and-forget 独立 supervisor(start_new_session 不 wait),daemon reload 不丢 in-flight worker;daemon 每 tick 清晰状态行

* #471: wakeup-plan latest-open-fresh 投影 — completed-marker open-target/latest + release-rollup 仅 latest-per-sha 且 ahead>0 本地 ref 验证(止 stale 投影/graphql) (#482)

* #481: consensus->implement readiness-dedup gate(止 daemon 重派 closed/has-PR/in-flight) (#486)

* #481: consensus->implement readiness-dedup gate, closed/closing-PR/remote-branch/in-flight target 不投影 dispatchable, 止重复/race

* #481 fix r1: dedup gate 复用 concurrency monitor owner surface 测量 in-flight,不重实现 daemon 算法(addr architect)

* #481 fix r2: 补 controller helper guard behavior test(addr tests)

* #485: consensus->implement 按 scope_paths 冲突分组串行(同组一个 executable,其余 status_only),复用 batching-heuristics,防并行冲突 PR (#489)

* #475: wakeup-runner WakeupApplyBudget — spawn-worker action 按 hard_gate.dispatch_required 批量到 floor deficit(lifecycle action 仍 1/tick),修 1/tick 线性恢复 (#492)

* hotfix(#475 ext): batch budget 纳入 dispatch_design_consensus —— solver triplet 一 tick 派齐(非 lifecycle,worker 派发),修 design-consensus 仍 1/tick 低并发

* hotfix: headless spawn-intent launch — skipped duplicate 不再 break 整批 spawn,stale applied ledger(log 缺失)可重试,helper_exit:3 加来源诊断事件;修 248 intent 堆积/0 并发

* hotfix r2: blocked lifecycle action 不再 dead-stop 整 tick — 跳过/抑制 stale blocked(close-drop target OPEN / publish 无 verified head)继续 launch 后续 spawn-batch,使 headless 并发拉到 floor

* hotfix r3: 根治 headless 0 并发 — detached spawn-codex launch 显式传 env_for_subprocess(修 #478 decouple 丢 env 致 EXIT=127);失败 log(非0 EXIT)可重试 launch,EXIT=0/未完成仍抑制重复;tick 状态汇总后续 launch 不被首个 blocked 掩盖

* hotfix r4: run_once 通用 blocked-not-dead-stop — 任何 blocked/skipped non-spawn action 不再 dead-stop 整 tick,继续 launch 后续 spawn-batch;wakeup-plan 抑制 precondition 注定失败的 stale action;根治 headless 并发=0

* hotfix r5: wakeup-plan 投影侧 suppress stale close_managed_drop(target 实际 OPEN 的 drop-close 注定失败)为 status_only,不再占 tick 挡住后续 design-consensus spawn launch

* hotfix r6: wakeup-runner run_once 不再把 no-intents 的 dispatch_design_consensus(helper_exit:3)误判为 codex launch failure dead-stop 整 tick

根因:dispatch_design_consensus 是 spawn-batch action;当 solver triplet 不完整(如 #496 minimal/structural EXIT=0 但未 emit SOLVER_DONE marker)时 helper 返回 exit 3(NO_INTENTS),_spawn_launch_failure 命中 "helper_exit:" 前缀判 True → run_once break,整 tick dead-stop,后续 reviewers/其他 design-consensus/implement 全部不派 → headless 并发=0。

修法:run_once 中 blocked 的 spawn-batch action 若 controller_action != spawn_codex_harness_background(即 dispatch_design_consensus),按 routing no-op skip-and-continue,继续 launch 后续 spawn-batch;仅真正 spawn_codex_harness_background launch failure 才走既有 retry/break。延续 r4 blocked-not-dead-stop 不变量到 design-consensus spawn-batch action。

behavior test: test_wakeup_runner_design_consensus_no_intents_does_not_dead_stop_later_spawn_batch(无修复时 later spawn 不 launch 而 FAIL)。



* hotfix r7: review head_sha binding — 打通 headless review→merge 自驱

根因:reviewer artifact frontmatter 不带 head_sha,_review_evidence_from_artifact 取不到所审 head SHA → 所有 reviewing PR(#480/#473/#402)恒判 WAIT_OR_REDISPATCH:invalid_reviewer_evidence:missing_reviewed_head_sha,且 wakeup-plan 不自动重派 reviewer → review→merge headless 永久卡死(挡住发版)。

修法(codex hotfix worker 实现,controller 验证后提交):
- controller_actions 派 reviewer 时取 headRefOid,缺则 fail-closed,注入权威 HEAD_SHA 进 rendered prompt。
- reviewer-{architect,tests,quality}.md frontmatter 写 head_sha: ${HEAD_SHA}。
- wakeup_runner._review_head_sha_for:artifact 缺 head_sha 时从 controller-rendered prompt/log 回收(不依赖 worker 自报,避免 markerless 覆辙);live-head 陈旧判定与 merge 真值表语义不变。
- wakeup_plan 为 missing/stale reviewer head 的 OPEN reviewing PR 投射可执行 dispatch_reviewers 自愈动作。

验证:controller 独立重跑 test_wakeup_runner+review_gate+wakeup_plan+controller_actions 293 通过、marker/prompt 合同 30 通过;review-gate 安全语义(stale head 仍不 merge)由 13 项 review-gate 测试锁住。



* hotfix r8: 让 review-evidence-redispatch 自愈动作可执行 — 打通 headless review→merge 闭环

承接 r7:r7 让 reviewer 证据绑定 head_sha,但 wakeup_plan 投射的 review-evidence-redispatch 自愈动作被 closed-action-projection finalizer 强制 status_only(其 kind 不在 EXECUTABLE_ACTION_KINDS),runner 永不应用 → #480/#473/#402 缺 head 的 review 证据无法自动重派 → review_gate 恒 WAIT_OR_REDISPATCH → 并发耗干到 0。

修法(codex hotfix worker 实现,controller 验证后提交):
- wakeup_plan EXECUTABLE_ACTION_KINDS 加 review-evidence-redispatch,finalizer 保留 runner authority 不再强制 status_only。
- controller_actions.dispatch_reviewers 读 stale_review_roles 只重派陈旧角色,且 _pending_review_spawn_exists 跳过已有在飞 dispatch-reviewers:<pr>:<role>:r1 intent(幂等不双派);仍走 r7 的 HEAD_SHA 注入,使重派 reviewer 的新证据 head-bound。
- merge-gate 真值表不变,仅让既有自愈动作可执行 + 定向到陈旧角色。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 295 通过。



* hotfix r9: reviewer 重派 bump 到下一轮 + 绝对 cd + latest-evidence-per-role — 真正打通 headless review→merge

根因(同一类:按"产物存在"而非"成功+区分维度"抑制):reviewer 重派硬编码回 r1,撞上旧 head 早先成功(EXIT=0)的 r1 log → _harness_spawn_intent_log_suppresses_retry 判"做过了"抑制 → 当前 head 该重跑的 review 永不跑 → review_gate 恒 WAIT_OR_REDISPATCH → 并发 0。且 intent cd 是相对 "." → headless 启动层不投射。

最小修法(按 maintainer directive:文件名即唯一 id,不加 wrapper --id,留在现有 r<R> 命名约定内不改公开命名):
- controller_actions.dispatch_reviewers:_next_review_round 取该 PR/role 现有最大轮 +1,渲染 r<N+1> 的 prompt/log/output/intent;pending-spawn 幂等 guard 按真实轮号;_append_harness_spawn_intent 的 cd 改 str(cd.resolve()) 绝对路径,使 headless 投射+启动成立。
- wakeup_plan.latest_reviewer_heads / wakeup_runner._latest_review_evidence_by_role:改为"每 role 取各自最新轮证据"(而非全局最新单轮),使只重派陈旧 role 时不丢其它 role 已 head-bound 的有效证据;merge 仍要求每 role 证据绑定 live head,安全语义不变。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 299 通过。



* hotfix r10: review_gate action 带上 reviewed head_sha + e2e 锁死 headless review→decision 全路

根因(review path head_sha 串接的第 5 处):wakeup_plan 投影 REVIEW_DONE 的 review_gate action 时只读 log 的 head(log 无 head)→ action.head_sha 空 → wakeup_runner._review_gate 卡 WAIT_OR_REDISPATCH:missing_action_reviewed_head_sha,证据虽已 head-bound 但 gate 连比都没法比。

修法(最小、按文件名即 id 思路):
- wakeup_plan.completed_marker_actions:REVIEW_DONE action 的 head_sha 改由 _review_done_action_head_sha 从权威证据面解析(latest_reviewer_heads 含 r7 prompt 注入的 head;或全 role head-bound 时取 live PR head;否则回落 log)。只从真实来源填充,不绕过 gate 的 head 比对,merge 安全语义不变。
- 新增 test_review_gate_e2e.py:从真实 wakeup_plan 投影 → WakeupRunner apply,断言 FIX(有 reject)与 MERGE(全 approve+CI 绿)两条具体决策落地——端到端锁死整条 review→decision,杜绝再冒隐藏层。
- test_wakeup_plan.py:加 prompt-bound reviewed head 的 planner 回归。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions+review_gate_e2e 302 通过。



* hotfix r11: PrChecksProjection 对瞬时 gh-api 失败加 bounded retry — ready PR headless 走到 review_gate 决策

根因:review_gate 的 CI 读 PrChecksProjection.check_pr 对 gh api pulls/<n> 与 commits/<sha>/check-runs 的单次非零返回立即 fail-closed(ci_unavailable:pull_api_failure),无重试;rate limit 健康,实为瞬时 blip → 让 #473/#402(MERGEABLE+CI 绿)每 tick 被瞬时失败挡住、never reach MERGE/FIX。

修法(最小,只动 PrChecksProjection;不碰 #322 ReleaseRequiredChecksProjection):
- _run_api_read:对两处 api read 最多 3 次确定性重试(无 sleep),成功即停,持续失败仍 fail-closed(安全)。happy path 无行为变化。
- starvation 结论(worker 核实):review_gate 本就被 apply,blocked ci_unavailable 行可重评估(非 terminal duplicate-suppressed),无需改 run_once。
- e2e:test_review_gate_e2e 加「首次 pulls 读瞬时失败仍 merge」用例;test_pr_checks 加 retry 覆盖。

验证:controller 独立重跑 test_pr_checks+wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+controller_actions 309 通过。



* hotfix r12: run_once 不再让 spawn-batch starve drain-only lifecycle — ready PR headless merge,backlog 能 drain

根因(已核实):wakeup_plan 每 tick 把 SPAWN actions(implement intent + design-consensus #495/#496/#490/#491/#418,index 0-5)排在 lifecycle review_gate(#480/#473/#402,index 6-8)之前;run_once 旧逻辑 `if applied_spawns>0 and not is_spawn_action: break` → 只要派了任一 spawn,首个 lifecycle 即 break 整 tick。design-consensus 几乎总有 spawn 活 → review_gate/merge 永远 starve → #473(approve/approve/comment → MERGE_WITH_COMMENTS,MERGEABLE+CI 绿)永不 merge,active 数永不下降。

修法(最小,核心 dispatch loop):
- 引入 consumes_spawn_budget = is_spawn_action or _uses_spawn_budget(action);超 budget 时 `continue`(不 break)以便仍能抵达后续 drain-only lifecycle。
- _uses_spawn_budget:dispatch_reviewers=True、review_gate 仅当决策为 FIX(会派 fix worker)=True;否则 review_gate(MERGE/MERGE_WITH_COMMENTS)、close 等 drain-only=False → 不占 spawn budget、floor 填满后仍被评估执行。
- 保留 r6 的 spawn launch-failure break/retry 与 dispatch_design_consensus no-intents skip-and-continue;真值表/head-binding/CI-green/mergeable 安全语义不变。

结果:每 tick 既用 spawn 填满 floor,又评估 merge/close 等 drain-only lifecycle,backlog 能 drain。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+controller_actions+pr_checks 309 通过;新增 test_wakeup_runner_lifecycle_review_gate_not_starved_after_spawn_batch。



* hotfix r13: 修 WakeupRunner._run_command 把 --repo 注入到 gh 子命令前致全部 gh 调用 malformed — review→merge 根因

根因(实测确认):_run_command 对所有 gh 命令在 full[1:1] 插 --repo(子命令之前)。gh 不接受 --repo 作 pre-subcommand 全局 flag,且 gh api 根本不吃 --repo(repo 在 URL):`gh --repo O/R api repos/O/R/pulls/473` → rc=1 "unknown flag: --repo"。于是 wakeup-runner 的每个 gh 调用都 malformed:CI 读(PrChecksProjection(runner=_run_command) 的 gh api pulls/check-runs)→ ci_unavailable:pull_api_failure;mergeability(gh pr view)→ mergeability_unavailable;live-state 等同理。→ review gate 永远读不到 CI/mergeable → #473(approve/approve/comment=MERGE_WITH_COMMENTS,绿)、#402#480 永不 headless 落地。r11 的 retry 只是重复同一 malformed 命令。直连 PrChecksProjection()/直连 gh 用的是 plain default runner(无注入)所以正常,掩盖了 bug。

最小修法:_run_command 仅对 `gh pr`/`gh issue` 注入 --repo,且放在子命令(及其 verb)之后的合法位置;`gh api` 完全不动;绝不 pre-subcommand 注入;已存在 --repo 则不重复。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+pr_checks+wakeup_plan+controller_actions 312 通过;新增 argv-shape 测试 + e2e(stub subprocess.run 拒绝 malformed `gh --repo ... api`,ready PR 仍 merge)。



* hotfix r14: managed draft PR ready-then-merge — review gate 不再硬挡 pr_draft

根因(r13 修好 gh 调用后浮现):review_gate 的 _review_gate_mergeability_error 把 isDraft=True 当 WAIT_OR_REDISPATCH:pr_draft 硬挡。但本 loop draft-until-consensus:PR 开为 draft 直到 review 共识,merge_pr 在决策 MERGE/MERGE_WITH_COMMENTS 时 ready-then-merge。于是绿 gate 的 managed draft(如 #473 approve/approve/comment)永远卡 pr_draft、永不 headless 落地。

最小修法:
- wakeup_runner:删除 mergeability 检查里的 isDraft→pr_draft 硬挡(仍挡 mergeable!=MERGEABLE,即 CONFLICTING/DIRTY)。
- controller_actions.merge_pr:对 live draft PR 先校验 crnd:lifecycle:managed 再 gh pr ready(非 managed draft 不 ready,写 CONTROLLER_ACTION_BLOCKED:target-not-managed 并返回 2);保留既有 ready-then-merge。
- 真值表/head-binding/CI-green/conflicting-blocks 安全语义不变;readying 仅限 managed PR。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+controller_actions+wakeup_plan+pr_checks 316 通过;新增 managed-draft merge + 真实 plan→runner→controller ready-then-merge e2e + conflicting/非managed 安全用例。



* #409: 移除 .refactor-loop/host.env runtime fallback,CONSENSUS_RND_HOST_ENV 为唯一 host fact locator(缺失 fail-closed);迁移依赖 legacy fallback 的 test fixture (#473)

* hotfix r15 (sshx 共识 B 之一): 统一 gh argv 构造到单一 build_gh_argv helper — 防 r13 类 --repo bug 复发

sshx 一致共识 B 的硬化第1步:把分散重复的 gh --repo 处理收口到 codex_refactor_loop/gh_invoke.py 的 build_gh_argv(slug, argv):gh api 永不加 --repo(repo 在 URL);gh pr/issue 仅在 subcommand 之后、缺失时加 --repo <slug>;绝不 pre-subcommand。wakeup_runner._run_command / ControllerActions.gh / PrChecksProjection 三处 API 调用全部改走该 helper,删除各自的内联 --repo 启发式。行为保持不变(含 r13 修复)。

验证:characterization + source-regression 锁在 test_gh_invoke.py(9 测试);controller 独立重跑 test_gh_invoke+pr_checks+wakeup_runner+review_gate+review_gate_e2e+controller_actions+wakeup_plan 326 通过;pr_checks 直跑 --help 兼容。

注:这是硬化(防复发),不直接提升并发;并发瓶颈在上游 implement→publish 阶段,接下来单独处理。



* hotfix r16 (sshx 共识): 打通 implement→publish 解锁并发瓶颈 — markerless/stale implement 重派 + 安全门控发布

sshx codex-cli triplet 共识(minimal/structural/delete + meta-judge):10 个 design-solving issue 已达成共识但卡在 implement→publish(implement 早跑过却 markerless/未提交/过期 base/没 PR;dispatch_consensus_implementation 被 local_iter_branch 抑制、publish 被 verified_pr_head_unavailable 抑制)→ 无新可派工作 → 并发 ~1。

修法(wakeup_plan/wakeup_runner/controller_actions/git.py):
- success-aware 重派:implement 分支/worktree 存在但无 valid clean IMPLEMENT_DONE:ok marker(markerless/failed)或 base 过期 → 重派;fresh_safe_worktree 把 canonical worktree reset/recreate 到当前 origin/$INTEGRATION_BRANCH;仅 genuine in-flight / 已有干净 PR / 待发布 clean marker 才抑制。
- 发布门控(严防误发 stale/wrong-scope):publish_implementation_output 仅在 canonical identity(refactor/iter<issue>-<candidate> + worktree 实际分支匹配)+ fresh integration base(merge-base==origin/integration,过期 base block)+ managed open issue + 无重复 PR + 非空 scoped diff 时,commit→push→open PR→派 reviewers;否则 block。
- wakeup_plan 用 canonical_implementation_identity/single_linked_managed_issue/clean_scoped_diff 替代 verified_pr_head 抑制。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 330 通过。

已知待修(follow-up):worker 把 PR title/body/commit 文案改成英文,违反「GitHub artifact 中文」策略,下一 commit 恢复中文。



* hotfix r17: fresh consensus-implement 重派前清除 terminal 非可发布 implement log — 让卡住的 implement 真正启动

承接 r16:r16 让 markerless/stale implement 重派(reset worktree),但旧的 EXIT=1/markerless implement log 仍占 spawn 的 target_log_absent 前置 → fresh implement 不启动(同一类「按存在抑制」,implement-spawn-log 层)。

修法(controller_actions,success-aware):fresh consensus-implementation dispatch 写 spawn intent 前,仅清除 terminal 非可发布 implement log(EXIT!=0 失败 / EXIT=0 无 IMPLEMENT_DONE:ok marker);保留 genuinely in-flight(无 EXIT)与 clean publish-ready(EXIT=0 + 干净 marker)log 不动。于是卡住的 markerless/failed implement 重派后能真正 launch、跑在 fresh base、产出干净 marker → r16 门控发布。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 333 通过。

设计说明:并发瓶颈用确定性逻辑修对(ready=clean EXIT=0+marker;parallel=scope_paths 不相交),不引入 LLM-in-loop 分发判断(daemon-first;sshx codex 三人组一致 revise:LLM 最多 shadow-mode,不入决策回路)。



* hotfix r18: implement-lifecycle 收敛为单一 success-aware 谓词 + runner 路径自清陈旧 log — headless 自愈,不再需手动清 log

承接 r16/r17(r17 把清 log 放进被抑制的 dispatch 路径,从没执行)。新增 implement_lifecycle.py 单一事实源 classify_implement_attempt → in_flight | publish_ready | redispatch,plan/runner/controller 共用:
- runner spawn-apply 路径 _spawn_log_suppresses_retry 仅在 in_flight|publish_ready 时抑制;redispatch(EXIT!=0/markerless/stale-base) 不抑制且 _clear_redispatchable_spawn_log 在 spawn 实际应用处清掉陈旧终态 log → fresh implement 真启动(修 r17 misplacement,免手动清)。
- publish 仅 publish_ready(clean IMPLEMENT_DONE:ok + canonical identity + fresh base + 非空 scoped diff);clean-marker 但 stale-base 改 redispatch 而非死锁(#421 edge)。
- 确定性逻辑(非 LLM-in-loop;daemon-first);merge-gate/#191/#322/#396 安全语义不变。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 339 通过。



* 修复 phase9-router triplet-evidence-invalid 误判致 headless design-consensus wedge

solver prompt 注入的 issue source snapshot 含上一轮 design-consensus 的
audit-trail(peer solver log 路径)时,_peer_solver_reference_violation
扫描整个 prompt 误判为 peer-isolation 违规,fail-closed 拒派 meta-judge,
使所有 body 含历史 consensus 记录的 issue 永久 wedge。

修复:peer-isolation 扫描只覆盖 router-controlled 区域,剥离
'## Issue source snapshot' 到 '## Full solver template' 之间的 issue
作者内容;router header / solver template 注入的真实 peer 引用仍拦截。
加两个 behavior test 覆盖 snapshot-排除 与 router-区域-仍拦截。

dogfood headless 实测:reload 后 418/437/474/490/491/493/494/495 全部
恢复 solver→judge→consensus 自驱,#494 已自驱到 implement。



* CLAUDE.md: 加 consensus-rnd:foundational-invariants sentinel block

dogfood bootstrap 的 check-project-rules probe 要求 PROJECT_RULES 含此
fixed-point block;缺失时 probe fail-closed 阻断 headless restart。内容
为 probe 生成的 canonical FI-001~007,镜像既有设计哲学,不引入新约束。



* 修复 markerless implement 致 publish 卡死的 headless gap

implement worker 干净退出(EXIT=0)但把 IMPLEMENT_DONE marker 只写进 run
artifact(runs/implement-issue-<id>.md)而非 log 末尾时(codex stdout marker
落点不可靠),completed_marker_actions 只扫 log marker → markerless → 不生成
publish action → design→implement→PR 链卡在 implement 后。实测 #421:EXIT=0
+ artifact 有 IMPLEMENT_DONE:issue-421:ok + 9 文件 staged,却无 PR、label 仍
停在 design-solving。

修复:对 clean-exit 的 implement-issue log,log markerless 时 fallback 读取
run artifact 末尾的 IMPLEMENT_DONE marker,复用 review verdict 已有的
artifact-first 模式。窄作用域(仅 implement-issue + clean-exit),配 2 个
behavior test;全套 wakeup_plan 138 tests green。



* 补全 markerless implement publish:revalidation 也读 run artifact marker

8aeea89 修了 wakeup_plan detection 端,但 wakeup-runner 的 source-marker
revalidation(_source_log_has_clean_marker)仍只扫 log,markerless implement
publish 仍被 reject(clean_exit_marker_missing)。补全:revalidation 对
clean-exit implement-issue log 在 log markerless 时 fallback 读
runs/implement-issue-<id>.md 的 marker,与 detection 端一致;配 behavior test,
全套 wakeup_runner 76 tests green。

注:dogfood 实测确认 markerless 是系统性病根(还影响 readiness 重派、review
verdict 等多处),应走系统性统一(见 #491 日志/observability 与后续 marker-reading
收口);本 commit 只补 implement publish 这条链。



* 日志详细化:wakeup-runner tick 显示全状态计数 + 被掩盖的 blocked/skipped 详情

旧 _wakeup_tick_action 只看 results[0],一个 tick 处理多 action 时 blocked(如
publish 失败)被成功的 spawn 掩盖、完全不可见;graphql-backoff 固定显示误导的
'remaining=unknown'。dogfood 调试时这逼得每次翻 ledger 才知道真实发生了什么。

改:tick log 显示 [applied=N,blocked=M,...] 全状态计数 + 被掩盖的 blocked/skipped
具体 reason+action(graphql-backoff 作整-tick 门控单独报、去掉 unknown);dispatched
headline 保留。更新 test 反映可见性提升,全套 wakeup_runner 76 tests green。



* 补全 markerless implement 第三环:readiness 谓词读 run artifact marker

classify_implement_attempt(r18 单一 success-aware 谓词)只扫 log 找
IMPLEMENT_DONE:ok,markerless log(marker 落 artifact)→ redispatch:markerless →
readiness 不 suppress → 重派覆盖已完成工作(dogfood 实测 #421 重派 4 次、#493
IMPLEMENT_DONE:ok→重派→:partial 工作丢失)。

补全:markerless log 时 fallback 读 runs/implement-<cluster>.md 的
IMPLEMENT_DONE:ok,与 detection(8aeea89)/revalidation(dfacc64)一致。窄作用域 +
r17/r18 兼容:只接受 :ok,partial/失败/真 markerless(无 artifact)仍 redispatch
recovery — 现有 markerless-redispatch test(wakeup_plan/wakeup_runner/
controller_actions 共 304 tests)全绿,新增 2 behavior test。

至此 implement→PR 的 markerless 三环(detection/revalidation/readiness)打通;
review 链 markerless 与系统性统一抽象交 #499 共识。



* #500: design-consensus 编排重设计 — phase9-router 唯一派发 owner,删 wakeup-runner 双重派发

落地 #500 r4 consensus(structural framing,no-new-schema):
- phase9-router 成为唯一 design-consensus worker intent owner,新增
  META_RESOLVED:re-design → marker.round+1 三路 solver route
- 删除 wakeup-runner 的 dispatch_design_consensus(消除双重派发 helper_exit:3)
- wakeup-runner 只保留 closed action apply(drop→close、consensus→implement)
- dispatch_consensus_implementation readiness 后转 crnd:phase:implementing
- 无新增 schema(复用 HARNESS_SPAWN_INTENT)

修复 headless P0:reflector META_RESOLVED 无 consumer + phase9-router/wakeup-runner
双重派发致 0-codex 死锁。14 文件 +207/-282;verification 471 tests 全绿。

共识由 controller 手动驱动 design-consensus r1→r4 达成(daemon 停止期间)。



* #501 fix r1: 补 dispatch_consensus_implementation label-transition failure-branch test

tests reviewer reject:新增 label transition(gh issue edit→implementing)只测 success
path,缺 gh issue edit 返回 nonzero 时 dispatch stop before worktree/render/intent 的
failure-branch test。本 fix 补该测试。



* 修复 headless 死锁:label 移除集只留 canonical crnd:*,删除历史 alias

gh issue/pr edit --remove-label 会因仓库不存在的 label 整条 rc=1 失败。
ISSUE_LABELS_REMOVE/PR_LABELS_REMOVE 含 cleanup_aliases(emoji/legacy 名),
仓库未维护这些历史 label,导致 _move_issue_to_implementing_phase 永久失败,
consensus→implement 派发卡死、wakeup_runner 0 codex 自驱死锁。

按 maintainer 指令删除历史 label 兼容(不再管历史标签):移除集只列
canonical crnd:* phase/human/stuck label。新增 source-regression 锁定
两集 canonical-only、无 alias。



* 修复 #500 回归:phase9-router 生产 ctx 丢失 GH_REPO_SLUG 致 design-consensus intake 全死

#500(e733b1a)把 Phase9Router.__init__ 的 ctx 构造从
`LoopContext.load(repo_root=repo_root)` 改成
`LoopContext.load(repo_root=repo_root, env={"REPO_ROOT": str(repo_root)})`。
受限 env 不读 os.environ,使 daemon(经 source host.env 启动)的 ctx
gh_repo_slug=None → `_open_design_consensus_issues` 第一行 fail-closed 返回 []
→ DesignConsensusIssueIntake 静默永不派 → headless 无法为任何 issue 启动
design-consensus;`_require_open_source_issue` 也因无 slug 失败(marker route
如 #490 redesign 卡在 phase9-source-state-unavailable)。

修复:revert 为 `LoopContext.load(repo_root=repo_root)`,读进程环境解析 host facts。
测试一律用 `Phase9Router(ctx=...)` 显式 ctx,不依赖受限 env;新增回归测试锁定
生产 repo_root 构造路径解析 GH_REPO_SLUG。



* 加 STALE_REVIVAL_HOURS:>阈值未推进的 stuck implement 自动 re-trigger(env 可配,默认 3h)

headless 下 partial/failed/markerless 的 implement log 会让已排队的
dispatch-consensus-implementation spawn intent 永久卡在 runner 的
target_log_absent 前置(清 log 的 dispatch_consensus_implementation 又被
pending_implement_intent 抑制)→ stuck implement 不会自愈(如 #421/494/493/474/498
的 10h partial-wedge)。

新增 host-tunable STALE_REVIVAL_HOURS(默认 3,非正/非法回落 3h)。wakeup_plan
投影 implement spawn intent 时,若目标 implement log 是 redispatchable
(partial/failed/markerless/stale-base)且 mtime 已超阈值,清掉它 →
target_log_absent 通过 → implement 自动重派。安全闸:in-flight(无终止 EXIT)永不清、
publish-ready(:ok+diff)不清、age-gate 防误清刚跑完的。

env 已登记进 SKILL.md host env surface matrix + host.env.example;5 个 behavior
test + matrix↔example 对称测试覆盖。



* 扩展 stale-revival 覆盖死 codex:in_flight 无 EXIT 且 >阈值 + 不在存活清单 也 re-trigger

最常见的 headless wedge 是 codex/supervisor 中途死亡(如 daemon 被 kill 截断),
log 永远停在 in_flight(无终止 EXIT)→ classify 永判 in_flight → 永不 redispatch。
之前只清 redispatchable(partial/failed/markerless)log,漏了这半边。

现在 _revive_stale_redispatchable_implement_log 在 log mtime 超阈值时,先按
redispatchable 清;否则若 classify 为 in_flight 且 monitor 存活清单里无该 log 的
活进程,则判定为死 worker 清掉。安全性:spawn-codex 的 no-output stall 窗口远小于
默认 3h,故 >阈值仍 in_flight 的 log 必是死 supervisor;再加 monitor 存活校验
双保险,绝不杀活 codex。新增 3 个 behavior test(死 codex revive / fresh 不碰 /
活进程不碰)。



* 加手动 trigger:consensus-rnd-cli revive-implements(立即 re-trigger 卡住的 implement,不等阈值)

_revive_stale_redispatchable_implement_log 加 force 参数:跳过 stale_revival_seconds()
age gate,但 in_flight log 在 force 下必须经 monitor 存活校验证明无活进程才清(绝不杀
正在跑的 codex)。新增 force_revive_stuck_implements() 扫所有 implement-issue-*.log
强制 revive,和新 CLI 命令 revive-implements(authority delete-log)。auto 路径
force=False 不变。新增 4 个 behavior test。



* 修 env 泄漏:测试套件 import 时隔离 ambient host.env(worktree 里跑 TEST_CMD 不再假失败)

daemon spawn 的 implement worker 在 worktree 里跑完整 TEST_CMD 时,继承的相对
CONSENSUS_RND_HOST_ENV=.config/consensus-rnd/host.env 泄漏进 temp-repo
LoopContext.load(repo_root=tmp) → 解析成 tmp/.config/... 不存在 → 假
LoopContextError(实测 concurrency_monitor 48 + restart_daemons 23 + … ≈151 errors)
→ 全套永不绿 → implement 永远 :partial → 进不了 PR。

unittest discover 在 collection 阶段先 import 所有 test_*.py 再跑;新增
test_aaa_host_env_isolation.py(字母序最先)在 import 时 pop 掉 host 注入变量,
全套在任何测试 run 前即隔离。验证:full discover 设 locator 跑 → exit 0 全绿
(此前 同条件 151 errors);targeted before/after 证 48/23/1 errors → OK。
纯测试模块,不碰 production/daemon。



* 修 markerless keystone:solver/judge marker 加 companion-artifact fallback

solver/judge codex 有时 EXIT=0 但 SOLVER_DONE/META_JUDGE_DONE marker 只落 run
artifact、没在 log 末尾,导致 phase9-router 判 triplet 不完整、永不派 judge,
design-consensus 轮卡死(实测 #505 r1 structural 214 次嵌入、0 次行首)。

把已有的 implement artifact-marker fallback 扩到 solver/judge:clean-exit log
末尾无独立 marker 时,从 companion run artifact runs/<log-stem>.md 兜底读。纯
additive(log 有 marker 时行为不变)、prefix-scoped、精确 filename regex、复用
安全的独立 marker 提取(不信嵌入/echo)。不削弱共识(仍要 3 marked solver +
judge,只让 marker 来源更稳健)。phase9/router.py + wakeup_plan.py + behavior +
source-regression tests;full suite 1367 green。



* 修 publish stale-base 无恢复 + 停 :ok churn:让 implementing 能发出 draft PR

implement :ok 但 worktree base 落后 origin/$INTEGRATION_BRANCH 时,publish 原本硬
block(publish_implementation_stale_base)且永不恢复;stale-revival 又把 stale-base
的 :ok implement 当 redispatchable 反复重派同一 stale worktree → 无限 churn 占满
floor、长时间零正式 PR(implement→首个 PR 永远开不出)。

改为:publish 前在 worktree merge origin/$INTEGRATION_BRANCH 恢复 base(冲突则
merge --abort + 明确 reason 交既有 resolver/re-dispatch);clean `:ok` 不再被
stale-revival churn(终态成功走 publish-with-recovery,不重跑);无 open PR 的 :ok
创建 draft PR(per #300 共识前 draft)+ Closes #N。controller 仍拥有 git topology,
worker 不 commit/push,不绕 review-gate/merge 真值表。controller_actions.py +
wakeup_plan.py + wakeup_runner.py + behavior/source-regression tests;full suite
1374 green。



* 修 markerless implement:EXIT=0+真diff 无干净 marker 时合成 publish 候选,止 stale-revival 空重派死循环

publish build/test 闸做安全网。



* 修 publish 顺序 bug + merge 失败派 codex 兜底 resolver:先 commit worker diff 再 merge fresh base,真冲突不 wedge 改派 resolver,幂等续跑

空判据改 git diff HEAD;顺带修 wakeup_plan stale-suppression 同类 git diff 问题。



* 实现 issue #504 (#519)

* 修 headless review-fix prompt 渲染不全:绑定全部 placeholder(PR_NUMBER/标题/轮次/分支/REVIEW_*_PATH 等),解锁 reject PR 的 fix→merge

之前 headless _dispatch_review_fix 只绑 FIX_OUTPUT_PATH,其余 ${} 未解析致 fix codex 读不到 reject 证据。



* 实现 issue #496 (#520)

* 修 review-fix codex 派到 main 的 cd bug:改在 PR worktree 跑,worktree 缺失 fail-closed

之前 _dispatch_review_fix cd=repo_root → fix 改 main(污染 daemon checkout)且不落 PR 分支(reject PR 永不 merge)。



* 实现 issue #508 (#523)

* 修 wakeup-plan action 排序:review_gate/publish 排到新工作 spawn 前,解 spawn-budget 饿死→reject PR 可被 fix→merge



* CLAUDE.md 加宪法工程规则:异常必抛出+记可诊断日志严禁吞掉/静默;测试必断言真实行为禁无意义测试(maintainer 指令)



* 实现 issue #503 (#521)

* 增加通用工程基本规则,明确面向对象设计要求及实现细则

* 实现 issue #514 (#524)

* 完善通用面向对象设计准则,增加核心原则与实现层细则

* 实现 issue #494 (#526)

* 实现 issue #517 (#528)

* 实现 issue #505 (#510)

* 实现 issue #505

* #510 fix r1:补 phase9-router 行为测试覆盖(应对 tests reviewer reject)



---------



* 修复 headless FIX_DONE 不提交 fix 输出 gap:wakeup-runner 在 FIX_DONE re-review 前先 commit+push fix worktree

headless 下 fix codex 产出 FIX_DONE 但 worker 从不 commit,FIX_DONE 直接路由 dispatch_reviewers,
fix 输出永远留在 worktree 未提交,reviewers 永远审旧 head,reject 不收敛,阻断所有需 fix 轮的 PR
review→merge。镜像 interactive controller:FIX_DONE 源的 dispatch_reviewers 前,若 fix worktree
dirty 则 git add+commit+safe_push,clean 则 no-op。附 2 个 behavior test(dirty 提交/clean 跳过)。



* 实现 issue #491 (#531)

* 实现 issue #490 (#529)

* 实现 issue #490

* 实现 issue #490

* PR #529 review-fix output

* 实现 issue #499 (#512)

* 实现 issue #499

* PR #512 review-fix output

* 实现 issue #474 (#532)

* 实现 issue #474

* 实现 issue #474

* 实现 issue #474

* 实现 issue #474

* PR #532 review-fix output

* 实现 issue #476 (#518)

* 实现 issue #476

* PR #518 review-fix output

* 实现 issue #509 (#527)

* 实现 issue #509

* 实现 issue #509

* spawn-claim recycle 改用 artifact-aware terminal marker:companion log 被 24h retention 删除后,已完成任务的 claim 仍可回收(读 runs/ artifact),不再永久阻断重派;running 任务无终止 marker 仍不回收(防 double-spawn)

* 实现 issue #421:发行前 clean-room 集成测试制度 (#534)

* 实现 issue #421

* rebase #421 onto fresh integration:解决 test_cli_command_router 冲突 + 校验

* 实现 issue #493:测试纪律修宪(behavior-first) (#535)

* 实现 issue #493

* rebase #493 onto fresh integration:解决冲突 + 校验

* PR #535 review-fix r1:addressed tests+quality reject(behavior-first test 调整)

* 实现 issue #506 (#536)

* 实现 issue #506

* 实现 issue #506

* 实现 issue #506

* 实现 issue #506

* PR #536 review-fix output

* churn 根治:wakeup-plan 给 consensus-impl spawn-intent 套 open_closing_pr/terminal/in-flight 抑制(对齐 dispatch_consensus_implementation),已有 PR 的 issue 不再被重派 implement

* 实现 issue #507 (#525)

* 实现 issue #507

* 实现 issue #507

* PR #525 review-fix output

* 实现 issue #516 (#530)

* 实现 issue #516

* 实现 issue #516

* #530 fix r1:补 ManagedWorkSnapshot fetch-failure stale-cache fallback 行为测试(应对 tests reviewer reject)

* PR #530 review-fix output

* PR #530 review-fix r2:typed ManagedWorkItem projection(addresses architect 类型边界 reject + tests + contract-tests)

* PR #530 review-fix r3:anchor ManagedWorkSnapshot read surface in authorization mirror + cross-artifact source-regression(addresses architect governance reject)

* PR #530 review-fix r4:add canonical+legacy managed-label discovery regression test(addresses tests reject)

* 修复 worker marker reader 误判相同重复 marker 为冲突

codex 常把终止 marker 回显 2-3 次(含 diff + 前缀行),导致
read_worker_terminal_marker 返回 duplicate_or_conflicting,review 真值表
永不收敛、所有 headless review→merge 卡死。改为:相同重复 marker 折叠为
有效,仅当存在 2+ 个不同 marker 值才判 duplicate_or_conflicting。
覆盖 _last_final_marker / _sentinel_adjacent_marker /
_marker_from_companion_artifact,保留 malformed 与 foreign-prefix 检测。



* Reserve implementation PR for issue #537 (#540)

⟦AI:AUTO-LOOP⟧

* 实现 issue #511 (#522)

* 实现 issue #511

* PR #522 review-fix output

* PR #522 review-fix output

* 移除误入 PR 的 .refactor-loop 运行 artifact(architect r8 reject)

resolve-pr522-merge.md 是一次性合并解决 summary,属生成物,不应进 durable diff
(CLAUDE.md「生成物不当事实源」)。解决说明保留在 commit message / PR body。



---------



* 修复早期 PR reservation 崩溃(#511 回归,阻断所有 implement 派发)

_reserve_implementation_pr 用 worker 富正文路径开 reservation PR,但该文件在
reservation 时尚不存在 → FileNotFoundError(未被 except RuntimeError 捕获)→
wakeup-runner tick 崩溃 → 任何 consensus issue 无法 implement → loop P0 wedge。
修复:reservation 前若 body 缺失则写占位正文(## issue #N 实现 + Closes + sentinel,
publish 会拒占位逼 worker 产真正文);reservation 幂等(已有 open PR 直接返回;
已有 remote head 则 reset 到 integration + force-with-lease push);捕获 OSError。



* Reserve implementation PR for issue #543 (#544)

⟦AI:AUTO-LOOP⟧

* Reserve implementation PR for issue #541 (#546)

⟦AI:AUTO-LOOP⟧

* revert early-PR reservation(#507):不再开空占位 PR,publish 出真内容才开 PR

dispatch_consensus_implementation 不再 reserve/开 PR;publish_implementation_output
恢复 pre-#507 行为——implement 出非空 diff 才 commit+push+open_pr_with_label(用 worker
真 title/body,验证非占位);空 diff 不开 PR、不 merge、不关 issue。删除占位机制
(_reserve_implementation_pr / _placeholder_*),改 wakeup_plan/runner 的 early_pr_missing 投影。
根因:空占位 PR 被 reviewer 在 implement publish 前空合,假性关闭 #537/#541/#543。
全量套件 1496 passed。



* 实现 issue #542:删除 redispatch payload 冗余 head_sha (#547)

移除 review redispatch 的重复 head_sha payload

* hotfix: stale consensus-implement intent 指向已删 worktree 不再 suppress fresh re-dispatch

consensus_implementation_suppressed_reason 的 pending_implement_intent 分支增加 worktree 存在性
守卫(接入既有 _canonical_consensus_worktree_exists):仅当 stale pending intent 的 canonical
worktree 仍存在时才 suppress。revert early-PR reservation(#507)后,#537/#541/#543 的 pending
implement intent 指向已删除 worktree,旧逻辑 suppress fresh re-dispatch,而 wakeup-runner 不断把
stale intent spawn 进死 cd 致 os error 2 / EXIT=1 死循环,阻断 consensus→implement 自主链路。
现 worktree 缺失即放行 fresh dispatch 重建 worktree。

补 test_wakeup_plan(worktree 缺失不 suppress / worktree 存在仍 suppress)+ 更新
test_controller_actions 锁旧行为的 subTest(pending case 建 canonical worktree 以保留
reject-before-fresh_safe_worktree 意图)。全套件 1557 passed。



* hotfix: spawn-claim lock 持有者 pid 已死即可回收(自愈,防 cleared-log 永久 HELD)

_existing_claim_is_recyclable 增加 pid-liveness 回退:当 lock 记录的持有者 pid 已死(os.kill
pid 0 抛 ProcessLookupError)时即可回收,保留既有 terminal-marker 回收与 live-spawn 保护
(pid 存活+无 marker 仍不回收)+ metadata-mismatch fail-closed。根因:implement 失败(EXIT=1)
留 stale lock,dispatch_consensus_implementation 重派时清掉 log → #490 spawn-claim 只在 log 有
EXIT= 时回收 → cleared log 无 marker → 死进程 lock 无法回收 → SPAWN_CLAIM_HELD 永久死锁,
阻断 consensus→implement 重派(#537/#541/#543)。补 7 个 behavior test。全套件 1560 passed。



* hotfix: publish_implementation_output 接受 worker 已 commit 的实现 diff

_require_publish_implementation_diff 之前只看 git diff HEAD --quiet(未提交改动),
worker 一旦自己 commit(在自己分支上,隔离无害)工作区就 clean → 误判 no_diff → 不开 PR。
现改为:有未提交改动 OR HEAD 相对 integration base 有 committed delta(merge-base..HEAD 非空)
即视为有 diff 可发;_commit_publish_implementation_diff 在工作区已 clean(worker 已 commit)时
跳过提交返回成功,不再误报 publish_commit_failed。修 #541 类 worker-commit→no_diff→无 PR+
daemon 重复 dispatch。补 behavior test。codex 全套件 DISCOVER_STATUS:0 通过。



* 实现 issue #437: .refactor-loop 历史生成物与陈旧 worktree 的有界清理机制 (#548)

* #437: 新增 canonical RuntimeRetention owner,log-retention 降为一版 alias;清理覆盖 generated files+pending-events compaction+stale worktree;CLAUDE.md 加 #437 narrow carveout

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* hotfix: CONFLICTING managed review PR 的 stale-base 冲突自主 resolve

接线既有但孤儿的 prompts/rebase-resolve.md 进 headless 路径:managed review PR
mergeable=CONFLICTING(base 落后)时,controller 在 PR worktree 起 git merge
--no-commit --no-ff origin/<integration>——干净则直接 commit+push,有冲突则派
rebase-resolve codex(隔离 worktree 内解+stage,禁 commit/push),收到
REBASE_RESOLVE_DONE 后 controller commit + safe_push 仅推 PR head。结果再过 CI+
review-gate 把关。新增 dispatch_pr_rebase_resolve / commit_push_resolved_pr_rebase
(controller_actions)+ wakeup_plan 投影 + wakeup_runner 应用。

安全:只动 refactor/iter<N>-* PR 分支,显式 guard 拒碰 integration/review_base,
diff-filter=U 验未解冲突才 commit,任何 abort 路径 git merge --abort 留干净 worktree,
幂等(已含 base / 解析在飞则 NOOP)。修 stale-base churn 致 loop 永久停滞(一个 PR
合了其它全 CONFLICTING 无人解)的根因。codex 全套件 1589 passed。



* hotfix: rebase_resolve_actions 投影前 enrich live mergeability(修生产 gap)

自主-resolve 投影读 item.mergeable 前未 enrich,但生产 GhItem 来自 snapshot
mergeable 为空 → 永远 skip → CONFLICTING PR 不投影 dispatch_pr_rebase_resolve →
loop 仍停滞。复用同文件 _with_live_mergeability(review_evidence_redispatch_actions
已用同模式)。补 mergeable-为空-需-live-fetch 的回归测试(原测试 mock 了 mergeable
未覆盖此生产路径)。



* hotfix: dispatch_pr_rebase_resolve kind 进 EXECUTABLE_ACTION_KINDS(end-to-end 可执行)

_close_projection_action 的 else 分支默认 status_only=True;rebase action 的 kind
stale-base-conflicting-pr 不在 EXECUTABLE_ACTION_KINDS → 永远 status_only,runner 不
apply,CONFLICTING PR 永不自主 resolve。加该 kind(+ commit_push completion kind 如需)
进集合;controller_action 已在 RUNNER_NAMED_HELPER_ACTIONS 故随即保持 executable。补
端到端 wakeup-plan pipeline 测试(原测试只测 rebase_resolve_actions 单函数,漏了 closed-
action-projection 的 status_only 关卡)。codex 全套件 1593 passed。



* hotfix: rebase-resolve commit/dispatch 投影 gate 在实际 worktree merge 态(止 stale-marker 噪音)

commit_push_resolved_pr_rebase 之前只凭持久 REBASE_RESOLVE_DONE marker 投影,worktree
已 commit+push(#549)或 abort(#550/#551)后 marker 仍在 → 每 tick fail-closed 噪音 +
反复 0-codex(zero_streak 触警)+ stale marker 挡 #550/#551 fresh resolve。现 gate:仅当
worktree 真有 merge-in-progress(MERGE_HEAD)且无 unmerged 才投 executable commit_push;
否则不投。并让 worktree clean + 仍 CONFLICTING 的 PR 重新拿 executable dispatch_pr_rebase_resolve
(stale DONE marker 不再永久压制)。补端到端测试。codex 全套件通过。



* 实现 issue #502: 删除遗留 label 兼容层:purge emoji/legacy alias 全套(cleanup_aliases / per-spec  (#549)

* 实现 issue #502: 删除遗留 label 兼容层:purge emoji/legacy alias 全套(cleanup_aliases / per-spec

* 修复 PR #549 review r3:#502 label-purge 适配当前 label 代码(contract-tests + architect/tests reject)

fix codex round-3 applied-4:对齐 #502 遗留 label 兼容层 purge 与近期新增 label 代码,
修 contract-tests 失败 + architect/tests reject。改 labels.py + SKILL.md + test_wakeup_plan.py。



---------



* 修复 headless review_gate ci_failed 死锁:reject→FIX 移到 ci/mergeability 检查前

_review_gate_decision 之前 ci_error/mergeability 检查在 reject>0→FIX 之前 → CI 红的 PR
即使 reviewer reject 也返回 WAIT_OR_REDISPATCH:ci_failed 干等,而 CI 不修不会绿 → 死锁
(#548/#549 都中,需手动派 fix 破解)。把 reject>0→FIX 移到 ci/mergeability 检查前
(仍在 reviewer-validity/head 检查之后)→ reject 时即使 CI 红也自主派 fix(fix 同解 reject+CI)。
MERGE/MERGE_WITH_COMMENTS 判定完全不变(仍需 ci-green + mergeable + reject=0 + approve>=1),
无误合并风险。补 test_wakeup_runner 端到端测试(reject+ci-red→FIX;reject=0+ci-red 仍 WAIT;
happy path MERGE 不变;stale-head+reject 仍 WAIT)。full suite 1600 passed。



* 修正 review_gate 顺序:mergeability 检查移到 reject→FIX 之前(CONFLICTING 不并发 review/fix)

上一修复(a125d19)把 reject→FIX 移到 ci+mergeability 前,破了 ci_failed 死锁但也让
CONFLICTING+reject 的 PR 路由到 FIX → 与独立的 dispatch_pr_rebase_resolve 并发跑同一
worktree(#550 出现 98-file mid-merge + reviewer 并发)。把 mergeability_error 检查移到
reject→FIX 之前:CONFLICTING→WAIT(先 rebase-resolve,不并发 review/fix);MERGEABLE+
CI红+reject→FIX(死锁修复保留);ci_error 仍在 reject→FIX 后(纯 CI 红无 reject 才 WAIT)。
MERGE 判定不变。test_wakeup_runner 90 OK(+conflicting 用例)。



* 实现 issue #498: headless daemon fail-closed 点的 codex 兜底裁决/生成机制(narrow allowlist,非通用 es (#550)

* 实现 issue #498

* 实现 issue #498: headless daemon fail-closed 点的 codex 兜底裁决/生成机制(narrow allowlist,非通用 es

* PR #550 review-fix output

* 新增 patrol-inspector 巡检 issue intake (#551)

* 实现 issue #541

* PR #551 review-fix output

* PR #551 review-fix output

* 修复 PR #551 contract-tests:对齐 patrol 第8 daemon + safe_push/sync_dev 测试 host.env

#551 patrol 新增 patrol_inspector_daemon(第8 daemon)→ test_anti_stop_restart 期望 7→8;
test_controller_lib_safe_push / test_sync_dev 补 CONSENSUS_RND_HOST_ENV host.env 设置
(代码按 host-config contract 需 host.env locator,rebase-resolve 丢了 base 新测试版本)。
断言未削弱(daemon 数断言强化为8)。full discovery 1634 OK + sshx 13 OK。



* 修复 PR #551 contract-tests(clean-env):e2e fixture 设 META_ESCALATION_STUCK_HOURS 隔离 de-dup 测试

test_batch_and_per_task_lock_do_not_duplicate_spawn 在 clean env(CI)下失败:#506 meta-
escalation 因 fixture mock 态被判 stuck 而 fire,产出 repository-stalled RunnerResult 污染
de-dup 断言(expected [] got [RunnerResult])。fixture host.env+env 设 META_ESCALATION_STUCK_
HOURS=999999 禁该测试的 meta-escalation,聚焦 de-dup 行为。断言未削弱。clean discovery 全过。



---------



* 修复 publish-gap:implement readiness 用 git status --porcelain 判 staged diff

原 classify_implement_attempt / wakeup_runner publish 前用 git diff --quiet 只检测 unstaged
改动;worker 正确 git add -A 后 staged diff 被误判 empty_scoped_diff,导致完成的 implement
(如 #553 干净 EXIT=0 + 885 行真实 staged diff + 有效 title/body artifacts)永远出不了 PR,
wakeup_runner 对 children 反复空转 re-dispatch。改为先 git status --porcelain 判 staged/
unstaged/untracked 任一即 publish-ready,只有真干净才 fallback git diff。publish-time 建 PR
路径不变(不重蹈 #507 early-PR 空 commit/placeholder 死锁)。



* 实现 issue #552

* PR #558 remote-ci-fix output

* 移除 #552 越界加的孤立 flaky phase9-router 测试

fb2f461(#552 实现)给未改动的 phase9 router 加了 test_phase9_router_tick_summary_is_
router_local_diagnostic,与 #552「收窄 restart daemon 权威」scope 无关,且该测试 call
self.router.tick() 不 mock managed-work-snapshot fetch → CI 无 GitHub 时 fetch-failed
诊断行污染输出致 anchored regex 不匹配(flaky),挡 CI。main 从无此测试。按删除优先+scope
收口移除;若 router 诊断测试有价值应在 phase9-focused PR 以 hermetic 形式正式加。



---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
loning added a commit that referenced this pull request Jun 6, 2026
* #471 hotfix: wakeup_plan completed-marker 投影按 open-managed 集过滤,closed/merged target 不投影(止血 graphql:消除 wakeup_runner 对已终结 target 的逐条重校验)

* #410: SKILL/prompt 注释改引 durable contract anchor,不引用一次性 .refactor-loop/runs/*;新增 test_generated_artifacts_not_authority 窄 source-regression (#472)

* #471 hotfix2: wakeup-plan 同一 target 只投影最新有效 marker,旧轮次不再投影 dispatchable action(止 graphql backoff 最后一刀)

* skill: 强化 no-gap 规则 — 0 codex + active work 无观察模式豁免,必须当 turn 立即介入派 codex(+source-regression)

* #470: direct-post prompt 渲染时 fixed-token 内联 _github-post-rules,host worktree cwd 可达(避免裸相对路径) (#479)

* #478 hotfix: daemon spawn 改 launch-and-forget 独立 supervisor(start_new_session 不 wait),daemon reload 不丢 in-flight worker;daemon 每 tick 清晰状态行

* #471: wakeup-plan latest-open-fresh 投影 — completed-marker open-target/latest + release-rollup 仅 latest-per-sha 且 ahead>0 本地 ref 验证(止 stale 投影/graphql) (#482)

* #481: consensus->implement readiness-dedup gate(止 daemon 重派 closed/has-PR/in-flight) (#486)

* #481: consensus->implement readiness-dedup gate, closed/closing-PR/remote-branch/in-flight target 不投影 dispatchable, 止重复/race

* #481 fix r1: dedup gate 复用 concurrency monitor owner surface 测量 in-flight,不重实现 daemon 算法(addr architect)

* #481 fix r2: 补 controller helper guard behavior test(addr tests)

* #485: consensus->implement 按 scope_paths 冲突分组串行(同组一个 executable,其余 status_only),复用 batching-heuristics,防并行冲突 PR (#489)

* #475: wakeup-runner WakeupApplyBudget — spawn-worker action 按 hard_gate.dispatch_required 批量到 floor deficit(lifecycle action 仍 1/tick),修 1/tick 线性恢复 (#492)

* hotfix(#475 ext): batch budget 纳入 dispatch_design_consensus —— solver triplet 一 tick 派齐(非 lifecycle,worker 派发),修 design-consensus 仍 1/tick 低并发

* hotfix: headless spawn-intent launch — skipped duplicate 不再 break 整批 spawn,stale applied ledger(log 缺失)可重试,helper_exit:3 加来源诊断事件;修 248 intent 堆积/0 并发

* hotfix r2: blocked lifecycle action 不再 dead-stop 整 tick — 跳过/抑制 stale blocked(close-drop target OPEN / publish 无 verified head)继续 launch 后续 spawn-batch,使 headless 并发拉到 floor

* hotfix r3: 根治 headless 0 并发 — detached spawn-codex launch 显式传 env_for_subprocess(修 #478 decouple 丢 env 致 EXIT=127);失败 log(非0 EXIT)可重试 launch,EXIT=0/未完成仍抑制重复;tick 状态汇总后续 launch 不被首个 blocked 掩盖

* hotfix r4: run_once 通用 blocked-not-dead-stop — 任何 blocked/skipped non-spawn action 不再 dead-stop 整 tick,继续 launch 后续 spawn-batch;wakeup-plan 抑制 precondition 注定失败的 stale action;根治 headless 并发=0

* hotfix r5: wakeup-plan 投影侧 suppress stale close_managed_drop(target 实际 OPEN 的 drop-close 注定失败)为 status_only,不再占 tick 挡住后续 design-consensus spawn launch

* hotfix r6: wakeup-runner run_once 不再把 no-intents 的 dispatch_design_consensus(helper_exit:3)误判为 codex launch failure dead-stop 整 tick

根因:dispatch_design_consensus 是 spawn-batch action;当 solver triplet 不完整(如 #496 minimal/structural EXIT=0 但未 emit SOLVER_DONE marker)时 helper 返回 exit 3(NO_INTENTS),_spawn_launch_failure 命中 "helper_exit:" 前缀判 True → run_once break,整 tick dead-stop,后续 reviewers/其他 design-consensus/implement 全部不派 → headless 并发=0。

修法:run_once 中 blocked 的 spawn-batch action 若 controller_action != spawn_codex_harness_background(即 dispatch_design_consensus),按 routing no-op skip-and-continue,继续 launch 后续 spawn-batch;仅真正 spawn_codex_harness_background launch failure 才走既有 retry/break。延续 r4 blocked-not-dead-stop 不变量到 design-consensus spawn-batch action。

behavior test: test_wakeup_runner_design_consensus_no_intents_does_not_dead_stop_later_spawn_batch(无修复时 later spawn 不 launch 而 FAIL)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r7: review head_sha binding — 打通 headless review→merge 自驱

根因:reviewer artifact frontmatter 不带 head_sha,_review_evidence_from_artifact 取不到所审 head SHA → 所有 reviewing PR(#480/#473/#402)恒判 WAIT_OR_REDISPATCH:invalid_reviewer_evidence:missing_reviewed_head_sha,且 wakeup-plan 不自动重派 reviewer → review→merge headless 永久卡死(挡住发版)。

修法(codex hotfix worker 实现,controller 验证后提交):
- controller_actions 派 reviewer 时取 headRefOid,缺则 fail-closed,注入权威 HEAD_SHA 进 rendered prompt。
- reviewer-{architect,tests,quality}.md frontmatter 写 head_sha: ${HEAD_SHA}。
- wakeup_runner._review_head_sha_for:artifact 缺 head_sha 时从 controller-rendered prompt/log 回收(不依赖 worker 自报,避免 markerless 覆辙);live-head 陈旧判定与 merge 真值表语义不变。
- wakeup_plan 为 missing/stale reviewer head 的 OPEN reviewing PR 投射可执行 dispatch_reviewers 自愈动作。

验证:controller 独立重跑 test_wakeup_runner+review_gate+wakeup_plan+controller_actions 293 通过、marker/prompt 合同 30 通过;review-gate 安全语义(stale head 仍不 merge)由 13 项 review-gate 测试锁住。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r8: 让 review-evidence-redispatch 自愈动作可执行 — 打通 headless review→merge 闭环

承接 r7:r7 让 reviewer 证据绑定 head_sha,但 wakeup_plan 投射的 review-evidence-redispatch 自愈动作被 closed-action-projection finalizer 强制 status_only(其 kind 不在 EXECUTABLE_ACTION_KINDS),runner 永不应用 → #480/#473/#402 缺 head 的 review 证据无法自动重派 → review_gate 恒 WAIT_OR_REDISPATCH → 并发耗干到 0。

修法(codex hotfix worker 实现,controller 验证后提交):
- wakeup_plan EXECUTABLE_ACTION_KINDS 加 review-evidence-redispatch,finalizer 保留 runner authority 不再强制 status_only。
- controller_actions.dispatch_reviewers 读 stale_review_roles 只重派陈旧角色,且 _pending_review_spawn_exists 跳过已有在飞 dispatch-reviewers:<pr>:<role>:r1 intent(幂等不双派);仍走 r7 的 HEAD_SHA 注入,使重派 reviewer 的新证据 head-bound。
- merge-gate 真值表不变,仅让既有自愈动作可执行 + 定向到陈旧角色。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 295 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r9: reviewer 重派 bump 到下一轮 + 绝对 cd + latest-evidence-per-role — 真正打通 headless review→merge

根因(同一类:按"产物存在"而非"成功+区分维度"抑制):reviewer 重派硬编码回 r1,撞上旧 head 早先成功(EXIT=0)的 r1 log → _harness_spawn_intent_log_suppresses_retry 判"做过了"抑制 → 当前 head 该重跑的 review 永不跑 → review_gate 恒 WAIT_OR_REDISPATCH → 并发 0。且 intent cd 是相对 "." → headless 启动层不投射。

最小修法(按 maintainer directive:文件名即唯一 id,不加 wrapper --id,留在现有 r<R> 命名约定内不改公开命名):
- controller_actions.dispatch_reviewers:_next_review_round 取该 PR/role 现有最大轮 +1,渲染 r<N+1> 的 prompt/log/output/intent;pending-spawn 幂等 guard 按真实轮号;_append_harness_spawn_intent 的 cd 改 str(cd.resolve()) 绝对路径,使 headless 投射+启动成立。
- wakeup_plan.latest_reviewer_heads / wakeup_runner._latest_review_evidence_by_role:改为"每 role 取各自最新轮证据"(而非全局最新单轮),使只重派陈旧 role 时不丢其它 role 已 head-bound 的有效证据;merge 仍要求每 role 证据绑定 live head,安全语义不变。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 299 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r10: review_gate action 带上 reviewed head_sha + e2e 锁死 headless review→decision 全路

根因(review path head_sha 串接的第 5 处):wakeup_plan 投影 REVIEW_DONE 的 review_gate action 时只读 log 的 head(log 无 head)→ action.head_sha 空 → wakeup_runner._review_gate 卡 WAIT_OR_REDISPATCH:missing_action_reviewed_head_sha,证据虽已 head-bound 但 gate 连比都没法比。

修法(最小、按文件名即 id 思路):
- wakeup_plan.completed_marker_actions:REVIEW_DONE action 的 head_sha 改由 _review_done_action_head_sha 从权威证据面解析(latest_reviewer_heads 含 r7 prompt 注入的 head;或全 role head-bound 时取 live PR head;否则回落 log)。只从真实来源填充,不绕过 gate 的 head 比对,merge 安全语义不变。
- 新增 test_review_gate_e2e.py:从真实 wakeup_plan 投影 → WakeupRunner apply,断言 FIX(有 reject)与 MERGE(全 approve+CI 绿)两条具体决策落地——端到端锁死整条 review→decision,杜绝再冒隐藏层。
- test_wakeup_plan.py:加 prompt-bound reviewed head 的 planner 回归。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions+review_gate_e2e 302 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r11: PrChecksProjection 对瞬时 gh-api 失败加 bounded retry — ready PR headless 走到 review_gate 决策

根因:review_gate 的 CI 读 PrChecksProjection.check_pr 对 gh api pulls/<n> 与 commits/<sha>/check-runs 的单次非零返回立即 fail-closed(ci_unavailable:pull_api_failure),无重试;rate limit 健康,实为瞬时 blip → 让 #473/#402(MERGEABLE+CI 绿)每 tick 被瞬时失败挡住、never reach MERGE/FIX。

修法(最小,只动 PrChecksProjection;不碰 #322 ReleaseRequiredChecksProjection):
- _run_api_read:对两处 api read 最多 3 次确定性重试(无 sleep),成功即停,持续失败仍 fail-closed(安全)。happy path 无行为变化。
- starvation 结论(worker 核实):review_gate 本就被 apply,blocked ci_unavailable 行可重评估(非 terminal duplicate-suppressed),无需改 run_once。
- e2e:test_review_gate_e2e 加「首次 pulls 读瞬时失败仍 merge」用例;test_pr_checks 加 retry 覆盖。

验证:controller 独立重跑 test_pr_checks+wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+controller_actions 309 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r12: run_once 不再让 spawn-batch starve drain-only lifecycle — ready PR headless merge,backlog 能 drain

根因(已核实):wakeup_plan 每 tick 把 SPAWN actions(implement intent + design-consensus #495/#496/#490/#491/#418,index 0-5)排在 lifecycle review_gate(#480/#473/#402,index 6-8)之前;run_once 旧逻辑 `if applied_spawns>0 and not is_spawn_action: break` → 只要派了任一 spawn,首个 lifecycle 即 break 整 tick。design-consensus 几乎总有 spawn 活 → review_gate/merge 永远 starve → #473(approve/approve/comment → MERGE_WITH_COMMENTS,MERGEABLE+CI 绿)永不 merge,active 数永不下降。

修法(最小,核心 dispatch loop):
- 引入 consumes_spawn_budget = is_spawn_action or _uses_spawn_budget(action);超 budget 时 `continue`(不 break)以便仍能抵达后续 drain-only lifecycle。
- _uses_spawn_budget:dispatch_reviewers=True、review_gate 仅当决策为 FIX(会派 fix worker)=True;否则 review_gate(MERGE/MERGE_WITH_COMMENTS)、close 等 drain-only=False → 不占 spawn budget、floor 填满后仍被评估执行。
- 保留 r6 的 spawn launch-failure break/retry 与 dispatch_design_consensus no-intents skip-and-continue;真值表/head-binding/CI-green/mergeable 安全语义不变。

结果:每 tick 既用 spawn 填满 floor,又评估 merge/close 等 drain-only lifecycle,backlog 能 drain。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+controller_actions+pr_checks 309 通过;新增 test_wakeup_runner_lifecycle_review_gate_not_starved_after_spawn_batch。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r13: 修 WakeupRunner._run_command 把 --repo 注入到 gh 子命令前致全部 gh 调用 malformed — review→merge 根因

根因(实测确认):_run_command 对所有 gh 命令在 full[1:1] 插 --repo(子命令之前)。gh 不接受 --repo 作 pre-subcommand 全局 flag,且 gh api 根本不吃 --repo(repo 在 URL):`gh --repo O/R api repos/O/R/pulls/473` → rc=1 "unknown flag: --repo"。于是 wakeup-runner 的每个 gh 调用都 malformed:CI 读(PrChecksProjection(runner=_run_command) 的 gh api pulls/check-runs)→ ci_unavailable:pull_api_failure;mergeability(gh pr view)→ mergeability_unavailable;live-state 等同理。→ review gate 永远读不到 CI/mergeable → #473(approve/approve/comment=MERGE_WITH_COMMENTS,绿)、#402、#480 永不 headless 落地。r11 的 retry 只是重复同一 malformed 命令。直连 PrChecksProjection()/直连 gh 用的是 plain default runner(无注入)所以正常,掩盖了 bug。

最小修法:_run_command 仅对 `gh pr`/`gh issue` 注入 --repo,且放在子命令(及其 verb)之后的合法位置;`gh api` 完全不动;绝不 pre-subcommand 注入;已存在 --repo 则不重复。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+pr_checks+wakeup_plan+controller_actions 312 通过;新增 argv-shape 测试 + e2e(stub subprocess.run 拒绝 malformed `gh --repo ... api`,ready PR 仍 merge)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r14: managed draft PR ready-then-merge — review gate 不再硬挡 pr_draft

根因(r13 修好 gh 调用后浮现):review_gate 的 _review_gate_mergeability_error 把 isDraft=True 当 WAIT_OR_REDISPATCH:pr_draft 硬挡。但本 loop draft-until-consensus:PR 开为 draft 直到 review 共识,merge_pr 在决策 MERGE/MERGE_WITH_COMMENTS 时 ready-then-merge。于是绿 gate 的 managed draft(如 #473 approve/approve/comment)永远卡 pr_draft、永不 headless 落地。

最小修法:
- wakeup_runner:删除 mergeability 检查里的 isDraft→pr_draft 硬挡(仍挡 mergeable!=MERGEABLE,即 CONFLICTING/DIRTY)。
- controller_actions.merge_pr:对 live draft PR 先校验 crnd:lifecycle:managed 再 gh pr ready(非 managed draft 不 ready,写 CONTROLLER_ACTION_BLOCKED:target-not-managed 并返回 2);保留既有 ready-then-merge。
- 真值表/head-binding/CI-green/conflicting-blocks 安全语义不变;readying 仅限 managed PR。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+controller_actions+wakeup_plan+pr_checks 316 通过;新增 managed-draft merge + 真实 plan→runner→controller ready-then-merge e2e + conflicting/非managed 安全用例。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* #409: 移除 .refactor-loop/host.env runtime fallback,CONSENSUS_RND_HOST_ENV 为唯一 host fact locator(缺失 fail-closed);迁移依赖 legacy fallback 的 test fixture (#473)

* hotfix r15 (sshx 共识 B 之一): 统一 gh argv 构造到单一 build_gh_argv helper — 防 r13 类 --repo bug 复发

sshx 一致共识 B 的硬化第1步:把分散重复的 gh --repo 处理收口到 codex_refactor_loop/gh_invoke.py 的 build_gh_argv(slug, argv):gh api 永不加 --repo(repo 在 URL);gh pr/issue 仅在 subcommand 之后、缺失时加 --repo <slug>;绝不 pre-subcommand。wakeup_runner._run_command / ControllerActions.gh / PrChecksProjection 三处 API 调用全部改走该 helper,删除各自的内联 --repo 启发式。行为保持不变(含 r13 修复)。

验证:characterization + source-regression 锁在 test_gh_invoke.py(9 测试);controller 独立重跑 test_gh_invoke+pr_checks+wakeup_runner+review_gate+review_gate_e2e+controller_actions+wakeup_plan 326 通过;pr_checks 直跑 --help 兼容。

注:这是硬化(防复发),不直接提升并发;并发瓶颈在上游 implement→publish 阶段,接下来单独处理。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r16 (sshx 共识): 打通 implement→publish 解锁并发瓶颈 — markerless/stale implement 重派 + 安全门控发布

sshx codex-cli triplet 共识(minimal/structural/delete + meta-judge):10 个 design-solving issue 已达成共识但卡在 implement→publish(implement 早跑过却 markerless/未提交/过期 base/没 PR;dispatch_consensus_implementation 被 local_iter_branch 抑制、publish 被 verified_pr_head_unavailable 抑制)→ 无新可派工作 → 并发 ~1。

修法(wakeup_plan/wakeup_runner/controller_actions/git.py):
- success-aware 重派:implement 分支/worktree 存在但无 valid clean IMPLEMENT_DONE:ok marker(markerless/failed)或 base 过期 → 重派;fresh_safe_worktree 把 canonical worktree reset/recreate 到当前 origin/$INTEGRATION_BRANCH;仅 genuine in-flight / 已有干净 PR / 待发布 clean marker 才抑制。
- 发布门控(严防误发 stale/wrong-scope):publish_implementation_output 仅在 canonical identity(refactor/iter<issue>-<candidate> + worktree 实际分支匹配)+ fresh integration base(merge-base==origin/integration,过期 base block)+ managed open issue + 无重复 PR + 非空 scoped diff 时,commit→push→open PR→派 reviewers;否则 block。
- wakeup_plan 用 canonical_implementation_identity/single_linked_managed_issue/clean_scoped_diff 替代 verified_pr_head 抑制。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 330 通过。

已知待修(follow-up):worker 把 PR title/body/commit 文案改成英文,违反「GitHub artifact 中文」策略,下一 commit 恢复中文。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r17: fresh consensus-implement 重派前清除 terminal 非可发布 implement log — 让卡住的 implement 真正启动

承接 r16:r16 让 markerless/stale implement 重派(reset worktree),但旧的 EXIT=1/markerless implement log 仍占 spawn 的 target_log_absent 前置 → fresh implement 不启动(同一类「按存在抑制」,implement-spawn-log 层)。

修法(controller_actions,success-aware):fresh consensus-implementation dispatch 写 spawn intent 前,仅清除 terminal 非可发布 implement log(EXIT!=0 失败 / EXIT=0 无 IMPLEMENT_DONE:ok marker);保留 genuinely in-flight(无 EXIT)与 clean publish-ready(EXIT=0 + 干净 marker)log 不动。于是卡住的 markerless/failed implement 重派后能真正 launch、跑在 fresh base、产出干净 marker → r16 门控发布。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 333 通过。

设计说明:并发瓶颈用确定性逻辑修对(ready=clean EXIT=0+marker;parallel=scope_paths 不相交),不引入 LLM-in-loop 分发判断(daemon-first;sshx codex 三人组一致 revise:LLM 最多 shadow-mode,不入决策回路)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r18: implement-lifecycle 收敛为单一 success-aware 谓词 + runner 路径自清陈旧 log — headless 自愈,不再需手动清 log

承接 r16/r17(r17 把清 log 放进被抑制的 dispatch 路径,从没执行)。新增 implement_lifecycle.py 单一事实源 classify_implement_attempt → in_flight | publish_ready | redispatch,plan/runner/controller 共用:
- runner spawn-apply 路径 _spawn_log_suppresses_retry 仅在 in_flight|publish_ready 时抑制;redispatch(EXIT!=0/markerless/stale-base) 不抑制且 _clear_redispatchable_spawn_log 在 spawn 实际应用处清掉陈旧终态 log → fresh implement 真启动(修 r17 misplacement,免手动清)。
- publish 仅 publish_ready(clean IMPLEMENT_DONE:ok + canonical identity + fresh base + 非空 scoped diff);clean-marker 但 stale-base 改 redispatch 而非死锁(#421 edge)。
- 确定性逻辑(非 LLM-in-loop;daemon-first);merge-gate/#191/#322/#396 安全语义不变。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 339 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 phase9-router triplet-evidence-invalid 误判致 headless design-consensus wedge

solver prompt 注入的 issue source snapshot 含上一轮 design-consensus 的
audit-trail(peer solver log 路径)时,_peer_solver_reference_violation
扫描整个 prompt 误判为 peer-isolation 违规,fail-closed 拒派 meta-judge,
使所有 body 含历史 consensus 记录的 issue 永久 wedge。

修复:peer-isolation 扫描只覆盖 router-controlled 区域,剥离
'## Issue source snapshot' 到 '## Full solver template' 之间的 issue
作者内容;router header / solver template 注入的真实 peer 引用仍拦截。
加两个 behavior test 覆盖 snapshot-排除 与 router-区域-仍拦截。

dogfood headless 实测:reload 后 418/437/474/490/491/493/494/495 全部
恢复 solver→judge→consensus 自驱,#494 已自驱到 implement。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* CLAUDE.md: 加 consensus-rnd:foundational-invariants sentinel block

dogfood bootstrap 的 check-project-rules probe 要求 PROJECT_RULES 含此
fixed-point block;缺失时 probe fail-closed 阻断 headless restart。内容
为 probe 生成的 canonical FI-001~007,镜像既有设计哲学,不引入新约束。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 markerless implement 致 publish 卡死的 headless gap

implement worker 干净退出(EXIT=0)但把 IMPLEMENT_DONE marker 只写进 run
artifact(runs/implement-issue-<id>.md)而非 log 末尾时(codex stdout marker
落点不可靠),completed_marker_actions 只扫 log marker → markerless → 不生成
publish action → design→implement→PR 链卡在 implement 后。实测 #421:EXIT=0
+ artifact 有 IMPLEMENT_DONE:issue-421:ok + 9 文件 staged,却无 PR、label 仍
停在 design-solving。

修复:对 clean-exit 的 implement-issue log,log markerless 时 fallback 读取
run artifact 末尾的 IMPLEMENT_DONE marker,复用 review verdict 已有的
artifact-first 模式。窄作用域(仅 implement-issue + clean-exit),配 2 个
behavior test;全套 wakeup_plan 138 tests green。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 补全 markerless implement publish:revalidation 也读 run artifact marker

8aeea89 修了 wakeup_plan detection 端,但 wakeup-runner 的 source-marker
revalidation(_source_log_has_clean_marker)仍只扫 log,markerless implement
publish 仍被 reject(clean_exit_marker_missing)。补全:revalidation 对
clean-exit implement-issue log 在 log markerless 时 fallback 读
runs/implement-issue-<id>.md 的 marker,与 detection 端一致;配 behavior test,
全套 wakeup_runner 76 tests green。

注:dogfood 实测确认 markerless 是系统性病根(还影响 readiness 重派、review
verdict 等多处),应走系统性统一(见 #491 日志/observability 与后续 marker-reading
收口);本 commit 只补 implement publish 这条链。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 日志详细化:wakeup-runner tick 显示全状态计数 + 被掩盖的 blocked/skipped 详情

旧 _wakeup_tick_action 只看 results[0],一个 tick 处理多 action 时 blocked(如
publish 失败)被成功的 spawn 掩盖、完全不可见;graphql-backoff 固定显示误导的
'remaining=unknown'。dogfood 调试时这逼得每次翻 ledger 才知道真实发生了什么。

改:tick log 显示 [applied=N,blocked=M,...] 全状态计数 + 被掩盖的 blocked/skipped
具体 reason+action(graphql-backoff 作整-tick 门控单独报、去掉 unknown);dispatched
headline 保留。更新 test 反映可见性提升,全套 wakeup_runner 76 tests green。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 补全 markerless implement 第三环:readiness 谓词读 run artifact marker

classify_implement_attempt(r18 单一 success-aware 谓词)只扫 log 找
IMPLEMENT_DONE:ok,markerless log(marker 落 artifact)→ redispatch:markerless →
readiness 不 suppress → 重派覆盖已完成工作(dogfood 实测 #421 重派 4 次、#493
IMPLEMENT_DONE:ok→重派→:partial 工作丢失)。

补全:markerless log 时 fallback 读 runs/implement-<cluster>.md 的
IMPLEMENT_DONE:ok,与 detection(8aeea89)/revalidation(dfacc64)一致。窄作用域 +
r17/r18 兼容:只接受 :ok,partial/失败/真 markerless(无 artifact)仍 redispatch
recovery — 现有 markerless-redispatch test(wakeup_plan/wakeup_runner/
controller_actions 共 304 tests)全绿,新增 2 behavior test。

至此 implement→PR 的 markerless 三环(detection/revalidation/readiness)打通;
review 链 markerless 与系统性统一抽象交 #499 共识。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* #500: design-consensus 编排重设计 — phase9-router 唯一派发 owner,删 wakeup-runner 双重派发

落地 #500 r4 consensus(structural framing,no-new-schema):
- phase9-router 成为唯一 design-consensus worker intent owner,新增
  META_RESOLVED:re-design → marker.round+1 三路 solver route
- 删除 wakeup-runner 的 dispatch_design_consensus(消除双重派发 helper_exit:3)
- wakeup-runner 只保留 closed action apply(drop→close、consensus→implement)
- dispatch_consensus_implementation readiness 后转 crnd:phase:implementing
- 无新增 schema(复用 HARNESS_SPAWN_INTENT)

修复 headless P0:reflector META_RESOLVED 无 consumer + phase9-router/wakeup-runner
双重派发致 0-codex 死锁。14 文件 +207/-282;verification 471 tests 全绿。

共识由 controller 手动驱动 design-consensus r1→r4 达成(daemon 停止期间)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* #501 fix r1: 补 dispatch_consensus_implementation label-transition failure-branch test

tests reviewer reject:新增 label transition(gh issue edit→implementing)只测 success
path,缺 gh issue edit 返回 nonzero 时 dispatch stop before worktree/render/intent 的
failure-branch test。本 fix 补该测试。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 headless 死锁:label 移除集只留 canonical crnd:*,删除历史 alias

gh issue/pr edit --remove-label 会因仓库不存在的 label 整条 rc=1 失败。
ISSUE_LABELS_REMOVE/PR_LABELS_REMOVE 含 cleanup_aliases(emoji/legacy 名),
仓库未维护这些历史 label,导致 _move_issue_to_implementing_phase 永久失败,
consensus→implement 派发卡死、wakeup_runner 0 codex 自驱死锁。

按 maintainer 指令删除历史 label 兼容(不再管历史标签):移除集只列
canonical crnd:* phase/human/stuck label。新增 source-regression 锁定
两集 canonical-only、无 alias。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 #500 回归:phase9-router 生产 ctx 丢失 GH_REPO_SLUG 致 design-consensus intake 全死

#500(e733b1a)把 Phase9Router.__init__ 的 ctx 构造从
`LoopContext.load(repo_root=repo_root)` 改成
`LoopContext.load(repo_root=repo_root, env={"REPO_ROOT": str(repo_root)})`。
受限 env 不读 os.environ,使 daemon(经 source host.env 启动)的 ctx
gh_repo_slug=None → `_open_design_consensus_issues` 第一行 fail-closed 返回 []
→ DesignConsensusIssueIntake 静默永不派 → headless 无法为任何 issue 启动
design-consensus;`_require_open_source_issue` 也因无 slug 失败(marker route
如 #490 redesign 卡在 phase9-source-state-unavailable)。

修复:revert 为 `LoopContext.load(repo_root=repo_root)`,读进程环境解析 host facts。
测试一律用 `Phase9Router(ctx=...)` 显式 ctx,不依赖受限 env;新增回归测试锁定
生产 repo_root 构造路径解析 GH_REPO_SLUG。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 加 STALE_REVIVAL_HOURS:>阈值未推进的 stuck implement 自动 re-trigger(env 可配,默认 3h)

headless 下 partial/failed/markerless 的 implement log 会让已排队的
dispatch-consensus-implementation spawn intent 永久卡在 runner 的
target_log_absent 前置(清 log 的 dispatch_consensus_implementation 又被
pending_implement_intent 抑制)→ stuck implement 不会自愈(如 #421/494/493/474/498
的 10h partial-wedge)。

新增 host-tunable STALE_REVIVAL_HOURS(默认 3,非正/非法回落 3h)。wakeup_plan
投影 implement spawn intent 时,若目标 implement log 是 redispatchable
(partial/failed/markerless/stale-base)且 mtime 已超阈值,清掉它 →
target_log_absent 通过 → implement 自动重派。安全闸:in-flight(无终止 EXIT)永不清、
publish-ready(:ok+diff)不清、age-gate 防误清刚跑完的。

env 已登记进 SKILL.md host env surface matrix + host.env.example;5 个 behavior
test + matrix↔example 对称测试覆盖。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 扩展 stale-revival 覆盖死 codex:in_flight 无 EXIT 且 >阈值 + 不在存活清单 也 re-trigger

最常见的 headless wedge 是 codex/supervisor 中途死亡(如 daemon 被 kill 截断),
log 永远停在 in_flight(无终止 EXIT)→ classify 永判 in_flight → 永不 redispatch。
之前只清 redispatchable(partial/failed/markerless)log,漏了这半边。

现在 _revive_stale_redispatchable_implement_log 在 log mtime 超阈值时,先按
redispatchable 清;否则若 classify 为 in_flight 且 monitor 存活清单里无该 log 的
活进程,则判定为死 worker 清掉。安全性:spawn-codex 的 no-output stall 窗口远小于
默认 3h,故 >阈值仍 in_flight 的 log 必是死 supervisor;再加 monitor 存活校验
双保险,绝不杀活 codex。新增 3 个 behavior test(死 codex revive / fresh 不碰 /
活进程不碰)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 加手动 trigger:consensus-rnd-cli revive-implements(立即 re-trigger 卡住的 implement,不等阈值)

_revive_stale_redispatchable_implement_log 加 force 参数:跳过 stale_revival_seconds()
age gate,但 in_flight log 在 force 下必须经 monitor 存活校验证明无活进程才清(绝不杀
正在跑的 codex)。新增 force_revive_stuck_implements() 扫所有 implement-issue-*.log
强制 revive,和新 CLI 命令 revive-implements(authority delete-log)。auto 路径
force=False 不变。新增 4 个 behavior test。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 env 泄漏:测试套件 import 时隔离 ambient host.env(worktree 里跑 TEST_CMD 不再假失败)

daemon spawn 的 implement worker 在 worktree 里跑完整 TEST_CMD 时,继承的相对
CONSENSUS_RND_HOST_ENV=.config/consensus-rnd/host.env 泄漏进 temp-repo
LoopContext.load(repo_root=tmp) → 解析成 tmp/.config/... 不存在 → 假
LoopContextError(实测 concurrency_monitor 48 + restart_daemons 23 + … ≈151 errors)
→ 全套永不绿 → implement 永远 :partial → 进不了 PR。

unittest discover 在 collection 阶段先 import 所有 test_*.py 再跑;新增
test_aaa_host_env_isolation.py(字母序最先)在 import 时 pop 掉 host 注入变量,
全套在任何测试 run 前即隔离。验证:full discover 设 locator 跑 → exit 0 全绿
(此前 同条件 151 errors);targeted before/after 证 48/23/1 errors → OK。
纯测试模块,不碰 production/daemon。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 markerless keystone:solver/judge marker 加 companion-artifact fallback

solver/judge codex 有时 EXIT=0 但 SOLVER_DONE/META_JUDGE_DONE marker 只落 run
artifact、没在 log 末尾,导致 phase9-router 判 triplet 不完整、永不派 judge,
design-consensus 轮卡死(实测 #505 r1 structural 214 次嵌入、0 次行首)。

把已有的 implement artifact-marker fallback 扩到 solver/judge:clean-exit log
末尾无独立 marker 时,从 companion run artifact runs/<log-stem>.md 兜底读。纯
additive(log 有 marker 时行为不变)、prefix-scoped、精确 filename regex、复用
安全的独立 marker 提取(不信嵌入/echo)。不削弱共识(仍要 3 marked solver +
judge,只让 marker 来源更稳健)。phase9/router.py + wakeup_plan.py + behavior +
source-regression tests;full suite 1367 green。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 publish stale-base 无恢复 + 停 :ok churn:让 implementing 能发出 draft PR

implement :ok 但 worktree base 落后 origin/$INTEGRATION_BRANCH 时,publish 原本硬
block(publish_implementation_stale_base)且永不恢复;stale-revival 又把 stale-base
的 :ok implement 当 redispatchable 反复重派同一 stale worktree → 无限 churn 占满
floor、长时间零正式 PR(implement→首个 PR 永远开不出)。

改为:publish 前在 worktree merge origin/$INTEGRATION_BRANCH 恢复 base(冲突则
merge --abort + 明确 reason 交既有 resolver/re-dispatch);clean `:ok` 不再被
stale-revival churn(终态成功走 publish-with-recovery,不重跑);无 open PR 的 :ok
创建 draft PR(per #300 共识前 draft)+ Closes #N。controller 仍拥有 git topology,
worker 不 commit/push,不绕 review-gate/merge 真值表。controller_actions.py +
wakeup_plan.py + wakeup_runner.py + behavior/source-regression tests;full suite
1374 green。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 markerless implement:EXIT=0+真diff 无干净 marker 时合成 publish 候选,止 stale-revival 空重派死循环

publish build/test 闸做安全网。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修 publish 顺序 bug + merge 失败派 codex 兜底 resolver:先 commit worker diff 再 merge fresh base,真冲突不 wedge 改派 resolver,幂等续跑

空判据改 git diff HEAD;顺带修 wakeup_plan stale-suppression 同类 git diff 问题。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #504 (#519)

* 修 headless review-fix prompt 渲染不全:绑定全部 placeholder(PR_NUMBER/标题/轮次/分支/REVIEW_*_PATH 等),解锁 reject PR 的 fix→merge

之前 headless _dispatch_review_fix 只绑 FIX_OUTPUT_PATH,其余 ${} 未解析致 fix codex 读不到 reject 证据。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #496 (#520)

* 修 review-fix codex 派到 main 的 cd bug:改在 PR worktree 跑,worktree 缺失 fail-closed

之前 _dispatch_review_fix cd=repo_root → fix 改 main(污染 daemon checkout)且不落 PR 分支(reject PR 永不 merge)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #508 (#523)

* 修 wakeup-plan action 排序:review_gate/publish 排到新工作 spawn 前,解 spawn-budget 饿死→reject PR 可被 fix→merge

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* CLAUDE.md 加宪法工程规则:异常必抛出+记可诊断日志严禁吞掉/静默;测试必断言真实行为禁无意义测试(maintainer 指令)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #503 (#521)

* 增加通用工程基本规则,明确面向对象设计要求及实现细则

* 实现 issue #514 (#524)

* 完善通用面向对象设计准则,增加核心原则与实现层细则

* 实现 issue #494 (#526)

* 实现 issue #517 (#528)

* 实现 issue #505 (#510)

* 实现 issue #505

* #510 fix r1:补 phase9-router 行为测试覆盖(应对 tests reviewer reject)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 headless FIX_DONE 不提交 fix 输出 gap:wakeup-runner 在 FIX_DONE re-review 前先 commit+push fix worktree

headless 下 fix codex 产出 FIX_DONE 但 worker 从不 commit,FIX_DONE 直接路由 dispatch_reviewers,
fix 输出永远留在 worktree 未提交,reviewers 永远审旧 head,reject 不收敛,阻断所有需 fix 轮的 PR
review→merge。镜像 interactive controller:FIX_DONE 源的 dispatch_reviewers 前,若 fix worktree
dirty 则 git add+commit+safe_push,clean 则 no-op。附 2 个 behavior test(dirty 提交/clean 跳过)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #491 (#531)

* 实现 issue #490 (#529)

* 实现 issue #490

* 实现 issue #490

* PR #529 review-fix output

* 实现 issue #499 (#512)

* 实现 issue #499

* PR #512 review-fix output

* 实现 issue #474 (#532)

* 实现 issue #474

* 实现 issue #474

* 实现 issue #474

* 实现 issue #474

* PR #532 review-fix output

* 实现 issue #476 (#518)

* 实现 issue #476

* PR #518 review-fix output

* 实现 issue #509 (#527)

* 实现 issue #509

* 实现 issue #509

* spawn-claim recycle 改用 artifact-aware terminal marker:companion log 被 24h retention 删除后,已完成任务的 claim 仍可回收(读 runs/ artifact),不再永久阻断重派;running 任务无终止 marker 仍不回收(防 double-spawn)

* 实现 issue #421:发行前 clean-room 集成测试制度 (#534)

* 实现 issue #421

* rebase #421 onto fresh integration:解决 test_cli_command_router 冲突 + 校验

* 实现 issue #493:测试纪律修宪(behavior-first) (#535)

* 实现 issue #493

* rebase #493 onto fresh integration:解决冲突 + 校验

* PR #535 review-fix r1:addressed tests+quality reject(behavior-first test 调整)

* 实现 issue #506 (#536)

* 实现 issue #506

* 实现 issue #506

* 实现 issue #506

* 实现 issue #506

* PR #536 review-fix output

* churn 根治:wakeup-plan 给 consensus-impl spawn-intent 套 open_closing_pr/terminal/in-flight 抑制(对齐 dispatch_consensus_implementation),已有 PR 的 issue 不再被重派 implement

* 实现 issue #507 (#525)

* 实现 issue #507

* 实现 issue #507

* PR #525 review-fix output

* 实现 issue #516 (#530)

* 实现 issue #516

* 实现 issue #516

* #530 fix r1:补 ManagedWorkSnapshot fetch-failure stale-cache fallback 行为测试(应对 tests reviewer reject)

* PR #530 review-fix output

* PR #530 review-fix r2:typed ManagedWorkItem projection(addresses architect 类型边界 reject + tests + contract-tests)

* PR #530 review-fix r3:anchor ManagedWorkSnapshot read surface in authorization mirror + cross-artifact source-regression(addresses architect governance reject)

* PR #530 review-fix r4:add canonical+legacy managed-label discovery regression test(addresses tests reject)

* 修复 worker marker reader 误判相同重复 marker 为冲突

codex 常把终止 marker 回显 2-3 次(含 diff + 前缀行),导致
read_worker_terminal_marker 返回 duplicate_or_conflicting,review 真值表
永不收敛、所有 headless review→merge 卡死。改为:相同重复 marker 折叠为
有效,仅当存在 2+ 个不同 marker 值才判 duplicate_or_conflicting。
覆盖 _last_final_marker / _sentinel_adjacent_marker /
_marker_from_companion_artifact,保留 malformed 与 foreign-prefix 检测。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Reserve implementation PR for issue #537 (#540)

⟦AI:AUTO-LOOP⟧

* 实现 issue #511 (#522)

* 实现 issue #511

* PR #522 review-fix output

* PR #522 review-fix output

* 移除误入 PR 的 .refactor-loop 运行 artifact(architect r8 reject)

resolve-pr522-merge.md 是一次性合并解决 summary,属生成物,不应进 durable diff
(CLAUDE.md「生成物不当事实源」)。解决说明保留在 commit message / PR body。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复早期 PR reservation 崩溃(#511 回归,阻断所有 implement 派发)

_reserve_implementation_pr 用 worker 富正文路径开 reservation PR,但该文件在
reservation 时尚不存在 → FileNotFoundError(未被 except RuntimeError 捕获)→
wakeup-runner tick 崩溃 → 任何 consensus issue 无法 implement → loop P0 wedge。
修复:reservation 前若 body 缺失则写占位正文(## issue #N 实现 + Closes + sentinel,
publish 会拒占位逼 worker 产真正文);reservation 幂等(已有 open PR 直接返回;
已有 remote head 则 reset 到 integration + force-with-lease push);捕获 OSError。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Reserve implementation PR for issue #543 (#544)

⟦AI:AUTO-LOOP⟧

* Reserve implementation PR for issue #541 (#546)

⟦AI:AUTO-LOOP⟧

* revert early-PR reservation(#507):不再开空占位 PR,publish 出真内容才开 PR

dispatch_consensus_implementation 不再 reserve/开 PR;publish_implementation_output
恢复 pre-#507 行为——implement 出非空 diff 才 commit+push+open_pr_with_label(用 worker
真 title/body,验证非占位);空 diff 不开 PR、不 merge、不关 issue。删除占位机制
(_reserve_implementation_pr / _placeholder_*),改 wakeup_plan/runner 的 early_pr_missing 投影。
根因:空占位 PR 被 reviewer 在 implement publish 前空合,假性关闭 #537/#541/#543。
全量套件 1496 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #542:删除 redispatch payload 冗余 head_sha (#547)

移除 review redispatch 的重复 head_sha payload

* hotfix: stale consensus-implement intent 指向已删 worktree 不再 suppress fresh re-dispatch

consensus_implementation_suppressed_reason 的 pending_implement_intent 分支增加 worktree 存在性
守卫(接入既有 _canonical_consensus_worktree_exists):仅当 stale pending intent 的 canonical
worktree 仍存在时才 suppress。revert early-PR reservation(#507)后,#537/#541/#543 的 pending
implement intent 指向已删除 worktree,旧逻辑 suppress fresh re-dispatch,而 wakeup-runner 不断把
stale intent spawn 进死 cd 致 os error 2 / EXIT=1 死循环,阻断 consensus→implement 自主链路。
现 worktree 缺失即放行 fresh dispatch 重建 worktree。

补 test_wakeup_plan(worktree 缺失不 suppress / worktree 存在仍 suppress)+ 更新
test_controller_actions 锁旧行为的 subTest(pending case 建 canonical worktree 以保留
reject-before-fresh_safe_worktree 意图)。全套件 1557 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: spawn-claim lock 持有者 pid 已死即可回收(自愈,防 cleared-log 永久 HELD)

_existing_claim_is_recyclable 增加 pid-liveness 回退:当 lock 记录的持有者 pid 已死(os.kill
pid 0 抛 ProcessLookupError)时即可回收,保留既有 terminal-marker 回收与 live-spawn 保护
(pid 存活+无 marker 仍不回收)+ metadata-mismatch fail-closed。根因:implement 失败(EXIT=1)
留 stale lock,dispatch_consensus_implementation 重派时清掉 log → #490 spawn-claim 只在 log 有
EXIT= 时回收 → cleared log 无 marker → 死进程 lock 无法回收 → SPAWN_CLAIM_HELD 永久死锁,
阻断 consensus→implement 重派(#537/#541/#543)。补 7 个 behavior test。全套件 1560 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: publish_implementation_output 接受 worker 已 commit 的实现 diff

_require_publish_implementation_diff 之前只看 git diff HEAD --quiet(未提交改动),
worker 一旦自己 commit(在自己分支上,隔离无害)工作区就 clean → 误判 no_diff → 不开 PR。
现改为:有未提交改动 OR HEAD 相对 integration base 有 committed delta(merge-base..HEAD 非空)
即视为有 diff 可发;_commit_publish_implementation_diff 在工作区已 clean(worker 已 commit)时
跳过提交返回成功,不再误报 publish_commit_failed。修 #541 类 worker-commit→no_diff→无 PR+
daemon 重复 dispatch。补 behavior test。codex 全套件 DISCOVER_STATUS:0 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #437: .refactor-loop 历史生成物与陈旧 worktree 的有界清理机制 (#548)

* #437: 新增 canonical RuntimeRetention owner,log-retention 降为一版 alias;清理覆盖 generated files+pending-events compaction+stale worktree;CLAUDE.md 加 #437 narrow carveout

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* hotfix: CONFLICTING managed review PR 的 stale-base 冲突自主 resolve

接线既有但孤儿的 prompts/rebase-resolve.md 进 headless 路径:managed review PR
mergeable=CONFLICTING(base 落后)时,controller 在 PR worktree 起 git merge
--no-commit --no-ff origin/<integration>——干净则直接 commit+push,有冲突则派
rebase-resolve codex(隔离 worktree 内解+stage,禁 commit/push),收到
REBASE_RESOLVE_DONE 后 controller commit + safe_push 仅推 PR head。结果再过 CI+
review-gate 把关。新增 dispatch_pr_rebase_resolve / commit_push_resolved_pr_rebase
(controller_actions)+ wakeup_plan 投影 + wakeup_runner 应用。

安全:只动 refactor/iter<N>-* PR 分支,显式 guard 拒碰 integration/review_base,
diff-filter=U 验未解冲突才 commit,任何 abort 路径 git merge --abort 留干净 worktree,
幂等(已含 base / 解析在飞则 NOOP)。修 stale-base churn 致 loop 永久停滞(一个 PR
合了其它全 CONFLICTING 无人解)的根因。codex 全套件 1589 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: rebase_resolve_actions 投影前 enrich live mergeability(修生产 gap)

自主-resolve 投影读 item.mergeable 前未 enrich,但生产 GhItem 来自 snapshot
mergeable 为空 → 永远 skip → CONFLICTING PR 不投影 dispatch_pr_rebase_resolve →
loop 仍停滞。复用同文件 _with_live_mergeability(review_evidence_redispatch_actions
已用同模式)。补 mergeable-为空-需-live-fetch 的回归测试(原测试 mock 了 mergeable
未覆盖此生产路径)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: dispatch_pr_rebase_resolve kind 进 EXECUTABLE_ACTION_KINDS(end-to-end 可执行)

_close_projection_action 的 else 分支默认 status_only=True;rebase action 的 kind
stale-base-conflicting-pr 不在 EXECUTABLE_ACTION_KINDS → 永远 status_only,runner 不
apply,CONFLICTING PR 永不自主 resolve。加该 kind(+ commit_push completion kind 如需)
进集合;controller_action 已在 RUNNER_NAMED_HELPER_ACTIONS 故随即保持 executable。补
端到端 wakeup-plan pipeline 测试(原测试只测 rebase_resolve_actions 单函数,漏了 closed-
action-projection 的 status_only 关卡)。codex 全套件 1593 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix: rebase-resolve commit/dispatch 投影 gate 在实际 worktree merge 态(止 stale-marker 噪音)

commit_push_resolved_pr_rebase 之前只凭持久 REBASE_RESOLVE_DONE marker 投影,worktree
已 commit+push(#549)或 abort(#550/#551)后 marker 仍在 → 每 tick fail-closed 噪音 +
反复 0-codex(zero_streak 触警)+ stale marker 挡 #550/#551 fresh resolve。现 gate:仅当
worktree 真有 merge-in-progress(MERGE_HEAD)且无 unmerged 才投 executable commit_push;
否则不投。并让 worktree clean + 仍 CONFLICTING 的 PR 重新拿 executable dispatch_pr_rebase_resolve
(stale DONE marker 不再永久压制)。补端到端测试。codex 全套件通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #502: 删除遗留 label 兼容层:purge emoji/legacy alias 全套(cleanup_aliases / per-spec  (#549)

* 实现 issue #502: 删除遗留 label 兼容层:purge emoji/legacy alias 全套(cleanup_aliases / per-spec

* 修复 PR #549 review r3:#502 label-purge 适配当前 label 代码(contract-tests + architect/tests reject)

fix codex round-3 applied-4:对齐 #502 遗留 label 兼容层 purge 与近期新增 label 代码,
修 contract-tests 失败 + architect/tests reject。改 labels.py + SKILL.md + test_wakeup_plan.py。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 headless review_gate ci_failed 死锁:reject→FIX 移到 ci/mergeability 检查前

_review_gate_decision 之前 ci_error/mergeability 检查在 reject>0→FIX 之前 → CI 红的 PR
即使 reviewer reject 也返回 WAIT_OR_REDISPATCH:ci_failed 干等,而 CI 不修不会绿 → 死锁
(#548/#549 都中,需手动派 fix 破解)。把 reject>0→FIX 移到 ci/mergeability 检查前
(仍在 reviewer-validity/head 检查之后)→ reject 时即使 CI 红也自主派 fix(fix 同解 reject+CI)。
MERGE/MERGE_WITH_COMMENTS 判定完全不变(仍需 ci-green + mergeable + reject=0 + approve>=1),
无误合并风险。补 test_wakeup_runner 端到端测试(reject+ci-red→FIX;reject=0+ci-red 仍 WAIT;
happy path MERGE 不变;stale-head+reject 仍 WAIT)。full suite 1600 passed。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修正 review_gate 顺序:mergeability 检查移到 reject→FIX 之前(CONFLICTING 不并发 review/fix)

上一修复(a125d19)把 reject→FIX 移到 ci+mergeability 前,破了 ci_failed 死锁但也让
CONFLICTING+reject 的 PR 路由到 FIX → 与独立的 dispatch_pr_rebase_resolve 并发跑同一
worktree(#550 出现 98-file mid-merge + reviewer 并发)。把 mergeability_error 检查移到
reject→FIX 之前:CONFLICTING→WAIT(先 rebase-resolve,不并发 review/fix);MERGEABLE+
CI红+reject→FIX(死锁修复保留);ci_error 仍在 reject→FIX 后(纯 CI 红无 reject 才 WAIT)。
MERGE 判定不变。test_wakeup_runner 90 OK(+conflicting 用例)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #498: headless daemon fail-closed 点的 codex 兜底裁决/生成机制(narrow allowlist,非通用 es (#550)

* 实现 issue #498

* 实现 issue #498: headless daemon fail-closed 点的 codex 兜底裁决/生成机制(narrow allowlist,非通用 es

* PR #550 review-fix output

* 新增 patrol-inspector 巡检 issue intake (#551)

* 实现 issue #541

* PR #551 review-fix output

* PR #551 review-fix output

* 修复 PR #551 contract-tests:对齐 patrol 第8 daemon + safe_push/sync_dev 测试 host.env

#551 patrol 新增 patrol_inspector_daemon(第8 daemon)→ test_anti_stop_restart 期望 7→8;
test_controller_lib_safe_push / test_sync_dev 补 CONSENSUS_RND_HOST_ENV host.env 设置
(代码按 host-config contract 需 host.env locator,rebase-resolve 丢了 base 新测试版本)。
断言未削弱(daemon 数断言强化为8)。full discovery 1634 OK + sshx 13 OK。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 PR #551 contract-tests(clean-env):e2e fixture 设 META_ESCALATION_STUCK_HOURS 隔离 de-dup 测试

test_batch_and_per_task_lock_do_not_duplicate_spawn 在 clean env(CI)下失败:#506 meta-
escalation 因 fixture mock 态被判 stuck 而 fire,产出 repository-stalled RunnerResult 污染
de-dup 断言(expected [] got [RunnerResult])。fixture host.env+env 设 META_ESCALATION_STUCK_
HOURS=999999 禁该测试的 meta-escalation,聚焦 de-dup 行为。断言未削弱。clean discovery 全过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 publish-gap:implement readiness 用 git status --porcelain 判 staged diff

原 classify_implement_attempt / wakeup_runner publish 前用 git diff --quiet 只检测 unstaged
改动;worker 正确 git add -A 后 staged diff 被误判 empty_scoped_diff,导致完成的 implement
(如 #553 干净 EXIT=0 + 885 行真实 staged diff + 有效 title/body artifacts)永远出不了 PR,
wakeup_runner 对 children 反复空转 re-dispatch。改为先 git status --porcelain 判 staged/
unstaged/untracked 任一即 publish-ready,只有真干净才 fallback git diff。publish-time 建 PR
路径不变(不重蹈 #507 early-PR 空 commit/placeholder 死锁)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 patrol_inspector daemon 启动崩溃:DaemonHeartbeatLease 用构造函数非 from_env

#551 patrol.py daemon 分支调 DaemonHeartbeatLease.from_env()(不存在的方法)→
AttributeError 启动即崩,daemon 从未成功跑起来。改为 DaemonHeartbeatLease(
"patrol_inspector_daemon", ctx.repo_root),与 closed_label_reconciler/wakeup_runner
一致,心跳文件名匹配 restart 助手期待的 patrol_inspector_daemon.ts。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 修复 publish-gap 残留层:duplicate-marker log 读 companion artifact + 取代 superseded spawn intent

层2 残留:implement worker 把 IMPLEMENT_DONE:issue-N:ok 在 log 里 echo 多次→
worker_markers 返回 duplicate_or_conflicting_log_marker fail-closed→classifier 当
markerless 且不查 companion artifact→无 completed-marker action→无 publish→stale
harness-spawn-intent:dispatch-consensus-implementation:N 残留致 pending_implement_intent。

修:(1) classify_implement_attempt / completed_marker_actions 遇 duplicate marker 时
narrow 读 companion .refactor-loop/runs/implement-*.md(仅 implement log,仅末 30 行恰好
1 个 IMPLEMENT_DONE:ok 才接受);(2) suppress_publish_superseded_implementation_spawn_intents:
仅当某 issue 已有 executable publish_implementation_output 才把对应 dispatch-consensus-
implementation spawn intent 抑制为 status_only(implementation_ready_to_publish)。
不重蹈 #507 死锁:只在 publish 已 ready 后抑制 spawn(不漏派)、publish 仍全验证(不提前/
重复)、wakeup-runner/publish 授权口径不变。live 验证 #553 现投影 executable publish。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #555 (#557)

* 修复 publish-gap 第3层:wakeup_runner clean_exit revalidation 对 duplicate-marker log 复用 artifact-fallback

pubgap2 修了 plan 侧 duplicate-marker artifact-fallback,但 wakeup_runner 的
_source_log_has_clean_marker(clean_exit_source_marker 预条件 revalidation)仍用严格
reader,对 #553 duplicate IMPLEMENT_DONE echo 返回 clean_exit_marker_missing→阻 publish。
改为严格 reader 优先,仅当 duplicate_or_conflicting_log_marker + is_implement_log +
_implement_run_artifact_done_marker 精确匹配 projected marker 才 fallback。复用 plan 侧
同一 helper,不弱化安全门(reviewer/release/rebase/非 implement/其它 fail reason 保持严格)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 实现 issue #553

* PR #561 review-fix output

* PR #561 remote-ci-fix output

* PR #561 review-fix output

* 收窄 restart daemon 权威到 owner-local 命令面 (#558)

* #471 hotfix: wakeup_plan completed-marker 投影按 open-managed 集过滤,closed/merged target 不投影(止血 graphql:消除 wakeup_runner 对已终结 target 的逐条重校验)

* #410: SKILL/prompt 注释改引 durable contract anchor,不引用一次性 .refactor-loop/runs/*;新增 test_generated_artifacts_not_authority 窄 source-regression (#472)

* #471 hotfix2: wakeup-plan 同一 target 只投影最新有效 marker,旧轮次不再投影 dispatchable action(止 graphql backoff 最后一刀)

* skill: 强化 no-gap 规则 — 0 codex + active work 无观察模式豁免,必须当 turn 立即介入派 codex(+source-regression)

* #470: direct-post prompt 渲染时 fixed-token 内联 _github-post-rules,host worktree cwd 可达(避免裸相对路径) (#479)

* #478 hotfix: daemon spawn 改 launch-and-forget 独立 supervisor(start_new_session 不 wait),daemon reload 不丢 in-flight worker;daemon 每 tick 清晰状态行

* #471: wakeup-plan latest-open-fresh 投影 — completed-marker open-target/latest + release-rollup 仅 latest-per-sha 且 ahead>0 本地 ref 验证(止 stale 投影/graphql) (#482)

* #481: consensus->implement readiness-dedup gate(止 daemon 重派 closed/has-PR/in-flight) (#486)

* #481: consensus->implement readiness-dedup gate, closed/closing-PR/remote-branch/in-flight target 不投影 dispatchable, 止重复/race

* #481 fix r1: dedup gate 复用 concurrency monitor owner surface 测量 in-flight,不重实现 daemon 算法(addr architect)

* #481 fix r2: 补 controller helper guard behavior test(addr tests)

* #485: consensus->implement 按 scope_paths 冲突分组串行(同组一个 executable,其余 status_only),复用 batching-heuristics,防并行冲突 PR (#489)

* #475: wakeup-runner WakeupApplyBudget — spawn-worker action 按 hard_gate.dispatch_required 批量到 floor deficit(lifecycle action 仍 1/tick),修 1/tick 线性恢复 (#492)

* hotfix(#475 ext): batch budget 纳入 dispatch_design_consensus —— solver triplet 一 tick 派齐(非 lifecycle,worker 派发),修 design-consensus 仍 1/tick 低并发

* hotfix: headless spawn-intent launch — skipped duplicate 不再 break 整批 spawn,stale applied ledger(log 缺失)可重试,helper_exit:3 加来源诊断事件;修 248 intent 堆积/0 并发

* hotfix r2: blocked lifecycle action 不再 dead-stop 整 tick — 跳过/抑制 stale blocked(close-drop target OPEN / publish 无 verified head)继续 launch 后续 spawn-batch,使 headless 并发拉到 floor

* hotfix r3: 根治 headless 0 并发 — detached spawn-codex launch 显式传 env_for_subprocess(修 #478 decouple 丢 env 致 EXIT=127);失败 log(非0 EXIT)可重试 launch,EXIT=0/未完成仍抑制重复;tick 状态汇总后续 launch 不被首个 blocked 掩盖

* hotfix r4: run_once 通用 blocked-not-dead-stop — 任何 blocked/skipped non-spawn action 不再 dead-stop 整 tick,继续 launch 后续 spawn-batch;wakeup-plan 抑制 precondition 注定失败的 stale action;根治 headless 并发=0

* hotfix r5: wakeup-plan 投影侧 suppress stale close_managed_drop(target 实际 OPEN 的 drop-close 注定失败)为 status_only,不再占 tick 挡住后续 design-consensus spawn launch

* hotfix r6: wakeup-runner run_once 不再把 no-intents 的 dispatch_design_consensus(helper_exit:3)误判为 codex launch failure dead-stop 整 tick

根因:dispatch_design_consensus 是 spawn-batch action;当 solver triplet 不完整(如 #496 minimal/structural EXIT=0 但未 emit SOLVER_DONE marker)时 helper 返回 exit 3(NO_INTENTS),_spawn_launch_failure 命中 "helper_exit:" 前缀判 True → run_once break,整 tick dead-stop,后续 reviewers/其他 design-consensus/implement 全部不派 → headless 并发=0。

修法:run_once 中 blocked 的 spawn-batch action 若 controller_action != spawn_codex_harness_background(即 dispatch_design_consensus),按 routing no-op skip-and-continue,继续 launch 后续 spawn-batch;仅真正 spawn_codex_harness_background launch failure 才走既有 retry/break。延续 r4 blocked-not-dead-stop 不变量到 design-consensus spawn-batch action。

behavior test: test_wakeup_runner_design_consensus_no_intents_does_not_dead_stop_later_spawn_batch(无修复时 later spawn 不 launch 而 FAIL)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r7: review head_sha binding — 打通 headless review→merge 自驱

根因:reviewer artifact frontmatter 不带 head_sha,_review_evidence_from_artifact 取不到所审 head SHA → 所有 reviewing PR(#480/#473/#402)恒判 WAIT_OR_REDISPATCH:invalid_reviewer_evidence:missing_reviewed_head_sha,且 wakeup-plan 不自动重派 reviewer → review→merge headless 永久卡死(挡住发版)。

修法(codex hotfix worker 实现,controller 验证后提交):
- controller_actions 派 reviewer 时取 headRefOid,缺则 fail-closed,注入权威 HEAD_SHA 进 rendered prompt。
- reviewer-{architect,tests,quality}.md frontmatter 写 head_sha: ${HEAD_SHA}。
- wakeup_runner._review_head_sha_for:artifact 缺 head_sha 时从 controller-rendered prompt/log 回收(不依赖 worker 自报,避免 markerless 覆辙);live-head 陈旧判定与 merge 真值表语义不变。
- wakeup_plan 为 missing/stale reviewer head 的 OPEN reviewing PR 投射可执行 dispatch_reviewers 自愈动作。

验证:controller 独立重跑 test_wakeup_runner+review_gate+wakeup_plan+controller_actions 293 通过、marker/prompt 合同 30 通过;review-gate 安全语义(stale head 仍不 merge)由 13 项 review-gate 测试锁住。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r8: 让 review-evidence-redispatch 自愈动作可执行 — 打通 headless review→merge 闭环

承接 r7:r7 让 reviewer 证据绑定 head_sha,但 wakeup_plan 投射的 review-evidence-redispatch 自愈动作被 closed-action-projection finalizer 强制 status_only(其 kind 不在 EXECUTABLE_ACTION_KINDS),runner 永不应用 → #480/#473/#402 缺 head 的 review 证据无法自动重派 → review_gate 恒 WAIT_OR_REDISPATCH → 并发耗干到 0。

修法(codex hotfix worker 实现,controller 验证后提交):
- wakeup_plan EXECUTABLE_ACTION_KINDS 加 review-evidence-redispatch,finalizer 保留 runner authority 不再强制 status_only。
- controller_actions.dispatch_reviewers 读 stale_review_roles 只重派陈旧角色,且 _pending_review_spawn_exists 跳过已有在飞 dispatch-reviewers:<pr>:<role>:r1 intent(幂等不双派);仍走 r7 的 HEAD_SHA 注入,使重派 reviewer 的新证据 head-bound。
- merge-gate 真值表不变,仅让既有自愈动作可执行 + 定向到陈旧角色。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 295 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r9: reviewer 重派 bump 到下一轮 + 绝对 cd + latest-evidence-per-role — 真正打通 headless review→merge

根因(同一类:按"产物存在"而非"成功+区分维度"抑制):reviewer 重派硬编码回 r1,撞上旧 head 早先成功(EXIT=0)的 r1 log → _harness_spawn_intent_log_suppresses_retry 判"做过了"抑制 → 当前 head 该重跑的 review 永不跑 → review_gate 恒 WAIT_OR_REDISPATCH → 并发 0。且 intent cd 是相对 "." → headless 启动层不投射。

最小修法(按 maintainer directive:文件名即唯一 id,不加 wrapper --id,留在现有 r<R> 命名约定内不改公开命名):
- controller_actions.dispatch_reviewers:_next_review_round 取该 PR/role 现有最大轮 +1,渲染 r<N+1> 的 prompt/log/output/intent;pending-spawn 幂等 guard 按真实轮号;_append_harness_spawn_intent 的 cd 改 str(cd.resolve()) 绝对路径,使 headless 投射+启动成立。
- wakeup_plan.latest_reviewer_heads / wakeup_runner._latest_review_evidence_by_role:改为"每 role 取各自最新轮证据"(而非全局最新单轮),使只重派陈旧 role 时不丢其它 role 已 head-bound 的有效证据;merge 仍要求每 role 证据绑定 live head,安全语义不变。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 299 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r10: review_gate action 带上 reviewed head_sha + e2e 锁死 headless review→decision 全路

根因(review path head_sha 串接的第 5 处):wakeup_plan 投影 REVIEW_DONE 的 review_gate action 时只读 log 的 head(log 无 head)→ action.head_sha 空 → wakeup_runner._review_gate 卡 WAIT_OR_REDISPATCH:missing_action_reviewed_head_sha,证据虽已 head-bound 但 gate 连比都没法比。

修法(最小、按文件名即 id 思路):
- wakeup_plan.completed_marker_actions:REVIEW_DONE action 的 head_sha 改由 _review_done_action_head_sha 从权威证据面解析(latest_reviewer_heads 含 r7 prompt 注入的 head;或全 role head-bound 时取 live PR head;否则回落 log)。只从真实来源填充,不绕过 gate 的 head 比对,merge 安全语义不变。
- 新增 test_review_gate_e2e.py:从真实 wakeup_plan 投影 → WakeupRunner apply,断言 FIX(有 reject)与 MERGE(全 approve+CI 绿)两条具体决策落地——端到端锁死整条 review→decision,杜绝再冒隐藏层。
- test_wakeup_plan.py:加 prompt-bound reviewed head 的 planner 回归。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions+review_gate_e2e 302 通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* hotfix r11: PrChecksProjection 对瞬时 gh-api 失败加 bounded retry — ready PR headless 走到 review_gate 决策

根因:review_gate 的 CI 读 PrChecksProjection.check_pr 对 gh api pulls/<n> 与 commits/<sha>/check-runs 的单次非零返回立即 fail-closed(ci_unavailable:pull_api_failure),无重试;rate limit 健康,实为瞬时 blip → 让 #473/#402(MERGEABLE+CI 绿)每 tick 被瞬时失败挡住、never reach MERGE/FIX。

修法(最小,只动 PrChecksProjection;不碰 #322 ReleaseRequiredChecksProjection):
- _run_api_read:对两处 api read 最多 3 次确定性重试(无 sleep),成功即停,持续失败仍 fail-closed(安全)。happy path 无行为变化。
- starvation 结论(worker 核实):review_gate 本就被 apply,blocked ci_unavailable 行可重评估(非 terminal duplicate-suppressed),无需改 run_once。
- e2e:test_review_gate_e2e 加「首次 pulls 读瞬时失败仍 merge」用例;test_pr_checks 加 retry 覆盖。

验证:controller 独立重跑 test_pr_checks+wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+co…
loning added a commit that referenced this pull request Jun 6, 2026
* #471 hotfix: wakeup_plan completed-marker 投影按 open-managed 集过滤,closed/merged target 不投影(止血 graphql:消除 wakeup_runner 对已终结 target 的逐条重校验)

* #410: SKILL/prompt 注释改引 durable contract anchor,不引用一次性 .refactor-loop/runs/*;新增 test_generated_artifacts_not_authority 窄 source-regression (#472)

* #471 hotfix2: wakeup-plan 同一 target 只投影最新有效 marker,旧轮次不再投影 dispatchable action(止 graphql backoff 最后一刀)

* skill: 强化 no-gap 规则 — 0 codex + active work 无观察模式豁免,必须当 turn 立即介入派 codex(+source-regression)

* #470: direct-post prompt 渲染时 fixed-token 内联 _github-post-rules,host worktree cwd 可达(避免裸相对路径) (#479)

* #478 hotfix: daemon spawn 改 launch-and-forget 独立 supervisor(start_new_session 不 wait),daemon reload 不丢 in-flight worker;daemon 每 tick 清晰状态行

* #471: wakeup-plan latest-open-fresh 投影 — completed-marker open-target/latest + release-rollup 仅 latest-per-sha 且 ahead>0 本地 ref 验证(止 stale 投影/graphql) (#482)

* #481: consensus->implement readiness-dedup gate(止 daemon 重派 closed/has-PR/in-flight) (#486)

* #481: consensus->implement readiness-dedup gate, closed/closing-PR/remote-branch/in-flight target 不投影 dispatchable, 止重复/race

* #481 fix r1: dedup gate 复用 concurrency monitor owner surface 测量 in-flight,不重实现 daemon 算法(addr architect)

* #481 fix r2: 补 controller helper guard behavior test(addr tests)

* #485: consensus->implement 按 scope_paths 冲突分组串行(同组一个 executable,其余 status_only),复用 batching-heuristics,防并行冲突 PR (#489)

* #475: wakeup-runner WakeupApplyBudget — spawn-worker action 按 hard_gate.dispatch_required 批量到 floor deficit(lifecycle action 仍 1/tick),修 1/tick 线性恢复 (#492)

* hotfix(#475 ext): batch budget 纳入 dispatch_design_consensus —— solver triplet 一 tick 派齐(非 lifecycle,worker 派发),修 design-consensus 仍 1/tick 低并发

* hotfix: headless spawn-intent launch — skipped duplicate 不再 break 整批 spawn,stale applied ledger(log 缺失)可重试,helper_exit:3 加来源诊断事件;修 248 intent 堆积/0 并发

* hotfix r2: blocked lifecycle action 不再 dead-stop 整 tick — 跳过/抑制 stale blocked(close-drop target OPEN / publish 无 verified head)继续 launch 后续 spawn-batch,使 headless 并发拉到 floor

* hotfix r3: 根治 headless 0 并发 — detached spawn-codex launch 显式传 env_for_subprocess(修 #478 decouple 丢 env 致 EXIT=127);失败 log(非0 EXIT)可重试 launch,EXIT=0/未完成仍抑制重复;tick 状态汇总后续 launch 不被首个 blocked 掩盖

* hotfix r4: run_once 通用 blocked-not-dead-stop — 任何 blocked/skipped non-spawn action 不再 dead-stop 整 tick,继续 launch 后续 spawn-batch;wakeup-plan 抑制 precondition 注定失败的 stale action;根治 headless 并发=0

* hotfix r5: wakeup-plan 投影侧 suppress stale close_managed_drop(target 实际 OPEN 的 drop-close 注定失败)为 status_only,不再占 tick 挡住后续 design-consensus spawn launch

* hotfix r6: wakeup-runner run_once 不再把 no-intents 的 dispatch_design_consensus(helper_exit:3)误判为 codex launch failure dead-stop 整 tick

根因:dispatch_design_consensus 是 spawn-batch action;当 solver triplet 不完整(如 #496 minimal/structural EXIT=0 但未 emit SOLVER_DONE marker)时 helper 返回 exit 3(NO_INTENTS),_spawn_launch_failure 命中 "helper_exit:" 前缀判 True → run_once break,整 tick dead-stop,后续 reviewers/其他 design-consensus/implement 全部不派 → headless 并发=0。

修法:run_once 中 blocked 的 spawn-batch action 若 controller_action != spawn_codex_harness_background(即 dispatch_design_consensus),按 routing no-op skip-and-continue,继续 launch 后续 spawn-batch;仅真正 spawn_codex_harness_background launch failure 才走既有 retry/break。延续 r4 blocked-not-dead-stop 不变量到 design-consensus spawn-batch action。

behavior test: test_wakeup_runner_design_consensus_no_intents_does_not_dead_stop_later_spawn_batch(无修复时 later spawn 不 launch 而 FAIL)。



* hotfix r7: review head_sha binding — 打通 headless review→merge 自驱

根因:reviewer artifact frontmatter 不带 head_sha,_review_evidence_from_artifact 取不到所审 head SHA → 所有 reviewing PR(#480/#473/#402)恒判 WAIT_OR_REDISPATCH:invalid_reviewer_evidence:missing_reviewed_head_sha,且 wakeup-plan 不自动重派 reviewer → review→merge headless 永久卡死(挡住发版)。

修法(codex hotfix worker 实现,controller 验证后提交):
- controller_actions 派 reviewer 时取 headRefOid,缺则 fail-closed,注入权威 HEAD_SHA 进 rendered prompt。
- reviewer-{architect,tests,quality}.md frontmatter 写 head_sha: ${HEAD_SHA}。
- wakeup_runner._review_head_sha_for:artifact 缺 head_sha 时从 controller-rendered prompt/log 回收(不依赖 worker 自报,避免 markerless 覆辙);live-head 陈旧判定与 merge 真值表语义不变。
- wakeup_plan 为 missing/stale reviewer head 的 OPEN reviewing PR 投射可执行 dispatch_reviewers 自愈动作。

验证:controller 独立重跑 test_wakeup_runner+review_gate+wakeup_plan+controller_actions 293 通过、marker/prompt 合同 30 通过;review-gate 安全语义(stale head 仍不 merge)由 13 项 review-gate 测试锁住。



* hotfix r8: 让 review-evidence-redispatch 自愈动作可执行 — 打通 headless review→merge 闭环

承接 r7:r7 让 reviewer 证据绑定 head_sha,但 wakeup_plan 投射的 review-evidence-redispatch 自愈动作被 closed-action-projection finalizer 强制 status_only(其 kind 不在 EXECUTABLE_ACTION_KINDS),runner 永不应用 → #480/#473/#402 缺 head 的 review 证据无法自动重派 → review_gate 恒 WAIT_OR_REDISPATCH → 并发耗干到 0。

修法(codex hotfix worker 实现,controller 验证后提交):
- wakeup_plan EXECUTABLE_ACTION_KINDS 加 review-evidence-redispatch,finalizer 保留 runner authority 不再强制 status_only。
- controller_actions.dispatch_reviewers 读 stale_review_roles 只重派陈旧角色,且 _pending_review_spawn_exists 跳过已有在飞 dispatch-reviewers:<pr>:<role>:r1 intent(幂等不双派);仍走 r7 的 HEAD_SHA 注入,使重派 reviewer 的新证据 head-bound。
- merge-gate 真值表不变,仅让既有自愈动作可执行 + 定向到陈旧角色。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 295 通过。



* hotfix r9: reviewer 重派 bump 到下一轮 + 绝对 cd + latest-evidence-per-role — 真正打通 headless review→merge

根因(同一类:按"产物存在"而非"成功+区分维度"抑制):reviewer 重派硬编码回 r1,撞上旧 head 早先成功(EXIT=0)的 r1 log → _harness_spawn_intent_log_suppresses_retry 判"做过了"抑制 → 当前 head 该重跑的 review 永不跑 → review_gate 恒 WAIT_OR_REDISPATCH → 并发 0。且 intent cd 是相对 "." → headless 启动层不投射。

最小修法(按 maintainer directive:文件名即唯一 id,不加 wrapper --id,留在现有 r<R> 命名约定内不改公开命名):
- controller_actions.dispatch_reviewers:_next_review_round 取该 PR/role 现有最大轮 +1,渲染 r<N+1> 的 prompt/log/output/intent;pending-spawn 幂等 guard 按真实轮号;_append_harness_spawn_intent 的 cd 改 str(cd.resolve()) 绝对路径,使 headless 投射+启动成立。
- wakeup_plan.latest_reviewer_heads / wakeup_runner._latest_review_evidence_by_role:改为"每 role 取各自最新轮证据"(而非全局最新单轮),使只重派陈旧 role 时不丢其它 role 已 head-bound 的有效证据;merge 仍要求每 role 证据绑定 live head,安全语义不变。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 299 通过。



* hotfix r10: review_gate action 带上 reviewed head_sha + e2e 锁死 headless review→decision 全路

根因(review path head_sha 串接的第 5 处):wakeup_plan 投影 REVIEW_DONE 的 review_gate action 时只读 log 的 head(log 无 head)→ action.head_sha 空 → wakeup_runner._review_gate 卡 WAIT_OR_REDISPATCH:missing_action_reviewed_head_sha,证据虽已 head-bound 但 gate 连比都没法比。

修法(最小、按文件名即 id 思路):
- wakeup_plan.completed_marker_actions:REVIEW_DONE action 的 head_sha 改由 _review_done_action_head_sha 从权威证据面解析(latest_reviewer_heads 含 r7 prompt 注入的 head;或全 role head-bound 时取 live PR head;否则回落 log)。只从真实来源填充,不绕过 gate 的 head 比对,merge 安全语义不变。
- 新增 test_review_gate_e2e.py:从真实 wakeup_plan 投影 → WakeupRunner apply,断言 FIX(有 reject)与 MERGE(全 approve+CI 绿)两条具体决策落地——端到端锁死整条 review→decision,杜绝再冒隐藏层。
- test_wakeup_plan.py:加 prompt-bound reviewed head 的 planner 回归。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions+review_gate_e2e 302 通过。



* hotfix r11: PrChecksProjection 对瞬时 gh-api 失败加 bounded retry — ready PR headless 走到 review_gate 决策

根因:review_gate 的 CI 读 PrChecksProjection.check_pr 对 gh api pulls/<n> 与 commits/<sha>/check-runs 的单次非零返回立即 fail-closed(ci_unavailable:pull_api_failure),无重试;rate limit 健康,实为瞬时 blip → 让 #473/#402(MERGEABLE+CI 绿)每 tick 被瞬时失败挡住、never reach MERGE/FIX。

修法(最小,只动 PrChecksProjection;不碰 #322 ReleaseRequiredChecksProjection):
- _run_api_read:对两处 api read 最多 3 次确定性重试(无 sleep),成功即停,持续失败仍 fail-closed(安全)。happy path 无行为变化。
- starvation 结论(worker 核实):review_gate 本就被 apply,blocked ci_unavailable 行可重评估(非 terminal duplicate-suppressed),无需改 run_once。
- e2e:test_review_gate_e2e 加「首次 pulls 读瞬时失败仍 merge」用例;test_pr_checks 加 retry 覆盖。

验证:controller 独立重跑 test_pr_checks+wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+controller_actions 309 通过。



* hotfix r12: run_once 不再让 spawn-batch starve drain-only lifecycle — ready PR headless merge,backlog 能 drain

根因(已核实):wakeup_plan 每 tick 把 SPAWN actions(implement intent + design-consensus #495/#496/#490/#491/#418,index 0-5)排在 lifecycle review_gate(#480/#473/#402,index 6-8)之前;run_once 旧逻辑 `if applied_spawns>0 and not is_spawn_action: break` → 只要派了任一 spawn,首个 lifecycle 即 break 整 tick。design-consensus 几乎总有 spawn 活 → review_gate/merge 永远 starve → #473(approve/approve/comment → MERGE_WITH_COMMENTS,MERGEABLE+CI 绿)永不 merge,active 数永不下降。

修法(最小,核心 dispatch loop):
- 引入 consumes_spawn_budget = is_spawn_action or _uses_spawn_budget(action);超 budget 时 `continue`(不 break)以便仍能抵达后续 drain-only lifecycle。
- _uses_spawn_budget:dispatch_reviewers=True、review_gate 仅当决策为 FIX(会派 fix worker)=True;否则 review_gate(MERGE/MERGE_WITH_COMMENTS)、close 等 drain-only=False → 不占 spawn budget、floor 填满后仍被评估执行。
- 保留 r6 的 spawn launch-failure break/retry 与 dispatch_design_consensus no-intents skip-and-continue;真值表/head-binding/CI-green/mergeable 安全语义不变。

结果:每 tick 既用 spawn 填满 floor,又评估 merge/close 等 drain-only lifecycle,backlog 能 drain。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+controller_actions+pr_checks 309 通过;新增 test_wakeup_runner_lifecycle_review_gate_not_starved_after_spawn_batch。



* hotfix r13: 修 WakeupRunner._run_command 把 --repo 注入到 gh 子命令前致全部 gh 调用 malformed — review→merge 根因

根因(实测确认):_run_command 对所有 gh 命令在 full[1:1] 插 --repo(子命令之前)。gh 不接受 --repo 作 pre-subcommand 全局 flag,且 gh api 根本不吃 --repo(repo 在 URL):`gh --repo O/R api repos/O/R/pulls/473` → rc=1 "unknown flag: --repo"。于是 wakeup-runner 的每个 gh 调用都 malformed:CI 读(PrChecksProjection(runner=_run_command) 的 gh api pulls/check-runs)→ ci_unavailable:pull_api_failure;mergeability(gh pr view)→ mergeability_unavailable;live-state 等同理。→ review gate 永远读不到 CI/mergeable → #473(approve/approve/comment=MERGE_WITH_COMMENTS,绿)、#402#480 永不 headless 落地。r11 的 retry 只是重复同一 malformed 命令。直连 PrChecksProjection()/直连 gh 用的是 plain default runner(无注入)所以正常,掩盖了 bug。

最小修法:_run_command 仅对 `gh pr`/`gh issue` 注入 --repo,且放在子命令(及其 verb)之后的合法位置;`gh api` 完全不动;绝不 pre-subcommand 注入;已存在 --repo 则不重复。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+pr_checks+wakeup_plan+controller_actions 312 通过;新增 argv-shape 测试 + e2e(stub subprocess.run 拒绝 malformed `gh --repo ... api`,ready PR 仍 merge)。



* hotfix r14: managed draft PR ready-then-merge — review gate 不再硬挡 pr_draft

根因(r13 修好 gh 调用后浮现):review_gate 的 _review_gate_mergeability_error 把 isDraft=True 当 WAIT_OR_REDISPATCH:pr_draft 硬挡。但本 loop draft-until-consensus:PR 开为 draft 直到 review 共识,merge_pr 在决策 MERGE/MERGE_WITH_COMMENTS 时 ready-then-merge。于是绿 gate 的 managed draft(如 #473 approve/approve/comment)永远卡 pr_draft、永不 headless 落地。

最小修法:
- wakeup_runner:删除 mergeability 检查里的 isDraft→pr_draft 硬挡(仍挡 mergeable!=MERGEABLE,即 CONFLICTING/DIRTY)。
- controller_actions.merge_pr:对 live draft PR 先校验 crnd:lifecycle:managed 再 gh pr ready(非 managed draft 不 ready,写 CONTROLLER_ACTION_BLOCKED:target-not-managed 并返回 2);保留既有 ready-then-merge。
- 真值表/head-binding/CI-green/conflicting-blocks 安全语义不变;readying 仅限 managed PR。

验证:controller 独立重跑 test_wakeup_runner+review_gate+review_gate_e2e+controller_actions+wakeup_plan+pr_checks 316 通过;新增 managed-draft merge + 真实 plan→runner→controller ready-then-merge e2e + conflicting/非managed 安全用例。



* #409: 移除 .refactor-loop/host.env runtime fallback,CONSENSUS_RND_HOST_ENV 为唯一 host fact locator(缺失 fail-closed);迁移依赖 legacy fallback 的 test fixture (#473)

* hotfix r15 (sshx 共识 B 之一): 统一 gh argv 构造到单一 build_gh_argv helper — 防 r13 类 --repo bug 复发

sshx 一致共识 B 的硬化第1步:把分散重复的 gh --repo 处理收口到 codex_refactor_loop/gh_invoke.py 的 build_gh_argv(slug, argv):gh api 永不加 --repo(repo 在 URL);gh pr/issue 仅在 subcommand 之后、缺失时加 --repo <slug>;绝不 pre-subcommand。wakeup_runner._run_command / ControllerActions.gh / PrChecksProjection 三处 API 调用全部改走该 helper,删除各自的内联 --repo 启发式。行为保持不变(含 r13 修复)。

验证:characterization + source-regression 锁在 test_gh_invoke.py(9 测试);controller 独立重跑 test_gh_invoke+pr_checks+wakeup_runner+review_gate+review_gate_e2e+controller_actions+wakeup_plan 326 通过;pr_checks 直跑 --help 兼容。

注:这是硬化(防复发),不直接提升并发;并发瓶颈在上游 implement→publish 阶段,接下来单独处理。



* hotfix r16 (sshx 共识): 打通 implement→publish 解锁并发瓶颈 — markerless/stale implement 重派 + 安全门控发布

sshx codex-cli triplet 共识(minimal/structural/delete + meta-judge):10 个 design-solving issue 已达成共识但卡在 implement→publish(implement 早跑过却 markerless/未提交/过期 base/没 PR;dispatch_consensus_implementation 被 local_iter_branch 抑制、publish 被 verified_pr_head_unavailable 抑制)→ 无新可派工作 → 并发 ~1。

修法(wakeup_plan/wakeup_runner/controller_actions/git.py):
- success-aware 重派:implement 分支/worktree 存在但无 valid clean IMPLEMENT_DONE:ok marker(markerless/failed)或 base 过期 → 重派;fresh_safe_worktree 把 canonical worktree reset/recreate 到当前 origin/$INTEGRATION_BRANCH;仅 genuine in-flight / 已有干净 PR / 待发布 clean marker 才抑制。
- 发布门控(严防误发 stale/wrong-scope):publish_implementation_output 仅在 canonical identity(refactor/iter<issue>-<candidate> + worktree 实际分支匹配)+ fresh integration base(merge-base==origin/integration,过期 base block)+ managed open issue + 无重复 PR + 非空 scoped diff 时,commit→push→open PR→派 reviewers;否则 block。
- wakeup_plan 用 canonical_implementation_identity/single_linked_managed_issue/clean_scoped_diff 替代 verified_pr_head 抑制。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 330 通过。

已知待修(follow-up):worker 把 PR title/body/commit 文案改成英文,违反「GitHub artifact 中文」策略,下一 commit 恢复中文。



* hotfix r17: fresh consensus-implement 重派前清除 terminal 非可发布 implement log — 让卡住的 implement 真正启动

承接 r16:r16 让 markerless/stale implement 重派(reset worktree),但旧的 EXIT=1/markerless implement log 仍占 spawn 的 target_log_absent 前置 → fresh implement 不启动(同一类「按存在抑制」,implement-spawn-log 层)。

修法(controller_actions,success-aware):fresh consensus-implementation dispatch 写 spawn intent 前,仅清除 terminal 非可发布 implement log(EXIT!=0 失败 / EXIT=0 无 IMPLEMENT_DONE:ok marker);保留 genuinely in-flight(无 EXIT)与 clean publish-ready(EXIT=0 + 干净 marker)log 不动。于是卡住的 markerless/failed implement 重派后能真正 launch、跑在 fresh base、产出干净 marker → r16 门控发布。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 333 通过。

设计说明:并发瓶颈用确定性逻辑修对(ready=clean EXIT=0+marker;parallel=scope_paths 不相交),不引入 LLM-in-loop 分发判断(daemon-first;sshx codex 三人组一致 revise:LLM 最多 shadow-mode,不入决策回路)。



* hotfix r18: implement-lifecycle 收敛为单一 success-aware 谓词 + runner 路径自清陈旧 log — headless 自愈,不再需手动清 log

承接 r16/r17(r17 把清 log 放进被抑制的 dispatch 路径,从没执行)。新增 implement_lifecycle.py 单一事实源 classify_implement_attempt → in_flight | publish_ready | redispatch,plan/runner/controller 共用:
- runner spawn-apply 路径 _spawn_log_suppresses_retry 仅在 in_flight|publish_ready 时抑制;redispatch(EXIT!=0/markerless/stale-base) 不抑制且 _clear_redispatchable_spawn_log 在 spawn 实际应用处清掉陈旧终态 log → fresh implement 真启动(修 r17 misplacement,免手动清)。
- publish 仅 publish_ready(clean IMPLEMENT_DONE:ok + canonical identity + fresh base + 非空 scoped diff);clean-marker 但 stale-base 改 redispatch 而非死锁(#421 edge)。
- 确定性逻辑(非 LLM-in-loop;daemon-first);merge-gate/#191/#322/#396 安全语义不变。

验证:controller 独立重跑 wakeup_plan+wakeup_runner+review_gate+review_gate_e2e+controller_actions+pr_checks+gh_invoke 339 通过。



* 修复 phase9-router triplet-evidence-invalid 误判致 headless design-consensus wedge

solver prompt 注入的 issue source snapshot 含上一轮 design-consensus 的
audit-trail(peer solver log 路径)时,_peer_solver_reference_violation
扫描整个 prompt 误判为 peer-isolation 违规,fail-closed 拒派 meta-judge,
使所有 body 含历史 consensus 记录的 issue 永久 wedge。

修复:peer-isolation 扫描只覆盖 router-controlled 区域,剥离
'## Issue source snapshot' 到 '## Full solver template' 之间的 issue
作者内容;router header / solver template 注入的真实 peer 引用仍拦截。
加两个 behavior test 覆盖 snapshot-排除 与 router-区域-仍拦截。

dogfood headless 实测:reload 后 418/437/474/490/491/493/494/495 全部
恢复 solver→judge→consensus 自驱,#494 已自驱到 implement。



* CLAUDE.md: 加 consensus-rnd:foundational-invariants sentinel block

dogfood bootstrap 的 check-project-rules probe 要求 PROJECT_RULES 含此
fixed-point block;缺失时 probe fail-closed 阻断 headless restart。内容
为 probe 生成的 canonical FI-001~007,镜像既有设计哲学,不引入新约束。



* 修复 markerless implement 致 publish 卡死的 headless gap

implement worker 干净退出(EXIT=0)但把 IMPLEMENT_DONE marker 只写进 run
artifact(runs/implement-issue-<id>.md)而非 log 末尾时(codex stdout marker
落点不可靠),completed_marker_actions 只扫 log marker → markerless → 不生成
publish action → design→implement→PR 链卡在 implement 后。实测 #421:EXIT=0
+ artifact 有 IMPLEMENT_DONE:issue-421:ok + 9 文件 staged,却无 PR、label 仍
停在 design-solving。

修复:对 clean-exit 的 implement-issue log,log markerless 时 fallback 读取
run artifact 末尾的 IMPLEMENT_DONE marker,复用 review verdict 已有的
artifact-first 模式。窄作用域(仅 implement-issue + clean-exit),配 2 个
behavior test;全套 wakeup_plan 138 tests green。



* 补全 markerless implement publish:revalidation 也读 run artifact marker

8aeea89 修了 wakeup_plan detection 端,但 wakeup-runner 的 source-marker
revalidation(_source_log_has_clean_marker)仍只扫 log,markerless implement
publish 仍被 reject(clean_exit_marker_missing)。补全:revalidation 对
clean-exit implement-issue log 在 log markerless 时 fallback 读
runs/implement-issue-<id>.md 的 marker,与 detection 端一致;配 behavior test,
全套 wakeup_runner 76 tests green。

注:dogfood 实测确认 markerless 是系统性病根(还影响 readiness 重派、review
verdict 等多处),应走系统性统一(见 #491 日志/observability 与后续 marker-reading
收口);本 commit 只补 implement publish 这条链。



* 日志详细化:wakeup-runner tick 显示全状态计数 + 被掩盖的 blocked/skipped 详情

旧 _wakeup_tick_action 只看 results[0],一个 tick 处理多 action 时 blocked(如
publish 失败)被成功的 spawn 掩盖、完全不可见;graphql-backoff 固定显示误导的
'remaining=unknown'。dogfood 调试时这逼得每次翻 ledger 才知道真实发生了什么。

改:tick log 显示 [applied=N,blocked=M,...] 全状态计数 + 被掩盖的 blocked/skipped
具体 reason+action(graphql-backoff 作整-tick 门控单独报、去掉 unknown);dispatched
headline 保留。更新 test 反映可见性提升,全套 wakeup_runner 76 tests green。



* 补全 markerless implement 第三环:readiness 谓词读 run artifact marker

classify_implement_attempt(r18 单一 success-aware 谓词)只扫 log 找
IMPLEMENT_DONE:ok,markerless log(marker 落 artifact)→ redispatch:markerless →
readiness 不 suppress → 重派覆盖已完成工作(dogfood 实测 #421 重派 4 次、#493
IMPLEMENT_DONE:ok→重派→:partial 工作丢失)。

补全:markerless log 时 fallback 读 runs/implement-<cluster>.md 的
IMPLEMENT_DONE:ok,与 detection(8aeea89)/revalidation(dfacc64)一致。窄作用域 +
r17/r18 兼容:只接受 :ok,partial/失败/真 markerless(无 artifact)仍 redispatch
recovery — 现有 markerless-redispatch test(wakeup_plan/wakeup_runner/
controller_actions 共 304 tests)全绿,新增 2 behavior test。

至此 implement→PR 的 markerless 三环(detection/revalidation/readiness)打通;
review 链 markerless 与系统性统一抽象交 #499 共识。



* #500: design-consensus 编排重设计 — phase9-router 唯一派发 owner,删 wakeup-runner 双重派发

落地 #500 r4 consensus(structural framing,no-new-schema):
- phase9-router 成为唯一 design-consensus worker intent owner,新增
  META_RESOLVED:re-design → marker.round+1 三路 solver route
- 删除 wakeup-runner 的 dispatch_design_consensus(消除双重派发 helper_exit:3)
- wakeup-runner 只保留 closed action apply(drop→close、consensus→implement)
- dispatch_consensus_implementation readiness 后转 crnd:phase:implementing
- 无新增 schema(复用 HARNESS_SPAWN_INTENT)

修复 headless P0:reflector META_RESOLVED 无 consumer + phase9-router/wakeup-runner
双重派发致 0-codex 死锁。14 文件 +207/-282;verification 471 tests 全绿。

共识由 controller 手动驱动 design-consensus r1→r4 达成(daemon 停止期间)。



* #501 fix r1: 补 dispatch_consensus_implementation label-transition failure-branch test

tests reviewer reject:新增 label transition(gh issue edit→implementing)只测 success
path,缺 gh issue edit 返回 nonzero 时 dispatch stop before worktree/render/intent 的
failure-branch test。本 fix 补该测试。



* 修复 headless 死锁:label 移除集只留 canonical crnd:*,删除历史 alias

gh issue/pr edit --remove-label 会因仓库不存在的 label 整条 rc=1 失败。
ISSUE_LABELS_REMOVE/PR_LABELS_REMOVE 含 cleanup_aliases(emoji/legacy 名),
仓库未维护这些历史 label,导致 _move_issue_to_implementing_phase 永久失败,
consensus→implement 派发卡死、wakeup_runner 0 codex 自驱死锁。

按 maintainer 指令删除历史 label 兼容(不再管历史标签):移除集只列
canonical crnd:* phase/human/stuck label。新增 source-regression 锁定
两集 canonical-only、无 alias。



* 修复 #500 回归:phase9-router 生产 ctx 丢失 GH_REPO_SLUG 致 design-consensus intake 全死

#500(e733b1a)把 Phase9Router.__init__ 的 ctx 构造从
`LoopContext.load(repo_root=repo_root)` 改成
`LoopContext.load(repo_root=repo_root, env={"REPO_ROOT": str(repo_root)})`。
受限 env 不读 os.environ,使 daemon(经 source host.env 启动)的 ctx
gh_repo_slug=None → `_open_design_consensus_issues` 第一行 fail-closed 返回 []
→ DesignConsensusIssueIntake 静默永不派 → headless 无法为任何 issue 启动
design-consensus;`_require_open_source_issue` 也因无 slug 失败(marker route
如 #490 redesign 卡在 phase9-source-state-unavailable)。

修复:revert 为 `LoopContext.load(repo_root=repo_root)`,读进程环境解析 host facts。
测试一律用 `Phase9Router(ctx=...)` 显式 ctx,不依赖受限 env;新增回归测试锁定
生产 repo_root 构造路径解析 GH_REPO_SLUG。



* 加 STALE_REVIVAL_HOURS:>阈值未推进的 stuck implement 自动 re-trigger(env 可配,默认 3h)

headless 下 partial/failed/markerless 的 implement log 会让已排队的
dispatch-consensus-implementation spawn intent 永久卡在 runner 的
target_log_absent 前置(清 log 的 dispatch_consensus_implementation 又被
pending_implement_intent 抑制)→ stuck implement 不会自愈(如 #421/494/493/474/498
的 10h partial-wedge)。

新增 host-tunable STALE_REVIVAL_HOURS(默认 3,非正/非法回落 3h)。wakeup_plan
投影 implement spawn intent 时,若目标 implement log 是 redispatchable
(partial/failed/markerless/stale-base)且 mtime 已超阈值,清掉它 →
target_log_absent 通过 → implement 自动重派。安全闸:in-flight(无终止 EXIT)永不清、
publish-ready(:ok+diff)不清、age-gate 防误清刚跑完的。

env 已登记进 SKILL.md host env surface matrix + host.env.example;5 个 behavior
test + matrix↔example 对称测试覆盖。



* 扩展 stale-revival 覆盖死 codex:in_flight 无 EXIT 且 >阈值 + 不在存活清单 也 re-trigger

最常见的 headless wedge 是 codex/supervisor 中途死亡(如 daemon 被 kill 截断),
log 永远停在 in_flight(无终止 EXIT)→ classify 永判 in_flight → 永不 redispatch。
之前只清 redispatchable(partial/failed/markerless)log,漏了这半边。

现在 _revive_stale_redispatchable_implement_log 在 log mtime 超阈值时,先按
redispatchable 清;否则若 classify 为 in_flight 且 monitor 存活清单里无该 log 的
活进程,则判定为死 worker 清掉。安全性:spawn-codex 的 no-output stall 窗口远小于
默认 3h,故 >阈值仍 in_flight 的 log 必是死 supervisor;再加 monitor 存活校验
双保险,绝不杀活 codex。新增 3 个 behavior test(死 codex revive / fresh 不碰 /
活进程不碰)。



* 加手动 trigger:consensus-rnd-cli revive-implements(立即 re-trigger 卡住的 implement,不等阈值)

_revive_stale_redispatchable_implement_log 加 force 参数:跳过 stale_revival_seconds()
age gate,但 in_flight log 在 force 下必须经 monitor 存活校验证明无活进程才清(绝不杀
正在跑的 codex)。新增 force_revive_stuck_implements() 扫所有 implement-issue-*.log
强制 revive,和新 CLI 命令 revive-implements(authority delete-log)。auto 路径
force=False 不变。新增 4 个 behavior test。



* 修 env 泄漏:测试套件 import 时隔离 ambient host.env(worktree 里跑 TEST_CMD 不再假失败)

daemon spawn 的 implement worker 在 worktree 里跑完整 TEST_CMD 时,继承的相对
CONSENSUS_RND_HOST_ENV=.config/consensus-rnd/host.env 泄漏进 temp-repo
LoopContext.load(repo_root=tmp) → 解析成 tmp/.config/... 不存在 → 假
LoopContextError(实测 concurrency_monitor 48 + restart_daemons 23 + … ≈151 errors)
→ 全套永不绿 → implement 永远 :partial → 进不了 PR。

unittest discover 在 collection 阶段先 import 所有 test_*.py 再跑;新增
test_aaa_host_env_isolation.py(字母序最先)在 import 时 pop 掉 host 注入变量,
全套在任何测试 run 前即隔离。验证:full discover 设 locator 跑 → exit 0 全绿
(此前 同条件 151 errors);targeted before/after 证 48/23/1 errors → OK。
纯测试模块,不碰 production/daemon。



* 修 markerless keystone:solver/judge marker 加 companion-artifact fallback

solver/judge codex 有时 EXIT=0 但 SOLVER_DONE/META_JUDGE_DONE marker 只落 run
artifact、没在 log 末尾,导致 phase9-router 判 triplet 不完整、永不派 judge,
design-consensus 轮卡死(实测 #505 r1 structural 214 次嵌入、0 次行首)。

把已有的 implement artifact-marker fallback 扩到 solver/judge:clean-exit log
末尾无独立 marker 时,从 companion run artifact runs/<log-stem>.md 兜底读。纯
additive(log 有 marker 时行为不变)、prefix-scoped、精确 filename regex、复用
安全的独立 marker 提取(不信嵌入/echo)。不削弱共识(仍要 3 marked solver +
judge,只让 marker 来源更稳健)。phase9/router.py + wakeup_plan.py + behavior +
source-regression tests;full suite 1367 green。



* 修 publish stale-base 无恢复 + 停 :ok churn:让 implementing 能发出 draft PR

implement :ok 但 worktree base 落后 origin/$INTEGRATION_BRANCH 时,publish 原本硬
block(publish_implementation_stale_base)且永不恢复;stale-revival 又把 stale-base
的 :ok implement 当 redispatchable 反复重派同一 stale worktree → 无限 churn 占满
floor、长时间零正式 PR(implement→首个 PR 永远开不出)。

改为:publish 前在 worktree merge origin/$INTEGRATION_BRANCH 恢复 base(冲突则
merge --abort + 明确 reason 交既有 resolver/re-dispatch);clean `:ok` 不再被
stale-revival churn(终态成功走 publish-with-recovery,不重跑);无 open PR 的 :ok
创建 draft PR(per #300 共识前 draft)+ Closes #N。controller 仍拥有 git topology,
worker 不 commit/push,不绕 review-gate/merge 真值表。controller_actions.py +
wakeup_plan.py + wakeup_runner.py + behavior/source-regression tests;full suite
1374 green。



* 修 markerless implement:EXIT=0+真diff 无干净 marker 时合成 publish 候选,止 stale-revival 空重派死循环

publish build/test 闸做安全网。



* 修 publish 顺序 bug + merge 失败派 codex 兜底 resolver:先 commit worker diff 再 merge fresh base,真冲突不 wedge 改派 resolver,幂等续跑

空判据改 git diff HEAD;顺带修 wakeup_plan stale-suppression 同类 git diff 问题。



* 实现 issue #504 (#519)

* 修 headless review-fix prompt 渲染不全:绑定全部 placeholder(PR_NUMBER/标题/轮次/分支/REVIEW_*_PATH 等),解锁 reject PR 的 fix→merge

之前 headless _dispatch_review_fix 只绑 FIX_OUTPUT_PATH,其余 ${} 未解析致 fix codex 读不到 reject 证据。



* 实现 issue #496 (#520)

* 修 review-fix codex 派到 main 的 cd bug:改在 PR worktree 跑,worktree 缺失 fail-closed

之前 _dispatch_review_fix cd=repo_root → fix 改 main(污染 daemon checkout)且不落 PR 分支(reject PR 永不 merge)。



* 实现 issue #508 (#523)

* 修 wakeup-plan action 排序:review_gate/publish 排到新工作 spawn 前,解 spawn-budget 饿死→reject PR 可被 fix→merge



* CLAUDE.md 加宪法工程规则:异常必抛出+记可诊断日志严禁吞掉/静默;测试必断言真实行为禁无意义测试(maintainer 指令)



* 实现 issue #503 (#521)

* 增加通用工程基本规则,明确面向对象设计要求及实现细则

* 实现 issue #514 (#524)

* 完善通用面向对象设计准则,增加核心原则与实现层细则

* 实现 issue #494 (#526)

* 实现 issue #517 (#528)

* 实现 issue #505 (#510)

* 实现 issue #505

* #510 fix r1:补 phase9-router 行为测试覆盖(应对 tests reviewer reject)



---------



* 修复 headless FIX_DONE 不提交 fix 输出 gap:wakeup-runner 在 FIX_DONE re-review 前先 commit+push fix worktree

headless 下 fix codex 产出 FIX_DONE 但 worker 从不 commit,FIX_DONE 直接路由 dispatch_reviewers,
fix 输出永远留在 worktree 未提交,reviewers 永远审旧 head,reject 不收敛,阻断所有需 fix 轮的 PR
review→merge。镜像 interactive controller:FIX_DONE 源的 dispatch_reviewers 前,若 fix worktree
dirty 则 git add+commit+safe_push,clean 则 no-op。附 2 个 behavior test(dirty 提交/clean 跳过)。



* 实现 issue #491 (#531)

* 实现 issue #490 (#529)

* 实现 issue #490

* 实现 issue #490

* PR #529 review-fix output

* 实现 issue #499 (#512)

* 实现 issue #499

* PR #512 review-fix output

* 实现 issue #474 (#532)

* 实现 issue #474

* 实现 issue #474

* 实现 issue #474

* 实现 issue #474

* PR #532 review-fix output

* 实现 issue #476 (#518)

* 实现 issue #476

* PR #518 review-fix output

* 实现 issue #509 (#527)

* 实现 issue #509

* 实现 issue #509

* spawn-claim recycle 改用 artifact-aware terminal marker:companion log 被 24h retention 删除后,已完成任务的 claim 仍可回收(读 runs/ artifact),不再永久阻断重派;running 任务无终止 marker 仍不回收(防 double-spawn)

* 实现 issue #421:发行前 clean-room 集成测试制度 (#534)

* 实现 issue #421

* rebase #421 onto fresh integration:解决 test_cli_command_router 冲突 + 校验

* 实现 issue #493:测试纪律修宪(behavior-first) (#535)

* 实现 issue #493

* rebase #493 onto fresh integration:解决冲突 + 校验

* PR #535 review-fix r1:addressed tests+quality reject(behavior-first test 调整)

* 实现 issue #506 (#536)

* 实现 issue #506

* 实现 issue #506

* 实现 issue #506

* 实现 issue #506

* PR #536 review-fix output

* churn 根治:wakeup-plan 给 consensus-impl spawn-intent 套 open_closing_pr/terminal/in-flight 抑制(对齐 dispatch_consensus_implementation),已有 PR 的 issue 不再被重派 implement

* 实现 issue #507 (#525)

* 实现 issue #507

* 实现 issue #507

* PR #525 review-fix output

* 实现 issue #516 (#530)

* 实现 issue #516

* 实现 issue #516

* #530 fix r1:补 ManagedWorkSnapshot fetch-failure stale-cache fallback 行为测试(应对 tests reviewer reject)

* PR #530 review-fix output

* PR #530 review-fix r2:typed ManagedWorkItem projection(addresses architect 类型边界 reject + tests + contract-tests)

* PR #530 review-fix r3:anchor ManagedWorkSnapshot read surface in authorization mirror + cross-artifact source-regression(addresses architect governance reject)

* PR #530 review-fix r4:add canonical+legacy managed-label discovery regression test(addresses tests reject)

* 修复 worker marker reader 误判相同重复 marker 为冲突

codex 常把终止 marker 回显 2-3 次(含 diff + 前缀行),导致
read_worker_terminal_marker 返回 duplicate_or_conflicting,review 真值表
永不收敛、所有 headless review→merge 卡死。改为:相同重复 marker 折叠为
有效,仅当存在 2+ 个不同 marker 值才判 duplicate_or_conflicting。
覆盖 _last_final_marker / _sentinel_adjacent_marker /
_marker_from_companion_artifact,保留 malformed 与 foreign-prefix 检测。



* Reserve implementation PR for issue #537 (#540)

⟦AI:AUTO-LOOP⟧

* 实现 issue #511 (#522)

* 实现 issue #511

* PR #522 review-fix output

* PR #522 review-fix output

* 移除误入 PR 的 .refactor-loop 运行 artifact(architect r8 reject)

resolve-pr522-merge.md 是一次性合并解决 summary,属生成物,不应进 durable diff
(CLAUDE.md「生成物不当事实源」)。解决说明保留在 commit message / PR body。



---------



* 修复早期 PR reservation 崩溃(#511 回归,阻断所有 implement 派发)

_reserve_implementation_pr 用 worker 富正文路径开 reservation PR,但该文件在
reservation 时尚不存在 → FileNotFoundError(未被 except RuntimeError 捕获)→
wakeup-runner tick 崩溃 → 任何 consensus issue 无法 implement → loop P0 wedge。
修复:reservation 前若 body 缺失则写占位正文(## issue #N 实现 + Closes + sentinel,
publish 会拒占位逼 worker 产真正文);reservation 幂等(已有 open PR 直接返回;
已有 remote head 则 reset 到 integration + force-with-lease push);捕获 OSError。



* Reserve implementation PR for issue #543 (#544)

⟦AI:AUTO-LOOP⟧

* Reserve implementation PR for issue #541 (#546)

⟦AI:AUTO-LOOP⟧

* revert early-PR reservation(#507):不再开空占位 PR,publish 出真内容才开 PR

dispatch_consensus_implementation 不再 reserve/开 PR;publish_implementation_output
恢复 pre-#507 行为——implement 出非空 diff 才 commit+push+open_pr_with_label(用 worker
真 title/body,验证非占位);空 diff 不开 PR、不 merge、不关 issue。删除占位机制
(_reserve_implementation_pr / _placeholder_*),改 wakeup_plan/runner 的 early_pr_missing 投影。
根因:空占位 PR 被 reviewer 在 implement publish 前空合,假性关闭 #537/#541/#543。
全量套件 1496 passed。



* 实现 issue #542:删除 redispatch payload 冗余 head_sha (#547)

移除 review redispatch 的重复 head_sha payload

* hotfix: stale consensus-implement intent 指向已删 worktree 不再 suppress fresh re-dispatch

consensus_implementation_suppressed_reason 的 pending_implement_intent 分支增加 worktree 存在性
守卫(接入既有 _canonical_consensus_worktree_exists):仅当 stale pending intent 的 canonical
worktree 仍存在时才 suppress。revert early-PR reservation(#507)后,#537/#541/#543 的 pending
implement intent 指向已删除 worktree,旧逻辑 suppress fresh re-dispatch,而 wakeup-runner 不断把
stale intent spawn 进死 cd 致 os error 2 / EXIT=1 死循环,阻断 consensus→implement 自主链路。
现 worktree 缺失即放行 fresh dispatch 重建 worktree。

补 test_wakeup_plan(worktree 缺失不 suppress / worktree 存在仍 suppress)+ 更新
test_controller_actions 锁旧行为的 subTest(pending case 建 canonical worktree 以保留
reject-before-fresh_safe_worktree 意图)。全套件 1557 passed。



* hotfix: spawn-claim lock 持有者 pid 已死即可回收(自愈,防 cleared-log 永久 HELD)

_existing_claim_is_recyclable 增加 pid-liveness 回退:当 lock 记录的持有者 pid 已死(os.kill
pid 0 抛 ProcessLookupError)时即可回收,保留既有 terminal-marker 回收与 live-spawn 保护
(pid 存活+无 marker 仍不回收)+ metadata-mismatch fail-closed。根因:implement 失败(EXIT=1)
留 stale lock,dispatch_consensus_implementation 重派时清掉 log → #490 spawn-claim 只在 log 有
EXIT= 时回收 → cleared log 无 marker → 死进程 lock 无法回收 → SPAWN_CLAIM_HELD 永久死锁,
阻断 consensus→implement 重派(#537/#541/#543)。补 7 个 behavior test。全套件 1560 passed。



* hotfix: publish_implementation_output 接受 worker 已 commit 的实现 diff

_require_publish_implementation_diff 之前只看 git diff HEAD --quiet(未提交改动),
worker 一旦自己 commit(在自己分支上,隔离无害)工作区就 clean → 误判 no_diff → 不开 PR。
现改为:有未提交改动 OR HEAD 相对 integration base 有 committed delta(merge-base..HEAD 非空)
即视为有 diff 可发;_commit_publish_implementation_diff 在工作区已 clean(worker 已 commit)时
跳过提交返回成功,不再误报 publish_commit_failed。修 #541 类 worker-commit→no_diff→无 PR+
daemon 重复 dispatch。补 behavior test。codex 全套件 DISCOVER_STATUS:0 通过。



* 实现 issue #437: .refactor-loop 历史生成物与陈旧 worktree 的有界清理机制 (#548)

* #437: 新增 canonical RuntimeRetention owner,log-retention 降为一版 alias;清理覆盖 generated files+pending-events compaction+stale worktree;CLAUDE.md 加 #437 narrow carveout

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* PR #548 review-fix output

* hotfix: CONFLICTING managed review PR 的 stale-base 冲突自主 resolve

接线既有但孤儿的 prompts/rebase-resolve.md 进 headless 路径:managed review PR
mergeable=CONFLICTING(base 落后)时,controller 在 PR worktree 起 git merge
--no-commit --no-ff origin/<integration>——干净则直接 commit+push,有冲突则派
rebase-resolve codex(隔离 worktree 内解+stage,禁 commit/push),收到
REBASE_RESOLVE_DONE 后 controller commit + safe_push 仅推 PR head。结果再过 CI+
review-gate 把关。新增 dispatch_pr_rebase_resolve / commit_push_resolved_pr_rebase
(controller_actions)+ wakeup_plan 投影 + wakeup_runner 应用。

安全:只动 refactor/iter<N>-* PR 分支,显式 guard 拒碰 integration/review_base,
diff-filter=U 验未解冲突才 commit,任何 abort 路径 git merge --abort 留干净 worktree,
幂等(已含 base / 解析在飞则 NOOP)。修 stale-base churn 致 loop 永久停滞(一个 PR
合了其它全 CONFLICTING 无人解)的根因。codex 全套件 1589 passed。



* hotfix: rebase_resolve_actions 投影前 enrich live mergeability(修生产 gap)

自主-resolve 投影读 item.mergeable 前未 enrich,但生产 GhItem 来自 snapshot
mergeable 为空 → 永远 skip → CONFLICTING PR 不投影 dispatch_pr_rebase_resolve →
loop 仍停滞。复用同文件 _with_live_mergeability(review_evidence_redispatch_actions
已用同模式)。补 mergeable-为空-需-live-fetch 的回归测试(原测试 mock 了 mergeable
未覆盖此生产路径)。



* hotfix: dispatch_pr_rebase_resolve kind 进 EXECUTABLE_ACTION_KINDS(end-to-end 可执行)

_close_projection_action 的 else 分支默认 status_only=True;rebase action 的 kind
stale-base-conflicting-pr 不在 EXECUTABLE_ACTION_KINDS → 永远 status_only,runner 不
apply,CONFLICTING PR 永不自主 resolve。加该 kind(+ commit_push completion kind 如需)
进集合;controller_action 已在 RUNNER_NAMED_HELPER_ACTIONS 故随即保持 executable。补
端到端 wakeup-plan pipeline 测试(原测试只测 rebase_resolve_actions 单函数,漏了 closed-
action-projection 的 status_only 关卡)。codex 全套件 1593 passed。



* hotfix: rebase-resolve commit/dispatch 投影 gate 在实际 worktree merge 态(止 stale-marker 噪音)

commit_push_resolved_pr_rebase 之前只凭持久 REBASE_RESOLVE_DONE marker 投影,worktree
已 commit+push(#549)或 abort(#550/#551)后 marker 仍在 → 每 tick fail-closed 噪音 +
反复 0-codex(zero_streak 触警)+ stale marker 挡 #550/#551 fresh resolve。现 gate:仅当
worktree 真有 merge-in-progress(MERGE_HEAD)且无 unmerged 才投 executable commit_push;
否则不投。并让 worktree clean + 仍 CONFLICTING 的 PR 重新拿 executable dispatch_pr_rebase_resolve
(stale DONE marker 不再永久压制)。补端到端测试。codex 全套件通过。



* 实现 issue #502: 删除遗留 label 兼容层:purge emoji/legacy alias 全套(cleanup_aliases / per-spec  (#549)

* 实现 issue #502: 删除遗留 label 兼容层:purge emoji/legacy alias 全套(cleanup_aliases / per-spec

* 修复 PR #549 review r3:#502 label-purge 适配当前 label 代码(contract-tests + architect/tests reject)

fix codex round-3 applied-4:对齐 #502 遗留 label 兼容层 purge 与近期新增 label 代码,
修 contract-tests 失败 + architect/tests reject。改 labels.py + SKILL.md + test_wakeup_plan.py。



---------



* 修复 headless review_gate ci_failed 死锁:reject→FIX 移到 ci/mergeability 检查前

_review_gate_decision 之前 ci_error/mergeability 检查在 reject>0→FIX 之前 → CI 红的 PR
即使 reviewer reject 也返回 WAIT_OR_REDISPATCH:ci_failed 干等,而 CI 不修不会绿 → 死锁
(#548/#549 都中,需手动派 fix 破解)。把 reject>0→FIX 移到 ci/mergeability 检查前
(仍在 reviewer-validity/head 检查之后)→ reject 时即使 CI 红也自主派 fix(fix 同解 reject+CI)。
MERGE/MERGE_WITH_COMMENTS 判定完全不变(仍需 ci-green + mergeable + reject=0 + approve>=1),
无误合并风险。补 test_wakeup_runner 端到端测试(reject+ci-red→FIX;reject=0+ci-red 仍 WAIT;
happy path MERGE 不变;stale-head+reject 仍 WAIT)。full suite 1600 passed。



* 修正 review_gate 顺序:mergeability 检查移到 reject→FIX 之前(CONFLICTING 不并发 review/fix)

上一修复(a125d19)把 reject→FIX 移到 ci+mergeability 前,破了 ci_failed 死锁但也让
CONFLICTING+reject 的 PR 路由到 FIX → 与独立的 dispatch_pr_rebase_resolve 并发跑同一
worktree(#550 出现 98-file mid-merge + reviewer 并发)。把 mergeability_error 检查移到
reject→FIX 之前:CONFLICTING→WAIT(先 rebase-resolve,不并发 review/fix);MERGEABLE+
CI红+reject→FIX(死锁修复保留);ci_error 仍在 reject→FIX 后(纯 CI 红无 reject 才 WAIT)。
MERGE 判定不变。test_wakeup_runner 90 OK(+conflicting 用例)。



* 实现 issue #498: headless daemon fail-closed 点的 codex 兜底裁决/生成机制(narrow allowlist,非通用 es (#550)

* 实现 issue #498

* 实现 issue #498: headless daemon fail-closed 点的 codex 兜底裁决/生成机制(narrow allowlist,非通用 es

* PR #550 review-fix output

* 新增 patrol-inspector 巡检 issue intake (#551)

* 实现 issue #541

* PR #551 review-fix output

* PR #551 review-fix output

* 修复 PR #551 contract-tests:对齐 patrol 第8 daemon + safe_push/sync_dev 测试 host.env

#551 patrol 新增 patrol_inspector_daemon(第8 daemon)→ test_anti_stop_restart 期望 7→8;
test_controller_lib_safe_push / test_sync_dev 补 CONSENSUS_RND_HOST_ENV host.env 设置
(代码按 host-config contract 需 host.env locator,rebase-resolve 丢了 base 新测试版本)。
断言未削弱(daemon 数断言强化为8)。full discovery 1634 OK + sshx 13 OK。



* 修复 PR #551 contract-tests(clean-env):e2e fixture 设 META_ESCALATION_STUCK_HOURS 隔离 de-dup 测试

test_batch_and_per_task_lock_do_not_duplicate_spawn 在 clean env(CI)下失败:#506 meta-
escalation 因 fixture mock 态被判 stuck 而 fire,产出 repository-stalled RunnerResult 污染
de-dup 断言(expected [] got [RunnerResult])。fixture host.env+env 设 META_ESCALATION_STUCK_
HOURS=999999 禁该测试的 meta-escalation,聚焦 de-dup 行为。断言未削弱。clean discovery 全过。



---------



* 修复 publish-gap:implement readiness 用 git status --porcelain 判 staged diff

原 classify_implement_attempt / wakeup_runner publish 前用 git diff --quiet 只检测 unstaged
改动;worker 正确 git add -A 后 staged diff 被误判 empty_scoped_diff,导致完成的 implement
(如 #553 干净 EXIT=0 + 885 行真实 staged diff + 有效 title/body artifacts)永远出不了 PR,
wakeup_runner 对 children 反复空转 re-dispatch。改为先 git status --porcelain 判 staged/
unstaged/untracked 任一即 publish-ready,只有真干净才 fallback git diff。publish-time 建 PR
路径不变(不重蹈 #507 early-PR 空 commit/placeholder 死锁)。



* 修复 patrol_inspector daemon 启动崩溃:DaemonHeartbeatLease 用构造函数非 from_env

#551 patrol.py daemon 分支调 DaemonHeartbeatLease.from_env()(不存在的方法)→
AttributeError 启动即崩,daemon 从未成功跑起来。改为 DaemonHeartbeatLease(
"patrol_inspector_daemon", ctx.repo_root),与 closed_label_reconciler/wakeup_runner
一致,心跳文件名匹配 restart 助手期待的 patrol_inspector_daemon.ts。



* 修复 publish-gap 残留层:duplicate-marker log 读 companion artifact + 取代 superseded spawn intent

层2 残留:implement worker 把 IMPLEMENT_DONE:issue-N:ok 在 log 里 echo 多次→
worker_markers 返回 duplicate_or_conflicting_log_marker fail-closed→classifier 当
markerless 且不查 companion artifact→无 completed-marker action→无 publish→stale
harness-spawn-intent:dispatch-consensus-implementation:N 残留致 pending_implement_intent。

修:(1) classify_implement_attempt / completed_marker_actions 遇 duplicate marker 时
narrow 读 companion .refactor-loop/runs/implement-*.md(仅 implement log,仅末 30 行恰好
1 个 IMPLEMENT_DONE:ok 才接受);(2) suppress_publish_superseded_implementation_spawn_intents:
仅当某 issue 已有 executable publish_implementation_output 才把对应 dispatch-consensus-
implementation spawn intent 抑制为 status_only(implementation_ready_to_publish)。
不重蹈 #507 死锁:只在 publish 已 ready 后抑制 spawn(不漏派)、publish 仍全验证(不提前/
重复)、wakeup-runner/publish 授权口径不变。live 验证 #553 现投影 executable publish。



* 实现 issue #555 (#557)

* 修复 publish-gap 第3层:wakeup_runner clean_exit revalidation 对 duplicate-marker log 复用 artifact-fallback

pubgap2 修了 plan 侧 duplicate-marker artifact-fallback,但 wakeup_runner 的
_source_log_has_clean_marker(clean_exit_source_marker 预条件 revalidation)仍用严格
reader,对 #553 duplicate IMPLEMENT_DONE echo 返回 clean_exit_marker_missing→阻 publish。
改为严格 reader 优先,仅当 duplicate_or_conflicting_log_marker + is_implement_log +
_implement_run_artifact_done_marker 精确匹配 projected marker 才 fallback。复用 plan 侧
同一 helper,不弱化安全门(reviewer/release/rebase/非 implement/其它 fail reason 保持严格)。



* 实现 issue #553

* PR #561 review-fix output

* PR #561 remote-ci-fix output

* PR #561 review-fix output

* 收窄 restart daemon 权威到 owner-local 命令面 (#558)

* #471 hotfix: wakeup_plan completed-marker 投影按 open-managed 集过滤,closed/merged target 不投影(止血 graphql:消除 wakeup_runner 对已终结 target 的逐条重校验)

* #410: SKILL/prompt 注释改引 durable contract anchor,不引用一次性 .refactor-loop/runs/*;新增 test_generated_artifacts_not_authority 窄 source-regression (#472)

* #471 hotfix2: wakeup-plan 同一 target 只投影最新有效 marker,旧轮次不再投影 dispatchable action(止 graphql backoff 最后一刀)

* skill: 强化 no-gap 规则 — 0 codex + active work 无观察模式豁免,必须当 turn 立即介入派 codex(+source-regression)

* #470: direct-post prompt 渲染时 fixed-token 内联 _github-post-rules,host worktree cwd 可达(避免裸相对路径) (#479)

* #478 hotfix: daemon spawn 改 launch-and-forget 独立 supervisor(start_new_session 不 wait),daemon reload 不丢 in-flight worker;daemon 每 tick 清晰状态行

* #471: wakeup-plan latest-open-fresh 投影 — completed-marker open-target/latest + release-rollup 仅 latest-per-sha 且 ahead>0 本地 ref 验证(止 stale 投影/graphql) (#482)

* #481: consensus->implement readiness-dedup gate(止 daemon 重派 closed/has-PR/in-flight) (#486)

* #481: consensus->implement readiness-dedup gate, closed/closing-PR/remote-branch/in-flight target 不投影 dispatchable, 止重复/race

* #481 fix r1: dedup gate 复用 concurrency monitor owner surface 测量 in-flight,不重实现 daemon 算法(addr architect)

* #481 fix r2: 补 controller helper guard behavior test(addr tests)

* #485: consensus->implement 按 scope_paths 冲突分组串行(同组一个 executable,其余 status_only),复用 batching-heuristics,防并行冲突 PR (#489)

* #475: wakeup-runner WakeupApplyBudget — spawn-worker action 按 hard_gate.dispatch_required 批量到 floor deficit(lifecycle action 仍 1/tick),修 1/tick 线性恢复 (#492)

* hotfix(#475 ext): batch budget 纳入 dispatch_design_consensus —— solver triplet 一 tick 派齐(非 lifecycle,worker 派发),修 design-consensus 仍 1/tick 低并发

* hotfix: headless spawn-intent launch — skipped duplicate 不再 break 整批 spawn,stale applied ledger(log 缺失)可重试,helper_exit:3 加来源诊断事件;修 248 intent 堆积/0 并发

* hotfix r2: blocked lifecycle action 不再 dead-stop 整 tick — 跳过/抑制 stale blocked(close-drop target OPEN / publish 无 verified head)继续 launch 后续 spawn-batch,使 headless 并发拉到 floor

* hotfix r3: 根治 headless 0 并发 — detached spawn-codex launch 显式传 env_for_subprocess(修 #478 decouple 丢 env 致 EXIT=127);失败 log(非0 EXIT)可重试 launch,EXIT=0/未完成仍抑制重复;tick 状态汇总后续 launch 不被首个 blocked 掩盖

* hotfix r4: run_once 通用 blocked-not-dead-stop — 任何 blocked/skipped non-spawn action 不再 dead-stop 整 tick,继续 launch 后续 spawn-batch;wakeup-plan 抑制 precondition 注定失败的 stale action;根治 headless 并发=0

* hotfix r5: wakeup-plan 投影侧 suppress stale close_managed_drop(target 实际 OPEN 的 drop-close 注定失败)为 status_only,不再占 tick 挡住后续 design-consensus spawn launch

* hotfix r6: wakeup-runner run_once 不再把 no-intents 的 dispatch_design_consensus(helper_exit:3)误判为 codex launch failure dead-stop 整 tick

根因:dispatch_design_consensus 是 spawn-batch action;当 solver triplet 不完整(如 #496 minimal/structural EXIT=0 但未 emit SOLVER_DONE marker)时 helper 返回 exit 3(NO_INTENTS),_spawn_launch_failure 命中 "helper_exit:" 前缀判 True → run_once break,整 tick dead-stop,后续 reviewers/其他 design-consensus/implement 全部不派 → headless 并发=0。

修法:run_once 中 blocked 的 spawn-batch action 若 controller_action != spawn_codex_harness_background(即 dispatch_design_consensus),按 routing no-op skip-and-continue,继续 launch 后续 spawn-batch;仅真正 spawn_codex_harness_background launch failure 才走既有 retry/break。延续 r4 blocked-not-dead-stop 不变量到 design-consensus spawn-batch action。

behavior test: test_wakeup_runner_design_consensus_no_intents_does_not_dead_stop_later_spawn_batch(无修复时 later spawn 不 launch 而 FAIL)。



* hotfix r7: review head_sha binding — 打通 headless review→merge 自驱

根因:reviewer artifact frontmatter 不带 head_sha,_review_evidence_from_artifact 取不到所审 head SHA → 所有 reviewing PR(#480/#473/#402)恒判 WAIT_OR_REDISPATCH:invalid_reviewer_evidence:missing_reviewed_head_sha,且 wakeup-plan 不自动重派 reviewer → review→merge headless 永久卡死(挡住发版)。

修法(codex hotfix worker 实现,controller 验证后提交):
- controller_actions 派 reviewer 时取 headRefOid,缺则 fail-closed,注入权威 HEAD_SHA 进 rendered prompt。
- reviewer-{architect,tests,quality}.md frontmatter 写 head_sha: ${HEAD_SHA}。
- wakeup_runner._review_head_sha_for:artifact 缺 head_sha 时从 controller-rendered prompt/log 回收(不依赖 worker 自报,避免 markerless 覆辙);live-head 陈旧判定与 merge 真值表语义不变。
- wakeup_plan 为 missing/stale reviewer head 的 OPEN reviewing PR 投射可执行 dispatch_reviewers 自愈动作。

验证:controller 独立重跑 test_wakeup_runner+review_gate+wakeup_plan+controller_actions 293 通过、marker/prompt 合同 30 通过;review-gate 安全语义(stale head 仍不 merge)由 13 项 review-gate 测试锁住。



* hotfix r8: 让 review-evidence-redispatch 自愈动作可执行 — 打通 headless review→merge 闭环

承接 r7:r7 让 reviewer 证据绑定 head_sha,但 wakeup_plan 投射的 review-evidence-redispatch 自愈动作被 closed-action-projection finalizer 强制 status_only(其 kind 不在 EXECUTABLE_ACTION_KINDS),runner 永不应用 → #480/#473/#402 缺 head 的 review 证据无法自动重派 → review_gate 恒 WAIT_OR_REDISPATCH → 并发耗干到 0。

修法(codex hotfix worker 实现,controller 验证后提交):
- wakeup_plan EXECUTABLE_ACTION_KINDS 加 review-evidence-redispatch,finalizer 保留 runner authority 不再强制 status_only。
- controller_actions.dispatch_reviewers 读 stale_review_roles 只重派陈旧角色,且 _pending_review_spawn_exists 跳过已有在飞 dispatch-reviewers:<pr>:<role>:r1 intent(幂等不双派);仍走 r7 的 HEAD_SHA 注入,使重派 reviewer 的新证据 head-bound。
- merge-gate 真值表不变,仅让既有自愈动作可执行 + 定向到陈旧角色。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 295 通过。



* hotfix r9: reviewer 重派 bump 到下一轮 + 绝对 cd + latest-evidence-per-role — 真正打通 headless review→merge

根因(同一类:按"产物存在"而非"成功+区分维度"抑制):reviewer 重派硬编码回 r1,撞上旧 head 早先成功(EXIT=0)的 r1 log → _harness_spawn_intent_log_suppresses_retry 判"做过了"抑制 → 当前 head 该重跑的 review 永不跑 → review_gate 恒 WAIT_OR_REDISPATCH → 并发 0。且 intent cd 是相对 "." → headless 启动层不投射。

最小修法(按 maintainer directive:文件名即唯一 id,不加 wrapper --id,留在现有 r<R> 命名约定内不改公开命名):
- controller_actions.dispatch_reviewers:_next_review_round 取该 PR/role 现有最大轮 +1,渲染 r<N+1> 的 prompt/log/output/intent;pending-spawn 幂等 guard 按真实轮号;_append_harness_spawn_intent 的 cd 改 str(cd.resolve()) 绝对路径,使 headless 投射+启动成立。
- wakeup_plan.latest_reviewer_heads / wakeup_runner._latest_review_evidence_by_role:改为"每 role 取各自最新轮证据"(而非全局最新单轮),使只重派陈旧 role 时不丢其它 role 已 head-bound 的有效证据;merge 仍要求每 role 证据绑定 live head,安全语义不变。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions 299 通过。



* hotfix r10: review_gate action 带上 reviewed head_sha + e2e 锁死 headless review→decision 全路

根因(review path head_sha 串接的第 5 处):wakeup_plan 投影 REVIEW_DONE 的 review_gate action 时只读 log 的 head(log 无 head)→ action.head_sha 空 → wakeup_runner._review_gate 卡 WAIT_OR_REDISPATCH:missing_action_reviewed_head_sha,证据虽已 head-bound 但 gate 连比都没法比。

修法(最小、按文件名即 id 思路):
- wakeup_plan.completed_marker_actions:REVIEW_DONE action 的 head_sha 改由 _review_done_action_head_sha 从权威证据面解析(latest_reviewer_heads 含 r7 prompt 注入的 head;或全 role head-bound 时取 live PR head;否则回落 log)。只从真实来源填充,不绕过 gate 的 head 比对,merge 安全语义不变。
- 新增 test_review_gate_e2e.py:从真实 wakeup_plan 投影 → WakeupRunner apply,断言 FIX(有 reject)与 MERGE(全 approve+CI 绿)两条具体决策落地——端到端锁死整条 review→decision,杜绝再冒隐藏层。
- test_wakeup_plan.py:加 prompt-bound reviewed head 的 planner 回归。

验证:controller 独立重跑 test_wakeup_plan+wakeup_runner+review_gate+controller_actions+review_gate_e2e 302 通过。



* hotfix r11: PrChecksProjection 对瞬时 gh-api 失败加 bounded retry — ready PR headless 走到 review_gate 决策

根因:review_gate 的 CI 读 PrChecksProjection.check_pr 对 gh api pulls/<n> 与 commits/<sha>/check-runs 的单次非零返回立即 fail-closed(ci_unavailable:pull_api_failure),无重试;rate limit 健康,实为瞬时 blip → 让 #473/#402(MERGEABLE+CI 绿)每 tick 被瞬时失败挡住、never reach MERGE/FIX。

修法(最小,只动 PrChecksProjection;不碰 #322 ReleaseRequiredChecksProjection):
- _run_api_read:对两处 api read 最多 3 次确定性重试(无 sleep),成功即停,持续失败仍 fail-closed(安全)。happy path 无行为变化。
- starvation 结论(worker 核实):review_gate 本就被 apply,blocked ci_unavailable 行可重评估(非 terminal duplicate-suppressed),无需改 run_once。
- e2e:test_review_gate_e2e 加「首次 pulls 读瞬时失败仍 merge」用例;test_pr_checks 加 retry 覆盖。

验证:controller 独立重跑 test_pr_checks+wakeup_runner+review_gate+review_gate_e2e+wakeup_plan+co…

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

crnd:human:auto auto-advancing, no human needed crnd:lifecycle:managed loop-managed item crnd:phase:merged merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants