Skip to content
3 changes: 2 additions & 1 deletion skills/codex-refactor-loop/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -519,7 +519,7 @@ Routing is marker-driven, but markers are trusted only after `EXIT=0` at the tai
| Latest complete Consensus-rnd Phase review-gate reviewer round resolves to `WAIT_EXPLICIT_APPROVAL` | Surface comments and wait; do not merge or dispatch fix. |
| Latest complete Consensus-rnd Phase review-gate reviewer round resolves to `FIX` | Dispatch fix codex for next round using reject evidence as blocking input. |
| Consensus-rnd Phase review-gate gate incomplete or invalid (`WAIT_OR_REDISPATCH`) | Wait or re-dispatch the missing/invalid reviewer; never merge. |
| `FIX_DONE` | Dispatch reviewers again. |
| `FIX_DONE` | Dispatch reviewers again only after review-thread completion evidence gate passes for review-thread-driven fixes; otherwise status-only block. |
| `TEST_ADD_DONE` | Commit/push and resume CI watch. |

No-gap policy:
Expand Down Expand Up @@ -547,6 +547,7 @@ Required visible updates:
| Stuck timeout | Status explaining timeout and reflector dispatch. |
| Iteration complete | Rollup PR banner and next audit dispatch. |
| Skill bug fix | Commit visible on integration branch. |
| PR review comment fix | Completion includes review-thread closure: fixes driven by PR review comments are incomplete until the original thread is replied to and resolved, or explicitly escalated. |

Status card templates and escalation ASCII diagrams are in [status and escalation templates](#status-and-escalation-templates).

Expand Down
2 changes: 1 addition & 1 deletion skills/codex-refactor-loop/prompts/implement.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ ${SCOPE_PATHS}
Old pattern: ${OLD_PATTERN}
New principle: ${NEW_PRINCIPLE}
```
3-5 行内;不是 changelog,是代码自我说明。
3-5 行内;不是 changelog,是代码自我说明。The marker identity is exactly `Refactor (iter${ITERATION}/${CLUSTER_ID})`; do not replace it with issue-only identities such as `Refactor (issue1525)`. If `${CLUSTER_ID}` is missing, use `cluster-issue${ISSUE_NUMBER}` and record that fallback in the implementation summary.
- missing/empty/default/`none`:MUST NOT add `Refactor (...)`, `Old pattern`, `New principle`, or `iterN/cluster` refactor-history source comments. Put the rationale in the implementation summary and include exactly: `refactor self-doc: not applicable (HOST_REFACTOR_COMMENT_POLICY=none)`.
3. **不新增功能**:不引入新接口、新 flag、新模块;只清理违反点。新增极小辅助类型的注释也必须遵守 `${HOST_REFACTOR_COMMENT_POLICY}`:missing/empty/default/`none` 时不得写 `refactor helper`, `no behavior change`, `Old`, `New`, 或 `iterN` 等 refactor-history source comments;如确需源码注释,只写面向业务行为的准确英文说明。仅 explicit `self-doc-comment` 时才按第 2 条既有 Refactor self-documentation 格式写注释。
4. **测试**:按 `verification_hints` 跑测试,必须通过;测试不足必须补;任何 `sleep/delay` 轮询测试必须改为确定性断言。
Expand Down
2 changes: 2 additions & 0 deletions skills/codex-refactor-loop/prompts/meta-reflector-stalled.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ If any of answers 1-3 is yes, do not emit `META_RESOLVED:escalate-human`. Emit `

`META_RESOLVED:drop:<reason>` is valid for false-positive/wontfix cases and for phase9-no-framing cases where Consensus-rnd Phase design-consensus stayed unchanged across multi-round solver evidence and continued polling would be waste. Do not use `drop` to bypass architect/quality rejects that have an actionable Consensus-rnd Phase design-consensus or maintainer-directive route.

Repeated rejects over deterministic in-scope text normalization are not human escalation. If `HOST_REFACTOR_COMMENT_POLICY=self-doc-comment` and the blocker is only non-canonical refactor marker identity, such as `Refactor (issue1525)` versus `Refactor (iterN/cluster-XXX)`, emit `META_RESOLVED:retry-fix:<exact normalization instruction>` when the canonical value can be derived from iteration/cluster metadata or the reviewer provided the concrete replacement pattern.

Valid outputs:

- `META_RESOLVED:retry-fix:<reason>`
Expand Down
10 changes: 7 additions & 3 deletions skills/codex-refactor-loop/prompts/review-fix.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Categorize each demand into one of:

For each fix:
- Open the file fully (not just the hunk) to make a context-aware change.
- Preserve/add refactor self-doc comments only when `${HOST_REFACTOR_COMMENT_POLICY}=self-doc-comment`, and keep those source comments English-only. When `${HOST_REFACTOR_COMMENT_POLICY}` is missing/empty/default/`none`, do not add `Refactor (...)`, `Old pattern`, `New principle`, or `iterN/cluster` refactor-history source comments; keep rationale in the fix report/external artifact and include `refactor self-doc: not applicable (HOST_REFACTOR_COMMENT_POLICY=none)`. If a reviewer demands those comments under `none`, classify it as a host-policy conflict/false-positive and record that evidence in the fix report. Any other policy value is invalid and fail-closed; do not guess.
- Preserve/add refactor self-doc comments only when `${HOST_REFACTOR_COMMENT_POLICY}=self-doc-comment`, and keep those source comments English-only. When enabled, non-canonical marker identity is (A) fixable in-scope: normalize issue-only identities such as `Refactor (issue1525)` to `Refactor (iter${ITERATION}/${CLUSTER_ID})`. If `${CLUSTER_ID}` is missing, derive it from the PR branch, audit artifact, or implement summary; if still absent, use `cluster-issue${ISSUE_NUMBER}` and record the derivation in the fix report. Do not emit `FIX_BLOCKED:human-decision` for deterministic marker normalization. When `${HOST_REFACTOR_COMMENT_POLICY}` is missing/empty/default/`none`, do not add `Refactor (...)`, `Old pattern`, `New principle`, or `iterN/cluster` refactor-history source comments; keep rationale in the fix report/external artifact and include `refactor self-doc: not applicable (HOST_REFACTOR_COMMENT_POLICY=none)`. If a reviewer demands those comments under `none`, classify it as a host-policy conflict/false-positive and record that evidence in the fix report. Any other policy value is invalid and fail-closed; do not guess.
- New test files: follow existing host test naming conventions, single behavior per test, no `sleep/delay`, no `[Skip]`, no mock-only assertions.
- New non-test code stays minimal and reuses existing patterns.

Expand All @@ -60,7 +60,11 @@ cd $REPO_ROOT && \

Pick the test projects whose code you changed; do NOT run the full solution test suite (too slow). If build fails → fix or `FIX_BLOCKED:build:<short>`.

### Step 4 — Write fix artifact
### Step 4 — Close review-thread completion evidence when seeded

If `$REPO_ROOT/.refactor-loop/state/review-thread-completion/pr${PR_NUMBER}.json` exists with `review_thread_driven=true`, treat it as blocking completion evidence: reply to and resolve the seeded original PR review thread, then update the artifact to set `replied=true` and `resolved=true`. If `thread_id` is empty, GitHub closure fails, or escalation lacks an exact clean-exit `.refactor-loop/logs/*.log` `META_RESOLVED:escalate-human:<short>` source, record the reason in `${FIX_OUTPUT_PATH}` and emit `FIX_BLOCKED:${PR_NUMBER}:round-${FIX_ROUND}:other:review-thread-completion`.

### Step 5 — Write fix artifact

Write `${FIX_OUTPUT_PATH}` with this structure:

Expand All @@ -86,7 +90,7 @@ Write `${FIX_OUTPUT_PATH}` with this structure:
- <if blocked, say "controller routes to reflector/meta-layer" + paste the FIX_BLOCKED line>
```

### Step 5 — Emit marker
### Step 6 — Emit marker

End your output with EXACTLY one of:

Expand Down
2 changes: 1 addition & 1 deletion skills/codex-refactor-loop/prompts/reviewer-quality.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ You are **one of N independent reviewers**.
- [ ] **No under-engineering**: ≥3 near-identical inline copies of a snippet should be extracted. Inline duplication that violates DRY → comment.
- [ ] **Method size & cyclomatic complexity**: a single new/modified method ≤ 80 lines and ≤ ~15 branches is preferred. Existing host 项目的复杂度分析器 warnings carried unchanged ≠ regression; but adding new ones → comment.
- [ ] **Comments add value**: new comments explain *why* not *what* (the code already says what). Filler comments / commented-out code → comment.
- [ ] **Refactor self-doc comment policy**: read `${HOST_REFACTOR_COMMENT_POLICY}`. missing/empty/default/`none` normalizes to `none`: missing/illegible self-doc must not be a reject reason, rationale belongs in external artifacts, and new Refactor/Old/New/iteration source comments are defects. Explicit `self-doc-comment` is downstream compatibility opt-in: English-only refactor self-doc comments must be present and clear, with Old/New blocks readable to a non-audit reader (no `see issue #X` placeholders, no truncated sentences). Still comment/reject for naming, dead code, complexity, scope creep, or code whose intent cannot be reviewed from names/structure/external artifacts. Any other value is invalid and fail-closed; do not guess.
- [ ] **Refactor self-doc comment policy**: read `${HOST_REFACTOR_COMMENT_POLICY}`. missing/empty/default/`none` normalizes to `none`: missing/illegible self-doc must not be a reject reason, rationale belongs in external artifacts, and new Refactor/Old/New/iteration source comments are defects. Explicit `self-doc-comment` is downstream compatibility opt-in: English-only refactor self-doc comments must be present and clear, with Old/New blocks readable to a non-audit reader (no `see issue #X` placeholders, no truncated sentences). Non-canonical marker identity is a fixable process defect: reject with the exact expected canonical marker, not a redesign or human-decision request. Still comment/reject for naming, dead code, complexity, scope creep, or code whose intent cannot be reviewed from names/structure/external artifacts. Any other value is invalid and fail-closed; do not guess.
- [ ] **No unrelated drive-by changes**: diff stays focused on the cluster intent; one-line "fix typo over there" or "tidy this whitespace" sneaking into a behavior PR → comment.

## Out of scope
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
from datetime import datetime, timedelta, timezone
from pathlib import Path
from string import Template
from typing import Mapping, Sequence
from typing import Any, Mapping, Sequence

from .active_controller import require_active_controller, write_active_controller_status
from . import labels
Expand All @@ -20,7 +20,11 @@
from .github_body import GitHubBodyError, validate_self_contained_github_body
from .issue_decomposition import load_issue_decomposition_plan
from .release.publisher import ReleasePublishResult, ReleasePublisher
from .review_fix_dispatch import ReviewFixDispatchSpec
from .review_fix_dispatch import (
ReviewFixDispatchSpec,
ReviewThreadCompletionEvidence,
validate_review_thread_completion,
)
from .triage import apply_decision, load_triage_apply_config
from .work_items import extract_closing_issue_numbers
from .workflow_spec import WorkflowSpecError, load_validated_workflow_spec
Expand Down Expand Up @@ -926,8 +930,108 @@ def render_review_fix_prompt(
str(prompt_path),
env=render_env,
)
self._write_review_thread_completion_seed(pr_number)
return spec

def _write_review_thread_completion_seed(self, pr_number: int) -> None:
state_dir = self.ctx.repo_root / ".refactor-loop" / "state" / "review-thread-completion"
state_path = state_dir / f"pr{pr_number}.json"
thread = self._first_unresolved_review_thread(pr_number)
if thread is None:
state_path.unlink(missing_ok=True)
return
state_dir.mkdir(parents=True, exist_ok=True)
state_path.write_text(
json.dumps(
{
"review_thread_driven": True,
"thread_id": thread["id"],
"replied": False,
"resolved": False,
"source": thread["source"],
},
sort_keys=True,
),
encoding="utf-8",
)

def _first_unresolved_review_thread(self, pr_number: int) -> dict[str, Any] | None:
slug = self.ctx.gh_repo_slug
if not slug:
return {"id": "", "source": "live-pr-review-thread-unknown"}
owner, _, repo = slug.partition("/")
if not owner or not repo:
return {"id": "", "source": "live-pr-review-thread-unknown"}
query = (
"query($owner:String!,$repo:String!,$number:Int!,$after:String){ "
"repository(owner:$owner,name:$repo){ pullRequest(number:$number){ "
"reviewThreads(first:100, after:$after){ "
"nodes{ id isResolved } pageInfo{ hasNextPage endCursor } "
"} } } }"
)
after = ""
while True:
args = [
"api",
"graphql",
"-f",
f"owner={owner}",
"-f",
f"repo={repo}",
"-F",
f"number={pr_number}",
"-f",
f"query={query}",
]
if after:
args.extend(["-f", f"after={after}"])
result = subprocess.run(
["gh", *args],
cwd=str(self.ctx.repo_root),
capture_output=True,
text=True,
check=False,
)
if result.returncode != 0:
return {"id": "", "source": "live-pr-review-thread-unknown"}
try:
payload = json.loads(result.stdout or "{}")
except json.JSONDecodeError:
return {"id": "", "source": "live-pr-review-thread-unknown"}
review_threads = (
(((payload.get("data") or {}).get("repository") or {}).get("pullRequest") or {})
.get("reviewThreads")
)
if not isinstance(review_threads, dict):
return {"id": "", "source": "live-pr-review-thread-unknown"}
nodes = review_threads.get("nodes")
if not isinstance(nodes, list):
return {"id": "", "source": "live-pr-review-thread-unknown"}
for node in nodes:
if not isinstance(node, dict):
return {"id": "", "source": "live-pr-review-thread-unknown"}
thread_id = node.get("id")
is_resolved = node.get("isResolved")
if isinstance(thread_id, str) and thread_id and is_resolved is False:
return {"id": thread_id, "source": "live-pr-review-thread"}
if not isinstance(is_resolved, bool):
return {"id": "", "source": "live-pr-review-thread-unknown"}
page_info = review_threads.get("pageInfo")
if not isinstance(page_info, dict):
return {"id": "", "source": "live-pr-review-thread-unknown"}
has_next_page = page_info.get("hasNextPage")
if has_next_page is False:
return None
if has_next_page is not True:
return {"id": "", "source": "live-pr-review-thread-unknown"}
end_cursor = page_info.get("endCursor")
if not isinstance(end_cursor, str) or not end_cursor:
return {"id": "", "source": "live-pr-review-thread-unknown"}
after = end_cursor

def validate_review_fix_completion(self, evidence: ReviewThreadCompletionEvidence) -> None:
validate_review_thread_completion(evidence)

def _resolve_template_input(self, input_path: str) -> Path:
if not input_path.startswith("host:"):
return Path(input_path)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Review-fix prompt render boundary helpers."""
"""Review-fix prompt render and completion boundary helpers."""

from __future__ import annotations

Expand All @@ -10,6 +10,7 @@

_POSITIVE_INT_RE = re.compile(r"^[1-9][0-9]*$")
_FIX_OUTPUT_RE = re.compile(r"^\.refactor-loop/runs/fix-pr[1-9][0-9]*-round-[1-9][0-9]*-report\.md$")
_ESCALATION_EVIDENCE_RE = re.compile(r"^META_RESOLVED:escalate-human:[A-Za-z0-9._-]+$")


@dataclass(frozen=True)
Expand Down Expand Up @@ -51,6 +52,39 @@ def validate_fix_output_path(path: str) -> str:
return value


@dataclass(frozen=True)
class ReviewThreadCompletionEvidence:
"""Evidence required before a review-thread-driven fix is complete."""

review_thread_driven: bool
thread_id: str = ""
replied: bool = False
resolved: bool = False
escalation_evidence: str = ""


class ReviewThreadCompletionError(ValueError):
"""Raised when review-thread completion evidence is missing or incomplete."""


def validate_review_thread_completion(evidence: ReviewThreadCompletionEvidence) -> None:
"""Fail closed unless original review-thread completion evidence is present."""

if not evidence.review_thread_driven:
return
escalation_evidence = evidence.escalation_evidence.strip()
if escalation_evidence:
if not _ESCALATION_EVIDENCE_RE.fullmatch(escalation_evidence):
raise ReviewThreadCompletionError("review-thread escalation evidence must be META_RESOLVED:escalate-human:<short>")
return
if not evidence.thread_id.strip():
raise ReviewThreadCompletionError("review-thread-driven fix requires original thread_id evidence")
if not evidence.replied:
raise ReviewThreadCompletionError("review-thread-driven fix requires reply evidence on the original thread")
if not evidence.resolved:
raise ReviewThreadCompletionError("review-thread-driven fix requires resolved evidence on the original thread")


def _validate_positive_int(value: int, label: str) -> str:
text = str(value)
if not _POSITIVE_INT_RE.fullmatch(text):
Expand Down
Loading
Loading