diff --git a/skills/codex-refactor-loop/SKILL.md b/skills/codex-refactor-loop/SKILL.md index f525cc6c..3198fd18 100644 --- a/skills/codex-refactor-loop/SKILL.md +++ b/skills/codex-refactor-loop/SKILL.md @@ -519,7 +519,7 @@ Routing is marker-driven, but markers are trusted only after `EXIT=0` at the tai | Latest complete Consensus-rnd Phase review-gate reviewer round resolves to `WAIT_EXPLICIT_APPROVAL` | Surface comments and wait; do not merge or dispatch fix. | | Latest complete Consensus-rnd Phase review-gate reviewer round resolves to `FIX` | Dispatch fix codex for next round using reject evidence as blocking input. | | Consensus-rnd Phase review-gate gate incomplete or invalid (`WAIT_OR_REDISPATCH`) | Wait or re-dispatch the missing/invalid reviewer; never merge. | -| `FIX_DONE` | Dispatch reviewers again. | +| `FIX_DONE` | Dispatch reviewers again only after review-thread completion evidence gate passes for review-thread-driven fixes; otherwise status-only block. | | `TEST_ADD_DONE` | Commit/push and resume CI watch. | No-gap policy: @@ -547,6 +547,7 @@ Required visible updates: | Stuck timeout | Status explaining timeout and reflector dispatch. | | Iteration complete | Rollup PR banner and next audit dispatch. | | Skill bug fix | Commit visible on integration branch. | +| PR review comment fix | Completion includes review-thread closure: fixes driven by PR review comments are incomplete until the original thread is replied to and resolved, or explicitly escalated. | Status card templates and escalation ASCII diagrams are in [status and escalation templates](#status-and-escalation-templates). diff --git a/skills/codex-refactor-loop/prompts/implement.md b/skills/codex-refactor-loop/prompts/implement.md index f76683df..5781168a 100644 --- a/skills/codex-refactor-loop/prompts/implement.md +++ b/skills/codex-refactor-loop/prompts/implement.md @@ -31,7 +31,7 @@ ${SCOPE_PATHS} Old pattern: ${OLD_PATTERN} New principle: ${NEW_PRINCIPLE} ``` - 3-5 行内;不是 changelog,是代码自我说明。 + 3-5 行内;不是 changelog,是代码自我说明。The marker identity is exactly `Refactor (iter${ITERATION}/${CLUSTER_ID})`; do not replace it with issue-only identities such as `Refactor (issue1525)`. If `${CLUSTER_ID}` is missing, use `cluster-issue${ISSUE_NUMBER}` and record that fallback in the implementation summary. - missing/empty/default/`none`:MUST NOT add `Refactor (...)`, `Old pattern`, `New principle`, or `iterN/cluster` refactor-history source comments. Put the rationale in the implementation summary and include exactly: `refactor self-doc: not applicable (HOST_REFACTOR_COMMENT_POLICY=none)`. 3. **不新增功能**:不引入新接口、新 flag、新模块;只清理违反点。新增极小辅助类型的注释也必须遵守 `${HOST_REFACTOR_COMMENT_POLICY}`:missing/empty/default/`none` 时不得写 `refactor helper`, `no behavior change`, `Old`, `New`, 或 `iterN` 等 refactor-history source comments;如确需源码注释,只写面向业务行为的准确英文说明。仅 explicit `self-doc-comment` 时才按第 2 条既有 Refactor self-documentation 格式写注释。 4. **测试**:按 `verification_hints` 跑测试,必须通过;测试不足必须补;任何 `sleep/delay` 轮询测试必须改为确定性断言。 diff --git a/skills/codex-refactor-loop/prompts/meta-reflector-stalled.md b/skills/codex-refactor-loop/prompts/meta-reflector-stalled.md index 4b4e6cb1..7f5d6ff8 100644 --- a/skills/codex-refactor-loop/prompts/meta-reflector-stalled.md +++ b/skills/codex-refactor-loop/prompts/meta-reflector-stalled.md @@ -33,6 +33,8 @@ If any of answers 1-3 is yes, do not emit `META_RESOLVED:escalate-human`. Emit ` `META_RESOLVED:drop:` is valid for false-positive/wontfix cases and for phase9-no-framing cases where Consensus-rnd Phase design-consensus stayed unchanged across multi-round solver evidence and continued polling would be waste. Do not use `drop` to bypass architect/quality rejects that have an actionable Consensus-rnd Phase design-consensus or maintainer-directive route. +Repeated rejects over deterministic in-scope text normalization are not human escalation. If `HOST_REFACTOR_COMMENT_POLICY=self-doc-comment` and the blocker is only non-canonical refactor marker identity, such as `Refactor (issue1525)` versus `Refactor (iterN/cluster-XXX)`, emit `META_RESOLVED:retry-fix:` when the canonical value can be derived from iteration/cluster metadata or the reviewer provided the concrete replacement pattern. + Valid outputs: - `META_RESOLVED:retry-fix:` diff --git a/skills/codex-refactor-loop/prompts/review-fix.md b/skills/codex-refactor-loop/prompts/review-fix.md index 657316c6..3edd667d 100644 --- a/skills/codex-refactor-loop/prompts/review-fix.md +++ b/skills/codex-refactor-loop/prompts/review-fix.md @@ -44,7 +44,7 @@ Categorize each demand into one of: For each fix: - Open the file fully (not just the hunk) to make a context-aware change. -- Preserve/add refactor self-doc comments only when `${HOST_REFACTOR_COMMENT_POLICY}=self-doc-comment`, and keep those source comments English-only. When `${HOST_REFACTOR_COMMENT_POLICY}` is missing/empty/default/`none`, do not add `Refactor (...)`, `Old pattern`, `New principle`, or `iterN/cluster` refactor-history source comments; keep rationale in the fix report/external artifact and include `refactor self-doc: not applicable (HOST_REFACTOR_COMMENT_POLICY=none)`. If a reviewer demands those comments under `none`, classify it as a host-policy conflict/false-positive and record that evidence in the fix report. Any other policy value is invalid and fail-closed; do not guess. +- Preserve/add refactor self-doc comments only when `${HOST_REFACTOR_COMMENT_POLICY}=self-doc-comment`, and keep those source comments English-only. When enabled, non-canonical marker identity is (A) fixable in-scope: normalize issue-only identities such as `Refactor (issue1525)` to `Refactor (iter${ITERATION}/${CLUSTER_ID})`. If `${CLUSTER_ID}` is missing, derive it from the PR branch, audit artifact, or implement summary; if still absent, use `cluster-issue${ISSUE_NUMBER}` and record the derivation in the fix report. Do not emit `FIX_BLOCKED:human-decision` for deterministic marker normalization. When `${HOST_REFACTOR_COMMENT_POLICY}` is missing/empty/default/`none`, do not add `Refactor (...)`, `Old pattern`, `New principle`, or `iterN/cluster` refactor-history source comments; keep rationale in the fix report/external artifact and include `refactor self-doc: not applicable (HOST_REFACTOR_COMMENT_POLICY=none)`. If a reviewer demands those comments under `none`, classify it as a host-policy conflict/false-positive and record that evidence in the fix report. Any other policy value is invalid and fail-closed; do not guess. - New test files: follow existing host test naming conventions, single behavior per test, no `sleep/delay`, no `[Skip]`, no mock-only assertions. - New non-test code stays minimal and reuses existing patterns. @@ -60,7 +60,11 @@ cd $REPO_ROOT && \ Pick the test projects whose code you changed; do NOT run the full solution test suite (too slow). If build fails → fix or `FIX_BLOCKED:build:`. -### Step 4 — Write fix artifact +### Step 4 — Close review-thread completion evidence when seeded + +If `$REPO_ROOT/.refactor-loop/state/review-thread-completion/pr${PR_NUMBER}.json` exists with `review_thread_driven=true`, treat it as blocking completion evidence: reply to and resolve the seeded original PR review thread, then update the artifact to set `replied=true` and `resolved=true`. If `thread_id` is empty, GitHub closure fails, or escalation lacks an exact clean-exit `.refactor-loop/logs/*.log` `META_RESOLVED:escalate-human:` source, record the reason in `${FIX_OUTPUT_PATH}` and emit `FIX_BLOCKED:${PR_NUMBER}:round-${FIX_ROUND}:other:review-thread-completion`. + +### Step 5 — Write fix artifact Write `${FIX_OUTPUT_PATH}` with this structure: @@ -86,7 +90,7 @@ Write `${FIX_OUTPUT_PATH}` with this structure: - ``` -### Step 5 — Emit marker +### Step 6 — Emit marker End your output with EXACTLY one of: diff --git a/skills/codex-refactor-loop/prompts/reviewer-quality.md b/skills/codex-refactor-loop/prompts/reviewer-quality.md index 39f27cf0..72c41162 100644 --- a/skills/codex-refactor-loop/prompts/reviewer-quality.md +++ b/skills/codex-refactor-loop/prompts/reviewer-quality.md @@ -26,7 +26,7 @@ You are **one of N independent reviewers**. - [ ] **No under-engineering**: ≥3 near-identical inline copies of a snippet should be extracted. Inline duplication that violates DRY → comment. - [ ] **Method size & cyclomatic complexity**: a single new/modified method ≤ 80 lines and ≤ ~15 branches is preferred. Existing host 项目的复杂度分析器 warnings carried unchanged ≠ regression; but adding new ones → comment. - [ ] **Comments add value**: new comments explain *why* not *what* (the code already says what). Filler comments / commented-out code → comment. -- [ ] **Refactor self-doc comment policy**: read `${HOST_REFACTOR_COMMENT_POLICY}`. missing/empty/default/`none` normalizes to `none`: missing/illegible self-doc must not be a reject reason, rationale belongs in external artifacts, and new Refactor/Old/New/iteration source comments are defects. Explicit `self-doc-comment` is downstream compatibility opt-in: English-only refactor self-doc comments must be present and clear, with Old/New blocks readable to a non-audit reader (no `see issue #X` placeholders, no truncated sentences). Still comment/reject for naming, dead code, complexity, scope creep, or code whose intent cannot be reviewed from names/structure/external artifacts. Any other value is invalid and fail-closed; do not guess. +- [ ] **Refactor self-doc comment policy**: read `${HOST_REFACTOR_COMMENT_POLICY}`. missing/empty/default/`none` normalizes to `none`: missing/illegible self-doc must not be a reject reason, rationale belongs in external artifacts, and new Refactor/Old/New/iteration source comments are defects. Explicit `self-doc-comment` is downstream compatibility opt-in: English-only refactor self-doc comments must be present and clear, with Old/New blocks readable to a non-audit reader (no `see issue #X` placeholders, no truncated sentences). Non-canonical marker identity is a fixable process defect: reject with the exact expected canonical marker, not a redesign or human-decision request. Still comment/reject for naming, dead code, complexity, scope creep, or code whose intent cannot be reviewed from names/structure/external artifacts. Any other value is invalid and fail-closed; do not guess. - [ ] **No unrelated drive-by changes**: diff stays focused on the cluster intent; one-line "fix typo over there" or "tidy this whitespace" sneaking into a behavior PR → comment. ## Out of scope diff --git a/skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py b/skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py index 70977d13..2fb16d4a 100644 --- a/skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py +++ b/skills/codex-refactor-loop/scripts/codex_refactor_loop/controller_actions.py @@ -11,7 +11,7 @@ from datetime import datetime, timedelta, timezone from pathlib import Path from string import Template -from typing import Mapping, Sequence +from typing import Any, Mapping, Sequence from .active_controller import require_active_controller, write_active_controller_status from . import labels @@ -20,7 +20,11 @@ from .github_body import GitHubBodyError, validate_self_contained_github_body from .issue_decomposition import load_issue_decomposition_plan from .release.publisher import ReleasePublishResult, ReleasePublisher -from .review_fix_dispatch import ReviewFixDispatchSpec +from .review_fix_dispatch import ( + ReviewFixDispatchSpec, + ReviewThreadCompletionEvidence, + validate_review_thread_completion, +) from .triage import apply_decision, load_triage_apply_config from .work_items import extract_closing_issue_numbers from .workflow_spec import WorkflowSpecError, load_validated_workflow_spec @@ -926,8 +930,108 @@ def render_review_fix_prompt( str(prompt_path), env=render_env, ) + self._write_review_thread_completion_seed(pr_number) return spec + def _write_review_thread_completion_seed(self, pr_number: int) -> None: + state_dir = self.ctx.repo_root / ".refactor-loop" / "state" / "review-thread-completion" + state_path = state_dir / f"pr{pr_number}.json" + thread = self._first_unresolved_review_thread(pr_number) + if thread is None: + state_path.unlink(missing_ok=True) + return + state_dir.mkdir(parents=True, exist_ok=True) + state_path.write_text( + json.dumps( + { + "review_thread_driven": True, + "thread_id": thread["id"], + "replied": False, + "resolved": False, + "source": thread["source"], + }, + sort_keys=True, + ), + encoding="utf-8", + ) + + def _first_unresolved_review_thread(self, pr_number: int) -> dict[str, Any] | None: + slug = self.ctx.gh_repo_slug + if not slug: + return {"id": "", "source": "live-pr-review-thread-unknown"} + owner, _, repo = slug.partition("/") + if not owner or not repo: + return {"id": "", "source": "live-pr-review-thread-unknown"} + query = ( + "query($owner:String!,$repo:String!,$number:Int!,$after:String){ " + "repository(owner:$owner,name:$repo){ pullRequest(number:$number){ " + "reviewThreads(first:100, after:$after){ " + "nodes{ id isResolved } pageInfo{ hasNextPage endCursor } " + "} } } }" + ) + after = "" + while True: + args = [ + "api", + "graphql", + "-f", + f"owner={owner}", + "-f", + f"repo={repo}", + "-F", + f"number={pr_number}", + "-f", + f"query={query}", + ] + if after: + args.extend(["-f", f"after={after}"]) + result = subprocess.run( + ["gh", *args], + cwd=str(self.ctx.repo_root), + capture_output=True, + text=True, + check=False, + ) + if result.returncode != 0: + return {"id": "", "source": "live-pr-review-thread-unknown"} + try: + payload = json.loads(result.stdout or "{}") + except json.JSONDecodeError: + return {"id": "", "source": "live-pr-review-thread-unknown"} + review_threads = ( + (((payload.get("data") or {}).get("repository") or {}).get("pullRequest") or {}) + .get("reviewThreads") + ) + if not isinstance(review_threads, dict): + return {"id": "", "source": "live-pr-review-thread-unknown"} + nodes = review_threads.get("nodes") + if not isinstance(nodes, list): + return {"id": "", "source": "live-pr-review-thread-unknown"} + for node in nodes: + if not isinstance(node, dict): + return {"id": "", "source": "live-pr-review-thread-unknown"} + thread_id = node.get("id") + is_resolved = node.get("isResolved") + if isinstance(thread_id, str) and thread_id and is_resolved is False: + return {"id": thread_id, "source": "live-pr-review-thread"} + if not isinstance(is_resolved, bool): + return {"id": "", "source": "live-pr-review-thread-unknown"} + page_info = review_threads.get("pageInfo") + if not isinstance(page_info, dict): + return {"id": "", "source": "live-pr-review-thread-unknown"} + has_next_page = page_info.get("hasNextPage") + if has_next_page is False: + return None + if has_next_page is not True: + return {"id": "", "source": "live-pr-review-thread-unknown"} + end_cursor = page_info.get("endCursor") + if not isinstance(end_cursor, str) or not end_cursor: + return {"id": "", "source": "live-pr-review-thread-unknown"} + after = end_cursor + + def validate_review_fix_completion(self, evidence: ReviewThreadCompletionEvidence) -> None: + validate_review_thread_completion(evidence) + def _resolve_template_input(self, input_path: str) -> Path: if not input_path.startswith("host:"): return Path(input_path) diff --git a/skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py b/skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py index 8002c380..9102d2b4 100644 --- a/skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py +++ b/skills/codex-refactor-loop/scripts/codex_refactor_loop/review_fix_dispatch.py @@ -1,4 +1,4 @@ -"""Review-fix prompt render boundary helpers.""" +"""Review-fix prompt render and completion boundary helpers.""" from __future__ import annotations @@ -10,6 +10,7 @@ _POSITIVE_INT_RE = re.compile(r"^[1-9][0-9]*$") _FIX_OUTPUT_RE = re.compile(r"^\.refactor-loop/runs/fix-pr[1-9][0-9]*-round-[1-9][0-9]*-report\.md$") +_ESCALATION_EVIDENCE_RE = re.compile(r"^META_RESOLVED:escalate-human:[A-Za-z0-9._-]+$") @dataclass(frozen=True) @@ -51,6 +52,39 @@ def validate_fix_output_path(path: str) -> str: return value +@dataclass(frozen=True) +class ReviewThreadCompletionEvidence: + """Evidence required before a review-thread-driven fix is complete.""" + + review_thread_driven: bool + thread_id: str = "" + replied: bool = False + resolved: bool = False + escalation_evidence: str = "" + + +class ReviewThreadCompletionError(ValueError): + """Raised when review-thread completion evidence is missing or incomplete.""" + + +def validate_review_thread_completion(evidence: ReviewThreadCompletionEvidence) -> None: + """Fail closed unless original review-thread completion evidence is present.""" + + if not evidence.review_thread_driven: + return + escalation_evidence = evidence.escalation_evidence.strip() + if escalation_evidence: + if not _ESCALATION_EVIDENCE_RE.fullmatch(escalation_evidence): + raise ReviewThreadCompletionError("review-thread escalation evidence must be META_RESOLVED:escalate-human:") + return + if not evidence.thread_id.strip(): + raise ReviewThreadCompletionError("review-thread-driven fix requires original thread_id evidence") + if not evidence.replied: + raise ReviewThreadCompletionError("review-thread-driven fix requires reply evidence on the original thread") + if not evidence.resolved: + raise ReviewThreadCompletionError("review-thread-driven fix requires resolved evidence on the original thread") + + def _validate_positive_int(value: int, label: str) -> str: text = str(value) if not _POSITIVE_INT_RE.fullmatch(text): diff --git a/skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py b/skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py index 62140c24..a45314bb 100755 --- a/skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py +++ b/skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py @@ -25,6 +25,10 @@ from codex_refactor_loop.restart import restart_managed_daemon_names from codex_refactor_loop.transition_assessment import TransitionAssessmentReader, transition_rank_key from codex_refactor_loop.work_items import ManagedWorkProjection, open_actionable_managed_items +from codex_refactor_loop.review_fix_dispatch import ( + ReviewThreadCompletionEvidence, + validate_review_thread_completion, +) from codex_refactor_loop.workflow_spec import WorkflowSpecError, load_validated_workflow_spec from codex_refactor_loop.workflow_stages import assert_stage_slug @@ -627,10 +631,12 @@ def _extract_completed_marker_line(text: str) -> str | None: return None -def completed_marker_actions(repo_root: Path) -> list[dict[str, Any]]: +def completed_marker_actions(repo_root: Path, ctx: LoopContext | None = None) -> list[dict[str, Any]]: logs_dir = repo_root / ".refactor-loop" / "logs" if not logs_dir.exists(): return [] + if ctx is None: + ctx = LoopContext.load(repo_root=repo_root, env=os.environ, cwd=repo_root, read_only=True) actions: list[dict[str, Any]] = [] for log_path in sorted(logs_dir.glob("*.log"), key=lambda p: p.stat().st_mtime, reverse=True): marker = marker_from_completed_log(log_path) @@ -679,10 +685,146 @@ def completed_marker_actions(repo_root: Path) -> list[dict[str, Any]]: action["no_lifecycle_authority"] = True action.pop("runner_authority", None) action.pop("no_generic_command", None) + if marker.startswith("FIX_DONE"): + _apply_fix_done_review_thread_gate(repo_root, ctx, action) actions.append(action) return actions +def _apply_fix_done_review_thread_gate(repo_root: Path, ctx: LoopContext, action: dict[str, Any]) -> None: + pr_number = action.get("target_number") + if action.get("target_kind") != "PR" or not isinstance(pr_number, int): + return + evidence = _review_thread_completion_evidence(repo_root, ctx, pr_number) + try: + validate_review_thread_completion(evidence) + except ValueError as exc: + action["status_only"] = True + action["no_lifecycle_authority"] = True + action["route"] = "review-thread-completion-gate" + action["blocked_reason"] = f"review_thread_completion_incomplete:{exc}" + action["preconditions"] = [ + *action.get("preconditions", []), + "review_thread_completion_evidence", + ] + action.pop("runner_authority", None) + action.pop("no_generic_command", None) + action.pop("controller_action", None) + else: + action["preconditions"] = [ + *action.get("preconditions", []), + "review_thread_completion_evidence", + ] + + +def _review_thread_completion_evidence(repo_root: Path, ctx: LoopContext, pr_number: int) -> ReviewThreadCompletionEvidence: + artifact = repo_root / ".refactor-loop" / "state" / "review-thread-completion" / f"pr{pr_number}.json" + data: dict[str, Any] = {} + if artifact.is_file(): + try: + loaded = json.loads(artifact.read_text(encoding="utf-8")) + except (OSError, json.JSONDecodeError): + loaded = {} + if isinstance(loaded, dict): + data = loaded + review_thread_driven = bool(data.get("review_thread_driven")) + thread_id = str(data.get("thread_id") or "") + raw_escalation_evidence = str(data.get("escalation_evidence") or "") + escalation_evidence = ( + raw_escalation_evidence + if _has_clean_escalation_marker_source(repo_root, raw_escalation_evidence) + else "" + ) + live_original_thread_resolved = True + if review_thread_driven and not escalation_evidence.strip(): + live_original_thread_resolved = _original_review_thread_is_resolved(ctx, pr_number, thread_id) + return ReviewThreadCompletionEvidence( + review_thread_driven=review_thread_driven, + thread_id=thread_id, + replied=bool(data.get("replied")), + resolved=bool(data.get("resolved")) and live_original_thread_resolved, + escalation_evidence=escalation_evidence, + ) + + +def _has_clean_escalation_marker_source(repo_root: Path, escalation_evidence: str) -> bool: + marker = escalation_evidence.strip() + if not marker.startswith("META_RESOLVED:escalate-human:"): + return False + logs_dir = repo_root / ".refactor-loop" / "logs" + if not logs_dir.exists(): + return False + for log_path in sorted(logs_dir.glob("*.log"), key=lambda p: p.stat().st_mtime, reverse=True): + if marker_from_completed_log(log_path) == marker: + return True + return False + + +def _original_review_thread_is_resolved(ctx: LoopContext, pr_number: int, thread_id: str) -> bool: + if not thread_id.strip(): + return False + slug = str(ctx.host_env.get("GH_REPO_SLUG") or "").strip() + owner, _, repo = slug.partition("/") + if not owner or not repo: + return False + query = ( + "query($owner:String!,$repo:String!,$number:Int!,$after:String){ " + "repository(owner:$owner,name:$repo){ pullRequest(number:$number){ " + "reviewThreads(first:100, after:$after){ " + "nodes{ id isResolved } pageInfo{ hasNextPage endCursor } " + "} } } }" + ) + after = "" + while True: + cmd = [ + "gh", + "api", + "graphql", + "-f", + f"owner={owner}", + "-f", + f"repo={repo}", + "-F", + f"number={pr_number}", + "-f", + f"query={query}", + ] + if after: + cmd.extend(["-f", f"after={after}"]) + payload = run_json(cmd, cwd=ctx.repo_root) + repository = ((payload or {}).get("data") or {}).get("repository") + if not isinstance(repository, dict): + return False + pull_request = repository.get("pullRequest") + if not isinstance(pull_request, dict): + return False + review_threads = pull_request.get("reviewThreads") + if not isinstance(review_threads, dict): + return False + nodes = review_threads.get("nodes") + if not isinstance(nodes, list): + return False + for node in nodes: + if not isinstance(node, dict): + return False + if node.get("id") != thread_id: + continue + is_resolved = node.get("isResolved") + return is_resolved if isinstance(is_resolved, bool) else False + page_info = review_threads.get("pageInfo") + if not isinstance(page_info, dict): + return False + has_next_page = page_info.get("hasNextPage") + if not isinstance(has_next_page, bool): + return False + if not has_next_page: + return False + end_cursor = page_info.get("endCursor") + if not isinstance(end_cursor, str) or not end_cursor: + return False + after = end_cursor + + def consensus_implementation_fields(repo_root: Path, log_path: Path, item: str | None) -> dict[str, Any]: log_match = CONSENSUS_JUDGE_LOG_RE.fullmatch(log_path.name) if log_match is None: @@ -1686,7 +1828,7 @@ def build_plan(repo_root: Path) -> dict[str, Any]: actions.extend(harness_spawn_intent_actions(repo_root, ctx, monitor, gh_items, gh_items_loaded)) actions.extend(maintainer_comment_actions(repo_root, gh_items)) actions.extend(unpushed_worker_output_actions(repo_root, gh_items)) - actions.extend(completed_marker_actions(repo_root)) + actions.extend(completed_marker_actions(repo_root, ctx)) actions.extend(release_rollup_actions(repo_root)) actions.extend(ci_red_actions(repo_root, gh_items)) actions.extend(no_gap_actions(repo_root)) diff --git a/skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py b/skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py index 47d1d717..bca5c83a 100644 --- a/skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py +++ b/skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py @@ -10,13 +10,47 @@ SCRIPT_PATH = Path(__file__).resolve() SKILL_ROOT = SCRIPT_PATH.parents[1] PROMPTS_DIR = SKILL_ROOT / "prompts" +SKILL_MD = SKILL_ROOT / "SKILL.md" def read_prompt(name: str) -> str: return (PROMPTS_DIR / name).read_text(encoding="utf-8") +def read_skill() -> str: + return SKILL_MD.read_text(encoding="utf-8") + + class RefactorCommentPolicyPromptContractTests(unittest.TestCase): + def test_github_state_contract_requires_pr_review_thread_closure(self) -> None: + skill = read_skill() + + self.assertIn("## GitHub State Contract", skill) + self.assertIn("| PR review comment fix |", skill) + self.assertIn("Completion includes review-thread closure", skill) + self.assertIn("fixes driven by PR review comments are incomplete", skill) + self.assertIn("original thread is replied to and resolved", skill) + self.assertIn("or explicitly escalated", skill) + + def test_fix_done_route_mentions_review_thread_completion_gate(self) -> None: + skill = read_skill() + + self.assertIn("| `FIX_DONE` |", skill) + self.assertIn("review-thread completion evidence gate passes", skill) + self.assertIn("review-thread-driven fixes", skill) + self.assertIn("status-only block", skill) + + def test_review_fix_prompt_requires_seeded_review_thread_completion(self) -> None: + review_fix = read_prompt("review-fix.md") + + self.assertIn("Close review-thread completion evidence when seeded", review_fix) + self.assertIn(".refactor-loop/state/review-thread-completion/pr${PR_NUMBER}.json", review_fix) + self.assertIn("reply to and resolve the seeded original PR review thread", review_fix) + self.assertIn("set `replied=true` and `resolved=true`", review_fix) + self.assertIn("FIX_BLOCKED:${PR_NUMBER}:round-${FIX_ROUND}:other:review-thread-completion", review_fix) + self.assertIn("exact clean-exit `.refactor-loop/logs/*.log`", review_fix) + self.assertIn("META_RESOLVED:escalate-human:", review_fix) + def test_default_policy_is_none_and_external_rationale(self) -> None: implement = read_prompt("implement.md") verify = read_prompt("verify.md") @@ -61,10 +95,14 @@ def test_explicit_self_doc_policy_preserves_old_new_requirement(self) -> None: self.assertIn("Old pattern: ${OLD_PATTERN}", implement) self.assertIn("New principle: ${NEW_PRINCIPLE}", implement) self.assertIn("源码注释必须 English-only", implement) + self.assertIn("do not replace it with issue-only identities", implement) self.assertIn("缺失任何一处且无合理 not-applicable 说明 → 标记缺陷", verify) self.assertIn("HOST_REFACTOR_COMMENT_POLICY=self-doc-comment", quality) self.assertIn("must be present and clear", quality) + self.assertIn("Non-canonical marker identity is a fixable process defect", quality) self.assertIn("Preserve/add refactor self-doc comments only when `${HOST_REFACTOR_COMMENT_POLICY}=self-doc-comment`", review_fix) + self.assertIn("non-canonical marker identity is (A) fixable in-scope", review_fix) + self.assertIn("Do not emit `FIX_BLOCKED:human-decision` for deterministic marker normalization", review_fix) def test_none_policy_forbids_refactor_history_source_comments(self) -> None: expected = ( @@ -113,12 +151,21 @@ def test_none_policy_disables_missing_self_doc_rejects(self) -> None: "reviewer-architect.md", "reviewer-quality.md", "review-fix.md", + "meta-reflector-stalled.md", ) ) for token in forbidden_unconditional: with self.subTest(token=token): self.assertNotIn(token, combined) + def test_deterministic_marker_normalization_routes_to_fix(self) -> None: + reflector = read_prompt("meta-reflector-stalled.md") + + self.assertIn("deterministic in-scope text normalization", reflector) + self.assertIn("non-canonical refactor marker identity", reflector) + self.assertIn("META_RESOLVED:retry-fix:", reflector) + self.assertIn("not human escalation", reflector) + def test_invalid_refactor_comment_policy_fails_closed(self) -> None: for name in ( "implement.md", diff --git a/skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py b/skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py index aef8ba9b..f175e2b5 100644 --- a/skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py +++ b/skills/codex-refactor-loop/scripts/test_review_fix_dispatch.py @@ -3,10 +3,13 @@ from __future__ import annotations +import json import shutil +import subprocess import tempfile import unittest from pathlib import Path +from unittest.mock import patch import sys @@ -15,7 +18,12 @@ from codex_refactor_loop.context import LoopContext from codex_refactor_loop.controller_actions import ControllerActions -from codex_refactor_loop.review_fix_dispatch import ReviewFixDispatchSpec +from codex_refactor_loop.review_fix_dispatch import ( + ReviewFixDispatchSpec, + ReviewThreadCompletionError, + ReviewThreadCompletionEvidence, + validate_review_thread_completion, +) class ReviewFixDispatchTests(unittest.TestCase): @@ -23,7 +31,10 @@ def setUp(self) -> None: self.tmp = Path(tempfile.mkdtemp(prefix="review-fix-dispatch-test-")) (self.tmp / ".refactor-loop").mkdir(parents=True) (self.tmp / ".refactor-loop" / "host.env").write_text( - f'export REPO_ROOT="{self.tmp}"\n', + f'export REPO_ROOT="{self.tmp}"\n' + 'export GH_REPO_SLUG="owner/repo"\n' + 'export INTEGRATION_BRANCH="dev"\n' + 'export REVIEW_BASE_BRANCH="dev"\n', encoding="utf-8", ) self.actions = ControllerActions( @@ -60,25 +71,45 @@ def test_validate_rejects_root_report_and_path_escape(self) -> None: ReviewFixDispatchSpec.validate_fix_output_path(value) def test_controller_render_review_fix_prompt_injects_fix_output_path(self) -> None: - spec = self.actions.render_review_fix_prompt( - 269, - 1, - env={ - "PR_NUMBER": "269", - "PR_TITLE": "Review fix render", - "FIX_ROUND": "1", - "MAX_FIX_ROUNDS": "3", - "BASE_BRANCH": "dev", - "HEAD_BRANCH": "impl/issue269", - "REVIEW_ARCHITECT_PATH": ".refactor-loop/runs/review-pr269-architect-r1.md", - "REVIEW_TESTS_PATH": ".refactor-loop/runs/review-pr269-tests-r1.md", - "REVIEW_QUALITY_PATH": ".refactor-loop/runs/review-pr269-quality-r1.md", - "AUDIT_PATH": ".refactor-loop/runs/audit.md", - "IMPLEMENT_SUMMARY_PATH": ".refactor-loop/runs/implement.md", - "PROJECT_RULES": "CLAUDE.md", - "HOST_REFACTOR_COMMENT_POLICY": "self-doc-comment", - }, - ) + with patch("codex_refactor_loop.controller_actions.subprocess.run") as run: + run.return_value = subprocess.CompletedProcess( + ["gh"], + 0, + stdout=json.dumps( + { + "data": { + "repository": { + "pullRequest": { + "reviewThreads": { + "nodes": [], + "pageInfo": {"hasNextPage": False, "endCursor": None}, + } + } + } + } + } + ), + stderr="", + ) + spec = self.actions.render_review_fix_prompt( + 269, + 1, + env={ + "PR_NUMBER": "269", + "PR_TITLE": "Review fix render", + "FIX_ROUND": "1", + "MAX_FIX_ROUNDS": "3", + "BASE_BRANCH": "dev", + "HEAD_BRANCH": "impl/issue269", + "REVIEW_ARCHITECT_PATH": ".refactor-loop/runs/review-pr269-architect-r1.md", + "REVIEW_TESTS_PATH": ".refactor-loop/runs/review-pr269-tests-r1.md", + "REVIEW_QUALITY_PATH": ".refactor-loop/runs/review-pr269-quality-r1.md", + "AUDIT_PATH": ".refactor-loop/runs/audit.md", + "IMPLEMENT_SUMMARY_PATH": ".refactor-loop/runs/implement.md", + "PROJECT_RULES": "CLAUDE.md", + "HOST_REFACTOR_COMMENT_POLICY": "self-doc-comment", + }, + ) self.assertEqual(spec.fix_output_path, ".refactor-loop/runs/fix-pr269-round-1-report.md") prompt = self.tmp / ".refactor-loop" / "prompts" / "fixes" / "fix-pr269-round-1.md" @@ -87,6 +118,233 @@ def test_controller_render_review_fix_prompt_injects_fix_output_path(self) -> No self.assertNotIn("${FIX_OUTPUT_PATH}", rendered) self.assertTrue(prompt.exists()) + def test_controller_render_review_fix_prompt_writes_review_thread_seed(self) -> None: + with patch("codex_refactor_loop.controller_actions.subprocess.run") as run: + run.return_value = subprocess.CompletedProcess( + ["gh"], + 0, + stdout=json.dumps( + { + "data": { + "repository": { + "pullRequest": { + "reviewThreads": { + "nodes": [{"id": "PRRT_kwDOExample", "isResolved": False}], + "pageInfo": {"hasNextPage": False, "endCursor": None}, + } + } + } + } + } + ), + stderr="", + ) + self.actions.render_review_fix_prompt(269, 1) + + seed = self.tmp / ".refactor-loop" / "state" / "review-thread-completion" / "pr269.json" + self.assertTrue(seed.exists()) + data = json.loads(seed.read_text(encoding="utf-8")) + self.assertEqual( + data, + { + "review_thread_driven": True, + "thread_id": "PRRT_kwDOExample", + "replied": False, + "resolved": False, + "source": "live-pr-review-thread", + }, + ) + + def test_controller_render_review_fix_prompt_removes_stale_seed_when_no_unresolved_threads_remain(self) -> None: + state_dir = self.tmp / ".refactor-loop" / "state" / "review-thread-completion" + state_dir.mkdir(parents=True) + seed = state_dir / "pr269.json" + seed.write_text( + json.dumps( + { + "review_thread_driven": True, + "thread_id": "", + "replied": False, + "resolved": False, + "source": "live-pr-review-thread-unknown", + } + ), + encoding="utf-8", + ) + with patch("codex_refactor_loop.controller_actions.subprocess.run") as run: + run.return_value = subprocess.CompletedProcess( + ["gh"], + 0, + stdout=json.dumps( + { + "data": { + "repository": { + "pullRequest": { + "reviewThreads": { + "nodes": [], + "pageInfo": {"hasNextPage": False, "endCursor": None}, + } + } + } + } + } + ), + stderr="", + ) + self.actions.render_review_fix_prompt(269, 1) + + self.assertFalse(seed.exists()) + + def test_controller_render_review_fix_prompt_writes_unknown_seed_on_review_thread_lookup_failure(self) -> None: + with patch("codex_refactor_loop.controller_actions.subprocess.run") as run: + run.return_value = subprocess.CompletedProcess(["gh"], 1, stdout="", stderr="boom") + self.actions.render_review_fix_prompt(269, 1) + + seed = self.tmp / ".refactor-loop" / "state" / "review-thread-completion" / "pr269.json" + self.assertTrue(seed.exists()) + data = json.loads(seed.read_text(encoding="utf-8")) + self.assertEqual( + data, + { + "review_thread_driven": True, + "thread_id": "", + "replied": False, + "resolved": False, + "source": "live-pr-review-thread-unknown", + }, + ) + + def test_controller_render_review_fix_prompt_paginates_review_thread_seed(self) -> None: + responses = [ + subprocess.CompletedProcess( + ["gh"], + 0, + stdout=json.dumps( + { + "data": { + "repository": { + "pullRequest": { + "reviewThreads": { + "nodes": [{"id": "PRRT_kwDOOther", "isResolved": True}], + "pageInfo": {"hasNextPage": True, "endCursor": "cursor1"}, + } + } + } + } + } + ), + stderr="", + ), + subprocess.CompletedProcess( + ["gh"], + 0, + stdout=json.dumps( + { + "data": { + "repository": { + "pullRequest": { + "reviewThreads": { + "nodes": [{"id": "PRRT_kwDOExample", "isResolved": False}], + "pageInfo": {"hasNextPage": False, "endCursor": None}, + } + } + } + } + } + ), + stderr="", + ), + ] + with patch("codex_refactor_loop.controller_actions.subprocess.run", side_effect=responses) as run: + self.actions.render_review_fix_prompt(269, 1) + + seed = self.tmp / ".refactor-loop" / "state" / "review-thread-completion" / "pr269.json" + self.assertTrue(seed.exists()) + self.assertIn("after=cursor1", " ".join(str(arg) for arg in run.call_args_list[1].args[0])) + + def test_review_thread_completion_ignores_non_thread_driven_fix(self) -> None: + validate_review_thread_completion( + ReviewThreadCompletionEvidence( + review_thread_driven=False, + replied=False, + resolved=False, + ) + ) + + def test_review_thread_completion_accepts_replied_and_resolved_original_thread(self) -> None: + validate_review_thread_completion( + ReviewThreadCompletionEvidence( + review_thread_driven=True, + thread_id="PRRT_kwDOExample", + replied=True, + resolved=True, + ) + ) + + def test_review_thread_completion_accepts_explicit_escalation(self) -> None: + validate_review_thread_completion( + ReviewThreadCompletionEvidence( + review_thread_driven=True, + thread_id="PRRT_kwDOExample", + replied=False, + resolved=False, + escalation_evidence="META_RESOLVED:escalate-human:conflicting-review-thread", + ) + ) + + def test_review_thread_completion_rejects_malformed_escalation_evidence(self) -> None: + with self.assertRaisesRegex(ReviewThreadCompletionError, "escalation evidence"): + validate_review_thread_completion( + ReviewThreadCompletionEvidence( + review_thread_driven=True, + thread_id="PRRT_kwDOExample", + replied=False, + resolved=False, + escalation_evidence="anything nonempty", + ) + ) + + def test_review_thread_completion_blocks_missing_original_thread_evidence(self) -> None: + with self.assertRaisesRegex(ReviewThreadCompletionError, "thread_id"): + validate_review_thread_completion( + ReviewThreadCompletionEvidence( + review_thread_driven=True, + replied=True, + resolved=True, + ) + ) + + def test_review_thread_completion_blocks_unreplied_or_unresolved_thread(self) -> None: + blocked = ( + ReviewThreadCompletionEvidence( + review_thread_driven=True, + thread_id="PRRT_kwDOExample", + replied=False, + resolved=True, + ), + ReviewThreadCompletionEvidence( + review_thread_driven=True, + thread_id="PRRT_kwDOExample", + replied=True, + resolved=False, + ), + ) + for evidence in blocked: + with self.subTest(replied=evidence.replied, resolved=evidence.resolved): + with self.assertRaises(ReviewThreadCompletionError): + validate_review_thread_completion(evidence) + + def test_controller_completion_path_fails_closed_for_open_review_thread(self) -> None: + with self.assertRaises(ReviewThreadCompletionError): + self.actions.validate_review_fix_completion( + ReviewThreadCompletionEvidence( + review_thread_driven=True, + thread_id="PRRT_kwDOExample", + replied=True, + resolved=False, + ) + ) + if __name__ == "__main__": unittest.main() diff --git a/skills/codex-refactor-loop/scripts/test_wakeup_plan.py b/skills/codex-refactor-loop/scripts/test_wakeup_plan.py index 5cb5d175..e5d48444 100644 --- a/skills/codex-refactor-loop/scripts/test_wakeup_plan.py +++ b/skills/codex-refactor-loop/scripts/test_wakeup_plan.py @@ -298,6 +298,51 @@ def write_fake_gh(self) -> None: esac exit 0 fi + if [[ "$api_path" == "graphql" ]]; then + if [[ -n "${WAKEUP_PLAN_GH_QUERY_LOG:-}" ]]; then + printf 'api graphql %s\n' "$args" >> "$WAKEUP_PLAN_GH_QUERY_LOG" + fi + case "$fixture" in + review_thread_unresolved) + printf '{"data":{"repository":{"pullRequest":{"reviewThreads":{"nodes":[{"id":"PRRT_kwDOExample","isResolved":false,"isOutdated":false}],"pageInfo":{"hasNextPage":false,"endCursor":null}}}}}}\n' + ;; + review_thread_unresolved_unrelated) + printf '{"data":{"repository":{"pullRequest":{"reviewThreads":{"nodes":[{"id":"PRRT_kwDOOther","isResolved":false,"isOutdated":false}],"pageInfo":{"hasNextPage":false,"endCursor":null}}}}}}\n' + ;; + review_thread_unresolved_outdated) + printf '{"data":{"repository":{"pullRequest":{"reviewThreads":{"nodes":[{"id":"PRRT_kwDOExample","isResolved":false,"isOutdated":true}],"pageInfo":{"hasNextPage":false,"endCursor":null}}}}}}\n' + ;; + review_thread_resolved) + printf '{"data":{"repository":{"pullRequest":{"reviewThreads":{"nodes":[{"id":"PRRT_kwDOExample","isResolved":true,"isOutdated":false}],"pageInfo":{"hasNextPage":false,"endCursor":null}}}}}}\n' + ;; + review_thread_paginated_unresolved) + if [[ "$args" == *"after=cursor1"* ]]; then + printf '{"data":{"repository":{"pullRequest":{"reviewThreads":{"nodes":[{"id":"PRRT_kwDOExample","isResolved":false,"isOutdated":false}],"pageInfo":{"hasNextPage":false,"endCursor":null}}}}}}\n' + else + printf '{"data":{"repository":{"pullRequest":{"reviewThreads":{"nodes":[{"id":"PRRT_kwDOOther","isResolved":true,"isOutdated":false}],"pageInfo":{"hasNextPage":true,"endCursor":"cursor1"}}}}}}\n' + fi + ;; + review_thread_graphql_failure) + exit 42 + ;; + review_thread_malformed) + printf '{"data":{"repository":{"pullRequest":{"reviewThreads":{"nodes":null,"pageInfo":{"hasNextPage":false,"endCursor":null}}}}}}\n' + ;; + review_thread_pull_request_null) + printf '{"data":{"repository":{"pullRequest":null}}}\n' + ;; + review_thread_page_info_null) + printf '{"data":{"repository":{"pullRequest":{"reviewThreads":{"nodes":[],"pageInfo":null}}}}}\n' + ;; + review_thread_node_malformed) + printf '{"data":{"repository":{"pullRequest":{"reviewThreads":{"nodes":[null],"pageInfo":{"hasNextPage":false,"endCursor":null}}}}}}\n' + ;; + *) + printf '{"data":{"repository":{"pullRequest":{"reviewThreads":{"nodes":[],"pageInfo":{"hasNextPage":false,"endCursor":null}}}}}}\n' + ;; + esac + exit 0 + fi if [[ "$api_flag1" == "--paginate" && "$api_flag2" == "--slurp" ]]; then if [[ "$fixture" == "ci_red" && "$api_path" == "repos/owner/repo/commits/ci-red-sha/check-runs" ]]; then printf '[{"check_runs":[{"name":"unit","status":"completed","conclusion":"failure","html_url":"https://checks/unit"},{"name":"lint","status":"completed","conclusion":"success","html_url":"https://checks/lint"}]}]\n' @@ -1184,9 +1229,258 @@ def test_fix_done_completed_marker_projects_executable_dispatch_reviewers(self) self.assertEqual(action["target_number"], 77) self.assertEqual(action["target"], {"kind": "PR", "number": 77}) self.assertIn("clean_exit_source_marker", action["preconditions"]) + self.assertIn("review_thread_completion_evidence", action["preconditions"]) for forbidden in ("argv", "shell", "cmd", "command_line", "commands", "env", "git", "gh", "executor"): self.assertNotIn(forbidden, action) + def test_fix_done_without_review_thread_artifact_ignores_unrelated_unresolved_threads(self) -> None: + self.write_completed_log("fix-pr77-r3.log", "FIX_DONE") + + plan = self.run_plan(fixture="review_thread_unresolved") + + action = next(item for item in plan["actions"] if item["action_id"].startswith("completed-marker:fix-pr77-r3")) + self.assertEqual(action["kind"], "completed-marker") + self.assertEqual(action["controller_action"], "dispatch_reviewers") + self.assertEqual(action["runner_authority"], "wakeup-runner-396") + self.assertNotIn("status_only", action) + + def test_fix_done_with_unresolved_original_review_thread_blocks_dispatch_reviewers(self) -> None: + completion_dir = self.repo / ".refactor-loop" / "state" / "review-thread-completion" + completion_dir.mkdir(parents=True) + (completion_dir / "pr77.json").write_text( + json.dumps( + { + "review_thread_driven": True, + "thread_id": "PRRT_kwDOExample", + "replied": True, + "resolved": True, + } + ), + encoding="utf-8", + ) + self.write_completed_log("fix-pr77-r3.log", "FIX_DONE") + + plan = self.run_plan(fixture="review_thread_unresolved") + + action = next(item for item in plan["actions"] if item["action_id"].startswith("completed-marker:fix-pr77-r3")) + self.assertEqual(action["kind"], "completed-marker") + self.assertEqual(action["route"], "review-thread-completion-gate") + self.assertTrue(action["status_only"]) + self.assertTrue(action["no_lifecycle_authority"]) + self.assertIn("review_thread_completion_incomplete", action["blocked_reason"]) + self.assertIn("review_thread_completion_evidence", action["preconditions"]) + self.assertNotIn("controller_action", action) + self.assertNotIn("runner_authority", action) + self.assertNotIn("no_generic_command", action) + + def test_fix_done_review_thread_completion_artifact_allows_dispatch_reviewers(self) -> None: + completion_dir = self.repo / ".refactor-loop" / "state" / "review-thread-completion" + completion_dir.mkdir(parents=True) + (completion_dir / "pr77.json").write_text( + json.dumps( + { + "review_thread_driven": True, + "thread_id": "PRRT_kwDOExample", + "replied": True, + "resolved": True, + } + ), + encoding="utf-8", + ) + self.write_completed_log("fix-pr77-r3.log", "FIX_DONE") + + plan = self.run_plan(fixture="review_thread_resolved") + + action = next(item for item in plan["actions"] if item["action_id"].startswith("completed-marker:fix-pr77-r3")) + self.assertEqual(action["kind"], "completed-marker") + self.assertEqual(action["controller_action"], "dispatch_reviewers") + self.assertEqual(action["runner_authority"], "wakeup-runner-396") + self.assertIn("review_thread_completion_evidence", action["preconditions"]) + self.assertNotIn("status_only", action) + + def test_fix_done_review_thread_completion_artifact_does_not_bypass_live_unresolved_thread(self) -> None: + completion_dir = self.repo / ".refactor-loop" / "state" / "review-thread-completion" + completion_dir.mkdir(parents=True) + (completion_dir / "pr77.json").write_text( + json.dumps( + { + "review_thread_driven": True, + "thread_id": "PRRT_kwDOExample", + "replied": True, + "resolved": True, + } + ), + encoding="utf-8", + ) + self.write_completed_log("fix-pr77-r3.log", "FIX_DONE") + + plan = self.run_plan(fixture="review_thread_unresolved") + + action = next(item for item in plan["actions"] if item["action_id"].startswith("completed-marker:fix-pr77-r3")) + self.assertEqual(action["route"], "review-thread-completion-gate") + self.assertTrue(action["status_only"]) + self.assertNotIn("controller_action", action) + + def test_fix_done_explicit_escalation_allows_unresolved_review_thread(self) -> None: + completion_dir = self.repo / ".refactor-loop" / "state" / "review-thread-completion" + completion_dir.mkdir(parents=True) + (self.logs / "judge-pr77-r1.log").write_text( + "META_RESOLVED:escalate-human:conflicting-review-thread\nEXIT=0\n", + encoding="utf-8", + ) + (completion_dir / "pr77.json").write_text( + json.dumps( + { + "review_thread_driven": True, + "thread_id": "PRRT_kwDOExample", + "replied": False, + "resolved": False, + "escalation_evidence": "META_RESOLVED:escalate-human:conflicting-review-thread", + } + ), + encoding="utf-8", + ) + self.write_completed_log("fix-pr77-r3.log", "FIX_DONE") + + plan = self.run_plan(fixture="review_thread_unresolved") + + action = next(item for item in plan["actions"] if item["action_id"].startswith("completed-marker:fix-pr77-r3")) + self.assertEqual(action["controller_action"], "dispatch_reviewers") + self.assertEqual(action["runner_authority"], "wakeup-runner-396") + self.assertNotIn("status_only", action) + + def test_fix_done_local_escalation_without_clean_marker_source_blocks_unresolved_review_thread(self) -> None: + completion_dir = self.repo / ".refactor-loop" / "state" / "review-thread-completion" + completion_dir.mkdir(parents=True) + (completion_dir / "pr77.json").write_text( + json.dumps( + { + "review_thread_driven": True, + "thread_id": "PRRT_kwDOExample", + "replied": False, + "resolved": False, + "escalation_evidence": "META_RESOLVED:escalate-human:conflicting-review-thread", + } + ), + encoding="utf-8", + ) + self.write_completed_log("fix-pr77-r3.log", "FIX_DONE") + + plan = self.run_plan(fixture="review_thread_unresolved") + + action = next(item for item in plan["actions"] if item["action_id"].startswith("completed-marker:fix-pr77-r3")) + self.assertEqual(action["route"], "review-thread-completion-gate") + self.assertTrue(action["status_only"]) + self.assertNotIn("controller_action", action) + + def test_fix_done_blocks_when_original_review_thread_live_state_is_unknown(self) -> None: + completion_dir = self.repo / ".refactor-loop" / "state" / "review-thread-completion" + completion_dir.mkdir(parents=True) + for fixture in ( + "review_thread_graphql_failure", + "review_thread_malformed", + "review_thread_pull_request_null", + "review_thread_page_info_null", + "review_thread_node_malformed", + ): + with self.subTest(fixture=fixture): + self.logs.joinpath("fix-pr77-r3.log").unlink(missing_ok=True) + (completion_dir / "pr77.json").write_text( + json.dumps( + { + "review_thread_driven": True, + "thread_id": "PRRT_kwDOExample", + "replied": True, + "resolved": True, + } + ), + encoding="utf-8", + ) + self.write_completed_log("fix-pr77-r3.log", "FIX_DONE") + + plan = self.run_plan(fixture=fixture) + + action = next(item for item in plan["actions"] if item["action_id"].startswith("completed-marker:fix-pr77-r3")) + self.assertEqual(action["route"], "review-thread-completion-gate") + self.assertTrue(action["status_only"]) + self.assertNotIn("controller_action", action) + + def test_fix_done_blocks_unresolved_outdated_original_review_thread(self) -> None: + completion_dir = self.repo / ".refactor-loop" / "state" / "review-thread-completion" + completion_dir.mkdir(parents=True) + (completion_dir / "pr77.json").write_text( + json.dumps( + { + "review_thread_driven": True, + "thread_id": "PRRT_kwDOExample", + "replied": True, + "resolved": True, + } + ), + encoding="utf-8", + ) + self.write_completed_log("fix-pr77-r3.log", "FIX_DONE") + + plan = self.run_plan(fixture="review_thread_unresolved_outdated") + + action = next(item for item in plan["actions"] if item["action_id"].startswith("completed-marker:fix-pr77-r3")) + self.assertEqual(action["route"], "review-thread-completion-gate") + self.assertTrue(action["status_only"]) + self.assertNotIn("controller_action", action) + + def test_fix_done_checks_paginated_original_review_thread_before_dispatch_reviewers(self) -> None: + completion_dir = self.repo / ".refactor-loop" / "state" / "review-thread-completion" + completion_dir.mkdir(parents=True) + (completion_dir / "pr77.json").write_text( + json.dumps( + { + "review_thread_driven": True, + "thread_id": "PRRT_kwDOExample", + "replied": True, + "resolved": True, + } + ), + encoding="utf-8", + ) + self.write_completed_log("fix-pr77-r3.log", "FIX_DONE") + + plan = self.run_plan(fixture="review_thread_paginated_unresolved") + + action = next(item for item in plan["actions"] if item["action_id"].startswith("completed-marker:fix-pr77-r3")) + self.assertEqual(action["route"], "review-thread-completion-gate") + self.assertTrue(action["status_only"]) + self.assertNotIn("controller_action", action) + query_log = (self.repo / "gh-query-labels.log").read_text(encoding="utf-8") + self.assertIn("api graphql", query_log) + self.assertIn("after=cursor1", query_log) + + def test_fix_done_blocks_when_original_review_thread_repo_slug_is_missing(self) -> None: + completion_dir = self.repo / ".refactor-loop" / "state" / "review-thread-completion" + completion_dir.mkdir(parents=True) + (completion_dir / "pr77.json").write_text( + json.dumps( + { + "review_thread_driven": True, + "thread_id": "PRRT_kwDOExample", + "replied": True, + "resolved": True, + } + ), + encoding="utf-8", + ) + (self.repo / ".refactor-loop" / "host.env").write_text( + f"REPO_ROOT={self.repo}\nCODEX_FLOOR=5\n", + encoding="utf-8", + ) + self.write_completed_log("fix-pr77-r3.log", "FIX_DONE") + + plan = self.run_plan() + + action = next(item for item in plan["actions"] if item["action_id"].startswith("completed-marker:fix-pr77-r3")) + self.assertEqual(action["route"], "review-thread-completion-gate") + self.assertTrue(action["status_only"]) + self.assertNotIn("controller_action", action) + def test_runner_named_helper_projection_remains_executable(self) -> None: plan = self.run_plan(fixture="unpushed")