chore: release develop#7
Open
github-actions[bot] wants to merge 1 commit into
Open
Conversation
ed3ced8 to
dfddd94
Compare
d22e869 to
56403f0
Compare
Sawmonabo
added a commit
that referenced
this pull request
May 7, 2026
Codex P1 (PR #33 R2): two gaps in the subagent-stage validator forced false-positive round-trips on legitimate halt outcomes. Gap 1 — `addressing: <item-key>` undocumented. The canonical-template responsibility list never told the subagent that concerns deferring a `semantic_work_pending` item must set `addressing: "<exact-item-key>"` to satisfy the validator's per-item check. The match key was a hidden contract. Gap 2 — per-item check fired on halt-states. When subagent returned BLOCKED or NEEDS_CONTEXT (it stopped before completing semantic work), the per-item check still demanded coverage of every pending item, forcing wasted dispatches. Fix: - Contract doc canonical template adds responsibility #5 documenting the `addressing: <item-key>` requirement (renumbers existing #5/#6/#7 -> #6/#7/#8). Validation invariants line clarifies the matching key + halt-state waiver. - JS CANONICAL_TEMPLATE const re-synced verbatim per Plan §Decisions-Locked D-1; D-7 row 12 snapshot test confirms sync. - validateManifestSubagentStage skips the per-item loop when manifest.result is "BLOCKED" or "NEEDS_CONTEXT". DONE_WITH_CONCERNS still requires per-item pairing — waiver narrowly scoped to halts. - Gap message now hints at the contract requirement so future regressions self-document. Tests: 4 new regression tests (BLOCKED waiver, NEEDS_CONTEXT waiver, DONE_WITH_CONCERNS still requires pairing, `addressing` match by exact item-key with `kind` irrelevant). Note: Codex Finding 2 (missing merge-command steps in SKILL.md Phase E) is a deeper architectural drift between spec (pre-merge housekeeping) and plan/SKILL.md (post-merge ordering) — deferred to user direction; not addressed here. Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sawmonabo
added a commit
that referenced
this pull request
May 7, 2026
- SKILL.md Phase E step 7: explicitly stage docs/plans/<plan-file> alongside <affected_files>. The Progress Log append in step 6 is orchestrator-controlled (not script-controlled) so the plan file is not in manifest.affected_files — especially under exit 3 (no Done Checklist sub-section) where the script skips checklist ticking entirely. Omitting it would leave the Progress Log edit unstaged, breaking the documented "single commit bundles housekeeping + log" guarantee. - validator: add scriptSchemaViolations parameter + check #9 (immutable superset comparison by composite key kind+field+ns_id). Mirrors check #7's scriptAffectedFiles superset semantics. Without it, a subagent could clear manifest.schema_violations and return DONE/DONE_WITH_CONCERNS, bypassing check #6 BLOCKED enforcement entirely. - 3 regression tests pin the new check #9 boundary: cleared (fail), preserved (pass), and additive (pass — subagent surfacing schema problems the script missed). Closes Codex Findings F-AMP7W + F-AMP7Y on PR #33 (P1). 44 helpers + 116 main tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sawmonabo
added a commit
that referenced
this pull request
May 7, 2026
Path-(a) A2 design closes the system-level half of the contract-half-fix bypass class (F-AMP7Y / F-AMSEL / F-AMWXK). Validator-API parameters were in place but no production caller plumbed them, so the four preservation / iteration checks (#7 / #9 / #10 / #11) silently disabled. Script now writes _script_stage as an immutable snapshot of the four arrays the validator enforces (affected_files / schema_violations / verification_failures / semantic_work_pending). Validator reads the snapshot from the manifest with precedence: explicit scriptXXX param > manifest._script_stage[field] > null. Structural-tampering check fires whenever the snapshot path is relied on (any scriptXXX null), catching secondary attacks where the subagent removes _script_stage to defeat the snapshot defense. Subagent contract: _script_stage is READ-ONLY. Touching it surfaces as a Codex F-AMbIV gap. Documented in the canonical subagent prompt template, the agent definition file, the contract doc § _script_stage immutability, and SKILL.md Phase E step 5. Tests: 4 new F-AMbIV regressions (A2 fallback / missing-snapshot gap / malformed-snapshot gap / param precedence). All 171 housekeeper tests pass (55 helper + 116 main). 14 fixture snapshots updated to mirror the new emission. 42 existing test fixtures bulk-updated with empty _script_stage to satisfy the structural check on legacy code paths. Refs: PR #33 Codex finding F-AMbIV
5 tasks
Sawmonabo
added a commit
that referenced
this pull request
May 10, 2026
* feat(repo): add plan-execution-housekeeper subagent definition Adds the 7th plan-execution subagent role per spec §5.2: blue color, tools Read/Grep/Glob/Edit/Write (Bash omitted to mechanically enforce Plan Invariant I-3 — no git shell-out from this role). Body sections mirror plan-execution-contract-author.md shape: Inputs, Mindset, Hard rules, Decision presentation, Exit states (only the 4 canonical from failure-modes.md per Plan Invariant I-2), Report format, Reference files. Refs: Plan-housekeeper Phase 4 Task 4.1; Spec §5.2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(repo): add post-merge-housekeeper-contract.md reference doc The contract is the housekeeper subagent's portable reference; sections extract spec §5.3 manifest schema, §5.1 exit codes, §5.3 validation invariants, §7.2 resume diagnostic, §3a.2 completion-rule matrix, §3a.4 file-reference extraction heuristic. Adds the canonical subagent prompt template that buildHousekeeperPrompt (Task 4.3) reproduces verbatim and the Layer 2 snapshot test (Task 4.8) pins. Refs: Plan-housekeeper Phase 4 Task 4.2; Spec §9.1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(repo): fix housekeeper rule-20 sprawl cite in prompt template The Canonical Subagent Prompt Template (Task 4.2 verbatim source + deployed contract) cited `rule 21` for the sprawl-routing case, but Task 4.7 defines rule 20 = sprawl (round-trip dispatch) and rule 21 = schema-violation halt → BLOCKED. Also re-frames the routing prose: sprawl triggers an orchestrator round-trip (DONE_WITH_CONCERNS is a secondary outcome only after weak-justification re-dispatch, not the primary route). Without this fix, Task 4.8's snapshot test would pin the wrong rule number into CI fixtures. Fixes both the canonical contract and the plan's verbatim source so a re-run of Task 4.2 cannot reintroduce the bug. Refs: Plan-housekeeper Phase 4 Task 4.2; Task 4.7 rule 20. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(repo): orchestrator helpers + Layer 2 prompt/validator tests Adds .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs exporting buildHousekeeperPrompt (BASE template from contract + conditional AUTO-CREATE / schema-violation augmentations) and validateManifestSubagentStage (catches: pending-item gaps, file-residual placeholders, semantic_edits values P2 placeholders, nested-array placeholders, schema_violations x concerns reconciliation gaps). 8 unit tests TDD-authored. Refs: Plan-housekeeper Phase 4 Task 4.3; Spec §5.3, §8.2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): lift housekeeper PLACEHOLDER constant + cleanup tmp dirs Code-quality review POLISH items: lift `<TODO subagent prose>` literal to module scope (single source of truth across walkForPlaceholder + validateManifestSubagentStage), and reap Test 6's tmp dir via try/finally + rmSync to match sibling test convention. Refs: Plan-execution-housekeeper Phase 4 Task 4.3 * docs(repo): rewrite SKILL.md Phase E section for housekeeper subagent Phase E now describes the 8-step post-merge housekeeping flow: candidate-lookup over §6 NS catalog, script dispatch (--candidate-ns / --auto-create / NEEDS_CONTEXT on multi-match), script-stage manifest validation, plan-execution-housekeeper subagent dispatch, subagent-stage validation, Progress Log append (moved from before-merge to after-housekeeping per spec §6.1), single bundled commit, push. Per Plan §Decisions-Locked D-3 the housekeeper ticks ALL Done-Checklist boxes for the merged Phase. Footnote at step 1 cross-links to Plan §D-7 fixture matrix. Refs: spec §5.4, §6.1, §6.2, §7.1; Plan-execution-housekeeper Phase 4 Task 4.4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(repo): mark Phase E D-3 cite as bullet per plan §Task-4.4 wording Plan line 2923 prescribes "A new bullet at the top of Phase E noting ..." but the Task 4.4 rewrite rendered the D-3 line as a paragraph. Prepend the list marker so the rendering matches the plan's "new bullet" framing. Refs: Plan-execution-housekeeper Phase 4 Task 4.4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(repo): fix housekeeper Phase E Progress Log path drift POLISH from Task 4.4 code-quality review: - Step 6 path drifted to gitignored `.agents/tmp/<session-id>/progress.md`, which is incompatible with step 7's "single git commit on develop" (spec §6.1 line 905 mandates same-commit). Restore plan-body `docs/plans/NNN-*.md` § Progress Log location. - GFM footnote ref `Rules:[^d7]` needs space before `[^d7]`. Mirrored fix in plan source per Task 4.2 rule-20 precedent (commit b5f1e10) so re-runs don't reintroduce the path drift. Refs: Plan-execution-housekeeper Phase 4 Task 4.4 * docs(repo): register 7th plan-execution-housekeeper role in SKILL.md Task 4.5 of plan-execution-housekeeper Phase 4: bumps the orchestrator's role count six → seven, adds `plan-execution-housekeeper` to all three role enumerations (line 29 introduction, line 42 no-git rule, line 515 Reference Files subagent-definitions row), and adds a Reference Files row pointing to `references/post-merge-housekeeper-contract.md` (authored in Task 4.2, commit b5f1e10's predecessor). Set-quantifier sweep includes lines 42 and 515 alongside the explicit line 29 from the plan task spec — per same-class-sweep discipline, the "Six subagent roles" claim mirrors at all three sites. Refs: Plan-execution-housekeeper Phase 4 Task 4.5 * docs(repo): add Phase E recovery + inline lookup rules; reserve BL-109 state-recovery.md: append "Resuming a Phase E housekeeping halt" diagnostic (per spec §7.2) and mirror the four §4.3.2 candidate-lookup rules inline (per spec §4.3.5) so a recovering engineer doesn't need to cross-link to SKILL.md to debug rule-N mismatches. Flag the existing line-174 .agents/tmp/ lifecycle drift with a \`> Note:\` block linking to BL-109. backlog.md: reserve BL-109 to track reconciling state-recovery.md's ".agents/ tmp/ deleted at commit time" claim with lefthook.yml (which has no such job). Refs: spec §6.1, §7.2, §4.3.5, §9.3; Plan §Decisions-Locked D-5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(repo): add slug-fragility warning above BL-109 heading POLISH from Task 4.6 code-quality review: an inbound anchor in state-recovery.md (the > Note: callout authored alongside BL-109) points to the GitHub-slugified BL-109 heading. The heading contains backticks + dots + slashes that get stripped by GFM, so any future heading reword could silently break the inbound anchor without any hook catching it. Add an HTML comment above the heading documenting the dependency, mirroring the PR #24 / PR #27 canonicalization-sweep lessons. Refs: Plan-execution-housekeeper Phase 4 Task 4.6 * docs(repo): add Phase E housekeeper routing rules 20-21 to failure-modes Adds rules 20 (housekeeper subagent affected_files sprawl detection + round-trip resolution) and 21 (script schema-violation BLOCKED routing reusing rule-4's graceful-drain) to failure-modes.md. Both reuse the existing four canonical exit-states (DONE / DONE_WITH_CONCERNS / NEEDS_CONTEXT / BLOCKED) per spec §7.1 invariant; no new exit-state introduced. Refs: spec §7.1; Plan-execution-housekeeper Phase 4 Task 4.7 * docs(repo): align cross-link cite to canonical "Exit codes" heading POLISH from Task 4.7 review: 4 citation sites used "§ Exit Codes" (capital C) while the contract heading at references/post-merge-housekeeper-contract.md:70 is "## Exit codes" (lowercase). Sweep direction lowercases the citers (not capitalizes the heading), because sentence-case matches: (a) all 5 sibling ## headings in post-merge-housekeeper-contract.md (b) sibling references/preflight-contract.md:13 "## Exit codes" Sites swept (4): - references/failure-modes.md:206 (rule 21, just landed in bc9c03b) - references/state-recovery.md:198 (Task 4.6 commit 5f0d04b) - plan source line 2989 (state-recovery.md verbatim mirror) - plan source line 3120 (failure-modes.md verbatim mirror) Refs: Plan-execution-housekeeper Phase 4 Task 4.7 * test(repo): add Layer 2 D-7 row + I-1/I-2/I-3 invariant unit tests Adds 7 unit tests to post-merge-housekeeper-orchestrator-helpers.test.mjs covering Plan §Decisions-Locked D-7 rows 12-15 (prompt-template snapshot, zod schema parse, affected_files superset, sprawl detection routing) and the I-1/I-2/I-3 invariants (cite-target hook regression, canonical exit- state declaration, no-shell-out for git). Adds new helper detectAffectedFilesSprawl to housekeeper-orchestrator- helpers.mjs (pure function, set-difference) and extends validateManifestSubagentStage with an optional scriptAffectedFiles param that enforces the D-7 row 14 superset check. Adds zod ^4.3.6 to root devDependencies so the schema test resolves the import. Adds the RESULT: prefix to the four canonical exit-states in plan-execution-housekeeper.md so the I-2 invariant test detects all four declarations consistently. Refs: Plan-execution-housekeeper Phase 4 Task 4.8 * docs(repo): fix plan Task-4.9 test command to use glob not directory Plan §Task-4.9 Step 1 prescribed `node --test <directory>/` but Node 22.21 does NOT directory-walk explicit path args — only auto-discovers from cwd. The directory-arg was treated as a module path and threw MODULE_NOT_FOUND. Use `<directory>/*.test.mjs` glob instead. Also corrects the expected-test-count line: Task 4.8 added 7 tests (4 D-7 rows + 3 invariants), not 4 D-7 rows alone. Refs: Plan-execution-housekeeper Phase 4 Task 4.9 * fix(repo): housekeeper schema_violations × concerns per-entry match The validator's reconciliation loop used `.some(c => c.kind === "schema_violation")` which ignored `sv` and let a single generic concerns entry satisfy N violations. Fix: filter concerns by kind once, then match each schema_violations entry structurally on `field` (and `ns_id` when present). Update the gap-message to reference the actual differentiator (e.g. "NS-15.type") instead of the always-literal `sv.kind`. Adds three regression tests: - 2 violations + 1 generic concern → exactly the unmatched 'type' violation gaps - 2 violations + 2 specific concerns → passes - --candidate-ns shape (no ns_id) → matches by `field` alone Refs: PR #33 Codex review thread r3198337816 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper validator halt-state waiver + addressing key doc Codex P1 (PR #33 R2): two gaps in the subagent-stage validator forced false-positive round-trips on legitimate halt outcomes. Gap 1 — `addressing: <item-key>` undocumented. The canonical-template responsibility list never told the subagent that concerns deferring a `semantic_work_pending` item must set `addressing: "<exact-item-key>"` to satisfy the validator's per-item check. The match key was a hidden contract. Gap 2 — per-item check fired on halt-states. When subagent returned BLOCKED or NEEDS_CONTEXT (it stopped before completing semantic work), the per-item check still demanded coverage of every pending item, forcing wasted dispatches. Fix: - Contract doc canonical template adds responsibility #5 documenting the `addressing: <item-key>` requirement (renumbers existing #5/#6/#7 -> #6/#7/#8). Validation invariants line clarifies the matching key + halt-state waiver. - JS CANONICAL_TEMPLATE const re-synced verbatim per Plan §Decisions-Locked D-1; D-7 row 12 snapshot test confirms sync. - validateManifestSubagentStage skips the per-item loop when manifest.result is "BLOCKED" or "NEEDS_CONTEXT". DONE_WITH_CONCERNS still requires per-item pairing — waiver narrowly scoped to halts. - Gap message now hints at the contract requirement so future regressions self-document. Tests: 4 new regression tests (BLOCKED waiver, NEEDS_CONTEXT waiver, DONE_WITH_CONCERNS still requires pairing, `addressing` match by exact item-key with `kind` irrelevant). Note: Codex Finding 2 (missing merge-command steps in SKILL.md Phase E) is a deeper architectural drift between spec (pre-merge housekeeping) and plan/SKILL.md (post-merge ordering) — deferred to user direction; not addressed here. Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper schema_violations matcher discriminates by kind Codex P2 (PR #33 R3): the schema_violations × concerns matcher's predicate `c.field !== sv.field` was trivially false when both sides lacked a `field` (undefined !== undefined), letting a single generic concern absorb multiple distinct-shape violations. Plus the pre-filter `c.kind === "schema_violation"` excluded shapes the script actually emits — `auto_create_title_seed_underivable` (line 899) is a singleton with no `field`/`ns_id`, so it would never find a matching concern. Fix: - Validator drops the kind="schema_violation" pre-filter and adds `c.kind === sv.kind` to the per-violation match predicate. Each violation must now be paired to a concern of the SAME kind. - Gap message handles fieldless violations by falling back to a `(kind: <kind>)` label (was printing literal "undefined") and reflects the actual match-key requirements. - Contract doc (validation invariants line 93 + canonical template responsibility #4) updated to describe the actual three-shape rule (schema_violation OR auto_create_title_seed_underivable singleton) rather than hardcoding `kind: "schema_violation"`. - JS CANONICAL_TEMPLATE const re-synced verbatim per Plan §Decisions-Locked D-1; D-7 row 12 snapshot test confirms sync. - Validator JSDoc check #4 expanded to call out the kind discriminator and the trivial-equality vulnerability it closes. Tests: 3 new regression tests (distinct-kind violations require kind-discriminated concerns, singleton kind matches by kind alone, fieldless-violation gap message format). Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper kind-contract sweep across agent + spec + plan Codex P1 (PR #33 R3 Finding 4): .claude/agents/plan-execution-housekeeper.md line 41 still mandated `kind: schema_violation` literally even after commit 77f4435 changed the validator to discriminate by sv.kind verbatim. The Finding 3 fix updated the contract doc + JS const + validator, but missed the same-axis prose in 4 other locations across the doc tree — classic canonicalization-sweep miss per `feedback_canonicalization_sweep_completeness.md`. Sweep covers all parallel hard-rule prose + invariants: - .claude/agents/plan-execution-housekeeper.md:41 (Codex flagged direct) - docs/superpowers/specs/.../housekeeper-design.md:617 (spec hard rule) - docs/superpowers/specs/.../housekeeper-design.md:693 (validation invariants) — also brings the invariants prose into alignment with Finding 1's `addressing: <item-key>` + halt-state waiver wording (was stale from the prior PR cycle's runtime fix) - docs/superpowers/specs/.../housekeeper-design.md:953 (failure-mode table) - docs/superpowers/plans/.../housekeeper-implementation.md:2477 (plan hard rule) Skipped (out of canonicalization-sweep scope): - Plan source's prescribed canonical template (lines 2570-2595) - Plan source's prescribed validator code (lines 2785-2839) These are historical implementation guides describing what the implementer was supposed to write at PR-construction time. Subsequent fixes (Findings 1 and 3) evolved the implementation; updating prescription post-hoc is plan-archeology, not contract-alignment. Note: Codex Finding 5 (PRRT_kwDOSCycWc6ALNW_) re-raises the architectural drift originally flagged as Finding 2 (PRRT_kwDOSCycWc6AK48r) — both about missing pre-merge / merge-step commands in SKILL.md Phase E. That question is OUT OF SCOPE for this commit; still awaits user direction on Path A (restructure SKILL.md + plan Task 4.4 per spec §4.1/§6.2 ordering) vs Path B (add D-8 Decision Log + reconcile line-45 contradiction). Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper validator enforces canonical result + BLOCKED Validator now enforces two contract invariants from references/post-merge-housekeeper-contract.md line 93 that prior rounds missed: - result !== null + ∈ {DONE, DONE_WITH_CONCERNS, NEEDS_CONTEXT, BLOCKED}: blocks any non-canonical exit state (was: undefined slipped through). Closes Codex finding PRRT_kwDOSCycWc6ALVFW (P2). - schema_violations.length > 0 ⇒ result === "BLOCKED": surfacing alone is insufficient; the contract requires the BLOCKED exit state for halt/routing-path determinism. Closes Codex finding PRRT_kwDOSCycWc6ALVFN (P1). Tests: 25 → 30 (added 5 cases — 3 canonical-state, 2 BLOCKED-on-violations). Existing fixtures gained explicit `result` fields so they assert under the new enforcement intentionally rather than passing by accident. JSDoc on validateManifestSubagentStage now enumerates 7 numbered checks in code order; the new canonical-state check is #1 and the new BLOCKED-when-schema-violations check is #6. D-7 row 12 snapshot test (canonical template ↔ JS const verbatim sync) unchanged and continues to pin contract↔JS sync. Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper requires non-empty semantic_edits payload Validator's pending-item reconciliation used hasOwnProperty.call() which only checks key presence. A subagent returning semantic_edits.compose_status_prose = undefined would pass the check despite zero payload, letting DONE/DONE_WITH_CONCERNS slip through with unresolved semantic work and skipping a needed round-trip. Tightened to require a meaningful payload: not null/undefined, not empty string (after trim), not empty array, not empty object. Numbers/booleans accepted defensively for future contract evolution. Closes Codex finding PRRT_kwDOSCycWc6ALgop (P2). Tests: 30 → 35 (added 4 empty-variant cases + 1 negative control). Helper isMeaningfulPayload extracted to module scope alongside walkForPlaceholder for testability and single-source-of-truth. JSDoc check #2 now clarifies the value-must-be-meaningful rule and cites the contract clause "containing the composed output" from canonical-template responsibility #5. Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper gaps when declared affected_file is missing Validator's placeholder-scan loop silently skipped any affected_files entry that no longer existed on disk (existsSync gate). A subagent that deleted a declared file (e.g. cross-plan-dependencies.md) would pass validation despite the contract clause that affected_files are files the subagent EDITED — implying they exist post-edit. Loop now pushes a gap when existsSync is false ("declared in affected_files but missing from disk — destructive out-of-scope behavior") AND continues (skipping the placeholder read for the missing file). Existing-file branch unchanged. Closes Codex finding PRRT_kwDOSCycWc6ALnyo (P1). Tests: 35 → 36 (added missing-file gap; negative case covered by existing placeholder-scan fixtures). Contract doc clause at line 93 augmented with explicit parenthetical: deletion of a declared file is a contract violation surfaced by the validator's missing-file gap. JSDoc check #3 now mentions both placeholder-scan AND existence-check since both live in the same loop. Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper BLOCKED on verification_failures + scope flip Two fixes from the Codex re-review pass on 5b3ff94: Finding 11 (P1): the validator's BLOCKED-on-X enforcement only covered schema_violations (post-Finding-6 fix). The script exit-2 path populates verification_failures and demands the same halt routing, but the validator silently accepted DONE/DONE_WITH_CONCERNS for that case too. Added the mirror check; closes PRRT_kwDOSCycWc6ALt55. Also added the parallel formal-invariant sentence to references/post-merge-housekeeper-contract.md §Validation invariants line 93 — the validator was enforcing a rule the contract only stated narratively (line 79); the contract now states it formally so the validator and the formal invariants list stay in lock-step. Finding 12 (P2): SKILL.md:443 said affected_files was a subset of actual edits. Inverted — the contract doc (line 93) and the subagent-stage validator's D-7 row 14 superset check both go the other way (declared is a superset of actual, so out-of-scope writes are detected). Flipped the direction to match. Closes PRRT_kwDOSCycWc6ALt59. Tests: 36 -> 38 (added BLOCKED-on-verification_failures positive + negative case mirroring the schema_violations pair). JSDoc check #8 added for the new validator check. D-7 row 12 snapshot test (canonical template <-> JS const verbatim sync) unchanged — passes. Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper SKILL.md sprawl + zero-match prose alignment Two prose-only fixes from the Codex re-review pass on b9c887b: Finding 13 (P1): SKILL.md:449 said out-of-scope subagent edits route straight to DONE_WITH_CONCERNS. The contract (failure-modes.md rule 20 + spec §7.3) requires a round-trip first — the orchestrator re-dispatches the subagent with an explicit (a) revert OR (b) extend-affected_files+ concern choice; DONE_WITH_CONCERNS only fires if the subagent picks (b) with weak justification, not on first detection. Updated the prose to state the round-trip flow inline so a reader cannot route past the re-dispatch step. Closes PRRT_kwDOSCycWc6AL1aI. Finding 14 (P2): SKILL.md:433's Rule 4 ("No-match fallback") said zero matches drop to the step 2 NEEDS_CONTEXT branch — but step 2 line 437 defines "0 candidate matches → --auto-create" and "2+ → NEEDS_CONTEXT" (spec §4.3.2 + the §5.5 candidate-dispatch table at design line 448-450 agree). Inverted the cite: 0 matches now correctly drop to --auto-create (genuinely new work), and the prose calls out that NEEDS_CONTEXT is reserved for the 2+ ambiguity case. Closes PRRT_kwDOSCycWc6AL1aJ. No code or test changes — prose-only alignment to cited contracts. Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper sprawl routes REDISPATCH not DONE_WITH_CONCERNS Finding 15 (P1) from the manual @codex review on 3ff9b63: The JS function `detectAffectedFilesSprawl` lagged the SKILL.md prose fix landed in 3ff9b63 for Finding 13. The prose now correctly states that out-of-scope edits trigger a round-trip re-dispatch per failure-modes.md rule 20 — the orchestrator offers (a) revert OR (b) extend+justify; only if the subagent picks (b) with weak justification does the orchestrator downgrade to DONE_WITH_CONCERNS. But the JS function still returned `suggestedRouting: "DONE_WITH_CONCERNS"` at first detection, so the canonical helper would have routed Phase E past the rule-20 round-trip into the merge-and-continue path even though the prose says otherwise. This is the JS half of the contract-half-fix pattern (mirror of b9c887b's "validator enforces a rule the contract only states narratively" — same shape, opposite direction: code lagging prose instead of prose lagging code). Fix: - `detectAffectedFilesSprawl` now returns `suggestedRouting: "REDISPATCH"` on first detection plus a `redispatchPromptTemplate` field carrying the verbatim rule-20 (a)/(b) prompt the orchestrator should send back to the subagent (with the out-of-scope file list substituted in) - JSDoc rewritten to describe the round-trip flow + the post-redispatch validation step + when DONE_WITH_CONCERNS is the legitimate outcome (only after the subagent picks (b) with weak justification) - Existing test renamed + retargeted to assert REDISPATCH + the prompt template's structural requirements (file enumeration, (a)/(b) text, affected_files_extension kind name) - New negative-case test: no-sprawl returns `{ sprawl: false }` exactly Tests: 38 -> 39. Full housekeeper suite 116/116 still passes; D-7 row 12 snapshot test (canonical template <-> JS const verbatim sync) unchanged. Closes PRRT_kwDOSCycWc6AL-nZ. Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper I-1 test asserts caught exception not stderr Finding 16 (P2) from Codex re-review on 7a8a67f. The I-1 invariant test (cite-target hook regression canary) had a nullish-coalescing-chain bug that produced false-positive passes: stderr = e.stderr?.toString() ?? e.stdout?.toString() ?? String(e); assert.equal(stderr, ""); `??` only falls back on null/undefined — not on empty string. If the hook exited nonzero with empty stderr (e.g. error written only to stdout, or process killed before flushing), `e.stderr` was the empty string `""`, the chain stopped there, and the final equality check passed. Cite-target regressions could slip through CI silently. Fix: capture the exception itself in `caughtError`, assert `caughtError === null`, and include `exit code + stderr + stdout` in the failure message for diagnosis. Now any throw fails the test — empty-stderr false-positives are impossible. Tests: 39/39 still pass on the healthy current catalog. The fix sharpens the failure mode without changing the happy-path semantics. Closes PRRT_kwDOSCycWc6AMC3r. Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper Phase D.5 + auto-merge PR landing per D-8 - SKILL.md: insert Phase D.5 (merge transition: gh pr ready/checks/merge, git pull --ff-only) between Phase D and E so the merged squash-commit Phase E reads from is explicitly produced; rewrite Phase E steps 7-8 to land via housekeeping/PR<N> auto-merge PR (preserves no-direct-push hard rule line 45 + adds PR-CI gate to housekeeping diff). - contract: append §Housekeeping commit landing (branch shape, PR title identical to commit subject, PR body, --auto --squash mechanics, CI failure handling). - plan: add D-8 Decision-Log entry recording the architectural drift + resolution; mark Task 4.4 verbatim Phase E prose SUPERSEDED in PR #33 (historical snapshot retained per ADR superseded-by convention). Closes Codex Findings 2/5/8 on PR #33 (architectural drift between Phase E direct-push prose and SKILL.md no-direct-push hard rule). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper scalar payload reject + extension contract align - isMeaningfulPayload: drop catch-all `return true` for booleans/numbers; the contract is composed completion-prose, which can't be a scalar. Prior behavior let a subagent satisfy `semantic_work_pending` with `false`/`0` and pass validation. Adds 2 regression tests pinning the rejection boundary for `false` and `0` payloads. - plan-execution-housekeeper.md report-format: replace stale field reference `concerns[kind=affected_files_extension].path` with the actual flow per failure-modes rule 20 — extended `manifest.affected_files` carries the path; the concerns entry uses `kind` + `addressing` for rationale (no `path` sub-field). Aligns the agent contract with the validator + sprawl-detector + canonical prompt. Closes Codex Findings F-AMLor + F-AMLos on PR #33 (P2). 41 helpers + 116 main tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper Phase E plan-file stage + violations preserve - SKILL.md Phase E step 7: explicitly stage docs/plans/<plan-file> alongside <affected_files>. The Progress Log append in step 6 is orchestrator-controlled (not script-controlled) so the plan file is not in manifest.affected_files — especially under exit 3 (no Done Checklist sub-section) where the script skips checklist ticking entirely. Omitting it would leave the Progress Log edit unstaged, breaking the documented "single commit bundles housekeeping + log" guarantee. - validator: add scriptSchemaViolations parameter + check #9 (immutable superset comparison by composite key kind+field+ns_id). Mirrors check #7's scriptAffectedFiles superset semantics. Without it, a subagent could clear manifest.schema_violations and return DONE/DONE_WITH_CONCERNS, bypassing check #6 BLOCKED enforcement entirely. - 3 regression tests pin the new check #9 boundary: cleared (fail), preserved (pass), and additive (pass — subagent surfacing schema problems the script missed). Closes Codex Findings F-AMP7W + F-AMP7Y on PR #33 (P1). 44 helpers + 116 main tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper merge-state poll + verification_failures preserve - SKILL.md Phase E step 8: add mergedAt poll AFTER gh pr checks --watch. gh pr merge --auto can leave the PR in a merge queue after CI passes (gh CLI docs: "If required checks have passed, the pull request will be added to the merge queue"), so checks-complete does NOT mean merged in merge-queue repos. Polling gh pr view --json mergedAt -q .mergedAt for a non-empty value waits until the squash-commit is on develop. - validator: add scriptVerificationFailures parameter + check #10 (immutable superset comparison by canonical JSON key — verification_failure entries are small structured shapes without a stable composite key, so JSON.stringify with sorted keys is the comparison key). Mirrors check #9 for schema_violations. Without it, a subagent could clear manifest.verification_failures and return DONE/DONE_WITH_CONCERNS, bypassing check #8's BLOCKED enforcement for exit-2 halt. - 3 regression tests pin the new check #10 boundary: cleared (fail), preserved (pass), and key-order-independent JSON canonicalization (pass — protects against false-positive gaps when the subagent re-stringifies). Closes Codex Findings F-AMSEJ + F-AMSEL on PR #33 (P1). 47 helpers + 116 main tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper validator non-file guard for affected_files Codex F-AMU4D (PR #33 P2): the placeholder-scan loop called readFileSync unconditionally after the existsSync gate, throwing EISDIR on directory paths and any other I/O error, crashing the orchestrator instead of routing through the validator's gap collection. Add statSync().isFile() guard + try/catch around statSync and readFileSync; each failure path emits a distinct gap citing F-AMU4D so the failure mode is identifiable. The contract requires affected_files entries to be regular files (the script edits line-level content); a directory is a contract violation that MUST surface as a gap so Phase E can re-dispatch — not an unhandled exception that halts orchestration. Regression test: a directory at the affected_files path now emits a gap matching "is not a regular file" + "Codex F-AMU4D" rather than crashing with EISDIR. Tests: 48 helpers (47 + 1 new) + 116 main pass. * fix(repo): housekeeper validator union pending iteration for F-AMWXK Codex F-AMWXK (PR #33 P1): validateManifestSubagentStage iterated manifest.semantic_work_pending (subagent-written), so a subagent could clear that array + return DONE/DONE_WITH_CONCERNS — the per-item pairing loop ran zero times and emitted zero gaps, letting unaddressed semantic work pass validation. SAME bypass shape as F-AMP7Y (schema_violations) + F-AMSEL (verification_failures). STRUCTURAL DIVERGENCE from checks #9/#10: those guard array-non-emptiness → BLOCKED state with a separate preservation check. F-AMWXK leverages the EXISTING per-item iteration by feeding it the UNION of script-stage snapshot + subagent-written array. The script snapshot is the immutable contract (subagent CANNOT bypass by clearing); the subagent's additions are also iterated (subagent committed to address anything they added). Why union not preservation-check: a separate preservation check would emit "X missing from semantic_work_pending" — forcing 3 round-trips (R1 array-shrink gap → R2 re-add → R3 address). Union semantics keep the existing "X listed but unaddressed" gap actionable in ONE round-trip. Add scriptSemanticWorkPending parameter (defaults null for legacy callsites + tests). Updated JSDoc check #11 documenting the new parameter, the union semantics, and the structural divergence. Regression tests (3 — pin both directions of the boundary plus the union semantics): - cleared semantic_work_pending → 2 gaps for both script-stage items - preserved + addressed → valid - subagent-added items also iterated → 1 gap for unaddressed addition Tests: 51 helpers (48 + 3 new) + 116 main pass. * fix(repo): housekeeper validator manifest _script_stage for F-AMbIV Path-(a) A2 design closes the system-level half of the contract-half-fix bypass class (F-AMP7Y / F-AMSEL / F-AMWXK). Validator-API parameters were in place but no production caller plumbed them, so the four preservation / iteration checks (#7 / #9 / #10 / #11) silently disabled. Script now writes _script_stage as an immutable snapshot of the four arrays the validator enforces (affected_files / schema_violations / verification_failures / semantic_work_pending). Validator reads the snapshot from the manifest with precedence: explicit scriptXXX param > manifest._script_stage[field] > null. Structural-tampering check fires whenever the snapshot path is relied on (any scriptXXX null), catching secondary attacks where the subagent removes _script_stage to defeat the snapshot defense. Subagent contract: _script_stage is READ-ONLY. Touching it surfaces as a Codex F-AMbIV gap. Documented in the canonical subagent prompt template, the agent definition file, the contract doc § _script_stage immutability, and SKILL.md Phase E step 5. Tests: 4 new F-AMbIV regressions (A2 fallback / missing-snapshot gap / malformed-snapshot gap / param precedence). All 171 housekeeper tests pass (55 helper + 116 main). 14 fixture snapshots updated to mirror the new emission. 42 existing test fixtures bulk-updated with empty _script_stage to satisfy the structural check on legacy code paths. Refs: PR #33 Codex finding F-AMbIV * fix(repo): housekeeper dispatch routing helper for F-ANGCa phase e step 4 dispatched the housekeeper subagent unconditionally, but the contract classifies exits 1/4/>=6 as orchestrator-stage halts (operator action required, not subagent work). dispatching over a malformed/absent manifest forced the llm to interpret garbage and emit a result tag based on hallucinated state. encode the dispatch/halt mapping in a tested helper (decideHousekeeperRouting) so skill prose, contract, and any future audit script delegate to one source of truth, mirroring the f-amp7y/f-amsel/ f-amwxk/f-ambiv pattern (move enforcement out of prose, into validators). 10 unit tests pin the mapping per exit class (0/1/2/3/4/5/6/137/-1/nan). Refs: PR #33, F-ANGCa Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(repo): strip review-thread provenance from agent prompt context skill prose, agent rules, contract reference, and the canonical subagent prompt template all carried `Codex F-AMbIV` / `Codex F-ANGCa` / `Codex P1 PR #33` review-thread suffix labels. these are github graphql ids with no semantic content for any reader outside the originating pr review tab — they violate the global rule against referencing the current task / fix / caller in code (claude.md), and they bleed review-process metadata into the housekeeper subagent's prompt every dispatch. the rule's actual rationale already lives in the surrounding prose; the f-tag added zero behavioral signal while consuming prompt tokens. scope is bounded to runtime-loaded surfaces (skill prose, agent rule list, contract reference, canonical_template constant). validator gap messages and jsdoc annotations stay — they're operator-facing / engineer-facing, not loaded into the housekeeper subagent's prompt context. the breadcrumb trail to source threads remains in commit messages and test names. Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper validator path containment for F-AxXBH Validator's affected_files loop joined declared paths against repoRoot without first checking they were repo-relative. Malformed manifests emitting an absolute path (/etc/passwd) or parent-traversal (../../private) bypassed validation; later step-7 git add fails with "outside repository pathspec", dead-ending housekeeping after a valid:true signal. Encoded containment in assertRepoRelative (lexical isAbsolute + relative startsWith ".."), called BEFORE existsSync. 7 unit tests cover helper branches + validator integration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper validator shape guards F-Axg-w/F-Axg-z Validator's affected_files loop forwarded each entry to assertRepoRelative (isAbsolute), which throws TypeError on non-string input. The schema_violations reconciliation + preservation loops both dereferenced v.kind/field/ns_id without per-element shape check, so a tampered manifest with null or non-object entries crashed mid-validation instead of surfacing structural-tampering as a gap. Sanitize once at the top (cleanedSchemaViolations) and add typeof guard before assertRepoRelative (idx-keyed gap message); downstream loops iterate the cleaned arrays. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper validator class-level array sanitize F-AxkNI/J Closes the (manifest.X ?? []).method() bug class across all 6 callsites in the validator. Two sub-classes addressed: (1) container-type — array fields received an OBJECT/scalar instead of an array, crashing .entries()/.some()/.map() with "is not a function"; (2) element-shape — arrays of objects received null/scalar elements, crashing field dereference (.kind/.field/.addressing) with TypeError. Pre-fix sites: schema_violations (recon + preserve), concerns (pending-pair + recon), affected_files (scan + D-7 superset), verification_failures (preserve). Centralized in sanitizeObjectArrayField + sanitizeStringArrayField helpers; consumers iterate cleaned arrays. Same gap-routing pattern as F-AMU4D / F-Axg-z: contract violations surface as gaps with contract-anchored shape hints, not orchestrator crashes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(repo): strip review-thread provenance from housekeeper surfaces Removed two classes of ephemeral PR-thread identifiers leaked into runtime test/code/comment surfaces under .claude/skills/plan-execution/: - F-tag suffixes (F-AMSEL/F-AMU4D/F-AMP7Y/F-AMNGCa/F-AMbIV/F-AMWXK/ F-Axg-w/F-Axg-z/F-AxXBH/F-AxkNI/F-AxkNJ/F-AMLor/F-ANGCa) - Full GraphQL thread IDs (PRRT_kwDO...) and review permalink IDs (Codex thread rNNNNNNNN — three call-sites) Both classes describe Codex review-thread provenance: base64 GraphQL IDs that are ephemeral (rot when the PR is closed/squashed) and opaque (no resolution path for a future reader). Per the standing rule, these identifiers belong in commit messages and PR thread replies, never in runtime-loaded test/code/comment/gap-message surfaces. Replaced identity-based test assertions (/Codex F-XXX/.test(g)) with behavior-based assertions (e.g., g.includes("script-stage verification_failure")) so future refactors that rename gap-message text don't silently break test coverage on a stale identifier match. Known residual: Bug-N markers (Bug-4/5/8/9) remain inline in the same files. They're a separate internal-triage convention without external docs/backlog cross-references; stripping them was out of scope for this authorization and is deferred as follow-up. Test count unchanged: 254/254 pass. Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): sanitize semantic_work_pending before union spread Codex P1 finding on housekeeper-orchestrator-helpers.mjs:349 — the `[...effScript, ...(manifest.semantic_work_pending ?? [])]` union spread crashes "object is not iterable" when a malformed subagent manifest sets `semantic_work_pending` to a non-array. The `??` only coalesces null/ undefined; an object/scalar short-circuits the coalesce, returns the bad value, and the spread throws. Bypasses the intended re-dispatch path — bad manifest structure becomes orchestrator failure instead of a recoverable validation gap. Same bug class as the prior class-level fix (sanitize helpers for schema_violations / concerns / verification_failures / affected_files), missed at the spread surface for semantic_work_pending. The earlier fix scope only covered method-call sites (`(arr ?? []).method()`); this commit extends to the spread surface. Fix: add cleanedSemanticWorkPending = sanitizeStringArrayField( manifest.semantic_work_pending, ...) upstream alongside the other sanitize calls; route both consumption sites through the cleaned array. Inline asymmetry note: ONLY the manifest field is sanitized; effScriptSemanticWorkPending is upstream-resolved via Array.isArray (script-stage = trusted, manifest = subagent-output = untrusted). Test added (integration test mirrors the AxkNJ proof pattern): manifest with _script_stage.semantic_work_pending populated and tampered manifest.semantic_work_pending produces BOTH a structural-tampering gap AND the canonical "X listed but absent" gap on the script-stage pending item — proves a tampered live field doesn't mask preservation enforcement. Test count: 255/255 pass (was 254; +1 new test for this finding). Refs: PR #33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper validator script-side element sanitize Sanitize elements of `effScriptVerificationFailures` and `effScriptSchemaViolations` BEFORE the preservation iteration loops via the existing `sanitizeObjectArrayField` helper. The container-type guard in check #12 only verifies the field is an array — element-shape was unguarded, so a tampered `manifest._script_stage.verification_failures` containing `[null]` crashed `Object.keys(null)` inside the `failureKey` closure (and `manifest._script_stage.schema_violations` containing `[null]` TypeError'd on `sv.kind` dereference). Same fix shape as the manifest-side sanitize already in place for the four object-array fields. The companion finding (manifest `_script_stage` as immutable baseline) is deferred — see thread reply for architectural rationale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(repo): strip ephemeral Bug-N labels from housekeeper surfaces Same class as the F-XXXX review-thread suffix cleanup (commit 2715e01). Bug-N labels were ephemeral session-tracking identifiers used during the initial finding cycle for this PR — they have no meaning outside the originating review session and rot the moment the PR squashes. Replaced each occurrence with a description of the bug class or contract violation the test/branch defends; preserved all surrounding rationale. Touches: - post-merge-housekeeper.test.mjs (Bug-9, Bug-8, Bug-4 ×2, Bug-5 headers + 1 test name) - post-merge-housekeeper.mjs (Bug-4 ×2 inline pre-validation comments) 256/256 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper baseline-trust gap on missing scriptXXX When validateManifestSubagentStage runs without all four scriptXXX params and falls back to manifest._script_stage, the validator now emits an explicit gap. _script_stage is subagent-emitted and may be tampered; the orchestrator's stage-1 conversation memory snapshot (read in SKILL.md step 3 BEFORE subagent dispatch) is the only untamperable baseline. Falling back silently allowed preservation bypass when the subagent cleared both the top-level emit field and the snapshot field while keeping shape intact. SKILL.md step 5, the housekeeper contract, the agent prompt, and the embedded CANONICAL_TEMPLATE all updated to describe the orchestrator-plumbs-stage-1-snapshot trust model. _script_stage is demoted from "immutable" to "redundant integrity signal" — the primary defense is the validator's gap, the snapshot is defense- in-depth. Test fixtures updated to plumb scriptXXX explicitly (mirroring the production path); two new tests cover the baseline-trust gap positive (omit scriptXXX → gap fires) and negative (all four plumbed → suppressed) cases. Refs: PR #33 finding F-AyDCN Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(repo): housekeeper contract — multi-candidate manifest shape Multi-candidate runs (--candidate-ns NS-XX,NS-YY) emit a divergent manifest shape: matched_entry is null with matched_entries[] carrying per-NS metadata, and mechanical_edits.{status_flip,mermaid_class_swap} are absent (omitted entirely) with plural status_flips[] / mermaid_class_swaps[] arrays in their place. Canonical fixture is scripts/__tests__/fixtures/14-multi-candidate-happy-path. The contract previously described only the single-candidate shape, and the canonical subagent prompt template's responsibility #1 told the housekeeper to replace `<TODO subagent prose>` placeholders only in mechanical_edits.status_flip.to_line — a subagent following the prompt literally on a multi-candidate run would skip the status_flips[] entries. The validator's on-disk placeholder scan would have caught the omission on the next round-trip (the safety net is shape-agnostic), but the wasted round-trip was avoidable. Updated responsibility #1 in both helpers.mjs CANONICAL_TEMPLATE and the contract's §Canonical Subagent Prompt Template (kept verbatim in sync for the buildHousekeeperPrompt parity test) to cover both shapes, added a multi-candidate-shape paragraph to the Manifest schema section, and updated the validation-invariants line to acknowledge the plural keys. Refs: PR #33 finding F-A2u4r Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(repo): housekeeper validator entry guard for malformed manifest A malformed subagent output (JSON literal `null`, undefined, scalar, or array root) would crash on the first `manifest.X` dereference at helpers.mjs L214 before gap collection ran, throwing TypeError instead of returning a structured validation gap. That broke Phase E's contract-violation recovery path — the orchestrator caught a runtime crash instead of a routable gap, and could not re-dispatch / surface the contract violation to the user. Added an entry guard at function start that catches null, undefined, non-object scalars, and array roots. Discriminates on actualKind for actionable messaging and short-circuits with a structured gap so downstream code can trust manifest is a non-null non-array object. Four dedicated tests cover null / undefined / scalar (number+string) / array roots. 89/89 pass. Refs: PR #33 finding F-A24wj Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
b3cd7fb to
79ca427
Compare
4 tasks
Sawmonabo
added a commit
that referenced
this pull request
May 11, 2026
- step 5 no longer contains a `cp manifest stage1` fallback. By step 5 the manifest has been mutated by the subagent dispatch, so the copy would alias the stage-1 baseline to tampered state and silently disable preservation checks #7/#9/#10/#11. (Codex P1) - step 3 snapshot promoted from "optional but recommended" to REQUIRED, with the cp command moved into a fenced code block. Drops the misleading "non-fatal but signals reduced preservation coverage" claim — omitting --stage1 actually forces a check #12 round-trip loop (exit 2). (Codex P2) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sawmonabo
added a commit
that referenced
this pull request
May 11, 2026
- CLI exit-1 routing gated on definitional-narration gaps only (Codex R4 P1). The narration scenario inherently fires 3 gaps (result non-canonical, narration_mode_detected, per-item unaddressed). Exit 1 must NOT win when a non-definitional integrity gap (baseline-trust check #12, preservation #7/#9/#10/#11, halt-state #8, schema-surfacing #5, placeholder leak, etc.) also fires — auto-deviation off untrusted _script_stage would compound the bug. Uses an allow-list of three patterns; new gap classes default to integrity (exit 2). - Auto-deviation fallback recipe (SKILL.md §) now sets result conditionally on _script_stage halt-state arrays (Codex R4 P2). schema_violations or verification_failures non-empty → BLOCKED; otherwise DONE_WITH_CONCERNS. Hard-coding DONE_WITH_CONCERNS would deterministically re-fail check #8 and trap the flow. - Adds child_process-spawned regression test exercising both routing paths (narration-only → exit 1; narration + --stage1 omitted → exit 2 via baseline-trust gap). 115/115 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sawmonabo
added a commit
that referenced
this pull request
May 11, 2026
…#53) * chore(repo): add plan-execution housekeeper narration-mode guardrails Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(repo): address Codex P2 findings on PR #53 - narration check sources hasPendingWork from effScriptSemanticWorkPending (trusted stage-1 baseline) instead of manifest.semantic_work_pending (subagent-emitted). A bad subagent clearing the top-level array no longer demotes routing from narration auto-deviation to generic round-trip. - CLI wrapper rejects --stage1 with no path argument (exit 3) instead of silently ignoring it. Adds regression test for the explicit-param > _script_stage precedence chain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(repo): address Codex P1 + P2 findings on PR #53 fixup - step 5 no longer contains a `cp manifest stage1` fallback. By step 5 the manifest has been mutated by the subagent dispatch, so the copy would alias the stage-1 baseline to tampered state and silently disable preservation checks #7/#9/#10/#11. (Codex P1) - step 3 snapshot promoted from "optional but recommended" to REQUIRED, with the cp command moved into a fenced code block. Drops the misleading "non-fatal but signals reduced preservation coverage" claim — omitting --stage1 actually forces a check #12 round-trip loop (exit 2). (Codex P2) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(repo): fix stdout flush race in validate-subagent-manifest Wrap the script body in main() returning the exit code, then assign `process.exitCode = main()`. The previous unconditional `process.exit(N)` after `process.stdout.write(JSON)` could terminate Node before async stdout writes drained when stdout is piped (e.g., `... | jq`, CI log collectors), intermittently truncating the JSON payload and breaking orchestrator parsing even when validation succeeded. (Codex PR #53 R3 P1) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(repo): address Codex R4 P1 + P2 findings on PR #53 - CLI exit-1 routing gated on definitional-narration gaps only (Codex R4 P1). The narration scenario inherently fires 3 gaps (result non-canonical, narration_mode_detected, per-item unaddressed). Exit 1 must NOT win when a non-definitional integrity gap (baseline-trust check #12, preservation #7/#9/#10/#11, halt-state #8, schema-surfacing #5, placeholder leak, etc.) also fires — auto-deviation off untrusted _script_stage would compound the bug. Uses an allow-list of three patterns; new gap classes default to integrity (exit 2). - Auto-deviation fallback recipe (SKILL.md §) now sets result conditionally on _script_stage halt-state arrays (Codex R4 P2). schema_violations or verification_failures non-empty → BLOCKED; otherwise DONE_WITH_CONCERNS. Hard-coding DONE_WITH_CONCERNS would deterministically re-fail check #8 and trap the flow. - Adds child_process-spawned regression test exercising both routing paths (narration-only → exit 1; narration + --stage1 omitted → exit 2 via baseline-trust gap). 115/115 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
79ca427 to
de8b339
Compare
254b7c8 to
4d01b87
Compare
18df547 to
e794cd7
Compare
4 tasks
d4f34fe to
203b4a5
Compare
203b4a5 to
150960b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🤖 I have created a release beep boop
contracts: 1.0.0
1.0.0 (2026-05-30)
Features
client-sdk: 1.0.0
1.0.0 (2026-05-30)
Features
This PR was generated with Release Please. See documentation.