Skip to content

chore: release develop#7

Open
github-actions[bot] wants to merge 1 commit into
developfrom
release-please--branches--develop
Open

chore: release develop#7
github-actions[bot] wants to merge 1 commit into
developfrom
release-please--branches--develop

Conversation

@github-actions
Copy link
Copy Markdown

@github-actions github-actions Bot commented Apr 27, 2026

🤖 I have created a release beep boop

contracts: 1.0.0

1.0.0 (2026-05-30)

Features

  • client-sdk: add Plan-002 Phase 5 membership client surface (dff6523)
  • contracts: add PtyHost interface (Plan-024 T-024-2-1) (#45) (ed71ee1)
  • contracts: add session, event, and error payload schemas (#8) (6166fa9)
  • contracts: Plan-002 P1 invite/membership/presence/channel contracts (#102) (347d62b)
  • control-plane: add HTTP/SSE substrate (Plan-008 Phase 1) (#21) (a362809)
  • control-plane: add session directory service (create/read/join) (#10) (c723b18)
  • control-plane: close Plan-001 P5 residuals + promote to completed (#87) (bc33f30)
  • control-plane: Plan-002 P3 presence + ChannelList projection (#108) (1b8e865)
  • daemon: add JSON-RPC wire substrate (Plan-007 Phase 2) (#17) (3d8ef0e)
  • daemon: add session.* handlers + SDK transport (Plan-007 Phase 3) (#19) (0e5599d)
  • daemon: RustSidecarPtyHost + AIS_PTY_BACKEND opt-in (Plan-024 P3) (#56) (be58baa)
  • desktop: scaffold Plan-023 Phase 1 workspace substrate (#70) (dcb9644)
  • repo: close BL-102/103/105 (protocolVersion + errors + events) (b7041e3)
  • repo: scaffold V1 monorepo + engineering CI surface (#6) (ca22530)
  • sidecar-rust-pty: bootstrap crate + protocol + framing (Plan-024 P1) (#42) (4b0a37e)
client-sdk: 1.0.0

1.0.0 (2026-05-30)

Features

  • client-sdk: add Plan-002 Phase 5 membership client surface (dff6523)
  • control-plane: add HTTP/SSE substrate (Plan-008 Phase 1) (#21) (a362809)
  • daemon: add session.* handlers + SDK transport (Plan-007 Phase 3) (#19) (0e5599d)
  • repo: close BL-102/103/105 (protocolVersion + errors + events) (b7041e3)
  • repo: scaffold V1 monorepo + engineering CI surface (#6) (ca22530)

This PR was generated with Release Please. See documentation.

@github-actions github-actions Bot requested a review from Sawmonabo as a code owner April 27, 2026 06:38
@github-actions github-actions Bot force-pushed the release-please--branches--develop branch 12 times, most recently from ed3ced8 to dfddd94 Compare May 4, 2026 01:08
@github-actions github-actions Bot force-pushed the release-please--branches--develop branch 2 times, most recently from d22e869 to 56403f0 Compare May 6, 2026 00:49
Sawmonabo added a commit that referenced this pull request May 7, 2026
Codex P1 (PR #33 R2): two gaps in the subagent-stage validator forced
false-positive round-trips on legitimate halt outcomes.

Gap 1 — `addressing: <item-key>` undocumented. The canonical-template
responsibility list never told the subagent that concerns deferring a
`semantic_work_pending` item must set `addressing: "<exact-item-key>"`
to satisfy the validator's per-item check. The match key was a hidden
contract.

Gap 2 — per-item check fired on halt-states. When subagent returned
BLOCKED or NEEDS_CONTEXT (it stopped before completing semantic work),
the per-item check still demanded coverage of every pending item,
forcing wasted dispatches.

Fix:
- Contract doc canonical template adds responsibility #5 documenting
  the `addressing: <item-key>` requirement (renumbers existing
  #5/#6/#7 -> #6/#7/#8). Validation invariants line clarifies the
  matching key + halt-state waiver.
- JS CANONICAL_TEMPLATE const re-synced verbatim per Plan
  §Decisions-Locked D-1; D-7 row 12 snapshot test confirms sync.
- validateManifestSubagentStage skips the per-item loop when
  manifest.result is "BLOCKED" or "NEEDS_CONTEXT". DONE_WITH_CONCERNS
  still requires per-item pairing — waiver narrowly scoped to halts.
- Gap message now hints at the contract requirement so future
  regressions self-document.

Tests: 4 new regression tests (BLOCKED waiver, NEEDS_CONTEXT waiver,
DONE_WITH_CONCERNS still requires pairing, `addressing` match by
exact item-key with `kind` irrelevant).

Note: Codex Finding 2 (missing merge-command steps in SKILL.md
Phase E) is a deeper architectural drift between spec (pre-merge
housekeeping) and plan/SKILL.md (post-merge ordering) — deferred to
user direction; not addressed here.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sawmonabo added a commit that referenced this pull request May 7, 2026
- SKILL.md Phase E step 7: explicitly stage docs/plans/<plan-file> alongside
  <affected_files>. The Progress Log append in step 6 is orchestrator-controlled
  (not script-controlled) so the plan file is not in manifest.affected_files —
  especially under exit 3 (no Done Checklist sub-section) where the script
  skips checklist ticking entirely. Omitting it would leave the Progress Log
  edit unstaged, breaking the documented "single commit bundles housekeeping
  + log" guarantee.
- validator: add scriptSchemaViolations parameter + check #9 (immutable
  superset comparison by composite key kind+field+ns_id). Mirrors check #7's
  scriptAffectedFiles superset semantics. Without it, a subagent could clear
  manifest.schema_violations and return DONE/DONE_WITH_CONCERNS, bypassing
  check #6 BLOCKED enforcement entirely.
- 3 regression tests pin the new check #9 boundary: cleared (fail), preserved
  (pass), and additive (pass — subagent surfacing schema problems the script
  missed).
Closes Codex Findings F-AMP7W + F-AMP7Y on PR #33 (P1). 44 helpers + 116 main
tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sawmonabo added a commit that referenced this pull request May 7, 2026
Path-(a) A2 design closes the system-level half of the contract-half-fix
bypass class (F-AMP7Y / F-AMSEL / F-AMWXK). Validator-API parameters were
in place but no production caller plumbed them, so the four preservation /
iteration checks (#7 / #9 / #10 / #11) silently disabled.

Script now writes _script_stage as an immutable snapshot of the four
arrays the validator enforces (affected_files / schema_violations /
verification_failures / semantic_work_pending). Validator reads the
snapshot from the manifest with precedence: explicit scriptXXX param >
manifest._script_stage[field] > null. Structural-tampering check fires
whenever the snapshot path is relied on (any scriptXXX null), catching
secondary attacks where the subagent removes _script_stage to defeat
the snapshot defense.

Subagent contract: _script_stage is READ-ONLY. Touching it surfaces as
a Codex F-AMbIV gap. Documented in the canonical subagent prompt
template, the agent definition file, the contract doc § _script_stage
immutability, and SKILL.md Phase E step 5.

Tests: 4 new F-AMbIV regressions (A2 fallback / missing-snapshot gap /
malformed-snapshot gap / param precedence). All 171 housekeeper tests
pass (55 helper + 116 main). 14 fixture snapshots updated to mirror
the new emission. 42 existing test fixtures bulk-updated with empty
_script_stage to satisfy the structural check on legacy code paths.

Refs: PR #33 Codex finding F-AMbIV
Sawmonabo added a commit that referenced this pull request May 10, 2026
* feat(repo): add plan-execution-housekeeper subagent definition

Adds the 7th plan-execution subagent role per spec §5.2: blue color,
tools Read/Grep/Glob/Edit/Write (Bash omitted to mechanically enforce
Plan Invariant I-3 — no git shell-out from this role). Body sections
mirror plan-execution-contract-author.md shape: Inputs, Mindset, Hard
rules, Decision presentation, Exit states (only the 4 canonical from
failure-modes.md per Plan Invariant I-2), Report format, Reference
files.

Refs: Plan-housekeeper Phase 4 Task 4.1; Spec §5.2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(repo): add post-merge-housekeeper-contract.md reference doc

The contract is the housekeeper subagent's portable reference; sections
extract spec §5.3 manifest schema, §5.1 exit codes, §5.3 validation
invariants, §7.2 resume diagnostic, §3a.2 completion-rule matrix, §3a.4
file-reference extraction heuristic. Adds the canonical subagent prompt
template that buildHousekeeperPrompt (Task 4.3) reproduces verbatim and
the Layer 2 snapshot test (Task 4.8) pins.

Refs: Plan-housekeeper Phase 4 Task 4.2; Spec §9.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(repo): fix housekeeper rule-20 sprawl cite in prompt template

The Canonical Subagent Prompt Template (Task 4.2 verbatim source +
deployed contract) cited `rule 21` for the sprawl-routing case, but
Task 4.7 defines rule 20 = sprawl (round-trip dispatch) and rule 21 =
schema-violation halt → BLOCKED. Also re-frames the routing prose:
sprawl triggers an orchestrator round-trip (DONE_WITH_CONCERNS is a
secondary outcome only after weak-justification re-dispatch, not the
primary route). Without this fix, Task 4.8's snapshot test would pin
the wrong rule number into CI fixtures.

Fixes both the canonical contract and the plan's verbatim source so a
re-run of Task 4.2 cannot reintroduce the bug.

Refs: Plan-housekeeper Phase 4 Task 4.2; Task 4.7 rule 20.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(repo): orchestrator helpers + Layer 2 prompt/validator tests

Adds .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs
exporting buildHousekeeperPrompt (BASE template from contract + conditional
AUTO-CREATE / schema-violation augmentations) and validateManifestSubagentStage
(catches: pending-item gaps, file-residual placeholders, semantic_edits values
P2 placeholders, nested-array placeholders, schema_violations x concerns
reconciliation gaps). 8 unit tests TDD-authored.

Refs: Plan-housekeeper Phase 4 Task 4.3; Spec §5.3, §8.2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): lift housekeeper PLACEHOLDER constant + cleanup tmp dirs

Code-quality review POLISH items: lift `<TODO subagent prose>` literal
to module scope (single source of truth across walkForPlaceholder +
validateManifestSubagentStage), and reap Test 6's tmp dir via
try/finally + rmSync to match sibling test convention.

Refs: Plan-execution-housekeeper Phase 4 Task 4.3

* docs(repo): rewrite SKILL.md Phase E section for housekeeper subagent

Phase E now describes the 8-step post-merge housekeeping flow: candidate-lookup
over §6 NS catalog, script dispatch (--candidate-ns / --auto-create / NEEDS_CONTEXT
on multi-match), script-stage manifest validation, plan-execution-housekeeper
subagent dispatch, subagent-stage validation, Progress Log append (moved from
before-merge to after-housekeeping per spec §6.1), single bundled commit, push.

Per Plan §Decisions-Locked D-3 the housekeeper ticks ALL Done-Checklist boxes
for the merged Phase. Footnote at step 1 cross-links to Plan §D-7 fixture matrix.

Refs: spec §5.4, §6.1, §6.2, §7.1; Plan-execution-housekeeper Phase 4 Task 4.4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(repo): mark Phase E D-3 cite as bullet per plan §Task-4.4 wording

Plan line 2923 prescribes "A new bullet at the top of Phase E noting ..."
but the Task 4.4 rewrite rendered the D-3 line as a paragraph. Prepend the
list marker so the rendering matches the plan's "new bullet" framing.

Refs: Plan-execution-housekeeper Phase 4 Task 4.4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(repo): fix housekeeper Phase E Progress Log path drift

POLISH from Task 4.4 code-quality review:

- Step 6 path drifted to gitignored `.agents/tmp/<session-id>/progress.md`,
  which is incompatible with step 7's "single git commit on develop" (spec
  §6.1 line 905 mandates same-commit). Restore plan-body
  `docs/plans/NNN-*.md` § Progress Log location.
- GFM footnote ref `Rules:[^d7]` needs space before `[^d7]`.

Mirrored fix in plan source per Task 4.2 rule-20 precedent (commit b5f1e10)
so re-runs don't reintroduce the path drift.

Refs: Plan-execution-housekeeper Phase 4 Task 4.4

* docs(repo): register 7th plan-execution-housekeeper role in SKILL.md

Task 4.5 of plan-execution-housekeeper Phase 4: bumps the orchestrator's
role count six → seven, adds `plan-execution-housekeeper` to all three
role enumerations (line 29 introduction, line 42 no-git rule, line 515
Reference Files subagent-definitions row), and adds a Reference Files
row pointing to `references/post-merge-housekeeper-contract.md`
(authored in Task 4.2, commit b5f1e10's predecessor).

Set-quantifier sweep includes lines 42 and 515 alongside the explicit
line 29 from the plan task spec — per same-class-sweep discipline, the
"Six subagent roles" claim mirrors at all three sites.

Refs: Plan-execution-housekeeper Phase 4 Task 4.5

* docs(repo): add Phase E recovery + inline lookup rules; reserve BL-109

state-recovery.md: append "Resuming a Phase E housekeeping halt" diagnostic
(per spec §7.2) and mirror the four §4.3.2 candidate-lookup rules inline (per
spec §4.3.5) so a recovering engineer doesn't need to cross-link to SKILL.md
to debug rule-N mismatches. Flag the existing line-174 .agents/tmp/ lifecycle
drift with a \`> Note:\` block linking to BL-109.

backlog.md: reserve BL-109 to track reconciling state-recovery.md's ".agents/
tmp/ deleted at commit time" claim with lefthook.yml (which has no such job).

Refs: spec §6.1, §7.2, §4.3.5, §9.3; Plan §Decisions-Locked D-5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(repo): add slug-fragility warning above BL-109 heading

POLISH from Task 4.6 code-quality review: an inbound anchor in
state-recovery.md (the > Note: callout authored alongside BL-109)
points to the GitHub-slugified BL-109 heading. The heading contains
backticks + dots + slashes that get stripped by GFM, so any future
heading reword could silently break the inbound anchor without any
hook catching it. Add an HTML comment above the heading documenting
the dependency, mirroring the PR #24 / PR #27 canonicalization-sweep
lessons.

Refs: Plan-execution-housekeeper Phase 4 Task 4.6

* docs(repo): add Phase E housekeeper routing rules 20-21 to failure-modes

Adds rules 20 (housekeeper subagent affected_files sprawl detection +
round-trip resolution) and 21 (script schema-violation BLOCKED routing
reusing rule-4's graceful-drain) to failure-modes.md. Both reuse the
existing four canonical exit-states (DONE / DONE_WITH_CONCERNS /
NEEDS_CONTEXT / BLOCKED) per spec §7.1 invariant; no new exit-state
introduced.

Refs: spec §7.1; Plan-execution-housekeeper Phase 4 Task 4.7

* docs(repo): align cross-link cite to canonical "Exit codes" heading

POLISH from Task 4.7 review: 4 citation sites used "§ Exit Codes"
(capital C) while the contract heading at
references/post-merge-housekeeper-contract.md:70 is "## Exit codes"
(lowercase). Sweep direction lowercases the citers (not capitalizes
the heading), because sentence-case matches:
  (a) all 5 sibling ## headings in post-merge-housekeeper-contract.md
  (b) sibling references/preflight-contract.md:13 "## Exit codes"

Sites swept (4):
- references/failure-modes.md:206 (rule 21, just landed in bc9c03b)
- references/state-recovery.md:198 (Task 4.6 commit 5f0d04b)
- plan source line 2989 (state-recovery.md verbatim mirror)
- plan source line 3120 (failure-modes.md verbatim mirror)

Refs: Plan-execution-housekeeper Phase 4 Task 4.7

* test(repo): add Layer 2 D-7 row + I-1/I-2/I-3 invariant unit tests

Adds 7 unit tests to post-merge-housekeeper-orchestrator-helpers.test.mjs
covering Plan §Decisions-Locked D-7 rows 12-15 (prompt-template snapshot,
zod schema parse, affected_files superset, sprawl detection routing) and
the I-1/I-2/I-3 invariants (cite-target hook regression, canonical exit-
state declaration, no-shell-out for git).

Adds new helper detectAffectedFilesSprawl to housekeeper-orchestrator-
helpers.mjs (pure function, set-difference) and extends
validateManifestSubagentStage with an optional scriptAffectedFiles param
that enforces the D-7 row 14 superset check. Adds zod ^4.3.6 to root
devDependencies so the schema test resolves the import. Adds the RESULT:
prefix to the four canonical exit-states in plan-execution-housekeeper.md
so the I-2 invariant test detects all four declarations consistently.

Refs: Plan-execution-housekeeper Phase 4 Task 4.8

* docs(repo): fix plan Task-4.9 test command to use glob not directory

Plan §Task-4.9 Step 1 prescribed `node --test <directory>/` but Node
22.21 does NOT directory-walk explicit path args — only auto-discovers
from cwd. The directory-arg was treated as a module path and threw
MODULE_NOT_FOUND. Use `<directory>/*.test.mjs` glob instead.

Also corrects the expected-test-count line: Task 4.8 added 7 tests
(4 D-7 rows + 3 invariants), not 4 D-7 rows alone.

Refs: Plan-execution-housekeeper Phase 4 Task 4.9

* fix(repo): housekeeper schema_violations × concerns per-entry match

The validator's reconciliation loop used `.some(c => c.kind === "schema_violation")`
which ignored `sv` and let a single generic concerns entry satisfy N violations.
Fix: filter concerns by kind once, then match each schema_violations entry
structurally on `field` (and `ns_id` when present). Update the gap-message to
reference the actual differentiator (e.g. "NS-15.type") instead of the
always-literal `sv.kind`. Adds three regression tests:

- 2 violations + 1 generic concern → exactly the unmatched 'type' violation gaps
- 2 violations + 2 specific concerns → passes
- --candidate-ns shape (no ns_id) → matches by `field` alone

Refs: PR #33 Codex review thread r3198337816

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper validator halt-state waiver + addressing key doc

Codex P1 (PR #33 R2): two gaps in the subagent-stage validator forced
false-positive round-trips on legitimate halt outcomes.

Gap 1 — `addressing: <item-key>` undocumented. The canonical-template
responsibility list never told the subagent that concerns deferring a
`semantic_work_pending` item must set `addressing: "<exact-item-key>"`
to satisfy the validator's per-item check. The match key was a hidden
contract.

Gap 2 — per-item check fired on halt-states. When subagent returned
BLOCKED or NEEDS_CONTEXT (it stopped before completing semantic work),
the per-item check still demanded coverage of every pending item,
forcing wasted dispatches.

Fix:
- Contract doc canonical template adds responsibility #5 documenting
  the `addressing: <item-key>` requirement (renumbers existing
  #5/#6/#7 -> #6/#7/#8). Validation invariants line clarifies the
  matching key + halt-state waiver.
- JS CANONICAL_TEMPLATE const re-synced verbatim per Plan
  §Decisions-Locked D-1; D-7 row 12 snapshot test confirms sync.
- validateManifestSubagentStage skips the per-item loop when
  manifest.result is "BLOCKED" or "NEEDS_CONTEXT". DONE_WITH_CONCERNS
  still requires per-item pairing — waiver narrowly scoped to halts.
- Gap message now hints at the contract requirement so future
  regressions self-document.

Tests: 4 new regression tests (BLOCKED waiver, NEEDS_CONTEXT waiver,
DONE_WITH_CONCERNS still requires pairing, `addressing` match by
exact item-key with `kind` irrelevant).

Note: Codex Finding 2 (missing merge-command steps in SKILL.md
Phase E) is a deeper architectural drift between spec (pre-merge
housekeeping) and plan/SKILL.md (post-merge ordering) — deferred to
user direction; not addressed here.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper schema_violations matcher discriminates by kind

Codex P2 (PR #33 R3): the schema_violations × concerns matcher's
predicate `c.field !== sv.field` was trivially false when both sides
lacked a `field` (undefined !== undefined), letting a single generic
concern absorb multiple distinct-shape violations. Plus the pre-filter
`c.kind === "schema_violation"` excluded shapes the script actually
emits — `auto_create_title_seed_underivable` (line 899) is a singleton
with no `field`/`ns_id`, so it would never find a matching concern.

Fix:
- Validator drops the kind="schema_violation" pre-filter and adds
  `c.kind === sv.kind` to the per-violation match predicate. Each
  violation must now be paired to a concern of the SAME kind.
- Gap message handles fieldless violations by falling back to a
  `(kind: <kind>)` label (was printing literal "undefined") and
  reflects the actual match-key requirements.
- Contract doc (validation invariants line 93 + canonical template
  responsibility #4) updated to describe the actual three-shape
  rule (schema_violation OR auto_create_title_seed_underivable
  singleton) rather than hardcoding `kind: "schema_violation"`.
- JS CANONICAL_TEMPLATE const re-synced verbatim per Plan
  §Decisions-Locked D-1; D-7 row 12 snapshot test confirms sync.
- Validator JSDoc check #4 expanded to call out the kind
  discriminator and the trivial-equality vulnerability it closes.

Tests: 3 new regression tests (distinct-kind violations require
kind-discriminated concerns, singleton kind matches by kind alone,
fieldless-violation gap message format).

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper kind-contract sweep across agent + spec + plan

Codex P1 (PR #33 R3 Finding 4): .claude/agents/plan-execution-housekeeper.md
line 41 still mandated `kind: schema_violation` literally even after
commit 77f4435 changed the validator to discriminate by sv.kind verbatim.
The Finding 3 fix updated the contract doc + JS const + validator, but
missed the same-axis prose in 4 other locations across the doc tree —
classic canonicalization-sweep miss per
`feedback_canonicalization_sweep_completeness.md`.

Sweep covers all parallel hard-rule prose + invariants:
- .claude/agents/plan-execution-housekeeper.md:41 (Codex flagged direct)
- docs/superpowers/specs/.../housekeeper-design.md:617 (spec hard rule)
- docs/superpowers/specs/.../housekeeper-design.md:693 (validation invariants)
  — also brings the invariants prose into alignment with Finding 1's
  `addressing: <item-key>` + halt-state waiver wording (was stale from
  the prior PR cycle's runtime fix)
- docs/superpowers/specs/.../housekeeper-design.md:953 (failure-mode table)
- docs/superpowers/plans/.../housekeeper-implementation.md:2477 (plan hard rule)

Skipped (out of canonicalization-sweep scope):
- Plan source's prescribed canonical template (lines 2570-2595)
- Plan source's prescribed validator code (lines 2785-2839)
These are historical implementation guides describing what the implementer
was supposed to write at PR-construction time. Subsequent fixes (Findings 1
and 3) evolved the implementation; updating prescription post-hoc is
plan-archeology, not contract-alignment.

Note: Codex Finding 5 (PRRT_kwDOSCycWc6ALNW_) re-raises the architectural
drift originally flagged as Finding 2 (PRRT_kwDOSCycWc6AK48r) — both
about missing pre-merge / merge-step commands in SKILL.md Phase E. That
question is OUT OF SCOPE for this commit; still awaits user direction on
Path A (restructure SKILL.md + plan Task 4.4 per spec §4.1/§6.2 ordering)
vs Path B (add D-8 Decision Log + reconcile line-45 contradiction).

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper validator enforces canonical result + BLOCKED

Validator now enforces two contract invariants from
references/post-merge-housekeeper-contract.md line 93 that prior
rounds missed:

- result !== null + ∈ {DONE, DONE_WITH_CONCERNS, NEEDS_CONTEXT, BLOCKED}:
  blocks any non-canonical exit state (was: undefined slipped through).
  Closes Codex finding PRRT_kwDOSCycWc6ALVFW (P2).

- schema_violations.length > 0 ⇒ result === "BLOCKED":
  surfacing alone is insufficient; the contract requires the BLOCKED
  exit state for halt/routing-path determinism.
  Closes Codex finding PRRT_kwDOSCycWc6ALVFN (P1).

Tests: 25 → 30 (added 5 cases — 3 canonical-state, 2 BLOCKED-on-violations).
Existing fixtures gained explicit `result` fields so they assert under
the new enforcement intentionally rather than passing by accident.

JSDoc on validateManifestSubagentStage now enumerates 7 numbered checks
in code order; the new canonical-state check is #1 and the new
BLOCKED-when-schema-violations check is #6.

D-7 row 12 snapshot test (canonical template ↔ JS const verbatim sync)
unchanged and continues to pin contract↔JS sync.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper requires non-empty semantic_edits payload

Validator's pending-item reconciliation used hasOwnProperty.call() which
only checks key presence. A subagent returning
semantic_edits.compose_status_prose = undefined would pass the check
despite zero payload, letting DONE/DONE_WITH_CONCERNS slip through with
unresolved semantic work and skipping a needed round-trip.

Tightened to require a meaningful payload: not null/undefined, not
empty string (after trim), not empty array, not empty object.
Numbers/booleans accepted defensively for future contract evolution.

Closes Codex finding PRRT_kwDOSCycWc6ALgop (P2).

Tests: 30 → 35 (added 4 empty-variant cases + 1 negative control).
Helper isMeaningfulPayload extracted to module scope alongside
walkForPlaceholder for testability and single-source-of-truth.

JSDoc check #2 now clarifies the value-must-be-meaningful rule and
cites the contract clause "containing the composed output" from
canonical-template responsibility #5.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper gaps when declared affected_file is missing

Validator's placeholder-scan loop silently skipped any affected_files
entry that no longer existed on disk (existsSync gate). A subagent that
deleted a declared file (e.g. cross-plan-dependencies.md) would pass
validation despite the contract clause that affected_files are files
the subagent EDITED — implying they exist post-edit.

Loop now pushes a gap when existsSync is false ("declared in
affected_files but missing from disk — destructive out-of-scope
behavior") AND continues (skipping the placeholder read for the
missing file). Existing-file branch unchanged.

Closes Codex finding PRRT_kwDOSCycWc6ALnyo (P1).

Tests: 35 → 36 (added missing-file gap; negative case covered by
existing placeholder-scan fixtures).

Contract doc clause at line 93 augmented with explicit parenthetical:
deletion of a declared file is a contract violation surfaced by the
validator's missing-file gap.

JSDoc check #3 now mentions both placeholder-scan AND existence-check
since both live in the same loop.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper BLOCKED on verification_failures + scope flip

Two fixes from the Codex re-review pass on 5b3ff94:

Finding 11 (P1): the validator's BLOCKED-on-X enforcement only covered
schema_violations (post-Finding-6 fix). The script exit-2 path populates
verification_failures and demands the same halt routing, but the validator
silently accepted DONE/DONE_WITH_CONCERNS for that case too. Added the
mirror check; closes PRRT_kwDOSCycWc6ALt55. Also added the parallel
formal-invariant sentence to references/post-merge-housekeeper-contract.md
§Validation invariants line 93 — the validator was enforcing a rule the
contract only stated narratively (line 79); the contract now states it
formally so the validator and the formal invariants list stay in lock-step.

Finding 12 (P2): SKILL.md:443 said affected_files was a subset of actual
edits. Inverted — the contract doc (line 93) and the subagent-stage
validator's D-7 row 14 superset check both go the other way (declared
is a superset of actual, so out-of-scope writes are detected). Flipped
the direction to match. Closes PRRT_kwDOSCycWc6ALt59.

Tests: 36 -> 38 (added BLOCKED-on-verification_failures positive +
negative case mirroring the schema_violations pair).

JSDoc check #8 added for the new validator check.

D-7 row 12 snapshot test (canonical template <-> JS const verbatim sync)
unchanged — passes.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper SKILL.md sprawl + zero-match prose alignment

Two prose-only fixes from the Codex re-review pass on b9c887b:

Finding 13 (P1): SKILL.md:449 said out-of-scope subagent edits route
straight to DONE_WITH_CONCERNS. The contract (failure-modes.md rule 20
+ spec §7.3) requires a round-trip first — the orchestrator re-dispatches
the subagent with an explicit (a) revert OR (b) extend-affected_files+
concern choice; DONE_WITH_CONCERNS only fires if the subagent picks (b)
with weak justification, not on first detection. Updated the prose to
state the round-trip flow inline so a reader cannot route past the
re-dispatch step. Closes PRRT_kwDOSCycWc6AL1aI.

Finding 14 (P2): SKILL.md:433's Rule 4 ("No-match fallback") said zero
matches drop to the step 2 NEEDS_CONTEXT branch — but step 2 line 437
defines "0 candidate matches → --auto-create" and "2+ → NEEDS_CONTEXT"
(spec §4.3.2 + the §5.5 candidate-dispatch table at design line 448-450
agree). Inverted the cite: 0 matches now correctly drop to --auto-create
(genuinely new work), and the prose calls out that NEEDS_CONTEXT is
reserved for the 2+ ambiguity case. Closes PRRT_kwDOSCycWc6AL1aJ.

No code or test changes — prose-only alignment to cited contracts.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper sprawl routes REDISPATCH not DONE_WITH_CONCERNS

Finding 15 (P1) from the manual @codex review on 3ff9b63:

The JS function `detectAffectedFilesSprawl` lagged the SKILL.md prose fix
landed in 3ff9b63 for Finding 13. The prose now correctly states that
out-of-scope edits trigger a round-trip re-dispatch per failure-modes.md
rule 20 — the orchestrator offers (a) revert OR (b) extend+justify; only
if the subagent picks (b) with weak justification does the orchestrator
downgrade to DONE_WITH_CONCERNS. But the JS function still returned
`suggestedRouting: "DONE_WITH_CONCERNS"` at first detection, so the
canonical helper would have routed Phase E past the rule-20 round-trip
into the merge-and-continue path even though the prose says otherwise.

This is the JS half of the contract-half-fix pattern (mirror of
b9c887b's "validator enforces a rule the contract only states
narratively" — same shape, opposite direction: code lagging prose
instead of prose lagging code).

Fix:

- `detectAffectedFilesSprawl` now returns `suggestedRouting: "REDISPATCH"`
  on first detection plus a `redispatchPromptTemplate` field carrying the
  verbatim rule-20 (a)/(b) prompt the orchestrator should send back to
  the subagent (with the out-of-scope file list substituted in)
- JSDoc rewritten to describe the round-trip flow + the post-redispatch
  validation step + when DONE_WITH_CONCERNS is the legitimate outcome
  (only after the subagent picks (b) with weak justification)
- Existing test renamed + retargeted to assert REDISPATCH + the prompt
  template's structural requirements (file enumeration, (a)/(b) text,
  affected_files_extension kind name)
- New negative-case test: no-sprawl returns `{ sprawl: false }` exactly

Tests: 38 -> 39. Full housekeeper suite 116/116 still passes; D-7 row 12
snapshot test (canonical template <-> JS const verbatim sync) unchanged.

Closes PRRT_kwDOSCycWc6AL-nZ.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper I-1 test asserts caught exception not stderr

Finding 16 (P2) from Codex re-review on 7a8a67f.

The I-1 invariant test (cite-target hook regression canary) had a
nullish-coalescing-chain bug that produced false-positive passes:

    stderr = e.stderr?.toString() ?? e.stdout?.toString() ?? String(e);
    assert.equal(stderr, "");

`??` only falls back on null/undefined — not on empty string. If the
hook exited nonzero with empty stderr (e.g. error written only to
stdout, or process killed before flushing), `e.stderr` was the empty
string `""`, the chain stopped there, and the final equality check
passed. Cite-target regressions could slip through CI silently.

Fix: capture the exception itself in `caughtError`, assert
`caughtError === null`, and include `exit code + stderr + stdout` in
the failure message for diagnosis. Now any throw fails the test —
empty-stderr false-positives are impossible.

Tests: 39/39 still pass on the healthy current catalog. The fix
sharpens the failure mode without changing the happy-path semantics.

Closes PRRT_kwDOSCycWc6AMC3r.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper Phase D.5 + auto-merge PR landing per D-8

- SKILL.md: insert Phase D.5 (merge transition: gh pr ready/checks/merge,
  git pull --ff-only) between Phase D and E so the merged squash-commit
  Phase E reads from is explicitly produced; rewrite Phase E steps 7-8
  to land via housekeeping/PR<N> auto-merge PR (preserves no-direct-push
  hard rule line 45 + adds PR-CI gate to housekeeping diff).
- contract: append §Housekeeping commit landing (branch shape, PR title
  identical to commit subject, PR body, --auto --squash mechanics, CI
  failure handling).
- plan: add D-8 Decision-Log entry recording the architectural drift +
  resolution; mark Task 4.4 verbatim Phase E prose SUPERSEDED in PR #33
  (historical snapshot retained per ADR superseded-by convention).

Closes Codex Findings 2/5/8 on PR #33 (architectural drift between Phase
E direct-push prose and SKILL.md no-direct-push hard rule).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper scalar payload reject + extension contract align

- isMeaningfulPayload: drop catch-all `return true` for booleans/numbers;
  the contract is composed completion-prose, which can't be a scalar.
  Prior behavior let a subagent satisfy `semantic_work_pending` with
  `false`/`0` and pass validation. Adds 2 regression tests pinning the
  rejection boundary for `false` and `0` payloads.
- plan-execution-housekeeper.md report-format: replace stale field
  reference `concerns[kind=affected_files_extension].path` with the
  actual flow per failure-modes rule 20 — extended `manifest.affected_files`
  carries the path; the concerns entry uses `kind` + `addressing` for
  rationale (no `path` sub-field). Aligns the agent contract with the
  validator + sprawl-detector + canonical prompt.

Closes Codex Findings F-AMLor + F-AMLos on PR #33 (P2). 41 helpers + 116
main tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper Phase E plan-file stage + violations preserve

- SKILL.md Phase E step 7: explicitly stage docs/plans/<plan-file> alongside
  <affected_files>. The Progress Log append in step 6 is orchestrator-controlled
  (not script-controlled) so the plan file is not in manifest.affected_files —
  especially under exit 3 (no Done Checklist sub-section) where the script
  skips checklist ticking entirely. Omitting it would leave the Progress Log
  edit unstaged, breaking the documented "single commit bundles housekeeping
  + log" guarantee.
- validator: add scriptSchemaViolations parameter + check #9 (immutable
  superset comparison by composite key kind+field+ns_id). Mirrors check #7's
  scriptAffectedFiles superset semantics. Without it, a subagent could clear
  manifest.schema_violations and return DONE/DONE_WITH_CONCERNS, bypassing
  check #6 BLOCKED enforcement entirely.
- 3 regression tests pin the new check #9 boundary: cleared (fail), preserved
  (pass), and additive (pass — subagent surfacing schema problems the script
  missed).
Closes Codex Findings F-AMP7W + F-AMP7Y on PR #33 (P1). 44 helpers + 116 main
tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper merge-state poll + verification_failures preserve

- SKILL.md Phase E step 8: add mergedAt poll AFTER gh pr checks --watch.
  gh pr merge --auto can leave the PR in a merge queue after CI passes
  (gh CLI docs: "If required checks have passed, the pull request will be
  added to the merge queue"), so checks-complete does NOT mean merged in
  merge-queue repos. Polling gh pr view --json mergedAt -q .mergedAt for
  a non-empty value waits until the squash-commit is on develop.
- validator: add scriptVerificationFailures parameter + check #10
  (immutable superset comparison by canonical JSON key — verification_failure
  entries are small structured shapes without a stable composite key, so
  JSON.stringify with sorted keys is the comparison key). Mirrors check #9
  for schema_violations. Without it, a subagent could clear
  manifest.verification_failures and return DONE/DONE_WITH_CONCERNS,
  bypassing check #8's BLOCKED enforcement for exit-2 halt.
- 3 regression tests pin the new check #10 boundary: cleared (fail),
  preserved (pass), and key-order-independent JSON canonicalization (pass —
  protects against false-positive gaps when the subagent re-stringifies).

Closes Codex Findings F-AMSEJ + F-AMSEL on PR #33 (P1). 47 helpers + 116 main
tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper validator non-file guard for affected_files

Codex F-AMU4D (PR #33 P2): the placeholder-scan loop called readFileSync
unconditionally after the existsSync gate, throwing EISDIR on directory
paths and any other I/O error, crashing the orchestrator instead of
routing through the validator's gap collection.

Add statSync().isFile() guard + try/catch around statSync and readFileSync;
each failure path emits a distinct gap citing F-AMU4D so the failure
mode is identifiable. The contract requires affected_files entries to
be regular files (the script edits line-level content); a directory is
a contract violation that MUST surface as a gap so Phase E can re-dispatch
— not an unhandled exception that halts orchestration.

Regression test: a directory at the affected_files path now emits a
gap matching "is not a regular file" + "Codex F-AMU4D" rather than
crashing with EISDIR.

Tests: 48 helpers (47 + 1 new) + 116 main pass.

* fix(repo): housekeeper validator union pending iteration for F-AMWXK

Codex F-AMWXK (PR #33 P1): validateManifestSubagentStage iterated
manifest.semantic_work_pending (subagent-written), so a subagent could
clear that array + return DONE/DONE_WITH_CONCERNS — the per-item pairing
loop ran zero times and emitted zero gaps, letting unaddressed semantic
work pass validation. SAME bypass shape as F-AMP7Y (schema_violations)
+ F-AMSEL (verification_failures).

STRUCTURAL DIVERGENCE from checks #9/#10: those guard
array-non-emptiness → BLOCKED state with a separate preservation check.
F-AMWXK leverages the EXISTING per-item iteration by feeding it the
UNION of script-stage snapshot + subagent-written array. The script
snapshot is the immutable contract (subagent CANNOT bypass by clearing);
the subagent's additions are also iterated (subagent committed to
address anything they added).

Why union not preservation-check: a separate preservation check would
emit "X missing from semantic_work_pending" — forcing 3 round-trips
(R1 array-shrink gap → R2 re-add → R3 address). Union semantics keep
the existing "X listed but unaddressed" gap actionable in ONE round-trip.

Add scriptSemanticWorkPending parameter (defaults null for legacy
callsites + tests). Updated JSDoc check #11 documenting the new
parameter, the union semantics, and the structural divergence.

Regression tests (3 — pin both directions of the boundary plus the
union semantics):
- cleared semantic_work_pending → 2 gaps for both script-stage items
- preserved + addressed → valid
- subagent-added items also iterated → 1 gap for unaddressed addition

Tests: 51 helpers (48 + 3 new) + 116 main pass.

* fix(repo): housekeeper validator manifest _script_stage for F-AMbIV

Path-(a) A2 design closes the system-level half of the contract-half-fix
bypass class (F-AMP7Y / F-AMSEL / F-AMWXK). Validator-API parameters were
in place but no production caller plumbed them, so the four preservation /
iteration checks (#7 / #9 / #10 / #11) silently disabled.

Script now writes _script_stage as an immutable snapshot of the four
arrays the validator enforces (affected_files / schema_violations /
verification_failures / semantic_work_pending). Validator reads the
snapshot from the manifest with precedence: explicit scriptXXX param >
manifest._script_stage[field] > null. Structural-tampering check fires
whenever the snapshot path is relied on (any scriptXXX null), catching
secondary attacks where the subagent removes _script_stage to defeat
the snapshot defense.

Subagent contract: _script_stage is READ-ONLY. Touching it surfaces as
a Codex F-AMbIV gap. Documented in the canonical subagent prompt
template, the agent definition file, the contract doc § _script_stage
immutability, and SKILL.md Phase E step 5.

Tests: 4 new F-AMbIV regressions (A2 fallback / missing-snapshot gap /
malformed-snapshot gap / param precedence). All 171 housekeeper tests
pass (55 helper + 116 main). 14 fixture snapshots updated to mirror
the new emission. 42 existing test fixtures bulk-updated with empty
_script_stage to satisfy the structural check on legacy code paths.

Refs: PR #33 Codex finding F-AMbIV

* fix(repo): housekeeper dispatch routing helper for F-ANGCa

phase e step 4 dispatched the housekeeper subagent unconditionally, but
the contract classifies exits 1/4/>=6 as orchestrator-stage halts (operator
action required, not subagent work). dispatching over a malformed/absent
manifest forced the llm to interpret garbage and emit a result tag based
on hallucinated state. encode the dispatch/halt mapping in a tested helper
(decideHousekeeperRouting) so skill prose, contract, and any future audit
script delegate to one source of truth, mirroring the f-amp7y/f-amsel/
f-amwxk/f-ambiv pattern (move enforcement out of prose, into validators).

10 unit tests pin the mapping per exit class (0/1/2/3/4/5/6/137/-1/nan).

Refs: PR #33, F-ANGCa

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(repo): strip review-thread provenance from agent prompt context

skill prose, agent rules, contract reference, and the canonical subagent
prompt template all carried `Codex F-AMbIV` / `Codex F-ANGCa` / `Codex P1
PR #33` review-thread suffix labels. these are github graphql ids with no
semantic content for any reader outside the originating pr review tab —
they violate the global rule against referencing the current task / fix /
caller in code (claude.md), and they bleed review-process metadata into
the housekeeper subagent's prompt every dispatch. the rule's actual
rationale already lives in the surrounding prose; the f-tag added zero
behavioral signal while consuming prompt tokens.

scope is bounded to runtime-loaded surfaces (skill prose, agent rule list,
contract reference, canonical_template constant). validator gap messages
and jsdoc annotations stay — they're operator-facing / engineer-facing,
not loaded into the housekeeper subagent's prompt context. the breadcrumb
trail to source threads remains in commit messages and test names.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper validator path containment for F-AxXBH

Validator's affected_files loop joined declared paths against repoRoot
without first checking they were repo-relative. Malformed manifests
emitting an absolute path (/etc/passwd) or parent-traversal
(../../private) bypassed validation; later step-7 git add fails with
"outside repository pathspec", dead-ending housekeeping after a
valid:true signal. Encoded containment in assertRepoRelative
(lexical isAbsolute + relative startsWith ".."), called BEFORE
existsSync. 7 unit tests cover helper branches + validator integration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper validator shape guards F-Axg-w/F-Axg-z

Validator's affected_files loop forwarded each entry to
assertRepoRelative (isAbsolute), which throws TypeError on non-string
input. The schema_violations reconciliation + preservation loops both
dereferenced v.kind/field/ns_id without per-element shape check, so a
tampered manifest with null or non-object entries crashed
mid-validation instead of surfacing structural-tampering as a gap.
Sanitize once at the top (cleanedSchemaViolations) and add typeof
guard before assertRepoRelative (idx-keyed gap message); downstream
loops iterate the cleaned arrays.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper validator class-level array sanitize F-AxkNI/J

Closes the (manifest.X ?? []).method() bug class across all 6 callsites
in the validator. Two sub-classes addressed: (1) container-type — array
fields received an OBJECT/scalar instead of an array, crashing
.entries()/.some()/.map() with "is not a function"; (2) element-shape
— arrays of objects received null/scalar elements, crashing field
dereference (.kind/.field/.addressing) with TypeError. Pre-fix sites:
schema_violations (recon + preserve), concerns (pending-pair + recon),
affected_files (scan + D-7 superset), verification_failures (preserve).
Centralized in sanitizeObjectArrayField + sanitizeStringArrayField
helpers; consumers iterate cleaned arrays. Same gap-routing pattern
as F-AMU4D / F-Axg-z: contract violations surface as gaps with
contract-anchored shape hints, not orchestrator crashes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(repo): strip review-thread provenance from housekeeper surfaces

Removed two classes of ephemeral PR-thread identifiers leaked into runtime
test/code/comment surfaces under .claude/skills/plan-execution/:

  - F-tag suffixes (F-AMSEL/F-AMU4D/F-AMP7Y/F-AMNGCa/F-AMbIV/F-AMWXK/
    F-Axg-w/F-Axg-z/F-AxXBH/F-AxkNI/F-AxkNJ/F-AMLor/F-ANGCa)
  - Full GraphQL thread IDs (PRRT_kwDO...) and review permalink IDs
    (Codex thread rNNNNNNNN — three call-sites)

Both classes describe Codex review-thread provenance: base64 GraphQL IDs
that are ephemeral (rot when the PR is closed/squashed) and opaque (no
resolution path for a future reader). Per the standing rule, these
identifiers belong in commit messages and PR thread replies, never in
runtime-loaded test/code/comment/gap-message surfaces.

Replaced identity-based test assertions (/Codex F-XXX/.test(g)) with
behavior-based assertions (e.g., g.includes("script-stage
verification_failure")) so future refactors that rename gap-message text
don't silently break test coverage on a stale identifier match.

Known residual: Bug-N markers (Bug-4/5/8/9) remain inline in the same
files. They're a separate internal-triage convention without external
docs/backlog cross-references; stripping them was out of scope for this
authorization and is deferred as follow-up.

Test count unchanged: 254/254 pass.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): sanitize semantic_work_pending before union spread

Codex P1 finding on housekeeper-orchestrator-helpers.mjs:349 — the
`[...effScript, ...(manifest.semantic_work_pending ?? [])]` union spread
crashes "object is not iterable" when a malformed subagent manifest sets
`semantic_work_pending` to a non-array. The `??` only coalesces null/
undefined; an object/scalar short-circuits the coalesce, returns the bad
value, and the spread throws. Bypasses the intended re-dispatch path —
bad manifest structure becomes orchestrator failure instead of a
recoverable validation gap.

Same bug class as the prior class-level fix (sanitize helpers for
schema_violations / concerns / verification_failures / affected_files),
missed at the spread surface for semantic_work_pending. The earlier fix
scope only covered method-call sites (`(arr ?? []).method()`); this
commit extends to the spread surface.

Fix: add cleanedSemanticWorkPending = sanitizeStringArrayField(
manifest.semantic_work_pending, ...) upstream alongside the other
sanitize calls; route both consumption sites through the cleaned array.
Inline asymmetry note: ONLY the manifest field is sanitized;
effScriptSemanticWorkPending is upstream-resolved via Array.isArray
(script-stage = trusted, manifest = subagent-output = untrusted).

Test added (integration test mirrors the AxkNJ proof pattern): manifest
with _script_stage.semantic_work_pending populated and tampered
manifest.semantic_work_pending produces BOTH a structural-tampering gap
AND the canonical "X listed but absent" gap on the script-stage pending
item — proves a tampered live field doesn't mask preservation enforcement.

Test count: 255/255 pass (was 254; +1 new test for this finding).

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper validator script-side element sanitize

Sanitize elements of `effScriptVerificationFailures` and
`effScriptSchemaViolations` BEFORE the preservation iteration loops via
the existing `sanitizeObjectArrayField` helper. The container-type guard
in check #12 only verifies the field is an array — element-shape was
unguarded, so a tampered `manifest._script_stage.verification_failures`
containing `[null]` crashed `Object.keys(null)` inside the `failureKey`
closure (and `manifest._script_stage.schema_violations` containing `[null]`
TypeError'd on `sv.kind` dereference). Same fix shape as the manifest-side
sanitize already in place for the four object-array fields.

The companion finding (manifest `_script_stage` as immutable baseline) is
deferred — see thread reply for architectural rationale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(repo): strip ephemeral Bug-N labels from housekeeper surfaces

Same class as the F-XXXX review-thread suffix cleanup (commit 2715e01).
Bug-N labels were ephemeral session-tracking identifiers used during the
initial finding cycle for this PR — they have no meaning outside the
originating review session and rot the moment the PR squashes. Replaced
each occurrence with a description of the bug class or contract violation
the test/branch defends; preserved all surrounding rationale.

Touches:
- post-merge-housekeeper.test.mjs (Bug-9, Bug-8, Bug-4 ×2, Bug-5 headers + 1 test name)
- post-merge-housekeeper.mjs (Bug-4 ×2 inline pre-validation comments)

256/256 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper baseline-trust gap on missing scriptXXX

When validateManifestSubagentStage runs without all four scriptXXX
params and falls back to manifest._script_stage, the validator now
emits an explicit gap. _script_stage is subagent-emitted and may be
tampered; the orchestrator's stage-1 conversation memory snapshot
(read in SKILL.md step 3 BEFORE subagent dispatch) is the only
untamperable baseline. Falling back silently allowed preservation
bypass when the subagent cleared both the top-level emit field and
the snapshot field while keeping shape intact.

SKILL.md step 5, the housekeeper contract, the agent prompt, and
the embedded CANONICAL_TEMPLATE all updated to describe the
orchestrator-plumbs-stage-1-snapshot trust model. _script_stage is
demoted from "immutable" to "redundant integrity signal" — the
primary defense is the validator's gap, the snapshot is defense-
in-depth.

Test fixtures updated to plumb scriptXXX explicitly (mirroring the
production path); two new tests cover the baseline-trust gap
positive (omit scriptXXX → gap fires) and negative (all four
plumbed → suppressed) cases.

Refs: PR #33 finding F-AyDCN

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(repo): housekeeper contract — multi-candidate manifest shape

Multi-candidate runs (--candidate-ns NS-XX,NS-YY) emit a divergent
manifest shape: matched_entry is null with matched_entries[] carrying
per-NS metadata, and mechanical_edits.{status_flip,mermaid_class_swap}
are absent (omitted entirely) with plural status_flips[] /
mermaid_class_swaps[] arrays in their place. Canonical fixture is
scripts/__tests__/fixtures/14-multi-candidate-happy-path.

The contract previously described only the single-candidate shape, and
the canonical subagent prompt template's responsibility #1 told the
housekeeper to replace `<TODO subagent prose>` placeholders only in
mechanical_edits.status_flip.to_line — a subagent following the prompt
literally on a multi-candidate run would skip the status_flips[]
entries. The validator's on-disk placeholder scan would have caught
the omission on the next round-trip (the safety net is shape-agnostic),
but the wasted round-trip was avoidable.

Updated responsibility #1 in both helpers.mjs CANONICAL_TEMPLATE and
the contract's §Canonical Subagent Prompt Template (kept verbatim in
sync for the buildHousekeeperPrompt parity test) to cover both shapes,
added a multi-candidate-shape paragraph to the Manifest schema section,
and updated the validation-invariants line to acknowledge the plural
keys.

Refs: PR #33 finding F-A2u4r

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(repo): housekeeper validator entry guard for malformed manifest

A malformed subagent output (JSON literal `null`, undefined, scalar,
or array root) would crash on the first `manifest.X` dereference at
helpers.mjs L214 before gap collection ran, throwing TypeError instead
of returning a structured validation gap. That broke Phase E's
contract-violation recovery path — the orchestrator caught a runtime
crash instead of a routable gap, and could not re-dispatch / surface
the contract violation to the user.

Added an entry guard at function start that catches null, undefined,
non-object scalars, and array roots. Discriminates on actualKind for
actionable messaging and short-circuits with a structured gap so
downstream code can trust manifest is a non-null non-array object.

Four dedicated tests cover null / undefined / scalar (number+string) /
array roots. 89/89 pass.

Refs: PR #33 finding F-A24wj

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot force-pushed the release-please--branches--develop branch 4 times, most recently from b3cd7fb to 79ca427 Compare May 11, 2026 07:27
Sawmonabo added a commit that referenced this pull request May 11, 2026
- step 5 no longer contains a `cp manifest stage1` fallback. By step 5 the
  manifest has been mutated by the subagent dispatch, so the copy would
  alias the stage-1 baseline to tampered state and silently disable
  preservation checks #7/#9/#10/#11. (Codex P1)
- step 3 snapshot promoted from "optional but recommended" to REQUIRED,
  with the cp command moved into a fenced code block. Drops the misleading
  "non-fatal but signals reduced preservation coverage" claim — omitting
  --stage1 actually forces a check #12 round-trip loop (exit 2). (Codex P2)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sawmonabo added a commit that referenced this pull request May 11, 2026
- CLI exit-1 routing gated on definitional-narration gaps only (Codex
  R4 P1). The narration scenario inherently fires 3 gaps (result
  non-canonical, narration_mode_detected, per-item unaddressed). Exit 1
  must NOT win when a non-definitional integrity gap (baseline-trust
  check #12, preservation #7/#9/#10/#11, halt-state #8, schema-surfacing
  #5, placeholder leak, etc.) also fires — auto-deviation off untrusted
  _script_stage would compound the bug. Uses an allow-list of three
  patterns; new gap classes default to integrity (exit 2).
- Auto-deviation fallback recipe (SKILL.md §) now sets result
  conditionally on _script_stage halt-state arrays (Codex R4 P2).
  schema_violations or verification_failures non-empty → BLOCKED;
  otherwise DONE_WITH_CONCERNS. Hard-coding DONE_WITH_CONCERNS would
  deterministically re-fail check #8 and trap the flow.
- Adds child_process-spawned regression test exercising both routing
  paths (narration-only → exit 1; narration + --stage1 omitted → exit 2
  via baseline-trust gap). 115/115 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sawmonabo added a commit that referenced this pull request May 11, 2026
…#53)

* chore(repo): add plan-execution housekeeper narration-mode guardrails

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(repo): address Codex P2 findings on PR #53

- narration check sources hasPendingWork from effScriptSemanticWorkPending
  (trusted stage-1 baseline) instead of manifest.semantic_work_pending
  (subagent-emitted). A bad subagent clearing the top-level array no longer
  demotes routing from narration auto-deviation to generic round-trip.
- CLI wrapper rejects --stage1 with no path argument (exit 3) instead of
  silently ignoring it. Adds regression test for the explicit-param >
  _script_stage precedence chain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(repo): address Codex P1 + P2 findings on PR #53 fixup

- step 5 no longer contains a `cp manifest stage1` fallback. By step 5 the
  manifest has been mutated by the subagent dispatch, so the copy would
  alias the stage-1 baseline to tampered state and silently disable
  preservation checks #7/#9/#10/#11. (Codex P1)
- step 3 snapshot promoted from "optional but recommended" to REQUIRED,
  with the cp command moved into a fenced code block. Drops the misleading
  "non-fatal but signals reduced preservation coverage" claim — omitting
  --stage1 actually forces a check #12 round-trip loop (exit 2). (Codex P2)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(repo): fix stdout flush race in validate-subagent-manifest

Wrap the script body in main() returning the exit code, then assign
`process.exitCode = main()`. The previous unconditional `process.exit(N)`
after `process.stdout.write(JSON)` could terminate Node before async
stdout writes drained when stdout is piped (e.g., `... | jq`, CI log
collectors), intermittently truncating the JSON payload and breaking
orchestrator parsing even when validation succeeded. (Codex PR #53 R3 P1)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(repo): address Codex R4 P1 + P2 findings on PR #53

- CLI exit-1 routing gated on definitional-narration gaps only (Codex
  R4 P1). The narration scenario inherently fires 3 gaps (result
  non-canonical, narration_mode_detected, per-item unaddressed). Exit 1
  must NOT win when a non-definitional integrity gap (baseline-trust
  check #12, preservation #7/#9/#10/#11, halt-state #8, schema-surfacing
  #5, placeholder leak, etc.) also fires — auto-deviation off untrusted
  _script_stage would compound the bug. Uses an allow-list of three
  patterns; new gap classes default to integrity (exit 2).
- Auto-deviation fallback recipe (SKILL.md §) now sets result
  conditionally on _script_stage halt-state arrays (Codex R4 P2).
  schema_violations or verification_failures non-empty → BLOCKED;
  otherwise DONE_WITH_CONCERNS. Hard-coding DONE_WITH_CONCERNS would
  deterministically re-fail check #8 and trap the flow.
- Adds child_process-spawned regression test exercising both routing
  paths (narration-only → exit 1; narration + --stage1 omitted → exit 2
  via baseline-trust gap). 115/115 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot force-pushed the release-please--branches--develop branch from 79ca427 to de8b339 Compare May 12, 2026 00:30
@github-actions github-actions Bot force-pushed the release-please--branches--develop branch 6 times, most recently from 254b7c8 to 4d01b87 Compare May 19, 2026 01:40
@github-actions github-actions Bot force-pushed the release-please--branches--develop branch 10 times, most recently from 18df547 to e794cd7 Compare May 26, 2026 02:25
@github-actions github-actions Bot force-pushed the release-please--branches--develop branch 4 times, most recently from d4f34fe to 203b4a5 Compare May 29, 2026 01:01
@github-actions github-actions Bot force-pushed the release-please--branches--develop branch from 203b4a5 to 150960b Compare May 30, 2026 01:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants