Skip to content

feat(repo): housekeeper subagent + Phase E orchestrator wiring#33

Merged
Sawmonabo merged 45 commits into
developfrom
feat/plan-execution-housekeeper-subagent
May 10, 2026
Merged

feat(repo): housekeeper subagent + Phase E orchestrator wiring#33
Sawmonabo merged 45 commits into
developfrom
feat/plan-execution-housekeeper-subagent

Conversation

@Sawmonabo
Copy link
Copy Markdown
Owner

Summary

  • Adds the 7th plan-execution subagent role: plan-execution-housekeeper (color: blue; tools: Read, Grep, Glob, Edit, Write).
  • Adds references/post-merge-housekeeper-contract.md with manifest schema, exit codes, validation invariants, recovery diagnostic, completion-rule matrix, file-reference extraction heuristic.
  • SKILL.md Phase E section rewritten with candidate-lookup prose (per spec §4.3.2 / §4.3.3) and 8-step ordering (per spec §6.1 — Progress Log moves AFTER housekeeping into the same commit).
  • "Six subagent roles" → "Seven subagent roles" at SKILL.md line 29; Reference Files section gains the contract row; line-42 "five of six" → "six of seven" set-quantifier mirror updated.
  • state-recovery.md: Phase E housekeeping recovery section (per spec §7.2) + inline candidate-lookup rules (per spec §4.3.5) + flag for the line-174 .agents/tmp/ lifecycle drift (deferred reconciliation per spec §9.3, tracked as BL-109).
  • failure-modes.md: routing rules 20 + 21 added — rule 20 covers affected_files sprawl round-trip, rule 21 covers schema-violation BLOCKED routing reusing rule 4 (no new exit-state introduced per spec §7.1 invariant).
  • Layer 2 tests (15 total in helper test file): prompt construction (3 modes: happy / AUTO-CREATE / schema-violation) + manifest-stage validation (pending-item coverage + leftover-placeholder detection + schema_violation reconciliation) + 4 D-7-row tests (snapshot, zod parse, affected_files superset, sprawl detection) + 3 invariant tests (I-1 cite-target hook regression, I-2 canonical exit-state declaration, I-3 no-shell-out for git).
  • New helper detectAffectedFilesSprawl in housekeeper-orchestrator-helpers.mjs (pure set-difference function for routing rule 20).
  • zod ^4.3.6 added to root devDependencies (was workspace-only; needed for the §5.3 schema test).

Refs

  • Spec: docs/superpowers/specs/2026-05-03-plan-execution-housekeeper-design.md §5.2, §5.3, §5.4, §6.1, §6.2, §7.1, §7.2, §7.3, §9.1, §9.3, §10.1
  • Plan: docs/superpowers/plans/2026-05-03-plan-execution-housekeeper-implementation.md §Phase 4

Test plan

  • CI green
  • Codex 👍
  • 0 unresolved threads
  • All Layer 1 + Layer 2 tests pass (188 tests, 0 fail — verified locally)
  • Manual Layer 3 (subagent E2E) run pre-merge: node .claude/skills/plan-execution/scripts/__tests__/run-subagent-against-fixtures.mjs against fixtures 01 + 05 + 09. Note: Layer 3 runner is NOT in this PR's scope (spec §8.3 marks it manual + non-CI).

🤖 Generated with Claude Code

Sawmonabo and others added 15 commits May 6, 2026 18:41
Adds the 7th plan-execution subagent role per spec §5.2: blue color,
tools Read/Grep/Glob/Edit/Write (Bash omitted to mechanically enforce
Plan Invariant I-3 — no git shell-out from this role). Body sections
mirror plan-execution-contract-author.md shape: Inputs, Mindset, Hard
rules, Decision presentation, Exit states (only the 4 canonical from
failure-modes.md per Plan Invariant I-2), Report format, Reference
files.

Refs: Plan-housekeeper Phase 4 Task 4.1; Spec §5.2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The contract is the housekeeper subagent's portable reference; sections
extract spec §5.3 manifest schema, §5.1 exit codes, §5.3 validation
invariants, §7.2 resume diagnostic, §3a.2 completion-rule matrix, §3a.4
file-reference extraction heuristic. Adds the canonical subagent prompt
template that buildHousekeeperPrompt (Task 4.3) reproduces verbatim and
the Layer 2 snapshot test (Task 4.8) pins.

Refs: Plan-housekeeper Phase 4 Task 4.2; Spec §9.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Canonical Subagent Prompt Template (Task 4.2 verbatim source +
deployed contract) cited `rule 21` for the sprawl-routing case, but
Task 4.7 defines rule 20 = sprawl (round-trip dispatch) and rule 21 =
schema-violation halt → BLOCKED. Also re-frames the routing prose:
sprawl triggers an orchestrator round-trip (DONE_WITH_CONCERNS is a
secondary outcome only after weak-justification re-dispatch, not the
primary route). Without this fix, Task 4.8's snapshot test would pin
the wrong rule number into CI fixtures.

Fixes both the canonical contract and the plan's verbatim source so a
re-run of Task 4.2 cannot reintroduce the bug.

Refs: Plan-housekeeper Phase 4 Task 4.2; Task 4.7 rule 20.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs
exporting buildHousekeeperPrompt (BASE template from contract + conditional
AUTO-CREATE / schema-violation augmentations) and validateManifestSubagentStage
(catches: pending-item gaps, file-residual placeholders, semantic_edits values
P2 placeholders, nested-array placeholders, schema_violations x concerns
reconciliation gaps). 8 unit tests TDD-authored.

Refs: Plan-housekeeper Phase 4 Task 4.3; Spec §5.3, §8.2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code-quality review POLISH items: lift `<TODO subagent prose>` literal
to module scope (single source of truth across walkForPlaceholder +
validateManifestSubagentStage), and reap Test 6's tmp dir via
try/finally + rmSync to match sibling test convention.

Refs: Plan-execution-housekeeper Phase 4 Task 4.3
Phase E now describes the 8-step post-merge housekeeping flow: candidate-lookup
over §6 NS catalog, script dispatch (--candidate-ns / --auto-create / NEEDS_CONTEXT
on multi-match), script-stage manifest validation, plan-execution-housekeeper
subagent dispatch, subagent-stage validation, Progress Log append (moved from
before-merge to after-housekeeping per spec §6.1), single bundled commit, push.

Per Plan §Decisions-Locked D-3 the housekeeper ticks ALL Done-Checklist boxes
for the merged Phase. Footnote at step 1 cross-links to Plan §D-7 fixture matrix.

Refs: spec §5.4, §6.1, §6.2, §7.1; Plan-execution-housekeeper Phase 4 Task 4.4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plan line 2923 prescribes "A new bullet at the top of Phase E noting ..."
but the Task 4.4 rewrite rendered the D-3 line as a paragraph. Prepend the
list marker so the rendering matches the plan's "new bullet" framing.

Refs: Plan-execution-housekeeper Phase 4 Task 4.4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
POLISH from Task 4.4 code-quality review:

- Step 6 path drifted to gitignored `.agents/tmp/<session-id>/progress.md`,
  which is incompatible with step 7's "single git commit on develop" (spec
  §6.1 line 905 mandates same-commit). Restore plan-body
  `docs/plans/NNN-*.md` § Progress Log location.
- GFM footnote ref `Rules:[^d7]` needs space before `[^d7]`.

Mirrored fix in plan source per Task 4.2 rule-20 precedent (commit b5f1e10)
so re-runs don't reintroduce the path drift.

Refs: Plan-execution-housekeeper Phase 4 Task 4.4
Task 4.5 of plan-execution-housekeeper Phase 4: bumps the orchestrator's
role count six → seven, adds `plan-execution-housekeeper` to all three
role enumerations (line 29 introduction, line 42 no-git rule, line 515
Reference Files subagent-definitions row), and adds a Reference Files
row pointing to `references/post-merge-housekeeper-contract.md`
(authored in Task 4.2, commit b5f1e10's predecessor).

Set-quantifier sweep includes lines 42 and 515 alongside the explicit
line 29 from the plan task spec — per same-class-sweep discipline, the
"Six subagent roles" claim mirrors at all three sites.

Refs: Plan-execution-housekeeper Phase 4 Task 4.5
state-recovery.md: append "Resuming a Phase E housekeeping halt" diagnostic
(per spec §7.2) and mirror the four §4.3.2 candidate-lookup rules inline (per
spec §4.3.5) so a recovering engineer doesn't need to cross-link to SKILL.md
to debug rule-N mismatches. Flag the existing line-174 .agents/tmp/ lifecycle
drift with a \`> Note:\` block linking to BL-109.

backlog.md: reserve BL-109 to track reconciling state-recovery.md's ".agents/
tmp/ deleted at commit time" claim with lefthook.yml (which has no such job).

Refs: spec §6.1, §7.2, §4.3.5, §9.3; Plan §Decisions-Locked D-5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
POLISH from Task 4.6 code-quality review: an inbound anchor in
state-recovery.md (the > Note: callout authored alongside BL-109)
points to the GitHub-slugified BL-109 heading. The heading contains
backticks + dots + slashes that get stripped by GFM, so any future
heading reword could silently break the inbound anchor without any
hook catching it. Add an HTML comment above the heading documenting
the dependency, mirroring the PR #24 / PR #27 canonicalization-sweep
lessons.

Refs: Plan-execution-housekeeper Phase 4 Task 4.6
Adds rules 20 (housekeeper subagent affected_files sprawl detection +
round-trip resolution) and 21 (script schema-violation BLOCKED routing
reusing rule-4's graceful-drain) to failure-modes.md. Both reuse the
existing four canonical exit-states (DONE / DONE_WITH_CONCERNS /
NEEDS_CONTEXT / BLOCKED) per spec §7.1 invariant; no new exit-state
introduced.

Refs: spec §7.1; Plan-execution-housekeeper Phase 4 Task 4.7
POLISH from Task 4.7 review: 4 citation sites used "§ Exit Codes"
(capital C) while the contract heading at
references/post-merge-housekeeper-contract.md:70 is "## Exit codes"
(lowercase). Sweep direction lowercases the citers (not capitalizes
the heading), because sentence-case matches:
  (a) all 5 sibling ## headings in post-merge-housekeeper-contract.md
  (b) sibling references/preflight-contract.md:13 "## Exit codes"

Sites swept (4):
- references/failure-modes.md:206 (rule 21, just landed in bc9c03b)
- references/state-recovery.md:198 (Task 4.6 commit 5f0d04b)
- plan source line 2989 (state-recovery.md verbatim mirror)
- plan source line 3120 (failure-modes.md verbatim mirror)

Refs: Plan-execution-housekeeper Phase 4 Task 4.7
Adds 7 unit tests to post-merge-housekeeper-orchestrator-helpers.test.mjs
covering Plan §Decisions-Locked D-7 rows 12-15 (prompt-template snapshot,
zod schema parse, affected_files superset, sprawl detection routing) and
the I-1/I-2/I-3 invariants (cite-target hook regression, canonical exit-
state declaration, no-shell-out for git).

Adds new helper detectAffectedFilesSprawl to housekeeper-orchestrator-
helpers.mjs (pure function, set-difference) and extends
validateManifestSubagentStage with an optional scriptAffectedFiles param
that enforces the D-7 row 14 superset check. Adds zod ^4.3.6 to root
devDependencies so the schema test resolves the import. Adds the RESULT:
prefix to the four canonical exit-states in plan-execution-housekeeper.md
so the I-2 invariant test detects all four declarations consistently.

Refs: Plan-execution-housekeeper Phase 4 Task 4.8
Plan §Task-4.9 Step 1 prescribed `node --test <directory>/` but Node
22.21 does NOT directory-walk explicit path args — only auto-discovers
from cwd. The directory-arg was treated as a module path and threw
MODULE_NOT_FOUND. Use `<directory>/*.test.mjs` glob instead.

Also corrects the expected-test-count line: Task 4.8 added 7 tests
(4 D-7 rows + 3 invariants), not 4 D-7 rows alone.

Refs: Plan-execution-housekeeper Phase 4 Task 4.9
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fa932d4567

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs Outdated
The validator's reconciliation loop used `.some(c => c.kind === "schema_violation")`
which ignored `sv` and let a single generic concerns entry satisfy N violations.
Fix: filter concerns by kind once, then match each schema_violations entry
structurally on `field` (and `ns_id` when present). Update the gap-message to
reference the actual differentiator (e.g. "NS-15.type") instead of the
always-literal `sv.kind`. Adds three regression tests:

- 2 violations + 1 generic concern → exactly the unmatched 'type' violation gaps
- 2 violations + 2 specific concerns → passes
- --candidate-ns shape (no ns_id) → matches by `field` alone

Refs: PR #33 Codex review thread r3198337816

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 027ef550a2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs Outdated
Comment thread .claude/skills/plan-execution/SKILL.md Outdated
Codex P1 (PR #33 R2): two gaps in the subagent-stage validator forced
false-positive round-trips on legitimate halt outcomes.

Gap 1 — `addressing: <item-key>` undocumented. The canonical-template
responsibility list never told the subagent that concerns deferring a
`semantic_work_pending` item must set `addressing: "<exact-item-key>"`
to satisfy the validator's per-item check. The match key was a hidden
contract.

Gap 2 — per-item check fired on halt-states. When subagent returned
BLOCKED or NEEDS_CONTEXT (it stopped before completing semantic work),
the per-item check still demanded coverage of every pending item,
forcing wasted dispatches.

Fix:
- Contract doc canonical template adds responsibility #5 documenting
  the `addressing: <item-key>` requirement (renumbers existing
  #5/#6/#7 -> #6/#7/#8). Validation invariants line clarifies the
  matching key + halt-state waiver.
- JS CANONICAL_TEMPLATE const re-synced verbatim per Plan
  §Decisions-Locked D-1; D-7 row 12 snapshot test confirms sync.
- validateManifestSubagentStage skips the per-item loop when
  manifest.result is "BLOCKED" or "NEEDS_CONTEXT". DONE_WITH_CONCERNS
  still requires per-item pairing — waiver narrowly scoped to halts.
- Gap message now hints at the contract requirement so future
  regressions self-document.

Tests: 4 new regression tests (BLOCKED waiver, NEEDS_CONTEXT waiver,
DONE_WITH_CONCERNS still requires pairing, `addressing` match by
exact item-key with `kind` irrelevant).

Note: Codex Finding 2 (missing merge-command steps in SKILL.md
Phase E) is a deeper architectural drift between spec (pre-merge
housekeeping) and plan/SKILL.md (post-merge ordering) — deferred to
user direction; not addressed here.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Sawmonabo
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6208124855

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Codex P2 (PR #33 R3): the schema_violations × concerns matcher's
predicate `c.field !== sv.field` was trivially false when both sides
lacked a `field` (undefined !== undefined), letting a single generic
concern absorb multiple distinct-shape violations. Plus the pre-filter
`c.kind === "schema_violation"` excluded shapes the script actually
emits — `auto_create_title_seed_underivable` (line 899) is a singleton
with no `field`/`ns_id`, so it would never find a matching concern.

Fix:
- Validator drops the kind="schema_violation" pre-filter and adds
  `c.kind === sv.kind` to the per-violation match predicate. Each
  violation must now be paired to a concern of the SAME kind.
- Gap message handles fieldless violations by falling back to a
  `(kind: <kind>)` label (was printing literal "undefined") and
  reflects the actual match-key requirements.
- Contract doc (validation invariants line 93 + canonical template
  responsibility #4) updated to describe the actual three-shape
  rule (schema_violation OR auto_create_title_seed_underivable
  singleton) rather than hardcoding `kind: "schema_violation"`.
- JS CANONICAL_TEMPLATE const re-synced verbatim per Plan
  §Decisions-Locked D-1; D-7 row 12 snapshot test confirms sync.
- Validator JSDoc check #4 expanded to call out the kind
  discriminator and the trivial-equality vulnerability it closes.

Tests: 3 new regression tests (distinct-kind violations require
kind-discriminated concerns, singleton kind matches by kind alone,
fieldless-violation gap message format).

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 77f44354ad

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/agents/plan-execution-housekeeper.md Outdated
Comment thread .claude/skills/plan-execution/SKILL.md Outdated
Codex P1 (PR #33 R3 Finding 4): .claude/agents/plan-execution-housekeeper.md
line 41 still mandated `kind: schema_violation` literally even after
commit 77f4435 changed the validator to discriminate by sv.kind verbatim.
The Finding 3 fix updated the contract doc + JS const + validator, but
missed the same-axis prose in 4 other locations across the doc tree —
classic canonicalization-sweep miss per
`feedback_canonicalization_sweep_completeness.md`.

Sweep covers all parallel hard-rule prose + invariants:
- .claude/agents/plan-execution-housekeeper.md:41 (Codex flagged direct)
- docs/superpowers/specs/.../housekeeper-design.md:617 (spec hard rule)
- docs/superpowers/specs/.../housekeeper-design.md:693 (validation invariants)
  — also brings the invariants prose into alignment with Finding 1's
  `addressing: <item-key>` + halt-state waiver wording (was stale from
  the prior PR cycle's runtime fix)
- docs/superpowers/specs/.../housekeeper-design.md:953 (failure-mode table)
- docs/superpowers/plans/.../housekeeper-implementation.md:2477 (plan hard rule)

Skipped (out of canonicalization-sweep scope):
- Plan source's prescribed canonical template (lines 2570-2595)
- Plan source's prescribed validator code (lines 2785-2839)
These are historical implementation guides describing what the implementer
was supposed to write at PR-construction time. Subsequent fixes (Findings 1
and 3) evolved the implementation; updating prescription post-hoc is
plan-archeology, not contract-alignment.

Note: Codex Finding 5 (PRRT_kwDOSCycWc6ALNW_) re-raises the architectural
drift originally flagged as Finding 2 (PRRT_kwDOSCycWc6AK48r) — both
about missing pre-merge / merge-step commands in SKILL.md Phase E. That
question is OUT OF SCOPE for this commit; still awaits user direction on
Path A (restructure SKILL.md + plan Task 4.4 per spec §4.1/§6.2 ordering)
vs Path B (add D-8 Decision Log + reconcile line-45 contradiction).

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dac2396634

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs Outdated
Comment thread .claude/skills/plan-execution/SKILL.md Outdated
Validator now enforces two contract invariants from
references/post-merge-housekeeper-contract.md line 93 that prior
rounds missed:

- result !== null + ∈ {DONE, DONE_WITH_CONCERNS, NEEDS_CONTEXT, BLOCKED}:
  blocks any non-canonical exit state (was: undefined slipped through).
  Closes Codex finding PRRT_kwDOSCycWc6ALVFW (P2).

- schema_violations.length > 0 ⇒ result === "BLOCKED":
  surfacing alone is insufficient; the contract requires the BLOCKED
  exit state for halt/routing-path determinism.
  Closes Codex finding PRRT_kwDOSCycWc6ALVFN (P1).

Tests: 25 → 30 (added 5 cases — 3 canonical-state, 2 BLOCKED-on-violations).
Existing fixtures gained explicit `result` fields so they assert under
the new enforcement intentionally rather than passing by accident.

JSDoc on validateManifestSubagentStage now enumerates 7 numbered checks
in code order; the new canonical-state check is #1 and the new
BLOCKED-when-schema-violations check is #6.

D-7 row 12 snapshot test (canonical template ↔ JS const verbatim sync)
unchanged and continues to pin contract↔JS sync.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8deaf8155e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs Outdated
Validator's pending-item reconciliation used hasOwnProperty.call() which
only checks key presence. A subagent returning
semantic_edits.compose_status_prose = undefined would pass the check
despite zero payload, letting DONE/DONE_WITH_CONCERNS slip through with
unresolved semantic work and skipping a needed round-trip.

Tightened to require a meaningful payload: not null/undefined, not
empty string (after trim), not empty array, not empty object.
Numbers/booleans accepted defensively for future contract evolution.

Closes Codex finding PRRT_kwDOSCycWc6ALgop (P2).

Tests: 30 → 35 (added 4 empty-variant cases + 1 negative control).
Helper isMeaningfulPayload extracted to module scope alongside
walkForPlaceholder for testability and single-source-of-truth.

JSDoc check #2 now clarifies the value-must-be-meaningful rule and
cites the contract clause "containing the composed output" from
canonical-template responsibility #5.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8bae2f15d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs Outdated
Validator's placeholder-scan loop silently skipped any affected_files
entry that no longer existed on disk (existsSync gate). A subagent that
deleted a declared file (e.g. cross-plan-dependencies.md) would pass
validation despite the contract clause that affected_files are files
the subagent EDITED — implying they exist post-edit.

Loop now pushes a gap when existsSync is false ("declared in
affected_files but missing from disk — destructive out-of-scope
behavior") AND continues (skipping the placeholder read for the
missing file). Existing-file branch unchanged.

Closes Codex finding PRRT_kwDOSCycWc6ALnyo (P1).

Tests: 35 → 36 (added missing-file gap; negative case covered by
existing placeholder-scan fixtures).

Contract doc clause at line 93 augmented with explicit parenthetical:
deletion of a declared file is a contract violation surfaced by the
validator's missing-file gap.

JSDoc check #3 now mentions both placeholder-scan AND existence-check
since both live in the same loop.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7bec222acc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sawmonabo added 2 commits May 7, 2026 00:20
Codex F-AMU4D (PR #33 P2): the placeholder-scan loop called readFileSync
unconditionally after the existsSync gate, throwing EISDIR on directory
paths and any other I/O error, crashing the orchestrator instead of
routing through the validator's gap collection.

Add statSync().isFile() guard + try/catch around statSync and readFileSync;
each failure path emits a distinct gap citing F-AMU4D so the failure
mode is identifiable. The contract requires affected_files entries to
be regular files (the script edits line-level content); a directory is
a contract violation that MUST surface as a gap so Phase E can re-dispatch
— not an unhandled exception that halts orchestration.

Regression test: a directory at the affected_files path now emits a
gap matching "is not a regular file" + "Codex F-AMU4D" rather than
crashing with EISDIR.

Tests: 48 helpers (47 + 1 new) + 116 main pass.
Codex F-AMWXK (PR #33 P1): validateManifestSubagentStage iterated
manifest.semantic_work_pending (subagent-written), so a subagent could
clear that array + return DONE/DONE_WITH_CONCERNS — the per-item pairing
loop ran zero times and emitted zero gaps, letting unaddressed semantic
work pass validation. SAME bypass shape as F-AMP7Y (schema_violations)
+ F-AMSEL (verification_failures).

STRUCTURAL DIVERGENCE from checks #9/#10: those guard
array-non-emptiness → BLOCKED state with a separate preservation check.
F-AMWXK leverages the EXISTING per-item iteration by feeding it the
UNION of script-stage snapshot + subagent-written array. The script
snapshot is the immutable contract (subagent CANNOT bypass by clearing);
the subagent's additions are also iterated (subagent committed to
address anything they added).

Why union not preservation-check: a separate preservation check would
emit "X missing from semantic_work_pending" — forcing 3 round-trips
(R1 array-shrink gap → R2 re-add → R3 address). Union semantics keep
the existing "X listed but unaddressed" gap actionable in ONE round-trip.

Add scriptSemanticWorkPending parameter (defaults null for legacy
callsites + tests). Updated JSDoc check #11 documenting the new
parameter, the union semantics, and the structural divergence.

Regression tests (3 — pin both directions of the boundary plus the
union semantics):
- cleared semantic_work_pending → 2 gaps for both script-stage items
- preserved + addressed → valid
- subagent-added items also iterated → 1 gap for unaddressed addition

Tests: 51 helpers (48 + 3 new) + 116 main pass.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7bec222acc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs Outdated
Path-(a) A2 design closes the system-level half of the contract-half-fix
bypass class (F-AMP7Y / F-AMSEL / F-AMWXK). Validator-API parameters were
in place but no production caller plumbed them, so the four preservation /
iteration checks (#7 / #9 / #10 / #11) silently disabled.

Script now writes _script_stage as an immutable snapshot of the four
arrays the validator enforces (affected_files / schema_violations /
verification_failures / semantic_work_pending). Validator reads the
snapshot from the manifest with precedence: explicit scriptXXX param >
manifest._script_stage[field] > null. Structural-tampering check fires
whenever the snapshot path is relied on (any scriptXXX null), catching
secondary attacks where the subagent removes _script_stage to defeat
the snapshot defense.

Subagent contract: _script_stage is READ-ONLY. Touching it surfaces as
a Codex F-AMbIV gap. Documented in the canonical subagent prompt
template, the agent definition file, the contract doc § _script_stage
immutability, and SKILL.md Phase E step 5.

Tests: 4 new F-AMbIV regressions (A2 fallback / missing-snapshot gap /
malformed-snapshot gap / param precedence). All 171 housekeeper tests
pass (55 helper + 116 main). 14 fixture snapshots updated to mirror
the new emission. 42 existing test fixtures bulk-updated with empty
_script_stage to satisfy the structural check on legacy code paths.

Refs: PR #33 Codex finding F-AMbIV
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7bec222acc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/skills/plan-execution/SKILL.md Outdated
Sawmonabo and others added 2 commits May 8, 2026 20:45
phase e step 4 dispatched the housekeeper subagent unconditionally, but
the contract classifies exits 1/4/>=6 as orchestrator-stage halts (operator
action required, not subagent work). dispatching over a malformed/absent
manifest forced the llm to interpret garbage and emit a result tag based
on hallucinated state. encode the dispatch/halt mapping in a tested helper
(decideHousekeeperRouting) so skill prose, contract, and any future audit
script delegate to one source of truth, mirroring the f-amp7y/f-amsel/
f-amwxk/f-ambiv pattern (move enforcement out of prose, into validators).

10 unit tests pin the mapping per exit class (0/1/2/3/4/5/6/137/-1/nan).

Refs: PR #33, F-ANGCa

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
skill prose, agent rules, contract reference, and the canonical subagent
prompt template all carried `Codex F-AMbIV` / `Codex F-ANGCa` / `Codex P1
PR #33` review-thread suffix labels. these are github graphql ids with no
semantic content for any reader outside the originating pr review tab —
they violate the global rule against referencing the current task / fix /
caller in code (claude.md), and they bleed review-process metadata into
the housekeeper subagent's prompt every dispatch. the rule's actual
rationale already lives in the surrounding prose; the f-tag added zero
behavioral signal while consuming prompt tokens.

scope is bounded to runtime-loaded surfaces (skill prose, agent rule list,
contract reference, canonical_template constant). validator gap messages
and jsdoc annotations stay — they're operator-facing / engineer-facing,
not loaded into the housekeeper subagent's prompt context. the breadcrumb
trail to source threads remains in commit messages and test names.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 902314e2be

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs Outdated
Validator's affected_files loop joined declared paths against repoRoot
without first checking they were repo-relative. Malformed manifests
emitting an absolute path (/etc/passwd) or parent-traversal
(../../private) bypassed validation; later step-7 git add fails with
"outside repository pathspec", dead-ending housekeeping after a
valid:true signal. Encoded containment in assertRepoRelative
(lexical isAbsolute + relative startsWith ".."), called BEFORE
existsSync. 7 unit tests cover helper branches + validator integration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6966fd9ef3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Validator's affected_files loop forwarded each entry to
assertRepoRelative (isAbsolute), which throws TypeError on non-string
input. The schema_violations reconciliation + preservation loops both
dereferenced v.kind/field/ns_id without per-element shape check, so a
tampered manifest with null or non-object entries crashed
mid-validation instead of surfacing structural-tampering as a gap.
Sanitize once at the top (cleanedSchemaViolations) and add typeof
guard before assertRepoRelative (idx-keyed gap message); downstream
loops iterate the cleaned arrays.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d69e5f7522

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs Outdated
Comment thread .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs Outdated
Closes the (manifest.X ?? []).method() bug class across all 6 callsites
in the validator. Two sub-classes addressed: (1) container-type — array
fields received an OBJECT/scalar instead of an array, crashing
.entries()/.some()/.map() with "is not a function"; (2) element-shape
— arrays of objects received null/scalar elements, crashing field
dereference (.kind/.field/.addressing) with TypeError. Pre-fix sites:
schema_violations (recon + preserve), concerns (pending-pair + recon),
affected_files (scan + D-7 superset), verification_failures (preserve).
Centralized in sanitizeObjectArrayField + sanitizeStringArrayField
helpers; consumers iterate cleaned arrays. Same gap-routing pattern
as F-AMU4D / F-Axg-z: contract violations surface as gaps with
contract-anchored shape hints, not orchestrator crashes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 48f21daac1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs Outdated
Removed two classes of ephemeral PR-thread identifiers leaked into runtime
test/code/comment surfaces under .claude/skills/plan-execution/:

  - F-tag suffixes (F-AMSEL/F-AMU4D/F-AMP7Y/F-AMNGCa/F-AMbIV/F-AMWXK/
    F-Axg-w/F-Axg-z/F-AxXBH/F-AxkNI/F-AxkNJ/F-AMLor/F-ANGCa)
  - Full GraphQL thread IDs (PRRT_kwDO...) and review permalink IDs
    (Codex thread rNNNNNNNN — three call-sites)

Both classes describe Codex review-thread provenance: base64 GraphQL IDs
that are ephemeral (rot when the PR is closed/squashed) and opaque (no
resolution path for a future reader). Per the standing rule, these
identifiers belong in commit messages and PR thread replies, never in
runtime-loaded test/code/comment/gap-message surfaces.

Replaced identity-based test assertions (/Codex F-XXX/.test(g)) with
behavior-based assertions (e.g., g.includes("script-stage
verification_failure")) so future refactors that rename gap-message text
don't silently break test coverage on a stale identifier match.

Known residual: Bug-N markers (Bug-4/5/8/9) remain inline in the same
files. They're a separate internal-triage convention without external
docs/backlog cross-references; stripping them was out of scope for this
authorization and is deferred as follow-up.

Test count unchanged: 254/254 pass.

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2715e011c2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs Outdated
Codex P1 finding on housekeeper-orchestrator-helpers.mjs:349 — the
`[...effScript, ...(manifest.semantic_work_pending ?? [])]` union spread
crashes "object is not iterable" when a malformed subagent manifest sets
`semantic_work_pending` to a non-array. The `??` only coalesces null/
undefined; an object/scalar short-circuits the coalesce, returns the bad
value, and the spread throws. Bypasses the intended re-dispatch path —
bad manifest structure becomes orchestrator failure instead of a
recoverable validation gap.

Same bug class as the prior class-level fix (sanitize helpers for
schema_violations / concerns / verification_failures / affected_files),
missed at the spread surface for semantic_work_pending. The earlier fix
scope only covered method-call sites (`(arr ?? []).method()`); this
commit extends to the spread surface.

Fix: add cleanedSemanticWorkPending = sanitizeStringArrayField(
manifest.semantic_work_pending, ...) upstream alongside the other
sanitize calls; route both consumption sites through the cleaned array.
Inline asymmetry note: ONLY the manifest field is sanitized;
effScriptSemanticWorkPending is upstream-resolved via Array.isArray
(script-stage = trusted, manifest = subagent-output = untrusted).

Test added (integration test mirrors the AxkNJ proof pattern): manifest
with _script_stage.semantic_work_pending populated and tampered
manifest.semantic_work_pending produces BOTH a structural-tampering gap
AND the canonical "X listed but absent" gap on the script-stage pending
item — proves a tampered live field doesn't mask preservation enforcement.

Test count: 255/255 pass (was 254; +1 new test for this finding).

Refs: PR #33

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f20bdc249e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sanitize elements of `effScriptVerificationFailures` and
`effScriptSchemaViolations` BEFORE the preservation iteration loops via
the existing `sanitizeObjectArrayField` helper. The container-type guard
in check #12 only verifies the field is an array — element-shape was
unguarded, so a tampered `manifest._script_stage.verification_failures`
containing `[null]` crashed `Object.keys(null)` inside the `failureKey`
closure (and `manifest._script_stage.schema_violations` containing `[null]`
TypeError'd on `sv.kind` dereference). Same fix shape as the manifest-side
sanitize already in place for the four object-array fields.

The companion finding (manifest `_script_stage` as immutable baseline) is
deferred — see thread reply for architectural rationale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f1ce845260

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sawmonabo and others added 2 commits May 8, 2026 23:14
Same class as the F-XXXX review-thread suffix cleanup (commit 2715e01).
Bug-N labels were ephemeral session-tracking identifiers used during the
initial finding cycle for this PR — they have no meaning outside the
originating review session and rot the moment the PR squashes. Replaced
each occurrence with a description of the bug class or contract violation
the test/branch defends; preserved all surrounding rationale.

Touches:
- post-merge-housekeeper.test.mjs (Bug-9, Bug-8, Bug-4 ×2, Bug-5 headers + 1 test name)
- post-merge-housekeeper.mjs (Bug-4 ×2 inline pre-validation comments)

256/256 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When validateManifestSubagentStage runs without all four scriptXXX
params and falls back to manifest._script_stage, the validator now
emits an explicit gap. _script_stage is subagent-emitted and may be
tampered; the orchestrator's stage-1 conversation memory snapshot
(read in SKILL.md step 3 BEFORE subagent dispatch) is the only
untamperable baseline. Falling back silently allowed preservation
bypass when the subagent cleared both the top-level emit field and
the snapshot field while keeping shape intact.

SKILL.md step 5, the housekeeper contract, the agent prompt, and
the embedded CANONICAL_TEMPLATE all updated to describe the
orchestrator-plumbs-stage-1-snapshot trust model. _script_stage is
demoted from "immutable" to "redundant integrity signal" — the
primary defense is the validator's gap, the snapshot is defense-
in-depth.

Test fixtures updated to plumb scriptXXX explicitly (mirroring the
production path); two new tests cover the baseline-trust gap
positive (omit scriptXXX → gap fires) and negative (all four
plumbed → suppressed) cases.

Refs: PR #33 finding F-AyDCN

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 682b3c4ecc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/skills/plan-execution/lib/housekeeper-orchestrator-helpers.mjs Outdated
Multi-candidate runs (--candidate-ns NS-XX,NS-YY) emit a divergent
manifest shape: matched_entry is null with matched_entries[] carrying
per-NS metadata, and mechanical_edits.{status_flip,mermaid_class_swap}
are absent (omitted entirely) with plural status_flips[] /
mermaid_class_swaps[] arrays in their place. Canonical fixture is
scripts/__tests__/fixtures/14-multi-candidate-happy-path.

The contract previously described only the single-candidate shape, and
the canonical subagent prompt template's responsibility #1 told the
housekeeper to replace `<TODO subagent prose>` placeholders only in
mechanical_edits.status_flip.to_line — a subagent following the prompt
literally on a multi-candidate run would skip the status_flips[]
entries. The validator's on-disk placeholder scan would have caught
the omission on the next round-trip (the safety net is shape-agnostic),
but the wasted round-trip was avoidable.

Updated responsibility #1 in both helpers.mjs CANONICAL_TEMPLATE and
the contract's §Canonical Subagent Prompt Template (kept verbatim in
sync for the buildHousekeeperPrompt parity test) to cover both shapes,
added a multi-candidate-shape paragraph to the Manifest schema section,
and updated the validation-invariants line to acknowledge the plural
keys.

Refs: PR #33 finding F-A2u4r

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: df5a798f21

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

A malformed subagent output (JSON literal `null`, undefined, scalar,
or array root) would crash on the first `manifest.X` dereference at
helpers.mjs L214 before gap collection ran, throwing TypeError instead
of returning a structured validation gap. That broke Phase E's
contract-violation recovery path — the orchestrator caught a runtime
crash instead of a routable gap, and could not re-dispatch / surface
the contract violation to the user.

Added an entry guard at function start that catches null, undefined,
non-object scalars, and array roots. Discriminates on actualKind for
actionable messaging and short-circuits with a structured gap so
downstream code can trust manifest is a non-null non-array object.

Four dedicated tests cover null / undefined / scalar (number+string) /
array roots. 89/89 pass.

Refs: PR #33 finding F-A24wj

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Sawmonabo Sawmonabo merged commit 8a8856e into develop May 10, 2026
8 checks passed
@Sawmonabo Sawmonabo deleted the feat/plan-execution-housekeeper-subagent branch May 10, 2026 01:38
Sawmonabo added a commit that referenced this pull request May 11, 2026
`902314e` swept review-thread suffixes from runtime-loaded surfaces (skill
prose, agent rules, contract reference, canonical prompt template) but
missed a single `F-A24wj:` prefix in a Layer-2 unit-test comment. The tag
is a Codex GraphQL ID with no semantic content for any reader outside the
originating PR review tab; the surrounding prose already names the bug
class (manifest = null short-circuit gap) without the tag.

Detection: `grep -rE 'F-A[A-Za-z0-9]{4,}|Codex F-' .claude/ packages/ scripts/`
now returns zero hits across runtime-loaded surfaces.

Comment is a pure edit — no test logic touched; 108-test file + full
386-test plan-execution skill suite pass locally.

Refs: PR #33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant