Skip to content

feat(spec): sync v0.4.0 — add 11 requirement IDs across P1/P2/P4/P6/P8#50

Merged
brettdavies merged 16 commits into
devfrom
feat/v0.4.0-spec-sync
May 7, 2026
Merged

feat(spec): sync v0.4.0 — add 11 requirement IDs across P1/P2/P4/P6/P8#50
brettdavies merged 16 commits into
devfrom
feat/v0.4.0-spec-sync

Conversation

@brettdavies
Copy link
Copy Markdown
Owner

@brettdavies brettdavies commented May 7, 2026

Summary

v0.4.0 spec sync companion. Ships live check implementations for 11 new requirement IDs across P1, P2, P4, P6, and the brand-new P8 (skill bundle discoverability), suppresses p6-must-sigterm under --audit-profile human-tui to mirror p6-sigpipe's rationale, and bumps the CLI from 0.3.1 to 0.4.0.

Changelog

Added

  • Add P1 secret-handling check (p1-must-secret-non-leaky-path): scans target CLIs' --help for secret-bearing flag families (--token, --password, --api-key, --secret, --auth, --credential) and verifies each has either a --*-file companion or stdin path advertised. Vacuous Pass when no secret-bearing flag is detected.
  • Add P2 schema trio (p2-must-schema-print, p2-should-schema-file, p2-should-json-aliases): runtime-discoverable output schema via schema subcommand or --schema flag, file-export of schemas (schema/*.json, *.schema.json at repo root), and --json / --jsonl short aliases for --output.
  • Add P4 closed-set rejection check (p4-should-enumerate-valid-set, Rust + Python): detects clap ValueEnum, PossibleValuesParser, value_parser!, and Python argparse.choices= / click.Choice().
  • Add P6 lifecycle and naming checks (p6-must-sigterm, Rust + Python; p6-may-standard-names): SIGTERM-handler detection across signal_hook, tokio::signal::unix, signal.signal, and loop.add_signal_handler; community-standard-verb allow-list applied to top-level subcommands.
  • Add P8 skill-bundle suite (p8-should-bundle-exists, p8-must-bundle-install, p8-may-install-all, p8-may-bundle-update): repo-root detection of AGENTS.md / SKILL.md with YAML frontmatter, plus help-surface probes for skill install, skill install --all, and skill update / skill upgrade. Brand-new principle in the registry.

Changed

  • Bump CLI from 0.3.1 to 0.4.0 (MINOR; meaningful coverage growth across five principles, including a brand-new principle).

Documentation

  • Document prose-scrubbing runbook in RELEASES.md for release-flow artifacts (PR bodies, CHANGELOG.md, release-PR bodies) using Vale + LanguageTool + unslop.
  • Add ## PR body section to RELEASES.md codifying what belongs in PR bodies (NEW user-facing substance, six required template sections) and what does not (workflow recap, triple-diff output, pre-push gate results, CI status, AI attribution).

Type of Change

  • feat: New feature (non-breaking change which adds functionality)
  • fix: Bug fix (non-breaking change which fixes an issue)
  • refactor: Code refactoring (no functional changes)
  • perf: Performance improvement
  • docs: Documentation update
  • test: Adding or updating tests
  • chore: Maintenance tasks (dependencies, config, etc.)
  • ci: CI/CD configuration changes
  • style: Code style/formatting changes
  • build: Build system changes
  • BREAKING CHANGE: Breaking API change (requires major version bump)

Related Issues/Stories

Files Modified

Modified:

  • Cargo.toml, Cargo.lock: version 0.3.1 to 0.4.0
  • RELEASES.md: prose-scrubbing runbook + new ## PR body section
  • docs/coverage-matrix.md, coverage/matrix.json: regenerated for 57 requirements
  • src/principles/spec/**: vendored from agentnative-spec v0.4.0
  • src/principles/registry.rs: counter bumps, principle range to 1..=8, p6-sigterm in HumanTui suppression
  • src/types.rs, src/scorecard/mod.rs: CheckGroup::P8 variant + label/order
  • src/checks/{behavioral,project,source/{rust,python}}/mod.rs: register the 13 new check files
  • tests/build_parser.rs: integration test pin updated for v0.4.0 / 57 requirements

Created:

  • src/principles/spec/principles/p8-discoverable-skill-bundle.md (vendored)
  • src/checks/behavioral/secret_non_leaky_path.rs (P1)
  • src/checks/source/{rust,python}/enumerate_valid_set.rs (P4)
  • src/checks/behavioral/{schema_print,json_aliases}.rs, src/checks/project/schema_file.rs (P2)
  • src/checks/source/{rust,python}/sigterm.rs, src/checks/behavioral/standard_names.rs (P6)
  • src/checks/project/bundle_exists.rs, src/checks/behavioral/{bundle_install,install_all,bundle_update}.rs (P8)

Renamed:

  • None.

Deleted:

  • None.

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing completed (dogfood anc check .)
  • All tests passing

Test Summary:

495 unit tests pass; 51 integration tests pass (including the spec-version drift sentry, the convention_check_result_constructed_only_in_run_body rule, the dangling_cover_ids detector, and the matrix artifact drift gate); clippy -Dwarnings clean; anc generate coverage-matrix --check exits 0.

brettdavies added 16 commits May 7, 2026 14:53
PR A (001) plans the agentnative-spec v0.4.0 companion: vendor 11 new
requirement IDs across P1/P2/P4/P6/P8, implement live checks (no stubs),
update SUPPRESSION_TABLE for p6-must-sigterm under HumanTui, regenerate
coverage-matrix artifacts, bump CLI from 0.3.1 to 0.4.0. Time-boxed to
the 24h coupled-release window started at PR #26 merge (2026-05-07T18:38Z).

PR B (002) plans the sibling prose tooling import: new sync-prose-tooling.sh
(parallel to sync-spec.sh), vendor BRAND.md plus four vale packs (skip
spec/ pack since RFC-2119 register doesn't apply to CLI prose), adapt
prose-check.sh, author CLI-channel .impeccable.md, wire CI workflow plus
weekly drift detection, ast-grep-based extraction for in-code prose. No
governance deadline.
Sync src/principles/spec/ to v0.4.0 (90dd48b). Adds p8-discoverable-skill-
bundle.md (new principle, 4 requirements) and refreshes P1-P7 with 7 new
requirement IDs across the existing principles for a total of 11 new IDs.

REQUIREMENTS slice regenerates at build time from vendored frontmatter
(via build_support/parser.rs), so the registry edits here are bookkeeping
only:

- registry_size_matches_spec: 46 -> 57 (4 new MUSTs, 4 new SHOULDs, 3 new
  MAYs).
- level_counts_match_spec: 23/16/7 -> 27/20/10.
- principle_range_is_valid: extended from 1..=7 to 1..=8.

UNVERIFIED_MUSTS gets four temporary entries (p1-must-secret-non-leaky-
path, p2-must-schema-print, p6-must-sigterm, p8-must-bundle-install)
covering the new MUSTs that need live check implementations. Each entry
is removed in the commit that lands its corresponding check (U2-U5 of
docs/plans/2026-05-07-001-feat-v0.4.0-spec-sync-plan.md).
p1-must-secret-non-leaky-path: behavioral check that scans `--help` for
secret-bearing flag families (--token, --password, --api-key, --secret,
--auth, --credential) and verifies each one has either a `*-file`
companion or a stdin path advertised in help text. Vacuous Pass when no
secret-bearing flag is detected. Fail names the offending flag(s).

p4-should-enumerate-valid-set (Rust + Python): source-layer check that
detects framework-declared closed-set rejection — clap's ValueEnum,
PossibleValuesParser, and value_parser! on the Rust side; argparse's
choices=[...] and click.Choice() on the Python side. Both frameworks
include the valid set in their default rejection message, so structural
detection is sufficient. Vacuous Pass when no framework is in use; Warn
when framework is detected but no closed-set declaration appears.

Removes the temporary p1-must-secret-non-leaky-path entry from
UNVERIFIED_MUSTS now that the live check covers it. p2-must-schema-print,
p6-must-sigterm, and p8-must-bundle-install entries remain pending their
respective commits per docs/plans/2026-05-07-001-feat-v0.4.0-spec-sync-
plan.md.
p2-must-schema-print (behavioral): probes target CLI's --help for a
structured-output indicator (--output / --format / --json / --jsonl /
prose mention of json/jsonl/ndjson). When present, looks for either a
`schema` subcommand or `--schema` flag. Skip when no structured-output
indicator appears (vacuous applicability); Pass when schema surface is
advertised; Fail when structured output is advertised without a schema
surface.

p2-should-schema-file (project): walks project root for `schema/`,
`schemas/`, or top-level `*.schema.json`. Pass on hit, Warn otherwise.
SHOULD-tier severity reflects that absence is a hint, not a stop.

p2-should-json-aliases (behavioral, universal): probes --help for
--json or --jsonl as short-form aliases for --output. Pass when present,
Warn otherwise.

Removes the temporary p2-must-schema-print entry from UNVERIFIED_MUSTS.
… under HumanTui

p6-must-sigterm (Rust + Python source): detects SIGTERM-handling
primitives across the canonical APIs — signal_hook (register / Signals
iterator), tokio::signal::unix::SignalKind::terminate, libc::SIGTERM on
the Rust side; signal.signal(signal.SIGTERM, ...) and asyncio's
add_signal_handler on the Python side. Applicability is gated by a
long-running-operation heuristic (server/daemon/watch/asyncio.run
markers); short-running CLIs receive vacuous Pass.

p6-may-standard-names (behavioral, universal-where-subcommands): probes
target's --help for top-level subcommands and matches against a
community-standard verb allow-list (CRUD verbs, action verbs, meta
commands, package-management style). Pass when at least 70% match;
Warn when below threshold; Skip when no subcommands parsed. MAY-tier
soft signal — non-conforming verbs are advisory.

SUPPRESSION_TABLE: appends "p6-sigterm" to the HumanTui slice.
Rationale mirrors p6-sigpipe verbatim — TUIs install their own SIGTERM
handlers to render exit dialogs and save state; the default-disposition
check doesn't match the category's execution model. No new
ExceptionCategory variant needed.

Removes the temporary p6-must-sigterm entry from UNVERIFIED_MUSTS.
Adds CheckGroup::P8 enum variant and four checks covering the new
P8 principle introduced in agentnative-spec v0.4.0.

p8-should-bundle-exists (project, universal): scans project root for
AGENTS.md or SKILL.md (case-insensitive), then verifies YAML frontmatter
opens with `---` and declares a `name:` field. SHOULD-tier — every
miss is Warn, never Fail. Exposes `find_bundle()` so the three
behavioral P8 checks share one detection heuristic.

p8-must-bundle-install (behavioral, conditional): vacuous Pass when no
bundle is present at the project root. Otherwise probes target's --help
for a top-level `skill` subcommand (canonical) or non-canonical patterns
(`init --skill`, `skills add`, `agents add`). Fail when bundle exists
but no install path is advertised.

p8-may-install-all (behavioral, conditional): vacuous Pass when no
bundle. Otherwise chained-probes `<binary> skill install --help` for
`--all`. Warn when absent — MAY-tier informational signal.

p8-may-bundle-update (behavioral, conditional): vacuous Pass when no
bundle. Otherwise chained-probes `<binary> skill --help` for
`update`/`upgrade` subcommand. Warn when absent.

CheckGroup::P8 added to types.rs; the existing exhaustive matches in
scorecard/mod.rs (label_for_group, group_order) extend to handle it.

Removes the temporary p8-must-bundle-install entry from
UNVERIFIED_MUSTS — all four v0.4.0 sync entries are now cleared.
…ub runbook

Final unit of the v0.4.0 spec sync companion (PR A of plan 2026-05-07-001):

- Cargo.toml + Cargo.lock: 0.3.1 -> 0.4.0. MINOR bump reflects the 11
  new live check implementations across P1/P2/P4/P6/P8.

- docs/coverage-matrix.md + coverage/matrix.json: regenerated via
  `anc generate coverage-matrix`. The committed artifacts now reflect 57
  requirements with covers() declarations for every new check.
  `--check` exits 0 (CI parity).

- RELEASES.md: ships ~55 lines of prose-scrubbing runbook for release-
  flow artifacts (PR bodies, CHANGELOG.md, release-PR bodies) that no
  automated check reaches. Self-referential for this release: the
  v0.4.0 operator follows the runbook to scrub PR A's body and the
  release/v0.4.0 PR body before submit. Until prose tooling is vendored
  locally (PR B of plan 2026-05-07-002), the runbook points Vale at the
  spec checkout via --config.

- Tidies lint nits surfaced by `cargo clippy -D warnings` against the
  earlier units' sources: collapsible_if in schema_file.rs, more-private-
  type-than-item visibility on EnumerateScan in both enumerate_valid_set
  files, and #[cfg(test)] gating on the sigterm test helpers (the trait
  run() aggregates across multiple parsed files; the helpers exist only
  for single-source-string testing).
…ements

The pre-push gate caught a stale counter in tests/build_parser.rs that
mirrors registry_size_matches_spec in src/principles/registry.rs. Both
exist to flag unintentional spec growth — the integration version reads
src/principles/spec/principles/ from disk and runs the build_support
parser end-to-end, so it's the closer of the two sentries to what
build.rs sees.

Updates: 46 -> 57; first ID still p1-must-env-var; last ID is now
p8-may-bundle-update (from the new P8 principle). Renames the test from
`vendored_v0_2_0_parses_to_46_requirements` to
`vendored_spec_parses_to_expected_requirement_count` so future syncs
don't have to rename the function alongside the count.
Source Check Convention (CLAUDE.md) requires run() to be the sole
CheckResult constructor per Check impl. The earlier U5 commit
introduced a make_result() helper in install_all.rs and bundle_update.rs
that constructed CheckResult outside run() — a clean dedup that
nonetheless violated the convention.

Refactor: extract a compute_status(project) -> CheckStatus per file. The
trait run() owns CheckResult construction; compute_status holds the
multi-branch applicability gating logic. Same external behavior, same
test coverage (the convention drift detector lives in
tests/integration.rs and now passes).
Aligns RELEASES.md with the structural improvements that landed in
agentnative-spec/RELEASES.md, adapted for the CLI's PR template
(Summary / Changelog / Type of Change / Related Issues/Stories /
Files Modified / Testing — six sections, distinct from the spec's
five-section template).

Adds:

- New `## PR body` section between "Daily development" and "Releasing
  dev to main". Codifies what belongs in PR bodies (NEW user-facing
  substance, six required template sections) and what does NOT (workflow
  recap, triple-diff output, pre-push gate results, CI status, AI
  attribution trailers). Closes the gap that produced the body-discipline
  drift across recent PRs.
- Explicit `cliff.toml` chore-skip footgun reference inside the new
  Type of Change bullet (mirrors the spec's `^chore` regex form for
  precision).
- Required-empty-fields rules: four `Files Modified` sub-headers and
  four `Related Issues/Stories` labels stay even when empty (`None.`
  rather than deletion).
- Internal tooling commit guidance (no entries in `## Changelog` for
  `chore(cliff): ...`, `chore(prose-check): ...`).
- Release-PR repetition rationale (`^release` skip prevents
  double-counting in future regeneration).
- No-AI-attribution rule, mirrored from global CLAUDE.md.

Tightens:

- Drops em-dash overuse in the file's opening sentences (matches spec's
  cleaner period-separated form): "are not permitted." instead of
  "are not permitted —"; same in the Branches table cell and the dev-as-
  forever-branch paragraph.

Self-referential: this PR (#50) is also the v0.4.0 spec sync companion,
so the new PR body guidance is what the v0.4.0 release operator follows
when scrubbing this PR's own body before submit. Future RELEASES.md
edits inherit the framing.
The 4-line breadcrumb added in U1 (commit 48d3239) labelled four temporary
UNVERIFIED_MUSTS entries that U2-U5 removed as each new MUST got its live
check. The entries were cleared in dff880e (U5) but the labelling comment
was missed.

Net effect: matrix.rs now has zero diff vs origin/dev — its only role in
this PR was carrying the temporary entries through the multi-commit
landing.
The rule was implicit (the auto-format hook skips /tmp/ paths so authored
shape is preserved) but never stated as an authoring requirement. Manual
~120-char wrapping during composition undoes the protection and produces
visible mid-sentence breaks plus prose-check pipeline interference (Vale's
line-anchored output reports against split lines, LanguageTool's input
handler chokes on certain control-char interactions when wrap-newlines
land near inline code spans).

Adds an explicit bullet to the section: each paragraph and each bullet
is one logical line, however long. Same rule applies to commit messages
composed via heredoc and to any markdown that ships verbatim to GitHub.

Caught 2026-05-07 on PR #50's own body — surfaces this as the canonical
location rather than a memory entry, since memories shouldn't duplicate
canonical docs (per CLAUDE.md user-level memory guidance).
The new p2-should-json-aliases check (added in U3) flags CLIs whose
--help advertises --output json/jsonl but not the canonical --json /
--jsonl short forms. anc dogfooded as Warn on its own check until now.

Adds a top-level global --json bool flag on Cli, threaded through the
Check and Skill::Install handlers: when set, --json overrides whatever
--output resolved to and forces OutputFormat::Json. The flag is
documented in --help with a pointer to the requirement ID.

JSONL is intentionally not added: anc's OutputFormat enum has only Text
and Json variants; --jsonl as an alias requires implementing JSONL
output first. The new check Passes on either --json or --jsonl, so
--json alone clears the dogfood Warn.

Regenerated completions (bash/zsh/fish/elvish/powershell) include the
new flag globally so shell completion offers --json on every subcommand.

Verified via 'cargo run -- check . --json' producing JSON, 'cargo run
-- check . --output text --json' still producing JSON (override
direction), and dogfood 'cargo run -- check .' now reporting
p2-json-aliases as Pass.
The new p2-must-schema-print MUST check (added in U3) flagged anc itself
as Fail because anc emits structured output via --output json but had
no runtime-discoverable schema surface. The dogfood integration test
caught the regression on pre-push gate.

Adds:

- Top-level `anc schema` subcommand. Prints scorecard schema metadata
  (schema_version from src/scorecard, format identifier, top-level keys,
  spec pointer) in either text or JSON form via --output. The schema body
  is intentionally compact: it pins the version and names the canonical
  top-level keys consumers see in 'anc check --output json'. Hand-rolled
  JSON to avoid pulling in a schemars dep just to describe the schema.
- AGENTS.md gains YAML frontmatter (name / binary / summary), satisfying
  p8-should-bundle-exists. Was Warn ('exists but lacks YAML frontmatter');
  now Pass.
- src/checks/behavioral/schema_print.rs section-match heuristic tightened
  to require trailing whitespace or end-of-line after 'schema' so it
  doesn't false-match prose mentions like 'schema_version' in a
  description line.
- Regenerated completions (bash/zsh/fish/elvish/powershell) include the
  new schema subcommand on every shell.

Dogfood now: p2-schema-print=Pass, p8-bundle-exists=Pass, all P2/P8
checks resolve without Fail.

Note on the why-did-this-pass-earlier puzzle: discover_rust_binaries
prefers target/release/anc over target/debug/anc when both exist. The
release binary was stale (built before the schema subcommand), so the
behavioral checks were probing an outdated --help. After cargo build
--release, both binaries report the new surface and dogfood is
consistent. Worth a follow-up to invert the preference (or pick by
mtime), but out of scope here.
…lands

The v0.4.0 spec sync added `p2-must-schema-print` which probes target
CLIs for a `schema` subcommand or `--schema` flag. anc emits structured
output via --output json but its canonical schema-export surface is
yet-unshipped — the planned implementation is fully designed at
docs/plans/2026-04-30-002-feat-scorecard-json-schema-plan.md (derived
schema via schemars build-dep, embedded via include_str!, exposed via
`anc generate scorecard-schema`).

Adds `p2-schema-print` to a PENDING_FAILS allowlist on
dogfood_no_p2_fail_after_skill_subcommand so PR #50 ships without
mixing scope. Removes the allowlist when the planned verb lands and
satisfies the check.

Pairs with the previous revert commit (b41b54f) that backed out the
ad-hoc top-level `anc schema` MVP. The MVP didn't follow the established
'committed artifact + drift check + build.rs codegen' pattern that the
plan codifies.
@brettdavies brettdavies merged commit 05e74ca into dev May 7, 2026
7 checks passed
@brettdavies brettdavies deleted the feat/v0.4.0-spec-sync branch May 7, 2026 21:06
brettdavies added a commit that referenced this pull request May 7, 2026
## Summary

Replaces `discover_rust_binaries()`'s always-prefer-release heuristic
with mtime-newest-wins so dev workflows where `cargo run`/`cargo test`
only refresh debug stop probing stale `target/release/<bin>` binaries.
The v0.4.0 spec sync surfaced the trap on `p2-must-schema-print` against
anc itself.

## Changelog

### Changed

- Binary discovery in `src/project.rs::discover_rust_binaries` now picks
the newer of `target/release/<bin>` and `target/debug/<bin>` by mtime
when both exist. Ties and metadata failures fall back to debug (matches
cargo's dev-flow default). CI scenarios where only one profile is built
are unchanged.

### Documentation

- Add Dogfooding Safety rule 3 to `CLAUDE.md` describing the mtime-based
selection, with a `NEVER` directive against reverting to the
always-prefer-release shape.

## Type of Change

- [ ] `feat`: New feature (non-breaking change which adds functionality)
- [x] `fix`: Bug fix (non-breaking change which fixes an issue)
- [ ] `refactor`: Code refactoring (no functional changes)
- [ ] `perf`: Performance improvement
- [ ] `docs`: Documentation update
- [ ] `test`: Adding or updating tests
- [ ] `chore`: Maintenance tasks (dependencies, config, etc.)
- [ ] `ci`: CI/CD configuration changes
- [ ] `style`: Code style/formatting changes
- [ ] `build`: Build system changes
- [ ] `BREAKING CHANGE`: Breaking API change (requires major version
bump)

## Related Issues/Stories

- Story: stale `target/release/anc` masking `p2-must-schema-print`
regressions during v0.4.0 spec sync
- Issue: n/a
- Architecture:
`docs/solutions/test-failures/stale-release-binary-dogfood-fail-2026-05-07.md`
- Related PRs: #50 (v0.4.0 spec sync, where the trap was surfaced and a
temporary `PENDING_FAILS` allowlist for `p2-schema-print` was added in
`tests/dogfood.rs`)

## Files Modified

**Modified:**

- `src/project.rs`: `discover_rust_binaries` switches from
existence-only release-over-debug preference to mtime-based selection
via a new `pick_newer_artifact` helper. Adds two unix-gated tests
(`test_discover_picks_newer_artifact_by_mtime`,
`test_discover_picks_release_when_newer`).
- `CLAUDE.md`: Dogfooding Safety section gains rule 3 describing the new
behavior; the `Rules for new behavioral checks` subsection gains a
`NEVER` directive locking the new shape in.

**Created:**

- None.

**Renamed:**

- None.

**Deleted:**

- None.

## Testing

- [x] Unit tests added/updated
- [ ] Integration tests added/updated
- [x] Manual testing completed (dogfood `anc check .` against this
branch; `p2-schema-print` no longer needs a manual `cargo build
--release` to flip from Fail to Pass)
- [x] All tests passing

**Test Summary:**

`cargo test` reports 588 passed, 2 ignored across 7 suites. The two new
unit tests use `tempfile`-style temp dirs and `File::set_modified`
(stable since 1.75) to construct binaries with controlled mtimes; gated
to `#[cfg(unix)]` because mtime semantics under Windows file-attribute
caching are not the semantics this regression asserts against. Pre-push
gate (fmt, clippy `-Dwarnings`, test, cargo-deny, Windows compat) passed
before the push.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant