Skip to content

ai-facing-docs + adapters: align surfaces with current behavior (PR 6 of architecture vNext)#214

Merged
garagon merged 9 commits into
mainfrom
ai-facing-docs-adapters
May 11, 2026
Merged

ai-facing-docs + adapters: align surfaces with current behavior (PR 6 of architecture vNext)#214
garagon merged 9 commits into
mainfrom
ai-facing-docs-adapters

Conversation

@garagon
Copy link
Copy Markdown
Owner

@garagon garagon commented May 11, 2026

Summary

The AI-facing surfaces (llms.txt, AGENTS.md, bin/about.sh, guard/SKILL.md, plus the public READMEs) carried claims from earlier rounds that no longer matched the system: "any AI coding agent", "Three-tier safety", "28 block rules", "zero dependencies", "no telemetry". Adapters under adapters/ were the single source of truth for capability evidence, but a stale last_verified could silently age out because no validator looked at the date.

This PR aligns every agent-facing surface with the current behavior and adds a freshness/schema gate for adapters.

What changes

Agent-facing docs

  • llms.txt + AGENTS.md: replaced "any agent" framing with the explicit verified-adapter set (claude, codex, cursor, gemini, opencode). Added sections for custom workflow stacks, the artifact-trust contract, and the privacy posture. Removed hardcoded rule counts; both now point at guard/rules.json as the source of truth.
  • bin/about.sh: headline changed from "Sprint quality framework" to "Local workflow framework". The adapter list is generated from adapters/*.json filenames at runtime (awk join — paste -sd ', ' alternates delimiters on macOS). Added the custom workflow stacks section and the privacy posture.
  • guard/SKILL.md + guard/bin/check-dangerous.sh header: replaced "three-tier permission system" with the layered pipeline the code actually runs (block rules → allowlist → phase-aware concurrency → in-project fast-path → sprint phase gate → budget gate → warn rules). The check-dangerous header comment matches the same order.
  • README.md + README.es.md: removed the literal block/warn rule counts ("35 block rules total", "9 warn rules total"); both now refer to guard/rules.json with a jq snippet to query the live count.

Adapter validation

  • bin/check-adapters.sh (new): validates every adapters/<host>.json against reference/host-adapter-schema.md:
    • Root must be a JSON object.
    • Required keys (host, schema_version, last_verified, verification, skill_discovery, bash_guard, write_guard, phase_gate, install_target, doctor_checks) are present.
    • schema_version is in the supported set (currently "1").
    • Scalar fields are strings (no install_target: 123).
    • Host enum: claude, cursor, codex, opencode, gemini.
    • host field must equal the filename basename (no mislabeled file).
    • Capability values are in the documented enum (unsupported, instructions_only, detectable, hooked, enforced, host_dependent).
    • skill_discovery is in its enum (native, rules_file, extension, skill_folder, ...).
    • verification is an object with method (in ci/manual/unknown) and non-empty evidence.
    • doctor_checks is a non-empty array of strings.
    • last_verified is a strict YYYY-MM-DD (no GNU-date relative shapes like yesterday), parseable, and not in the future.
    • Freshness policy: warn after 30 days, fail after 60 days for README-listed adapters. NANOSTACK_ALLOW_STALE_ADAPTERS=1 downgrades the fail to a warning for manual re-runs.
    • Single-host mode: bin/check-adapters.sh <host> validates one adapter. A filter with no matching file fails ("no adapters/.json found"). The cross-check that detects README-listed-but-missing adapters is scoped to the filter so a partial fixture does not produce spurious failures.
    • README cross-check anchors at $NANOSTACK_ROOT/README.md (not the caller's cwd) so the script works from any directory.
    • --json mode emits a parseable summary even on root-type failures.

Lint

Two new jobs in .github/workflows/lint.yml:

  • ai-facing-docs-consistency:

    • Forbidden-overclaim grep across llms.txt + AGENTS.md + bin/about.sh + guard/SKILL.md + READMEs (case-insensitive: "any AI coding agent", "any agent that reads SKILL", "zero dependencies", "full engineering team", "no telemetry, no remote calls", "three-tier guard/safety/permission system").
    • Every adapter filename under adapters/ must be mentioned in AGENTS.md, llms.txt, README.md, and README.es.md.
    • Reverse direction: no canonical adapter name (claude/cursor/codex/opencode/gemini) may be advertised when its JSON file is missing. Matches three shapes: inline backticked, "verified adapter X" copy, and bullet entries like - Cursor.
    • Sprint order matches the canonical default in bin/about.sh.
    • No hardcoded block/warn rule counts in any agent-facing surface.
  • adapter-freshness: runs bin/check-adapters.sh.

E2E

ci/e2e-adapter-freshness.sh (new): 15 cells, 43 checks. Covers fresh set passes, missing field fails, enum violation fails, README-listed missing fails, stale fails after 60 days, stale non-README warns, override downgrades to warn, --json parses, documented capability enum honored, empty capability fails, missing verification fails, unparseable date reported, host-filename mismatch fails, non-string doctor_checks fails, future date fails, non-object verification typed-fails, scalar type wrong fails, root-type wrong typed-fails, --json still parses on root-type failure, non-ISO date rejected, single-host filter scoped, filter typo fails.

What does not change

  • The adapter JSON schema is unchanged. Existing adapters/*.json files all pass the new validator on first run.
  • Default sprint order, guard pipeline behavior, and artifact contracts are documented; no runtime behavior is changed in this PR beyond a tightened doc grep.
  • All 14 existing E2E suites pass unchanged.

Codex review trail (9 iterations)

Pass Finding Fix
1 Capability enum drifted from reference/host-adapter-schema.md; required-field loop was truncated; empty-string capabilities passed; unparseable date silent-exited. Aligned enum, expanded required check, rejected empty strings, wrapped parse in set +e.
2 host field could mismatch the filename basename; reverse-adapter lint was one-directional. Added filename match check; reverse direction in lint.
3 README path was cwd-dependent; doctor_checks element type not checked. Anchored at $NANOSTACK_ROOT; element-type guard.
4 Filter typo silently passed; schema_version was not validated. Track FILTER_MATCHED; added SCHEMA_VERSION_ENUM.
5 Bullet form - Cursor missed by reverse lint; future dates suppressed freshness; verification non-object access crashed jq. Added bullet match shape; future-date guard; removed unguarded early read.
6 Non-object JSON root crashed jq; scalar types not checked; READMEs absent from lint. Added root-type check; scalar-type loop; extended both lint directions to READMEs.
7 GNU date -d accepted non-ISO values like "yesterday" on Ubuntu. Strict YYYY-MM-DD regex guard before parsing.
8 Single-host filter cross-checked unrelated README-listed adapters. Scope the cross-check to the filter.
9 Final pass clean. None.

Final codex pass clean: "I did not find a discrete bug that would break existing behavior or invalidate the new checks."

Tests

All checks green locally (15 suites):

  • ci/e2e-user-flows.sh: 100 checks.
  • ci/e2e-custom-stack-flows.sh: 40 checks.
  • ci/e2e-custom-stack-examples.sh: 51 checks.
  • ci/e2e-artifact-trust.sh: 29 checks.
  • ci/e2e-structured-artifacts.sh: 31 checks.
  • ci/e2e-graph-aware-session.sh: 61 checks.
  • ci/e2e-custom-routing.sh: 35 checks.
  • ci/e2e-adapter-freshness.sh: 43 checks (new).
  • ci/e2e-delivery-matrix.sh: 17 cells.
  • ci/e2e-examples.sh: 40 checks.
  • ci/e2e-think-flows.sh: 32 checks.
  • ci/e2e-think-archetypes.sh: 25 checks.
  • ci/e2e-onboarding-flows.sh: 34 checks.
  • tests/run.sh (local-only): 83 tests.
  • tests/e2e-user-flows.sh (local-only): 5 checks.

Total: 626 checks across 15 suites.

Files

  • bin/check-adapters.sh (new): adapter schema + freshness validator.
  • ci/e2e-adapter-freshness.sh (new): 15-cell, 43-check harness.
  • llms.txt: rewritten around verified adapters + custom workflow stacks.
  • AGENTS.md: rewritten to point at adapters/*.json as source of truth.
  • bin/about.sh: adapter list generated from filenames; custom-stack and privacy sections.
  • guard/SKILL.md: pipeline order documented; no hardcoded rule counts.
  • guard/bin/check-dangerous.sh: header comment matches the new pipeline.
  • README.md + README.es.md: removed hardcoded block/warn counts; reference guard/rules.json.
  • .github/workflows/lint.yml: ai-facing-docs-consistency + adapter-freshness jobs.
  • .github/workflows/e2e.yml: e2e-adapter-freshness job.

Spec

PR 6 of the 2026-05-10 Nanostack Architecture Audit vNext. Closes the P2 findings "Guard And Agent Docs Are Stale Relative To Current Behavior" and "Adapter Verification Has No Freshness Policy". Acceptance criteria from the spec, all met:

  • No "any AI coding agent", "any agent", "zero dependencies", "full engineering team", "no telemetry", or "three-tier guard" in public/agent-facing docs (forbidden-overclaim grep).
  • llms.txt, AGENTS.md, README.md, README.es.md, and bin/about.sh --print agree on adapter set, sprint order, telemetry framing, and framework positioning (lint locks).
  • Lint derives guard rule counts from guard/rules.json (no hardcoded counts allowed by lint).
  • A malformed adapter JSON fails lint (cells 2, 3, 7a, 7c, 7e, 7l, 7m).
  • A README-listed adapter missing from adapters/ fails lint (cell 4 + reverse direction in lint).
  • A stale adapter beyond threshold fails scheduled/manual verification (cells 5, 7j).

Next

PR 7: Secret Write Policy Parity. Closes the P2 "Secret Write Guard Is Narrower Than Secret Read Guard". Targets guard/bin/check-write.sh so real credential basenames (credentials.json, service-account*.json, firebase-adminsdk*.json) are blocked at write time, mirroring the read-time guard added in a previous round.

Test plan

  • bash ci/e2e-adapter-freshness.sh reports 43 of 43 checks passed.
  • bin/check-adapters.sh exits 0 against the live adapters/.
  • bin/check-adapters.sh --json returns a parseable summary.
  • All existing CI E2E suites still pass.
  • New lint jobs ai-facing-docs-consistency and adapter-freshness pass.

garagon added 9 commits May 11, 2026 02:59
… of architecture vNext)

The AI-facing surfaces (llms.txt, AGENTS.md, bin/about.sh,
guard/SKILL.md, plus the public READMEs) carried claims from earlier
rounds: "any AI coding agent", "Three-tier safety", "28 block rules",
"zero dependencies", "no telemetry". The README and AGENTS docs had
landed on the verified-adapters phrasing in a prior round, but the
agent-facing manifests had not been updated and the guard tier model
was already six tiers, not three. Adapters had no freshness policy:
adapters/<host>.json was the single source of truth for capability
evidence, but a stale last_verified could silently age out.

This PR aligns every agent-facing surface with the current shape of
the system and adds a freshness gate.

llms.txt + AGENTS.md
  Replaced "any agent" framing with the explicit verified-adapter
  set (claude, codex, cursor, gemini, opencode). Added sections for
  custom workflow stacks, the artifact trust contract, and the
  privacy posture. Removed hardcoded rule counts; the docs point
  at guard/rules.json as the source of truth.

bin/about.sh
  Headline changed from "Sprint quality framework" to "Local
  workflow framework". The adapter list is now generated from
  adapters/*.json filenames (awk join — paste -sd ', ' alternates
  delimiters on macOS). Added the custom workflow stacks section
  and the privacy posture. Sprint order
  (/think -> /nano -> build -> /review -> /security -> /qa -> /ship)
  stays the canonical default.

guard/SKILL.md + guard/bin/check-dangerous.sh
  Replaced "three-tier permission system" with the layered pipeline
  the code actually runs: block rules first, then allowlist, then
  phase-aware concurrency (PR 1 of this round), then in-project
  fast-path, then sprint phase gate, then budget gate, then warn
  rules. The header comment in check-dangerous.sh matches the same
  order. Hardcoded rule counts are gone; doc points at guard/rules.json
  with a jq snippet to query the live count.

README.md + README.es.md
  Removed the literal block-rule count ("35 block rules total",
  "9 warn rules total"). Both now refer to guard/rules.json.

bin/check-adapters.sh (new)
  Validates every adapters/<host>.json: required keys, host in the
  known set, capability fields in the enforcement enum,
  skill_discovery in its enum, last_verified parseable. Freshness
  policy: warn after 30 days, fail after 60 days for README-listed
  adapters, NANOSTACK_ALLOW_STALE_ADAPTERS=1 downgrades the fail to
  a warn for manual re-runs. Cross-check: a README-listed adapter
  missing from adapters/ fails. --json output for CI consumers.

ci/e2e-adapter-freshness.sh (new)
  8 cells, 13 checks against tmp adapters/ fixtures so the live
  repo adapters never need to be mutated to exercise the failure
  paths. Locks: fresh set passes, missing field fails, enum
  violation fails, README-listed-but-missing fails, stale README
  adapter fails after 60 days, stale non-listed adapter warns,
  override downgrades the fail, --json parses.

Lint adds two jobs:
  - ai-facing-docs-consistency: forbidden overclaim grep across
    llms.txt + AGENTS.md + bin/about.sh + guard/SKILL.md + READMEs;
    every adapter filename must appear in AGENTS.md and llms.txt;
    sprint order matches in bin/about.sh; no hardcoded block/warn
    rule counts in any agent-facing surface.
  - adapter-freshness: runs bin/check-adapters.sh.

Closes the P2 findings "Guard And Agent Docs Are Stale Relative To
Current Behavior" and "Adapter Verification Has No Freshness Policy"
from the 2026-05-10 Nanostack Architecture Audit vNext.
Four findings from the PR 6 first Codex review:

[P2] The capability enum the validator accepted
(enforced, reported, instructions_only, unsupported, unknown)
drifted from reference/host-adapter-schema.md, which documents
unsupported / instructions_only / detectable / hooked / enforced
/ host_dependent. An adapter using a documented value like
detectable or hooked failed the new lint while undocumented
values like reported passed. Aligned the enum with the schema doc
and added a separate verification.method enum (ci/manual/unknown).

[P2] The required-field loop only checked six fields, so an
adapter could omit schema_version, verification, install_target,
or doctor_checks and still pass. Expanded the loop to cover every
required field from the schema doc and added shape checks: the
verification block must be an object with a non-empty
method + evidence, and doctor_checks must be a non-empty array.

[P2] Empty-string capability values bypassed the enum check
because the previous code used `[ -n "$val" ] && in_enum`. A
README-listed adapter with `"bash_guard": ""` then passed as OK.
Empty values are now reported as `<field> is empty` errors.

[P3] An unparseable last_verified caused parse_iso_date to exit
non-zero under set -e, which short-circuited the rest of the
function and produced a silent exit instead of the documented
"does not parse as a date" failure. Wrapped the parse in
`set +e ... set -e` so record_result always runs.

Cells 7a-7d of ci/e2e-adapter-freshness.sh lock each fix:
documented enum values pass, empty capability fails, missing
verification fails with the explicit message, unparseable date
exits 1 with the documented diagnostic.
…s 2)

Two findings from the PR 6 second Codex review:

[P2] The schema requires adapters/<host>.json to match the .host
field, but check-adapters.sh only verified .host was in the known
set. A mislabeled file (cursor.json with host=claude) passed the
validator AND satisfied the README missing-file cross-check
because the path existed. CI could then ship a duplicated adapter
while advertising the wrong host as verified. The validator now
fails any adapter whose host field does not equal the filename
basename, with a clear "does not match filename" message.

[P3] The ai-facing-docs-consistency lint job was one-directional:
it required every JSON file under adapters/ to be mentioned in
AGENTS.md + llms.txt, but not the reverse. If an adapter file
was removed or renamed, the docs could keep advertising the old
adapter name forever and the lint stayed green. The job now also
checks each name in the canonical adapter set (claude, cursor,
codex, opencode, gemini) and fails when the docs reference a
name whose JSON file is missing.

Cell 7e of ci/e2e-adapter-freshness.sh locks the validator's
host-filename match: a cursor.json fixture with host=claude
exits 1 with the documented diagnostic.
… pass 3)

Two findings from the PR 6 third Codex review:

[P2] check-adapters.sh's README_LISTED computation greped the
plain "README.md" path, which resolves relative to the caller's
cwd. A script invoked from anywhere outside the repo root
produced an empty list, so an official adapter older than the
60-day threshold quietly downgraded to a warn and the command
exited 0. Anchored the path at $NANOSTACK_ROOT.

[P3] doctor_checks was checked for array shape + non-empty, but
not for element type. The schema in reference/host-adapter-schema.md
says string[], and downstream doctor/setup code uses each entry
as a check name; a numeric or object entry would silently break
runtime lookups. Added a jq element-type guard.

Cells 7f and 7g lock both: running check-adapters.sh from a
sibling directory still fails a stale README-listed adapter;
[123] for doctor_checks fails with the documented message.
Two findings from the PR 6 fourth Codex review:

[P2] Single-host mode (bin/check-adapters.sh <host>) silently
returned an empty success when the filter matched no adapter
file. A typo like `check-adapters.sh codxe` exited 0 and CI or a
maintainer could believe the requested adapter had been
validated. The script now tracks whether any file matched the
filter and reports `FAIL  <name>: no adapters/<name>.json found`
when zero matched.

[P2] schema_version was only checked for presence, not value.
An adapter declaring schema_version="2" passed even though
reference/host-adapter-schema.md says the current supported
version is "1". Added a SCHEMA_VERSION_ENUM (currently "1") and a
member check so a forward-incompatible adapter cannot land
silently. Bumping the schema means adding to the enum in the
same commit that updates downstream consumers.

Cells 7h and 7i lock both: a typo filter exits 1 with the
documented message; an adapter with schema_version="2" exits 1
with the documented "not in supported set" message.
…tion (codex pass 5)

Three findings from the PR 6 fifth Codex review:

[P2] The reverse adapter lint matched backticked `cursor` and the
"verified adapter ... cursor" copy in body text, but not the
common llms.txt shape "- Cursor" under "Verified adapters". So a
removed adapter's bullet could stay advertised. Added a third
match shape: a list-bullet line whose first non-space token is the
adapter name.

[P2] A future last_verified date (typo like 2099-01-01) produced a
negative age and the adapter stayed OK, suppressing every
freshness warning. The script now treats then_epoch > NOW_EPOCH
as a typed failure with the message
"last_verified=... is in the future".

[P3] The initial variable-extraction block read .verification.method
directly, so a non-object verification field crashed jq under set
-e before the typed-failure branch could report it. --json mode
emitted a raw jq error instead of a parseable summary. Removed
the early read; the method now comes from the type-guarded block
that already runs after the .verification | type == "object"
check.

Cells 7j and 7k lock both validator fixes: a 2099-01-01 fixture
exits 1 with the future-date message; a string-shaped
verification field reports "verification is not an object" and
exits 1.
Two findings from the PR 6 sixth Codex review:

[P2] An adapter file whose root is valid JSON but not an object
(e.g. `[]` or a scalar) passed the initial `jq -e .` check and
crashed the next `.host` read under set -e. --json mode then
emitted a raw jq error instead of a parseable summary, and the
remaining adapters were not checked. The validator now asserts
`type == "object"` on the root and reports
"root is not a JSON object" as a typed failure. Same gap on the
required scalar fields: only key presence was checked, so
`install_target: 123` passed. Added a scalar type-check loop for
host / schema_version / last_verified / skill_discovery /
bash_guard / write_guard / phase_gate / install_target so a
wrong-type value reports "is not a string".

[P2] The ai-facing-docs-consistency lint only loaded AGENTS.md +
llms.txt. README.md + README.es.md are the load-bearing public
claim for verified adapters, but they were absent from the loop
in both directions: a removed adapter could keep advertising in
the READMEs, and a new adapter could be missing from them.
Extended the forward (every adapter must be mentioned) and
reverse (no removed adapter may stay advertised) checks to the
two READMEs.

Cells 7l and 7m lock both: `[]` as the adapter root exits 1 with
the typed message and --json still parses; install_target: 123
exits 1 with the documented "is not a string" message.
GNU `date -d` on the Ubuntu CI runner accepts non-ISO values like
"yesterday" or "04/25/2026", so a malformed last_verified used to
parse cleanly and pass the freshness gate. Codex caught the
permissive parse on the PR 6 seventh review pass.

parse_iso_date now requires the value to match
[0-9]{4}-[0-9]{2}-[0-9]{2} before calling date(1). A non-matching
value drops to the "does not parse as a date" branch and exits 1
with the documented diagnostic.

Cell 7n locks both forms: "yesterday" and "04/25/2026" each exit
1 with the documented message.
In documented single-host mode (`check-adapters.sh <host>`), the
README cross-check still iterated over every README-listed
adapter. A partial checkout or test fixture whose README mentioned
claude AND cursor would make `check-adapters.sh claude` fail on
the missing cursor.json even though the caller asked only for
claude. Codex flagged the cross-host bleed on the PR 6 eighth
review pass.

The cross-check loop now `continue`s when $FILTER is set and the
README-listed name does not match the filter. No-filter mode is
unchanged.

Cell 7o locks both branches: filter=claude passes when cursor.json
is missing; the same setup with no filter still fails.
@garagon garagon merged commit 5416a42 into main May 11, 2026
60 checks passed
@garagon garagon deleted the ai-facing-docs-adapters branch May 11, 2026 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant