ai-facing-docs + adapters: align surfaces with current behavior (PR 6 of architecture vNext)#214
Merged
Merged
Conversation
… of architecture vNext)
The AI-facing surfaces (llms.txt, AGENTS.md, bin/about.sh,
guard/SKILL.md, plus the public READMEs) carried claims from earlier
rounds: "any AI coding agent", "Three-tier safety", "28 block rules",
"zero dependencies", "no telemetry". The README and AGENTS docs had
landed on the verified-adapters phrasing in a prior round, but the
agent-facing manifests had not been updated and the guard tier model
was already six tiers, not three. Adapters had no freshness policy:
adapters/<host>.json was the single source of truth for capability
evidence, but a stale last_verified could silently age out.
This PR aligns every agent-facing surface with the current shape of
the system and adds a freshness gate.
llms.txt + AGENTS.md
Replaced "any agent" framing with the explicit verified-adapter
set (claude, codex, cursor, gemini, opencode). Added sections for
custom workflow stacks, the artifact trust contract, and the
privacy posture. Removed hardcoded rule counts; the docs point
at guard/rules.json as the source of truth.
bin/about.sh
Headline changed from "Sprint quality framework" to "Local
workflow framework". The adapter list is now generated from
adapters/*.json filenames (awk join — paste -sd ', ' alternates
delimiters on macOS). Added the custom workflow stacks section
and the privacy posture. Sprint order
(/think -> /nano -> build -> /review -> /security -> /qa -> /ship)
stays the canonical default.
guard/SKILL.md + guard/bin/check-dangerous.sh
Replaced "three-tier permission system" with the layered pipeline
the code actually runs: block rules first, then allowlist, then
phase-aware concurrency (PR 1 of this round), then in-project
fast-path, then sprint phase gate, then budget gate, then warn
rules. The header comment in check-dangerous.sh matches the same
order. Hardcoded rule counts are gone; doc points at guard/rules.json
with a jq snippet to query the live count.
README.md + README.es.md
Removed the literal block-rule count ("35 block rules total",
"9 warn rules total"). Both now refer to guard/rules.json.
bin/check-adapters.sh (new)
Validates every adapters/<host>.json: required keys, host in the
known set, capability fields in the enforcement enum,
skill_discovery in its enum, last_verified parseable. Freshness
policy: warn after 30 days, fail after 60 days for README-listed
adapters, NANOSTACK_ALLOW_STALE_ADAPTERS=1 downgrades the fail to
a warn for manual re-runs. Cross-check: a README-listed adapter
missing from adapters/ fails. --json output for CI consumers.
ci/e2e-adapter-freshness.sh (new)
8 cells, 13 checks against tmp adapters/ fixtures so the live
repo adapters never need to be mutated to exercise the failure
paths. Locks: fresh set passes, missing field fails, enum
violation fails, README-listed-but-missing fails, stale README
adapter fails after 60 days, stale non-listed adapter warns,
override downgrades the fail, --json parses.
Lint adds two jobs:
- ai-facing-docs-consistency: forbidden overclaim grep across
llms.txt + AGENTS.md + bin/about.sh + guard/SKILL.md + READMEs;
every adapter filename must appear in AGENTS.md and llms.txt;
sprint order matches in bin/about.sh; no hardcoded block/warn
rule counts in any agent-facing surface.
- adapter-freshness: runs bin/check-adapters.sh.
Closes the P2 findings "Guard And Agent Docs Are Stale Relative To
Current Behavior" and "Adapter Verification Has No Freshness Policy"
from the 2026-05-10 Nanostack Architecture Audit vNext.
Four findings from the PR 6 first Codex review: [P2] The capability enum the validator accepted (enforced, reported, instructions_only, unsupported, unknown) drifted from reference/host-adapter-schema.md, which documents unsupported / instructions_only / detectable / hooked / enforced / host_dependent. An adapter using a documented value like detectable or hooked failed the new lint while undocumented values like reported passed. Aligned the enum with the schema doc and added a separate verification.method enum (ci/manual/unknown). [P2] The required-field loop only checked six fields, so an adapter could omit schema_version, verification, install_target, or doctor_checks and still pass. Expanded the loop to cover every required field from the schema doc and added shape checks: the verification block must be an object with a non-empty method + evidence, and doctor_checks must be a non-empty array. [P2] Empty-string capability values bypassed the enum check because the previous code used `[ -n "$val" ] && in_enum`. A README-listed adapter with `"bash_guard": ""` then passed as OK. Empty values are now reported as `<field> is empty` errors. [P3] An unparseable last_verified caused parse_iso_date to exit non-zero under set -e, which short-circuited the rest of the function and produced a silent exit instead of the documented "does not parse as a date" failure. Wrapped the parse in `set +e ... set -e` so record_result always runs. Cells 7a-7d of ci/e2e-adapter-freshness.sh lock each fix: documented enum values pass, empty capability fails, missing verification fails with the explicit message, unparseable date exits 1 with the documented diagnostic.
…s 2) Two findings from the PR 6 second Codex review: [P2] The schema requires adapters/<host>.json to match the .host field, but check-adapters.sh only verified .host was in the known set. A mislabeled file (cursor.json with host=claude) passed the validator AND satisfied the README missing-file cross-check because the path existed. CI could then ship a duplicated adapter while advertising the wrong host as verified. The validator now fails any adapter whose host field does not equal the filename basename, with a clear "does not match filename" message. [P3] The ai-facing-docs-consistency lint job was one-directional: it required every JSON file under adapters/ to be mentioned in AGENTS.md + llms.txt, but not the reverse. If an adapter file was removed or renamed, the docs could keep advertising the old adapter name forever and the lint stayed green. The job now also checks each name in the canonical adapter set (claude, cursor, codex, opencode, gemini) and fails when the docs reference a name whose JSON file is missing. Cell 7e of ci/e2e-adapter-freshness.sh locks the validator's host-filename match: a cursor.json fixture with host=claude exits 1 with the documented diagnostic.
… pass 3) Two findings from the PR 6 third Codex review: [P2] check-adapters.sh's README_LISTED computation greped the plain "README.md" path, which resolves relative to the caller's cwd. A script invoked from anywhere outside the repo root produced an empty list, so an official adapter older than the 60-day threshold quietly downgraded to a warn and the command exited 0. Anchored the path at $NANOSTACK_ROOT. [P3] doctor_checks was checked for array shape + non-empty, but not for element type. The schema in reference/host-adapter-schema.md says string[], and downstream doctor/setup code uses each entry as a check name; a numeric or object entry would silently break runtime lookups. Added a jq element-type guard. Cells 7f and 7g lock both: running check-adapters.sh from a sibling directory still fails a stale README-listed adapter; [123] for doctor_checks fails with the documented message.
Two findings from the PR 6 fourth Codex review: [P2] Single-host mode (bin/check-adapters.sh <host>) silently returned an empty success when the filter matched no adapter file. A typo like `check-adapters.sh codxe` exited 0 and CI or a maintainer could believe the requested adapter had been validated. The script now tracks whether any file matched the filter and reports `FAIL <name>: no adapters/<name>.json found` when zero matched. [P2] schema_version was only checked for presence, not value. An adapter declaring schema_version="2" passed even though reference/host-adapter-schema.md says the current supported version is "1". Added a SCHEMA_VERSION_ENUM (currently "1") and a member check so a forward-incompatible adapter cannot land silently. Bumping the schema means adding to the enum in the same commit that updates downstream consumers. Cells 7h and 7i lock both: a typo filter exits 1 with the documented message; an adapter with schema_version="2" exits 1 with the documented "not in supported set" message.
…tion (codex pass 5) Three findings from the PR 6 fifth Codex review: [P2] The reverse adapter lint matched backticked `cursor` and the "verified adapter ... cursor" copy in body text, but not the common llms.txt shape "- Cursor" under "Verified adapters". So a removed adapter's bullet could stay advertised. Added a third match shape: a list-bullet line whose first non-space token is the adapter name. [P2] A future last_verified date (typo like 2099-01-01) produced a negative age and the adapter stayed OK, suppressing every freshness warning. The script now treats then_epoch > NOW_EPOCH as a typed failure with the message "last_verified=... is in the future". [P3] The initial variable-extraction block read .verification.method directly, so a non-object verification field crashed jq under set -e before the typed-failure branch could report it. --json mode emitted a raw jq error instead of a parseable summary. Removed the early read; the method now comes from the type-guarded block that already runs after the .verification | type == "object" check. Cells 7j and 7k lock both validator fixes: a 2099-01-01 fixture exits 1 with the future-date message; a string-shaped verification field reports "verification is not an object" and exits 1.
Two findings from the PR 6 sixth Codex review: [P2] An adapter file whose root is valid JSON but not an object (e.g. `[]` or a scalar) passed the initial `jq -e .` check and crashed the next `.host` read under set -e. --json mode then emitted a raw jq error instead of a parseable summary, and the remaining adapters were not checked. The validator now asserts `type == "object"` on the root and reports "root is not a JSON object" as a typed failure. Same gap on the required scalar fields: only key presence was checked, so `install_target: 123` passed. Added a scalar type-check loop for host / schema_version / last_verified / skill_discovery / bash_guard / write_guard / phase_gate / install_target so a wrong-type value reports "is not a string". [P2] The ai-facing-docs-consistency lint only loaded AGENTS.md + llms.txt. README.md + README.es.md are the load-bearing public claim for verified adapters, but they were absent from the loop in both directions: a removed adapter could keep advertising in the READMEs, and a new adapter could be missing from them. Extended the forward (every adapter must be mentioned) and reverse (no removed adapter may stay advertised) checks to the two READMEs. Cells 7l and 7m lock both: `[]` as the adapter root exits 1 with the typed message and --json still parses; install_target: 123 exits 1 with the documented "is not a string" message.
GNU `date -d` on the Ubuntu CI runner accepts non-ISO values like
"yesterday" or "04/25/2026", so a malformed last_verified used to
parse cleanly and pass the freshness gate. Codex caught the
permissive parse on the PR 6 seventh review pass.
parse_iso_date now requires the value to match
[0-9]{4}-[0-9]{2}-[0-9]{2} before calling date(1). A non-matching
value drops to the "does not parse as a date" branch and exits 1
with the documented diagnostic.
Cell 7n locks both forms: "yesterday" and "04/25/2026" each exit
1 with the documented message.
In documented single-host mode (`check-adapters.sh <host>`), the README cross-check still iterated over every README-listed adapter. A partial checkout or test fixture whose README mentioned claude AND cursor would make `check-adapters.sh claude` fail on the missing cursor.json even though the caller asked only for claude. Codex flagged the cross-host bleed on the PR 6 eighth review pass. The cross-check loop now `continue`s when $FILTER is set and the README-listed name does not match the filter. No-filter mode is unchanged. Cell 7o locks both branches: filter=claude passes when cursor.json is missing; the same setup with no filter still fails.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The AI-facing surfaces (
llms.txt,AGENTS.md,bin/about.sh,guard/SKILL.md, plus the public READMEs) carried claims from earlier rounds that no longer matched the system: "any AI coding agent", "Three-tier safety", "28 block rules", "zero dependencies", "no telemetry". Adapters underadapters/were the single source of truth for capability evidence, but a stalelast_verifiedcould silently age out because no validator looked at the date.This PR aligns every agent-facing surface with the current behavior and adds a freshness/schema gate for adapters.
What changes
Agent-facing docs
llms.txt+AGENTS.md: replaced "any agent" framing with the explicit verified-adapter set (claude, codex, cursor, gemini, opencode). Added sections for custom workflow stacks, the artifact-trust contract, and the privacy posture. Removed hardcoded rule counts; both now point atguard/rules.jsonas the source of truth.bin/about.sh: headline changed from "Sprint quality framework" to "Local workflow framework". The adapter list is generated fromadapters/*.jsonfilenames at runtime (awk join —paste -sd ', 'alternates delimiters on macOS). Added the custom workflow stacks section and the privacy posture.guard/SKILL.md+guard/bin/check-dangerous.shheader: replaced "three-tier permission system" with the layered pipeline the code actually runs (block rules → allowlist → phase-aware concurrency → in-project fast-path → sprint phase gate → budget gate → warn rules). The check-dangerous header comment matches the same order.README.md+README.es.md: removed the literal block/warn rule counts ("35 block rules total", "9 warn rules total"); both now refer toguard/rules.jsonwith ajqsnippet to query the live count.Adapter validation
bin/check-adapters.sh(new): validates everyadapters/<host>.jsonagainstreference/host-adapter-schema.md:schema_versionis in the supported set (currently"1").install_target: 123).hostfield must equal the filename basename (no mislabeled file).unsupported,instructions_only,detectable,hooked,enforced,host_dependent).skill_discoveryis in its enum (native,rules_file,extension,skill_folder, ...).verificationis an object with method (inci/manual/unknown) and non-empty evidence.doctor_checksis a non-empty array of strings.last_verifiedis a strictYYYY-MM-DD(no GNU-date relative shapes likeyesterday), parseable, and not in the future.NANOSTACK_ALLOW_STALE_ADAPTERS=1downgrades the fail to a warning for manual re-runs.bin/check-adapters.sh <host>validates one adapter. A filter with no matching file fails ("no adapters/.json found"). The cross-check that detects README-listed-but-missing adapters is scoped to the filter so a partial fixture does not produce spurious failures.$NANOSTACK_ROOT/README.md(not the caller's cwd) so the script works from any directory.--jsonmode emits a parseable summary even on root-type failures.Lint
Two new jobs in
.github/workflows/lint.yml:ai-facing-docs-consistency:adapters/must be mentioned in AGENTS.md, llms.txt, README.md, and README.es.md.- Cursor.adapter-freshness: runsbin/check-adapters.sh.E2E
ci/e2e-adapter-freshness.sh(new): 15 cells, 43 checks. Covers fresh set passes, missing field fails, enum violation fails, README-listed missing fails, stale fails after 60 days, stale non-README warns, override downgrades to warn, --json parses, documented capability enum honored, empty capability fails, missing verification fails, unparseable date reported, host-filename mismatch fails, non-string doctor_checks fails, future date fails, non-object verification typed-fails, scalar type wrong fails, root-type wrong typed-fails, --json still parses on root-type failure, non-ISO date rejected, single-host filter scoped, filter typo fails.What does not change
adapters/*.jsonfiles all pass the new validator on first run.Codex review trail (9 iterations)
reference/host-adapter-schema.md; required-field loop was truncated; empty-string capabilities passed; unparseable date silent-exited.set +e.hostfield could mismatch the filename basename; reverse-adapter lint was one-directional.doctor_checkselement type not checked.$NANOSTACK_ROOT; element-type guard.schema_versionwas not validated.FILTER_MATCHED; addedSCHEMA_VERSION_ENUM.- Cursormissed by reverse lint; future dates suppressed freshness; verification non-object access crashed jq.date -daccepted non-ISO values like "yesterday" on Ubuntu.YYYY-MM-DDregex guard before parsing.Final codex pass clean: "I did not find a discrete bug that would break existing behavior or invalidate the new checks."
Tests
All checks green locally (15 suites):
ci/e2e-user-flows.sh: 100 checks.ci/e2e-custom-stack-flows.sh: 40 checks.ci/e2e-custom-stack-examples.sh: 51 checks.ci/e2e-artifact-trust.sh: 29 checks.ci/e2e-structured-artifacts.sh: 31 checks.ci/e2e-graph-aware-session.sh: 61 checks.ci/e2e-custom-routing.sh: 35 checks.ci/e2e-adapter-freshness.sh: 43 checks (new).ci/e2e-delivery-matrix.sh: 17 cells.ci/e2e-examples.sh: 40 checks.ci/e2e-think-flows.sh: 32 checks.ci/e2e-think-archetypes.sh: 25 checks.ci/e2e-onboarding-flows.sh: 34 checks.tests/run.sh(local-only): 83 tests.tests/e2e-user-flows.sh(local-only): 5 checks.Total: 626 checks across 15 suites.
Files
bin/check-adapters.sh(new): adapter schema + freshness validator.ci/e2e-adapter-freshness.sh(new): 15-cell, 43-check harness.llms.txt: rewritten around verified adapters + custom workflow stacks.AGENTS.md: rewritten to point at adapters/*.json as source of truth.bin/about.sh: adapter list generated from filenames; custom-stack and privacy sections.guard/SKILL.md: pipeline order documented; no hardcoded rule counts.guard/bin/check-dangerous.sh: header comment matches the new pipeline.README.md+README.es.md: removed hardcoded block/warn counts; reference guard/rules.json..github/workflows/lint.yml:ai-facing-docs-consistency+adapter-freshnessjobs..github/workflows/e2e.yml:e2e-adapter-freshnessjob.Spec
PR 6 of the 2026-05-10 Nanostack Architecture Audit vNext. Closes the P2 findings "Guard And Agent Docs Are Stale Relative To Current Behavior" and "Adapter Verification Has No Freshness Policy". Acceptance criteria from the spec, all met:
Next
PR 7: Secret Write Policy Parity. Closes the P2 "Secret Write Guard Is Narrower Than Secret Read Guard". Targets
guard/bin/check-write.shso real credential basenames (credentials.json, service-account*.json, firebase-adminsdk*.json) are blocked at write time, mirroring the read-time guard added in a previous round.Test plan
bash ci/e2e-adapter-freshness.shreports 43 of 43 checks passed.bin/check-adapters.shexits 0 against the live adapters/.bin/check-adapters.sh --jsonreturns a parseable summary.ai-facing-docs-consistencyandadapter-freshnesspass.