Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions .github/workflows/e2e.yml
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,24 @@ jobs:
chmod +x ci/e2e-onboarding-flows.sh
ci/e2e-onboarding-flows.sh

e2e-adapter-freshness:
name: E2E Adapter Schema + Freshness (8 cells)
runs-on: ubuntu-latest
timeout-minutes: 5
# PR 6 of the 2026-05-10 architecture audit. Locks bin/check-adapters.sh
# end-to-end against tmp adapters/ fixtures: malformed JSON fails,
# README-listed adapter missing fails, stale beyond 60 days fails
# for README-listed hosts, NANOSTACK_ALLOW_STALE_ADAPTERS=1
# downgrades to warn, --json output parses.
steps:
- uses: actions/checkout@v4
- name: jq is present
run: jq --version
- name: Run adapter freshness E2E
run: |
chmod +x ci/e2e-adapter-freshness.sh
ci/e2e-adapter-freshness.sh

e2e-custom-routing:
name: E2E Custom Routing Contract (8 cells)
runs-on: ubuntu-latest
Expand Down
116 changes: 116 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3366,6 +3366,122 @@ jobs:
fi
echo "OK: guard concurrency tier uses the phase registry."

ai-facing-docs-consistency:
name: AI-facing docs agree on adapters, sprint, guard, privacy
runs-on: ubuntu-latest
# PR 6 of the 2026-05-10 architecture audit. llms.txt, AGENTS.md,
# bin/about.sh, guard/SKILL.md, and the public READMEs must not
# ship stale overclaims and must agree on the verified adapter
# set, the default sprint order, the layered guard structure,
# and the privacy posture. Lint catches drift; the architect
# round audited the surface state once.
permissions:
contents: read
steps:
- uses: actions/checkout@v4
- name: No forbidden overclaims in agent-facing docs
run: |
set -e
fail=0
patterns='any AI coding agent|any agent that reads SKILL|zero dependencies|full engineering team|no telemetry, no remote calls|three-tier guard|three-tier safety|three-tier permission system'
for f in llms.txt AGENTS.md bin/about.sh guard/SKILL.md README.md README.es.md; do
[ -f "$f" ] || continue
if grep -nEi "$patterns" "$f"; then
echo "FAIL: $f contains a forbidden overclaim above."
fail=1
fi
done
exit $fail
- name: Verified adapters set is the same across surfaces
run: |
set -e
fail=0
# Source of truth: filenames under adapters/.
adapters_truth=$(find adapters -maxdepth 1 -name "*.json" -type f \
| sed 's|.*/||; s|\.json$||' | sort | tr '\n' ' ')
# Every public surface that names verified adapters must
# mention each one shipped under adapters/. Codex flagged
# the partial coverage on the PR 6 sixth review pass: the
# READMEs are the load-bearing public claim, and they were
# not in this loop.
for name in $adapters_truth; do
[ -z "$name" ] && continue
for f in AGENTS.md llms.txt README.md README.es.md; do
if ! grep -qi -- "$name" "$f"; then
echo "FAIL: $f does not mention adapter '$name'"
fail=1
fi
done
done
# Reverse direction: AGENTS.md and llms.txt must NOT
# advertise a verified adapter that adapters/ no longer
# ships. Codex flagged the one-directional check on the
# PR 6 second review pass: a removed adapter could stay
# advertised in the agent-facing docs forever.
known_adapters="claude cursor codex opencode gemini"
for candidate in $known_adapters; do
case " $adapters_truth " in
*" $candidate "*) ;;
*)
for f in AGENTS.md llms.txt README.md README.es.md; do
# Match three shapes:
# - inline backticked -> `cursor`
# - "verified adapter" copy -> "verified adapter cursor"
# - bullet entry under a "Verified adapters" header -> "- Cursor"
# Codex flagged the bare-bullet miss on the PR 6
# fifth review pass and the README omission on the
# sixth review pass.
if grep -qiE "(\`${candidate}\`|verified adapter[^.]*${candidate}|^[[:space:]]*[-*][[:space:]]+${candidate}\b)" "$f"; then
echo "FAIL: $f advertises adapter '$candidate' but adapters/${candidate}.json is missing."
fail=1
fi
done
;;
esac
done
exit $fail
- name: Default sprint order matches across surfaces
run: |
set -e
fail=0
# The canonical order is documented as
# /think -> /nano -> build -> /review -> /security -> /qa -> /ship.
# Spot-check that bin/about.sh keeps the same arrows shape.
if ! grep -qE '/think.*/nano.*build.*/review.*/security.*/qa.*/ship' bin/about.sh; then
echo "FAIL: bin/about.sh sprint order does not match the canonical default."
fail=1
fi
exit $fail
- name: Guard doc references rules.json instead of a hand-maintained count
run: |
set -e
# Hand-maintained "28 block rules and 9 warn rules" used to
# live in guard/SKILL.md. PR 6 of the audit removed that
# hardcoding. If a future commit adds back any specific
# block-rule count, the lint fails so the count never drifts
# from the JSON. (We allow generic "rule count" mentions
# that refer to guard/rules.json explicitly.)
if grep -nE '[0-9]+ (block|warn) rules' guard/SKILL.md README.md README.es.md AGENTS.md llms.txt bin/about.sh 2>/dev/null; then
echo "FAIL: a numeric block/warn rule count was found in a public doc."
echo " Replace with guard/rules.json as the source of truth."
exit 1
fi
echo "OK: no hardcoded rule counts."

adapter-freshness:
name: Adapter schema + freshness
runs-on: ubuntu-latest
# PR 6 of the 2026-05-10 architecture audit. Runs bin/check-adapters.sh
# so a stale or malformed adapter cannot ship to main.
permissions:
contents: read
steps:
- uses: actions/checkout@v4
- name: Run check-adapters.sh
run: |
chmod +x bin/check-adapters.sh
bin/check-adapters.sh

custom-routing-contract:
name: Custom routing contract wired into resolve.sh
runs-on: ubuntu-latest
Expand Down
31 changes: 16 additions & 15 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,28 @@
# Nanostack: Agent Discovery

This file lists all available skills for all supported agents (Claude Code, Cursor, Codex, OpenCode, Gemini CLI).
Each skill folder contains a `SKILL.md` for agent discovery and an `agents/openai.yaml` for OpenAI-compatible agents.
This file lists the skills shipped by Nanostack for the verified adapters: Claude Code, Cursor, OpenAI Codex, OpenCode, and Gemini CLI. Each skill folder contains a `SKILL.md` for adapter discovery and an `agents/openai.yaml` for OpenAI-compatible agents. Adapter capability evidence lives in `adapters/<host>.json`; treat the JSON as the single source of truth for what a given host actually enforces (hook execution, write guard, phase gate).

## Available Skills
## Available skills

| Skill | Directory | Description |
|-------|-----------|-------------|
| think | `think/` | Strategic product thinking. Three modes (Founder/Startup/Builder) with calibrated intensity. YC-grade forcing questions, CEO cognitive patterns, manual delivery test. |
| nano | `plan/` | Implementation planning. Scope assessment, step-by-step plans with verification, product standards. |
| review | `review/` | Two-pass code review. Structural then adversarial. Scope drift detection against plan. Conflict detection with /security. |
| qa | `qa/` | Quality assurance. Browser, API, CLI and debug testing with Playwright. WTF heuristic. |
| security | `security/` | Security audit. OWASP Top 10, STRIDE, dependency scanning. Cross-references /review for conflicts. Graded report (A-F). |
| ship | `ship/` | Shipping pipeline. PR creation, CI monitoring, post-merge verification. Generates sprint journal on success. |
| guard | `guard/` | Three-tier safety. Allowlist, in-project bypass, 28 block rules with safer alternatives. Configurable in guard/rules.json. |
| think | `think/` | Strategic product thinking with calibrated intensity per archetype. Saves a structured artifact (value proposition, scope mode, target user, narrowest wedge, key risk, premise validation). |
| nano | `plan/` | Implementation planning. Planned files, plan approval, scope assessment, product standards. |
| review | `review/` | Two-pass code review (structural + adversarial). Scope drift detection against /nano. Conflict precedence with /security. |
| qa | `qa/` | Browser, API, CLI, or debug testing. WTF heuristic. |
| security | `security/` | OWASP Top 10 + STRIDE audit. Cross-references /review for conflicts. |
| ship | `ship/` | Pre-flight, PR creation, CI monitoring, post-deploy verification. Generates the sprint journal on success. |
| guard | `guard/` | Block and warn rules on Bash + Write/Edit. Phase concurrency, sprint phase gate, and budget gate run inside the same pipeline. Rule counts live in `guard/rules.json`. |
| conductor | `conductor/` | Multi-agent sprint orchestrator. Parallel sessions via claim/complete protocol with atomic file locking. |

## Know-how Pipeline
## Custom workflow stacks

Skills automatically save artifacts to `.nanostack/` and cross-reference each other. `/ship` generates a sprint journal. The vault at `.nanostack/know-how/` works as an Obsidian vault. Run `bin/discard-sprint.sh` to clean up bad sessions.
Custom stacks declare their own phases in `.nanostack/config.json` (`custom_phases` + `phase_graph`) and live under `<store>/skills/<name>/`. They get the same lifecycle support as the built-in sprint (graph-aware progression, concurrency enforcement, artifact trust, schema validation, routing intent through `phase_context`). The contract is in `reference/custom-stack-contract.md`; `examples/custom-stack-template/compliance-release/` is a worked example.

## Usage
## Know-how pipeline

Skills automatically save artifacts to `.nanostack/`. Downstream skills read upstream artifacts through `bin/resolve.sh`, which honors the artifact-trust contract (PR 2) and the routing contract for custom skills (PR 5). `/ship` generates a sprint journal. `bin/discard-sprint.sh` cleans up bad sessions.

Each skill's `SKILL.md` contains the full instructions. Read it and follow the process described.
## Usage

Supporting files (templates, references, checklists, scripts) are in subdirectories. Read them when referenced by the SKILL.md.
Each skill's `SKILL.md` contains the full instructions. Read it and follow the process described. Supporting files (templates, references, checklists, scripts) live in subdirectories and are referenced from the SKILL.md when needed.
4 changes: 2 additions & 2 deletions README.es.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,14 +201,14 @@ Los agentes cometen errores. Corren `rm -rf` cuando querían `rm -r`, hacen forc

Cada comando de Bash pasa por estos seis tiers, en este orden:

1. **Block rules**: las reglas de bloqueo corren primero. 35 reglas cubren borrado masivo (`rm -rf .`, `find . -delete`), destrucción de historia (`git push --force`), lecturas de secretos (`.env`, `*.pem`), drops de DB, deploys a producción y ejecución remota (`curl | sh`). Una coincidencia bloquea aunque el binario esté en el allowlist de abajo.
1. **Block rules**: las reglas de bloqueo corren primero. Cubren borrado masivo (`rm -rf .`, `find . -delete`), destrucción de historia (`git push --force`), lecturas de secretos (`.env`, `*.pem`), drops de DB, deploys a producción y ejecución remota (`curl | sh`). Una coincidencia bloquea aunque el binario esté en el allowlist de abajo. La fuente de verdad es [`guard/rules.json`](guard/rules.json); para ver el conteo actual: `jq '[.tiers.block.rules[].id] | length' guard/rules.json`.
2. **Allowlist**: para comandos que pasaron las block rules, los allowlisteados (`git status`, `ls`, `cat`, `jq`, etc.) saltan el resto.
3. **In-project**: operaciones que solo tocan archivos del repo actual pasan. El control de versiones es la red de seguridad.
4. **Concurrencia por fase**: durante fases read-only (review, qa, security), las operaciones de escritura quedan bloqueadas para evitar race conditions.
5. **Phase gate**: cuando hay un sprint activo, `git commit` y `git push` quedan bloqueados hasta que existan artifacts frescos de review, security y qa.
6. **Budget gate**: cuando el sprint tiene un presupuesto y se gastó 95%+, todos los comandos no-allowlist quedan bloqueados.

Plus 9 reglas de advertencia para operaciones que requieren atención sin llegar a bloqueo.
Mas un tier de reglas de advertencia (`warn`) para operaciones que requieren atención sin llegar a bloqueo. Las definiciones también viven en `guard/rules.json`.

Las herramientas Write, Edit y MultiEdit pasan por su propio hook (`guard/bin/check-write.sh`) que niega rutas protegidas: archivos de secretos (`.env` y variantes, `*.pem`, `*.key`, llaves SSH) y directorios de sistema o usuario-secreto (`/etc`, `/var`, `/usr/bin`, `~/.ssh`, `~/.aws`, `~/.kube`). Los symlinks se resuelven antes de matchear, así que un `mylink/config -> ~/.ssh/config` se trata como destino resuelto.

Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -443,7 +443,7 @@ AI agents make mistakes. They run `rm -rf` when they mean `rm -r`, force push to

Inspired by [Claude Code auto mode](https://www.anthropic.com/engineering/claude-code-auto-mode), guard evaluates every Bash command through six tiers in this order:

**Tier 1: Block rules.** Patterns for mass deletion, history destruction, database drops, production deploys, remote code execution, secret reads, security degradation and safety bypasses run first. A match exits 1 immediately, even if the command's binary is on the allowlist below. This ordering closes the bypass class where `find . -delete` or `cat .env` slipped past Tier 2 because `find` and `cat` were on the allowlist. 35 block rules total.
**Tier 1: Block rules.** Patterns for mass deletion, history destruction, database drops, production deploys, remote code execution, secret reads, security degradation and safety bypasses run first. A match exits 1 immediately, even if the command's binary is on the allowlist below. This ordering closes the bypass class where `find . -delete` or `cat .env` slipped past Tier 2 because `find` and `cat` were on the allowlist. Block rule definitions live in [`guard/rules.json`](guard/rules.json); query the live count with `jq '[.tiers.block.rules[].id] | length' guard/rules.json`.

**Tier 2: Allowlist.** After block rules clear, commands like `git status`, `ls`, `cat`, `jq` skip the remaining checks. They are read-only or otherwise side-effect-free for safe arguments.

Expand All @@ -455,7 +455,7 @@ Inspired by [Claude Code auto mode](https://www.anthropic.com/engineering/claude

**Tier 6: Budget gate.** When a sprint budget is set and 95%+ spent, all non-allowlisted commands are blocked. The agent can still run safe commands (`ls`, `git status`, `cat`) to save work, but cannot execute builds, tests, or deploys. Bypass with `NANOSTACK_SKIP_BUDGET=1`.

Plus a Tier 7 of warn rules for operations that need attention but not blocking. 9 warn rules total.
Plus a Tier 7 of warn rules for operations that need attention but not blocking. Warn rule definitions also live in `guard/rules.json`.

### Write and Edit are hooked too

Expand Down
25 changes: 22 additions & 3 deletions bin/about.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
#!/usr/bin/env bash
# about.sh — Generate compact self-description for agents
# Writes .nanostack/ABOUT.md with skills, flow, key commands.
# Any agent (Cursor, Codex, Claude Code) can read this to understand nanostack.
# Verified adapters: Claude Code, Cursor, OpenAI Codex, OpenCode, Gemini CLI.
# Adapter capabilities live in adapters/<host>.json.
#
# Usage: about.sh Generate/update ABOUT.md
# about.sh --print Print to stdout instead of file
Expand All @@ -21,9 +22,21 @@ SESSIONS=$(find "$NANOSTACK_STORE/sessions" -name "*.json" -type f 2>/dev/null |
HAS_CONFIG="no"
[ -f "$NANOSTACK_STORE/config.json" ] && HAS_CONFIG="yes"

# Adapter list: read names from adapters/*.json so this stays in sync
# with the single source of truth. Falls back to the canonical five if
# the adapters directory is missing. `paste -sd ', '` alternates the
# delimiter byte-by-byte on macOS, so we use awk for a clean
# comma-space join.
ADAPTER_LIST=""
if [ -d "$NANOSTACK_ROOT/adapters" ]; then
ADAPTER_LIST=$(find "$NANOSTACK_ROOT/adapters" -maxdepth 1 -name "*.json" -type f 2>/dev/null \
| sed 's|.*/||; s|\.json$||' | sort | awk 'NR>1{printf ", "} {printf "%s",$0} END{print ""}')
fi
[ -z "$ADAPTER_LIST" ] && ADAPTER_LIST="claude, codex, cursor, gemini, opencode"

DOC="# Nanostack

Sprint quality framework. Turns your AI agent into an engineering team.
Local workflow framework for AI coding agents. The built-in sprint plus a framework for declaring your own custom workflow stacks. Verified adapters: $ADAPTER_LIST.

## Flow

Expand Down Expand Up @@ -56,14 +69,20 @@ Sprint quality framework. Turns your AI agent into an engineering team.
| bin/doctor.sh | Know-how health check. |
| bin/capture-failure.sh | Log what went wrong (no /compound needed). |

## Custom workflow stacks

Declare your own phases in \`.nanostack/config.json\` (\`custom_phases\` + \`phase_graph\`) and put the skill under \`<store>/skills/<name>/\`. Conductor scheduling, guard concurrency, the artifact contract, session lifecycle, next-step output, and the resolver all consume the same phase registry. See \`reference/custom-stack-contract.md\` and \`examples/custom-stack-template/compliance-release/\`.

## State

All data in \`.nanostack/\`:
- Artifacts: \`.nanostack/<phase>/<timestamp>.json\`
- Artifacts: \`.nanostack/<phase>/<timestamp>.json\` with SHA-256 integrity field.
- Solutions: \`.nanostack/know-how/solutions/{bug,pattern,decision}/\`
- Briefs: \`.nanostack/know-how/briefs/\`
- Audit log: \`.nanostack/audit.log\`

There is no Nanostack cloud. Telemetry is opt-in and documented in \`reference/telemetry.md\`.

## This Project

- Solutions: $SOLUTIONS
Expand Down
Loading
Loading