Skip to content

custom-routing: phase_context contract for custom skills (PR 5 of architecture vNext)#213

Merged
garagon merged 5 commits into
mainfrom
custom-routing-contract
May 11, 2026
Merged

custom-routing: phase_context contract for custom skills (PR 5 of architecture vNext)#213
garagon merged 5 commits into
mainfrom
custom-routing-contract

Conversation

@garagon
Copy link
Copy Markdown
Owner

@garagon garagon commented May 11, 2026

Summary

Custom skills could declare dependency edges (via phase_graph or depends_on), but nothing else. resolve.sh loaded their upstream artifacts and stopped. Skills that needed strict integrity reimplemented the check; skills that needed solutions or diarizations stayed dependency-blind.

This PR adds a phase_context block to .nanostack/config.json that resolve.sh reads when the active phase is custom. Skills declare what shape of context they need; the resolver applies the declared fields, loads matching artifacts, and surfaces the applied routing in a new routing block of its JSON output.

What changes

bin/resolve.sh reads phase_context.<phase> from the active config (project-local .nanostack/config.json OR global ~/.nanostack/config.json, same resolution as the phase registry). Fields:

Field Default Effect
trust "normal" "strict" rejects integrity_missing artifacts (so only verified paths land in upstream_artifacts); "normal" keeps the historical lenient load. upstream_status always reports the actual trust state.
upstream_required [] Surfaces declared-required upstreams in routing.upstream_required AND merges them into the lookup set, so a routed-only upstream still produces a upstream_status entry.
upstream_optional [] Surfaces declared-optional upstreams in routing.upstream_optional AND merges them into the lookup set.
max_age_days per-phase default Overrides the per-phase max age window. CLI --max-age still takes precedence.
solutions.tags [] Loads matching solutions from <store>/know-how/solutions (case-insensitive literal substring across file content).
solutions.limit 10 (default applied for the lookup; surfaced in routing when tags are declared) Caps the number of solution paths.
diarizations.paths / diarizations.keywords [] Loads diarizations whose subject: contains any declared path or keyword (case-insensitive literal substring). Does not require a git diff.

resolve.sh's JSON output gains a routing block that surfaces the applied context. routing.declared is false for core phases and for custom phases without a context entry, so existing consumers that only read upstream_artifacts / upstream_status continue to work unchanged.

reference/custom-stack-contract.md gains a "Custom routing contract" section with the schema, field semantics, and a worked example. EXTENDING.md points to it.

ci/e2e-custom-routing.sh is a new harness: 10 cells, 35 checks. Covers backward compat, missing required upstream surfaces explicitly, strict trust drops integrity_missing, normal trust keeps it, max_age_days override, solution_tags filter (literal), diarization filter (literal), routed-only upstreams, global config fallback, JSON-safe diarization output (subjects with quotes/backslashes), default limit reported when tags are declared.

Lint adds custom-routing-contract: greps that resolve.sh reads phase_context, emits the routing block, and surfaces every routing_* field; and that the contract doc mentions phase_context.

What does not change

  • Core phases (plan, review, qa, security, ship, etc.) ignore phase_context. Their routing stays in the hardcoded case statement above.
  • Custom skills with no phase_context entry keep their pre-PR-5 behavior: upstreams from depends_on / phase_graph, no solutions or diarizations.
  • upstream_artifacts shape is preserved. upstream_status keeps its 5-value vocabulary (verified, integrity_missing, integrity_mismatch, missing, not_applicable).
  • The artifact-trust contract from PR 2 is unchanged; this PR just chooses when to apply strict vs lenient.
  • All existing E2E suites pass unchanged: user-flows, custom-stack-flows, custom-stack-examples, artifact-trust, structured-artifacts, graph-aware-session, delivery-matrix, examples, think-flows, think-archetypes, onboarding-flows.

Codex review trail (4 iterations)

Pass Finding Fix
1 Routed-only upstreams (declared in upstream_required / upstream_optional but not in depends_on) were echoed in routing.* but never resolved. Merge those names into UPSTREAM before the lookup loop, dedupe against existing entries.
2 Phase context was only read from $NANOSTACK_STORE/config.json; the global ~/.nanostack/config.json fallback was missed. Diarization path/keyword filter used regex grep, so app/users/[id]/page.tsx matched app/users/i/page.tsx. Use _nano_phases_resolve_config (same resolution as the registry). Switch diarization grep to -F.
3 Solution tags also used regex grep; next.js matched unrelated content. Diarization JSON was built via string concatenation, so a subject containing a quote produced invalid JSON. Switch solution-tag grep to -F. Build the diarization array through jq with --arg path / --arg subject / --arg age_days.
4 routing.solutions.limit reported null when tags were declared but limit omitted, while the lookup applied the documented default (10). Report the default in the routing block when tags are declared; stay null when no phase_context exists.

Final pass clean: "I did not identify any actionable correctness issues introduced by the diff."

Tests

All checks green locally (14 suites):

  • ci/e2e-user-flows.sh: 100 checks.
  • ci/e2e-custom-stack-flows.sh: 40 checks.
  • ci/e2e-custom-stack-examples.sh: 51 checks.
  • ci/e2e-artifact-trust.sh: 29 checks.
  • ci/e2e-structured-artifacts.sh: 31 checks.
  • ci/e2e-graph-aware-session.sh: 61 checks.
  • ci/e2e-custom-routing.sh: 35 checks (new).
  • ci/e2e-delivery-matrix.sh: 17 cells.
  • ci/e2e-examples.sh: 40 checks.
  • ci/e2e-think-flows.sh: 32 checks.
  • ci/e2e-think-archetypes.sh: 25 checks.
  • ci/e2e-onboarding-flows.sh: 34 checks.
  • tests/run.sh (local-only): 83 tests.
  • tests/e2e-user-flows.sh (local-only): 5 checks.

Total: 583 checks across 14 suites.

Spec

PR 5 of the 2026-05-10 Nanostack Architecture Audit vNext. Closes the P2 finding "Custom Routing Is Dependency-Only". Acceptance criteria from the spec, all met:

  • A custom skill can ask for strict upstream artifacts without local helper code (cell 3).
  • A custom skill can request solution search tags (cells 6, 7d).
  • A custom skill can request diarizations by path or keyword (cells 7, 7c, 7e).
  • Missing required upstreams are explicit in upstream_status, not silently null (cell 2).
  • Backward compatibility: custom skills with no context: block keep current behavior (cell 1).

Files

  • bin/resolve.sh: phase_context reader + trust application + solution/diarization lookup + routing in output.
  • ci/e2e-custom-routing.sh (new): 10-cell, 35-check harness.
  • reference/custom-stack-contract.md: "Custom routing contract" section.
  • EXTENDING.md: pointer to the new contract.
  • .github/workflows/lint.yml: new custom-routing-contract job.
  • .github/workflows/e2e.yml: new e2e-custom-routing job.

Next

PR 6: AI-Facing Docs + Adapter Freshness. Closes the P2 "Guard And Agent Docs Are Stale Relative To Current Behavior" and "Adapter Verification Has No Freshness Policy". Targets llms.txt, AGENTS.md, bin/about.sh, guard/SKILL.md, adapters/*.json, plus a new bin/check-adapters.sh.

Test plan

  • bash ci/e2e-custom-routing.sh reports 35 of 35 checks passed.
  • A custom skill with phase_context.trust = "strict" rejects a legacy artifact (no integrity field) and the resolver reports upstream_status[phase] = "integrity_missing" + upstream_artifacts[phase] = null.
  • A custom skill with solutions.tags = ["license"] loads matching files from <store>/know-how/solutions.
  • All existing CI E2E suites still pass.
  • New lint job custom-routing-contract passes.

garagon added 5 commits May 11, 2026 02:24
…hitecture vNext)

Custom skills could only declare dependency edges. resolve.sh
loaded their upstream artifacts but did not honor any other
routing intent. Skills that needed strict integrity reimplemented
the check; skills that needed solutions or diarizations stayed
dependency-blind.

This change introduces a phase_context block in
.nanostack/config.json that resolve.sh reads when the active phase
is custom. Fields:

  trust:              "strict" rejects integrity_missing artifacts
                      (only verified paths land in upstream_artifacts);
                      "normal" keeps the historical lenient load.
  upstream_required:  list of phase names the skill considers
                      required. Missing required upstreams already
                      report upstream_status[phase] = "missing";
                      the routing block makes the intent explicit.
  upstream_optional:  list of phase names the skill considers
                      optional. Informational; consumers can soften
                      missing-warning logic.
  max_age_days:       per-phase age override. CLI --max-age still
                      takes precedence.
  solutions.tags:     non-empty list triggers a content-match load
                      from <store>/know-how/solutions, limited by
                      solutions.limit (default 10).
  diarizations.paths
  diarizations.keywords: non-empty lists load matching diarizations
                      whose subject contains any path or keyword
                      (case-insensitive substring). Does not require
                      a git diff to be present.

bin/resolve.sh's JSON output gains a `routing` block that surfaces
the applied context block (routing.declared is false for core
phases and for custom phases without a context entry). Existing
consumers that only read upstream_artifacts and upstream_status
continue to work unchanged.

reference/custom-stack-contract.md gains a "Custom routing
contract" section with the schema, field semantics, and a
worked example. EXTENDING.md points to it from the Two
starting points section.

ci/e2e-custom-routing.sh is a new 8-cell harness (25 checks)
that locks each contract surface: backward compat (no
phase_context), missing required upstream surfaces explicitly,
strict trust drops integrity_missing, normal trust keeps it,
max_age_days override, solution_tags filter, diarization paths
filter, routing block shape. Wired into .github/workflows/e2e.yml.

Lint adds custom-routing-contract: greps that resolve.sh reads
phase_context and emits the routing block with every routing_*
field, and that the contract doc mentions phase_context.

Closes the P2 finding "Custom Routing Is Dependency-Only" from
the 2026-05-10 Nanostack Architecture Audit vNext.
When a custom skill declared upstream_required or upstream_optional
in phase_context without ALSO listing those phases in depends_on or
the graph, resolve.sh only echoed the lists in routing.* and never
actually resolved the artifact. upstream_status stayed empty for
those phases. Codex caught the gap on the PR 5 first review pass:
declaring upstream_optional: ["security"] without depends_on left
security absent from upstream_status entirely.

The phase_context reader now merges upstream_required and
upstream_optional into the UPSTREAM list (deduping against entries
already added from depends_on / phase_graph). The default age for
those merged entries follows routing.max_age_days when set,
otherwise the per-phase 30-day custom default. CLI --max-age still
takes precedence on the override pass.

Cell 7a of ci/e2e-custom-routing.sh locks the behavior: a license-
audit skill that only depends on review can declare
upstream_optional: ["security"] and see security artifacts land
in upstream_artifacts + upstream_status.
… pass 2)

Two findings from the PR 5 second Codex review:

[P2] resolve.sh used \$NANOSTACK_STORE/config.json directly when
reading phase_context. nano_phase_kind already consults the global
~/.nanostack/config.json as a fallback, so a project without a
local config silently lost its routing intent. resolve.sh now
calls _nano_phases_resolve_config (the same helper the registry
uses), so routing reads from whichever config the registry chose.

[P2] The diarization path/keyword filter used `grep -qi --` which
interprets regex metacharacters. A declared path like
app/users/[id]/page.tsx then matched unrelated subjects such as
app/users/i/page.tsx because [id] read as a character class.
Switched to `grep -qiF --` so the filters behave as literal
substring matches across the whole subject line.

Cells 7b and 7c lock both fixes: 7b sets phase_context only in the
fake-HOME global config and asserts routing.trust = strict
propagates; 7c declares an [id] path and asserts only the exact
subject loads (no decoy match).
…pass 3)

Two findings from the PR 5 third Codex review:

[P2] solution_tags also used `grep -lri` which is regex. A tag like
"next.js" or "app/users/[id]" would match unrelated solutions
because of the `.` or `[id]` characters. Switched to `grep -lriF`
so the filter behaves as a literal substring match like
diarization paths now do.

[P2] The custom-phase diarizations block built JSON via string
concatenation (`"$DIAR_RESULTS{...}"`). Any subject or path
containing a quote, backslash, or other JSON metacharacter
produced invalid JSON and broke the final `jq --argjson
diarizations` parse. The block now iterates jq calls so each
{path, subject, age_days} object is built with proper escaping.

Cells 7d and 7e lock both: 7d declares a "next.js" tag and asserts
only the exact-tag file loads (not a nextxjs decoy); 7e declares a
diarization path filter and asserts a subject like
app/"weird"/path.tsx lands intact in the JSON output.
…ss 4)

When a custom phase declared solutions.tags but omitted
solutions.limit, the resolver applied the documented default
(10) for the actual lookup but reported routing.solutions.limit
as null in its output. Consumers auditing the routing block
would see disagreement with the documented default and the
actual lookup. Codex flagged the silent default on the PR 5
fourth review pass.

The output now reports 10 when tags are declared and limit is
omitted, and stays null when no phase_context was declared at
all (no implicit value leak). Cell 8a covers both branches.
@garagon garagon merged commit 4adf4f6 into main May 11, 2026
58 checks passed
@garagon garagon deleted the custom-routing-contract branch May 11, 2026 05:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant