diff --git a/.shared/workflows/level-sync.md b/.shared/workflows/level-sync.md index 4b06bb2..166db7c 100644 --- a/.shared/workflows/level-sync.md +++ b/.shared/workflows/level-sync.md @@ -18,6 +18,20 @@ flowchart TD FILTER -->|no| SHOW_ALL[Display capabilities for all levels] ``` +## Why a Single Source of Truth + +`docs/capability-levels.md` is the authoritative document for which capabilities belong to which level. The registry file `registry/levels.yml` is a generated artifact — never edited directly. + +The capability-level mapping is a design decision documented in prose with rationale. If `levels.yml` were editable independently, the prose document and the registry would inevitably drift apart. Generating from one source keeps the mapping consistent and the reasoning discoverable. + +## Why Three Subcommands + +- **sync** writes the registry — used when the design document changes and the registry must catch up. +- **diff** reveals drift without modifying anything — safe to run as a check in CI or before a release. +- **list** is a read-only query for human convenience — no file I/O, just formatted output. + +Separating these prevents accidental overwrites when you only wanted to inspect. + ## Parsing Contract The matrix in `docs/capability-levels.md` under "Capability Assessment Matrix": @@ -75,5 +89,5 @@ levels: ## Constraints - **Source of truth**: `docs/capability-levels.md` — never edit `registry/levels.yml` directly -- **Idempotent**: Running `sync` twice produces identical output -- **No reordering**: Capabilities listed in the order they appear in the matrix +- **Idempotent**: Running `sync` twice produces identical output — safe to re-run without side effects, and makes CI checks deterministic +- **No reordering**: Capabilities listed in the order they appear in the matrix — produces stable diffs and predictable output diff --git a/.shared/workflows/qa-smoke-test.md b/.shared/workflows/qa-smoke-test.md index 71e9d97..352ae96 100644 --- a/.shared/workflows/qa-smoke-test.md +++ b/.shared/workflows/qa-smoke-test.md @@ -20,15 +20,31 @@ flowchart TD CLEANUP --> PASS([PASS]) ``` +## Why This Sequence + +The five steps form a progressive confidence chain — each step depends on the previous one succeeding and exercises a different workflow: + +1. **Create** exercises `rule-creation` — can we produce a valid rule from scratch? +2. **Validate all** exercises `rule-validation` — does the new rule coexist with existing rules without breaking anything? +3. **Update** exercises `rule-update` — can we modify a rule in place without corrupting it? +4. **Re-validate** is the regression gate — did the update break what creation built? This catches the class of bugs where an update workflow silently damages fields that the creation workflow set correctly. +5. **Cleanup** ensures the test is self-contained — no artifacts leak into the real rule set. + +Skipping step 4 would miss regression bugs. Skipping step 2 would miss cross-rule conflicts. The order mirrors the lifecycle of a real rule: create, validate, modify, re-validate. + +## Why Cleanup Must Always Run + +Test artifacts (the `CORE:S:9999` smoke test rule) would pollute real validation runs, appear in coordinate map checks, and confuse git status. Even if step 3 fails, the directory from step 1 still exists. Unconditional cleanup prevents stale test rules from accumulating. + ## Test Sequence -| Step | Workflow | Input | Verify With | -|------|----------|-------|-------------| -| 1 | rule-creation | `/generate-rule CORE:S:9999 structure "Smoke Test"` | qa-checklist.md#generate-rule | -| 2 | rule-validation | all rules | qa-checklist.md#validate-rules | -| 3 | rule-update | `/update-rule CORE:S:9999 "Add test pattern"` | qa-checklist.md#update-rule | -| 4 | rule-validation | all rules | Test rule still passes | -| 5 | cleanup | `rm -rf core/structure/smoke-test/` | directory deleted | +| Step | Workflow | Input | Verify With | +|------|-----------------|--------------------------------------------------|--------------------------------| +| 1 | rule-creation | `/generate-rule CORE:S:9999 structure "Smoke Test"` | qa-checklist.md#generate-rule | +| 2 | rule-validation | all rules | qa-checklist.md#validate-rules | +| 3 | rule-update | `/update-rule CORE:S:9999 "Add test pattern"` | qa-checklist.md#update-rule | +| 4 | rule-validation | all rules | Test rule still passes | +| 5 | cleanup | `rm -rf core/structure/smoke-test/` | directory deleted | ## When to Run diff --git a/.shared/workflows/rule-creation.md b/.shared/workflows/rule-creation.md index 3e6253c..50ffef3 100644 --- a/.shared/workflows/rule-creation.md +++ b/.shared/workflows/rule-creation.md @@ -10,16 +10,36 @@ flowchart TD DET --> SOURCES[Find backing sources in docs/sources.yml] SEM --> SOURCES MECH --> SOURCES - SOURCES --> GEN[Generate rule.md + rule.yml + tests/pass/ + tests/fail/] + SOURCES --> GEN[Generate rule.md + rule.yml + tests/pass/ + tests/fail] GEN --> RESOLVE[Resolve templates for validation] RESOLVE --> VALID{OpenGrep exit code?} VALID -->|0 or 1| SAVE[Save files with templates intact] VALID -->|2| FIX2[Fix syntax error] --> RESOLVE VALID -->|7| FIX7[Add positive pattern] --> RESOLVE SAVE --> REFS[Update coordinate-map, capability-levels if needed] - REFS --> CHANGELOG[/add-changelog-entry] + REFS --> CHANGELOG[add-changelog-entry] ``` +## Why Type Classification Comes First + +The type decision (mechanical / deterministic / semantic) sets the ceiling on what detection methods a rule can use: + +- **Mechanical** rules check structural facts — file exists, line count thresholds, section presence. No pattern matching. These are the cheapest to run and the most reliable, but can only detect what's countable or locatable. +- **Deterministic** rules use OpenGrep patterns to match or reject content. They can detect specific textual violations without human judgment. Most rules land here. +- **Semantic** rules require an LLM judgment call — the violation can't be reduced to a pattern. These are the most expensive and least deterministic, so they're a last resort, not a default. + +Choosing the type early prevents over-engineering (writing OpenGrep patterns for a rule that only needs `file_exists`) or under-engineering (trying to pattern-match something that genuinely needs judgment). + +## Why Sources Before Generation + +Every rule must reference at least one entry in `docs/sources.yml`. This is not bureaucracy — it's the evidence chain. A rule without a backing source is an arbitrary opinion. Sources ground rules in published best practices, official documentation, or empirical research, which makes the framework defensible when users ask "why does this rule exist?" + +If no existing source covers the rule, a new source entry must be added first. The rule and its justification enter the system together. + +## Why Validate Before Save + +The resolve → validate loop catches broken patterns before they're committed. A rule that passes schema validation but has malformed OpenGrep syntax would fail silently until someone runs the test harness — possibly much later, in a different context. Validating at creation time makes the failure immediate and attributable. + ## Edge Cases **No existing source backs the rule:** diff --git a/.shared/workflows/rule-update.md b/.shared/workflows/rule-update.md index 4b11c25..aa824bf 100644 --- a/.shared/workflows/rule-update.md +++ b/.shared/workflows/rule-update.md @@ -17,6 +17,28 @@ flowchart TD UPDATEMD --> REPORT[Report changes] ``` +## Why Coordinates and Slugs Are Immutable + +Rule coordinates (e.g., `CORE:S:0005`) and directory slugs are external references. Other rules, changelogs, documentation, and the coordinate map all point to them. Renaming a coordinate would silently break every reference and create phantom entries in the registry. + +If a rule's scope changes enough to warrant a new coordinate, tombstone the old one and create a new rule. + +## Why Templates Must Survive the Save + +The resolve → validate → save cycle has a critical invariant: **stored files contain templates, never resolved values**. + +Templates like `{{instruction_files}}` make rules portable across agents. A core rule resolved for Claude (`**/CLAUDE.md`) would fail for Codex (`codex.md`). Resolution is ephemeral — it exists only for validation. Saving resolved values would lock a rule to a single agent's configuration. + +## Why the Fix Loop Exists + +OpenGrep exit codes signal distinct problems: + +- **Exit 2** (syntax error): The pattern itself is malformed. Fix the YAML/regex and re-validate. +- **Exit 7** (no positive pattern): OpenGrep requires at least one positive match to anchor the rule. Add a `pattern` or `pattern-regex` before retrying. +- **Exit 0 or 1** (valid): Pattern is syntactically correct regardless of whether it matched anything. + +The loop prevents saving patterns that would fail at runtime in the test harness. + ## Constraints **NEVER change:** diff --git a/.shared/workflows/rule-validation.md b/.shared/workflows/rule-validation.md index c93033a..a52f57d 100644 --- a/.shared/workflows/rule-validation.md +++ b/.shared/workflows/rule-validation.md @@ -17,6 +17,24 @@ flowchart TD NEXT -->|no| SUMMARY[Summary output] ``` +## Why Three Layers in This Order + +Validation runs schema, then contract, then OpenGrep. The ordering is deliberate: + +1. **Schema validation** catches structural errors (missing fields, wrong types, bad format) with zero external dependencies. It's the cheapest check and filters out rules that would cause confusing downstream failures. + +2. **Contract validation** confirms that `rule.md` and `rule.yml` agree — same coordinate, matching check IDs, consistent type declarations. This catches the class of bugs where one file was updated but the other wasn't. It requires both files to be schema-valid first. + +3. **OpenGrep validation** runs the actual patterns against the syntax checker. This is the most expensive step and requires template resolution (file I/O, agent config loading). Running it last means we only pay that cost for rules that are already structurally sound. + +Reversing the order would waste time running OpenGrep on rules with missing fields, or mask contract errors behind pattern syntax failures. + +## Why Template Resolution Happens Before OpenGrep Only + +Schema validation checks the template syntax itself — `{{instruction_files}}` must appear as-is in the stored file. Resolving templates before schema validation would hide template errors. + +OpenGrep, however, needs real glob paths to validate pattern syntax. A pattern targeting `{{instruction_files}}` is not a valid path — it must be resolved to `**/CLAUDE.md` (or equivalent) before the pattern engine can parse it. + ## Template Resolution Before OpenGrep validation: @@ -25,11 +43,11 @@ Before OpenGrep validation: 2. Replace variables from `vars:` section 3. Create temp resolved file for validation -| Template | Example Value (claude) | -|----------|------------------------| +| Template | Example Value (claude) | +|-------------------------|-----------------------------------------| | `{{instruction_files}}` | `**/CLAUDE.md`, `.claude/rules/**/*.md` | -| `{{rules_dir}}` | `.claude/rules` | -| `{{skills_dir}}` | `.claude/skills` | +| `{{rules_dir}}` | `.claude/rules` | +| `{{skills_dir}}` | `.claude/skills` | ## Output Format diff --git a/CHANGELOG.md b/CHANGELOG.md index d302aba..5684893 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,13 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.4.1] - 2026-02-17 + +Workflow documentation — added reasoning prose to all shared workflows. + +### Changed +- **Workflows**: Added "why" prose sections to 5 workflows (level-sync, rule-validation, rule-update, rule-creation, qa-smoke-test) — mermaid flowcharts show the what, new prose sections explain the reasoning behind key decisions, constraints, and sequencing + ## [0.4.0] - 2026-02-16 Rules consolidation — 47 rules reduced to 17 focused rules, GitHub Copilot agent added, new governance and maintenance categories. diff --git a/README.md b/README.md index 2fc0362..1bdaf2a 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ Validation rules for AI agent instruction files (CLAUDE.md, .cursorrules, copilot-instructions.md). Community-maintained. -**Version:** 0.4.0 +**Version:** 0.4.1 ### Pre-1.0 — moving fast, API still evolving, feedback welcome. diff --git a/UNRELEASED.md b/UNRELEASED.md index 1e3761d..79e701b 100644 --- a/UNRELEASED.md +++ b/UNRELEASED.md @@ -1 +1 @@ -# Unreleased \ No newline at end of file +# Unreleased diff --git a/VERSION b/VERSION index 1d0ba9e..267577d 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -0.4.0 +0.4.1