Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 16 additions & 2 deletions .shared/workflows/level-sync.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,20 @@ flowchart TD
FILTER -->|no| SHOW_ALL[Display capabilities for all levels]
```

## Why a Single Source of Truth

`docs/capability-levels.md` is the authoritative document for which capabilities belong to which level. The registry file `registry/levels.yml` is a generated artifact — never edited directly.

The capability-level mapping is a design decision documented in prose with rationale. If `levels.yml` were editable independently, the prose document and the registry would inevitably drift apart. Generating from one source keeps the mapping consistent and the reasoning discoverable.

## Why Three Subcommands

- **sync** writes the registry — used when the design document changes and the registry must catch up.
- **diff** reveals drift without modifying anything — safe to run as a check in CI or before a release.
- **list** is a read-only query for human convenience — no file I/O, just formatted output.

Separating these prevents accidental overwrites when you only wanted to inspect.

## Parsing Contract

The matrix in `docs/capability-levels.md` under "Capability Assessment Matrix":
Expand Down Expand Up @@ -75,5 +89,5 @@ levels:
## Constraints

- **Source of truth**: `docs/capability-levels.md` — never edit `registry/levels.yml` directly
- **Idempotent**: Running `sync` twice produces identical output
- **No reordering**: Capabilities listed in the order they appear in the matrix
- **Idempotent**: Running `sync` twice produces identical output — safe to re-run without side effects, and makes CI checks deterministic
- **No reordering**: Capabilities listed in the order they appear in the matrix — produces stable diffs and predictable output
30 changes: 23 additions & 7 deletions .shared/workflows/qa-smoke-test.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,31 @@ flowchart TD
CLEANUP --> PASS([PASS])
```

## Why This Sequence

The five steps form a progressive confidence chain — each step depends on the previous one succeeding and exercises a different workflow:

1. **Create** exercises `rule-creation` — can we produce a valid rule from scratch?
2. **Validate all** exercises `rule-validation` — does the new rule coexist with existing rules without breaking anything?
3. **Update** exercises `rule-update` — can we modify a rule in place without corrupting it?
4. **Re-validate** is the regression gate — did the update break what creation built? This catches the class of bugs where an update workflow silently damages fields that the creation workflow set correctly.
5. **Cleanup** ensures the test is self-contained — no artifacts leak into the real rule set.

Skipping step 4 would miss regression bugs. Skipping step 2 would miss cross-rule conflicts. The order mirrors the lifecycle of a real rule: create, validate, modify, re-validate.

## Why Cleanup Must Always Run

Test artifacts (the `CORE:S:9999` smoke test rule) would pollute real validation runs, appear in coordinate map checks, and confuse git status. Even if step 3 fails, the directory from step 1 still exists. Unconditional cleanup prevents stale test rules from accumulating.

## Test Sequence

| Step | Workflow | Input | Verify With |
|------|----------|-------|-------------|
| 1 | rule-creation | `/generate-rule CORE:S:9999 structure "Smoke Test"` | qa-checklist.md#generate-rule |
| 2 | rule-validation | all rules | qa-checklist.md#validate-rules |
| 3 | rule-update | `/update-rule CORE:S:9999 "Add test pattern"` | qa-checklist.md#update-rule |
| 4 | rule-validation | all rules | Test rule still passes |
| 5 | cleanup | `rm -rf core/structure/smoke-test/` | directory deleted |
| Step | Workflow | Input | Verify With |
|------|-----------------|--------------------------------------------------|--------------------------------|
| 1 | rule-creation | `/generate-rule CORE:S:9999 structure "Smoke Test"` | qa-checklist.md#generate-rule |
| 2 | rule-validation | all rules | qa-checklist.md#validate-rules |
| 3 | rule-update | `/update-rule CORE:S:9999 "Add test pattern"` | qa-checklist.md#update-rule |
| 4 | rule-validation | all rules | Test rule still passes |
| 5 | cleanup | `rm -rf core/structure/smoke-test/` | directory deleted |

## When to Run

Expand Down
24 changes: 22 additions & 2 deletions .shared/workflows/rule-creation.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,36 @@ flowchart TD
DET --> SOURCES[Find backing sources in docs/sources.yml]
SEM --> SOURCES
MECH --> SOURCES
SOURCES --> GEN[Generate rule.md + rule.yml + tests/pass/ + tests/fail/]
SOURCES --> GEN[Generate rule.md + rule.yml + tests/pass/ + tests/fail]
GEN --> RESOLVE[Resolve templates for validation]
RESOLVE --> VALID{OpenGrep exit code?}
VALID -->|0 or 1| SAVE[Save files with templates intact]
VALID -->|2| FIX2[Fix syntax error] --> RESOLVE
VALID -->|7| FIX7[Add positive pattern] --> RESOLVE
SAVE --> REFS[Update coordinate-map, capability-levels if needed]
REFS --> CHANGELOG[/add-changelog-entry]
REFS --> CHANGELOG[add-changelog-entry]
```

## Why Type Classification Comes First

The type decision (mechanical / deterministic / semantic) sets the ceiling on what detection methods a rule can use:

- **Mechanical** rules check structural facts — file exists, line count thresholds, section presence. No pattern matching. These are the cheapest to run and the most reliable, but can only detect what's countable or locatable.
- **Deterministic** rules use OpenGrep patterns to match or reject content. They can detect specific textual violations without human judgment. Most rules land here.
- **Semantic** rules require an LLM judgment call — the violation can't be reduced to a pattern. These are the most expensive and least deterministic, so they're a last resort, not a default.

Choosing the type early prevents over-engineering (writing OpenGrep patterns for a rule that only needs `file_exists`) or under-engineering (trying to pattern-match something that genuinely needs judgment).

## Why Sources Before Generation

Every rule must reference at least one entry in `docs/sources.yml`. This is not bureaucracy — it's the evidence chain. A rule without a backing source is an arbitrary opinion. Sources ground rules in published best practices, official documentation, or empirical research, which makes the framework defensible when users ask "why does this rule exist?"

If no existing source covers the rule, a new source entry must be added first. The rule and its justification enter the system together.

## Why Validate Before Save

The resolve → validate loop catches broken patterns before they're committed. A rule that passes schema validation but has malformed OpenGrep syntax would fail silently until someone runs the test harness — possibly much later, in a different context. Validating at creation time makes the failure immediate and attributable.

## Edge Cases

**No existing source backs the rule:**
Expand Down
22 changes: 22 additions & 0 deletions .shared/workflows/rule-update.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,28 @@ flowchart TD
UPDATEMD --> REPORT[Report changes]
```

## Why Coordinates and Slugs Are Immutable

Rule coordinates (e.g., `CORE:S:0005`) and directory slugs are external references. Other rules, changelogs, documentation, and the coordinate map all point to them. Renaming a coordinate would silently break every reference and create phantom entries in the registry.

If a rule's scope changes enough to warrant a new coordinate, tombstone the old one and create a new rule.

## Why Templates Must Survive the Save

The resolve → validate → save cycle has a critical invariant: **stored files contain templates, never resolved values**.

Templates like `{{instruction_files}}` make rules portable across agents. A core rule resolved for Claude (`**/CLAUDE.md`) would fail for Codex (`codex.md`). Resolution is ephemeral — it exists only for validation. Saving resolved values would lock a rule to a single agent's configuration.

## Why the Fix Loop Exists

OpenGrep exit codes signal distinct problems:

- **Exit 2** (syntax error): The pattern itself is malformed. Fix the YAML/regex and re-validate.
- **Exit 7** (no positive pattern): OpenGrep requires at least one positive match to anchor the rule. Add a `pattern` or `pattern-regex` before retrying.
- **Exit 0 or 1** (valid): Pattern is syntactically correct regardless of whether it matched anything.

The loop prevents saving patterns that would fail at runtime in the test harness.

## Constraints

**NEVER change:**
Expand Down
26 changes: 22 additions & 4 deletions .shared/workflows/rule-validation.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,24 @@ flowchart TD
NEXT -->|no| SUMMARY[Summary output]
```

## Why Three Layers in This Order

Validation runs schema, then contract, then OpenGrep. The ordering is deliberate:

1. **Schema validation** catches structural errors (missing fields, wrong types, bad format) with zero external dependencies. It's the cheapest check and filters out rules that would cause confusing downstream failures.

2. **Contract validation** confirms that `rule.md` and `rule.yml` agree — same coordinate, matching check IDs, consistent type declarations. This catches the class of bugs where one file was updated but the other wasn't. It requires both files to be schema-valid first.

3. **OpenGrep validation** runs the actual patterns against the syntax checker. This is the most expensive step and requires template resolution (file I/O, agent config loading). Running it last means we only pay that cost for rules that are already structurally sound.

Reversing the order would waste time running OpenGrep on rules with missing fields, or mask contract errors behind pattern syntax failures.

## Why Template Resolution Happens Before OpenGrep Only

Schema validation checks the template syntax itself — `{{instruction_files}}` must appear as-is in the stored file. Resolving templates before schema validation would hide template errors.

OpenGrep, however, needs real glob paths to validate pattern syntax. A pattern targeting `{{instruction_files}}` is not a valid path — it must be resolved to `**/CLAUDE.md` (or equivalent) before the pattern engine can parse it.

## Template Resolution

Before OpenGrep validation:
Expand All @@ -25,11 +43,11 @@ Before OpenGrep validation:
2. Replace variables from `vars:` section
3. Create temp resolved file for validation

| Template | Example Value (claude) |
|----------|------------------------|
| Template | Example Value (claude) |
|-------------------------|-----------------------------------------|
| `{{instruction_files}}` | `**/CLAUDE.md`, `.claude/rules/**/*.md` |
| `{{rules_dir}}` | `.claude/rules` |
| `{{skills_dir}}` | `.claude/skills` |
| `{{rules_dir}}` | `.claude/rules` |
| `{{skills_dir}}` | `.claude/skills` |

## Output Format

Expand Down
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,13 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.4.1] - 2026-02-17

Workflow documentation — added reasoning prose to all shared workflows.

### Changed
- **Workflows**: Added "why" prose sections to 5 workflows (level-sync, rule-validation, rule-update, rule-creation, qa-smoke-test) — mermaid flowcharts show the what, new prose sections explain the reasoning behind key decisions, constraints, and sequencing

## [0.4.0] - 2026-02-16

Rules consolidation — 47 rules reduced to 17 focused rules, GitHub Copilot agent added, new governance and maintenance categories.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Validation rules for AI agent instruction files (CLAUDE.md, .cursorrules, copilot-instructions.md).
Community-maintained.

**Version:** 0.4.0 <!-- source of truth: VERSION file -->
**Version:** 0.4.1 <!-- source of truth: VERSION file -->

### Pre-1.0 — moving fast, API still evolving, feedback welcome.

Expand Down
2 changes: 1 addition & 1 deletion UNRELEASED.md
Original file line number Diff line number Diff line change
@@ -1 +1 @@
# Unreleased
# Unreleased
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.4.0
0.4.1