diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index 538dedd..6e79722 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -12,7 +12,7 @@ "name": "dev-workflows", "source": "./dev-workflows", "strict": true, - "version": "0.16.17", + "version": "0.17.0", "description": "Skills + Subagents for backend development - Use skills for coding guidance, or run recipe workflows for full orchestrated agentic coding with specialized agents", "author": { "name": "Shinsuke Kagawa", @@ -68,6 +68,7 @@ "./skills/recipe-fullstack-implement", "./skills/recipe-implement", "./skills/recipe-plan", + "./skills/recipe-prepare-implementation", "./skills/recipe-reverse-engineer", "./skills/recipe-review", "./skills/recipe-task", @@ -81,7 +82,7 @@ "name": "dev-workflows-frontend", "source": "./dev-workflows-frontend", "strict": true, - "version": "0.16.17", + "version": "0.17.0", "description": "Skills + Subagents for React/TypeScript - Use skills for coding guidance, or run recipe workflows for full orchestrated agentic coding with specialized agents", "author": { "name": "Shinsuke Kagawa", @@ -149,7 +150,7 @@ "name": "dev-skills", "source": "./dev-skills", "strict": true, - "version": "0.16.17", + "version": "0.17.0", "description": "Lightweight skills for users with existing workflows - coding best practices, testing principles, and design guidelines without recipe workflows or agents", "author": { "name": "Shinsuke Kagawa", diff --git a/README.md b/README.md index 2713ccf..dcb4730 100644 --- a/README.md +++ b/README.md @@ -207,6 +207,7 @@ All workflow entry points use the `recipe-` prefix to distinguish them from know | `/recipe-task` | Execute single task with precision | Bug fixes, small changes | | `/recipe-design` | Create design documentation | Architecture planning | | `/recipe-plan` | Generate work plan from design | Planning phase | +| `/recipe-prepare-implementation` | Verify implementation readiness and resolve gaps | Pre-build check that the work plan is implementable | | `/recipe-build` | Execute from existing task plan | Resume implementation | | `/recipe-fullstack-build` | Execute fullstack task plan | Resume cross-layer implementation (requires both plugins) | | `/recipe-review` | Verify code against design docs | Post-implementation check | @@ -498,6 +499,16 @@ A: Yes! **[codex-workflows](https://github.com/shinpr/codex-workflows)** provide A: `dev-skills` provides only coding best practices as skills (`coding-principles`, `testing-principles`, etc.) β€” no workflow recipes or agents. `dev-workflows` includes the same skills plus recipes like `/recipe-implement` and specialized agents for full orchestrated development. Use `dev-skills` if you already have your own orchestration and just want the knowledge guides. They should not be installed together. See [Skills Only](#skills-only-for-users-with-existing-workflows) for details and switching instructions. +**Q: Should I commit the work plan and task files in `docs/plans/`?** + +A: No. Recipes treat `docs/plans/` as ephemeral working state β€” work plans, task files, prep tasks, review-fix tasks, and intermediate analysis files all live there during a recipe run and are cleaned up at the end. Add the following line to your project's `.gitignore` so working state stays out of git: + +``` +docs/plans/ +``` + +PRDs, ADRs, UI Specs, and Design Docs live in their own directories (`docs/prd/`, `docs/adr/`, `docs/ui-spec/`, `docs/design/`) and are intended to be committed. + --- ## πŸ”Œ Contributing External Plugins diff --git a/agents/acceptance-test-generator.md b/agents/acceptance-test-generator.md index 2916939..7eba9ef 100644 --- a/agents/acceptance-test-generator.md +++ b/agents/acceptance-test-generator.md @@ -37,7 +37,8 @@ Test type definitions, budgets, and ROI calculations are specified in **integrat Key points: - **Integration Tests**: MAX 3 per feature, created alongside implementation -- **E2E Tests**: MAX 1-2 per feature, executed in final phase only +- **fixture-e2e**: MAX 3 per feature, created alongside the UI feature phase, ROI β‰₯ 20 beyond reserved slot +- **service-integration-e2e**: MAX 1-2 per feature, executed only in the final phase, ROI > 50 beyond reserved slot ## 4-Phase Generation Process @@ -103,9 +104,9 @@ For each valid AC from Phase 1: **Output**: Candidate pool with ROI metadata -### Phase 3: ROI-Based Selection (Two-Pass #2) +### Phase 3: ROI-Based Selection and Lane Assignment (Two-Pass #2) -ROI calculation formula and cost table are defined in **integration-e2e-testing skill**. +ROI calculation formula and cost table are defined in **integration-e2e-testing skill**. Lane definitions and selection rules are also in that skill. **Selection Algorithm**: @@ -118,34 +119,50 @@ ROI calculation formula and cost table are defined in **integration-e2e-testing 3. **Push-Down Analysis**: ``` Can this be unit-tested? β†’ Remove from integration/E2E pool - Already integration-tested? β†’ Keep as E2E candidate IF part of multi-step user journey (see definition in integration-e2e-testing skill) - Already integration-tested AND NOT part of multi-step journey? β†’ Remove from E2E pool + Already integration-tested AND verifiable in-process? β†’ Remove from E2E pool ``` -4. **Sort by ROI** (descending order) +4. **Lane assignment** (E2E candidates only): + - Default to `fixture-e2e` for any UI journey verifiable with mocked backend / fixture-driven state + - Promote to `service-integration-e2e` only when the verification depends on real cross-service behavior. A candidate qualifies for `service-integration-e2e` when ANY of the following must be asserted: + - Data persists across a real DB write (e.g., row inserted/updated in the actual database under test) + - A downstream service receives a real event/message (e.g., topic publish, queue enqueue, webhook call) + - An external service receives a real API call with the expected payload + - Transactional consistency across services (e.g., two-phase commit, saga compensation) +5. **Sort by ROI** within each lane (descending) β€” this is the single ranking step; Phase 4 budget enforcement consumes this ranked list directly without re-sorting. -**Output**: Ranked, deduplicated candidate list +**Output**: Ranked, deduplicated candidate list with lane assigned per E2E candidate. ### Phase 4: Budget Enforcement **Hard Limits per Feature**: - **Integration Tests**: MAX 3 tests -- **E2E Tests**: MAX 1-2 tests total, composed of: - - 1 reserved slot (emitted regardless of ROI) when feature contains a **user-facing** multi-step user journey (see definition and classification in integration-e2e-testing skill) +- **fixture-e2e**: MAX 3 tests, no ROI gate. When the feature contains a **user-facing** multi-step user journey, the highest-ROI journey candidate is reserved (emitted regardless of ranking) +- **service-integration-e2e**: MAX 1-2 tests, composed of: + - 1 reserved slot (emitted regardless of ROI) when the journey's correctness depends on real cross-service behavior that fixture-e2e cannot verify - Up to 1 additional slot requiring ROI > 50 **Selection Algorithm**: ``` -1. Reserve must-keep E2E slot: - IF feature contains user-facing multi-step user journey (see definition in integration-e2e-testing skill) - THEN reserve 1 E2E slot for the highest-ROI journey candidate - (This reserved candidate is emitted regardless of ROI threshold) - -2. Sort remaining candidates by ROI (descending) - -3. Select top N within budget: +1. Reserve fixture-e2e slot: + IF feature contains user-facing multi-step user journey + THEN reserve 1 fixture-e2e slot for the highest-ROI journey candidate + +2. Reserve service-integration-e2e slot (only if needed): + IF the reserved journey's verification requires ANY of: + - data persists across a real DB write + - downstream service receives a real event/message + - external service receives a real API call with expected payload + - transactional consistency across services + THEN reserve 1 service-integration-e2e slot for that journey + +3. Walk the candidate list (already sorted by ROI within each lane in Phase 3 step 5) + and select within budget: - Integration: Pick top 3 highest-ROI - - E2E (additional beyond reserved): Pick up to 1 more IF ROI score > 50 + - fixture-e2e (additional beyond reserved): Pick up to remaining budget IF ROI β‰₯ 20 + - service-integration-e2e (additional beyond reserved): Pick up to 1 more IF ROI > 50 + + Leave budget intentionally unfilled when no remaining candidate clears the lane's threshold. ``` **Output**: Final test set @@ -180,81 +197,109 @@ The examples below use `//` comment syntax. Adapt to the project's language (e.g [Test: 'AC1: Failed payment displays error without creating order'] ``` -### E2E Test File +### fixture-e2e Test File + +``` +// [Feature Name] fixture-e2e Test - Design Doc: [filename] +// Generated: [date] | Budget Used: 1/3 fixture-e2e +// Test Type: Browser-level UI verification with mocked backend / fixture-driven state +// Implementation Timing: Alongside the UI feature implementation + +[Import statement using detected test framework] + +[Test suite using detected framework syntax] + // User Journey: Click Dismiss β†’ card disappears β†’ undo banner appears β†’ Undo restores card + // ROI: 60 (BV:6 Γ— Freq:7 + Legal:0 + Defect:8) | reserved slot: user-facing multi-step journey + // Verification: UI state transitions are observable in the browser + // @category: fixture-e2e + // @lane: fixture-e2e + // @dependency: full-ui (mocked backend) + // @complexity: medium + [Test: 'User Journey: Dismiss-then-Undo restores the card to its original state'] +``` + +### service-integration-e2e Test File ``` -// [Feature Name] E2E Test - Design Doc: [filename] -// Generated: [date] | Budget Used: 1/2 E2E -// Test Type: End-to-End Test -// Implementation Timing: After all feature implementations complete +// [Feature Name] service-integration-e2e Test - Design Doc: [filename] +// Generated: [date] | Budget Used: 1/2 service-integration-e2e +// Test Type: End-to-end against running local stack +// Implementation Timing: Executed only in the final phase [Import statement using detected test framework] [Test suite using detected framework syntax] - // User Journey: Complete purchase flow (browse β†’ add to cart β†’ checkout β†’ payment β†’ confirmation) - // ROI: 119 (BV:10 Γ— Freq:10 + Legal:10 + Defect:9) | reserved slot: multi-step journey - // Verification: End-to-end user experience from product selection to order confirmation - // @category: e2e + // User Journey: Complete purchase flow (browse β†’ checkout β†’ payment β†’ confirmation persisted in DB) + // ROI: 119 (BV:10 Γ— Freq:10 + Legal:10 + Defect:9) | reserved slot: cross-service correctness + // Verification: Order persists in DB and confirmation event reaches downstream consumer + // @category: service-integration-e2e + // @lane: service-integration-e2e // @dependency: full-system // @complexity: high - [Test: 'User Journey: Complete product purchase from browse to confirmation email'] + [Test: 'User Journey: Complete purchase persists order and emits confirmation event'] ``` ### Generation Report -**When E2E tests are emitted:** +**When both E2E lanes are emitted:** ```json { "status": "completed", "feature": "payment", "generatedFiles": { "integration": "tests/payment.int.test.[ext]", - "e2e": "tests/payment.e2e.test.[ext]" + "fixtureE2e": "tests/payment.fixture.e2e.test.[ext]", + "serviceE2e": "tests/payment.service.e2e.test.[ext]" }, - "budgetUsage": { "integration": "2/3", "e2e": "1/2" }, - "e2eAbsenceReason": null + "budgetUsage": { "integration": "2/3", "fixtureE2e": "1/3", "serviceE2e": "1/2" }, + "e2eAbsenceReason": { "fixtureE2e": null, "serviceE2e": null } } ``` -**When no E2E tests are emitted:** +**When only fixture-e2e is emitted:** ```json { "status": "completed", "feature": "payment", "generatedFiles": { "integration": "tests/payment.int.test.[ext]", - "e2e": null + "fixtureE2e": "tests/payment.fixture.e2e.test.[ext]", + "serviceE2e": null }, - "budgetUsage": { "integration": "2/3", "e2e": "0/2" }, - "e2eAbsenceReason": "no_multi_step_journey" + "budgetUsage": { "integration": "2/3", "fixtureE2e": "2/3", "serviceE2e": "0/2" }, + "e2eAbsenceReason": { "fixtureE2e": null, "serviceE2e": "no_real_service_dependency" } } ``` -**When no integration tests are emitted:** +**When no E2E tests are emitted:** ```json { "status": "completed", "feature": "config-update", "generatedFiles": { - "integration": null, - "e2e": null + "integration": "tests/config.int.test.[ext]", + "fixtureE2e": null, + "serviceE2e": null }, - "budgetUsage": { "integration": "0/3", "e2e": "0/2" }, - "e2eAbsenceReason": "no_multi_step_journey" + "budgetUsage": { "integration": "1/3", "fixtureE2e": "0/3", "serviceE2e": "0/2" }, + "e2eAbsenceReason": { "fixtureE2e": "no_multi_step_journey", "serviceE2e": "no_multi_step_journey" } } ``` -**Contract**: Both `generatedFiles.integration` and `generatedFiles.e2e` are always present as keys. Value is a file path string when generated, `null` when not generated. `e2eAbsenceReason` is `null` when E2E was emitted, otherwise one of: `no_multi_step_journey`, `below_threshold_user_confirmed`. +**Contract**: +- `generatedFiles.integration`, `generatedFiles.fixtureE2e`, `generatedFiles.serviceE2e` are always present as keys. Value is a file path string when generated, `null` when not. +- `e2eAbsenceReason` is an object with `fixtureE2e` and `serviceE2e` keys. Each value is `null` when that lane emitted, otherwise one of: `no_multi_step_journey`, `below_threshold_user_confirmed`, `no_real_service_dependency` (service-integration-e2e only β€” meaning the journey is verifiable in fixture-e2e). ## Test Meta Information Assignment Each test case MUST have the following standard annotations for test implementation planning: -- **@category**: core-functionality | integration | edge-case | ux -- **@dependency**: none | [component names] | full-system +- **@category**: core-functionality | integration | edge-case | ux | fixture-e2e | service-integration-e2e +- **@lane**: integration | fixture-e2e | service-integration-e2e +- **@dependency**: none | [component names] | full-ui (mocked backend) | full-system - **@complexity**: low | medium | high -These annotations are used when planning and prioritizing test implementation. +These annotations are used when planning and prioritizing test implementation. The `@lane` annotation is the source of truth for budget accounting and CI gating. ## Constraints and Quality Standards diff --git a/agents/quality-fixer-frontend.md b/agents/quality-fixer-frontend.md index fde3f68..737f385 100644 --- a/agents/quality-fixer-frontend.md +++ b/agents/quality-fixer-frontend.md @@ -103,9 +103,11 @@ Return one of the following as the final response (see Output Format for schemas ### Testing Quality (React Testing Library) - **Test Coverage**: Minimum 60% coverage for frontend code - - Atoms: 70% target - - Molecules: 65% target - - Organisms: 60% target + - When the project adopts Atomic Design (atoms / molecules / organisms layering): + - Atoms: 70% target + - Molecules: 65% target + - Organisms: 60% target + - When the project uses a different component architecture (Feature-based, Container-Presenter, etc.): apply 60% as the baseline and raise the target for foundational/leaf components (those reused across many features) to 70% - **User-Observable Behavior**: Test what users see and interact with - **MSW for API Mocking**: Use Mock Service Worker for API mocking - **Test Behavior Over Internals**: Test observable behavior and outputs, not internal state diff --git a/agents/task-decomposer.md b/agents/task-decomposer.md index acbf5cc..aeff0d1 100644 --- a/agents/task-decomposer.md +++ b/agents/task-decomposer.md @@ -104,8 +104,12 @@ Decompose tasks based on implementation strategy patterns determined in implemen |---|---| | Existing code modification | The existing implementation files being modified, their tests, related Design Doc sections | | New component/feature | Adjacent implementations in the same layer/domain, Design Doc interface contracts | + | Frontend component implementation | UI Spec component section (use the section heading the work plan's UI Spec Component β†’ Task Mapping cites), Design Doc interface contracts, adjacent components in the same layer | + | Frontend integration / fixture-e2e test | UI Spec component section including the State x Display Matrix and Interaction Definition tables, the implemented component code, fixture data files | | Test implementation | Test skeleton comments/annotations, the target code being tested, actual API/auth flows | - | E2E environment setup | Current environment config (startup scripts, docker-compose or equivalent), seed scripts, existing fixture patterns, application auth flow | + | fixture-e2e environment setup | Existing fixture data files, the API mock layer the project uses (e.g., MSW for JS/TS, WireMock for JVM, responses for Python), browser harness configuration (Playwright by default) | + | service-integration-e2e environment setup | Local startup scripts (docker-compose or equivalent), seed scripts, application auth flow, external service stubs | + | Cross-package boundary implementation | Both sides of the boundary as listed in the work plan's Connection Map (owner modules and expected signal), the contract definition between them | | Bug fix / refactor | The affected code paths, related test coverage, error reproduction context | | Behavior replacement / rewrite | The existing implementation being replaced, its observable outputs, Design Doc Verification Strategy section | @@ -115,6 +119,8 @@ Decompose tasks based on implementation strategy patterns determined in implemen - Be specific with file paths: `src/orders/checkout`, `docs/design/payment.md` β€” not "the order module" or "related code" - When the target is a section within a file, write the file path and add a search hint: `docs/design/payment.md (Β§ Payment Flow)` or `src/orders/checkout (processOrder function)` - When test skeletons exist for the task, always include them as Investigation Targets + - When the work plan contains a UI Spec Component β†’ Task Mapping table, propagate the matching component section to every task in that row (see UI Spec Propagation below) + - When the work plan contains a Connection Map, propagate the boundary rows touching this task's target files (see Connection Map Propagation below) 7. **Implementation Pattern Consistency** When including implementation samples, MUST ensure strict compliance with the Design Doc implementation approach that forms the basis of the work plan @@ -144,6 +150,29 @@ When the work plan header includes a Quality Assurance Mechanisms table, propaga 3. **Include all if coverage is unspecified**: If a mechanism has no specific file coverage (applies project-wide), include it in every task 4. **Omit when no match**: If no mechanisms match a task's target files, omit the "Quality Assurance Mechanisms" section from that task +## UI Spec Propagation + +When the work plan contains a UI Spec Component β†’ Task Mapping table, propagate component references to each implementation task as follows: + +1. **Lookup by task ID**: For each row in the mapping table, locate the task(s) listed in the "Covered By Task(s)" column +2. **Append a single line to Investigation Targets**: Add one line per matched component in the task's Investigation Targets section. The line format is `[ui-spec path] (Β§ [component heading])`, where `` is appended only when the row lists specific states. + + - When no states are listed: `docs/ui-spec/foo-ui-spec.md (Β§ Component: AlertCard)` + - When states are listed: `docs/ui-spec/foo-ui-spec.md (Β§ Component: AlertCard β€” verify default + loading + error states)` + + This is the entire entry β€” do not also add a separate parenthetical line. The state hint is part of the same line. +3. **One row β†’ one or more tasks**: A component can be split across multiple tasks; propagate the same line to each +4. **Skip when not provided**: If the work plan has no UI Spec Component β†’ Task Mapping table, skip this propagation step + +## Connection Map Propagation + +When the work plan contains a Connection Map table, propagate boundary context to each implementation task as follows: + +1. **Lookup by task ID**: For each row in the Connection Map, locate the task(s) listed in the "Covered By Task(s)" column +2. **Append to Investigation Targets**: Add the boundary's owner module file paths on both sides to each matched task's Investigation Targets +3. **Add a "Boundary Context" note in the task body**: Record the boundary identifier and expected signal verbatim from the Connection Map row, so the executor knows what observable evidence the implementation must produce +4. **Skip when not provided**: If the work plan has no Connection Map, skip this propagation step + ## Task File Template See task template in documentation-criteria skill for details. @@ -243,6 +272,8 @@ Please execute decomposed tasks according to the order. - [ ] Appropriate granularity (1-5 files/task) - [ ] Investigation Targets specified for every task (specific file paths, not vague categories) - [ ] Quality Assurance Mechanisms from work plan header propagated to relevant tasks +- [ ] UI Spec Component β†’ Task Mapping rows propagated to matching tasks (when work plan has the table) +- [ ] Connection Map boundary rows propagated to matching tasks (when work plan has the table) - [ ] Clear completion criteria setting - [ ] Overall design document creation - [ ] Implementation efficiency and rework prevention (pre-identification of common processing, clarification of impact scope) diff --git a/agents/task-executor-frontend.md b/agents/task-executor-frontend.md index 84b4605..7a521fa 100644 --- a/agents/task-executor-frontend.md +++ b/agents/task-executor-frontend.md @@ -55,7 +55,7 @@ Use the appropriate run command based on the `packageManager` field in package.j ### Step1: Design Deviation Check (Any YES β†’ Immediate Escalation) β–‘ Interface definition change needed? (Props type/structure/name changes) -β–‘ Component hierarchy violation needed? (e.g., Atomβ†’Organism direct dependency) +β–‘ Component hierarchy violation needed? (e.g., skipping a layer in the project's adopted architecture β€” Atomβ†’Organism in Atomic Design, leafβ†’container in Container-Presenter, etc.) β–‘ Data flow direction reversal needed? (e.g., child component updating parent state without callback) β–‘ New external library/API addition needed? β–‘ Need to ignore type definitions in Design Doc? diff --git a/agents/technical-designer-frontend.md b/agents/technical-designer-frontend.md index b42988f..f0bc862 100644 --- a/agents/technical-designer-frontend.md +++ b/agents/technical-designer-frontend.md @@ -119,7 +119,7 @@ Must be performed when creating Design Doc: 1. **Approach Selection Criteria** - Execute Phase 1-4 of implementation-approach skill to select strategy - **Vertical Slice**: Complete by feature unit, minimal component dependencies, early value delivery - - **Horizontal Slice**: Implementation by component layer (Atomsβ†’Moleculesβ†’Organisms), important common components, design consistency priority + - **Horizontal Slice**: Implementation by component layer (e.g., Atomsβ†’Moleculesβ†’Organisms when Atomic Design is adopted; otherwise the project's foundationalβ†’composite layering), important common components, design consistency priority - **Hybrid**: Composite, handles complex requirements - Document selection reason (record results of metacognitive strategy selection process) @@ -326,7 +326,7 @@ class Button extends React.Component { **Design Doc**: Component hierarchy diagram and data flow diagram are mandatory. Add state transition diagram and sequence diagram for complex cases. **React Diagrams**: -- Component hierarchy (Atoms β†’ Molecules β†’ Organisms β†’ Templates β†’ Pages) +- Component hierarchy following the project's adopted architecture (e.g., Atoms β†’ Molecules β†’ Organisms β†’ Templates β†’ Pages for Atomic Design; feature-folder tree for Feature-based; container vs presenter split for Container-Presenter) - Props flow diagram (parent β†’ child data flow) - State management diagram (Context, custom hooks) - User interaction flow (click β†’ state update β†’ re-render) diff --git a/agents/ui-spec-designer.md b/agents/ui-spec-designer.md index d4c997d..cad3a10 100644 --- a/agents/ui-spec-designer.md +++ b/agents/ui-spec-designer.md @@ -103,6 +103,8 @@ Execute file output immediately (considered approved at execution). - [ ] If prototype provided: prototype is placed in `docs/ui-spec/assets/` - [ ] All TBDs in Open Items have owner and deadline - [ ] All UI Spec requirements align with PRD requirements +- [ ] **Component heading uniqueness**: Every component is documented under a section heading whose text is unique within this UI Spec. Use the format `## Component: [ComponentName]` (or `### Component: [ComponentName]` when nested under a screen). Downstream agents (work-planner Step 5a, task-decomposer UI Spec Propagation) reference components by exact heading text β€” duplicate or paraphrased headings break the propagation chain. + - **Disambiguation rule**: When two components share a base name (e.g., the same `AlertCard` rendered as a banner variant and as an inline variant), append a parenthetical qualifier to make each heading unique: `Component: AlertCard (Banner variant)` and `Component: AlertCard (Inline variant)`. Verify uniqueness with a final pass: extract all `Component: ` headings, confirm zero duplicates ## Important Design Principles diff --git a/agents/work-planner.md b/agents/work-planner.md index 1de8715..9cef86a 100644 --- a/agents/work-planner.md +++ b/agents/work-planner.md @@ -38,22 +38,40 @@ Choose Strategy A (TDD) if test skeletons are provided, Strategy B (implementati - Final phase is always Quality Assurance **E2E Gap Check (all strategies)**: -After determining which test skeletons are available, check whether E2E skeletons are absent. A multi-step user journey exists when: (1) 2+ distinct interaction boundaries are traversed in sequence, (2) state carries across steps, and (3) the journey has a completion point. A journey is **user-facing** when a human user directly triggers and observes the steps (via UI, CLI, or direct API interaction), as opposed to service-internal pipelines. +After determining which test skeletons are available, check the two E2E lanes (fixture-e2e, service-integration-e2e β€” see integration-e2e-testing skill) independently. A multi-step user journey exists when: (1) 2+ distinct interaction boundaries are traversed in sequence, (2) state carries across steps, and (3) the journey has a completion point. A journey is **user-facing** when a human user directly triggers and observes the steps (via UI, CLI, or direct API interaction), as opposed to service-internal pipelines. ``` -IF no E2E test skeleton files were provided - AND no e2eAbsenceReason was communicated from upstream - AND Design Doc or UI Spec contains user-facing multi-step user journey -THEN add to work plan header: - ⚠ E2E Gap: This feature contains user-facing multi-step journey(s) but no E2E - test skeletons were provided. Consider running the test skeleton generation - step to evaluate E2E test candidates before final phase. - Detected journeys: [list journey descriptions and AC references] +fixture-e2e gap: + IF no fixture-e2e skeleton was provided + AND (e2eAbsenceReason.fixtureE2e is null + OR e2eAbsenceReason.fixtureE2e was not communicated) + AND Design Doc or UI Spec contains user-facing multi-step user journey + THEN add to work plan header: + ⚠ fixture-e2e Gap: This feature contains user-facing multi-step journey(s) + but no fixture-e2e skeleton was provided. Consider running the test + skeleton generation step to evaluate fixture-e2e candidates before the + UI implementation phase. + Detected journeys: [list journey descriptions and AC references] + +service-integration-e2e gap: + IF no service-integration-e2e skeleton was provided + AND (e2eAbsenceReason.serviceE2e is null + OR e2eAbsenceReason.serviceE2e was not communicated) + AND Design Doc indicates the journey requires real cross-service + verification (data persistence across services, transactional + consistency, external service contract) + THEN add to work plan header: + ⚠ service-integration-e2e Gap: This feature crosses service boundaries + where correctness depends on real cross-service behavior, but no + service-integration-e2e skeleton was provided. + Detected boundaries: [list crossings and AC references] ``` -When an `e2eAbsenceReason` is provided (e.g., `no_multi_step_journey`, `below_threshold_user_confirmed`), E2E absence is intentional β€” skip this gap check. +The "was not communicated" branch covers the scenario where the upstream planning flow skipped test skeleton generation entirely β€” in that case the absence reason field is not even passed to work-planner, so the gap check still runs. -This check applies regardless of whether Strategy A or B was selected. Integration-only skeletons being provided does not imply E2E coverage. Service-internal journeys (async pipelines, service-to-service sagas) are not flagged here β€” they may still warrant E2E through the normal ROI path. +When an `e2eAbsenceReason` for a lane carries a value (e.g., `no_multi_step_journey`, `below_threshold_user_confirmed`, `no_real_service_dependency`), absence in that lane is intentional β€” skip the gap check for that lane. + +This check applies regardless of whether Strategy A or B was selected. Integration-only skeletons being provided does not imply E2E coverage. Service-internal journeys (async pipelines, service-to-service sagas) are not flagged for the reserved-slot rule but may still warrant service-integration-e2e through the normal ROI path. **Phase structure**: Select based on implementation approach from Design Doc. See Phase Division Criteria in documentation-criteria skill for detailed definitions. Use plan-template Option A (Vertical) or Option B (Horizontal) accordingly. @@ -73,12 +91,47 @@ Map each extracted item to a covering task. Items may be covered by a dedicated Record the mapping in the Design-to-Plan Traceability table (see plan template). If an item has no covering task, set Gap Status to `gap` with justification in Notes. Gaps with justification require user confirmation before plan approval. +### 5a. Map UI Spec Components to Tasks (when UI Spec provided) + +When a UI Spec is among the inputs, also map components and states to the tasks that implement them. task-decomposer reads this mapping in Step 6 to populate each task's Investigation Targets, so without this step the UI Spec never reaches the executor. + +For each component documented in the UI Spec: +1. Identify the component's section heading exactly as it appears in the UI Spec (the heading is the reference key β€” see ui-spec-designer's heading uniqueness rule) +2. Identify which states (default / loading / empty / error / partial) the implementation must cover +3. Identify the task(s) in this plan that implement the component or its tests + +Record the mapping in the **UI Spec Component β†’ Task Mapping** table (see plan template). One row per component. Components with no covering task are flagged as `gap` requiring user confirmation, identical to the Design-to-Plan Traceability rule. + +### 5b. Map Cross-Package Boundaries to Tasks (when implementation crosses runtime/deployment boundaries) + +When the implementation crosses a runtime or deployment boundary, build a Connection Map so task-decomposer can propagate boundary context to each affected task. + +**A boundary qualifies for the Connection Map only when ALL of the following hold**: +- The two sides run in separate processes, services, or runtimes (e.g., web client ↔ HTTP server, service A ↔ service B over a network, frontend bundle ↔ backend handler) +- A serialized contract crosses between them (HTTP request/response, message envelope, RPC call, event payload) +- A failure on one side produces an observable signal on the other (status code, missing field, timeout, dropped message) + +**Excluded β€” these are NOT boundaries for the Connection Map**: +- A package importing a sibling utility, type definition, or shared constant from the same monorepo (in-process, no serialized contract) +- Internal layering within the same runtime (e.g., handler β†’ usecase β†’ repository) +- Source code dependencies that compile/bundle into the same artifact + +For each qualifying boundary: +1. Identify the boundary (e.g., `web β†’ API gateway`, `service-A β†’ service-B`, `frontend β†’ shared client β†’ backend handler`) +2. Identify the owner module/package on each side +3. Identify the expected signal that confirms the boundary works (e.g., HTTP 200 with schema X, message published to topic Y, row inserted in table Z) +4. Identify the task(s) that implement either side of the boundary + +Record the mapping in the **Connection Map** table (see plan template). Omit this section entirely when no qualifying boundary exists. + ### 6. Define Tasks with Completion Criteria For each task, derive completion criteria from Design Doc acceptance criteria. Apply the 3-element completion definition (Implementation Complete, Quality Complete, Integration Complete). ### 7. Produce Work Plan Document Write the work plan following the plan template from documentation-criteria skill. Include Phase Structure Diagram and Task Dependency Diagram (mermaid). +The plan header MUST include the line `Implementation Readiness: pending`. The marker contract: it takes one of three values β€” `pending` (initial, set here by work-planner), `ready` (verification completed with no remaining gaps), or `escalated` (verification completed with remaining gaps). The producer that promotes the marker beyond `pending` and the consumer that reads it before execution are external orchestration concerns owned outside this agent. + ## Input Parameters - **mode**: `create` (default) | `update` @@ -129,7 +182,8 @@ Create Red state tests based on unit test definitions provided from previous pro **Test Implementation Timing and Placement**: - Unit tests: Phase 0 Red β†’ Green during implementation - Integration tests: Create and execute at completion of relevant feature implementation (include in phase tasks like "[Feature name] implementation with integration test creation") -- E2E tests: Execute only in final phase (execution only, no separate implementation needed) +- fixture-e2e tests: Create and execute alongside the UI feature phase (include in phase tasks like "[Feature name] UI implementation with fixture-e2e creation"). These run in CI without infrastructure setup +- service-integration-e2e tests: Execute only in the final phase (these depend on local stack and tend to be too slow/heavy for per-task cycles) #### Meta Information Utilization Analyze meta information (@category, @dependency, @complexity, etc.) included in test definitions, @@ -166,22 +220,29 @@ Read test skeleton files (integration tests, E2E tests) with the Read tool and e #### Step 3: Extract Environment Prerequisites from E2E Skeletons -When E2E test skeletons are provided, scan for environment prerequisites in two stages: +When E2E test skeletons are provided, scan for environment prerequisites in two stages. Apply the lane-aware rules below β€” fixture-e2e and service-integration-e2e have very different prerequisite shapes. -**Stage 1: Detect precondition patterns** β€” scan all E2E skeletons and list every detected precondition: -- `Preconditions:` or `Arrange:` comment annotations mentioning seed data, test users, subscriptions, or specific DB state -- `@dependency: full-system` combined with auth/login setup code +**Stage 1: Detect precondition patterns** β€” scan each E2E skeleton (read its `@lane` header to know which lane applies) and list every detected precondition: +- `Preconditions:` or `Arrange:` comment annotations mentioning seed data, test users, fixtures, or specific UI/DB state +- `@dependency: full-ui (mocked backend)` combined with fixture loaders or API mock handlers (e.g., MSW for JS/TS, route interception in the project's browser harness β€” fixture-e2e) +- `@dependency: full-system` combined with auth/login setup code (service-integration-e2e) - References to environment variables (`E2E_*`, `TEST_*`) -- External service references requiring HTTP mock/intercept patterns in test code +- External service references requiring HTTP mock/intercept patterns + +**Stage 2: Generate setup tasks** β€” for each detected precondition, create a corresponding Phase 0 task. Common categories by lane: + +For **fixture-e2e**: +- **Fixture data** β†’ "Create fixture data files for [feature] UI states" +- **Mock backend** β†’ "Configure API mock layer for fixture-e2e (e.g., MSW for JS/TS, WireMock for JVM, responses for Python β€” use the project's standard)" +- **Browser harness** β†’ "Set up the browser harness for fixture-e2e (Playwright by default; no live services required)" -**Stage 2: Generate setup tasks** β€” for each detected precondition, create a corresponding Phase 0 task. Common categories include: -- **Seed data** β†’ "Create E2E seed data script (test users, required records)" -- **Auth fixture** β†’ "Implement E2E auth fixture using application's login flow" -- **External service mocks** β†’ "Configure external service mocks for E2E tests" -- **Environment configuration** β†’ "Define E2E environment variables and document setup" -- **Other detected preconditions** β†’ Create a setup task matching the detected category +For **service-integration-e2e**: +- **Seed data** β†’ "Create seed data script for service-integration-e2e (test users, required records)" +- **Auth fixture** β†’ "Implement auth fixture using application's login flow" +- **External service stubs** β†’ "Configure external service stubs for service-integration-e2e" +- **Environment configuration** β†’ "Define service-integration-e2e environment variables and document local startup" -Place all environment setup tasks in Phase 0 (before any implementation tasks). Mark with `@category: e2e-setup` for traceability. +Place all environment setup tasks in Phase 0 (before any implementation tasks). Mark with `@category: e2e-setup` and `@lane:` matching the target lane for traceability. #### Step 4: Classify and Place Tests @@ -189,7 +250,8 @@ Place all environment setup tasks in Phase 0 (before any implementation tasks). - Setup items (Mock preparation, measurement tools, Helpers, etc.) β†’ Prioritize in Phase 1 - Unit tests (individual functions) β†’ Start from Phase 0 with Red-Green-Refactor - Integration tests β†’ Place as create/execute tasks when relevant feature implementation is complete -- E2E tests β†’ Place as execute-only tasks in final phase +- fixture-e2e tests β†’ Place as create/execute tasks alongside the relevant UI feature implementation +- service-integration-e2e tests β†’ Place as execute-only tasks in final phase - Non-functional requirement tests (performance, UX, etc.) β†’ Place in quality assurance phase - Risk levels ("high risk", "required", etc.) β†’ Move to earlier phases @@ -234,6 +296,12 @@ When creating work plans, **Phase Structure Diagrams** and **Task Dependency Dia - [ ] Design-to-Plan Traceability table complete (all DD technical requirements categorized and mapped) - [ ] No `gap` entries without justification - [ ] All justified `gap` entries flagged for user confirmation before plan approval +- [ ] UI Spec Component β†’ Task Mapping table complete (when UI Spec provided) + - [ ] Every UI Spec component has a covering task, OR an explicit `gap` justification + - [ ] Component reference uses the UI Spec section heading exactly as it appears in the document +- [ ] Connection Map table complete (when implementation crosses packages/services) + - [ ] Every boundary lists owner modules and expected signal + - [ ] Every boundary maps to at least one covering task on each side - [ ] Verification Strategy extracted from Design Doc and included in plan header - [ ] Adopted Quality Assurance Mechanisms extracted from Design Doc and included in plan header - [ ] Phase structure matches implementation approach (vertical β†’ value unit phases, horizontal β†’ layer phases) @@ -242,7 +310,8 @@ When creating work plans, **Phase Structure Diagrams** and **Task Dependency Dia - [ ] Quality assurance exists in final phase - [ ] Test skeleton file paths listed in corresponding phases (when provided) - [ ] E2E environment prerequisites addressed (when E2E skeletons provided) - - [ ] Seed data, auth fixture, and external service mock tasks generated + - [ ] fixture-e2e prerequisites: fixture data, mocked backend, browser harness tasks generated when applicable + - [ ] service-integration-e2e prerequisites: seed data, auth fixture, external service stub tasks generated when applicable - [ ] Environment setup tasks placed in Phase 0 - [ ] Test design information reflected (only when provided) - [ ] Setup tasks placed in first phase diff --git a/dev-skills/.claude-plugin/plugin.json b/dev-skills/.claude-plugin/plugin.json index 9acca9d..b9a6521 100644 --- a/dev-skills/.claude-plugin/plugin.json +++ b/dev-skills/.claude-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "dev-skills", "description": "Lightweight skills for users with existing workflows - coding best practices, testing principles, and design guidelines without recipe workflows or agents", - "version": "0.16.17", + "version": "0.17.0", "author": { "name": "Shinsuke Kagawa", "url": "https://github.com/shinpr" diff --git a/dev-skills/skills/documentation-criteria/references/plan-template.md b/dev-skills/skills/documentation-criteria/references/plan-template.md index 1120318..084ec79 100644 --- a/dev-skills/skills/documentation-criteria/references/plan-template.md +++ b/dev-skills/skills/documentation-criteria/references/plan-template.md @@ -5,6 +5,7 @@ Type: feature|fix|refactor Estimated Duration: X days Estimated Impact: X files Related Issue/PR: #XXX (if any) +Implementation Readiness: pending ## Related Documents - Design Doc(s): @@ -46,6 +47,26 @@ Maps each Design Doc technical requirement to the covering task(s). One row per **Gap Status values**: `covered` (task exists), `gap` (no task β€” requires justification in Notes, user confirmation required before plan approval) +## UI Spec Component β†’ Task Mapping + +Include this section when a UI Spec is among the inputs. Maps each component documented in the UI Spec to the task(s) that implement it. task-decomposer reads this table to populate each task's Investigation Targets with the corresponding UI Spec section. Omit the section when no UI Spec exists. + +| UI Spec Component (section heading) | States to Cover | Covered By Task(s) | Gap Status | Notes | +|---|---|---|---|---| +| [Use the UI Spec heading exactly as written, e.g., "Β§ Component: AlertCard"] | [default / loading / empty / error / partial β€” list the states the implementation must produce] | [Phase X Task Y] | covered | | + +**Reference key rule**: The component identifier in column 1 is the UI Spec section heading (verbatim). ui-spec-designer enforces unique component headings so this reference resolves to exactly one section. + +**Gap Status values**: `covered` (task exists), `gap` (no task β€” requires justification in Notes, user confirmation required before plan approval) + +## Connection Map + +Include this section when the implementation crosses more than one package, service, or process boundary. Document each boundary so task-decomposer can propagate boundary context to the implementation tasks on each side. Omit the section when the implementation stays within a single package. + +| Boundary | Owner (left side) | Owner (right side) | Expected Signal | Covered By Task(s) | +|---|---|---|---|---| +| [e.g., "web client β†’ API gateway"] | [module/package on the request side] | [module/package on the response side] | [Observable evidence the boundary works β€” e.g., "HTTP 200 with response matching ContractA", "row inserted in tableB", "message published to topicC"] | [Phase X Task Y on each side] | + ## Objective [Why this change is necessary, what problem it solves] diff --git a/dev-skills/skills/documentation-criteria/references/ui-spec-template.md b/dev-skills/skills/documentation-criteria/references/ui-spec-template.md index b66681b..9b8b60d 100644 --- a/dev-skills/skills/documentation-criteria/references/ui-spec-template.md +++ b/dev-skills/skills/documentation-criteria/references/ui-spec-template.md @@ -59,6 +59,8 @@ Map PRD acceptance criteria to prototype references. Skip this section if no pro ### Component: [ComponentName] +> Component heading uniqueness: every `Component: [ComponentName]` heading must be unique within this UI Spec. work-planner and task-decomposer reference components by exact heading text β€” duplicate names or paraphrased headings break the propagation to implementation tasks. + #### State x Display Matrix | State | Default | Loading | Empty | Error | Partial | diff --git a/dev-skills/skills/integration-e2e-testing/SKILL.md b/dev-skills/skills/integration-e2e-testing/SKILL.md index 6ad9889..801459c 100644 --- a/dev-skills/skills/integration-e2e-testing/SKILL.md +++ b/dev-skills/skills/integration-e2e-testing/SKILL.md @@ -7,14 +7,21 @@ description: Integration and E2E test design principles, ROI calculation, test s ## References -**E2E test design with Playwright**: See [references/e2e-design.md](references/e2e-design.md) for UI Spec-driven E2E test candidate selection and Playwright test architecture. +**E2E test design**: See [references/e2e-design.md](references/e2e-design.md) for UI Spec-driven E2E test candidate selection and browser test architecture. The reference uses Playwright as the default browser harness; substitute the project's standard when different. ## Test Type Definition and Limits -| Test Type | Purpose | Scope | Limit per Feature | Implementation Timing | -|-----------|---------|-------|-------------------|----------------------| -| Integration | Verify component interactions | Partial system integration | MAX 3 | Created alongside implementation | -| E2E | Verify critical user journeys | Full system | MAX 1-2 | Executed in final phase only | +| Test Type | Purpose | Scope | External Deps | Limit per Feature | Implementation Timing | +|-----------|---------|-------|---------------|-------------------|----------------------| +| Integration | Verify component interactions in-process | Partial system integration (in-process modules; for UI components, the framework's in-process renderer e.g., RTL+MSW for React/TS) | Mocked or in-process | MAX 3 | Created alongside implementation | +| fixture-e2e | Verify UI behavior in a browser with deterministic fixtures | Full UI flow with mocked backend / fixture-driven state | Mocked / fixture only β€” no live services | MAX 3 | Created alongside the UI feature | +| service-integration-e2e | Verify critical user journeys against a running local stack | Full system across services | Live local services or stubs | MAX 1-2 | Executed only in the final phase | + +**Lane selection (E2E only)**: +- Default lane for user-facing UI journeys is **fixture-e2e** β€” it runs a real browser against deterministic fixtures, catches the bugs that unit/integration tests miss (button no-op, state never updates, navigation breaks), and runs in CI without infrastructure setup +- Add **service-integration-e2e** only when the journey's correctness depends on real cross-service behavior (data persistence, transactional consistency, external service contracts) that cannot be faked safely + +The two E2E lanes are budgeted independently β€” having a fixture-e2e for a journey does not consume the service-integration-e2e budget and vice versa. ## Behavior-First Principle @@ -43,20 +50,29 @@ ROI Score = Business Value Γ— User Frequency + Legal Requirement Γ— 10 + Defect Higher ROI Score = higher priority within its test type. No normalization or capping is applied β€” the raw score is used directly for ranking. Deduplication is a separate step that removes candidates entirely; it does not modify scores. -### ROI Threshold for E2E +### ROI Thresholds by Lane + +The two E2E lanes have very different ownership costs and use independent thresholds. -E2E tests have high ownership cost (creation, execution, and maintenance are each 3-10Γ— higher than integration tests). To justify creation, an E2E candidate (beyond the must-keep reserved slot) requires **ROI Score > 50**. +| Lane | ROI threshold | Rationale | +|------|---------------|-----------| +| fixture-e2e | ROI β‰₯ 20 (beyond reserved slot) | Cost is comparable to integration tests once the harness exists; the floor avoids filling MAX 3 with low-signal tests when fewer would suffice | +| service-integration-e2e | ROI > 50 (beyond reserved slot) | Creation, execution, and maintenance cost is 3-10Γ— higher than integration; reserve for journeys whose value cannot be proven any other way | + +Reserved slot rules (see Multi-Step User Journey Definition below) apply per lane and override the threshold (the reserved candidate is emitted regardless of its ROI score). Below-floor candidates beyond the reserved slot are not emitted, leaving budget intentionally unfilled rather than padding with low-value tests. ### ROI Calculation Examples | Scenario | BV | Freq | Legal | Defect | ROI Score | Test Type | Selection Outcome | |----------|----|------|-------|--------|-----------|-----------|-------------------| -| Core checkout flow | 10 | 9 | true | 9 | 109 | E2E | Selected (reserved slot: user-facing multi-step journey) | -| Payment error handling | 8 | 3 | false | 7 | 31 | E2E | Below threshold (31 < 50), not selected | -| Profile save flow | 7 | 6 | false | 6 | 48 | E2E | Below threshold (48 < 50), not selected | +| Core checkout UI flow | 10 | 9 | true | 9 | 109 | fixture-e2e | Selected (reserved slot: user-facing multi-step journey, browser-level verification with fixtures) | +| Core checkout against live payment service | 10 | 9 | true | 9 | 109 | service-integration-e2e | Selected (real-service correctness above ROI threshold) | +| Dismiss button updates UI state | 6 | 7 | false | 8 | 50 | fixture-e2e | Selected (rank 2 of 3 fixture-e2e budget) | +| Payment error message display | 5 | 4 | false | 7 | 27 | fixture-e2e | Selected (rank 3 of 3 fixture-e2e budget) | +| Optional filter toggle | 3 | 4 | false | 2 | 14 | fixture-e2e | Not selected (rank 4, budget full) | +| Payment retry against real provider | 8 | 3 | false | 7 | 31 | service-integration-e2e | Below ROI threshold (31 < 50), not selected | | DB persistence check | 8 | 8 | false | 8 | 72 | Integration | Selected (rank 1 of 3) | -| Error message display | 5 | 3 | false | 4 | 19 | Integration | Selected (rank 2 of 3) | -| Optional filter toggle | 3 | 4 | false | 2 | 14 | Integration | Not selected (rank 4, budget full) | +| Pure data transformation | 5 | 3 | false | 4 | 19 | Integration | Selected (rank 2 of 3) | ## Multi-Step User Journey Definition @@ -72,14 +88,14 @@ A feature qualifies as containing a **multi-step user journey** when ALL of the ### User-Facing vs Service-Internal Journeys -Multi-step journeys are further classified for E2E budget decisions: +Multi-step journeys are classified for reserved-slot eligibility: -| Classification | Condition | E2E Reserved Slot | Example | +| Classification | Condition | Reserved Slot Eligibility | Example | |---|---|---|---| -| **User-facing** | A human user directly triggers and observes the steps (via UI, CLI, or direct API interaction) | Eligible | Web checkout flow, CLI setup wizard, mobile onboarding | -| **Service-internal** | Steps are triggered by backend services without direct user interaction | Not eligible (use integration tests) | Async job pipeline, service-to-service saga, scheduled batch processing | +| **User-facing** | A human user directly triggers and observes the steps (via UI, CLI, or direct API interaction) | Eligible β€” defaults to **fixture-e2e** reserved slot. Add a service-integration-e2e reserved slot only when the journey's correctness depends on real cross-service behavior | Web checkout flow, CLI setup wizard, mobile onboarding | +| **Service-internal** | Steps are triggered by backend services without direct user interaction | Not eligible for reserved slot β€” use integration tests. Service-integration-e2e through normal ROI > 50 path is still valid when full-system verification is warranted | Async job pipeline, service-to-service saga, scheduled batch processing | -This classification applies only to the reserved E2E slot and the E2E Gap Check. Service-internal journeys are still valid E2E candidates through the normal ROI > 50 path if they warrant full-system verification. +This classification applies only to the reserved-slot rule and the E2E Gap Check. Other selection follows lane-specific ROI rules above. Use this definition when evaluating E2E test candidates and E2E gap detection. @@ -92,12 +108,18 @@ Each test MUST include the following annotations: ``` AC: [Original acceptance criteria text] Behavior: [Trigger] β†’ [Process] β†’ [Observable Result] -@category: core-functionality | integration | edge-case | e2e +@category: core-functionality | integration | edge-case | fixture-e2e | service-integration-e2e +@lane: integration | fixture-e2e | service-integration-e2e @dependency: none | [component names] | full-system @complexity: low | medium | high ROI: [score] ``` +**`@lane` selection rule**: +- `integration` β€” Component interaction in-process, no browser (e.g., RTL+MSW for React/TS, in-process module/handler integration in any language) +- `fixture-e2e` β€” Browser-level UI verification with mocked backend / fixture-driven state +- `service-integration-e2e` β€” Browser-level or end-to-end verification against running local services or stubs + Use the project's comment syntax to wrap these annotations (e.g., `//` for C-family, `#` for Python/Ruby/Shell). ### Verification Items (Optional) @@ -121,9 +143,10 @@ Verification items: ## Test File Naming Convention - Integration tests: `*.int.test.*` or `*.integration.test.*` -- E2E tests: `*.e2e.test.*` +- fixture-e2e tests: `*.fixture.e2e.test.*` (or organize under `tests/e2e/fixture/`) +- service-integration-e2e tests: `*.service.e2e.test.*` (or organize under `tests/e2e/service/`) -The test runner or framework in the project determines the appropriate file extension. +The test runner or framework in the project determines the appropriate file extension. Repos that already use a single `*.e2e.test.*` convention may keep it as long as each file declares `@lane:` in its header β€” the lane annotation is the source of truth for routing and budget accounting. ## Review Criteria diff --git a/dev-skills/skills/integration-e2e-testing/references/e2e-design.md b/dev-skills/skills/integration-e2e-testing/references/e2e-design.md index f4e9e90..45a0174 100644 --- a/dev-skills/skills/integration-e2e-testing/references/e2e-design.md +++ b/dev-skills/skills/integration-e2e-testing/references/e2e-design.md @@ -1,8 +1,21 @@ -# E2E Test Design with Playwright +# E2E Test Design (Browser Harness) + +This reference uses Playwright as the default example throughout because it is the standard E2E browser harness assumed by these workflows. Adapt patterns to the project's chosen framework when different (Cypress, Selenium, etc.); the lane definitions, ROI rules, and budgets remain the same. + +## Two E2E Lanes + +E2E tests in this workflow split into two lanes (see parent skill Test Type Definition): + +| Lane | When | ROI gate | Cost | +|------|------|----------|------| +| **fixture-e2e** | UI journey verification with deterministic fixtures (mocked backend / fixture data) | None β€” selected by ranking within MAX 3 budget | Comparable to integration; runs in CI without infrastructure setup | +| **service-integration-e2e** | Journey correctness depends on real cross-service behavior (data persistence, transactional consistency, external contracts) | ROI > 50 (beyond reserved slot) | 3-10Γ— higher than integration; reserved for what cannot be faked safely | + +Both lanes typically use Playwright; the difference is whether the backend is mocked / fixture-driven or running for real. ## When to Create E2E Tests -E2E tests target **critical user journeys** that span multiple pages or require real browser interaction. Apply the same ROI framework from the parent skill β€” only create E2E tests when ROI > 50. +E2E candidates target **critical user journeys** that span multiple pages or require real browser interaction. Pick the lane based on whether real services are required for the verification. ### Candidate Sources @@ -22,8 +35,8 @@ E2E tests target **critical user journeys** that span multiple pages or require - Responsive behavior across viewports **Use integration tests instead when**: -- Testing single-component state changes β†’ RTL -- Testing API response handling β†’ MSW + RTL +- Testing single-component state changes β†’ in-process component renderer (e.g., RTL for React/TS) +- Testing API response handling β†’ in-process API mock + component renderer (e.g., MSW + RTL for React/TS) - Testing pure data transformations β†’ unit tests ## UI Spec to E2E Test Mapping @@ -41,12 +54,15 @@ When a UI Spec exists, use it as the primary source for E2E test design: Screen Transition: [Screen A] β†’ [Screen B] β†’ [Screen C] AC Reference: AC-{id} User Journey: [Description of what the user accomplishes] -Preconditions: [Auth state, data state] +Lane: fixture-e2e | service-integration-e2e +Preconditions: [Auth state, data state β€” note whether these are fixture-driven or live] Verification Points: - [What to assert at each step] E2E ROI Score: [calculated score] ``` +**Lane decision**: choose `fixture-e2e` by default. Promote to `service-integration-e2e` when the verification requires observing real cross-service behavior (e.g., the test asserts that data persists across a real DB write, or that an external service receives the correct payload). + ## Playwright Test Architecture ### Page Object Pattern @@ -56,9 +72,11 @@ Organize browser interactions through page objects for maintainability: ``` tests/ β”œβ”€β”€ e2e/ -β”‚ β”œβ”€β”€ pages/ # Page objects -β”‚ β”œβ”€β”€ fixtures/ # Test fixtures and helpers -β”‚ └── *.e2e.test.ts # Test files +β”‚ β”œβ”€β”€ pages/ # Page objects (shared across lanes) +β”‚ β”œβ”€β”€ fixtures/ # Test fixtures and helpers (auth, seed) +β”‚ β”œβ”€β”€ data/ # Static fixture data for fixture-e2e +β”‚ β”œβ”€β”€ *.fixture.e2e.test.ts # fixture-e2e test files +β”‚ └── *.service.e2e.test.ts # service-integration-e2e test files ``` ### Test Isolation @@ -81,6 +99,6 @@ When UI Spec defines responsive behavior, test critical breakpoints: ## Budget Enforcement Hard limits per feature (same as parent skill): -- **E2E Tests**: MAX 1-2 tests -- Only generate if ROI score > 50 -- Prefer fewer, comprehensive journey tests over many granular tests +- **fixture-e2e**: MAX 3 tests, no ROI gate (selected by ranking) +- **service-integration-e2e**: MAX 1-2 tests, ROI > 50 beyond the reserved slot +- Prefer fewer, comprehensive journey tests over many granular tests in both lanes diff --git a/dev-skills/skills/test-implement/references/e2e.md b/dev-skills/skills/test-implement/references/e2e.md index 573f765..47cdbe5 100644 --- a/dev-skills/skills/test-implement/references/e2e.md +++ b/dev-skills/skills/test-implement/references/e2e.md @@ -1,5 +1,16 @@ # E2E Test Implementation with Playwright +## Lane Selection + +E2E tests in this workflow split into two lanes (defined in integration-e2e-testing skill): + +| Lane | Backend setup | Use these patterns | +|------|---------------|-------------------| +| **fixture-e2e** | Mocked via `page.route()` or fixture loaders; no live services | Page Object Pattern, Locator Strategy, Assertions, the **Fixture-Based Backend** section below | +| **service-integration-e2e** | Live local stack with real services | All patterns above PLUS the **E2E Environment Prerequisites** section (seed data, auth fixture against real auth flow) | + +The skeleton's `@lane:` annotation declares which lane the test belongs to. Choose implementation patterns to match. + ## Test Framework - **Playwright Test**: `@playwright/test` - Test imports: `import { test, expect } from '@playwright/test'` @@ -10,18 +21,23 @@ ``` tests/ └── e2e/ - β”œβ”€β”€ pages/ # Page objects + β”œβ”€β”€ pages/ # Page objects (shared across lanes) β”‚ β”œβ”€β”€ login.page.ts β”‚ └── dashboard.page.ts - β”œβ”€β”€ fixtures/ # Test fixtures + β”œβ”€β”€ fixtures/ # Test fixtures (auth, seed) β”‚ └── auth.fixture.ts - └── *.e2e.test.ts # Test files + β”œβ”€β”€ data/ # Static fixture data for fixture-e2e + β”‚ └── *.fixture.json + β”œβ”€β”€ *.fixture.e2e.test.ts # fixture-e2e test files + └── *.service.e2e.test.ts # service-integration-e2e test files ``` ### Naming Conventions -- Test files: `{FeatureName}.e2e.test.ts` +- fixture-e2e files: `{FeatureName}.fixture.e2e.test.ts` +- service-integration-e2e files: `{FeatureName}.service.e2e.test.ts` - Page objects: `{PageName}.page.ts` - Fixtures: `{Purpose}.fixture.ts` +- Static fixture data: `{scenario}.fixture.json` ## Page Object Pattern @@ -102,9 +118,46 @@ export const test = base.extend<{ authenticatedPage: Page }>({ }) ``` -## E2E Environment Prerequisites +## Fixture-Based Backend (fixture-e2e) + +fixture-e2e tests run a real browser against deterministic fixtures β€” no live backend, no DB, no external services. Use one of these patterns to fake the network: + +### Pattern A: page.route() interception + +```typescript +test('Dismiss-then-Undo restores card', async ({ page }) => { + // Arrange: intercept all backend calls with deterministic responses + await page.route('**/api/cards', async (route) => { + await route.fulfill({ json: cardsFixture }) + }) + await page.route('**/api/cards/*/dismiss', async (route) => { + await route.fulfill({ status: 204 }) + }) + + await page.goto('/cards') + await page.getByRole('button', { name: 'Dismiss' }).first().click() + await page.getByRole('button', { name: 'Undo' }).click() + + await expect(page.getByText(cardsFixture[0].title)).toBeVisible() +}) +``` + +### Pattern B: Fixture loader injection + +```typescript +// data/cards-with-dismiss.fixture.json β€” committed alongside the test +// Loaded via a route helper or app-level test mode +``` -E2E tests require a running application with real data state. Unlike unit/integration tests, environment setup is part of E2E test implementation scope. +**Principles for fixture-e2e**: +- Backend is faked, not running. No `npm run start:backend` required to execute these tests +- Fixtures are versioned in the repo (`tests/e2e/data/`) so tests are deterministic across machines +- Auth, when needed, is faked too (set a test cookie via `page.context().addCookies()` or use a fixture-mode bypass) +- These tests run in CI without provisioning external infrastructure + +## E2E Environment Prerequisites (service-integration-e2e) + +service-integration-e2e tests require a running application with real data state. Unlike fixture-e2e, environment setup is part of test implementation scope. ### Seed Data Strategy @@ -163,16 +216,16 @@ export const test = base.extend<{ playerPage: Page }>({ - Store test credentials in environment variables only (`E2E_*` prefixed) - If the auth flow requires specific user records, seed them in the fixture -### Environment Checklist +### Environment Checklist (service-integration-e2e only) -Before E2E tests can pass, verify: +Before service-integration-e2e tests can pass, verify: - [ ] Application is running and accessible at `baseURL` - [ ] Database has required seed data (test users, subscriptions, content) - [ ] Authentication flow works with test credentials - [ ] Environment variables are set (`E2E_*` prefixed) -- [ ] External services are either available or mocked via `page.route()` +- [ ] External services are either available or stubbed -When the work plan includes dedicated environment setup tasks (Phase 0), follow those tasks. When no setup tasks exist in the plan, address missing prerequisites as part of the E2E test implementation task itself. +When the work plan includes dedicated environment setup tasks (Phase 0 β€” see work-planner E2E Environment Prerequisites extraction), follow those tasks. When no setup tasks exist in the plan, address missing prerequisites as part of the test implementation task itself, OR consider whether the verification could move to fixture-e2e instead. ## Locator Strategy @@ -235,18 +288,36 @@ test.describe('responsive navigation', () => { ## Skeleton Comment Format -E2E test skeletons follow the same annotation format as integration tests (adapt comment syntax to the project's language): +E2E test skeletons follow the same annotation format as integration tests (adapt comment syntax to the project's language). The `@lane` annotation routes the test to the correct implementation patterns. + +### fixture-e2e example +```typescript +// AC: [Original acceptance criteria text] +// Behavior: [User action] β†’ [System response] β†’ [Observable result in browser] +// @category: fixture-e2e +// @lane: fixture-e2e +// @dependency: full-ui (mocked backend) +// @complexity: medium +// ROI: [score] +test('AC1: [Description]', async ({ page }) => { + // Arrange: load fixture data, intercept network + // Act: user interaction + // Assert: observable browser state +}) +``` +### service-integration-e2e example ```typescript // AC: [Original acceptance criteria text] -// Behavior: [User action] β†’ [System response] β†’ [Observable result] -// @category: e2e +// Behavior: [User action] β†’ [System response across services] β†’ [Observable cross-service result] +// @category: service-integration-e2e +// @lane: service-integration-e2e // @dependency: full-system // @complexity: high // ROI: [score] -test('AC1: [Description]', async ({ page }) => { - // Arrange: [Setup description] - // Act: [Action description] - // Assert: [Verification description] +test('AC1: [Description]', async ({ page, request }) => { + // Arrange: seed real data, real auth + // Act: user interaction + // Assert: observable result + cross-service evidence (DB row, downstream event) }) ``` diff --git a/dev-skills/skills/test-implement/references/frontend.md b/dev-skills/skills/test-implement/references/frontend.md index e605e28..ff3098f 100644 --- a/dev-skills/skills/test-implement/references/frontend.md +++ b/dev-skills/skills/test-implement/references/frontend.md @@ -19,9 +19,15 @@ ### Coverage Requirements (ADR-0002 Compliant) **Component-specific targets**: + +When the project adopts Atomic Design (atoms / molecules / organisms layering): - Atoms (Button, Text, etc.): 70% or higher - Molecules (FormField, etc.): 65% or higher - Organisms (Header, Footer, etc.): 60% or higher + +When the project uses a different component architecture (Feature-based, Container-Presenter, etc.): apply 60% as the baseline and raise the target for foundational/leaf components (those reused across many features) to 70%. + +Component-architecture-independent targets: - Custom Hooks: 65% or higher - Utils: 70% or higher diff --git a/dev-skills/skills/typescript-rules/SKILL.md b/dev-skills/skills/typescript-rules/SKILL.md index 99a6f50..947ab7b 100644 --- a/dev-skills/skills/typescript-rules/SKILL.md +++ b/dev-skills/skills/typescript-rules/SKILL.md @@ -62,7 +62,7 @@ function isUser(value: unknown): value is User { **Component Design Criteria** - **Function components only**: Official React recommendation, optimizable by modern tooling (Exception: Error Boundary requires class component) - **Custom Hooks**: Standard pattern for logic reuse and dependency injection -- **Component Hierarchy**: Atoms β†’ Molecules β†’ Organisms β†’ Templates β†’ Pages +- **Component Hierarchy**: Use the project's adopted component architecture. When the project uses Atomic Design: Atoms β†’ Molecules β†’ Organisms β†’ Templates β†’ Pages. When the project uses Feature-based, Container-Presenter, or another structure: follow that structure consistently and document the chosen layering in the project README or design doc - **Co-location**: Place tests, styles, and related files alongside components **State Management Patterns** diff --git a/dev-workflows-frontend/.claude-plugin/plugin.json b/dev-workflows-frontend/.claude-plugin/plugin.json index 5d39771..1d0c511 100644 --- a/dev-workflows-frontend/.claude-plugin/plugin.json +++ b/dev-workflows-frontend/.claude-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "dev-workflows-frontend", "description": "Skills + Subagents for React/TypeScript - Use skills for coding guidance, or run recipe workflows for full orchestrated agentic coding with specialized agents", - "version": "0.16.17", + "version": "0.17.0", "author": { "name": "Shinsuke Kagawa", "url": "https://github.com/shinpr" diff --git a/dev-workflows-frontend/agents/acceptance-test-generator.md b/dev-workflows-frontend/agents/acceptance-test-generator.md index 2916939..7eba9ef 100644 --- a/dev-workflows-frontend/agents/acceptance-test-generator.md +++ b/dev-workflows-frontend/agents/acceptance-test-generator.md @@ -37,7 +37,8 @@ Test type definitions, budgets, and ROI calculations are specified in **integrat Key points: - **Integration Tests**: MAX 3 per feature, created alongside implementation -- **E2E Tests**: MAX 1-2 per feature, executed in final phase only +- **fixture-e2e**: MAX 3 per feature, created alongside the UI feature phase, ROI β‰₯ 20 beyond reserved slot +- **service-integration-e2e**: MAX 1-2 per feature, executed only in the final phase, ROI > 50 beyond reserved slot ## 4-Phase Generation Process @@ -103,9 +104,9 @@ For each valid AC from Phase 1: **Output**: Candidate pool with ROI metadata -### Phase 3: ROI-Based Selection (Two-Pass #2) +### Phase 3: ROI-Based Selection and Lane Assignment (Two-Pass #2) -ROI calculation formula and cost table are defined in **integration-e2e-testing skill**. +ROI calculation formula and cost table are defined in **integration-e2e-testing skill**. Lane definitions and selection rules are also in that skill. **Selection Algorithm**: @@ -118,34 +119,50 @@ ROI calculation formula and cost table are defined in **integration-e2e-testing 3. **Push-Down Analysis**: ``` Can this be unit-tested? β†’ Remove from integration/E2E pool - Already integration-tested? β†’ Keep as E2E candidate IF part of multi-step user journey (see definition in integration-e2e-testing skill) - Already integration-tested AND NOT part of multi-step journey? β†’ Remove from E2E pool + Already integration-tested AND verifiable in-process? β†’ Remove from E2E pool ``` -4. **Sort by ROI** (descending order) +4. **Lane assignment** (E2E candidates only): + - Default to `fixture-e2e` for any UI journey verifiable with mocked backend / fixture-driven state + - Promote to `service-integration-e2e` only when the verification depends on real cross-service behavior. A candidate qualifies for `service-integration-e2e` when ANY of the following must be asserted: + - Data persists across a real DB write (e.g., row inserted/updated in the actual database under test) + - A downstream service receives a real event/message (e.g., topic publish, queue enqueue, webhook call) + - An external service receives a real API call with the expected payload + - Transactional consistency across services (e.g., two-phase commit, saga compensation) +5. **Sort by ROI** within each lane (descending) β€” this is the single ranking step; Phase 4 budget enforcement consumes this ranked list directly without re-sorting. -**Output**: Ranked, deduplicated candidate list +**Output**: Ranked, deduplicated candidate list with lane assigned per E2E candidate. ### Phase 4: Budget Enforcement **Hard Limits per Feature**: - **Integration Tests**: MAX 3 tests -- **E2E Tests**: MAX 1-2 tests total, composed of: - - 1 reserved slot (emitted regardless of ROI) when feature contains a **user-facing** multi-step user journey (see definition and classification in integration-e2e-testing skill) +- **fixture-e2e**: MAX 3 tests, no ROI gate. When the feature contains a **user-facing** multi-step user journey, the highest-ROI journey candidate is reserved (emitted regardless of ranking) +- **service-integration-e2e**: MAX 1-2 tests, composed of: + - 1 reserved slot (emitted regardless of ROI) when the journey's correctness depends on real cross-service behavior that fixture-e2e cannot verify - Up to 1 additional slot requiring ROI > 50 **Selection Algorithm**: ``` -1. Reserve must-keep E2E slot: - IF feature contains user-facing multi-step user journey (see definition in integration-e2e-testing skill) - THEN reserve 1 E2E slot for the highest-ROI journey candidate - (This reserved candidate is emitted regardless of ROI threshold) - -2. Sort remaining candidates by ROI (descending) - -3. Select top N within budget: +1. Reserve fixture-e2e slot: + IF feature contains user-facing multi-step user journey + THEN reserve 1 fixture-e2e slot for the highest-ROI journey candidate + +2. Reserve service-integration-e2e slot (only if needed): + IF the reserved journey's verification requires ANY of: + - data persists across a real DB write + - downstream service receives a real event/message + - external service receives a real API call with expected payload + - transactional consistency across services + THEN reserve 1 service-integration-e2e slot for that journey + +3. Walk the candidate list (already sorted by ROI within each lane in Phase 3 step 5) + and select within budget: - Integration: Pick top 3 highest-ROI - - E2E (additional beyond reserved): Pick up to 1 more IF ROI score > 50 + - fixture-e2e (additional beyond reserved): Pick up to remaining budget IF ROI β‰₯ 20 + - service-integration-e2e (additional beyond reserved): Pick up to 1 more IF ROI > 50 + + Leave budget intentionally unfilled when no remaining candidate clears the lane's threshold. ``` **Output**: Final test set @@ -180,81 +197,109 @@ The examples below use `//` comment syntax. Adapt to the project's language (e.g [Test: 'AC1: Failed payment displays error without creating order'] ``` -### E2E Test File +### fixture-e2e Test File + +``` +// [Feature Name] fixture-e2e Test - Design Doc: [filename] +// Generated: [date] | Budget Used: 1/3 fixture-e2e +// Test Type: Browser-level UI verification with mocked backend / fixture-driven state +// Implementation Timing: Alongside the UI feature implementation + +[Import statement using detected test framework] + +[Test suite using detected framework syntax] + // User Journey: Click Dismiss β†’ card disappears β†’ undo banner appears β†’ Undo restores card + // ROI: 60 (BV:6 Γ— Freq:7 + Legal:0 + Defect:8) | reserved slot: user-facing multi-step journey + // Verification: UI state transitions are observable in the browser + // @category: fixture-e2e + // @lane: fixture-e2e + // @dependency: full-ui (mocked backend) + // @complexity: medium + [Test: 'User Journey: Dismiss-then-Undo restores the card to its original state'] +``` + +### service-integration-e2e Test File ``` -// [Feature Name] E2E Test - Design Doc: [filename] -// Generated: [date] | Budget Used: 1/2 E2E -// Test Type: End-to-End Test -// Implementation Timing: After all feature implementations complete +// [Feature Name] service-integration-e2e Test - Design Doc: [filename] +// Generated: [date] | Budget Used: 1/2 service-integration-e2e +// Test Type: End-to-end against running local stack +// Implementation Timing: Executed only in the final phase [Import statement using detected test framework] [Test suite using detected framework syntax] - // User Journey: Complete purchase flow (browse β†’ add to cart β†’ checkout β†’ payment β†’ confirmation) - // ROI: 119 (BV:10 Γ— Freq:10 + Legal:10 + Defect:9) | reserved slot: multi-step journey - // Verification: End-to-end user experience from product selection to order confirmation - // @category: e2e + // User Journey: Complete purchase flow (browse β†’ checkout β†’ payment β†’ confirmation persisted in DB) + // ROI: 119 (BV:10 Γ— Freq:10 + Legal:10 + Defect:9) | reserved slot: cross-service correctness + // Verification: Order persists in DB and confirmation event reaches downstream consumer + // @category: service-integration-e2e + // @lane: service-integration-e2e // @dependency: full-system // @complexity: high - [Test: 'User Journey: Complete product purchase from browse to confirmation email'] + [Test: 'User Journey: Complete purchase persists order and emits confirmation event'] ``` ### Generation Report -**When E2E tests are emitted:** +**When both E2E lanes are emitted:** ```json { "status": "completed", "feature": "payment", "generatedFiles": { "integration": "tests/payment.int.test.[ext]", - "e2e": "tests/payment.e2e.test.[ext]" + "fixtureE2e": "tests/payment.fixture.e2e.test.[ext]", + "serviceE2e": "tests/payment.service.e2e.test.[ext]" }, - "budgetUsage": { "integration": "2/3", "e2e": "1/2" }, - "e2eAbsenceReason": null + "budgetUsage": { "integration": "2/3", "fixtureE2e": "1/3", "serviceE2e": "1/2" }, + "e2eAbsenceReason": { "fixtureE2e": null, "serviceE2e": null } } ``` -**When no E2E tests are emitted:** +**When only fixture-e2e is emitted:** ```json { "status": "completed", "feature": "payment", "generatedFiles": { "integration": "tests/payment.int.test.[ext]", - "e2e": null + "fixtureE2e": "tests/payment.fixture.e2e.test.[ext]", + "serviceE2e": null }, - "budgetUsage": { "integration": "2/3", "e2e": "0/2" }, - "e2eAbsenceReason": "no_multi_step_journey" + "budgetUsage": { "integration": "2/3", "fixtureE2e": "2/3", "serviceE2e": "0/2" }, + "e2eAbsenceReason": { "fixtureE2e": null, "serviceE2e": "no_real_service_dependency" } } ``` -**When no integration tests are emitted:** +**When no E2E tests are emitted:** ```json { "status": "completed", "feature": "config-update", "generatedFiles": { - "integration": null, - "e2e": null + "integration": "tests/config.int.test.[ext]", + "fixtureE2e": null, + "serviceE2e": null }, - "budgetUsage": { "integration": "0/3", "e2e": "0/2" }, - "e2eAbsenceReason": "no_multi_step_journey" + "budgetUsage": { "integration": "1/3", "fixtureE2e": "0/3", "serviceE2e": "0/2" }, + "e2eAbsenceReason": { "fixtureE2e": "no_multi_step_journey", "serviceE2e": "no_multi_step_journey" } } ``` -**Contract**: Both `generatedFiles.integration` and `generatedFiles.e2e` are always present as keys. Value is a file path string when generated, `null` when not generated. `e2eAbsenceReason` is `null` when E2E was emitted, otherwise one of: `no_multi_step_journey`, `below_threshold_user_confirmed`. +**Contract**: +- `generatedFiles.integration`, `generatedFiles.fixtureE2e`, `generatedFiles.serviceE2e` are always present as keys. Value is a file path string when generated, `null` when not. +- `e2eAbsenceReason` is an object with `fixtureE2e` and `serviceE2e` keys. Each value is `null` when that lane emitted, otherwise one of: `no_multi_step_journey`, `below_threshold_user_confirmed`, `no_real_service_dependency` (service-integration-e2e only β€” meaning the journey is verifiable in fixture-e2e). ## Test Meta Information Assignment Each test case MUST have the following standard annotations for test implementation planning: -- **@category**: core-functionality | integration | edge-case | ux -- **@dependency**: none | [component names] | full-system +- **@category**: core-functionality | integration | edge-case | ux | fixture-e2e | service-integration-e2e +- **@lane**: integration | fixture-e2e | service-integration-e2e +- **@dependency**: none | [component names] | full-ui (mocked backend) | full-system - **@complexity**: low | medium | high -These annotations are used when planning and prioritizing test implementation. +These annotations are used when planning and prioritizing test implementation. The `@lane` annotation is the source of truth for budget accounting and CI gating. ## Constraints and Quality Standards diff --git a/dev-workflows-frontend/agents/quality-fixer-frontend.md b/dev-workflows-frontend/agents/quality-fixer-frontend.md index fde3f68..737f385 100644 --- a/dev-workflows-frontend/agents/quality-fixer-frontend.md +++ b/dev-workflows-frontend/agents/quality-fixer-frontend.md @@ -103,9 +103,11 @@ Return one of the following as the final response (see Output Format for schemas ### Testing Quality (React Testing Library) - **Test Coverage**: Minimum 60% coverage for frontend code - - Atoms: 70% target - - Molecules: 65% target - - Organisms: 60% target + - When the project adopts Atomic Design (atoms / molecules / organisms layering): + - Atoms: 70% target + - Molecules: 65% target + - Organisms: 60% target + - When the project uses a different component architecture (Feature-based, Container-Presenter, etc.): apply 60% as the baseline and raise the target for foundational/leaf components (those reused across many features) to 70% - **User-Observable Behavior**: Test what users see and interact with - **MSW for API Mocking**: Use Mock Service Worker for API mocking - **Test Behavior Over Internals**: Test observable behavior and outputs, not internal state diff --git a/dev-workflows-frontend/agents/task-decomposer.md b/dev-workflows-frontend/agents/task-decomposer.md index acbf5cc..aeff0d1 100644 --- a/dev-workflows-frontend/agents/task-decomposer.md +++ b/dev-workflows-frontend/agents/task-decomposer.md @@ -104,8 +104,12 @@ Decompose tasks based on implementation strategy patterns determined in implemen |---|---| | Existing code modification | The existing implementation files being modified, their tests, related Design Doc sections | | New component/feature | Adjacent implementations in the same layer/domain, Design Doc interface contracts | + | Frontend component implementation | UI Spec component section (use the section heading the work plan's UI Spec Component β†’ Task Mapping cites), Design Doc interface contracts, adjacent components in the same layer | + | Frontend integration / fixture-e2e test | UI Spec component section including the State x Display Matrix and Interaction Definition tables, the implemented component code, fixture data files | | Test implementation | Test skeleton comments/annotations, the target code being tested, actual API/auth flows | - | E2E environment setup | Current environment config (startup scripts, docker-compose or equivalent), seed scripts, existing fixture patterns, application auth flow | + | fixture-e2e environment setup | Existing fixture data files, the API mock layer the project uses (e.g., MSW for JS/TS, WireMock for JVM, responses for Python), browser harness configuration (Playwright by default) | + | service-integration-e2e environment setup | Local startup scripts (docker-compose or equivalent), seed scripts, application auth flow, external service stubs | + | Cross-package boundary implementation | Both sides of the boundary as listed in the work plan's Connection Map (owner modules and expected signal), the contract definition between them | | Bug fix / refactor | The affected code paths, related test coverage, error reproduction context | | Behavior replacement / rewrite | The existing implementation being replaced, its observable outputs, Design Doc Verification Strategy section | @@ -115,6 +119,8 @@ Decompose tasks based on implementation strategy patterns determined in implemen - Be specific with file paths: `src/orders/checkout`, `docs/design/payment.md` β€” not "the order module" or "related code" - When the target is a section within a file, write the file path and add a search hint: `docs/design/payment.md (Β§ Payment Flow)` or `src/orders/checkout (processOrder function)` - When test skeletons exist for the task, always include them as Investigation Targets + - When the work plan contains a UI Spec Component β†’ Task Mapping table, propagate the matching component section to every task in that row (see UI Spec Propagation below) + - When the work plan contains a Connection Map, propagate the boundary rows touching this task's target files (see Connection Map Propagation below) 7. **Implementation Pattern Consistency** When including implementation samples, MUST ensure strict compliance with the Design Doc implementation approach that forms the basis of the work plan @@ -144,6 +150,29 @@ When the work plan header includes a Quality Assurance Mechanisms table, propaga 3. **Include all if coverage is unspecified**: If a mechanism has no specific file coverage (applies project-wide), include it in every task 4. **Omit when no match**: If no mechanisms match a task's target files, omit the "Quality Assurance Mechanisms" section from that task +## UI Spec Propagation + +When the work plan contains a UI Spec Component β†’ Task Mapping table, propagate component references to each implementation task as follows: + +1. **Lookup by task ID**: For each row in the mapping table, locate the task(s) listed in the "Covered By Task(s)" column +2. **Append a single line to Investigation Targets**: Add one line per matched component in the task's Investigation Targets section. The line format is `[ui-spec path] (Β§ [component heading])`, where `` is appended only when the row lists specific states. + + - When no states are listed: `docs/ui-spec/foo-ui-spec.md (Β§ Component: AlertCard)` + - When states are listed: `docs/ui-spec/foo-ui-spec.md (Β§ Component: AlertCard β€” verify default + loading + error states)` + + This is the entire entry β€” do not also add a separate parenthetical line. The state hint is part of the same line. +3. **One row β†’ one or more tasks**: A component can be split across multiple tasks; propagate the same line to each +4. **Skip when not provided**: If the work plan has no UI Spec Component β†’ Task Mapping table, skip this propagation step + +## Connection Map Propagation + +When the work plan contains a Connection Map table, propagate boundary context to each implementation task as follows: + +1. **Lookup by task ID**: For each row in the Connection Map, locate the task(s) listed in the "Covered By Task(s)" column +2. **Append to Investigation Targets**: Add the boundary's owner module file paths on both sides to each matched task's Investigation Targets +3. **Add a "Boundary Context" note in the task body**: Record the boundary identifier and expected signal verbatim from the Connection Map row, so the executor knows what observable evidence the implementation must produce +4. **Skip when not provided**: If the work plan has no Connection Map, skip this propagation step + ## Task File Template See task template in documentation-criteria skill for details. @@ -243,6 +272,8 @@ Please execute decomposed tasks according to the order. - [ ] Appropriate granularity (1-5 files/task) - [ ] Investigation Targets specified for every task (specific file paths, not vague categories) - [ ] Quality Assurance Mechanisms from work plan header propagated to relevant tasks +- [ ] UI Spec Component β†’ Task Mapping rows propagated to matching tasks (when work plan has the table) +- [ ] Connection Map boundary rows propagated to matching tasks (when work plan has the table) - [ ] Clear completion criteria setting - [ ] Overall design document creation - [ ] Implementation efficiency and rework prevention (pre-identification of common processing, clarification of impact scope) diff --git a/dev-workflows-frontend/agents/task-executor-frontend.md b/dev-workflows-frontend/agents/task-executor-frontend.md index 84b4605..7a521fa 100644 --- a/dev-workflows-frontend/agents/task-executor-frontend.md +++ b/dev-workflows-frontend/agents/task-executor-frontend.md @@ -55,7 +55,7 @@ Use the appropriate run command based on the `packageManager` field in package.j ### Step1: Design Deviation Check (Any YES β†’ Immediate Escalation) β–‘ Interface definition change needed? (Props type/structure/name changes) -β–‘ Component hierarchy violation needed? (e.g., Atomβ†’Organism direct dependency) +β–‘ Component hierarchy violation needed? (e.g., skipping a layer in the project's adopted architecture β€” Atomβ†’Organism in Atomic Design, leafβ†’container in Container-Presenter, etc.) β–‘ Data flow direction reversal needed? (e.g., child component updating parent state without callback) β–‘ New external library/API addition needed? β–‘ Need to ignore type definitions in Design Doc? diff --git a/dev-workflows-frontend/agents/technical-designer-frontend.md b/dev-workflows-frontend/agents/technical-designer-frontend.md index b42988f..f0bc862 100644 --- a/dev-workflows-frontend/agents/technical-designer-frontend.md +++ b/dev-workflows-frontend/agents/technical-designer-frontend.md @@ -119,7 +119,7 @@ Must be performed when creating Design Doc: 1. **Approach Selection Criteria** - Execute Phase 1-4 of implementation-approach skill to select strategy - **Vertical Slice**: Complete by feature unit, minimal component dependencies, early value delivery - - **Horizontal Slice**: Implementation by component layer (Atomsβ†’Moleculesβ†’Organisms), important common components, design consistency priority + - **Horizontal Slice**: Implementation by component layer (e.g., Atomsβ†’Moleculesβ†’Organisms when Atomic Design is adopted; otherwise the project's foundationalβ†’composite layering), important common components, design consistency priority - **Hybrid**: Composite, handles complex requirements - Document selection reason (record results of metacognitive strategy selection process) @@ -326,7 +326,7 @@ class Button extends React.Component { **Design Doc**: Component hierarchy diagram and data flow diagram are mandatory. Add state transition diagram and sequence diagram for complex cases. **React Diagrams**: -- Component hierarchy (Atoms β†’ Molecules β†’ Organisms β†’ Templates β†’ Pages) +- Component hierarchy following the project's adopted architecture (e.g., Atoms β†’ Molecules β†’ Organisms β†’ Templates β†’ Pages for Atomic Design; feature-folder tree for Feature-based; container vs presenter split for Container-Presenter) - Props flow diagram (parent β†’ child data flow) - State management diagram (Context, custom hooks) - User interaction flow (click β†’ state update β†’ re-render) diff --git a/dev-workflows-frontend/agents/ui-spec-designer.md b/dev-workflows-frontend/agents/ui-spec-designer.md index d4c997d..cad3a10 100644 --- a/dev-workflows-frontend/agents/ui-spec-designer.md +++ b/dev-workflows-frontend/agents/ui-spec-designer.md @@ -103,6 +103,8 @@ Execute file output immediately (considered approved at execution). - [ ] If prototype provided: prototype is placed in `docs/ui-spec/assets/` - [ ] All TBDs in Open Items have owner and deadline - [ ] All UI Spec requirements align with PRD requirements +- [ ] **Component heading uniqueness**: Every component is documented under a section heading whose text is unique within this UI Spec. Use the format `## Component: [ComponentName]` (or `### Component: [ComponentName]` when nested under a screen). Downstream agents (work-planner Step 5a, task-decomposer UI Spec Propagation) reference components by exact heading text β€” duplicate or paraphrased headings break the propagation chain. + - **Disambiguation rule**: When two components share a base name (e.g., the same `AlertCard` rendered as a banner variant and as an inline variant), append a parenthetical qualifier to make each heading unique: `Component: AlertCard (Banner variant)` and `Component: AlertCard (Inline variant)`. Verify uniqueness with a final pass: extract all `Component: ` headings, confirm zero duplicates ## Important Design Principles diff --git a/dev-workflows-frontend/agents/work-planner.md b/dev-workflows-frontend/agents/work-planner.md index 1de8715..9cef86a 100644 --- a/dev-workflows-frontend/agents/work-planner.md +++ b/dev-workflows-frontend/agents/work-planner.md @@ -38,22 +38,40 @@ Choose Strategy A (TDD) if test skeletons are provided, Strategy B (implementati - Final phase is always Quality Assurance **E2E Gap Check (all strategies)**: -After determining which test skeletons are available, check whether E2E skeletons are absent. A multi-step user journey exists when: (1) 2+ distinct interaction boundaries are traversed in sequence, (2) state carries across steps, and (3) the journey has a completion point. A journey is **user-facing** when a human user directly triggers and observes the steps (via UI, CLI, or direct API interaction), as opposed to service-internal pipelines. +After determining which test skeletons are available, check the two E2E lanes (fixture-e2e, service-integration-e2e β€” see integration-e2e-testing skill) independently. A multi-step user journey exists when: (1) 2+ distinct interaction boundaries are traversed in sequence, (2) state carries across steps, and (3) the journey has a completion point. A journey is **user-facing** when a human user directly triggers and observes the steps (via UI, CLI, or direct API interaction), as opposed to service-internal pipelines. ``` -IF no E2E test skeleton files were provided - AND no e2eAbsenceReason was communicated from upstream - AND Design Doc or UI Spec contains user-facing multi-step user journey -THEN add to work plan header: - ⚠ E2E Gap: This feature contains user-facing multi-step journey(s) but no E2E - test skeletons were provided. Consider running the test skeleton generation - step to evaluate E2E test candidates before final phase. - Detected journeys: [list journey descriptions and AC references] +fixture-e2e gap: + IF no fixture-e2e skeleton was provided + AND (e2eAbsenceReason.fixtureE2e is null + OR e2eAbsenceReason.fixtureE2e was not communicated) + AND Design Doc or UI Spec contains user-facing multi-step user journey + THEN add to work plan header: + ⚠ fixture-e2e Gap: This feature contains user-facing multi-step journey(s) + but no fixture-e2e skeleton was provided. Consider running the test + skeleton generation step to evaluate fixture-e2e candidates before the + UI implementation phase. + Detected journeys: [list journey descriptions and AC references] + +service-integration-e2e gap: + IF no service-integration-e2e skeleton was provided + AND (e2eAbsenceReason.serviceE2e is null + OR e2eAbsenceReason.serviceE2e was not communicated) + AND Design Doc indicates the journey requires real cross-service + verification (data persistence across services, transactional + consistency, external service contract) + THEN add to work plan header: + ⚠ service-integration-e2e Gap: This feature crosses service boundaries + where correctness depends on real cross-service behavior, but no + service-integration-e2e skeleton was provided. + Detected boundaries: [list crossings and AC references] ``` -When an `e2eAbsenceReason` is provided (e.g., `no_multi_step_journey`, `below_threshold_user_confirmed`), E2E absence is intentional β€” skip this gap check. +The "was not communicated" branch covers the scenario where the upstream planning flow skipped test skeleton generation entirely β€” in that case the absence reason field is not even passed to work-planner, so the gap check still runs. -This check applies regardless of whether Strategy A or B was selected. Integration-only skeletons being provided does not imply E2E coverage. Service-internal journeys (async pipelines, service-to-service sagas) are not flagged here β€” they may still warrant E2E through the normal ROI path. +When an `e2eAbsenceReason` for a lane carries a value (e.g., `no_multi_step_journey`, `below_threshold_user_confirmed`, `no_real_service_dependency`), absence in that lane is intentional β€” skip the gap check for that lane. + +This check applies regardless of whether Strategy A or B was selected. Integration-only skeletons being provided does not imply E2E coverage. Service-internal journeys (async pipelines, service-to-service sagas) are not flagged for the reserved-slot rule but may still warrant service-integration-e2e through the normal ROI path. **Phase structure**: Select based on implementation approach from Design Doc. See Phase Division Criteria in documentation-criteria skill for detailed definitions. Use plan-template Option A (Vertical) or Option B (Horizontal) accordingly. @@ -73,12 +91,47 @@ Map each extracted item to a covering task. Items may be covered by a dedicated Record the mapping in the Design-to-Plan Traceability table (see plan template). If an item has no covering task, set Gap Status to `gap` with justification in Notes. Gaps with justification require user confirmation before plan approval. +### 5a. Map UI Spec Components to Tasks (when UI Spec provided) + +When a UI Spec is among the inputs, also map components and states to the tasks that implement them. task-decomposer reads this mapping in Step 6 to populate each task's Investigation Targets, so without this step the UI Spec never reaches the executor. + +For each component documented in the UI Spec: +1. Identify the component's section heading exactly as it appears in the UI Spec (the heading is the reference key β€” see ui-spec-designer's heading uniqueness rule) +2. Identify which states (default / loading / empty / error / partial) the implementation must cover +3. Identify the task(s) in this plan that implement the component or its tests + +Record the mapping in the **UI Spec Component β†’ Task Mapping** table (see plan template). One row per component. Components with no covering task are flagged as `gap` requiring user confirmation, identical to the Design-to-Plan Traceability rule. + +### 5b. Map Cross-Package Boundaries to Tasks (when implementation crosses runtime/deployment boundaries) + +When the implementation crosses a runtime or deployment boundary, build a Connection Map so task-decomposer can propagate boundary context to each affected task. + +**A boundary qualifies for the Connection Map only when ALL of the following hold**: +- The two sides run in separate processes, services, or runtimes (e.g., web client ↔ HTTP server, service A ↔ service B over a network, frontend bundle ↔ backend handler) +- A serialized contract crosses between them (HTTP request/response, message envelope, RPC call, event payload) +- A failure on one side produces an observable signal on the other (status code, missing field, timeout, dropped message) + +**Excluded β€” these are NOT boundaries for the Connection Map**: +- A package importing a sibling utility, type definition, or shared constant from the same monorepo (in-process, no serialized contract) +- Internal layering within the same runtime (e.g., handler β†’ usecase β†’ repository) +- Source code dependencies that compile/bundle into the same artifact + +For each qualifying boundary: +1. Identify the boundary (e.g., `web β†’ API gateway`, `service-A β†’ service-B`, `frontend β†’ shared client β†’ backend handler`) +2. Identify the owner module/package on each side +3. Identify the expected signal that confirms the boundary works (e.g., HTTP 200 with schema X, message published to topic Y, row inserted in table Z) +4. Identify the task(s) that implement either side of the boundary + +Record the mapping in the **Connection Map** table (see plan template). Omit this section entirely when no qualifying boundary exists. + ### 6. Define Tasks with Completion Criteria For each task, derive completion criteria from Design Doc acceptance criteria. Apply the 3-element completion definition (Implementation Complete, Quality Complete, Integration Complete). ### 7. Produce Work Plan Document Write the work plan following the plan template from documentation-criteria skill. Include Phase Structure Diagram and Task Dependency Diagram (mermaid). +The plan header MUST include the line `Implementation Readiness: pending`. The marker contract: it takes one of three values β€” `pending` (initial, set here by work-planner), `ready` (verification completed with no remaining gaps), or `escalated` (verification completed with remaining gaps). The producer that promotes the marker beyond `pending` and the consumer that reads it before execution are external orchestration concerns owned outside this agent. + ## Input Parameters - **mode**: `create` (default) | `update` @@ -129,7 +182,8 @@ Create Red state tests based on unit test definitions provided from previous pro **Test Implementation Timing and Placement**: - Unit tests: Phase 0 Red β†’ Green during implementation - Integration tests: Create and execute at completion of relevant feature implementation (include in phase tasks like "[Feature name] implementation with integration test creation") -- E2E tests: Execute only in final phase (execution only, no separate implementation needed) +- fixture-e2e tests: Create and execute alongside the UI feature phase (include in phase tasks like "[Feature name] UI implementation with fixture-e2e creation"). These run in CI without infrastructure setup +- service-integration-e2e tests: Execute only in the final phase (these depend on local stack and tend to be too slow/heavy for per-task cycles) #### Meta Information Utilization Analyze meta information (@category, @dependency, @complexity, etc.) included in test definitions, @@ -166,22 +220,29 @@ Read test skeleton files (integration tests, E2E tests) with the Read tool and e #### Step 3: Extract Environment Prerequisites from E2E Skeletons -When E2E test skeletons are provided, scan for environment prerequisites in two stages: +When E2E test skeletons are provided, scan for environment prerequisites in two stages. Apply the lane-aware rules below β€” fixture-e2e and service-integration-e2e have very different prerequisite shapes. -**Stage 1: Detect precondition patterns** β€” scan all E2E skeletons and list every detected precondition: -- `Preconditions:` or `Arrange:` comment annotations mentioning seed data, test users, subscriptions, or specific DB state -- `@dependency: full-system` combined with auth/login setup code +**Stage 1: Detect precondition patterns** β€” scan each E2E skeleton (read its `@lane` header to know which lane applies) and list every detected precondition: +- `Preconditions:` or `Arrange:` comment annotations mentioning seed data, test users, fixtures, or specific UI/DB state +- `@dependency: full-ui (mocked backend)` combined with fixture loaders or API mock handlers (e.g., MSW for JS/TS, route interception in the project's browser harness β€” fixture-e2e) +- `@dependency: full-system` combined with auth/login setup code (service-integration-e2e) - References to environment variables (`E2E_*`, `TEST_*`) -- External service references requiring HTTP mock/intercept patterns in test code +- External service references requiring HTTP mock/intercept patterns + +**Stage 2: Generate setup tasks** β€” for each detected precondition, create a corresponding Phase 0 task. Common categories by lane: + +For **fixture-e2e**: +- **Fixture data** β†’ "Create fixture data files for [feature] UI states" +- **Mock backend** β†’ "Configure API mock layer for fixture-e2e (e.g., MSW for JS/TS, WireMock for JVM, responses for Python β€” use the project's standard)" +- **Browser harness** β†’ "Set up the browser harness for fixture-e2e (Playwright by default; no live services required)" -**Stage 2: Generate setup tasks** β€” for each detected precondition, create a corresponding Phase 0 task. Common categories include: -- **Seed data** β†’ "Create E2E seed data script (test users, required records)" -- **Auth fixture** β†’ "Implement E2E auth fixture using application's login flow" -- **External service mocks** β†’ "Configure external service mocks for E2E tests" -- **Environment configuration** β†’ "Define E2E environment variables and document setup" -- **Other detected preconditions** β†’ Create a setup task matching the detected category +For **service-integration-e2e**: +- **Seed data** β†’ "Create seed data script for service-integration-e2e (test users, required records)" +- **Auth fixture** β†’ "Implement auth fixture using application's login flow" +- **External service stubs** β†’ "Configure external service stubs for service-integration-e2e" +- **Environment configuration** β†’ "Define service-integration-e2e environment variables and document local startup" -Place all environment setup tasks in Phase 0 (before any implementation tasks). Mark with `@category: e2e-setup` for traceability. +Place all environment setup tasks in Phase 0 (before any implementation tasks). Mark with `@category: e2e-setup` and `@lane:` matching the target lane for traceability. #### Step 4: Classify and Place Tests @@ -189,7 +250,8 @@ Place all environment setup tasks in Phase 0 (before any implementation tasks). - Setup items (Mock preparation, measurement tools, Helpers, etc.) β†’ Prioritize in Phase 1 - Unit tests (individual functions) β†’ Start from Phase 0 with Red-Green-Refactor - Integration tests β†’ Place as create/execute tasks when relevant feature implementation is complete -- E2E tests β†’ Place as execute-only tasks in final phase +- fixture-e2e tests β†’ Place as create/execute tasks alongside the relevant UI feature implementation +- service-integration-e2e tests β†’ Place as execute-only tasks in final phase - Non-functional requirement tests (performance, UX, etc.) β†’ Place in quality assurance phase - Risk levels ("high risk", "required", etc.) β†’ Move to earlier phases @@ -234,6 +296,12 @@ When creating work plans, **Phase Structure Diagrams** and **Task Dependency Dia - [ ] Design-to-Plan Traceability table complete (all DD technical requirements categorized and mapped) - [ ] No `gap` entries without justification - [ ] All justified `gap` entries flagged for user confirmation before plan approval +- [ ] UI Spec Component β†’ Task Mapping table complete (when UI Spec provided) + - [ ] Every UI Spec component has a covering task, OR an explicit `gap` justification + - [ ] Component reference uses the UI Spec section heading exactly as it appears in the document +- [ ] Connection Map table complete (when implementation crosses packages/services) + - [ ] Every boundary lists owner modules and expected signal + - [ ] Every boundary maps to at least one covering task on each side - [ ] Verification Strategy extracted from Design Doc and included in plan header - [ ] Adopted Quality Assurance Mechanisms extracted from Design Doc and included in plan header - [ ] Phase structure matches implementation approach (vertical β†’ value unit phases, horizontal β†’ layer phases) @@ -242,7 +310,8 @@ When creating work plans, **Phase Structure Diagrams** and **Task Dependency Dia - [ ] Quality assurance exists in final phase - [ ] Test skeleton file paths listed in corresponding phases (when provided) - [ ] E2E environment prerequisites addressed (when E2E skeletons provided) - - [ ] Seed data, auth fixture, and external service mock tasks generated + - [ ] fixture-e2e prerequisites: fixture data, mocked backend, browser harness tasks generated when applicable + - [ ] service-integration-e2e prerequisites: seed data, auth fixture, external service stub tasks generated when applicable - [ ] Environment setup tasks placed in Phase 0 - [ ] Test design information reflected (only when provided) - [ ] Setup tasks placed in first phase diff --git a/dev-workflows-frontend/skills/documentation-criteria/references/plan-template.md b/dev-workflows-frontend/skills/documentation-criteria/references/plan-template.md index 1120318..084ec79 100644 --- a/dev-workflows-frontend/skills/documentation-criteria/references/plan-template.md +++ b/dev-workflows-frontend/skills/documentation-criteria/references/plan-template.md @@ -5,6 +5,7 @@ Type: feature|fix|refactor Estimated Duration: X days Estimated Impact: X files Related Issue/PR: #XXX (if any) +Implementation Readiness: pending ## Related Documents - Design Doc(s): @@ -46,6 +47,26 @@ Maps each Design Doc technical requirement to the covering task(s). One row per **Gap Status values**: `covered` (task exists), `gap` (no task β€” requires justification in Notes, user confirmation required before plan approval) +## UI Spec Component β†’ Task Mapping + +Include this section when a UI Spec is among the inputs. Maps each component documented in the UI Spec to the task(s) that implement it. task-decomposer reads this table to populate each task's Investigation Targets with the corresponding UI Spec section. Omit the section when no UI Spec exists. + +| UI Spec Component (section heading) | States to Cover | Covered By Task(s) | Gap Status | Notes | +|---|---|---|---|---| +| [Use the UI Spec heading exactly as written, e.g., "Β§ Component: AlertCard"] | [default / loading / empty / error / partial β€” list the states the implementation must produce] | [Phase X Task Y] | covered | | + +**Reference key rule**: The component identifier in column 1 is the UI Spec section heading (verbatim). ui-spec-designer enforces unique component headings so this reference resolves to exactly one section. + +**Gap Status values**: `covered` (task exists), `gap` (no task β€” requires justification in Notes, user confirmation required before plan approval) + +## Connection Map + +Include this section when the implementation crosses more than one package, service, or process boundary. Document each boundary so task-decomposer can propagate boundary context to the implementation tasks on each side. Omit the section when the implementation stays within a single package. + +| Boundary | Owner (left side) | Owner (right side) | Expected Signal | Covered By Task(s) | +|---|---|---|---|---| +| [e.g., "web client β†’ API gateway"] | [module/package on the request side] | [module/package on the response side] | [Observable evidence the boundary works β€” e.g., "HTTP 200 with response matching ContractA", "row inserted in tableB", "message published to topicC"] | [Phase X Task Y on each side] | + ## Objective [Why this change is necessary, what problem it solves] diff --git a/dev-workflows-frontend/skills/documentation-criteria/references/ui-spec-template.md b/dev-workflows-frontend/skills/documentation-criteria/references/ui-spec-template.md index b66681b..9b8b60d 100644 --- a/dev-workflows-frontend/skills/documentation-criteria/references/ui-spec-template.md +++ b/dev-workflows-frontend/skills/documentation-criteria/references/ui-spec-template.md @@ -59,6 +59,8 @@ Map PRD acceptance criteria to prototype references. Skip this section if no pro ### Component: [ComponentName] +> Component heading uniqueness: every `Component: [ComponentName]` heading must be unique within this UI Spec. work-planner and task-decomposer reference components by exact heading text β€” duplicate names or paraphrased headings break the propagation to implementation tasks. + #### State x Display Matrix | State | Default | Loading | Empty | Error | Partial | diff --git a/dev-workflows-frontend/skills/integration-e2e-testing/SKILL.md b/dev-workflows-frontend/skills/integration-e2e-testing/SKILL.md index 6ad9889..801459c 100644 --- a/dev-workflows-frontend/skills/integration-e2e-testing/SKILL.md +++ b/dev-workflows-frontend/skills/integration-e2e-testing/SKILL.md @@ -7,14 +7,21 @@ description: Integration and E2E test design principles, ROI calculation, test s ## References -**E2E test design with Playwright**: See [references/e2e-design.md](references/e2e-design.md) for UI Spec-driven E2E test candidate selection and Playwright test architecture. +**E2E test design**: See [references/e2e-design.md](references/e2e-design.md) for UI Spec-driven E2E test candidate selection and browser test architecture. The reference uses Playwright as the default browser harness; substitute the project's standard when different. ## Test Type Definition and Limits -| Test Type | Purpose | Scope | Limit per Feature | Implementation Timing | -|-----------|---------|-------|-------------------|----------------------| -| Integration | Verify component interactions | Partial system integration | MAX 3 | Created alongside implementation | -| E2E | Verify critical user journeys | Full system | MAX 1-2 | Executed in final phase only | +| Test Type | Purpose | Scope | External Deps | Limit per Feature | Implementation Timing | +|-----------|---------|-------|---------------|-------------------|----------------------| +| Integration | Verify component interactions in-process | Partial system integration (in-process modules; for UI components, the framework's in-process renderer e.g., RTL+MSW for React/TS) | Mocked or in-process | MAX 3 | Created alongside implementation | +| fixture-e2e | Verify UI behavior in a browser with deterministic fixtures | Full UI flow with mocked backend / fixture-driven state | Mocked / fixture only β€” no live services | MAX 3 | Created alongside the UI feature | +| service-integration-e2e | Verify critical user journeys against a running local stack | Full system across services | Live local services or stubs | MAX 1-2 | Executed only in the final phase | + +**Lane selection (E2E only)**: +- Default lane for user-facing UI journeys is **fixture-e2e** β€” it runs a real browser against deterministic fixtures, catches the bugs that unit/integration tests miss (button no-op, state never updates, navigation breaks), and runs in CI without infrastructure setup +- Add **service-integration-e2e** only when the journey's correctness depends on real cross-service behavior (data persistence, transactional consistency, external service contracts) that cannot be faked safely + +The two E2E lanes are budgeted independently β€” having a fixture-e2e for a journey does not consume the service-integration-e2e budget and vice versa. ## Behavior-First Principle @@ -43,20 +50,29 @@ ROI Score = Business Value Γ— User Frequency + Legal Requirement Γ— 10 + Defect Higher ROI Score = higher priority within its test type. No normalization or capping is applied β€” the raw score is used directly for ranking. Deduplication is a separate step that removes candidates entirely; it does not modify scores. -### ROI Threshold for E2E +### ROI Thresholds by Lane + +The two E2E lanes have very different ownership costs and use independent thresholds. -E2E tests have high ownership cost (creation, execution, and maintenance are each 3-10Γ— higher than integration tests). To justify creation, an E2E candidate (beyond the must-keep reserved slot) requires **ROI Score > 50**. +| Lane | ROI threshold | Rationale | +|------|---------------|-----------| +| fixture-e2e | ROI β‰₯ 20 (beyond reserved slot) | Cost is comparable to integration tests once the harness exists; the floor avoids filling MAX 3 with low-signal tests when fewer would suffice | +| service-integration-e2e | ROI > 50 (beyond reserved slot) | Creation, execution, and maintenance cost is 3-10Γ— higher than integration; reserve for journeys whose value cannot be proven any other way | + +Reserved slot rules (see Multi-Step User Journey Definition below) apply per lane and override the threshold (the reserved candidate is emitted regardless of its ROI score). Below-floor candidates beyond the reserved slot are not emitted, leaving budget intentionally unfilled rather than padding with low-value tests. ### ROI Calculation Examples | Scenario | BV | Freq | Legal | Defect | ROI Score | Test Type | Selection Outcome | |----------|----|------|-------|--------|-----------|-----------|-------------------| -| Core checkout flow | 10 | 9 | true | 9 | 109 | E2E | Selected (reserved slot: user-facing multi-step journey) | -| Payment error handling | 8 | 3 | false | 7 | 31 | E2E | Below threshold (31 < 50), not selected | -| Profile save flow | 7 | 6 | false | 6 | 48 | E2E | Below threshold (48 < 50), not selected | +| Core checkout UI flow | 10 | 9 | true | 9 | 109 | fixture-e2e | Selected (reserved slot: user-facing multi-step journey, browser-level verification with fixtures) | +| Core checkout against live payment service | 10 | 9 | true | 9 | 109 | service-integration-e2e | Selected (real-service correctness above ROI threshold) | +| Dismiss button updates UI state | 6 | 7 | false | 8 | 50 | fixture-e2e | Selected (rank 2 of 3 fixture-e2e budget) | +| Payment error message display | 5 | 4 | false | 7 | 27 | fixture-e2e | Selected (rank 3 of 3 fixture-e2e budget) | +| Optional filter toggle | 3 | 4 | false | 2 | 14 | fixture-e2e | Not selected (rank 4, budget full) | +| Payment retry against real provider | 8 | 3 | false | 7 | 31 | service-integration-e2e | Below ROI threshold (31 < 50), not selected | | DB persistence check | 8 | 8 | false | 8 | 72 | Integration | Selected (rank 1 of 3) | -| Error message display | 5 | 3 | false | 4 | 19 | Integration | Selected (rank 2 of 3) | -| Optional filter toggle | 3 | 4 | false | 2 | 14 | Integration | Not selected (rank 4, budget full) | +| Pure data transformation | 5 | 3 | false | 4 | 19 | Integration | Selected (rank 2 of 3) | ## Multi-Step User Journey Definition @@ -72,14 +88,14 @@ A feature qualifies as containing a **multi-step user journey** when ALL of the ### User-Facing vs Service-Internal Journeys -Multi-step journeys are further classified for E2E budget decisions: +Multi-step journeys are classified for reserved-slot eligibility: -| Classification | Condition | E2E Reserved Slot | Example | +| Classification | Condition | Reserved Slot Eligibility | Example | |---|---|---|---| -| **User-facing** | A human user directly triggers and observes the steps (via UI, CLI, or direct API interaction) | Eligible | Web checkout flow, CLI setup wizard, mobile onboarding | -| **Service-internal** | Steps are triggered by backend services without direct user interaction | Not eligible (use integration tests) | Async job pipeline, service-to-service saga, scheduled batch processing | +| **User-facing** | A human user directly triggers and observes the steps (via UI, CLI, or direct API interaction) | Eligible β€” defaults to **fixture-e2e** reserved slot. Add a service-integration-e2e reserved slot only when the journey's correctness depends on real cross-service behavior | Web checkout flow, CLI setup wizard, mobile onboarding | +| **Service-internal** | Steps are triggered by backend services without direct user interaction | Not eligible for reserved slot β€” use integration tests. Service-integration-e2e through normal ROI > 50 path is still valid when full-system verification is warranted | Async job pipeline, service-to-service saga, scheduled batch processing | -This classification applies only to the reserved E2E slot and the E2E Gap Check. Service-internal journeys are still valid E2E candidates through the normal ROI > 50 path if they warrant full-system verification. +This classification applies only to the reserved-slot rule and the E2E Gap Check. Other selection follows lane-specific ROI rules above. Use this definition when evaluating E2E test candidates and E2E gap detection. @@ -92,12 +108,18 @@ Each test MUST include the following annotations: ``` AC: [Original acceptance criteria text] Behavior: [Trigger] β†’ [Process] β†’ [Observable Result] -@category: core-functionality | integration | edge-case | e2e +@category: core-functionality | integration | edge-case | fixture-e2e | service-integration-e2e +@lane: integration | fixture-e2e | service-integration-e2e @dependency: none | [component names] | full-system @complexity: low | medium | high ROI: [score] ``` +**`@lane` selection rule**: +- `integration` β€” Component interaction in-process, no browser (e.g., RTL+MSW for React/TS, in-process module/handler integration in any language) +- `fixture-e2e` β€” Browser-level UI verification with mocked backend / fixture-driven state +- `service-integration-e2e` β€” Browser-level or end-to-end verification against running local services or stubs + Use the project's comment syntax to wrap these annotations (e.g., `//` for C-family, `#` for Python/Ruby/Shell). ### Verification Items (Optional) @@ -121,9 +143,10 @@ Verification items: ## Test File Naming Convention - Integration tests: `*.int.test.*` or `*.integration.test.*` -- E2E tests: `*.e2e.test.*` +- fixture-e2e tests: `*.fixture.e2e.test.*` (or organize under `tests/e2e/fixture/`) +- service-integration-e2e tests: `*.service.e2e.test.*` (or organize under `tests/e2e/service/`) -The test runner or framework in the project determines the appropriate file extension. +The test runner or framework in the project determines the appropriate file extension. Repos that already use a single `*.e2e.test.*` convention may keep it as long as each file declares `@lane:` in its header β€” the lane annotation is the source of truth for routing and budget accounting. ## Review Criteria diff --git a/dev-workflows-frontend/skills/integration-e2e-testing/references/e2e-design.md b/dev-workflows-frontend/skills/integration-e2e-testing/references/e2e-design.md index f4e9e90..45a0174 100644 --- a/dev-workflows-frontend/skills/integration-e2e-testing/references/e2e-design.md +++ b/dev-workflows-frontend/skills/integration-e2e-testing/references/e2e-design.md @@ -1,8 +1,21 @@ -# E2E Test Design with Playwright +# E2E Test Design (Browser Harness) + +This reference uses Playwright as the default example throughout because it is the standard E2E browser harness assumed by these workflows. Adapt patterns to the project's chosen framework when different (Cypress, Selenium, etc.); the lane definitions, ROI rules, and budgets remain the same. + +## Two E2E Lanes + +E2E tests in this workflow split into two lanes (see parent skill Test Type Definition): + +| Lane | When | ROI gate | Cost | +|------|------|----------|------| +| **fixture-e2e** | UI journey verification with deterministic fixtures (mocked backend / fixture data) | None β€” selected by ranking within MAX 3 budget | Comparable to integration; runs in CI without infrastructure setup | +| **service-integration-e2e** | Journey correctness depends on real cross-service behavior (data persistence, transactional consistency, external contracts) | ROI > 50 (beyond reserved slot) | 3-10Γ— higher than integration; reserved for what cannot be faked safely | + +Both lanes typically use Playwright; the difference is whether the backend is mocked / fixture-driven or running for real. ## When to Create E2E Tests -E2E tests target **critical user journeys** that span multiple pages or require real browser interaction. Apply the same ROI framework from the parent skill β€” only create E2E tests when ROI > 50. +E2E candidates target **critical user journeys** that span multiple pages or require real browser interaction. Pick the lane based on whether real services are required for the verification. ### Candidate Sources @@ -22,8 +35,8 @@ E2E tests target **critical user journeys** that span multiple pages or require - Responsive behavior across viewports **Use integration tests instead when**: -- Testing single-component state changes β†’ RTL -- Testing API response handling β†’ MSW + RTL +- Testing single-component state changes β†’ in-process component renderer (e.g., RTL for React/TS) +- Testing API response handling β†’ in-process API mock + component renderer (e.g., MSW + RTL for React/TS) - Testing pure data transformations β†’ unit tests ## UI Spec to E2E Test Mapping @@ -41,12 +54,15 @@ When a UI Spec exists, use it as the primary source for E2E test design: Screen Transition: [Screen A] β†’ [Screen B] β†’ [Screen C] AC Reference: AC-{id} User Journey: [Description of what the user accomplishes] -Preconditions: [Auth state, data state] +Lane: fixture-e2e | service-integration-e2e +Preconditions: [Auth state, data state β€” note whether these are fixture-driven or live] Verification Points: - [What to assert at each step] E2E ROI Score: [calculated score] ``` +**Lane decision**: choose `fixture-e2e` by default. Promote to `service-integration-e2e` when the verification requires observing real cross-service behavior (e.g., the test asserts that data persists across a real DB write, or that an external service receives the correct payload). + ## Playwright Test Architecture ### Page Object Pattern @@ -56,9 +72,11 @@ Organize browser interactions through page objects for maintainability: ``` tests/ β”œβ”€β”€ e2e/ -β”‚ β”œβ”€β”€ pages/ # Page objects -β”‚ β”œβ”€β”€ fixtures/ # Test fixtures and helpers -β”‚ └── *.e2e.test.ts # Test files +β”‚ β”œβ”€β”€ pages/ # Page objects (shared across lanes) +β”‚ β”œβ”€β”€ fixtures/ # Test fixtures and helpers (auth, seed) +β”‚ β”œβ”€β”€ data/ # Static fixture data for fixture-e2e +β”‚ β”œβ”€β”€ *.fixture.e2e.test.ts # fixture-e2e test files +β”‚ └── *.service.e2e.test.ts # service-integration-e2e test files ``` ### Test Isolation @@ -81,6 +99,6 @@ When UI Spec defines responsive behavior, test critical breakpoints: ## Budget Enforcement Hard limits per feature (same as parent skill): -- **E2E Tests**: MAX 1-2 tests -- Only generate if ROI score > 50 -- Prefer fewer, comprehensive journey tests over many granular tests +- **fixture-e2e**: MAX 3 tests, no ROI gate (selected by ranking) +- **service-integration-e2e**: MAX 1-2 tests, ROI > 50 beyond the reserved slot +- Prefer fewer, comprehensive journey tests over many granular tests in both lanes diff --git a/dev-workflows-frontend/skills/recipe-front-build/SKILL.md b/dev-workflows-frontend/skills/recipe-front-build/SKILL.md index a9d9daa..df9c996 100644 --- a/dev-workflows-frontend/skills/recipe-front-build/SKILL.md +++ b/dev-workflows-frontend/skills/recipe-front-build/SKILL.md @@ -20,33 +20,51 @@ Work plan: $ARGUMENTS ## Pre-execution Prerequisites -### Task File Existence Check -```bash -# Check work plans -! ls -la docs/plans/*.md | grep -v template | tail -5 +### Implementation Readiness Check -# Check task files -! ls docs/plans/tasks/*.md 2>/dev/null || echo "No task files found" -``` +Before any task processing, locate the work plan to gate against. Resolution rule: +1. List task files in `docs/plans/tasks/` matching the single-layer pattern `{plan-name}-task-*.md`. Layer-aware fullstack tasks (`{plan-name}-backend-task-*.md` / `{plan-name}-frontend-task-*.md`) are excluded here so a stale fullstack run does not redirect this recipe to the wrong work plan +2. From the matched files, also exclude every file matching any of these patterns β€” they originate from other workflow phases and are not implementation tasks for this run's plan: `*-task-prep-*.md` (readiness preflight tasks), `_overview-*.md` (decomposition overview file), `*-phase*-completion.md` (per-phase completion files), `review-fixes-*.md` (post-implementation review fixes), `integration-tests-*-task-*.md` (integration-test add-on scaffolding) +3. For each remaining file, extract the `{plan-name}` prefix as the segment that appears before `-task-` +4. When at least one task file matches, the work plan is `docs/plans/{plan-name}.md` for the prefix that has the most recent task-file mtime; ties broken by the lexicographically last `{plan-name}` +5. When no task file matches the restricted pattern, the work plan is the most-recent-mtime non-template `.md` in `docs/plans/` + +Read the work plan header and find the line `Implementation Readiness: `. Apply this rule: + +| Status | Action | +|--------|--------| +| `ready` | Proceed to Consumed Task Set computation | +| `escalated` | Read the work plan's Readiness Report section, surface remaining gaps to the user via AskUserQuestion: "Implementation Readiness is `escalated` with the following remaining gaps: [list]. Continue execution? (y/n)". On `y` proceed; on `n` stop | +| `pending` | Present via AskUserQuestion: "Implementation Readiness is `pending`. To verify the work plan is implementable, run `/recipe-prepare-implementation [plan-path]` first, then resume. That recipe is provided by the dev-workflows plugin β€” when only this frontend plugin is installed, install dev-workflows to use it, or continue without preflight. Continue without preflight? (y/n)". On `y` proceed; on `n` stop | +| absent (line missing) | Treat as `pending` β€” older work plans created before the readiness marker existed should be preflighted explicitly | + +### Consumed Task Set + +Compute the **Consumed Task Set** for this run β€” the exact files this recipe owns, executes, and later deletes. Use the same restricted pattern as the Implementation Readiness Check: + +1. List task files in `docs/plans/tasks/` matching the single-layer pattern `{plan-name}-task-*.md` for the `{plan-name}` resolved by the readiness check. Layer-aware fullstack tasks are excluded +2. Exclude every file matching: `*-task-prep-*.md`, `_overview-*.md`, `*-phase*-completion.md`, `review-fixes-*.md`, `integration-tests-*-task-*.md` (these originate from other workflow phases) + +Every subsequent reference to "task files" in this recipe β€” Task Generation Decision Flow, Task Execution Cycle iteration, and Final Cleanup β€” uses this set, not the unrestricted `docs/plans/tasks/*.md` glob. ### Task Generation Decision Flow -Analyze task file existence state and determine the action required: +Analyze the Consumed Task Set and determine the action required: | State | Criteria | Next Action | |-------|----------|-------------| -| Tasks exist | .md files in tasks/ directory | User's execution instruction serves as batch approval β†’ Enter autonomous execution immediately | -| No tasks + plan exists | Plan exists but no task files | Confirm with user β†’ run task-decomposer | -| Neither exists + Design Doc exists | No plan or task files, but docs/design/*.md exists | Invoke work-planner to create work plan from Design Doc, then proceed to task decomposition | -| Neither exists | No plan, no task files, no Design Doc | Report missing prerequisites to user and stop | +| Tasks exist | Consumed Task Set is non-empty | User's execution instruction serves as batch approval β†’ Enter autonomous execution immediately | +| No tasks + plan exists | Consumed Task Set is empty but the resolved work plan exists | Confirm with user β†’ run task-decomposer | +| Neither exists + Design Doc exists | No plan, no Consumed Task Set, but `docs/design/*.md` exists | Invoke work-planner to create work plan from Design Doc, then proceed to task decomposition | +| Neither exists | No plan, no Consumed Task Set, no Design Doc | Report missing prerequisites to user and stop | ## Task Decomposition Phase (Conditional) -When task files don't exist: +When the Consumed Task Set is empty: ### 1. User Confirmation ``` -No task files found. +No task files in the Consumed Task Set. Work plan: docs/plans/[plan-name].md Generate tasks from the work plan? (y/n): @@ -59,17 +77,14 @@ Invoke task-decomposer using Agent tool: - `prompt`: "Read work plan at docs/plans/[plan-name].md and decompose into atomic tasks. Output: Individual task files in docs/plans/tasks/. Granularity: 1 task = 1 commit = independently executable" ### 3. Verify Generation -```bash -# Verify generated task files -! ls -la docs/plans/tasks/*.md | head -10 -``` +Recompute the Consumed Task Set using the same restricted pattern from the Consumed Task Set section above. Confirm it is now non-empty. If it is still empty, escalate to the user β€” task-decomposer either failed silently or produced files that don't match the expected pattern. -**Flow**: Task generation β†’ Autonomous execution (in this order) +**Flow**: Task generation β†’ Consumed Task Set recompute β†’ Autonomous execution (in this order) ## Pre-execution Checklist -- [ ] Confirmed task files exist in docs/plans/tasks/ -- [ ] Identified task execution order (dependencies) +- [ ] Confirmed Consumed Task Set is non-empty (computed in the Consumed Task Set section above) +- [ ] Identified task execution order within the Consumed Task Set (dependencies) - [ ] **Environment check**: Can I execute per-task commit cycle? - If commit capability unavailable β†’ Escalate before autonomous mode - Other environments (tests, quality tools) β†’ Subagents will escalate @@ -77,7 +92,7 @@ Invoke task-decomposer using Agent tool: ## Task Execution Cycle (4-Step Cycle) **MANDATORY EXECUTION CYCLE**: `task-executor-frontend β†’ escalation check β†’ quality-fixer-frontend β†’ commit` -For EACH task, YOU MUST: +For EACH task in the Consumed Task Set, YOU MUST: 1. **Register tasks using TaskCreate**: Register work steps. Always include first task "Map preloaded skills to applicable concrete rules" and final task "Verify the mapped rules before final JSON" 2. **Agent tool** (subagent_type: "dev-workflows-frontend:task-executor-frontend") β†’ Pass task file path in prompt, receive structured response 3. **CHECK task-executor-frontend response**: @@ -127,7 +142,18 @@ After all task cycles finish, run verification agents **in parallel** before the - Re-run only the failed verifiers (by the criteria in step 2) - Repeat until all pass or `blocked` β†’ Escalate to user -4. **All passed** β†’ Proceed to completion report +4. **All passed** β†’ Proceed to Final Cleanup + +## Final Cleanup + +Before the completion report, delete the implementation task files this recipe consumed. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete every file in the Consumed Task Set +- Delete every file matching `docs/plans/tasks/{plan-name}-phase*-completion.md` (the per-phase completion files generated by task-decomposer for this `{plan-name}`) +- Delete the corresponding `docs/plans/tasks/_overview-{plan-name}.md` if present +- Preserve the work plan itself (`docs/plans/{plan-name}.md`) β€” the user decides whether to delete it after final review + +If task files cannot be deleted (filesystem error), report the failure but do not block the completion report. ## Output Example Frontend implementation phase completed. @@ -135,4 +161,5 @@ Frontend implementation phase completed. - Implemented tasks: [number] tasks - Quality checks: All passed (Lighthouse, bundle size, tests) - Commits: [number] commits created +- Cleanup: Task files removed from docs/plans/tasks/ diff --git a/dev-workflows-frontend/skills/recipe-front-plan/SKILL.md b/dev-workflows-frontend/skills/recipe-front-plan/SKILL.md index 2be4a60..80d3670 100644 --- a/dev-workflows-frontend/skills/recipe-front-plan/SKILL.md +++ b/dev-workflows-frontend/skills/recipe-front-plan/SKILL.md @@ -39,24 +39,25 @@ Follow the planning process below: - Present options if multiple exist (can be specified with $ARGUMENTS) ### Step 2: Test Skeleton Generation Confirmation - - Confirm with user whether to generate test skeletons (integration + E2E) first - - If user wants generation: acceptance-test-generator generates both integration and E2E test skeletons + - Confirm with user whether to generate test skeletons (integration + fixture-e2e + service-integration-e2e) first + - If user wants generation: acceptance-test-generator generates skeletons across all applicable lanes - Invoke acceptance-test-generator using Agent tool: - `subagent_type`: "dev-workflows-frontend:acceptance-test-generator" - `description`: "Test skeleton generation" - If UI Spec exists: `prompt: "Generate test skeletons from Design Doc at [path]. UI Spec at [ui-spec path]."` - If no UI Spec: `prompt: "Generate test skeletons from Design Doc at [path]."` - - Pass integration test file path, E2E test file path (or null), and e2eAbsenceReason to work-planner according to subagents-orchestration-guide "acceptance-test-generator β†’ work-planner" section + - Pass integration test file path, fixture-e2e and service-integration-e2e file paths (or null per lane), and e2eAbsenceReason (per lane) to work-planner according to subagents-orchestration-guide "acceptance-test-generator β†’ work-planner" section ### Step 3: Work Plan Creation Invoke work-planner using Agent tool: - `subagent_type`: "dev-workflows-frontend:work-planner" - `description`: "Work plan creation" -- If test skeletons were generated in Step 2: - - When `generatedFiles.e2e` is not null: - `prompt`: "Create work plan from Design Doc at [path]. Integration test file: [integration test path]. E2E test file: [E2E test path]. Integration tests are created simultaneously with each phase implementation, E2E tests are executed only in final phase." - - When `generatedFiles.e2e` is null: - `prompt`: "Create work plan from Design Doc at [path]. Integration test file: [integration test path]. No E2E test skeletons were generated (reason: [e2eAbsenceReason]). Integration tests are created simultaneously with each phase implementation." +- If test skeletons were generated in Step 2, build the prompt by listing every lane's status: + - Always include: "Integration test file: [path or 'not generated']" + - For each E2E lane (`fixtureE2e`, `serviceE2e`): + - When `generatedFiles.` is not null: "[lane] test file: [path]" + - When `generatedFiles.` is null: "No [lane] skeleton generated (reason: [e2eAbsenceReason.])" + - Append placement guidance: "Integration tests are created simultaneously with each phase implementation. fixture-e2e tests are created alongside the UI feature phase. service-integration-e2e tests are executed only in the final phase." - If test skeletons were not generated: `prompt`: "Create work plan from Design Doc at [path]." diff --git a/dev-workflows-frontend/skills/recipe-front-review/SKILL.md b/dev-workflows-frontend/skills/recipe-front-review/SKILL.md index fa45fce..8e24255 100644 --- a/dev-workflows-frontend/skills/recipe-front-review/SKILL.md +++ b/dev-workflows-frontend/skills/recipe-front-review/SKILL.md @@ -16,9 +16,10 @@ disable-model-invocation: true - Compliance validation β†’ performed by code-reviewer - Security validation β†’ performed by security-reviewer -- Fix implementation β†’ performed by task-executor-frontend -- Quality checks β†’ performed by quality-fixer-frontend -- Re-validation β†’ performed by code-reviewer / security-reviewer +- **Code-side fix path**: Fix implementation β†’ task-executor-frontend; Quality checks β†’ quality-fixer-frontend; Re-validation β†’ code-reviewer / security-reviewer +- **Design-side update path**: DD revision β†’ technical-designer-frontend (update mode); DD review β†’ document-reviewer; cross-DD consistency β†’ design-sync (when multiple DDs exist); Re-validation β†’ code-reviewer + +The design-side path applies when the discrepancy reflects code that was correct but the Design Doc became stale, rather than code that violated the Design Doc. Design Doc (uses most recent if omitted): $ARGUMENTS @@ -63,36 +64,73 @@ Invoke security-reviewer using Agent tool: **Report both results independently using subagent output fields only**: +Before presenting to the user, the orchestrator computes a recommended route per finding using the rule below (this rule is internal β€” do not include it in the user-facing prompt): + +| Finding pattern | Recommended route | +|-----------------|-------------------| +| `dd_violation` where the code intent matches the original requirement but the Design Doc captured a different design | `d` (Design-side update) | +| `dd_violation` where the code drifted from a still-correct Design Doc | `c` (Code-side fix) | +| `reliability` / `security` / `maintainability` findings | `c` (Code-side fix) | + +Then present to the user (label each finding with its recommended route, grouped by route): + ``` Code Compliance: [complianceRate from code-reviewer] Verdict: [verdict from code-reviewer] Identifier Match Rate: [identifierMatchRate from code-reviewer] Acceptance Criteria: - [fulfilled] [item] (confidence: [high/medium/low]) - - [partially_fulfilled] [item]: [gap] β€” [suggestion] - - [unfulfilled] [item]: [gap] β€” [suggestion] + - [partially_fulfilled] [item]: [gap] β€” [suggestion] [recommended: c | d] + - [unfulfilled] [item]: [gap] β€” [suggestion] [recommended: c | d] Identifier Mismatches: - - [identifier]: DD=[designDocValue] Code=[codeValue] at [location] + - [identifier]: DD=[designDocValue] Code=[codeValue] at [location] [recommended: c | d] Quality Findings: - - [category] [location]: [description] β€” [rationale] + - [category] [location]: [description] β€” [rationale] [recommended: c] Security Review: [status from security-reviewer] Findings by category: - - [confirmed_risk] [location]: [description] β€” [rationale] - - [defense_gap] [location]: [description] β€” [rationale] - - [hardening] [location]: [description] β€” [rationale] - - [policy] [location]: [description] β€” [rationale] + - [confirmed_risk] [location]: [description] β€” [rationale] [recommended: c] + - [defense_gap] [location]: [description] β€” [rationale] [recommended: c] + - [hardening] [location]: [description] β€” [rationale] [recommended: c] + - [policy] [location]: [description] β€” [rationale] [recommended: c] Notes: [notes from security-reviewer, if present] -Execute fixes? (y/n): +Resolve discrepancies β€” confirm or override the recommended route per finding: + c) Code-side fix β€” code violates Design Doc; modify code to match + d) Design-side update β€” code is correct; Design Doc is stale, revise it + s) Skip β€” accept current state without changes ``` -If both pass and user selects `n`: Skip Steps 5-10, proceed to Step 11. +Use AskUserQuestion. The default offer is **"accept all recommended routes"** β€” a single confirmation for the typical case where the orchestrator's recommendations are correct. When the user wants to override, collect per-finding c/d/s decisions instead. If the user selects `s` for everything: skip Steps 5-10, proceed to Step 11. ### Step 5: Execute Skill Execute Skill: documentation-criteria (for task file template) +### Step 5d: Design-Side Update + +Run this step only when the user routed at least one finding to `d`. When all routes are `c` or `s`, skip directly to Step 6. + +1. Invoke technical-designer-frontend in update mode using Agent tool: + - `subagent_type`: "dev-workflows-frontend:technical-designer-frontend" + - `description`: "Design Doc update from review findings" + - `prompt`: "Update Design Doc at [path] in update mode. The implementation has diverged in the following ways that the team has decided to ratify in the design rather than in the code: [list of `d`-routed findings with codeLocation and designDocValue from $STEP_2_OUTPUT]. Reflect the current code behavior in the relevant sections and add a history entry." + +2. Invoke document-reviewer to verify the updated Design Doc: + - `subagent_type`: "dev-workflows-frontend:document-reviewer" + - `description`: "Document review of updated Design Doc" + - `prompt`: "Review updated Design Doc at [path] for consistency and completeness." + +3. When multiple Design Docs exist (`ls docs/design/*.md | grep -v template | wc -l > 1`), invoke design-sync: + - `subagent_type`: "dev-workflows-frontend:design-sync" + - `description`: "Cross-DD consistency check" + - `prompt`: "source_design: [updated DD path]. Detect conflicts across all Design Docs after the update." + - When `sync_status: conflicts_found`: present conflicts to the user; resolution requires re-invoking technical-designer-frontend for affected DDs. + +4. After Step 5d completes: + - If the user selected `d` for all findings (no `c` routes) β†’ skip Steps 6-8, proceed to Step 9 for re-validation + - If the user selected both `d` and `c` β†’ re-evaluate the `c`-routed findings against the updated DD and drop any that are now satisfied by the DD revision; then proceed to Step 6 with the remaining `c` findings + ### Step 6: Create Task File Create task file at `docs/plans/tasks/review-fixes-YYYYMMDD.md` @@ -117,7 +155,7 @@ Invoke quality-fixer-frontend using Agent tool: Invoke code-reviewer using Agent tool: - `subagent_type`: "dev-workflows-frontend:code-reviewer" - `description`: "Re-validate compliance" -- `prompt`: "Re-validate Design Doc compliance after fixes. Design Doc: [path]. Implementation files: [file list]. Prior compliance issues: $STEP_2_OUTPUT. Verify each prior issue is resolved." +- `prompt`: "Re-validate Design Doc compliance after fixes. Design Doc: [path]. Implementation files: [file list]. Prior compliance issues: $STEP_2_OUTPUT. Verify each prior issue is resolved (whether resolved code-side or design-side)." ### Step 10: Re-validate security-reviewer @@ -126,7 +164,16 @@ Invoke security-reviewer using Agent tool (only if security fixes were applied): - `description`: "Re-validate security" - `prompt`: "Re-validate security after fixes. Prior findings: $STEP_3_OUTPUT. Design Doc: [path]. Implementation files: [file list]." -### Step 11: Final Report +### Step 11: Final Cleanup and Report + +Delete the review-fix task file this recipe created (if any). Its work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete `docs/plans/tasks/review-fixes-YYYYMMDD.md` if it exists + +If the file cannot be deleted (filesystem error), report the failure but do not block the final report. + +Then present the final report: + ``` Code Compliance: Initial: [X]% @@ -139,9 +186,11 @@ Security Review: Remaining issues: - [items requiring manual intervention] + +Cleanup: review-fixes task file removed ``` -## Auto-fixable Items +## Auto-fixable Items (code-side path) - Simple unimplemented acceptance criteria - Error handling additions - Contract definition fixes @@ -151,10 +200,16 @@ Remaining issues: ## Non-fixable Items - Fundamental business logic changes - Architecture-level modifications -- Design Doc deficiencies - Committed secrets (blocked β†’ human intervention) -**Scope**: Design Doc compliance validation, security review, and auto-fixes. +## Design-Side Update Triggers +Discrepancies suitable for the design-side path (code is correct, DD became stale): +- Identifier renames where the new identifier reflects the team's current naming +- Behavioral changes that match the original requirement intent better than what the DD captured +- Component splits or merges where the new structure is sound and the DD documented the prior structure +- New ACs that the implementation already satisfies but the DD never enumerated + +**Scope**: Design Doc compliance validation, security review, code-side auto-fixes, and design-side update routing. ## Scope Boundary for Subagents diff --git a/dev-workflows-frontend/skills/subagents-orchestration-guide/SKILL.md b/dev-workflows-frontend/skills/subagents-orchestration-guide/SKILL.md index dd4f205..ca2e2c1 100644 --- a/dev-workflows-frontend/skills/subagents-orchestration-guide/SKILL.md +++ b/dev-workflows-frontend/skills/subagents-orchestration-guide/SKILL.md @@ -111,7 +111,7 @@ Autonomous execution MUST stop and wait for user input at these points. | Design | After design-sync completes consistency verification | Approve Design Doc | | Work Plan | After work-planner creates plan | Batch approval for implementation phase | -**After batch approval**: Autonomous execution proceeds without stops until completion or escalation +**After batch approval**: Autonomous execution proceeds without stops until completion or escalation. ## Scale Determination and Document Requirements | Scale | File Count | PRD | ADR | Design Doc | Work Plan | @@ -184,7 +184,7 @@ Subagents respond in JSON format. Key fields for orchestrator decisions: - **design-sync**: sync_status (synced/conflicts_found) - **integration-test-reviewer**: status (approved/needs_revision/blocked), requiredFixes - **security-reviewer**: status (approved/approved_with_notes/needs_revision/blocked), findings, notes, requiredFixes -- **acceptance-test-generator**: status, generatedFiles (integration: path|null, e2e: path|null), budgetUsage, e2eAbsenceReason (null when E2E emitted, otherwise: no_multi_step_journey|below_threshold_user_confirmed) +- **acceptance-test-generator**: status, generatedFiles.{integration,fixtureE2e,serviceE2e} (path|null per lane), budgetUsage per lane, e2eAbsenceReason per E2E lane (null when emitted; reason enum is owned by acceptance-test-generator and integration-e2e-testing skill) ## Handling Requirement Changes @@ -229,7 +229,9 @@ Always start with requirement-analyzer, then select the minimum planning flow re | Medium | requirement-analyzer β†’ codebase-analyzer β†’ optional UI Spec β†’ optional ADR β†’ Design Doc β†’ code-verifier β†’ document-reviewer β†’ design-sync β†’ acceptance-test-generator β†’ work-planner β†’ task-decomposer | | Small | requirement-analyzer β†’ work-planner | -After the planning flow completes and the user grants batch approval, execute the task execution cycle: `task-executor β†’ quality-fixer β†’ commit` for each task. See "Autonomous Execution Mode" below for full per-task details. At Small scale this cycle still applies β€” implementation runs through `task-executor`, not orchestrator-direct edits. +After the planning flow completes and the user grants batch approval, the work plan carries an `Implementation Readiness:` header (work-planner emits `pending`; promotion to `ready` or `escalated` is an external orchestration concern). External orchestration also decides when and how to act on this marker; this guide does not invoke any orchestrator above the agent layer. + +Then execute the task execution cycle: `task-executor β†’ quality-fixer β†’ commit` for each task. See "Autonomous Execution Mode" below for full per-task details. At Small scale this cycle still applies β€” implementation runs through `task-executor`, not orchestrator-direct edits. Each agent name in the chain is invoked via the Agent tool (per "Orchestrator's Permitted Tools" above). @@ -397,21 +399,13 @@ Register overall phases using TaskCreate. Update each phase with TaskUpdate as i #### HC-06: acceptance-test-generator β†’ work-planner - **Pass to acceptance-test-generator**: - - Design Doc: [path] - - UI Spec: [path] (if exists) + **Pass to acceptance-test-generator**: Design Doc path; UI Spec path (if exists). - **Orchestrator verification items**: - - Verify `generatedFiles.integration` is a valid path (when not null) and the file exists - - Verify `generatedFiles.e2e` is a valid path (when not null) and the file exists - - When `generatedFiles.e2e` is null, verify `e2eAbsenceReason` is present β€” this is intentional absence, not an error + **Orchestrator verification**: Every non-null `generatedFiles.` path exists on disk. For each null lane, `e2eAbsenceReason.` is present (intentional absence, not an error). - **Pass to work-planner**: - - Integration test file: [path] (create and execute simultaneously with each phase implementation) - - E2E test file: [path] or null (execute only in final phase, when provided) - - E2E absence reason: [reason] (when E2E is null β€” pass this so work-planner can skip E2E Gap Check for intentional absence) + **Pass to work-planner**: integration / fixture-e2e / service-integration-e2e file paths (or null per lane), per-lane absence reasons, plus timing guidance β€” integration tests are created alongside each phase implementation, fixture-e2e tests are created alongside the UI feature phase, service-integration-e2e tests are executed only in the final phase. - **On error**: Escalate to user if integration file generation failed unexpectedly (status != completed). E2E being null with a valid absence reason is not an error. + **On error**: Escalate to user when status != completed and integration file generation failed unexpectedly. A null E2E lane with a valid absence reason is not an error. 3. **ADR Status Management**: Update ADR status after user decision (Accepted/Rejected) diff --git a/dev-workflows-frontend/skills/subagents-orchestration-guide/references/monorepo-flow.md b/dev-workflows-frontend/skills/subagents-orchestration-guide/references/monorepo-flow.md index 4304e07..840c9dc 100644 --- a/dev-workflows-frontend/skills/subagents-orchestration-guide/references/monorepo-flow.md +++ b/dev-workflows-frontend/skills/subagents-orchestration-guide/references/monorepo-flow.md @@ -27,7 +27,7 @@ This reference defines the orchestration flow for projects spanning multiple lay | 11 | code-verifier | Verify **Frontend** Design Doc against existing code | Frontend verification | | 12 | document-reviewer Γ—2 | Review each Design Doc (with code-verifier results as `code_verification`) | Reviews | | 13 | design-sync | Cross-layer consistency verification (source: frontend Design Doc) **[Stop]** | Sync status | -| 14 | acceptance-test-generator | Integration/E2E test skeleton from cross-layer contracts | Test skeletons | +| 14 | acceptance-test-generator | Integration + fixture-e2e + service-integration-e2e test skeletons from cross-layer contracts (per-lane) | Test skeletons | | 15 | work-planner | Work plan from all Design Docs **[Stop: Batch approval]** | Work plan | ### Medium Scale Fullstack (3-5 Files) - 13 Steps @@ -45,7 +45,7 @@ This reference defines the orchestration flow for projects spanning multiple lay | 9 | code-verifier | Verify **Frontend** Design Doc against existing code | Frontend verification | | 10 | document-reviewer Γ—2 | Review each Design Doc (with code-verifier results as `code_verification`) | Reviews | | 11 | design-sync | Cross-layer consistency verification (source: frontend Design Doc) **[Stop]** | Sync status | -| 12 | acceptance-test-generator | Integration/E2E test skeleton from cross-layer contracts | Test skeletons | +| 12 | acceptance-test-generator | Integration + fixture-e2e + service-integration-e2e test skeletons from cross-layer contracts (per-lane) | Test skeletons | | 13 | work-planner | Work plan from all Design Docs **[Stop: Batch approval]** | Work plan | ### Parallelization in Multi-Agent Steps diff --git a/dev-workflows-frontend/skills/test-implement/references/e2e.md b/dev-workflows-frontend/skills/test-implement/references/e2e.md index 573f765..47cdbe5 100644 --- a/dev-workflows-frontend/skills/test-implement/references/e2e.md +++ b/dev-workflows-frontend/skills/test-implement/references/e2e.md @@ -1,5 +1,16 @@ # E2E Test Implementation with Playwright +## Lane Selection + +E2E tests in this workflow split into two lanes (defined in integration-e2e-testing skill): + +| Lane | Backend setup | Use these patterns | +|------|---------------|-------------------| +| **fixture-e2e** | Mocked via `page.route()` or fixture loaders; no live services | Page Object Pattern, Locator Strategy, Assertions, the **Fixture-Based Backend** section below | +| **service-integration-e2e** | Live local stack with real services | All patterns above PLUS the **E2E Environment Prerequisites** section (seed data, auth fixture against real auth flow) | + +The skeleton's `@lane:` annotation declares which lane the test belongs to. Choose implementation patterns to match. + ## Test Framework - **Playwright Test**: `@playwright/test` - Test imports: `import { test, expect } from '@playwright/test'` @@ -10,18 +21,23 @@ ``` tests/ └── e2e/ - β”œβ”€β”€ pages/ # Page objects + β”œβ”€β”€ pages/ # Page objects (shared across lanes) β”‚ β”œβ”€β”€ login.page.ts β”‚ └── dashboard.page.ts - β”œβ”€β”€ fixtures/ # Test fixtures + β”œβ”€β”€ fixtures/ # Test fixtures (auth, seed) β”‚ └── auth.fixture.ts - └── *.e2e.test.ts # Test files + β”œβ”€β”€ data/ # Static fixture data for fixture-e2e + β”‚ └── *.fixture.json + β”œβ”€β”€ *.fixture.e2e.test.ts # fixture-e2e test files + └── *.service.e2e.test.ts # service-integration-e2e test files ``` ### Naming Conventions -- Test files: `{FeatureName}.e2e.test.ts` +- fixture-e2e files: `{FeatureName}.fixture.e2e.test.ts` +- service-integration-e2e files: `{FeatureName}.service.e2e.test.ts` - Page objects: `{PageName}.page.ts` - Fixtures: `{Purpose}.fixture.ts` +- Static fixture data: `{scenario}.fixture.json` ## Page Object Pattern @@ -102,9 +118,46 @@ export const test = base.extend<{ authenticatedPage: Page }>({ }) ``` -## E2E Environment Prerequisites +## Fixture-Based Backend (fixture-e2e) + +fixture-e2e tests run a real browser against deterministic fixtures β€” no live backend, no DB, no external services. Use one of these patterns to fake the network: + +### Pattern A: page.route() interception + +```typescript +test('Dismiss-then-Undo restores card', async ({ page }) => { + // Arrange: intercept all backend calls with deterministic responses + await page.route('**/api/cards', async (route) => { + await route.fulfill({ json: cardsFixture }) + }) + await page.route('**/api/cards/*/dismiss', async (route) => { + await route.fulfill({ status: 204 }) + }) + + await page.goto('/cards') + await page.getByRole('button', { name: 'Dismiss' }).first().click() + await page.getByRole('button', { name: 'Undo' }).click() + + await expect(page.getByText(cardsFixture[0].title)).toBeVisible() +}) +``` + +### Pattern B: Fixture loader injection + +```typescript +// data/cards-with-dismiss.fixture.json β€” committed alongside the test +// Loaded via a route helper or app-level test mode +``` -E2E tests require a running application with real data state. Unlike unit/integration tests, environment setup is part of E2E test implementation scope. +**Principles for fixture-e2e**: +- Backend is faked, not running. No `npm run start:backend` required to execute these tests +- Fixtures are versioned in the repo (`tests/e2e/data/`) so tests are deterministic across machines +- Auth, when needed, is faked too (set a test cookie via `page.context().addCookies()` or use a fixture-mode bypass) +- These tests run in CI without provisioning external infrastructure + +## E2E Environment Prerequisites (service-integration-e2e) + +service-integration-e2e tests require a running application with real data state. Unlike fixture-e2e, environment setup is part of test implementation scope. ### Seed Data Strategy @@ -163,16 +216,16 @@ export const test = base.extend<{ playerPage: Page }>({ - Store test credentials in environment variables only (`E2E_*` prefixed) - If the auth flow requires specific user records, seed them in the fixture -### Environment Checklist +### Environment Checklist (service-integration-e2e only) -Before E2E tests can pass, verify: +Before service-integration-e2e tests can pass, verify: - [ ] Application is running and accessible at `baseURL` - [ ] Database has required seed data (test users, subscriptions, content) - [ ] Authentication flow works with test credentials - [ ] Environment variables are set (`E2E_*` prefixed) -- [ ] External services are either available or mocked via `page.route()` +- [ ] External services are either available or stubbed -When the work plan includes dedicated environment setup tasks (Phase 0), follow those tasks. When no setup tasks exist in the plan, address missing prerequisites as part of the E2E test implementation task itself. +When the work plan includes dedicated environment setup tasks (Phase 0 β€” see work-planner E2E Environment Prerequisites extraction), follow those tasks. When no setup tasks exist in the plan, address missing prerequisites as part of the test implementation task itself, OR consider whether the verification could move to fixture-e2e instead. ## Locator Strategy @@ -235,18 +288,36 @@ test.describe('responsive navigation', () => { ## Skeleton Comment Format -E2E test skeletons follow the same annotation format as integration tests (adapt comment syntax to the project's language): +E2E test skeletons follow the same annotation format as integration tests (adapt comment syntax to the project's language). The `@lane` annotation routes the test to the correct implementation patterns. + +### fixture-e2e example +```typescript +// AC: [Original acceptance criteria text] +// Behavior: [User action] β†’ [System response] β†’ [Observable result in browser] +// @category: fixture-e2e +// @lane: fixture-e2e +// @dependency: full-ui (mocked backend) +// @complexity: medium +// ROI: [score] +test('AC1: [Description]', async ({ page }) => { + // Arrange: load fixture data, intercept network + // Act: user interaction + // Assert: observable browser state +}) +``` +### service-integration-e2e example ```typescript // AC: [Original acceptance criteria text] -// Behavior: [User action] β†’ [System response] β†’ [Observable result] -// @category: e2e +// Behavior: [User action] β†’ [System response across services] β†’ [Observable cross-service result] +// @category: service-integration-e2e +// @lane: service-integration-e2e // @dependency: full-system // @complexity: high // ROI: [score] -test('AC1: [Description]', async ({ page }) => { - // Arrange: [Setup description] - // Act: [Action description] - // Assert: [Verification description] +test('AC1: [Description]', async ({ page, request }) => { + // Arrange: seed real data, real auth + // Act: user interaction + // Assert: observable result + cross-service evidence (DB row, downstream event) }) ``` diff --git a/dev-workflows-frontend/skills/test-implement/references/frontend.md b/dev-workflows-frontend/skills/test-implement/references/frontend.md index e605e28..ff3098f 100644 --- a/dev-workflows-frontend/skills/test-implement/references/frontend.md +++ b/dev-workflows-frontend/skills/test-implement/references/frontend.md @@ -19,9 +19,15 @@ ### Coverage Requirements (ADR-0002 Compliant) **Component-specific targets**: + +When the project adopts Atomic Design (atoms / molecules / organisms layering): - Atoms (Button, Text, etc.): 70% or higher - Molecules (FormField, etc.): 65% or higher - Organisms (Header, Footer, etc.): 60% or higher + +When the project uses a different component architecture (Feature-based, Container-Presenter, etc.): apply 60% as the baseline and raise the target for foundational/leaf components (those reused across many features) to 70%. + +Component-architecture-independent targets: - Custom Hooks: 65% or higher - Utils: 70% or higher diff --git a/dev-workflows-frontend/skills/typescript-rules/SKILL.md b/dev-workflows-frontend/skills/typescript-rules/SKILL.md index 99a6f50..947ab7b 100644 --- a/dev-workflows-frontend/skills/typescript-rules/SKILL.md +++ b/dev-workflows-frontend/skills/typescript-rules/SKILL.md @@ -62,7 +62,7 @@ function isUser(value: unknown): value is User { **Component Design Criteria** - **Function components only**: Official React recommendation, optimizable by modern tooling (Exception: Error Boundary requires class component) - **Custom Hooks**: Standard pattern for logic reuse and dependency injection -- **Component Hierarchy**: Atoms β†’ Molecules β†’ Organisms β†’ Templates β†’ Pages +- **Component Hierarchy**: Use the project's adopted component architecture. When the project uses Atomic Design: Atoms β†’ Molecules β†’ Organisms β†’ Templates β†’ Pages. When the project uses Feature-based, Container-Presenter, or another structure: follow that structure consistently and document the chosen layering in the project README or design doc - **Co-location**: Place tests, styles, and related files alongside components **State Management Patterns** diff --git a/dev-workflows/.claude-plugin/plugin.json b/dev-workflows/.claude-plugin/plugin.json index 025d5cf..ff70ff4 100644 --- a/dev-workflows/.claude-plugin/plugin.json +++ b/dev-workflows/.claude-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "dev-workflows", "description": "Skills + Subagents for backend development - Use skills for coding guidance, or run recipe workflows for full orchestrated agentic coding with specialized agents", - "version": "0.16.17", + "version": "0.17.0", "author": { "name": "Shinsuke Kagawa", "url": "https://github.com/shinpr" diff --git a/dev-workflows/agents/acceptance-test-generator.md b/dev-workflows/agents/acceptance-test-generator.md index 2916939..7eba9ef 100644 --- a/dev-workflows/agents/acceptance-test-generator.md +++ b/dev-workflows/agents/acceptance-test-generator.md @@ -37,7 +37,8 @@ Test type definitions, budgets, and ROI calculations are specified in **integrat Key points: - **Integration Tests**: MAX 3 per feature, created alongside implementation -- **E2E Tests**: MAX 1-2 per feature, executed in final phase only +- **fixture-e2e**: MAX 3 per feature, created alongside the UI feature phase, ROI β‰₯ 20 beyond reserved slot +- **service-integration-e2e**: MAX 1-2 per feature, executed only in the final phase, ROI > 50 beyond reserved slot ## 4-Phase Generation Process @@ -103,9 +104,9 @@ For each valid AC from Phase 1: **Output**: Candidate pool with ROI metadata -### Phase 3: ROI-Based Selection (Two-Pass #2) +### Phase 3: ROI-Based Selection and Lane Assignment (Two-Pass #2) -ROI calculation formula and cost table are defined in **integration-e2e-testing skill**. +ROI calculation formula and cost table are defined in **integration-e2e-testing skill**. Lane definitions and selection rules are also in that skill. **Selection Algorithm**: @@ -118,34 +119,50 @@ ROI calculation formula and cost table are defined in **integration-e2e-testing 3. **Push-Down Analysis**: ``` Can this be unit-tested? β†’ Remove from integration/E2E pool - Already integration-tested? β†’ Keep as E2E candidate IF part of multi-step user journey (see definition in integration-e2e-testing skill) - Already integration-tested AND NOT part of multi-step journey? β†’ Remove from E2E pool + Already integration-tested AND verifiable in-process? β†’ Remove from E2E pool ``` -4. **Sort by ROI** (descending order) +4. **Lane assignment** (E2E candidates only): + - Default to `fixture-e2e` for any UI journey verifiable with mocked backend / fixture-driven state + - Promote to `service-integration-e2e` only when the verification depends on real cross-service behavior. A candidate qualifies for `service-integration-e2e` when ANY of the following must be asserted: + - Data persists across a real DB write (e.g., row inserted/updated in the actual database under test) + - A downstream service receives a real event/message (e.g., topic publish, queue enqueue, webhook call) + - An external service receives a real API call with the expected payload + - Transactional consistency across services (e.g., two-phase commit, saga compensation) +5. **Sort by ROI** within each lane (descending) β€” this is the single ranking step; Phase 4 budget enforcement consumes this ranked list directly without re-sorting. -**Output**: Ranked, deduplicated candidate list +**Output**: Ranked, deduplicated candidate list with lane assigned per E2E candidate. ### Phase 4: Budget Enforcement **Hard Limits per Feature**: - **Integration Tests**: MAX 3 tests -- **E2E Tests**: MAX 1-2 tests total, composed of: - - 1 reserved slot (emitted regardless of ROI) when feature contains a **user-facing** multi-step user journey (see definition and classification in integration-e2e-testing skill) +- **fixture-e2e**: MAX 3 tests, no ROI gate. When the feature contains a **user-facing** multi-step user journey, the highest-ROI journey candidate is reserved (emitted regardless of ranking) +- **service-integration-e2e**: MAX 1-2 tests, composed of: + - 1 reserved slot (emitted regardless of ROI) when the journey's correctness depends on real cross-service behavior that fixture-e2e cannot verify - Up to 1 additional slot requiring ROI > 50 **Selection Algorithm**: ``` -1. Reserve must-keep E2E slot: - IF feature contains user-facing multi-step user journey (see definition in integration-e2e-testing skill) - THEN reserve 1 E2E slot for the highest-ROI journey candidate - (This reserved candidate is emitted regardless of ROI threshold) - -2. Sort remaining candidates by ROI (descending) - -3. Select top N within budget: +1. Reserve fixture-e2e slot: + IF feature contains user-facing multi-step user journey + THEN reserve 1 fixture-e2e slot for the highest-ROI journey candidate + +2. Reserve service-integration-e2e slot (only if needed): + IF the reserved journey's verification requires ANY of: + - data persists across a real DB write + - downstream service receives a real event/message + - external service receives a real API call with expected payload + - transactional consistency across services + THEN reserve 1 service-integration-e2e slot for that journey + +3. Walk the candidate list (already sorted by ROI within each lane in Phase 3 step 5) + and select within budget: - Integration: Pick top 3 highest-ROI - - E2E (additional beyond reserved): Pick up to 1 more IF ROI score > 50 + - fixture-e2e (additional beyond reserved): Pick up to remaining budget IF ROI β‰₯ 20 + - service-integration-e2e (additional beyond reserved): Pick up to 1 more IF ROI > 50 + + Leave budget intentionally unfilled when no remaining candidate clears the lane's threshold. ``` **Output**: Final test set @@ -180,81 +197,109 @@ The examples below use `//` comment syntax. Adapt to the project's language (e.g [Test: 'AC1: Failed payment displays error without creating order'] ``` -### E2E Test File +### fixture-e2e Test File + +``` +// [Feature Name] fixture-e2e Test - Design Doc: [filename] +// Generated: [date] | Budget Used: 1/3 fixture-e2e +// Test Type: Browser-level UI verification with mocked backend / fixture-driven state +// Implementation Timing: Alongside the UI feature implementation + +[Import statement using detected test framework] + +[Test suite using detected framework syntax] + // User Journey: Click Dismiss β†’ card disappears β†’ undo banner appears β†’ Undo restores card + // ROI: 60 (BV:6 Γ— Freq:7 + Legal:0 + Defect:8) | reserved slot: user-facing multi-step journey + // Verification: UI state transitions are observable in the browser + // @category: fixture-e2e + // @lane: fixture-e2e + // @dependency: full-ui (mocked backend) + // @complexity: medium + [Test: 'User Journey: Dismiss-then-Undo restores the card to its original state'] +``` + +### service-integration-e2e Test File ``` -// [Feature Name] E2E Test - Design Doc: [filename] -// Generated: [date] | Budget Used: 1/2 E2E -// Test Type: End-to-End Test -// Implementation Timing: After all feature implementations complete +// [Feature Name] service-integration-e2e Test - Design Doc: [filename] +// Generated: [date] | Budget Used: 1/2 service-integration-e2e +// Test Type: End-to-end against running local stack +// Implementation Timing: Executed only in the final phase [Import statement using detected test framework] [Test suite using detected framework syntax] - // User Journey: Complete purchase flow (browse β†’ add to cart β†’ checkout β†’ payment β†’ confirmation) - // ROI: 119 (BV:10 Γ— Freq:10 + Legal:10 + Defect:9) | reserved slot: multi-step journey - // Verification: End-to-end user experience from product selection to order confirmation - // @category: e2e + // User Journey: Complete purchase flow (browse β†’ checkout β†’ payment β†’ confirmation persisted in DB) + // ROI: 119 (BV:10 Γ— Freq:10 + Legal:10 + Defect:9) | reserved slot: cross-service correctness + // Verification: Order persists in DB and confirmation event reaches downstream consumer + // @category: service-integration-e2e + // @lane: service-integration-e2e // @dependency: full-system // @complexity: high - [Test: 'User Journey: Complete product purchase from browse to confirmation email'] + [Test: 'User Journey: Complete purchase persists order and emits confirmation event'] ``` ### Generation Report -**When E2E tests are emitted:** +**When both E2E lanes are emitted:** ```json { "status": "completed", "feature": "payment", "generatedFiles": { "integration": "tests/payment.int.test.[ext]", - "e2e": "tests/payment.e2e.test.[ext]" + "fixtureE2e": "tests/payment.fixture.e2e.test.[ext]", + "serviceE2e": "tests/payment.service.e2e.test.[ext]" }, - "budgetUsage": { "integration": "2/3", "e2e": "1/2" }, - "e2eAbsenceReason": null + "budgetUsage": { "integration": "2/3", "fixtureE2e": "1/3", "serviceE2e": "1/2" }, + "e2eAbsenceReason": { "fixtureE2e": null, "serviceE2e": null } } ``` -**When no E2E tests are emitted:** +**When only fixture-e2e is emitted:** ```json { "status": "completed", "feature": "payment", "generatedFiles": { "integration": "tests/payment.int.test.[ext]", - "e2e": null + "fixtureE2e": "tests/payment.fixture.e2e.test.[ext]", + "serviceE2e": null }, - "budgetUsage": { "integration": "2/3", "e2e": "0/2" }, - "e2eAbsenceReason": "no_multi_step_journey" + "budgetUsage": { "integration": "2/3", "fixtureE2e": "2/3", "serviceE2e": "0/2" }, + "e2eAbsenceReason": { "fixtureE2e": null, "serviceE2e": "no_real_service_dependency" } } ``` -**When no integration tests are emitted:** +**When no E2E tests are emitted:** ```json { "status": "completed", "feature": "config-update", "generatedFiles": { - "integration": null, - "e2e": null + "integration": "tests/config.int.test.[ext]", + "fixtureE2e": null, + "serviceE2e": null }, - "budgetUsage": { "integration": "0/3", "e2e": "0/2" }, - "e2eAbsenceReason": "no_multi_step_journey" + "budgetUsage": { "integration": "1/3", "fixtureE2e": "0/3", "serviceE2e": "0/2" }, + "e2eAbsenceReason": { "fixtureE2e": "no_multi_step_journey", "serviceE2e": "no_multi_step_journey" } } ``` -**Contract**: Both `generatedFiles.integration` and `generatedFiles.e2e` are always present as keys. Value is a file path string when generated, `null` when not generated. `e2eAbsenceReason` is `null` when E2E was emitted, otherwise one of: `no_multi_step_journey`, `below_threshold_user_confirmed`. +**Contract**: +- `generatedFiles.integration`, `generatedFiles.fixtureE2e`, `generatedFiles.serviceE2e` are always present as keys. Value is a file path string when generated, `null` when not. +- `e2eAbsenceReason` is an object with `fixtureE2e` and `serviceE2e` keys. Each value is `null` when that lane emitted, otherwise one of: `no_multi_step_journey`, `below_threshold_user_confirmed`, `no_real_service_dependency` (service-integration-e2e only β€” meaning the journey is verifiable in fixture-e2e). ## Test Meta Information Assignment Each test case MUST have the following standard annotations for test implementation planning: -- **@category**: core-functionality | integration | edge-case | ux -- **@dependency**: none | [component names] | full-system +- **@category**: core-functionality | integration | edge-case | ux | fixture-e2e | service-integration-e2e +- **@lane**: integration | fixture-e2e | service-integration-e2e +- **@dependency**: none | [component names] | full-ui (mocked backend) | full-system - **@complexity**: low | medium | high -These annotations are used when planning and prioritizing test implementation. +These annotations are used when planning and prioritizing test implementation. The `@lane` annotation is the source of truth for budget accounting and CI gating. ## Constraints and Quality Standards diff --git a/dev-workflows/agents/task-decomposer.md b/dev-workflows/agents/task-decomposer.md index acbf5cc..aeff0d1 100644 --- a/dev-workflows/agents/task-decomposer.md +++ b/dev-workflows/agents/task-decomposer.md @@ -104,8 +104,12 @@ Decompose tasks based on implementation strategy patterns determined in implemen |---|---| | Existing code modification | The existing implementation files being modified, their tests, related Design Doc sections | | New component/feature | Adjacent implementations in the same layer/domain, Design Doc interface contracts | + | Frontend component implementation | UI Spec component section (use the section heading the work plan's UI Spec Component β†’ Task Mapping cites), Design Doc interface contracts, adjacent components in the same layer | + | Frontend integration / fixture-e2e test | UI Spec component section including the State x Display Matrix and Interaction Definition tables, the implemented component code, fixture data files | | Test implementation | Test skeleton comments/annotations, the target code being tested, actual API/auth flows | - | E2E environment setup | Current environment config (startup scripts, docker-compose or equivalent), seed scripts, existing fixture patterns, application auth flow | + | fixture-e2e environment setup | Existing fixture data files, the API mock layer the project uses (e.g., MSW for JS/TS, WireMock for JVM, responses for Python), browser harness configuration (Playwright by default) | + | service-integration-e2e environment setup | Local startup scripts (docker-compose or equivalent), seed scripts, application auth flow, external service stubs | + | Cross-package boundary implementation | Both sides of the boundary as listed in the work plan's Connection Map (owner modules and expected signal), the contract definition between them | | Bug fix / refactor | The affected code paths, related test coverage, error reproduction context | | Behavior replacement / rewrite | The existing implementation being replaced, its observable outputs, Design Doc Verification Strategy section | @@ -115,6 +119,8 @@ Decompose tasks based on implementation strategy patterns determined in implemen - Be specific with file paths: `src/orders/checkout`, `docs/design/payment.md` β€” not "the order module" or "related code" - When the target is a section within a file, write the file path and add a search hint: `docs/design/payment.md (Β§ Payment Flow)` or `src/orders/checkout (processOrder function)` - When test skeletons exist for the task, always include them as Investigation Targets + - When the work plan contains a UI Spec Component β†’ Task Mapping table, propagate the matching component section to every task in that row (see UI Spec Propagation below) + - When the work plan contains a Connection Map, propagate the boundary rows touching this task's target files (see Connection Map Propagation below) 7. **Implementation Pattern Consistency** When including implementation samples, MUST ensure strict compliance with the Design Doc implementation approach that forms the basis of the work plan @@ -144,6 +150,29 @@ When the work plan header includes a Quality Assurance Mechanisms table, propaga 3. **Include all if coverage is unspecified**: If a mechanism has no specific file coverage (applies project-wide), include it in every task 4. **Omit when no match**: If no mechanisms match a task's target files, omit the "Quality Assurance Mechanisms" section from that task +## UI Spec Propagation + +When the work plan contains a UI Spec Component β†’ Task Mapping table, propagate component references to each implementation task as follows: + +1. **Lookup by task ID**: For each row in the mapping table, locate the task(s) listed in the "Covered By Task(s)" column +2. **Append a single line to Investigation Targets**: Add one line per matched component in the task's Investigation Targets section. The line format is `[ui-spec path] (Β§ [component heading])`, where `` is appended only when the row lists specific states. + + - When no states are listed: `docs/ui-spec/foo-ui-spec.md (Β§ Component: AlertCard)` + - When states are listed: `docs/ui-spec/foo-ui-spec.md (Β§ Component: AlertCard β€” verify default + loading + error states)` + + This is the entire entry β€” do not also add a separate parenthetical line. The state hint is part of the same line. +3. **One row β†’ one or more tasks**: A component can be split across multiple tasks; propagate the same line to each +4. **Skip when not provided**: If the work plan has no UI Spec Component β†’ Task Mapping table, skip this propagation step + +## Connection Map Propagation + +When the work plan contains a Connection Map table, propagate boundary context to each implementation task as follows: + +1. **Lookup by task ID**: For each row in the Connection Map, locate the task(s) listed in the "Covered By Task(s)" column +2. **Append to Investigation Targets**: Add the boundary's owner module file paths on both sides to each matched task's Investigation Targets +3. **Add a "Boundary Context" note in the task body**: Record the boundary identifier and expected signal verbatim from the Connection Map row, so the executor knows what observable evidence the implementation must produce +4. **Skip when not provided**: If the work plan has no Connection Map, skip this propagation step + ## Task File Template See task template in documentation-criteria skill for details. @@ -243,6 +272,8 @@ Please execute decomposed tasks according to the order. - [ ] Appropriate granularity (1-5 files/task) - [ ] Investigation Targets specified for every task (specific file paths, not vague categories) - [ ] Quality Assurance Mechanisms from work plan header propagated to relevant tasks +- [ ] UI Spec Component β†’ Task Mapping rows propagated to matching tasks (when work plan has the table) +- [ ] Connection Map boundary rows propagated to matching tasks (when work plan has the table) - [ ] Clear completion criteria setting - [ ] Overall design document creation - [ ] Implementation efficiency and rework prevention (pre-identification of common processing, clarification of impact scope) diff --git a/dev-workflows/agents/work-planner.md b/dev-workflows/agents/work-planner.md index 1de8715..9cef86a 100644 --- a/dev-workflows/agents/work-planner.md +++ b/dev-workflows/agents/work-planner.md @@ -38,22 +38,40 @@ Choose Strategy A (TDD) if test skeletons are provided, Strategy B (implementati - Final phase is always Quality Assurance **E2E Gap Check (all strategies)**: -After determining which test skeletons are available, check whether E2E skeletons are absent. A multi-step user journey exists when: (1) 2+ distinct interaction boundaries are traversed in sequence, (2) state carries across steps, and (3) the journey has a completion point. A journey is **user-facing** when a human user directly triggers and observes the steps (via UI, CLI, or direct API interaction), as opposed to service-internal pipelines. +After determining which test skeletons are available, check the two E2E lanes (fixture-e2e, service-integration-e2e β€” see integration-e2e-testing skill) independently. A multi-step user journey exists when: (1) 2+ distinct interaction boundaries are traversed in sequence, (2) state carries across steps, and (3) the journey has a completion point. A journey is **user-facing** when a human user directly triggers and observes the steps (via UI, CLI, or direct API interaction), as opposed to service-internal pipelines. ``` -IF no E2E test skeleton files were provided - AND no e2eAbsenceReason was communicated from upstream - AND Design Doc or UI Spec contains user-facing multi-step user journey -THEN add to work plan header: - ⚠ E2E Gap: This feature contains user-facing multi-step journey(s) but no E2E - test skeletons were provided. Consider running the test skeleton generation - step to evaluate E2E test candidates before final phase. - Detected journeys: [list journey descriptions and AC references] +fixture-e2e gap: + IF no fixture-e2e skeleton was provided + AND (e2eAbsenceReason.fixtureE2e is null + OR e2eAbsenceReason.fixtureE2e was not communicated) + AND Design Doc or UI Spec contains user-facing multi-step user journey + THEN add to work plan header: + ⚠ fixture-e2e Gap: This feature contains user-facing multi-step journey(s) + but no fixture-e2e skeleton was provided. Consider running the test + skeleton generation step to evaluate fixture-e2e candidates before the + UI implementation phase. + Detected journeys: [list journey descriptions and AC references] + +service-integration-e2e gap: + IF no service-integration-e2e skeleton was provided + AND (e2eAbsenceReason.serviceE2e is null + OR e2eAbsenceReason.serviceE2e was not communicated) + AND Design Doc indicates the journey requires real cross-service + verification (data persistence across services, transactional + consistency, external service contract) + THEN add to work plan header: + ⚠ service-integration-e2e Gap: This feature crosses service boundaries + where correctness depends on real cross-service behavior, but no + service-integration-e2e skeleton was provided. + Detected boundaries: [list crossings and AC references] ``` -When an `e2eAbsenceReason` is provided (e.g., `no_multi_step_journey`, `below_threshold_user_confirmed`), E2E absence is intentional β€” skip this gap check. +The "was not communicated" branch covers the scenario where the upstream planning flow skipped test skeleton generation entirely β€” in that case the absence reason field is not even passed to work-planner, so the gap check still runs. -This check applies regardless of whether Strategy A or B was selected. Integration-only skeletons being provided does not imply E2E coverage. Service-internal journeys (async pipelines, service-to-service sagas) are not flagged here β€” they may still warrant E2E through the normal ROI path. +When an `e2eAbsenceReason` for a lane carries a value (e.g., `no_multi_step_journey`, `below_threshold_user_confirmed`, `no_real_service_dependency`), absence in that lane is intentional β€” skip the gap check for that lane. + +This check applies regardless of whether Strategy A or B was selected. Integration-only skeletons being provided does not imply E2E coverage. Service-internal journeys (async pipelines, service-to-service sagas) are not flagged for the reserved-slot rule but may still warrant service-integration-e2e through the normal ROI path. **Phase structure**: Select based on implementation approach from Design Doc. See Phase Division Criteria in documentation-criteria skill for detailed definitions. Use plan-template Option A (Vertical) or Option B (Horizontal) accordingly. @@ -73,12 +91,47 @@ Map each extracted item to a covering task. Items may be covered by a dedicated Record the mapping in the Design-to-Plan Traceability table (see plan template). If an item has no covering task, set Gap Status to `gap` with justification in Notes. Gaps with justification require user confirmation before plan approval. +### 5a. Map UI Spec Components to Tasks (when UI Spec provided) + +When a UI Spec is among the inputs, also map components and states to the tasks that implement them. task-decomposer reads this mapping in Step 6 to populate each task's Investigation Targets, so without this step the UI Spec never reaches the executor. + +For each component documented in the UI Spec: +1. Identify the component's section heading exactly as it appears in the UI Spec (the heading is the reference key β€” see ui-spec-designer's heading uniqueness rule) +2. Identify which states (default / loading / empty / error / partial) the implementation must cover +3. Identify the task(s) in this plan that implement the component or its tests + +Record the mapping in the **UI Spec Component β†’ Task Mapping** table (see plan template). One row per component. Components with no covering task are flagged as `gap` requiring user confirmation, identical to the Design-to-Plan Traceability rule. + +### 5b. Map Cross-Package Boundaries to Tasks (when implementation crosses runtime/deployment boundaries) + +When the implementation crosses a runtime or deployment boundary, build a Connection Map so task-decomposer can propagate boundary context to each affected task. + +**A boundary qualifies for the Connection Map only when ALL of the following hold**: +- The two sides run in separate processes, services, or runtimes (e.g., web client ↔ HTTP server, service A ↔ service B over a network, frontend bundle ↔ backend handler) +- A serialized contract crosses between them (HTTP request/response, message envelope, RPC call, event payload) +- A failure on one side produces an observable signal on the other (status code, missing field, timeout, dropped message) + +**Excluded β€” these are NOT boundaries for the Connection Map**: +- A package importing a sibling utility, type definition, or shared constant from the same monorepo (in-process, no serialized contract) +- Internal layering within the same runtime (e.g., handler β†’ usecase β†’ repository) +- Source code dependencies that compile/bundle into the same artifact + +For each qualifying boundary: +1. Identify the boundary (e.g., `web β†’ API gateway`, `service-A β†’ service-B`, `frontend β†’ shared client β†’ backend handler`) +2. Identify the owner module/package on each side +3. Identify the expected signal that confirms the boundary works (e.g., HTTP 200 with schema X, message published to topic Y, row inserted in table Z) +4. Identify the task(s) that implement either side of the boundary + +Record the mapping in the **Connection Map** table (see plan template). Omit this section entirely when no qualifying boundary exists. + ### 6. Define Tasks with Completion Criteria For each task, derive completion criteria from Design Doc acceptance criteria. Apply the 3-element completion definition (Implementation Complete, Quality Complete, Integration Complete). ### 7. Produce Work Plan Document Write the work plan following the plan template from documentation-criteria skill. Include Phase Structure Diagram and Task Dependency Diagram (mermaid). +The plan header MUST include the line `Implementation Readiness: pending`. The marker contract: it takes one of three values β€” `pending` (initial, set here by work-planner), `ready` (verification completed with no remaining gaps), or `escalated` (verification completed with remaining gaps). The producer that promotes the marker beyond `pending` and the consumer that reads it before execution are external orchestration concerns owned outside this agent. + ## Input Parameters - **mode**: `create` (default) | `update` @@ -129,7 +182,8 @@ Create Red state tests based on unit test definitions provided from previous pro **Test Implementation Timing and Placement**: - Unit tests: Phase 0 Red β†’ Green during implementation - Integration tests: Create and execute at completion of relevant feature implementation (include in phase tasks like "[Feature name] implementation with integration test creation") -- E2E tests: Execute only in final phase (execution only, no separate implementation needed) +- fixture-e2e tests: Create and execute alongside the UI feature phase (include in phase tasks like "[Feature name] UI implementation with fixture-e2e creation"). These run in CI without infrastructure setup +- service-integration-e2e tests: Execute only in the final phase (these depend on local stack and tend to be too slow/heavy for per-task cycles) #### Meta Information Utilization Analyze meta information (@category, @dependency, @complexity, etc.) included in test definitions, @@ -166,22 +220,29 @@ Read test skeleton files (integration tests, E2E tests) with the Read tool and e #### Step 3: Extract Environment Prerequisites from E2E Skeletons -When E2E test skeletons are provided, scan for environment prerequisites in two stages: +When E2E test skeletons are provided, scan for environment prerequisites in two stages. Apply the lane-aware rules below β€” fixture-e2e and service-integration-e2e have very different prerequisite shapes. -**Stage 1: Detect precondition patterns** β€” scan all E2E skeletons and list every detected precondition: -- `Preconditions:` or `Arrange:` comment annotations mentioning seed data, test users, subscriptions, or specific DB state -- `@dependency: full-system` combined with auth/login setup code +**Stage 1: Detect precondition patterns** β€” scan each E2E skeleton (read its `@lane` header to know which lane applies) and list every detected precondition: +- `Preconditions:` or `Arrange:` comment annotations mentioning seed data, test users, fixtures, or specific UI/DB state +- `@dependency: full-ui (mocked backend)` combined with fixture loaders or API mock handlers (e.g., MSW for JS/TS, route interception in the project's browser harness β€” fixture-e2e) +- `@dependency: full-system` combined with auth/login setup code (service-integration-e2e) - References to environment variables (`E2E_*`, `TEST_*`) -- External service references requiring HTTP mock/intercept patterns in test code +- External service references requiring HTTP mock/intercept patterns + +**Stage 2: Generate setup tasks** β€” for each detected precondition, create a corresponding Phase 0 task. Common categories by lane: + +For **fixture-e2e**: +- **Fixture data** β†’ "Create fixture data files for [feature] UI states" +- **Mock backend** β†’ "Configure API mock layer for fixture-e2e (e.g., MSW for JS/TS, WireMock for JVM, responses for Python β€” use the project's standard)" +- **Browser harness** β†’ "Set up the browser harness for fixture-e2e (Playwright by default; no live services required)" -**Stage 2: Generate setup tasks** β€” for each detected precondition, create a corresponding Phase 0 task. Common categories include: -- **Seed data** β†’ "Create E2E seed data script (test users, required records)" -- **Auth fixture** β†’ "Implement E2E auth fixture using application's login flow" -- **External service mocks** β†’ "Configure external service mocks for E2E tests" -- **Environment configuration** β†’ "Define E2E environment variables and document setup" -- **Other detected preconditions** β†’ Create a setup task matching the detected category +For **service-integration-e2e**: +- **Seed data** β†’ "Create seed data script for service-integration-e2e (test users, required records)" +- **Auth fixture** β†’ "Implement auth fixture using application's login flow" +- **External service stubs** β†’ "Configure external service stubs for service-integration-e2e" +- **Environment configuration** β†’ "Define service-integration-e2e environment variables and document local startup" -Place all environment setup tasks in Phase 0 (before any implementation tasks). Mark with `@category: e2e-setup` for traceability. +Place all environment setup tasks in Phase 0 (before any implementation tasks). Mark with `@category: e2e-setup` and `@lane:` matching the target lane for traceability. #### Step 4: Classify and Place Tests @@ -189,7 +250,8 @@ Place all environment setup tasks in Phase 0 (before any implementation tasks). - Setup items (Mock preparation, measurement tools, Helpers, etc.) β†’ Prioritize in Phase 1 - Unit tests (individual functions) β†’ Start from Phase 0 with Red-Green-Refactor - Integration tests β†’ Place as create/execute tasks when relevant feature implementation is complete -- E2E tests β†’ Place as execute-only tasks in final phase +- fixture-e2e tests β†’ Place as create/execute tasks alongside the relevant UI feature implementation +- service-integration-e2e tests β†’ Place as execute-only tasks in final phase - Non-functional requirement tests (performance, UX, etc.) β†’ Place in quality assurance phase - Risk levels ("high risk", "required", etc.) β†’ Move to earlier phases @@ -234,6 +296,12 @@ When creating work plans, **Phase Structure Diagrams** and **Task Dependency Dia - [ ] Design-to-Plan Traceability table complete (all DD technical requirements categorized and mapped) - [ ] No `gap` entries without justification - [ ] All justified `gap` entries flagged for user confirmation before plan approval +- [ ] UI Spec Component β†’ Task Mapping table complete (when UI Spec provided) + - [ ] Every UI Spec component has a covering task, OR an explicit `gap` justification + - [ ] Component reference uses the UI Spec section heading exactly as it appears in the document +- [ ] Connection Map table complete (when implementation crosses packages/services) + - [ ] Every boundary lists owner modules and expected signal + - [ ] Every boundary maps to at least one covering task on each side - [ ] Verification Strategy extracted from Design Doc and included in plan header - [ ] Adopted Quality Assurance Mechanisms extracted from Design Doc and included in plan header - [ ] Phase structure matches implementation approach (vertical β†’ value unit phases, horizontal β†’ layer phases) @@ -242,7 +310,8 @@ When creating work plans, **Phase Structure Diagrams** and **Task Dependency Dia - [ ] Quality assurance exists in final phase - [ ] Test skeleton file paths listed in corresponding phases (when provided) - [ ] E2E environment prerequisites addressed (when E2E skeletons provided) - - [ ] Seed data, auth fixture, and external service mock tasks generated + - [ ] fixture-e2e prerequisites: fixture data, mocked backend, browser harness tasks generated when applicable + - [ ] service-integration-e2e prerequisites: seed data, auth fixture, external service stub tasks generated when applicable - [ ] Environment setup tasks placed in Phase 0 - [ ] Test design information reflected (only when provided) - [ ] Setup tasks placed in first phase diff --git a/dev-workflows/skills/documentation-criteria/references/plan-template.md b/dev-workflows/skills/documentation-criteria/references/plan-template.md index 1120318..084ec79 100644 --- a/dev-workflows/skills/documentation-criteria/references/plan-template.md +++ b/dev-workflows/skills/documentation-criteria/references/plan-template.md @@ -5,6 +5,7 @@ Type: feature|fix|refactor Estimated Duration: X days Estimated Impact: X files Related Issue/PR: #XXX (if any) +Implementation Readiness: pending ## Related Documents - Design Doc(s): @@ -46,6 +47,26 @@ Maps each Design Doc technical requirement to the covering task(s). One row per **Gap Status values**: `covered` (task exists), `gap` (no task β€” requires justification in Notes, user confirmation required before plan approval) +## UI Spec Component β†’ Task Mapping + +Include this section when a UI Spec is among the inputs. Maps each component documented in the UI Spec to the task(s) that implement it. task-decomposer reads this table to populate each task's Investigation Targets with the corresponding UI Spec section. Omit the section when no UI Spec exists. + +| UI Spec Component (section heading) | States to Cover | Covered By Task(s) | Gap Status | Notes | +|---|---|---|---|---| +| [Use the UI Spec heading exactly as written, e.g., "Β§ Component: AlertCard"] | [default / loading / empty / error / partial β€” list the states the implementation must produce] | [Phase X Task Y] | covered | | + +**Reference key rule**: The component identifier in column 1 is the UI Spec section heading (verbatim). ui-spec-designer enforces unique component headings so this reference resolves to exactly one section. + +**Gap Status values**: `covered` (task exists), `gap` (no task β€” requires justification in Notes, user confirmation required before plan approval) + +## Connection Map + +Include this section when the implementation crosses more than one package, service, or process boundary. Document each boundary so task-decomposer can propagate boundary context to the implementation tasks on each side. Omit the section when the implementation stays within a single package. + +| Boundary | Owner (left side) | Owner (right side) | Expected Signal | Covered By Task(s) | +|---|---|---|---|---| +| [e.g., "web client β†’ API gateway"] | [module/package on the request side] | [module/package on the response side] | [Observable evidence the boundary works β€” e.g., "HTTP 200 with response matching ContractA", "row inserted in tableB", "message published to topicC"] | [Phase X Task Y on each side] | + ## Objective [Why this change is necessary, what problem it solves] diff --git a/dev-workflows/skills/documentation-criteria/references/ui-spec-template.md b/dev-workflows/skills/documentation-criteria/references/ui-spec-template.md index b66681b..9b8b60d 100644 --- a/dev-workflows/skills/documentation-criteria/references/ui-spec-template.md +++ b/dev-workflows/skills/documentation-criteria/references/ui-spec-template.md @@ -59,6 +59,8 @@ Map PRD acceptance criteria to prototype references. Skip this section if no pro ### Component: [ComponentName] +> Component heading uniqueness: every `Component: [ComponentName]` heading must be unique within this UI Spec. work-planner and task-decomposer reference components by exact heading text β€” duplicate names or paraphrased headings break the propagation to implementation tasks. + #### State x Display Matrix | State | Default | Loading | Empty | Error | Partial | diff --git a/dev-workflows/skills/integration-e2e-testing/SKILL.md b/dev-workflows/skills/integration-e2e-testing/SKILL.md index 6ad9889..801459c 100644 --- a/dev-workflows/skills/integration-e2e-testing/SKILL.md +++ b/dev-workflows/skills/integration-e2e-testing/SKILL.md @@ -7,14 +7,21 @@ description: Integration and E2E test design principles, ROI calculation, test s ## References -**E2E test design with Playwright**: See [references/e2e-design.md](references/e2e-design.md) for UI Spec-driven E2E test candidate selection and Playwright test architecture. +**E2E test design**: See [references/e2e-design.md](references/e2e-design.md) for UI Spec-driven E2E test candidate selection and browser test architecture. The reference uses Playwright as the default browser harness; substitute the project's standard when different. ## Test Type Definition and Limits -| Test Type | Purpose | Scope | Limit per Feature | Implementation Timing | -|-----------|---------|-------|-------------------|----------------------| -| Integration | Verify component interactions | Partial system integration | MAX 3 | Created alongside implementation | -| E2E | Verify critical user journeys | Full system | MAX 1-2 | Executed in final phase only | +| Test Type | Purpose | Scope | External Deps | Limit per Feature | Implementation Timing | +|-----------|---------|-------|---------------|-------------------|----------------------| +| Integration | Verify component interactions in-process | Partial system integration (in-process modules; for UI components, the framework's in-process renderer e.g., RTL+MSW for React/TS) | Mocked or in-process | MAX 3 | Created alongside implementation | +| fixture-e2e | Verify UI behavior in a browser with deterministic fixtures | Full UI flow with mocked backend / fixture-driven state | Mocked / fixture only β€” no live services | MAX 3 | Created alongside the UI feature | +| service-integration-e2e | Verify critical user journeys against a running local stack | Full system across services | Live local services or stubs | MAX 1-2 | Executed only in the final phase | + +**Lane selection (E2E only)**: +- Default lane for user-facing UI journeys is **fixture-e2e** β€” it runs a real browser against deterministic fixtures, catches the bugs that unit/integration tests miss (button no-op, state never updates, navigation breaks), and runs in CI without infrastructure setup +- Add **service-integration-e2e** only when the journey's correctness depends on real cross-service behavior (data persistence, transactional consistency, external service contracts) that cannot be faked safely + +The two E2E lanes are budgeted independently β€” having a fixture-e2e for a journey does not consume the service-integration-e2e budget and vice versa. ## Behavior-First Principle @@ -43,20 +50,29 @@ ROI Score = Business Value Γ— User Frequency + Legal Requirement Γ— 10 + Defect Higher ROI Score = higher priority within its test type. No normalization or capping is applied β€” the raw score is used directly for ranking. Deduplication is a separate step that removes candidates entirely; it does not modify scores. -### ROI Threshold for E2E +### ROI Thresholds by Lane + +The two E2E lanes have very different ownership costs and use independent thresholds. -E2E tests have high ownership cost (creation, execution, and maintenance are each 3-10Γ— higher than integration tests). To justify creation, an E2E candidate (beyond the must-keep reserved slot) requires **ROI Score > 50**. +| Lane | ROI threshold | Rationale | +|------|---------------|-----------| +| fixture-e2e | ROI β‰₯ 20 (beyond reserved slot) | Cost is comparable to integration tests once the harness exists; the floor avoids filling MAX 3 with low-signal tests when fewer would suffice | +| service-integration-e2e | ROI > 50 (beyond reserved slot) | Creation, execution, and maintenance cost is 3-10Γ— higher than integration; reserve for journeys whose value cannot be proven any other way | + +Reserved slot rules (see Multi-Step User Journey Definition below) apply per lane and override the threshold (the reserved candidate is emitted regardless of its ROI score). Below-floor candidates beyond the reserved slot are not emitted, leaving budget intentionally unfilled rather than padding with low-value tests. ### ROI Calculation Examples | Scenario | BV | Freq | Legal | Defect | ROI Score | Test Type | Selection Outcome | |----------|----|------|-------|--------|-----------|-----------|-------------------| -| Core checkout flow | 10 | 9 | true | 9 | 109 | E2E | Selected (reserved slot: user-facing multi-step journey) | -| Payment error handling | 8 | 3 | false | 7 | 31 | E2E | Below threshold (31 < 50), not selected | -| Profile save flow | 7 | 6 | false | 6 | 48 | E2E | Below threshold (48 < 50), not selected | +| Core checkout UI flow | 10 | 9 | true | 9 | 109 | fixture-e2e | Selected (reserved slot: user-facing multi-step journey, browser-level verification with fixtures) | +| Core checkout against live payment service | 10 | 9 | true | 9 | 109 | service-integration-e2e | Selected (real-service correctness above ROI threshold) | +| Dismiss button updates UI state | 6 | 7 | false | 8 | 50 | fixture-e2e | Selected (rank 2 of 3 fixture-e2e budget) | +| Payment error message display | 5 | 4 | false | 7 | 27 | fixture-e2e | Selected (rank 3 of 3 fixture-e2e budget) | +| Optional filter toggle | 3 | 4 | false | 2 | 14 | fixture-e2e | Not selected (rank 4, budget full) | +| Payment retry against real provider | 8 | 3 | false | 7 | 31 | service-integration-e2e | Below ROI threshold (31 < 50), not selected | | DB persistence check | 8 | 8 | false | 8 | 72 | Integration | Selected (rank 1 of 3) | -| Error message display | 5 | 3 | false | 4 | 19 | Integration | Selected (rank 2 of 3) | -| Optional filter toggle | 3 | 4 | false | 2 | 14 | Integration | Not selected (rank 4, budget full) | +| Pure data transformation | 5 | 3 | false | 4 | 19 | Integration | Selected (rank 2 of 3) | ## Multi-Step User Journey Definition @@ -72,14 +88,14 @@ A feature qualifies as containing a **multi-step user journey** when ALL of the ### User-Facing vs Service-Internal Journeys -Multi-step journeys are further classified for E2E budget decisions: +Multi-step journeys are classified for reserved-slot eligibility: -| Classification | Condition | E2E Reserved Slot | Example | +| Classification | Condition | Reserved Slot Eligibility | Example | |---|---|---|---| -| **User-facing** | A human user directly triggers and observes the steps (via UI, CLI, or direct API interaction) | Eligible | Web checkout flow, CLI setup wizard, mobile onboarding | -| **Service-internal** | Steps are triggered by backend services without direct user interaction | Not eligible (use integration tests) | Async job pipeline, service-to-service saga, scheduled batch processing | +| **User-facing** | A human user directly triggers and observes the steps (via UI, CLI, or direct API interaction) | Eligible β€” defaults to **fixture-e2e** reserved slot. Add a service-integration-e2e reserved slot only when the journey's correctness depends on real cross-service behavior | Web checkout flow, CLI setup wizard, mobile onboarding | +| **Service-internal** | Steps are triggered by backend services without direct user interaction | Not eligible for reserved slot β€” use integration tests. Service-integration-e2e through normal ROI > 50 path is still valid when full-system verification is warranted | Async job pipeline, service-to-service saga, scheduled batch processing | -This classification applies only to the reserved E2E slot and the E2E Gap Check. Service-internal journeys are still valid E2E candidates through the normal ROI > 50 path if they warrant full-system verification. +This classification applies only to the reserved-slot rule and the E2E Gap Check. Other selection follows lane-specific ROI rules above. Use this definition when evaluating E2E test candidates and E2E gap detection. @@ -92,12 +108,18 @@ Each test MUST include the following annotations: ``` AC: [Original acceptance criteria text] Behavior: [Trigger] β†’ [Process] β†’ [Observable Result] -@category: core-functionality | integration | edge-case | e2e +@category: core-functionality | integration | edge-case | fixture-e2e | service-integration-e2e +@lane: integration | fixture-e2e | service-integration-e2e @dependency: none | [component names] | full-system @complexity: low | medium | high ROI: [score] ``` +**`@lane` selection rule**: +- `integration` β€” Component interaction in-process, no browser (e.g., RTL+MSW for React/TS, in-process module/handler integration in any language) +- `fixture-e2e` β€” Browser-level UI verification with mocked backend / fixture-driven state +- `service-integration-e2e` β€” Browser-level or end-to-end verification against running local services or stubs + Use the project's comment syntax to wrap these annotations (e.g., `//` for C-family, `#` for Python/Ruby/Shell). ### Verification Items (Optional) @@ -121,9 +143,10 @@ Verification items: ## Test File Naming Convention - Integration tests: `*.int.test.*` or `*.integration.test.*` -- E2E tests: `*.e2e.test.*` +- fixture-e2e tests: `*.fixture.e2e.test.*` (or organize under `tests/e2e/fixture/`) +- service-integration-e2e tests: `*.service.e2e.test.*` (or organize under `tests/e2e/service/`) -The test runner or framework in the project determines the appropriate file extension. +The test runner or framework in the project determines the appropriate file extension. Repos that already use a single `*.e2e.test.*` convention may keep it as long as each file declares `@lane:` in its header β€” the lane annotation is the source of truth for routing and budget accounting. ## Review Criteria diff --git a/dev-workflows/skills/integration-e2e-testing/references/e2e-design.md b/dev-workflows/skills/integration-e2e-testing/references/e2e-design.md index f4e9e90..45a0174 100644 --- a/dev-workflows/skills/integration-e2e-testing/references/e2e-design.md +++ b/dev-workflows/skills/integration-e2e-testing/references/e2e-design.md @@ -1,8 +1,21 @@ -# E2E Test Design with Playwright +# E2E Test Design (Browser Harness) + +This reference uses Playwright as the default example throughout because it is the standard E2E browser harness assumed by these workflows. Adapt patterns to the project's chosen framework when different (Cypress, Selenium, etc.); the lane definitions, ROI rules, and budgets remain the same. + +## Two E2E Lanes + +E2E tests in this workflow split into two lanes (see parent skill Test Type Definition): + +| Lane | When | ROI gate | Cost | +|------|------|----------|------| +| **fixture-e2e** | UI journey verification with deterministic fixtures (mocked backend / fixture data) | None β€” selected by ranking within MAX 3 budget | Comparable to integration; runs in CI without infrastructure setup | +| **service-integration-e2e** | Journey correctness depends on real cross-service behavior (data persistence, transactional consistency, external contracts) | ROI > 50 (beyond reserved slot) | 3-10Γ— higher than integration; reserved for what cannot be faked safely | + +Both lanes typically use Playwright; the difference is whether the backend is mocked / fixture-driven or running for real. ## When to Create E2E Tests -E2E tests target **critical user journeys** that span multiple pages or require real browser interaction. Apply the same ROI framework from the parent skill β€” only create E2E tests when ROI > 50. +E2E candidates target **critical user journeys** that span multiple pages or require real browser interaction. Pick the lane based on whether real services are required for the verification. ### Candidate Sources @@ -22,8 +35,8 @@ E2E tests target **critical user journeys** that span multiple pages or require - Responsive behavior across viewports **Use integration tests instead when**: -- Testing single-component state changes β†’ RTL -- Testing API response handling β†’ MSW + RTL +- Testing single-component state changes β†’ in-process component renderer (e.g., RTL for React/TS) +- Testing API response handling β†’ in-process API mock + component renderer (e.g., MSW + RTL for React/TS) - Testing pure data transformations β†’ unit tests ## UI Spec to E2E Test Mapping @@ -41,12 +54,15 @@ When a UI Spec exists, use it as the primary source for E2E test design: Screen Transition: [Screen A] β†’ [Screen B] β†’ [Screen C] AC Reference: AC-{id} User Journey: [Description of what the user accomplishes] -Preconditions: [Auth state, data state] +Lane: fixture-e2e | service-integration-e2e +Preconditions: [Auth state, data state β€” note whether these are fixture-driven or live] Verification Points: - [What to assert at each step] E2E ROI Score: [calculated score] ``` +**Lane decision**: choose `fixture-e2e` by default. Promote to `service-integration-e2e` when the verification requires observing real cross-service behavior (e.g., the test asserts that data persists across a real DB write, or that an external service receives the correct payload). + ## Playwright Test Architecture ### Page Object Pattern @@ -56,9 +72,11 @@ Organize browser interactions through page objects for maintainability: ``` tests/ β”œβ”€β”€ e2e/ -β”‚ β”œβ”€β”€ pages/ # Page objects -β”‚ β”œβ”€β”€ fixtures/ # Test fixtures and helpers -β”‚ └── *.e2e.test.ts # Test files +β”‚ β”œβ”€β”€ pages/ # Page objects (shared across lanes) +β”‚ β”œβ”€β”€ fixtures/ # Test fixtures and helpers (auth, seed) +β”‚ β”œβ”€β”€ data/ # Static fixture data for fixture-e2e +β”‚ β”œβ”€β”€ *.fixture.e2e.test.ts # fixture-e2e test files +β”‚ └── *.service.e2e.test.ts # service-integration-e2e test files ``` ### Test Isolation @@ -81,6 +99,6 @@ When UI Spec defines responsive behavior, test critical breakpoints: ## Budget Enforcement Hard limits per feature (same as parent skill): -- **E2E Tests**: MAX 1-2 tests -- Only generate if ROI score > 50 -- Prefer fewer, comprehensive journey tests over many granular tests +- **fixture-e2e**: MAX 3 tests, no ROI gate (selected by ranking) +- **service-integration-e2e**: MAX 1-2 tests, ROI > 50 beyond the reserved slot +- Prefer fewer, comprehensive journey tests over many granular tests in both lanes diff --git a/dev-workflows/skills/recipe-add-integration-tests/SKILL.md b/dev-workflows/skills/recipe-add-integration-tests/SKILL.md index da21964..b05f111 100644 --- a/dev-workflows/skills/recipe-add-integration-tests/SKILL.md +++ b/dev-workflows/skills/recipe-add-integration-tests/SKILL.md @@ -160,6 +160,14 @@ Check quality-fixer response: On `approved` from quality-fixer: - Commit test files using Bash with message format: "test: add [layer] integration tests for [feature name]" +### Step 9: Final Cleanup + +After all task files have been processed and committed, delete the task files this recipe created. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete every file matching `docs/plans/tasks/integration-tests-backend-task-*.md` and `docs/plans/tasks/integration-tests-frontend-task-*.md` created during this run + +If task files cannot be deleted (filesystem error), report the failure but do not block completion. + ## Scope Boundary for Subagents Append the following block to every subagent prompt invoked from this recipe: diff --git a/dev-workflows/skills/recipe-build/SKILL.md b/dev-workflows/skills/recipe-build/SKILL.md index 7907366..d56dbb3 100644 --- a/dev-workflows/skills/recipe-build/SKILL.md +++ b/dev-workflows/skills/recipe-build/SKILL.md @@ -20,33 +20,51 @@ Work plan: $ARGUMENTS ## Pre-execution Prerequisites -### Task File Existence Check -```bash -# Check work plans -! ls -la docs/plans/*.md | grep -v template | tail -5 +### Implementation Readiness Check -# Check task files -! ls docs/plans/tasks/*.md 2>/dev/null || echo "No task files found" -``` +Before any task processing, locate the work plan to gate against. Resolution rule: +1. List task files in `docs/plans/tasks/` matching the single-layer pattern `{plan-name}-task-*.md`. Layer-aware fullstack tasks (`{plan-name}-backend-task-*.md` / `{plan-name}-frontend-task-*.md`) are excluded here so a stale fullstack run does not redirect this recipe to the wrong work plan +2. From the matched files, also exclude every file matching any of these patterns β€” they originate from other workflow phases and are not implementation tasks for this run's plan: `*-task-prep-*.md` (readiness preflight tasks), `_overview-*.md` (decomposition overview file), `*-phase*-completion.md` (per-phase completion files), `review-fixes-*.md` (post-implementation review fixes), `integration-tests-*-task-*.md` (integration-test add-on scaffolding) +3. For each remaining file, extract the `{plan-name}` prefix as the segment that appears before `-task-` +4. When at least one task file matches, the work plan is `docs/plans/{plan-name}.md` for the prefix that has the most recent task-file mtime; ties broken by the lexicographically last `{plan-name}` +5. When no task file matches the restricted pattern, the work plan is the most-recent-mtime non-template `.md` in `docs/plans/` + +Read the work plan header and find the line `Implementation Readiness: `. Apply this rule: + +| Status | Action | +|--------|--------| +| `ready` | Proceed to Consumed Task Set computation | +| `escalated` | Read the work plan's Readiness Report section, surface remaining gaps to the user via AskUserQuestion: "Implementation Readiness is `escalated` with the following remaining gaps: [list]. Continue execution? (y/n)". On `y` proceed; on `n` stop | +| `pending` | Present via AskUserQuestion: "Implementation Readiness is `pending`. Run `/recipe-prepare-implementation [plan-path]` first to verify the work plan is implementable, then resume. Continue without preflight? (y/n)". On `y` proceed; on `n` stop | +| absent (line missing) | Treat as `pending` β€” older work plans created before the readiness marker existed should be preflighted explicitly | + +### Consumed Task Set + +Compute the **Consumed Task Set** for this run β€” the exact files this recipe owns, executes, and later deletes. Use the same restricted pattern as the Implementation Readiness Check: + +1. List task files in `docs/plans/tasks/` matching the single-layer pattern `{plan-name}-task-*.md` for the `{plan-name}` resolved by the readiness check. Layer-aware fullstack tasks are excluded +2. Exclude every file matching: `*-task-prep-*.md`, `_overview-*.md`, `*-phase*-completion.md`, `review-fixes-*.md`, `integration-tests-*-task-*.md` (these originate from other workflow phases) + +Every subsequent reference to "task files" in this recipe β€” Task Generation Decision Flow, Task Execution Cycle iteration, and Final Cleanup β€” uses this set, not the unrestricted `docs/plans/tasks/*.md` glob. ### Task Generation Decision Flow -Analyze task file existence state and determine the action required: +Analyze the Consumed Task Set and determine the action required: | State | Criteria | Next Action | |-------|----------|-------------| -| Tasks exist | .md files in tasks/ directory | User's execution instruction serves as batch approval β†’ Enter autonomous execution immediately | -| No tasks + plan exists | Plan exists but no task files | Confirm with user β†’ run task-decomposer | -| Neither exists + Design Doc exists | No plan or task files, but docs/design/*.md exists | Invoke work-planner to create work plan from Design Doc, then proceed to task decomposition | -| Neither exists | No plan, no task files, no Design Doc | Report missing prerequisites to user and stop | +| Tasks exist | Consumed Task Set is non-empty | User's execution instruction serves as batch approval β†’ Enter autonomous execution immediately | +| No tasks + plan exists | Consumed Task Set is empty but the resolved work plan exists | Confirm with user β†’ run task-decomposer | +| Neither exists + Design Doc exists | No plan, no Consumed Task Set, but `docs/design/*.md` exists | Invoke work-planner to create work plan from Design Doc, then proceed to task decomposition | +| Neither exists | No plan, no Consumed Task Set, no Design Doc | Report missing prerequisites to user and stop | ## Task Decomposition Phase (Conditional) -When task files don't exist: +When the Consumed Task Set is empty: ### 1. User Confirmation ``` -No task files found. +No task files in the Consumed Task Set. Work plan: docs/plans/[plan-name].md Generate tasks from the work plan? (y/n): @@ -59,17 +77,14 @@ Invoke task-decomposer using Agent tool: - `prompt`: "Read work plan at docs/plans/[plan-name].md and decompose into atomic tasks. Output: Individual task files in docs/plans/tasks/. Granularity: 1 task = 1 commit = independently executable" ### 3. Verify Generation -```bash -# Verify generated task files -! ls -la docs/plans/tasks/*.md | head -10 -``` +Recompute the Consumed Task Set using the same restricted pattern from the Consumed Task Set section above. Confirm it is now non-empty. If it is still empty, escalate to the user β€” task-decomposer either failed silently or produced files that don't match the expected pattern. -**Flow**: Task generation β†’ Autonomous execution (in this order) +**Flow**: Task generation β†’ Consumed Task Set recompute β†’ Autonomous execution (in this order) ## Pre-execution Checklist -- [ ] Confirmed task files exist in docs/plans/tasks/ -- [ ] Identified task execution order (dependencies) +- [ ] Confirmed Consumed Task Set is non-empty (computed in the Consumed Task Set section above) +- [ ] Identified task execution order within the Consumed Task Set (dependencies) - [ ] **Environment check**: Can I execute per-task commit cycle? - If commit capability unavailable β†’ Escalate before autonomous mode - Other environments (tests, quality tools) β†’ Subagents will escalate @@ -77,7 +92,7 @@ Invoke task-decomposer using Agent tool: ## Task Execution Cycle (4-Step Cycle) **MANDATORY EXECUTION CYCLE**: `task-executor β†’ escalation check β†’ quality-fixer β†’ commit` -For EACH task, YOU MUST: +For EACH task in the Consumed Task Set, YOU MUST: 1. **Register tasks using TaskCreate**: Register work steps. Always include first task "Map preloaded skills to applicable concrete rules" and final task "Verify the mapped rules before final JSON" 2. **Agent tool** (subagent_type: "dev-workflows:task-executor") β†’ Pass task file path in prompt, receive structured response 3. **CHECK task-executor response**: @@ -127,7 +142,18 @@ After all task cycles finish, run verification agents **in parallel** before the - Re-run only the failed verifiers (by the criteria in step 2) - Repeat until all pass or `blocked` β†’ Escalate to user -4. **All passed** β†’ Proceed to completion report +4. **All passed** β†’ Proceed to Final Cleanup + +## Final Cleanup + +Before the completion report, delete the implementation task files this recipe consumed. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete every file in the Consumed Task Set +- Delete every file matching `docs/plans/tasks/{plan-name}-phase*-completion.md` (the per-phase completion files generated by task-decomposer for this `{plan-name}`) +- Delete the corresponding `docs/plans/tasks/_overview-{plan-name}.md` if present +- Preserve the work plan itself (`docs/plans/{plan-name}.md`) β€” the user decides whether to delete it after final review + +If task files cannot be deleted (filesystem error), report the failure but do not block the completion report. ## Output Example Implementation phase completed. @@ -135,4 +161,5 @@ Implementation phase completed. - Implemented tasks: [number] tasks - Quality checks: All passed - Commits: [number] commits created +- Cleanup: Task files removed from docs/plans/tasks/ diff --git a/dev-workflows/skills/recipe-fullstack-build/SKILL.md b/dev-workflows/skills/recipe-fullstack-build/SKILL.md index 7d4bb8e..6488bf2 100644 --- a/dev-workflows/skills/recipe-fullstack-build/SKILL.md +++ b/dev-workflows/skills/recipe-fullstack-build/SKILL.md @@ -28,33 +28,51 @@ Work plan: $ARGUMENTS ## Pre-execution Prerequisites -### Task File Existence Check -```bash -# Check work plans -! ls -la docs/plans/*.md | grep -v template | tail -5 +### Implementation Readiness Check -# Check task files -! ls docs/plans/tasks/*.md 2>/dev/null || echo "No task files found" -``` +Before any task processing, locate the work plan to gate against. Resolution rule: +1. List task files in `docs/plans/tasks/` matching the layer-aware patterns `{plan-name}-backend-task-*.md` and `{plan-name}-frontend-task-*.md` only. Single-layer tasks (`{plan-name}-task-*.md`) are excluded here so a stale single-layer run does not redirect this recipe to the wrong work plan +2. From the matched files, also exclude every file matching any of these patterns β€” they originate from other workflow phases and are not implementation tasks for this run's plan: `*-task-prep-*.md` (readiness preflight tasks), `_overview-*.md` (decomposition overview file), `*-phase*-completion.md` (per-phase completion files), `review-fixes-*.md` (post-implementation review fixes), `integration-tests-*-task-*.md` (integration-test add-on scaffolding) +3. For each remaining file, extract the `{plan-name}` prefix as the segment that appears before `-backend-task-` or `-frontend-task-` +4. When at least one task file matches, the work plan is `docs/plans/{plan-name}.md` for the prefix that has the most recent task-file mtime; ties broken by the lexicographically last `{plan-name}` +5. When no task file matches the restricted pattern, the work plan is the most-recent-mtime non-template `.md` in `docs/plans/` + +Read the work plan header and find the line `Implementation Readiness: `. Apply this rule: + +| Status | Action | +|--------|--------| +| `ready` | Proceed to Consumed Task Set computation | +| `escalated` | Read the work plan's Readiness Report section, surface remaining gaps to the user via AskUserQuestion: "Implementation Readiness is `escalated` with the following remaining gaps: [list]. Continue execution? (y/n)". On `y` proceed; on `n` stop | +| `pending` | Present via AskUserQuestion: "Implementation Readiness is `pending`. Run `/recipe-prepare-implementation [plan-path]` first to verify the work plan is implementable, then resume. Continue without preflight? (y/n)". On `y` proceed; on `n` stop | +| absent (line missing) | Treat as `pending` β€” older work plans created before the readiness marker existed should be preflighted explicitly | + +### Consumed Task Set + +Compute the **Consumed Task Set** for this run β€” the exact files this recipe owns, executes, and later deletes. Use the same restricted pattern as the Implementation Readiness Check: + +1. List task files in `docs/plans/tasks/` matching the layer-aware patterns `{plan-name}-backend-task-*.md` and `{plan-name}-frontend-task-*.md` for the `{plan-name}` resolved by the readiness check. Single-layer tasks are excluded +2. Exclude every file matching: `*-task-prep-*.md`, `_overview-*.md`, `*-phase*-completion.md`, `review-fixes-*.md`, `integration-tests-*-task-*.md` (these originate from other workflow phases) + +Every subsequent reference to "task files" in this recipe β€” Task Generation Decision Flow, Task Execution Cycle iteration, and Final Cleanup β€” uses this set, not the unrestricted `docs/plans/tasks/*.md` glob. ### Task Generation Decision Flow -Analyze task file existence state and determine the action required: +Analyze the Consumed Task Set and determine the action required: | State | Criteria | Next Action | |-------|----------|-------------| -| Tasks exist | .md files in tasks/ directory | User's execution instruction serves as batch approval β†’ Enter autonomous execution immediately | -| No tasks + plan exists | Plan exists but no task files | Confirm with user β†’ run task-decomposer | -| Neither exists + Design Doc exists | No plan or task files, but docs/design/*.md exists | Invoke work-planner to create work plan from Design Doc(s), then proceed to task decomposition | -| Neither exists | No plan, no task files, no Design Doc | Report missing prerequisites to user and stop | +| Tasks exist | Consumed Task Set is non-empty | User's execution instruction serves as batch approval β†’ Enter autonomous execution immediately | +| No tasks + plan exists | Consumed Task Set is empty but the resolved work plan exists | Confirm with user β†’ run task-decomposer | +| Neither exists + Design Doc exists | No plan, no Consumed Task Set, but `docs/design/*.md` exists | Invoke work-planner to create work plan from Design Doc(s), then proceed to task decomposition | +| Neither exists | No plan, no Consumed Task Set, no Design Doc | Report missing prerequisites to user and stop | ## Task Decomposition Phase (Conditional) -When task files don't exist: +When the Consumed Task Set is empty: ### 1. User Confirmation ``` -No task files found. +No task files in the Consumed Task Set. Work plan: docs/plans/[plan-name].md Generate tasks from the work plan? (y/n): @@ -67,22 +85,19 @@ Invoke task-decomposer using Agent tool: - `prompt`: "Read work plan at docs/plans/[plan-name].md and decompose into atomic tasks. Output: Individual task files in docs/plans/tasks/. Granularity: 1 task = 1 commit = independently executable. Use layer-aware naming: {plan}-backend-task-{n}.md, {plan}-frontend-task-{n}.md based on Target files paths." ### 3. Verify Generation -```bash -# Verify generated task files -! ls -la docs/plans/tasks/*.md | head -10 -``` +Recompute the Consumed Task Set using the same restricted pattern from the Consumed Task Set section above. Confirm it is now non-empty. If it is still empty, escalate to the user β€” task-decomposer either failed silently or produced files that don't match the expected pattern. ## Pre-execution Checklist -- [ ] Confirmed task files exist in docs/plans/tasks/ -- [ ] Identified task execution order (dependencies) +- [ ] Confirmed Consumed Task Set is non-empty (computed in the Consumed Task Set section above) +- [ ] Identified task execution order within the Consumed Task Set (dependencies) - [ ] **Environment check**: Can I execute per-task commit cycle? - If commit capability unavailable β†’ Escalate before autonomous mode - Other environments (tests, quality tools) β†’ Subagents will escalate ## Task Execution Cycle (Filename-Pattern-Based) -**MANDATORY**: Route agents by task filename pattern from monorepo-flow.md reference. +**MANDATORY**: For each task in the Consumed Task Set, route agents by task filename pattern from monorepo-flow.md reference. ### Agent Routing Table @@ -144,7 +159,18 @@ After all task cycles finish, run verification agents **in parallel** before the - Re-run only the failed verifiers (by the criteria in step 2) - Repeat until all pass or `blocked` β†’ Escalate to user -4. **All passed** β†’ Proceed to completion report +4. **All passed** β†’ Proceed to Final Cleanup + +## Final Cleanup + +Before the completion report, delete the implementation task files this recipe consumed. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete every file in the Consumed Task Set +- Delete every file matching `docs/plans/tasks/{plan-name}-phase*-completion.md` (the per-phase completion files generated by task-decomposer for this `{plan-name}`) +- Delete the corresponding `docs/plans/tasks/_overview-{plan-name}.md` if present +- Preserve the work plan itself (`docs/plans/{plan-name}.md`) β€” the user decides whether to delete it after final review + +If task files cannot be deleted (filesystem error), report the failure but do not block the completion report. ## Output Example Fullstack implementation phase completed. @@ -152,4 +178,5 @@ Fullstack implementation phase completed. - Implemented tasks: [number] tasks (backend: X, frontend: Y) - Quality checks: All passed - Commits: [number] commits created +- Cleanup: Task files removed from docs/plans/tasks/ diff --git a/dev-workflows/skills/recipe-fullstack-implement/SKILL.md b/dev-workflows/skills/recipe-fullstack-implement/SKILL.md index 37db7a4..44012f1 100644 --- a/dev-workflows/skills/recipe-fullstack-implement/SKILL.md +++ b/dev-workflows/skills/recipe-fullstack-implement/SKILL.md @@ -101,6 +101,17 @@ When user responds to questions: - Run quality-fixer (layer-appropriate) before every commit - Obtain user approval before Edit/Write/MultiEdit outside autonomous mode +### Implementation Readiness Check (between work-planner approval and task-decomposer) + +After work-planner completes and the user grants batch approval, before invoking task-decomposer, read the work plan header and find the line `Implementation Readiness: `. Apply this rule: + +| Status | Action | +|--------|--------| +| `ready` | Proceed to task-decomposer | +| `escalated` | Read the work plan's Readiness Report section, surface remaining gaps to the user via AskUserQuestion: "Implementation Readiness is `escalated` with the following remaining gaps: [list]. Continue execution? (y/n)". On `y` proceed; on `n` stop | +| `pending` | Present via AskUserQuestion: "Implementation Readiness is `pending`. Run `/recipe-prepare-implementation [plan-path]` first to verify the work plan is implementable, then resume. Continue without preflight? (y/n)". On `y` proceed; on `n` stop | +| absent (line missing) | Treat as `pending` β€” older work plans created before the readiness marker existed should be preflighted explicitly | + ## Scope Boundary for Subagents Append the following block to every subagent prompt invoked from this recipe: @@ -150,14 +161,26 @@ After all task cycles finish, run verification agents **in parallel** before the - Re-run only the failed verifiers (by the criteria in step 2) - Repeat until all pass or `blocked` β†’ Escalate to user -4. **All passed** β†’ Proceed to completion report +4. **All passed** β†’ Proceed to Final Cleanup + +### Final Cleanup + +Before the completion report, delete the implementation task files this recipe consumed. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete every file matching `docs/plans/tasks/{plan-name}-backend-task-*.md` and `docs/plans/tasks/{plan-name}-frontend-task-*.md` (the `{plan-name}` derived from the work plan path used in this run) +- Delete every file matching `docs/plans/tasks/{plan-name}-phase*-completion.md` (the per-phase completion files generated by task-decomposer) +- Delete the corresponding `docs/plans/tasks/_overview-{plan-name}.md` if present +- Preserve the work plan itself (`docs/plans/{plan-name}.md`) β€” the user decides whether to delete it after final review + +If task files cannot be deleted (filesystem error), report the failure but do not block the completion report. ### Test Information Communication After acceptance-test-generator execution, when invoking work-planner (subagent_type: "dev-workflows:work-planner"), communicate: - Generated integration test file path (from `generatedFiles.integration`) -- Generated E2E test file path or null (from `generatedFiles.e2e`) -- E2E absence reason (from `e2eAbsenceReason`, when E2E is null) -- Explicit note that integration tests are created simultaneously with implementation, E2E tests are executed after all implementations (when E2E path is provided) +- Generated fixture-e2e test file path or null (from `generatedFiles.fixtureE2e`) +- Generated service-integration-e2e test file path or null (from `generatedFiles.serviceE2e`) +- Per-lane E2E absence reason (from `e2eAbsenceReason.fixtureE2e` and `e2eAbsenceReason.serviceE2e`, when each lane is null) +- Explicit note: integration tests are created simultaneously with implementation, fixture-e2e tests are created alongside the UI feature phase, service-integration-e2e tests are executed only in the final phase ## Execution Method diff --git a/dev-workflows/skills/recipe-implement/SKILL.md b/dev-workflows/skills/recipe-implement/SKILL.md index 2ea9d3b..c6f0ad0 100644 --- a/dev-workflows/skills/recipe-implement/SKILL.md +++ b/dev-workflows/skills/recipe-implement/SKILL.md @@ -81,6 +81,19 @@ When user responds to questions: - Run quality-fixer before every commit - Obtain user approval before Edit/Write/MultiEdit outside autonomous mode +### Implementation Readiness Check (between work-planner approval and task-decomposer) + +After work-planner completes and the user grants batch approval, before invoking task-decomposer, read the work plan header and find the line `Implementation Readiness: `. Apply this rule: + +| Status | Action | +|--------|--------| +| `ready` | Proceed to task-decomposer | +| `escalated` | Read the work plan's Readiness Report section, surface remaining gaps to the user via AskUserQuestion: "Implementation Readiness is `escalated` with the following remaining gaps: [list]. Continue execution? (y/n)". On `y` proceed; on `n` stop | +| `pending` | Present via AskUserQuestion: "Implementation Readiness is `pending`. Run `/recipe-prepare-implementation [plan-path]` first to verify the work plan is implementable, then resume. Continue without preflight? (y/n)". On `y` proceed; on `n` stop | +| absent (line missing) | Treat as `pending` β€” older work plans created before the readiness marker existed should be preflighted explicitly | + +This check applies to all scales (Small / Medium / Large) because recipe-implement is the scale-agnostic orchestrator. + ## Scope Boundary for Subagents Append the following block to every subagent prompt invoked from this recipe: @@ -129,14 +142,26 @@ After all task cycles finish, run verification agents **in parallel** before the - Re-run only the failed verifiers (by the criteria in step 2) - Repeat until all pass or `blocked` β†’ Escalate to user -4. **All passed** β†’ Proceed to completion report +4. **All passed** β†’ Proceed to Final Cleanup + +### Final Cleanup + +Before the completion report, delete the implementation task files this recipe consumed. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete every file matching `docs/plans/tasks/{plan-name}-task-*.md` (the `{plan-name}` derived from the work plan path used in this run) +- Delete every file matching `docs/plans/tasks/{plan-name}-phase*-completion.md` (the per-phase completion files generated by task-decomposer) +- Delete the corresponding `docs/plans/tasks/_overview-{plan-name}.md` if present +- Preserve the work plan itself (`docs/plans/{plan-name}.md`) β€” the user decides whether to delete it after final review + +If task files cannot be deleted (filesystem error), report the failure but do not block the completion report. ### Test Information Communication After acceptance-test-generator execution, when invoking work-planner (subagent_type: "dev-workflows:work-planner"), communicate: - Generated integration test file path (from `generatedFiles.integration`) -- Generated E2E test file path or null (from `generatedFiles.e2e`) -- E2E absence reason (from `e2eAbsenceReason`, when E2E is null) -- Explicit note that integration tests are created simultaneously with implementation, E2E tests are executed after all implementations (when E2E path is provided) +- Generated fixture-e2e test file path or null (from `generatedFiles.fixtureE2e`) +- Generated service-integration-e2e test file path or null (from `generatedFiles.serviceE2e`) +- Per-lane E2E absence reason (from `e2eAbsenceReason.fixtureE2e` and `e2eAbsenceReason.serviceE2e`, when each lane is null) +- Explicit note: integration tests are created simultaneously with implementation, fixture-e2e tests are created alongside the UI feature phase, service-integration-e2e tests are executed only in the final phase ## Execution Method diff --git a/dev-workflows/skills/recipe-plan/SKILL.md b/dev-workflows/skills/recipe-plan/SKILL.md index bb0fecd..d6844f5 100644 --- a/dev-workflows/skills/recipe-plan/SKILL.md +++ b/dev-workflows/skills/recipe-plan/SKILL.md @@ -38,20 +38,21 @@ Follow the planning process below: - Check for existence of design documents, notify user if none exist - Present options if multiple exist (can be specified with $ARGUMENTS) -### Step 2: E2E Test Skeleton Generation Confirmation - - Confirm with user whether to generate E2E test skeleton first - - If user wants generation: Generate test skeleton with acceptance-test-generator +### Step 2: Test Skeleton Generation Confirmation + - Confirm with user whether to generate test skeletons (integration + E2E lanes) first + - If user wants generation: invoke acceptance-test-generator - Pass generation results to next process according to subagents-orchestration-guide skill coordination specification ### Step 3: Work Plan Creation Invoke work-planner using Agent tool: - `subagent_type`: "dev-workflows:work-planner" - `description`: "Work plan creation" -- If test skeletons were generated in Step 2: - - When `generatedFiles.e2e` is not null: - `prompt`: "Create work plan from Design Doc at [path]. Integration test file: [integration test path]. E2E test file: [E2E test path]. Integration tests are created simultaneously with each phase implementation, E2E tests are executed only in final phase." - - When `generatedFiles.e2e` is null: - `prompt`: "Create work plan from Design Doc at [path]. Integration test file: [integration test path]. No E2E test skeletons were generated (reason: [e2eAbsenceReason]). Integration tests are created simultaneously with each phase implementation." +- If test skeletons were generated in Step 2, build the prompt by listing every lane's status: + - Always include: "Integration test file: [path or 'not generated']" + - For each E2E lane (`fixtureE2e`, `serviceE2e`): + - When `generatedFiles.` is not null: "[lane] test file: [path]" + - When `generatedFiles.` is null: "No [lane] skeleton generated (reason: [e2eAbsenceReason.])" + - Append placement guidance: "Integration tests are created simultaneously with each phase implementation. fixture-e2e tests are created alongside the UI feature phase. service-integration-e2e tests are executed only in the final phase." - If test skeletons were not generated: `prompt`: "Create work plan from Design Doc at [path]." diff --git a/dev-workflows/skills/recipe-prepare-implementation/SKILL.md b/dev-workflows/skills/recipe-prepare-implementation/SKILL.md new file mode 100644 index 0000000..43b1b8f --- /dev/null +++ b/dev-workflows/skills/recipe-prepare-implementation/SKILL.md @@ -0,0 +1,192 @@ +--- +name: recipe-prepare-implementation +description: Verifies the work plan is implementable end-to-end and resolves verification-lane / fixture / E2E-environment gaps before the build phase begins. Use when "implement-ready/verification readiness/lane setup/E2E environment missing" is mentioned, or before any build phase begins on a work plan whose readiness has not been preflight-checked. +disable-model-invocation: true +--- + +**Context**: Optional readiness phase between work-plan approval and recipe-*-build. Confirms the implementation will be observable from Phase 1 onward and resolves any gaps via Phase 0 tasks. Exits no-op when the readiness criteria already pass, so the recipe is safe to invoke unconditionally. + +## Orchestrator Definition + +**Core Identity**: "I am an orchestrator." (see subagents-orchestration-guide skill) + +**Execution Protocol**: +1. **Delegate all work through Agent tool** β€” invoke sub-agents, pass deliverable paths between them, and report results (permitted tools: see subagents-orchestration-guide "Orchestrator's Permitted Tools") +2. **Self-contained scope**: When gaps are found, this recipe BOTH generates resolution tasks AND executes them through the standard 4-step cycle. Recipe completes only when readiness criteria pass or remaining gaps are escalated. +3. **No-op exit**: When the readiness scan finds no failing criteria, generate no resolution tasks and exit immediately. The only file modifications in this branch are to the work plan itself β€” promoting the `Implementation Readiness:` header to `ready` and persisting the Readiness Report section. No code or test files are touched. + +Work plan: $ARGUMENTS + +## When This Recipe Applies + +Run before any recipe-*-build invocation when ANY of the following hold: +- Work plan was created from a Design Doc whose Verification Strategy references commands, files, functions, or endpoints not yet present in the codebase +- Work plan includes E2E test skeletons (seed data, auth fixture, environment variables, or external mocks may be unaddressed) +- Work plan touches UI components without a fixture entry or development route to render their visual states +- The team has not previously confirmed the local lane runs end-to-end for this feature area + +When none of the above hold, the readiness scan in Step 2 will find zero failing criteria and the recipe exits no-op (see Context at the top of this skill). + +## Readiness Criteria + +Each criterion is a measurable check producing `pass`, `fail`, or `not_applicable` with cited evidence. + +| ID | Criterion | Pass evidence | +|----|-----------|---------------| +| R1 | Verification Strategy references resolve | Every command, file path, function, endpoint, and test referenced in the work plan's Verification Strategy section either exists in the codebase (verified via Glob/Grep) or is the deliverable of a task already in this plan | +| R2 | E2E preconditions addressed | When E2E skeletons exist: every precondition mentioned in skeleton comments (seed data, auth fixture, env var, external mock) is present in the codebase or covered by a Phase 0 task in this plan | +| R3 | Phase 1 observability | The first implementation phase contains at least one task whose Operation Verification Methods can execute at task completion using only artifacts that exist before the task starts (existing code, prior Phase 0 task deliverables, or the task's own outputs) | +| R4 | UI rendering surface | When the plan implements UI components: a fixture entry, dev route, Storybook story, or equivalent rendering surface exists for the impacted components, OR a Phase 0 task adds one | +| R5 | Local lane procedure | The work plan or a referenced doc records the commands needed to start the system locally for manual verification (start commands, default ports, seed steps) | + +R4 and R5 are evaluated only when their triggering signals appear in the work plan; otherwise mark `not_applicable`. + +## Pre-execution Prerequisites + +```bash +# Verify the work plan exists +! ls -la docs/plans/*.md | grep -v template | tail -5 +``` + +**State check**: +- Work plan exists β†’ Proceed to Step 1 +- No work plan β†’ Stop and report: "An approved work plan is required. Complete the upstream planning phase first, then re-invoke this recipe." + +## Execution Flow + +### Step 1: Load Inputs + +Read the work plan path passed in `$ARGUMENTS`. Extract: +- Verification Strategy section (Correctness Proof Method + Early Verification Point) +- Quality Assurance Mechanisms table +- Design-to-Plan Traceability table +- Test skeleton references listed in the plan header +- Phase structure with each phase's tasks +- Referenced Design Doc(s) and UI Spec (when present) + +### Step 2: Readiness Scan + +For each criterion R1–R5: +1. Execute the scan defined in Readiness Criteria using Read / Glob / Grep +2. Record the result: `pass` / `fail` / `not_applicable` +3. Cite evidence: file:line for `pass`, the unresolved reference for `fail`, the missing trigger signal for `not_applicable` + +Build the Readiness Report (see Output Format) regardless of outcome. + +### Step 3: No-op Check + +When every applicable criterion is `pass` (zero `fail`): +- Append (or replace, if already present) a `## Implementation Readiness Report` section in the work plan immediately after the header block, using the same Readiness Report markdown defined in Output Format below +- Update the work plan header `Implementation Readiness:` line to `ready` (insert it after `Related Issue/PR:` if absent) +- Present the Readiness Report to the user +- Exit with `outcome: ready, gaps_resolved: 0` +- The work plan modifications above are the only file modifications in this branch + +When one or more criteria are `fail` β†’ proceed to Step 4. + +### Step 4: Plan Resolution Tasks + +For each `fail` criterion: +1. Determine the smallest concrete task that closes the gap (examples: "Add fixture entry for ComponentX covering loading/empty/error states", "Add seed script for E2E user fixtures", "Document local startup commands in docs/run/local.md") +2. Decide the task's **layer** by matching every target file path against the markers below: + - **backend** when every target file path matches one of: `**/api/**`, `**/server/**`, `**/services/**`, `**/backend/**`, `**/handlers/**`, `**/repositories/**` + - **frontend** when every target file path matches one of: `**/components/**`, `**/pages/**`, `**/web/**`, `**/frontend/**`, `**/*.tsx`, `**/*.jsx` + - **mixed** (target files span both backend and frontend markers) β†’ escalate to user; ask the user to split the gap into per-layer tasks + - **unrecognized** (any target file matches neither backend nor frontend markers β€” e.g., `docs/**`, `scripts/**`, root-level configs, fixture data files outside the markers above) β†’ escalate to user; ask the user to either (a) decide which layer's executor / quality-fixer should run the task, or (b) update the markers if the project uses different paths + + Apply the rules in the order above. The first matching rule wins; "unrecognized" is the final fallback rather than a catch-all that defaults to backend. +3. Create a Phase 0 task file at `docs/plans/tasks/{plan-name}-backend-task-prep-{NN}.md` (backend) or `docs/plans/tasks/{plan-name}-frontend-task-prep-{NN}.md` (frontend) using the task template from documentation-criteria skill. The `-task-prep-` segment lets recipe-prepare-implementation distinguish prep tasks from implementation tasks while keeping the existing `{plan-name}-{layer}-task-*` matcher used by other recipes +4. Update the work plan to insert these tasks as Phase 0 (before Phase 1) + +Present the proposed resolution task list to the user with AskUserQuestion. Proceed only after explicit approval β€” this is the single human gate inside this recipe. + +### Step 5: Execute Resolution Tasks + +For each resolution task, run the standard 4-step cycle (see subagents-orchestration-guide "Task Management: 4-Step Cycle"): + +1. **Agent tool** β€” route by filename layer segment: + - `*-backend-task-prep-*` β†’ `subagent_type: "dev-workflows:task-executor"` + - `*-frontend-task-prep-*` β†’ `subagent_type: "dev-workflows-frontend:task-executor-frontend"` + - Filename without a recognized layer segment β†’ escalate (the file should not exist; Step 4 prevents this) +2. Check escalation per orchestration-guide +3. **quality-fixer** β€” route by the same filename layer segment: + - `*-backend-task-prep-*` β†’ `"dev-workflows:quality-fixer"` + - `*-frontend-task-prep-*` β†’ `"dev-workflows-frontend:quality-fixer-frontend"` +4. **Commit** when quality-fixer returns `approved` + +Append the Scope Boundary block (below) to every subagent prompt. + +### Step 6: Re-scan, Persist Readiness Report, Update Header, Cleanup, Exit + +1. **Re-scan**: Re-run the Step 2 readiness scan after all resolution tasks are committed. + +2. **Persist Readiness Report into work plan body**: Append (or replace, if already present) a `## Implementation Readiness Report` section in the work plan immediately after the header block. Use the same Readiness Report markdown defined in Output Format below. Downstream recipe-*-build / recipe-*-implement read this section when the header is `escalated` to surface remaining gaps to the user. + +3. **Update work plan header**: Locate the line `Implementation Readiness: pending` in the work plan and rewrite it based on the re-scan outcome: + + | Re-scan result | New header value | + |----------------|------------------| + | All applicable criteria `pass` | `Implementation Readiness: ready` | + | One or more `fail` remain | `Implementation Readiness: escalated` | + + If the line is absent (older work plan format), insert it after the `Related Issue/PR:` line. + +4. **Final Cleanup**: Delete every prep task file this recipe created for the current `{plan-name}` (`docs/plans/tasks/{plan-name}-backend-task-prep-*.md` and `docs/plans/tasks/{plan-name}-frontend-task-prep-*.md`) AND the phase-completion file generated for prep phases (`docs/plans/tasks/{plan-name}-phase0-completion.md` when present, since prep tasks live in Phase 0). Prep task files for other plans are out of scope β€” this recipe deletes only what it created for the current run. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs. The work plan itself is preserved for the downstream recipe-*-build / recipe-*-implement. + +5. **Exit**: + + | Re-scan result | Action | + |----------------|--------| + | All applicable criteria `pass` | Exit with `outcome: ready, gaps_resolved: N` and final Readiness Report | + | One or more `fail` remain | Exit with `outcome: escalated` β€” present remaining failures to the user with the next-action recommendation. Treat the re-scan as the terminal evaluation; further resolution requires the user to re-invoke this recipe with updated inputs. | + +## Scope Boundary for Subagents + +Append the following block to every subagent prompt invoked from this recipe: + +``` +Scope boundary for subagents: +Operate within the task scope and referenced files in the prompt. +Use loaded skills to execute that scope. +Escalate when the required fix or investigation falls outside that scope. +``` + +## Output Format + +Final report presented to the user at exit: + +``` +## Implementation Readiness Report + +Work plan: [path] +Outcome: ready | escalated +Gaps resolved: [N] + +### Readiness Criteria + +| ID | Result | Evidence | +|----|--------|----------| +| R1 | pass / fail / not_applicable | [file:line OR "missing: "] | +| R2 | ... | ... | +| R3 | ... | ... | +| R4 | ... | ... | +| R5 | ... | ... | + +### Resolution Tasks Executed (when gaps_resolved > 0) +- [task file path] β€” [one-line summary] β€” committed +- ... + +### Remaining Gaps (when outcome is escalated) +- [criterion ID]: [unresolved reference] β€” Next action: [recommendation] +``` + +## Completion Criteria + +- [ ] Work plan loaded and Verification Strategy / E2E references / Phase structure extracted +- [ ] Readiness scan run with per-criterion result and evidence recorded +- [ ] No-op exit when all `pass`, OR resolution tasks generated, approved, and executed via the 4-step cycle +- [ ] Re-scan run after the last resolution task commits +- [ ] `## Implementation Readiness Report` section persisted into the work plan body +- [ ] Work plan header `Implementation Readiness:` line updated to `ready` or `escalated` +- [ ] Prep task files (and Phase 0 phase-completion file when generated) deleted from `docs/plans/tasks/` +- [ ] Final report presented to the user diff --git a/dev-workflows/skills/recipe-review/SKILL.md b/dev-workflows/skills/recipe-review/SKILL.md index 4280e72..1ef961c 100644 --- a/dev-workflows/skills/recipe-review/SKILL.md +++ b/dev-workflows/skills/recipe-review/SKILL.md @@ -16,11 +16,10 @@ disable-model-invocation: true - Compliance validation β†’ performed by code-reviewer - Security validation β†’ performed by security-reviewer -- Fix implementation β†’ performed by task-executor -- Quality checks β†’ performed by quality-fixer -- Re-validation β†’ performed by code-reviewer / security-reviewer +- **Code-side fix path**: Fix implementation β†’ task-executor; Quality checks β†’ quality-fixer; Re-validation β†’ code-reviewer / security-reviewer +- **Design-side update path**: DD revision β†’ technical-designer (update mode); DD review β†’ document-reviewer; cross-DD consistency β†’ design-sync (when multiple DDs exist); Re-validation β†’ code-reviewer -Orchestrator invokes sub-agents and passes structured JSON between them. +Orchestrator invokes sub-agents and passes structured JSON between them. The design-side path applies when the discrepancy reflects code that was correct but the Design Doc became stale, rather than code that violated the Design Doc. Design Doc (uses most recent if omitted): $ARGUMENTS @@ -65,36 +64,73 @@ Invoke security-reviewer using Agent tool: **Report both results independently using subagent output fields only**: +Before presenting to the user, the orchestrator computes a recommended route per finding using the rule below (this rule is internal β€” do not include it in the user-facing prompt): + +| Finding pattern | Recommended route | +|-----------------|-------------------| +| `dd_violation` where the code intent matches the original requirement but the Design Doc captured a different design | `d` (Design-side update) | +| `dd_violation` where the code drifted from a still-correct Design Doc | `c` (Code-side fix) | +| `reliability` / `security` / `maintainability` findings | `c` (Code-side fix) | + +Then present to the user (label each finding with its recommended route, grouped by route): + ``` Code Compliance: [complianceRate from code-reviewer] Verdict: [verdict from code-reviewer] Identifier Match Rate: [identifierMatchRate from code-reviewer] Acceptance Criteria: - [fulfilled] [item] (confidence: [high/medium/low]) - - [partially_fulfilled] [item]: [gap] β€” [suggestion] - - [unfulfilled] [item]: [gap] β€” [suggestion] + - [partially_fulfilled] [item]: [gap] β€” [suggestion] [recommended: c | d] + - [unfulfilled] [item]: [gap] β€” [suggestion] [recommended: c | d] Identifier Mismatches: - - [identifier]: DD=[designDocValue] Code=[codeValue] at [location] + - [identifier]: DD=[designDocValue] Code=[codeValue] at [location] [recommended: c | d] Quality Findings: - - [category] [location]: [description] β€” [rationale] + - [category] [location]: [description] β€” [rationale] [recommended: c] Security Review: [status from security-reviewer] Findings by category: - - [confirmed_risk] [location]: [description] β€” [rationale] - - [defense_gap] [location]: [description] β€” [rationale] - - [hardening] [location]: [description] β€” [rationale] - - [policy] [location]: [description] β€” [rationale] + - [confirmed_risk] [location]: [description] β€” [rationale] [recommended: c] + - [defense_gap] [location]: [description] β€” [rationale] [recommended: c] + - [hardening] [location]: [description] β€” [rationale] [recommended: c] + - [policy] [location]: [description] β€” [rationale] [recommended: c] Notes: [notes from security-reviewer, if present] -Execute fixes? (y/n): +Resolve discrepancies β€” confirm or override the recommended route per finding: + c) Code-side fix β€” code violates Design Doc; modify code to match + d) Design-side update β€” code is correct; Design Doc is stale, revise it + s) Skip β€” accept current state without changes ``` -If both pass and user selects `n`: Skip Steps 5-10, proceed to Step 11. +Use AskUserQuestion. The default offer is **"accept all recommended routes"** β€” a single confirmation for the typical case where the orchestrator's recommendations are correct. When the user wants to override, collect per-finding c/d/s decisions instead. If the user selects `s` for everything: skip Steps 5-10, proceed to Step 11. ### Step 5: Execute Skill Execute Skill: documentation-criteria (for task file template) +### Step 5d: Design-Side Update + +Run this step only when the user routed at least one finding to `d`. When all routes are `c` or `s`, skip directly to Step 6. + +1. Invoke technical-designer in update mode using Agent tool: + - `subagent_type`: "dev-workflows:technical-designer" + - `description`: "Design Doc update from review findings" + - `prompt`: "Update Design Doc at [path] in update mode. The implementation has diverged in the following ways that the team has decided to ratify in the design rather than in the code: [list of `d`-routed findings with codeLocation and designDocValue from $STEP_2_OUTPUT]. Reflect the current code behavior in the relevant sections and add a history entry." + +2. Invoke document-reviewer to verify the updated Design Doc: + - `subagent_type`: "dev-workflows:document-reviewer" + - `description`: "Document review of updated Design Doc" + - `prompt`: "Review updated Design Doc at [path] for consistency and completeness." + +3. When multiple Design Docs exist (`ls docs/design/*.md | grep -v template | wc -l > 1`), invoke design-sync: + - `subagent_type`: "dev-workflows:design-sync" + - `description`: "Cross-DD consistency check" + - `prompt`: "source_design: [updated DD path]. Detect conflicts across all Design Docs after the update." + - When `sync_status: conflicts_found`: present conflicts to the user; resolution requires re-invoking technical-designer for affected DDs. + +4. After Step 5d completes: + - If the user selected `d` for all findings (no `c` routes) β†’ skip Steps 6-8, proceed to Step 9 for re-validation + - If the user selected both `d` and `c` β†’ re-evaluate the `c`-routed findings against the updated DD and drop any that are now satisfied by the DD revision; then proceed to Step 6 with the remaining `c` findings + ### Step 6: Create Task File Create task file at `docs/plans/tasks/review-fixes-YYYYMMDD.md` @@ -119,7 +155,7 @@ Invoke quality-fixer using Agent tool: Invoke code-reviewer using Agent tool: - `subagent_type`: "dev-workflows:code-reviewer" - `description`: "Re-validate compliance" -- `prompt`: "Re-validate Design Doc compliance after fixes. Prior compliance issues: $STEP_2_OUTPUT. Verify each prior issue is resolved." +- `prompt`: "Re-validate Design Doc compliance after fixes. Prior compliance issues: $STEP_2_OUTPUT. Verify each prior issue is resolved (whether resolved code-side or design-side)." ### Step 10: Re-validate security-reviewer @@ -128,7 +164,15 @@ Invoke security-reviewer using Agent tool (only if security fixes were applied): - `description`: "Re-validate security" - `prompt`: "Re-validate security after fixes. Prior findings: $STEP_3_OUTPUT. Design Doc: [path]. Implementation files: [file list]." -### Step 11: Final Report +### Step 11: Final Cleanup and Report + +Delete the review-fix task file this recipe created (if any). Its work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete `docs/plans/tasks/review-fixes-YYYYMMDD.md` if it exists + +If the file cannot be deleted (filesystem error), report the failure but do not block the final report. + +Then present the final report: ``` Code Compliance: @@ -142,9 +186,11 @@ Security Review: Remaining issues: - [items requiring manual intervention] + +Cleanup: review-fixes task file removed ``` -## Auto-fixable Items +## Auto-fixable Items (code-side path) - Simple unimplemented acceptance criteria - Error handling additions - Contract definition fixes @@ -154,10 +200,16 @@ Remaining issues: ## Non-fixable Items - Fundamental business logic changes - Architecture-level modifications -- Design Doc deficiencies - Committed secrets (blocked β†’ human intervention) -**Scope**: Design Doc compliance validation, security review, and auto-fixes. +## Design-Side Update Triggers +Discrepancies suitable for the design-side path (code is correct, DD became stale): +- Identifier renames where the new identifier reflects the team's current naming +- Behavioral changes that match the original requirement intent better than what the DD captured +- Component splits or merges where the new structure is sound and the DD documented the prior structure +- New ACs that the implementation already satisfies but the DD never enumerated + +**Scope**: Design Doc compliance validation, security review, code-side auto-fixes, and design-side update routing. ## Scope Boundary for Subagents diff --git a/dev-workflows/skills/subagents-orchestration-guide/SKILL.md b/dev-workflows/skills/subagents-orchestration-guide/SKILL.md index dd4f205..ca2e2c1 100644 --- a/dev-workflows/skills/subagents-orchestration-guide/SKILL.md +++ b/dev-workflows/skills/subagents-orchestration-guide/SKILL.md @@ -111,7 +111,7 @@ Autonomous execution MUST stop and wait for user input at these points. | Design | After design-sync completes consistency verification | Approve Design Doc | | Work Plan | After work-planner creates plan | Batch approval for implementation phase | -**After batch approval**: Autonomous execution proceeds without stops until completion or escalation +**After batch approval**: Autonomous execution proceeds without stops until completion or escalation. ## Scale Determination and Document Requirements | Scale | File Count | PRD | ADR | Design Doc | Work Plan | @@ -184,7 +184,7 @@ Subagents respond in JSON format. Key fields for orchestrator decisions: - **design-sync**: sync_status (synced/conflicts_found) - **integration-test-reviewer**: status (approved/needs_revision/blocked), requiredFixes - **security-reviewer**: status (approved/approved_with_notes/needs_revision/blocked), findings, notes, requiredFixes -- **acceptance-test-generator**: status, generatedFiles (integration: path|null, e2e: path|null), budgetUsage, e2eAbsenceReason (null when E2E emitted, otherwise: no_multi_step_journey|below_threshold_user_confirmed) +- **acceptance-test-generator**: status, generatedFiles.{integration,fixtureE2e,serviceE2e} (path|null per lane), budgetUsage per lane, e2eAbsenceReason per E2E lane (null when emitted; reason enum is owned by acceptance-test-generator and integration-e2e-testing skill) ## Handling Requirement Changes @@ -229,7 +229,9 @@ Always start with requirement-analyzer, then select the minimum planning flow re | Medium | requirement-analyzer β†’ codebase-analyzer β†’ optional UI Spec β†’ optional ADR β†’ Design Doc β†’ code-verifier β†’ document-reviewer β†’ design-sync β†’ acceptance-test-generator β†’ work-planner β†’ task-decomposer | | Small | requirement-analyzer β†’ work-planner | -After the planning flow completes and the user grants batch approval, execute the task execution cycle: `task-executor β†’ quality-fixer β†’ commit` for each task. See "Autonomous Execution Mode" below for full per-task details. At Small scale this cycle still applies β€” implementation runs through `task-executor`, not orchestrator-direct edits. +After the planning flow completes and the user grants batch approval, the work plan carries an `Implementation Readiness:` header (work-planner emits `pending`; promotion to `ready` or `escalated` is an external orchestration concern). External orchestration also decides when and how to act on this marker; this guide does not invoke any orchestrator above the agent layer. + +Then execute the task execution cycle: `task-executor β†’ quality-fixer β†’ commit` for each task. See "Autonomous Execution Mode" below for full per-task details. At Small scale this cycle still applies β€” implementation runs through `task-executor`, not orchestrator-direct edits. Each agent name in the chain is invoked via the Agent tool (per "Orchestrator's Permitted Tools" above). @@ -397,21 +399,13 @@ Register overall phases using TaskCreate. Update each phase with TaskUpdate as i #### HC-06: acceptance-test-generator β†’ work-planner - **Pass to acceptance-test-generator**: - - Design Doc: [path] - - UI Spec: [path] (if exists) + **Pass to acceptance-test-generator**: Design Doc path; UI Spec path (if exists). - **Orchestrator verification items**: - - Verify `generatedFiles.integration` is a valid path (when not null) and the file exists - - Verify `generatedFiles.e2e` is a valid path (when not null) and the file exists - - When `generatedFiles.e2e` is null, verify `e2eAbsenceReason` is present β€” this is intentional absence, not an error + **Orchestrator verification**: Every non-null `generatedFiles.` path exists on disk. For each null lane, `e2eAbsenceReason.` is present (intentional absence, not an error). - **Pass to work-planner**: - - Integration test file: [path] (create and execute simultaneously with each phase implementation) - - E2E test file: [path] or null (execute only in final phase, when provided) - - E2E absence reason: [reason] (when E2E is null β€” pass this so work-planner can skip E2E Gap Check for intentional absence) + **Pass to work-planner**: integration / fixture-e2e / service-integration-e2e file paths (or null per lane), per-lane absence reasons, plus timing guidance β€” integration tests are created alongside each phase implementation, fixture-e2e tests are created alongside the UI feature phase, service-integration-e2e tests are executed only in the final phase. - **On error**: Escalate to user if integration file generation failed unexpectedly (status != completed). E2E being null with a valid absence reason is not an error. + **On error**: Escalate to user when status != completed and integration file generation failed unexpectedly. A null E2E lane with a valid absence reason is not an error. 3. **ADR Status Management**: Update ADR status after user decision (Accepted/Rejected) diff --git a/dev-workflows/skills/subagents-orchestration-guide/references/monorepo-flow.md b/dev-workflows/skills/subagents-orchestration-guide/references/monorepo-flow.md index 4304e07..840c9dc 100644 --- a/dev-workflows/skills/subagents-orchestration-guide/references/monorepo-flow.md +++ b/dev-workflows/skills/subagents-orchestration-guide/references/monorepo-flow.md @@ -27,7 +27,7 @@ This reference defines the orchestration flow for projects spanning multiple lay | 11 | code-verifier | Verify **Frontend** Design Doc against existing code | Frontend verification | | 12 | document-reviewer Γ—2 | Review each Design Doc (with code-verifier results as `code_verification`) | Reviews | | 13 | design-sync | Cross-layer consistency verification (source: frontend Design Doc) **[Stop]** | Sync status | -| 14 | acceptance-test-generator | Integration/E2E test skeleton from cross-layer contracts | Test skeletons | +| 14 | acceptance-test-generator | Integration + fixture-e2e + service-integration-e2e test skeletons from cross-layer contracts (per-lane) | Test skeletons | | 15 | work-planner | Work plan from all Design Docs **[Stop: Batch approval]** | Work plan | ### Medium Scale Fullstack (3-5 Files) - 13 Steps @@ -45,7 +45,7 @@ This reference defines the orchestration flow for projects spanning multiple lay | 9 | code-verifier | Verify **Frontend** Design Doc against existing code | Frontend verification | | 10 | document-reviewer Γ—2 | Review each Design Doc (with code-verifier results as `code_verification`) | Reviews | | 11 | design-sync | Cross-layer consistency verification (source: frontend Design Doc) **[Stop]** | Sync status | -| 12 | acceptance-test-generator | Integration/E2E test skeleton from cross-layer contracts | Test skeletons | +| 12 | acceptance-test-generator | Integration + fixture-e2e + service-integration-e2e test skeletons from cross-layer contracts (per-lane) | Test skeletons | | 13 | work-planner | Work plan from all Design Docs **[Stop: Batch approval]** | Work plan | ### Parallelization in Multi-Agent Steps diff --git a/package.json b/package.json index c86c1ec..9347314 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "claude-code-workflows", - "version": "0.16.17", + "version": "0.17.0", "private": true, "type": "module", "engines": { diff --git a/skills/documentation-criteria/references/plan-template.md b/skills/documentation-criteria/references/plan-template.md index 1120318..084ec79 100644 --- a/skills/documentation-criteria/references/plan-template.md +++ b/skills/documentation-criteria/references/plan-template.md @@ -5,6 +5,7 @@ Type: feature|fix|refactor Estimated Duration: X days Estimated Impact: X files Related Issue/PR: #XXX (if any) +Implementation Readiness: pending ## Related Documents - Design Doc(s): @@ -46,6 +47,26 @@ Maps each Design Doc technical requirement to the covering task(s). One row per **Gap Status values**: `covered` (task exists), `gap` (no task β€” requires justification in Notes, user confirmation required before plan approval) +## UI Spec Component β†’ Task Mapping + +Include this section when a UI Spec is among the inputs. Maps each component documented in the UI Spec to the task(s) that implement it. task-decomposer reads this table to populate each task's Investigation Targets with the corresponding UI Spec section. Omit the section when no UI Spec exists. + +| UI Spec Component (section heading) | States to Cover | Covered By Task(s) | Gap Status | Notes | +|---|---|---|---|---| +| [Use the UI Spec heading exactly as written, e.g., "Β§ Component: AlertCard"] | [default / loading / empty / error / partial β€” list the states the implementation must produce] | [Phase X Task Y] | covered | | + +**Reference key rule**: The component identifier in column 1 is the UI Spec section heading (verbatim). ui-spec-designer enforces unique component headings so this reference resolves to exactly one section. + +**Gap Status values**: `covered` (task exists), `gap` (no task β€” requires justification in Notes, user confirmation required before plan approval) + +## Connection Map + +Include this section when the implementation crosses more than one package, service, or process boundary. Document each boundary so task-decomposer can propagate boundary context to the implementation tasks on each side. Omit the section when the implementation stays within a single package. + +| Boundary | Owner (left side) | Owner (right side) | Expected Signal | Covered By Task(s) | +|---|---|---|---|---| +| [e.g., "web client β†’ API gateway"] | [module/package on the request side] | [module/package on the response side] | [Observable evidence the boundary works β€” e.g., "HTTP 200 with response matching ContractA", "row inserted in tableB", "message published to topicC"] | [Phase X Task Y on each side] | + ## Objective [Why this change is necessary, what problem it solves] diff --git a/skills/documentation-criteria/references/ui-spec-template.md b/skills/documentation-criteria/references/ui-spec-template.md index b66681b..9b8b60d 100644 --- a/skills/documentation-criteria/references/ui-spec-template.md +++ b/skills/documentation-criteria/references/ui-spec-template.md @@ -59,6 +59,8 @@ Map PRD acceptance criteria to prototype references. Skip this section if no pro ### Component: [ComponentName] +> Component heading uniqueness: every `Component: [ComponentName]` heading must be unique within this UI Spec. work-planner and task-decomposer reference components by exact heading text β€” duplicate names or paraphrased headings break the propagation to implementation tasks. + #### State x Display Matrix | State | Default | Loading | Empty | Error | Partial | diff --git a/skills/integration-e2e-testing/SKILL.md b/skills/integration-e2e-testing/SKILL.md index 6ad9889..801459c 100644 --- a/skills/integration-e2e-testing/SKILL.md +++ b/skills/integration-e2e-testing/SKILL.md @@ -7,14 +7,21 @@ description: Integration and E2E test design principles, ROI calculation, test s ## References -**E2E test design with Playwright**: See [references/e2e-design.md](references/e2e-design.md) for UI Spec-driven E2E test candidate selection and Playwright test architecture. +**E2E test design**: See [references/e2e-design.md](references/e2e-design.md) for UI Spec-driven E2E test candidate selection and browser test architecture. The reference uses Playwright as the default browser harness; substitute the project's standard when different. ## Test Type Definition and Limits -| Test Type | Purpose | Scope | Limit per Feature | Implementation Timing | -|-----------|---------|-------|-------------------|----------------------| -| Integration | Verify component interactions | Partial system integration | MAX 3 | Created alongside implementation | -| E2E | Verify critical user journeys | Full system | MAX 1-2 | Executed in final phase only | +| Test Type | Purpose | Scope | External Deps | Limit per Feature | Implementation Timing | +|-----------|---------|-------|---------------|-------------------|----------------------| +| Integration | Verify component interactions in-process | Partial system integration (in-process modules; for UI components, the framework's in-process renderer e.g., RTL+MSW for React/TS) | Mocked or in-process | MAX 3 | Created alongside implementation | +| fixture-e2e | Verify UI behavior in a browser with deterministic fixtures | Full UI flow with mocked backend / fixture-driven state | Mocked / fixture only β€” no live services | MAX 3 | Created alongside the UI feature | +| service-integration-e2e | Verify critical user journeys against a running local stack | Full system across services | Live local services or stubs | MAX 1-2 | Executed only in the final phase | + +**Lane selection (E2E only)**: +- Default lane for user-facing UI journeys is **fixture-e2e** β€” it runs a real browser against deterministic fixtures, catches the bugs that unit/integration tests miss (button no-op, state never updates, navigation breaks), and runs in CI without infrastructure setup +- Add **service-integration-e2e** only when the journey's correctness depends on real cross-service behavior (data persistence, transactional consistency, external service contracts) that cannot be faked safely + +The two E2E lanes are budgeted independently β€” having a fixture-e2e for a journey does not consume the service-integration-e2e budget and vice versa. ## Behavior-First Principle @@ -43,20 +50,29 @@ ROI Score = Business Value Γ— User Frequency + Legal Requirement Γ— 10 + Defect Higher ROI Score = higher priority within its test type. No normalization or capping is applied β€” the raw score is used directly for ranking. Deduplication is a separate step that removes candidates entirely; it does not modify scores. -### ROI Threshold for E2E +### ROI Thresholds by Lane + +The two E2E lanes have very different ownership costs and use independent thresholds. -E2E tests have high ownership cost (creation, execution, and maintenance are each 3-10Γ— higher than integration tests). To justify creation, an E2E candidate (beyond the must-keep reserved slot) requires **ROI Score > 50**. +| Lane | ROI threshold | Rationale | +|------|---------------|-----------| +| fixture-e2e | ROI β‰₯ 20 (beyond reserved slot) | Cost is comparable to integration tests once the harness exists; the floor avoids filling MAX 3 with low-signal tests when fewer would suffice | +| service-integration-e2e | ROI > 50 (beyond reserved slot) | Creation, execution, and maintenance cost is 3-10Γ— higher than integration; reserve for journeys whose value cannot be proven any other way | + +Reserved slot rules (see Multi-Step User Journey Definition below) apply per lane and override the threshold (the reserved candidate is emitted regardless of its ROI score). Below-floor candidates beyond the reserved slot are not emitted, leaving budget intentionally unfilled rather than padding with low-value tests. ### ROI Calculation Examples | Scenario | BV | Freq | Legal | Defect | ROI Score | Test Type | Selection Outcome | |----------|----|------|-------|--------|-----------|-----------|-------------------| -| Core checkout flow | 10 | 9 | true | 9 | 109 | E2E | Selected (reserved slot: user-facing multi-step journey) | -| Payment error handling | 8 | 3 | false | 7 | 31 | E2E | Below threshold (31 < 50), not selected | -| Profile save flow | 7 | 6 | false | 6 | 48 | E2E | Below threshold (48 < 50), not selected | +| Core checkout UI flow | 10 | 9 | true | 9 | 109 | fixture-e2e | Selected (reserved slot: user-facing multi-step journey, browser-level verification with fixtures) | +| Core checkout against live payment service | 10 | 9 | true | 9 | 109 | service-integration-e2e | Selected (real-service correctness above ROI threshold) | +| Dismiss button updates UI state | 6 | 7 | false | 8 | 50 | fixture-e2e | Selected (rank 2 of 3 fixture-e2e budget) | +| Payment error message display | 5 | 4 | false | 7 | 27 | fixture-e2e | Selected (rank 3 of 3 fixture-e2e budget) | +| Optional filter toggle | 3 | 4 | false | 2 | 14 | fixture-e2e | Not selected (rank 4, budget full) | +| Payment retry against real provider | 8 | 3 | false | 7 | 31 | service-integration-e2e | Below ROI threshold (31 < 50), not selected | | DB persistence check | 8 | 8 | false | 8 | 72 | Integration | Selected (rank 1 of 3) | -| Error message display | 5 | 3 | false | 4 | 19 | Integration | Selected (rank 2 of 3) | -| Optional filter toggle | 3 | 4 | false | 2 | 14 | Integration | Not selected (rank 4, budget full) | +| Pure data transformation | 5 | 3 | false | 4 | 19 | Integration | Selected (rank 2 of 3) | ## Multi-Step User Journey Definition @@ -72,14 +88,14 @@ A feature qualifies as containing a **multi-step user journey** when ALL of the ### User-Facing vs Service-Internal Journeys -Multi-step journeys are further classified for E2E budget decisions: +Multi-step journeys are classified for reserved-slot eligibility: -| Classification | Condition | E2E Reserved Slot | Example | +| Classification | Condition | Reserved Slot Eligibility | Example | |---|---|---|---| -| **User-facing** | A human user directly triggers and observes the steps (via UI, CLI, or direct API interaction) | Eligible | Web checkout flow, CLI setup wizard, mobile onboarding | -| **Service-internal** | Steps are triggered by backend services without direct user interaction | Not eligible (use integration tests) | Async job pipeline, service-to-service saga, scheduled batch processing | +| **User-facing** | A human user directly triggers and observes the steps (via UI, CLI, or direct API interaction) | Eligible β€” defaults to **fixture-e2e** reserved slot. Add a service-integration-e2e reserved slot only when the journey's correctness depends on real cross-service behavior | Web checkout flow, CLI setup wizard, mobile onboarding | +| **Service-internal** | Steps are triggered by backend services without direct user interaction | Not eligible for reserved slot β€” use integration tests. Service-integration-e2e through normal ROI > 50 path is still valid when full-system verification is warranted | Async job pipeline, service-to-service saga, scheduled batch processing | -This classification applies only to the reserved E2E slot and the E2E Gap Check. Service-internal journeys are still valid E2E candidates through the normal ROI > 50 path if they warrant full-system verification. +This classification applies only to the reserved-slot rule and the E2E Gap Check. Other selection follows lane-specific ROI rules above. Use this definition when evaluating E2E test candidates and E2E gap detection. @@ -92,12 +108,18 @@ Each test MUST include the following annotations: ``` AC: [Original acceptance criteria text] Behavior: [Trigger] β†’ [Process] β†’ [Observable Result] -@category: core-functionality | integration | edge-case | e2e +@category: core-functionality | integration | edge-case | fixture-e2e | service-integration-e2e +@lane: integration | fixture-e2e | service-integration-e2e @dependency: none | [component names] | full-system @complexity: low | medium | high ROI: [score] ``` +**`@lane` selection rule**: +- `integration` β€” Component interaction in-process, no browser (e.g., RTL+MSW for React/TS, in-process module/handler integration in any language) +- `fixture-e2e` β€” Browser-level UI verification with mocked backend / fixture-driven state +- `service-integration-e2e` β€” Browser-level or end-to-end verification against running local services or stubs + Use the project's comment syntax to wrap these annotations (e.g., `//` for C-family, `#` for Python/Ruby/Shell). ### Verification Items (Optional) @@ -121,9 +143,10 @@ Verification items: ## Test File Naming Convention - Integration tests: `*.int.test.*` or `*.integration.test.*` -- E2E tests: `*.e2e.test.*` +- fixture-e2e tests: `*.fixture.e2e.test.*` (or organize under `tests/e2e/fixture/`) +- service-integration-e2e tests: `*.service.e2e.test.*` (or organize under `tests/e2e/service/`) -The test runner or framework in the project determines the appropriate file extension. +The test runner or framework in the project determines the appropriate file extension. Repos that already use a single `*.e2e.test.*` convention may keep it as long as each file declares `@lane:` in its header β€” the lane annotation is the source of truth for routing and budget accounting. ## Review Criteria diff --git a/skills/integration-e2e-testing/references/e2e-design.md b/skills/integration-e2e-testing/references/e2e-design.md index f4e9e90..45a0174 100644 --- a/skills/integration-e2e-testing/references/e2e-design.md +++ b/skills/integration-e2e-testing/references/e2e-design.md @@ -1,8 +1,21 @@ -# E2E Test Design with Playwright +# E2E Test Design (Browser Harness) + +This reference uses Playwright as the default example throughout because it is the standard E2E browser harness assumed by these workflows. Adapt patterns to the project's chosen framework when different (Cypress, Selenium, etc.); the lane definitions, ROI rules, and budgets remain the same. + +## Two E2E Lanes + +E2E tests in this workflow split into two lanes (see parent skill Test Type Definition): + +| Lane | When | ROI gate | Cost | +|------|------|----------|------| +| **fixture-e2e** | UI journey verification with deterministic fixtures (mocked backend / fixture data) | None β€” selected by ranking within MAX 3 budget | Comparable to integration; runs in CI without infrastructure setup | +| **service-integration-e2e** | Journey correctness depends on real cross-service behavior (data persistence, transactional consistency, external contracts) | ROI > 50 (beyond reserved slot) | 3-10Γ— higher than integration; reserved for what cannot be faked safely | + +Both lanes typically use Playwright; the difference is whether the backend is mocked / fixture-driven or running for real. ## When to Create E2E Tests -E2E tests target **critical user journeys** that span multiple pages or require real browser interaction. Apply the same ROI framework from the parent skill β€” only create E2E tests when ROI > 50. +E2E candidates target **critical user journeys** that span multiple pages or require real browser interaction. Pick the lane based on whether real services are required for the verification. ### Candidate Sources @@ -22,8 +35,8 @@ E2E tests target **critical user journeys** that span multiple pages or require - Responsive behavior across viewports **Use integration tests instead when**: -- Testing single-component state changes β†’ RTL -- Testing API response handling β†’ MSW + RTL +- Testing single-component state changes β†’ in-process component renderer (e.g., RTL for React/TS) +- Testing API response handling β†’ in-process API mock + component renderer (e.g., MSW + RTL for React/TS) - Testing pure data transformations β†’ unit tests ## UI Spec to E2E Test Mapping @@ -41,12 +54,15 @@ When a UI Spec exists, use it as the primary source for E2E test design: Screen Transition: [Screen A] β†’ [Screen B] β†’ [Screen C] AC Reference: AC-{id} User Journey: [Description of what the user accomplishes] -Preconditions: [Auth state, data state] +Lane: fixture-e2e | service-integration-e2e +Preconditions: [Auth state, data state β€” note whether these are fixture-driven or live] Verification Points: - [What to assert at each step] E2E ROI Score: [calculated score] ``` +**Lane decision**: choose `fixture-e2e` by default. Promote to `service-integration-e2e` when the verification requires observing real cross-service behavior (e.g., the test asserts that data persists across a real DB write, or that an external service receives the correct payload). + ## Playwright Test Architecture ### Page Object Pattern @@ -56,9 +72,11 @@ Organize browser interactions through page objects for maintainability: ``` tests/ β”œβ”€β”€ e2e/ -β”‚ β”œβ”€β”€ pages/ # Page objects -β”‚ β”œβ”€β”€ fixtures/ # Test fixtures and helpers -β”‚ └── *.e2e.test.ts # Test files +β”‚ β”œβ”€β”€ pages/ # Page objects (shared across lanes) +β”‚ β”œβ”€β”€ fixtures/ # Test fixtures and helpers (auth, seed) +β”‚ β”œβ”€β”€ data/ # Static fixture data for fixture-e2e +β”‚ β”œβ”€β”€ *.fixture.e2e.test.ts # fixture-e2e test files +β”‚ └── *.service.e2e.test.ts # service-integration-e2e test files ``` ### Test Isolation @@ -81,6 +99,6 @@ When UI Spec defines responsive behavior, test critical breakpoints: ## Budget Enforcement Hard limits per feature (same as parent skill): -- **E2E Tests**: MAX 1-2 tests -- Only generate if ROI score > 50 -- Prefer fewer, comprehensive journey tests over many granular tests +- **fixture-e2e**: MAX 3 tests, no ROI gate (selected by ranking) +- **service-integration-e2e**: MAX 1-2 tests, ROI > 50 beyond the reserved slot +- Prefer fewer, comprehensive journey tests over many granular tests in both lanes diff --git a/skills/recipe-add-integration-tests/SKILL.md b/skills/recipe-add-integration-tests/SKILL.md index da21964..b05f111 100644 --- a/skills/recipe-add-integration-tests/SKILL.md +++ b/skills/recipe-add-integration-tests/SKILL.md @@ -160,6 +160,14 @@ Check quality-fixer response: On `approved` from quality-fixer: - Commit test files using Bash with message format: "test: add [layer] integration tests for [feature name]" +### Step 9: Final Cleanup + +After all task files have been processed and committed, delete the task files this recipe created. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete every file matching `docs/plans/tasks/integration-tests-backend-task-*.md` and `docs/plans/tasks/integration-tests-frontend-task-*.md` created during this run + +If task files cannot be deleted (filesystem error), report the failure but do not block completion. + ## Scope Boundary for Subagents Append the following block to every subagent prompt invoked from this recipe: diff --git a/skills/recipe-build/SKILL.md b/skills/recipe-build/SKILL.md index 7907366..d56dbb3 100644 --- a/skills/recipe-build/SKILL.md +++ b/skills/recipe-build/SKILL.md @@ -20,33 +20,51 @@ Work plan: $ARGUMENTS ## Pre-execution Prerequisites -### Task File Existence Check -```bash -# Check work plans -! ls -la docs/plans/*.md | grep -v template | tail -5 +### Implementation Readiness Check -# Check task files -! ls docs/plans/tasks/*.md 2>/dev/null || echo "No task files found" -``` +Before any task processing, locate the work plan to gate against. Resolution rule: +1. List task files in `docs/plans/tasks/` matching the single-layer pattern `{plan-name}-task-*.md`. Layer-aware fullstack tasks (`{plan-name}-backend-task-*.md` / `{plan-name}-frontend-task-*.md`) are excluded here so a stale fullstack run does not redirect this recipe to the wrong work plan +2. From the matched files, also exclude every file matching any of these patterns β€” they originate from other workflow phases and are not implementation tasks for this run's plan: `*-task-prep-*.md` (readiness preflight tasks), `_overview-*.md` (decomposition overview file), `*-phase*-completion.md` (per-phase completion files), `review-fixes-*.md` (post-implementation review fixes), `integration-tests-*-task-*.md` (integration-test add-on scaffolding) +3. For each remaining file, extract the `{plan-name}` prefix as the segment that appears before `-task-` +4. When at least one task file matches, the work plan is `docs/plans/{plan-name}.md` for the prefix that has the most recent task-file mtime; ties broken by the lexicographically last `{plan-name}` +5. When no task file matches the restricted pattern, the work plan is the most-recent-mtime non-template `.md` in `docs/plans/` + +Read the work plan header and find the line `Implementation Readiness: `. Apply this rule: + +| Status | Action | +|--------|--------| +| `ready` | Proceed to Consumed Task Set computation | +| `escalated` | Read the work plan's Readiness Report section, surface remaining gaps to the user via AskUserQuestion: "Implementation Readiness is `escalated` with the following remaining gaps: [list]. Continue execution? (y/n)". On `y` proceed; on `n` stop | +| `pending` | Present via AskUserQuestion: "Implementation Readiness is `pending`. Run `/recipe-prepare-implementation [plan-path]` first to verify the work plan is implementable, then resume. Continue without preflight? (y/n)". On `y` proceed; on `n` stop | +| absent (line missing) | Treat as `pending` β€” older work plans created before the readiness marker existed should be preflighted explicitly | + +### Consumed Task Set + +Compute the **Consumed Task Set** for this run β€” the exact files this recipe owns, executes, and later deletes. Use the same restricted pattern as the Implementation Readiness Check: + +1. List task files in `docs/plans/tasks/` matching the single-layer pattern `{plan-name}-task-*.md` for the `{plan-name}` resolved by the readiness check. Layer-aware fullstack tasks are excluded +2. Exclude every file matching: `*-task-prep-*.md`, `_overview-*.md`, `*-phase*-completion.md`, `review-fixes-*.md`, `integration-tests-*-task-*.md` (these originate from other workflow phases) + +Every subsequent reference to "task files" in this recipe β€” Task Generation Decision Flow, Task Execution Cycle iteration, and Final Cleanup β€” uses this set, not the unrestricted `docs/plans/tasks/*.md` glob. ### Task Generation Decision Flow -Analyze task file existence state and determine the action required: +Analyze the Consumed Task Set and determine the action required: | State | Criteria | Next Action | |-------|----------|-------------| -| Tasks exist | .md files in tasks/ directory | User's execution instruction serves as batch approval β†’ Enter autonomous execution immediately | -| No tasks + plan exists | Plan exists but no task files | Confirm with user β†’ run task-decomposer | -| Neither exists + Design Doc exists | No plan or task files, but docs/design/*.md exists | Invoke work-planner to create work plan from Design Doc, then proceed to task decomposition | -| Neither exists | No plan, no task files, no Design Doc | Report missing prerequisites to user and stop | +| Tasks exist | Consumed Task Set is non-empty | User's execution instruction serves as batch approval β†’ Enter autonomous execution immediately | +| No tasks + plan exists | Consumed Task Set is empty but the resolved work plan exists | Confirm with user β†’ run task-decomposer | +| Neither exists + Design Doc exists | No plan, no Consumed Task Set, but `docs/design/*.md` exists | Invoke work-planner to create work plan from Design Doc, then proceed to task decomposition | +| Neither exists | No plan, no Consumed Task Set, no Design Doc | Report missing prerequisites to user and stop | ## Task Decomposition Phase (Conditional) -When task files don't exist: +When the Consumed Task Set is empty: ### 1. User Confirmation ``` -No task files found. +No task files in the Consumed Task Set. Work plan: docs/plans/[plan-name].md Generate tasks from the work plan? (y/n): @@ -59,17 +77,14 @@ Invoke task-decomposer using Agent tool: - `prompt`: "Read work plan at docs/plans/[plan-name].md and decompose into atomic tasks. Output: Individual task files in docs/plans/tasks/. Granularity: 1 task = 1 commit = independently executable" ### 3. Verify Generation -```bash -# Verify generated task files -! ls -la docs/plans/tasks/*.md | head -10 -``` +Recompute the Consumed Task Set using the same restricted pattern from the Consumed Task Set section above. Confirm it is now non-empty. If it is still empty, escalate to the user β€” task-decomposer either failed silently or produced files that don't match the expected pattern. -**Flow**: Task generation β†’ Autonomous execution (in this order) +**Flow**: Task generation β†’ Consumed Task Set recompute β†’ Autonomous execution (in this order) ## Pre-execution Checklist -- [ ] Confirmed task files exist in docs/plans/tasks/ -- [ ] Identified task execution order (dependencies) +- [ ] Confirmed Consumed Task Set is non-empty (computed in the Consumed Task Set section above) +- [ ] Identified task execution order within the Consumed Task Set (dependencies) - [ ] **Environment check**: Can I execute per-task commit cycle? - If commit capability unavailable β†’ Escalate before autonomous mode - Other environments (tests, quality tools) β†’ Subagents will escalate @@ -77,7 +92,7 @@ Invoke task-decomposer using Agent tool: ## Task Execution Cycle (4-Step Cycle) **MANDATORY EXECUTION CYCLE**: `task-executor β†’ escalation check β†’ quality-fixer β†’ commit` -For EACH task, YOU MUST: +For EACH task in the Consumed Task Set, YOU MUST: 1. **Register tasks using TaskCreate**: Register work steps. Always include first task "Map preloaded skills to applicable concrete rules" and final task "Verify the mapped rules before final JSON" 2. **Agent tool** (subagent_type: "dev-workflows:task-executor") β†’ Pass task file path in prompt, receive structured response 3. **CHECK task-executor response**: @@ -127,7 +142,18 @@ After all task cycles finish, run verification agents **in parallel** before the - Re-run only the failed verifiers (by the criteria in step 2) - Repeat until all pass or `blocked` β†’ Escalate to user -4. **All passed** β†’ Proceed to completion report +4. **All passed** β†’ Proceed to Final Cleanup + +## Final Cleanup + +Before the completion report, delete the implementation task files this recipe consumed. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete every file in the Consumed Task Set +- Delete every file matching `docs/plans/tasks/{plan-name}-phase*-completion.md` (the per-phase completion files generated by task-decomposer for this `{plan-name}`) +- Delete the corresponding `docs/plans/tasks/_overview-{plan-name}.md` if present +- Preserve the work plan itself (`docs/plans/{plan-name}.md`) β€” the user decides whether to delete it after final review + +If task files cannot be deleted (filesystem error), report the failure but do not block the completion report. ## Output Example Implementation phase completed. @@ -135,4 +161,5 @@ Implementation phase completed. - Implemented tasks: [number] tasks - Quality checks: All passed - Commits: [number] commits created +- Cleanup: Task files removed from docs/plans/tasks/ diff --git a/skills/recipe-front-build/SKILL.md b/skills/recipe-front-build/SKILL.md index a9d9daa..df9c996 100644 --- a/skills/recipe-front-build/SKILL.md +++ b/skills/recipe-front-build/SKILL.md @@ -20,33 +20,51 @@ Work plan: $ARGUMENTS ## Pre-execution Prerequisites -### Task File Existence Check -```bash -# Check work plans -! ls -la docs/plans/*.md | grep -v template | tail -5 +### Implementation Readiness Check -# Check task files -! ls docs/plans/tasks/*.md 2>/dev/null || echo "No task files found" -``` +Before any task processing, locate the work plan to gate against. Resolution rule: +1. List task files in `docs/plans/tasks/` matching the single-layer pattern `{plan-name}-task-*.md`. Layer-aware fullstack tasks (`{plan-name}-backend-task-*.md` / `{plan-name}-frontend-task-*.md`) are excluded here so a stale fullstack run does not redirect this recipe to the wrong work plan +2. From the matched files, also exclude every file matching any of these patterns β€” they originate from other workflow phases and are not implementation tasks for this run's plan: `*-task-prep-*.md` (readiness preflight tasks), `_overview-*.md` (decomposition overview file), `*-phase*-completion.md` (per-phase completion files), `review-fixes-*.md` (post-implementation review fixes), `integration-tests-*-task-*.md` (integration-test add-on scaffolding) +3. For each remaining file, extract the `{plan-name}` prefix as the segment that appears before `-task-` +4. When at least one task file matches, the work plan is `docs/plans/{plan-name}.md` for the prefix that has the most recent task-file mtime; ties broken by the lexicographically last `{plan-name}` +5. When no task file matches the restricted pattern, the work plan is the most-recent-mtime non-template `.md` in `docs/plans/` + +Read the work plan header and find the line `Implementation Readiness: `. Apply this rule: + +| Status | Action | +|--------|--------| +| `ready` | Proceed to Consumed Task Set computation | +| `escalated` | Read the work plan's Readiness Report section, surface remaining gaps to the user via AskUserQuestion: "Implementation Readiness is `escalated` with the following remaining gaps: [list]. Continue execution? (y/n)". On `y` proceed; on `n` stop | +| `pending` | Present via AskUserQuestion: "Implementation Readiness is `pending`. To verify the work plan is implementable, run `/recipe-prepare-implementation [plan-path]` first, then resume. That recipe is provided by the dev-workflows plugin β€” when only this frontend plugin is installed, install dev-workflows to use it, or continue without preflight. Continue without preflight? (y/n)". On `y` proceed; on `n` stop | +| absent (line missing) | Treat as `pending` β€” older work plans created before the readiness marker existed should be preflighted explicitly | + +### Consumed Task Set + +Compute the **Consumed Task Set** for this run β€” the exact files this recipe owns, executes, and later deletes. Use the same restricted pattern as the Implementation Readiness Check: + +1. List task files in `docs/plans/tasks/` matching the single-layer pattern `{plan-name}-task-*.md` for the `{plan-name}` resolved by the readiness check. Layer-aware fullstack tasks are excluded +2. Exclude every file matching: `*-task-prep-*.md`, `_overview-*.md`, `*-phase*-completion.md`, `review-fixes-*.md`, `integration-tests-*-task-*.md` (these originate from other workflow phases) + +Every subsequent reference to "task files" in this recipe β€” Task Generation Decision Flow, Task Execution Cycle iteration, and Final Cleanup β€” uses this set, not the unrestricted `docs/plans/tasks/*.md` glob. ### Task Generation Decision Flow -Analyze task file existence state and determine the action required: +Analyze the Consumed Task Set and determine the action required: | State | Criteria | Next Action | |-------|----------|-------------| -| Tasks exist | .md files in tasks/ directory | User's execution instruction serves as batch approval β†’ Enter autonomous execution immediately | -| No tasks + plan exists | Plan exists but no task files | Confirm with user β†’ run task-decomposer | -| Neither exists + Design Doc exists | No plan or task files, but docs/design/*.md exists | Invoke work-planner to create work plan from Design Doc, then proceed to task decomposition | -| Neither exists | No plan, no task files, no Design Doc | Report missing prerequisites to user and stop | +| Tasks exist | Consumed Task Set is non-empty | User's execution instruction serves as batch approval β†’ Enter autonomous execution immediately | +| No tasks + plan exists | Consumed Task Set is empty but the resolved work plan exists | Confirm with user β†’ run task-decomposer | +| Neither exists + Design Doc exists | No plan, no Consumed Task Set, but `docs/design/*.md` exists | Invoke work-planner to create work plan from Design Doc, then proceed to task decomposition | +| Neither exists | No plan, no Consumed Task Set, no Design Doc | Report missing prerequisites to user and stop | ## Task Decomposition Phase (Conditional) -When task files don't exist: +When the Consumed Task Set is empty: ### 1. User Confirmation ``` -No task files found. +No task files in the Consumed Task Set. Work plan: docs/plans/[plan-name].md Generate tasks from the work plan? (y/n): @@ -59,17 +77,14 @@ Invoke task-decomposer using Agent tool: - `prompt`: "Read work plan at docs/plans/[plan-name].md and decompose into atomic tasks. Output: Individual task files in docs/plans/tasks/. Granularity: 1 task = 1 commit = independently executable" ### 3. Verify Generation -```bash -# Verify generated task files -! ls -la docs/plans/tasks/*.md | head -10 -``` +Recompute the Consumed Task Set using the same restricted pattern from the Consumed Task Set section above. Confirm it is now non-empty. If it is still empty, escalate to the user β€” task-decomposer either failed silently or produced files that don't match the expected pattern. -**Flow**: Task generation β†’ Autonomous execution (in this order) +**Flow**: Task generation β†’ Consumed Task Set recompute β†’ Autonomous execution (in this order) ## Pre-execution Checklist -- [ ] Confirmed task files exist in docs/plans/tasks/ -- [ ] Identified task execution order (dependencies) +- [ ] Confirmed Consumed Task Set is non-empty (computed in the Consumed Task Set section above) +- [ ] Identified task execution order within the Consumed Task Set (dependencies) - [ ] **Environment check**: Can I execute per-task commit cycle? - If commit capability unavailable β†’ Escalate before autonomous mode - Other environments (tests, quality tools) β†’ Subagents will escalate @@ -77,7 +92,7 @@ Invoke task-decomposer using Agent tool: ## Task Execution Cycle (4-Step Cycle) **MANDATORY EXECUTION CYCLE**: `task-executor-frontend β†’ escalation check β†’ quality-fixer-frontend β†’ commit` -For EACH task, YOU MUST: +For EACH task in the Consumed Task Set, YOU MUST: 1. **Register tasks using TaskCreate**: Register work steps. Always include first task "Map preloaded skills to applicable concrete rules" and final task "Verify the mapped rules before final JSON" 2. **Agent tool** (subagent_type: "dev-workflows-frontend:task-executor-frontend") β†’ Pass task file path in prompt, receive structured response 3. **CHECK task-executor-frontend response**: @@ -127,7 +142,18 @@ After all task cycles finish, run verification agents **in parallel** before the - Re-run only the failed verifiers (by the criteria in step 2) - Repeat until all pass or `blocked` β†’ Escalate to user -4. **All passed** β†’ Proceed to completion report +4. **All passed** β†’ Proceed to Final Cleanup + +## Final Cleanup + +Before the completion report, delete the implementation task files this recipe consumed. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete every file in the Consumed Task Set +- Delete every file matching `docs/plans/tasks/{plan-name}-phase*-completion.md` (the per-phase completion files generated by task-decomposer for this `{plan-name}`) +- Delete the corresponding `docs/plans/tasks/_overview-{plan-name}.md` if present +- Preserve the work plan itself (`docs/plans/{plan-name}.md`) β€” the user decides whether to delete it after final review + +If task files cannot be deleted (filesystem error), report the failure but do not block the completion report. ## Output Example Frontend implementation phase completed. @@ -135,4 +161,5 @@ Frontend implementation phase completed. - Implemented tasks: [number] tasks - Quality checks: All passed (Lighthouse, bundle size, tests) - Commits: [number] commits created +- Cleanup: Task files removed from docs/plans/tasks/ diff --git a/skills/recipe-front-plan/SKILL.md b/skills/recipe-front-plan/SKILL.md index 2be4a60..80d3670 100644 --- a/skills/recipe-front-plan/SKILL.md +++ b/skills/recipe-front-plan/SKILL.md @@ -39,24 +39,25 @@ Follow the planning process below: - Present options if multiple exist (can be specified with $ARGUMENTS) ### Step 2: Test Skeleton Generation Confirmation - - Confirm with user whether to generate test skeletons (integration + E2E) first - - If user wants generation: acceptance-test-generator generates both integration and E2E test skeletons + - Confirm with user whether to generate test skeletons (integration + fixture-e2e + service-integration-e2e) first + - If user wants generation: acceptance-test-generator generates skeletons across all applicable lanes - Invoke acceptance-test-generator using Agent tool: - `subagent_type`: "dev-workflows-frontend:acceptance-test-generator" - `description`: "Test skeleton generation" - If UI Spec exists: `prompt: "Generate test skeletons from Design Doc at [path]. UI Spec at [ui-spec path]."` - If no UI Spec: `prompt: "Generate test skeletons from Design Doc at [path]."` - - Pass integration test file path, E2E test file path (or null), and e2eAbsenceReason to work-planner according to subagents-orchestration-guide "acceptance-test-generator β†’ work-planner" section + - Pass integration test file path, fixture-e2e and service-integration-e2e file paths (or null per lane), and e2eAbsenceReason (per lane) to work-planner according to subagents-orchestration-guide "acceptance-test-generator β†’ work-planner" section ### Step 3: Work Plan Creation Invoke work-planner using Agent tool: - `subagent_type`: "dev-workflows-frontend:work-planner" - `description`: "Work plan creation" -- If test skeletons were generated in Step 2: - - When `generatedFiles.e2e` is not null: - `prompt`: "Create work plan from Design Doc at [path]. Integration test file: [integration test path]. E2E test file: [E2E test path]. Integration tests are created simultaneously with each phase implementation, E2E tests are executed only in final phase." - - When `generatedFiles.e2e` is null: - `prompt`: "Create work plan from Design Doc at [path]. Integration test file: [integration test path]. No E2E test skeletons were generated (reason: [e2eAbsenceReason]). Integration tests are created simultaneously with each phase implementation." +- If test skeletons were generated in Step 2, build the prompt by listing every lane's status: + - Always include: "Integration test file: [path or 'not generated']" + - For each E2E lane (`fixtureE2e`, `serviceE2e`): + - When `generatedFiles.` is not null: "[lane] test file: [path]" + - When `generatedFiles.` is null: "No [lane] skeleton generated (reason: [e2eAbsenceReason.])" + - Append placement guidance: "Integration tests are created simultaneously with each phase implementation. fixture-e2e tests are created alongside the UI feature phase. service-integration-e2e tests are executed only in the final phase." - If test skeletons were not generated: `prompt`: "Create work plan from Design Doc at [path]." diff --git a/skills/recipe-front-review/SKILL.md b/skills/recipe-front-review/SKILL.md index fa45fce..8e24255 100644 --- a/skills/recipe-front-review/SKILL.md +++ b/skills/recipe-front-review/SKILL.md @@ -16,9 +16,10 @@ disable-model-invocation: true - Compliance validation β†’ performed by code-reviewer - Security validation β†’ performed by security-reviewer -- Fix implementation β†’ performed by task-executor-frontend -- Quality checks β†’ performed by quality-fixer-frontend -- Re-validation β†’ performed by code-reviewer / security-reviewer +- **Code-side fix path**: Fix implementation β†’ task-executor-frontend; Quality checks β†’ quality-fixer-frontend; Re-validation β†’ code-reviewer / security-reviewer +- **Design-side update path**: DD revision β†’ technical-designer-frontend (update mode); DD review β†’ document-reviewer; cross-DD consistency β†’ design-sync (when multiple DDs exist); Re-validation β†’ code-reviewer + +The design-side path applies when the discrepancy reflects code that was correct but the Design Doc became stale, rather than code that violated the Design Doc. Design Doc (uses most recent if omitted): $ARGUMENTS @@ -63,36 +64,73 @@ Invoke security-reviewer using Agent tool: **Report both results independently using subagent output fields only**: +Before presenting to the user, the orchestrator computes a recommended route per finding using the rule below (this rule is internal β€” do not include it in the user-facing prompt): + +| Finding pattern | Recommended route | +|-----------------|-------------------| +| `dd_violation` where the code intent matches the original requirement but the Design Doc captured a different design | `d` (Design-side update) | +| `dd_violation` where the code drifted from a still-correct Design Doc | `c` (Code-side fix) | +| `reliability` / `security` / `maintainability` findings | `c` (Code-side fix) | + +Then present to the user (label each finding with its recommended route, grouped by route): + ``` Code Compliance: [complianceRate from code-reviewer] Verdict: [verdict from code-reviewer] Identifier Match Rate: [identifierMatchRate from code-reviewer] Acceptance Criteria: - [fulfilled] [item] (confidence: [high/medium/low]) - - [partially_fulfilled] [item]: [gap] β€” [suggestion] - - [unfulfilled] [item]: [gap] β€” [suggestion] + - [partially_fulfilled] [item]: [gap] β€” [suggestion] [recommended: c | d] + - [unfulfilled] [item]: [gap] β€” [suggestion] [recommended: c | d] Identifier Mismatches: - - [identifier]: DD=[designDocValue] Code=[codeValue] at [location] + - [identifier]: DD=[designDocValue] Code=[codeValue] at [location] [recommended: c | d] Quality Findings: - - [category] [location]: [description] β€” [rationale] + - [category] [location]: [description] β€” [rationale] [recommended: c] Security Review: [status from security-reviewer] Findings by category: - - [confirmed_risk] [location]: [description] β€” [rationale] - - [defense_gap] [location]: [description] β€” [rationale] - - [hardening] [location]: [description] β€” [rationale] - - [policy] [location]: [description] β€” [rationale] + - [confirmed_risk] [location]: [description] β€” [rationale] [recommended: c] + - [defense_gap] [location]: [description] β€” [rationale] [recommended: c] + - [hardening] [location]: [description] β€” [rationale] [recommended: c] + - [policy] [location]: [description] β€” [rationale] [recommended: c] Notes: [notes from security-reviewer, if present] -Execute fixes? (y/n): +Resolve discrepancies β€” confirm or override the recommended route per finding: + c) Code-side fix β€” code violates Design Doc; modify code to match + d) Design-side update β€” code is correct; Design Doc is stale, revise it + s) Skip β€” accept current state without changes ``` -If both pass and user selects `n`: Skip Steps 5-10, proceed to Step 11. +Use AskUserQuestion. The default offer is **"accept all recommended routes"** β€” a single confirmation for the typical case where the orchestrator's recommendations are correct. When the user wants to override, collect per-finding c/d/s decisions instead. If the user selects `s` for everything: skip Steps 5-10, proceed to Step 11. ### Step 5: Execute Skill Execute Skill: documentation-criteria (for task file template) +### Step 5d: Design-Side Update + +Run this step only when the user routed at least one finding to `d`. When all routes are `c` or `s`, skip directly to Step 6. + +1. Invoke technical-designer-frontend in update mode using Agent tool: + - `subagent_type`: "dev-workflows-frontend:technical-designer-frontend" + - `description`: "Design Doc update from review findings" + - `prompt`: "Update Design Doc at [path] in update mode. The implementation has diverged in the following ways that the team has decided to ratify in the design rather than in the code: [list of `d`-routed findings with codeLocation and designDocValue from $STEP_2_OUTPUT]. Reflect the current code behavior in the relevant sections and add a history entry." + +2. Invoke document-reviewer to verify the updated Design Doc: + - `subagent_type`: "dev-workflows-frontend:document-reviewer" + - `description`: "Document review of updated Design Doc" + - `prompt`: "Review updated Design Doc at [path] for consistency and completeness." + +3. When multiple Design Docs exist (`ls docs/design/*.md | grep -v template | wc -l > 1`), invoke design-sync: + - `subagent_type`: "dev-workflows-frontend:design-sync" + - `description`: "Cross-DD consistency check" + - `prompt`: "source_design: [updated DD path]. Detect conflicts across all Design Docs after the update." + - When `sync_status: conflicts_found`: present conflicts to the user; resolution requires re-invoking technical-designer-frontend for affected DDs. + +4. After Step 5d completes: + - If the user selected `d` for all findings (no `c` routes) β†’ skip Steps 6-8, proceed to Step 9 for re-validation + - If the user selected both `d` and `c` β†’ re-evaluate the `c`-routed findings against the updated DD and drop any that are now satisfied by the DD revision; then proceed to Step 6 with the remaining `c` findings + ### Step 6: Create Task File Create task file at `docs/plans/tasks/review-fixes-YYYYMMDD.md` @@ -117,7 +155,7 @@ Invoke quality-fixer-frontend using Agent tool: Invoke code-reviewer using Agent tool: - `subagent_type`: "dev-workflows-frontend:code-reviewer" - `description`: "Re-validate compliance" -- `prompt`: "Re-validate Design Doc compliance after fixes. Design Doc: [path]. Implementation files: [file list]. Prior compliance issues: $STEP_2_OUTPUT. Verify each prior issue is resolved." +- `prompt`: "Re-validate Design Doc compliance after fixes. Design Doc: [path]. Implementation files: [file list]. Prior compliance issues: $STEP_2_OUTPUT. Verify each prior issue is resolved (whether resolved code-side or design-side)." ### Step 10: Re-validate security-reviewer @@ -126,7 +164,16 @@ Invoke security-reviewer using Agent tool (only if security fixes were applied): - `description`: "Re-validate security" - `prompt`: "Re-validate security after fixes. Prior findings: $STEP_3_OUTPUT. Design Doc: [path]. Implementation files: [file list]." -### Step 11: Final Report +### Step 11: Final Cleanup and Report + +Delete the review-fix task file this recipe created (if any). Its work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete `docs/plans/tasks/review-fixes-YYYYMMDD.md` if it exists + +If the file cannot be deleted (filesystem error), report the failure but do not block the final report. + +Then present the final report: + ``` Code Compliance: Initial: [X]% @@ -139,9 +186,11 @@ Security Review: Remaining issues: - [items requiring manual intervention] + +Cleanup: review-fixes task file removed ``` -## Auto-fixable Items +## Auto-fixable Items (code-side path) - Simple unimplemented acceptance criteria - Error handling additions - Contract definition fixes @@ -151,10 +200,16 @@ Remaining issues: ## Non-fixable Items - Fundamental business logic changes - Architecture-level modifications -- Design Doc deficiencies - Committed secrets (blocked β†’ human intervention) -**Scope**: Design Doc compliance validation, security review, and auto-fixes. +## Design-Side Update Triggers +Discrepancies suitable for the design-side path (code is correct, DD became stale): +- Identifier renames where the new identifier reflects the team's current naming +- Behavioral changes that match the original requirement intent better than what the DD captured +- Component splits or merges where the new structure is sound and the DD documented the prior structure +- New ACs that the implementation already satisfies but the DD never enumerated + +**Scope**: Design Doc compliance validation, security review, code-side auto-fixes, and design-side update routing. ## Scope Boundary for Subagents diff --git a/skills/recipe-fullstack-build/SKILL.md b/skills/recipe-fullstack-build/SKILL.md index 7d4bb8e..6488bf2 100644 --- a/skills/recipe-fullstack-build/SKILL.md +++ b/skills/recipe-fullstack-build/SKILL.md @@ -28,33 +28,51 @@ Work plan: $ARGUMENTS ## Pre-execution Prerequisites -### Task File Existence Check -```bash -# Check work plans -! ls -la docs/plans/*.md | grep -v template | tail -5 +### Implementation Readiness Check -# Check task files -! ls docs/plans/tasks/*.md 2>/dev/null || echo "No task files found" -``` +Before any task processing, locate the work plan to gate against. Resolution rule: +1. List task files in `docs/plans/tasks/` matching the layer-aware patterns `{plan-name}-backend-task-*.md` and `{plan-name}-frontend-task-*.md` only. Single-layer tasks (`{plan-name}-task-*.md`) are excluded here so a stale single-layer run does not redirect this recipe to the wrong work plan +2. From the matched files, also exclude every file matching any of these patterns β€” they originate from other workflow phases and are not implementation tasks for this run's plan: `*-task-prep-*.md` (readiness preflight tasks), `_overview-*.md` (decomposition overview file), `*-phase*-completion.md` (per-phase completion files), `review-fixes-*.md` (post-implementation review fixes), `integration-tests-*-task-*.md` (integration-test add-on scaffolding) +3. For each remaining file, extract the `{plan-name}` prefix as the segment that appears before `-backend-task-` or `-frontend-task-` +4. When at least one task file matches, the work plan is `docs/plans/{plan-name}.md` for the prefix that has the most recent task-file mtime; ties broken by the lexicographically last `{plan-name}` +5. When no task file matches the restricted pattern, the work plan is the most-recent-mtime non-template `.md` in `docs/plans/` + +Read the work plan header and find the line `Implementation Readiness: `. Apply this rule: + +| Status | Action | +|--------|--------| +| `ready` | Proceed to Consumed Task Set computation | +| `escalated` | Read the work plan's Readiness Report section, surface remaining gaps to the user via AskUserQuestion: "Implementation Readiness is `escalated` with the following remaining gaps: [list]. Continue execution? (y/n)". On `y` proceed; on `n` stop | +| `pending` | Present via AskUserQuestion: "Implementation Readiness is `pending`. Run `/recipe-prepare-implementation [plan-path]` first to verify the work plan is implementable, then resume. Continue without preflight? (y/n)". On `y` proceed; on `n` stop | +| absent (line missing) | Treat as `pending` β€” older work plans created before the readiness marker existed should be preflighted explicitly | + +### Consumed Task Set + +Compute the **Consumed Task Set** for this run β€” the exact files this recipe owns, executes, and later deletes. Use the same restricted pattern as the Implementation Readiness Check: + +1. List task files in `docs/plans/tasks/` matching the layer-aware patterns `{plan-name}-backend-task-*.md` and `{plan-name}-frontend-task-*.md` for the `{plan-name}` resolved by the readiness check. Single-layer tasks are excluded +2. Exclude every file matching: `*-task-prep-*.md`, `_overview-*.md`, `*-phase*-completion.md`, `review-fixes-*.md`, `integration-tests-*-task-*.md` (these originate from other workflow phases) + +Every subsequent reference to "task files" in this recipe β€” Task Generation Decision Flow, Task Execution Cycle iteration, and Final Cleanup β€” uses this set, not the unrestricted `docs/plans/tasks/*.md` glob. ### Task Generation Decision Flow -Analyze task file existence state and determine the action required: +Analyze the Consumed Task Set and determine the action required: | State | Criteria | Next Action | |-------|----------|-------------| -| Tasks exist | .md files in tasks/ directory | User's execution instruction serves as batch approval β†’ Enter autonomous execution immediately | -| No tasks + plan exists | Plan exists but no task files | Confirm with user β†’ run task-decomposer | -| Neither exists + Design Doc exists | No plan or task files, but docs/design/*.md exists | Invoke work-planner to create work plan from Design Doc(s), then proceed to task decomposition | -| Neither exists | No plan, no task files, no Design Doc | Report missing prerequisites to user and stop | +| Tasks exist | Consumed Task Set is non-empty | User's execution instruction serves as batch approval β†’ Enter autonomous execution immediately | +| No tasks + plan exists | Consumed Task Set is empty but the resolved work plan exists | Confirm with user β†’ run task-decomposer | +| Neither exists + Design Doc exists | No plan, no Consumed Task Set, but `docs/design/*.md` exists | Invoke work-planner to create work plan from Design Doc(s), then proceed to task decomposition | +| Neither exists | No plan, no Consumed Task Set, no Design Doc | Report missing prerequisites to user and stop | ## Task Decomposition Phase (Conditional) -When task files don't exist: +When the Consumed Task Set is empty: ### 1. User Confirmation ``` -No task files found. +No task files in the Consumed Task Set. Work plan: docs/plans/[plan-name].md Generate tasks from the work plan? (y/n): @@ -67,22 +85,19 @@ Invoke task-decomposer using Agent tool: - `prompt`: "Read work plan at docs/plans/[plan-name].md and decompose into atomic tasks. Output: Individual task files in docs/plans/tasks/. Granularity: 1 task = 1 commit = independently executable. Use layer-aware naming: {plan}-backend-task-{n}.md, {plan}-frontend-task-{n}.md based on Target files paths." ### 3. Verify Generation -```bash -# Verify generated task files -! ls -la docs/plans/tasks/*.md | head -10 -``` +Recompute the Consumed Task Set using the same restricted pattern from the Consumed Task Set section above. Confirm it is now non-empty. If it is still empty, escalate to the user β€” task-decomposer either failed silently or produced files that don't match the expected pattern. ## Pre-execution Checklist -- [ ] Confirmed task files exist in docs/plans/tasks/ -- [ ] Identified task execution order (dependencies) +- [ ] Confirmed Consumed Task Set is non-empty (computed in the Consumed Task Set section above) +- [ ] Identified task execution order within the Consumed Task Set (dependencies) - [ ] **Environment check**: Can I execute per-task commit cycle? - If commit capability unavailable β†’ Escalate before autonomous mode - Other environments (tests, quality tools) β†’ Subagents will escalate ## Task Execution Cycle (Filename-Pattern-Based) -**MANDATORY**: Route agents by task filename pattern from monorepo-flow.md reference. +**MANDATORY**: For each task in the Consumed Task Set, route agents by task filename pattern from monorepo-flow.md reference. ### Agent Routing Table @@ -144,7 +159,18 @@ After all task cycles finish, run verification agents **in parallel** before the - Re-run only the failed verifiers (by the criteria in step 2) - Repeat until all pass or `blocked` β†’ Escalate to user -4. **All passed** β†’ Proceed to completion report +4. **All passed** β†’ Proceed to Final Cleanup + +## Final Cleanup + +Before the completion report, delete the implementation task files this recipe consumed. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete every file in the Consumed Task Set +- Delete every file matching `docs/plans/tasks/{plan-name}-phase*-completion.md` (the per-phase completion files generated by task-decomposer for this `{plan-name}`) +- Delete the corresponding `docs/plans/tasks/_overview-{plan-name}.md` if present +- Preserve the work plan itself (`docs/plans/{plan-name}.md`) β€” the user decides whether to delete it after final review + +If task files cannot be deleted (filesystem error), report the failure but do not block the completion report. ## Output Example Fullstack implementation phase completed. @@ -152,4 +178,5 @@ Fullstack implementation phase completed. - Implemented tasks: [number] tasks (backend: X, frontend: Y) - Quality checks: All passed - Commits: [number] commits created +- Cleanup: Task files removed from docs/plans/tasks/ diff --git a/skills/recipe-fullstack-implement/SKILL.md b/skills/recipe-fullstack-implement/SKILL.md index 37db7a4..44012f1 100644 --- a/skills/recipe-fullstack-implement/SKILL.md +++ b/skills/recipe-fullstack-implement/SKILL.md @@ -101,6 +101,17 @@ When user responds to questions: - Run quality-fixer (layer-appropriate) before every commit - Obtain user approval before Edit/Write/MultiEdit outside autonomous mode +### Implementation Readiness Check (between work-planner approval and task-decomposer) + +After work-planner completes and the user grants batch approval, before invoking task-decomposer, read the work plan header and find the line `Implementation Readiness: `. Apply this rule: + +| Status | Action | +|--------|--------| +| `ready` | Proceed to task-decomposer | +| `escalated` | Read the work plan's Readiness Report section, surface remaining gaps to the user via AskUserQuestion: "Implementation Readiness is `escalated` with the following remaining gaps: [list]. Continue execution? (y/n)". On `y` proceed; on `n` stop | +| `pending` | Present via AskUserQuestion: "Implementation Readiness is `pending`. Run `/recipe-prepare-implementation [plan-path]` first to verify the work plan is implementable, then resume. Continue without preflight? (y/n)". On `y` proceed; on `n` stop | +| absent (line missing) | Treat as `pending` β€” older work plans created before the readiness marker existed should be preflighted explicitly | + ## Scope Boundary for Subagents Append the following block to every subagent prompt invoked from this recipe: @@ -150,14 +161,26 @@ After all task cycles finish, run verification agents **in parallel** before the - Re-run only the failed verifiers (by the criteria in step 2) - Repeat until all pass or `blocked` β†’ Escalate to user -4. **All passed** β†’ Proceed to completion report +4. **All passed** β†’ Proceed to Final Cleanup + +### Final Cleanup + +Before the completion report, delete the implementation task files this recipe consumed. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete every file matching `docs/plans/tasks/{plan-name}-backend-task-*.md` and `docs/plans/tasks/{plan-name}-frontend-task-*.md` (the `{plan-name}` derived from the work plan path used in this run) +- Delete every file matching `docs/plans/tasks/{plan-name}-phase*-completion.md` (the per-phase completion files generated by task-decomposer) +- Delete the corresponding `docs/plans/tasks/_overview-{plan-name}.md` if present +- Preserve the work plan itself (`docs/plans/{plan-name}.md`) β€” the user decides whether to delete it after final review + +If task files cannot be deleted (filesystem error), report the failure but do not block the completion report. ### Test Information Communication After acceptance-test-generator execution, when invoking work-planner (subagent_type: "dev-workflows:work-planner"), communicate: - Generated integration test file path (from `generatedFiles.integration`) -- Generated E2E test file path or null (from `generatedFiles.e2e`) -- E2E absence reason (from `e2eAbsenceReason`, when E2E is null) -- Explicit note that integration tests are created simultaneously with implementation, E2E tests are executed after all implementations (when E2E path is provided) +- Generated fixture-e2e test file path or null (from `generatedFiles.fixtureE2e`) +- Generated service-integration-e2e test file path or null (from `generatedFiles.serviceE2e`) +- Per-lane E2E absence reason (from `e2eAbsenceReason.fixtureE2e` and `e2eAbsenceReason.serviceE2e`, when each lane is null) +- Explicit note: integration tests are created simultaneously with implementation, fixture-e2e tests are created alongside the UI feature phase, service-integration-e2e tests are executed only in the final phase ## Execution Method diff --git a/skills/recipe-implement/SKILL.md b/skills/recipe-implement/SKILL.md index 2ea9d3b..c6f0ad0 100644 --- a/skills/recipe-implement/SKILL.md +++ b/skills/recipe-implement/SKILL.md @@ -81,6 +81,19 @@ When user responds to questions: - Run quality-fixer before every commit - Obtain user approval before Edit/Write/MultiEdit outside autonomous mode +### Implementation Readiness Check (between work-planner approval and task-decomposer) + +After work-planner completes and the user grants batch approval, before invoking task-decomposer, read the work plan header and find the line `Implementation Readiness: `. Apply this rule: + +| Status | Action | +|--------|--------| +| `ready` | Proceed to task-decomposer | +| `escalated` | Read the work plan's Readiness Report section, surface remaining gaps to the user via AskUserQuestion: "Implementation Readiness is `escalated` with the following remaining gaps: [list]. Continue execution? (y/n)". On `y` proceed; on `n` stop | +| `pending` | Present via AskUserQuestion: "Implementation Readiness is `pending`. Run `/recipe-prepare-implementation [plan-path]` first to verify the work plan is implementable, then resume. Continue without preflight? (y/n)". On `y` proceed; on `n` stop | +| absent (line missing) | Treat as `pending` β€” older work plans created before the readiness marker existed should be preflighted explicitly | + +This check applies to all scales (Small / Medium / Large) because recipe-implement is the scale-agnostic orchestrator. + ## Scope Boundary for Subagents Append the following block to every subagent prompt invoked from this recipe: @@ -129,14 +142,26 @@ After all task cycles finish, run verification agents **in parallel** before the - Re-run only the failed verifiers (by the criteria in step 2) - Repeat until all pass or `blocked` β†’ Escalate to user -4. **All passed** β†’ Proceed to completion report +4. **All passed** β†’ Proceed to Final Cleanup + +### Final Cleanup + +Before the completion report, delete the implementation task files this recipe consumed. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete every file matching `docs/plans/tasks/{plan-name}-task-*.md` (the `{plan-name}` derived from the work plan path used in this run) +- Delete every file matching `docs/plans/tasks/{plan-name}-phase*-completion.md` (the per-phase completion files generated by task-decomposer) +- Delete the corresponding `docs/plans/tasks/_overview-{plan-name}.md` if present +- Preserve the work plan itself (`docs/plans/{plan-name}.md`) β€” the user decides whether to delete it after final review + +If task files cannot be deleted (filesystem error), report the failure but do not block the completion report. ### Test Information Communication After acceptance-test-generator execution, when invoking work-planner (subagent_type: "dev-workflows:work-planner"), communicate: - Generated integration test file path (from `generatedFiles.integration`) -- Generated E2E test file path or null (from `generatedFiles.e2e`) -- E2E absence reason (from `e2eAbsenceReason`, when E2E is null) -- Explicit note that integration tests are created simultaneously with implementation, E2E tests are executed after all implementations (when E2E path is provided) +- Generated fixture-e2e test file path or null (from `generatedFiles.fixtureE2e`) +- Generated service-integration-e2e test file path or null (from `generatedFiles.serviceE2e`) +- Per-lane E2E absence reason (from `e2eAbsenceReason.fixtureE2e` and `e2eAbsenceReason.serviceE2e`, when each lane is null) +- Explicit note: integration tests are created simultaneously with implementation, fixture-e2e tests are created alongside the UI feature phase, service-integration-e2e tests are executed only in the final phase ## Execution Method diff --git a/skills/recipe-plan/SKILL.md b/skills/recipe-plan/SKILL.md index bb0fecd..d6844f5 100644 --- a/skills/recipe-plan/SKILL.md +++ b/skills/recipe-plan/SKILL.md @@ -38,20 +38,21 @@ Follow the planning process below: - Check for existence of design documents, notify user if none exist - Present options if multiple exist (can be specified with $ARGUMENTS) -### Step 2: E2E Test Skeleton Generation Confirmation - - Confirm with user whether to generate E2E test skeleton first - - If user wants generation: Generate test skeleton with acceptance-test-generator +### Step 2: Test Skeleton Generation Confirmation + - Confirm with user whether to generate test skeletons (integration + E2E lanes) first + - If user wants generation: invoke acceptance-test-generator - Pass generation results to next process according to subagents-orchestration-guide skill coordination specification ### Step 3: Work Plan Creation Invoke work-planner using Agent tool: - `subagent_type`: "dev-workflows:work-planner" - `description`: "Work plan creation" -- If test skeletons were generated in Step 2: - - When `generatedFiles.e2e` is not null: - `prompt`: "Create work plan from Design Doc at [path]. Integration test file: [integration test path]. E2E test file: [E2E test path]. Integration tests are created simultaneously with each phase implementation, E2E tests are executed only in final phase." - - When `generatedFiles.e2e` is null: - `prompt`: "Create work plan from Design Doc at [path]. Integration test file: [integration test path]. No E2E test skeletons were generated (reason: [e2eAbsenceReason]). Integration tests are created simultaneously with each phase implementation." +- If test skeletons were generated in Step 2, build the prompt by listing every lane's status: + - Always include: "Integration test file: [path or 'not generated']" + - For each E2E lane (`fixtureE2e`, `serviceE2e`): + - When `generatedFiles.` is not null: "[lane] test file: [path]" + - When `generatedFiles.` is null: "No [lane] skeleton generated (reason: [e2eAbsenceReason.])" + - Append placement guidance: "Integration tests are created simultaneously with each phase implementation. fixture-e2e tests are created alongside the UI feature phase. service-integration-e2e tests are executed only in the final phase." - If test skeletons were not generated: `prompt`: "Create work plan from Design Doc at [path]." diff --git a/skills/recipe-prepare-implementation/SKILL.md b/skills/recipe-prepare-implementation/SKILL.md new file mode 100644 index 0000000..43b1b8f --- /dev/null +++ b/skills/recipe-prepare-implementation/SKILL.md @@ -0,0 +1,192 @@ +--- +name: recipe-prepare-implementation +description: Verifies the work plan is implementable end-to-end and resolves verification-lane / fixture / E2E-environment gaps before the build phase begins. Use when "implement-ready/verification readiness/lane setup/E2E environment missing" is mentioned, or before any build phase begins on a work plan whose readiness has not been preflight-checked. +disable-model-invocation: true +--- + +**Context**: Optional readiness phase between work-plan approval and recipe-*-build. Confirms the implementation will be observable from Phase 1 onward and resolves any gaps via Phase 0 tasks. Exits no-op when the readiness criteria already pass, so the recipe is safe to invoke unconditionally. + +## Orchestrator Definition + +**Core Identity**: "I am an orchestrator." (see subagents-orchestration-guide skill) + +**Execution Protocol**: +1. **Delegate all work through Agent tool** β€” invoke sub-agents, pass deliverable paths between them, and report results (permitted tools: see subagents-orchestration-guide "Orchestrator's Permitted Tools") +2. **Self-contained scope**: When gaps are found, this recipe BOTH generates resolution tasks AND executes them through the standard 4-step cycle. Recipe completes only when readiness criteria pass or remaining gaps are escalated. +3. **No-op exit**: When the readiness scan finds no failing criteria, generate no resolution tasks and exit immediately. The only file modifications in this branch are to the work plan itself β€” promoting the `Implementation Readiness:` header to `ready` and persisting the Readiness Report section. No code or test files are touched. + +Work plan: $ARGUMENTS + +## When This Recipe Applies + +Run before any recipe-*-build invocation when ANY of the following hold: +- Work plan was created from a Design Doc whose Verification Strategy references commands, files, functions, or endpoints not yet present in the codebase +- Work plan includes E2E test skeletons (seed data, auth fixture, environment variables, or external mocks may be unaddressed) +- Work plan touches UI components without a fixture entry or development route to render their visual states +- The team has not previously confirmed the local lane runs end-to-end for this feature area + +When none of the above hold, the readiness scan in Step 2 will find zero failing criteria and the recipe exits no-op (see Context at the top of this skill). + +## Readiness Criteria + +Each criterion is a measurable check producing `pass`, `fail`, or `not_applicable` with cited evidence. + +| ID | Criterion | Pass evidence | +|----|-----------|---------------| +| R1 | Verification Strategy references resolve | Every command, file path, function, endpoint, and test referenced in the work plan's Verification Strategy section either exists in the codebase (verified via Glob/Grep) or is the deliverable of a task already in this plan | +| R2 | E2E preconditions addressed | When E2E skeletons exist: every precondition mentioned in skeleton comments (seed data, auth fixture, env var, external mock) is present in the codebase or covered by a Phase 0 task in this plan | +| R3 | Phase 1 observability | The first implementation phase contains at least one task whose Operation Verification Methods can execute at task completion using only artifacts that exist before the task starts (existing code, prior Phase 0 task deliverables, or the task's own outputs) | +| R4 | UI rendering surface | When the plan implements UI components: a fixture entry, dev route, Storybook story, or equivalent rendering surface exists for the impacted components, OR a Phase 0 task adds one | +| R5 | Local lane procedure | The work plan or a referenced doc records the commands needed to start the system locally for manual verification (start commands, default ports, seed steps) | + +R4 and R5 are evaluated only when their triggering signals appear in the work plan; otherwise mark `not_applicable`. + +## Pre-execution Prerequisites + +```bash +# Verify the work plan exists +! ls -la docs/plans/*.md | grep -v template | tail -5 +``` + +**State check**: +- Work plan exists β†’ Proceed to Step 1 +- No work plan β†’ Stop and report: "An approved work plan is required. Complete the upstream planning phase first, then re-invoke this recipe." + +## Execution Flow + +### Step 1: Load Inputs + +Read the work plan path passed in `$ARGUMENTS`. Extract: +- Verification Strategy section (Correctness Proof Method + Early Verification Point) +- Quality Assurance Mechanisms table +- Design-to-Plan Traceability table +- Test skeleton references listed in the plan header +- Phase structure with each phase's tasks +- Referenced Design Doc(s) and UI Spec (when present) + +### Step 2: Readiness Scan + +For each criterion R1–R5: +1. Execute the scan defined in Readiness Criteria using Read / Glob / Grep +2. Record the result: `pass` / `fail` / `not_applicable` +3. Cite evidence: file:line for `pass`, the unresolved reference for `fail`, the missing trigger signal for `not_applicable` + +Build the Readiness Report (see Output Format) regardless of outcome. + +### Step 3: No-op Check + +When every applicable criterion is `pass` (zero `fail`): +- Append (or replace, if already present) a `## Implementation Readiness Report` section in the work plan immediately after the header block, using the same Readiness Report markdown defined in Output Format below +- Update the work plan header `Implementation Readiness:` line to `ready` (insert it after `Related Issue/PR:` if absent) +- Present the Readiness Report to the user +- Exit with `outcome: ready, gaps_resolved: 0` +- The work plan modifications above are the only file modifications in this branch + +When one or more criteria are `fail` β†’ proceed to Step 4. + +### Step 4: Plan Resolution Tasks + +For each `fail` criterion: +1. Determine the smallest concrete task that closes the gap (examples: "Add fixture entry for ComponentX covering loading/empty/error states", "Add seed script for E2E user fixtures", "Document local startup commands in docs/run/local.md") +2. Decide the task's **layer** by matching every target file path against the markers below: + - **backend** when every target file path matches one of: `**/api/**`, `**/server/**`, `**/services/**`, `**/backend/**`, `**/handlers/**`, `**/repositories/**` + - **frontend** when every target file path matches one of: `**/components/**`, `**/pages/**`, `**/web/**`, `**/frontend/**`, `**/*.tsx`, `**/*.jsx` + - **mixed** (target files span both backend and frontend markers) β†’ escalate to user; ask the user to split the gap into per-layer tasks + - **unrecognized** (any target file matches neither backend nor frontend markers β€” e.g., `docs/**`, `scripts/**`, root-level configs, fixture data files outside the markers above) β†’ escalate to user; ask the user to either (a) decide which layer's executor / quality-fixer should run the task, or (b) update the markers if the project uses different paths + + Apply the rules in the order above. The first matching rule wins; "unrecognized" is the final fallback rather than a catch-all that defaults to backend. +3. Create a Phase 0 task file at `docs/plans/tasks/{plan-name}-backend-task-prep-{NN}.md` (backend) or `docs/plans/tasks/{plan-name}-frontend-task-prep-{NN}.md` (frontend) using the task template from documentation-criteria skill. The `-task-prep-` segment lets recipe-prepare-implementation distinguish prep tasks from implementation tasks while keeping the existing `{plan-name}-{layer}-task-*` matcher used by other recipes +4. Update the work plan to insert these tasks as Phase 0 (before Phase 1) + +Present the proposed resolution task list to the user with AskUserQuestion. Proceed only after explicit approval β€” this is the single human gate inside this recipe. + +### Step 5: Execute Resolution Tasks + +For each resolution task, run the standard 4-step cycle (see subagents-orchestration-guide "Task Management: 4-Step Cycle"): + +1. **Agent tool** β€” route by filename layer segment: + - `*-backend-task-prep-*` β†’ `subagent_type: "dev-workflows:task-executor"` + - `*-frontend-task-prep-*` β†’ `subagent_type: "dev-workflows-frontend:task-executor-frontend"` + - Filename without a recognized layer segment β†’ escalate (the file should not exist; Step 4 prevents this) +2. Check escalation per orchestration-guide +3. **quality-fixer** β€” route by the same filename layer segment: + - `*-backend-task-prep-*` β†’ `"dev-workflows:quality-fixer"` + - `*-frontend-task-prep-*` β†’ `"dev-workflows-frontend:quality-fixer-frontend"` +4. **Commit** when quality-fixer returns `approved` + +Append the Scope Boundary block (below) to every subagent prompt. + +### Step 6: Re-scan, Persist Readiness Report, Update Header, Cleanup, Exit + +1. **Re-scan**: Re-run the Step 2 readiness scan after all resolution tasks are committed. + +2. **Persist Readiness Report into work plan body**: Append (or replace, if already present) a `## Implementation Readiness Report` section in the work plan immediately after the header block. Use the same Readiness Report markdown defined in Output Format below. Downstream recipe-*-build / recipe-*-implement read this section when the header is `escalated` to surface remaining gaps to the user. + +3. **Update work plan header**: Locate the line `Implementation Readiness: pending` in the work plan and rewrite it based on the re-scan outcome: + + | Re-scan result | New header value | + |----------------|------------------| + | All applicable criteria `pass` | `Implementation Readiness: ready` | + | One or more `fail` remain | `Implementation Readiness: escalated` | + + If the line is absent (older work plan format), insert it after the `Related Issue/PR:` line. + +4. **Final Cleanup**: Delete every prep task file this recipe created for the current `{plan-name}` (`docs/plans/tasks/{plan-name}-backend-task-prep-*.md` and `docs/plans/tasks/{plan-name}-frontend-task-prep-*.md`) AND the phase-completion file generated for prep phases (`docs/plans/tasks/{plan-name}-phase0-completion.md` when present, since prep tasks live in Phase 0). Prep task files for other plans are out of scope β€” this recipe deletes only what it created for the current run. Their work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs. The work plan itself is preserved for the downstream recipe-*-build / recipe-*-implement. + +5. **Exit**: + + | Re-scan result | Action | + |----------------|--------| + | All applicable criteria `pass` | Exit with `outcome: ready, gaps_resolved: N` and final Readiness Report | + | One or more `fail` remain | Exit with `outcome: escalated` β€” present remaining failures to the user with the next-action recommendation. Treat the re-scan as the terminal evaluation; further resolution requires the user to re-invoke this recipe with updated inputs. | + +## Scope Boundary for Subagents + +Append the following block to every subagent prompt invoked from this recipe: + +``` +Scope boundary for subagents: +Operate within the task scope and referenced files in the prompt. +Use loaded skills to execute that scope. +Escalate when the required fix or investigation falls outside that scope. +``` + +## Output Format + +Final report presented to the user at exit: + +``` +## Implementation Readiness Report + +Work plan: [path] +Outcome: ready | escalated +Gaps resolved: [N] + +### Readiness Criteria + +| ID | Result | Evidence | +|----|--------|----------| +| R1 | pass / fail / not_applicable | [file:line OR "missing: "] | +| R2 | ... | ... | +| R3 | ... | ... | +| R4 | ... | ... | +| R5 | ... | ... | + +### Resolution Tasks Executed (when gaps_resolved > 0) +- [task file path] β€” [one-line summary] β€” committed +- ... + +### Remaining Gaps (when outcome is escalated) +- [criterion ID]: [unresolved reference] β€” Next action: [recommendation] +``` + +## Completion Criteria + +- [ ] Work plan loaded and Verification Strategy / E2E references / Phase structure extracted +- [ ] Readiness scan run with per-criterion result and evidence recorded +- [ ] No-op exit when all `pass`, OR resolution tasks generated, approved, and executed via the 4-step cycle +- [ ] Re-scan run after the last resolution task commits +- [ ] `## Implementation Readiness Report` section persisted into the work plan body +- [ ] Work plan header `Implementation Readiness:` line updated to `ready` or `escalated` +- [ ] Prep task files (and Phase 0 phase-completion file when generated) deleted from `docs/plans/tasks/` +- [ ] Final report presented to the user diff --git a/skills/recipe-review/SKILL.md b/skills/recipe-review/SKILL.md index 4280e72..1ef961c 100644 --- a/skills/recipe-review/SKILL.md +++ b/skills/recipe-review/SKILL.md @@ -16,11 +16,10 @@ disable-model-invocation: true - Compliance validation β†’ performed by code-reviewer - Security validation β†’ performed by security-reviewer -- Fix implementation β†’ performed by task-executor -- Quality checks β†’ performed by quality-fixer -- Re-validation β†’ performed by code-reviewer / security-reviewer +- **Code-side fix path**: Fix implementation β†’ task-executor; Quality checks β†’ quality-fixer; Re-validation β†’ code-reviewer / security-reviewer +- **Design-side update path**: DD revision β†’ technical-designer (update mode); DD review β†’ document-reviewer; cross-DD consistency β†’ design-sync (when multiple DDs exist); Re-validation β†’ code-reviewer -Orchestrator invokes sub-agents and passes structured JSON between them. +Orchestrator invokes sub-agents and passes structured JSON between them. The design-side path applies when the discrepancy reflects code that was correct but the Design Doc became stale, rather than code that violated the Design Doc. Design Doc (uses most recent if omitted): $ARGUMENTS @@ -65,36 +64,73 @@ Invoke security-reviewer using Agent tool: **Report both results independently using subagent output fields only**: +Before presenting to the user, the orchestrator computes a recommended route per finding using the rule below (this rule is internal β€” do not include it in the user-facing prompt): + +| Finding pattern | Recommended route | +|-----------------|-------------------| +| `dd_violation` where the code intent matches the original requirement but the Design Doc captured a different design | `d` (Design-side update) | +| `dd_violation` where the code drifted from a still-correct Design Doc | `c` (Code-side fix) | +| `reliability` / `security` / `maintainability` findings | `c` (Code-side fix) | + +Then present to the user (label each finding with its recommended route, grouped by route): + ``` Code Compliance: [complianceRate from code-reviewer] Verdict: [verdict from code-reviewer] Identifier Match Rate: [identifierMatchRate from code-reviewer] Acceptance Criteria: - [fulfilled] [item] (confidence: [high/medium/low]) - - [partially_fulfilled] [item]: [gap] β€” [suggestion] - - [unfulfilled] [item]: [gap] β€” [suggestion] + - [partially_fulfilled] [item]: [gap] β€” [suggestion] [recommended: c | d] + - [unfulfilled] [item]: [gap] β€” [suggestion] [recommended: c | d] Identifier Mismatches: - - [identifier]: DD=[designDocValue] Code=[codeValue] at [location] + - [identifier]: DD=[designDocValue] Code=[codeValue] at [location] [recommended: c | d] Quality Findings: - - [category] [location]: [description] β€” [rationale] + - [category] [location]: [description] β€” [rationale] [recommended: c] Security Review: [status from security-reviewer] Findings by category: - - [confirmed_risk] [location]: [description] β€” [rationale] - - [defense_gap] [location]: [description] β€” [rationale] - - [hardening] [location]: [description] β€” [rationale] - - [policy] [location]: [description] β€” [rationale] + - [confirmed_risk] [location]: [description] β€” [rationale] [recommended: c] + - [defense_gap] [location]: [description] β€” [rationale] [recommended: c] + - [hardening] [location]: [description] β€” [rationale] [recommended: c] + - [policy] [location]: [description] β€” [rationale] [recommended: c] Notes: [notes from security-reviewer, if present] -Execute fixes? (y/n): +Resolve discrepancies β€” confirm or override the recommended route per finding: + c) Code-side fix β€” code violates Design Doc; modify code to match + d) Design-side update β€” code is correct; Design Doc is stale, revise it + s) Skip β€” accept current state without changes ``` -If both pass and user selects `n`: Skip Steps 5-10, proceed to Step 11. +Use AskUserQuestion. The default offer is **"accept all recommended routes"** β€” a single confirmation for the typical case where the orchestrator's recommendations are correct. When the user wants to override, collect per-finding c/d/s decisions instead. If the user selects `s` for everything: skip Steps 5-10, proceed to Step 11. ### Step 5: Execute Skill Execute Skill: documentation-criteria (for task file template) +### Step 5d: Design-Side Update + +Run this step only when the user routed at least one finding to `d`. When all routes are `c` or `s`, skip directly to Step 6. + +1. Invoke technical-designer in update mode using Agent tool: + - `subagent_type`: "dev-workflows:technical-designer" + - `description`: "Design Doc update from review findings" + - `prompt`: "Update Design Doc at [path] in update mode. The implementation has diverged in the following ways that the team has decided to ratify in the design rather than in the code: [list of `d`-routed findings with codeLocation and designDocValue from $STEP_2_OUTPUT]. Reflect the current code behavior in the relevant sections and add a history entry." + +2. Invoke document-reviewer to verify the updated Design Doc: + - `subagent_type`: "dev-workflows:document-reviewer" + - `description`: "Document review of updated Design Doc" + - `prompt`: "Review updated Design Doc at [path] for consistency and completeness." + +3. When multiple Design Docs exist (`ls docs/design/*.md | grep -v template | wc -l > 1`), invoke design-sync: + - `subagent_type`: "dev-workflows:design-sync" + - `description`: "Cross-DD consistency check" + - `prompt`: "source_design: [updated DD path]. Detect conflicts across all Design Docs after the update." + - When `sync_status: conflicts_found`: present conflicts to the user; resolution requires re-invoking technical-designer for affected DDs. + +4. After Step 5d completes: + - If the user selected `d` for all findings (no `c` routes) β†’ skip Steps 6-8, proceed to Step 9 for re-validation + - If the user selected both `d` and `c` β†’ re-evaluate the `c`-routed findings against the updated DD and drop any that are now satisfied by the DD revision; then proceed to Step 6 with the remaining `c` findings + ### Step 6: Create Task File Create task file at `docs/plans/tasks/review-fixes-YYYYMMDD.md` @@ -119,7 +155,7 @@ Invoke quality-fixer using Agent tool: Invoke code-reviewer using Agent tool: - `subagent_type`: "dev-workflows:code-reviewer" - `description`: "Re-validate compliance" -- `prompt`: "Re-validate Design Doc compliance after fixes. Prior compliance issues: $STEP_2_OUTPUT. Verify each prior issue is resolved." +- `prompt`: "Re-validate Design Doc compliance after fixes. Prior compliance issues: $STEP_2_OUTPUT. Verify each prior issue is resolved (whether resolved code-side or design-side)." ### Step 10: Re-validate security-reviewer @@ -128,7 +164,15 @@ Invoke security-reviewer using Agent tool (only if security fixes were applied): - `description`: "Re-validate security" - `prompt`: "Re-validate security after fixes. Prior findings: $STEP_3_OUTPUT. Design Doc: [path]. Implementation files: [file list]." -### Step 11: Final Report +### Step 11: Final Cleanup and Report + +Delete the review-fix task file this recipe created (if any). Its work is committed; `docs/plans/` is ephemeral working state and is not retained between recipe runs: + +- Delete `docs/plans/tasks/review-fixes-YYYYMMDD.md` if it exists + +If the file cannot be deleted (filesystem error), report the failure but do not block the final report. + +Then present the final report: ``` Code Compliance: @@ -142,9 +186,11 @@ Security Review: Remaining issues: - [items requiring manual intervention] + +Cleanup: review-fixes task file removed ``` -## Auto-fixable Items +## Auto-fixable Items (code-side path) - Simple unimplemented acceptance criteria - Error handling additions - Contract definition fixes @@ -154,10 +200,16 @@ Remaining issues: ## Non-fixable Items - Fundamental business logic changes - Architecture-level modifications -- Design Doc deficiencies - Committed secrets (blocked β†’ human intervention) -**Scope**: Design Doc compliance validation, security review, and auto-fixes. +## Design-Side Update Triggers +Discrepancies suitable for the design-side path (code is correct, DD became stale): +- Identifier renames where the new identifier reflects the team's current naming +- Behavioral changes that match the original requirement intent better than what the DD captured +- Component splits or merges where the new structure is sound and the DD documented the prior structure +- New ACs that the implementation already satisfies but the DD never enumerated + +**Scope**: Design Doc compliance validation, security review, code-side auto-fixes, and design-side update routing. ## Scope Boundary for Subagents diff --git a/skills/subagents-orchestration-guide/SKILL.md b/skills/subagents-orchestration-guide/SKILL.md index dd4f205..ca2e2c1 100644 --- a/skills/subagents-orchestration-guide/SKILL.md +++ b/skills/subagents-orchestration-guide/SKILL.md @@ -111,7 +111,7 @@ Autonomous execution MUST stop and wait for user input at these points. | Design | After design-sync completes consistency verification | Approve Design Doc | | Work Plan | After work-planner creates plan | Batch approval for implementation phase | -**After batch approval**: Autonomous execution proceeds without stops until completion or escalation +**After batch approval**: Autonomous execution proceeds without stops until completion or escalation. ## Scale Determination and Document Requirements | Scale | File Count | PRD | ADR | Design Doc | Work Plan | @@ -184,7 +184,7 @@ Subagents respond in JSON format. Key fields for orchestrator decisions: - **design-sync**: sync_status (synced/conflicts_found) - **integration-test-reviewer**: status (approved/needs_revision/blocked), requiredFixes - **security-reviewer**: status (approved/approved_with_notes/needs_revision/blocked), findings, notes, requiredFixes -- **acceptance-test-generator**: status, generatedFiles (integration: path|null, e2e: path|null), budgetUsage, e2eAbsenceReason (null when E2E emitted, otherwise: no_multi_step_journey|below_threshold_user_confirmed) +- **acceptance-test-generator**: status, generatedFiles.{integration,fixtureE2e,serviceE2e} (path|null per lane), budgetUsage per lane, e2eAbsenceReason per E2E lane (null when emitted; reason enum is owned by acceptance-test-generator and integration-e2e-testing skill) ## Handling Requirement Changes @@ -229,7 +229,9 @@ Always start with requirement-analyzer, then select the minimum planning flow re | Medium | requirement-analyzer β†’ codebase-analyzer β†’ optional UI Spec β†’ optional ADR β†’ Design Doc β†’ code-verifier β†’ document-reviewer β†’ design-sync β†’ acceptance-test-generator β†’ work-planner β†’ task-decomposer | | Small | requirement-analyzer β†’ work-planner | -After the planning flow completes and the user grants batch approval, execute the task execution cycle: `task-executor β†’ quality-fixer β†’ commit` for each task. See "Autonomous Execution Mode" below for full per-task details. At Small scale this cycle still applies β€” implementation runs through `task-executor`, not orchestrator-direct edits. +After the planning flow completes and the user grants batch approval, the work plan carries an `Implementation Readiness:` header (work-planner emits `pending`; promotion to `ready` or `escalated` is an external orchestration concern). External orchestration also decides when and how to act on this marker; this guide does not invoke any orchestrator above the agent layer. + +Then execute the task execution cycle: `task-executor β†’ quality-fixer β†’ commit` for each task. See "Autonomous Execution Mode" below for full per-task details. At Small scale this cycle still applies β€” implementation runs through `task-executor`, not orchestrator-direct edits. Each agent name in the chain is invoked via the Agent tool (per "Orchestrator's Permitted Tools" above). @@ -397,21 +399,13 @@ Register overall phases using TaskCreate. Update each phase with TaskUpdate as i #### HC-06: acceptance-test-generator β†’ work-planner - **Pass to acceptance-test-generator**: - - Design Doc: [path] - - UI Spec: [path] (if exists) + **Pass to acceptance-test-generator**: Design Doc path; UI Spec path (if exists). - **Orchestrator verification items**: - - Verify `generatedFiles.integration` is a valid path (when not null) and the file exists - - Verify `generatedFiles.e2e` is a valid path (when not null) and the file exists - - When `generatedFiles.e2e` is null, verify `e2eAbsenceReason` is present β€” this is intentional absence, not an error + **Orchestrator verification**: Every non-null `generatedFiles.` path exists on disk. For each null lane, `e2eAbsenceReason.` is present (intentional absence, not an error). - **Pass to work-planner**: - - Integration test file: [path] (create and execute simultaneously with each phase implementation) - - E2E test file: [path] or null (execute only in final phase, when provided) - - E2E absence reason: [reason] (when E2E is null β€” pass this so work-planner can skip E2E Gap Check for intentional absence) + **Pass to work-planner**: integration / fixture-e2e / service-integration-e2e file paths (or null per lane), per-lane absence reasons, plus timing guidance β€” integration tests are created alongside each phase implementation, fixture-e2e tests are created alongside the UI feature phase, service-integration-e2e tests are executed only in the final phase. - **On error**: Escalate to user if integration file generation failed unexpectedly (status != completed). E2E being null with a valid absence reason is not an error. + **On error**: Escalate to user when status != completed and integration file generation failed unexpectedly. A null E2E lane with a valid absence reason is not an error. 3. **ADR Status Management**: Update ADR status after user decision (Accepted/Rejected) diff --git a/skills/subagents-orchestration-guide/references/monorepo-flow.md b/skills/subagents-orchestration-guide/references/monorepo-flow.md index 4304e07..840c9dc 100644 --- a/skills/subagents-orchestration-guide/references/monorepo-flow.md +++ b/skills/subagents-orchestration-guide/references/monorepo-flow.md @@ -27,7 +27,7 @@ This reference defines the orchestration flow for projects spanning multiple lay | 11 | code-verifier | Verify **Frontend** Design Doc against existing code | Frontend verification | | 12 | document-reviewer Γ—2 | Review each Design Doc (with code-verifier results as `code_verification`) | Reviews | | 13 | design-sync | Cross-layer consistency verification (source: frontend Design Doc) **[Stop]** | Sync status | -| 14 | acceptance-test-generator | Integration/E2E test skeleton from cross-layer contracts | Test skeletons | +| 14 | acceptance-test-generator | Integration + fixture-e2e + service-integration-e2e test skeletons from cross-layer contracts (per-lane) | Test skeletons | | 15 | work-planner | Work plan from all Design Docs **[Stop: Batch approval]** | Work plan | ### Medium Scale Fullstack (3-5 Files) - 13 Steps @@ -45,7 +45,7 @@ This reference defines the orchestration flow for projects spanning multiple lay | 9 | code-verifier | Verify **Frontend** Design Doc against existing code | Frontend verification | | 10 | document-reviewer Γ—2 | Review each Design Doc (with code-verifier results as `code_verification`) | Reviews | | 11 | design-sync | Cross-layer consistency verification (source: frontend Design Doc) **[Stop]** | Sync status | -| 12 | acceptance-test-generator | Integration/E2E test skeleton from cross-layer contracts | Test skeletons | +| 12 | acceptance-test-generator | Integration + fixture-e2e + service-integration-e2e test skeletons from cross-layer contracts (per-lane) | Test skeletons | | 13 | work-planner | Work plan from all Design Docs **[Stop: Batch approval]** | Work plan | ### Parallelization in Multi-Agent Steps diff --git a/skills/test-implement/references/e2e.md b/skills/test-implement/references/e2e.md index 573f765..47cdbe5 100644 --- a/skills/test-implement/references/e2e.md +++ b/skills/test-implement/references/e2e.md @@ -1,5 +1,16 @@ # E2E Test Implementation with Playwright +## Lane Selection + +E2E tests in this workflow split into two lanes (defined in integration-e2e-testing skill): + +| Lane | Backend setup | Use these patterns | +|------|---------------|-------------------| +| **fixture-e2e** | Mocked via `page.route()` or fixture loaders; no live services | Page Object Pattern, Locator Strategy, Assertions, the **Fixture-Based Backend** section below | +| **service-integration-e2e** | Live local stack with real services | All patterns above PLUS the **E2E Environment Prerequisites** section (seed data, auth fixture against real auth flow) | + +The skeleton's `@lane:` annotation declares which lane the test belongs to. Choose implementation patterns to match. + ## Test Framework - **Playwright Test**: `@playwright/test` - Test imports: `import { test, expect } from '@playwright/test'` @@ -10,18 +21,23 @@ ``` tests/ └── e2e/ - β”œβ”€β”€ pages/ # Page objects + β”œβ”€β”€ pages/ # Page objects (shared across lanes) β”‚ β”œβ”€β”€ login.page.ts β”‚ └── dashboard.page.ts - β”œβ”€β”€ fixtures/ # Test fixtures + β”œβ”€β”€ fixtures/ # Test fixtures (auth, seed) β”‚ └── auth.fixture.ts - └── *.e2e.test.ts # Test files + β”œβ”€β”€ data/ # Static fixture data for fixture-e2e + β”‚ └── *.fixture.json + β”œβ”€β”€ *.fixture.e2e.test.ts # fixture-e2e test files + └── *.service.e2e.test.ts # service-integration-e2e test files ``` ### Naming Conventions -- Test files: `{FeatureName}.e2e.test.ts` +- fixture-e2e files: `{FeatureName}.fixture.e2e.test.ts` +- service-integration-e2e files: `{FeatureName}.service.e2e.test.ts` - Page objects: `{PageName}.page.ts` - Fixtures: `{Purpose}.fixture.ts` +- Static fixture data: `{scenario}.fixture.json` ## Page Object Pattern @@ -102,9 +118,46 @@ export const test = base.extend<{ authenticatedPage: Page }>({ }) ``` -## E2E Environment Prerequisites +## Fixture-Based Backend (fixture-e2e) + +fixture-e2e tests run a real browser against deterministic fixtures β€” no live backend, no DB, no external services. Use one of these patterns to fake the network: + +### Pattern A: page.route() interception + +```typescript +test('Dismiss-then-Undo restores card', async ({ page }) => { + // Arrange: intercept all backend calls with deterministic responses + await page.route('**/api/cards', async (route) => { + await route.fulfill({ json: cardsFixture }) + }) + await page.route('**/api/cards/*/dismiss', async (route) => { + await route.fulfill({ status: 204 }) + }) + + await page.goto('/cards') + await page.getByRole('button', { name: 'Dismiss' }).first().click() + await page.getByRole('button', { name: 'Undo' }).click() + + await expect(page.getByText(cardsFixture[0].title)).toBeVisible() +}) +``` + +### Pattern B: Fixture loader injection + +```typescript +// data/cards-with-dismiss.fixture.json β€” committed alongside the test +// Loaded via a route helper or app-level test mode +``` -E2E tests require a running application with real data state. Unlike unit/integration tests, environment setup is part of E2E test implementation scope. +**Principles for fixture-e2e**: +- Backend is faked, not running. No `npm run start:backend` required to execute these tests +- Fixtures are versioned in the repo (`tests/e2e/data/`) so tests are deterministic across machines +- Auth, when needed, is faked too (set a test cookie via `page.context().addCookies()` or use a fixture-mode bypass) +- These tests run in CI without provisioning external infrastructure + +## E2E Environment Prerequisites (service-integration-e2e) + +service-integration-e2e tests require a running application with real data state. Unlike fixture-e2e, environment setup is part of test implementation scope. ### Seed Data Strategy @@ -163,16 +216,16 @@ export const test = base.extend<{ playerPage: Page }>({ - Store test credentials in environment variables only (`E2E_*` prefixed) - If the auth flow requires specific user records, seed them in the fixture -### Environment Checklist +### Environment Checklist (service-integration-e2e only) -Before E2E tests can pass, verify: +Before service-integration-e2e tests can pass, verify: - [ ] Application is running and accessible at `baseURL` - [ ] Database has required seed data (test users, subscriptions, content) - [ ] Authentication flow works with test credentials - [ ] Environment variables are set (`E2E_*` prefixed) -- [ ] External services are either available or mocked via `page.route()` +- [ ] External services are either available or stubbed -When the work plan includes dedicated environment setup tasks (Phase 0), follow those tasks. When no setup tasks exist in the plan, address missing prerequisites as part of the E2E test implementation task itself. +When the work plan includes dedicated environment setup tasks (Phase 0 β€” see work-planner E2E Environment Prerequisites extraction), follow those tasks. When no setup tasks exist in the plan, address missing prerequisites as part of the test implementation task itself, OR consider whether the verification could move to fixture-e2e instead. ## Locator Strategy @@ -235,18 +288,36 @@ test.describe('responsive navigation', () => { ## Skeleton Comment Format -E2E test skeletons follow the same annotation format as integration tests (adapt comment syntax to the project's language): +E2E test skeletons follow the same annotation format as integration tests (adapt comment syntax to the project's language). The `@lane` annotation routes the test to the correct implementation patterns. + +### fixture-e2e example +```typescript +// AC: [Original acceptance criteria text] +// Behavior: [User action] β†’ [System response] β†’ [Observable result in browser] +// @category: fixture-e2e +// @lane: fixture-e2e +// @dependency: full-ui (mocked backend) +// @complexity: medium +// ROI: [score] +test('AC1: [Description]', async ({ page }) => { + // Arrange: load fixture data, intercept network + // Act: user interaction + // Assert: observable browser state +}) +``` +### service-integration-e2e example ```typescript // AC: [Original acceptance criteria text] -// Behavior: [User action] β†’ [System response] β†’ [Observable result] -// @category: e2e +// Behavior: [User action] β†’ [System response across services] β†’ [Observable cross-service result] +// @category: service-integration-e2e +// @lane: service-integration-e2e // @dependency: full-system // @complexity: high // ROI: [score] -test('AC1: [Description]', async ({ page }) => { - // Arrange: [Setup description] - // Act: [Action description] - // Assert: [Verification description] +test('AC1: [Description]', async ({ page, request }) => { + // Arrange: seed real data, real auth + // Act: user interaction + // Assert: observable result + cross-service evidence (DB row, downstream event) }) ``` diff --git a/skills/test-implement/references/frontend.md b/skills/test-implement/references/frontend.md index e605e28..ff3098f 100644 --- a/skills/test-implement/references/frontend.md +++ b/skills/test-implement/references/frontend.md @@ -19,9 +19,15 @@ ### Coverage Requirements (ADR-0002 Compliant) **Component-specific targets**: + +When the project adopts Atomic Design (atoms / molecules / organisms layering): - Atoms (Button, Text, etc.): 70% or higher - Molecules (FormField, etc.): 65% or higher - Organisms (Header, Footer, etc.): 60% or higher + +When the project uses a different component architecture (Feature-based, Container-Presenter, etc.): apply 60% as the baseline and raise the target for foundational/leaf components (those reused across many features) to 70%. + +Component-architecture-independent targets: - Custom Hooks: 65% or higher - Utils: 70% or higher diff --git a/skills/typescript-rules/SKILL.md b/skills/typescript-rules/SKILL.md index 99a6f50..947ab7b 100644 --- a/skills/typescript-rules/SKILL.md +++ b/skills/typescript-rules/SKILL.md @@ -62,7 +62,7 @@ function isUser(value: unknown): value is User { **Component Design Criteria** - **Function components only**: Official React recommendation, optimizable by modern tooling (Exception: Error Boundary requires class component) - **Custom Hooks**: Standard pattern for logic reuse and dependency injection -- **Component Hierarchy**: Atoms β†’ Molecules β†’ Organisms β†’ Templates β†’ Pages +- **Component Hierarchy**: Use the project's adopted component architecture. When the project uses Atomic Design: Atoms β†’ Molecules β†’ Organisms β†’ Templates β†’ Pages. When the project uses Feature-based, Container-Presenter, or another structure: follow that structure consistently and document the chosen layering in the project README or design doc - **Co-location**: Place tests, styles, and related files alongside components **State Management Patterns**