diff --git a/{{cookiecutter.project_slug}}/.github/workflows/test.yml b/{{cookiecutter.project_slug}}/.github/workflows/test.yml
index 0c12862..59f9f97 100644
--- a/{{cookiecutter.project_slug}}/.github/workflows/test.yml
+++ b/{{cookiecutter.project_slug}}/.github/workflows/test.yml
@@ -6,9 +6,6 @@ on:
       - main
   pull_request:
 
-env:
-  UV_SYSTEM_PYTHON: "1"
-
 jobs:
   quality:
     runs-on: ubuntu-latest
@@ -30,8 +27,15 @@ jobs:
           curl -LsSf https://astral.sh/uv/install.sh | sh
           echo "$HOME/.local/bin" >> "$GITHUB_PATH"
 
-      - name: Sync dependencies
-        run: uv sync --dev
+      - name: Verify uv installation
+        run: |
+          uv --version
+          which uv
+
+      - name: Install dependencies (workspace)
+        run: |
+          uv sync --dev --all-packages
+          uv pip list
 
       - name: Ruff lint
         run: uv run ruff check src/ tests/
diff --git a/{{cookiecutter.project_slug}}/CLAUDE.md b/{{cookiecutter.project_slug}}/CLAUDE.md
index 027854d..bcb549e 100644
--- a/{{cookiecutter.project_slug}}/CLAUDE.md
+++ b/{{cookiecutter.project_slug}}/CLAUDE.md
@@ -1,144 +1,780 @@
-# Development Rules
+# CellSem Agentic Workflow - Development Guide
 
-## Test-Driven Development (MANDATORY)
-1. Write unit and integration tests FIRST
-2. Tests must fail initially (red)
-3. Commit tests before implementation
-4. Write minimal code to pass tests (green)
-5. Refactor while keeping tests green, commit
+**Template for building robust agentic workflows with integrated validation**
 
-## TDD Workflow Commands (using uv)
+This CLAUDE.md should be copied to each new agentic workflow project and customized for that project's specifics.
+
+---
+
+## Development Philosophy
+
+### Core Principle: Scope Rings
+
+**Every project follows this sequence:**
+
+```
+Ring 0 (MVP - Ship First):     Core value proposition
+Ring 1 (After validation):     User-requested enhancements
+Ring 2 (If valuable):          Advanced features
+Ring 3 (Speculative):          Experiments
+```
+
+**RULE: Cannot work on Ring N+1 until Ring N is shipped and validated with users**
+
+**Timeline:**
+(treat week numbers as relative timings/durations here - actual agentic dev may be faster)
+- Week 0: Validate constraints
+- Week 1-2: Build Ring 0
+- Week 2-3: Ship & get user feedback
+- Week 4+: Iterate based on feedback
+
+---
+
+## Understanding the Scaffold
+
+This template provides **infrastructure** and **optional patterns**. See `SCAFFOLD_GUIDE.md` for complete decision trees.
+
+### Infrastructure (Keep Always)
+
+These prevent technical debt and ensure consistency:
+
+- ✅ **JSON schemas in `schemas/`** - Schema-first design (generate Pydantic models programmatically)
+- ✅ **YAML prompts co-located with agents/services** - Declarative, versionable prompts
+- ✅ **Prompt naming convention**: `{agent_name}.prompt.yaml` or `{service_name}.{purpose}.prompt.yaml`
+  - Examples: `agents/annotator.prompt.yaml`, `services/deepsearch.query.prompt.yaml`
+  - Easy to find: `find . -name "*.prompt.yaml"`
+  - Easy to review: `git diff **/*.prompt.yaml`
+- ✅ **Test structure** - `unit/` and `integration/` with pytest markers
+- ✅ **Tooling configs** - pytest, ruff, mypy, sphinx in `pyproject.toml`
+- ✅ **Dotenv bootstrap** - Environment management via `.env` files
+
+### Optional Components (Evaluate for Ring 0)
+
+These are proven patterns - use if Ring 0 needs them, otherwise DELETE:
+
+- **`graphs/`** - Multi-step workflow orchestration
+  - Keep if: Complex branching workflows needed
+  - Delete if: Single agent or linear flow sufficient
+  - See: `src/{{cookiecutter.package_name}}/graphs/README.md`
+
+- **`validation/`** - Cross-cutting validation logic
+  - Keep if: Shared validation across 2+ services
+  - Delete if: Simple Pydantic validation sufficient
+  - See: `src/{{cookiecutter.package_name}}/validation/README.md`
+
+- **`agents/example_agent.py`** - Example agent with infrastructure patterns
+  - EXAMPLE code: Replace with your domain logic
+  - INFRASTRUCTURE patterns: Keep schema-first, prompt-first approach
+
+- **`deep-research-client` dependency**
+  - Keep if: Using Perplexity/deep research in Ring 0
+  - Delete if: Not needed for MVP (remove from `pyproject.toml`)
+
+### Week 0 Includes Scaffold Review
+
+1. Define Ring 0 scope (update sections below)
+2. Review `SCAFFOLD_GUIDE.md` decision trees
+3. **Delete unused optional components** (graphs/, validation/, etc.)
+4. **Remove unused dependencies** (deep-research-client if not needed)
+5. Keep infrastructure, replace example code with domain logic
+6. Update README.md with your project description
+
+**Key Principle**: Infrastructure ≠ premature abstraction. Schema files and prompt files are infrastructure that prevent technical debt.
+
+### Prompt File Organization
+
+**INFRASTRUCTURE**: Always store prompts in separate YAML files (not hardcoded in code).
+
+**Location**: Co-locate prompts with the agents/services that use them.
+
+**Naming Convention**: Use `.prompt.yaml` suffix for easy discoverability:
+- `{agent_name}.prompt.yaml` - For single-purpose agents
+- `{service_name}.{purpose}.prompt.yaml` - For services with multiple prompts
+
+**Examples**:
+- `src/{{cookiecutter.package_name}}/agents/annotator.prompt.yaml`
+- `src/{{cookiecutter.package_name}}/services/deepsearch.query.prompt.yaml`
+- `src/{{cookiecutter.package_name}}/services/deepsearch.summary.prompt.yaml`
+
+**Benefits**:
+- Easy discovery: `find . -name "*.prompt.yaml"`
+- Clear ownership: prompt lives next to implementation
+- Easy review: `git diff **/*.prompt.yaml`
+- Grepable: search for prompt changes across project
+- Version controlled: track prompt evolution in git
+
+**Pattern**:
+```yaml
+# Co-located with agent/service that uses it
+system_prompt: |
+  You are an AI assistant specialized in {domain}.
+
+user_prompt: |
+  Process this {task_type}: {input_data}
+
+presets:
+  openai-gpt4:
+    provider: openai
+    model: gpt-4
+    temperature: 0.1
+```
+
+**Load in code**:
+```python
+from pathlib import Path
+import yaml
+
+def load_prompt(prompt_file: str) -> dict:
+    """Load co-located prompt file."""
+    prompt_path = Path(__file__).parent / prompt_file
+    return yaml.safe_load(prompt_path.read_text())
+
+# Usage
+prompt_config = load_prompt("my_agent.prompt.yaml")
+```
+
+---
+
+## Project-Specific Configuration
+
+**[CUSTOMIZE THIS SECTION FOR EACH PROJECT]**
+
+### Ring 0 Scope (MVP)
+<!-- Define your minimum viable product -->
+- [ ] Core feature 1
+- [ ] Core feature 2
+- [ ] Basic output format
+
+**STOP after Ring 0. Share with users. Get feedback.**
+
+### Ring 1 Scope (After User Validation)
+<!-- Features to consider after Ring 0 feedback -->
+- [ ] TBD based on user feedback
+
+### Architecture Vision
+<!-- Document your core architectural decisions -->
+
+**Core design:**
+- Service pattern: [describe]
+- Schema-first: Pydantic models from JSON schema
+- Configuration: [describe preset system]
+
+**What NOT to do yet:**
+- ❌ Don't add multi-provider support unless clearly stated use case (wait for 2nd provider need)
+- ❌ Don't build abstract base classes (wait for 2+ concrete cases)
+- ❌ Don't optimize for scale (wait for scale problems) BUT - also warn of any poorly scaling or costly anti-patterns (e.g. multiple LLM calls passing the same info)
+
+### Known Constraints
+<!-- Document tested facts about external services -->
+<!-- Update this as you discover API quirks -->
+
+**Example:**
+```markdown
+## Perplexity deep reasearch API (Tested YYYY-MM-DD)
+- ❌ Does NOT respect JSON schema in system prompt
+- ✅ DOES work with schema in user message as part of request for larger report
+- Tested: 10/10 successful parses
+```
+
+---
+
+## Week 0: Validation Phase (REQUIRED)
+
+**Before writing production code:**
+
+### 1. Test External Services (1-2 days)
+- Write 5-10 simple test scripts for each external API
+- Document behavior quirks in CONSTRAINTS.md
+- Test edge cases, error modes, rate limits
+- **Deliverable:** CONSTRAINTS.md with tested facts
+
+### 2. Create Scope Rings (1 day)
+- Define Ring 0 (MVP) clearly - what's the minimum value?
+- Identify Ring 1 candidates (defer until feedback)
+- **Deliverable:** SCOPE_RINGS.md
+
+### 3. Update This CLAUDE.md
+- Fill in Ring 0 scope above
+- Document architectural decisions
+- List what NOT to do yet
+- **Deliverable:** Project-specific CLAUDE.md
+
+**Week 0 prevents:** Building elaborate systems on wrong assumptions
+
+---
+
+## Test-Driven Development
+
+### Integration Tests: ALWAYS Required
+
+**From Day 1:**
+- Integration tests with REAL external services
+- Tests FAIL HARD if no API keys
+- Forces validation against real behavior
+- Catches API quirks immediately
+
+**Example:**
+```python
+@pytest.mark.integration
+def test_perplexity_json_output():
+    """Test real Perplexity API with JSON schema."""
+    if not os.getenv("PERPLEXITY_API_KEY"):
+        pytest.fail("PERPLEXITY_API_KEY required for integration tests")
+
+    # Test with real API
+    result = query_perplexity(...)
+    assert valid_json(result)
+```
+
+### TDD: When to Use
+
+**Use TDD for:**
+- ✅ Parsers, validators, data transformers (clear inputs/outputs)
+- ✅ Bug fixes (red → green → refactor)
+- ✅ Core domain logic (once understood)
+
+**Don't use TDD for:**
+- ❌ Exploratory prototypes
+- ❌ Trying different prompt strategies
+- ❌ Initial API integration experiments
+
+**TDD Workflow:**
 ```bash
-# Install dependencies and sync environment
-uv sync --dev            # Install all dependencies including dev tools
+# 1. Red: Write failing test
+uv run pytest tests/test_parser.py -k test_new_feature  # Should fail
 
-# Run tests
-uv run pytest                    # All tests
-uv run pytest -m unit           # Unit tests only
-uv run pytest -m integration    # Integration tests only
-uv run pytest --cov             # With coverage
+# 2. Green: Minimal implementation
+# ... write code ...
+uv run pytest tests/test_parser.py -k test_new_feature  # Should pass
+
+# 3. Refactor: Improve while tests stay green
+```
+
+### Coverage Targets
+
+**MVP Phase (Week 1-3):**
+- Target: 60% coverage
+- Focus on critical paths
+- Integration tests > unit test coverage
+
+**Post-MVP (Week 4+):**
+- Target: 80%+ coverage
+- Comprehensive test suite
+- Add edge cases
 
-# Code quality (run before committing!)
+---
+
+## Code Quality: Phase-Appropriate Standards
+
+### MVP Phase (Week 1-3): Relaxed
+
+**Focus:** Deliver value, validate approach
+
+```bash
+# Run these manually (not blocking)
+uv run mypy src/                      # Type checking (encouraged)
+uv run ruff check --fix src/ tests/   # Lint
 uv run ruff format src/ tests/        # Format code
-uv run ruff check --fix src/ tests/   # Lint and auto-fix
-uv run mypy src/                      # Type checking
+```
 
-# Documentation (run before committing!)
-python scripts/check-docs.py         # Build docs and check for errors
-cd docs && uv run sphinx-build . _build/html -W  # Alternative direct command
+**Standards:**
+- ✅ Integration tests (required)
+- ✅ Type hints (encouraged, not enforced)
+- ✅ Linting (run manually, don't block commits)
+- ✅ Coverage: 60% target
+- ❌ NO pre-commit hooks yet (patterns not stable)
 
-# Pre-commit hooks (recommended)
-uv run pre-commit install       # Install git hooks
-uv run pre-commit run --all-files   # Run on all files
+### Post-MVP Phase (Week 4+): Strict
 
-# Add new dependencies
-uv add requests              # Add runtime dependency
-uv add --dev pytest         # Add development dependency
+**Focus:** Sustainable, maintainable code
+
+```bash
+# Install pre-commit hooks
+uv run pre-commit install
+
+# These now run automatically on commit
+uv run pytest --cov --cov-fail-under=80
+uv run mypy src/
+uv run ruff check src/ tests/
+uv run ruff format src/ tests/
 
-# Environment management
-uv sync                      # Sync dependencies (production only)
-uv sync --dev               # Sync with development dependencies
 ```
 
-## Code Quality Strategy
-- **Pre-commit hooks**: Auto-run ruff to lint AND format /mypy to type check before each commit
-- **Local checks**: Always run `ruff format` and `ruff check --fix` before pushing
-- **GitHub Actions**: Runs same checks on PRs - should never fail if run locally
-- **IDE integration**: Configure your editor to run formatters on save
+**Standards:**
+- ✅ Pre-commit hooks enforced
+- ✅ Type checking enforced (mypy)
+- ✅ Linting enforced (ruff)
+- ✅ Coverage: 80%+ required
+- ✅ CI/CD checks
 
-## Environment Configuration
-- **ALWAYS use dotenv**: Use `from dotenv import load_dotenv; load_dotenv()` for environment variables, never use `os.getenv()` directly
-- **Never hardcode secrets**: All API keys, emails, and sensitive data must come from .env files
-- **Environment precedence**: Constructor params > environment variables > sensible defaults
+**When to transition:** After Ring 0 shipped, user feedback received, code patterns stabilizing
 
-## FORBIDDEN Patterns
-- Mock data generation for integration tests
-- Simulated API responses
-- Dummy database connections
-- Placeholder implementations
-- Integration tests that pass without real integration
-- Skipping failing tests with pytest.mark.skip
+---
+
+## Testing Commands (using uv)
+
+```bash
+# Environment setup
+uv sync --dev            # Install all dependencies including dev tools
+
+# Running tests
+uv run pytest                    # All tests
+uv run pytest -m unit           # Unit tests only (CI uses this)
+uv run pytest -m integration    # Integration tests (local only)
+uv run pytest --cov             # With coverage
+uv run pytest --cov --cov-fail-under=60   # MVP phase
+uv run pytest --cov --cov-fail-under=80   # Post-MVP phase
+
+# Code quality
+uv run mypy src/                      # Type check
+uv run ruff check --fix src/ tests/   # Lint + auto-fix
+uv run ruff format src/ tests/        # Format
+
+# Documentation
+python scripts/check-docs.py          # Build and check docs
+cd docs && uv run sphinx-build . _build/html -W  # Alternative
+
+# Pre-commit (Post-MVP)
+uv run pre-commit install             # Install hooks
+uv run pre-commit run --all-files     # Run on all files
+
+# Dependencies
+uv add requests              # Runtime dependency
+uv add --dev pytest          # Dev dependency
+```
+
+---
 
 ## Required Test Structure
-- Unit tests: tests/unit/ (fast, isolated, no external deps)
-- Integration tests: tests/integration/ (environment-dependent behavior)
-- Fixtures with real connection setup/teardown
-- Coverage minimum: 80%
-- All tests must use pytest markers (@pytest.mark.unit or @pytest.mark.integration)
+
+```
+tests/
+├── unit/                    # Fast, isolated, no external deps
+│   ├── test_parsers.py
+│   ├── test_validators.py
+│   └── ...
+├── integration/             # Real external services
+│   ├── test_perplexity.py
+│   ├── test_deepsearch.py
+│   └── ...
+└── conftest.py             # Shared fixtures
+```
+
+**All tests must use markers:**
+```python
+@pytest.mark.unit          # Unit test
+@pytest.mark.integration   # Integration test
+```
+
+**Integration test requirements:**
+- Real API connections (no mocks)
+- Fail hard if no credentials
+- Document expected behavior
+- Test error modes (rate limits, network failures)
+
+---
 
 ## Integration Testing Strategy
-**Local Development (Real APIs Only):**
-- Integration tests REQUIRE real API keys (e.g. OPENAI_API_KEY, ANTHROPIC_API_KEY)
-- Tests FAIL HARD if no API keys are present
-- Forces developers to test against real APIs before pushing
-- Pre-commit hooks enforce integration test passage
-- Command: `uv run pytest -m integration`
-
-**CI/GitHub Actions (Unit Tests Only):**
-- NO integration tests run in CI to avoid mock complexity
-- Only unit tests run (fast, no external dependencies)
-- Command: `uv run pytest -m unit`
+
+### Local Development (Real APIs)
+- Integration tests REQUIRE real API keys
+- Tests FAIL HARD if credentials missing
+- Forces validation against real services
+- Run: `uv run pytest -m integration`
+
+### CI/GitHub Actions (Unit Tests Only)
+- NO integration tests in CI (avoid API costs/mocking)
+- Only unit tests (fast, reliable)
+- Run: `uv run pytest -m unit`
 
 **Rationale:**
-- Local: Mandatory real API testing ensures integration quality
-- CI: Simple, fast, reliable unit test validation
-- Avoids complex mock maintenance and false confidence
-- Forces developers to have working API access
+- Local: Mandatory real API testing ensures quality
+- CI: Simple, fast validation
+- Avoids mock complexity and false confidence
+
+---
+
+## FORBIDDEN Patterns
+
+**Never:**
+- ❌ Mock data for integration tests (use real APIs)
+- ❌ Simulated API responses in integration tests
+- ❌ Skipping tests with `pytest.mark.skip` (fix or remove)
+- ❌ Ring 1+ features before Ring 0 ships
+- ❌ Building generic architecture before specific case works
+- ❌ Rewriting existing code without documented reason
+
+**Required:**
+- ✅ Real API integration tests from Day 1
+- ✅ Ship Ring 0 within 2-3 weeks
+- ✅ Get user feedback before Ring 1
+- ✅ Extend existing code when possible
+
+---
+
+## Architecture Requirements
+
+### Schema-First Pattern
+
+**JSON Schema is source of truth:  Use it to generate pydanitc models**
+```python
+# 1. Define JSON schema (not Pydantic)
+schema = {
+    "type": "object",
+    "properties": {...},
+    "additionalProperties": False
+}
+
+# 2. Generate Pydantic model from JSON schema
+# Use cellsem-llm-client utilities
+Model = create_model_from_json_schema(schema)
+
+# 3. Validate and correct LLM outputs
+result = Model.model_validate(llm_output)  # Strict validation
+# OR
+result = Model.model_validate(llm_output, strict=False)  # Drop extra fields with warning
+```
+
+**Modular schemas:**
+- Separate business logic from domain (biology)
+- Reusable components
+- Shared between core and validation packages
+
+### Core Libraries
+
+**Use:**
+- `cellsem-llm-client` for LLM agents and generation of pydantic models
+- `deepsearch-client` for DeepSearch queries
+- `pydantic-ai` for orchestration graphs
+
+**DeepSearch calls belong in services layer, not scattered through code**
+
+### Orchestration: PydanticAI Graphs
+
+```python
+# Define workflow as graph
+workflow = Graph()
+workflow.add_node("query", query_agent)
+workflow.add_node("validate", validation_agent)
+workflow.add_edge("query", "validate")
+
+# Declarative, inspectable, debuggable
+result = workflow.run(input_data)
+```
+
+### Declarative Workflows
+
+**Prefer declarative over imperative:**
+
+**Prompts in YAML:**
+```yaml
+# prompts/gene_annotation.yaml
+system_prompt: |
+  You are a gene program annotator.
+  {instructions}
+
+user_prompt: |
+  Analyze these genes: {gene_list}
+
+presets:
+  perplexity-sonar:
+    provider: perplexity
+    model: sonar-pro
+    temperature: 0.1
+```
+
+**Benefits:**
+- Easy to modify without code changes
+- Versioned separately from logic
+- Testable in isolation
+- Self-documenting
+
+### Transparency & Debuggability
+
+**Required:**
+- ✅ Save intermediate outputs at each step
+- ✅ Structured output directory: `outputs/{project}/{query}/{timestamp}/`
+- ✅ Ability to resume from any step
+- ✅ Dry-run mode (show all prompts/calls without executing)
+
+**Example:**
+```python
+# Save intermediate results
+def run_workflow(input_data, output_dir, start_step=None):
+    if start_step is None or start_step <= 1:
+        result1 = step1(input_data)
+        save_json(result1, f"{output_dir}/step1_output.json")
+    else:
+        result1 = load_json(f"{output_dir}/step1_output.json")
+
+    if start_step is None or start_step <= 2:
+        result2 = step2(result1)
+        save_json(result2, f"{output_dir}/step2_output.json")
+    # ...
+```
+
+---
+
+## Scripts & CLI
+
+### Core Runner Script
+
+**Every workflow needs a runner supporting:**
+
+```bash
+# Single run
+workflow-runner --input genes.txt --output results/
+
+# Batch mode
+workflow-runner --batch inputs/ --output results/
+
+# Dry run (show plan without executing)
+workflow-runner --input genes.txt --dry-run
+
+# Resume from step
+workflow-runner --input genes.txt --output results/ --resume-from step3
+```
+
+**Requirements:**
+- Distributed with package (installed as console script)
+- Single run, batch, and dry-run modes
+- Clear progress output
+- Error handling with helpful messages
+
+**Anti-pattern:** Encoding workflow logic in scripts instead of core package
+
+---
+
+## Repository Structure
+
+### Two-Package Architecture (UV Workspace)
+
+This project uses **UV workspace** to manage two separately publishable packages:
+
+```
+{{cookiecutter.project_slug}}/
+├── pyproject.toml                           # UV workspace configuration
+├── src/
+│   ├── {{cookiecutter.package_name}}/      # CORE PACKAGE
+│   │   ├── pyproject.toml                   # Core package config
+│   │   └── {{cookiecutter.package_name}}/  # Source code
+│   │       ├── __init__.py                  # Bootstrap with dotenv
+│   │       ├── agents/                      # Agent orchestration
+│   │       │   └── *.prompt.yaml            # Co-located prompts
+│   │       ├── graphs/                      # Workflow graphs (OPTIONAL)
+│   │       ├── schemas/                     # JSON schemas (source of truth)
+│   │       ├── services/                    # LLM + API integrations
+│   │       │   └── *.prompt.yaml            # Co-located prompts
+│   │       ├── utils/                       # Supporting utilities
+│   │       └── validation/                  # Cross-cutting validations (OPTIONAL)
+│   └── {{cookiecutter.package_name}}_validation_tools/  # VALIDATION PACKAGE (OPTIONAL)
+│       ├── pyproject.toml                   # Validation package config
+│       └── {{cookiecutter.package_name}}_validation_tools/
+│           ├── comparisons/                 # Compare workflow runs
+│           ├── metrics/                     # Quality metrics
+│           └── visualizations/              # Analysis plots
+├── tests/
+│   ├── unit/
+│   └── integration/
+├── docs/
+├── scripts/
+│   └── check-docs.py
+├── SCAFFOLD_GUIDE.md                        # Scaffold decision guide
+└── CLAUDE.md                                # This file
+```
+
+**Core package** (`{{cookiecutter.package_name}}`):
+- **Always keep**: Contains all workflow logic
+- **Owns schemas**: Only location for JSON schemas
+- **Prompts co-located**: `*.prompt.yaml` files next to agents/services
+- Publish: `pip install {{cookiecutter.package_name}}`
+
+**Validation package** (`{{cookiecutter.package_name}}_validation_tools`):
+- **OPTIONAL**: Delete entire directory if Ring 0 doesn't need validation tooling
+- **Depends on core**: Imports schemas and models from core package
+- **No schema duplication**: Uses `from {{cookiecutter.package_name}}.schemas import ...`
+- Publish: `pip install {{cookiecutter.package_name}}-validation-tools`
+
+**UV Workspace benefits:**
+- Single `uv sync` installs both packages in development mode
+- Shared lockfile (`uv.lock`) for reproducibility
+- Independent publishing to PyPI
+- Clear dependency graph (validation → core)
+
+---
+
+## Environment Configuration
+
+**ALWAYS use dotenv:**
+```python
+from dotenv import load_dotenv
+load_dotenv()
+
+# Then use os.getenv()
+api_key = os.getenv("PERPLEXITY_API_KEY")
+```
+
+**Precedence:**
+1. Constructor parameters (explicit)
+2. Environment variables (.env file)
+3. Sensible defaults
+
+**Never:**
+- ❌ Hardcode secrets
+- ❌ Commit .env files
+- ❌ Use `os.getenv()` without `load_dotenv()`
+
+---
 
 ## Documentation Requirements
-- Google-style docstrings for all public functions/classes
-- **RST syntax in docstrings**: Use `.. code-block:: python` instead of markdown ```python
-- Auto-generated API docs via Sphinx + AutoAPI
-- Manual docs in docs/ using MyST markdown
-- Build docs: `python scripts/check-docs.py` (recommended) or `sphinx-build docs docs/_build`
-- **Always run docs check before committing** to catch RST syntax errors
-
-## MVP Definition
-For each feature:
+
+**Google-style docstrings:**
+```python
+def query_deepsearch(gene_list: list[str], model: str = "sonar-pro") -> dict:
+    """Query DeepSearch API for gene program annotation.
+
+    Args:
+        gene_list: List of gene symbols to annotate
+        model: DeepSearch model to use (default: sonar-pro)
+
+    Returns:
+        Dictionary containing annotation results with keys:
+        - programs: List of identified gene programs
+        - citations: Supporting references
+        - confidence: Confidence scores
+
+    Raises:
+        APIError: If DeepSearch request fails
+        ValidationError: If response doesn't match schema
+
+    Example:
+        .. code-block:: python
+
+            result = query_deepsearch(["TP53", "BRCA1"])
+            programs = result["programs"]
+    """
+```
+
+**RST syntax in docstrings:**
+- Use `.. code-block:: python` (not markdown backticks)
+- Check with: `python scripts/check-docs.py`
+
+**Documentation structure:**
+- Auto-generated API docs (Sphinx + AutoAPI)
+- Manual guides in docs/ (MyST markdown)
+- ALWAYS run docs check before committing
+
+---
+
+## Planning Requirements
+
+### For Each Feature
+
+**Include:**
 1. Clear, testable goal
-2. Integration test demonstrating real API connection
-3. Error handling for actual failure modes (network, malformed data, rate limits)
-4. No feature complete until real integration test passes
+2. Integration test demonstrating real API behavior
+3. Error handling for actual failure modes:
+   - Network failures
+   - Malformed data
+   - Rate limits
+   - Authentication errors
+4. Critique: Potential issues/risks with approach
+
+**Template:**
+```markdown
+## Feature: [Name]
+
+### Goal
+[What value does this provide?]
+
+### Integration Test
+[How will we test with real APIs?]
+
+### Error Modes
+- Network failure: [handling]
+- Malformed response: [handling]
+- Rate limit: [handling]
+
+### Critique
+- Risk 1: [mitigation]
+- Risk 2: [mitigation]
+```
+
+### MVP Definition
+
+**Each feature is not complete until:**
+- ✅ Real integration test passes
+- ✅ Error handling implemented
+- ✅ Documented in code and docs
+
+---
+
+## Decision Checklist
+
+**Before writing production code, verify:**
+
+- [ ] **Week 0 complete?** CONSTRAINTS.md, SCOPE_RINGS.md, CLAUDE.md updated
+- [ ] **Is this Ring 0?** If no, STOP until Ring 0 ships
+- [ ] **Have we tested the external API?** Integration test first
+- [ ] **Can we extend existing code?** Don't rewrite without reason
+- [ ] **Is architecture documented in CLAUDE.md?** Agent needs guidance
+- [ ] **Are we in Week 3+ without user feedback?** Time to share
+
+---
+
+## Red Flags: Stop and Review
+
+**Warning signs:**
+- [ ] Ring 1+ features before Ring 0 ships
+- [ ] Rewriting existing code without documented reason
+- [ ] Building custom generic architecture (use/contribute to template)
+- [ ] Week 3+ without sharing with users
+- [ ] No integration tests with real APIs
+- [ ] Missing architectural vision in CLAUDE.md
+
+**If any checked:** Pause. Review [[development-principles]]. Refocus on Ring 0.
+
+---
 
-## Planning
+## Success Metrics
 
-Each proposed plan of work should include an MVP and a critique section detailing potential issues/risks with the approach.
+**Ship fast:**
+- Ring 0 shipped: Week 2-3 (not Week 5+)
+- User feedback: Week 3
+- Ring 1 decisions: Based on feedback
 
-## Architecture
+**Build right:**
+- Integration tests from Day 1
+- Real API validation (no mocks)
+- Coverage: 60% (MVP) → 80% (Post-MVP)
+- Sustainable patterns (template infrastructure)
 
-Use cellsem-llm-client for agents.
-Use deepsearch-client for deepsearch (deepsearch calls belong in services)
+**Iterate smart:**
+- Rapid experiments + user feedback
+- Extend existing code when possible
+- Contribute patterns back to template
 
-### Schema first pattern:
- - Schema is ALWAYS defined as JSON schema NOT pydantic schema.
- - pydantic models are ALWAYS derived from JSON schema - using cellsem-llm-client
- - All JSON schema content generated by agents/deepsearch is validated and corrected by pydantic 
- - where schema specified additionalFields: False, pydantic must be set to drop any additional fields with a warning rather than hard failing.
- - JSON schema should be modular with modules strictly separating business logic from biology
+---
 
-### Orchestration:
- - Agents are orchestrated via pydanticAI graphs
+## References
 
-### Declarative workflows:
- - As far as possible, prefer declarative structure:
-    - Prompts (including system prompts) + interpolation defined in YAML
-    - YAML files define preset combinations:  prompts, provider model, other API metadata (keys and values)
-    - If declarative use systems for orchestration via pydanticAI graphs are possible, then please use and document them.
+- [[development-principles]] - Lessons from Langpa retrospective
+- [[workflows]] - Integration with research workflows
+- CellSemAgenticWorkflow template repository - [URL]
 
-### Transparency:
- - It must be possible to save and inspect intermediate files in workflows
- - It must be possible to run workflows starting from multiple steps
+---
 
-### Scripts:
- - All workflows should have a core runner script supporting:
-   - single run
-   - batch mode
-   - dry-run (a single run returning a report of all prompts, agents, API calls in order.)
- - The runner script must be distributed with the package
- - Be careful to avoid encoding workflows in scripts rather than in the core package.
+## Customization Checklist
 
-### Repo structure:
-The repo must consist of two packages: A core package + a validation package. Where relevant these MUST share schemas. They MUST also share an outputs directory and standard ways to structure these.  For example, outputs/{project}/{query}/{timestamp}/(results_files).
+**When starting new project from this template:**
 
-Validation package must be capable of basic stats - precision, recall, F1, ROC curves, heat map bubble plots...
+- [ ] Fill in "Ring 0 Scope" section above
+- [ ] Document architectural decisions
+- [ ] List "What NOT to do yet"
+- [ ] Complete Week 0 validation
+- [ ] Create CONSTRAINTS.md
+- [ ] Create SCOPE_RINGS.md
+- [ ] Update this CLAUDE.md with project specifics
+- [ ] Share this with agent for each session
 
+**This CLAUDE.md guides the agent. Keep it updated as decisions evolve.**
diff --git a/{{cookiecutter.project_slug}}/README.md b/{{cookiecutter.project_slug}}/README.md
index ada96b2..139a84d 100644
--- a/{{cookiecutter.project_slug}}/README.md
+++ b/{{cookiecutter.project_slug}}/README.md
@@ -10,6 +10,15 @@
 
 ## 🚀 Quick Start
 
+### Understanding This Scaffold
+
+This project was generated from a standardized template. **See `SCAFFOLD_GUIDE.md` for**:
+- What's **infrastructure** (keep always) vs **optional** (evaluate for your Ring 0)
+- What's **example code** (replace with your domain logic)
+- **Decision trees** for each component (graphs/, validation/, etc.)
+
+**Week 0 Task**: Review scaffold and remove components not needed for your Ring 0 MVP.
+
 ### Installation
 
 ```bash
@@ -61,41 +70,93 @@ bootstrap()
 
 Documentation lives in `docs/` and is built with Sphinx + MyST. Run `python scripts/check-docs.py` to build with warnings-as-errors before each commit. Publish the rendered HTML via GitHub Pages or your preferred static host.
 
+## 📦 Package Structure
+
+This project contains **two independently publishable packages** managed as a UV workspace:
+
+### Core Package
+```bash
+pip install {{cookiecutter.package_name}}
+```
+Main workflow package with agents, services, and orchestration. **Always keep this package.**
+
+### Validation Tools (Optional)
+```bash
+pip install {{cookiecutter.package_name}}-validation-tools
+```
+Tools for comparing runs, computing metrics, and visualizing results.
+
+**Note**: Validation package is **OPTIONAL**. Delete `src/{{cookiecutter.package_name}}_validation_tools/` if not needed for your Ring 0 MVP. See `SCAFFOLD_GUIDE.md` for guidance.
+
+## 🛠️ Development
+
+This is a **UV workspace** - a single `uv sync` installs both packages:
+
+```bash
+# Install both packages in development mode
+uv sync --dev
+
+# Run tests for all packages
+uv run pytest
+
+# Lint all packages
+uv run ruff check src/ tests/
+```
+
 ## ✨ Current Features
 
+- ✅ **Two-package architecture** - Core + optional validation tools
+- ✅ **UV workspace** - Modern multi-package management
 - ✅ **Agentic workflow scaffold** with strict TDD guardrails (`CLAUDE.md`)
 - ✅ **Unit & integration test suites** pre-configured with pytest markers
 - ✅ **Docs + automation scripts** for Sphinx builds
 - ✅ **Environment bootstrap** handled via `python-dotenv`
-- ✅ **uv-first packaging** (`pyproject.toml` with Ruff, MyPy, pytest config)
 - ✅ **Integrated clients**: [`cellsem_llm_client`](https://github.com/Cellular-Semantics/cellsem_llm_client) for LLMs and [`deep-research-client`](https://github.com/monarch-initiative/deep-research-client) for Deepsearch workflows
 - ✅ **Pydantic AI graph orchestration**: `pydantic-ai` agent surfaces graph nodes safely with typed deps
+- ✅ **Schema-first design**: JSON schemas → Pydantic models
+- ✅ **Prompt co-location**: `*.prompt.yaml` files next to agents/services
 
 ## 🏗️ Architecture
 
 ```
 {{cookiecutter.project_slug}}/
-├── src/{{cookiecutter.package_name}}/
-│   ├── agents/       # Agent classes coordinating workflows
-│   ├── graphs/       # Optional workflow graphs powered by Pydantic
-│   ├── schemas/      # Shared IO models and contracts
-│   └── services/     # LLM + Deepsearch integration layers
-├── tests/unit/        # Fast, isolated tests
-├── tests/integration/ # Real API + IO validation (no mocks)
-├── docs/              # Sphinx configuration and content
-└── scripts/           # Tooling helpers (docs, chores, etc.)
+├── pyproject.toml                           # UV workspace config
+├── src/
+│   ├── {{cookiecutter.package_name}}/      # CORE PACKAGE
+│   │   ├── pyproject.toml
+│   │   └── {{cookiecutter.package_name}}/
+│   │       ├── agents/                      # Agent orchestration
+│   │       ├── graphs/                      # Workflow graphs (OPTIONAL)
+│   │       ├── schemas/                     # JSON schemas (source of truth)
+│   │       ├── services/                    # LLM + API integrations
+│   │       ├── utils/                       # Supporting utilities
+│   │       └── validation/                  # Cross-cutting validations (OPTIONAL)
+│   └── {{cookiecutter.package_name}}_validation_tools/  # VALIDATION PACKAGE (OPTIONAL)
+│       ├── pyproject.toml
+│       └── {{cookiecutter.package_name}}_validation_tools/
+│           ├── comparisons/                 # Compare workflow runs
+│           ├── metrics/                     # Quality metrics
+│           └── visualizations/              # Analysis plots
+├── tests/
+│   ├── unit/                                # Fast, isolated tests
+│   └── integration/                         # Real API validation (no mocks)
+├── docs/                                    # Sphinx configuration and content
+└── scripts/                                 # Tooling helpers (docs, chores, etc.)
 ```
 
-Optional workflow graphs powered by Pydantic ensure orchestration definitions are validated before agents execute them, keeping schema and runtime behaviors aligned.
-
-- `src/{{cookiecutter.package_name}}/agents`: Agent entrypoints coordinating services and schemas
-- `src/{{cookiecutter.package_name}}/graphs`: Optional workflow graphs powered by Pydantic + pydantic-ai
-- `src/{{cookiecutter.package_name}}/schemas`: JSON Schema contracts describing outputs + business rules
-- `src/{{cookiecutter.package_name}}/services`: Concrete integrations (CellSem LLM client, Deepsearch)
-- `src/{{cookiecutter.package_name}}/utils`: Repo-specific tooling/helpers that support workflows without being agents
-- `src/{{cookiecutter.package_name}}/validation`: Cross-cutting workflow validations (schema checks, service registration guards)
-
-Workflow validations live in src/{{cookiecutter.package_name}}/validation. Use this module to centralize logic that inspects graphs, schemas, or services before workflows execute.
+**Core package** (always keep):
+- `agents/`: Agent classes coordinating workflows (prompts co-located as `*.prompt.yaml`)
+- `graphs/`: Optional workflow graphs powered by Pydantic + pydantic-ai
+- `schemas/`: JSON Schema contracts (source of truth for data models)
+- `services/`: LLM and API integrations (CellSem LLM client, Deepsearch)
+- `utils/`: Supporting utilities
+- `validation/`: Cross-cutting validations (OPTIONAL - delete if not needed)
+
+**Validation package** (optional - delete if Ring 0 doesn't need):
+- `comparisons/`: Tools for comparing workflow runs
+- `metrics/`: Quality metrics (precision, recall, F1, etc.)
+- `visualizations/`: Analysis plots (heatmaps, ROC curves, etc.)
+- Imports schemas and models from core package (no duplication)
 
 ### Graph Agents with pydantic-ai
 
diff --git a/{{cookiecutter.project_slug}}/SCAFFOLD_GUIDE.md b/{{cookiecutter.project_slug}}/SCAFFOLD_GUIDE.md
new file mode 100644
index 0000000..a2586ea
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/SCAFFOLD_GUIDE.md
@@ -0,0 +1,275 @@
+# Project Scaffold Guide
+
+This guide helps you understand the generated project structure and decide what to keep for your Ring 0 MVP.
+
+## Two-Package Architecture
+
+This project uses a **two-package structure** managed as a UV workspace for separation of concerns:
+
+### 1. Core Package: `{{cookiecutter.package_name}}`
+
+**Always Keep**: This is your main workflow package.
+
+**Location**: `src/{{cookiecutter.package_name}}/`
+
+**Contains**:
+- Agents, services, orchestration
+- Core business logic
+- Schemas (source of truth - **only location for schemas**)
+- Prompts (co-located with agents/services)
+
+**Publish**: `pip install {{cookiecutter.package_name}}`
+
+### 2. Validation Package: `{{cookiecutter.package_name}}_validation_tools`
+
+**OPTIONAL**: Delete entire directory if not needed for Ring 0.
+
+**Location**: `src/{{cookiecutter.package_name}}_validation_tools/`
+
+**Contains**:
+- Workflow output comparison tools (`comparisons/`)
+- Quality metrics (`metrics/` - precision, recall, F1, etc.)
+- Visualizations (`visualizations/` - heatmaps, ROC curves, plots)
+
+**Depends on**: Core package (imports schemas and models from core)
+
+**Publish**: `pip install {{cookiecutter.package_name}}-validation-tools`
+
+**Keep if**:
+- Need to compare workflow runs
+- Need quality metrics for evaluation
+- Need visualizations for analysis
+- Building tools for workflow validation
+
+**Delete if**:
+- No comparison/analysis needed in Ring 0
+- Simple workflows without evaluation needs
+- Not building validation tooling
+
+### UV Workspace Benefits
+
+- **Single `uv sync`**: Installs both packages in development mode
+- **Shared lockfile**: `uv.lock` ensures reproducibility across team
+- **Local development**: Edit either package, changes reflected immediately
+- **Independent publishing**: Each package can be published to PyPI separately
+- **Clear dependencies**: Validation depends on core, not vice versa
+
+---
+
+## Infrastructure (Always Keep)
+
+These components prevent technical debt and ensure consistency across all projects:
+
+- **`src/{{cookiecutter.package_name}}/__init__.py`** - Bootstrap with dotenv loading
+- **`src/{{cookiecutter.package_name}}/schemas/`** - JSON schemas directory (schema-first design)
+- **`*.prompt.yaml` files** - Co-located with agents/services that use them
+- **`tests/unit/`** and **`tests/integration/`** - Testing structure with pytest markers
+- **`pyproject.toml`**, tooling configs - Development infrastructure (ruff, mypy, pytest, sphinx)
+- **`.github/workflows/`** - CI/CD pipeline
+- **`.githooks/`** - Pre-commit quality checks
+
+### Prompt File Naming Convention
+
+**Always use `.prompt.yaml` suffix** for easy identification and discoverability:
+
+**Examples:**
+- `src/{{cookiecutter.package_name}}/agents/annotator.prompt.yaml` - Single prompt for agent
+- `src/{{cookiecutter.package_name}}/services/deepsearch.query.prompt.yaml` - Specific purpose
+- `src/{{cookiecutter.package_name}}/services/deepsearch.summary.prompt.yaml` - Another purpose
+
+**Benefits:**
+- Easy to find all prompts: `find . -name "*.prompt.yaml"`
+- Clear ownership: prompt lives next to code that uses it
+- Easy to review: `git diff **/*.prompt.yaml`
+- Grepable: search for prompt-related changes across project
+- Version controlled: track prompt evolution in git
+
+---
+
+## Optional Components (Evaluate for Ring 0)
+
+### `graphs/` - Workflow Orchestration
+
+**Keep if** your Ring 0 needs:
+- Multi-step workflows with branching logic
+- Complex dependencies between steps
+- Dynamic routing based on runtime conditions
+- Type-safe workflow definitions
+
+**Delete if**:
+- Single agent or linear flow sufficient
+- Simple sequential operations
+- No branching or complex routing needed
+
+**Provided**: Working example of pydantic-ai graph orchestration with typed dependencies
+
+---
+
+### `validation/` - Cross-cutting Validations
+
+**Keep if** your Ring 0 has:
+- Complex validation logic used across multiple services
+- Business rules that span multiple components
+- Schema validations beyond simple Pydantic models
+
+**Delete if**:
+- Simple validation in service layer is sufficient
+- No shared validation logic across components
+
+**Ring 0 guidance**: Likely not needed. Add in Ring 1+ if you discover duplicated validation logic.
+
+**Alternative**: Keep validation in service layer until pattern emerges (don't premature abstract).
+
+**Provided**: Empty directory with README explaining usage patterns
+
+---
+
+### `agents/` - Agent Classes
+
+**Keep if** your Ring 0 has:
+- Multiple agents with shared patterns
+- Complex agent coordination
+- Agent orchestration needs
+
+**Delete if**:
+- Single simple agent is sufficient
+- No shared patterns between agents yet
+
+**Provided**: Example agent demonstrating schema-first and prompt-first patterns
+
+---
+
+### `deep-research-client` Integration
+
+**Keep if** your Ring 0 needs:
+- Deep research workflows
+- Perplexity API integration
+- Literature search and synthesis
+
+**Delete if**:
+- Not using deep research capabilities in Ring 0
+- Different research tool needed
+
+**Action**: Remove from `pyproject.toml` dependencies if not needed
+
+---
+
+## Example Code (Replace with Domain Logic)
+
+Files and code marked with `# EXAMPLE` comments are working demonstrations:
+
+- **Purpose**: Show proven patterns (schema-first, prompt-first, co-located prompts)
+- **Action**: Replace with your domain-specific logic
+- **Keep**: The patterns and infrastructure
+- **Maintain**: Test structure and documentation style
+
+---
+
+## Week 0 Checklist
+
+Use this checklist during your Week 0 validation phase:
+
+- [ ] **Define Ring 0 scope** in CLAUDE.md (update "Ring 0 Scope" section)
+- [ ] **Review each directory** against your Ring 0 requirements
+- [ ] **Delete unused optional components** (graphs/, validation/, agents/ if not needed)
+- [ ] **Remove unused dependencies** from pyproject.toml (e.g., deep-research-client)
+- [ ] **Replace example code** with first real use case
+- [ ] **Update README.md** with your project description and purpose
+- [ ] **Create .env file** with required API keys
+- [ ] **Run integration tests** to validate API access
+- [ ] **Document architectural decisions** in CLAUDE.md
+
+---
+
+## Decision Tree
+
+```
+Is your Ring 0 a single, simple agent?
+├─ YES → Delete: graphs/, validation/, example agents/
+│         Keep: schemas/, tests/, tooling, one real agent
+│
+└─ NO → Multi-step workflow?
+    ├─ YES → Keep: graphs/, agents/
+    │         Evaluate: validation/ (probably defer to Ring 1)
+    │
+    └─ NO → Linear multi-agent?
+            Keep: agents/
+            Delete: graphs/, validation/
+```
+
+---
+
+## Common Patterns to Follow
+
+### 1. Schema-First Design
+
+```python
+# schemas/my_input.schema.json
+{
+  "$comment": "Define business logic in JSON schema first",
+  "type": "object",
+  "properties": {
+    "query": {"type": "string"},
+    "max_results": {"type": "integer", "default": 10}
+  },
+  "required": ["query"]
+}
+
+# Then generate Pydantic model programmatically
+# (using cellsem-llm-client utilities)
+```
+
+### 2. Prompt-First Design
+
+```yaml
+# agents/my_agent.prompt.yaml
+system_prompt: |
+  You are an AI assistant specialized in {domain}.
+
+user_prompt: |
+  {task_description}
+
+presets:
+  openai-gpt4:
+    provider: openai
+    model: gpt-4
+    temperature: 0.1
+```
+
+```python
+# agents/my_agent.py
+def load_prompt(prompt_file: str) -> dict:
+    """Load co-located prompt file."""
+    prompt_path = Path(__file__).parent / prompt_file
+    return yaml.safe_load(prompt_path.read_text())
+```
+
+### 3. Test Structure
+
+```python
+# tests/unit/test_parser.py
+@pytest.mark.unit
+def test_parser_logic():
+    """Fast, isolated, no external dependencies."""
+    pass
+
+# tests/integration/test_api.py
+@pytest.mark.integration
+def test_real_api_connection():
+    """Real API, fail hard if no credentials."""
+    if not os.getenv("API_KEY"):
+        pytest.fail("API_KEY required for integration tests")
+    # Test with real API
+```
+
+---
+
+## Questions?
+
+- Review `CLAUDE.md` for full development philosophy and ring-based approach
+- See directory-specific `README.md` files for detailed component guidance
+- Check `README.md` for quick start and architecture overview
+
+---
+
+**Remember**: Infrastructure ≠ premature abstraction. Schema files, prompt files, and testing patterns are infrastructure that prevent technical debt.
diff --git a/{{cookiecutter.project_slug}}/pyproject.toml b/{{cookiecutter.project_slug}}/pyproject.toml
index 26eecbd..3766b58 100644
--- a/{{cookiecutter.project_slug}}/pyproject.toml
+++ b/{{cookiecutter.project_slug}}/pyproject.toml
@@ -1,20 +1,12 @@
-[project]
-name = "{{cookiecutter.project_slug}}"
-version = "0.1.0"
-description = "{{cookiecutter.description}}"
-readme = "README.md"
-requires-python = ">= {{cookiecutter.python_version}}"
-dependencies = [
-    "python-dotenv>=1.0.1",
-    "pydantic>=2.7.0",
-    "pydantic-ai>=1.16.0",
-    "jsonschema>=4.22.0",
-    "cellsem-llm-client @ git+https://github.com/Cellular-Semantics/cellsem_llm_client.git@main",
-    "deep-research-client @ git+https://github.com/monarch-initiative/deep-research-client.git@main",
+[tool.uv.workspace]
+members = [
+    "src/{{cookiecutter.package_name}}",
+    "src/{{cookiecutter.package_name}}_validation_tools",
 ]
 
-[project.optional-dependencies]
-dev = [
+[tool.uv]
+managed = true
+dev-dependencies = [
     "pytest>=8.0.0",
     "pytest-cov>=4.1.0",
     "ruff>=0.5.0",
@@ -25,17 +17,8 @@ dev = [
     "pre-commit>=3.7.0",
 ]
 
-[build-system]
-requires = ["setuptools>=68.0"]
-build-backend = "setuptools.build_meta"
-
-[tool.setuptools.packages.find]
-where = ["src"]
-
-[tool.setuptools.package-data]
-"{{cookiecutter.package_name}}" = ["schemas/*.schema.json"]
-
 [tool.pytest.ini_options]
+testpaths = ["tests"]
 addopts = "-m 'unit'"
 markers = [
     "unit: fast, isolated tests",
@@ -53,14 +36,14 @@ select = ["E", "F", "I", "UP", "B", "SIM"]
 quote-style = "double"
 
 [tool.mypy]
-packages = ["{{cookiecutter.package_name}}"]
+packages = ["{{cookiecutter.package_name}}", "{{cookiecutter.package_name}}_validation_tools"]
 python_version = "{{cookiecutter.python_version}}"
 disallow_untyped_defs = true
 strict_optional = true
 warn_unused_ignores = true
 
 [tool.coverage.run]
-source = ["src/{{cookiecutter.package_name}}"]
-
-[tool.uv]
-managed = true
+source = [
+    "src/{{cookiecutter.package_name}}",
+    "src/{{cookiecutter.package_name}}_validation_tools",
+]
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/README.md b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/README.md
new file mode 100644
index 0000000..e3b4fd7
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/README.md
@@ -0,0 +1,26 @@
+# {{cookiecutter.project_name}} - Core Package
+
+Main agentic workflow package containing agents, services, and orchestration logic.
+
+## Installation
+
+```bash
+pip install {{cookiecutter.package_name}}
+```
+
+## Usage
+
+See main project README at repository root for complete documentation.
+
+## Package Contents
+
+- **agents/** - Agent classes coordinating workflow execution
+- **graphs/** - Workflow orchestration (optional, delete if not needed)
+- **schemas/** - JSON schemas (source of truth for data models)
+- **services/** - LLM and API integration layer
+- **utils/** - Supporting utilities
+- **validation/** - Cross-cutting validations (optional)
+
+## Development
+
+This package is part of a UV workspace. See repository root for development instructions.
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/pyproject.toml b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/pyproject.toml
new file mode 100644
index 0000000..c68dded
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/pyproject.toml
@@ -0,0 +1,33 @@
+[project]
+name = "{{cookiecutter.package_name}}"
+version = "0.1.0"
+description = "{{cookiecutter.description}}"
+readme = "README.md"
+requires-python = ">= {{cookiecutter.python_version}}"
+dependencies = [
+    "python-dotenv>=1.0.1",
+    "pydantic>=2.7.0",
+    "pydantic-ai>=1.16.0",
+    "jsonschema>=4.22.0",
+    "pyyaml>=6.0",
+    "cellsem-llm-client @ git+https://github.com/Cellular-Semantics/cellsem_llm_client.git@main",
+    "deep-research-client @ git+https://github.com/monarch-initiative/deep-research-client.git@main",
+]
+
+[project.optional-dependencies]
+dev = [
+    "pytest>=8.0.0",
+    "pytest-cov>=4.1.0",
+    "ruff>=0.5.0",
+    "mypy>=1.10.0",
+]
+
+[build-system]
+requires = ["setuptools>=68.0"]
+build-backend = "setuptools.build_meta"
+
+[tool.setuptools.packages.find]
+where = ["."]
+
+[tool.setuptools.package-data]
+"{{cookiecutter.package_name}}" = ["schemas/*.schema.json", "**/*.prompt.yaml"]
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/__init__.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/__init__.py
similarity index 100%
rename from {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/__init__.py
rename to {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/__init__.py
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/agents/__init__.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/agents/__init__.py
similarity index 100%
rename from {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/agents/__init__.py
rename to {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/agents/__init__.py
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/agents/example_agent.prompt.yaml b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/agents/example_agent.prompt.yaml
new file mode 100644
index 0000000..b52193d
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/agents/example_agent.prompt.yaml
@@ -0,0 +1,48 @@
+# EXAMPLE: Replace with your domain-specific prompts
+# INFRASTRUCTURE: Prompts co-located with agents/services that use them
+# Naming convention: {agent_name}.prompt.yaml or {service_name}.{purpose}.prompt.yaml
+
+system_prompt: |
+  You are an AI assistant helping with agentic workflow tasks.
+
+  Your role is to process user queries and provide helpful, accurate responses
+  following the configured parameters.
+
+  EXAMPLE: Replace this system prompt with your domain-specific instructions.
+
+user_prompt: |
+  Process the following query:
+
+  Query: {query}
+
+  Provide a comprehensive response based on the available context and your knowledge.
+
+  EXAMPLE: Replace this user prompt template with your domain-specific format.
+  Use {variable_name} for template variables that will be filled at runtime.
+
+# Preset configurations for different LLM providers/models
+# INFRASTRUCTURE: Define model presets here for easy switching
+presets:
+  openai-gpt4:
+    provider: openai
+    model: gpt-4
+    temperature: 0.1
+    max_tokens: 1000
+
+  openai-gpt4-turbo:
+    provider: openai
+    model: gpt-4-turbo-preview
+    temperature: 0.1
+    max_tokens: 2000
+
+  anthropic-claude:
+    provider: anthropic
+    model: claude-3-sonnet-20240229
+    temperature: 0.1
+    max_tokens: 1000
+
+# EXAMPLE: Add your domain-specific presets
+# Choose appropriate models for your use case:
+# - Fast/cheap models for simple tasks
+# - Powerful models for complex reasoning
+# - Specialized models for specific domains
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/agents/example_agent.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/agents/example_agent.py
new file mode 100644
index 0000000..6aedea9
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/agents/example_agent.py
@@ -0,0 +1,138 @@
+"""Example agent demonstrating infrastructure patterns.
+
+EXAMPLE: Replace this entire file with your domain-specific agent logic.
+INFRASTRUCTURE: The patterns shown here (schema-first, prompt-first, co-located prompts) are standard.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+from typing import Any
+
+import yaml
+from pydantic import BaseModel
+
+
+class ExampleInput(BaseModel):
+    """Example input model.
+
+    EXAMPLE: This would be generated from schemas/example_input.schema.json.
+    INFRASTRUCTURE: Always generate Pydantic models from JSON schemas programmatically.
+    """
+
+    query: str
+    max_results: int = 10
+
+
+class ExampleOutput(BaseModel):
+    """Example output model.
+
+    EXAMPLE: Replace with your domain output structure.
+    """
+
+    status: str
+    result: str
+    metadata: dict[str, Any]
+
+
+def load_prompt(prompt_file: str) -> dict[str, Any]:
+    """Load co-located prompt file.
+
+    INFRASTRUCTURE: Always load prompts from YAML files, never hardcode them.
+    Co-locate prompts with the agents/services that use them.
+
+    Args:
+        prompt_file: Name of .prompt.yaml file in same directory (e.g., "example_agent.prompt.yaml")
+
+    Returns:
+        Dictionary containing prompt configuration (system_prompt, user_prompt, presets)
+
+    Example:
+        .. code-block:: python
+
+            prompt_config = load_prompt("example_agent.prompt.yaml")
+            system_prompt = prompt_config["system_prompt"]
+            user_prompt = prompt_config["user_prompt"].format(query="test")
+    """
+    prompt_path = Path(__file__).parent / prompt_file
+    if not prompt_path.exists():
+        raise FileNotFoundError(f"Prompt file not found: {prompt_path}")
+    return yaml.safe_load(prompt_path.read_text())
+
+
+def run_example_agent(input_data: ExampleInput) -> ExampleOutput:
+    """Execute example agent workflow.
+
+    EXAMPLE: Replace with your domain-specific agent logic.
+    INFRASTRUCTURE: Keep the pattern:
+        1. Load co-located prompt
+        2. Validate input with Pydantic (from JSON schema)
+        3. Execute workflow
+        4. Return validated output
+
+    Args:
+        input_data: Validated input model (generated from JSON schema)
+
+    Returns:
+        Validated output model
+
+    Example:
+        .. code-block:: python
+
+            from {{cookiecutter.package_name}}.agents.example_agent import run_example_agent, ExampleInput
+
+            result = run_example_agent(
+                ExampleInput(query="What is agentic AI?", max_results=5)
+            )
+            print(result.status)  # "completed"
+    """
+    # INFRASTRUCTURE: Load co-located prompt
+    prompt_config = load_prompt("example_agent.prompt.yaml")
+
+    # EXAMPLE: Your actual workflow logic would go here
+    # - Use prompt_config to build LLM request
+    # - Call external services (LLM, APIs, etc.)
+    # - Process results
+    # - Return validated output
+
+    # For now, just demonstrate the pattern
+    system_prompt = prompt_config["system_prompt"]
+    user_prompt = prompt_config["user_prompt"].format(query=input_data.query)
+
+    # Simulated processing (replace with real logic)
+    result = ExampleOutput(
+        status="completed",
+        result=f"Processed query: {input_data.query}",
+        metadata={
+            "max_results": input_data.max_results,
+            "prompt_used": "example_agent.prompt.yaml",
+            "model": prompt_config.get("presets", {}).get("openai-gpt4", {}).get("model", "unknown"),
+        },
+    )
+
+    return result
+
+
+# EXAMPLE: Additional helper functions for your domain
+# INFRASTRUCTURE: Keep functions small, testable, well-documented
+
+
+def validate_agent_prerequisites() -> bool:
+    """Validate agent has required configuration and dependencies.
+
+    INFRASTRUCTURE: Good pattern for startup validation.
+
+    Returns:
+        True if prerequisites met, False otherwise
+    """
+    # Check prompt file exists
+    prompt_path = Path(__file__).parent / "example_agent.prompt.yaml"
+    if not prompt_path.exists():
+        return False
+
+    # Add other prerequisite checks as needed
+    # - API keys present
+    # - Required services available
+    # - Configuration valid
+
+    return True
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/graphs/README.md b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/graphs/README.md
new file mode 100644
index 0000000..5b68e6a
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/graphs/README.md
@@ -0,0 +1,124 @@
+# Graphs - Workflow Orchestration
+
+**Status**: OPTIONAL
+
+## Purpose
+
+This directory contains workflow orchestration logic using `pydantic-ai` for type-safe, declarative multi-step workflows.
+
+## Keep This Directory If
+
+Your Ring 0 MVP needs:
+
+- ✅ **Multi-step workflows** with branching logic
+- ✅ **Complex dependencies** between workflow steps
+- ✅ **Dynamic routing** based on runtime conditions
+- ✅ **Type-safe workflow definitions** validated at parse time
+- ✅ **Inspectable orchestration** for debugging and visualization
+
+## Delete This Directory If
+
+Your Ring 0 MVP has:
+
+- ❌ **Single agent** with no workflow orchestration
+- ❌ **Simple linear flow** (step 1 → step 2 → step 3)
+- ❌ **No branching** or conditional routing needed
+
+## What's Provided
+
+### `definitions.py`
+- `GraphNode`: Atomic unit representing a workflow step
+- `WorkflowGraph`: Declarative workflow definition with entrypoint and nodes
+- `route()` method: Navigate the graph by node ID
+
+### `graph_agent.py`
+- Pydantic AI agent for graph-based orchestration
+- Typed dependencies (`GraphDependencies`)
+- Structured output (`GraphNode`)
+- Tool registration for graph navigation
+
+## Example Usage
+
+```python
+from {{cookiecutter.package_name}}.graphs import (
+    WorkflowGraph,
+    GraphNode,
+    build_graph_agent,
+    GraphDependencies
+)
+
+# Define workflow declaratively
+workflow = WorkflowGraph(
+    name="research_workflow",
+    entrypoint="query",
+    nodes=[
+        GraphNode(
+            id="query",
+            description="Query knowledge base",
+            service="deepsearch_service",
+            next=["analyze"]
+        ),
+        GraphNode(
+            id="analyze",
+            description="Analyze results",
+            service="analysis_service",
+            next=["summarize"]
+        ),
+        GraphNode(
+            id="summarize",
+            description="Generate summary",
+            service="summary_service"
+        ),
+    ],
+)
+
+# Execute with pydantic-ai agent
+agent = build_graph_agent()
+result = agent.run_sync(
+    "Navigate to next node",
+    deps=GraphDependencies(graph=workflow, current_node="query")
+)
+```
+
+## When to Add This in Ring 0
+
+**Add immediately if**:
+- Your core value proposition involves multi-step orchestration
+- You're building a workflow engine or pipeline system
+- Branching logic is fundamental to your MVP
+
+**Defer to Ring 1+ if**:
+- You can ship Ring 0 with a linear flow
+- Orchestration complexity isn't core to initial value
+- You're still discovering the workflow structure
+
+## Alternatives for Simple Cases
+
+For linear workflows, consider:
+
+```python
+# Simple function composition (no graphs needed)
+def simple_workflow(input_data: str) -> dict:
+    step1_result = query_service(input_data)
+    step2_result = analyze_service(step1_result)
+    return summarize_service(step2_result)
+```
+
+Only reach for graph orchestration when you have **proven need** for:
+- Branching/conditional logic
+- Dynamic routing
+- Workflow reusability
+- Complex dependencies
+
+## Architecture Notes
+
+- **Declarative**: Workflows defined as data (inspectable, serializable)
+- **Type-safe**: Pydantic validation at definition time
+- **Testable**: Mock individual nodes, test routing logic independently
+- **Observable**: Easy to add logging, metrics at node transitions
+
+## See Also
+
+- `SCAFFOLD_GUIDE.md` - Full decision tree for optional components
+- `CLAUDE.md` - Ring-based development philosophy
+- [Pydantic AI Docs](https://github.com/pydantic/pydantic-ai) - Framework documentation
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/graphs/__init__.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/graphs/__init__.py
similarity index 100%
rename from {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/graphs/__init__.py
rename to {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/graphs/__init__.py
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/graphs/definitions.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/graphs/definitions.py
similarity index 76%
rename from {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/graphs/definitions.py
rename to {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/graphs/definitions.py
index 3933985..7d10522 100644
--- a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/graphs/definitions.py
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/graphs/definitions.py
@@ -1,4 +1,10 @@
-"""Pydantic-powered graph primitives for orchestrating workflows."""
+"""Pydantic-powered graph primitives for orchestrating workflows.
+
+OPTIONAL: Delete entire graphs/ directory if Ring 0 doesn't need workflow orchestration.
+INFRASTRUCTURE: If you keep this, the pattern (Pydantic models, typed dependencies) is standard.
+
+See: src/{{cookiecutter.package_name}}/graphs/README.md for guidance on when to use.
+"""
 
 from __future__ import annotations
 
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/graphs/graph_agent.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/graphs/graph_agent.py
similarity index 100%
rename from {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/graphs/graph_agent.py
rename to {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/graphs/graph_agent.py
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/schemas/__init__.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/schemas/__init__.py
similarity index 100%
rename from {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/schemas/__init__.py
rename to {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/schemas/__init__.py
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/schemas/example_input.schema.json b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/schemas/example_input.schema.json
new file mode 100644
index 0000000..a509017
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/schemas/example_input.schema.json
@@ -0,0 +1,28 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#",
+  "$id": "example_input.schema.json",
+  "$comment": "EXAMPLE: Replace with your domain-specific input schema. INFRASTRUCTURE: Always define schemas in separate JSON files, then generate Pydantic models programmatically using cellsem-llm-client utilities.",
+  "title": "ExampleInput",
+  "description": "Example input schema demonstrating schema-first design pattern",
+  "type": "object",
+  "properties": {
+    "query": {
+      "type": "string",
+      "description": "The user query to process",
+      "minLength": 1,
+      "examples": [
+        "What is agentic AI?",
+        "Explain multi-agent systems"
+      ]
+    },
+    "max_results": {
+      "type": "integer",
+      "description": "Maximum number of results to return",
+      "minimum": 1,
+      "maximum": 100,
+      "default": 10
+    }
+  },
+  "required": ["query"],
+  "additionalProperties": false
+}
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/schemas/workflow_output.schema.json b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/schemas/workflow_output.schema.json
similarity index 85%
rename from {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/schemas/workflow_output.schema.json
rename to {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/schemas/workflow_output.schema.json
index 0f145be..33e2dbf 100644
--- a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/schemas/workflow_output.schema.json
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/schemas/workflow_output.schema.json
@@ -1,5 +1,6 @@
 {
   "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$comment": "EXAMPLE: Replace with your domain-specific workflow output schema. INFRASTRUCTURE: Always define schemas in separate JSON files, then generate Pydantic models programmatically using cellsem-llm-client utilities.",
   "title": "WorkflowOutput",
   "description": "Canonical structured response produced by {{cookiecutter.project_name}} agents.",
   "type": "object",
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/services/__init__.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/services/__init__.py
similarity index 100%
rename from {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/services/__init__.py
rename to {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/services/__init__.py
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/utils/__init__.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/utils/__init__.py
similarity index 100%
rename from {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/utils/__init__.py
rename to {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/utils/__init__.py
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/validation/README.md b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/validation/README.md
new file mode 100644
index 0000000..1606914
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/validation/README.md
@@ -0,0 +1,148 @@
+# Validation - Cross-cutting Concerns
+
+**Status**: OPTIONAL (currently empty)
+
+## Purpose
+
+This directory is for validation logic that is:
+- **Shared** across multiple services/agents
+- **Cross-cutting** (not specific to one component)
+- **Complex** enough to warrant centralization
+
+## Keep This Directory If
+
+Your Ring 0 MVP has:
+
+- ✅ **Complex business rules** used across multiple components
+- ✅ **Schema validations** beyond simple Pydantic models
+- ✅ **Data quality checks** applied to multiple data sources
+- ✅ **Compliance validations** (HIPAA, GDPR, etc.) affecting multiple services
+
+## Delete This Directory If
+
+Your Ring 0 MVP has:
+
+- ❌ **Simple validations** handled by Pydantic models
+- ❌ **Service-specific validation** (keep in service layer)
+- ❌ **No duplicated validation logic** across components
+
+## Ring 0 Guidance
+
+**Most projects should DELETE this directory for Ring 0.**
+
+Why? Because:
+1. You likely don't have duplicated validation logic yet
+2. Premature abstraction adds complexity
+3. Better to keep validation in service layer until patterns emerge
+
+**Add validation/ in Ring 1+** when you discover:
+- Same validation logic copy-pasted across 2+ components
+- Complex validation rules that deserve their own module
+- Clear separation between business logic and validation needed
+
+## Example Use Cases (Ring 1+)
+
+### Service Registration Validation
+
+```python
+# validation/service_registry.py
+def ensure_services_registered(
+    service_names: list[str],
+    available: list[str]
+) -> None:
+    """Validate all required services are registered."""
+    missing = set(service_names) - set(available)
+    if missing:
+        raise ValueError(f"Missing services: {missing}")
+```
+
+### Workflow Output Validation
+
+```python
+# validation/workflow_output.py
+from jsonschema import validate
+from {{cookiecutter.package_name}}.schemas import load_schema
+
+def validate_workflow_output(data: dict) -> None:
+    """Validate output against workflow_output.schema.json."""
+    schema = load_schema("workflow_output.schema.json")
+    validate(instance=data, schema=schema)
+```
+
+### Cross-service Data Quality
+
+```python
+# validation/data_quality.py
+def validate_gene_list(genes: list[str]) -> list[str]:
+    """Validate and normalize gene symbols across services."""
+    # Complex validation shared by multiple services
+    # - Format checking
+    # - Normalization
+    # - Duplicate removal
+    pass
+```
+
+## Alternative Approaches for Ring 0
+
+### Option 1: Validation in Pydantic Models
+
+```python
+# Simple validations in domain models
+from pydantic import BaseModel, field_validator
+
+class GeneQuery(BaseModel):
+    genes: list[str]
+
+    @field_validator('genes')
+    @classmethod
+    def validate_gene_format(cls, v: list[str]) -> list[str]:
+        # Validation logic here
+        return v
+```
+
+### Option 2: Validation in Service Layer
+
+```python
+# Keep validation close to where it's used
+class DeepSearchService:
+    def query(self, genes: list[str]) -> dict:
+        # Validate inputs
+        self._validate_genes(genes)
+        # Execute query
+        return self._execute_query(genes)
+
+    def _validate_genes(self, genes: list[str]) -> None:
+        # Service-specific validation
+        pass
+```
+
+## When to Centralize Validation
+
+Wait until you see these patterns:
+
+1. **Duplication**: Same validation in 2+ places
+2. **Complexity**: Validation logic is complex enough to test independently
+3. **Reuse**: Multiple services need identical validation
+4. **Compliance**: Regulatory requirements span multiple components
+
+## Decision Tree
+
+```
+Do you have validation logic used in 2+ components?
+├─ NO → DELETE validation/ directory
+│        Keep validation in Pydantic models or service methods
+│
+└─ YES → Is it complex enough to test independently?
+    ├─ NO → Consider extracting to shared utility first
+    │        Don't need full validation/ module yet
+    │
+    └─ YES → Move to validation/ directory
+             Add comprehensive tests
+             Document validation rules
+```
+
+## See Also
+
+- `SCAFFOLD_GUIDE.md` - Full decision tree for optional components
+- `CLAUDE.md` - Ring-based development philosophy (defer abstraction)
+- [Pydantic Validators](https://docs.pydantic.dev/latest/concepts/validators/) - Built-in validation patterns
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/validation/__init__.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/validation/__init__.py
similarity index 100%
rename from {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/validation/__init__.py
rename to {{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}/{{cookiecutter.package_name}}/validation/__init__.py
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/README.md b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/README.md
new file mode 100644
index 0000000..8c02d35
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/README.md
@@ -0,0 +1,40 @@
+# {{cookiecutter.project_name}} - Validation Tools
+
+**Status**: OPTIONAL
+
+Validation and analysis tools for comparing runs, computing metrics, and visualizing results.
+
+## Delete This Package If
+
+Your Ring 0 MVP doesn't need:
+- Workflow output comparison
+- Quality metrics (precision, recall, etc.)
+- Visualizations and analysis
+
+See `SCAFFOLD_GUIDE.md` for guidance.
+
+## Installation
+
+```bash
+pip install {{cookiecutter.package_name}}-validation-tools
+```
+
+## Structure
+
+- **comparisons/** - Compare workflow outputs across runs
+- **metrics/** - Quality metrics (precision, recall, F1, etc.)
+- **visualizations/** - Plots, heatmaps, ROC curves
+
+## Usage
+
+This package imports schemas and models from the core package:
+
+```python
+from {{cookiecutter.package_name}}.schemas import load_schema
+from {{cookiecutter.package_name}}_validation_tools.metrics import calculate_f1
+from {{cookiecutter.package_name}}_validation_tools.visualizations import plot_heatmap
+```
+
+## Development
+
+This package is part of a UV workspace. See repository root for development instructions.
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/pyproject.toml b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/pyproject.toml
new file mode 100644
index 0000000..5ca0389
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/pyproject.toml
@@ -0,0 +1,27 @@
+[project]
+name = "{{cookiecutter.package_name}}-validation-tools"
+version = "0.1.0"
+description = "Validation and analysis tools for {{cookiecutter.project_name}}"
+readme = "README.md"
+requires-python = ">= {{cookiecutter.python_version}}"
+dependencies = [
+    "{{cookiecutter.package_name}}",
+    "pandas>=2.0.0",
+    "numpy>=1.24.0",
+    "matplotlib>=3.7.0",
+    "seaborn>=0.12.0",
+    "scikit-learn>=1.3.0",
+]
+
+[project.optional-dependencies]
+dev = [
+    "pytest>=8.0.0",
+    "pytest-cov>=4.1.0",
+]
+
+[build-system]
+requires = ["setuptools>=68.0"]
+build-backend = "setuptools.build_meta"
+
+[tool.setuptools.packages.find]
+where = ["."]
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/__init__.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/__init__.py
new file mode 100644
index 0000000..9b4191f
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/__init__.py
@@ -0,0 +1,11 @@
+"""Validation and analysis tools for {{cookiecutter.project_name}}.
+
+OPTIONAL: Delete this entire package if not needed for your Ring 0 MVP.
+
+This package provides tools for validating, comparing, and analyzing
+workflow outputs from the {{cookiecutter.package_name}} core package.
+
+Schemas are imported from the core package - no duplication.
+"""
+
+__version__ = "0.1.0"
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/comparisons/README.md b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/comparisons/README.md
new file mode 100644
index 0000000..71ac22a
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/comparisons/README.md
@@ -0,0 +1,24 @@
+# Comparisons
+
+Tools for comparing workflow runs and outputs.
+
+## Use Cases
+
+- Compare results from different workflow versions
+- Diff analysis between parameter configurations
+- Side-by-side output comparison
+- Regression testing (current vs baseline)
+
+## Example
+
+```python
+from {{cookiecutter.package_name}}_validation_tools.comparisons import compare_runs
+
+result = compare_runs(
+    run1_output="outputs/run1/final.json",
+    run2_output="outputs/run2/final.json"
+)
+
+print(result.differences)
+print(result.similarity_score)
+```
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/comparisons/__init__.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/comparisons/__init__.py
new file mode 100644
index 0000000..c841ff4
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/comparisons/__init__.py
@@ -0,0 +1,9 @@
+"""Tools for comparing workflow runs and outputs.
+
+EXAMPLE: Add your domain-specific comparison logic here.
+
+Common patterns:
+- Compare outputs from different workflow runs
+- Diff analysis between versions
+- Side-by-side result comparison
+"""
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/metrics/README.md b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/metrics/README.md
new file mode 100644
index 0000000..8b59045
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/metrics/README.md
@@ -0,0 +1,27 @@
+# Metrics
+
+Quality metrics for evaluating workflow performance.
+
+## Common Metrics
+
+- **Precision**: True positives / (True positives + False positives)
+- **Recall**: True positives / (True positives + False negatives)
+- **F1 Score**: Harmonic mean of precision and recall
+- **Accuracy**: Correct predictions / Total predictions
+- **ROC-AUC**: Area under ROC curve
+- **PR-AUC**: Area under precision-recall curve
+
+## Example
+
+```python
+from {{cookiecutter.package_name}}_validation_tools.metrics import calculate_metrics
+
+metrics = calculate_metrics(
+    predictions=workflow_output,
+    ground_truth=gold_standard
+)
+
+print(f"Precision: {metrics.precision:.3f}")
+print(f"Recall: {metrics.recall:.3f}")
+print(f"F1: {metrics.f1:.3f}")
+```
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/metrics/__init__.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/metrics/__init__.py
new file mode 100644
index 0000000..b564f5d
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/metrics/__init__.py
@@ -0,0 +1,10 @@
+"""Metrics for evaluating workflow quality.
+
+EXAMPLE: Add your domain-specific metrics here.
+
+Common metrics:
+- Precision, recall, F1 score
+- Accuracy, specificity, sensitivity
+- ROC-AUC, PR-AUC
+- Custom domain metrics
+"""
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/visualizations/README.md b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/visualizations/README.md
new file mode 100644
index 0000000..47310c1
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/visualizations/README.md
@@ -0,0 +1,26 @@
+# Visualizations
+
+Analysis visualizations for workflow results.
+
+## Available Visualizations
+
+- **Heatmaps**: Result matrices, correlation analysis
+- **ROC Curves**: Classification performance
+- **Precision-Recall Curves**: Threshold analysis
+- **Confusion Matrices**: Classification breakdown
+- **Time Series**: Performance over time
+- **Comparison Charts**: Side-by-side analysis
+
+## Example
+
+```python
+from {{cookiecutter.package_name}}_validation_tools.visualizations import plot_roc_curve
+
+fig = plot_roc_curve(
+    y_true=ground_truth,
+    y_scores=predictions,
+    title="Workflow Performance"
+)
+
+fig.savefig("outputs/roc_curve.png")
+```
diff --git a/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/visualizations/__init__.py b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/visualizations/__init__.py
new file mode 100644
index 0000000..22fe86d
--- /dev/null
+++ b/{{cookiecutter.project_slug}}/src/{{cookiecutter.package_name}}_validation_tools/{{cookiecutter.package_name}}_validation_tools/visualizations/__init__.py
@@ -0,0 +1,12 @@
+"""Visualizations for workflow analysis.
+
+EXAMPLE: Add your domain-specific visualizations here.
+
+Common visualizations:
+- Heatmaps of results
+- ROC curves
+- Confusion matrices
+- Precision-recall curves
+- Time series plots
+- Comparison charts
+"""