AI Engineering Framework

Production-grade patterns, templates, and methodologies for building AI-powered systems — from Claude.md design and agent orchestration through governance, observability, and enterprise rollout.

Repository Status

Field	Value
Version	v2.0.0
Last updated	May 2026
Content	52 files · 11 directories
Status	🟢 Stable — ready for use

Repository Identity

What this is: A reusable knowledge base for teams building AI systems. Covers Claude.md design, agent orchestration, multi-stage pipelines, LLM routing, token optimisation, evaluation, governance, and enterprise rollout. Vendor-neutral throughout.

What this is not: A product, SaaS, or managed service. A collection of tested patterns and copy-paste templates.

Primary use case: Projects that ingest unstructured text, extract structured knowledge, and expose it through interactive interfaces.

Secondary use case: Any project using Claude Code with multi-agent orchestration, regardless of domain.

Reading Paths

Starting a New AI Project

templates/claude-md-template.md — set up your project's CLAUDE.md
frameworks/claude-md/DESIGN_GUIDE.md — understand what goes where
architecture/SYSTEM_DESIGN.md — reference architecture and deployment topologies
architecture/PIPELINE_PATTERNS.md — how to structure your pipeline

Designing a Multi-Agent System

frameworks/agents/MULTI_AGENT_DESIGN.md — coordination topologies, memory architectures, failure modes
frameworks/agents/AGENT_FRONTMATTER_SPEC.md — official YAML schema
templates/agent-template.md — copy-paste starting point
frameworks/agents/examples/example-agent.md — annotated example

Optimising Token Costs and Session Quality

operations/TOKEN_OPTIMISATION.md — how tokens work and where projects waste them
operations/COST_MODELLING.md — estimating API spend with worked examples
operations/CONTEXT_MANAGEMENT.md — GREEN/YELLOW/RED/BLACK budget tiers
operations/SESSION_PROTOCOLS.md — session start, end, and recovery

Understanding Claude.md Best Practices

frameworks/claude-md/DESIGN_GUIDE.md — from monolith to modular hub
frameworks/claude-md/RESTRUCTURE_METHODOLOGY.md — 4-phase migration plan
frameworks/claude-md/examples/monolithic-before.md — what not to do
frameworks/claude-md/examples/restructured-after.md — the target state

Governing and Operating AI Systems

governance/AI_GOVERNANCE_FRAMEWORK.md — accountability structures, maturity model, risk tiers
governance/RISK_ASSESSMENT.md — four-quadrant risk taxonomy and mitigation patterns
governance/RESPONSIBLE_AI_CHECKLIST.md — pre-deployment, operational, and periodic review checklists
templates/governance-review-template.md — governance review template

Evaluating and Observing AI Quality

observability/EVALUATION_FRAMEWORK.md — what to evaluate, evaluation strategies, red teaming
observability/OBSERVABILITY_PATTERNS.md — instrumentation, logging, alerting, dashboards
observability/QUALITY_ASSURANCE.md — test pyramid, prompt testing, CI/CD integration, canary deployment
templates/evaluation-template.md — evaluation plan template

Repository Structure

Directory	What it contains	When to use it
`architecture/`	System patterns: pipelines, LLM routing, data contracts, graph construction	When designing how data flows
`design/`	Visual language, colour systems, interaction patterns for data-dense interfaces	When building a visualisation or dashboard UI
`docs/`	Deep analyses: Claude.md comparison, extracted patterns, integration guides	When you need the reasoning behind framework decisions
`enterprise/`	Operating models, rollout playbook, scaling patterns	When deploying AI at enterprise scale
`frameworks/`	Methodologies: Claude.md design, agents, skills, rules	When designing your AI system's control layer
`governance/`	AI governance framework, risk assessment, responsible AI checklist	When making AI systems accountable and auditable
`observability/`	Evaluation framework, observability patterns, quality assurance	When measuring and monitoring AI quality
`operations/`	Running guides: token budgets, session protocols, cost modelling	When optimising running costs and session quality
`research/`	Research maps, agent comparisons, database evaluations	When making technology selection decisions
`templates/`	Copy-paste starting points for every key artefact	When starting a new agent, skill, rule, or plan

File Index

architecture/

File	Purpose
`DATA_CONTRACTS.md`	Schema contracts, state machines, structured outputs, schema evolution, and contract testing
`ENTITY_RESOLUTION.md`	Merging entity mentions into canonical entities; blocking strategies, evaluation, incremental resolution
`GRAPH_CONSTRUCTION.md`	Vendor-neutral knowledge graph construction: property graph vs. RDF, construction pipeline, link analysis
`LLM_ROUTING.md`	Multi-model task assignment, dynamic routing, capability dispatch, cost-quality Pareto frontier
`PIPELINE_PATTERNS.md`	Text-to-graph pipeline stages, event-driven patterns, idempotency, SLO design
`SYSTEM_DESIGN.md`	Reference architecture, deployment topologies, failure domains, synchronous/asynchronous processing

design/

File	Purpose
`COLOUR_SYSTEMS.md`	Three-tier token architecture, functional colour assignments, dark/light mode, accessibility checklist
`INTERACTION_PATTERNS.md`	State machines, node/edge interaction states, temporal navigation, performance budgets
`VISUAL_LANGUAGE.md`	Design philosophy, layered canvas architecture, visual grammar, information hierarchy

docs/

File	Purpose
`CLAUDE_MD_COMPARATIVE_ANALYSIS.md`	Deep comparison of five production CLAUDE.md files across three project archetypes
`EXTRACTED_PATTERNS.md`	47 patterns from production Claude.md files for direct reuse
`INTEGRATION_ANALYSIS.md`	Compatibility-aware improvement plan cross-referencing multiple AI framework resources
`PRODUCT_ARCHITECTURE_GUIDE.md`	Claude API, entity resolution, holistic system view, model orchestration, learning path

enterprise/

File	Purpose
`OPERATING_MODEL.md`	Team structure archetypes, roles, Centre of Excellence, decision rights matrix, cost allocation
`ROLLOUT_PLAYBOOK.md`	Six-phase deployment playbook from discovery to optimisation, with exit criteria and rollback patterns
`SCALING_PATTERNS.md`	Infrastructure, data, team, quality, cost, and organisational adoption scaling

frameworks/

File	Purpose
`agents/AGENT_FRONTMATTER_SPEC.md`	All official YAML frontmatter fields for Claude Code agent files
`agents/MULTI_AGENT_DESIGN.md`	Automation tiers, coordination topologies, memory architectures, failure modes, agent evaluation
`agents/examples/example-agent.md`	Annotated agent definition with all fields explained
`claude-md/DESIGN_GUIDE.md`	Official guidance for designing production-grade CLAUDE.md files
`claude-md/RESTRUCTURE_METHODOLOGY.md`	Phase-by-phase migration from monolithic to lean hub
`claude-md/examples/monolithic-before.md`	Annotated over-grown CLAUDE.md — the anti-pattern
`claude-md/examples/restructured-after.md`	Same CLAUDE.md after restructure — the target state
`rules/RULE_DESIGN.md`	Path-scoped constraints that load automatically on matching file paths
`rules/examples/example-rule.md`	Annotated rendering rule with all constraints explained
`skills/SKILL_DESIGN.md`	When and how to create skills; rigid vs. flexible distinction; migration pattern
`skills/examples/example-skill.md`	Complete session-end skill with all steps annotated

governance/

File	Purpose
`AI_GOVERNANCE_FRAMEWORK.md`	Accountability structures, control mechanisms, governance maturity model, risk tiers
`RESPONSIBLE_AI_CHECKLIST.md`	Pre-deployment, operational, and periodic review checklists
`RISK_ASSESSMENT.md`	Four-quadrant risk taxonomy, risk rating matrix, mitigation patterns, risk register format

observability/

File	Purpose
`EVALUATION_FRAMEWORK.md`	What to evaluate, evaluation strategies, ground truth, metrics taxonomy, red teaming
`OBSERVABILITY_PATTERNS.md`	Instrumentation, structured logging, distributed tracing, alerting, dashboard patterns
`QUALITY_ASSURANCE.md`	AI test pyramid, prompt testing, output quality gates, CI/CD integration, canary deployment

operations/

File	Purpose
`CONTEXT_MANAGEMENT.md`	Context window economics, GREEN/YELLOW/RED/BLACK budget model, splitting strategies
`COST_MODELLING.md`	API spend estimation, pipeline cost breakdown, monthly projections
`SESSION_PROTOCOLS.md`	Session start, end, recovery, and logging protocols
`TOKEN_OPTIMISATION.md`	How tokens work, where projects waste them, compression strategies

research/

File	Purpose
`GRAPH_DATABASE_COMPARISON.md`	Comparison of 7 graph database systems for knowledge graph applications
`KNOWLEDGE_GRAPH_RESEARCH_MAP.md`	Vendor-neutral research map for enterprise knowledge graph systems
`RESEARCH_AGENT_COMPARISON.md`	Comparison of AI research agents: GPT Researcher, Gemini Deep Research, Perplexity, Claude

templates/

File	Purpose
`agent-template.md`	Blank agent with all YAML frontmatter fields
`claude-md-template.md`	Fill-in-the-blanks CLAUDE.md starter
`data-contract-template.md`	Schema contract between pipeline stages
`evaluation-template.md`	Evaluation plan with metrics, ground truth, and results log
`governance-review-template.md`	Governance review covering controls, quality metrics, risk register
`handoff-envelope-template.md`	Context-transfer format between agents or sessions
`risk-triage-template.md`	RED/YELLOW/GREEN feature classification before implementation
`rollout-plan-template.md`	Six-phase rollout plan with success metrics and phase log
`rule-template.md`	Blank path-scoped rule
`session-end-template.md`	Mandatory session-end checklist with 7-step protocol
`session-start-template.md`	Mandatory session-start checklist
`skill-template.md`	Blank skill with frontmatter and step structure

Key Principles

Facts in Claude.md, procedures in skills. Claude.md holds what every session needs; skills load on demand. Keep CLAUDE.md under 200 lines.
Path-scoped rules. Rules in .claude/rules/ load only when matching files are open. Free context savings.
Local-first model routing. Use the cheapest model that can do the job. Pay only for tasks that require complex reasoning.
Extract first, remove second. Create replacement before deleting original. Never leave a capability gap.
Token budget zones. GREEN → YELLOW → RED → BLACK. Monitor and act at each threshold.
Handoff envelopes prevent context loss. Structured YAML between agents or sessions preserves assumptions, state, and next steps.
Hard bans with incident provenance. Each non-negotiable rule traces to a real failure. Rules without incidents are preferences, not bans.
Governance is not optional. Every production AI system needs defined accountability, a risk register, and an audit trail. Size the governance to the risk tier.
Evaluation before deployment. Never deploy a model or prompt change without running the evaluation suite. Quality is a CI gate, not an afterthought.
Roll out in phases. Discovery → POC → Pilot → Limited Production → Full Production → Optimisation. Each phase validates the next phase's assumptions.

Hard Rules (for this repository)

NO project-specific references — all examples use generic [PLACEHOLDER] syntax
NO personal identifiers (names, emails, org names) in any content file
NO hardcoded live URLs that can go stale — reference by description instead
NO content that describes a specific real project
NO shortening or summarising source material — preserve full technical substance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Engineering Framework

Repository Status

Repository Identity

Reading Paths

Starting a New AI Project

Designing a Multi-Agent System

Optimising Token Costs and Session Quality

Understanding Claude.md Best Practices

Governing and Operating AI Systems

Evaluating and Observing AI Quality

Repository Structure

File Index

architecture/

design/

docs/

enterprise/

frameworks/

governance/

observability/

operations/

research/

templates/

Key Principles

Hard Rules (for this repository)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
architecture		architecture
design		design
docs		docs
enterprise		enterprise
frameworks		frameworks
governance		governance
observability		observability
operations		operations
research		research
templates		templates
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
GLOSSARY.md		GLOSSARY.md
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

AI Engineering Framework

Repository Status

Repository Identity

Reading Paths

Starting a New AI Project

Designing a Multi-Agent System

Optimising Token Costs and Session Quality

Understanding Claude.md Best Practices

Governing and Operating AI Systems

Evaluating and Observing AI Quality

Repository Structure

File Index

architecture/

design/

docs/

enterprise/

frameworks/

governance/

observability/

operations/

research/

templates/

Key Principles

Hard Rules (for this repository)

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages