Principled-Evolution · kmadan · May 14, 2026 · Apr 29, 2025 · May 27, 2025 · Jun 17, 2025
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,87 @@
+# Agent Instructions — AICertify
+
+This file is the canonical operational guide for AI coding agents working in this repository (Claude Code, Cursor, Codex, Windsurf, Gemini CLI, Copilot, etc.). Tool-specific files (`CLAUDE.md`, `GEMINI.md`) inherit from this and add only platform-specific notes.
+
+## What this project is
+
+**AICertify** is a Python framework that evaluates AI applications against regulatory frameworks using [Open Policy Agent (OPA)](https://www.openpolicyagent.org/) policies sourced from the sister project [gopal](https://github.com/Principled-Evolution/gopal). It produces audit-ready compliance reports in PDF, Markdown, JSON, and HTML.
+
+The user-facing surface is:
+
+1. **Python API** — `aicertify.application`, `aicertify.regulations`, evaluator classes
+2. **CLI** — `python -m aicertify.cli`
+3. **Reports** — generated artifacts in `examples/outputs/` you can show to an auditor
+
+## Repository layout
+
+```
+aicertify/                          Python package
+├── __init__.py                     Public API (re-exports the surface a user sees)
+├── cli.py                          Argparse CLI entry — see "Useful commands" below
+├── application.py / regulations.py User-facing fluent API
+├── api.py / contract_models.py     Contract data model (Pydantic)
+├── evaluators/                     Pluggable evaluators (Fairness, ContentSafety, …)
+├── opa_policies/                   Vendored Rego policy tree (mirrors gopal layout)
+│   ├── global/v1/                  Cross-cutting categories
+│   ├── international/              EU AI Act, NIST, India, …
+│   ├── industry_specific/          Healthcare, BFS, Automotive, Aviation, …
+│   └── operational/                AIOps, cost, corporate
+├── report_generation/              ReportLab-based PDF + Markdown + HTML + JSON writers
+└── assets/                         Logos, images
+examples/
+├── quickstart.py                   End-to-end demo — the canonical "does this work?" script
+├── sample_contract.json            Contract structure reference
+└── outputs/                        Pre-generated reports for inspection
+tests/                              pytest tests
+pyproject.toml                      Poetry-managed project metadata
+```
+
+## Useful commands
+
+```bash
+# Install (editable)
+pip install -e .
+
+# Run the end-to-end demo (writes reports/ into the repo)
+python examples/quickstart.py
+
+# CLI evaluation
+python -m aicertify.cli \
+  --contract examples/sample_contract.json \
+  --policy aicertify/opa_policies/international/eu_ai_act/v1 \
+  --report-format pdf \
+  --output-dir reports/
+
+# Tests
+pytest tests/ -v
+
+# Lint / format
+ruff check aicertify/
+black aicertify/
+```
+
+## Conventions
+
+- **Python 3.12 only** (see `requires-python` in `pyproject.toml`). Do not introduce syntax incompatible with 3.12 — do not assume 3.13.
+- **Line length 88** (black + ruff default). Don't change this.
+- **Pydantic v2** — when defining models, use `model_config` not `class Config`.
+- **OPA policies** — never hand-edit `aicertify/opa_policies/*.rego` files in this repo without considering whether the change should land upstream in [gopal](https://github.com/Principled-Evolution/gopal) first. They are a vendored copy.
+- **Reports are user-facing legal artifacts** — when changing report generation, run the quickstart and inspect the output PDF before claiming success.
+- **Avoid breaking the public API** — anything re-exported from `aicertify/__init__.py` is part of the user-facing contract. Bump the version and note in CHANGELOG.md if you must.
+
+## What NOT to do
+
+- Don't pin new dependencies aggressively — many users will install AICertify alongside their own stack; tight pins create conflicts.
+- Don't replace OPA with another policy engine. The whole value proposition is policy-as-code in Rego.
+- Don't commit `reports/` or `examples/outputs/` content unless deliberately updating the canonical samples.
+- Don't introduce calls to closed-source SaaS services (Credo AI, etc.) — this is a self-hostable, on-prem-capable framework.
+- Don't add a feature that requires `python >= 3.13` — the project is intentionally pinned to 3.12.
+
+## Sister projects
+
+- **[gopal](https://github.com/Principled-Evolution/gopal)** — the upstream Rego policy library. If you find yourself adding a new framework, contribute the `.rego` there, then vendor it here.
+- **[Open Policy Agent](https://www.openpolicyagent.org/)** — the engine.
+
+## Conservatism
+
+The author prefers **surgical changes**: do only what was asked, present the plan first when there's any ambiguity, and ask before introducing new abstractions. Critique your own design once for elegance, DRY, KISS, and explainability before presenting it.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,43 @@
+# Changelog
+
+All notable changes to **AICertify** are documented in this file.
+
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
+and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+
+## [Unreleased]
+
+### Added
+- Centered HTML hero, ordered badge wall, value-prop tagline, and 5 programmatically-generated marketing diagrams in the README (regulatory coverage, architecture, comparison vs Fairlearn / AIF360 / MS RAI / Credo AI, report anatomy, end-to-end flow).
+- `AGENTS.md` and `CLAUDE.md` — operational instructions for AI coding agents working in this repository.
+- `skills/` directory with 4 Claude Code skills: `run-compliance-check`, `evaluate-contract`, `explain-regulation`, `draft-policy`. Each ships as a slash command once installed into `~/.claude/skills/`.
+- Comparison table vs Fairlearn / IBM AI Fairness 360 / Microsoft RAI Toolbox / Credo AI in the README.
+- `diagrams/generate_diagrams.py` — reproducible matplotlib script that regenerates every marketing PNG.
+
+### Changed
+- README rewritten for product-page clarity: value-prop first, then quickstart, then differentiation, then coverage.
+
+## [0.7.0] — 2025-04
+
+### Added
+- Reporting subsystem (`aicertify.report_generation`) producing audit-ready artifacts in PDF (via ReportLab), Markdown, JSON, and HTML.
+- Quickstart example (`examples/quickstart.py`) wiring sample interactions through the EU AI Act policy set and emitting a full report.
+- Pluggable evaluator classes — `FairnessEvaluator`, `ContentSafetyEvaluator`, `RiskManagementEvaluator`, `ComplianceEvaluator`.
+- Sample pre-generated reports under `examples/outputs/` for EU AI Act, loan evaluation, and medical diagnosis use cases.
+
+### Changed
+- OPA policies migrated to the standalone [gopal](https://github.com/Principled-Evolution/gopal) library; AICertify now vendors the policy tree under `aicertify/opa_policies/` via Git submodule.
+- Enhanced logging across the evaluation pipeline.
+- Pre-commit hooks: `ruff`, `black`, security checks.
+
+### Fixed
+- Security: bumped `protobuf` to 5.29.5 and `pycares` to 4.9.0 to resolve advisory exposure.
+- Security: bumped `transformers` and `setuptools` to resolve security alerts.
+- Auto-labeling workflow no longer produces excessive labels.
+
+## Earlier history
+
+For changes prior to 0.7.0, see the [Git log](https://github.com/Principled-Evolution/aicertify/commits/main).
+
+[Unreleased]: https://github.com/Principled-Evolution/aicertify/compare/v0.7.0...HEAD
+[0.7.0]: https://github.com/Principled-Evolution/aicertify/releases/tag/v0.7.0
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,51 @@
+# Claude Code — Project Context for AICertify
+
+> **Read [AGENTS.md](AGENTS.md) first.** This file inherits all of those instructions and only adds Claude Code-specific notes.
+
+## What you're working on
+
+**AICertify** evaluates AI applications against regulatory frameworks (EU AI Act, NIST AI RMF, +13 more) using [OPA](https://www.openpolicyagent.org/) policies from [gopal](https://github.com/Principled-Evolution/gopal). The deliverable is an audit-ready compliance report (PDF / Markdown / JSON / HTML).
+
+## Fast orientation
+
+When the user asks a question about the codebase, in order:
+
+1. **For "how does the eval flow work?"** → read [examples/quickstart.py](examples/quickstart.py). It's the canonical user path.
+2. **For "what's covered by regulation X?"** → look in `aicertify/opa_policies/<domain>/<framework>/v1/`.
+3. **For "what does a report look like?"** → open one in `examples/outputs/`.
+4. **For "what's the public Python API?"** → read `aicertify/__init__.py` — it's the surface contract.
+
+## Conventions Claude Code should respect
+
+- **Use TodoWrite** for multi-step tasks (this is the user's preferred tracking).
+- **Run the quickstart** before claiming a feature works end-to-end. Type-check and tests verify code correctness; the quickstart verifies *user-facing* correctness.
+- **When editing OPA policies in `aicertify/opa_policies/`**, remind the user that these are vendored from [gopal](https://github.com/Principled-Evolution/gopal) and the upstream copy should usually be edited first.
+- **Reports** — when changing report templates, open the generated PDF (`examples/outputs/eu_ai_act/`) and inspect it visually. Don't assume rendering correctness from the code alone.
+
+## Useful skills
+
+This project ships Claude Code skills in [`skills/`](skills/). To register them, copy into your Claude Code skills directory:
+
+```bash
+mkdir -p ~/.claude/skills && cp -r skills/* ~/.claude/skills/
+```
+
+Available slash commands once registered:
+
+- `/run-compliance-check` — Run the end-to-end quickstart and surface the generated report
+- `/evaluate-contract` — Evaluate a user-supplied contract JSON against any supported framework
+- `/explain-regulation` — Walk every policy in a framework directory and explain what it checks
+- `/draft-policy` — Scaffold a new Rego policy with metadata, default rule, and a test sibling
+
+See [`skills/README.md`](skills/README.md) for the full list and conventions.
+
+## What NOT to do in this repo
+
+- Don't run `python examples/quickstart.py` repeatedly without cleaning `reports/` — the directory accumulates artifacts.
+- Don't add Python ≥3.13 syntax (project pins to 3.12).
+- Don't switch package manager — poetry is canonical, `pip install -e .` works for users.
+- Don't claim MCP-compatibility yet — an MCP server is on the roadmap, not shipped.
+
+## Session etiquette
+
+The author prefers terse, surgical responses. Present the plan first when ambiguous; ask before introducing new abstractions.