Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .well-known/agents-shipgate.json
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
"trust_model": "static_by_default",
"schemas": {
"manifest": "https://raw.githubusercontent.com/ThreeMoonsLab/agents-shipgate/main/docs/manifest-v0.1.json",
"report": "https://raw.githubusercontent.com/ThreeMoonsLab/agents-shipgate/main/docs/report-schema.v0.16.json",
"report": "https://raw.githubusercontent.com/ThreeMoonsLab/agents-shipgate/main/docs/report-schema.v0.17.json",
"packet": "https://raw.githubusercontent.com/ThreeMoonsLab/agents-shipgate/main/docs/packet-schema.v0.5.json",
"checks_catalog": "https://raw.githubusercontent.com/ThreeMoonsLab/agents-shipgate/main/docs/checks.json"
},
Expand Down
8 changes: 5 additions & 3 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -249,8 +249,9 @@ Other stable top-level fields:
- `findings[].provenance_kind` (v0.15+, per-finding rule provenance — `static_declaration | ast_extraction | keyword_heuristic | regex_heuristic | policy_pack`; independent of `confidence`, useful for filtering heuristic-only findings)
- `findings[].blocks_release` (v0.16+, explicit release-policy blockers from Action Surface Diff policies)
- `action_surface_facts` / `action_surface_diff` (v0.16+, deterministic action snapshot and base/head action delta)
- `release_decision.contribution_rules[]` (v0.17+, per-finding audit of how each finding contributed to the decision; one row per `report.findings` entry, with `category` ∈ `{blocker, review_item, excluded}` and `rule` ∈ `{policy_block_new, severity_block_new, policy_baseline_accepted, severity_baseline_accepted, review_required, sub_threshold, suppressed}`)

The full schema is at [`docs/report-schema.v0.16.json`](docs/report-schema.v0.16.json) (current; emitted reports carry `report_schema_version: "0.16"`). v0.16 adds first-class Action Surface Diff fields, on top of v0.15's per-finding `provenance_kind` enum, v0.14's `insufficient_evidence` value in the `release_decision.decision`/`agent_summary.verdict` enums, and v0.13's `codex_plugin_surface` block. Older reports validate against [`docs/report-schema.v0.15.json`](docs/report-schema.v0.15.json) (frozen reference). What's-stable is documented in [STABILITY.md](STABILITY.md).
The full schema is at [`docs/report-schema.v0.17.json`](docs/report-schema.v0.17.json) (current; emitted reports carry `report_schema_version: "0.17"`). v0.17 adds the per-finding `release_decision.contribution_rules[]` audit, on top of v0.16's first-class Action Surface Diff fields, v0.15's per-finding `provenance_kind` enum, v0.14's `insufficient_evidence` value in the `release_decision.decision`/`agent_summary.verdict` enums, and v0.13's `codex_plugin_surface` block. Older reports validate against [`docs/report-schema.v0.16.json`](docs/report-schema.v0.16.json) (frozen reference). What's-stable is documented in [STABILITY.md](STABILITY.md).

**Release gating signal**: prefer `release_decision.decision` (`"blocked" | "review_required" | "insufficient_evidence" | "passed"`) over `summary.status`. The new field is **baseline-aware** — a baseline-matched critical surfaces in `release_decision.review_items` (accepted debt), not `release_decision.blockers`. `summary.status` stays baseline-blind for v0.7 compatibility, so a baseline-matched-only critical produces both `summary.status = "release_blockers_detected"` AND `release_decision.decision = "review_required"` (intentional divergence — see [STABILITY.md](STABILITY.md#release_decisiondecision-vs-summarystatus)). `insufficient_evidence` (added v0.14) signals that the scan saw too many low-confidence tools or source-loader warnings to be trustworthy; consumers that switch on the enum must fall back to `review_required` for unknown future values.

Expand Down Expand Up @@ -316,7 +317,7 @@ validation and [`docs/manifest-v0.1.md`](docs/manifest-v0.1.md) for prose.
### Where is the report schema?

Parse `agents-shipgate-reports/report.json` and validate against
[`docs/report-schema.v0.16.json`](docs/report-schema.v0.16.json) (current).
[`docs/report-schema.v0.17.json`](docs/report-schema.v0.17.json) (current).
Older reports (`report_schema_version: "0.10"`) validate against the
frozen [`docs/report-schema.v0.10.json`](docs/report-schema.v0.10.json).
Do not scrape Markdown when JSON is available.
Expand Down Expand Up @@ -354,7 +355,8 @@ For the short, current statement of "which fields to read", see [`docs/agent-con
| What | Path | Stable |
|---|---|---|
| Manifest schema | [`docs/manifest-v0.1.json`](docs/manifest-v0.1.json) | `0.1` |
| Report schema (current) | [`docs/report-schema.v0.16.json`](docs/report-schema.v0.16.json) | `0.16` |
| Report schema (current) | [`docs/report-schema.v0.17.json`](docs/report-schema.v0.17.json) | `0.17` |
| Report schema (v0.16 frozen reference) | [`docs/report-schema.v0.16.json`](docs/report-schema.v0.16.json) | `0.16` |
| Report schema (v0.15 frozen reference) | [`docs/report-schema.v0.15.json`](docs/report-schema.v0.15.json) | `0.15` |
| Report schema (v0.14 frozen reference) | [`docs/report-schema.v0.14.json`](docs/report-schema.v0.14.json) | `0.14` |
| Report schema (v0.13 frozen reference) | [`docs/report-schema.v0.13.json`](docs/report-schema.v0.13.json) | `0.13` |
Expand Down
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,22 @@

## Unreleased

- Added `release_decision.contribution_rules[]` — a deterministic
per-finding audit of how each finding contributed to the release
decision (M8 of the Trust Hardening Pass). Bumps
`report_schema_version` to `0.17`. Exactly one row per
`report.findings` entry (including suppressed) with `category` ∈
`{blocker, review_item, excluded}` and `rule` ∈ `{policy_block_new,
severity_block_new, policy_baseline_accepted,
severity_baseline_accepted, review_required, sub_threshold,
suppressed}`. The new `STABILITY.md` "Release decision truth table"
documents which `(rule, category)` pair fires for every
`(blocks_release, severity, baseline_status, fail_on)` combination.
Additive only: no semantic change to `decision`, `blockers[]`,
`review_items[]`, `fail_policy.exit_code`, or strict-mode exit codes —
the audit reflects existing behavior, it does not modify it. The
field defaults to `[]` for legacy reports loaded via
`explain-finding` so consumers never need an existence check.
- Replaced the hardcoded `if/elif` source-dispatch in `cli/scan.py` with a
real `ToolSourceAdapter` Protocol and `AdapterRegistry`. Every loader
(MCP, OpenAPI, OpenAI Agents SDK, Google ADK, LangChain, CrewAI, n8n,
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@ Set `pr_comment: "true"` to post a compact PR summary:

## What it produces

- **Tool-Use Readiness Report** — `agents-shipgate-reports/report.{md,json,sarif}`. Markdown for human release review, JSON for tools and coding agents (current schema [v0.16](docs/report-schema.v0.16.json); gating signal is `release_decision.decision`; v0.16 adds first-class Action Surface Diff fields on top of v0.15's per-finding `provenance_kind`), SARIF for GitHub code-scanning workflows.
- **Tool-Use Readiness Report** — `agents-shipgate-reports/report.{md,json,sarif}`. Markdown for human release review, JSON for tools and coding agents (current schema [v0.17](docs/report-schema.v0.17.json); gating signal is `release_decision.decision`; v0.17 adds the per-finding `release_decision.contribution_rules[]` audit on top of v0.16's first-class Action Surface Diff fields and v0.15's per-finding `provenance_kind`), SARIF for GitHub code-scanning workflows.
- **Release Evidence Packet** — `agents-shipgate-reports/packet.{md,json,html}` (and `packet.pdf` with the `[pdf]` extras). Reviewer-shaped synthesis with fixed sections, including tool-surface and action-surface diffs when available. Governed by [packet schema v0.5](docs/packet-schema.v0.5.json) — see [STABILITY.md §Release Evidence Packet](STABILITY.md#release-evidence-packet-v05).

## Exit codes
Expand Down Expand Up @@ -226,7 +226,7 @@ Agents Shipgate is designed to be agent-friendly. If you're a coding agent (Clau
- **[`prompts/`](prompts/)** — reusable prompts for common workflows
- **[`skills/agents-shipgate/`](skills/agents-shipgate/)** + **[`.claude/commands/shipgate.md`](.claude/commands/shipgate.md)** — self-contained Claude Code skill (bundled prompts and CI recipe) and `/shipgate` slash command. See [`docs/agents/use-with-claude-code.md`](docs/agents/use-with-claude-code.md) to install in your own project.
- **[`docs/ai-search-summary.md`](docs/ai-search-summary.md)** — human-readable summary for AI search, answer engines, and coding agents
- **[`docs/manifest-v0.1.json`](docs/manifest-v0.1.json)** + **[`docs/report-schema.v0.16.json`](docs/report-schema.v0.16.json)** — JSON Schemas for live editor validation (current; emitted reports carry `report_schema_version: "0.16"`). v0.16 adds `action_surface_facts` and `action_surface_diff`; v0.15 added the per-finding `provenance_kind` enum. Read `release_decision.decision` for release gating in new consumers; read `agent_summary.first_recommended_action` for a deterministic next step.
- **[`docs/manifest-v0.1.json`](docs/manifest-v0.1.json)** + **[`docs/report-schema.v0.17.json`](docs/report-schema.v0.17.json)** — JSON Schemas for live editor validation (current; emitted reports carry `report_schema_version: "0.17"`). v0.17 adds `release_decision.contribution_rules[]` (per-finding decision audit); v0.16 added `action_surface_facts` and `action_surface_diff`; v0.15 added the per-finding `provenance_kind` enum. Read `release_decision.decision` for release gating in new consumers; read `agent_summary.first_recommended_action` for a deterministic next step.
- **[`docs/checks.json`](docs/checks.json)** — machine-readable check catalog

Every command has a `--json` form. Errors emit a structured `next_action` line on stderr when `AGENTS_SHIPGATE_AGENT_MODE=1`.
Expand Down Expand Up @@ -414,7 +414,7 @@ Agents Shipgate is a static, manifest-first scanner. It is intentionally narrow:
- It does not verify runtime behavior, latency, prompt quality, or routing decisions.
- It does not replace dynamic security testing or human security review of the underlying systems.
- It only inspects what is declared in `shipgate.yaml`, local OpenAPI specs, MCP exports, simple OpenAI API artifacts, optional SDK AST metadata, static Google ADK/LangChain/CrewAI inputs, and static Codex plugin package metadata; tools that are not declared or statically discoverable are not scanned.
- The manifest remains `version: "0.1"` so existing configs keep working. Current reports carry `report_schema_version: "0.16"` (additive over v0.15's provenance enum, adding `action_surface_facts` and `action_surface_diff`) while preserving the stable payload contract documented in the report schema.
- The manifest remains `version: "0.1"` so existing configs keep working. Current reports carry `report_schema_version: "0.17"` (additive over v0.16, adding `release_decision.contribution_rules[]` — a deterministic per-finding audit of how each finding contributed to the release decision) while preserving the stable payload contract documented in the report schema.

See [ROADMAP.md](ROADMAP.md) for what is planned next.

Expand Down Expand Up @@ -491,7 +491,7 @@ readers and AI search ingest.
- [Check catalog](docs/checks.md)
- [Policy packs](docs/policy-packs.md)
- [Baseline workflow](docs/baseline.md)
- [JSON report schema v0.16](docs/report-schema.v0.16.json)
- [JSON report schema v0.17](docs/report-schema.v0.17.json)
- [Trust model](docs/trust-model.md)
- [AI search summary](docs/ai-search-summary.md)
- [Design partners](docs/design-partners.md)
Expand Down
26 changes: 26 additions & 0 deletions STABILITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ In `agents-shipgate-reports/report.json`, the following are guaranteed:
- `findings[].blocks_release` (v0.16+) — explicit release-policy blocking bit. Built-in and user-defined Action Surface Diff policies, plus declarative policy-pack rules with `block: true`, set it for findings that must block release when active and unbaselined; ordinary severity-based gating still works for existing checks.
- `action_surface_facts.actions[]` (v0.16+) — deterministic current action snapshot: action id, operation, effect, normalized risk tags, scopes, approval policy, safeguards, evidence, input fields, and stable hashes.
- `action_surface_diff.{enabled, base, summary, added, removed, modified, notes}` (v0.16+) — reviewer-facing delta for what the agent can do vs. a prior report or v0.4 baseline. Policy findings derived from this diff can set `findings[].blocks_release=true` and affect `release_decision.decision` and strict-mode exit behavior.
- `release_decision.contribution_rules[].{finding_id, fingerprint, check_id, category, rule, rationale}` (v0.17+) — deterministic per-finding audit of how each finding contributed to the release decision. Required + always present (defaults to `[]` for legacy reports loaded via `explain-finding`). Exactly one row per `report.findings` entry, including suppressed findings, so the audit set is exhaustive over the full findings list. `category` enum: `blocker | review_item | excluded`. `rule` enum: `policy_block_new | severity_block_new | policy_baseline_accepted | severity_baseline_accepted | review_required | sub_threshold | suppressed`. The (rule, category) pairs the gate can produce are exhaustively documented in [Release decision truth table](#release-decision-truth-table) below — reading the contribution rule is sufficient to predict the outcome for that finding without re-deriving the decision logic. The audit cannot disagree with `release_decision.{blockers,review_items}[]`: the same classification powers both. Adding `contribution_rules` does not change any existing behavior — `decision`, `blockers[]`, `review_items[]`, `fail_policy.exit_code`, and strict-mode exit codes are byte-identical to v0.16.
- `baseline.{matched_count, new_count, resolved_count, path}` (when `--baseline` is used)
- `tool_inventory[].{name, source_type, source_ref, risk_tags, auth_scopes, owner, confidence}`
- `loaded_plugins[].{name, value, distribution, version, check_id}`
Expand Down Expand Up @@ -126,6 +127,31 @@ These are **intentionally different signals**, kept apart for backwards compatib
| `release_decision.decision` | yes — baseline-matched criticals appear in `review_items`, not `blockers` | **yes (v0.8+)** |
| `summary.status` | no — any unsuppressed critical flips status to `release_blockers_detected` | preserved for v0.7 callers |

#### Release decision truth table

The classification below is the contract for how every active finding lands in `release_decision.{blockers, review_items}[]` and which `contribution_rules[].rule` (v0.17+) fires for it. Same shape as the v0.8 implementation: this section documents existing behavior, it does not change it. Suppressed findings (`finding.suppressed=true`) are excluded entirely from the active set and audited as `category="excluded", rule="suppressed"`.

Notation: `fail_on` is `release_decision.fail_policy.fail_on` after `ci_mode` resolution (advisory → empty, strict → `["critical"]`, plus any explicit `--fail-on` override). `blocker_severities` = `{critical} ∪ fail_on`. `review_tier` = `{critical, high, medium}` (or any severity when `requires_human_review=true`).

| `blocks_release` | severity | baseline_status | severity in `blocker_severities`? | severity in `review_tier`? | category | `rule` | strict-mode exit |
|---|---|---|---|---|---|---|---|
| true | any | new / null | n/a | n/a | **blocker** | `policy_block_new` | 20 |
| true | any | matched | n/a | yes | review_item | `policy_baseline_accepted` | 0 (with `--baseline-mode new-findings`) |
| true | any | matched | n/a | no | excluded | `policy_baseline_accepted` | 0 (with `--baseline-mode new-findings`) |
| true | any | resolved | n/a | n/a | excluded | (not produced; resolved findings are absent from the active set) | 0 |
| false | any | new / null | yes | n/a | **blocker** | `severity_block_new` | 20 |
| false | any | matched | yes | yes | review_item | `severity_baseline_accepted` | 0 (with `--baseline-mode new-findings`) |
| false | any | matched | yes | no | excluded | `severity_baseline_accepted` | 0 (with `--baseline-mode new-findings`) |
| false | any | new / null | no | yes | review_item | `review_required` | 0 |
| false | any | matched | no | yes | review_item | `review_required` | 0 |
| false | any | new / null / matched | no | no | excluded | `sub_threshold` | 0 |

**Why baseline-matched policy findings drop to `review_items`, not `blockers`.** `blocks_release=true` represents an explicit *policy* decision (Action Surface Diff rule, `action_surface:` manifest entry, or policy-pack rule with `block: true`) that the finding must block release **on first appearance**. A baseline accepts technical debt that already passed prior review — the project agreed to ship with that finding present. Treating baselined policy debt as a hard blocker would defeat the purpose of `baseline save`. The baseline-aware drop is symmetric for severity-driven blockers and policy blockers: both land in `review_items` once accepted into the baseline, both become hard blockers if newly introduced.

**Why `severity ∈ blocker_severities + matched + below review_tier` lands in `excluded`, not `review_items`.** A finding whose severity isn't in `{critical, high, medium}` (and which doesn't carry `requires_human_review=true`) has nothing for a human reviewer to act on per the v0.8 contract — it's been baselined and isn't severe enough to warrant attention. v0.17 records this in the audit so the (rare) edge case isn't silently invisible, but the `blockers[]`/`review_items[]` lists themselves are unchanged.

**Why exit code 20 depends on `--baseline-mode`.** `release_decision.{blockers, review_items}[]` always include the full set computed against `report.findings` (with suppressed excluded). The strict-mode exit code, however, is computed from `baseline_filtered_active(report, new_findings_only=...)` — when `--baseline-mode new-findings` is set (the default for the GitHub Action when `baseline:` is provided), baseline-matched policy and severity blockers are filtered out before the exit check, so exit is `0`. With `new_findings_only=False`, a matched policy blocker still triggers exit 20. The `release_decision` block remains baseline-aware in all cases; only the exit-code path changes mode.

Concretely: a scan with one baseline-matched critical and zero new findings produces `summary.status = "release_blockers_detected"` AND `release_decision.decision = "review_required"`. Both are correct under their respective contracts. New consumers should read `release_decision.decision`.

### Check IDs
Expand Down
3 changes: 2 additions & 1 deletion docs/INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,10 @@ A single entry point for human readers and AI agents walking the `docs/` tree.
- [`checks.md`](checks.md) — full check catalog (human-readable)
- [`checks.json`](checks.json) — machine-readable check catalog (regenerated each release)
- [`manifest-v0.1.json`](manifest-v0.1.json) — JSON Schema for `shipgate.yaml`
- [`report-schema.v0.16.json`](report-schema.v0.16.json) — JSON Schema for `report.json` (current; emitted reports carry `report_schema_version: "0.16"`, which adds first-class Action Surface Diff fields)
- [`report-schema.v0.17.json`](report-schema.v0.17.json) — JSON Schema for `report.json` (current; emitted reports carry `report_schema_version: "0.17"`, which adds the per-finding `release_decision.contribution_rules[]` audit on top of v0.16's first-class Action Surface Diff fields)
- [`agent-action-guide.md`](agent-action-guide.md) — per-category recipe for what to do with a finding (canonical fix per check category, last-resort suppression rules)
- [`upstream-integrations.md`](upstream-integrations.md) — per-framework 60-second drop-in for adding Shipgate to an existing project (OpenAI Agents SDK, LangChain, CrewAI, ADK, MCP-only, OpenAPI-only, OpenAI Messages API, Anthropic Messages API)
- [`report-schema.v0.16.json`](report-schema.v0.16.json) — frozen v0.16 reference schema; pre-v0.17 reports validate against this
- [`report-schema.v0.15.json`](report-schema.v0.15.json) — frozen v0.15 reference schema; pre-v0.16 reports validate against this
- [`report-schema.v0.14.json`](report-schema.v0.14.json) — frozen v0.14 reference schema; pre-v0.15 reports validate against this
- [`report-schema.v0.13.json`](report-schema.v0.13.json) — frozen v0.13 reference schema; pre-v0.14 reports validate against this
Expand Down
Loading