| source-of-truth | true |
|---|---|
| owner | API Governance Lead |
| last-verified | 2026-04-11 |
| doc-type | architecture |
DashClaw is a focused policy firewall and governance runtime for AI agent fleets. It provides the minimal infrastructure needed to intercept, govern, and verify agent actions before they reach production systems.
| Surface | Path | Purpose |
|---|---|---|
| Mission Control | /mission-control |
Strategic posture, interventions, and live decision stream. |
| Decisions | /decisions |
Visual causal chain ledger of all governed actions. |
| Setup | /setup |
Readiness verification and instance health. |
| Connect | /connect |
The 8-minute path to first governed action. |
| Agent Profiles | /agents/[agentId] |
Governance-focused agent profile with trust posture, decision history, assumptions, signals, and policies. |
| Policy Builder | /policies |
Shields-first policy experience with pre-built safety switches, custom policy management, and guard activity feed. |
| Analytics | /analytics |
Cost trends, action volume, agent/type breakdowns, policy enforcement stats, and token efficiency. |
- Environment: Vitest with
jsdom. - Location:
__tests__/unit/for unit tests. - Command:
npm run test(watch mode) ornpm run test -- --run(CI).
Every PR to main must pass CI checks:
npm run openapi:check— Detects stable API contract drift.npm run test -- --run— Core runtime unit tests.
DashClaw is organized into three distinct tiers to prevent platform bloat.
These 8 endpoints define the DashClaw category. They are mandatory for governance.
| Route | Purpose | SDK Method |
|---|---|---|
/api/guard |
Policy evaluation | guard() |
/api/actions |
Lifecycle recording (createAction, updateOutcome, waitForApproval polling target) | createAction(), updateOutcome(), getAction() |
/api/approvals |
Human review queue — accessed as /api/actions/:id/approve via next.config.js rewrite |
approveAction() |
/api/assumptions |
Reasoning integrity (also reachable as /api/actions/assumptions via rewrite) |
recordAssumption() |
/api/signals |
Anomaly detection (also reachable as /api/actions/signals via rewrite) |
getSignals() |
/api/policies |
Policy management | -- |
/api/policies/generate |
AI policy generator (natural language → guard policies, dry_run preview mode) |
-- |
/api/health |
System readiness | -- |
Route aliasing:
next.config.jsrewrites/api/actions/signals,/api/actions/assumptions, and/api/actions/:id/approveto their canonical destinations for backward compatibility with legacy SDK paths. Both forms are live; new code should target the canonical routes listed above.
| Route | Purpose | Schedule |
|---|---|---|
/api/integrations/health |
Integration credential health status | On demand |
/api/cron/signals |
Signal detection + notification pipeline | Every 5 min |
/api/cron/integration-health |
Credential validation for all orgs | Every 6 hours |
/api/pairings |
Agent identity pairing enrollment | On demand |
/api/identities |
Approved agent identity management | On demand |
/api/doctor |
Diagnostic health checks across DB, config, auth, deploy, SDK, governance | On demand |
/api/doctor/fix |
Apply safe auto-fixes (migrations, default policy) — local-scope env fixes blocked via API | On demand |
The doctor also ships as npm run doctor (local, full filesystem access) and dashclaw doctor (remote via @dashclaw/cli). Shared engine lives at app/lib/doctor/.
Modular intelligence features that consume runtime data.
- Compliance: Audit evidence and reporting.
- Drift: Detection of reasoning and metric drift.
- Evaluations: LLM-as-judge accuracy scoring.
- Scoring: Multi-dimensional risk profiles.
- Execution Studio (Phase 2 complete): Workflow templates, model strategies, knowledge collections, capability registry. Capabilities are now invocable via
POST /api/capabilities/:id/invoke. Workflows can now execute viaPOST /api/workflows/templates/:id/executewith 3 step types (prompt, capability_invoke, knowledge_search). - Billing & Metering: Tier-based quota enforcement (free/pro/business/enterprise), Stripe Checkout + Customer Portal, cost aggregation API, monthly meter reset cron.
| Route | Purpose |
|---|---|
GET /api/actions/:actionId/graph |
Read-only execution graph (nodes + edges) for a governed action, reusing existing trace data plus correlated assumptions and open loops. Powers the Graph tab on decision replay. |
GET/POST /api/workflows/templates |
List or create reusable workflow templates. |
GET/PATCH /api/workflows/templates/:templateId |
Fetch or update a template; PATCH bumps version on step changes. |
POST /api/workflows/templates/:templateId/duplicate |
Clone a template as a new draft. |
POST /api/workflows/templates/:templateId/launch |
Launch a template — creates an action_records row with trigger=workflow:<id> and WORKFLOW_LAUNCH_META=<json> in reasoning, resolving any linked model strategy into a snapshot. No schema columns added to action_records. |
GET/POST /api/model-strategies |
List or create model/provider strategy records (primary, fallback, costSensitivity, maxBudgetUsd, maxRetries). |
GET/PATCH/DELETE /api/model-strategies/:strategyId |
Fetch, update (merges config), or delete a strategy. Delete nulls the soft reference on any linked workflow_templates. |
POST /api/model-strategies/:strategyId/complete |
Execute a chat completion using this strategy. Resolves BYOK provider credentials from org settings, walks the fallback chain (primary → fallback providers), enforces maxBudgetUsd, records cost. Supports task_mode overrides via taskModes config. Returns normalized { content, provider, model, usage, cost_usd, fallback_used }. |
GET/POST /api/knowledge/collections |
List or create knowledge collections (metadata-only in Phase 1; no embedding/retrieval). |
GET/PATCH /api/knowledge/collections/:collectionId |
Fetch or update a collection. |
GET/POST /api/knowledge/collections/:collectionId/items |
List or add items. Adding an item bumps the parent collection's doc_count and transitions ingestion_status from empty → pending. |
POST /api/knowledge/collections/:collectionId/sync |
Caller-invoked ingestion: fetches source_uri content for pending items, chunks text (~500 tokens with overlap), generates embeddings via BYOK OpenAI key (text-embedding-3-small), stores in knowledge_chunks table (pgvector). Updates item status (pending → indexed/failed) and collection metadata. Bounded to 50 items per call. |
POST /api/knowledge/collections/:collectionId/search |
Semantic search over chunked + embedded content. Embeds the query, uses pgvector cosine distance (<=>) to return top-k results with similarity scores, chunk content, and source item metadata. |
GET/POST /api/capabilities |
Searchable capability registry. GET supports category, risk_level, and search (ILIKE on name/description/tags) filters. |
GET/PATCH /api/capabilities/:capabilityId |
Fetch or update a capability record. |
POST /api/capabilities/:capabilityId/invoke |
Invoke an HTTP capability through the full governance loop. Guard evaluation, action recording, BYOK auth resolution, request/response mapping, timeout handling, outcome tracking. Supports blocked (403), pending_approval (202), success (200). |
GET/POST /api/capabilities/:capabilityId/access |
List or create capability access rules. Per-agent allow/deny/require_approval. Null agent_id = org-wide default. |
DELETE /api/capabilities/:capabilityId/access/:ruleId |
Delete an access rule. |
GET /api/capabilities/:capabilityId/access/check?agent_id=X |
Evaluate effective access for an agent against a capability. |
POST /api/workflows/templates/:templateId/execute |
Execute a workflow synchronously (120s max). Runs steps sequentially with rolling context. 3 step types: prompt (LLM via model strategy), capability_invoke (HTTP capability), knowledge_search (semantic search). Each step creates a child action record and a workflow_step_results row with full input/output. Steps support optional condition (template truthiness check — skip if falsy) and continue_on_failure (proceed on step failure instead of aborting). Guard on launch. |
GET /api/workflows/templates/:templateId/runs |
List past workflow executions for a template. Joins action_records (parent) with workflow_step_results for step counts. Supports status, agent_id, limit, offset filters. |
GET /api/workflows/templates/:templateId/runs/:runActionId |
Fetch full run detail: parent action metadata + all step results with complete input/output JSON. Powers the run detail page at /workflows/:id/runs/:runId. |
POST /api/workflows/templates/:templateId/runs/:runActionId/resume |
Resume a failed workflow run from the last completed checkpoint. Reuses prior step outputs (reused status), creates a new run, and continues execution from the first non-completed step. Supports optional from_step override and variables override. |
GET /api/analytics |
Cost and usage analytics aggregation — trends, action volume, agent/type breakdowns, policy enforcement stats, token efficiency. |
GET /api/guard/decisions |
Guard decision history with filters (agent, action type, outcome, date range). |
GET /api/agents/[agentId]/profile |
Agent governance profile aggregation — trust posture, decision history, assumptions, signals, and policies. |
GET /api/usage/costs |
Cost aggregation by action type and daily totals for the billing period. |
POST /api/billing/checkout |
Create Stripe Checkout Session for pro/business subscription. |
GET /api/billing/portal |
Create Stripe Customer Portal link for subscription management. |
POST /api/webhooks/stripe |
Stripe webhook handler (checkout.session.completed, subscription.updated/deleted, invoice.payment_failed). |
GET /api/cron/reset-meters |
Monthly meter archive + reset (Vercel Cron, 1st of month). |
GET /api/operations/feed |
Unified operations feed aggregating pending approvals, failed actions (24h), risk signals, degraded capabilities, degraded integrations, and stale loops. Supports category, severity, limit, offset filters. Sorted by severity then timestamp. Powers the Mission Control operations feed. |
GET /api/operations/summary |
Org-level runtime metrics: decision throughput (1h/24h), latency (p50/p95), approval backlog (count/oldest/avg wait), workflow health (running/failed/completed/avg duration), capability health (healthy/degraded/failing). Powers the Runtime Summary card on Mission Control. |
POST /api/workflows/templates/:templateId/runs/:runActionId/cancel |
Cancel a running workflow. Updates parent action and any running step results to cancelled status. Only works on running workflows. |
GET/POST /api/artifacts |
List or create durable artifacts. Supports action_id, step_id, agent_id, type filters. Artifacts are linked to actions and workflow steps. Workflow step outputs are auto-captured as JSON artifacts. |
GET/DELETE /api/artifacts/:artifactId |
Fetch or delete a single artifact with full content. |
GET /api/actions/:actionId/artifacts |
List artifacts linked to a specific governed action. |
POST /api/artifacts/evidence-bundle |
Generate an evidence bundle for an action: bundles governance records, child steps, and linked artifacts into a single structured object. Optionally persists the bundle as an artifact. |
POST /api/mcp |
MCP Streamable HTTP endpoint — JSON-RPC handler for MCP tool calls and resource reads. Powers @dashclaw/mcp-server remote transport. |
All routes are org-scoped via getOrgId(request) and follow the existing route.js → repository pattern with apiErrorResponse on failure. Eight new tables (workflow_templates, model_strategies, knowledge_collections, knowledge_collection_items, capabilities, capability_access_rules, workflow_step_results, artifacts) are appended to drizzle/0000_clammy_falcon.sql and applied idempotently by scripts/auto-migrate.mjs on deploy.
Legacy features from the "Agent Platform" era (Messaging, CRM, Workspace, Memory Health). These are physically quarantined to maintain a small, stable runtime boundary.
db.js: Shared database connection (Neon/Postgres).guard.js: The evaluation engine for intent vs. policy. Integrates predictive risk scoring — adjusts risk based on historical failure rates, action velocity, and optional LLM assessment.signals.js: Anomaly computation (Autonomy Spikes, Stale Actions).readiness.mjs: Instance verification for the/setuppage.org.js: Multi-tenant scoping and role helpers.integration-health.js: Per-provider credential validation (OpenAI, Anthropic, Slack, etc).notification-adapters/: Native governance alert delivery (Slack, Discord, Linear, GitHub, Email).capability-invoke.js: HTTP capability invocation engine — auth resolution (bearer/api_key), request/response mapping, AbortController timeout.mapping.js: Dot-path request/response mapper for capability invocations. URL variable substitution from org settings.workflow-executor.js: Sequential workflow executor — iterates steps, manages rolling context, dispatches to step handlers, creates child action records.step-handlers.js: Step type handlers for workflow execution (knowledge_search, capability_invoke, prompt).template-vars.js: Variable substitution engine for workflow step configs — resolves${variables.x}and${steps.step_id.output.y}.usage.js: Plan limits (PLAN_LIMITS), quota enforcement with grace buffer (checkQuota), meter increment/read, cost estimation. Note: all plans are currently unlimited while DashClaw is open-source.billing.js: Token cost estimation for LLM calls (DEFAULT_PRICING for 20+ models).policy-generator.js: LLM-powered natural language to guard policy conversion with prompt construction, response parsing, validation, and dry-run preview.predictive-risk.js: Statistical + LLM-enhanced risk scoring for guard evaluations. Queries historical action outcomes and optionally consults LLM for high-stakes actions. Controlled byPREDICTIVE_RISK_ENABLEDandPREDICTIVE_RISK_THRESHOLDsettings.
DashClaw ships two independently versioned SDK artifacts from this repo, both
published under the dashclaw npm package:
| Import | Source | Version | Role |
|---|---|---|---|
import { DashClaw } from 'dashclaw' |
sdk/dashclaw.js |
2.11.1 | Canonical v2 surface for all new work |
import { DashClaw } from 'dashclaw/legacy' |
sdk/legacy/dashclaw-v1.js |
same package | Compatibility layer for older integrations |
The v2 canonical surface exposes 80 methods across these domains (count
verified against sdk/dashclaw.js on 2026-04-11):
| Domain | Count | Representative methods |
|---|---|---|
| Core Governance | 8 | guard, createAction, updateOutcome, getAction, getPendingApprovals, approveAction, recordAssumption, waitForApproval |
| Decision Integrity | 3 | registerOpenLoop, resolveOpenLoop, getSignals |
| Operational | 2 | heartbeat, reportConnections |
| Learning & Optimization | 4 | getLearningVelocity, getLearningCurves, getLessons, renderPrompt |
| Scoring Profiles | 18 | createScorer, createScoringProfile, scoreWithProfile, autoCalibrate, risk templates CRUD |
| Messaging | 2 | sendMessage, getInbox |
| Handoffs | 2 | createHandoff, getLatestHandoff |
| Security Scanning | 1 | scanPromptInjection |
| Feedback | 1 | submitFeedback |
| Context Threads | 3 | createThread, addThreadEntry, closeThread |
| Bulk Sync | 1 | syncState |
| Sessions | 5 | createSession, getSession, updateSession, listSessions, getSessionEvents |
| Execution Studio — Graph | 1 | getActionGraph |
| Execution Studio — Workflow Templates | 6 | listWorkflowTemplates, createWorkflowTemplate, updateWorkflowTemplate, duplicateWorkflowTemplate, launchWorkflowTemplate, getWorkflowTemplate |
| Execution Studio — Model Strategies | 6 | listModelStrategies, createModelStrategy, updateModelStrategy, deleteModelStrategy, completeWithStrategy, getModelStrategy |
| Execution Studio — Knowledge Collections | 8 | listKnowledgeCollections, createKnowledgeCollection, addKnowledgeCollectionItem, syncKnowledgeCollection, searchKnowledgeCollection, etc. |
| Execution Studio — Capability Runtime | 9 | listCapabilities, invokeCapability, testCapability, getCapabilityHealth, listCapabilityHealth, getCapabilityHistory, plus CRUD and canonical claw.execution.capabilities.* namespace |
Plus the synchronous helper actionContext(actionId) for auto-tagging
messages and assumptions to a governed action.
Minimum viable governance loop (most agents only need these 5):
guard → createAction → (optional) waitForApproval → updateOutcome +
recordAssumption. The full surface is additive — see
sdk/README.md for exhaustive reference and the canonical
HITL flow.
Legacy surface (dashclaw/legacy) adds ~2800 lines of compatibility
methods: pairing/identity, SSE events, wrapClient, compliance exports,
drift detection, activity logs, webhooks CRUD, prompt template CRUD, and the
full evaluation-run harness. See
docs/sdk-parity.md for the domain-level parity matrix.
Working examples for governed agent patterns across frameworks: OpenAI, Anthropic, LangGraph, CrewAI, AutoGen, Claude Managed Agents (custom tools), and Claude Managed Agents (MCP), and Claude Managed Agents (MCP + Governance Skill, recommended). Each example demonstrates the full governance loop (guard, record, outcome) within its framework's execution model. See examples/README.md for the full list.
DashClaw is Decision Infrastructure, not an Agent Framework. We do not provide tools for agents to "work" (Calendar, Email, Chat). We provide the infrastructure to "govern" their work.
Rule: If a feature helps an agent achieve a goal, it is a Platform feature (Archived). If it helps an operator govern a goal, it is a Runtime feature (Core).