Skip to content

oritera/Cairn

Repository files navigation

Cairn Banner

Cairn

More Than Just AI Penetration Testing — Towards General State-Space Search

Tencent TCH

Cairn is a general-purpose problem-solving engine.
It defines no roles, no workflows. Given an origin and a goal, it searches for a path through an unknown state space.
AI Penetration Testing is one such problem — and a proven one.

Discord X

Cairn runtime screenshot

What is Cairn?

Penetration testing is fundamentally a directed search through a near-infinite state space:

  • Origin: known (target IP, target system)
  • Goal: defined (get a shell, capture the flag)
  • Path: unknown

This structure is not unique to penetration testing. Vulnerability research, mathematical proof, CTF challenges — any problem with a clear starting point, a clear success condition, and an unknown path in between shares the same shape.

Cairn is built for this class of problems. Penetration testing is the first domain it has been validated on.

The engine is built on a Blackboard Architecture with an explicit fact-intent graph. Three primitives are all it needs:

Concept Meaning
Fact A confirmed, objective finding written to the board
Intent A declared direction of exploration, not yet executed
Hint Human judgment injected at any time; absorbed by agents on the next read

The graph grows from origin toward goal. Every new Fact is a stepping stone; every Intent is a step into the unknown.

Agent Workers run an OODA loop — Observe the full graph, Orient to the current state, Decide on next intents, Act to explore — and write their findings back as new Facts. Workers have no fixed roles. Tasks are generated at runtime from the graph's current state, not from predefined job descriptions.

Agents coordinate exclusively through the shared board (Stigmergy). No direct communication. No information silos.

Cairn in Action

cairn.mp4

How It Works

Three task types, all executed by the same Worker:

Task What it does Output
Bootstrap At project start, attempts to solve the problem directly Fact + possible Complete
Reason Reads the full graph: is the goal met? What should be explored next? Complete / new Intents / no-op
Explore Claims one Intent, executes the exploration, reports findings One Fact

System architecture:

          ┌──────────────────────────────────┐
          │           Cairn Server           │
          │    Facts + Intents + Hints       │
          └─────────────────┬────────────────┘
                            │
                     Read / Write API
                            │
          ┌─────────────────┴────────────────┐
          │             Dispatcher           │
          │   Schedules tasks, manages       │
          │   containers, writes protocol    │
          └──────────┬───────────────┬───────┘
                     │               │
     ┌───────────────┴──┐     ┌──────┴──────────────┐
     │  Worker Container│     │  Worker Container   │
     │   (Project A)    │     │   (Project B)       │
     │  ┌────┐  ┌────┐  │     │  ┌────┐  ┌────┐     │
     │  │ W. │  │ W. │  │     │  │ W. │  │ W. │     │
     │  └────┘  └────┘  │     │  └────┘  └────┘     │
     └──────────────────┘     └─────────────────────┘

Cairn Server maintains graph consistency only.

Cairn Dispatcher reads the graph, schedules tasks, spins up and tears down worker containers, and is the sole writer to the protocol. Each project gets its own Worker Container; multiple Agent Workers run concurrently inside it. Agent Workers only receive a prompt and return structured output.

Supported worker backends: Claude Code, Codex, and Pi.

Results

Tencent Cloud Hackathon · AI Penetration Testing Challenge · 2nd Edition

610 teams · 1,345 participants · top universities and security firms across China

Metric Value
Problems solved 54 / 54 — only team to AK
Final ranking 3rd

The system had never been tested before the competition. The full pipeline came online for the first time at 4 AM on race day. No training, no tuning, no domain-specific tooling. Zero MCP tools, zero RAG, zero predefined agent roles.

Further Reading

Getting Started

Prerequisites

  • macOS or Linux
  • Python ≥ 3.12
  • Docker

Pull required images

Both setup methods require the worker container image:

docker pull --platform=linux/amd64 ghcr.io/oritera/cairn-worker-container:latest

Docker Compose (recommended)

Pull the base image used to build Cairn:

docker pull ghcr.io/astral-sh/uv:python3.13-trixie

Edit dispatch.yaml and fill in your LLM endpoints and API keys, then start both services:

docker compose up --build

This starts cairn-server on port 8000 and cairn-dispatcher once the server passes its health check. The dispatcher mounts dispatch.yaml from the project root and connects to Docker via the host socket. Data is persisted to ./datas/cairn/.

Manual

Edit dispatch.yaml and fill in your LLM endpoints and API keys, then:

# Start the server
uv run --project cairn cairn serve
 
# Run the dispatcher
uv run --project cairn cairn dispatch --config dispatch.yaml
 
# Run startup health checks only
uv run --project cairn cairn dispatch --config dispatch.yaml --startup-healthcheck-only

Disclaimer

Cairn is a general-purpose problem-solving engine. Although it supports penetration testing, CTF solving, security assessment, and vulnerability research workflows, it is intended to be used only in environments where you have explicit authorization to operate.

You are solely responsible for how you use this project. Do not use Cairn against systems, networks, applications, or data without clear prior permission from the owner or operator. Unauthorized security testing, exploitation, or data access may be illegal and may cause harm.

The developers and contributors of this project do not endorse or accept responsibility for any misuse, abuse, damage, loss, or legal consequences arising from its use. By using this project, you agree to ensure that your activities comply with all applicable laws, regulations, contractual obligations, and professional or organizational policies in your jurisdiction.

⚖️ License

This project is licensed under GNU AGPLv3 for personal and educational use.

Commercial Use: If you wish to use this project in a commercial or proprietary environment without the AGPL-3.0 open-source obligations, please contact me to obtain a commercial license.

Contributions: By submitting a Pull Request, you agree that your contributions may be used under both the AGPL-3.0 and the project's commercial license.