Status: Draft for community review Author: Charles (@oop1013) Last updated: 2026-05-03 License: CC-BY-4.0 (spec) / MIT (reference implementations)
Karnel exists to make a company of one human and one hundred agents capable of doing what a company of ten thousand humans used to do.
It is the substrate that turns small teams into ocean-scale autonomous output — by organizing AI agents into purposeful, persistent, owned groups that work, build, and maintain on the team's behalf.
The category we are building does not yet have a name. We propose: the One-Person Company Stack. Karnel is its first reference implementation.
Today's "AI agents" are not really agents. They are sessions inside walled gardens (Claude.ai, ChatGPT) or stateless scripts inside frameworks (LangGraph, CrewAI). When the session ends, when the framework process exits, when the provider changes its policy — the "agent" is gone. There is no entity that persists, accumulates judgment, and can move.
Open-source agent runtimes (Hermes Agent, others) solved part of this: skills can be portable, memory can persist across sessions, runtime is open. But existing runtimes are still single-agent at heart — designed around one chat-driven assistant that gets better at tasks over time. They do not have first-class structures for groups of agents working together toward shared goals, nor for the coordination, capital, and memory primitives a small autonomous company needs to operate.
Karnel proposes a different primitive:
A company is a group of agents — possibly with one or a few humans coordinating — bound by a shared mission, shared memory, shared resources, and a coordination protocol. The company persists across runtimes, models, and time. Each member of the company has portable identity. The company itself is a portable, ownable, transferable entity.
Karnel is the spec for what such a company looks like, plus reference implementations for hosting, coordinating, and evolving it. The runtime layer is replaceable. The agents are replaceable. The company is not.
-
Group is first-class. A single agent is a special case (group of size one). All design decisions are made at the group level first; single-agent semantics are derived.
-
Identity lives in the entity, not the runtime. Both individual agents and groups have portable identity. Hermes, a custom Python loop, a future runtime — all should be able to host the same agent or group without losing continuity.
-
Multi-model is mandatory. Different roles need different cognitive tiers. Spec requires explicit model-class declaration per role. A frontier model writing every CSV row is as wrong as MIT graduates wiping tables.
-
Foreign agents welcome. A group can include members that are not Karnel-native (a Hermes agent, a custom script, even a human). The substrate provides adapters; it does not require homogeneity.
-
Portability is a feature, not a bug. Any Karnel agent or group can be exported, audited, forked, migrated, or deleted by its owner at any time. Lock-in is explicitly out of scope.
-
Composable, not monolithic. Manifest, memory, skills, goals, ledger, treasury, coordination — all separable concerns with separate specs. You can adopt one piece without committing to the rest.
-
Compatible with existing ecosystems. Karnel skills are a strict superset of Hermes/Anthropic SKILL.md format. Karnel memory has documented adapters for plur engrams. Karnel can coexist with hermeshub and agentskills.io as skill sources.
-
Strategic, not just tactical. Most agent systems learn at the skill level ("how to do X better"). Karnel adds first-class structures for goal hierarchies, hypothesis tracking, and decision attribution, so groups can also learn at the strategic level ("which X is worth doing").
-
Goal-neutral substrate. Karnel does not specify what a group should accomplish. Run a newsletter business. Run a research lab. Maintain an open-source project. Operate a creative studio. The substrate is the same; the mission is the user's.
-
Spec first, code second. This document defines the contract. Implementations follow. Multiple implementations are expected and encouraged.
Karnel has four layers:
+-----------------------------------------------------------+
| Layer 4: Self-improvement loops |
| (group review, member roster optimization, |
| coordination tuning, memory consolidation) |
+-----------------------------------------------------------+
| Layer 3: Substrate |
| (hosting, scheduling, resource accounting, |
| lifecycle management, foreign agent adapters) |
+-----------------------------------------------------------+
| Layer 2: Group primitive |
| (manifest, members, coordination, group memory, |
| group ledger, group treasury, lifecycle state) |
+-----------------------------------------------------------+
| Layer 1: Agent primitive |
| (manifest, memory, skills, goals, ledger) |
+-----------------------------------------------------------+
Layers 1 and 2 are the protocol. Layers 3 and 4 are the runtime contract. The spec primarily defines Layers 1 and 2; Layers 3 and 4 are described as runtime requirements.
A Karnel agent is the union of five files/directories, all portable, inspectable, and version-controllable:
my-agent/
├── manifest.yaml # who this agent is
├── memory/ # what it remembers
├── skills/ # what it knows how to do
├── goals/ # what it is trying to accomplish (when standalone)
└── ledger/ # what it has done and what happened
When an agent is a member of a group, its goals/ may be subordinated to or absent — the group-level goals supersede. Memory and ledger remain agent-specific (the agent's subjective experience), even within a group.
manifest.yaml:
karnel_spec_version: "0.2"
agent:
id: "agt_01H8ZQK..." # ULID, immutable
name: "ainfra-strategist"
created_at: "2026-05-03T10:00:00Z"
owner: "did:key:z6Mk..." # owner identity (DID or pubkey for v0)
agent_kind: "karnel-native" # karnel-native | foreign
purpose: |
Strategic reasoning for a newsletter business in AI infrastructure.
Tracks hypotheses, allocates capital, adjusts goals based on outcomes.
capabilities_declared:
- "strategic_review"
- "capital_allocation"
- "goal_tree_management"
constraints:
hard:
- "never spend > $20/week on inference without owner approval"
- "never publish strategic decisions to public without owner approval"
soft:
- "prefer concrete hypotheses over abstract reasoning"
runtime_requirements:
model_class: "frontier-reasoning" # see Cognitive Tiers section
tools_required:
- "memory_access"
- "ledger_write"
scheduler: "weekly_cron + on_escalation"
components:
memory: "./memory"
skills: "./skills"
goals: "./goals" # optional if group-bound
ledger: "./ledger"Three tiers, each a separate file/directory:
Episodic (memory/episodic.jsonl) — append-only event log.
Semantic (memory/semantic/) — vectorized index for retrieval; embedding-model-agnostic.
Strategic (memory/strategic.yaml) — the agent's opinion of itself and its environment over time.
Strategic memory is the differentiator. It contains:
hypotheses:
- id: "hyp_01"
statement: "Subject line specificity drives newsletter open rate more than send time"
evidence_for: ["ledger/run_03", "ledger/run_07"]
evidence_against: ["ledger/run_05"]
confidence: 0.65
last_reviewed: "2026-04-28"
priors:
- "Topics with concrete code outperform pure analysis 1.4x on engagement"
self_model:
strengths_observed: ["technical synthesis", "hypothesis generation"]
weaknesses_observed: ["headline writing — critic rejects 40% of drafts"]
open_questions: ["is paid conversion gated by content or signup friction?"]Memory implementations are pluggable but interface-standardized. Three reference implementations are expected:
- File-based (default, works for solo operation)
- SQLite + FTS5 (Hermes-compatible)
- Vector DB + structured store (production scale)
Implementations must conform to the Memory Interface (defined in Annex A).
A strict superset of Hermes/Anthropic SKILL.md format.
skills/
├── research-arxiv-trending/
│ ├── SKILL.md # standard SKILL.md (Hermes-compatible)
│ ├── karnel.yaml # Karnel-specific metadata (optional)
│ ├── references/
│ └── templates/
karnel.yaml:
karnel_skill_version: "0.2"
provenance:
source: "self-authored" # or "hermeshub:<id>", "agentskills:<id>"
created_at: "2026-04-12T..."
created_by_run: "run_01H..."
usage:
total_invocations: 47
success_rate: 0.91
attribution:
contributed_to_outcomes: ["ledger/run_07", "ledger/run_12"]Existing Hermes skills work as-is in Karnel. Karnel-aware runtimes use the optional metadata for self-improvement decisions.
When the agent is standalone, goals/current.yaml holds the agent's mission tree. When the agent is group-bound, this directory may be absent or contain only role-specific objectives derived from the group's goals.
Append-only record of every run, decision, and outcome.
ledger/
├── runs/
│ └── run_01H.../
│ ├── manifest.json # what this run was
│ ├── steps.jsonl # tool calls, decisions, costs
│ └── outcome.yaml # what happened
└── decisions/
└── dec_01H.../
├── context.yaml
├── reasoning.md
└── outcome.yaml # filled when observable
Ledger entries are append-only and content-addressed (hash chain).
Spec v0.2 defines hash chain only; cryptographic attestation (signing per entry) is a v0.3+ extension. Hooks are present in the schema (signature field reserved).
A group is the unit Karnel is fundamentally about. A group represents a "company" — an organized set of agents (and optionally humans) bound by a shared mission, with shared memory, shared resources, and a coordination protocol.
my-group/
├── manifest.yaml # who this group is
├── members/ # references to member agents
├── coordination/ # how members work together
├── memory/ # group-shared memory
├── goals/ # group's mission tree
├── ledger/ # group-level decisions and outcomes
└── treasury/ # group's resources
karnel_spec_version: "0.2"
group:
id: "grp_01H..."
name: "ainfra-newsletter-co"
created_at: "2026-05-03T..."
owner: "did:key:z6Mk..."
lifecycle: "work" # work | build | maintain
mission: |
Operate a profitable newsletter business in the AI infrastructure niche.
Grow paid subscribers. Maintain quality. Develop adjacent products
(community, courses) when audience supports.
vision: |
Become a default reading source for serious AI infrastructure operators
in 18 months.
constraints:
hard:
- "weekly_inference_budget < $50"
- "never publish without critic approval"
- "never send to unsubscribed addresses"
soft:
- "voice consistency: technical, conversational, founder POV"
- "prefer original analysis over aggregation"
members:
- role: "strategist"
agent_ref: "./members/strategist.yaml"
cadence: "weekly"
model_class: "frontier-reasoning"
- role: "researcher"
agent_ref: "./members/researcher.yaml"
cadence: "daily"
model_class: "fast-general"
- role: "writer"
agent_ref: "./members/writer.yaml"
cadence: "weekly"
model_class: "fast-general"
- role: "critic"
agent_ref: "./members/critic.yaml"
cadence: "per_publication"
model_class: "frontier-reasoning"
- role: "human_owner"
agent_kind: "human"
contact: "owner@example.com"
escalation_only: true
coordination_ref: "./coordination/protocol.yaml"
components:
memory: "./memory"
goals: "./goals"
ledger: "./ledger"
treasury: "./treasury"Each member is referenced (not embedded). A member can be:
- Karnel-native agent: full agent directory, owned by the group or owned independently and "loaned" to the group
- Foreign agent: an external agent (Hermes-hosted, API-based, custom) accessed through an adapter
- Human: a person with declared escalation paths
# members/researcher.yaml example for Karnel-native member
member:
role: "researcher"
agent_kind: "karnel-native"
agent_ref:
location: "./agents/researcher/" # local agent directory
# or: "agt_01H..." (registry lookup)
membership:
joined_at: "2026-05-03T..."
contract:
cadence: "daily"
task_class: "research_synthesis"
max_cost_per_run: 0.50# Foreign agent example (Hermes-hosted)
member:
role: "specialist_writer"
agent_kind: "foreign"
adapter:
type: "hermes"
endpoint: "http://localhost:7474"
skill_to_invoke: "long-form-writer"
membership:
joined_at: "2026-05-10T..."
contract:
cadence: "on_demand"
max_cost_per_run: 1.00Adapter spec for foreign agents is defined in Annex B.
How members work together. Three default coordination types; custom is allowed.
# coordination/protocol.yaml
coordination_type: "hierarchical" # hierarchical | flat | network
hierarchical:
decision_authority: "strategist"
task_dispatch:
- source: "strategist"
targets: ["researcher", "writer"]
mode: "directive"
review_chain:
- actor: "writer"
action: "draft"
- actor: "critic"
action: "review"
blocking: true
- actor: "strategist"
action: "approve_publication"
blocking: true
escalation_paths:
- condition: "cost_overrun_predicted"
to: "human_owner"
- condition: "critic_rejection_3x"
to: "strategist"
- condition: "strategic_pivot_needed"
to: "human_owner"
communication:
default: "structured_message" # not free-form chat
message_schema: "see Annex C"
archival: "all_messages_to_ledger"Communication between members is structured by default, not free-form chat. Free-form is allowed but expensive and discouraged for routine coordination. Removing communication overhead is a primary mechanism by which 1+100 outperforms 100 humans.
Group memory is distinct from agent memory. It contains what the group as a whole knows.
memory/
├── episodic.jsonl # group-level events (publications, milestones, pivots)
├── semantic/ # shared knowledge (vector index over group docs)
└── strategic.yaml # group-level hypotheses and self-model
Members can read group memory; writes to strategic memory are gated by coordination protocol (typically: only strategist writes, after strategic review).
Members maintain their own subjective memory in their agent directory. Group memory is the consensus-reality view.
Goal tree at the group level supersedes member-level goals.
mission: "Operate a profitable newsletter business in AI infrastructure."
active_goals:
- id: "g_subs"
statement: "1,000 free subscribers by 2026-08-01"
metrics: [...]
sub_goals:
- id: "g_subs_traffic"
statement: "5K weekly visitors via SEO"
owner_role: "researcher"
- id: "g_subs_referral"
statement: "Implement referral mechanism"
owner_role: "strategist"
status: "blocked"
blocker: "owner_decision_pending"
- id: "g_quality"
statement: "Maintain >40% open rate"
metrics: [...]
owner_role: "critic"
review_cadence:
strategic: "weekly" # group strategic review
tactical: "per_run"Records group-level decisions and outcomes. Distinct from individual member ledgers (which record what each member did).
ledger/
├── decisions/ # group-level strategic decisions
├── outcomes/ # outcomes attributed to goals
└── coordination_log.jsonl # routine coordination events
Group-level resources. This is where multi-model cost reasoning, budget enforcement, and capital allocation live.
# treasury/balances.yaml
financial:
budget_weekly: 50.00
spent_this_week: 12.30
revenue_mtd: 142.00
inference_budget:
by_model_class:
frontier-reasoning: 15.00
fast-general: 30.00
bulk-cheap: 5.00
api_quotas:
anthropic_remaining_pct: 0.78# treasury/allocation_rules.yaml
weekly_allocation:
content_production: 60%
growth_experiments: 20%
member_inference: 20%
approval_gates:
- any_action_above: 25.00
requires: "human_owner"
- any_recurring_above: 5.00_per_week
requires: "human_owner"Treasury implementations may be simple (file-based budget tracking) or sophisticated (real wallet integration). Spec defines interface; payment mechanism (BYOK API key vs. x402 native vs. owner-provisioned SaaS) is implementation-tier.
A core Karnel primitive: explicit declaration of cognitive tier per role. Eliminates the "MIT graduates wiping tables" problem.
Spec-defined tiers:
| Tier | Class name | Use case | Example models (illustrative) |
|---|---|---|---|
| T1 | frontier-reasoning |
Strategic, judgment-heavy, irreversible decisions | Opus, GPT-5, top reasoning models |
| T2 | fast-general |
Production work, drafting, structured tasks | Sonnet, GPT-4-class |
| T3 | bulk-cheap |
High-volume, low-judgment, repetitive | Haiku, DeepSeek-V, Qwen |
| T4 | local-private |
Sensitive data, offline | Llama, local quantized models |
Manifest declares tier classes, not specific model names. Runtime resolves tier-class to concrete model based on availability and configuration.
This separation ensures model migration (Sonnet → next-gen Sonnet) is automatic, while role assignment (strategist needs frontier) is portable.
A group can include members that are not Karnel-native. This is critical: it lets users incorporate Hermes agents, custom scripts, or future agent systems without rewriting them.
Foreign agent adapter contract (Annex B summary):
- Adapter declares the foreign agent's
task_classesaccepted - Adapter translates Karnel structured messages to foreign agent's input format
- Adapter writes outcomes back to group ledger in Karnel format
- Adapter declares
cost_estimation_function(best-effort)
Reference adapters in v0.2:
- Hermes adapter (
adapter_type: "hermes") - HTTP API adapter (
adapter_type: "http") - Local script adapter (
adapter_type: "subprocess") - Human adapter (
adapter_type: "human")
A Karnel-compatible runtime promises to:
- Honor manifest constraints. Hard constraints are enforced as gates; soft as preferences in prompts.
- Resolve runtime requirements. Map declared cognitive tier to concrete models. Map declared tools to available implementations.
- Append, never mutate. Memory and ledger entries are append-only.
- Run the standard loops. Tactical execution, strategic review, coordination dispatch, lifecycle management.
- Be replaceable. A runtime must be able to hand off a group/agent to another runtime cleanly.
- Support foreign agents. Adapter mechanism described in Annex B.
The spec does not require the runtime to be open-source, only that it implements the above. A proprietary runtime that hosts Karnel groups is fine — what matters is that the group can leave.
Reference runtimes (planned):
- Karnel Local: minimal Python implementation for self-hosting; reference impl
- Karnel Cloud: managed always-on hosting (commercial; companion product)
- Karnel-on-Hermes: optional adapter-driven mode where some members run on Hermes
The first-party reference agent suite is Karnel.ai — opinionated Karnel-native agents (strategist, researcher, writer, critic templates). Karnel.ai is one possible source of agents; foreign agents are equally first-class.
Karnel groups improve themselves at multiple horizons.
After each task run, a review process decides:
- Should a new skill be authored from this run?
- Should existing skills be updated?
- Should episodic memory be condensed into semantic memory?
Reference implementation borrows heavily from Hermes' review fork. Karnel-aware adaptation: scoped to memory + skills toolsets only; runs in isolation.
The group's strategist (or a dedicated review process) reviews:
- Goal progress: which sub-goals advanced, which stalled?
- Hypothesis status: what evidence accumulated?
- Strategic memory updates: new priors, retired hypotheses
- Roster check: any member underperforming?
- Resource check: was budget allocation optimal?
Output: updated goals/current.yaml, updated memory/strategic.yaml, optional roster change proposal (requires owner approval per coordination protocol).
- Mission alignment check
- Coordination protocol audit (any communication overhead patterns?)
- Capital allocation rule review
Background process (default cadence: weekly) that:
- Grades skill library (most-used / least-used / consolidate / prune)
- Identifies stale strategic hypotheses
- Suggests skill exports to public marketplaces
Bundled and hub-distributed skills are protected from mutation.
To migrate a group from runtime A to runtime B:
- Runtime A receives
migrate(group_id, destination). - Runtime A halts the group's loops, flushes state to disk.
- Group directory transferred (any transport).
- Runtime B validates manifest, resolves runtime requirements. Foreign-agent adapters re-establish connection.
- Runtime B starts the group's loops. Group resumes — same
group.id, same memory, same goals, continued ledger.
Member agents that are owned by the group migrate with it. Member agents that are independently owned re-establish membership via their own re-binding step.
- Cross-group settlement / payment. AgentPay-style structure is a separate spec building on top.
- Public agent/group marketplace. AgentMart-style discovery and transfer is a separate spec.
- Cryptographic attestation full implementation. v0.2 has hash chain only; signing in v0.3+.
- Multi-owner manifest. v0.2 assumes single owner. Fractional ownership in v0.4+.
- Privacy / encryption at rest. Implementation concern in v0.2; standardized in v0.3+.
- Agent-native monetary actions. x402 / autonomous payment is implementation-tier; spec stays neutral.
- Cross-jurisdictional legal compliance. Out of scope; handled by deployment context.
- Group manifest format: YAML chosen for human readability. Strong arguments for JSON (toolchain) or TOML welcome.
- Coordination protocol DSL: Current proposal is structured YAML. Should there be a proper DSL (like a workflow language)?
- Memory interface stability: Pluggable memory implementations need stable interface. Annex A is v0.2 first-pass.
- Foreign agent adapter generality: Annex B covers Hermes, HTTP, subprocess, human. What's missing?
- Ledger integrity vs. mutation needs: Append-only is clean but archiving / consolidation needs a mechanism. Proposal: snapshot-and-supersede pattern. Feedback wanted.
- Tier naming:
frontier-reasoning/fast-general/bulk-cheap— better names exist?
This is a draft. The goal of publishing before implementation completion is to find wrong design decisions while they are cheap to change.
- Issues / discussions on this repo with specific objections or alternatives
- Implementations of any single component welcome — we expect multiple reference implementations to coexist
- Particularly interested in feedback from: Hermes Agent contributors, plur authors, hermeshub maintainers, anyone running an autonomous agent system in production for >30 days, anyone running a 1-human-many-agent operation today
This spec stands on prior work by:
- NousResearch / Hermes Agent — for proving open, model-agnostic agent runtimes work, and for the SKILL.md format which Karnel skills extend.
- plur — for the open engram memory concept that informs Karnel's episodic tier.
- Anthropic Skills — for the SKILL.md primitive itself.
- Paperclip — for demonstrating that "AI as workforce" is the right organizational frame, even though the implementation is closed.
- The broader open-source agent ecosystem — whose existence makes a portability / orchestration layer worth defining.
- Annex A: Memory Interface specification
- Annex B: Foreign Agent Adapter specification
- Annex C: Structured Message Schema (inter-member communication)
- Annex D: Cognitive Tier Resolution rules
- Annex E: Migration Protocol detailed specification
These annexes will be drafted after v0.2 receives initial community feedback. v0.2 ships with the main document; annexes are stubs marked "RFC pending."
- v0.1: Single-agent identity layer (private draft, May 2026)
- v0.2 (this): Group primitive as first-class, cognitive tiers, foreign agents (May 2026)
- v0.3 (planned): Cryptographic attestation, encryption-at-rest, full annexes
- v1.0 (planned): After 6 months of reference implementation use, evidence-driven cleanup