Battle-tested plugins for Hermes Agent β zero core patches required. Drop in, enable, restart.
1. async-delegate/ π
Spawn background subagents without blocking the current conversation turn.
A Hermes Agent plugin that adds true async task delegation β fire off a subagent to work on something in the background while you keep chatting. When the task finishes, a notification is automatically injected back into the originating session.
βββββββββββββββ delegate_async ββββββββββββββββββββ
β Agent β βββββββββββββββββββΊ β Subagent β
β (turn) β returns task_id β (hermes chat) β
β β immediately β runs in bg β
β β ββββββββ¬ββββββββββββ
β continues β β
β chatting β .done file written β
β normally β β β
ββββββββ¬ββββββββ βΌ β
β ββββββββββββββββ β
β β Watcher βββββββββββ
β β Thread β polls every 5s
β β (daemon) β
β ββββββββ¬ββββββββ
β β
β βββββββββββββββββββ
β notification injected
β (queue or steer)
βΌ
Architecture:
delegate_asynctool β Agent calls this to spawn a backgroundhermes chatprocess. Returns atask_idimmediately. The agent's current turn is NOT blocked.- File-based coordination β Each task gets a set of files in
~/.hermes/async-tasks/(JSON metadata, prompt, wrapper script, output, error, done marker). - Watcher thread β A daemon thread polls for
.donefiles every 5 seconds. On completion, it injects a notification back into the originating session. - Session injection β Uses the gateway's internal APIs to deliver the notification, with fallback via
pre_llm_callhook.
| Mode | Behavior | Use For |
|---|---|---|
| Queue (default) | Notification waits for the current turn to finish, then delivers as a clean new turn | Background research, lookups, fire-and-forget tasks |
| Steer | Notification is interleaved into the agent's active tool loop mid-turn | Results that might change what the agent is currently doing (API checks, validation, etc.) |
| Tool | Description |
|---|---|
delegate_async |
Spawns a background subagent. Returns task_id immediately. |
check_async_tasks |
Check a specific task or list all tasks. Includes result preview for completed tasks. |
| Hook | Purpose |
|---|---|
pre_gateway_dispatch |
Captures GatewayRunner reference + session routing from incoming messages |
pre_llm_call |
Fallback: scans for completed tasks before each LLM call |
on_session_end |
Cleans up task files older than 24 hours |
- File-based, not database β Simple, debuggable, no migration headaches.
- Session injection, not webhooks β Works in any deployment, no external HTTP endpoints.
- Dual routing lookup β In-memory dict (fast) + JSON fallback (survives gateway restarts).
cp -r async-delegate ~/.hermes/plugins/async-delegate
# Add to config.yaml:
# plugins:
# enabled:
# - async-delegate
# Restart gatewayNo additional config needed. Drop the plugin folder into ~/.hermes/plugins/ and restart the gateway.
2. multi-agent-context/ π€
Injects shared channel/group history into agent context so agents can see what other agents said β without triggering infinite reply loops. Supports Discord (REST API) and Telegram (shared SQLite).
Running multiple Hermes agents in the same Discord channel creates a dilemma with no good built-in solution:
| Discord Trigger Mode | Problem |
|---|---|
require_mention: true |
β Agents only respond when @mentioned β BUT they see only the message they were tagged in, zero context of what anyone else said before. They respond blind. |
trigger: "all" |
β Agents see every message β BUT they respond to each other's messages in an infinite loop, burning tokens until you shut them down. |
You're forced to choose between agents that are deaf and agents that won't shut up. There is no middle ground in Hermes' built-in config.
This plugin gives you both: agents get full channel context (so they understand what's happening) but only speak when @mentioned (so they don't loop).
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Discord Channel β
β β
β User: "@Furina look at this screenshot" β
β Zhongli: "I think it's a bug in run_agent.py" β
β Nahida: "Actually the issue is in the compressor" β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Furina receives @mention β β
β β β β
β β WITHOUT plugin: β β
β β Sees ONLY: "@Furina look at this screenshot"β β
β β β "What screenshot? What are we talking β β
β β about? I have no context!" π΅ β β
β β β β
β β WITH multi-agent-context plugin: β β
β β Sees: Full channel history injected via β β
β β pre_llm_call hook β β
β β β "Ah! Zhongli says run_agent.py, Nahida β β
β β says compressor. Let me check both!" π‘ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
How it works under the hood:
Discord:
- Every time an agent is about to call the LLM (
pre_llm_callhook), the plugin fetches the last N messages from the current Discord channel/thread via the bot token - Formats them into a clean
[Recent Thread/Channel History]block - Injects that block as context into the current turn
Telegram:
- After every LLM response (
post_llm_callhook), the plugin writes the user message and bot response to a shared SQLite database in WAL mode - On the next turn (
pre_llm_callhook), it reads recent turns from that shared DB - Formats them as
[Recent Group History]and injects them as context
Both platforms: the agent now knows what everyone said β but still only responds when triggered by its normal config (mention, keyword, etc.)
Key features:
- Multi-platform (v2.0): Discord (REST API) and Telegram (shared SQLite) β both work simultaneously
- Contextvar-aware: Reads thread/channel/chat IDs from
gateway.session_contextβ no hardcoded IDs needed - Self-filtering (Discord): Strips the bot's own messages from history (no echo chamber)
- Cross-process shared state (Telegram): SQLite WAL mode enables multiple agent processes to read/write the same DB safely
- Cached (Discord): 10-second TTL prevents redundant API calls within the same turn
- Rate-limit handling: Respects Discord's
429 Retry-After - Mention sanitization: Strips Discord's
<@id>formatting for readability - Auto-pruning (Telegram): Messages older than 48 hours are automatically cleaned from the DB
Telegram's Bot API has no message history endpoint β unlike Discord, you can't just fetch recent messages from a group chat. Worse, when running multiple Hermes agent processes (one per bot), they cannot share in-memory state: each process has its own Python runtime, so a message received by one agent is invisible to the others.
The result: Telegram agents are deaf to each other, unable to build on what another agent just said.
A shared SQLite database on disk with WAL (Write-Ahead Logging) mode, which allows safe concurrent reads and writes across processes:
βββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Telegram Group Chat β
β β
β User: "@Zhongli what's the status?" β
β β
β ββ Zhongli process ββ ββ Nahida process βββββββ β
β β β β β β
β β post_llm_call: β β pre_llm_call: β β
β β writes turn to βββΌβββΌββΊ reads recent turns β β
β β shared SQLite β β from shared SQLite β β
β β β β β β
β β Nahida now sees: β β "Zhongli: All systems β β
β β β β nominal, PR #42 β β
β β β β merged!" β β
β ββββββββββββββββββββββ ββββββββββββββββββββββββββ β
β β
β ββββ shared SQLite DB ββββ β
β β /root/.hermes/data/ β β
β β multi_agent_tg_shared β β
β β .db (WAL mode) β β
β ββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββ
post_llm_callhook: After every Telegram turn, writes the triggering user message and the bot's response to the shared DBpre_llm_callhook: Before the next LLM call, reads recent turns from the shared DB and injects them as context- WAL mode: Multiple processes can read/write concurrently without locking each other out
- Auto-pruning: Messages older than 48 hours are automatically deleted to keep the DB lean
The plugin now registers two hooks:
| Hook | Trigger | Platforms | Purpose |
|---|---|---|---|
pre_llm_call |
Before every LLM call | Discord + Telegram | Injects channel/group history as context |
post_llm_call |
After every LLM response | Telegram only | Persists the turn to the shared SQLite DB |
Both platforms work simultaneously β Discord uses the REST API to fetch history, Telegram uses the shared SQLite database.
cp -r multi-agent-context ~/.hermes/plugins/multi-agent-context
# Add to config.yaml:
# plugins:
# enabled:
# - multi-agent-context
# Keep require_mention: true (or your preferred trigger) in Discord config
# Restart gateway| Variable | Default | Description |
|---|---|---|
MULTI_AGENT_HISTORY_COUNT |
20 |
Number of recent messages to inject as context (both platforms) |
DISCORD_BOT_TOKEN |
(auto-set) | Discord bot token β set automatically by Hermes |
MULTI_AGENT_BOT_NAME |
(profile name) | Display name for this bot in Telegram shared history |
MULTI_AGENT_TG_DB_PATH |
/root/.hermes/data/multi_agent_tg_shared.db |
Path to the shared SQLite database |
3. kanban-context/ ποΈ
Injects recent Kanban board activity into agent context so agents can see what tasks are moving through the board β without requiring explicit board queries.
The Hermes Kanban system powers multi-agent work queues with dependency chains, worker claims, and automatic task promotion. But the board lives in a SQLite database that agents never read during conversation. Workers using the kanban_* tools see their assigned task, but orchestrators and conversation agents have zero visibility into:
- Tasks being created and moving through the pipeline
- Blocked items affecting downstream work
- Completed tasks whose outputs may be useful
- Worker progress notes (heartbeats)
This plugin hooks into pre_llm_call and reads the last N events from the shared Kanban SQLite database. It injects a structured context block before every LLM call:
[Recent Kanban Activity]
- [2h ago] [kanban] **Design auth schema** (created β ready)
- [30m ago] [kanban] **Implement auth API** (completed)
- [5m ago] [linkedin-content] **Weekly trends post** (in progress: scraper running)
[End Kanban Activity]
- Multi-board: Scans both default and named boards (
kanban/boards/*/kanban.db) - Chronological merge: Events from all boards are sorted by time
- 13 event kinds recognised: created, assigned, claimed, completed, blocked, unblocked, heartbeat, spawned, archived, commented, linked, edited, promoted
- No extra dependencies: Uses Python stdlib (
sqlite3,json,os) - Path resolution via
get_hermes_home(): Works with any profile orHERMES_HOMEoverride
The multi-agent-context plugin (above) shares conversation history across Telegram/Discord bots. kanban-context complements it by sharing board history β together they give agents both conversational and operational context.
| Variable | Default | Description |
|---|---|---|
KANBAN_CONTEXT_EVENT_LIMIT |
10 |
Max events to inject per pre-LLM context block |
KANBAN_CONTEXT_LOOKBACK_H |
12 |
Lookback window in hours |
cp -r kanban-context ~/.hermes/plugins/kanban-context
# Add to config.yaml:
# plugins:
# enabled:
# - kanban-context
# Restart gatewayFor multi-profile setups, symlink or copy into each profile's plugins dir. See kanban-context/README.md for details.
4. native-vision/ β‘ β β οΈ DEPRECATED
native-vision/
β οΈ Now a built-in feature in Hermes Agent v0.11.0+ β this plugin is no longer needed. Kept here for historical reference.
Bypass the auxiliary vision model and send images directly to vision-capable main LLMs (GPT-4o, Claude Sonnet 4, GLM-5V-Turbo, etc.).
- What it solved: Hermes routes all image analysis through an aux vision model (e.g., qwen-vl), even when your main model can see images natively. This adds latency, cost, and information loss (text description β seeing pixels).
- How it worked: Runtime monkey-patching with signature-gated defensive checks. Inserts
[NATIVE_VISION_IMAGES:...]markers into the text pipeline, then expands them into multimodal content blocks before the API call. - Survives updates: If Hermes changes a method signature, that patch silently skips itself instead of crashing.
- Patches 5 methods across
gateway.run,cli, andrun_agentβ all viaregister(ctx).
| Setting | Default | Description |
|---|---|---|
native_vision_enabled |
true |
Master on/off toggle |
max_image_dimension |
1024 |
Resize max side in px (saves tokens) |
max_total_image_tokens |
100000 |
Token budget for all images combined |
vision_models |
(see file) | Model name allowlist (substring match) |
- Hermes Agent v0.11.0+ with plugin system support
- Python 3.11+
async-delegate: No extra dependencies β uses Python stdlib onlymulti-agent-context:pip install requests(usually already installed). Telegram path uses Python's built-insqlite3β no extra deps.kanban-context: No extra dependencies β uses Python stdlib only (sqlite3,json,os)
# Global plugin location (for reference)
~/.hermes/plugins/<plugin-name>/
# Each agent needs its own copy or symlink:
for agent in furina raiden zhongli nahida; do
mkdir -p ~/.hermes/profiles/${agent}/plugins/
ln -sf ~/.hermes/plugins/<plugin-name> \
~/.hermes/profiles/${agent}/plugins/<plugin-name>
doneThen enable in each agent's ~/.hermes/profiles/{agent}/config.yaml.
See the Hermes Plugin Development Guide for full details on the plugin system.
MIT β use freely, modify freely, contribute back if you'd like! π