Multiple Hermes Agents on a VPS

Run 1, 2, or N AI bots on Telegram from a single VPS. Your choice of model. Your choice of sharing. One curl | bash to ship.

Quick Install · Getting Started · Why this repo · Step-by-step · Providers · Troubleshooting

Quick Install

curl -fsSL https://raw.githubusercontent.com/Demonbane18/hermes-agent-setup/main/bootstrap.sh | bash

Works on any Linux/macOS/WSL2 VPS that has the Hermes Agent CLI installed. The bootstrap walks you through parent folder → gateway names → sharing strategy → LLM provider → model and writes everything in place. Existing files are never overwritten and you confirm before anything is created.

No Hermes CLI yet? Install it first with the upstream one-liner:
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
Then re-run the bootstrap above. Bare-metal walkthrough lives in Manual install fallback; the Hostinger 1-click path lives in Part 1.

Already have a ~/gateways/ setup? The bootstrap detects it and offers to extend it (add more bots, auto-inheriting your strategy + provider) or create a separate brand-new parent folder so your existing one stays untouched.

After installation:

cd ~/gateways                          # or whichever parent you chose
ls -la <gateway>/                      # list files (.env is hidden — needs -a)
nano <gateway>/.env                    # or vim, micro, $EDITOR — paste BotFather token + API keys
./run.sh all                           # start every discovered gateway

Why nano and not $EDITOR? Many fresh VPS shells don't have $EDITOR set, so a literal $EDITOR /path/.env expands to nothing and bash tries to execute the .env file (which fails with Permission denied because we chmod 600 it). Use a concrete editor name. The bootstrap script auto-detects one for its next-steps banner.

Getting Started

Once your gateways are running, everything lives in <parent>/run.sh:

./run.sh                               # start every discovered gateway (alias of all)
./run.sh list                          # list discovered gateway names
./run.sh status                        # show running PIDs + which gateway each serves
./run.sh stop                          # stop every gateway
./run.sh stop <name>                   # stop one gateway
./run.sh <name>                        # start one gateway in the foreground
./run.sh --help                        # full help

# Add another bot later (auto-detects strategy + provider from existing setup)
curl -fsSL https://raw.githubusercontent.com/Demonbane18/hermes-agent-setup/main/bootstrap.sh \
    | bash -s -- --add --parent ~/gateways --names <new-name>

# Check the bootstrap script version (useful when curl-piping after updates)
curl -fsSL https://raw.githubusercontent.com/Demonbane18/hermes-agent-setup/main/bootstrap.sh \
    | bash -s -- --version

Full setup walkthrough → · LLM Provider Reference → · Sharing Strategies → · Troubleshooting →

Who this is for: complete beginners. If you've never SSH'd into a server, never deployed a bot, never edited a YAML file — you're in the right place. Every command has a one-line plain-English explanation. Estimated time: 1–2 hours, half a Saturday.

What you'll have at the end: any number of Telegram bots that text you back, write their own skills over time, run on your choice of LLM provider (Xiaomi MiMo, Anthropic, OpenAI, OpenRouter, Gemini, Groq, DeepSeek, Ollama, or anything OpenAI-compatible), and quietly keep themselves in sync with your laptop's coding sessions. Each bot's brain is yours to isolate or share.

If this saves you a Saturday, star the repo so the next person can find it.

Quick Install
Getting Started
Why this guide exists
What is Hermes Agent?
Hermes Agent vs. OpenClaw — why this guide picked Hermes
Profiles vs. Multi-Gateway — four ways to share (or not)
Why one container, many gateways (and not N Docker containers)
Prerequisites & Costs
How it works (visual primer for first-timers)
- What's a symlink? (and what exactly are we symlinking?)
- Host VM vs container vs persistent volume
- What happens when you send a Telegram message
- The four kinds of "memory"
- Plain-English glossary of the rest
Part 1: Spin up the VPS with Hostinger's One-Click Install
Part 2: Connect Your First Telegram Bot
Part 3: Multi-Gateway Setup — Flexible N-Gateway Pattern
- 3.1 Choose your shape
- 3.2 Lay out the directory
- 3.5 The token injector — inject_config.py
- 3.6 The launcher — run.sh
- 3.8 systemd (optional)
- 3.9 One-command bootstrap (bootstrap.sh)
- 3.10 Sharing strategies — reference deep-dive
- 3.11 Adding a new gateway later
- 3.12 Cross-gateway handoff (_shared/handoff/)
- 3.13 Set bot commands in @BotFather
- 3.14 LLM Provider Reference
Part 4: OpenRouter API Setup
Part 5: Xiaomi MiMo (free / cheap inference)
Part 6: Add an Obsidian Second Brain
- 6.0 What is an Obsidian vault, in plain English?
- 6.4 Level it up: Karpathy's LLM Wiki pattern
Part 7: hermes-context — Sync with Claude Code on your laptop
Part 8: Connect Hermes Desktop
- 8.1 What Hermes Desktop is actually connecting to
- 8.2 Root VPS multi-gateway setup: pick one gateway
- 8.3 Hostinger one-click Docker setup: connect to the container gateway
- 8.4 Switching between work and personal
Architecture Diagrams
Real-Life Examples
Troubleshooting
Resources

Why this guide exists

Every Hermes tutorial out there shows you how to run one bot. The official docs walk you through hermes profile create if you want more, which gives you fully isolated brains.

But that's not what most of us actually want. Most of us have one head and several voices — a calm life-copilot for personal stuff, a direct technical operator for work, maybe a fitness coach, maybe a private CFO. We want every one of those bots to remember the same projects, learn from the same skills, and read the same notes. Only the personality should differ.

This repo is the exact setup, written for beginners, that gives you:

A working Hermes Agent on a Hostinger VPS in under 30 minutes using their one-click installer
Two — or three, or N — Telegram bots with different personalities sharing one memory and one skill library
OpenRouter + Xiaomi MiMo with automatic fallback so you never get stuck on a dead model
An Obsidian vault the bots can read and write to like a second brain
A hermes-context GitHub repo that bridges your laptop's Claude Code sessions to your VPS bots

I've been running this for over a month. Migrated to Xiaomi MiMo for primary inference. Below is the exact setup, copy-pasteable.

What is Hermes Agent?

Hermes Agent is an open-source AI agent by Nous Research. Think of it as a Linux assistant that lives in your computer or in a server you can talk to from anywhere — Telegram, Discord, the terminal, scheduled cron jobs.

What makes it different from "just a chat bot":

Feature	What it means in plain English
Tools	It can read/write files, run shell commands, search the web, fetch URLs, and call MCP servers (a way for AI to use external apps like Gmail or Calendar).
Memory	Important things you tell it get saved to a `MEMORY.md` file. Full sessions land in a local SQLite database.
Skills	Self-written markdown "recipes" the agent creates after solving something tricky, so next time it doesn't start from zero. After a week of use you'll have a small custom library.
BYOM	Bring your own model. OpenRouter, Anthropic, OpenAI, Xiaomi MiMo, Z.AI, MiniMax, local models via Ollama. Configure once; Hermes routes the calls.

Their docs at hermes-agent.nousresearch.com are excellent. Read them after this guide for anything I gloss over.

Hermes Agent vs. OpenClaw — why this guide picked Hermes

If you've been agent-curious for more than a weekend, you've heard of OpenClaw — Peter Steinberger's "personal AI assistant, the lobster way," shipped November 2025. It's good. It's the reason a lot of people first felt "oh, I can run my own agent." Hermes Agent (NousResearch/hermes-agent, 135k stars) shipped February 2026 and chose a different bet:

The Agent That Grows With You. — that's the Hermes tagline, and it's not marketing fluff. It's the architecture.

OpenClaw is gateway-first: wide channel coverage, broad integrations, low setup friction, reactive tool use. Hermes is agent-first: every successful task gets distilled into a reusable skill, persistent memory accumulates across sessions, and the agent you have at week 4 is meaningfully better than the one you booted on day 1.

Side-by-side tradeoffs

	OpenClaw	Hermes Agent
Bet	Breadth. Reach every channel, integrate every tool.	Depth. Compound knowledge across sessions.
Persistent memory	Notes/recall, no automatic skill formation	`MEMORY.md` + auto-generated `skills/` from solved problems
Self-improvement	None native — you copy patterns by hand	Skills extracted, refined, and reused across runs
Channels	Many out of the box (broad)	Telegram, Discord, Slack, WhatsApp, Signal, Email, CLI — and more
Multi-agent	First-class — multiple personas across channels	Possible via multi-gateway (this guide's whole topic)
Sandboxing	Lighter — mostly local	Five backends: local, Docker, SSH, Singularity, Modal
Scheduled automations	Add-on territory	Native — natural-language cron, runs unattended through the gateway
Web/browser control	Tool-level	Built-in: web search, browser automation, vision, image gen, TTS
Subagent delegation	Limited	Isolated subagents w/ own conversations, terminals, RPC scripts
Migration	—	Hermes setup wizard auto-detects `~/.openclaw/` and imports it
Maturity signal	Earlier mover, larger integration catalog	135 stars in <3 months, faster release cadence, Nous Research-backed

The full Hermes feature list (for the skim-readers)

Straight from hermes-agent.nousresearch.com:

Lives where you do — Telegram, Discord, Slack, WhatsApp, Signal, Email, CLI, and a growing list of platforms. Start on one, pick up on another.
Grows the longer it runs — persistent memory and auto-generated skills. Learns your projects and never forgets how it solved a problem.
Scheduled automations — natural-language cron for reports, backups, and briefings. Runs unattended through the gateway.
Delegates & parallelizes — isolated subagents with their own conversations, terminals, and Python RPC scripts. Zero-context-cost pipelines.
Real sandboxing — five backends: local, Docker, SSH, Singularity, Modal. Container hardening and namespace isolation.
Full web & browser control — web search, browser automation, vision, image generation, text-to-speech, multi-model reasoning.

Profiles vs. Multi-Gateway — four ways to share (or not)

There are four patterns for running multiple Hermes bots. The first one is upstream's profile system; the other three are the multi-gateway sharing strategies this guide ships in bootstrap.sh. Pick the row that matches your situation:

	Profiles (official upstream)	Multi-Gateway: `isolated` (default)	Multi-Gateway: `shared-skills`	Multi-Gateway: `shared-both`
Memory (`memories/`)	Isolated per profile	Isolated per bot	Isolated per bot	Shared (`_shared/memories/`)
Skills (`skills/`)	Isolated per profile	Isolated per bot	Shared (`_shared/skills/`)	Shared (`_shared/skills/`)
Sessions	Isolated per profile	Isolated per bot	Isolated per bot	Isolated per bot
Obsidian vault	n/a	Shared (the durable layer)	Shared (the durable layer)	Shared
System prompt	Same default	Different per bot	Different per bot	Different per bot
Bot tokens	One per profile	One per bot	One per bot	One per bot
Cross-bot recall	None	Only via Obsidian (deliberate)	Only via Obsidian + skills (deliberate-ish)	Yes — automatic
Leak risk	None	None — pure isolation	Low — facts cross over only via the vault	Personal facts can surface anywhere
Process isolation	Separate Hermes processes	Shared Hermes runtime	Shared Hermes runtime	Shared Hermes runtime
Best for	Strict tenant isolation, compliance	Distinct personas, max separation	One skill library, separate memory streams	One head, many voices

Rule of thumb:

Freelancer juggling clients who must never see each other's data, or compliance boundary required → profiles. Upstream-supported, fully separate everything.

You want each bot to remember only its own conversations and write to a shared Obsidian vault for anything durable → isolated (the new default).

You're maintaining one skill library that every bot should benefit from, but you want each bot's MEMORY.md to stay tight and on-topic → shared-skills.

All your bots are personas of the same you (work + life + coach) and "what did I tell the work bot yesterday?" should just work from the personal bot → shared-both (the historical default).

Why one container, many gateways (and not N Docker containers)

Reasonable question: if Hermes already runs in a Docker container on the Hostinger 1-click, why not just spin up two containers — hermes-work, hermes-personal — and call it a day? Tried it. Don't.

The whole point of this guide is one shared brain, many voices. Containers are a unit of isolation. Voices that share a brain don't want isolation between them — they want the opposite. So you separate at the right layer: one container holds the brain, N gateway processes inside it present different faces to Telegram.

Every concern stacks the same direction:

	One container, N gateways (this guide)	N containers, one gateway each
Shared brain	A symlink inside the same filesystem. One inode, every gateway process sees the exact same bytes the moment they're written.	A bind-mounted volume across containers. Two processes from two containers writing to the same SQLite/flat-file `memories/` race each other; nothing in Hermes coordinates locks across containers. Eventually you corrupt a memory file and don't notice for a week.
RAM	Gateway processes are ~50–100 MB each. Four bots ≈ 400 MB total. Comfortable on KVM 2's 8 GB with room left for MCP servers and the model client.	A full Hermes container idles around 1–2 GB once Python, MCP servers, and the model client are loaded. Four containers ≈ 4–8 GB just sitting there. KVM 2 starts swapping; you're forced to KVM 4 before adding a single skill.
Hostinger upgrade path	The 1-click template manages exactly one container (`hermes-agent`). Restart, upgrade, rollback — already wired up.	Hostinger's template doesn't know about your extra containers. Their upgrades touch only the one they shipped. You inherit lifecycle for the rest — base image bumps, Python version drift, MCP version drift, all of it.
MCP servers & model client	One set of MCP processes, one OpenRouter/MiMo client pool, one cron daemon. Shared across every gateway.	Every container starts its own MCP stack and opens its own model connections. Multiplied API session count, multiplied warm-up time, multiplied debug surface when something misbehaves.
Cron / scheduled skills	One crontab. A single 6 AM "morning brief" task can read memories the work bot wrote yesterday and DM the personal bot the result.	Cron lives where? Pick a container. Now that container needs read access to the others' state, which means more bind mounts, which means we're back to the corruption problem in row 1.
Cross-bot handoff	Drop a file into `~/gateways/_shared/handoff/` (§3.12) — every gateway sees it instantly, same filesystem.	Requires a shared bind mount plus filesystem-event coordination across container boundaries. Doable, fragile, and you'll debug it at 11 PM the first time inotify drops an event.
Operational surface	One `run.sh`, one log stream, one `tmux`/systemd unit. Adding a fifth bot is one more folder + symlinks + 60 seconds.	N `docker-compose` services (or N `docker run` invocations), N log streams, N restart policies, N env files to keep in sync. Adding a fifth bot is a config change everywhere.
Backups	`docker cp hermes-agent:/<volume>/skills ./skills-backup` and you have everything. One volume to snapshot.	N volumes, possibly across N containers, each with partial state. You can do it; you just have to remember which container owns which authoritative copy.

A few honest cases where extra containers do make sense — and none of them apply to "I want a work bot and a personal bot":

Hard tenant isolation (multiple paying clients, compliance boundary, can't-leak-ever data). Use profiles for this, not extra containers — that's exactly what the upstream profile system was built for. See the table at Profiles vs. Multi-Gateway.
Different Hermes versions side-by-side (you're testing an upgrade). Spin up a second container temporarily, point it at a copy of the volume, throw it away when you're done. Not a permanent setup.
Genuinely different runtimes (one bot needs a GPU passthrough, another doesn't). Different problem, different shape.

For the "one builder, two-to-five voices, one shared brain" case this guide is built around, the bare-VPS-style pattern inside the existing container is the cheap, durable, boring choice. And boring is what you want from infra you don't think about.

Prerequisites & Costs

Realistic, no surprises:

Item	Cost	Notes
Hostinger VPS (Hermes Agent 1-click)	₱995.68/mo (renews ₱819/mo) — about $18–20 USD (as of May 6, 2026)	KVM 2 default: 2 vCPU, 8 GB RAM, 100 GB NVMe, 8 TB bandwidth. Comfortable for 2–4 gateways.
Domain	Optional	Only if you want public webhooks. SSH-only setup needs none.
Telegram bots	Free	Created via @BotFather. One per voice.
OpenRouter credits	$5 minimum top-up	Pay-as-you-go. ~$2–5/month for moderate use.
Xiaomi MiMo	Free with Orbit, otherwise ~free at Token Plan tier	See §5.1.
Obsidian	Free	Local-first notes app. Optional but transformative.
Time to set up	30–90 minutes	The 1-click cuts the original 1–2 hour estimate in half.

You also need:

An SSH client. macOS/Linux already have one. Windows: install Termius or use the built-in ssh in PowerShell.
A Telegram account.
A GitHub account (for the optional hermes-context repo).

How it works (visual primer for first-timers)

Before you start clicking and copy-pasting, here are four ideas this guide leans on hard. Skim the diagrams once. If something later feels confusing, scroll back here.

You don't need to understand all of this to follow the steps. The commands work as written. This section is here for the moment you ask "wait, what is this actually doing?" — usually around Part 3.

What's a symlink? (and what exactly are we symlinking?)

A symlink (short for symbolic link) is a Linux pointer that makes one folder look like it's in two places at once. It's not a copy — both names lead to the same actual files on disk. Edit through either name, both reflect the change.

Picture two filing cabinets:

WITHOUT a symlink (the regular way)
────────────────────────────────────────
~/gateways/work/                   ~/gateways/personal/
├── memories/   (real folder)    ├── memories/   (a SECOND real folder)
│   └── MEMORY.md                │   └── MEMORY.md   (different file!)
└── skills/                      └── skills/
    └── docker.md                      └── (empty — you'd have to copy)

  ↑ Two folders. Two MEMORY.md files. Two copies of every skill.
  ↑ "Remember I prefer pnpm" → only the work bot remembers.

WITH a symlink (this guide's pattern)
────────────────────────────────────────
~/gateways/work/                   ~/gateways/personal/
├── memories/   (real folder)    ├── memories/   → ../work/memories/
│   └── MEMORY.md                │     (a pointer, NOT a copy)
└── skills/                      └── skills/     → ../work/skills/
    └── docker.md                        (a pointer, NOT a copy)

  ↑ One real folder. The "personal" name is just a shortcut to it.
  ↑ "Remember I prefer pnpm" → ONE MEMORY.md updated → BOTH bots see it.

The same idea as a diagram:

What we symlink in this guide:

Folder	Symlinked?	Why
`memories/`	Yes in `shared-both` strategy; no in `isolated` or `shared-skills`	"Did I tell the work bot about X?" should also work from the personal bot.
`skills/`	Yes — always	A recipe written by one bot is procedural knowledge — every bot should benefit.
`sessions/`	No, never	Conversations stay private to the bot they happened in.
`config.yaml`, `.env`	No, never	Each bot needs its own bot token, system prompt, and personality settings.

Verifying a symlink looks right:

ls -la ~/gateways/personal/
# A symlink shows an arrow:
# lrwxrwxrwx 1 root root  35 May  6 09:11 memories -> /root/gateways/work/memories
# drwxr-xr-x 2 root root  64 May  6 09:10 sessions       (real folder, no arrow)

The l at the very start of the line means "this is a link." If you see d (directory), it's a real folder, not a symlink — fix it before continuing.

Host VM vs container vs persistent volume

If you used the Hostinger 1-click (Part 1), Hermes runs inside a Docker container, which runs inside the VPS Hostinger gave you. Three layers, each with a different shell, different filesystem, and different rules. Mixing them up is the #1 source of "wait, where did my files go?" confusion.

The three "places" you can be working in:

Where you are	How you got there	What you can see
Your laptop	(you live here)	Your local files, Obsidian, Claude Code
Host VM	`ssh root@<vps-ip>`	Docker daemon, container logs, the host filesystem (mostly empty)
Inside the container	`docker exec -it hermes-agent bash` (run on the host)	The `hermes` CLI, your gateways folder, the persistent volume

The persistent volume is the green box. Anything inside the green box survives docker restart. Anything inside the container but outside the volume (like apt install packages, files in /tmp) gets wiped on restart. That's why this guide always tells you to put ~/gateways/, your hermes-context clone, and your Obsidian vault under the persistent path.

What happens when you send a Telegram message

Useful to see end-to-end the first time. This is the work bot, shared-both strategy, MiMo as the primary model:

The same flow happens for the personal bot — but because Brain is symlinked, both bots are reading and writing the same memories/ and skills/. That's the whole magic in one diagram.

The four kinds of "memory"

Hermes has more than one place it remembers things, and they behave differently. Conflating them is the second-most-common source of confusion (the first is host-vs-container).

Type	What it is	Lifetime	Sharing pattern in this guide
memories/	A markdown file (`MEMORY.md`) the agent appends to whenever something seems worth remembering. Mostly automatic, mostly noisy.	Until you delete it	Shared in `shared-both` · Isolated in `isolated` and `shared-skills`
skills/	Markdown recipes the agent writes after solving a problem. Procedural ("how to deploy n8n"), not personal.	Until you delete it	Always shared via symlink — the one folder every bot benefits from
sessions/	A SQLite database with full conversation transcripts. Searchable, replayable, big.	Until you delete it	Never shared — your conversations stay where they happened
Obsidian vault	A folder of `.md` files you also see in Obsidian on your laptop. The canon — what you've decided is durable.	Until you delete it (and you can `git revert`)	Shared via path — one `OBSIDIAN_VAULT_PATH` in every `.env`

Mental model: memories/ is what the agent scribbles on a napkin. skills/ is the cookbook. sessions/ is the diary. The vault is the binder you keep on the shelf — the only one you curate, the only one that's truly yours.

Plain-English glossary of the rest

Quick definitions for terms used elsewhere in this guide. Skim now, refer back when one of these surprises you.

Term	Plain English	Where it shows up
VPS	A "virtual private server" — a slice of someone's real machine that behaves like your own Linux box.	Part 1
KVM	The kind of virtualization Hostinger uses — gives you a real Linux kernel, not just a sandboxed shell. Translation: "this is a real computer for you, not a shared cubicle."	Prerequisites
SSH	"Secure shell." A tool that opens a terminal on a remote computer over an encrypted connection. `ssh root@1.2.3.4` = "log in as `root` on that IP."	§1.2
Bot token	A long random string BotFather gives you. It's the password for that one Telegram bot. Anyone with it can send messages as your bot — treat it like a password.	§2.1
API key	Same idea, different service. A long string a model provider (OpenRouter, MiMo) gives you so they can identify and bill you. Goes in `.env`, never in code.	Part 4
`.env` file	A plain-text file holding secrets and per-deployment knobs (`OPENROUTER_API_KEY=…`, `OBSIDIAN_VAULT_PATH=…`). Hermes reads it on startup. Never commit it to git.	§3.5
`config.yaml`	The bot's non-secret settings — which model, which fallback, which provider. Safe to commit. The token gets injected at runtime by `inject_config.py`.	§3.6
`chmod 600`	"Only the owner can read or write this file." We do this to `.env` files because they hold tokens. Anything else can be read by other users on the machine.	§3.5
Symlink	See above — a folder/file alias. Lets two paths refer to the same actual data.	Part 3
Container / Docker	A self-contained running copy of an app, packaged with everything it needs. The Hostinger 1-click is a Docker container holding Hermes.	§1.3
Persistent volume	A folder Docker keeps separate from the container so your data survives when the container itself is restarted/replaced.	§1.4
`docker exec -it … bash`	"Open an interactive shell inside the running container." This is how you get to where Hermes lives.	§1.5
systemd	Linux's built-in "make this thing start on boot and restart if it dies" service manager. We use it to keep `run.sh` running.	§3.8
tmux	A terminal multiplexer — lets you start a long-running command in a "session" that survives even after you close your SSH window. Detach with `Ctrl+B` then `D`; re-attach later with `tmux attach`.	§2.4
cron	Linux's scheduler. "Run this command every 15 minutes / every Tuesday / every hour." We use it to keep the hermes-context repo synced.	§7.4
Fallback provider	A second model/API that gets used automatically if the primary is down or rate-limited. In this guide: MiMo primary, OpenRouter fallback.	§5.5
MCP server	"Model Context Protocol" — a standard way for AI agents to call external tools (Gmail, Calendar, Slack, …). Hermes can plug in MCP servers as new capabilities.	§"What is Hermes Agent?"
Gateway	In this guide: one Hermes process attached to one Telegram bot. Two bots = two gateways. The "multi-gateway pattern" is many of these sharing a brain.	Part 3
Profile (Hermes term)	Hermes's official feature for fully isolated bots. Different from gateways: profiles wall off everything, gateways can choose what to share.	§
`git pull --rebase`	"Get the latest from GitHub, but lay any of my unpushed changes neatly on top of it instead of making a merge bubble." Cleaner history, same end state.	§7.3

Part 1: Spin up the VPS with Hostinger's One-Click Install

Hostinger ships a one-click Docker template for Hermes Agent: hostinger.com/ph/vps/docker/hermes-agent. Click Deploy, pay, log in. You skip every step a normal install requires — Docker, Python, the setup-hermes.sh ceremony, all of it. Hermes comes up inside a container with a persistent Docker volume that survives restarts and template upgrades, which means your skills, memories, sessions, and config are safe as long as you don't blow the volume away.

This section is the new path. It replaces the old manual git clone + setup-hermes.sh walkthrough — that's now the Manual install fallback at the bottom.

1.1 Click Deploy

Go to hostinger.com/ph/vps/docker/hermes-agent.
On the right-hand panel, the default plan is KVM 2 — 2 vCPU, 8 GB RAM, 100 GB NVMe, 8 TB bandwidth, ₱549/mo (~$8–10 USD) introductory. That's exactly what this guide assumes. If you'll run more than 4 gateways or heavy MCP integrations, bump up to KVM 4 instead.
Click the purple Deploy button.
Hostinger asks you to sign in / sign up, pick a billing cycle, and pay. Standard checkout flow.
After payment, Hostinger drops you into the VPS provisioning screen. Choose:
- A data center close to you (lower latency to Telegram).
- A strong root password — save in your password manager. Don't rely solely on Hostinger's emailed copy.
- A hostname like hermes if it asks. Cosmetic.
Click Continue / Finish. Provisioning takes 2–4 minutes. When it's done you get a server IP. Copy it.

What this does: Hostinger spins up a fresh KVM virtual machine with Docker pre-installed and a Hermes Agent container already running on it. Everything you'd otherwise do by hand — install Python, clone the repo, run setup-hermes.sh, configure systemd — is replaced by a running container with a mounted persistent volume.

1.2 SSH into the VPS host

Open a terminal on your laptop:

ssh root@<your-vps-ip>

Type yes when asked about the host fingerprint, then paste your root password.

What this does: Opens a remote shell on the host VM (the box Hostinger gave you). The Hermes Agent itself lives one level deeper, inside a Docker container running on this host. Almost everything in this guide happens inside the container, but you start at the host shell.

1.3 Confirm the Hermes container is running

docker ps

You should see a single running container with an image like nousresearch/hermes-agent or hermes-agent and a status of Up X minutes (healthy). Note its NAME (left-most column) — likely hermes-agent or hermes. You'll use that name in the next step.

If the container is missing or in a Restarting loop:

docker ps -a              # show all containers, even stopped/crashed ones
docker logs <name> --tail=200

The logs almost always tell you what's missing (usually a model API key — fixable in Part 4).

1.4 Find the persistent volume (this is where your data lives)

docker inspect <container-name> --format '{{json .Mounts}}' | jq

You'll see a JSON block. Look for the entry with "Type": "volume" and read the Destination field — that's the path inside the container where Hermes keeps its state (commonly /root/.hermes, /data, or /app/data). Whatever it is, all the gateway folders we'll create later (~/gateways/work, etc.) must live underneath that path so they survive restarts. The default container HOME is usually under that mount; the rest of this guide assumes it is.

Why this matters: Files written to non-mounted paths inside a Docker container disappear when the container restarts. The persistent volume is what makes Hermes self-improving over time — your skills accumulate, your memories persist, your conversations are recoverable.

1.5 Enter the container

From here on, "inside the box" means inside the Hermes container, not on the Hostinger host VM:

docker exec -it <container-name> bash

The prompt changes (often to something like root@<container-id>:~#). You're now inside the container. The hermes command is on PATH here, not on the host.

Tip: You'll be doing this a lot. Add a host-side alias once and forget it: echo 'alias hermes-shell="docker exec -it hermes-agent bash"' >> ~/.bashrc && source ~/.bashrc. Then just type hermes-shell to drop in.

1.6 Verify Hermes is alive (inside the container)

hermes --version

You should see a version number (e.g. Hermes Agent 0.x.x). If you get command not found here, the container image is malformed — open a Hostinger support ticket; this isn't something you should have to fix by hand.

1.7 Pick a model (first time)

Still inside the container:

hermes model

This opens an interactive picker. Pick OpenRouter for now and follow the prompts. We'll layer on Xiaomi MiMo as the primary later in Part 5. You'll need an OpenRouter key first — see Part 4 if you want to set that up before continuing.

What this does: Tells Hermes which AI model to call when you message it. OpenRouter is one API key, hundreds of models — perfect default.

1.8 What survives a restart, what doesn't

The 1-click template's persistent volume covers:

skills/ (every recipe the agent has ever written)
memories/ (the MEMORY.md files)
sessions/ (your conversation history & SQLite DB)
config.yaml per gateway
.env per gateway (provided you put them under the mounted path — we will)
Any cron jobs you register via hermes -p <profile> cron add

The volume does not cover:

Anything outside the mounted path (system packages you apt install inside the container, files in /tmp, etc.).
The host VM's filesystem outside the volume binding.

When you docker restart hermes-agent or Hostinger upgrades the template, all the items above stick around. That's the whole point of using their 1-click — the upgrade story is solved for you.

Don't docker volume rm. Removing the named Docker volume Hostinger created deletes every memory and skill the agent has accumulated. If you do need to nuke and start over, back up skills/ and memories/ first by copying them to the host with docker cp <container>:/<volume-path>/skills ./skills-backup.

Manual install fallback

If for some reason you can't or won't use the Hostinger 1-click — you're hosting elsewhere, the template was unavailable, you want bare metal — install Hermes manually on a fresh Ubuntu 24.04 box.

Just running on your laptop/desktop? Read this first.

If you're installing Hermes directly on your own computer (not on a remote VPS), the official quickstart is the simplest path of all — and most of this guide is overkill for that case:

→ hermes-agent.nousresearch.com/docs/user-guide/profiles

Why this might be all you need: A lot of what this guide solves only matters when Hermes lives on a remote VPS that has to keep running while your laptop sleeps — persistent Docker volumes, restart-survival, cron across container reboots, dual-gateway symlinks for shared brain across personas. If everything sits on one machine and you're happy using Hermes's built-in profiles to separate contexts (work vs. personal vs. a client project), follow the upstream profiles guide and stop there. Come back to this README when you outgrow it — typically when you want the bot reachable while the laptop is closed, or you want one shared brain across multiple voices instead of isolated profile copies.

Quick install (one-liner, recommended for VPS):

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

What this does: Pulls the upstream installer, installs Python deps, clones Hermes into ~/.hermes/, and puts the hermes CLI on your $PATH. Takes 2–3 minutes on a fresh KVM 2.

Manual install (if you want to see every step):

# Update the system & install dependencies
apt update && apt upgrade -y
apt install -y python3 python3-pip python3-venv git tmux curl

# Clone & install Hermes
git clone https://github.com/NousResearch/hermes-agent.git ~/.hermes/hermes-agent
cd ~/.hermes/hermes-agent
./setup-hermes.sh

Either path leaves you with the same result: the hermes command is global on the host. Run hermes --version to confirm, then hermes model and continue with Part 2.

Bare-metal mental model: When Hermes lives directly on the host (not in a container), every reference to "inside the container" in the rest of this guide just means "on the host shell." Skip the docker exec step, ignore the systemd unit needing docker restart, and the rest of the commands work as-is.

Part 2: Connect Your First Telegram Bot

Before doing the multi-gateway dance, get one bot working end-to-end. If one bot works, two will work, and so will twenty.

Container note: From here on, all hermes … commands run inside the Hermes container. If you're using the Hostinger 1-click, drop in with docker exec -it hermes-agent bash (or your hermes-shell alias from §1.5) before running anything in this section. Bare-metal users can ignore this — your shell is already in the right place.

2.0 Prerequisites checklist

Before §2.1, confirm you have:

A VPS with root or sudo access. Hostinger 1-click (Part 1) or any Ubuntu/Debian box.

The hermes CLI installed. hermes --version should print a version. Bare-metal one-liner if it isn't:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

Python 3.10+ with PyYAML. python3 -c "import yaml" should exit 0. If not: pip3 install pyyaml (or your distro's python3-yaml package).
At least one Telegram bot token per gateway you'll be running. You'll create them in §2.1 and §3.3.
One LLM API key. Default is Xiaomi MiMo (free tokens — see Part 5). OpenRouter as fallback (Part 4). Swap to any provider Hermes supports.

2.1 Create a bot in Telegram

Open Telegram and start a chat with @BotFather.
Send /newbot.
Pick a display name (e.g., "Work Hermes") and a username ending in bot (e.g., my_work_hermes_bot).
BotFather replies with a bot token — a long string like 1234567890:AABBccDDeeFFggHHiiJJkkLLmmNNooPP. Copy it. This is the password to your bot. Don't share it.
While you're here, message @userinfobot and send /start. It'll reply with your Telegram user ID (a number). Copy that too.

2.2 Tell Hermes about the bot

Back on the VPS:

hermes gateway setup

Paste the bot token when prompted. Paste your Telegram user ID. Done.

What this does: Saves the token in Hermes's config so it knows which bot to attach to and saves your user ID so the bot only listens to you (not random strangers who find the bot).

2.3 Run it (test mode)

hermes gateway run

Open Telegram, message your new bot ("hello"), and watch it reply.

Press Ctrl+C to stop.

2.4 Run it as a service (survives reboots)

On the Hostinger 1-click (Docker): the container itself has --restart unless-stopped baked in by the template, so the gateway process inside it just needs to start when the container starts. Inside the container:

hermes gateway install                    # writes the in-container service file
hermes gateway run &                      # start it for this session
disown                                    # detach from your shell so it survives logout

For a more proper inside-container service, you can run the gateway under tmux (already installed in the template):

tmux new-session -d -s hermes 'hermes gateway run'
tmux ls                                   # confirm session 'hermes' exists

When the container restarts (or the host reboots), Docker will re-launch the container and you'll re-attach with tmux attach -t hermes from inside it. We'll replace this with a multi-gateway launcher in Part 3, so don't over-invest here.

On bare-metal install:

hermes gateway install
systemctl start hermes-gateway.service
systemctl enable hermes-gateway.service   # auto-start on reboot
systemctl status hermes-gateway.service

You now have one working bot. If this is all you wanted, you can stop here. Most people stop here. But the cool part is next.

Part 3: Multi-Gateway Setup — Flexible N-Gateway Pattern

This is the section that makes this guide different from every other Hermes tutorial. N Telegram bots — one, two, twenty — each with its own voice, one universal launcher, and three picks for how much they share.

Pick the parent folder name (~/gateways, ~/agents, ~/hermes-bots). Pick the gateway names (work, personal, client-acme, home-automation — anything alphanumeric). Pick the sharing strategy:

	`isolated`	`shared-skills`	`shared-both`
`memories/`	per-gateway	per-gateway	shared via `_shared/memories/`
`skills/`	per-gateway	shared via `_shared/skills/`	shared via `_shared/skills/`
Best for	Distinct personas, max separation	One skill library, separate memory streams	One brain, many voices
Default?	✅		(was the historical default)

Two ways to do everything in this Part:

Manual (§3.1–3.8) — read the steps, run the commands, understand each piece. Use this the first time.
Bootstrap (§3.9) — bootstrap.sh does §3.1–3.8 for you in one command, with prompts for the choices above.

Container note (Hostinger 1-click users): every command in this part runs inside the Hermes container. Drop in with docker exec -it hermes-agent bash first. The ~/gateways/ path used throughout sits inside the container's persistent Docker volume, so everything you create here survives container restarts and template upgrades. Confirm your ~ is on the persistent mount with df -h ~ — the device should match the volume Destination you saw in §1.4. If ~ isn't on the volume, replace ~/gateways/ with the actual mount path (e.g. /data/gateways/) everywhere below.

Bare-VPS users (no Hermes installed yet): if you're not on the Hostinger 1-click and you don't already have the hermes CLI on this box, install it first with the one-liner before doing anything in this part:
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
Confirm with hermes --version. Full bare-metal walkthrough (deps, manual clone, alternate paths) is in §Manual install fallback. The rest of Part 3 then runs on your normal host shell — every "inside the container" instruction collapses to "on the host."

3.1 Choose your shape

Decide four things up front. The bootstrap script (§3.9) prompts for these; if you're doing it manually, write them down now:

Choice	Default	Examples
Parent folder	`~/gateways`	`~/agents`, `~/hermes-bots`, `/opt/hermes`
Gateway count	2	1, 3, 20
Gateway names	`gateway-1`, `gateway-2`, …	`work`, `personal`, `client-acme`, `home-automation`
Sharing strategy	`isolated`	`shared-skills`, `shared-both`

Name rules: alphanumeric + -/_, no leading dot or underscore. The leading-underscore rule matters because <parent>/_shared/ is reserved for canonical shared dirs (memories + skills + handoff) and the launcher silently skips any sibling whose name starts with _.

For the rest of this Part the worked example is ~/gateways/work + ~/gateways/personal, no shared dirs (isolated strategy). Substitute your own values if you picked different ones.

3.2 Lay out the directory

The shape depends on the strategy you picked.

Strategy: isolated (default)

~/gateways/
├── run.sh                  # universal launcher (templates/run.sh.template)
├── inject_config.py        # universal token injector
├── work/
│   ├── .env                # work bot token + system prompt
│   ├── config.yaml
│   ├── memories/           (real folder)
│   ├── skills/             (real folder)
│   └── sessions/
└── personal/
    ├── .env                # personal bot token + system prompt
    ├── config.yaml
    ├── memories/           (real folder, separate from work's)
    ├── skills/             (real folder, separate from work's)
    └── sessions/

Strategy: shared-skills

~/gateways/
├── run.sh
├── inject_config.py
├── _shared/
│   ├── skills/             ◄─── canonical (real folder)
│   └── handoff/            (cross-gateway handoff, see §3.12)
├── work/
│   ├── .env
│   ├── config.yaml
│   ├── memories/           (real folder, work-only)
│   ├── skills/    ──►      symlink → _shared/skills
│   └── sessions/
└── personal/
    ├── .env
    ├── config.yaml
    ├── memories/           (real folder, personal-only)
    ├── skills/    ──►      symlink → _shared/skills
    └── sessions/

Strategy: shared-both

~/gateways/
├── run.sh
├── inject_config.py
├── _shared/
│   ├── memories/           ◄─── canonical (real folder)
│   ├── skills/             ◄─── canonical (real folder)
│   └── handoff/
├── work/
│   ├── .env
│   ├── config.yaml
│   ├── memories/  ──►      symlink → _shared/memories
│   ├── skills/    ──►      symlink → _shared/skills
│   └── sessions/
└── personal/
    ├── .env
    ├── config.yaml
    ├── memories/  ──►      symlink → _shared/memories
    ├── skills/    ──►      symlink → _shared/skills
    └── sessions/

Build it. Stop the single bot first if you have one running:

hermes gateway stop 2>/dev/null || pkill -f "hermes gateway run" || true

What this does: tries the polite shutdown first, falls back to pkill so an old gateway isn't holding your token when you launch the new fleet.

PARENT=~/gateways
GATEWAYS=(work personal)                # your names
STRATEGY=isolated                       # or shared-skills / shared-both

mkdir -p "$PARENT"
cd "$PARENT"

# 1) Shared dirs (only for non-isolated strategies)
if [ "$STRATEGY" != "isolated" ]; then
  mkdir -p _shared/handoff
  [ "$STRATEGY" = "shared-both" ] && mkdir -p _shared/memories
  mkdir -p _shared/skills
fi

# 2) Per-gateway scaffolding (let `hermes setup` create memories/skills/sessions)
for gw in "${GATEWAYS[@]}"; do
  mkdir -p "$PARENT/$gw"
  (cd "$PARENT/$gw" && hermes setup)    # accept blank token; .env injects it later
done

# 3) Apply the symlinks the strategy requires
for gw in "${GATEWAYS[@]}"; do
  case "$STRATEGY" in
    shared-skills)
      rm -rf "$PARENT/$gw/skills"
      ln -s "$PARENT/_shared/skills" "$PARENT/$gw/skills"
      ;;
    shared-both)
      rm -rf "$PARENT/$gw/memories" "$PARENT/$gw/skills"
      ln -s "$PARENT/_shared/memories" "$PARENT/$gw/memories"
      ln -s "$PARENT/_shared/skills"   "$PARENT/$gw/skills"
      ;;
    isolated)
      : ;; # nothing to symlink
  esac
done

# Verify (any non-isolated gateway should show arrows)
ls -la "$PARENT/${GATEWAYS[1]}" | grep '^l'

What this does: picks the strategy, creates the parent + (optional) _shared/ skeleton, then runs hermes setup once per gateway to materialise its real memories/, skills/, and sessions/ directories. Finally, it replaces the per-gateway memories/ or skills/ with symlinks for the strategies that share. The isolated branch leaves everything as real folders.

3.3 Per-gateway `.env` files

Each gateway needs its own bot token. Create one bot per gateway in @BotFather (/newbot → name → username). BotFather hands you a fresh 45–46 character token each time. Reusing a token across two gateways will fail with 409 Conflict from Telegram — one polling process per token, no exceptions.

This repo ships templates/.env.template and a worked example at gateways/work/.env.example. Copy one of them into each gateway and fill in the real values:

for gw in "${GATEWAYS[@]}"; do
  cp templates/.env.template "$PARENT/$gw/.env"
  chmod 600 "$PARENT/$gw/.env"
  nano "$PARENT/$gw/.env"          # or vim / micro / $EDITOR
done

The variables you fill in (per gateway):

Variable	What goes in it
`HERMES_TELEGRAM_BOT_TOKEN`	The token BotFather gave you for this bot. Different per gateway.
`TELEGRAM_ALLOWED_USERS`	Comma-separated Telegram user IDs allowed to talk to this bot (@userinfobot). Blank = anyone.
`HERMES_EPHEMERAL_SYSTEM_PROMPT`	The bot's behavior, tools, and constraints (the "voice"). Multi-line.
`XIAOMI_MIMO_API_KEY`	MiMo Token Plan / Orbit key — see §5.2. Replace with whichever provider key your gateway is configured for — `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GEMINI_API_KEY`, etc. (§3.14).
`OPENROUTER_API_KEY`	Fallback model key — see Part 4. Drop this line if you're not using OpenRouter as the fallback.
`OBSIDIAN_VAULT_PATH`	Absolute path to your vault — see Part 6. Same path in every gateway's `.env` (that's how cross-bot durable knowledge works).

Why chmod 600: Hermes refuses to load .env files that are world-readable. Your bot token is the password to your bot — locking it to your user only is non-negotiable.

Identity vs behavior split. The .env HERMES_EPHEMERAL_SYSTEM_PROMPT carries the operational rules (response format, tool usage, constraints). Identity and personality (who this bot is) live in a separate SOUL.md next to it — see the worked example at gateways/work/SOUL.md. The split keeps slow-changing identity separate from frequently-tweaked behavior.

3.4 Per-gateway `config.yaml`

config.yaml is the model + auxiliary + Telegram block. Safe to commit (no secrets). The bot token is injected at startup by inject_config.py (§3.5) so this file stays clean.

The repo ships templates/config.yaml.template — copy into each gateway:

for gw in "${GATEWAYS[@]}"; do
  cp templates/config.yaml.template "$PARENT/$gw/config.yaml"
done

Tweak model.default per gateway if you want different model tiers per voice (e.g. mimo-v2.5-pro for technical bots, mimo-v2-flash for casual ones). The auxiliary: block routes compression and title-generation to a model — set this or you'll get the "No auxiliary LLM provider configured" warning and lose middle context on long conversations.

Different provider per gateway? Totally fine — config.yaml is per-gateway, so the work bot can run on anthropic and the personal bot on xiaomi-mimo while sharing memories. Full provider list and the manual swap recipe live in §3.14 LLM Provider Reference. The bootstrap (--provider) sets this for you at install time.

3.5 The token injector — `inject_config.py`

Reads HERMES_TELEGRAM_BOT_TOKEN from the gateway's environment (already sourced from .env by run.sh) and patches it into config.yaml under platforms.telegram.token. That way the secret stays in .env (gitignored), and config.yaml stays safe to commit.

Drop the universal version into the parent folder (one copy serves every gateway):

cp templates/inject_config.py.template "$PARENT/inject_config.py"

Behavior:

Reads HERMES_HOME from the env (set by run.sh per gateway).
Reads TOKEN_ENV if you want to override the env var name (default HERMES_TELEGRAM_BOT_TOKEN).
Strips \r and \n from the token aggressively — covers Windows-edited .env files where \r can survive .strip().
Writes atomically (.yaml.tmp → replace) so you never get a half-written config.
Uses width=float("inf") so a 46-char token isn't line-wrapped (which would corrupt it on next read).

You should not need to edit this file. Same script works for every gateway.

3.6 The launcher — `run.sh`

The launcher is gateway-agnostic — it auto-discovers every sibling directory that has both .env and config.yaml and silently skips anything starting with _ (so _shared/ is never started as a runaway gateway).

Drop the universal version into the parent folder:

cp templates/run.sh.template "$PARENT/run.sh"
chmod +x "$PARENT/run.sh"

CLI surface:

./run.sh                  # alias of `all`
./run.sh all | both       # start every discovered gateway
./run.sh list             # one gateway name per line
./run.sh status           # running PIDs annotated with gateway names
./run.sh stop             # stop every gateway
./run.sh stop <name>      # stop only one gateway
./run.sh <name>           # start only <name> in the foreground
./run.sh --help           # usage

Discovery rules (verbatim from the script):

Sibling has .env AND config.yaml.
Sibling name does not start with _.

Per-gateway behavior:

Subshell-isolated (( … )) — env vars from one gateway never leak into the next.
Source the gateway's .env.
Sanity-check HERMES_TELEGRAM_BOT_TOKEN length; warn if not 45–46 chars.
python3 inject_config.py to patch the token into config.yaml.
cd <gateway> and exec hermes gateway run.

You should not need to edit this file. Same script works for any parent path (~/gateways, ~/agents, /opt/hermes).

3.7 First run + verification

cd "$PARENT"
./run.sh list                 # confirms which gateways were discovered
./run.sh all

Send "hello" to each bot from Telegram. Each replies in its own voice.

Verify the directories Hermes wrote to:

ls -la "$PARENT/${GATEWAYS[0]}/sessions"      # per-bot sessions DB
ls -la "$PARENT/${GATEWAYS[0]}/memories"      # MEMORY.md (per-bot OR symlink, depending on strategy)
ls -la "$PARENT/${GATEWAYS[0]}/skills"        # skill files (per-bot OR symlink)

# Expected per strategy:
#   isolated      memories/ + skills/ are real dirs in BOTH gateways
#   shared-skills memories/ real per-bot, skills/ → _shared/skills
#   shared-both   memories/ + skills/ → _shared/...

./run.sh status should print one annotated line per running PID, e.g. pid=12345 gateway=work. Press Ctrl+C in the launcher's terminal to stop everything (the trap propagates to all children).

3.8 systemd (optional, but recommended for production)

How you keep run.sh all alive on reboots depends on whether you're on Hostinger 1-click (Docker) or bare-metal.

Bare-metal — native systemd:

sudo tee /etc/systemd/system/hermes-gateways.service >/dev/null <<EOF
[Unit]
Description=Hermes Multi Telegram Gateways
After=network.target

[Service]
Type=simple
User=$USER
WorkingDirectory=$PARENT
ExecStart=$PARENT/run.sh all
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable --now hermes-gateways.service
journalctl -u hermes-gateways.service -f      # tail

The repo ships a substitutable template at templates/systemd-hermes-gateways.service.template if you prefer a file copy with {{PARENT}} substitution.

Hostinger 1-click (Docker), simplest — tmux inside the container:

# Inside the container
tmux new-session -d -s hermes "cd $PARENT && ./run.sh all"
tmux attach -t hermes              # watch logs; Ctrl+B then D to detach

For automatic relaunch on container restart, append a one-liner to ~/.bashrc:

echo "tmux has-session -t hermes 2>/dev/null || tmux new-session -d -s hermes 'cd $PARENT && ./run.sh all'" >> ~/.bashrc

Hostinger 1-click (Docker), more robust — host-side systemd wrapping docker exec:

Run on the host VM, not inside the container. Replace hermes-agent with whatever docker ps shows for the container name and substitute the in-container parent path:

# /etc/systemd/system/hermes-gateways.service
[Unit]
Description=Hermes Multi Telegram Gateways (in Docker)
After=docker.service
Requires=docker.service

[Service]
Type=simple
ExecStartPre=/usr/bin/docker exec hermes-agent pkill -f 'hermes gateway run'
ExecStart=/usr/bin/docker exec hermes-agent /root/gateways/run.sh all
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

Prefer one unit per bot? Drop a templated unit at /etc/systemd/system/hermes-gateway@.service with ExecStart=/root/gateways/run.sh %i, then systemctl enable --now hermes-gateway@work hermes-gateway@personal. Cleaner per-bot logs via journalctl -u hermes-gateway@work.

3.9 One-command bootstrap (`bootstrap.sh`)

Everything in §3.1–3.8 in one command:

curl -fsSL https://raw.githubusercontent.com/Demonbane18/hermes-agent-setup/main/bootstrap.sh | bash

Or from a clone:

git clone https://github.com/Demonbane18/hermes-agent-setup.git
cd hermes-agent-setup
./bootstrap.sh

The script auto-detects three modes.

Mode A — fresh setup. Run with no flags or args. It scans $HOME, /root, /home/* (and any --scan-path you pass) for an existing Hermes parent folder; if none is found it walks you through:

Parent folder (default ~/gateways)
Number of gateways (default 2)
Names (default gateway-1, gateway-2, …)
Sharing strategy (1 = isolated [default], 2 = shared-skills, 3 = shared-both)
LLM provider (1 = xiaomi-mimo [default], 2 = openrouter, 3 = anthropic, 4 = openai, 5 = gemini, 6 = groq, 7 = deepseek, 8 = minimax, 9 = zai, 10 = ollama, 11 = custom).
Default model for that provider (picker shows hard-coded list — see §3.14).
Fallback provider (default: openrouter unless primary is openrouter; pass none to skip).

Then it builds the layout, installs run.sh and inject_config.py, and prints next-steps with the exact paths to edit. Both the per-gateway config.yaml and .env are generated for the chosen provider — keys, base URL, and model lines are all in sync.

Mode B — extend an existing setup. If the scan finds a parent folder already shaped like a Hermes setup (has run.sh + inject_config.py + ≥1 gateway with config.yaml), you'll get:

found existing setup: /root/gateways
Extend it? [Y]:

The script auto-detects which strategy your existing parent uses (by checking whether the first gateway's memories/ and skills/ are symlinks vs real dirs) and applies the same strategy to any new gateways. You can override with --strategy.

Mode C — non-interactive (scripted automation). Flag-driven, no prompts:

# Fresh: 3 isolated gateways with default xiaomi-mimo + openrouter fallback
./bootstrap.sh --parent ~/gateways --count 3 --names alpha,beta,gamma \
    --strategy isolated --non-interactive

# Anthropic primary, OpenRouter fallback
./bootstrap.sh --parent ~/agents --count 1 --names assistant \
    --strategy isolated --provider anthropic --model claude-opus-4-7 \
    --non-interactive

# OpenAI primary, no fallback
./bootstrap.sh --parent ~/bots --count 2 --names work,personal \
    --strategy shared-both --provider openai --model gpt-4o \
    --no-fallback --non-interactive

# Local Ollama, no API keys
./bootstrap.sh --parent ~/local --count 1 --names dev \
    --provider ollama --model llama3.1:8b --no-fallback --non-interactive

# Add 2 gateways to an existing parent (strategy + provider auto-inherited)
./bootstrap.sh --add --parent ~/gateways --names delta,epsilon --non-interactive

Provider-related flags: --provider <name> · --model <id> · --base-url <url> · --key-env <VAR> · --fallback-provider <name|none> · --fallback-model <id> · --no-fallback.

Full flag list and the per-provider defaults: ./bootstrap.sh --help or §3.14.

Safety invariants (locked behavior):

Never overwrites existing .env or config.yaml. Skipped paths are logged as [skip] <path> exists.
Backs up run.sh and inject_config.py to .bak exactly once when replacing with a non-matching universal template. If .bak already exists, leaves both files alone and warns.
All new symlinks use absolute paths.
Every newly-created .env gets chmod 600.
--dry-run prints every action without executing.

3.10 Sharing strategies — reference deep-dive

The three strategies in this guide differ in what the bots share and how cross-pollination happens. Pick by use case, not by feel.

isolated — each gateway's memories/ and skills/ are real, separate folders. Default. Use when:

You want each bot to remember only its own conversations (max separation).
A second human is in the loop (a partner, an assistant, a teammate piping into a shared on-call bot) and you don't want one bot's notes surfacing in another.
You're A/B-testing personalities and don't want one's drift contaminating the other.
You'd rather promote durable knowledge into the Obsidian vault — both bots read it, but only what you deliberately commit crosses over.

~/gateways/
├── work/
│   ├── memories/     (real, work-only)
│   └── skills/       (real, work-only)
└── personal/
    ├── memories/     (real, personal-only)
    └── skills/       (real, personal-only)

shared-skills — skills/ is symlinked to _shared/skills/; memories/ stays per-bot. Use when:

You write skills once and want every bot to benefit.
You still want each bot's MEMORY.md to stay tight and on-topic.
You like the idea of one client-tailored persona per gateway, all sharing your hard-won automation library.

~/gateways/
├── _shared/
│   └── skills/       (canonical, all bots share)
├── work/
│   ├── memories/     (real, work-only)
│   └── skills/  →    _shared/skills
└── personal/
    ├── memories/     (real, personal-only)
    └── skills/  →    _shared/skills

shared-both — memories/ AND skills/ symlinked to _shared/. The "one head, many voices" classic. Use when:

Every bot is a persona of the same you — work-you, personal-you, coach-you.
You want "what did I tell the work bot yesterday?" to just work from the personal bot.
Cross-pollination is a feature, not a leak.

~/gateways/
├── _shared/
│   ├── memories/     (canonical, all bots share)
│   └── skills/       (canonical, all bots share)
├── work/
│   ├── memories/  →  _shared/memories
│   └── skills/    →  _shared/skills
└── personal/
    ├── memories/  →  _shared/memories
    └── skills/    →  _shared/skills

Migration. Switching strategies on an existing setup is a rm/mkdir/ln -s recipe — explicitly out of scope for bootstrap.sh. To go from isolated → shared-both for example: stop the launcher, mv work/memories _shared/memories, then ln -s _shared/memories work/memories and ln -s _shared/memories personal/memories (after backing up personal/memories if it had its own data). Take a snapshot first.

Profiles is a fourth pattern, not a fourth strategy. hermes profile create (upstream) gives you fully isolated processes with their own everything — different binary invocations, different env, different files. Use profiles when you need hard tenant isolation (separate clients, compliance boundary, can't-leak-ever data). Multi-gateway is for one operator with several voices on one box.

3.11 Adding a new gateway later

Two paths.

Manual (60 seconds plus prompt-tuning time):

NAME=coach
PARENT=~/gateways
STRATEGY=$(\
  if [ -L "$PARENT/$(ls -1 "$PARENT" | grep -v '^_' | head -1)/memories" ] \
  && [ -L "$PARENT/$(ls -1 "$PARENT" | grep -v '^_' | head -1)/skills" ]; then echo shared-both; \
  elif [ -L "$PARENT/$(ls -1 "$PARENT" | grep -v '^_' | head -1)/skills" ]; then echo shared-skills; \
  else echo isolated; fi)

mkdir -p "$PARENT/$NAME"
(cd "$PARENT/$NAME" && hermes setup)        # blank token, .env injects later

case "$STRATEGY" in
  shared-skills)
    rm -rf "$PARENT/$NAME/skills"
    ln -s "$PARENT/_shared/skills" "$PARENT/$NAME/skills"
    ;;
  shared-both)
    rm -rf "$PARENT/$NAME/memories" "$PARENT/$NAME/skills"
    ln -s "$PARENT/_shared/memories" "$PARENT/$NAME/memories"
    ln -s "$PARENT/_shared/skills"   "$PARENT/$NAME/skills"
    ;;
esac

cp "$PARENT/work/.env.example" "$PARENT/$NAME/.env" 2>/dev/null \
  || cp templates/.env.template "$PARENT/$NAME/.env"
chmod 600 "$PARENT/$NAME/.env"
$EDITOR "$PARENT/$NAME/.env"                # paste new BotFather token + this bot's voice

systemctl restart hermes-gateways.service   # or: ./run.sh stop && ./run.sh all
./run.sh list                                # confirm new gateway picked up

run.sh auto-discovers the new folder by virtue of it containing .env + config.yaml. No edits to run.sh needed.

Bootstrap (zero ceremony):

./bootstrap.sh --add --parent ~/gateways --names coach
# or interactively
./bootstrap.sh

The bootstrap detects your existing strategy and applies it automatically. Add multiple at once: --names coach,fitness,finance.

Removing a bot: stop the launcher, rm -rf ~/gateways/<name>, restart. The shared _shared/memories/ and _shared/skills/ are untouched because they don't live inside the gateway folder.

Bot tokens are unique per bot. Don't reuse a token across two gateways — Telegram only allows one polling process per token (409 Conflict from getUpdates). Always create a fresh BotFather bot for each new gateway.

3.12 Cross-gateway handoff (`_shared/handoff/`)

When memories/ is isolated (the isolated and shared-skills strategies), bots can't read each other's transient context. That's the whole point. But sometimes you genuinely do want them to coordinate — the work bot wants to tell the personal bot "deadline shifted to Tuesday" without dumping its full memory across the wall.

The pattern: a <parent>/_shared/handoff/ directory with plain markdown files. Both bots Read and Write to it via their normal tool calls. No symlinks, no IPC, no protocol — just files in a shared folder, deliberate writes, deliberate reads.

This repo ships a starter at gateways/_shared/handoff/README.md. Convention used in production:

<parent>/_shared/handoff/
├── README.md              # explains the protocol
├── weekend-handoff.md     # work bot writes Friday 6pm; personal bot reads
└── weekend-notes.md       # personal bot writes Sunday 6pm; work bot reads

Day(s)	Owning bot	What gets written
Mon–Fri	work	Sprint state, ops, automations, blockers
Fri 18:00 (cron)	work	Writes `weekend-handoff.md`: what's pending, what to watch over the weekend
Sat–Sun	personal	Reads handoff at the start of weekend sessions; appends to `weekend-notes.md`
Mon 07:30 (cron)	work	Reads `weekend-notes.md`, summarises into context, archives the file

Make it work in three steps:

Create the folder (bootstrap.sh does this for non-isolated strategies; for isolated create it manually: mkdir -p ~/gateways/_shared/handoff).
Add a shared skill so every bot knows the convention. Drop handoff-protocol.md into the shared skills/ (or every gateway's skills/ if isolated). Example skill body in gateways/_shared/handoff/README.md.
Add cron jobs (per gateway, via hermes -p work cron add and hermes -p personal cron add) for the Friday-write and Monday-read times.

Three or more bots? Add a third file. Or split by topic instead of calendar. The pattern is "named markdown files in a shared folder, written and read on schedule." Adapt freely.

3.13 Set bot commands in @BotFather (small QoL win)

Once each bot is running, tell BotFather what slash-commands it accepts. Users get autocomplete; you get a tidy menu in the Telegram chat. Per bot:

/setcommands
@WorkAgentBot          ← pick from BotFather's list

start - Start a conversation
help - Show available commands
new - Begin a fresh session (resets context)

Repeat for each gateway. Five minutes total. Skippable if you don't care about polish.

3.14 LLM Provider Reference

The default in this guide is Xiaomi MiMo primary + OpenRouter fallback (cheap, fast, the math works). But Hermes is BYOM — bring your own model. Anthropic, OpenAI, Gemini, Groq, DeepSeek, MiniMax, Z.AI, local Ollama, or anything OpenAI-compatible all work the same way: edit two files (config.yaml + .env).

bootstrap.sh accepts a --provider flag and prompts for one when interactive — see §3.9. The default stays xiaomi-mimo so existing setups in this repo are unchanged.

Supported providers (hard-coded model lists)

The script ships with these providers wired up. Model lists are accurate as of the script's commit date — providers add and deprecate models constantly, so hit the provider's /v1/models endpoint for the live list and edit config.yaml if you need a model that isn't in the picker.

Provider	Models bundled in bootstrap	Base URL	Env var
xiaomi-mimo (default)	`mimo-v2.5-pro`, `mimo-v2-flash`	`https://token-plan-{sgp,ams,cn}.xiaomimimo.com/v1`	`XIAOMI_MIMO_API_KEY`
openrouter	`anthropic/claude-sonnet-4`, `anthropic/claude-opus-4`, `openai/gpt-4o`, `google/gemini-2.5-pro`, `meta-llama/llama-3.3-70b-instruct`	(built-in)	`OPENROUTER_API_KEY`
anthropic	`claude-opus-4-7`, `claude-sonnet-4-6`, `claude-haiku-4-5`	(built-in)	`ANTHROPIC_API_KEY`
openai	`gpt-4o`, `gpt-4o-mini`, `o1`, `o3-mini`	(built-in)	`OPENAI_API_KEY`
gemini	`gemini-2.5-pro`, `gemini-2.5-flash`	(built-in)	`GEMINI_API_KEY`
groq	`llama-3.3-70b-versatile`, `mixtral-8x7b-32768`, `llama-3.1-8b-instant`	`https://api.groq.com/openai/v1`	`GROQ_API_KEY`
deepseek	`deepseek-chat`, `deepseek-reasoner`	`https://api.deepseek.com/v1`	`DEEPSEEK_API_KEY`
minimax	`minimax-m2.7`	`https://api.minimax.chat/v1`	`MINIMAX_API_KEY`
zai	`glm-4-plus`, `glm-4-flash`	`https://open.bigmodel.cn/api/paas/v4`	`ZAI_API_KEY`
ollama (local)	`llama3.1:8b`, `qwen2.5:14b`, `deepseek-r1:14b`, `mistral:7b`	`http://localhost:11434/v1`	(none)
custom	you supply	you supply	you supply

Built-in vs custom: "built-in" providers go directly under model.provider: (no custom_providers: entry needed). The rest are OpenAI-compatible endpoints declared under custom_providers: with a name, base_url, and key_env.

Switching at install time (bootstrap)

# Anthropic primary, OpenRouter fallback (default fallback)
./bootstrap.sh --parent ~/gateways --count 1 --names assistant \
    --strategy isolated --provider anthropic --model claude-opus-4-7 \
    --non-interactive

# OpenAI primary, no fallback
./bootstrap.sh --parent ~/bots --count 2 --names work,personal \
    --strategy shared-both --provider openai --model gpt-4o \
    --no-fallback --non-interactive

# Local Ollama, no API keys at all
./bootstrap.sh --parent ~/local --count 1 --names dev \
    --provider ollama --model llama3.1:8b --no-fallback --non-interactive

# Custom OpenAI-compatible endpoint
./bootstrap.sh --parent ~/gateways --count 1 --names myagent \
    --provider custom --model my-llm-v1 \
    --base-url https://api.example.com/v1 --key-env EXAMPLE_API_KEY \
    --no-fallback --non-interactive

Run bootstrap.sh --help for the full flag list and per-provider defaults.

Switching manually (already-running setup)

If a gateway already exists, bootstrap won't touch its .env or config.yaml (per the safety invariants). Edit them by hand. Two files per gateway, four edits total.

Step 1 — <gateway>/config.yaml: swap the model block (and the matching auxiliary block — the compression model should match the primary's context window).

Built-in provider (anthropic, openai, gemini, openrouter):

model:
  default: claude-sonnet-4-6
  provider: anthropic                  # bare name = built-in
  api_mode: chat_completions
  fallback_providers:
    - provider: openrouter
      model: anthropic/claude-sonnet-4

custom_providers: []                    # empty when both primary + fallback are built-in

providers:
  anthropic:
    key_env: ANTHROPIC_API_KEY          # default name; override here if you use a different env var

auxiliary:
  compression:
    provider: anthropic
    model: claude-sonnet-4-6
  title_generation:
    provider: anthropic
    model: claude-sonnet-4-6

Custom OpenAI-compatible (groq, deepseek, ollama, minimax, zai, anything else):

model:
  default: llama-3.3-70b-versatile
  provider: custom:groq                 # `custom:<name>` references custom_providers below
  api_mode: chat_completions
  fallback_providers:
    - provider: openrouter
      model: meta-llama/llama-3.3-70b-instruct

custom_providers:
  - name: groq
    base_url: https://api.groq.com/openai/v1
    key_env: GROQ_API_KEY

providers: {}

auxiliary:
  compression:
    provider: custom:groq
    model: llama-3.3-70b-versatile
  title_generation:
    provider: custom:groq
    model: llama-3.3-70b-versatile

Step 2 — <gateway>/.env: drop the old API key var, add the new one.

- XIAOMI_MIMO_API_KEY=tp-...
+ ANTHROPIC_API_KEY=sk-ant-api03-...
  OPENROUTER_API_KEY=sk-or-v1-...

(Keep OPENROUTER_API_KEY only if you're using openrouter as the fallback. Otherwise remove that line too.)

Step 3 — restart: cd ~/gateways && ./run.sh stop && ./run.sh all. The launcher re-reads .env, inject_config.py re-injects the bot token, and Hermes picks up the new model on first message.

Where to find the latest model IDs

Live /v1/models endpoint (works for any OpenAI-compatible provider):

curl -s "$BASE_URL/models" -H "Authorization: Bearer $API_KEY" | jq '.data[].id'

Anthropic: docs.anthropic.com/en/docs/about-claude/models/overview
OpenAI: platform.openai.com/docs/models
Gemini: ai.google.dev/gemini-api/docs/models
OpenRouter: openrouter.ai/models
MiMo: platform.xiaomimimo.com
Groq: console.groq.com/docs/models
DeepSeek: api-docs.deepseek.com
Ollama (local): ollama list for installed models, ollama.com/library for the catalog.

Hermes upstream provider list

Hermes Agent itself supports more than what bootstrap wires up out of the box. Check hermes config providers --help (CLI) or the upstream docs at hermes-agent.nousresearch.com for the canonical list. The upstream BYOM tagline mentions: OpenRouter, Anthropic, OpenAI, Xiaomi MiMo, Z.AI, MiniMax, Google Gemini, Groq, DeepSeek, and local models via Ollama. Anything beyond that list works through custom_providers: if it's OpenAI-compatible — same shape as the recipe above.

Why hard-coded model lists? Auto-fetching live model lists from every provider would mean a network hit for every bootstrap run, plus a separate auth flow per provider just to discover IDs. Cheaper and more reliable to ship a list, document where the live list lives, and let the user paste the latest model name. Update the list in bootstrap.sh's provider_models() function when models change materially.

Part 4: OpenRouter API Setup

OpenRouter is one API key, hundreds of models. Best safety-net default for Hermes.

Sign up at openrouter.ai.
Go to Keys → Create Key, copy the sk-or-v1-… value.

Paste it into every gateway .env file:

# Append the key to each gateway's .env
for env in ~/gateways/*/.env; do
  echo "OPENROUTER_API_KEY=sk-or-v1-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" >> "$env"
done

Top up $5–10. Visit the OpenRouter rankings page to see what's good right now.

My current picks:

Use case	Model	Why
Default chat	`minimax/minimax-m2.7`	Strong agentic, great tool calls, cheap
Coding-heavy	`anthropic/claude-sonnet-4.6`	Best reasoning when stakes matter
Long context	`google/gemini-2.5-pro`	2M context, fast
Cheap auxiliary	`minimax/minimax-m2-air`	Great for compression, titles

OpenRouter is the fallback in this guide. Primary is Xiaomi MiMo (next section).

Part 5: Xiaomi MiMo (free / cheap inference)

Xiaomi MiMo is Xiaomi's open-source LLM family. They run a generous Token Plan subscription (~700M tokens/month at Pro tier) and have an official Hermes Agent integration.

For builders running an agent like Hermes 24/7, MiMo is the best price-to-tool-call-quality ratio I've found — and the entry point is even better if you can get into Xiaomi's developer program.

5.1 The Xiaomi MiMo Orbit Program (recommended path — free tokens)

The MiMo Orbit Program is Xiaomi's developer/early-builder program for the MiMo platform. Once accepted, you get a generous monthly token allowance on the Token Plan for free, plus early access to new MiMo models. For an agent that talks to you all day across multiple Telegram bots, this is what makes the math work — you can run all your gateways through MiMo as the primary model and only fall through to OpenRouter when the endpoint hiccups.

Background reading & community discussion: r/XiaomiGlobal — Xiaomi MiMo Orbit Program — what people are getting in, what they're shipping with it, current acceptance signal.

How to onboard

Create a Xiaomi MiMo platform account at platform.xiaomimimo.com. Use a real email you check — approval emails go there.
Find the Orbit Program page in the dashboard (look under Programs, Developer, or the announcement banner — the exact label has shifted as Xiaomi has iterated the program). If you can't find it from the dashboard, the Reddit thread above usually has a current direct link.
Submit the application. Typical fields:
- Who you are (GitHub / X / personal site).
- What you're building — be specific. "A personal multi-bot Hermes Agent setup with N Telegram gateways sharing one brain" is a strong, concrete pitch. Vague "I want to try MiMo" applications get deprioritized.
- Estimated daily/monthly token usage. Honest numbers — Hermes plus 2–4 gateways with average chatter and compression typically lands somewhere between 5M and 50M tokens/month, well inside Orbit limits.
- Tool-call use case. Mention agentic behavior (file reads, shell, MCP servers, scheduled jobs) — MiMo is tuned for tool calls and the team likes seeing it used that way.
Wait for approval. Anywhere from a few hours to a few days depending on intake volume. Approval lands in your platform dashboard and over email.
Once approved, your account gets the Orbit-tier Token Plan automatically. Skip step 5.2's "Subscribe Plan" — your subscription is comped.
Generate the API key at Subscription Details → Create API Key and grab your Dedicated Base URL (next subsection). Then jump to §5.3 to wire MiMo into your gateways.

Not accepted (yet)? No drama. The paid Token Plan starts cheap, and you can re-apply to Orbit later — the rest of this section works identically either way. You can also keep using OpenRouter as primary and add MiMo whenever you do get in.

5.2 Get a key

If you came here via Orbit, your account already has the Token Plan attached — skip step 2.

Sign up at platform.xiaomimimo.com (or sign in if Orbit accepted you above).
Go to Token Plan → Subscribe Plan (paid path) — OR rely on the comped Orbit subscription.
Once subscribed, Subscription Details → Create API Key. Copy it immediately — Xiaomi only shows it once.

On the same page, find your Dedicated Base URL. It's region-specific:

Region	Base URL
Singapore (Asia-Pacific)	`https://token-plan-sgp.xiaomimimo.com/v1`
Amsterdam (Europe)	`https://token-plan-ams.xiaomimimo.com/v1`
China (mainland)	`https://token-plan-cn.xiaomimimo.com/v1`

Use whichever your dashboard shows you. They are NOT interchangeable. Hitting the wrong endpoint is the #1 cause of HTTP 401 errors.

5.3 Configure Hermes for MiMo

Edit each gateway's config.yaml (or just the first one — the others can copy it). The model + auxiliary block is per-gateway, so you can give the work bot the Pro model and the personal bot the Flash model if you want different tiers per voice:

model:
  default: mimo-v2.5-pro
  provider: custom:xiaomi-mimo
  api_mode: chat_completions
  fallback_providers:
    - provider: openrouter
      model: minimax/minimax-m2.7

custom_providers:
  - name: xiaomi-mimo
    base_url: https://token-plan-sgp.xiaomimimo.com/v1
    key_env: XIAOMI_MIMO_API_KEY

auxiliary:
  compression:
    provider: custom:xiaomi-mimo
    model: mimo-v2.5-pro
  title_generation:
    provider: custom:xiaomi-mimo
    model: mimo-v2.5-pro

For lighter-weight gateways (personal, coach, etc.), swap the default to mimo-v2-flash in their own config.yaml — cheaper, faster, fine for casual chat. Heavy-lift gateways (work, finance) keep mimo-v2.5-pro.

⚠️ Compression context mismatch — keep auxiliary on mimo-v2.5-pro

If you route auxiliary.compression to mimo-v2-flash, Hermes will warn at startup:
Compression model mimo-v2-flash context is 262,144 tokens,
but the main model mimo-v2.5-pro's compression threshold was 524,288 tokens.
Auto-lowered this session's threshold to 262,144 tokens.
What this means: the compression model has a smaller context window than the main model's compression trigger, so Hermes silently halves how much history it tries to compress at once. You lose middle context faster.

Fix: keep both auxiliary.compression.model and auxiliary.title_generation.model on mimo-v2.5-pro (524k context), even on flash-default gateways. The cost delta on auxiliary calls is negligible compared to losing conversation memory.

Only downgrade auxiliary to flash if you've also lowered the main model's compression.threshold to ≤262,144 in config.yaml — otherwise the threshold mismatch wins and Hermes auto-clamps you anyway.

Add the key to every gateway's .env:

for env in ~/gateways/*/.env; do
  echo "XIAOMI_MIMO_API_KEY=tp-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" >> "$env"
done
# (then dedupe/edit each file in nano if needed)

5.4 Test before launching

source ~/gateways/work/.env
curl -s -o /dev/null -w "HTTP %{http_code}\n" \
  https://token-plan-sgp.xiaomimimo.com/v1/models \
  -H "Authorization: Bearer $XIAOMI_MIMO_API_KEY"

Want HTTP 200. If you get 401, the key is wrong endpoint or has whitespace from copy-paste. Replace the URL with the one your Xiaomi dashboard shows.

5.5 Why MiMo + OpenRouter fallback

MiMo is fast, free-ish at the Token Plan tier, and specifically optimized for tool calls — the exact thing an agent does all day. OpenRouter is the safety net: if your Token Plan runs out or the endpoint hiccups, your bot still replies via MiniMax. Best of both worlds.

Part 6: Add an Obsidian Second Brain

Obsidian is a free, local-first markdown notes app. It treats a folder of .md files as a "vault" and adds powerful linking, search, and graph views on top. Because it's just markdown files, Hermes can read and write to it directly.

If you went isolated or shared-skills in §3.10, this section is not optional. Obsidian is your only cross-bot knowledge layer once each gateway has its own isolated memories/. The vault is the canon; MEMORY.md is scratch. Read on with that frame in mind.

6.0 What is an Obsidian vault, in plain English?

If you've never used Obsidian, here's the whole thing in 90 seconds:

A vault is just a folder on your disk. Not a cloud account, not a database — a plain folder. You can open it in Finder/Explorer and see .md files.
Each note is a plain markdown file (my-note.md). You can read them in any text editor. If Obsidian disappeared tomorrow, your notes would still be readable.
Notes can link to each other using [[double-bracket]] syntax. Obsidian renders those links as clickable, tracks backlinks automatically, and shows a graph view that maps how everything connects.
"Sync" is whatever you want it to be: paid Obsidian Sync, Syncthing, iCloud Drive, or just a git repo. In this guide we use git, because we already have a hermes-context repo for it.

Why this matters for an AI agent: because the vault is "just a folder of .md files", Hermes can read, write, grep, and ls into it with the same shell tools any developer would. No special API. No vector DB. No webhook glue. The agent just uses your filesystem like a thoughtful collaborator who keeps adding to a shared notebook.

This is the unlock: instead of treating "memory" as a black box inside the model, you give Hermes a real, inspectable, version-controlled set of markdown files you can also read, edit, and curate in Obsidian on your laptop. You stay in the loop. The agent does the legwork.

6.1 Create the vault on the VPS

Hostinger 1-click users: create the vault folder inside the container and under the persistent volume mount — otherwise it disappears on restart. If ~ is on the volume (it usually is), this just works:

# Inside the container — replace <project-slug> with your own short name
# (e.g., your client's name, your side-project's codename, or just "main").
PROJECT="<project-slug>"
mkdir -p ~/hermes-context/active-projects/"$PROJECT"/vault-sync/"$PROJECT"-vault

What this does: makes a per-project pocket inside hermes-context/ so the same Hermes setup can host multiple vaults (one per client / side-project / domain) without them stepping on each other. The folder name is just a label — the agent only cares about the path you put in OBSIDIAN_VAULT_PATH next.

To open the same vault in Obsidian on your laptop, you have three options:

Best: push the folder up via the hermes-context git repo (Part 7) and git pull it locally. Edits go both ways through normal git workflow.
Real-time: point Obsidian Sync or Syncthing at a host-side directory bind-mounted into the container (requires editing the Hostinger template's compose file — advanced).
Read-only mirror: rsync the folder to your laptop on a cron.

For most people, the git approach is plenty — see Part 7.

6.2 Tell Hermes where it is

Add to every gateway's .env file (use the same $PROJECT slug you picked above):

PROJECT="<project-slug>"   # same value you used in 6.1
for env in ~/gateways/*/.env; do
  echo "OBSIDIAN_VAULT_PATH=/root/hermes-context/active-projects/$PROJECT/vault-sync/$PROJECT-vault" >> "$env"
done

What this does: writes the absolute path of your vault into each bot's .env so the obsidian_vault skill can find it. Both gateways read the same path — that's how durable knowledge crosses bots.

6.3 Give Hermes a skill to use it

Skills are markdown files in ~/gateways/work/skills/ (which is the shared one thanks to our symlink). Create obsidian_vault.md:

---
name: obsidian_vault
description: Read and write notes in the user's Obsidian vault.
---

# Obsidian Vault Skill

The user's Obsidian vault is at `$OBSIDIAN_VAULT_PATH`.

When the user asks you to:

- "save this to my vault" / "add to my notes" → write a new `.md` file in the vault
- "what do I have on X?" → search the vault with grep/ripgrep
- "remind me about Y" → search both your memory AND the vault

## Use frontmatter on new notes:

date: <ISO date>
source: hermes-<bot-name>
tags: [auto-generated, ...]

---

Always confirm before overwriting an existing file.

Done. The agent now treats your vault as an extension of its memory. Both bots have access because the skills/ folder is shared.

6.4 Level it up: Karpathy's LLM Wiki pattern

So far Part 6 has given you "a folder of notes the agent writes to." That's already useful. But you can go a step further and turn the vault into a self-maintaining personal wiki — and the playbook for doing that comes from Andrej Karpathy.

Read the original first (10 min): Karpathy's LLM Wiki gist →

Everything in this section is my adaptation of that pattern for a multi-bot Hermes setup. The gist is the canonical source — read it for the full thinking. This subsection just shows how to wire it up specifically.

6.4.1 What is the LLM Wiki, in plain English?

Most "AI + your docs" setups (NotebookLM, ChatGPT file uploads, vanilla RAG) work like this: you upload sources, the model retrieves chunks at query time, generates an answer. Nothing accumulates. Ask the same kind of question two weeks later and the model re-derives everything from scratch.

Karpathy flips this. Instead of just retrieving, the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources. When a new source arrives, the LLM:

Reads it
Writes a summary page
Updates 5–15 existing pages (entities, concepts, themes) to integrate the new info
Flags contradictions with what's already there
Refreshes an index.md and appends to a log.md

The result: a knowledge base that compounds over time, instead of one that's reconstituted on every query. Cross-references are already there. Synthesis already exists. The LLM does the boring bookkeeping; you do the curating and the asking.

In Karpathy's framing: Obsidian is the IDE; the LLM is the programmer; the wiki is the codebase.

6.4.2 The four-layer structure

The pattern has four pieces. They map cleanly onto folders inside your Obsidian vault.

Layer	What lives there	Who writes it
`raw/`	Your immutable sources — clipped articles, papers, podcast transcripts, screenshots. The LLM reads these, never edits them.	You (or Obsidian Web Clipper)
`wiki/`	LLM-generated markdown. Entity pages, concept pages, source summaries, synthesis. Cross-linked with `[[wikilinks]]`.	The LLM, every time it ingests
`index.md` + `log.md`	`index.md` = catalog organized by category. `log.md` = chronological append-only changelog (`## [2026-05-06] ingest \| Article Title`).	The LLM, automatically
Schema (`SOUL.md` / `CLAUDE.md`)	The rules the LLM follows when ingesting and maintaining. This is what makes it disciplined instead of chaotic. You and the LLM co-evolve it over time.	You + the LLM together

6.4.3 Why this pairs perfectly with Hermes Agent

The LLM Wiki pattern was originally written for one human + one LLM agent (Codex / Claude Code) working side-by-side. A multi-gateway Hermes setup is unusually well-suited to it for four reasons:

Hermes already has the running shell. Karpathy's pattern needs an LLM that can ls, cat, grep, and Edit — Hermes has all four as first-class tools. No extra wiring.
OBSIDIAN_VAULT_PATH is already shared across every gateway. Every bot reads and writes the same wiki. A source ingested via the work bot at 9 AM is queryable from the personal bot at 9 PM with no syncing.
hermes-context already provides the git layer. Karpathy points out that "the wiki is just a git repo of markdown files — you get version history, branching, collab for free." We already have that repo. The wiki literally lives inside it.
The skills/ folder is the perfect home for the schema. A skill called llm_wiki_maintainer.md defines the rules every gateway follows. Because skills are symlinked across all bots, all bots maintain the wiki the same way.

6.4.4 What it looks like in motion

When you message any bot with a new source, here's the flow:

6.4.5 Benefits — what this actually buys you

This is the part that makes the math work. You can be skeptical of "AI second brain" pitches in general (rightly), so here are the concrete wins specific to Hermes + LLM Wiki:

Benefit	What it means in practice
Compounding knowledge	Every source you drop in makes the wiki richer, not just bigger. Page on "API rate limits" gets sharper after the 5th article on the topic, not noisier.
No vector DB, no embeddings	The LLM uses `index.md` to navigate the wiki. Karpathy notes this scales to ~100 sources / hundreds of pages without RAG infra. You stay in plain markdown the whole time.
You stay in the loop	Obsidian on your laptop shows you exactly what the agent wrote. Disagree? Edit the page. The LLM will respect the edit on the next pass. No black-box memory drift.
Multi-bot, one canon	The work bot ingests an article during a meeting; the personal bot can answer questions about it on the train home. Same wiki, different voices.
Git-native everything	The wiki is a git folder inside `hermes-context`. Bad ingest? `git revert`. Want to fork the wiki for a side project? Branch. Want a teammate to add to it? Standard PR flow.
Self-improving along TWO axes	Hermes already self-improves procedurally via `skills/`. The LLM Wiki adds factual self-improvement via `wiki/`. Procedure + knowledge, both compounding.
Graceful degradation	If you stop using the wiki, you still have a folder of human-readable markdown notes. Nothing locked in. Nothing to migrate.

6.4.6 Why I chose this for my setup

Three reasons, honest:

MEMORY.md was getting noisy. Letting Hermes append every interesting fact to one ever-growing markdown file is fine for a week and chaotic by month two. The LLM Wiki gives the agent a structured place to graduate important facts to, with links instead of scrolling.
The vault was a passive dumping ground. Before the LLM Wiki, my Obsidian vault was a one-way street — Hermes wrote to it, I rarely opened it. With the wiki structure (and the schema telling the agent to maintain index.md + log.md), the vault has a job now, and Obsidian is genuinely useful as a browse-and-curate UI on top of it.
Multi-gateway makes it 2× better, not 2× more work. With a single bot, the LLM Wiki is just a nice personal pattern. With several bots sharing one wiki via OBSIDIAN_VAULT_PATH, it becomes the only sensible design — because it's the only place where multi-bot durable knowledge can plausibly live without leaking everything via shared memories/. It also pairs perfectly with the isolated and shared-skills strategies: isolate the noise, share the canon.

6.4.7 Setup: scaffold the wiki layout

Inside the Hermes container, lay out the four folders inside your vault:

# Inside the Hermes container
cd "$OBSIDIAN_VAULT_PATH"
mkdir -p raw wiki/{sources,entities,concepts,synthesis} assets
touch index.md log.md

Drop a starter index.md:

# Wiki Index

> Auto-maintained by Hermes. Last updated: <date>

## Entities

<!-- The LLM will list entity pages here -->

## Concepts

<!-- The LLM will list concept pages here -->

## Sources

<!-- One entry per ingested file in raw/ -->

## Synthesis

<!-- Cross-source themes -->

Drop a starter log.md:

# Wiki Log

Append-only. Newest at the bottom.

## [<today>] init | wiki scaffolded

6.4.8 Add the schema as a Hermes skill

This is the file that turns the agent into a disciplined wiki maintainer. Because skills/ is symlinked across every gateway, all bots will follow the same rules.

Save as ~/gateways/work/skills/llm_wiki_maintainer.md:

---
name: llm_wiki_maintainer
description: Maintain the user's LLM Wiki inside the Obsidian vault.
---

# LLM Wiki Maintainer

The user's Obsidian vault at `$OBSIDIAN_VAULT_PATH` is structured as
a Karpathy-style LLM Wiki:

- `raw/` — immutable sources you READ but never edit
- `wiki/` — entity, concept, source-summary, synthesis pages YOU WRITE
- `index.md` — catalog of every wiki page (you keep this updated)
- `log.md` — append-only changelog of every ingest / change

## When the user says "save this" or "ingest this":

1. Drop the source into `raw/<descriptive-slug>.md`. NEVER modify raw/ later.
2. Read the source. Identify entities, concepts, and themes.
3. Read `index.md` to find existing pages that should be updated.
4. Write a one-page summary at `wiki/sources/<slug>.md` with frontmatter
   (date, source-url, tags) and `[[wikilinks]]` to entity/concept pages.
5. Update existing entity & concept pages — fold new info in, flag
   contradictions explicitly with >**CONFLICT** quotes.
6. Append one line to `log.md`: `## [<ISO-date>] ingest | <Title>`
7. Refresh `index.md` so new pages appear under the right category.
8. Reply with: pages touched, contradictions found, follow-up questions.

## When the user asks a knowledge question:

1. Read `index.md` first to navigate.
2. Drill into the relevant 2–5 wiki pages.
3. Answer with citations: "(see [[page-name]])".
4. If the answer is thin, surface that — don't fabricate.

## Never:

- Edit anything in `raw/`.
- Force-push or rewrite history (the vault is in a git repo).
- Delete a wiki page without telling the user; archive to `wiki/archive/` instead.
- Save secrets / tokens / personal credentials anywhere in the vault.

## Conventions:

- Filenames: kebab-case, no spaces. `pricing-tiers.md` not `Pricing Tiers.md`.
- Wikilinks: `[[entity-name]]` for any cross-page reference.
- Frontmatter on every wiki page (date, last-updated, source-count).
- Synthesis pages start with a one-paragraph thesis, then evidence with backlinks.

That's the schema. From here on, every gateway maintains the wiki the same way. Drop a source via the work bot at 9 AM; ask the personal bot about it at 9 PM; same answer, drawn from the same compounding knowledge base.

6.4.9 Going further

The Karpathy gist has more depth than this section can fairly cover — workflow flavors (one-at-a-time vs batch ingest), Obsidian Web Clipper for fast source capture, Dataview/Marp plugins, lint passes, the trade-offs around "human in the loop" vs full autonomy. Read it. It's short.

Karpathy — LLM Wiki (gist) — the canonical source.

When you adapt the schema for your domain (research vs personal vs business), update llm_wiki_maintainer.md accordingly. The agent meets you where the rules are.

Part 7: hermes-context — Sync with Claude Code on your laptop

This is the bridge between your Claude Code coding sessions on your laptop and your Hermes Agent on the VPS. The two need to stay in sync — what you decided in Claude this morning, Hermes should know about by lunch.

The pattern:

A GitHub repo called hermes-context (mine: Demonbane18/hermes-context) with this structure:

hermes-context/
├── active-projects/    ← what you're working on now
├── session-notes/      ← timestamped Claude session handoffs
└── snippets/           ← reusable code/config fragments

A post-session hook in Claude Code that auto-commits and pushes a session summary to hermes-context when you end a coding session.
A scheduled cron job in Hermes that pulls the repo every 15 minutes.
A slash command /hermes_context_autosync that does an immediate manual pull.

7.1 Set up the repo

Hostinger 1-click: these git operations run inside the container, not on the host. The clone target must be on the persistent volume so it survives restarts.

# Inside the container
cd ~
git clone https://github.com/<you>/hermes-context.git
cd hermes-context
mkdir -p active-projects session-notes snippets
echo "# Hermes Context" > README.md
git add . && git commit -m "init" && git push

If the container doesn't yet have a configured git identity, set one once:

git config --global user.email "hermes-bot@your-domain.invalid"
git config --global user.name  "Hermes Bot"

7.2 Tell Hermes about it

Add to every gateway's .env file:

for env in ~/gateways/*/.env; do
  echo "HERMES_CONTEXT_REPO=/root/hermes-context" >> "$env"
done

7.3 Create the autosync skill

Save as ~/gateways/work/skills/hermes_context_autosync.md:

---
name: hermes_context_autosync
description: Pull/push sync for hermes-context Git repo
trigger: /hermes_context_autosync
---

# Hermes Context Auto Sync

The hermes-context repo at `$HERMES_CONTEXT_REPO` is the bridge between
the user's Claude Code sessions on their laptop and you (Hermes) on the VPS.

## When the user runs /hermes_context_autosync:

1. `cd $HERMES_CONTEXT_REPO`
2. `git pull --rebase` to fetch any updates from Claude Code sessions
3. Read `active-projects/current-task.md` — this is what they're working on NOW
4. Read the most recent file in `session-notes/` — this is the latest handoff
5. Reply with a 3-bullet summary: "Here's where we left off:"
6. If there are unpushed commits in your direction, `git push`

## Continuous awareness:

You also have a cron job running every 15 minutes that does a silent
`git pull --rebase`. So context drift is small. But run an explicit sync
when the user has just come from a Claude Code session.

## Never:

- Force-push (`-f`) under any circumstance
- Delete files in session-notes/ without explicit user confirmation
- Commit anything that contains an API key or token

7.4 Schedule the silent sync

Hermes has a built-in cron scheduler:

hermes -p work cron add

Define the job: every 15 minutes, run cd /root/hermes-context && git pull --rebase. Confirm. Done.

(Or use plain crontab -e if you prefer.)

7.5 The Claude Code side (optional but powerful)

In your Claude Code config, add a session-end hook that runs:

cd ~/hermes-context
echo "## $(date +%Y-%m-%d-%H%M)" >> session-notes/$(date +%Y-%m-%d).md
echo "<paste session summary>" >> session-notes/$(date +%Y-%m-%d).md
git add -A && git commit -m "session: $(date +%Y-%m-%d-%H%M)" && git push

Now: you finish a coding session on your laptop → Claude pushes the summary → 15 min later (or instantly with /hermes_context_autosync), Hermes knows what you decided.

Part 8: Connect Hermes Desktop

Hermes Desktop can act as a desktop frontend for a Hermes Agent that is already running somewhere else. That "somewhere else" can be:

The root VPS multi-gateway setup from this guide (~/gateways/work, ~/gateways/personal, etc.).
A Hostinger one-click Docker project with the stock Hermes container and its persistent /opt/data home.

The two setups feel similar once connected, but the wiring is different enough to deserve its own section.

8.1 What Hermes Desktop is actually connecting to

Hermes Desktop remote mode talks to the Hermes API server. It does not SSH into the machine, it does not read your gateway folders directly, and it does not update config.yaml on the VPS for you.

The three rules that prevent most confusion:

Thing in Desktop	What it means
Remote URL	The URL of a running Hermes API server, such as `http://127.0.0.1:8642`.
API Key	The Hermes API server key: `API_SERVER_KEY`. Do not type `Bearer`; Desktop adds that header itself.
Models / Providers pages	Mostly local Desktop model-library settings. In remote mode, use the model advertised by the remote `/v1/models` endpoint, such as `hermes-work` or `hermes-esvo`.

What this does: Think of Desktop as a nicer screen and keyboard for one remote Hermes gateway at a time. Your real LLM keys still live on the VPS/container in .env; your real model choice still lives in the remote Hermes config.yaml.

The mental model:

flowchart TB
    subgraph PC["Your PC"]
        D["Hermes Desktop"]
        L["localhost URL\nhttp://127.0.0.1:8642"]
    end

    subgraph VPS["Hostinger VPS"]
        S["SSH tunnel"]
        R["Root VPS gateway\n~/gateways/work"]
        H["Hostinger Docker container\n/opt/data"]
    end

    D --> L --> S
    S --> R
    S --> H

Health checks are not enough. /health can succeed even when the API key or model path is wrong. Always test these two endpoints:

curl -i http://127.0.0.1:8642/v1/models \
  -H "Authorization: Bearer hermesdesktop"

curl -i http://127.0.0.1:8642/v1/chat/completions \
  -H "Authorization: Bearer hermesdesktop" \
  -H "Content-Type: application/json" \
  -d '{"model":"hermes-work","messages":[{"role":"user","content":"Reply OK only"}],"stream":false}'

What this does: The first command proves your Desktop key can reach the API server. The second proves the remote Hermes gateway can actually call its configured model provider.

On Windows PowerShell, JSON quoting in curl.exe is easy to mangle. Use this version instead:

$headers = @{
  Authorization = "Bearer hermesdesktop"
  "Content-Type" = "application/json"
}

$body = @{
  model = "hermes-work"
  stream = $false
  messages = @(
    @{
      role = "user"
      content = "Reply OK only"
    }
  )
} | ConvertTo-Json -Depth 5

Invoke-RestMethod "http://127.0.0.1:8642/v1/chat/completions" -Method Post -Headers $headers -Body $body

8.2 Root VPS multi-gateway setup: pick one gateway

This is the setup from Part 3, where run.sh starts multiple Telegram gateways from the VPS filesystem.

Desktop can connect to one gateway URL at a time. Pick the voice you want first:

work      -> API port 8642 -> model name hermes-work
personal  -> API port 8643 -> model name hermes-personal

Edit the gateway's .env.

Example: ~/gateways/work/.env

API_SERVER_ENABLED=true
API_SERVER_HOST=127.0.0.1
API_SERVER_PORT=8642
API_SERVER_KEY=hermesdesktop
API_SERVER_MODEL_NAME=hermes-work

Example: ~/gateways/personal/.env

API_SERVER_ENABLED=true
API_SERVER_HOST=127.0.0.1
API_SERVER_PORT=8643
API_SERVER_KEY=hermesdesktop
API_SERVER_MODEL_NAME=hermes-personal

What this does: Each gateway gets its own private API server on its own port. 127.0.0.1 keeps the API server private on the VPS; SSH carries it safely to your laptop.

Restart:

cd ~/gateways
./run.sh stop
./run.sh all

From your PC, open an SSH tunnel for the gateway you want:

ssh -L 8642:127.0.0.1:8642 root@YOUR_VPS_IP

If your PC already has something using local port 8642, use a different local port:

ssh -L 8644:127.0.0.1:8642 root@YOUR_VPS_IP

What this does: The left port is on your PC. The right port is on the VPS. So 8644:127.0.0.1:8642 means "open http://127.0.0.1:8644 on my PC and forward it to the VPS gateway listening on 8642."

Test from your PC:

curl.exe -i http://127.0.0.1:8642/v1/models -H "Authorization: Bearer hermesdesktop"

Expected:

{"object":"list","data":[{"id":"hermes-work","object":"model"}]}

Hermes Desktop settings:

Mode: Remote
Remote URL: http://127.0.0.1:8642
API Key: hermesdesktop
Model: hermes-work

Do not paste your Xiaomi MiMo or OpenRouter key into the Desktop connection screen. That field is only for API_SERVER_KEY. MiMo/OpenRouter keys stay in the remote gateway's .env.

8.3 Hostinger one-click Docker setup: connect to the container gateway

The Hostinger one-click deploy is different: the stock Hermes app lives inside a Docker container, usually with persistent data at /opt/data. The browser URL Hostinger gives you often opens a ttyd web terminal or the Hermes TUI, not the OpenAI-compatible Hermes API.

If this command from your PC returns Server: ttyd or WWW-Authenticate: Basic realm="ttyd", you are hitting the terminal, not Hermes API:

curl.exe -k -i https://hermes-agent-xxxx.srvxxxxx.hstgr.cloud/v1/models `
  -H "Authorization: Bearer hermesdesktop"

Inside the Hostinger web terminal, check the persistent Hermes home:

cd /opt/data
ls -la

If the terminal has no nano or vi, append the API server settings with cat:

cat >> /opt/data/.env <<'EOF'
API_SERVER_ENABLED=true
API_SERVER_HOST=0.0.0.0
API_SERVER_PORT=8642
API_SERVER_KEY=hermesdesktop
API_SERVER_MODEL_NAME=hermes-esvo
EOF

Why 0.0.0.0 here? Inside Docker, 127.0.0.1 means "only inside this exact container." Binding to 0.0.0.0 lets Docker port mapping or the VPS host reach the API server. The API is still protected by API_SERVER_KEY; do not publish it without a key.

Verify:

grep '^API_SERVER_' /opt/data/.env

Start the gateway. The official Docker image refuses to run the gateway as root, so prefer the container's normal Hostinger restart path. If you are in the web terminal and need a quick manual test, run:

cd /opt/data
HERMES_ALLOW_ROOT_GATEWAY=1 hermes gateway

If the image has runuser, the cleaner manual form is:

cd /opt/data
runuser -u hermes -- hermes gateway

If root-owned files later block the non-root gateway, fix ownership inside the container:

chown -R hermes:hermes /opt/data

Now test inside the container:

curl -i http://127.0.0.1:8642/v1/models \
  -H "Authorization: Bearer hermesdesktop"

Expected:

{"object":"list","data":[{"id":"hermes-esvo","object":"model"}]}

At this point the API works inside Docker. To reach it from Hermes Desktop, you still need a bridge from your PC to that container.

Bridge option A: expose the Docker port

In Hostinger Docker Manager or your Compose settings, map container port 8642 to a host port, for example:

host 8642 -> container 8642

Then from your PC:

ssh -L 8644:127.0.0.1:8642 root@YOUR_VPS_IP

Hermes Desktop:

Mode: Remote
Remote URL: http://127.0.0.1:8644
API Key: hermesdesktop
Model: hermes-esvo

Bridge option B: tunnel to the container IP

If you cannot or do not want to publish a Docker host port, tunnel to the container's private Docker IP from the VPS host.

On the VPS host shell, not inside the container:

docker ps
docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' <container-name-or-id>

Suppose that prints 172.18.0.3. From your PC:

ssh -L 8644:172.18.0.3:8642 root@YOUR_VPS_IP

Hermes Desktop:

Mode: Remote
Remote URL: http://127.0.0.1:8644
API Key: hermesdesktop
Model: hermes-esvo

Caveat: Docker container IPs can change after recreation. A Hostinger/Docker port mapping is steadier for long-term use; the container-IP tunnel is a good diagnostic path.

8.4 Switching between work and personal

The VPS can run multiple API gateways at the same time. Hermes Desktop, as of this writing, uses one active remote connection at a time.

For the root VPS multi-gateway setup, run both API servers:

work      -> VPS 127.0.0.1:8642 -> Desktop http://127.0.0.1:8642
personal  -> VPS 127.0.0.1:8643 -> Desktop http://127.0.0.1:8643

Tunnel both in one SSH session:

ssh -L 8642:127.0.0.1:8642 -L 8643:127.0.0.1:8643 root@YOUR_VPS_IP

Then switch Desktop Settings:

Work:     Remote URL http://127.0.0.1:8642, model hermes-work
Personal: Remote URL http://127.0.0.1:8643, model hermes-personal

For separate Hostinger one-click Docker projects, treat each project/container as one gateway:

hermes-agent-dru1 -> model hermes-dru1 -> one exposed/tunneled port
hermes-agent-esvo -> model hermes-esvo -> another exposed/tunneled port

Telegram is still better for having two personas open at once. Desktop is best when you want a bigger screen for one chosen gateway.

Architecture Diagrams

Multi-gateway with shared brain

The diagram shows N bots with one shared brain (the shared-both strategy). The third gateway (coach) is dotted to show it's optional — drop in or remove gateways without touching the canonical store.

Legend: plain folder = canonical real folder · → = symlink into the canonical store. With the _shared/ layout, the canonical store is <parent>/_shared/; in older setups (pre-refactor) it lived inside the "primary" gateway. Both layouts work with the same launcher.

Context sync loop

Real-Life Examples

Daily moments where this setup pays for itself:

Morning standup, on the toilet. Personal bot: "What did I leave half-done on the n8n workflow yesterday?" → it pulls from the work bot's session summary and the latest current-task.md and tells you in one paragraph.
Driving home, voice memo. Personal bot via Telegram voice note: "Remind me to ask the client about the API rate limits Monday." → saved to memory + an Obsidian note tagged client-followup.
Mid-meeting panic. Work bot: "Pull the deployment notes from last Thursday" → instant grep across session-notes/ and back in your Telegram.
3am idea. Personal bot: "I had an idea about pricing tiers for the SaaS — capture it" → written to vault under ideas/, surfaces in your morning review.
Cross-bot recall. You told the personal bot about a doctor's appointment Tuesday. The work bot, when you ask it about Tuesday's schedule, knows. Same brain.
Skill compounding. You ask the work bot to set up Docker for n8n. It writes a skill. Two weeks later, you ask the personal bot to set up Docker for Home Assistant. Same skill loads. Saves 10 minutes.
Context handoff after Claude Code. You spent 3 hours debugging with Claude. End the session. By the time you grab coffee, Hermes has pulled the handoff and can answer "what did we figure out about the database connection issue?" — both from Telegram and the next time you start a Claude Code session.

Specific multi-gateway wins that profiles can't replicate:

A skill written by the work bot during a tense incident is immediately usable by the personal bot when something similar happens at home.
Memory accumulated from a year of work conversations enriches the personal bot's understanding of your projects, deadlines, and stress patterns.
One vault, one git repo, one knowledge graph — multiple voices into the same head.

From the chat — what this actually looks like

These are real screenshots from my own dual-gateway setup. The brand on a few of them is blurred because it's a live client; everything else is exactly what hits Telegram.

Daily ops & scheduled agenda

The personal bot runs a Daily Agenda cron and pulls calendar + the day's task brief on demand. Asking "what's my schedule?" feels less like talking to a tool and more like a shared whiteboard the bot maintains for me.

Schedule asked via Telegram, pulls calendar entry

_{Personal bot, on demand: pulls the calendar, and even offers to block time for sprint work.}

_{Cron-fired daily agenda — same job posts itself back to chat with a `job_id` you can cancel by reply.}

Scheduled book reading — personal bot

A 19-day cron job drips one chapter of Neville Goddard's At Your Command into Telegram each morning. Same primitive as the daily agenda, just pointed at a different skill — bot pulls the day's chunk, formats word count + read time, and posts it.

_{Personal bot, Day 1/19 — cron fires `read_book` skill, bot replies in voice with chapter text and progress marker.}

Claude Code on the laptop → Hermes on the VPS

This is the hermes-context bridge in action. End a Claude Code session, walk away, and the work bot already has the punch list when you ask.

_{"What's my tasks for today?" → bot reads ~/hermes-context/active-projects/<project>/ and replies with a P0/P0-followup/P0-ops punch list.}

Job scraper — work bot

Cron-driven job_scraper skill hits Indeed, LinkedIn, Upwork on a schedule, dedupes against the previous run, and stages a fresh list. Asking "show me updated job list" in chat replays the most recent batch grouped by source with direct apply links.

_{Work bot: scraper output grouped by source — Indeed 1, LinkedIn 1, Upwork 18 — every row a clickable URL straight to the listing.}

Dogfooding skill — agent QAs your own site

/dogfood is a custom skill: pick a target URL and the bot crawls it like a junior QA, then files a severity-ranked bug report straight to Telegram.

/dogfood skill running browser_navigate, browser_console, browser_vision against target site

_{Tool trace: `browser_navigate`, `browser_console`, `browser_vision` looping over the page.}

Dogfood QA report: 8 issues found, 2 High / 3 Medium / 3 Low

_{Output: "8 issues found — 2 High, 3 Medium, 3 Low" with concrete fixes (₱0 prices, placeholder copy, broken stock states).}

Delegated multi-source web research

delegate_task fans out to a researcher sub-agent that hits multiple search engines and market-intel sites in parallel, then composes one structured brief.

Research request fanning out across Google, Bing, DuckDuckGo, Brave, Mordor Intelligence, Grand View, Statista, IMARC

_{Tool trace: 12+ navigations across general search and paid-research sites — all from one casual question in chat.}

Research output: PH printer market size, growth, brands, business models

_{The synthesized brief comes back as a normal Telegram message — sources, numbers, market-share table, and an opinion table at the end.}

Channel-by-channel visibility audit

Ask either bot to score a brand's online presence and it scrapes the relevant social/SEO surfaces, then compiles a scored table.

Visibility audit tool trace: facebook, m.facebook, shopee, google, bing scrapes

_{Same delegation pattern, this time pointed at marketplace + social channels.}

_{Output table: Facebook 7/10, Brand 8/10, Website SEO 3/10 — actionable instead of vibey.}

Building & populating an Obsidian vault from chat

The /obsidian skill points the bot at OBSIDIAN_VAULT_PATH (kept in the hermes-context repo so it round-trips to your laptop). One ask and it scaffolds a full vault with wiki/{entities,concepts,timelines}/, frontmatter, wikilinks — and pushes it to GitHub.

/obsidian skill confirms OBSIDIAN_VAULT_PATH points into hermes-context repo

_{`/obsidian` confirms the path is wired and the skill ready.}

Vault created: CLAUDE.md, index.md, log.md, wiki/{entities,concepts,timelines}, 12 files pushed to GitHub

_{12-file vault written, committed, pushed — and the autosync skill + cron registered in the same turn.}

Costs & model choice

Two running checks I keep open: which model is currently serving, and whether OpenRouter credits are about to run out.

Bot reports it's running mimo-v2.5-pro via custom provider

_{Primary: Xiaomi MiMo v2.5-pro via custom provider (cheap, region-pinned base URL).}

/model slash command shows minimax/minimax-m2.7 on OpenRouter as current

_{`/model` lets you swap providers in chat — here it's MiniMax M2.7 on OpenRouter.}

OpenRouter credits — Total $10, Used $2.83, Remaining $7.17

_{One sentence in: balance check via the OpenRouter skill so the fallback never lapses silently.}

_{The Xiaomi Mimo dashboard - Pro Monthly Plan the guide keeps pointing at — note the dedicated base URL field. Whichever region it shows is the one that has to land in your .env; sgp/ams/cn are not interchangeable.}

Troubleshooting

The first 5 items below are the issues that bit me hardest while building this multi-gateway setup. The rest are general operational fixes.

Telegram rejects the bot token (401 Unauthorized) even though I copy-pasted it

Almost always one of three things:

Wrong length. A real BotFather token is 45–46 characters (e.g. 1234567890:AABBccDDeeFFggHHiiJJkkLLmmNNooPPqqRRssTT). The hardened run.sh prints the length on every start ([gw] token length: 46) and warns if it isn't in [45, 46]. If you see len 47, you almost certainly have a \r from Windows line endings or a trailing space.
\r from Windows line endings. If you edited .env in Notepad, every line ends in \r\n instead of \n. The inject_config.py strip step now scrubs \r and \n aggressively, but you can pre-clean too:
```
sed -i 's/\r$//' ~/gateways/<name>/.env
```
Stray quotes. HERMES_TELEGRAM_BOT_TOKEN="123:abc" — the quotes sometimes survive into the runtime. Drop them:
```
HERMES_TELEGRAM_BOT_TOKEN=123:abc
```

Both gateways using the same token / 409 Conflict from Telegram

Telegram allows exactly one polling process per token. Two symptoms:

Same token in two .env files — confirm with:

for d in ~/gateways/*/; do echo -n "$(basename "$d"): "; grep '^HERMES_TELEGRAM_BOT_TOKEN=' "$d/.env" | cut -d= -f2 | cut -c1-12; done

Should print different prefixes per gateway. Each bot needs its own BotFather token.

Env var leaked across gateways in older run.sh. The hardened launcher wraps every gateway in a subshell (( … )) so per-gateway env vars never leak into the next gateway's process. If you're still on a pre-refactor run.sh, copy the universal one from templates/run.sh.template (or run bootstrap.sh --add to install it with a .bak of the old version).

Sessions/ or memories/ never get created (empty after first run)

Three checks:

HERMES_HOME not set. Hermes only writes to its own home. Confirm run.sh exports it before hermes gateway run:
```
ps eww $(pgrep -f "hermes gateway" | head -1) | tr ' ' '\n' | grep HERMES_HOME=
```
Should print HERMES_HOME=/path/to/your/gateway. If empty, your launcher isn't exporting it.
Permissions. Some Docker volumes default to root:root; if the gateway's process runs as a different UID, writes silently fail:
```
ls -la ~/gateways/<name>
stat -c '%u:%g' ~/gateways/<name>
```
Wrong sessions_dir in config.yaml. The template uses ./sessions (relative to HERMES_HOME). If you hardcoded an absolute path that doesn't exist, Hermes errors silently — fix the path or mkdir -p it.

"No auxiliary LLM provider configured" warning / context lost mid-conversation

You haven't routed compression and title-generation to a model. Add the auxiliary: block in config.yaml:

auxiliary:
  compression:
    provider: custom:xiaomi-mimo
    model: mimo-v2.5-pro
  title_generation:
    provider: custom:xiaomi-mimo
    model: mimo-v2.5-pro

Also keep compression.threshold reasonable (~0.25 of the context window) so middle context survives long sessions. See Part 5 for the full block.

"Not supported model" error from auxiliary tasks

The compression/title-generation model name doesn't match what the provider actually serves. Check the provider's /v1/models endpoint to see valid names:

curl -s "$BASE_URL/v1/models" -H "Authorization: Bearer $XIAOMI_MIMO_API_KEY" | jq '.data[].id'

Common gotchas: model name capitalized differently (mimo-v2.5-pro ≠ MiMo-V2.5-Pro), or you wrote mimo-v2-flash when the deployment exposes mimo-v2.5-flash. Match the dashboard string exactly.

"Telegram bot token already in use"

Another gateway is already holding that token. Kill all hermes processes:

pkill -9 -f "hermes gateway"
sleep 2
find ~/gateways -maxdepth 2 -name 'gateway.pid' -delete
./run.sh all

"HTTP 401 Invalid API Key" from MiMo

Three things, in order:

Wrong endpoint. Check your Xiaomi dashboard for your dedicated base URL — sgp / ams / cn are not interchangeable.
Whitespace in the key from copy-paste. Verify with: grep XIAOMI ~/gateways/work/.env | cat -A — should end in $, not ^M$.
Key was never created. The dashboard's "Create API Key" button only appears once; clicking it generates a one-time-visible key. If you missed copying it, rotate and try again.

Hermes Desktop connects, but chat says "Invalid API key"

Do not trust the Desktop Test Connection button by itself. It checks /health, and /health can succeed even when the authenticated chat endpoint is misconfigured.

Test the real API from your PC:

curl.exe -i http://127.0.0.1:8642/v1/models -H "Authorization: Bearer hermesdesktop"

If that returns 200 OK, test chat with PowerShell-native JSON:

$headers = @{
  Authorization = "Bearer hermesdesktop"
  "Content-Type" = "application/json"
}

$body = @{
  model = "hermes-work"
  stream = $false
  messages = @(@{ role = "user"; content = "Reply OK only" })
} | ConvertTo-Json -Depth 5

Invoke-RestMethod "http://127.0.0.1:8642/v1/chat/completions" -Method Post -Headers $headers -Body $body

What the result means:

/v1/models returns 401 - Desktop/API auth mismatch. The Desktop API Key field must be the raw API_SERVER_KEY value, with no Bearer.
/v1/models works but chat says invalid key - Hermes reached the gateway, but the remote provider key is wrong or not loaded (OPENROUTER_API_KEY, XIAOMI_API_KEY, etc.).
PowerShell works but Desktop fails - Desktop is likely selecting a stale local model card. In remote mode, choose the model advertised by /v1/models (hermes-work, hermes-personal, hermes-esvo), not a local Xiaomi/OpenRouter card.

Hostinger one-click public URL returns `ttyd` / Basic Auth

That URL is the Hostinger web terminal, not the Hermes API server. The giveaway:

Server: ttyd/...
WWW-Authenticate: Basic realm="ttyd"

Do not use that URL in Hermes Desktop. Use Part 8:

Enable API_SERVER_* inside /opt/data/.env.
Start the gateway inside the container.
Confirm curl http://127.0.0.1:8642/v1/models works inside the container.
Expose the Docker port or tunnel to the container IP from the VPS host.

"No auxiliary LLM provider configured" warning

You haven't routed compression/title-generation to a model. Add the auxiliary: block from Part 5 to your config.yaml. Until you do, long conversations lose middle context — not fatal, but worth fixing.

Bot replies in the wrong voice

HERMES_EPHEMERAL_SYSTEM_PROMPT isn't loading. Verify:

ps eww $(pgrep -f "hermes gateway" | head -1) | tr ' ' '\n' | grep HERMES_EPHEMERAL

If empty, your run.sh isn't sourcing .env properly. Confirm .env permissions are 600 and set -a; source .env; set +a is in run.sh.

1-click Hermes template not visible / regional unavailability

The Hermes Agent Docker template at hostinger.com/ph/vps/docker/hermes-agent rolls out by region. If the link 404s for your account, try Hostinger's main Docker catalog and search for "Hermes Agent" or use the Manual install fallback on a plain Ubuntu 24.04 VPS — same end result, 10 extra minutes.

Container won't start / restart loop

Run on the host VM (not inside the container):

docker ps -a                       # confirm the container exists & its status
docker logs hermes-agent --tail=200 # last 200 log lines explain almost every crash
docker inspect hermes-agent | jq '.[0].State'   # health & restart count

Most common causes:

No model API key — see Part 4. The container needs at least one valid provider key in its environment or in a gateway's .env.
Volume mount missing — docker volume ls should show a Hermes-related volume. If it was deleted, your data is gone; recreate by re-running the 1-click deploy.
Out of memory — docker stats while the container tries to start. If RSS climbs past your KVM 2's 8 GB, either drop a heavy MCP server or scale to KVM 4.

Restart the container cleanly:

docker restart hermes-agent
docker logs -f hermes-agent          # follow the boot logs

Bot not replying at all

Hostinger 1-click (Docker):

# From the HOST VM
docker ps                                       # is the container up?
docker exec hermes-agent /root/gateways/run.sh status   # is the launcher alive?
docker logs hermes-agent --tail=200             # recent container output

# Drop into the container for deeper checks
docker exec -it hermes-agent bash
hermes doctor                                   # in-container self-diagnostic
tmux attach -t hermes                           # if you ran the launcher under tmux

Bare-metal:

./run.sh status
journalctl -u hermes-gateways.service -f      # if using systemd
tmux attach -t hermes                          # if using tmux
hermes doctor

hermes doctor validates config, checks API connectivity, tells you what's broken in plain English.

Edits inside the container disappeared after restart

You wrote files outside the persistent volume. Re-check §1.4 — only paths under the volume's Destination survive docker restart.

# Inside the container, confirm where your gateways live
df -h ~                                        # device should be the docker volume
docker exec hermes-agent realpath ~/gateways   # absolute path

Move anything outside the volume into it (mv), then update run.sh paths if needed.

Security basics

chmod 600 your .env files. Always.
Mount /proc with hidepid=2 so command-line arguments aren't visible to other system users.
Never paste API keys in screenshots when asking for help. Redact the value, keep the prefix (tp-…, sk-or-v1-…) for context.
If you expose a web terminal (ttyd), put it behind Tailscale or SSH-tunnel only. Don't use HTTP Basic Auth for production.

Cost control

Set a $5–10 cap on OpenRouter. Top up only when needed.
Monitor MiMo usage daily for the first week.
Keep auxiliary tasks (compression, titles) on mimo-v2.5-pro to match its 524k context window — routing them to MiMo Flash auto-clamps the compression threshold to 262k and you lose middle context faster (see §5.3 callout). Auxiliary token cost is small; conversation memory isn't.
Disable compression entirely (compression.enabled: false) on the personal bot if it's mostly casual chat — savings add up.

Resources

Hermes Agent

Models & Providers

Companion repos

Demonbane18/hermes-context — context bridge repo

Patterns & inspirations

Karpathy — LLM Wiki gist — the four-layer pattern Part 6.4 is built on. Required reading.

Tools mentioned

Obsidian — second brain
Obsidian Web Clipper — clip web articles into raw/ fast
Excalidraw — hand-drawn diagrams
tmux — terminal multiplexer
@BotFather on Telegram

Hosting

Hostinger VPS — what I run on
Tailscale — for private remote access

If this saved you a Saturday, star the repo so the next person can find it.

Build something good with it.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
diagrams		diagrams
gateways		gateways
images		images
prompt-context		prompt-context
templates		templates
.gitignore		.gitignore
README.md		README.md
bootstrap.sh		bootstrap.sh

Folders and files

Latest commit

History

Repository files navigation