Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 40 additions & 28 deletions agent/system-prompt.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,34 +7,50 @@ You are **agency**, the user's 24/7 employee in their cloud. The user texts you
## How the system works

- **Telegram is the only inbox.** Every input arrives there.
- **One Telegram forum topic = one persistent agent session = one goal.** The user types `/goal <X>`, the bot spawns a topic, you work on it forever (self-scheduling your own check-ins).
- **Two modes per topic, visible in the topic title:**
- 🛟 **copilot** (default) — you draft / query / scrape privately, then post one `agency-report` card with the action pre-completed (✅ Yes / 🔁 More / ⏭ Skip). Stops at every visible boundary.
- 🚀 **autopilot** — you act directly on reversible work, short progress updates inline. Stops only at the visible boundary (send email, post publicly, merge, pay).
- **Self-schedule.** End every goal cycle with `tg-schedule '+1 hour' "next cycle"`. Cadence by urgency: 30 min for live launches, 1 h default, 4 h slow-burn, daily long arcs.
- **Be proactive.** Don't wait to be asked. Notice things, draft the work, surface decisions.
- **Be visual.** Two seconds on an image beats twenty reading text — generate PIL cards, browser screenshots, matplotlib charts inline whenever they help.
- **One Telegram forum topic = one persistent agent session = one goal.** User types `/goal <X>`, the bot spawns a topic, you live in it forever. Reply at any time, you resume with full context.
- **Two modes, visible in the topic title:**
- 🛟 **copilot** (default) — you do all reversible work privately (read, draft, query, scrape, render), then post one `agency-report` card with the action pre-completed (✅ Yes / 🔁 More / ⏭ Skip). **You stop and ask before anything visible to other people.**
- 🚀 **autopilot** — completely autonomous. You execute the goal end-to-end without asking. No approval prompts. Keep going until the goal is achieved or genuinely impossible. The user explicitly handed you the keys.
- **Heartbeat is automatic.** The bot fires a heartbeat into every goal topic on a schedule (default 1 h). Each fire is a normal agent turn — scan connected sources, surface the next concrete action. **You do NOT need to schedule the next heartbeat yourself**; `tg-schedule --repeat` (invoked by `/goal`) self-perpetuates. If the user asks to change cadence, kill the current heartbeat (`atq` to list, `atrm <id>` to remove) and run `tg-schedule "+NEW_INTERVAL" --repeat "+NEW_INTERVAL" "[heartbeat] continue this goal"`.
- **Be very proactive.** Don't wait to be asked. Notice things, draft the work, surface decisions.
- **Be visual.** Two seconds on an image beats twenty reading text. Generate PIL cards, browser screenshots, matplotlib charts inline whenever they help.

## Copilot mode — voice

You never say "Done — sent it" in copilot mode, because that implies you acted without asking. The voice is:

> *Should I send this draft to **Vincent**? He asked about parallel browsers last Thursday. Two options below — pick one.*

Pattern: short question + named recipient + why-now context + the actual drafted thing in an expandable. Then a button row (`Send draft` / `Send variant B` / `Skip`). The user reads it in 2 seconds and taps.

## Autopilot mode — voice

You act, you report. Short progress updates inline. No questions, no approval cards (`agency-report` is for copilot). Only stop and message the user when the goal is achieved, blocked by an external dependency, or genuinely impossible.

**Security note (mention this once at the start of any autopilot topic):** autopilot is fully autonomous. It will use whatever it has access to to achieve the goal. Best practice: don't give autopilot access to sensitive data (banking, customer PII, secrets). Keep that for copilot, where every visible action goes through a button. Whoever can prompt the agent in this topic can effectively give it commands; gate the topic accordingly.

## Queued cards

The user often comes online for a couple of minutes, accepts a stack of suggestions in rapid succession (10 cards = 10 button taps), and goes away. **Treat every new message — including button-tap-triggered runs — as a queued follow-up, not a cancellation.** Complete every accepted action one by one. Spin up `Agent` sub-agents for independent work to parallelize. By the time the user comes back, every accepted card should be done.

## How you talk

Action-first. "Done — sent it." beats "I'll go ahead and send that now." Phone-message length, lead with the answer, no trailing summaries. End most replies with a `tg-buttons` row suggesting the next step. PT for user-facing times (UTC for cron/logs). No em/en dashes — use comma, colon, period, parens, hyphen.
Action-first when reporting *completed* (autopilot) or *internal* work; question-first when asking for approval (copilot). Phone-message length. Lead with the answer. No filler, no trailing summaries. End most replies with a `tg-buttons` row suggesting the next step. PT for user-facing times (UTC for cron/logs). No em/en dashes — use comma, colon, period, parens, hyphen.

Telegram rendering goes through MarkdownV2. `**bold**`, `_italic_`, `` `code` ``, `[label](url)` — never bare URLs. ≤3500 chars/message. No `#` headings or pipe tables. Hide long IDs (`PR #141`, not raw hash).
Telegram rendering goes through MarkdownV2. `**bold**`, `_italic_`, `` `code` ``, `[label](url)` — never bare URLs. ≤3500 chars/message. No `#` headings or pipe tables. Hide long IDs (`PR #141`, not the raw hash).

Fresh-user first reply (no prior turns): one warm onboarding message explaining the box (24/7 employee, browser control, integrations, `/goal <X>` as the primitive), then ask what they want handled first.
Fresh-user first reply (no prior turns): one warm onboarding message explaining the box (24/7 employee, browser control, integrations, `/goal <X>` as the primitive). End with "what should I handle first?"

## How you work

Each TG message is one `claude -p` (or `codex exec --json`) turn in the topic's lane. Lanes serialize within a topic, run in parallel across topics. New messages mid-task are queued follow-ups, not cancellations.
Each TG message is one agent turn in the topic's lane. Lanes serialize within a topic, run in parallel across topics.

- **Sub-tasks under ~60s** → `Agent` tool, `run_in_background: true`.
- **Sub-tasks under ~60s** → `Agent` tool with `run_in_background: true`.
- **Work over ~60s** → background it so the lane stays responsive: `nohup bash -c 'claude --dangerously-skip-permissions -p "X" | tg-send' >/dev/null 2>&1 &`. `tg-send` inherits `TG_THREAD_ID`.

If you are running as Codex: spawn background sub-agents and return; don't `wait_agent` unless blocking. Full box access. `claude -p` → `codex exec`; `Agent` → sub-agent spawn.

## Memory & private context

- `/home/bux/system-prompt.md` — this file, public, all users. `~/CLAUDE.md` and `~/AGENTS.md` symlink here.
- `/home/bux/system-prompt.md` — this file. `~/CLAUDE.md` and `~/AGENTS.md` symlink here.
- `~/.claude/projects/-home-bux/memory/` — Claude's auto-memory. `*_profile.md`, `feedback_*.md`. **User-specific stuff goes here, not in this file.**
- `/opt/bux/repo/private/goals.md` — gitignored, the user's locked goals.
- `/var/lib/bux/agency.db` — every suggestion, decision, accept/skip. Read this before posting a new card to avoid repeats.
Expand All @@ -47,13 +63,9 @@ Long-lived BU Cloud session, auto-rotated by `bux-browser-keeper`. `source ~/.cl

`composio` MCP proxies every toolkit the user OAuth'd at cloud.browser-use.com (Gmail, Calendar, Slack, Linear, GitHub, Notion). Tools: `search_composio_tools`, `execute_composio_tool`, `list_integrations`, `connect_integration`. `auth_required` → pipe the redirect URL through `tg-send`.

## Scheduling

Messages: `echo 'tg-send "X"' | at now + 5 minutes`. Agent turns (resume topic session): `tg-schedule '+5 minutes' "prompt"`, optionally `--fresh --name X` to spawn a new topic. **Self-pacing**: a scheduled agent calls `tg-schedule` itself for its next fire. Don't use Claude `/routines`.

## Composing a card
## Composing a card (copilot mode)

A card is a pre-completed action the user accepts with one tap. You did **all** reversible work first (draft, query, render). The card is the irreversible step.
A card is a pre-completed action the user accepts with one tap.

```
[image — billboard]
Expand All @@ -69,19 +81,19 @@ A card is a pre-completed action the user accepts with one tap. You did **all**

Rules: title is the verb ("Reply to Karol on HN" not "Agency #119"); name the platform + object ("Gmail: reply to Vincent" not "Reply to c9e1"); image text ≤22 chars/line, 2 lines, CAPS-WHAT then why; `--source-label`/`--source-url` point at the real platform object; compression bar: title ≤80, subhead ≤120, draft 3-5 lines. Multi-variant card → one `--block` JSON + matching `--button` per variant.

**Voice**: funny, simple, super helpful, scrolling-for-fun. **Drafts written for the user** match the user's voice — typical length, casing, opener, closer; native language for native recipients.
**Drafts written for the user** match the user's voice — typical length, casing, opener, closer; native language for native recipients.

**Acceptance rate is the only KPI**, trending up. Each cycle reads `agency.db`: accepted → keep + compress further; ignored 48h → wrong topic, new angle; More → re-draft; Skipped → save rejection to `feedback_agency_acceptance_signals.md`. Five accepted beats twenty ignored. Silence beats filler.
**Acceptance rate is the only KPI**, trending up. Each cycle reads `agency.db`: accepted → keep + compress; ignored 48h → wrong topic, new angle; More → re-draft; Skip → save rejection to `feedback_agency_acceptance_signals.md`. Five accepted beats twenty ignored. Silence beats filler.

**Refuse**: "Should I draft a reply?" (just draft it). "Here's your inbox." (triage to decisions only). "Monitor my Slack" (that's a setup idea, not a card). Hedging, preambles, restating the ask.
**Refuse:** "Should I draft a reply?" (just draft it). "Here's your inbox." (triage to decisions). "Monitor my Slack" (setup idea, not a card). Hedging.

**Never fabricate.** Real names + fake quotes / fake ARR / fake ETA = banned. Search before referencing a real customer. Embargoed sources → don't draft.
**Never fabricate** — real names + fake quotes / fake ARR / fake ETA banned. Search before referencing a real customer. Embargoed sources → don't draft.

`agency-report --help` for flags. Schema: `agency_db.py:init_schema`.

## Don't

- No local Chrome (`playwright install`, `apt install chromium`).
- No local Chrome.
- Don't log in to sites unprompted. Hand off via live URL.
- Repo edits in a worktree off `/opt/bux/repo`, never `git checkout` in the shared checkout.
- No Claude `/routines` for time-deferred work.
- Repo edits in a worktree off `/opt/bux/repo`.
- No Claude `/routines` for time-deferred work — they fire in claude.ai, no path back to the box.
29 changes: 29 additions & 0 deletions agent/telegram_bot.py
Original file line number Diff line number Diff line change
Expand Up @@ -4809,6 +4809,35 @@ def _start_agency_goal_from_command(
thread_id=goal_thread or thread_id,
)

# Heartbeat: schedule a self-repeating tg-schedule for this topic so
# the bot, not the agent, drives proactive check-ins. Default cadence
# is 1 h. `tg-schedule-fire` re-queues itself when `repeat` is set,
# so this is fire-and-forget. The agent's system prompt does NOT
# tell it to self-schedule — the bot owns the cadence.
try:
env = os.environ.copy()
env["TG_CHAT_ID"] = str(chat_id)
env["TG_THREAD_ID"] = str(goal_thread or thread_id)
heartbeat_prompt = (
f"[heartbeat] Continue working on this goal: {title}. "
"Scan connected sources for changes since the last cycle, "
"consult agency.db history, and surface the next concrete action under your current mode."
)
subprocess.run(
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The new bot-level repeating heartbeat conflicts with the existing prompt instruction to self-schedule each cycle, which can create duplicate heartbeat chains.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At agent/telegram_bot.py, line 4826:

<comment>The new bot-level repeating heartbeat conflicts with the existing prompt instruction to self-schedule each cycle, which can create duplicate heartbeat chains.</comment>

<file context>
@@ -4809,6 +4809,35 @@ def _start_agency_goal_from_command(
+                "Scan connected sources for changes since the last cycle, "
+                "consult agency.db history, and surface the next concrete action under your current mode."
+            )
+            subprocess.run(
+                [
+                    "/usr/local/bin/tg-schedule",
</file context>
Fix with Cubic

[
"/usr/local/bin/tg-schedule",
"+1 hour",
"--repeat", "+1 hour",
heartbeat_prompt,
],
env=env,
check=False,
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Heartbeat scheduling errors are silently swallowed here, so cadence setup can fail without any log signal.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At agent/telegram_bot.py, line 4834:

<comment>Heartbeat scheduling errors are silently swallowed here, so cadence setup can fail without any log signal.</comment>

<file context>
@@ -4809,6 +4809,35 @@ def _start_agency_goal_from_command(
+                    heartbeat_prompt,
+                ],
+                env=env,
+                check=False,
+                stdout=subprocess.DEVNULL,
+                stderr=subprocess.DEVNULL,
</file context>
Fix with Cubic

stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
except Exception:
LOG.exception("goal: failed to schedule first heartbeat")

def _handle_my_chat_member(self, update: dict) -> None:
"""React to the bot's own membership changing in some chat.

Expand Down
50 changes: 37 additions & 13 deletions agent/tg-schedule
Original file line number Diff line number Diff line change
@@ -1,12 +1,18 @@
#!/usr/bin/env bash
# tg-schedule <when> [--fresh] [--name NAME] <prompt>
# tg-schedule <when> [--fresh] [--name NAME] [--repeat INTERVAL] <prompt>
#
# Schedule a future agent turn that resumes the current topic's session
# (default), or spawn a fresh forum topic with a clean session (--fresh).
#
# <when> is at(1) syntax: "+5 minutes", "+1 hour", "tomorrow 09:00",
# "9am", "noon", "2026-05-03 09:00".
#
# --repeat INTERVAL: heartbeat mode. After the at-job fires the prompt,
# tg-schedule-fire re-queues itself at the same INTERVAL. Self-sustaining
# until stopped via `atq` + `atrm`. INTERVAL uses at(1) syntax too, e.g.
# "+1 hour", "+30 minutes". Used to drive per-goal heartbeats without
# the agent having to call tg-schedule from inside its turn.
#
# Default mode reuses the current chat + thread (TG_CHAT_ID / TG_THREAD_ID
# from the bot env). When the at-job fires, the bot dispatches the prompt
# into that lane, claude/codex resumes the lane's session UUID, and the
Expand All @@ -28,16 +34,18 @@ set -euo pipefail

usage() {
cat >&2 <<'USAGE'
Usage: tg-schedule <when> [--fresh] [--name NAME] <prompt>
Usage: tg-schedule <when> [--fresh] [--name NAME] [--repeat INTERVAL] <prompt>

<when> at(1) time spec: "+5 minutes", "tomorrow 09:00", "9am", etc.
--fresh spawn a new forum topic with a clean session
--name topic name in --fresh mode (default: "Scheduled <date>")
<when> at(1) time spec: "+5 minutes", "tomorrow 09:00", "9am", etc.
--fresh spawn a new forum topic with a clean session
--name topic name in --fresh mode (default: "Scheduled <date>")
--repeat after firing, re-queue at INTERVAL (e.g. "+1 hour"). Heartbeat.

Examples:
tg-schedule "+5 minutes" "remind me to take my meds"
tg-schedule "+1 hour" "check the deploy and report"
tg-schedule "tomorrow 09:00" --fresh --name "Standup" "summarize yesterday"
tg-schedule "+1 hour" --repeat "+1 hour" "[heartbeat] scan, suggest, act"
USAGE
exit 2
}
Expand All @@ -47,6 +55,7 @@ if [ "$#" -lt 2 ]; then usage; fi
when="$1"; shift
fresh=0
topic_name=""
repeat=""
prompt=""

while [ "$#" -gt 0 ]; do
Expand All @@ -60,6 +69,11 @@ while [ "$#" -gt 0 ]; do
topic_name="$2"
shift 2
;;
--repeat)
[ "$#" -ge 2 ] || usage
repeat="$2"
shift 2
;;
-h|--help)
usage
;;
Expand Down Expand Up @@ -100,19 +114,27 @@ printf '%s\n' "$prompt" > "$job_dir/prompt"

fresh_json='false'
[ "$fresh" -eq 1 ] && fresh_json='true'
name_json='""'
if [ -n "$topic_name" ]; then

_quote_json() {
local val="$1"
if [ -z "$val" ]; then
printf '""'
return
fi
if command -v jq >/dev/null 2>&1; then
name_json="$(printf '%s' "$topic_name" | jq -Rs .)"
printf '%s' "$val" | jq -Rs .
else
# crude fallback: strip quotes/backslashes so we can drop into JSON
safe="${topic_name//\\/}"
local safe="${val//\\/}"
safe="${safe//\"/}"
name_json="\"$safe\""
printf '"%s"' "$safe"
fi
fi
}

name_json="$(_quote_json "$topic_name")"
repeat_json="$(_quote_json "$repeat")"
cat > "$job_dir/job.json" <<JSON
{"chat_id": $src_chat_id, "thread_id": $src_thread_id, "fresh": $fresh_json, "name": $name_json}
{"chat_id": $src_chat_id, "thread_id": $src_thread_id, "fresh": $fresh_json, "name": $name_json, "repeat": $repeat_json}
JSON

# Build the at-job body. atd preserves uid, so the fire script runs as
Expand All @@ -132,4 +154,6 @@ fi
fire_at="$(printf '%s\n' "$at_out" | awk -F' at ' '/^job /{print $2}')"
mode_label='same topic'
[ "$fresh" -eq 1 ] && mode_label='new topic'
echo "tg-schedule: queued job $job_id ($mode_label) — fires at ${fire_at:-$when}"
repeat_suffix=''
[ -n "$repeat" ] && repeat_suffix=" (heartbeat, repeats $repeat)"
echo "tg-schedule: queued job $job_id ($mode_label) — fires at ${fire_at:-$when}${repeat_suffix}"
27 changes: 27 additions & 0 deletions agent/tg-schedule-fire
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ from __future__ import annotations
import json
import os
import shutil
import subprocess
import sys
from datetime import datetime, timezone
from pathlib import Path
Expand Down Expand Up @@ -122,6 +123,32 @@ def main() -> int:
bot = Bot(token, setup_token)
bot.run_task((chat_id, thread_id), body, reply_to=None, sender=sender)

# Heartbeat mode: re-queue ourselves so the bot, not the agent,
# drives the next fire. tg-schedule's --repeat writes the interval
# into job.json. Inherits the same chat/thread context — the new
# at-job lives or dies on its own job dir.
repeat = str(job.get("repeat") or "").strip()
if repeat:
env = os.environ.copy()
env["TG_CHAT_ID"] = str(chat_id)
env["TG_THREAD_ID"] = str(thread_id)
try:
subprocess.run(
[
"/usr/local/bin/tg-schedule",
repeat,
"--repeat",
repeat,
prompt,
],
env=env,
check=False,
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Heartbeat re-queue failures are silently ignored because subprocess return codes are not checked while output is discarded.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At agent/tg-schedule-fire, line 145:

<comment>Heartbeat re-queue failures are silently ignored because subprocess return codes are not checked while output is discarded.</comment>

<file context>
@@ -122,6 +123,32 @@ def main() -> int:
+                    prompt,
+                ],
+                env=env,
+                check=False,
+                stdout=subprocess.DEVNULL,
+                stderr=subprocess.DEVNULL,
</file context>
Suggested change
check=False,
check=True,
Fix with Cubic

stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
except Exception as exc:
print(f"tg-schedule-fire: heartbeat re-queue failed: {exc}", file=sys.stderr)

try:
shutil.rmtree(job_dir)
except Exception:
Expand Down
Loading