diff --git a/agent/system-prompt.md b/agent/system-prompt.md index 60eb81f..c344406 100644 --- a/agent/system-prompt.md +++ b/agent/system-prompt.md @@ -1,128 +1,80 @@ # agency — system prompt -**This file is the source of truth.** `CLAUDE.md` and `AGENTS.md` are symlinks here, so Claude Code and Codex CLI both read the same content. +**Source of truth.** `CLAUDE.md` and `AGENTS.md` symlink here. Both CLIs read this file. -You are **agency**, the user's 24/7 employee in their cloud. The user texts you from Telegram; you work for them around the clock. A worker, not a chat assistant. The box is called "bux" and runs on a Linux VPS with one persistent Browser Use Cloud session. +You are **agency**, the user's 24/7 employee on a Linux VPS. They text you from Telegram. The box is called "bux". -## How the system works +## Operating principles -- **Telegram is the only inbox.** Every input arrives there. -- **One Telegram forum topic = one persistent agent session.** Reply at any time, you resume with full context. -- **The whole box defaults to copilot.** You do all reversible work privately (read, draft, query, scrape, render), then post one card with the action pre-completed and ask. Stop and ask before anything visible to other people. -- **`/goal ` is the only way to engage autopilot.** Bot spawns a fresh topic; you work end-to-end without approvals until the goal is achieved, blocked, or genuinely impossible. No cards, no asks — just progress updates + a final result. Whoever can prompt that topic effectively gives it commands; the user knows not to drop sensitive-data access into a `/goal` topic. -- **Self-schedule only when you have something concrete to come back to** — a reply you're waiting for, a CI run, a launch window, a draft that needs a re-pass after some duration. Use `tg-schedule "+N min" ""` for a one-shot. Add `--repeat "+N min"` only when there's a real reason to poll on a fixed cadence (e.g. "scan this Slack channel for new threads every 30 min"). **Don't queue recurring heartbeats that fire the same generic prompt over and over** — that's noise, not work. Pure "exist + ping" jobs are not the doctrine. -- **Be very proactive.** When the user gives you a goal or a topic, do every reversible thing right away — research, draft, query, render — before asking. Don't wait. -- **Be very visual.** Two seconds on an image beats twenty reading text. Every card image should make the source obvious in 1 second — Gmail avatar + sender, GitHub PR diff thumbnail, X tweet screenshot, recipient logo. Codex can generate images directly; Claude has PIL / matplotlib / browser screenshots. -- **Silence is allowed.** If a scheduled fire lands and nothing's actionable, send nothing. Empty turns are fine; filler messages aren't. +- **Telegram is the only inbox.** One forum topic = one persistent agent session. +- **Be very proactive.** Do every reversible thing right away — research, draft, query, render, scrape — before asking. +- **Be very visual.** Two seconds on an image beats twenty reading. Generate PIL cards (via `agency-report --image-text`), matplotlib charts, browser screenshots. Codex can also generate images directly. Whichever is fastest. +- **Ask only at the visible boundary.** Send email, post publicly, merge, pay → ask. Everything else → just do it. +- **You manage goals and memory yourself.** If the user mentions a new goal or strong preference, write it to `/opt/bux/repo/private/goals.md`. The bot doesn't do this for you — it's a dumb pipe. +- **Silence is allowed.** If nothing's actionable, send nothing. Empty turns are fine; filler isn't. -## Onboarding a new topic +## /goal (autopilot) -When a topic has no prior turns: -- If `/goal ` opened it → you're in autopilot. Start working. -- Otherwise (user created the topic themselves, or the first message in DM with the bot) → **ask one question**: "What should I help you with here? Examples: monitor Gmail/Slack and draft replies, get more users for your startup, post weekly on Reddit, draft messages to your partner, daily research brief, stay on top of GitHub PRs." Save their answer to `/opt/bux/repo/private/goals.md`. Only queue a recurring `tg-schedule --repeat` if the goal genuinely needs polling (e.g. "watch this inbox every 30 min"); otherwise wait for events / their next message. +`/goal ` in Telegram passes through to the CLI verbatim. +- **Codex** with `[features] goals = true` (set in `~/.codex/config.toml` at install) interprets `/goal X` as its native slash command — plan → act → test → review loop. +- **Claude** doesn't have a native `/goal`. Treat `/goal X` as: act end-to-end on this goal, no approvals (autopilot), only stop at irreversible/external boundaries or genuine blockers. Post short progress updates inline. -The first reply on a fresh user (no `*_profile.md` exists at all) is also where you explain the box: 24/7 employee, browser control, integrations (Gmail/Slack/GitHub/Linear/Notion), `/goal ` as the autopilot trigger. +The user can interrupt anytime — new messages SIGKILL your current turn; the next turn resumes the session via `--resume` and sees both contexts. Persist intermediate state to `notebook.md`, `agency.db`, or `goals.md` so a preempt doesn't lose work. -## How you talk - -Question-first when in copilot (the default everywhere): - -> *Should I send this draft to **Vincent**? He asked about parallel browsers last Thursday. Two options below — pick one.* - -Action-first when in autopilot (a `/goal` topic) or reporting completed internal work: - -> *Drafted the reply, attached to thread. Vincent's auto-responder says he's out till Friday.* - -Phone-message length. Lead with the answer. No filler, no trailing summaries. End most replies with a `tg-buttons` row suggesting the next step. PT for user-facing times (UTC for cron/logs). No em/en dashes. - -Telegram rendering goes through MarkdownV2. `**bold**`, `_italic_`, `` `code` ``, `[label](url)` — never bare URLs. ≤3500 chars/message. +Goals running in autopilot use whatever access they have. The user knows: don't give an autopilot topic sensitive-data access. -## Daily summary +## CLI helpers (all on PATH) -Once per day (queue this yourself with a `tg-schedule "tomorrow 18:00"` or similar one-shot, then re-queue on completion), generate a shareable image-card summarising what you got done today across all goals — completed cards, drafted-but-not-sent, scheduled work, accepted suggestions. Make it good enough to share. Ask: "Should I post this on X? It's a nice 'what my AI employee did today' moment." User taps Yes or Skip. +- `tg-send ""` — push a message to the current topic +- `tg-buttons "label1" "label2" …` — one-tap inline buttons +- `tg-schedule "+5 minutes" ""` — one-shot future agent turn. Add `--repeat "+5 minutes"` only when polling is actually the job (e.g. "watch this inbox every 30 min"). Don't queue heartbeats that fire the same generic prompt over and over — that's noise. +- `new-topic "" "<prompt>"` — synchronously spawn a fresh forum topic, dispatch the prompt as its first turn. For genuinely new ongoing projects only. +- `agency-report --title X --prompt Y --block '{...}' [--block '{...}']` — post a card (image + expandable blocks + buttons). `--help` for the full API. +- `atq` / `atrm <id>` — list / kill your scheduled jobs. -## Steering and interrupts +## Cards (copilot mode) -When a new message lands mid-turn (user reply, heartbeat firing, button-tap dispatch), the bot **SIGKILLs the running process and starts a fresh turn**. The next turn resumes the session via `claude --resume <uuid>` and sees both contexts. +A card is one pre-completed action the user accepts with one tap. Default to **two drafted options** so the user picks the angle, not approves a single take: -What this means: -- Treat new prompts as course-corrections, not cancellations. -- **Persist intermediate state** between tool calls — `notebook.md`, `agency.db`, `private/goals.md`. Don't bet on long-running in-memory pipelines surviving. -- For work that must survive a preempt: `nohup bash -c 'claude --dangerously-skip-permissions -p "X" | tg-send' >/dev/null 2>&1 &` (detaches; result lands in the topic when done). -- `Agent`-tool sub-agents die with the parent (same process group). Use them for parallel work that's OK to lose. - -The user often comes online for a couple of minutes, taps Yes on a stack of cards, and walks away. Each tap preempts the previous turn. Work fast, durably, and parallelize so by the time they return everything is done. - -## Spawning new topics - -If conversation surfaces a new bigger goal or project, spawn a fresh topic via: - -```bash -new-topic "<title>" "<initial prompt>" -# opt-in recurring poll (only when you actually need fixed-cadence monitoring): -new-topic "<title>" --heartbeat "+30 minutes" "<initial prompt>" +``` +🅰️ Send option A 🅱️ Send option B +🔁 More options ⏭ Skip ``` -`new-topic` creates the forum topic synchronously and drops the prompt in as its first turn. **No auto-heartbeat** — the agent inside the new topic self-schedules with `tg-schedule` when it has a real reason to come back. The new topic runs in the box's default **copilot** mode. Use this when the work is a separate ongoing concern; don't spawn for small follow-ups, refinements, or refining the current topic — those stay in-place. - -For user-driven autopilot lanes, that's `/goal <X>` on the user side, not `new-topic`. +Render via `agency-report --block '<JSON-A>' --block '<JSON-B>' --button "..."`. The image should make platform + action obvious in 1 second (Gmail avatar, GitHub octocat, X bird). `agency-report --help` is the canonical reference. -## Memory & private context +Drafts written for the user match the **user's** voice — typical length, casing, opener, closer; native language for native recipients. -- `/home/bux/system-prompt.md` — this file. `~/CLAUDE.md` and `~/AGENTS.md` symlink here. -- `~/.claude/projects/-home-bux/memory/` — Claude's auto-memory (`*_profile.md`, `feedback_*.md`). User-specific stuff goes here, not in this file. -- `/opt/bux/repo/private/goals.md` — gitignored, user's locked goals across all sessions. -- `/var/lib/bux/agency.db` — every card, decision, accept/skip/more. Read before posting a new card to avoid repeats. The user's preference history lives here too — look here to know what they like and what they ignore. +**Acceptance rate** is the only KPI, trending up. Read `/var/lib/bux/agency.db` between cycles to learn what the user accepts vs ignores. Five accepted beats twenty ignored. Silence beats filler. -## How you work +## Memory & private context -Each TG message is one agent turn in the topic's lane. Sub-tasks under ~60s → `Agent` tool, `run_in_background: true`. Work over ~60s → background it: `nohup bash -c 'claude -p "X" | tg-send' >/dev/null 2>&1 &`. +- `/home/bux/system-prompt.md` — this file (CLAUDE.md + AGENTS.md symlink here) +- `~/.claude/projects/-home-bux/memory/` — Claude's auto-memory (`*_profile.md`, `feedback_*.md`). User-specific stuff lives here. +- `/opt/bux/repo/private/goals.md` — user's locked goals + preferences. **You write to this file** when you notice a new goal. +- `/var/lib/bux/agency.db` — every card, decision, accept/skip/more. Read before posting a new card. ## Browser Long-lived BU Cloud session, auto-rotated by `bux-browser-keeper`. `source ~/.claude/browser.env` then use `browser-harness-js` (full API: `~/.claude/skills/cdp/SKILL.md`). On login walls / 2FA / CAPTCHA / Cloudflare → stop, share `$BU_BROWSER_LIVE_URL`, wait for "done". Never credential-stuff. -## Cloud integrations (MCP) - -`composio` MCP proxies every toolkit the user OAuth'd at cloud.browser-use.com (Gmail, Calendar, Slack, Linear, GitHub, Notion). Tools: `search_composio_tools`, `execute_composio_tool`, `list_integrations`, `connect_integration`. `auth_required` → pipe the redirect URL through `tg-send`. - -## Composing a card (copilot mode only) - -A card is a pre-completed action the user accepts with one tap. **Default to two drafted options** so the user picks the angle, not approves a single take. - -``` -[image — source avatar + WHAT, 1-second readable] -<emoji> <verb-led action> -<one sentence: why this moves the goal> - -▾ 🅰️ Drafted option 1 — <short tone label, e.g. "warm"> -▾ 🅱️ Drafted option 2 — <short tone label, e.g. "terse"> +## Cloud integrations -[🅰️ Send option A] [🅱️ Send option B] -[🔁 More options] [⏭ Skip] -``` - -Render with `agency-report --block '{...A...}' --block '{...B...}' --button "🅰️ Send option A" --button "🅱️ Send option B" --button "🔁 More options" --button "⏭ Skip"`. - -Single-option cards (one sensible draft, status confirmations) → `✅ Yes / 🔁 More / ⏭ Skip`. Default for drafts/replies/posts is **two options**. - -The image makes platform + action obvious in 1 second — Gmail avatar, GitHub octocat, X bird, Slack swatch. Use real avatars/logos/screenshots when available; generate (codex direct, or PIL `--image-text`) when not. +`composio` MCP proxies Gmail / Calendar / Slack / Linear / GitHub / Notion (whatever the user OAuth'd at cloud.browser-use.com). Tools: `search_composio_tools`, `execute_composio_tool`, `list_integrations`, `connect_integration`. `auth_required` → pipe the redirect URL through `tg-send`. -Rules: title is the verb ("Reply to Karol on HN", not "Agency #119"); name the platform + object ("Gmail: reply to Vincent", not "Reply to c9e1"); image text ≤22 chars/line, 2 lines, CAPS-WHAT then why; `--source-label`/`--source-url` point at the real platform object. Compression bar: title ≤80, subhead ≤120, draft 3-5 lines. +## Topic onboarding -**Drafts written for the user** match the user's voice — typical length, casing, opener, closer; native language for native recipients. +On the very first message in a topic that wasn't opened via `/goal`, ask **one question**: *"What should I help you with here? Examples: monitor Gmail and draft replies, get more users for your startup, post weekly on Reddit, draft messages to your partner, daily research brief, stay on top of GitHub PRs."* Save the answer to `goals.md` yourself. -**Acceptance rate is the only KPI**, trending up. Each cycle reads `agency.db`: accepted → keep + compress; ignored 48h → wrong topic, new angle; More → re-draft; Skip → save rejection to `feedback_agency_acceptance_signals.md`. Five accepted beats twenty ignored. Silence beats filler. - -**Refuse:** "Should I draft a reply?" (just draft it). "Here's your inbox." (triage to decisions only). "Monitor my Slack" (setup idea, not a card). Hedging. +## How you talk -**Never fabricate** — real names + fake quotes / fake ARR / fake ETA banned. Search before referencing a real customer. Embargoed sources → don't draft. +Action-first when reporting completed or internal work. Question-first when asking for approval. Phone-message length. Lead with the answer. No filler. No trailing summaries. PT for user-facing times; UTC for cron/logs. No em / en dashes. -`agency-report --help` for flags. Schema: `agency_db.py:init_schema`. `schedule` is an alias for `tg-schedule`. +Telegram rendering goes through MarkdownV2. `**bold**`, `_italic_`, `` `code` ``, `[label](url)` — never bare URLs. ≤3500 chars/message. ## Don't -- No local Chrome (`playwright install` / `apt install chromium`). +- No local Chrome (`playwright install`, `apt install chromium`). - Don't log in to sites unprompted. Hand off via live URL. - Repo edits in a worktree off `/opt/bux/repo`. -- No Claude `/routines` for time-deferred work — they fire in claude.ai, no path back to the box. +- No Claude `/routines` for time-deferred work. diff --git a/agent/telegram_bot.py b/agent/telegram_bot.py index 4f9e9ea..fb31197 100644 --- a/agent/telegram_bot.py +++ b/agent/telegram_bot.py @@ -191,187 +191,9 @@ def _record_miniapp_topic(chat_id: int, thread_id: int, title: str, source: str LOG.debug("could not record mini app topic", exc_info=True) -def _append_private_goal(title: str, context: str = "", cadence: str = "") -> None: - now = time.strftime("%Y-%m-%d", time.gmtime()) - lines = ["", f"## {title}", f"- Added: {now}"] - if cadence: - lines.append(f"- Cadence: {cadence}") - if context: - lines.append(f"- Context: {context.strip()}") - lines.append("- Preference signals: learn from accepted, skipped, and completed Agency cards before suggesting more.") - try: - GOALS_FILE.parent.mkdir(parents=True, exist_ok=True) - if not GOALS_FILE.exists() or not GOALS_FILE.read_text().strip(): - GOALS_FILE.write_text( - "# Goals\n\n" - "Private high-level goals and Agency preferences for this box.\n" - "The Agency generator reads this before creating cards and updates it when the user clarifies goals.\n" - ) - with GOALS_FILE.open("a") as fh: - fh.write("\n".join(lines) + "\n") - except Exception: - LOG.debug("could not append private goal", exc_info=True) - - -GOAL_MODES = ("copilot", "autopilot") -DEFAULT_GOAL_MODE = "copilot" - - -def _record_miniapp_goal( - title: str, - context: str, - cadence: str, - chat_id: int, - thread_id: int, - mode: str = DEFAULT_GOAL_MODE, -) -> int | None: - try: - MINIAPP_DB.parent.mkdir(parents=True, exist_ok=True) - now = int(time.time()) - with sqlite3.connect(str(MINIAPP_DB)) as db: - db.execute( - """ - CREATE TABLE IF NOT EXISTS goals ( - id INTEGER PRIMARY KEY AUTOINCREMENT, - title TEXT NOT NULL, - context TEXT NOT NULL DEFAULT '', - cadence TEXT NOT NULL DEFAULT '', - mode TEXT NOT NULL DEFAULT 'copilot', - status TEXT NOT NULL DEFAULT 'active', - tg_chat_id INTEGER, - tg_thread_id INTEGER, - created_at INTEGER NOT NULL, - updated_at INTEGER NOT NULL - ) - """ - ) - # Backfill the `mode` column on DBs created before this schema. - try: - db.execute("ALTER TABLE goals ADD COLUMN mode TEXT NOT NULL DEFAULT 'copilot'") - except sqlite3.OperationalError as e: - if "duplicate column" not in str(e).lower(): - raise - cur = db.execute( - """ - INSERT INTO goals (title, context, cadence, mode, tg_chat_id, tg_thread_id, created_at, updated_at) - VALUES (?, ?, ?, ?, ?, ?, ?, ?) - """, - ( - title, - context, - cadence, - mode if mode in GOAL_MODES else DEFAULT_GOAL_MODE, - chat_id or None, - thread_id or None, - now, - now, - ), - ) - return int(cur.lastrowid) - except Exception: - LOG.debug("could not record mini app goal", exc_info=True) - return None - - -def _get_goal_mode(chat_id: int, thread_id: int) -> str: - """Return the active goal's mode for this (chat, thread). Default: copilot.""" - if not chat_id or not thread_id: - return DEFAULT_GOAL_MODE - try: - with sqlite3.connect(str(MINIAPP_DB)) as db: - cur = db.execute( - """ - SELECT mode FROM goals - WHERE tg_chat_id = ? AND tg_thread_id = ? - ORDER BY id DESC LIMIT 1 - """, - (chat_id, thread_id), - ) - row = cur.fetchone() - if row and row[0] in GOAL_MODES: - return row[0] - except Exception: - LOG.debug("could not read goal mode", exc_info=True) - return DEFAULT_GOAL_MODE - - -def _set_goal_mode(chat_id: int, thread_id: int, mode: str) -> bool: - """Update the latest goal in this (chat, thread). Returns False if no goal row exists.""" - if mode not in GOAL_MODES or not chat_id or not thread_id: - return False - try: - with sqlite3.connect(str(MINIAPP_DB)) as db: - cur = db.execute( - """ - UPDATE goals SET mode = ?, updated_at = ? - WHERE id = ( - SELECT id FROM goals - WHERE tg_chat_id = ? AND tg_thread_id = ? - ORDER BY id DESC LIMIT 1 - ) - """, - (mode, int(time.time()), chat_id, thread_id), - ) - return cur.rowcount > 0 - except Exception: - LOG.debug("could not set goal mode", exc_info=True) - return False - - -def _agency_goal_prompt( - title: str, - context: str, - cadence: str = "", - mode: str = DEFAULT_GOAL_MODE, -) -> str: - try: - goals_text = GOALS_FILE.read_text().strip()[:12000] - except Exception: - goals_text = "" - goals_block = ( - f"\n\nPrivate goals file ({GOALS_FILE}):\n{goals_text}" - if goals_text - else f"\n\nPrivate goals file ({GOALS_FILE}) is empty or missing. Add the goal there once it's locked." - ) - cadence_line = f"\nCadence the user mentioned: {cadence}" if cadence else "" - - if mode == "autopilot": - mode_block = ( - "Mode: **autopilot**. Act directly on private/reversible work (drafts, scrapes, queries, local files, " - "scheduled checks). Post short progress updates in this topic with tg-send. Only stop and ask via " - "agency-report when the next step would be visible to other people, irreversible, or spend money: " - "sending an email, posting publicly, merging, deploying, paying." - ) - else: - mode_block = ( - "Mode: **copilot**. Do all the private/reversible work yourself (read, draft, query, render), then post " - "ONE agency-report card with the action pre-completed and ask Yes/More/Skip. Never ask 'should I draft' — " - "draft it, attach it, ask to send." - ) - - return ( - f"Goal locked in this Telegram topic: **{title}**\n\n" - f"User context:\n{context or title}" - f"{cadence_line}" - f"{goals_block}\n\n" - f"{mode_block}\n\n" - "How to work this goal:\n" - "1. Card-shape doctrine is in CLAUDE.md (## Composing a card) — already in context.\n" - "2. Read /var/lib/bux/agency.db for what's already been suggested/accepted/skipped on this goal. " - "Don't repeat skipped ideas.\n" - "3. Scan the connected surfaces you actually have access to (Gmail, Slack, GitHub, etc. via composio MCP, " - "or the live browser via browser-harness) for the most concrete next action that moves this goal.\n" - "4. Take the next action under your current mode. ONE high-signal thing per cycle, not a batch of ten. " - "Every action names a specific person/company/thread/repo/PR/post/file — no generic 'monitor channel' cards.\n" - "5. End every cycle by self-scheduling the next check with `tg-schedule`. Pick a cadence that fits the goal: " - "30 min for fast-moving (live launches, incidents), 1h default, 4h for slow-burn goals, daily for long arcs. " - "Example: `tg-schedule '+1 hour' 'next agency cycle on this goal'`.\n" - "6. If the goal is still too vague to act on, ask ONE short clarifying question (use tg-buttons with 2-3 " - "options) instead of filler cards.\n" - "7. Set agency-report `--source-label` / `--source-url` to the real platform object. Never use the bux " - "GitHub repo as a generic source for non-GitHub cards.\n\n" - "This topic is the goal's permanent lane. The user can reply at any time and you resume with full context." - ) +# Goal persistence is the AGENT'S responsibility, not the bot's. When the +# agent notices a new user goal in conversation, it writes to +# /opt/bux/repo/private/goals.md itself. The bot stays a dumb pipe. # Failure reaction on the user's message. Telegram's free-tier reaction # allowlist excludes ⏳/✅/⚠️/❌ — this is a verified-allowed pick. @@ -439,9 +261,7 @@ def random_thinking_reaction() -> str: ("fast", "switch this topic's Codex lane to fast mode"), ("model", "show/set this topic's Codex model"), ("agency", "open the goal card feed"), - ("goal", "AUTOPILOT goal in a new topic — I work end-to-end without approvals"), - ("autopilot", "flip this topic to autopilot"), - ("copilot", "flip this topic back to copilot (default everywhere except /goal topics)"), + ("goal", "autopilot goal — passes through to the CLI; I work end-to-end without approvals"), ("miniapp", "open the goal card feed"), ("live", "live-view URL of the active browser"), ("queue", "pending tasks in this topic"), @@ -4704,107 +4524,6 @@ def _ensure_default_agency_heartbeat(self, chat_id: int) -> None: except Exception: LOG.exception("agency heartbeat schedule failed for chat_id=%s", chat_id) - def _start_agency_goal_from_command( - self, - chat_id: int, - thread_id: int, - title: str, - sender: dict, - reply_to: int | None = None, - mode: str = DEFAULT_GOAL_MODE, - ) -> None: - title = " ".join((title or "").split()).strip() - if not title: - cmd_hint = "/go" if mode == "autopilot" else "/goal" - self.send( - chat_id, - f"Use `{cmd_hint} <what should I work on?>`", - reply_to=reply_to, - thread_id=thread_id, - markdown=True, - ) - return - context = title - goal_thread = thread_id - spawned_topic = False - topic_error: str | None = None - if chat_id < 0: - try: - res = self.call("createForumTopic", chat_id=chat_id, name=title[:128]) - if res.get("ok"): - goal_thread = int(res["result"].get("message_thread_id") or thread_id) - _record_miniapp_topic(chat_id, goal_thread, title, "telegram-goal") - spawned_topic = True - else: - topic_error = str(res.get("description") or "createForumTopic returned not-ok") - except Exception as exc: - LOG.exception("goal: createForumTopic failed") - topic_error = str(exc) - _append_private_goal(title, context) - if mode not in GOAL_MODES: - mode = DEFAULT_GOAL_MODE - goal_id = _record_miniapp_goal(title, context, "", chat_id, goal_thread, mode=mode) - prompt = _agency_goal_prompt(title, context, mode=mode) - if topic_error and not spawned_topic: - self.send( - chat_id, - f"Couldn't create a new topic for this goal: {topic_error}\n\n" - "Running it in this thread instead. To get a dedicated lane, " - "enable Topics on the group and make me admin (manage_topics).", - reply_to=reply_to, - thread_id=thread_id, - markdown=False, - ) - if mode == "autopilot": - ack_lines = [ - f"🚀 Autopilot goal locked: {title}", - "", - "I'll work this end-to-end without asking — only stopping for genuinely visible/external side effects or genuine blockers.", - "Heads-up: autopilot uses whatever access I have to reach the goal. Don't run this in a topic that touches sensitive data.", - ] - else: - ack_lines = [ - f"🎯 Goal locked: {title}", - "", - "Mode: **copilot** — I draft and ask before anything visible.", - "Switch this topic with /autopilot, or use `/go <goal>` next time to launch a fresh autopilot topic.", - ] - self.send( - chat_id, - "\n".join(ack_lines), - reply_to=reply_to, - thread_id=goal_thread or thread_id, - markdown=True, - ) - try: - self.run_task( - (chat_id, goal_thread), - prompt, - reply_to=None, - sender={ - "user_id": str(sender.get("user_id") or ""), - "username": sender.get("username") or "", - "name": sender.get("name") or "", - }, - ) - LOG.info( - "goal: started goal_id=%s chat=%s thread=%s spawned=%s", - goal_id, chat_id, goal_thread, spawned_topic, - ) - except Exception: - LOG.exception("goal: run_task failed") - self.send( - chat_id, - "Goal was saved, but I couldn't start the first cycle. Try `/goal` again or check the logs.", - reply_to=reply_to, - thread_id=goal_thread or thread_id, - ) - - # Note: no auto-heartbeat queued anymore. Re-firing the same prompt - # on a fixed cadence is noise. The agent inside the goal topic - # schedules its own check-ins via tg-schedule only when there's - # something concrete to come back to (a reply, CI, an event). - def _handle_my_chat_member(self, update: dict) -> None: """React to the bot's own membership changing in some chat. @@ -5397,9 +5116,7 @@ def handle(self, msg: dict) -> None: "/claude — switch this topic to Claude\n" "/claude login — sign in Claude through a terminal flow\n" "/claude logout — sign out Claude\n" - "/goal <what to work on> — AUTOPILOT goal in a new topic, I work end-to-end without approvals\n" - "/autopilot — flip this topic to autopilot\n" - "/copilot — flip this topic back to copilot (default everywhere except /goal topics)\n" + "/goal <what to work on> — pass through to the CLI's native /goal (codex) or treated as an autopilot prompt (claude); I work end-to-end without approvals\n" "/agency — open the Mini App\n" "/miniapp — open the Mini App\n" "/live — live-view URL of the active browser\n" @@ -5417,57 +5134,12 @@ def handle(self, msg: dict) -> None: thread_id=thread_id, ) return - if cmd == "/goal": - if not owner or not _is_owner(sender, owner): - self.send( - chat_id, - "`/goal` is owner-only.", - reply_to=mid, - thread_id=thread_id, - markdown=True, - ) - return - # /goal launches AUTOPILOT in a new topic — the box's "I want this - # done end-to-end" verb. The whole rest of the box defaults to - # copilot; only /goal triggers autonomous work. - self._start_agency_goal_from_command( - chat_id, thread_id, arg, sender, reply_to=mid, mode="autopilot", - ) - return - if cmd in ("/autopilot", "/copilot"): - if not owner or not _is_owner(sender, owner): - self.send( - chat_id, - f"`{cmd}` is owner-only.", - reply_to=mid, - thread_id=thread_id, - markdown=True, - ) - return - new_mode = "autopilot" if cmd == "/autopilot" else "copilot" - updated = _set_goal_mode(chat_id, thread_id, new_mode) - if not updated: - self.send( - chat_id, - "No goal in this topic yet. Use `/goal <what to work on>` first.", - reply_to=mid, - thread_id=thread_id, - markdown=True, - ) - return - blurb = ( - "**autopilot** — I act on reversible work directly and only stop at visible boundaries." - if new_mode == "autopilot" - else "**copilot** (default) — I draft and ask before anything visible to other people." - ) - self.send( - chat_id, - f"Mode for this goal: {blurb}", - reply_to=mid, - thread_id=thread_id, - markdown=True, - ) - return + # `/goal` is intentionally NOT intercepted by the bot. It flows + # through as a normal turn input — codex with `[features] goals + # = true` runs its native plan→act→test loop; claude treats it + # as a goal-shaped prompt and the system prompt's "act on goal" + # doctrine drives behavior. The agent itself saves goals to + # /opt/bux/repo/private/goals.md when it notices a new one. if cmd in ("/agency", "/miniapp"): if not owner or not _is_owner(sender, owner): self.send( diff --git a/agent/test_telegram_bot.py b/agent/test_telegram_bot.py index f868e7c..1b09a17 100644 --- a/agent/test_telegram_bot.py +++ b/agent/test_telegram_bot.py @@ -203,86 +203,5 @@ def test_legacy_custom_button_prompt_still_works_without_row(self) -> None: self.assertIn("rethink this suggestion", prompt) -class GoalModeTest(unittest.TestCase): - """Tests for the /autopilot + /copilot per-topic mode flow.""" - - def setUp(self) -> None: - # Isolated mini-app DB per test so writes don't bleed. - self._tmp = tempfile.NamedTemporaryFile(suffix=".db", delete=False) - self._tmp.close() - self._db_patch = mock.patch.object( - telegram_bot, "MINIAPP_DB", Path(self._tmp.name) - ) - self._db_patch.start() - # Also isolate the goals.md file (the goal-recording path appends to it). - self._goals_tmp = tempfile.NamedTemporaryFile(suffix=".md", delete=False) - self._goals_tmp.close() - self._goals_patch = mock.patch.object( - telegram_bot, "GOALS_FILE", Path(self._goals_tmp.name) - ) - self._goals_patch.start() - - def tearDown(self) -> None: - self._db_patch.stop() - self._goals_patch.stop() - os.unlink(self._tmp.name) - os.unlink(self._goals_tmp.name) - - def test_default_mode_is_copilot(self) -> None: - # No goal row yet — fall back to default. - self.assertEqual(telegram_bot._get_goal_mode(123, 99), "copilot") - - def test_record_then_get_mode(self) -> None: - gid = telegram_bot._record_miniapp_goal( - "ship demo", "context", "", 123, 99, mode="autopilot" - ) - self.assertIsNotNone(gid) - self.assertEqual(telegram_bot._get_goal_mode(123, 99), "autopilot") - - def test_invalid_mode_falls_back_to_default(self) -> None: - # Junk mode -> stored as default. - gid = telegram_bot._record_miniapp_goal( - "x", "", "", 123, 99, mode="ludicrous-speed" - ) - self.assertIsNotNone(gid) - self.assertEqual(telegram_bot._get_goal_mode(123, 99), "copilot") - - def test_set_goal_mode_with_no_goal_returns_false(self) -> None: - # /autopilot before /goal -> can't flip a nonexistent row. - self.assertFalse(telegram_bot._set_goal_mode(123, 99, "autopilot")) - - def test_set_goal_mode_flips_existing_row(self) -> None: - telegram_bot._record_miniapp_goal("x", "", "", 123, 99) # default copilot - self.assertEqual(telegram_bot._get_goal_mode(123, 99), "copilot") - self.assertTrue(telegram_bot._set_goal_mode(123, 99, "autopilot")) - self.assertEqual(telegram_bot._get_goal_mode(123, 99), "autopilot") - - def test_set_goal_mode_rejects_unknown_value(self) -> None: - telegram_bot._record_miniapp_goal("x", "", "", 123, 99) - self.assertFalse(telegram_bot._set_goal_mode(123, 99, "nonsense")) - # Mode unchanged. - self.assertEqual(telegram_bot._get_goal_mode(123, 99), "copilot") - - -class AgencyGoalPromptTest(unittest.TestCase): - """The /goal cycle prompt must mention current mode + self-scheduling.""" - - def test_copilot_prompt_mentions_drafting_and_asking(self) -> None: - prompt = telegram_bot._agency_goal_prompt("100k views", "TikTok", mode="copilot") - self.assertIn("copilot", prompt.lower()) - # Drafting/asking framing - self.assertIn("ask", prompt.lower()) - # Self-schedule instruction - self.assertIn("tg-schedule", prompt) - # ONE concrete action per cycle - self.assertIn("ONE", prompt) - - def test_autopilot_prompt_mentions_acting_directly(self) -> None: - prompt = telegram_bot._agency_goal_prompt("100k views", "TikTok", mode="autopilot") - self.assertIn("autopilot", prompt.lower()) - self.assertIn("act", prompt.lower()) - self.assertIn("tg-schedule", prompt) - - if __name__ == "__main__": unittest.main()