exospherehost · NiveditJain · May 4, 2026 · coderabbitai · May 4, 2026
diff --git a/drafts/2026-05-04T112801Z.md b/drafts/2026-05-04T112801Z.md
@@ -0,0 +1,57 @@
+# Reply to OP on "Stop Treating Agent Sandboxes as Cattle"
+
+- **HN:** https://news.ycombinator.com/item?id=48004872
+- **Story:** "Stop Treating Agent Sandboxes as Cattle" (id=48004872, 1 point, 0 comments at draft time, 6 hours old, OP `iacguy`, links out to a blog post by Utpal Nadiger on opencomputer.dev)
+- **Status:** draft (pending manual post)
+
+## The post
+
+Top-level reply to the OP. The article is a direct rebuttal to "The agent harness belongs outside the sandbox" (id=47990675, a Mendral article we already covered in PR #17). It argues you can run the harness inside the sandbox after all, with three rebuttals:
+
+1. **Credentials**: Solved by an egress proxy (Fly's Tokenizer / mitmproxy / IMDS / Vault pattern). The sandbox holds an opaque handle; the proxy substitutes the real secret at the boundary. The sandbox never sees real credential material.
+2. **Idle compute cost**: Solved by VM hibernation (~25ms resume) plus elasticity (1GB/1vCPU to 16GB/4vCPU autoscale), so idle reasoning runs at the bottom tier and bursts to 16GB only for builds.
+3. **Cattle vs pets durability**: Third option = "git branches for VMs" — hibernate to survive planned restarts, checkpoint to survive hard failures, fork to explore alternatives in parallel.
+
+The thread is a substantive technical post inviting design discussion (Show HN-shaped, even though not formally a Show HN). Per the FailProof thread-fit gate, that's the second allowed shape. PocketOS-style failures (Cursor agent's volumeDelete on Railway during a credential-mismatch fix) are explicitly cited in the prior article ecosystem this rebuttal lives in, so an intent-gating policy snippet is on-topic, not opportunistic.
+
+## My reply
+
+```
+(disclosure: I work on FailProof AI: https://github.com/exospherehost/failproofai)
+
+The egress-proxy and hibernate-fork story is good infrastructure plumbing, but it sidesteps a question orthogonal to where the harness sits: should this specific tool call happen at all? PocketOS-style failures (agent runs volumeDelete while fixing a credential mismatch) happen entirely inside a sandbox holding a properly scoped, proxy-authenticated token. The proxy can't see model intent; checkpoint-and-fork lets you survive a bad call but not prevent it.
+
+PreToolUse closes that gap independently of sandbox placement:
+
+  import { customPolicies, allow, deny } from "failproofai";
+
+  customPolicies.add({
+    name: "deny-volume-delete",
+    match: { events: ["PreToolUse"] },
+    fn: ({ toolName, toolInput }) => {
+      if (toolName !== "Bash") return allow();
+      const cmd = toolInput?.command ?? "";
+      if (/\bvolumeDelete\b|\brailway\s+volumes?\s+delete\b/i.test(cmd)) {
+        return deny("volumeDelete blocked; snapshot first.");
+      }
+      return allow();
+    },
+  });
+
+Pets-vs-cattle-vs-git-branches answers durability; this answers intent.
+```
+
+## Insight for the FailProof team
+
+- The opencomputer.dev / Mendral debate is a useful framing wedge for FailProof's positioning. Both sides are arguing about *infrastructure* (where the harness lives, how credentials transit, how the VM survives restarts). FailProof operates on a different axis: agent decision-time at the PreToolUse boundary, before the call is dispatched at all. **Action:** lift this "third axis" framing into a short blog post — *Three orthogonal layers of agent reliability: sandbox isolation, network credential boundary, and tool-call policy* — diagrammed against the same incidents (PocketOS, Replit, Bedrock prompt-cache miss) the rest of this space already references. The diagram is the marketing hook.
+- The article spends most of its space rebutting the *outside-the-sandbox* argument by saying "egress proxy + hibernate solves it." That's a real point, but it concedes credential exfiltration is the only thing the credential layer protects against. Worth being explicit in our marketing copy that egress proxies and PreToolUse policies *compose*: the proxy keeps secrets out of the sandbox, the policy keeps unsafe tool calls from reaching the proxy in the first place. Both/and.
+- "Git branches for VMs" (hibernate / checkpoint / fork) is a neat primitive worth understanding. The fork-to-explore-alternatives idea (try three migration strategies in parallel from one checkpoint) intersects with FailProof's instruct() guidance flow. A future custom-policy idea: `route-destructive-to-fork` — when the agent issues a destructive verb, instead of allow/deny, fork the sandbox, run the call in the fork, surface the diff, let the user pick. We'd need opencomputer.dev-style fork primitives to wire it; not a priority but a sketch worth filing.
+- Author is Utpal Nadiger at digger / opencomputer.dev, replying to Andrea Luzzardi at Mendral. Both teams are reasonable people building agent infrastructure; relationship is "friendly disagreement on a design question," not adversarial. Comment should engage on the design merits (it does), not pile on either side.
+
+## Notes / findings
+
+- Thread is fresh (6h, 1 point, 0 comments at draft time). The reply will be the first comment, so visibility depends on whether the thread gains any traction at all. Per the gate this is *not* a long-tail / saturated thread; it's an embryonic /newest item where an early thoughtful comment can plausibly seed engagement, or be the only comment if the thread dies. Either way the comment stands on its own as substantive on-topic content.
+- The reply form is present and the thread is not `[dead]` / `[flagged]`. Confirmed via `document.querySelector("form textarea[name=\"text\"]")` returning truthy and the page text containing neither marker.
+- The custom-policy snippet is *new* relative to prior FailProof drafts in this repo. The closest paraphrase is `comments/2026-04-29T043958Z.md` (`block-drop-database`) and PR #17 (`no-rm-rf`). The `deny-volume-delete` snippet here ties directly to the PocketOS / Railway incident this article's ecosystem references, not to a generic "destructive command" intent. Cross-thread duplicate guard satisfied.
+- Discovery path: `/news` page 1 + page 2, `/show` page 1 + page 2, `/active`, Algolia searches for `claude deleted`, `agent guardrails`, `claude code`, `cursor agent`, `agent rm`, `agent sandbox`, `MCP security`, `prompt injection`, `agent reliability`, `claude code hooks`, `agent pushed main`, `claude committed`, `cursor deleted`, `secrets leaked agent`, `claude code leaked`, `claude force push` — across past day, past week, past month. Most agent-failure-shaped threads with engagement were already covered by existing PRs; this one was the strongest uncovered fit.
+- Found and ruled out: 48006894 ("Agent Control Room - Auth0 for AI Agents", 0 comments, just a link with no OP text), 48006682 ("Show HN: Valk Guard - dangerous SQL", 0 comments, too sparse), 48000137 ("Babysitting the Agent", 0 comments), 47963910 ("Show HN: Npx LLM-safe-haven", 0 comments), 48002442 ("Agentic Coding Is a Trap", saturated 263-comment vent thread, fails the gate), 47989883 ("VS Code Co-Authored-by Copilot", saturated 823-comment vent thread on a Microsoft consent issue not at the agent-tool-call layer), 47963204 ("Claude Code refuses requests for OpenClaw", saturated 718-comment thread).