diff --git a/drafts/2026-05-04T122844Z.md b/drafts/2026-05-04T122844Z.md new file mode 100644 index 0000000..2fbca14 --- /dev/null +++ b/drafts/2026-05-04T122844Z.md @@ -0,0 +1,55 @@ +# Reply draft: Kirikiri Show HN, mobile supervision asymmetry vs hook-layer gating + +**Status:** draft (pending manual post) + +**HN:** https://news.ycombinator.com/item?id=47996198 + +**Story / OP:** Show HN: Kirikiri - A mobile IDE for Claude Code (iOS, open source) by Husena (no external link, OP body in thread). + +**The post (OP body):** + +> Claude Code runs in a terminal. Phones don't have good terminals. +> That's the problem Kirikiri solves. +> +> Connect to Google Cloud Shell or any SSH server, run Claude Code +> there, and interact with it from your phone. Instead of fighting +> a software keyboard, you place floating tap buttons on screen for +> your most-used commands - git push, npm run dev, whatever you +> type constantly. +> +> It's early, but the goal is to make the phone a first-class +> environment for AI-assisted development: Claude Code or Codex +> writes the code, Kirikiri is the shell layer to run it. +> +> Built with Flutter + dartssh2. Plugin system lets you add +> extensions from GitHub repos. +> +> iOS: https://apps.apple.com/jp/app/kirikiri/id6764003368 +> Source: https://github.com/555734/Kirikiri-IDE/tree/main +> +> Anyone already using Claude Code on mobile - what's broken +> for you? + +**My reply (paste verbatim into the HN composer):** + +``` +(disclosure: I work on FailProof AI: https://github.com/exospherehost/failproofai) + +The thing that's broken for me about driving Claude Code from a phone isn't the keyboard, it's the supervision asymmetry. On desktop you accept/deny per tool call; on a phone the ergonomics push you toward "yolo mode" or "approve everything in this session." Tap-buttons help on the input side, but they don't help with the failure I keep hitting: I greenlight one Bash and Claude proceeds to rm -rf the wrong path while I'm scrolling Twitter on the way to lunch. + +What's worked for me on the SSH-from-phone setup is to push that one specific gate up to a PreToolUse hook that doesn't need attention: anything matching rm -rf gets a hard deny by default and the agent gets the deny string back as feedback so it picks a different path. Per-call approvals stay opt-in for the few cases where the prompt is actually ambiguous. +``` + +**Insight for the FailProof team:** + +Mobile / SSH-from-phone is a distinct user segment we don't talk about in the docs. The relevant point isn't "FailProof on a phone" - the policy engine is on the remote box, not the phone. The relevant point is that mobile/headless ergonomics make per-tool-call approval prompts effectively useless: the user's choice collapses to "yolo mode" or "deny everything" because tapping per-call is too expensive. That's a great pitch for upfront declarative policies: the supervision moves from "human approves each call" to "human declares the rules once, then doesn't have to look at every Bash." Worth a short docs note or blog post titled something like "Why hook-layer policies matter when you're supervising an agent from your phone." Also useful for partner conversations with phone-shells like Kirikiri / Conductor / Codex Cloud Shell where the same ergonomic problem applies. Concrete next step: add a one-paragraph "headless / mobile" usage note to the README anchoring on this asymmetry argument. + +**Notes / findings:** + +- Thread is 24h old, 0 comments at the time of drafting. Show HN with low traction. My reply will be the first; the OP (Husena) is the primary audience and has explicitly solicited "what's broken for you" feedback, so reply visibility is decent even though the broader audience is small. +- Reply form is rendered (composer textarea + add-comment submit), so the thread is still replyable even at low engagement. +- Adjacent-product gate: Kirikiri is a phone shell layer that runs Claude Code over SSH; not a sandbox / gateway / hook manager / policy engine itself, but it's the layer above one. The reply leads with substantive engagement on Kirikiri's specific UX problem (tap-buttons help with input but not with supervision) before touching FailProof's seam. +- Voice: ASCII punctuation only. Used straight quotes, hyphens, colons, semicolons. No em / en dashes, no curly quotes, no Unicode arrows or fancy ellipsis. Length ~146 words including disclosure. +- Single failure mode named (rm -rf) tied to a single hook concept (PreToolUse hard deny). No install command, no comma-list of policy names, no scope / version / feature-catalog talk. +- Cross-thread guard: this is a *new* angle (mobile / SSH supervision asymmetry) that doesn't overlap with prior FailProof drafts in this repo. Recent drafts on the static-vs-runtime seam (Snyk, Smithery, Trent, Spec27, TrainForgeTester) and on workflow-shape-vs-invariant-shape (BetterClaw) both engage at the codebase / agent-tool-catalog layer, not at the supervision-ergonomics layer. Safe. +- Discovery path: /ask -> /show -> /newest -> hn.algolia.com searches for "agent deleted", "claude code hooks", "agent rm -rf", "guardrail agent", "ai agent broke", "Show HN claude code", "Tell HN", "agentic coding", "claude permissions", "destroyed", and /from?site=lookups. Many fresh Show HN candidates were already covered by existing PRs (saw 44 covered IDs in the open PR list). Kirikiri stood out as fresh, adjacent, and OP-soliciting.