exospherehost · NiveditJain · May 10, 2026 · May 10, 2026 · May 10, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,10 @@
 # Changelog
 
+## 0.0.10-beta.12 — 2026-05-10
+
+### Features
+- Pi: enforce `Stop` policies (`require-commit-before-stop`, `require-push-before-stop`, etc.) on the next user turn via `before_agent_start` injection. Pi's `AgentEndEvent` has no Result type — by the time it fires, Pi's agent loop has already exited, so a deny return cannot keep Pi running the way Claude's exit-2-from-Stop can. Empirically observed: a user on Pi with `require-commit-before-stop` enabled saw the deny `reason` ("You have uncommitted changes …") propagate to Pi but Pi exited anyway. Fix: the `pi-extension/index.ts` shim captures any deny `reason` from `agent_end` into a per-`sessionId` in-memory map, then on the next `before_agent_start` (Pi v0.73.x — fires after the user submits a prompt, before the agent loop runs) returns `{systemPrompt: <event.systemPrompt> + "\n\n" + reason}` so the LLM sees a `MANDATORY ACTION REQUIRED` directive at the top of its next turn. The map is one-shot per drain and is cleared on every `session_shutdown` reason (including `quit`), so a stale gate cannot leak into a fresh session started in the same Pi process. `policy-evaluator.ts` now emits the `MANDATORY ACTION REQUIRED from failproofai (policy: …)` wrapper inside `reason` for `cli === "pi" && eventType === "Stop"` (both the deny and instruct paths), mirroring the Cursor/Gemini/Copilot/OpenCode Stop branches; non-Stop Pi events keep the existing flat `{permission, reason}` shape. Bounded by Pi process lifetime — same bound Claude has on exit-2-from-Stop (kill the agent and the gate is missed). New shim unit tests cover capture/drain/one-shot/`session_shutdown`-clear/no-pending-noop/missing-systemPrompt and a new e2e test pins the binary's stdout shape (`{permission:"deny", reason:/MANDATORY ACTION REQUIRED.*require-commit-before-stop.*uncommitted changes/}`) for `agent_end` in a dirty repo (#PR).
+
 ## 0.0.10-beta.11 — 2026-05-10
 
 ### Fixes

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -218,18 +218,19 @@ pi-extension. Same self-reference caveat applies — do **not** install the
 standard `npx` form from inside this repo.
 
 **Pi limitations vs. Claude semantics** (verified against pi-coding-agent
-v0.72.1 d.ts; the `pi-extension/` shim subscribes to 7 events but Pi's API
+v0.73.1 d.ts; the `pi-extension/` shim subscribes to 8 events but Pi's API
 caps what each handler can do):
 
-| Pi event           | → Claude event   | Veto / mutate? | Notes |
-|--------------------|------------------|----------------|-------|
-| `tool_call`        | PreToolUse       | ✅ block      | Full deny support via `{block, reason}`. |
-| `user_bash`        | PreToolUse       | ✅ block      | Full deny support. |
-| `input`            | UserPromptSubmit | ✅ block      | Full deny support. |
-| `session_start`    | SessionStart     | observation   | No return-value effect on Pi. |
-| `tool_result`      | PostToolUse      | observation   | `ToolResultEventResult` exposes `{content, details, isError}` for mutation but no `block`. PostToolUse is observation/sanitize anyway, matching Claude semantics. |
-| `agent_end`        | Stop             | observation   | Pi's agent loop has already exited; we cannot keep Pi running the way Claude's exit-2-from-Stop can. `require-*-before-stop` policies still RUN — their findings land in the activity store + stderr — but the stop is not vetoed. |
-| `session_shutdown` | SessionEnd       | observation   | Symmetry only. |
+| Pi event             | → Claude event   | Veto / mutate?  | Notes |
+|----------------------|------------------|-----------------|-------|
+| `tool_call`          | PreToolUse       | ✅ block        | Full deny support via `{block, reason}`. |
+| `user_bash`          | PreToolUse       | ✅ block        | Full deny support. |
+| `input`              | UserPromptSubmit | ✅ block        | Full deny support. |
+| `session_start`      | SessionStart     | observation     | No return-value effect on Pi. |
+| `tool_result`        | PostToolUse      | observation     | `ToolResultEventResult` exposes `{content, details, isError}` for mutation but no `block`. PostToolUse is observation/sanitize anyway, matching Claude semantics. |
+| `agent_end`          | Stop             | shifted (next turn) | Pi's `AgentEndEvent` has no Result type — we cannot retry the same loop the way Claude's exit-2-from-Stop can. The shim captures any deny `reason` and stashes it keyed by sessionId for the next `before_agent_start` handler to drain. The 5 `require-*-before-stop` builtins thus enforce by gating the NEXT user turn's system prompt. Bounded by Pi process lifetime — same bound Claude has on exit-2-from-Stop. |
+| `before_agent_start` | (Pi-only handoff) | systemPrompt   | Drains any pending Stop deny captured at `agent_end`, returning `{systemPrompt: <event.systemPrompt> + "\n\n" + reason}` so the LLM sees the MANDATORY ACTION directive before its next turn. Multiple extensions chain. No injection when no block is pending. |
+| `session_shutdown`   | SessionEnd       | observation     | Symmetry only. Also clears any pending stop-block for the session id (every reason, not just `new`/`resume`/`fork`). |
 
 **Instruct (`additionalContext`) on Pi `tool_call`** — Pi's
 `ToolCallEventResult` shape is `{block?, reason?}` only; there's no

diff --git a/__tests__/e2e/hooks/pi-integration.e2e.test.ts b/__tests__/e2e/hooks/pi-integration.e2e.test.ts
@@ -208,6 +208,39 @@ describe("E2E: Pi integration — hook protocol (handler-only)", () => {
     }
   });
 
+  it("agent_end with require-commit-before-stop in a dirty repo emits MANDATORY ACTION reason", () => {
+    const env = createPiEnv();
+    try {
+      // Make env.cwd a git repo with an uncommitted file so
+      // require-commit-before-stop returns deny. This is the bridge to the
+      // shim-side handoff: the binary's stdout MUST be a Pi-flat
+      // `{permission:"deny", reason}` payload whose reason carries the
+      // "MANDATORY ACTION REQUIRED" wrapper. The shim captures that
+      // reason on agent_end and re-injects it via before_agent_start (the
+      // in-process map handoff is covered by the shim unit tests).
+      execSync("git init -q && git config user.email t@t && git config user.name t", { cwd: env.cwd });
+      writeFileSync(resolve(env.cwd, "dirty.txt"), "uncommitted\n");
+      writeConfig(env.cwd, ["require-commit-before-stop"]);
+      const result = runHook(
+        "agent_end",
+        PiPayloads.agentEnd(env.cwd),
+        { homeDir: env.home, cli: "pi" },
+      );
+      // Stop on Pi uses the MANDATORY ACTION wrapping (not the generic
+      // `Blocked <displayTool> by failproofai because: …` wording that
+      // assertPiDeny matches), so we inline the deny shape checks here.
+      expect(result.exitCode).toBe(0);
+      expect(result.parsed?.permission).toBe("deny");
+      expect(result.parsed?.hookSpecificOutput).toBeUndefined();
+      const reason = String(result.parsed?.reason ?? "");
+      expect(reason).toMatch(/MANDATORY ACTION REQUIRED/);
+      expect(reason).toMatch(/require-commit-before-stop/);
+      expect(reason).toMatch(/uncommitted changes/i);
+    } finally {
+      env.cleanup();
+    }
+  });
+
   it("agent-settings guard: Bash read of .pi/settings.json is denied", () => {
     const env = createPiEnv();
     try {

diff --git a/__tests__/hooks/pi-extension-shim.test.ts b/__tests__/hooks/pi-extension-shim.test.ts
@@ -24,10 +24,23 @@ interface PiExtensionApi {
 
 const captured: CapturedCall[] = [];
 
+/** Per-event stdout reply queue for the spawnSync mock. Tests that need
+ *  to simulate a binary deny set `mockSpawnReplyByEvent[<eventName>]`
+ *  to a JSON string before invoking the matching handler. Any event not
+ *  in the map gets the default empty stdout. */
+const mockSpawnReplyByEvent: Record<string, string | undefined> = {};
+
+function eventNameFromArgs(args: string[]): string | undefined {
+  const i = args.indexOf("--hook");
+  return i >= 0 ? args[i + 1] : undefined;
+}
+
 vi.mock("node:child_process", () => ({
   spawnSync: (_cmd: string, args: string[], opts: { input?: string }) => {
     captured.push({ args: args ?? [], payload: JSON.parse(opts?.input ?? "{}") });
-    return { pid: 0, output: [], status: 0, signal: null, stderr: "", stdout: "" };
+    const evt = eventNameFromArgs(args ?? []);
+    const stdout = (evt && mockSpawnReplyByEvent[evt]) ?? "";
+    return { pid: 0, output: [], status: 0, signal: null, stderr: "", stdout };
   },
 }));
 
@@ -43,6 +56,7 @@ describe("pi-extension shim — sessionId resolution via on-disk discovery", ()
 
   beforeEach(async () => {
     captured.length = 0;
+    for (const k of Object.keys(mockSpawnReplyByEvent)) delete mockSpawnReplyByEvent[k];
     handlers = {};
     piRoot = mkdtempSync(join(tmpdir(), "pi-shim-test-"));
     originalEnv = process.env.PI_SESSIONS_DIR;
@@ -240,4 +254,122 @@ describe("pi-extension shim — sessionId resolution via on-disk discovery", ()
   });
 });
 
+/**
+ * Pi cannot veto `agent_end` directly (Pi's AgentEndEvent has no Result type).
+ * The shim captures any deny reason and re-injects it as a `systemPrompt`
+ * suffix on the next `before_agent_start`. These tests cover that handoff.
+ */
+describe("pi-extension shim — agent_end → before_agent_start stop-block handoff", () => {
+  let handlers: Record<string, (event: unknown) => unknown> = {};
+  let piRoot: string;
+  let originalEnv: string | undefined;
+  const SID = "ffffffff-ffff-ffff-ffff-ffffffffffff";
+
+  beforeEach(async () => {
+    captured.length = 0;
+    for (const k of Object.keys(mockSpawnReplyByEvent)) delete mockSpawnReplyByEvent[k];
+    handlers = {};
+    piRoot = mkdtempSync(join(tmpdir(), "pi-shim-handoff-"));
+    originalEnv = process.env.PI_SESSIONS_DIR;
+    process.env.PI_SESSIONS_DIR = piRoot;
+    // Seed a transcript so resolveSessionId returns a stable id.
+    const dir = join(piRoot, piEncodeCwd("/proj"));
+    mkdirSync(dir, { recursive: true });
+    writeFileSync(join(dir, `2026-05-09T00-00-00-000Z_${SID}.jsonl`), "{}\n");
+    vi.resetModules();
+    const mod = await import("../../pi-extension/index");
+    mod.default({ on: (name, fn) => { handlers[name] = fn; } });
+  });
+
+  afterEach(() => {
+    if (originalEnv === undefined) delete process.env.PI_SESSIONS_DIR;
+    else process.env.PI_SESSIONS_DIR = originalEnv;
+    rmSync(piRoot, { recursive: true, force: true });
+  });
+
+  it("agent_end deny is captured and drained on next before_agent_start as a systemPrompt suffix", () => {
+    mockSpawnReplyByEvent["agent_end"] = JSON.stringify({
+      permission: "deny",
+      reason: "MANDATORY ACTION REQUIRED from failproofai (policy: require-commit-before-stop): commit now.",
+    });
+    handlers.agent_end({ type: "agent_end", cwd: "/proj" });
+    // No reply value from agent_end (Pi cannot veto stop).
+    const result = handlers.before_agent_start({
+      type: "before_agent_start",
+      prompt: "next prompt",
+      systemPrompt: "BASE",
+      cwd: "/proj",
+    }) as { systemPrompt?: string } | undefined;
+    expect(result?.systemPrompt).toBe(
+      "BASE\n\nMANDATORY ACTION REQUIRED from failproofai (policy: require-commit-before-stop): commit now.",
+    );
+  });
+
+  it("before_agent_start with no pending block returns undefined", () => {
+    const result = handlers.before_agent_start({
+      type: "before_agent_start",
+      prompt: "p",
+      systemPrompt: "BASE",
+      cwd: "/proj",
+    });
+    expect(result).toBeUndefined();
+  });
+
+  it("the stop-block is one-shot: a second before_agent_start in the same session does not re-fire", () => {
+    mockSpawnReplyByEvent["agent_end"] = JSON.stringify({ permission: "deny", reason: "X" });
+    handlers.agent_end({ type: "agent_end", cwd: "/proj" });
+    const first = handlers.before_agent_start({ type: "before_agent_start", systemPrompt: "B", cwd: "/proj" }) as { systemPrompt?: string };
+    expect(first?.systemPrompt).toBe("B\n\nX");
+    const second = handlers.before_agent_start({ type: "before_agent_start", systemPrompt: "B", cwd: "/proj" });
+    expect(second).toBeUndefined();
+  });
+
+  it("session_shutdown clears the pending stop-block (quit reason too, not just new/resume/fork)", () => {
+    mockSpawnReplyByEvent["agent_end"] = JSON.stringify({ permission: "deny", reason: "X" });
+    handlers.agent_end({ type: "agent_end", cwd: "/proj" });
+    handlers.session_shutdown({ type: "session_shutdown", reason: "quit", cwd: "/proj" });
+    // Even though `quit` retains the cached sessionId, the pending block must
+    // be dropped so a future before_agent_start (e.g. in the next session
+    // started in this process) doesn't inherit a stale gate.
+    const result = handlers.before_agent_start({
+      type: "before_agent_start",
+      systemPrompt: "B",
+      cwd: "/proj",
+    });
+    expect(result).toBeUndefined();
+  });
+
+  it("agent_end with allow stdout (empty reason) does NOT set a pending block", () => {
+    // Default mock returns empty stdout → callPolicy returns {block:false}.
+    handlers.agent_end({ type: "agent_end", cwd: "/proj" });
+    const result = handlers.before_agent_start({
+      type: "before_agent_start",
+      systemPrompt: "B",
+      cwd: "/proj",
+    });
+    expect(result).toBeUndefined();
+  });
+
+  it("before_agent_start without a resolvable sessionId is a no-op", () => {
+    // Use a cwd that has no on-disk transcript — sessionId discovery returns
+    // undefined and the handler must early-return without throwing.
+    const result = handlers.before_agent_start({
+      type: "before_agent_start",
+      systemPrompt: "B",
+      cwd: "/no-such-cwd",
+    });
+    expect(result).toBeUndefined();
+  });
+
+  it("before_agent_start with no systemPrompt in the event still injects (uses empty base)", () => {
+    mockSpawnReplyByEvent["agent_end"] = JSON.stringify({ permission: "deny", reason: "Y" });
+    handlers.agent_end({ type: "agent_end", cwd: "/proj" });
+    const result = handlers.before_agent_start({
+      type: "before_agent_start",
+      cwd: "/proj",
+    }) as { systemPrompt?: string };
+    expect(result?.systemPrompt).toBe("\n\nY");
+  });
+});
+
 import { afterEach } from "vitest";
diff --git a/pi-extension/index.ts b/pi-extension/index.ts
@@ -231,6 +231,24 @@ function discoverPiSessionId(cwd: string): string | undefined {
  *  across multiple workspace roots) can't cross-attribute. Cleared on
  *  session_shutdown reasons `new`/`resume`/`fork` (Pi reuses the process). */
 const cachedSessionIdByCwd = new Map<string, string>();
+
+/** Pending Stop-policy deny reason from agent_end, keyed by sessionId.
+ *  Drained by before_agent_start on the next user turn in the same Pi
+ *  process. Cleared on every session_shutdown.
+ *
+ *  Why this exists: Pi's agent_end has no Result type — the agent loop
+ *  has already exited when it fires, so a deny return cannot keep Pi
+ *  running the way Claude's exit-2-from-Stop does. The closest analog
+ *  is to capture the deny here and re-inject it as a MANDATORY ACTION
+ *  system-prompt addition on the NEXT before_agent_start, which fires
+ *  after the user submits a prompt but before the agent loop runs.
+ *  Best-effort: bounded by the Pi process lifetime — same bound Claude
+ *  has on exit-2-from-Stop (kill the agent and the gate is missed).
+ *
+ *  Why per-session not per-cwd: a Pi process can host multiple sessions
+ *  via /resume and /fork; per-cwd would cross-attribute a stale block
+ *  from a prior session into a fresh one. */
+const pendingStopBlockBySession = new Map<string, string>();
 function resolveSessionId(eventSessionId: string | undefined, cwd: string): string | undefined {
   if (eventSessionId) {
     cachedSessionIdByCwd.set(cwd, eventSessionId);
@@ -302,6 +320,17 @@ interface PiAgentEndEvent {
   sessionId?: string;
 }
 
+/** Pi v0.73.x before_agent_start event payload. Fires once per turn,
+ *  after the user submits a prompt but before the agent loop runs. */
+interface PiBeforeAgentStartEvent {
+  type?: string;
+  prompt?: string;
+  /** The fully assembled system prompt for this turn — we append to it. */
+  systemPrompt?: string;
+  cwd?: string;
+  sessionId?: string;
+}
+
 interface PiExtensionApi {
   on(event: string, handler: (event: unknown) => unknown): void;
 }
@@ -384,21 +413,51 @@ export default function failproofaiBridge(pi: PiExtensionApi) {
     return undefined;
   });
 
-  // agent_end → Stop. Observation-only on Pi: the agent loop has already
-  // exited when this fires, so a deny decision cannot keep Pi running the
-  // way Claude's exit-2-from-Stop can. We still forward so the 5
-  // require-*-before-stop builtins run and log their findings (visible in
-  // the dashboard's activity feed and stderr) — best-effort visibility.
+  // agent_end → Stop. Pi cannot veto agent_end (the agent loop has already
+  // exited when this fires — see the AgentEndEvent typedef in pi-coding-agent
+  // which has NO Result type). Instead we capture any deny reason and stash
+  // it keyed by sessionId for the next before_agent_start handler to drain.
+  // The 5 require-*-before-stop builtins thus enforce by gating the NEXT
+  // user turn's system prompt rather than by retrying the same loop. If the
+  // user kills Pi between turns, the gate is missed — same bound Claude has.
   pi.on("agent_end", (event: unknown): unknown => {
     const e = event as PiAgentEndEvent;
-    callPolicy("agent_end", {
-      session_id: resolveSessionId(e.sessionId, resolveCwd(e.cwd)),
-      cwd: resolveCwd(e.cwd),
+    const cwd = resolveCwd(e.cwd);
+    const sessionId = resolveSessionId(e.sessionId, cwd);
+    const decision = callPolicy("agent_end", {
+      session_id: sessionId,
+      cwd,
       hook_event_name: "Stop",
     });
+    if (decision.block && decision.reason && sessionId) {
+      pendingStopBlockBySession.set(sessionId, decision.reason);
+      debug(`agent_end deny stored for session=${sessionId}`);
+    }
     return undefined;
   });
 
+  // before_agent_start → drain any pending Stop-policy deny captured at
+  // agent_end. This is Pi's only first-class channel to influence the next
+  // turn before the LLM call: the result type accepts a `systemPrompt`
+  // replacement (chained across extensions) and an optional injected
+  // CustomMessage. We only return systemPrompt — sufficient for the LLM to
+  // see the MANDATORY ACTION directive immediately, and avoids polluting the
+  // visible conversation history with framework chrome. The reason text
+  // already carries the policy-attributed MANDATORY ACTION wording from
+  // policy-evaluator's Pi-Stop branch.
+  pi.on("before_agent_start", (event: unknown): unknown => {
+    const e = event as PiBeforeAgentStartEvent;
+    const cwd = resolveCwd(e.cwd);
+    const sessionId = resolveSessionId(e.sessionId, cwd);
+    if (!sessionId) return undefined;
+    const pending = pendingStopBlockBySession.get(sessionId);
+    if (!pending) return undefined;
+    pendingStopBlockBySession.delete(sessionId);
+    debug(`before_agent_start drains stop-block for session=${sessionId}`);
+    const base = e.systemPrompt ?? "";
+    return { systemPrompt: `${base}\n\n${pending}` };
+  });
+
   // session_shutdown → SessionEnd. Observation-only; emits a SessionEnd
   // record so per-session telemetry has a clean close. Reset the per-cwd
   // sessionId cache for shutdown reasons that mean "Pi is starting a new
@@ -414,9 +473,19 @@ export default function failproofaiBridge(pi: PiExtensionApi) {
       reason: e.reason,
       hook_event_name: "SessionEnd",
     });
+    // Capture sessionId BEFORE the cache reset so we delete the pending
+    // entry under the just-ending session's id. After resetSessionIdCache,
+    // a subsequent resolveSessionId would re-discover from disk and could
+    // bind to a different (stale) file — wrong key for the cleanup below.
+    const sessionId = resolveSessionId(e.sessionId, cwd);
     if (e.reason === "new" || e.reason === "resume" || e.reason === "fork") {
       resetSessionIdCache(cwd);
     }
+    // Drop any pending Stop-policy deny for this session on every shutdown
+    // reason — `quit` ends the session for good (don't leak the entry into
+    // GC); `new`/`resume`/`fork` start a different session in the same
+    // process and must not inherit the prior session's gate.
+    if (sessionId) pendingStopBlockBySession.delete(sessionId);
     return undefined;
   });
 }