Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@

## Unreleased

### Fixes
- Make `require-*-before-stop` policies actually enforce on OpenCode (sst/opencode). Empirically observed: a Stop hook fired (visible in the dashboard activity feed) but the agent stopped without retry, identical failure mode to Cursor pre-#318 and Copilot pre-#299. Root cause: the OpenCode plugin shim subscribes to `session.idle` (canonical `Stop`) and `policy-evaluator.ts` had no `cli === "opencode"` branch for `Stop` / `SubagentStop`, so OpenCode fell through to the generic Claude-shape `exitCode: 2 + stderr` response — which the shim's `applyDecision` turns into `throw new Error(reason)` from inside the `session.idle` event callback. By that point OpenCode has already gone idle; the throw is logged at most. The shim does have a working force-retry channel: when stdout JSON contains `hookSpecificOutput.additionalContext`, it submits the text via `client.session.prompt({path: {id: sessionID}, body: {parts: [{type: "text", text}]}})`, which becomes a new user message that re-triggers the agent loop — exactly the same model as Cursor's `followup_message` (#318) and Copilot's `{decision: "block", reason}` (#299). New `cli === "opencode" && eventType in {Stop, SubagentStop}` arms in both the deny and instruct paths of `policy-evaluator.ts` emit `{hookSpecificOutput: {additionalContext: <MANDATORY ACTION reasonText>}}` ahead of the generic Stop fall-through. The shim's `applyDecision` is promoted to `async` and now awaits `client.session.prompt` for `Stop` / `SubagentStop` events specifically (fire-and-forget for tool events, so we don't add SDK round-trip latency on the hot path) — without the await, OpenCode could tear down the plugin context before the SDK call completes. New unit tests in `policy-evaluator.test.ts` pin the deny + instruct shapes for opencode (Stop and SubagentStop) plus a regression test confirming PreToolUse still uses the Claude `permissionDecision: "deny"` shape; new tests in `opencode-plugin-shim.test.ts` assert the shim awaits `client.session.prompt` on `session.idle` (handler stays pending until the SDK call resolves), swallows SDK rejection (agent is exiting anyway), and still throws on exit-2 stderr (back-compat with stale binaries). Pi `agent_end` remains observation-only by upstream design — Pi's agent loop has already exited when `agent_end` fires and the `AgentEndEventResult` exposes no `block` field, documented in `CLAUDE.md`. Gemini `AfterAgent` (canonical `Stop`) was already correctly emitting `{decision: "block", reason}`; new unit tests in `policy-evaluator.test.ts` pin both the deny and instruct shapes to prevent regression.

Comment thread
NiveditJain marked this conversation as resolved.
## 0.0.10-beta.7 — 2026-05-08

### Fixes
Expand Down
62 changes: 62 additions & 0 deletions __tests__/hooks/opencode-plugin-shim.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -385,6 +385,68 @@ describe("OpenCode plugin shim — translation of binary response to plugin acti
await hooks["permission.ask"]!({ tool: "bash", sessionID: "s" }, out);
expect(out.status).toBe("deny");
});

it("event session.idle + additionalContext → AWAITS client.session.prompt (Stop force-retry)", async () => {
// For Stop events the prompt is the only force-retry channel — it MUST
// land before the plugin handler returns or OpenCode tears down the
// plugin context. Verify by making session.prompt return a Promise that
// resolves on a tick we control: if the handler awaits, it won't return
// until we tick.
let resolvePrompt: () => void = () => {};
const promptPromise = new Promise<void>((res) => { resolvePrompt = res; });
const client = { session: { prompt: vi.fn().mockReturnValue(promptPromise) } };

responses.push({
status: 0,
stdout: JSON.stringify({ hookSpecificOutput: { additionalContext: "Run tests before stopping" } }),
stderr: "",
});
const { plugin } = await setup();
const hooks = await plugin({ client, directory: "/repo" });

// Fire the handler and observe that it does NOT resolve until prompt resolves.
let handlerSettled = false;
const handlerDone = (async () => {
await hooks.event!({ event: { type: "session.idle", properties: { sessionID: "ses_1" } } });
handlerSettled = true;
})();

// Yield to the microtask queue; the handler should still be pending.
await new Promise((r) => setImmediate(r));
expect(handlerSettled).toBe(false);
expect(client.session.prompt).toHaveBeenCalledTimes(1);
const arg = client.session.prompt.mock.calls[0][0] as { path: { id: string }; body: { parts: Array<{ type: string; text: string }> } };
expect(arg.path.id).toBe("ses_1");
expect(arg.body.parts[0]).toEqual({ type: "text", text: "Run tests before stopping" });

// Resolve the SDK call; the handler should now finish.
resolvePrompt();
await handlerDone;
expect(handlerSettled).toBe(true);
});

it("event session.idle + SDK rejection on prompt is swallowed (agent already exiting)", async () => {
responses.push({
status: 0,
stdout: JSON.stringify({ hookSpecificOutput: { additionalContext: "..." } }),
stderr: "",
});
const client = { session: { prompt: vi.fn().mockRejectedValue(new Error("network")) } };
const { plugin } = await setup();
const hooks = await plugin({ client, directory: "/repo" });
await expect(
hooks.event!({ event: { type: "session.idle", properties: { sessionID: "ses_1" } } }),
).resolves.toBeUndefined();
});

it("event session.idle + exit 2 → still throws (back-compat with stale binaries)", async () => {
responses.push({ status: 2, stdout: "", stderr: "MANDATORY: commit before stopping" });
const { plugin } = await setup();
const hooks = await plugin({ client: fakeClient(), directory: "/repo" });
await expect(
hooks.event!({ event: { type: "session.idle", properties: { sessionID: "ses_1" } } }),
).rejects.toThrow(/MANDATORY: commit before stopping/);
});
});

describe("OpenCode plugin shim — spawn options and registration", () => {
Expand Down
144 changes: 144 additions & 0 deletions __tests__/hooks/policy-evaluator.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -251,6 +251,63 @@ describe("hooks/policy-evaluator", () => {
expect(parsed.followup_message).toContain("subagent verification pending");
});

it("OpenCode Stop + instruct emits {hookSpecificOutput.additionalContext} JSON on stdout (NOT exit 2)", async () => {
// Mirrors the Copilot/Cursor instruct-on-Stop fixes: without a per-CLI
// branch, opencode instruct verdicts fall through to exit-2 + stderr,
// which the OpenCode shim throws from inside session.idle — a no-op
// because the agent has already gone idle. The shim's only working
// force-retry channel is hookSpecificOutput.additionalContext routed
// through client.session.prompt.
registerPolicy("verify", "desc", () => ({
decision: "instruct",
reason: "needs verification",
}), { events: ["Stop"] });

const result = await evaluatePolicies("Stop", {}, { cli: "opencode" });
expect(result.exitCode).toBe(0);
expect(result.stderr).toBe("");
expect(result.decision).toBe("instruct");
const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { additionalContext?: string } };
expect(parsed.hookSpecificOutput?.additionalContext).toContain("MANDATORY ACTION REQUIRED");
expect(parsed.hookSpecificOutput?.additionalContext).toContain("needs verification");
});

it("OpenCode SubagentStop + instruct also emits {hookSpecificOutput.additionalContext} JSON", async () => {
// Forward-compat parity with the SubagentStop deny test — see comment
// there for why we widen even though OpenCode doesn't yet expose
// subagent boundaries to plugins.
registerPolicy("verify", "desc", () => ({
decision: "instruct",
reason: "subagent verification pending",
}), { events: ["SubagentStop"] });

const result = await evaluatePolicies("SubagentStop", {}, { cli: "opencode" });
expect(result.exitCode).toBe(0);
expect(result.stderr).toBe("");
expect(result.decision).toBe("instruct");
const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { additionalContext?: string } };
expect(parsed.hookSpecificOutput?.additionalContext).toContain("subagent verification pending");
});

it("Gemini Stop + instruct emits {decision:'block', reason} JSON on stdout (force-retry via AfterAgent)", async () => {
// Confirms the existing cli==="gemini" Stop arm in the instruct path
// emits the documented force-retry shape. Pairs with the deny-path
// test in the "Stop event deny format" describe block.
registerPolicy("verify", "desc", () => ({
decision: "instruct",
reason: "needs verification",
}), { events: ["Stop"] });

const result = await evaluatePolicies("Stop", {}, { cli: "gemini" });
expect(result.exitCode).toBe(0);
expect(result.stderr).toBe("");
expect(result.decision).toBe("instruct");
const parsed = JSON.parse(result.stdout) as { decision?: string; reason?: string };
expect(parsed.decision).toBe("block");
expect(parsed.reason).toContain("MANDATORY ACTION REQUIRED");
expect(parsed.reason).toContain("needs verification");
});

it("accumulates multiple instruct messages", async () => {
registerPolicy("first", "desc", () => ({
decision: "instruct",
Expand Down Expand Up @@ -605,6 +662,93 @@ describe("hooks/policy-evaluator", () => {
await evaluatePolicies("Stop", {});
expect(secondPolicyCalled.value).toBe(false);
});

it("OpenCode Stop deny emits {hookSpecificOutput.additionalContext} JSON on stdout (NOT exit 2)", async () => {
// OpenCode's `session.idle` event is notification-only — by the time
// the plugin handler fires, the agent has already gone idle and a
// thrown error from the handler does not force-retry. The shim's only
// working force-retry channel is `client.session.prompt(...)`, which
// it routes through `hookSpecificOutput.additionalContext`. Without
// this branch, the 5 require-*-before-stop builtins were
// observation-only on OpenCode — the deny was logged in the dashboard
// but the agent stopped silently.
registerPolicy("stop-blocker", "desc", () => ({
decision: "deny",
reason: "changes not committed",
}), { events: ["Stop"] });

const result = await evaluatePolicies("Stop", {}, { cli: "opencode" });
expect(result.exitCode).toBe(0);
expect(result.stderr).toBe("");
const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { additionalContext?: string; permissionDecision?: string } };
expect(parsed.hookSpecificOutput?.additionalContext).toContain("MANDATORY ACTION REQUIRED");
expect(parsed.hookSpecificOutput?.additionalContext).toContain("changes not committed");
// Load-bearing: must NOT emit exit-2 (the shim would `throw` from
// session.idle, which OpenCode logs but ignores) or
// permissionDecision: "deny" (the shim would also throw on that path).
expect(parsed.hookSpecificOutput?.permissionDecision).toBeUndefined();
expect(result.decision).toBe("deny");
});

it("OpenCode SubagentStop deny also emits {hookSpecificOutput.additionalContext} JSON (forward-compat)", async () => {
// OpenCode does not yet expose subagent boundaries to plugins, but
// custom policies subscribing to SubagentStop should still get the
// force-retry shape if OpenCode adds the bus event later. Mirrors the
// Cursor + Copilot SubagentStop widening.
registerPolicy("subagent-blocker", "desc", () => ({
decision: "deny",
reason: "subagent left work undone",
}), { events: ["SubagentStop"] });

const result = await evaluatePolicies("SubagentStop", {}, { cli: "opencode" });
expect(result.exitCode).toBe(0);
const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { additionalContext?: string } };
expect(parsed.hookSpecificOutput?.additionalContext).toContain("MANDATORY ACTION REQUIRED");
expect(parsed.hookSpecificOutput?.additionalContext).toContain("subagent left work undone");
expect(result.decision).toBe("deny");
});

it("OpenCode PreToolUse deny still uses Claude permissionDecision shape (regression: tool-event path unchanged)", async () => {
// The Stop arm is added INSIDE the cli==="opencode" branch so the
// generic Claude shape below it is unchanged for tool events. The
// shim's applyDecision throws on permissionDecision: "deny" — that's
// the working path for tool events.
registerPolicy("tool-blocker", "desc", () => ({
decision: "deny",
reason: "blocked tool",
}), { events: ["PreToolUse"] });

const result = await evaluatePolicies(
"PreToolUse",
{ tool_name: "Bash", tool_input: { command: "ls" } },
{ cli: "opencode" },
);
expect(result.exitCode).toBe(0);
const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { permissionDecision?: string; additionalContext?: string } };
expect(parsed.hookSpecificOutput?.permissionDecision).toBe("deny");
// No additionalContext on tool events — that's a Stop-only shape.
expect(parsed.hookSpecificOutput?.additionalContext).toBeUndefined();
});

it("Gemini Stop deny emits {decision:'block', reason} JSON on stdout (force-retry via AfterAgent)", async () => {
// Per Gemini's hooks docs (https://geminicli.com/docs/hooks/), the
// AfterAgent hook (canonical "Stop") force-retries when the hook
// returns `{decision: "block", reason}` on stdout. Exit-2 is per-
// action only ("turn continues") and would NOT trigger the retry.
registerPolicy("stop-blocker", "desc", () => ({
decision: "deny",
reason: "tests not run",
}), { events: ["Stop"] });

const result = await evaluatePolicies("Stop", {}, { cli: "gemini" });
expect(result.exitCode).toBe(0);
expect(result.stderr).toBe("");
const parsed = JSON.parse(result.stdout) as { decision?: string; reason?: string };
expect(parsed.decision).toBe("block");
expect(parsed.reason).toContain("MANDATORY ACTION REQUIRED");
expect(parsed.reason).toContain("tests not run");
expect(result.decision).toBe("deny");
});
});

describe("workflow policy chain integration", () => {
Expand Down
Loading