Run Claude and Codex ACP adapters in VM containers#996
Conversation
❌ Tests failed — 4/1265 failed
Failed tests
|
Greptile SummaryThis PR replaces the host-process ACP runtimes for Claude and Codex with VM-backed container runtimes (matching the existing Hermes pattern), adds a shared
Confidence Score: 4/5Safe to merge with awareness of the reliability concerns around version-pinning and the 5-second polling cadence. The core runtime migration is well-structured and mirrors the proven Hermes pattern. The VM deduplication fix is correct, the readiness-probe retry loop is necessary given the slow npm-install entrypoint, and error handling in the WebSocket open handler was correctly added. The main concerns are operational: both new container runtimes install latest packages on every restart risking version drift and slow cold starts, the /targets HTTP handler propagates subprocess errors as 500s rather than gracefully returning an empty list, the adapter health check now polls every 5 s unconditionally, and TerminalTargetId is duplicated across three files instead of living in the shared package. codex-container-runtime.ts and claude-container-runtime.ts (start commands), terminal.ts (listRunningContainers error path), and useAgents.ts (polling interval). Important Files Changed
Sequence DiagramsequenceDiagram
participant App as Application (main.ts)
participant SCBE as startContainerRuntimeBestEffort
participant CR as ClaudeRuntime / CodexRuntime / HermesRuntime
participant VM as VmRuntime (ensureReady)
participant Ctr as Container (nerdctl)
participant UI as Agent UI
App->>SCBE: configureClaudeRuntime()
App->>SCBE: configureCodexRuntime()
App->>SCBE: configureHermesRuntime()
SCBE->>CR: executeAction install
SCBE->>CR: executeAction start
CR->>VM: ensureReady()
Note over VM: module-level promise map deduplicates concurrent calls
VM-->>CR: VM ready
CR->>Ctr: nerdctl pull image
CR->>Ctr: nerdctl run npm install then sleep
CR->>Ctr: readinessProbe loop up to 120 s
Ctr-->>CR: command -v claude/codex OK
CR-->>App: runtime ready
UI->>App: "GET /terminal/targets?agentId=X"
App->>Ctr: ContainerCli.ps() running containers
App-->>UI: targets array
UI->>App: "WS /terminal/ws?target=claude&agentId=X"
App->>Ctr: nerdctl exec -it -e HOME -w agentHome /bin/sh
Ctr-->>UI: PTY stream
Prompt To Fix All With AIFix the following 5 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 5
packages/browseros-agent/apps/server/src/api/routes/terminal.ts:115-128
**Unhandled error from `listRunningContainers` causes 500 on `/targets`**
If the VM is unavailable or `limactl` fails, `deps.listRunningContainers()` throws and the entire route handler propagates a 500. The frontend silently ignores non-OK responses, so users see a stale/empty target dropdown during VM startup with no server-side diagnostic logged. A try/catch that falls back to `runningContainers = undefined` (skipping the filter) or returns `{ targets: [] }` would be more resilient.
### Issue 2 of 5
packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentsPage.tsx:463-471
**`inferTerminalTarget` matches on display label strings**
The fallback path for resolving the terminal target compares `runtimeLabel.toLowerCase()` against hard-coded strings like `'claude code'`. If a display name is updated elsewhere (e.g., in `RuntimeDescriptor.displayName`), this silently returns `null` and the "Open terminal" action becomes a no-op without any user feedback. Prefer using the `adapter` field from `harnessAgentLookup` exclusively, or derive a stable adapter ID that doesn't depend on UI labels.
### Issue 3 of 5
packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentTerminal.tsx:155
**`TerminalTargetId` and `TerminalTargetOption` are duplicated client-side**
`TerminalTargetId` is defined independently in `AgentTerminal.tsx`, `AgentsPage.tsx`, and on the server in `terminal-session.ts`. When a new target is added server-side, the two client-side type definitions and the `parseTargetId` guard must also be updated manually. Since these are already exported from the shared `@browseros/shared` package, consider exporting the type (or a plain constant array of valid IDs) from there to keep a single source of truth.
### Issue 4 of 5
packages/browseros-agent/apps/agent/entrypoints/app/agents/useAgents.ts:72-73
**Unconditional 5-second adapter health polling**
`refetchInterval: 5_000` runs whenever the component is mounted with `enabled = true`, regardless of whether any container-backed adapter is registered. Each poll triggers a server-side request; at the server this may also invoke `ContainerCli.ps()` (a subprocess). Consider polling only when at least one container runtime is known to be starting, or backing off once all adapters are healthy.
### Issue 5 of 5
packages/browseros-agent/apps/server/src/lib/agents/runtime/codex-container-runtime.ts:50-52
**`apt-get update` and `@latest` npm installs run on every container restart**
`CODEX_START_COMMAND` runs `apt-get update` and installs `@openai/codex@latest` each time the container starts. On a server restart the 120 s readiness timeout re-applies, and the installed package version may differ from the previous run. Similarly `CLAUDE_CODE_START_COMMAND` pins `@anthropic-ai/claude-code@latest`, making both runtimes susceptible to spontaneous version drift between restarts. Pinning to an explicit version (or using a baked image) would make the behaviour more reproducible.
Reviews (1): Last reviewed commit: "feat(agent): run claude and codex in VM ..." | Re-trigger Greptile |
| .get('/targets', async (c) => { | ||
| let runningContainers: Set<string> | undefined | ||
| if (deps.listRunningContainers) { | ||
| runningContainers = new Set(await deps.listRunningContainers()) | ||
| } | ||
| return c.json({ | ||
| targets: listTerminalTargets({ | ||
| browserosDir: deps.browserosDir, | ||
| agentId: c.req.query('agentId'), | ||
| runningContainers, | ||
| openclawContainerName: deps.containerName, | ||
| }), | ||
| }) | ||
| }) |
There was a problem hiding this comment.
Unhandled error from
listRunningContainers causes 500 on /targets
If the VM is unavailable or limactl fails, deps.listRunningContainers() throws and the entire route handler propagates a 500. The frontend silently ignores non-OK responses, so users see a stale/empty target dropdown during VM startup with no server-side diagnostic logged. A try/catch that falls back to runningContainers = undefined (skipping the filter) or returns { targets: [] } would be more resilient.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/api/routes/terminal.ts
Line: 115-128
Comment:
**Unhandled error from `listRunningContainers` causes 500 on `/targets`**
If the VM is unavailable or `limactl` fails, `deps.listRunningContainers()` throws and the entire route handler propagates a 500. The frontend silently ignores non-OK responses, so users see a stale/empty target dropdown during VM startup with no server-side diagnostic logged. A try/catch that falls back to `runningContainers = undefined` (skipping the filter) or returns `{ targets: [] }` would be more resilient.
How can I resolve this? If you propose a fix, please make it concise.| </div> | ||
| ) | ||
| } | ||
|
|
||
| function inferTerminalTarget(label: string): TerminalTargetId | null { | ||
| const lower = label.toLowerCase() | ||
| if (lower === 'claude code') return 'claude' | ||
| if (lower === 'codex') return 'codex' | ||
| if (lower === 'hermes') return 'hermes' |
There was a problem hiding this comment.
inferTerminalTarget matches on display label strings
The fallback path for resolving the terminal target compares runtimeLabel.toLowerCase() against hard-coded strings like 'claude code'. If a display name is updated elsewhere (e.g., in RuntimeDescriptor.displayName), this silently returns null and the "Open terminal" action becomes a no-op without any user feedback. Prefer using the adapter field from harnessAgentLookup exclusively, or derive a stable adapter ID that doesn't depend on UI labels.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentsPage.tsx
Line: 463-471
Comment:
**`inferTerminalTarget` matches on display label strings**
The fallback path for resolving the terminal target compares `runtimeLabel.toLowerCase()` against hard-coded strings like `'claude code'`. If a display name is updated elsewhere (e.g., in `RuntimeDescriptor.displayName`), this silently returns `null` and the "Open terminal" action becomes a no-op without any user feedback. Prefer using the `adapter` field from `harnessAgentLookup` exclusively, or derive a stable adapter ID that doesn't depend on UI labels.
How can I resolve this? If you propose a fix, please make it concise.| enabled: Boolean(baseUrl) && !urlLoading && enabled, | ||
| refetchInterval: enabled ? 5_000 : false, |
There was a problem hiding this comment.
Unconditional 5-second adapter health polling
refetchInterval: 5_000 runs whenever the component is mounted with enabled = true, regardless of whether any container-backed adapter is registered. Each poll triggers a server-side request; at the server this may also invoke ContainerCli.ps() (a subprocess). Consider polling only when at least one container runtime is known to be starting, or backing off once all adapters are healthy.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/agent/entrypoints/app/agents/useAgents.ts
Line: 72-73
Comment:
**Unconditional 5-second adapter health polling**
`refetchInterval: 5_000` runs whenever the component is mounted with `enabled = true`, regardless of whether any container-backed adapter is registered. Each poll triggers a server-side request; at the server this may also invoke `ContainerCli.ps()` (a subprocess). Consider polling only when at least one container runtime is known to be starting, or backing off once all adapters are healthy.
How can I resolve this? If you propose a fix, please make it concise.| export interface CodexRuntimeConfig { | ||
| browserosDir: string | ||
| codexHarnessHostDir: string |
There was a problem hiding this comment.
apt-get update and @latest npm installs run on every container restart
CODEX_START_COMMAND runs apt-get update and installs @openai/codex@latest each time the container starts. On a server restart the 120 s readiness timeout re-applies, and the installed package version may differ from the previous run. Similarly CLAUDE_CODE_START_COMMAND pins @anthropic-ai/claude-code@latest, making both runtimes susceptible to spontaneous version drift between restarts. Pinning to an explicit version (or using a baked image) would make the behaviour more reproducible.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/lib/agents/runtime/codex-container-runtime.ts
Line: 50-52
Comment:
**`apt-get update` and `@latest` npm installs run on every container restart**
`CODEX_START_COMMAND` runs `apt-get update` and installs `@openai/codex@latest` each time the container starts. On a server restart the 120 s readiness timeout re-applies, and the installed package version may differ from the previous run. Similarly `CLAUDE_CODE_START_COMMAND` pins `@anthropic-ai/claude-code@latest`, making both runtimes susceptible to spontaneous version drift between restarts. Pinning to an explicit version (or using a baked image) would make the behaviour more reproducible.
How can I resolve this? If you propose a fix, please make it concise.
Summary
Test Plan
Follow-up