Skip to content

[codex] Add saved browser conversation follow-up API#231

Draft
pdurlej wants to merge 1 commit into
steipete:mainfrom
pdurlej:codex/browser-follow-up-api
Draft

[codex] Add saved browser conversation follow-up API#231
pdurlej wants to merge 1 commit into
steipete:mainfrom
pdurlej:codex/browser-follow-up-api

Conversation

@pdurlej
Copy link
Copy Markdown
Contributor

@pdurlej pdurlej commented May 30, 2026

Summary

Adds a public follow-up API for continuing an existing saved ChatGPT browser conversation without mutating the parent Oracle session.

  • Adds oracle follow-up <parentSessionId> [prompt] with --prompt, --slug, --wait/--no-wait, and --no-recover.
  • Adds MCP tool follow_up returning sessionId, parentSessionId, status, and logTail.
  • Creates child browser sessions linked with parentSessionId / followUpOfSessionId, copying parent browser config while forcing follow-up runs out of Deep Research mode.
  • Keeps follow-up v1 prompt-only and rejects new files/attachments with a clear error.
  • Uses the detached runner/finalizer path so browser work can survive MCP client timeouts.

Recovery fixes

This also hardens the long-browser-run recovery path:

  • Adds oracle-await render-poll helper for stale status:"running" sessions.
  • Adds a detached browser finalizer that renders/rechecks orphaned sessions.
  • Treats status:"error" plus an existing non-empty artifacts/transcript.md as a completed captured result, covering wrapper cleanup crashes such as Node/undici setTypeOfService EINVAL after the answer was already saved.

Validation

  • pnpm exec vitest run tests/cli/sessionFinalizer.test.ts tests/cli/browserFollowUp.test.ts tests/mcp/followUp.test.ts tests/cli/promptRequirement.test.ts tests/cli/sessionCommand.test.ts tests/mcp/sessions.test.ts tests/mcp/consult.test.ts tests/browser/recoverConversation.test.ts tests/browser/sessionRunner.test.ts tests/browser/index.test.ts tests/cli/integrationCli.test.ts
    • 11 files, 137 tests passed
  • pnpm run typecheck
  • pnpm run build
  • shell smoke: oracle-await returns READY for status:error when artifacts/transcript.md is present

Notes

Upstream already has --browser-follow-up for planned extra turns inside a fresh browser consult and API --followup for Responses API chaining. This PR covers a different path: sending a new prompt to an already saved/recoverable ChatGPT browser conversation as a separate child session.

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 30, 2026

Codex review: needs real behavior proof before merge. Reviewed May 30, 2026, 1:37 AM ET / 05:37 UTC.

Summary
Adds a saved ChatGPT browser conversation follow-up command, MCP follow_up tool, detached browser finalizer/recovery helper, session parent metadata, docs, and tests.

Reproducibility: not applicable. as a feature PR. Source review confirms current main has API follow-up and planned in-run browser follow-ups, but not this saved browser conversation child-session API.

Review metrics: 3 noteworthy metrics.

  • Diff size: 19 files, +1542/-36. The patch spans CLI, MCP, browser recovery, session metadata, skill docs, and tests, so compatibility review matters before merge.
  • Public surfaces: 1 CLI command, 1 MCP tool, 1 helper script. New user-facing entry points need packaging, docs, and upgrade behavior to line up.
  • Existing MCP tool changed: 1 existing tool behavior changed. The consult tool is not only extended; its browser execution semantics change for existing clients.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🦐 gold shrimp
Result: blocked until real behavior proof from a real setup is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Add redacted real behavior proof for oracle follow-up or MCP follow_up against a saved ChatGPT browser conversation, including the child transcript and unchanged parent session.
  • Preserve or explicitly gate the existing MCP browser consult blocking behavior.
  • Package or re-scope oracle-await, and update session lineage tests for browser parent metadata.

Proof guidance:

  • [P1] Needs real behavior proof before merge: The PR lists tests/builds and a prose shell smoke, but this browser/MCP runtime change still needs redacted terminal logs, live output, or a recording showing an actual saved ChatGPT follow-up completing. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Mantis proof suggestion
A real ChatGPT browser follow-up is best verified with visible browser or terminal proof rather than unit tests alone. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

visual task: verify oracle follow-up sends a prompt to a saved ChatGPT browser conversation and captures the child transcript without changing the parent session.

Risk before merge

  • [P1] Merging as-is changes the existing MCP browser consult contract: local browser calls can return pending/running status after the short poll window instead of a final answer.
  • [P1] The new oracle-await instructions may fail for normal package users because the helper is not exposed through package bins or included npm files.
  • [P1] The child browser sessions carry parent metadata, but current status lineage code will not render them as children unless it is taught the new fields.
  • [P1] The PR changes browser runtime/recovery behavior without posted real ChatGPT/browser follow-up proof.

Maintainer options:

  1. Preserve MCP consult compatibility (recommended)
    Keep existing browser MCP consults blocking by default and make early detached return an explicit mode or a carefully documented timeout fallback.
  2. Accept an MCP contract break deliberately
    Maintainers can choose the new pending-status behavior, but it should be documented as an intentional API behavior change with upgrade guidance.

Next step before merge

  • [P1] Needs maintainer judgment on the MCP compatibility change and contributor-supplied real browser proof before any repair lane is useful.

Security
Cleared: No concrete security or supply-chain regression found; the diff adds no dependencies, workflows, permissions, or new secret-handling paths.

Review findings

  • [P1] Preserve blocking MCP browser consults by default — src/mcp/tools/consult.ts:663
  • [P2] Ship oracle-await where users can run it — skills/oracle/scripts/oracle-await:1
  • [P2] Wire browser follow-ups into session lineage — src/cli/browserFollowUp.ts:143-144
Review details

Best possible solution:

Land the saved-browser follow-up API only after preserving existing MCP consult compatibility, shipping or scoping the helper correctly, wiring browser parent lineage into session display, and adding real browser proof.

Do we have a high-confidence way to reproduce the issue?

Not applicable as a feature PR. Source review confirms current main has API follow-up and planned in-run browser follow-ups, but not this saved browser conversation child-session API.

Is this the best way to solve the issue?

No. The direction is useful, but the current patch is not the best merge shape until the MCP compatibility change is narrowed or explicitly approved, helper distribution is fixed, and browser lineage/proof are added.

Full review comments:

  • [P1] Preserve blocking MCP browser consults by default — src/mcp/tools/consult.ts:663
    This branch sends every local browser MCP consult through the detached path and only waits ORACLE_MCP_BROWSER_WAIT_MS before returning status. Existing clients that call consult for a browser answer can now receive pending/running plus logs instead of the final answer, so keep the current blocking behavior by default or make early detach an explicit/timeout-fallback mode.
    Confidence: 0.86
  • [P2] Ship oracle-await where users can run it — skills/oracle/scripts/oracle-await:1
    The PR body and runbook tell operators to run oracle-await, but the script is only added under skills/oracle/scripts/ while package.json still publishes only the oracle and oracle-mcp bins and excludes the skills directory. A normal package install will not put this command on PATH, so either package it as a real bin or scope the docs to the skill script path.
    Confidence: 0.82
  • [P2] Wire browser follow-ups into session lineage — src/cli/browserFollowUp.ts:143-144
    The child session records parentSessionId/followUpOfSessionId, but the status tree still resolves lineage only from previousResponseId and followupSessionId. Browser follow-up children have no Responses API previous id, so they will show as root sessions instead of linked children unless the lineage resolver/display is updated for these new fields.
    Confidence: 0.78

Overall correctness: patch is incorrect
Overall confidence: 0.86

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 6019a199e44c.

Label changes

Label changes:

  • add P2: This is a useful but non-urgent browser/MCP feature with bounded compatibility risks.
  • add merge-risk: 🚨 compatibility: The PR changes existing MCP browser consult return behavior and adds public commands/helpers whose availability affects existing users and agents.
  • add rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🦐 gold shrimp.
  • add status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR lists tests/builds and a prose shell smoke, but this browser/MCP runtime change still needs redacted terminal logs, live output, or a recording showing an actual saved ChatGPT follow-up completing. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Label justifications:

  • P2: This is a useful but non-urgent browser/MCP feature with bounded compatibility risks.
  • merge-risk: 🚨 compatibility: The PR changes existing MCP browser consult return behavior and adds public commands/helpers whose availability affects existing users and agents.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🦐 gold shrimp.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR lists tests/builds and a prose shell smoke, but this browser/MCP runtime change still needs redacted terminal logs, live output, or a recording showing an actual saved ChatGPT follow-up completing. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
Evidence reviewed

What I checked:

  • Repository policy read: AGENTS.md was read fully; browser work requires careful recovery/live smoke consideration and release changelog edits should stay release-owned. (AGENTS.md:4, 6019a199e44c)
  • Current main has related but different follow-up paths: Current main exposes API --followup and in-run --browser-follow-up, but no saved browser conversation follow-up command or MCP follow_up tool. (bin/oracle-cli.ts:380, 6019a199e44c)
  • PR changes existing MCP browser consult behavior: The PR head routes local browser MCP consults through a detached worker and returns status after a short poll window instead of always running performSessionRun inline. (src/mcp/tools/consult.ts:663, 854f5a257201)
  • Helper script is not packaged as a command: The PR adds oracle-await only under the skill scripts path while package.json still publishes only oracle and oracle-mcp and does not include the skills directory in npm files. (package.json:16, 854f5a257201)
  • Lineage display does not read new browser parent fields: Current session lineage resolves only Responses API previousResponseId/followupSessionId, while the PR creates browser children with parentSessionId/followUpOfSessionId and no previousResponseId. (src/cli/sessionLineage.ts:54, 6019a199e44c)
  • Feature history: Browser multi-turn consults and saved conversation recovery have recent/current-main history in the same area, including commits c888fd3 and 6019a19. (src/browser/recoverConversation.ts:27, 6019a199e44c)

Likely related people:

  • Peter Steinberger: Introduced browser multi-turn ChatGPT consult support and appears throughout the central browser/session implementation history. (role: feature history owner; confidence: high; commits: c888fd3cee1a, abb7c9a7d9c8; files: src/browser/index.ts, bin/oracle-cli.ts, src/mcp/tools/consult.ts)
  • Bruce Weaver: Recently added saved URL recovery for --harvest/--live, which this PR builds on for saved browser conversation recovery. (role: recent area contributor; confidence: high; commits: 6019a199e44c; files: src/browser/recoverConversation.ts, src/cli/browserTabs.ts, tests/browser/recoverConversation.test.ts)
  • Cheul: Introduced existing CLI follow-up support and lineage display for Responses API sessions. (role: follow-up lineage contributor; confidence: medium; commits: 2202c337b0bd, c2f833848d0a; files: bin/oracle-cli.ts, src/sessionManager.ts, src/cli/sessionLineage.ts)
  • Piotr Durlej: Has prior merged work in browser multi-turn guardrails and reliability, beyond being the proposer of this branch. (role: recent browser workflow contributor; confidence: medium; commits: 843d5a525bf9, 8dc0b4115eac; files: docs/browser-mode.md, src/browser/index.ts, skills/oracle/SKILL.md)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. labels May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. P2 Normal priority bug or improvement with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant