Skip to content

fix(browser): extract Deep Research report from out-of-process iframe#232

Open
umutkeltek wants to merge 6 commits into
steipete:mainfrom
umutkeltek:fix/deep-research-oopif-extraction
Open

fix(browser): extract Deep Research report from out-of-process iframe#232
umutkeltek wants to merge 6 commits into
steipete:mainfrom
umutkeltek:fix/deep-research-oopif-extraction

Conversation

@umutkeltek
Copy link
Copy Markdown

@umutkeltek umutkeltek commented May 30, 2026

Problem

Deep Research runs via the browser engine stopped returning results. The session hangs until timeout (or stays stuck running), and harvest captures only the "ChatGPT said:" placeholder instead of the report.

Root cause

ChatGPT moved the Deep Research report out of the inline assistant turn and into a doubly-nested, out-of-process sandboxed iframe:

chatgpt.com/c/…                            → assistantCount: 0  (no report in the main DOM)
 └─ connector_openai_deep_research (OOPIF)  → body empty (sandbox shell)
     └─ nested same-origin iframe           → the actual report ("Research completed in …")

waitForDeepResearchCompletion has two extraction paths:

  • readDeepResearchFrameResult — reads the main page's frame tree via isolated worlds. The OOPIF does not appear in the main page's frame tree, so findDeepResearchFrameId returns null. It can never see the report.
  • readDeepResearchTargetResult — attaches to the iframe's own CDP target and walks its nested frames. This one can read the report.

The dispatch was Page ? frameResult : client ? targetResult : null. In production both Page and client are always passed (see index.ts and reattach.ts), so it always took the broken frame path and the working target path was effectively dead code. The completion poll never saw finished (no assistant turn in the main DOM either), so the run timed out.

A second guard (hasActiveScopedResearch) also depended on a main-DOM assistant turn that no longer exists, which would have blocked the report even after fixing dispatch.

Fix

  • Prefer the target-attach path (reaches the OOPIF + nested frames); fall back to the in-page frame path for legacy/inline rendering.
  • Treat a target-confirmed completion as authoritative even when the main DOM exposes no assistant turn, since the report now lives entirely in the OOPIF.

The scoped-staleness guard is preserved for the in-page frame path (target-confirmed completions read the live connector iframe directly, so they don't need the main-DOM corroboration).

Verification

Reattaching to a real completed Deep Research session now works end-to-end through the actual CLI:

Monitoring Deep Research (timeout: 40min)...
Deep Research completed (6s elapsed)
[browser] Saved deep-research-report artifact … deep-research-report.md (26.7 KB)
Reattach succeeded; session marked completed.

Previously this same flow extracted only "ChatGPT said:" and ran to timeout.

Tests

  • Adds a regression test: production shape (both Page and client passed, scoped run, minTurnIndex >= 0) where the report is only reachable via the target path and the main DOM has no assistant turn → returns the report instead of hanging.
  • Existing staleness-protection test (does not complete from an unscoped frame result during a scoped run) still passes — the guard relaxation only applies to target-confirmed completions.
  • deepResearch.test.ts: 28 passed. Full browser suite: 399 passed / 1 skipped.

Known follow-up (not in this PR)

readDeepResearchTargetResult enumerates targets browser-wide (Target.getTargets). With multiple ChatGPT tabs open in the persistent profile, each holding a completed Deep Research report, a run could in principle pick up another tab's report. A follow-up should scope target enumeration to the current page session (the client is already connected to the page target) and drop the browser-wide scan.


Follow-up: cross-tab scoping + wrapper-path binding + fresh live proof

The first review flagged that target-first extraction scanned browser-wide CDP targets (cross-tab leak). Addressed in two commits:

  • 53b7b331 — scope discovery to the current page via page-session auto-attach; drop the browser-wide Target.getTargets/attachToTarget scan.
  • 95903b61 — the page-scoping only held for direct-tab clients. On the browser-WSEndpoint path, client is a session-bound wrapper (createSessionBoundChromeClient) whose raw send is browser-level, so Target.setAutoAttach via send still went browser-wide there. Now the wrapper is tagged with its page session id (oraclePageSessionId) and auto-attach is bound to it explicitly. The cross-tab boundary now holds on both connection paths.

Regression tests (both verified to FAIL against the pre-fix source):

  • foreign completed Deep Research target is never read (page-scoped discovery)
  • every Target.setAutoAttach on a session-bound wrapper client is bound to the page session

Fresh live multi-tab proof (latest head 95903b61) — a signed-in Chrome with two completed Deep Research tabs open; the real waitForDeepResearchCompletion run scoped to each tab returns that tab's own report, no cross-contamination:

{
  "tabA": { "name": "v2 (Routing Cost Map)",  "len": 27099,
            "head": "Current LLM Inference Routing Cost Map / Scope and reading notes ...",
            "hasOwn_RoutingMap": true,  "leaked_fromB": false },
  "tabB": { "name": "v1 (Provider Cost Map)", "len": 14317,
            "head": "LLM Inference Provider Cost Map / Scope and normalization ...",
            "hasOwn_ProviderMap": true, "leaked_fromA": false }
}
// ISOLATION PASS — each tab returned its OWN report, no cross-contamination

Both connector OOPIFs (connector_openai_deep_research.*) were present simultaneously; tab A extracted the Routing report and tab B the Provider report, each from its own page-scoped session.

Verification: tests/browser/deepResearch.test.ts 30 passed; full browser suite 401 passed / 1 skipped; typecheck (changed files) / format / lint clean.


Update — candidate-order fix, clean-up, and re-proof at HEAD (f5428df9)

Addressing the re-review of the page-session-binding commit:

Correctness (70375b47, 8a9b98c2)

  • A page can expose more than one Deep Research iframe target. Target scanning now returns a completed read immediately and only keeps the best in-progress/text-bearing read when none completed — it no longer returns the first in-progress target and misses a later completed OOPIF.
  • waitForDeepResearchCompletion no longer lets an incomplete target read suppress the in-page frame fallback; a completed target read stays authoritative, otherwise the frame path runs (preserving legacy inline/mixed rendering — the compatibility concern).

Clean-up (f5428df9)

  • Extracted the read-selection logic into a documented, unit-tested helper pickPreferredDeepResearchRead; renamed the misleading frameResult local to read.

Tests (each verified to fail against the pre-fix source):

  • first-target-in-progress + later-target-completed → returns the completed report
  • in-progress target + completed in-page frame → returns the frame report (legacy inline)
  • 6 direct unit tests for the selection helper, including the legacy no-target + completed-frame case
  • tests/browser/deepResearch.test.ts: all pass; full browser suite 410 passed / 1 skipped; typecheck (changed files) / format / lint clean.

Fresh live multi-tab proof at HEAD f5428df9 — signed-in Chrome, two completed Deep Research tabs open simultaneously (both connector OOPIFs present); the real waitForDeepResearchCompletion scoped to each tab returns that tab's own report:

{
  "head": "f5428df9",
  "tabA": { "name": "v2 (Routing Cost Map)",  "len": 27099,
            "head": "Current LLM Inference Routing Cost Map / Scope and reading notes ...",
            "hasOwn_RoutingMap": true,  "leaked_fromB": false },
  "tabB": { "name": "v1 (Provider Cost Map)", "len": 14317,
            "head": "LLM Inference Provider Cost Map / Scope and normalization ...",
            "hasOwn_ProviderMap": true, "leaked_fromA": false }
}
// ISOLATION PASS — each tab returned its OWN report, no cross-contamination

ChatGPT now renders Deep Research reports inside a doubly-nested,
out-of-process sandboxed iframe (connector_openai_deep_research.*.
oaiusercontent.com) instead of an inline assistant turn. That OOPIF does
not appear in the main page's frame tree, so the in-page isolated-world
extraction (readDeepResearchFrameResult) can never find it. Because that
path was preferred whenever a Page was present, the capable
target-attach path (readDeepResearchTargetResult) was dead code in
production: waitForDeepResearchCompletion never detected completion and
ran to timeout, leaving the session stuck "running" and harvesting only
the "ChatGPT said:" placeholder.

Prefer the target-attach path (which reaches the OOPIF and its nested
frames) and fall back to the in-page frame path for legacy inline
rendering. Treat a target-confirmed completion as authoritative even
when the main DOM exposes no assistant turn, since the report now lives
entirely in the OOPIF and the hasActiveScopedResearch heuristic no
longer holds there.

Verified end-to-end: reattaching to a completed Deep Research session
now detects completion in ~6s and saves the full report where it
previously timed out. Adds a regression test for the production shape
(both Page and client passed, scoped run, report only reachable via the
target path).
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 30, 2026

Codex review: needs maintainer review before merge. Reviewed May 30, 2026, 7:14 AM ET / 11:14 UTC.

Summary
This PR changes Deep Research browser completion to prefer page-scoped CDP target extraction for OOPIF reports, tags session-bound Chrome clients with a page session id, and adds regression tests plus changelog coverage.

Reproducibility: yes. at source level: current main chooses the in-page frame path whenever Page is present, and the production browser/reattach callers pass both Page and client. I did not run a live ChatGPT session, but the PR body includes fresh redacted live proof for the external OOPIF behavior.

Review metrics: 2 noteworthy metrics.

  • Diff size: 4 files, +659/-48. The implementation is focused but touches CDP report extraction, session binding, tests, and release notes.
  • Regression coverage: 544 test lines added. Most of the diff is targeted coverage for cross-tab scoping, session-bound wrappers, candidate ordering, and legacy fallback behavior.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦀 challenger crab
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] This depends on ChatGPT's current OOPIF/CDP target shape, so maintainers should treat the supplied live proof and focused tests as the acceptance evidence for that external browser boundary.
  • [P1] Because the patch extracts report text from sandboxed iframe targets in a persistent Chrome profile, page-session scoping remains the critical safety property even though this head removes browser-wide target enumeration and adds cross-tab proof.

Maintainer options:

  1. Accept scoped CDP extraction (recommended)
    Land this as the browser regression fix if maintainers accept the page-session scoped auto-attach proof as sufficient for persistent Chrome profiles.
  2. Pause for maintainer live smoke
    Hold the PR until a maintainer repeats a signed-in Deep Research smoke with two completed tabs and confirms no cross-tab report capture.

Next step before merge

  • [P2] No repair lane is needed; the remaining action is maintainer acceptance of the scoped CDP extraction and release timing for this browser fix.

Security
Cleared: No discrete security or supply-chain defect was found; the security-sensitive cross-tab CDP extraction risk is addressed by page-scoped code, regression tests, live proof, and a clean GitGuardian scan.

Review details

Best possible solution:

Land the page-scoped target extraction if maintainers accept the CDP boundary proof, then verify a release smoke on Deep Research before shipping.

Do we have a high-confidence way to reproduce the issue?

Yes at source level: current main chooses the in-page frame path whenever Page is present, and the production browser/reattach callers pass both Page and client. I did not run a live ChatGPT session, but the PR body includes fresh redacted live proof for the external OOPIF behavior.

Is this the best way to solve the issue?

Yes, this is the narrowest maintainable direction I found: prefer the target path that can see OOPIF reports, preserve the in-page fallback, and bind target discovery to the page session. The remaining question is maintainer acceptance of the CDP boundary risk, not a concrete code repair.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 6019a199e44c.

Label changes

Label justifications:

  • P1: The PR addresses a current Deep Research browser regression where completed runs can time out or harvest only the placeholder instead of the report.
  • merge-risk: 🚨 compatibility: The patch changes the Deep Research completion path from Page-first to target-first, with fallback coverage for legacy inline/frame rendering.
  • merge-risk: 🚨 session-state: A scoping error could save a completed report from another ChatGPT tab into the current Oracle session.
  • merge-risk: 🚨 security-boundary: The patch intentionally reads report text from sandboxed OOPIF targets through CDP, so the page-session boundary is security-sensitive.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦀 challenger crab and patch quality is 🦞 diamond lobster.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body contains copied live CLI output and a fresh redacted two-tab Deep Research proof at head f5428df showing each tab returned its own report without cross-contamination.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body contains copied live CLI output and a fresh redacted two-tab Deep Research proof at head f5428df showing each tab returned its own report without cross-contamination.
Evidence reviewed

What I checked:

  • Current main production dispatch: Current main still chooses the in-page frame reader whenever Page is present, so the target reader is not reached in the production shape where callers pass both Page and client. (src/browser/actions/deepResearch.ts:206, 6019a199e44c)
  • Production callers pass both Page and client: Browser run and reattach paths call waitForDeepResearchCompletion with Runtime, Page, and client, matching the PR body's described broken dispatch shape. (src/browser/index.ts:1206, 6019a199e44c)
  • Target-first extraction at PR head: The PR head reads the CDP target path first, falls back to the in-page frame path when the target read is missing or incomplete, and treats a completed target read as authoritative. (src/browser/actions/deepResearch.ts:214, f5428df94497)
  • Page-scoped CDP discovery: The PR head removes the browser-wide Target.getTargets scan from the patched path and uses Target.setAutoAttach with the page session id when available. (src/browser/actions/deepResearch.ts:451, f5428df94497)
  • Session-bound wrapper support: The PR head tags the session-bound Chrome client with oraclePageSessionId so raw Target.* sends can stay bound to the page session on the browser-WSEndpoint path. (src/browser/chromeLifecycle.ts:472, f5428df94497)
  • Regression coverage: The added tests cover cross-tab isolation, session-bound wrapper page-session binding, completed-target ordering, legacy in-page fallback, and the OOPIF scoped-run completion path. (tests/browser/deepResearch.test.ts:483, f5428df94497)

Likely related people:

  • steipete: Introduced ChatGPT Deep Research mode and appears in current blame/API history for the central browser Deep Research code path. (role: feature owner and recent area contributor; confidence: high; commits: dff95f2a99b5, b91502c9c70b, abb7c9a7d9c8; files: src/browser/actions/deepResearch.ts, src/browser/chromeLifecycle.ts)
  • pdurlej: Recent browser control work touched chromeLifecycle and the timeline shows involvement around this browser review surface. (role: recent browser area contributor; confidence: medium; commits: 8465379ce0fd; files: src/browser/chromeLifecycle.ts)
  • dedene: Authored the direct CDP attach-running work that is adjacent to the session-bound Chrome client and remote/browser attach behavior this PR relies on. (role: adjacent CDP/session contributor; confidence: medium; commits: 1f36413f89df; files: src/browser/chromeLifecycle.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. P1 Urgent regression or broken agent/channel workflow affecting real users now. merge-risk: 🚨 session-state 🚨 Merging this PR could lose, corrupt, stale, or mis-associate session or agent state. merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. labels May 30, 2026
Address review: the target-attach OOPIF extraction enabled in this PR used
readDeepResearchTargetResult, which enumerated browser-wide CDP targets
(Target.getTargets) and returned the first completed Deep Research target it
could read. In a shared/persistent Chrome profile with another completed Deep
Research tab, that could save the other tab's report into the current Oracle
session (cross-tab session-state / privacy leak).

Scope discovery to the current Oracle-controlled page: `client` is connected to
the conversation page target, so page-session auto-attach only surfaces THIS
page's related targets (its Deep Research OOPIF). Drop the browser-wide
Target.getTargets / attachToTarget enumeration (and the now-unneeded
setDiscoverTargets); only auto-attached, page-scoped sessions are treated as
belonging to this run.

Add a regression test with two Deep Research targets — the current page's OOPIF
(in progress) plus a foreign COMPLETED report visible browser-wide — asserting
the foreign report is never read (the run times out instead). Verified the test
fails against the pre-scoping source and passes after. Full browser suite green
(400 passed); document the fix + scoping in the changelog.
@umutkeltek
Copy link
Copy Markdown
Author

Addressed the P1 (browser-wide target scan / cross-tab leak) in 53b7b331.

Fix

readDeepResearchTargetResult no longer enumerates Target.getTargets (browser-wide). client is connected to the conversation page target, so enabling Target.setAutoAttach on that session only surfaces this page's related targets (its Deep Research OOPIF subframe). Discovery is now limited to those auto-attached, page-scoped sessions; the browser-wide getTargets/attachToTarget enumeration (and the now-unneeded setDiscoverTargets) is removed. A foreign completed Deep Research tab in a shared/persistent profile can no longer be read into the current session.

Regression test

Added does not return a foreign completed Deep Research report from another tab: the current page's OOPIF auto-attaches but is still in progress, while a foreign completed report is visible via the browser-wide Target.getTargets scan. The test asserts the foreign report is never read (the run times out) and that the browser-wide scan is not consulted.

Proof it's a real guard: against the pre-scoping source the test fails (foreignAttachCalled === true — the foreign target was attached and its report returned); with the scoping fix it passes.

  • tests/browser/deepResearch.test.ts: 29 passed. Full browser suite: 400 passed / 1 skipped. Typecheck (changed file) / format / lint clean.
  • Changelog updated with the OOPIF fix + the cross-tab scoping.

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 30, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. and removed proof: sufficient Contributor real behavior proof is sufficient. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 30, 2026
Re-review found the page-scoping fix only held on direct-tab clients. On the
browser-WSEndpoint / remote-Chrome path, `client` is a session-bound wrapper
(createSessionBoundChromeClient) whose domain methods are session-bound but
whose raw `send` is the browser-level send. readDeepResearchTargetResult issues
Target.setAutoAttach via raw `send`, so on that path it auto-attached
browser-wide and could still read another tab's completed Deep Research report
into the current session.

Tag the session-bound wrapper with its page session id (oraclePageSessionId)
and pass it explicitly to Target.setAutoAttach (enable + disable). For a direct
tab client this is undefined and `send` already defaults to the page session, so
behavior there is unchanged; on the wrapper path the auto-attach is now bound to
the page session and stays scoped to this tab.

Add a regression test for the session-bound wrapper path asserting every
setAutoAttach call is bound to the page session (it fails against the un-bound
source, where a foreign completed report leaks in). Full browser suite green
(401 passed); changelog updated.
@umutkeltek
Copy link
Copy Markdown
Author

Both follow-up P1s resolved + fresh live proof on the latest head.

Wrapper-path binding (95903b61) — the prior page-scoping only held for direct-tab clients. On the browser-WSEndpoint path the session-bound wrapper's raw send is browser-level, so Target.setAutoAttach still went browser-wide. Now the wrapper carries its page session id (oraclePageSessionId) and auto-attach is bound to it explicitly. New regression test asserts every setAutoAttach on a session-bound wrapper is page-bound — verified to fail against the un-bound source.

Fresh live multi-tab proof (latest head) — signed-in Chrome, two completed Deep Research tabs open simultaneously (both connector_openai_deep_research OOPIFs present). The real waitForDeepResearchCompletion scoped to each tab returned that tab's own report:

  • tab A → "Current LLM Inference Routing Cost Map" (27,099 chars), no leak from B
  • tab B → "LLM Inference Provider Cost Map" (14,317 chars), no leak from A
  • ISOLATION PASS — no cross-contamination.

Redacted result + commit details in the PR body. tests/browser/deepResearch.test.ts 30 passed; full browser suite 401 passed.

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 30, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. and removed status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 30, 2026
Re-review found two candidate-ordering gaps in the target-scan path:

- readDeepResearchTargetResult returned the first in-progress / text-bearing
  session, so when a page exposed more than one Deep Research iframe target
  (e.g. a stale in-progress one attached before the completed report) it could
  return the incomplete one and miss the completed OOPIF. It now scans all
  page-scoped sessions, returns a completed read immediately, and only falls
  back to the best in-progress/text-bearing read when none completed.
- waitForDeepResearchCompletion used `targetResult ?? frameResult`, so an
  incomplete (in-progress) target read suppressed the in-page frame fallback. A
  completed target read is still authoritative; otherwise the frame path runs
  and a completed in-page result is preferred.

Add a regression test: two page targets (in-progress attached first, completed
second) must return the completed report. Verified it fails fast against the
prior early-return loop and passes after. Full browser suite green (402).
Adds the second regression case requested in review: when the target-attach read
is only in-progress but the in-page frame path has a completed report, the
incomplete target read must not suppress the frame fallback. Verified it fails
against the pre-fix source and passes after. Pairs with the existing
first-target-in-progress + later-target-completed test.
… helper

Clean up the result-selection logic flagged in review. The nested ternary that
chose between the target-attach read and the in-page frame read is replaced by a
named, unit-tested helper (pickPreferredDeepResearchRead) and the local is
renamed from the misleading `frameResult` to `read` (it can hold either source).

Behaviour is unchanged: a completed read wins (target preferred), otherwise the
best in-progress/text-bearing read is kept for progress logging; an incomplete
target read never suppresses a completed in-page result (legacy/inline path).

Adds direct unit tests for every selection branch, including the legacy-inline
case (no target read + completed in-page read returns the in-page report). Full
browser suite green (409 passed).
@clawsweeper clawsweeper Bot added rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 30, 2026
@umutkeltek
Copy link
Copy Markdown
Author

Addressed the re-review of the page-session-binding commit, plus a clean-up pass and fresh proof at the new HEAD.

Correctness (70375b47, 8a9b98c2) — target scanning now returns a completed read immediately and only keeps the best in-progress read when none completed (no longer returns the first in-progress target and misses a later completed OOPIF); and an incomplete target read no longer suppresses the in-page frame fallback (a completed target read stays authoritative — this preserves the legacy inline/mixed rendering path, the compatibility concern).

Clean-up (f5428df9) — extracted the read-selection into a documented, unit-tested helper (pickPreferredDeepResearchRead); renamed the misleading frameResultread.

Tests (each proven to fail against the pre-fix source): first-target-in-progress + later-target-completed; in-progress target + completed in-page frame (legacy inline); 6 unit tests for the selection helper incl. the no-target + completed-frame legacy case. tests/browser/deepResearch.test.ts all pass; full browser suite 410 passed; typecheck (changed files) / format / lint clean. An independent code-review pass found nothing substantive.

Fresh live multi-tab proof at HEAD f5428df9 (in the PR body): signed-in Chrome, two completed Deep Research tabs open at once, each scoped extraction returns that tab's own report — ISOLATION PASS, no cross-contamination.

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 30, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@umutkeltek
Copy link
Copy Markdown
Author

I was kinda obsessed with the rating, it would be very logical to gamify pr's like this in the future.

@yeqiuqiu123
Copy link
Copy Markdown

Downstream validation from a real Hermes/Oracle autonomous Deep Research workflow:

  • Reproduced on released @steipete/oracle v0.13.0: completed ChatGPT Deep Research sessions harvested successfully at the CLI transport layer (rc=0, state completed) but output only the placeholder ChatGPT said: / 13 bytes.
  • Built and ran this PR head locally (f5428df94497b354d9646d4cc8634e070c59d0d7) as a profile-local hotfix.
  • Re-harvested a known completed Deep Research session through oracle session <slug> --harvest --write-output ... and the same workflow produced real report content instead of the 13-byte placeholder.
  • End-to-end runner gate also passed: the downstream harvester classified the patched output as completed/valid rather than garbage/empty.

Minimal observed before/after:

before v0.13.0:
  rc=0
  state=completed
  output_size=13
  output="ChatGPT said:"

after PR head f5428df9:
  rc=0
  state=completed
  output_size=1476
  output contains expected Deep Research marker/content

This is not just a unit-test-only issue for us; it was blocking an autonomous Research Intern pipeline because completed reports were being discarded as empty extraction artifacts. The PR head fixes the real downstream failure mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. merge-risk: 🚨 session-state 🚨 Merging this PR could lose, corrupt, stale, or mis-associate session or agent state. P1 Urgent regression or broken agent/channel workflow affecting real users now. proof: sufficient Contributor real behavior proof is sufficient. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants