Skip to content

feat(server): add wait and custom code tools#984

Open
Nikhil (shadowfax92) wants to merge 3 commits into
devfrom
polecat/topaz/bosmain-6fd-rework
Open

feat(server): add wait and custom code tools#984
Nikhil (shadowfax92) wants to merge 3 commits into
devfrom
polecat/topaz/bosmain-6fd-rework

Conversation

@shadowfax92
Copy link
Copy Markdown
Contributor

Fixes #422

Summary

  • Register wait_for and browser_run_code MCP tools for BrowserOS server clients.
  • Add fixed-time waits, single-condition text/selector appearance/disappearance waits, and custom async page-context code execution.
  • Tighten wait_for validation so ambiguous condition combinations fail loudly, and make selectorGone wait for prior selector presence before reporting disappearance.

Test plan

  • bun --env-file=.env.development test ./tests/tools/registry.test.ts ./tests/browser/wait-for.test.ts
  • bun run test:browser
  • bunx biome check apps/server/src/agent/prompt.ts apps/server/src/browser/browser.ts apps/server/src/tools/navigation.ts apps/server/src/tools/registry.ts apps/server/src/tools/snapshot.ts apps/server/src/tools/tool-label-registry.ts apps/server/tests/tools/navigation.test.ts apps/server/tests/tools/observation.test.ts apps/server/tests/tools/registry.test.ts apps/server/tests/browser/wait-for.test.ts
  • bun run typecheck from packages/browseros-agent/apps/server

Local blockers

  • FORCE_COLOR=1 bun scripts/build/server.ts --target=darwin-arm64 --ci is blocked locally because apps/server/.env.production is missing.
  • bun run test:tools is blocked locally because /Applications/BrowserOS.app/Contents/MacOS/BrowserOS is not installed in this worktree environment; the run was stopped after repeated identical launch failures.

@shadowfax92
Copy link
Copy Markdown
Contributor Author

Claude (@claude) review

@shadowfax92
Copy link
Copy Markdown
Contributor Author

Greptile (@greptileai) review

@claude
Copy link
Copy Markdown

claude Bot commented May 9, 2026

Claude finished Nikhil (@shadowfax92)'s task in 2m 6s —— View job


Code Review: feat(server): add wait and custom code tools

  • Gather context and understand the request
  • Read changed files
  • Analyze code for bugs, correctness, security, and style
  • Post review feedback

Overall this is a solid PR. The conditionCount validation, selectorGoneWasPresent state machine, and browser_run_code args-serialization pattern are all well-executed. Test coverage is thorough. A few issues below, one of which is a real correctness bug.


Bug: textGone succeeds spuriously if the text was never present

apps/server/src/browser/browser.ts, lines 683–688

if (opts.textGone) {
  const result = await session.Runtime.evaluate({
    expression: `!(document.body?.innerText?.includes(${JSON.stringify(opts.textGone)}) ?? false)`,
    returnByValue: true,
  })
  if (result.result?.value === true) return true   // ← fires on first poll if text absent
}

If the text is simply not on the page yet (page still loading, element not rendered), this returns true immediately and incorrectly. The PR description says it tightened selectorGone to "wait for prior selector presence before reporting disappearance" — but that same fix was not applied to textGone. The two conditions should behave symmetrically.

The fix mirrors what selectorGone does: track a textGoneWasSeen flag and only return true after the text first appeared and then disappeared.

Fix this →


Minor: Grammatically awkward success messages for gone-conditions

apps/server/src/tools/navigation.ts, lines 391–393

response.text(`Found ${target} on page.`)
// → "Found text "Loading complete" to disappear on page."
// → "Found selector ".spinner" to disappear on page."

The target string for textGone/selectorGone already contains "to disappear", so the prefix "Found" and suffix "on page" produce an odd sentence. Consider a branch:

const isGoneCondition = args.textGone !== undefined || args.selectorGone !== undefined
response.text(isGoneCondition ? `Condition met: ${target}.` : `Found ${target} on page.`)

Nit: Redundant fallback in handler

apps/server/src/tools/navigation.ts, line 335

const timeout = args.timeout ?? 10_000

args.timeout already has .default(10000) in the Zod schema, so the ?? fallback is dead code. args.timeout will always be a number after Zod parsing.


Nit: Fragile setTimeout monkey-patching in tests

apps/server/tests/browser/wait-for.test.ts, lines 37–42

globalThis.setTimeout = ((callback: () => void) => {
  callback()
  return undefined as ReturnType<typeof setTimeout>
}) as typeof setTimeout

undefined as ReturnType<typeof setTimeout> is only safe here because the return value is never used, but it requires a cast that'll confuse future readers. A cleaner alternative is to mock the resolveSession to return a slow session and use a real short timeout, or inject a clock via a constructor option. The try/finally restore is correct — good hygiene there.


Pre-existing, but amplified: no execution timeout for runCode

apps/server/src/browser/browser.ts, lines 988–993

const result = await session.Runtime.evaluate({
  expression,
  returnByValue: true,
  awaitPromise: true,   // ← will hang if the async code never resolves
})

evaluate_script has the same issue, but browser_run_code surfaces it more prominently because it's explicitly designed for arbitrary user-authored async logic. A well-named code parameter that takes function bodies will invite while(true){} or unresolved Promises. Consider documenting the lack of timeout, or adding a Promise.race wrapper around the submitted expression with a configurable deadline.


Positive notes

  • conditionCount guard is clean and catches all ambiguous combinations early.
  • selectorGoneWasPresent correctly prevents false positives for elements that were never rendered.
  • runCode's JSON.stringify(args ?? {}) for the args channel is the right pattern — prevents code-injection through the args parameter.
  • Registry comment counts (// Navigation (9), // Observation (10)) are correct.
  • System-prompt additions are minimal and accurate.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

✅ Tests passed — 1238/1242

Suite Passed Failed Skipped
agent 80/80 0 0
build 9/9 0 0
eval 93/93 0 0
server-agent 261/261 0 0
server-api 203/203 0 0
server-browser 8/8 0 0
server-integration 9/10 0 1
server-lib 242/242 0 0
server-root 60/63 0 3
server-skills 31/31 0 0
server-tools 242/242 0 0

View workflow run

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 9, 2026

Greptile Summary

This PR registers two new MCP tools — wait_for (now with text/selector disappearance and fixed-time modes) and browser_run_code (async custom code execution in the page context) — and wires them into the tool registry, prompt guidance, and label registry.

  • waitFor in browser.ts gains textGone/selectorGone variants with was-present guards so disappearance is never reported before the target was ever seen; the loop boundary is also tightened to <= with a precise sleep.
  • browser_run_code wraps user code as an async (args) => { … } body, forwards the CDP exceptionDetails on throw, and serialises the return value via returnByValue.
  • New unit tests cover false-positive suppression for both textGone/selectorGone, the validation gate for ambiguous combined conditions, and end-to-end async execution in a real browser page.

Confidence Score: 5/5

Safe to merge — the new tools are well-guarded, the was-present logic is correct, and the test coverage is thorough.

The core polling logic, was-present guards for both textGone and selectorGone, runCode error propagation, and the single-condition validation gate are all correct. Tests cover false-positive suppression, fixed-time waits, error paths, and value serialisation.

No files require special attention.

Important Files Changed

Filename Overview
packages/browseros-agent/apps/server/src/browser/browser.ts Adds textGone/selectorGone polling with was-present guards and the new runCode CDP evaluate helper; loop boundary fix looks correct.
packages/browseros-agent/apps/server/src/tools/navigation.ts Extends wait_for with four new condition types and a fixed-time mode; validation gate and target-string building are correct, minor error-message wording issue noted.
packages/browseros-agent/apps/server/src/tools/snapshot.ts Adds browser_run_code tool that delegates to Browser.runCode; error propagation and value serialisation paths are handled correctly.
packages/browseros-agent/apps/server/src/tools/registry.ts Removes the temporary wait_for disable comment and registers both new tools; count comments updated correctly.
packages/browseros-agent/apps/server/tests/browser/wait-for.test.ts New unit tests cover both was-present suppression and successful disappearance detection for textGone and selectorGone.
packages/browseros-agent/apps/server/tests/tools/registry.test.ts New registry tests cover tool registration, fixed-time wait, condition-count validation, disappearance forwarding, and browser_run_code success/error paths.

Sequence Diagram

sequenceDiagram
    participant Agent
    participant wait_for tool
    participant Browser.waitFor
    participant Page (CDP)

    Agent->>wait_for tool: { page, text/textGone/selector/selectorGone/time }

    alt time only
        wait_for tool->>wait_for tool: setTimeout(time)
        wait_for tool-->>Agent: { found: true, timeout: time }
    else text / selector
        wait_for tool->>Browser.waitFor: { text/selector, timeout }
        loop every 500ms until deadline
            Browser.waitFor->>Page (CDP): Runtime.evaluate(innerText.includes / querySelector)
            Page (CDP)-->>Browser.waitFor: true / false
        end
        Browser.waitFor-->>wait_for tool: found (bool)
        wait_for tool-->>Agent: { found, target, timeout } + snapshot if found
    else textGone / selectorGone
        wait_for tool->>Browser.waitFor: { textGone/selectorGone, timeout }
        loop every 500ms until deadline
            Browser.waitFor->>Page (CDP): Runtime.evaluate
            Page (CDP)-->>Browser.waitFor: present (bool)
            Note over Browser.waitFor: Set wasPresent=true on first true, return true after wasPresent && absent
        end
        Browser.waitFor-->>wait_for tool: found (bool)
        wait_for tool-->>Agent: { found, target, timeout } + snapshot if found
    end

    Agent->>browser_run_code tool: { page, code, args }
    browser_run_code tool->>Browser.runCode: (page, code, args)
    Browser.runCode->>Page (CDP): Runtime.evaluate(async (args) => { code })(args)
    Page (CDP)-->>Browser.runCode: result / exceptionDetails
    Browser.runCode-->>browser_run_code tool: { value, description } or { error }
    browser_run_code tool-->>Agent: text + structured value, or error
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
packages/browseros-agent/apps/server/src/tools/navigation.ts:351-356
The error message only calls out `time` as the disallowed combination partner, which implies to a caller (or an LLM) that combining two non-time conditions like `{ text, selector }` is valid. Since the guard rejects any `conditionCount > 1`, the message should reflect the actual rule.

```suggestion
    if (conditionCount > 1) {
      response.error(
        'Provide exactly one wait condition: text, textGone, selector, selectorGone, or time.',
      )
      return
    }
```

### Issue 2 of 2
packages/browseros-agent/apps/server/src/tools/navigation.ts:335
Zod's `.default(10000)` on the `timeout` field guarantees `args.timeout` is always a number in the handler, so the `?? 10_000` fallback is dead code. This is worth removing under the team rule about keeping dead code out of the codebase.

```suggestion
    const timeout = args.timeout
```

Reviews (2): Last reviewed commit: "fix: require prior presence for textGone..." | Re-trigger Greptile

Comment on lines +682 to +688
if (opts.textGone) {
const result = await session.Runtime.evaluate({
expression: `!(document.body?.innerText?.includes(${JSON.stringify(opts.textGone)}) ?? false)`,
returnByValue: true,
})
if (result.result?.value === true) return true
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 textGone returns true prematurely on an unloaded page

The expression !(document.body?.innerText?.includes(...) ?? false) evaluates to true when document.body is null or undefined (page still loading). The optional chain short-circuits to undefined, undefined ?? false becomes false, and !false is true, so the condition reports the text as "gone" even before any content has rendered. Contrast with the text check which uses the same pattern affirmatively — it correctly returns false (not found) when the body is absent.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/browser/browser.ts
Line: 682-688

Comment:
**`textGone` returns true prematurely on an unloaded page**

The expression `!(document.body?.innerText?.includes(...) ?? false)` evaluates to `true` when `document.body` is `null` or `undefined` (page still loading). The optional chain short-circuits to `undefined`, `undefined ?? false` becomes `false`, and `!false` is `true`, so the condition reports the text as "gone" even before any content has rendered. Contrast with the `text` check which uses the same pattern affirmatively — it correctly returns `false` (not found) when the body is absent.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +682 to +688
if (opts.textGone) {
const result = await session.Runtime.evaluate({
expression: `!(document.body?.innerText?.includes(${JSON.stringify(opts.textGone)}) ?? false)`,
returnByValue: true,
})
if (result.result?.value === true) return true
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 textGone lacks the "was-present" guard that selectorGone has

selectorGone only reports success after the element was observed at least once (via selectorGoneWasPresent), preventing a false positive when the selector simply never existed. textGone has no equivalent guard and returns true on the very first poll if the text is already absent. A caller doing waitFor({ textGone: 'Loading…' }) before the page has rendered any text would receive found: true immediately, even though the text never appeared.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/browser/browser.ts
Line: 682-688

Comment:
**`textGone` lacks the "was-present" guard that `selectorGone` has**

`selectorGone` only reports success after the element was observed at least once (via `selectorGoneWasPresent`), preventing a false positive when the selector simply never existed. `textGone` has no equivalent guard and returns `true` on the very first poll if the text is already absent. A caller doing `waitFor({ textGone: 'Loading…' })` before the page has rendered any text would receive `found: true` immediately, even though the text never appeared.

How can I resolve this? If you propose a fix, please make it concise.

@shadowfax92
Copy link
Copy Markdown
Contributor Author

Addressed the correctness item in 5960bce: now mirrors by requiring prior presence before disappearance succeeds. Also cleaned up the gone-condition success message and removed the timer monkey-patch from the browser wait tests.

@shadowfax92
Copy link
Copy Markdown
Contributor Author

Greptile (@greptileai) review

@shadowfax92
Copy link
Copy Markdown
Contributor Author

Claude (@claude) review

@shadowfax92
Copy link
Copy Markdown
Contributor Author

Follow-up pushed in 5960bce: textGone now mirrors selectorGone with a prior-presence guard; gone-condition success text now uses Condition met; wait-for tests use real polling instead of timer monkey-patching.

@claude
Copy link
Copy Markdown

claude Bot commented May 9, 2026

Claude finished Nikhil (@shadowfax92)'s task in 1m 27s —— View job


Claude finished Nikhil (@shadowfax92)'s taskView job


Code Review: feat(server): add wait and custom code tools

  • Gather context and understand the request
  • Read changed files (focusing on 5960bce fixes)
  • Analyze code for bugs, correctness, security, and style
  • Post review feedback

The fixes in 5960bce look correct. All items from the previous review have been addressed.

textGone fix is correct

apps/server/src/browser/browser.ts, lines 683–692

The fix properly mirrors selectorGone: the expression now tests for text presence (not absence), textGoneWasPresent is set on the first true result, and return true only fires after the text was seen and is now gone. This eliminates both false-positive paths — unloaded page body and text-never-present.

The unit tests in tests/browser/wait-for.test.ts cover both cases cleanly:

  • [false, false, false] + short timeout → false (never-present guard)
  • [true, false] + 600ms timeout → true (appear-then-disappear happy path)

The approach of injecting pre-staged evaluation values via createBrowserWithEvaluationValues is simpler and more readable than the previous setTimeout monkey-patch.

Success message fix is clean

The foundMessage branch in navigation.ts:392–395 correctly distinguishes gone conditions from appear conditions, producing grammatical output in both cases.


Remaining minor nit (not a blocker)

apps/server/src/tools/navigation.ts, line 335

const timeout = args.timeout ?? 10_000

args.timeout has .default(10000) in the Zod schema, so the ?? 10_000 fallback is unreachable. No behavior issue, but it's dead code.


Pre-existing: no execution timeout for runCode

apps/server/src/browser/browser.ts, lines 994–998

awaitPromise: true with no deadline will hang if the user's async code never resolves. This was called out in the previous review as pre-existing/amplified — still worth a follow-up issue to document the limitation or add a Promise.race wrapper, but not a blocker for this PR.


The correctness bug is fixed, tests are thorough, and the PR is in good shape. The remaining items are nits that can be cleaned up separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add wait_for and browser_run_code tools

1 participant