[codex] Make ChatGPT image generation easy for MCP agents by umutkeltek · Pull Request #229 · steipete/oracle

umutkeltek · 2026-05-29T17:52:48Z

Summary

Add a dedicated chatgpt_image MCP tool that wraps ChatGPT browser image generation with a small agent-friendly input shape.
Add generateImage and outputPath to the general consult tool so advanced callers can still trigger the CLI --generate-image image-aware wait/download path directly.
Return saved session artifacts, including a convenience images array with paths and ChatGPT file metadata, in structured MCP output.
Document the new agent image workflow in MCP/browser docs, README, and changelog.

Verification

pnpm exec vitest run tests/mcp*.test.ts tests/mcp/**/*.test.ts
pnpm exec vitest run tests/mcp/chatgptImage.test.ts tests/mcp/utils.test.ts tests/mcp/consult.test.ts
pnpm exec oxfmt --check CHANGELOG.md README.md docs/browser-mode.md docs/mcp.md src/mcp/server.ts src/mcp/tools/consult.ts src/mcp/tools/chatgptImage.ts src/mcp/types.ts src/mcp/utils.ts tests/mcp/chatgptImage.test.ts tests/mcp/consult.test.ts tests/mcp/utils.test.ts
pnpm exec oxlint src/mcp/server.ts src/mcp/tools/consult.ts src/mcp/tools/chatgptImage.ts src/mcp/types.ts src/mcp/utils.ts tests/mcp/chatgptImage.test.ts tests/mcp/consult.test.ts tests/mcp/utils.test.ts
pnpm exec tsgo --noEmit fails on existing repo-wide issues unrelated to this patch: scripts/check.ts Bun/esbuild target typing and tests/browser/chatgptImages.test.ts fetch mock typing under Node 26.

Review follow-ups (commits `cfa4f047`, `4ee828de`)

Security hardening

Constrained agent-supplied generateImage / outputPath to ORACLE_HOME_DIR by default (.. traversal rejected); ORACLE_MCP_ALLOW_EXTERNAL_OUTPUT=1 opts into external paths. CLI --generate-image / --output unaffected.
Symlink hardening (P1): the boundary check now resolves symlinks in the existing path prefix via realpath (deepest existing ancestor + non-existent remainder), so a symlinked parent under the Oracle home (e.g. ~/.oracle/generated -> /tmp/evil) can no longer smuggle a write outside the boundary. Regression tests cover symlink escape for both fields (rejected by default), a within-home symlink (allowed), and an escape allowed under the explicit opt-in. The escape tests were verified to fail against the prior lexical check.

Correctness / contract

Default image path carries a random suffix (no collision under parallel agent calls).
chatgpt_image output reuses the typed consult output shape (images / artifacts / resolved).

Live behavior proof (P1) — real chatgpt_image MCP run (redacted), GPT-5.5 browser session, image downloaded via cookie-backed URL and saved under ORACLE_HOME_DIR/generated/:

// structuredContent (redacted)
{
  "status": "completed",
  "models": [{ "model": "gpt-5.5", "status": "completed", "usage": { "totalTokens": 81 } }],
  "images": [{
    "kind": "image",
    "path": "~/.oracle/generated/pr229-proof.png",
    "mimeType": "image/png",
    "sizeBytes": 962854,
    "width": 1254,
    "height": 1254,
    "fileId": "file_…<redacted>",
    "sourceUrl": "https://chatgpt.com/backend-api/estuary/content?id=file_…&sig=<redacted>"
  }],
  "artifacts": [
    { "kind": "image", "path": "~/.oracle/generated/pr229-proof.png", "sizeBytes": 962854 },
    { "kind": "transcript", "path": "~/.oracle/sessions/<id>/artifacts/transcript.md" }
  ],
  "requestedOutputPath": "~/.oracle/generated/pr229-proof.png"
}

Saved file verified on disk: PNG image data, 1254 x 1254, 8-bit/color RGB (963 KB).

Verification: tests/mcp 41 passed; oxfmt --check clean; oxlint clean on the listed files; no new typecheck errors in the changed files (the 3 pre-existing tsgo errors remain in untouched files).

Update — remote image-output handling (`8f221a7a`)

Addresses the re-review [P1]: image output was not threaded through the remote browser-service protocol, so chatgpt_image/consult image runs could not fulfill the structuredContent.images contract when ORACLE_REMOTE_HOST is configured.

Fail closed: when a remote host is configured and image output (generateImage/outputPath) is requested, the run is rejected with a clear error pointing at the local-browser path, instead of silently returning no images or proceeding into a remote run that can't deliver artifacts. Local image generation is unaffected (proven end-to-end above).

Regression test (remote host + token + generateImage → isError), verified to fail against the pre-fix source.
Documented in docs/mcp.md + changelog.
Full MCP suite 42 passed; format / lint / typecheck (changed files) clean.

clawsweeper · 2026-05-29T17:53:55Z

Codex review: needs maintainer review before merge. Reviewed May 30, 2026, 7:30 AM ET / 11:30 UTC.

Summary
The PR adds an MCP chatgpt_image tool, exposes image output fields on consult, returns saved image artifacts, hardens MCP output-path containment, fails closed for remote browser-service image output, and updates docs/tests/changelog.

Reproducibility: not applicable. this is a feature PR rather than a bug report. The PR body provides a redacted live MCP chatgpt_image run, and the source/tests show the intended paths.

Review metrics: 2 noteworthy metrics.

Changed Surface: 13 files, +1116/-266. The diff spans MCP API, browser image artifact handling, docs, changelog, and focused tests, so maintainers should review it as a feature plus boundary change.
Public MCP Tools: 1 added, 1 changed. The PR adds chatgpt_image and expands consult, which changes the public MCP surface.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

none.

Risk before merge

[P1] The branch introduces an MCP-controlled host file-write surface; it is constrained to ORACLE_HOME_DIR by default with symlink-aware checks, but maintainers still need to accept that security boundary and the explicit external-output opt-in.
[P1] Remote browser-service users will receive a clear error for image output until artifact transfer is designed; that is safer than silent loss, but it is still a maintainer-visible product choice for the new feature.

Maintainer options:

Accept The Bounded MCP Write Surface (recommended)
Merge after maintainers accept ORACLE_HOME_DIR default containment, symlink-aware validation, explicit external-output opt-in, and remote fail-closed behavior as the intended MCP policy.
Tighten The Output Policy First
If maintainers want a smaller boundary, require a dedicated configured output directory or remove the external-output opt-in before merge.

Next step before merge

[P2] No narrow automation repair remains; the next step is maintainer judgment on the MCP file-write boundary and remote image-output policy.

Security
Cleared: No concrete security defect remains after the path-containment, symlink, opt-in external-output, and remote fail-closed changes; the new boundary still needs maintainer acceptance.

Review details

Best possible solution:

Merge the bounded local-browser MCP image path if maintainers accept the security boundary; otherwise require a stricter output-directory policy or a remote artifact-transfer design before merge.

Do we have a high-confidence way to reproduce the issue?

Not applicable: this is a feature PR rather than a bug report. The PR body provides a redacted live MCP chatgpt_image run, and the source/tests show the intended paths.

Is this the best way to solve the issue?

Yes, with maintainer acceptance of the boundary: the implementation reuses the existing ChatGPT image wait/download path, constrains MCP-controlled paths, and fails closed for unsupported remote-service image output.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 6019a199e44c.

Label changes

Label changes:

add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes redacted live MCP output showing a completed chatgpt_image run with populated image metadata and an on-disk PNG verification.
remove rating: 🦐 gold shrimp: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.
remove status: ⏳ waiting on author: Current PR status label is status: 👀 ready for maintainer look.

Label justifications:

P2: This is a bounded MCP/browser feature with normal maintainer-review urgency and limited blast radius.
merge-risk: 🚨 security-boundary: The PR gives MCP callers a new way to request host file writes for generated images, guarded by containment and an explicit opt-out.
rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes redacted live MCP output showing a completed chatgpt_image run with populated image metadata and an on-disk PNG verification.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes redacted live MCP output showing a completed chatgpt_image run with populated image metadata and an on-disk PNG verification.

Evidence reviewed

What I checked:

Repository policy read: AGENTS.md was present and fully read; its browser/MCP smoke guidance and changelog grouping note were relevant to this review. (AGENTS.md:1, 6019a199e44c)
New MCP image wrapper: chatgpt_image forces browser mode, maps the image output path to generateImage, defaults reference files to upload, and delegates to the shared consult runner. (src/mcp/tools/chatgptImage.ts:87, 8f221a7a4c3f)
MCP output containment: MCP-supplied generateImage and outputPath are resolved through existing symlink ancestors and rejected unless the real target stays under the Oracle home, unless explicit external output is enabled. (src/mcp/utils.ts:65, 8f221a7a4c3f)
Remote fail-closed behavior: When a remote browser service is configured and image output is requested, the MCP consult path returns an error instead of silently running without transferable artifacts. (src/mcp/tools/consult.ts:645, 8f221a7a4c3f)
Regression coverage: Tests cover remote-service fail-closed behavior and path rejection for traversal and symlink escapes, including both generateImage and outputPath. (tests/mcp/consult.test.ts:361, 8f221a7a4c3f)
Current-main provenance: Current main blame for the central MCP, browser-image, and remote-service files points to the v0.13.0 release commit as the available local history root for this shallow checkout. (src/mcp/tools/consult.ts:1, abb7c9a7d9c8)

Likely related people:

Peter Steinberger: Current-main blame for the central MCP and browser image files points to the v0.13.0 release commit by this author, and the local shortlog shows dominant activity in the affected area. (role: feature history owner / release integrator; confidence: high; commits: abb7c9a7d9c8; files: src/mcp/tools/consult.ts, src/mcp/utils.ts, src/browser/chatgptImages.ts)
pdurlej: Release notes credit this handle on closely related MCP/browser consult, dry-run, model-strategy, follow-up, and generated-image artifact work; the PR timeline also shows them being mentioned on this review thread. (role: adjacent feature contributor; confidence: medium; commits: abb7c9a7d9c8; files: CHANGELOG.md, docs/mcp.md, src/mcp/tools/consult.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Address review feedback on the chatgpt_image / consult image-output work. Security: constrain agent-supplied generateImage / outputPath to the Oracle home directory by default (resolveMcpOutputPath in src/mcp/utils.ts). MCP clients are less trusted than the CLI user, so an agent must not be able to write generated images or saved responses to arbitrary host paths; `..` traversal is rejected by the resolved-prefix check. Opt into external paths with ORACLE_MCP_ALLOW_EXTERNAL_OUTPUT=1. A rejected path now returns a clean MCP error instead of throwing. CLI --generate-image / --output are unaffected. Correctness: the default image output path (and the browser-side session-less fallback) now carry a random suffix so concurrent agent calls in the same millisecond cannot resolve to the same path and overwrite each other. Contract: chatgpt_image reuses the typed consult output shape (images / artifacts / resolved) instead of loose z.record(...any), so the advertised structuredContent.images shape is actually declared. Tests: allow/reject path coverage for mapConsultToRunOptions, the consult MCP handler, and the chatgpt_image handler; default-path uniqueness assertion.

umutkeltek · 2026-05-30T10:06:29Z

Did an independent pass on this and pushed a follow-up commit (cfa4f047) to address the review. Summary of findings + what changed.

Findings

The bot's two P1s (output-path security boundary, needs real proof) are fair. Two refinements after reading the actual write path (chatgptImages.ts, sessionRunner.ts):

outputPath is the stronger write primitive, not the image bytes. The image saver partly self-limits — resolveSiblingImagePath forces a content-type extension on extensionless paths, and the bytes are ChatGPT-rendered image data (not attacker-chosen). But outputPath writes the model's text response verbatim (sessionRunner.ts:991-1007), text is fully caller-steerable, and its only guard is "not inside the sessions dir" — a state-corruption guard, not a security boundary. It mkdir -ps the parent, overwrites, and on EPERM even falls back to another path. Containment has to cover both fields.
Default image path collides under parallelism (bot missed this). resolveDefaultImageOutputPath was chatgpt-image-${Date.now().toString(36)}.png. Two concurrent chatgpt_image calls in the same millisecond resolve to the same path and overwrite each other — wrong default for a tool built for parallel agents.
Output schema didn't declare its advertised shape. images/artifacts/resolved were z.record(z.string(), z.any()), so the "returns saved paths in structuredContent.images" contract wasn't actually typed.

Changes in `cfa4f047`

Containment (both fields): resolveMcpOutputPath constrains agent-supplied generateImage / outputPath to ORACLE_HOME_DIR by default; .. traversal is rejected by the resolved-prefix check. Opt out with ORACLE_MCP_ALLOW_EXTERNAL_OUTPUT=1. Rejected paths return a clean MCP error (not a throw). CLI --generate-image / --output are untouched — the boundary is MCP-only.
Collision fix: default image path (and the browser-side session-less fallback) now carry a random suffix.
Contract: chatgpt_image output reuses the typed consult output shape (images / artifacts / resolved) so the structuredContent shape is consistent and declared.
Tests (the bot's P1 rank-up move): allow/reject path coverage for mapConsultToRunOptions, the consult MCP handler, and the chatgpt_image handler, plus a default-path-uniqueness assertion. Full MCP + image suites green (43 tests); format/lint clean.

Still open

Residual symlink escape: path.resolve doesn't resolve symlinks, so a symlinked dir under ORACLE_HOME_DIR pointing outside could bypass the prefix check. Left as a known follow-up (would need realpath on the existing parent).
Live proof: the cheap dryRun:true path now exercises resolution + containment in tests, but a single redacted real ChatGPT image run (cookie download → populated structuredContent.images[]) is still worth attaching before merge — automation can't supply that.

Address review: the boundary check in resolveMcpOutputPath was lexical (path.resolve prefix), so a pre-existing symlinked directory under ORACLE_HOME_DIR (e.g. ~/.oracle/generated -> /tmp/evil) could pass the string-prefix test while the actual image/text write landed outside the documented MCP containment boundary. Resolve the target through symlinks before the boundary check: realpath the deepest existing ancestor of the requested path and re-append the not-yet- created remainder, then compare against the realpath of the Oracle home. A symlinked parent that escapes the home is now rejected unless ORACLE_MCP_ALLOW_EXTERNAL_OUTPUT=1 is set. Tests: symlink-escape regression for BOTH generateImage and outputPath (rejected by default), a within-home symlink (allowed), and an escape allowed under the explicit opt-in. Verified the escape tests fail against the prior lexical check and pass after. Docs/changelog note the symlink-safe containment.

umutkeltek · 2026-05-30T10:29:04Z

Both remaining P1s addressed.

Symlink hardening (4ee828de) — resolveMcpOutputPath no longer relies on a lexical prefix check. It resolves symlinks in the existing path prefix via realpath (deepest existing ancestor + the not-yet-created remainder) and compares against the realpath of the Oracle home, so a symlinked parent under ORACLE_HOME_DIR (e.g. generated -> /tmp/evil) can't smuggle a write outside the boundary. Regression tests cover symlink escape for both generateImage and outputPath (rejected by default), a within-home symlink (allowed), and an escape allowed under ORACLE_MCP_ALLOW_EXTERNAL_OUTPUT=1. Verified the escape tests fail against the prior lexical check and pass after.

Live behavior proof — ran a real chatgpt_image MCP call end-to-end (redacted output in the PR body): GPT-5.5 browser session → image generated → cookie-backed download → saved under ORACLE_HOME_DIR/generated/ with a populated structuredContent.images[] (path, mimeType, sizeBytes: 962854, 1254x1254, fileId). File verified on disk as PNG image data, 1254 x 1254.

tests/mcp 41 passed; oxfmt/oxlint clean on the listed files; PR body updated with details.

@clawsweeper re-review

clawsweeper · 2026-05-30T10:29:07Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26682407602
Updated: 2026-05-30T11:21:31.230Z

umutkeltek · 2026-05-30T11:09:33Z

No code changes since the last review cleared the security defect (paths confined under the realpathed Oracle home, symlink escapes tested, external writes gated behind ORACLE_MCP_ALLOW_EXTERNAL_OUTPUT), with live chatgpt_image proof in the body. Requesting a fresh pass to confirm the current verdict.

@clawsweeper re-review

clawsweeper · 2026-05-30T11:09:35Z

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26682311627
Updated: 2026-05-30T11:15:22.402Z

Review found that consult/chatgpt_image image output was not threaded through the remote browser-service protocol: when ORACLE_REMOTE_HOST is configured the generated image is written on the remote host and never transferred back, so the promised structuredContent.images contract could not be fulfilled (the run would silently return no images, or proceed into a remote run that can't deliver them). Reject image output explicitly when a remote host is configured, with a clear error pointing the caller at the local-browser path. Local image generation is unaffected (verified end-to-end). Adds a regression test (remote host + token + generateImage -> isError) and documents the limitation in docs/mcp.md + changelog.

umutkeltek · 2026-05-30T11:25:06Z

Closed the remote image-output [P1] from the last pass.

8f221a7a — image output now fails closed when a remote browser service is configured (ORACLE_REMOTE_HOST): the remote executor doesn't transfer image artifacts back, so rather than silently returning empty structuredContent.images, the run is rejected with a clear error pointing at the local-browser path. Local image generation is unchanged (proven live in the body). Regression test added (remote host + token + generateImage → isError, verified to fail against the pre-fix source); documented in docs/mcp.md + changelog. Full MCP suite 42 passed; format/lint/typecheck clean.

@clawsweeper re-review

clawsweeper · 2026-05-30T11:25:09Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26682599412
Updated: 2026-05-30T11:31:20.245Z

Expose ChatGPT image outputs to MCP agents

d4df728

umutkeltek force-pushed the codex/mcp-chatgpt-image-artifacts branch from d54ef7e to d4df728 Compare May 29, 2026 18:06

umutkeltek changed the title ~~[codex] Expose ChatGPT image outputs to MCP agents~~ [codex] Make ChatGPT image generation easy for MCP agents May 29, 2026

clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels May 29, 2026

umutkeltek marked this pull request as ready for review May 29, 2026 21:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Make ChatGPT image generation easy for MCP agents#229

[codex] Make ChatGPT image generation easy for MCP agents#229
umutkeltek wants to merge 4 commits into
steipete:mainfrom
umutkeltek:codex/mcp-chatgpt-image-artifacts

umutkeltek commented May 29, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 29, 2026 •

edited

Loading

Uh oh!

umutkeltek commented May 30, 2026

Uh oh!

umutkeltek commented May 30, 2026

Uh oh!

clawsweeper Bot commented May 30, 2026 •

edited

Loading

Uh oh!

umutkeltek commented May 30, 2026

Uh oh!

clawsweeper Bot commented May 30, 2026 •

edited

Loading

Uh oh!

umutkeltek commented May 30, 2026

Uh oh!

clawsweeper Bot commented May 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

umutkeltek commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Review follow-ups (commits cfa4f047, 4ee828de)

Update — remote image-output handling (8f221a7a)

Uh oh!

clawsweeper Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

umutkeltek commented May 30, 2026

Findings

Changes in cfa4f047

Still open

Uh oh!

umutkeltek commented May 30, 2026

Uh oh!

clawsweeper Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

umutkeltek commented May 30, 2026

Uh oh!

clawsweeper Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

umutkeltek commented May 30, 2026

Uh oh!

clawsweeper Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

umutkeltek commented May 29, 2026 •

edited

Loading

Review follow-ups (commits `cfa4f047`, `4ee828de`)

Update — remote image-output handling (`8f221a7a`)

clawsweeper Bot commented May 29, 2026 •

edited

Loading

Changes in `cfa4f047`

clawsweeper Bot commented May 30, 2026 •

edited

Loading

clawsweeper Bot commented May 30, 2026 •

edited

Loading

clawsweeper Bot commented May 30, 2026 •

edited

Loading