[codex] Make ChatGPT image generation easy for MCP agents#229
[codex] Make ChatGPT image generation easy for MCP agents#229umutkeltek wants to merge 4 commits into
Conversation
|
Codex review: needs maintainer review before merge. Reviewed May 30, 2026, 7:30 AM ET / 11:30 UTC. Summary Reproducibility: not applicable. this is a feature PR rather than a bug report. The PR body provides a redacted live MCP Review metrics: 2 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Merge the bounded local-browser MCP image path if maintainers accept the security boundary; otherwise require a stricter output-directory policy or a remote artifact-transfer design before merge. Do we have a high-confidence way to reproduce the issue? Not applicable: this is a feature PR rather than a bug report. The PR body provides a redacted live MCP Is this the best way to solve the issue? Yes, with maintainer acceptance of the boundary: the implementation reuses the existing ChatGPT image wait/download path, constrains MCP-controlled paths, and fails closed for unsupported remote-service image output. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 6019a199e44c. Label changesLabel changes:
Label justifications:
Evidence reviewedWhat I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
d54ef7e to
d4df728
Compare
Address review feedback on the chatgpt_image / consult image-output work. Security: constrain agent-supplied generateImage / outputPath to the Oracle home directory by default (resolveMcpOutputPath in src/mcp/utils.ts). MCP clients are less trusted than the CLI user, so an agent must not be able to write generated images or saved responses to arbitrary host paths; `..` traversal is rejected by the resolved-prefix check. Opt into external paths with ORACLE_MCP_ALLOW_EXTERNAL_OUTPUT=1. A rejected path now returns a clean MCP error instead of throwing. CLI --generate-image / --output are unaffected. Correctness: the default image output path (and the browser-side session-less fallback) now carry a random suffix so concurrent agent calls in the same millisecond cannot resolve to the same path and overwrite each other. Contract: chatgpt_image reuses the typed consult output shape (images / artifacts / resolved) instead of loose z.record(...any), so the advertised structuredContent.images shape is actually declared. Tests: allow/reject path coverage for mapConsultToRunOptions, the consult MCP handler, and the chatgpt_image handler; default-path uniqueness assertion.
|
Did an independent pass on this and pushed a follow-up commit ( FindingsThe bot's two P1s (output-path security boundary, needs real proof) are fair. Two refinements after reading the actual write path (
Changes in
|
Address review: the boundary check in resolveMcpOutputPath was lexical (path.resolve prefix), so a pre-existing symlinked directory under ORACLE_HOME_DIR (e.g. ~/.oracle/generated -> /tmp/evil) could pass the string-prefix test while the actual image/text write landed outside the documented MCP containment boundary. Resolve the target through symlinks before the boundary check: realpath the deepest existing ancestor of the requested path and re-append the not-yet- created remainder, then compare against the realpath of the Oracle home. A symlinked parent that escapes the home is now rejected unless ORACLE_MCP_ALLOW_EXTERNAL_OUTPUT=1 is set. Tests: symlink-escape regression for BOTH generateImage and outputPath (rejected by default), a within-home symlink (allowed), and an escape allowed under the explicit opt-in. Verified the escape tests fail against the prior lexical check and pass after. Docs/changelog note the symlink-safe containment.
|
Both remaining P1s addressed. Symlink hardening ( Live behavior proof — ran a real
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
No code changes since the last review cleared the security defect (paths confined under the realpathed Oracle home, symlink escapes tested, external writes gated behind @clawsweeper re-review |
|
🦞👀 Command router queued. I will update this comment with the next step. Re-review progress:
|
Review found that consult/chatgpt_image image output was not threaded through the remote browser-service protocol: when ORACLE_REMOTE_HOST is configured the generated image is written on the remote host and never transferred back, so the promised structuredContent.images contract could not be fulfilled (the run would silently return no images, or proceed into a remote run that can't deliver them). Reject image output explicitly when a remote host is configured, with a clear error pointing the caller at the local-browser path. Local image generation is unaffected (verified end-to-end). Adds a regression test (remote host + token + generateImage -> isError) and documents the limitation in docs/mcp.md + changelog.
|
Closed the remote image-output [P1] from the last pass.
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
Summary
chatgpt_imageMCP tool that wraps ChatGPT browser image generation with a small agent-friendly input shape.generateImageandoutputPathto the generalconsulttool so advanced callers can still trigger the CLI--generate-imageimage-aware wait/download path directly.imagesarray with paths and ChatGPT file metadata, in structured MCP output.Verification
pnpm exec vitest run tests/mcp*.test.ts tests/mcp/**/*.test.tspnpm exec vitest run tests/mcp/chatgptImage.test.ts tests/mcp/utils.test.ts tests/mcp/consult.test.tspnpm exec oxfmt --check CHANGELOG.md README.md docs/browser-mode.md docs/mcp.md src/mcp/server.ts src/mcp/tools/consult.ts src/mcp/tools/chatgptImage.ts src/mcp/types.ts src/mcp/utils.ts tests/mcp/chatgptImage.test.ts tests/mcp/consult.test.ts tests/mcp/utils.test.tspnpm exec oxlint src/mcp/server.ts src/mcp/tools/consult.ts src/mcp/tools/chatgptImage.ts src/mcp/types.ts src/mcp/utils.ts tests/mcp/chatgptImage.test.ts tests/mcp/consult.test.ts tests/mcp/utils.test.tspnpm exec tsgo --noEmitfails on existing repo-wide issues unrelated to this patch:scripts/check.tsBun/esbuild target typing andtests/browser/chatgptImages.test.tsfetch mock typing under Node 26.Review follow-ups (commits
cfa4f047,4ee828de)Security hardening
generateImage/outputPathtoORACLE_HOME_DIRby default (..traversal rejected);ORACLE_MCP_ALLOW_EXTERNAL_OUTPUT=1opts into external paths. CLI--generate-image/--outputunaffected.realpath(deepest existing ancestor + non-existent remainder), so a symlinked parent under the Oracle home (e.g.~/.oracle/generated -> /tmp/evil) can no longer smuggle a write outside the boundary. Regression tests cover symlink escape for both fields (rejected by default), a within-home symlink (allowed), and an escape allowed under the explicit opt-in. The escape tests were verified to fail against the prior lexical check.Correctness / contract
chatgpt_imageoutput reuses the typedconsultoutput shape (images/artifacts/resolved).Live behavior proof (P1) — real
chatgpt_imageMCP run (redacted), GPT-5.5 browser session, image downloaded via cookie-backed URL and saved underORACLE_HOME_DIR/generated/:Saved file verified on disk:
PNG image data, 1254 x 1254, 8-bit/color RGB(963 KB).Verification:
tests/mcp41 passed;oxfmt --checkclean;oxlintclean on the listed files; no new typecheck errors in the changed files (the 3 pre-existingtsgoerrors remain in untouched files).Update — remote image-output handling (
8f221a7a)Addresses the re-review [P1]: image output was not threaded through the remote browser-service protocol, so
chatgpt_image/consultimage runs could not fulfill thestructuredContent.imagescontract whenORACLE_REMOTE_HOSTis configured.Fail closed: when a remote host is configured and image output (
generateImage/outputPath) is requested, the run is rejected with a clear error pointing at the local-browser path, instead of silently returning no images or proceeding into a remote run that can't deliver artifacts. Local image generation is unaffected (proven end-to-end above).generateImage→isError), verified to fail against the pre-fix source.docs/mcp.md+ changelog.