Skip to content

feat(look-at): add multi-file support for look_at tool#3094

Open
sjawhar wants to merge 1 commit intocode-yeongyu:devfrom
sjawhar:feat/look-at-multi-file
Open

feat(look-at): add multi-file support for look_at tool#3094
sjawhar wants to merge 1 commit intocode-yeongyu:devfrom
sjawhar:feat/look-at-multi-file

Conversation

@sjawhar
Copy link
Copy Markdown
Contributor

@sjawhar sjawhar commented Apr 3, 2026

Summary

  • Add file_paths and image_data_list array parameters to the look_at tool, enabling agents to compare multiple images/documents in a single call
  • Backward-compatible: existing singular file_path/image_data parameters still work unchanged
  • Mixed types supported: file paths and base64 image data can be provided simultaneously
  • Updated multimodal-looker agent prompt for multi-file analysis with comparison guidance

Changes

Tool Schema (src/tools/look-at/tools.ts)

  • Added file_paths: string[] (optional) and image_data_list: string[] (optional) to tool args
  • Processing loop creates fileParts for each file, sends [text, ...fileParts] to agent
  • Prompt switches between singular/plural text, includes "File N: filename" labels for multi-file calls
  • Extracted prepareFilePart() helper for single-file-to-part conversion (file path or base64)

Validation (src/tools/look-at/look-at-arguments.ts)

  • normalizeArgs() converts singular params to arrays for downstream consumption
  • validateArgs() accepts singular OR array per field (not both), allows mixed file_paths + image_data_list
  • Rejects: empty arrays, remote URLs per element, missing all inputs, singular+array conflict on same field

Agent Prompt (src/agents/multimodal-looker.ts)

  • "examine the attached file" -> "examine the attached file(s)"
  • Added: "When multiple files are provided, analyze each and address the goal across all files. If the goal involves comparison, explicitly compare and contrast."

Tests

  • look-at-arguments.test.ts: 12 tests covering backward compat, multi-file, mixed types, rejection cases
  • tools.test.ts: Extended with multi-file processing, plural/singular prompt, file labels

Motivation

Agents frequently need to compare screenshots, documents, or diagrams side-by-side. Previously this required multiple sequential look_at calls with no cross-file context. Multi-file support lets agents send all files at once, enabling the multimodal-looker to analyze them together.


Summary by cubic

Add multi-file support to look_at so agents can analyze and compare several files/images in one call, combining local paths and Base64. Backward-compatible and updates prompts for clearer side-by-side comparison.

  • New Features

    • Added file_paths and image_data_list; file_path/image_data still work and can be mixed in one call.
    • Mixed sources supported in a single request (e.g., file_paths + image_data_list or file_path + image_data).
    • Prompt pluralizes and lists per-file labels (“File N: ”) with comparison guidance; multimodal-looker agent prompt updated to “file(s)” with compare/contrast instructions.
    • Validation: normalizes singular to arrays; rejects empty arrays, remote URLs, and singular/array conflicts on the same field.
  • Refactors

    • Extracted prepareFilePart() and a unified prompt builder for single/multi flows; builds one file part per input and cleans up temp conversions.
    • Expanded tests for backward compatibility, mixed inputs (including singular+singular), schema exposure, and error handling.

Written for commit 199c2be. Summary will update on new commits.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 6 files

Confidence score: 3/5

  • There is a concrete user-impacting risk in src/tools/look-at/tools.ts: filePart.filename is not sanitized before being inserted into the prompt, so crafted filenames could inject instructions and alter multimodal agent behavior.
  • Given the medium severity (6/10) and high confidence (8/10) on a prompt-injection path, this carries more than minor risk and is worth addressing before or immediately with merge.
  • A lower-severity maintainability concern exists in src/tools/look-at/tools.test.ts, where duplicated createToolContext fixtures can drift over time and make test updates less reliable.
  • Pay close attention to src/tools/look-at/tools.ts and src/tools/look-at/tools.test.ts - prompt-construction safety should be hardened, and duplicated fixtures should be consolidated to reduce future regression risk.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/tools/look-at/tools.ts">

<violation number="1" location="src/tools/look-at/tools.ts:126">
P2: Sanitize `filePart.filename` before adding it to the prompt. A crafted filename can inject extra prompt text and change how the multimodal agent interprets the request.</violation>
</file>

<file name="src/tools/look-at/tools.test.ts">

<violation number="1" location="src/tools/look-at/tools.test.ts:22">
P2: Duplicate `createToolContext` fixtures in the same test file introduce maintainability drift risk.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

}
}

function createToolContext(): ToolContext {
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Duplicate createToolContext fixtures in the same test file introduce maintainability drift risk.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/tools/look-at/tools.test.ts, line 22:

<comment>Duplicate `createToolContext` fixtures in the same test file introduce maintainability drift risk.</comment>

<file context>
@@ -3,6 +3,66 @@ import type { ToolContext } from "@opencode-ai/plugin/tool"
+  }
+}
+
+function createToolContext(): ToolContext {
+  return {
+    sessionID: "parent-session",
</file context>
Fix with Cubic

Add file_paths and image_data_list array parameters to the look_at tool,
allowing agents to compare multiple images/files in a single call. Updates
validation, tool handler, and multimodal-looker agent prompt.
@sjawhar sjawhar force-pushed the feat/look-at-multi-file branch from 2394f59 to 199c2be Compare April 3, 2026 20:26
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 199c2be54c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +115 to +117
url: pathToFileURL(actualFilePath).href,
filename: basename(actualFilePath),
},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve source filenames for converted file inputs

When an input file needs conversion (needsConversion), actualFilePath is replaced with a temp output path before filename is set, so converted files are labeled as the temp name (typically converted.jpg) instead of the original source name. In multi-file mode this makes prompt labels ambiguous (for example, two HEIC files both appear as converted.jpg), which can cause the multimodal agent to misattribute comparisons or fail to map results back to the user’s files.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant