Skip to content

runtime/claude_code: materialize image blocks to tmpfile + extract image_cache module#1151

Open
benhoverter wants to merge 7 commits intoRightNow-AI:mainfrom
benhoverter:feat/runtime-image-cache
Open

runtime/claude_code: materialize image blocks to tmpfile + extract image_cache module#1151
benhoverter wants to merge 7 commits intoRightNow-AI:mainfrom
benhoverter:feat/runtime-image-cache

Conversation

@benhoverter
Copy link
Copy Markdown
Contributor

Summary

Makes inbound image ContentBlocks viewable by the Claude Code CLI driver, which cannot fetch URLs or read in-memory bytes — it can only Read files on disk. Images are now materialized to a content-addressed tmpfile under $HOME/.openfang/tmp/images/ and the directory is granted to the CLI via --add-dir. Also extracts the cache into a sibling module so future producers (outbound Discord) can reuse it.

Four commits, no merge commits, builds clean workspace-wide.

Commits

  1. 1203647 runtime/claude_code: materialize images to tmpfile so the CLI's Read tool can view them
    Adds tmpfile materialization in the claude_code driver. Also introduces ContentBlock::Image::source_url: Option<String> in openfang-types and threads it through every consumer (api, channels/bridge, all four model drivers for pattern-match exhaustiveness, agent_loop, compactor). The field is wired through serde and matches but is not consumed by any driver on this branch — it is the load-bearing schema change for the rest of the series and for outbound Discord (separate PR). Includes a regression test (bridge::tests::download_image_to_blocks_populates_source_url) asserting source_url is populated on the HTTP fetch path.

  2. 021c6df fix(runtime/claude_code): atomically publish materialized image tmpfiles
    Replaces the non-atomic fs::write with write-tmp + rename(2). Closes a torn-file window where a re-render racing a Read tool call could read a partially-written image. Orphan .tmp files are reaped by the existing 24h TTL sweep.

  3. 46e5c8c fix(runtime/claude_code): pass --add-dir for materialized image tmpfiles
    The CLI's Read tool refuses paths outside the agent workspace unless --add-dir'd (we don't want --dangerously-skip-permissions as the only escape hatch). Adds the grant to both streaming and non-streaming command builders. The directory is per-user and content-addressed, so the grant is narrow.

  4. c93318f refactor(runtime): extract image_cache module
    Lifts the cache helpers (image_tmp_dir, ext_for_mime, materialize_image, sweep_old_image_tmpfiles, IMAGE_TMP_TTL_SECS, sweep guard) out of the driver into crates/openfang-runtime/src/image_cache.rs. No behavior change — extracted helpers are byte-identical. Module is pub so channel adapters can resolve cached paths from base64 image blocks before forwarding them outbound.

Notes for reviewers

  • Trivial textual conflict with fix(channels/discord): surface image attachments to text-only providers #1143. Both branches edit download_image_to_blocks in crates/openfang-channels/src/bridge.rsfix(channels/discord): surface image attachments to text-only providers #1143 added file:// support; this PR sets source_url on the resulting ContentBlock::Image. The conflict is mechanical (one extra field on a struct literal) and I'll resolve it on whichever branch lands second.
  • source_url field added but not read on this branch. It's wired through serde and every pattern match for type-safety, but no driver consumes it here. The consumer ships in the outbound-Discord PR (forthcoming) where it lets us round-trip a Discord CDN URL back to Discord without re-uploading bytes.
  • Scope is runtime-only. No Discord, no channel-adapter behavior changes (the bridge.rs edit only sets the new field on download).

Test plan

  • cargo check --workspace clean (only pre-existing imap-proto future-incompat warning)
  • cargo test -p openfang-channels --lib clean (477 passed)
  • Inbound image via Discord URL → Claude Code driver materializes → CLI Read succeeds (validated end-to-end with a real receipt image; daemon log confirms materialization + CLI describes contents)
  • download_image_to_blocks populates source_url on the resulting ContentBlock::Image (new regression test, bridge::tests::download_image_to_blocks_populates_source_url)

Atomicity (commit 2) and the 24h TTL sweep (commit 4) are verified by code inspection: rename(2) is POSIX-atomic and the module extraction is byte-identical with no call-site change.

…tool can view them

Adds image materialization in the claude_code driver: inbound image
ContentBlocks are written to $HOME/.openfang/tmp/images/ so the Claude
CLI can view them via its Read tool (it cannot fetch URLs or read
in-memory bytes).

Also introduces `ContentBlock::Image::source_url: Option<String>` in
openfang-types and threads it through every consumer:
  - openfang-channels/bridge.rs (sets source_url when downloading from URL)
  - openfang-api/openai_compat.rs + routes.rs (serde round-trip)
  - openfang-runtime drivers: anthropic, gemini, openai, vertex (pattern
    match exhaustiveness)
  - openfang-runtime: agent_loop, compactor (pattern match exhaustiveness)

The `source_url` field is the load-bearing schema change for downstream
work: later commits in this series consume it for cache lookup, and
outbound Discord (separate PR) uses it to round-trip image URLs back to
Discord without re-uploading bytes. No driver reads it on this branch
yet — it is wired through serde and pattern matches only.
`materialize_image` previously wrote bytes directly to the
content-addressed destination via `fs::write`, which truncates-then-writes
non-atomically. Two concurrent renders of the same image (or a
re-render racing a Read tool invocation) could produce a torn,
partially-written file readable by the CLI's Read tool — a real risk
under load now that file-sharing is a first-class feature.

Switch to write-tmp + rename(2): write the decoded bytes to a unique
sibling tmpfile (suffixed with pid + nanos), then atomically rename
into the content-addressed destination. rename(2) is atomic on the
same filesystem, so readers either see the full file or nothing.
Loser of a race still rename-replaces with byte-identical content.

Orphan .tmp files from crashed processes are reaped by the existing
24h TTL sweep (mtime-based).
The CLI's Read tool refuses paths outside the agent's working
directory unless explicitly granted via --add-dir (or unless
--dangerously-skip-permissions is set, which we don't want to rely on
as the only escape hatch). Materialized images live under
$HOME/.openfang/tmp/images/, which is outside the agent workspace, so
without --add-dir the materialization is a dead-end whenever
skip_permissions is false.

Append --add-dir <image_tmp_dir> to both the non-streaming and
streaming Command builders. The directory is per-user and
content-addressed, so the grant is narrow and idempotent.
Lift the content-addressed image tmpfile cache out of the Claude Code
driver and into a sibling module so the upcoming outbound Discord
file-sharing path can reuse the same cache. No behavior change — the
extracted helpers (image_tmp_dir, ext_for_mime, materialize_image,
sweep_old_image_tmpfiles, IMAGE_TMP_TTL_SECS, sweep guard) are
byte-identical to the previous private impl.

The driver now imports image_tmp_dir / materialize_image /
spawn_sweep_once from crate::image_cache. The new module is publicly
exported so producers outside the runtime crate (channel adapters)
can resolve materialized paths from base64 image blocks before
forwarding them to outbound transports.
`materialize_image` is content-addressed and idempotent: a re-render
of the same image returns the existing path without rewriting. As a
side effect, the tmpfile's mtime never advanced past its original
write — so the 24h TTL sweep, which gates on `meta.modified()`,
could GC a tmpfile still actively referenced by an in-scope
ContentBlock::Image in a long-running conversation.

Refresh mtime via `File::set_modified(SystemTime::now())` (futimens
on Unix) on every cache hit. Read-only fd is sufficient: futimens
only requires file ownership, not write access.

Best-effort: any failure is debug-logged and the cached path is
returned anyway — worst case is the prior 24h-GC behavior.

Tests: cache-hit refreshes mtime and survives a sweep that would
otherwise GC the file; companion test confirms the sweep does
remove genuinely stale files.
ContentBlock::Image was being stringified to `[Image: {mime}]`,
silently dropping the `source_url` populated by the inbound Discord
path. That field exists so the outbound path (PR-C) can re-fetch
the original CDN-hosted image and re-attach it post-compaction —
without it, every compaction event quietly severed an image from
its CDN origin.

Emit `[Image: {mime} @ {url}]` when `source_url` is `http://` or
`https://`. `file://` (local tmpfile materialization) and any other
schemes fall back to the legacy mime-only form: those are internal
and must not leak into compacted summaries that may be persisted,
logged, or shipped across processes.

Tests cover all four arms (https, http, file://, None).
…args

Rust 1.94 / clippy 1.94 (CI runner image 20260413.86.1) flags the
`&media_type` and `&data` borrows at claude_code.rs:199 as
`needless_borrow` — both are already `&String` from destructuring
`ContentBlock::Image` by reference, and `materialize_image` takes
`&str` / `&[u8]`, so the compiler was re-dereferencing immediately.

Pure toolchain-drift fix; no behavior change. cargo test
-p openfang-runtime --lib → 958/958 green.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant