Skip to content

feat(channels/discord) Outbound file/image attachments + image_cache hardening#1162

Open
benhoverter wants to merge 28 commits intoRightNow-AI:mainfrom
benhoverter:feat/outbound-file-attach
Open

feat(channels/discord) Outbound file/image attachments + image_cache hardening#1162
benhoverter wants to merge 28 commits intoRightNow-AI:mainfrom
benhoverter:feat/outbound-file-attach

Conversation

@benhoverter
Copy link
Copy Markdown
Contributor

PR-C: Outbound file/image attachments + image_cache hardening

Branch: feat/outbound-file-attachupstream/main
Stack: independent of PR-A (#1143) and PR-B; rebased on upstream/main so it can land in any order. 27 commits, +4,141 / -64 across channels, runtime, and types.

Summary

Lets agents send images and files outbound through the Discord channel — and tightens the inbound image cache that feeds Claude Code's Read tool.

Two concerns, one branch (kept stacked here per maintainer preference to avoid base-branch profusion):

  1. Outbound attachments (channels/bridge, channels/discord, new outbound_attach.rs)

    • New <openfang:attach path="..." /> / url="..." /> parser in the bridge — agents emit it inline; the bridge extracts and routes to channel-specific outbound paths.
    • Discord outbound now handles ChannelContent::File, Image URL, and a true Multipart arm that batches N attachments into a single Discord message via multipart/form-data with greedy byte-cap packing (Discord's 25 MB / 10-file limits).
    • SSRF guard with redirect revalidation, scrubbed logs, content-length / streaming download cap aborts.
    • Body-aware 429 retry-after honored on the multi-file path; partial-send failures emit a structured grep-friendly WARN.
    • Parallel fetch via try_join_all; Fetcher trait extracted so tests can stub.
  2. Inbound image_cache hardening (runtime/image_cache, runtime/claude_code, runtime/compactor)

    • Extracted image_cache into its own module (was inline in claude_code.rs).
    • Filename-retaining tmpfile names: <hash16>__<sanitized_name>.ext so tmp/images/ is greppable instead of an opaque-hash junkyard. Cache-hit dedup still works via <hash>*.<ext> glob — first-name-wins on rename collisions.
    • Atomic publish (rename-from-temp), mtime refresh on cache hit, --add-dir plumbed so Claude Code's Read tool can actually open the materialized files.
    • Compactor now preserves http(s) source_url across compaction so re-materialization works after compaction events.

Out of scope (intentional)

  • Telegram / WhatsApp outbound attachments — touched only for type-signature parity.
  • Pruning pre-deploy bare-hash entries in tmp/images/ — harmless coexistence; no migration needed.

Test plan

  • cargo check clean
  • cargo clippy -D warnings clean (only a transitive imap-proto v0.10.2 future-incompat note, not ours)
  • cargo test -p openfang-channels532 unit + 9 integration passed, 0 failed, including:
    • test_multipart_outbound_multifile_429_* (body-aware retry-after, both single-shot and aggregated paths)
    • greedy-pack chunker byte-cap test
    • streaming download size-cap abort test
    • per-block resolver dispatch in mixed Multipart
  • Live smoke against the local-main daemon (Discord channel):
    • Inbound: filename retention verified (<hash16>__arize_app_01.png, <hash16>__pc_thumbnail_02.png).
    • Cache-hit dedup: same bytes re-uploaded → single entry, mtime refreshed, no bare-hash sibling.
    • First-name-wins on rename: Arize_app_01111.png re-upload reused the original __arize_app_01.png entry.
    • Outbound <openfang:attach .../> round-trip succeeded for both URL and absolute-path forms.

Notes for review

  • cargo fmt --check reports drift on files we didn't touch (kernel.rs, drivers/{anthropic,gemini,openai,vertex}.rs). Same upstream-wide drift currently red on main. Will be addressed by the separate fmt-only PR; not a blocker for this one.
  • The Fetcher trait + SSRF guard combo is the security-sensitive surface; flagging it for closer review.

🤖 Generated with Claude Code

benhoverter and others added 27 commits May 2, 2026 23:35
…tool can view them

Adds image materialization in the claude_code driver: inbound image
ContentBlocks are written to $HOME/.openfang/tmp/images/ so the Claude
CLI can view them via its Read tool (it cannot fetch URLs or read
in-memory bytes).

Also introduces `ContentBlock::Image::source_url: Option<String>` in
openfang-types and threads it through every consumer:
  - openfang-channels/bridge.rs (sets source_url when downloading from URL)
  - openfang-api/openai_compat.rs + routes.rs (serde round-trip)
  - openfang-runtime drivers: anthropic, gemini, openai, vertex (pattern
    match exhaustiveness)
  - openfang-runtime: agent_loop, compactor (pattern match exhaustiveness)

The `source_url` field is the load-bearing schema change for downstream
work: later commits in this series consume it for cache lookup, and
outbound Discord (separate PR) uses it to round-trip image URLs back to
Discord without re-uploading bytes. No driver reads it on this branch
yet — it is wired through serde and pattern matches only.
`materialize_image` previously wrote bytes directly to the
content-addressed destination via `fs::write`, which truncates-then-writes
non-atomically. Two concurrent renders of the same image (or a
re-render racing a Read tool invocation) could produce a torn,
partially-written file readable by the CLI's Read tool — a real risk
under load now that file-sharing is a first-class feature.

Switch to write-tmp + rename(2): write the decoded bytes to a unique
sibling tmpfile (suffixed with pid + nanos), then atomically rename
into the content-addressed destination. rename(2) is atomic on the
same filesystem, so readers either see the full file or nothing.
Loser of a race still rename-replaces with byte-identical content.

Orphan .tmp files from crashed processes are reaped by the existing
24h TTL sweep (mtime-based).
The CLI's Read tool refuses paths outside the agent's working
directory unless explicitly granted via --add-dir (or unless
--dangerously-skip-permissions is set, which we don't want to rely on
as the only escape hatch). Materialized images live under
$HOME/.openfang/tmp/images/, which is outside the agent workspace, so
without --add-dir the materialization is a dead-end whenever
skip_permissions is false.

Append --add-dir <image_tmp_dir> to both the non-streaming and
streaming Command builders. The directory is per-user and
content-addressed, so the grant is narrow and idempotent.
Lift the content-addressed image tmpfile cache out of the Claude Code
driver and into a sibling module so the upcoming outbound Discord
file-sharing path can reuse the same cache. No behavior change — the
extracted helpers (image_tmp_dir, ext_for_mime, materialize_image,
sweep_old_image_tmpfiles, IMAGE_TMP_TTL_SECS, sweep guard) are
byte-identical to the previous private impl.

The driver now imports image_tmp_dir / materialize_image /
spawn_sweep_once from crate::image_cache. The new module is publicly
exported so producers outside the runtime crate (channel adapters)
can resolve materialized paths from base64 image blocks before
forwarding them to outbound transports.
`materialize_image` is content-addressed and idempotent: a re-render
of the same image returns the existing path without rewriting. As a
side effect, the tmpfile's mtime never advanced past its original
write — so the 24h TTL sweep, which gates on `meta.modified()`,
could GC a tmpfile still actively referenced by an in-scope
ContentBlock::Image in a long-running conversation.

Refresh mtime via `File::set_modified(SystemTime::now())` (futimens
on Unix) on every cache hit. Read-only fd is sufficient: futimens
only requires file ownership, not write access.

Best-effort: any failure is debug-logged and the cached path is
returned anyway — worst case is the prior 24h-GC behavior.

Tests: cache-hit refreshes mtime and survives a sweep that would
otherwise GC the file; companion test confirms the sweep does
remove genuinely stale files.
ContentBlock::Image was being stringified to `[Image: {mime}]`,
silently dropping the `source_url` populated by the inbound Discord
path. That field exists so the outbound path (PR-C) can re-fetch
the original CDN-hosted image and re-attach it post-compaction —
without it, every compaction event quietly severed an image from
its CDN origin.

Emit `[Image: {mime} @ {url}]` when `source_url` is `http://` or
`https://`. `file://` (local tmpfile materialization) and any other
schemes fall back to the legacy mime-only form: those are internal
and must not leak into compacted summaries that may be persisted,
logged, or shipped across processes.

Tests cover all four arms (https, http, file://, None).
…args

Rust 1.94 / clippy 1.94 (CI runner image 20260413.86.1) flags the
`&media_type` and `&data` borrows at claude_code.rs:199 as
`needless_borrow` — both are already `&String` from destructuring
`ContentBlock::Image` by reference, and `materialize_image` takes
`&str` / `&[u8]`, so the compiler was re-dereferencing immediately.

Pure toolchain-drift fix; no behavior change. cargo test
-p openfang-runtime --lib → 958/958 green.
Pick 3a of the Discord file-passing plan: extend the URL-flavored File
variant with optional mime and size metadata so adapters can pass
attachment context through to bridges. FileData (bytes-flavored) is
unchanged; size is implicit in data.len() and mime_type already exists.

Match-arm sites in bridge.rs, telegram.rs, whatsapp.rs use `..` to stay
forward-compatible. Construction sites in telegram.rs and kernel.rs
pass `mime: None, size: None` for now; Discord inbound (PR-A) will
populate them.

Refs: projects/openfang-fork/discord-file-passing-plan.md
Pick 3a-bis of the Discord file-passing plan: teach the multimodal
image fetcher to handle file:// URLs by reading from local disk
instead of going through reqwest. PR-A (Discord inbound) will
materialize attachments to a shared inbox dir and emit
ChannelContent::Image { url: "file://..." }, so this branch is what
unblocks vision on inbox-materialized images after the Discord CDN
URL has expired.

Implementation:
- Branch on url.strip_prefix("file://"); local read uses tokio::fs::read.
- HTTP path unchanged. Both paths converge on (Vec<u8>, Option<String>)
  before the existing 5MB cap, magic-byte sniffing, and base64 path.
- No content-type header on file:// — magic-byte detection and URL
  extension fallback do all the media-type work, which is fine since
  detect_image_magic and media_type_from_url already exist.
- No new deps. Vec<u8> instead of bytes::Bytes to avoid pulling in
  the bytes crate as a direct dep.
- No URL percent-decoding: the inbox writer (PR-A) controls filenames
  and avoids characters that would need encoding.

Refs: projects/openfang-fork/discord-file-passing-plan.md (step 2)
Adds a single tracing::debug! at the top of parse_discord_message that
dumps the full payload JSON. Silent at default `info` level; enable with
`RUST_LOG=openfang_channels::discord=debug` to capture real attachment
JSON when developing the file-passing parse code.

Logs before any filters (bot, allowed_users, allowed_guilds, empty
content) so attachment-only messages are visible too.
Discord MESSAGE_CREATE payloads with attachments were previously parsed
in a way that either dropped the attachment (when text was present, only
the text was kept) or dropped the whole message (when text was empty,
the early `content.is_empty()` return killed bare-image posts). The
result on text-only providers like claude-code: silent drops, then
hallucinated acknowledgements of content the model never saw.

This rewires the inbound path end-to-end:

* types: add ChannelContent::Multipart(Vec<ChannelContent>) so a single
  inbound message can carry a caption + one or more attachments as
  sibling blocks. Doc forbids nesting; consumers debug_assert.

* discord: classify attachments by MIME (with extension fallback for
  bot-relayed payloads that omit content_type) and a 5 MB vision-size
  cap matching Anthropic's image block limit. Vision-eligible images
  become ChannelContent::Image; everything else becomes File. Emit
  Multipart whenever text and attachments coexist, or when there are
  multiple attachments.

* bridge: flat-map Multipart in both dispatch paths — into Vec<ContentBlock>
  for multimodal-capable providers, and into a newline-joined text
  descriptor for text-flatten providers.

* telegram: add the Multipart arm to send_to_user for exhaustive-match
  parity; flattens defensively.

* claude_code driver: render Image blocks as
  "[attachment: <mime> image, ~N KB — not viewable on this provider]"
  instead of dropping them. The model still cannot see the image, but
  it can acknowledge it coherently rather than confabulating.

Adds 9 discord parser tests covering all (text, attachment-count) shapes
plus MIME edge cases, and 2 claude_code driver tests covering captioned
and bare-image rendering.
…RL arms

Add `download_url_to_bytes` helper and wire `ChannelContent::File` /
`ChannelContent::Image` through to the existing `api_send_attachment`
multipart path, so outbound messages constructed as URL pass-throughs
(by the bridge or by the agent's `channel_send` tool with `file_url`)
upload as proper attachments instead of falling through to the
"(Unsupported content type)" warning.

Constants:
- URL_FETCH_TIMEOUT = 15s (per-request, matches sibling REST calls).
- URL_FETCH_MAX_BYTES = 25 MiB (matches Discord's non-Nitro upload cap;
  enforced both pre-flight via Content-Length and via streamed chunks
  to cover servers that omit/lie about Content-Length).

Helpers (extracted as pure functions for ease of testing without an
HTTP mock layer):
- strip_mime_params: normalize Content-Type by dropping `; charset=...`.
- derive_filename_from_url + percent_decode_lossy: pull a filename out
  of the URL path tail.
- resolve_file_filename / resolve_file_mime: fallback chain for the File
  arm (explicit field → URL tail → response CT → extension lookup →
  application/octet-stream).
- resolve_image_filename / resolve_image_mime: fallback chain for the
  Image arm (URL tail → response CT → extension → image/png default).

`api_send_attachment` now takes `impl Into<bytes::Bytes>`, so callers
passing a `Vec<u8>` (the existing FileData arm) and callers passing
`Bytes` from `download_url_to_bytes` (the new arms) both avoid an
extra allocation.

Test approach: dev-deps unchanged. The size-cap tests use a hand-rolled
`tokio::net::TcpListener` server so we can craft a Content-Length header
independent of the body length (axum always sets a truthful CL, which
defeats the streaming-cap test). The fallback-chain tests are pure
unit tests against the helper functions — no HTTP needed. 9 new tests:

- test_strip_mime_params_basic
- test_derive_filename_from_url
- test_resolve_file_filename_chain
- test_resolve_file_mime_chain
- test_resolve_image_filename_chain
- test_resolve_image_mime_chain
- test_download_size_cap_via_content_length
- test_download_size_cap_via_streaming
- test_download_under_cap_succeeds_and_returns_bytes_and_ct

`cargo check -p openfang-channels` and `cargo test -p openfang-channels
--lib discord::` both pass (40/40).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…bing for URL fetch

Closes the SSRF surface on `download_url_to_bytes`, which was reachable via
agent-supplied URLs in `ChannelContent::File` / `Image` outbound arms. A
prompt-injected agent could previously have caused the bot to fetch
`http://169.254.169.254/...` and exfiltrate cloud-metadata creds to Discord.

Changes:

1. URLs are now parsed with `url::Url`. Schemes other than http/https are
   refused with a clear error.
2. SSRF host guard: DNS-resolves the host via `tokio::net::lookup_host` and
   refuses if any resolved address is loopback, RFC1918 / link-local /
   unique-local / CGNAT, multicast, unspecified, broadcast, or the literal
   cloud-metadata IP. IPv4-mapped IPv6 is unwrapped and re-checked.
3. Per-request reqwest client with a custom redirect policy: caps at 3 hops
   and re-applies the literal-IP SSRF check on every hop's URL. DNS
   re-resolution inside the sync redirect callback is documented as out of
   scope; the threat model is a malicious URL, not a malicious DNS server.
4. New `redact_url` helper strips query string + fragment from every URL
   that gets interpolated into a `warn!`/`error!`/returned `Err`. Discord
   CDN URLs carry HMAC-style `ex`/`is`/`hm` params that must not leak into
   operator log aggregators.
5. `BytesMut::with_capacity` pre-sizes the body buffer using
   `min(content_length.unwrap_or(64 KiB), URL_FETCH_MAX_BYTES)`, removing
   ~24 reallocations on the 25 MiB happy path.
6. Explicit User-Agent header `openfang-channels-discord/<crate-version>`
   on the download request.
7. The streaming-cap test now uses a side-channel `AtomicUsize` counter
   on a chunk-by-chunk server. It asserts the server stopped writing well
   short of the full payload, proving the client aborts mid-stream rather
   than buffering everything and complaining at the end.

Plus new SSRF unit tests covering loopback, RFC1918 (10.x and 192.168.x),
the 169.254.169.254 cloud-metadata canary, non-http schemes, IPv6 loopback
and the IPv4-mapped form of the metadata IP, plus a `redact_url` test and
a `is_blocked_v4` canary test pinning the blocked ranges. 40 -> 49 tests.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…hing

- Add ChannelContent::Multipart handling in DiscordAdapter::send.
- Concatenate Text blocks into a single caption, chunk attachments into
  groups of 10, send one multipart POST per chunk with caption on first.
- Mixed Image+File blocks supported; per-block resolver dispatch.
- Fail-fast on any fetch error; no partial sends.
- Delete stale "parent recursion path" comment now that this arm IS the path.
- Add api_send_attachments (plural); refactor api_send_attachment as a
  thin wrapper. Add cfg(test) api_base_override + api_base() helper so
  tests can point the adapter at a local axum stub.

Tests: caption concat, empty-caption suppression, >10 chunking, caption-only
fallback, mixed Image+File one-shot, plus mid-batch failure / empty / unknown
nesting. cargo build / cargo test / cargo clippy -- -D warnings all clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… Multipart

The previous mixed-types test used two FileData blocks, which never
exercised the Image-vs-File resolver dispatch introduced in the new
Multipart arm. Replace it with a test that constructs a real
Image{url} + File{url} Multipart, serves both from local HTTP fixture
servers (bypassed via a new #[cfg(test)] ssrf_bypass flag, modelled on
the existing api_base_override pattern), and asserts that:

- the Image part's filename + Content-Type come from resolve_image_*
  (server CT preserved, URL path tail used as filename)
- the File part's filename + Content-Type come from resolve_file_*
  (explicit filename/mime from the File{} block take precedence)
- the caption is in payload_json.content
- exactly one multipart POST is issued

Also adds spawn_fixture_server (a configurable-CT sibling of
spawn_raw_http_server), CapturedFile metadata capture in the Discord
stub, and test_adapter_with_base_and_ssrf_bypass helper.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the `#[cfg(test)] ssrf_bypass: bool` field on `DiscordAdapter`
with an `Arc<dyn Fetcher>` injected at construction time. The wire-level
HTTP fetch (`do_http_fetch`) is now a free fn shared by:

- `ProductionFetcher` — runs `resolve_and_check_host` SSRF preflight,
  then `do_http_fetch`. Default for `DiscordAdapter::new`.
- `PermissiveFetcher` (cfg(test)) — same wire path, no SSRF preflight,
  for tests that need to talk to 127.0.0.1 stubs.

Test helper `test_adapter_with_base_and_ssrf_bypass` swaps in
`PermissiveFetcher` instead of toggling a bool. `test_fetch` calls
`do_http_fetch` directly. All 57 discord tests pass; clippy clean
under `-D warnings`.
Introduces an `AttachmentSource` enum (Resolved | UrlImage | UrlFile)
that the classification pass produces directly. The previous code
walked `parts` twice: once to bucket Text vs unknowns, then again to
fetch and resolve every non-Text variant.

Now there's a single match-arm block over `parts` that emits
`AttachmentSource`s into a typed Vec. A small `resolve_attachment_source`
helper then turns each source into the `(bytes, filename, mime)` tuple
the multipart helper consumes — fail-fast preserved, order preserved,
existing error message format ("Multipart fetch failed for {url}: …")
preserved. All 57 discord tests pass; clippy clean.

Sets the shape for step 4 (parallel fetch via try_join_all) — the
serial resolver call site is now the only place that needs to change.
The `if i == 0 { ATTACHMENT_FIELD_NAME.to_string() } else { format!(…) }`
branch was redundant — both arms produced identical strings. Replaced
with the unconditional format. The pinned constant is now `#[cfg(test)]`
and the format-invariant assertion (`format!("files[{}]", 0) ==
ATTACHMENT_FIELD_NAME`) was added so a future refactor that drifts the
two apart breaks loudly. 57/57 discord tests pass; clippy clean.
Replaces the serial resolve loop in the Multipart arm with
`futures::future::try_join_all`. For an N-URL Multipart this drops
latency from sum-of-RTTs to max-of-RTT while keeping every existing
invariant: input order is preserved (so `files[i]` still lines up
with the original block index), fail-fast cancels remaining fetches
on the first error, and the SSRF + size-cap + log-scrub behavior
is unchanged because all of that lives inside `Fetcher::fetch`.

Side change: `resolve_attachment_source`'s error type is widened to
`Box<dyn Error + Send + Sync>` so the `try_join_all` future is `Send`
(the surrounding `ChannelAdapter::send` future has to be `Send` per
`#[async_trait]`). The looser `Box<dyn Error>` returned by `send` is
recovered via an explicit unsize coercion at the await site —
removing auto traits from a trait object isn't exposed via `From`,
so a one-line `.map_err` does it. 57/57 discord tests pass; clippy
clean.
Discord rejects requests over ~25 MiB total; the previous count-only
chunker (≤ 10 attachments) could trivially overshoot — 10×3 MiB = 30 MiB
→ silent 413. Adds CHUNK_TOTAL_CAP_BYTES = 24 MiB (1 MiB headroom for
multipart envelope) and a `chunk_attachments` helper that greedy-packs
under both the count cap (10) and the byte cap, preserving input order.
Oversized single attachments still go in their own chunk so the helper
always makes progress (Discord still rejects, matching pre-existing
behavior).

Tests: a unit test pinning the helper's count/byte/overflow/empty
behavior, plus an end-to-end test that sends 3×10 MiB FileData blocks
through `send()` and asserts the chunker emits 2 chunks (20 + 10 MiB)
with the caption only on the first. The discord-stub's axum router
gets `DefaultBodyLimit::disable()` so the 20 MiB chunk doesn't trip
axum's 2 MiB default. 59/59 discord tests pass; clippy clean.
Adds an integration test that fronts the multipart endpoint with a stub
that returns 429 (Retry-After: 0, body retry_after: 0) on the first
POST and 200 on the second. Sending a 3-attachment Multipart must
produce exactly 2 POSTs against the same chunk and both must carry
files[0..2] (the retry rebuilds the form, doesn't drop entries).

The body-aware retry behavior was already shared across single- and
multi-file paths via `api_send_attachments`; this test pins the
multi-file path so a future refactor that lifts retry handling can't
silently regress it. 60/60 discord tests pass; clippy clean.
The previous warn line buried the partial-send signal in prose; an
operator grepping for "why are some files showing and some not?" had
to read the full message to find the relevant bits. Switch to a
structured WARN with `event = "discord_multipart_partial_send"` plus
`chunks_sent` / `chunks_total` / `failed_chunk_index` fields so a
single greppable token surfaces the partial-send class. The prose
message is preserved as the event description for human readers.
60/60 discord tests pass; clippy clean.
Three review findings, all S2 (should-fix), addressed in one commit:

1. Drop the duplicated `failed_chunk_index` field from the partial-send
   WARN. It encoded the same value as `chunks_sent`; two field names
   for the same datum is a misnaming smell. Kept `chunks_sent` and
   `chunks_total` — the failed index is recoverable as `chunks_sent`.

2. Inline the now-vestigial `download_url_to_bytes` wrapper. Post-
   Fetcher-trait it was a one-line passthrough to `self.fetcher.fetch`.
   Six call sites (production + SSRF tests) now hit the trait method
   directly; one less hop for readers.

3. Add a header-only 429 retry test. The existing test sets both
   header and body so a regression that drops body-aware parsing would
   pass. New test sends 429 with `Retry-After: 0` and an empty body,
   pinning the header-fallback path independently.

61/61 discord tests pass; clippy clean.
Second-pass reviewer flagged the body+header and header-only 429 tests
as ~95% identical: Axum router scaffolding, multipart-drain loop,
CapturedPost accumulation, and post-assertions were duplicated verbatim.

Extracts:
- `spawn_429_then_ok_stub(first_response)` — closure-driven stub that
  returns the configured response on attempt 0 and 200 OK after.
- `caption_plus_n_files(caption, n)` — builds a Multipart of N
  FileData blocks named `a.txt`, `b.txt`, ….
- `assert_all_attempts_carry_files(captured, n)` — pins the
  `files[0..n)` field-name invariant across every retry attempt.

Each test is now ~15 lines that document only what differs (the 429
response shape and attachment count). 61/61 discord tests pass; clippy
clean.
Discord's CDN edges occasionally advertise `content-encoding: gzip` (or
deflate/brotli) on PNG/JPEG passthroughs while the body is raw,
uncompressed image bytes. With the default `reqwest::Client::new()` and
the workspace's gzip/deflate/brotli features all enabled, reqwest's
transparent-decompression layer chokes on the PNG/JPEG header and
returns "error decoding response body" only on `bytes().await` (not on
`send()`), causing `download_image_to_blocks` to silently fall back to a
text-only block — the user's image never reaches the model.

Build the client explicitly with no_gzip/no_deflate/no_brotli so the
request advertises identity encoding and the body is read raw. Also set
a User-Agent (some CDN edges 403 clients without one) and a 30s timeout
aligned with the upstream 5 MB cap.

Repro: send an image attachment via Discord; the daemon logs
`Failed to read image bytes: error decoding response body` and the turn
appends as text-only with `appended_has_image=false`. After this fix the
PNG bytes are read and emitted as an Image content block as intended.
Teach `send_response` to recognise `<openfang:attach path="…" [name="…"]
[spoiler="true"] [caption="…"]/>` markers in agent text, resolve them
against an allow-root (default `$HOME/.openfang/`), read the bytes, and
emit `ChannelContent::FileData` blocks. Single attachment goes out as
`FileData`; multiple (or text + attachments) go out as `Multipart`,
which the existing `discord::send` chunker (10/msg, 24 MiB aggregate)
handles unchanged.

Parsing happens BEFORE channel formatting — telegram HTML / slack
mrkdwn would escape `<` and break detection downstream.

Per-directive failures (path missing, outside allow-root, oversized,
not a regular file) are logged at WARN and the marker is silently
dropped from the outgoing message — partial success rather than
failing the whole reply.

Allow-root canonicalisation closes the symlink-escape vector. Paths
must be absolute. Per-file cap is 25 MiB; per-message cap is 10
attachments (Discord's hard limit). MIME table mirrors `tool_runner`'s
`channel_send` extension table for inbound/outbound symmetry.

10 unit tests cover: no markers → NoMarkers; single resolves to
FileData; caption appended; spoiler prefix; name override; outside
allow-root rejected; relative path rejected; multiple resolved;
malformed marker preserved; mime table.
Files under ~/.openfang/tmp/images/ have been content-addressed by
sha256(bytes) since the cache was introduced. Idempotent and correct,
but unfriendly: a human grepping the dir or recalling "the TCC png"
sees only opaque hashes.

Plumb a best-effort `original_name: Option<&str>` through
`materialize_image`. When present, the resulting filename becomes
`<hash16>__<sanitized>.<ext>`; when absent, the legacy `<hash16>.<ext>`
shape is preserved. Cache-hit lookup now globs `<hash16>*.<ext>` so a
re-render with or without a name converges on the existing tmpfile —
preserving the dedup property that makes the cache useful.

The driver-side caller (claude_code) derives the hint from the source
URL's last path segment, which already encodes the filename for Discord
CDN, Telegram file API, S3, etc. — no protocol/wire-format changes
needed. file:// source URLs (used by future inbox materialization)
short-circuit to None to avoid double-suffixing.

Tests cover: name appended in expected shape, dedup across
named/unnamed renders, sanitization edge cases (path traversal,
collapsing punctuation, length cap, Unicode collapse to underscore).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@benhoverter benhoverter changed the title Outbound file/image attachments + image_cache hardening feat(channels/discord) Outbound file/image attachments + image_cache hardening May 5, 2026
`File::set_modified` calls `SetFileTime` on Windows, which requires the
handle to have FILE_WRITE_ATTRIBUTES. A read-only `File::open` works on
Unix (`futimens` only checks ownership) but panics with PermissionDenied
on the windows-latest CI runner. Switch all three call sites — production
`touch_mtime` and two tests — to `OpenOptions::new().write(true).open(...)`.

Fixes Test / windows-latest in PR RightNow-AI#1162.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant