fix: [#788] surface OpenAI nested error.message instead of Bad Request: {?:?}#789
Conversation
…st: {?:?}`
`provider/error.ts` extracted `errMsg` via `body.message || body.error || body.error?.message`. For OpenAI's standard `{error: {message, type, code}}` response shape, `body.error` short-circuits as a truthy object, the `typeof errMsg === "string"` guard fails, and the parser falls through to dumping the raw response body. Telemetry (2026-05-04) caught users seeing `APIError: Bad Request: {?:?}` (App Insights redacts string values) and retrying with the same broken model selection because the surfaced error gave no clue about the underlying cause.
Switch to an explicit-typeof ternary matching `parseStreamError`'s pattern (line 153). Now handles all three shapes safely:
- `{error: {message: "..."}}` (OpenAI) — extracts nested message
- `{message: "..."}` (Anthropic et al.) — extracts top-level message
- `{error: "..."}` (some others) — extracts string error
Wrapped in `altimate_change start — upstream_fix:` markers; upstream has the same bug. Filing upstream PR in parallel.
## Tests
Six tests added to `error.test.ts` covering:
- OpenAI nested shape extracts the inner message and does not leak the structured body
- Top-level message field still works
- Plain-string `body.error` still works
- Object `body.error` without a `message` key falls through to top-level `body.message`
- Non-string `body.error.message` (array/object) does not block a valid `body.message` (regression guard for the same class of bug)
- All-non-string-fields body falls through to raw responseBody dump (existing last-resort behavior preserved)
Suite: 386/386 provider tests pass. Typecheck clean. Marker check passes.
Reviewed by GPT 5.4, Gemini 3.1 Pro, Kimi K2.5 via /consensus:code-review — all approved with the test coverage and ternary improvements that landed before commit.
Closes #788
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR fixes error message extraction in OpenAI API response handling. The ChangesError Message Extraction Robustness
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
* release: v0.7.1 — provider error handling pass A focused pre-release review (5-persona + 2-persona re-review) widened v0.7.1 from a single OpenAI-message-extraction fix into a tight provider-error pass. All changes are surgical, marker-disciplined, and pinned by 40 adversarial tests in `release-v0.7.1-adversarial.test.ts`. User-facing fixes: - Bedrock / AWS Lambda `errorMessage` shape now extracted by both `parseAPICallError` and `parseStreamError`. - `parseStreamError` no longer falls through to `JSON.stringify(e)` for non-OpenAI codes — uses the same string-typeof chain as the API path. - `model_not_found` short-circuits OpenAI 404 retry-storm; user sees the actionable error on attempt 1 instead of after 5 silent retries. - `altimate models` discoverability hint appended on `model_not_found`. Privacy / defense-in-depth: - `Telemetry.maskString` redacts email addresses and internal hostnames (`*.local` / `*.internal` / RFC1918 / IMDS) so the unwrapped provider text from #789 doesn't leak more PII than the prior JSON-quote shred. - `MessageV2.APIError.metadata.url` masks internal hosts (incl. IPv6 loopback / ULA / link-local, AWS IMDS) and strips basic-auth userinfo before the URL flows into local storage / share / telemetry. - `responseBody` capped at 4KB at the `parseAPICallError` boundary. Docs: - New "Provider API Errors" section in `docs/docs/reference/troubleshooting.md`. CI gates passed locally: - typecheck (`tsgo --noEmit`): clean - marker guard (`analyze.ts --markers --base main --strict`): clean - pre-release (`bun run pre-release`): all 4 checks pass; binary starts - targeted tests: 102/102 (provider/error + adversarial + branding + retry) - full opencode suite: 8011 pass / 503 skip / 0 fail Closes #788 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address PR #794 review feedback (CodeRabbit + cubic) CodeRabbit and cubic flagged three classes of issue on the v0.7.1 PR. All three are addressed here; CHANGELOG, troubleshooting doc, and code are now consistent. 1. Hint spelling consistency (CR Minor / cubic P3) - Three different forms shipped in the original PR: - error.ts:290 -> "Run `altimate models` to see available models." - troubleshooting.md:47 -> "altimate-code models <provider>" - CHANGELOG.md:21 -> "Run `altimate-code models` to see available models." - Aligned to the form actually emitted by the code: `altimate models` (no dash, no provider arg). troubleshooting.md auth-login example also aligned to `altimate auth login` to match error.ts:99. 2. maskString missing IMDS + IPv6 (CR Major / cubic P1) - The regex covered only RFC1918 + *.local / *.internal, while the paired maskInternalHost in error.ts already covered AWS IMDS (169.254.169.254), IPv6 loopback ([::1]), ULA (fc00::/7), and link-local (fe80::/10). - Extended the regex so an error string containing the same URL is redacted with the same coverage as the metadata.url path. - Added query-string/fragment chars (`+`, `#`, `,`, `;`) to the trailing char class so secrets after `?` don't survive past the `<internal-host>` marker. - 5 new adversarial tests for IMDS, IPv6 loopback/ULA/link-local, and the query/fragment leak guard. 3. maskInternalHost only stripped userinfo for internal hosts (cubic P1) - Previously, `https://user:pass@10.0.0.5/x` redacted host AND userinfo, but `https://user:pass@api.openai.com/x` preserved `user:pass@`. A credential in a public-host URL is at least as sensitive as one in an internal-host URL — strip userinfo on EVERY URL, regardless of internal/public classification. - 1 new adversarial test for public-host userinfo strip. Also expanded the upstream_fix marker comment in parseAPICallError to mention all four extractor shapes (OpenAI nested, Anthropic-style top- level, Bedrock errorMessage, legacy string error) so readers don't have to reverse-engineer the chain. Tests: 46 release-v0.7.1 adversarial cases (40 -> 46), 238/238 across provider+adversarial+upgrade+retry+telemetry suites. Typecheck clean. Marker guard --strict clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: wrap capResponseBody call sites in altimate_change markers The capResponseBody helper was wrapped in markers but the two call sites in parseAPICallError (context_overflow and api_error returns) were unmarked. Marker analyzer --strict caught this. Same fix as the "hint comment expanded" pass — every modified line in an upstream- shared file needs to live inside a marker block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: wrap finalMessage usage in altimate_change markers * fix: address PR #794 round-2 review feedback (cubic P1 + CR nits) cubic flagged a real P1 on the previous round's commit and CodeRabbit posted a few low-value nits worth picking up while we're here. P1 (cubic) — `Telemetry.maskString` URL regex missed URLs carrying basic-auth userinfo before the host. `https://admin:hunter2@10.0.0.5/x` fell through unredacted because the regex started with the host alternation. Added an optional `(?:[^\/\s@]+@)?` userinfo prefix so the whole URL — credentials + host — is captured into `<internal-host>`. CR quick wins: - Hoisted the `Telemetry` import to the top of the adversarial test file (mid-file imports work in Bun but tooling that scans headers miss them). - Tightened the `/models` hint test to assert the full canonical string ``Run `altimate models` to see available models.`` so docs / changelog / code can't quietly diverge again. Defense-in-depth from CR nits: - Added `0.0.0.0` (any-interface bind) to maskInternalHost — shows up in misconfigured proxy URLs and is functionally internal. - Added a one-line comment to the silent `catch {}` in isOpenAiErrorRetryable so a future refactor doesn't "fix" it into something that throws on malformed JSON. Also added 2 new adversarial tests pinning the userinfo+host case and the `0.0.0.0` case (240/240 across the 5 affected files, up from 238). Deferred to v0.7.2: extracting `makeAPICallError` to a shared fixture (refactor, scope creep), reusing `codeFromBody` in `isOpenAiErrorRetryable` (signature change), IPv4-mapped IPv6 (`::ffff:10.0.0.1`) coverage, byte-vs-char `RESPONSE_BODY_CAP` rename. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: pin prototype-pollution test to a real malicious payload CodeRabbit caught that the __proto__ regression guard was hollow: JSON.stringify({__proto__: ...}) produces `{}` — `__proto__` in object-literal syntax is the prototype setter, not an enumerable own property, so it never makes it into the JSON. The test was serializing an empty payload and asserting prototype wasn't polluted (trivially true). Replace the JSON.stringify call with a hand-written JSON literal containing an explicit "__proto__" key. parseAPICallError's JSON.parse now actually receives the malicious key, exercising the regression guard against a future refactor that switches to Object.assign / spread. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address PR #794 round-4 nits + bump release date CodeRabbit Quick Wins: 1. Wrap prototype-pollution tests in try/finally. If a regression ever breaks the guards, Object.prototype stays polluted for the rest of the suite — cascading noise instead of a localized failure. The finally block restores (or deletes) the original key value. 2. Pin the 4KB truncation boundary exactly. The test was asserting `length < 5000`, which would still pass if a future refactor moved the cap from 4096 → 4500 → 4900 — none of those are the documented guarantee. Now asserts the exact 4096-char prefix and the full truncation suffix `…[truncated 95904 chars]`. Cap is now the load-bearing assertion, not the upper bound. Also bumped CHANGELOG date to today (2026-05-06) since the tag pushes today. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: deflake tracing-display-crash flushSync test CI run 25448250105 hit: (fail) flushSync — crash recovery > flushSync preserves all accumulated data [126.00ms] The test used two `await new Promise((r) => setTimeout(r, 50))` waits for async snapshot operations to settle. The neighboring test on the same describe block already documents this exact failure mode and uses `await tracer.flush()` as the deterministic antidote (see comment at line 215: "deterministic vs sleep(50) which flakes on slow CI runners"). Replace both setTimeout waits with `tracer.flush()`. Locally now ~700ms (was ~1.2s with the sleeps) and 5/5 passes in a row. The test exercises the same crash-recovery invariants with no timing dependency. Pre-existing flake unrelated to v0.7.1 scope but blocking the release PR's CI — fixing in scope per the project's "no flaky tests" rule. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
What does this PR do?
Fixes
provider/error.tsso OpenAI 4xx errors with the standard{error: {message, type, code}}shape surface their innermessageto the user, instead of the previous fallthrough behavior that producedAPIError: Bad Request: {?:?}(the{?:?}is the App Insights pipeline scrubbing string values from the raw structured body).The original OR-chain
body.message || body.error || body.error?.messageshort-circuited onbody.error(an object), failed thetypeof === "string"guard, and dumped the rawresponseBody. Replaced with an explicit-typeof ternary that handles the three known shapes cleanly and matchesparseStreamError's defensive pattern at line 153.Wrapped in
// altimate_change start — upstream_fix:markers per project policy. Upstream has the same bug; a parallel upstream PR is being filed so we can drop our marker if upstream takes the fix.Type of change
Issue for this PR
Closes #788
How did you verify your code works?
bun test packages/opencode/test/provider/error.test.ts— 24/24 pass (18 existing + 6 new)bun test packages/opencode/test/provider/— 386/386 pass (full provider suite)bunx tsgo --noEmit— clean (exit 0)bun run script/upstream/analyze.ts --markers --base main --strict— passes (changes wrapped in markers)/consensus:code-review— all approved with the test coverage and ternary improvements applied before commitTests added
body.messagestill works (non-regression)body.errorstill works (non-regression)body.errorwithoutmessagekey falls through to top-levelbody.messagebody.error.message(array/object) does not block a validbody.message(defensive guard for the same class of bug)responseBody(existing last-resort behavior preserved)Checklist
Summary by cubic
Fixes error message extraction for OpenAI 4xx responses by reading nested
error.messageinstead of dumping the raw body. Users now see actionable errors (e.g., model not found) instead of "Bad Request: {?:?}", meeting Linear #788.body.error.message→body.message→ stringbody.error, avoiding object short-circuits and matching stream parser behavior.responseBodywhen a valid message exists; covered by new tests.Written for commit 3a7ec72. Summary will update on new commits.
Summary by CodeRabbit
Release Notes
Bug Fixes
Tests