-
Notifications
You must be signed in to change notification settings - Fork 91
Fix Anthropic thinking with tool calls #1160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… calls **Root Cause:** The auto-disable logic was too aggressive - it disabled thinking whenever it found a tool call without a thinking block in the SAME AI message. During streaming, thinking blocks and tool calls are often stored as separate IContent items, causing the auto-disable to incorrectly trigger and permanently disable thinking. **Changes:** 1. **Improved Auto-Disable Detection (AnthropicProvider.ts lines 936-1008)** - Enhanced logic to look back up to 3 previous messages for orphaned thinking blocks - Only disables thinking if truly no thinking found anywhere in recent history - Added detailed debug logging to track when/why thinking gets disabled 2. **Extended Orphaned Thinking Merging (AnthropicProvider.ts lines 1170-1201)** - Increased look-back window from 1 to 3 previous messages - Can now find and merge thinking blocks that are several messages away from tool calls - Handles non-consecutive AI messages (e.g., separated by tool results) 3. **Comprehensive Test Coverage (AnthropicProvider.thinking.test.ts)** - Added 4 new tests for multi-turn thinking persistence - Tests verify thinking persists across tool calls in various scenarios - Tests confirm auto-disable only triggers when genuinely appropriate **Result:** Thinking blocks now persist throughout entire conversations, even with tool calls. The auto-disable safety mechanism remains in place but only triggers in rare edge cases where thinking blocks are genuinely missing. Fixes #1150
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughImplements cross-message detection and merging of orphaned Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Client
participant Provider as AnthropicProvider
participant History as MessageHistory
participant AnthropicAPI as Anthropic API
Note over Provider,History: look back up to 3 prior AI messages for orphaned thinking
User->>Client: user input (may trigger tool call)
Client->>Provider: build ProviderCallOptions (contents, settings)
Provider->>History: inspect recent messages (lookback up to 3 AI messages) for thinking/tool_use/tool_call
alt orphaned thinking found
History-->>Provider: return thinking block(s) with signature/sourceField
Provider->>Provider: merge thinking into assistant/tool_use payload (preserve signature)
else no signed thinking & reasoning enabled
Provider->>Provider: insert `redacted_thinking` placeholder
end
Provider->>AnthropicAPI: send assembled request (includes thinking/tool_use/redacted_thinking)
AnthropicAPI-->>Provider: response (streaming or final)
Provider->>Client: return response / stream
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (12)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used🧠 Learnings (3)📓 Common learnings📚 Learning: 2026-01-13T19:28:00.789ZApplied to files:
📚 Learning: 2025-11-16T22:51:26.374ZApplied to files:
🧬 Code graph analysis (4)packages/core/src/services/history/ContentConverters.ts (1)
packages/core/src/core/geminiChat.ts (2)
packages/core/src/providers/anthropic/AnthropicProvider.ts (1)
packages/core/src/providers/anthropic/AnthropicProvider.issue1150.test.ts (2)
🔇 Additional comments (14)
✏️ Tip: You can disable this entire section by setting Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
WARNING: LLxprt PR Review infrastructure failureThe automated reviewer failed with exit code 1. Please inspect the workflow logs (LLxprt section) and re-run once resolved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@packages/core/src/providers/anthropic/AnthropicProvider.thinking.test.ts`:
- Around line 1083-1085: The test is checking for a 'redacted_thinking' type but
AnthropicContentBlock doesn't include it, causing a TS2367 error; update the
local type used when reading assistantMsg!.content (used in the hasThinking
computation) to include 'redacted_thinking' so the union is widened. For
example, introduce a local alias (e.g., TestBlock = AnthropicContentBlock | {
type: 'redacted_thinking' }) and cast assistantMsg!.content to TestBlock[] when
computing hasThinking, or cast the checked value to unknown before the .some
check to satisfy the type system.
🧹 Nitpick comments (1)
packages/core/src/providers/anthropic/AnthropicProvider.ts (1)
1226-1257: Consider preventing orphaned thinking reuse/duplication.When you pull a thinking-only block from earlier history, it can be reattached to multiple tool_call messages within the 3-message window, and the original thinking-only message still gets emitted—potentially duplicating context and inflating tokens. A small pre-merge/consume strategy (e.g., tracking “consumed” orphan indices and skipping reuse) would keep the request tighter.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/core/src/providers/anthropic/AnthropicProvider.thinking.test.tspackages/core/src/providers/anthropic/AnthropicProvider.ts
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2026-01-13T19:28:00.789Z
Learning: In the llxprt-code codebase (`packages/core/src/providers/anthropic/AnthropicProvider.ts`), Anthropic's API returns `contentBlock.input` as an already-parsed JavaScript object, not a JSON string. The code was incorrectly calling `JSON.stringify(contentBlock.input)` before passing it to `processToolParameters()`, which was designed for OpenAI-style string parameters. This causes arrays and other complex types to be corrupted into strings (e.g., `paths` array becomes a string `"[\"**/*.toml\"]"` instead of actual array). The fix is to use `contentBlock.input` directly without stringifying for Anthropic provider.
Learnt from: e2720pjk
Repo: vybestack/llxprt-code PR: 583
File: packages/core/src/providers/openai/OpenAIProvider.ts:935-959
Timestamp: 2025-11-16T22:51:26.374Z
Learning: In the llxprt-code codebase (packages/core/src/providers/openai/OpenAIProvider.ts), tools like `run_shell_command` use internal streaming only for real-time UI updates during execution, but each tool execution produces exactly ONE final `ToolResponseBlock` containing the full result. The streaming chunks are never sent to the LLM and are not converted into multiple tool messages. The OpenAI Chat Completions API requires that each tool call (tool_call_id) corresponds to exactly one message with role 'tool', so duplicate tool response detection that removes subsequent tool messages with the same tool_call_id is correct and necessary for API compliance.
📚 Learning: 2026-01-13T19:28:00.789Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2026-01-13T19:28:00.789Z
Learning: In the llxprt-code codebase (`packages/core/src/providers/anthropic/AnthropicProvider.ts`), Anthropic's API returns `contentBlock.input` as an already-parsed JavaScript object, not a JSON string. The code was incorrectly calling `JSON.stringify(contentBlock.input)` before passing it to `processToolParameters()`, which was designed for OpenAI-style string parameters. This causes arrays and other complex types to be corrupted into strings (e.g., `paths` array becomes a string `"[\"**/*.toml\"]"` instead of actual array). The fix is to use `contentBlock.input` directly without stringifying for Anthropic provider.
Applied to files:
packages/core/src/providers/anthropic/AnthropicProvider.ts
📚 Learning: 2025-12-18T14:06:22.557Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-18T14:06:22.557Z
Learning: OpenAIResponsesProvider (packages/core/src/providers/openai-responses/OpenAIResponsesProvider.ts) currently bypasses the ephemeral truncation system by using direct `JSON.stringify(toolResponseBlock.result)` and needs to be updated to support ephemeral settings like the other providers.
Applied to files:
packages/core/src/providers/anthropic/AnthropicProvider.tspackages/core/src/providers/anthropic/AnthropicProvider.thinking.test.ts
📚 Learning: 2025-11-16T22:51:26.374Z
Learnt from: e2720pjk
Repo: vybestack/llxprt-code PR: 583
File: packages/core/src/providers/openai/OpenAIProvider.ts:935-959
Timestamp: 2025-11-16T22:51:26.374Z
Learning: In the llxprt-code codebase (packages/core/src/providers/openai/OpenAIProvider.ts), tools like `run_shell_command` use internal streaming only for real-time UI updates during execution, but each tool execution produces exactly ONE final `ToolResponseBlock` containing the full result. The streaming chunks are never sent to the LLM and are not converted into multiple tool messages. The OpenAI Chat Completions API requires that each tool call (tool_call_id) corresponds to exactly one message with role 'tool', so duplicate tool response detection that removes subsequent tool messages with the same tool_call_id is correct and necessary for API compliance.
Applied to files:
packages/core/src/providers/anthropic/AnthropicProvider.tspackages/core/src/providers/anthropic/AnthropicProvider.thinking.test.ts
📚 Learning: 2025-12-14T16:16:39.939Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-14T16:16:39.939Z
Learning: MiniMax M2 model (hf:MiniMax/MiniMax-M2) requires reasoning.stripFromContext set to "none" to preserve interleaved thinking tokens in conversation history, as dropping thinking content harms agent reliability.
Applied to files:
packages/core/src/providers/anthropic/AnthropicProvider.ts
🧬 Code graph analysis (1)
packages/core/src/providers/anthropic/AnthropicProvider.ts (1)
packages/core/src/services/history/IContent.ts (1)
ThinkingBlock(177-191)
⏰ Context from checks skipped due to timeout of 270000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Test (macos-latest, 24.x)
- GitHub Check: Test (ubuntu-latest, 24.x)
- GitHub Check: Test (windows-latest, 24.x)
- GitHub Check: E2E Test (Linux) - sandbox:none
- GitHub Check: E2E Test (Linux) - sandbox:docker
- GitHub Check: E2E Test (macOS)
- GitHub Check: Slow E2E - Win
🔇 Additional comments (2)
packages/core/src/providers/anthropic/AnthropicProvider.thinking.test.ts (1)
1005-1160: Great multi-turn coverage for the issue regression.These scenarios hit the tool/thinking split cases and verify the persistence logic clearly.
packages/core/src/providers/anthropic/AnthropicProvider.ts (1)
936-1023: Cross-message thinking validation looks solid.The lookback-based check reads clearly and should prevent the false disables seen in the issue.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| const hasThinking = ( | ||
| assistantMsg!.content as AnthropicContentBlock[] | ||
| ).some((b) => b.type === 'thinking' || b.type === 'redacted_thinking'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type union doesn’t include redacted_thinking.
Line 1083 compares against redacted_thinking, but AnthropicContentBlock currently excludes it, which can trigger a TS2367 error in stricter configs. Consider widening the local union.
Suggested update (local test type)
type AnthropicContentBlock =
| { type: 'text'; text: string }
| {
type: 'thinking';
thinking: string;
signature: string;
}
+ | {
+ type: 'redacted_thinking';
+ data: string;
+ }
| {
type: 'tool_use';
id: string;
name: string;
input: unknown;
};
🤖 Prompt for AI Agents
In `@packages/core/src/providers/anthropic/AnthropicProvider.thinking.test.ts`
around lines 1083 - 1085, The test is checking for a 'redacted_thinking' type
but AnthropicContentBlock doesn't include it, causing a TS2367 error; update the
local type used when reading assistantMsg!.content (used in the hasThinking
computation) to include 'redacted_thinking' so the union is widened. For
example, introduce a local alias (e.g., TestBlock = AnthropicContentBlock | {
type: 'redacted_thinking' }) and cast assistantMsg!.content to TestBlock[] when
computing hasThinking, or cast the checked value to unknown before the .some
check to satisfy the type system.
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-24.x-ubuntu-latest' artifact from the main CI run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/core/src/providers/anthropic/AnthropicProvider.ts (1)
1205-1287: Redaction policy can be bypassed when orphaned thinking is merged.
WhenstripFromContext = 'all'(orincludeInContext = false), a tool_call message that borrows orphaned thinking will include the full thought becauseshouldRedactThinkingonly checks the current index. That leaks reasoning back into the request and violates the redaction policy.🐛 Suggested fix (propagate redaction to borrowed thinking)
- // Check if this message's thinking should be redacted (stripped but preserved as placeholder) - const shouldRedactThinking = - redactedThinkingIndices.has(contentIndex); + // Track if we attach orphaned thinking from a previous message + let orphanedThinkingIndex: number | null = null; @@ - if ( - prevThinkingBlocks.length > 0 && - prevNonThinkingBlocks.length === 0 - ) { + if ( + prevThinkingBlocks.length > 0 && + prevNonThinkingBlocks.length === 0 + ) { + orphanedThinkingIndex = contentIndex - i; this.getLogger().debug( () => `[AnthropicProvider] Found orphaned thinking block ${i} messages back, merging with tool_use message at index ${contentIndex}`, ); anthropicThinkingBlocks = prevThinkingBlocks; break; } } } } + + const shouldRedactThinking = + shouldStripAll || + redactedThinkingIndices.has(contentIndex) || + (orphanedThinkingIndex !== null && + redactedThinkingIndices.has(orphanedThinkingIndex)); if (anthropicThinkingBlocks.length > 0) {
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/core/src/providers/anthropic/AnthropicProvider.issue1150-repro.test.tspackages/core/src/providers/anthropic/AnthropicProvider.ts
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2026-01-13T19:28:00.789Z
Learning: In the llxprt-code codebase (`packages/core/src/providers/anthropic/AnthropicProvider.ts`), Anthropic's API returns `contentBlock.input` as an already-parsed JavaScript object, not a JSON string. The code was incorrectly calling `JSON.stringify(contentBlock.input)` before passing it to `processToolParameters()`, which was designed for OpenAI-style string parameters. This causes arrays and other complex types to be corrupted into strings (e.g., `paths` array becomes a string `"[\"**/*.toml\"]"` instead of actual array). The fix is to use `contentBlock.input` directly without stringifying for Anthropic provider.
Learnt from: e2720pjk
Repo: vybestack/llxprt-code PR: 583
File: packages/core/src/providers/openai/OpenAIProvider.ts:935-959
Timestamp: 2025-11-16T22:51:26.374Z
Learning: In the llxprt-code codebase (packages/core/src/providers/openai/OpenAIProvider.ts), tools like `run_shell_command` use internal streaming only for real-time UI updates during execution, but each tool execution produces exactly ONE final `ToolResponseBlock` containing the full result. The streaming chunks are never sent to the LLM and are not converted into multiple tool messages. The OpenAI Chat Completions API requires that each tool call (tool_call_id) corresponds to exactly one message with role 'tool', so duplicate tool response detection that removes subsequent tool messages with the same tool_call_id is correct and necessary for API compliance.
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-14T16:16:39.939Z
Learning: MiniMax M2 model (hf:MiniMax/MiniMax-M2) requires reasoning.stripFromContext set to "none" to preserve interleaved thinking tokens in conversation history, as dropping thinking content harms agent reliability.
📚 Learning: 2026-01-13T19:28:00.789Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2026-01-13T19:28:00.789Z
Learning: In the llxprt-code codebase (`packages/core/src/providers/anthropic/AnthropicProvider.ts`), Anthropic's API returns `contentBlock.input` as an already-parsed JavaScript object, not a JSON string. The code was incorrectly calling `JSON.stringify(contentBlock.input)` before passing it to `processToolParameters()`, which was designed for OpenAI-style string parameters. This causes arrays and other complex types to be corrupted into strings (e.g., `paths` array becomes a string `"[\"**/*.toml\"]"` instead of actual array). The fix is to use `contentBlock.input` directly without stringifying for Anthropic provider.
Applied to files:
packages/core/src/providers/anthropic/AnthropicProvider.issue1150-repro.test.tspackages/core/src/providers/anthropic/AnthropicProvider.ts
📚 Learning: 2025-12-18T14:06:22.557Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-18T14:06:22.557Z
Learning: OpenAIResponsesProvider (packages/core/src/providers/openai-responses/OpenAIResponsesProvider.ts) currently bypasses the ephemeral truncation system by using direct `JSON.stringify(toolResponseBlock.result)` and needs to be updated to support ephemeral settings like the other providers.
Applied to files:
packages/core/src/providers/anthropic/AnthropicProvider.issue1150-repro.test.tspackages/core/src/providers/anthropic/AnthropicProvider.ts
📚 Learning: 2025-11-16T22:51:26.374Z
Learnt from: e2720pjk
Repo: vybestack/llxprt-code PR: 583
File: packages/core/src/providers/openai/OpenAIProvider.ts:935-959
Timestamp: 2025-11-16T22:51:26.374Z
Learning: In the llxprt-code codebase (packages/core/src/providers/openai/OpenAIProvider.ts), tools like `run_shell_command` use internal streaming only for real-time UI updates during execution, but each tool execution produces exactly ONE final `ToolResponseBlock` containing the full result. The streaming chunks are never sent to the LLM and are not converted into multiple tool messages. The OpenAI Chat Completions API requires that each tool call (tool_call_id) corresponds to exactly one message with role 'tool', so duplicate tool response detection that removes subsequent tool messages with the same tool_call_id is correct and necessary for API compliance.
Applied to files:
packages/core/src/providers/anthropic/AnthropicProvider.issue1150-repro.test.tspackages/core/src/providers/anthropic/AnthropicProvider.ts
📚 Learning: 2025-12-14T16:16:39.939Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-14T16:16:39.939Z
Learning: MiniMax M2 model (hf:MiniMax/MiniMax-M2) requires reasoning.stripFromContext set to "none" to preserve interleaved thinking tokens in conversation history, as dropping thinking content harms agent reliability.
Applied to files:
packages/core/src/providers/anthropic/AnthropicProvider.ts
🧬 Code graph analysis (2)
packages/core/src/providers/anthropic/AnthropicProvider.issue1150-repro.test.ts (3)
packages/core/src/test-utils/runtime.ts (1)
createProviderWithRuntime(82-108)packages/core/src/providers/test-utils/providerTestConfig.ts (1)
TEST_PROVIDER_CONFIG(23-25)packages/core/src/services/history/IContent.ts (2)
IContent(21-40)ThinkingBlock(177-191)
packages/core/src/providers/anthropic/AnthropicProvider.ts (1)
packages/core/src/services/history/IContent.ts (1)
ThinkingBlock(177-191)
🔇 Additional comments (8)
packages/core/src/providers/anthropic/AnthropicProvider.issue1150-repro.test.ts (7)
1-78: Solid SDK/prompt mocking setup.
Mocks keep the suite deterministic and isolate provider behavior.
79-146: Runtime + settings wiring looks correct.
The setup cleanly mirrors provider runtime usage and reasoning ephemerals.
147-158: Helper for provider call options is clean and reusable.
Nice way to keep tests concise and consistent.
160-230: Good coverage for long-lookback thinking presence.
This captures the regression scenario cleanly.
232-284: Covers non-thinkingsourceField flow well.
Ensures reasoning remains enabled in mixed-source histories.
286-359: Redacted placeholder assertion is on point.
Validates Anthropic-requiredredacted_thinkinginsertion.
361-497: Complex multi-tool-call scenario is well exercised.
This test should prevent regressions in orphaned-thinking handling.packages/core/src/providers/anthropic/AnthropicProvider.ts (1)
936-1020: Cross-message tool-call scan is a good guardrail.
The lookback + logging improves observability and preserves thinking continuity.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Anthropic extended thinking requires every assistant message to start with a thinking/redacted_thinking block when thinking is enabled. Previously we only merged thinking-only messages into subsequent tool-call messages, causing 400 errors when the next message was text-only. Changes: - AnthropicProvider: merge thinking-only AI messages into the next AI message regardless of whether it contains tool calls - useGeminiStream: filter out functionCall parts when submitting tool responses to avoid creating duplicate tool_use blocks - geminiChat: improved history normalization for tool interactions - ContentConverters: preserve thinking block signatures during conversion Fixes #1150
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/core/src/providers/anthropic/AnthropicProvider.ts (1)
1120-1186: Fix missing thinking block fallback for tool_use messages when thinking is enabled.When
shouldIncludeThinkingis true and tool calls exist, the code only logs a warning if no signed thinking blocks are found (even after lookback). It then continues to emittool_usewithout a precedingthinkingorredacted_thinkingblock. This violates Anthropic's API contract: the assistant's message must start with a thinking block when extended thinking is enabled, or the API returns aninvalid_requesterror.Inject a
redacted_thinkingblock with a signature hash as fallback in this scenario:🛠️ Suggested fix
if ( anthropicThinkingBlocks.length === 0 && shouldIncludeThinking && toolCallBlocks.length > 0 ) { this.getLogger().warn( () => `[AnthropicProvider] Missing thinking blocks with signatures for tool_use message at index ${contentIndex}. Injecting redacted placeholder.`, ); + const fallbackSignature = createHash('sha256') + .update(toolCallBlocks.map((tc) => tc.id).join(':')) + .digest('hex'); + contentArray.push({ + type: 'redacted_thinking', + data: fallbackSignature, + }); }
♻️ Duplicate comments (1)
packages/core/src/providers/anthropic/AnthropicProvider.thinking.test.ts (1)
30-43: Test union should includeredacted_thinkingto match usage.The local
AnthropicContentBlockunion omitsredacted_thinking, but later checks use it, which can trigger TS2367 in stricter configs.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
packages/cli/src/ui/hooks/useGeminiStream.test.tsxpackages/cli/src/ui/hooks/useGeminiStream.tspackages/core/src/core/geminiChat.runtime.test.tspackages/core/src/core/geminiChat.tspackages/core/src/providers/anthropic/AnthropicProvider.issue1150-repro.test.tspackages/core/src/providers/anthropic/AnthropicProvider.issue1150.test.tspackages/core/src/providers/anthropic/AnthropicProvider.thinking.test.tspackages/core/src/providers/anthropic/AnthropicProvider.tspackages/core/src/services/history/ContentConverters.test.tspackages/core/src/services/history/ContentConverters.tspackages/core/src/services/history/HistoryService.test.tspackages/core/src/services/history/IContent.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/core/src/providers/anthropic/AnthropicProvider.issue1150-repro.test.ts
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2026-01-13T19:28:00.789Z
Learning: In the llxprt-code codebase (`packages/core/src/providers/anthropic/AnthropicProvider.ts`), Anthropic's API returns `contentBlock.input` as an already-parsed JavaScript object, not a JSON string. The code was incorrectly calling `JSON.stringify(contentBlock.input)` before passing it to `processToolParameters()`, which was designed for OpenAI-style string parameters. This causes arrays and other complex types to be corrupted into strings (e.g., `paths` array becomes a string `"[\"**/*.toml\"]"` instead of actual array). The fix is to use `contentBlock.input` directly without stringifying for Anthropic provider.
📚 Learning: 2026-01-13T19:28:00.789Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2026-01-13T19:28:00.789Z
Learning: In the llxprt-code codebase (`packages/core/src/providers/anthropic/AnthropicProvider.ts`), Anthropic's API returns `contentBlock.input` as an already-parsed JavaScript object, not a JSON string. The code was incorrectly calling `JSON.stringify(contentBlock.input)` before passing it to `processToolParameters()`, which was designed for OpenAI-style string parameters. This causes arrays and other complex types to be corrupted into strings (e.g., `paths` array becomes a string `"[\"**/*.toml\"]"` instead of actual array). The fix is to use `contentBlock.input` directly without stringifying for Anthropic provider.
Applied to files:
packages/core/src/services/history/HistoryService.test.tspackages/core/src/services/history/ContentConverters.tspackages/core/src/providers/anthropic/AnthropicProvider.thinking.test.tspackages/core/src/core/geminiChat.tspackages/core/src/services/history/ContentConverters.test.tspackages/core/src/providers/anthropic/AnthropicProvider.tspackages/core/src/providers/anthropic/AnthropicProvider.issue1150.test.tspackages/cli/src/ui/hooks/useGeminiStream.test.tsx
📚 Learning: 2025-11-16T22:51:26.374Z
Learnt from: e2720pjk
Repo: vybestack/llxprt-code PR: 583
File: packages/core/src/providers/openai/OpenAIProvider.ts:935-959
Timestamp: 2025-11-16T22:51:26.374Z
Learning: In the llxprt-code codebase (packages/core/src/providers/openai/OpenAIProvider.ts), tools like `run_shell_command` use internal streaming only for real-time UI updates during execution, but each tool execution produces exactly ONE final `ToolResponseBlock` containing the full result. The streaming chunks are never sent to the LLM and are not converted into multiple tool messages. The OpenAI Chat Completions API requires that each tool call (tool_call_id) corresponds to exactly one message with role 'tool', so duplicate tool response detection that removes subsequent tool messages with the same tool_call_id is correct and necessary for API compliance.
Applied to files:
packages/cli/src/ui/hooks/useGeminiStream.tspackages/core/src/core/geminiChat.runtime.test.tspackages/core/src/providers/anthropic/AnthropicProvider.thinking.test.tspackages/core/src/core/geminiChat.tspackages/core/src/providers/anthropic/AnthropicProvider.tspackages/core/src/providers/anthropic/AnthropicProvider.issue1150.test.tspackages/cli/src/ui/hooks/useGeminiStream.test.tsx
🧬 Code graph analysis (4)
packages/core/src/services/history/ContentConverters.ts (1)
packages/core/src/services/history/IContent.ts (1)
ThinkingBlock(177-191)
packages/core/src/core/geminiChat.ts (2)
packages/core/src/services/history/IContent.ts (2)
ThinkingBlock(177-191)IContent(21-40)packages/core/src/services/history/ContentConverters.ts (1)
ContentConverters(36-455)
packages/core/src/providers/anthropic/AnthropicProvider.ts (1)
packages/core/src/services/history/IContent.ts (1)
ThinkingBlock(177-191)
packages/core/src/providers/anthropic/AnthropicProvider.issue1150.test.ts (2)
packages/core/src/providers/anthropic/AnthropicProvider.ts (1)
AnthropicProvider(76-2624)packages/core/src/services/history/IContent.ts (2)
IContent(21-40)ThinkingBlock(177-191)
🔇 Additional comments (14)
packages/core/src/services/history/IContent.ts (1)
232-240: LGTM! Thinking block validation correctly enforces signature requirements.The validation logic properly distinguishes between Anthropic thinking blocks (which require signatures per the API contract) and other thinking sources. The early return for empty thoughts is a good defensive check.
packages/core/src/services/history/HistoryService.test.ts (1)
746-759: LGTM! Good coverage for the new unsigned thinking block validation.This test correctly verifies that Anthropic thinking blocks (identified by
sourceField: 'thinking') without a signature are treated as invalid content and filtered from curated history. This aligns with the validation changes inIContent.ts.packages/cli/src/ui/hooks/useGeminiStream.ts (1)
1423-1428: LGTM! Correct filtering of functionCall parts from tool response submission.This change properly prevents
functionCallparts from being resent when submitting tool responses as a continuation. Tool calls originate from the model and should not be included in user/tool role messages sent back to the API. The filtering ensures onlyfunctionResponseand other relevant parts are submitted.packages/core/src/core/geminiChat.runtime.test.ts (2)
278-369: LGTM! Comprehensive test for thinking block retention with tool calls.This test correctly verifies that when
reasoning.includeInContextis enabled, thinking blocks are preserved alongside tool calls in the curated history. The test:
- Mocks a response containing both thinking and tool_call blocks
- Verifies the thinking block appears in the same history entry as the tool call
- Uses appropriate assertions to confirm the expected behavior
This provides good coverage for the PR's objective of preserving thinking visibility across tool-call turns.
137-138: LGTM! Settingreasoning.includeInContexttotrueenables thinking block preservation for this test context.packages/cli/src/ui/hooks/useGeminiStream.test.tsx (2)
687-784: LGTM! Excellent test coverage for the functionCall filtering behavior.This test properly validates that when tool responses are submitted:
functionCallparts are filtered out from the payloadfunctionResponseandtextparts are retained and sent to the streamThe test setup mirrors realistic scenarios where response parts may contain both the original function call and its response, ensuring the implementation correctly handles this case.
543-543: LGTM! Adding expliciterrorType: undefinedsatisfies the interface requirements and makes the test fixtures type-complete.packages/core/src/services/history/ContentConverters.test.ts (4)
2-7: LGTM! Import addition ofThinkingBlockenables proper type assertions in the new thinking-related tests.
187-214: LGTM! Good test coverage for thinking signature preservation from Gemini content.This test verifies that when converting Gemini content with thinking parts to IContent:
- The
thoughtSignaturemaps tosignature- The
thought: trueindicator results insourceField: 'thought'- The thought text is preserved correctly
216-243: LGTM! Proper test for Anthropic thinking metadata preservation.This test correctly validates that explicit
llxprtSourceField: 'thinking'metadata (used for Anthropic thinking blocks) is preserved during Gemini→IContent conversion, ensuring cross-provider thinking block handling works correctly.
369-392: LGTM! Comprehensive test for IContent→Gemini thinking conversion.This test verifies the reverse direction: when converting IContent with a ThinkingBlock back to Gemini format:
signaturemaps tothoughtSignaturesourceField: 'thinking'maps tollxprtSourceField: 'thinking'thought: trueis set on the partThis completes the round-trip test coverage for thinking block metadata.
packages/core/src/services/history/ContentConverters.ts (1)
109-125: Thinking metadata propagation looks solid.The updated mapping preserves
signatureandsourceFieldcleanly in both directions.Also applies to: 228-241
packages/core/src/providers/anthropic/AnthropicProvider.issue1150.test.ts (1)
160-291: Good regression coverage for issue#1150scenarios.The multi-turn/tool-call test setup looks thorough and aligns well with the bug’s reproduction cases.
packages/core/src/core/geminiChat.ts (1)
165-179: Thought-part handling and history attachment look good.The ThoughtPart guard + thought-block extraction/attachment aligns with includeInContext behavior and keeps history consistent.
Also applies to: 2509-2615
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Summary
Testing
Fixes #1150