-
Notifications
You must be signed in to change notification settings - Fork 582
Streaming tool_use with adaptive thinking: terminated stop_reason on long tool input content #1317
Copy link
Copy link
Open
Labels
Description
Environment
- Model: claude-opus-4-6
- API: Messages API with streaming enabled
- Thinking: adaptive (also tested with effort: "high")
Description
When using streaming with adaptive thinking and tool_use, the API returns stop_reason: terminated when the tool input content parameter exceeds approximately 6-8KB of text. The model is generating a tool call with a large string parameter (e.g. file content), but the response stream terminates before the tool input block is complete.
Reproduction
- Register a simple tool with a
content: stringparameter - Send a messages request with
stream: trueandthinking: { type: "adaptive" } - Prompt the model to call the tool with ~180+ lines of English text in the content parameter
- The streamed response terminates mid-generation with
stop_reason: terminated - Retrying produces the same result consistently
Observations
- Tool input content under ~150 lines (~5KB) consistently succeeds
- Tool input content over ~180 lines (~7KB) consistently fails
max_tokensis well above the needed output size (42666)- The tool execute function never runs — truncation happens at the API streaming response level, not client-side
- Without adaptive thinking, the same content size succeeds (not fully confirmed)
Expected behavior
The API should complete the tool_use input block regardless of content size (within max_tokens), or return a meaningful error.
Workaround
Split large tool input content across multiple smaller tool calls.
Reactions are currently unavailable