Skip to content

Streaming tool_use with adaptive thinking: terminated stop_reason on long tool input content #1317

@OneSpiral

Description

@OneSpiral

Environment

  • Model: claude-opus-4-6
  • API: Messages API with streaming enabled
  • Thinking: adaptive (also tested with effort: "high")

Description

When using streaming with adaptive thinking and tool_use, the API returns stop_reason: terminated when the tool input content parameter exceeds approximately 6-8KB of text. The model is generating a tool call with a large string parameter (e.g. file content), but the response stream terminates before the tool input block is complete.

Reproduction

  1. Register a simple tool with a content: string parameter
  2. Send a messages request with stream: true and thinking: { type: "adaptive" }
  3. Prompt the model to call the tool with ~180+ lines of English text in the content parameter
  4. The streamed response terminates mid-generation with stop_reason: terminated
  5. Retrying produces the same result consistently

Observations

  • Tool input content under ~150 lines (~5KB) consistently succeeds
  • Tool input content over ~180 lines (~7KB) consistently fails
  • max_tokens is well above the needed output size (42666)
  • The tool execute function never runs — truncation happens at the API streaming response level, not client-side
  • Without adaptive thinking, the same content size succeeds (not fully confirmed)

Expected behavior

The API should complete the tool_use input block regardless of content size (within max_tokens), or return a meaningful error.

Workaround

Split large tool input content across multiple smaller tool calls.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions