Skip to content

OpenAICompatClient drops reasoning content on tool calls #114

@antoinezambelli

Description

@antoinezambelli

Problem

OpenAICompatClient never captures reasoning/thinking content. Unlike VLLMClient, OllamaClient, and LlamafileClient — which attach reasoning to ToolCall.reasoningOpenAICompatClient._parse_tool_calls (src/forge/clients/openai_compat.py:153-172) takes no reasoning parameter, and neither send() nor send_stream() ever look at a reasoning field or <think> tags. So for any reasoning model behind an OpenAI-compatible endpoint, the entire chain-of-thought is dropped on every tool call — no REASONING message downstream, breaking reasoning replay (full / keep-last).

This is the same class of bug as #110 (fixed for vLLM + Ollama in #113), but for the generic OpenAI-compat client. It was deliberately deferred from #113 because it needs more design surface than the other two clients.

Scope / design notes (decided during the #113 parity work)

  • No raw-content fallback. vLLM/Ollama/llamafile fall back to raw content as reasoning because they target local instruct models that narrate before a tool call. OpenAICompatClient is the deliberately provider-agnostic client (Groq/Together/OpenRouter/hosted instruct/etc.), where a content preamble alongside a tool call is routinely legitimate user-facing text, not chain-of-thought. Labeling it reasoning would mis-route it and, under reasoning_replay=none (the default), silently drop a real assistant turn. So capture reasoning only from (a) the canonical structured fields reasoning_content / reasoning / reasoning_text (see forge/core/reasoning.py:REASONING_MESSAGE_FIELDS) or (b) <think> tags via forge.prompts.think_tags.extract_think_tagsnot bare content.
  • No think constructor flag. Unlike vLLM/Ollama, this client has no think flag today. Don't add one — the downstream reasoning_replay policy (default none) already controls whether reasoning is serialized.
  • Strip <think> tags from TextResponse content (parity with the other clients).
  • Attach reasoning to the first ToolCall only; make the new _parse_tool_calls reasoning param keyword-with-default to avoid breaking positional callers.

Files

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions