Skip to content

fix(api): normalize chat-completion tool calls into Responses events (#608)#735

Open
chronoai-shining wants to merge 2 commits into
developfrom
fix/608-chat-completion-tool-calls
Open

fix(api): normalize chat-completion tool calls into Responses events (#608)#735
chronoai-shining wants to merge 2 commits into
developfrom
fix/608-chat-completion-tool-calls

Conversation

@chronoai-shining
Copy link
Copy Markdown
Collaborator

Summary

  • parseChatCompletionStream now accumulates delta.tool_calls chunks per index and emits synthesized response.output_item.done / function_call events on flush. The playground tool-use loop already matches on that shape, so zero changes were needed in chatService.ts.
  • Flushes on any of: explicit finish_reason (tool_calls / stop / non-null), upstream [DONE], or stream EOF. Flush is idempotent via a flushed guard.
  • Parallel tool calls supported (one done event per index, emitted in index order). Missing index falls back to 0.
  • 6 new tests appended to llm.test.ts, 17/17 in file green.

Test plan

  • bun test src/clients/nyxid/llm.test.ts — 17/17 green
  • bun run typecheck:api — no new errors in llm.ts
  • Full ornn-api suite — no new regressions
  • Local manual: configure a DeepSeek (chat-completion) provider for playground; run a runtime-based skill — execute_in_sandbox should be called by the loop, not rendered as text. Combined with the session reuse from [Bug] [Playground Sandbox] Installed CLI state is not preserved across tool calls #531, multi-round runtime skills should now work end-to-end on chat-completion providers.

Fixes #608

…608)

After #574, chat-completion providers were routed to /chat/completions
with text deltas translated into Responses-API events — but the stream
parser dropped delta.tool_calls. The playground tool-use loop only
matches on Responses-API response.output_item.done with
item.type=function_call, so a tool call from a DeepSeek-style
provider never reached the loop. Models would emit
execute_in_sandbox(...) and the JSON arrived as plain assistant text
instead of triggering the sandbox — runtime-based and mixed skills
appeared to respond but never actually ran.

Fix in parseChatCompletionStream:

- Per-index Map<number, ToolCallAccumulator> carries {id, name,
  arguments}. Each delta.tool_calls[] chunk merges into its buffer
  (id+name on the first chunk, arguments JSON string accumulates).
- A turn flushes when ANY of: explicit finish_reason (tool_calls /
  stop / anything non-null), upstream [DONE], or stream EOF. A
  `flushed` guard makes flush idempotent so multiple end signals
  never produce duplicate events.
- The synthesized event matches the shape chatService already parses
  via outputItemDoneEventSchema, so zero changes are needed in the
  playground loop — its existing pendingToolCall capture works for
  both upstream formats.

Parallel tool calls within one turn are supported (one done event
per index, emitted in index order). Missing `index` falls back to 0.
Streams that close without [DONE] or finish_reason still flush at
EOF so buffered calls are never lost.

Coverage: 6 new tests in llm.test.ts cover chunked accumulation +
finish_reason flush, EOF flush without [DONE], parallel tool calls,
idempotent flush across finish_reason+[DONE], intermixed text+tool
deltas (event order), and missing index fallback. 17/17 green in
the file; full ornn-api suite no new regressions (18 unrelated
pre-existing failures in validateSkillFrontmatter #649).

Fixes #608
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] [Playground Runtime] runtime / mixed skills do not execute sandbox tool when the LLM provider uses chat-completion format

1 participant