Skip to content

feat(llm-sdk): add reasoning/thinking support via OpenAI Responses API + OpenRouter passthrough#92

Open
ankushchhabradelta4infotech-ai wants to merge 48 commits into
YourGPT:release/alphafrom
ankushchhabradelta4infotech-ai:fix/llm-sdk
Open

feat(llm-sdk): add reasoning/thinking support via OpenAI Responses API + OpenRouter passthrough#92
ankushchhabradelta4infotech-ai wants to merge 48 commits into
YourGPT:release/alphafrom
ankushchhabradelta4infotech-ai:fix/llm-sdk

Conversation

@ankushchhabradelta4infotech-ai
Copy link
Copy Markdown
Contributor

Description

Adds reasoning/thinking token support to the llm-sdk for OpenRouter-routed models. OpenAI's o1/o3/o4 and gpt-5 series hide chain-of-thought on the chat-completions endpoint — this PR routes those models through OpenAI's Responses API to surface reasoning summaries, and adds streaming reasoning_content passthrough for other OpenRouter models (e.g. DeepSeek R1, Claude with extended thinking).

Changes

  • adapters/openai.ts — Dynamic provider name resolved from baseUrl (openai / openrouter / xai / azure / google); adds disableThinking config flag; adds full streamWithResponsesAPI path that maps Responses API SSE events (response.reasoning_summary_text.delta, response.output_text.delta, function-call events) back to the same StreamEvent shapes the chat-completions path emits
  • providers/openrouter/provider.tssupportsThinking flipped to true; doStream detects OpenAI reasoning model IDs (openai/o1*, openai/o3*, openai/o4*, openai/gpt-5*) and delegates to doStreamResponsesAPI; adds reasoning_content / include_reasoning passthrough for other reasoning models (DeepSeek R1 etc.); adds disableThinking provider option to opt out

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Refactoring (no functional changes)

Testing

  • I've tested this locally
  • I've added/updated tests
  • All existing tests pass

Checklist

  • My code follows the project's style guidelines
  • I've updated the documentation (if needed)
  • I've added tests that prove my fix/feature works
  • New and existing tests pass locally

Screenshots (if applicable)

…into fix/sdk-ui

# Conflicts:
#	packages/copilot-sdk/src/chat/classes/AbstractChat.ts
…Vite 8 examples

- generative-ui-demo: port of experimental generative UI to Vite+Express, recharts, dark UI with prompt suggestions sidebar
- skills-demo: new demo for server-side skills system with loadSkills(), 3 skill files (code-review/concise-mode/customer-support), branching toggle via allowEdit prop

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…l activation

- Replace generic skills with SaaS-focused: revenue-intelligence, customer-health, incident-runbook
- Full dark dashboard UI (Bricolage Grotesque + JetBrains Mono, navy/indigo palette)
- Live metrics bar, nav with module links, AI Copilot badge
- Skill cards animate on load: scan line sweep → expand → capability reveal with staggered fade-in
- Skill state detection via load_skill toolRenderer watching for tool call completion
- Branching toggle, demo prompt injectors for demo recording
- Updated server system prompt for SaaS context

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds @yourgpt/llm-sdk/fallback subpath export:
- createFallbackChain() with priority and round-robin routing
- Per-model retries with exponential/fixed backoff before fallback
- FallbackExhaustedError with per-model failure breakdown
- MemoryRoutingStore (default) + pluggable RoutingStore interface
- onRetry / onFallback observability callbacks
- Two-tier error detection (class-based for complete(), message-regex for stream())

Includes fallback-demo example and docs page.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
# Conflicts:
#	pnpm-lock.yaml
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…o and generative-ui-demo

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ensions with composite

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ion and UI improvements

- Added support for dynamic skills registration via API, allowing skills to be registered at runtime.
- Updated server to handle dynamic skills, including new endpoints for skill management.
- Introduced a new skill for frontend design with detailed guidelines.
- Improved UI with new font styles (DM Sans and DM Mono) and animations for skill activation.
- Enhanced the skills panel to display both static and dynamic skills, improving user experience.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Added 'vaul' version 1.1.2 and 'tw-animate-css' version 1.4.0 to dependencies.
- Updated 'yourgpt-server-demo' with new dependencies including 'cors', 'dotenv', 'express', and 'ws'.
- Updated devDependencies for 'yourgpt-server-demo' with '@types/cors', '@types/express', '@types/ws', 'tsx', and 'typescript'.
- Updated integrity checks for '@types/ws' and 'vaul' in the lockfile.
…vior

- 'multi-step' (default) — one bubble per server agent iteration (OpenAI/LiteLLM style), default unchanged
- 'single-turn' — all iterations collapsed into one bubble per user turn (Vercel AI SDK / Claude.ai style)

Usage: <CopilotProvider streamMode="single-turn" />

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Deleted the comprehensive documentation for the tool management branch, which included details on features, configurations, and known issues related to the tool management stack across copilot-sdk and llm-sdk.
- Replaced the existing system prompt to reflect the new focus on HR operations, including employee onboarding and performance reviews.
- Removed outdated skills related to revenue intelligence, customer health, frontend design, and incident management.
- Introduced new skills for employee onboarding and performance reviews with updated icons and configurations.
- Adjusted loading behavior to prevent duplicate skill calls in the conversation history.
- Updated initial welcome message to align with the new HR Copilot theme.
… from package files

- Updated the version in package.json to 2.1.4-alpha.3.
- Excluded source map files from the package distribution by adding "!dist/**/*.map" to the files array.
…ect event ordering

Previously, server-side tools were executed post-loop after the adapter's for-await
finished, causing message:end to arrive before action:end(result). This broke client
message splitting and rendered skill cards below the assistant response instead of above.

Changes:
- llm-sdk/runtime: execute server tools inline in case "action:end" before message:end
  arrives naturally from the adapter, removing all event-ordering hacks
- copilot-sdk/AbstractChat: split message turn on toolResults.size > 0 (not just text)
  so tool-only turns are correctly finalized at message:end
- copilot-sdk/AbstractChat: skip server-side assistantWithToolCalls in done handler
  to prevent duplicate tool card renders in the UI
- examples/playground: use workspace:* deps and add transpilePackages for Turbopack

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Introduced a new tool for retrieving the current date, including its day of the week.
- Updated tool rendering logic to handle the new date tool and improve UI feedback during execution.
- Adjusted the handling of assistant messages to prevent duplicate rendering of tool results in the chat interface.
…handling

- Introduced a new provider configuration for "yourgpt-server" in constants and types.
- Added a proxy route for yourgpt-server to handle streaming and non-streaming requests.
- Updated PlaygroundPage to conditionally check for API keys based on the selected provider.
- Bumped versions for copilot-sdk and llm-sdk to reflect recent changes.
…cleanup

- Replace all sidebar icons with Hugeicons duotone-rounded (AiMagic, AiBook,
  MagicWand, Puzzle, BubbleChat, SlidersHorizontal, FileCode, ServerStack, AiChip1)
- Move Generative UI to root sidebar; rewrite with two-approach structure
  (toolRenderers vs AI-generated iframe via useGenerativeUI/HtmlRenderer)
- Move branching from chat/ to advanced/; merge message-actions into chat/ui
- Collapse skills/ folder to flat skills.mdx (fixes fumadocs dropdown bug)
- Move context/ pages into advanced/; move headless into customizations/
- Add 18 permanent redirects in next.config.mjs for all moved/deleted routes
- Add AiMagic, AiBook, BubbleChat, FileCode, MagicWand, Puzzle,
  ServerStack, SlidersHorizontal icon components

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add @yourgpt/llm-sdk/fireworks provider (OpenAI-compatible, lazy OpenAI client)
- Fix tool call streaming: track by index instead of tc.id (Fireworks repeats
  the same tc.id across chunks, breaking multi-chunk argument accumulation)
- Add examples/fireworks-demo — Vite+Express app with 26-model selector,
  browse link to fireworks.ai/models, tested with kimi-k2p5 + deepseek-v3p2
- Add docs: providers/fireworks.mdx and providers/openrouter.mdx (closes YourGPT#79)
- Update providers/index.mdx and meta.json to include both new providers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sahil5963 and others added 18 commits April 8, 2026 16:26
- Introduced a new `storage.mdx` file detailing session persistence and chat history management.
- Removed outdated `session.mdx` and `meta.json` files to streamline documentation.
- Included examples for basic setup, browser storage, server persistence, and the `useThreadManager` hook.
- Added information on data persistence, auto-eviction, and API contract for server storage.
- Removed outdated headless routes from `next.config.mjs` to streamline navigation.
- Added new documentation for headless features, including `useCopilotEvent` and `useMessageMeta`, to enhance user understanding.
- Introduced client-side and server-side skills documentation, detailing registration and usage.
- Created new `headless` and `skills` sections in the documentation, including examples and best practices for implementation.
- Deleted the `skills.mdx` file as part of the restructuring to improve clarity and organization.
- Modified the PATH variable in the pre-commit script to include the user's pnpm directory for better compatibility in non-interactive shells.
All four legacy adapters (OpenAI, Azure, Google, Ollama) were yielding
action:call events but never emitting action:end, preventing the runtime
from knowing when a tool call finished and triggering execution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add native Together AI provider using OpenAI-compatible API with
streaming, tool calling, vision, and JSON mode support. Includes
full docs page, Express+Vite demo app, and comprehensive test suite.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… beta release notes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Vercel uses npm which doesn't support pnpm's workspace: protocol,
causing EUNSUPPORTEDPROTOCOL errors during install.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tProvider

- Add createTogetherAI() legacy factory (returns AIProvider for createRuntime)
- Rebuild demo from Express+Vite to Next.js app using the proper
  copilot-sdk pattern: createRuntime + runtime.handleRequest on backend,
  CopilotProvider + CopilotChat on frontend
- Sidebar with model selector, setup guide, and links
- 11 verified models across DeepSeek, Llama, Qwen, and more

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…Runtime pattern

- Add fallback chain with priority strategy + retry logic to demo API route
- Add fallback toggle UI in sidebar showing chain order
- Update docs: createTogetherAI + createRuntime as primary pattern,
  fallback chain section, cleaned up model list
- Add REST test script for all models
- Bump llm-sdk to 2.1.9

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the tool-call-based generative UI with a text-streaming approach
(similar to Claude Artifacts pattern). The AI writes HTML wrapped in
<GENUI> tags as part of its text response, which streams naturally and
renders progressively in a sandboxed iframe.

Key changes:

SDK (copilot-sdk/experimental):
- New useGenerativeUI() hook returns wrapMessage for CopilotChat
- New generativeUISystemPrompt() helper for backend system prompts
- New GenUIFrame component: iframe with postMessage, auto-height via
  ResizeObserver, copilot bridge API for interactivity
- Simplified types: removed chart/table/stat/card types, HTML-only
- Deleted CardRenderer, TableRenderer, StatRenderer

SDK (core streaming improvements):
- Progressive action:args emission in Anthropic/OpenAI adapters
- tool-call-start/delta events in providers and stream-text
- Runtime: proper action:args accumulation (don't null currentToolCall)
- AbstractChat: handle action events when streamState is null
- processChunk: parsePartialJson for streaming arg extraction
- connected-chat: attach unmatched streaming executions to messages
- ChatWithTools: allow action:start/args for client tools
- AbstractAgentLoop: reuse streaming executions

Example (generative-ui-demo):
- Rewritten with sidebar, dicebear avatars, copilot logo
- Uses generativeUISystemPrompt() + loadSkills() on backend
- Skills system with frontend-design skill (eager)
- SaaS dashboard copilot context (Acme Inc.)

Docs:
- Rewritten generative-ui.mdx with two approaches documented
- Updated BETA-FEATURES.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- AbstractChat: check toolResults.size in message:end so tool-only turns
  reset streamState and don't attach next turn's tools to previous message
- AbstractChat: in single-turn mode parent insertChainParentId from message
  before the streaming message, not from streamState.messageId
- AbstractChat: skip role:tool messages from done.messages in single-turn
  (already in streamState.toolResults, re-inserting creates duplicates)
- connected-chat: restrict liveExecutions to last assistant message only;
  prefer metadata.toolExecutions for historical messages
- connected-chat: filter unmatched executions to pending/executing only
  so stale completed cards don't bleed into new streaming message
- connected-chat: null-safe message filter to prevent render crashes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously request.systemPrompt || this.config.systemPrompt meant a
client-sent prompt could silently override the server-configured one,
breaking generativeUISystemPrompt and any server-side prompt setup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 1, 2026

@ankushchhabradelta4infotech-ai is attempting to deploy a commit to the Delta4 Infotech Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants