Skip to content

[Feature] Add response_format support, fix max_completion_tokens for gpt-5.x/o-series, and add Anthropic structured output in fallback chain #93

@ankushchhabradelta4infotech-ai

Description

Summary

Several provider compatibility issues in @yourgpt/llm-sdk that prevent the fallback chain from working correctly with newer OpenAI models and structured output requests.

Problem / Use Case

We use createFallbackChain + createRuntime to automatically failover between providers. Three issues block this from working correctly in production.

Issues

1. Newer OpenAI models (gpt-5.x, o1, o3, o4-*) fail with max_tokens

These models require max_completion_tokens instead of max_tokens, and do not accept temperature. Sending max_tokens causes a 400 error:

400 Unsupported parameter: 'max_tokens' is not supported with this model.
Use 'max_completion_tokens' instead.

Fix — detect by model name pattern (no hardcoding needed):

const isNewGen = /o1|o3|o4|gpt-5/.test(modelId);
// use max_completion_tokens if isNewGen, otherwise max_tokens
// skip temperature if isNewGen

Needs to be applied in:

  • src/adapters/openai.tscomplete() and stream()
  • src/providers/openai/provider.tsdoGenerate() and doStream()

2. response_format / json_schema is never forwarded to the provider

ChatRequest.config only has temperature and maxTokens. There is no responseFormat field, so structured output requests silently return plain text instead of JSON.

This breaks all frontend calls that use response_format: { type: "json_schema", json_schema: ... } with OpenAI and Gemini models.

Fix — add responseFormat to ChatRequest.config:

config?: {
  temperature?: number;
  maxTokens?: number;
  responseFormat?: {
    type: 'text' | 'json_object' | 'json_schema';
    json_schema?: { name: string; strict?: boolean; schema: Record<string, unknown> };
  };
}

And forward it in:

  • src/adapters/openai.tscomplete() and stream() → pass as response_format
  • src/providers/openai/provider.tsdoGenerate() and doStream() → pass as response_format
  • src/providers/google/provider.tsdoGenerate() and doStream() → pass as response_format
  • src/core/generate-text.ts → forward responseFormat to doGenerate/doStream
  • src/types/language-model.ts → add responseFormat to DoGenerateParams

3. Anthropic adapter does not support structured output (output_config.format)

Anthropic's API has its own structured output format — output_config.format with { type: "json_schema", schema: {...} } — which is completely different from OpenAI's response_format. The SDK's Anthropic adapter does not support this at all.

When fallback chain falls to Anthropic on a json_schema call, the response comes back as plain text instead of structured JSON.

Fix — add responseFormat support to AnthropicAdapter:

  • Map config.responseFormat.json_schema.schemaoutput_config: { format: { type: "json_schema", schema: ... } } in buildRequestOptions()
  • Only send output_config when responseFormat.type === "json_schema"

Anthropic SDK reference: messages.create({ ..., output_config: { format: { type: "json_schema", schema: {...} } } })


Before submitting:

  • I've searched existing issues to make sure this isn't a duplicate
  • I've read the documentation

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions