Skip to content

Add ModelProfile to let model-specific behaviors be configured independent of the model class #1782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
DouweM opened this issue May 20, 2025 · 2 comments · Fixed by #1835
Closed
Assignees
Labels
Feature request New feature request

Comments

@DouweM
Copy link
Contributor

DouweM commented May 20, 2025

Description

Model classes were built for a specific API and model family, encoding not just the hard API spec but also various behaviors related to those specific models' capabilities and limitations, e.g. what JSON schemas or multi-modal input types are supported.

This breaks down when a model class is used with a different API or a different family of models that don't match all of those hard and soft assumptions:

  • OpenAIModel is used with various ostensibly-OpenAI-compatible APIs and a wide variety of models.
  • BedrockModel and GroqModel are used with a wide variety of models.

So far, the biggest class of issues (at least one filed a day) has been in JSON schema handling. OpenAIModel and GeminiModel both implement their own transformers (OpenAI, Gemini), but people using OpenAIModel and BedrockModel with other models have been running into API errors when using particular models, even if others on the same provider may work fine (suggesting this really is model-specific, and not something that an OpenAI-compatible API "should" handle consistently across all models). Resolving this currently requires manually defining a model subclass and applying one of the existing JSON schema transformer, sometimes with tweaks:

We're going to run into something similar with Structured Output Modes:

  • OpenAIModel + pre-4o doesn't support json_schema, only json_object (and tool calls and manual JSON)
  • Some models don't support json_schema or json_object, only tool calls or manual JSON
  • Some models don't support tool calls at all, only manual JSON
  • GeminiModel (and presumably BedrockModel + Gemini) doesn't support json_schema alongside tools

Built-in tools may also be different between providers used with the same model class (e.g. OpenAIModel + OpenRouter), or between models on the same provider (as some models may not support tool calls at all):

That's not the end of it, unfortunately, as we've already seen some other axes where different models may need different handling to get the most out of them:

I think it's time to pull some model and model family-specific details out of the model classes, generalize them, and allow them to be tweaked on a model-by-model basis.
This'll be somewhat similar to ModelSettings, but instead of properties to be passed directly to the model API, these new properties will determine how PydanticAI builds its request payload to get the most out of each specific model and work around limitations.

There'd be global defaults, model class/family defaults layered on top of that, model-specific overrides provided by the model class file, and the ability for users to tweak the settings further, or even use the settings defined by one model class (e.g. GeminiModel's specification for 2.5 Pro) with another model class (like OpenAIModel + OpenRouter + Gemini).

Because we're basically describing how the model likes to be talked to, I'm leaning towards the name ModelProfile or ModelSpec or something similar -- but very open to other suggestions.

It'd look something like this:

@dataclass
class ModelProfile:
  json_schema_transformer: Literal['openai', 'gemini'] | type[WalkJsonSchema]
  supported_output_modes: set[Literal['tool', 'json_schema', 'json_object', 'manual_json']]
  default_output_mode: Literal['tool', 'json_schema', 'json_object', 'manual_json']

  # definitely not all necessary right away, but to give you an idea
  built_in_tools: dict[str, dict[str, Any]]
  manual_json_prompt: str
  tool_use: bool
  strict_tools: bool
  tool_choice: bool
  tool_result_type: Literal['string', 'object']
  multi_modal_input_types: set[Literal['video', 'audio', 'image', 'docs']]
  offer_batch_tool: bool

# models/__init__.py
DEFAULT_PROFILE = ModelProfile(...)

# models/openai.py
DEFAULT_OPENAI_PROFILE = replace(DEFAULT_PROFILE, json_schema_transformer='openai', ...)

OPENAI_PROFILES = {}
OPENAI_PROFILES['gpt-4'] = replace(DEFAULT_OPENAI_PROFILE, supported_output_modes={'tool', 'json_object', 'manual_json'})
OPENAI_PROFILES['gpt-4o'] = replace(OPENAI_PROFILES['gpt-4'], supported_output_modes={'tool', 'json_schema', 'manual_json'})

# models/gemini.py
DEFAULT_GEMINI_PROFILE = replace(DEFAULT_PROFILE, json_schema_transformer='gemini', ...)

GEMINI_PROFILES = {}
GEMINI_PROFILES['gemini-2.0-flash-001'] = replace(DEFAULT_GEMINI_PROFILE)

# models/anthropic.py
DEFAULT_ANTHROPIC_PROFILE = replace(DEFAULT_PROFILE, ...)

ANTHROPIC_PROFILES = {}
ANTHROPIC_PROFILES['claude-3-5-sonnet-20240620'] = replace(DEFAULT_ANTHROPIC_PROFILE, ...)

# models/bedrock.py
DEFAULT_BEDROCK_PROFILE = replace(DEFAULT_PROFILE)

BEDROCK_PROFILES = {}
BEDROCK_PROFILES['us.anthropic.claude-3-5-sonnet-20240620'] = ANTHROPIC_PROFILES['claude-3-5-sonnet-20240620'] # or some cleverer way to read these automatically based on name

# my_agent.py
model = OpenAIModel(model_name='gpt-4o')

openrouter_provider = 
model = OpenAIModel(
  "google/gemini-2.0-flash-001", 
  provider=OpenAIProvider(base_url="https://openrouter.ai/api/v1", ...), 
  profile=GEMINI_PROFILES['gemini-2.0-flash-001']
)

model = AnthropicModel(model_name='claude-3-5-sonnet-20240620')

model = BedrockModel(model_name='llama3.3', profile=replace(DEFAULT_PROFILE, json_schema_transformer='gemini'))

# could also work, if we merge in the defaults (or just set those on the dataclass/pydantic model?)
model = BedrockModel(model_name='llama3.3', profile=ModelProfile(json_schema_transformer='gemini')) 

I'd start by implementing this for json_schema_transformer as that's the main one causing issues today, but since we have the output modes in the pipeline, I'd rather implement this with a new class from the get-go rather than with a json_schema_transformer argument directly set on Model.

@dmontagu @Kludex Thoughts? :)

References

No response

@DouweM DouweM self-assigned this May 20, 2025
@DouweM DouweM added the Feature request New feature request label May 20, 2025
@DouweM DouweM marked this as a duplicate of #1735 May 20, 2025
@DouweM DouweM marked this as a duplicate of #1659 May 20, 2025
@DouweM DouweM marked this as a duplicate of #1623 May 20, 2025
@DouweM DouweM marked this as a duplicate of #1649 May 20, 2025
@dmontagu
Copy link
Contributor

This looks good. I think the GEMINI_PROFILES etc. should not include duplicate values for each model name, and instead should just include one key for each distinct profile value, and under the hood we should use a function that selects the appropriate profile as a function of the model name.

I would imagine then that we could allow users to pass either an explicit ModelProfile or a Callable[[str], ModelProfile] into the profile argument of Model.

@Kludex Kludex marked this as not a duplicate of #1735 May 21, 2025
@Kludex Kludex marked this as not a duplicate of #1659 May 21, 2025
@Kludex Kludex marked this as not a duplicate of #1623 May 21, 2025
@Kludex Kludex marked this as a duplicate of #1735 May 21, 2025
@Kludex Kludex marked this as a duplicate of #1659 May 21, 2025
@Kludex Kludex marked this as a duplicate of #1623 May 21, 2025
@Kludex
Copy link
Member

Kludex commented May 21, 2025

I'm not sure about the name, but I agree with the idea. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request New feature request
Projects
None yet
3 participants