Skip to content

Different behaviour with Gemini models using OpenAI+OpenRouter #1735

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks done
ChenghaoMou opened this issue May 15, 2025 · 5 comments
Closed
2 tasks done

Different behaviour with Gemini models using OpenAI+OpenRouter #1735

ChenghaoMou opened this issue May 15, 2025 · 5 comments
Assignees

Comments

@ChenghaoMou
Copy link
Contributor

Initial Checks

Description

When using a gemini model as follows:

openrouter_provider = OpenAIProvider(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ["OPENROUTER_API_KEY"],
)
model = OpenAIModel(
    "google/gemini-2.0-flash-001", provider=openrouter_provider
)

It creates a different parameters_json_schema that causes the model repeatedly fail to follow the schema, while the GeminiModel + VertexProvider rarely fails.

Image

Left: OpenAIModel with OpenAIProvider/OpenRouter
Right: GeminiModel with VertexProvider

Error Message
15:50:03.343 agent run
15:50:03.344   chat google/gemini-2.0-flash-001
Traceback (most recent call last):
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_ai/_output.py", line 234, in validate
    output = self.type_adapter.validate_json(tool_call.args, experimental_allow_partial=pyd_allow_partial)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic/type_adapter.py", line 468, in validate_json
    return self.validator.validate_json(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for GroundedRating
reasoning
  Input should be an object [type=model_type, input_value='The speech was well-stru...red and easy to follow.', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/model_type

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_ai/_agent_graph.py", line 459, in _handle_tool_calls
    result_data = output_tool.validate(call)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_ai/_output.py", line 244, in validate
    raise ToolRetryError(m) from e
pydantic_ai._output.ToolRetryError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/chenghao/Developer/ai-reports/reproduce/debug.py", line 92, in <module>
    asyncio.run(test_analyze_sales_conversation(data))
  File "/Users/chenghao/.local/share/uv/python/cpython-3.12.6-macos-aarch64-none/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/Users/chenghao/.local/share/uv/python/cpython-3.12.6-macos-aarch64-none/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/chenghao/.local/share/uv/python/cpython-3.12.6-macos-aarch64-none/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/chenghao/Developer/ai-reports/reproduce/debug.py", line 77, in test_analyze_sales_conversation
    result = await agent.run(
             ^^^^^^^^^^^^^^^^
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_ai/agent.py", line 451, in run
    async for _ in agent_run:
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_ai/agent.py", line 1812, in __anext__
    next_node = await self._graph_run.__anext__()
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_graph/graph.py", line 810, in __anext__
    return await self.next(self._next_node)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_graph/graph.py", line 783, in next
    self._next_node = await node.run(ctx)
                      ^^^^^^^^^^^^^^^^^^^
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_ai/_agent_graph.py", line 380, in run
    async with self.stream(ctx):
               ^^^^^^^^^^^^^^^^
  File "/Users/chenghao/.local/share/uv/python/cpython-3.12.6-macos-aarch64-none/lib/python3.12/contextlib.py", line 217, in __aexit__
    await anext(self.gen)
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_ai/_agent_graph.py", line 394, in stream
    async for _event in stream:
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_ai/_agent_graph.py", line 443, in _run_stream
    async for event in self._events_iterator:
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_ai/_agent_graph.py", line 421, in _run_stream
    async for event in self._handle_tool_calls(ctx, tool_calls):
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_ai/_agent_graph.py", line 464, in _handle_tool_calls
    ctx.state.increment_retries(ctx.deps.max_result_retries)
  File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_ai/_agent_graph.py", line 70, in increment_retries
    raise exceptions.UnexpectedModelBehavior(
pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (0) for result validation

Example Code

import asyncio
import os
from typing import Annotated, Literal

from dotenv import load_dotenv
from pydantic import BaseModel, Field
from pydantic_ai import Agent, BinaryContent
from pydantic_ai.models.gemini import GeminiModel
from pydantic_ai.models.instrumented import InstrumentationSettings
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.google_vertex import GoogleVertexProvider
from pydantic_ai.providers.openai import OpenAIProvider


class GradingReasoning(BaseModel):
    """The reasoning for the grade of the communication."""

    overall_reasoning: Annotated[
        str,
        Field(description="The reasoning for the overall grade of the communication."),
    ]
    grade_justification: Annotated[
        str,
        Field(
            description="The justification for the chosen grade of the communication."
        ),
    ]
    grade_level_justification: Annotated[
        str,
        Field(
            description="The justification for the chosen grade level of the communication."
        ),
    ]


class GroundedRating(BaseModel):
    """A grounded rating is a rating that is related to a specific transcript. Used in model generation."""

    reasoning: Annotated[
        GradingReasoning, Field(description="The reasoning for the rating.")
    ]
    grade: Annotated[
        Literal["a", "b", "c", "d", "e", "f"],
        Field(description="The grade of the communication."),
    ]
    summary: Annotated[str, Field(description="The summary of the issues.")]


load_dotenv(override=True)


async def test_analyze_sales_conversation(audio_bytes: bytes) -> None:
    openrouter_provider = OpenAIProvider(
        base_url="https://openrouter.ai/api/v1",
        api_key=os.environ["OPENROUTER_API_KEY"],
    )
    vertex_provider = GoogleVertexProvider(
        service_account_file=os.environ["SERVICE_ACCOUNT_JSON"],
    )

    gemini_model = GeminiModel("gemini-2.0-flash-001", provider=vertex_provider)
    openai_model = OpenAIModel(
        "google/gemini-2.0-flash-001", provider=openrouter_provider
    )

    agent = Agent(
        model=openai_model,
        # model=gemini_model,
        system_prompt="What can you tell me about the speech?",
        retries=0,
        instrument=True,
    )
    result = await agent.run(
        [
            "Here is the audio file to analyze:",
            BinaryContent(audio_bytes, media_type="audio/wav"),
        ],
        output_type=GroundedRating,
    )
    print(result.output)


if __name__ == "__main__":
    data = open(
        "....wav",
        "rb",
    ).read()
    asyncio.run(test_analyze_sales_conversation(data))

Python, Pydantic AI & LLM client version

pydantic                                     2.11.4
pydantic-ai                                  0.2.0
pydantic-ai-slim                             0.2.0
pydantic-core                                2.33.2
pydantic-evals                               0.2.0
pydantic-graph                               0.2.0
pydantic-settings                            2.9.1
openai                                       1.75.0
google-genai                                 1.15.0
@DouweM
Copy link
Contributor

DouweM commented May 19, 2025

@ChenghaoMou Since you're using the OpenAIModel, the Gemini-specific JSON schema transformations aren't applied.

You can do so manually with a custom OpenAIModel subclass that uses _GeminiJsonSchema, like this:

from pydantic_ai.models import ModelRequestParameters
from pydantic_ai.models._json_schema import JsonSchema
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.models.gemini import _GeminiJsonSchema
from pydantic_ai.settings import ModelSettings
from pydantic_ai.tools import ToolDefinition

class ORGeminiModel(OpenAIModel):
    def customize_request_parameters(self, model_request_parameters: ModelRequestParameters) -> ModelRequestParameters:
        def _customize_tool_def(t: ToolDefinition):
            return replace(t, parameters_json_schema=_GeminiJsonSchema(t.parameters_json_schema).walk())

        return ModelRequestParameters(
            function_tools=[_customize_tool_def(tool) for tool in model_request_parameters.function_tools],
            allow_text_output=model_request_parameters.allow_text_output,
            output_tools=[_customize_tool_def(tool) for tool in model_request_parameters.output_tools],
        )

Let me know if that works.

I'll be filing an issue later to start working out a more general solution to this problem of using a model class built for a specific API (OpenAI) and model (OpenAI models) with different actual APIs (OpenRouter) and models (Gemini), which can result in unexpected behavior like what you're seeing here.

@DouweM DouweM self-assigned this May 19, 2025
@ChenghaoMou
Copy link
Contributor Author

@DouweM Thanks a ton for the help! I can confirm that it solves my issue. Feel free to close this issue for your later one.

@DouweM
Copy link
Contributor

DouweM commented May 26, 2025

@ChenghaoMou With the changes in #1835, which include a new OpenRouterProvider that automatically reads the OPENROUTER_API_KEY env var, and automatic JSON schema transformer selection based on the google/ prefix in the model name, you should be able to drop the ORGeminiModel we created, and use just this:

model = OpenAIModel(
    "google/gemini-2.0-flash-001", provider=OpenRouterProvider()
)
agent = Agent(model)

Or this:

model = OpenAIModel("google/gemini-2.0-flash-001", provider="openrouter")
agent = Agent(model)

Or even this:

agent = Agent("openrouter:google/gemini-2.0-flash-001")

If you have a chance, could you verify that that works as expected?

@ChenghaoMou
Copy link
Contributor Author

@DouweM thanks for the quick turnaround!

I have tested your branch with OpenAIModel("google/gemini-2.0-flash-001", provider=openrouter_provider),.

Here is an issue I found:

             │   File "/Users/chenghao/Developer/ai-reports/.venv/lib/python3.12/site-packages/pydantic_ai/providers/openrouter.py", line 64, in model_profile
             │     model_name, _ = model_name.split(':', 1)
             │     ^^^^^^^^^^^^^
             │ ValueError: not enough values to unpack (expected 2, got 1)

the model profile function seems to expect provider/model_name:something:

    def model_profile(self, model_name: str) -> ModelProfile | None:
        provider, model_name = model_name.split('/', 1)
        if provider in _provider_to_profile:
            model_name, _ = model_name.split(':', 1)
            return _provider_to_profile[provider](model_name)
        return None

But if I change to model_name, *_ = model_name.split(':', 1), then it works fine.

@DouweM
Copy link
Contributor

DouweM commented May 27, 2025

@ChenghaoMou Good catch, that's what I get for leaving tests to the end :) Fixed in the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants