Skip to content

fix(responses): store complete assistant response in Redis sessions#89

Open
josemaria-vilaplana wants to merge 1 commit intoupstream-sync/v1.81.0-stablefrom
fix/redis-session-store-complete-response
Open

fix(responses): store complete assistant response in Redis sessions#89
josemaria-vilaplana wants to merge 1 commit intoupstream-sync/v1.81.0-stablefrom
fix/redis-session-store-complete-response

Conversation

@josemaria-vilaplana
Copy link

Summary

  • Fix multi-turn conversations with Gemini thinking models (2.5/3) when using tool calling via the Responses API
  • Previously, Redis session storage only stored input messages, missing the assistant response with tool_calls and provider_specific_fields
  • This caused "Base64 decoding failed for thought_signature" errors on follow-up requests

Changes

  • Extract and store complete assistant message including tool_calls in Redis sessions
  • Preserve provider_specific_fields containing thought_signatures (critical for Gemini thinking models)
  • Handle both streaming and non-streaming response paths
  • Add comprehensive tests for thought_signature preservation (17 new tests)

Test plan

  • Run unit tests: pytest tests/test_litellm/responses/litellm_completion_transformation/test_thought_signature_preservation.py (17 passed)
  • Run related tests: pytest tests/test_litellm/responses/litellm_completion_transformation/ (98 passed, 1 skipped due to missing dependency)
  • Manual test with Gemini 2.5/3 thinking model + tool calling + multi-turn conversation

🤖 Generated with Claude Code

Previously, Redis session storage only stored input messages, missing the
assistant response with tool_calls and provider_specific_fields. This caused
Gemini thinking models (2.5/3) to fail on multi-turn tool calling conversations
with "Base64 decoding failed for thought_signature" errors.

Changes:
- Extract and store complete assistant message including tool_calls
- Preserve provider_specific_fields containing thought_signatures
- Handle both streaming and non-streaming response paths
- Add comprehensive tests for thought_signature preservation

Fixes multi-turn conversations with Gemini thinking models when using
tool calling via the Responses API.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Collaborator

@mateo-di mateo-di left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josemaria-vilaplana can we check if we have a test already that verify this carto-customization in the smoke-ai tests in cloud-native?

@josemaria-vilaplana
Copy link
Author

josemaria-vilaplana commented Feb 6, 2026

@josemaria-vilaplana can we check if we have a test already that verify this carto-customization in the smoke-ai tests in cloud-native?

No way we can test this with our current tests approach. This fails randomly, only when the thought_signatures in base64 is bigger then 2K, which happens only from time to time depending on the encoded value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants