fix(moonshot): enforce temperature=1 for all Kimi K2 / thinking models#87
Open
ChenglongWang wants to merge 1 commit into
Open
fix(moonshot): enforce temperature=1 for all Kimi K2 / thinking models#87ChenglongWang wants to merge 1 commit into
ChenglongWang wants to merge 1 commit into
Conversation
Moonshot rejects any temperature other than 1 for the entire Kimi K2 family (kimi-k2, kimi-k2-turbo, kimi-k2.5, kimi-k2.5-turbo, ...) and the thinking/reasoning preview models, but the registry only matched the literal string "kimi-k2.5", so other variants returned: litellm.BadRequestError: MoonshotException - invalid temperature: only 1 is allowed for this model Fix in two layers: 1. Broaden the Moonshot model_overrides patterns to "kimi-k2" and "kimi-thinking" so all current and future K2 / thinking variants are clamped to temperature=1.0 before the request leaves the client. 2. Add a runtime fallback in LiteLLMProvider.chat(): if the provider still rejects the request with an "only 1 is allowed" / "must be 1" style error, retry once with temperature=1.0 instead of surfacing the error to the user. Guards against future Moonshot releases we haven't registered overrides for yet. Tests cover override matching for the K2 / thinking family, the moonshot-v1-* pass-through, error-message detection, the single retry path, and the no-infinite-loop guard.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the runtime error:
Root cause: Moonshot enforces
temperature=1on the entire Kimi K2 family (kimi-k2,kimi-k2-turbo,kimi-k2.5,kimi-k2.5-turbo, futurek2.x...) and on thinking / reasoning preview models (kimi-thinking-preview,kimi-k2-thinking, ...). The registry only matched the literal substringkimi-k2.5, so every other affected variant let the caller's0.7through and got a 400 back.Changes
mira_engine/providers/registry.py— broaden Moonshotmodel_overridesfrom("kimi-k2.5", ...)to two prefixes that cover the whole family:kimi-k2— matcheskimi-k2,kimi-k2-turbo,kimi-k2.5,kimi-k2.5-turbo, futurek2.6+kimi-thinking— matcheskimi-thinking-previewand similar reasoning previewsmira_engine/providers/litellm_provider.py— defensive runtime fallback: if the API still rejects the request with anonly 1 is allowed/must be 1style error, retry once withtemperature=1.0. Guards against future Moonshot variants we haven't registered overrides for yet. No infinite-retry risk — we don't retry iftemperatureis already 1.0.tests/providers/test_litellm_provider.py— 8 new tests:moonshot-v1-*pass-through preserved (caller temperature kept)Test plan
pytest tests/providers/test_litellm_provider.py— 9/9 ✅pytest tests/providers/ tests/test_model_routing.py— 244/244 ✅