fix(auto-run): unwedge experiment loop when provider rejects `temperature` by ChenglongWang · Pull Request #78 · MIRA-Intelligence/mira

ChenglongWang · 2026-05-14T09:49:51Z

Summary

Production incident PRJ-0002 (2026-05-14): every azure/anthropic/claude-opus-4-7 call returned

litellm.BadRequestError: Azure_aiException - invalid_request_error: `temperature` is deprecated for this model.

Every auto-run round re-hit the same parameter error until _AUTO_MAX_ROUNDS (20) was exhausted, and the error string was surfaced to the user instead of any research output.

Logs that triggered this PR: ~/Shared/agent-service.log (19 identical _run_agent_loop:580 errors in 13 seconds) and ~/Shared/actions.jsonl (auto-run round 1..20 all firing on the same model with no recovery).

Three independent bugs collided. This PR fixes each in isolation.

Bug 1 — `temperature` attached to models that reject it

New mira_engine/providers/model_compat.py centralises the rule (currently claude-opus-4-7).
AzureOpenAIProvider._supports_temperature consults it on top of its existing gpt-5 / o1 / o3 / o4 blocklist.
LiteLLMProvider.chat pops temperature from the request kwargs before model-specific overrides apply.
AnthropicProvider._build_kwargs conditionally skips temperature on every code path (adaptive thinking, enabled thinking, plain).

Adding a model to the blocklist is now a one-line registry change.

Bug 2 — non-retryable raised exceptions burned the fallback chain

RoutedProviderManager.chat already classified non-retryable error responses via _should_retry_with_fallback, but the except Exception branch unconditionally walked to the next candidate. A raised BadRequestError therefore tried every candidate even though they all fail identically. Apply the same classifier on str(exc) and re-raise immediately on non-retryable errors.

Bug 3 — auto-run kept spinning on LLM provider errors

_evaluate_continuation only stopped on failure responses when strictHeuristics was on. In PRJ-0002 there was no automation_policy, so it defaulted to relaxed heuristics and kept calling the LLM 20 times with the same parameter error. Add an unconditional _looks_like_llm_provider_error check that halts auto mode with stop_reason="llm provider error" regardless of policy. Detects markers from every provider wrapper (Error calling LLM, Error calling Azure OpenAI, litellm.BadRequestError, Azure_aiException, invalid_request_error, All candidate models failed for this turn, …).

Files

mira_engine/providers/model_compat.py              (new)
mira_engine/providers/azure_openai_provider.py
mira_engine/providers/litellm_provider.py
mira_engine/providers/anthropic_provider.py
mira_engine/agent/routing.py
mira_engine/agent/research_loop.py
tests/providers/test_model_compat.py               (new)
tests/providers/test_azure_openai_provider.py
tests/test_model_routing.py
tests/test_research_loop_core.py

Test plan

pytest tests/providers/test_model_compat.py tests/providers/test_azure_openai_provider.py — 31 passed (covers claude-opus-4-7 under every provider prefix and Azure body builder dropping temperature)
pytest tests/test_model_routing.py — 12 passed (new: retryable-raised falls back; non-retryable-raised does not invoke fallback candidate)
pytest tests/test_research_loop_core.py — 14 passed (new: _looks_like_llm_provider_error markers; _evaluate_continuation halts with "llm provider error" even under relaxed heuristics)
Full regression on tests/providers/, tests/test_model_routing.py, tests/test_research_loop_core.py, tests/test_agent_loop_core.py, tests/test_agent_loop.py — 272 passed
ruff check on newly touched files — clean (pre-existing W293/F841/I001 in unrelated lines confirmed via stash diff)
Re-run a claude-opus-4-7 auto session against staging to confirm the surface error stays a single round and the loop halts with stop_reason="llm provider error"

…ture` PRJ-0002 (2026-05-14) wedged on `azure/anthropic/claude-opus-4-7` returning `invalid_request_error: \`temperature\` is deprecated for this model.` Every auto-run round re-hit the same parameter error until `_AUTO_MAX_ROUNDS` (20) was exhausted, surfacing the error to the user instead of any research result. Three independent bugs collided: 1. `temperature` was unconditionally attached to outbound requests for models that no longer accept it. Centralise the rule in a new `providers.model_compat` registry (currently lists `claude-opus-4-7`) and gate temperature emission on it in the Azure, LiteLLM, and Anthropic providers. Azure's existing `_supports_temperature` rule for `gpt-5`/`o*` deployments is preserved on top. 2. `RoutedProviderManager.chat` blindly walked the fallback chain when a provider RAISED rather than returned an error response, so a permanent 4xx burned every remaining candidate. Apply the same `_should_retry_with_fallback` classification used on the response path; non-retryable exceptions now short-circuit immediately. 3. `_evaluate_continuation` only stopped auto mode on failure responses when `strictHeuristics` was on. LLM-provider errors are NOT experiment outcomes — they mean the model never produced a turn, so the next round will hit the same error. Add an unconditional `_looks_like_llm_provider_error` check that halts auto mode with `stop_reason="llm provider error"` regardless of policy. Tests cover the model_compat blocklist under every provider prefix, the Azure body builder dropping temperature for `claude-opus-4-7`, non-retryable raised exceptions not burning the fallback chain, and auto-run halting on the exact error text observed in PRJ-0002.

…rovider too Second occurrence at 2026-05-14 21:09: the temperature error reappeared even after the first PR fix. Root cause: `OpenAICompatProvider` (used by `custom` provider configs and by `GitHubCopilotProvider` via inheritance) keeps its own `_supports_temperature` rule that only blocked GPT-5 / o1 / o3 / o4 deployments. When a user's OpenAI-compatible endpoint proxies to Azure-hosted `claude-opus-4-7`, this path still attached `temperature` and Azure 400'd with `invalid_request_error: \`temperature\` is deprecated for this model.` Have `_supports_temperature` also consult the shared `providers.model_compat` registry. Same pattern as Azure / LiteLLM / Anthropic providers from the parent commit. The error-format trail (`Error: {'message':...}`) comes from `_handle_error` in `openai_compat_provider.py:811`, which confirms this code path is the one the user's config hits. Adds two regression tests: - `_supports_temperature` returns False for `claude-opus-4-7` under every provider prefix. - `_build_kwargs` AND `_build_responses_body` both omit `temperature` from the outbound request body for `azure/anthropic/claude-opus-4-7`.

ChenglongWang added 2 commits May 14, 2026 17:47

ChenglongWang force-pushed the fix/auto-run-temperature-and-error-loop branch from e133e75 to 2dd9140 Compare May 14, 2026 13:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(auto-run): unwedge experiment loop when provider rejects `temperature`#78

fix(auto-run): unwedge experiment loop when provider rejects `temperature`#78
ChenglongWang wants to merge 2 commits into
feat/project-registry-runtime-isolationfrom
fix/auto-run-temperature-and-error-loop

ChenglongWang commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ChenglongWang commented May 14, 2026

Summary

Bug 1 — temperature attached to models that reject it

Bug 2 — non-retryable raised exceptions burned the fallback chain

Bug 3 — auto-run kept spinning on LLM provider errors

Files

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Bug 1 — `temperature` attached to models that reject it