fix: graceful fallback when model has no chat_template (MedGemma)#271
fix: graceful fallback when model has no chat_template (MedGemma)#271jackneil wants to merge 1 commit intowaybarrios:mainfrom
Conversation
Models like MedGemma have apply_chat_template() as an inherited method but no chat_template configured, causing ValueError on every request. Now catches ValueError and falls back to plain-text prompt format in both BatchedEngine and SimpleEngine paths. Fixes: MedGemma crashes with "Cannot use apply_chat_template" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
One correctness concern before merge: the new fallback paths catch broad That is fine for the specific MedGemma case ( I think the fallback should be narrowed to the known missing-template case only — e.g. catch |
Thump604
left a comment
There was a problem hiding this comment.
I rechecked this against current main, and I do not think it is merge-ready in its current shape.
Two blockers:
-
The fallback is still too broad. Catching generic
TypeError/ValueErrorfromapply_chat_template()will silently mask unrelated template regressions. This should be narrowed to the known missing-template case only (theno chat_template configured/Cannot use apply_chat_template when no chat_template is setclass of failure). -
More importantly, this branch does not actually cover the real
SimpleEngine.chat()crash path on currentmain. The non-streaming simple-engine path still delegates straight intoself._model.chat(...)before any of this local fallback logic runs, so a model/processor withapply_chat_template()present but no configuredchat_templatecan still fail before reaching the new fallback code. The current diff mainly helps the local prompt-building/accounting paths, not the full execution path.
Given that the branch is also conflicting, I would rather see a small current-main replacement that fixes the real execution path and adds a regression against that path specifically.
janhilgard
left a comment
There was a problem hiding this comment.
Agreeing with Thump604's review -- the broad TypeError/ValueError catch is the main concern.
A couple of additional notes:
-
simple.pytoken count estimation is lossy. When the fallback triggers,prompt_token_countis estimated aslen(content) // 4. This is a rough heuristic that can be significantly off for non-English text or short prompts. The estimate propagates into theusagefield of the API response, which callers may rely on for billing or context window management. Consider at least logging a warning when this fallback path is taken, so users know the token count is approximate. -
Inconsistent fallback between engines. In
batched.py, the fallback catchesValueErrorfrom the outer try (thehasattr(template_applicator, 'apply_chat_template')path) and also adds a new innerexcept (TypeError, ValueError): passfor the tools-stripped retry. But insimple.py, onlyapply_chat_templateis wrapped. If MedGemma is used with the batched engine, it hits a different code path than with the simple engine. It would be good to unify the fallback logic into a shared helper rather than duplicating it. -
Tests use
object.__new__(BatchedEngine)to create a stub. This is brittle -- ifBatchedEngine.__init__ever adds required state (e.g., a logger, a lock), the tests will break in confusing ways. A lightweight mock or extracting_apply_chat_templateinto a standalone function would be more robust. -
Branch has merge conflicts with current main. Needs a rebase.
The fix itself is needed (MedGemma and similar template-less models should not crash the server), but I agree with Thump604 that narrowing the catch to the specific "no chat_template configured" error message is important to avoid masking real bugs.
|
Hi @jackneil -- this PR has review feedback from April and merge conflicts with current main. Are you still working on it? Happy to review a rebased version. If the work is stalled, we can close it and you're welcome to refile when ready. Will check back in two weeks. |
Summary
apply_chat_template()inherited but nochat_templateconfiguredValueError: Cannot use apply_chat_template when no chat_template is setValueErrorand falls back to plain-text prompt format in both BatchedEngine and SimpleEngineTest plan
🤖 Generated with Claude Code