sdk(llm): Stop model-name whack-a-mole: revert to core family substring matching #879

enyst · 2025-10-23T18:22:37Z

Summary

Return to simple, robust core family substring matching across the full raw model string
Remove fnmatch/globbing and stop using normalization for feature detection
Update pattern tables to pure substrings (no wildcards)
Adjust tests accordingly and add coverage that validates Bedrock-style names

What & Why
Recent refactor introduced fnmatch-based globbing over a normalized basename. This unintentionally diverged from the prior V0 behavior where we effectively matched by substring on the full provider/model name. That change broke real-world cases, notably with AWS Bedrock where names embed dotted vendor prefixes and version suffixes inside the basename (e.g., bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0). Our patterns like 'claude-3-5-sonnet*' stopped matching after normalization and globbing.

This PR restores the durable invariant: if a meaningful family token (e.g., 'claude-3-5-sonnet', 'gpt-4o', 'o3', 'gemini-2.5-pro') appears anywhere in the model string, the feature applies. This eliminates the pattern maintenance whack-a-mole caused by dotted prefixes and provider-specific suffixes and aligns again with proven behavior in the wild.

Implementation Details

model_matches()
- Lowercase + strip the incoming model string and perform case-insensitive substring checks on the full raw string
- For each pattern, lowercase/strip and drop any trailing '*' (migration aid); treat the remaining token as a plain substring
- Return True on first match; False otherwise
- No use of normalize_model_name() here
Pattern tables: remove '*'
- FUNCTION_CALLING_PATTERNS, REASONING_EFFORT_PATTERNS, PROMPT_CACHE_PATTERNS, SUPPORTS_STOP_WORDS_FALSE_PATTERNS, RESPONSES_API_PATTERNS now contain pure substrings
- Provider-qualified entries remain supported by virtue of substring matching against the raw string
normalize_model_name()
- Not used by matching. Tests exercising normalization for matching were removed to avoid confusion
Tests
- Remove wildcard expectations; adapt to pure substring semantics
- Ensure Bedrock coverage: e.g., 'bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0' enables function calling and prompt cache
- Verify provider-qualified substrings gate as expected (e.g., 'openai/gpt-4o' matches 'openai/gpt-4o' but not 'anthropic/*')
- Keep conservative defaults for unknown models

Outcomes

Clear behavior: if the essential family token appears in the model string, the feature applies
Fewer special-case patterns and more durable matching across providers
Restores pre-refactor semantics that worked reliably in practice

Checklist

Code formatted and linted via pre-commit
Updated tests for sdk changes; all impacted sdk tests pass locally

Closes #844

@enyst can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Base Image	Docs / Tags
golang	`golang:1.21-bookworm`	Link
java	`eclipse-temurin:17-jdk`	Link
python	`nikolaik/python-nodejs:python3.12-nodejs22`	Link

Pull (multi-arch manifest)

docker pull ghcr.io/openhands/agent-server:60d1363-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-60d1363-python \
  ghcr.io/openhands/agent-server:60d1363-python

All tags pushed for this build

ghcr.io/openhands/agent-server:60d1363-golang
ghcr.io/openhands/agent-server:v1.0.0a4_golang_tag_1.21-bookworm_binary
ghcr.io/openhands/agent-server:60d1363-java
ghcr.io/openhands/agent-server:v1.0.0a4_eclipse-temurin_tag_17-jdk_binary
ghcr.io/openhands/agent-server:60d1363-python
ghcr.io/openhands/agent-server:v1.0.0a4_nikolaik_s_python-nodejs_tag_python3.12-nodejs22_binary

The 60d1363 tag is a multi-arch manifest (amd64/arm64); your client pulls the right arch automatically.

Cross-repo impact: Fix: OpenHands/OpenHands#11248

…normalize usage - model_matches now does case-insensitive substring on full raw model - strip trailing '*' in patterns (migration aid) - pattern tables converted to plain substrings (no '*') - drop normalize_model_name and related tests - update tests to reflect substring semantics and Bedrock coverage Fixes #844 Co-authored-by: openhands <[email protected]>

github-actions · 2025-10-23T18:26:14Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
TOTAL	10595	4647	56%

report-only-changed-files is enabled. No files were changed during this commit :)

…handling and empty-token skipping - Patterns are now used exactly as provided (lowercased/stripped) - No special handling for '*' or empty tokens Co-authored-by: openhands <[email protected]>

…eature detection - Validate provider-prefixed Bedrock ids and plain vendor-prefixed names - Ensure function-calling and prompt-cache features are enabled for Claude families Co-authored-by: openhands <[email protected]>

…edrock dotted vendor prefixes - Function-calling: adds claude-sonnet-4-5 and claude-sonnet-4.5, and us.anthropic.* examples - Prompt cache: keep only supported families; drop unsupported haiku-4.5 dotted vendor case Co-authored-by: openhands <[email protected]>

… extend tests with dotted vendor forms - Add claude-haiku-4.5 and claude-haiku-4-5 to PROMPT_CACHE_PATTERNS - Expand tests for us.anthropic.* and local names for Haiku 4.5 Co-authored-by: openhands <[email protected]>

github-actions · 2025-10-23T22:31:10Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2025-10-23T22:31:10Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2025-10-23T22:31:12Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2025-10-23T22:32:15Z

🧪 Integration Tests Results

Overall Success Rate: 0.0%
Total Cost: $0.00
Models Tested: 3
Timestamp: 2025-10-23 22:32:12 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_deepseek_deepseek_chat: 📥 View & Download Logs
litellm_proxy_openai_gpt_5_mini: 📥 View & Download Logs
litellm_proxy_anthropic_claude_sonnet_4_5_20250929: 📥 View & Download Logs

📊 Summary

Model	Success Rate	Tests Passed	Total Tests	Cost
litellm_proxy_deepseek_deepseek_chat	0.0%	0/7	7	$0.00
litellm_proxy_openai_gpt_5_mini	0.0%	0/7	7	$0.00
litellm_proxy_anthropic_claude_sonnet_4_5_20250929	0.0%	0/7	7	$0.00

📋 Detailed Results

litellm_proxy_deepseek_deepseek_chat

Success Rate: 0.0% (0/7)
Total Cost: $0.00
Run Suffix: litellm_proxy_deepseek_deepseek_chat_4b0e7bd_deepseek_run_N7_20251023_223118

Failed Tests:

t07_interactive_commands: Test execution failed: Conversation run failed for id=3381176d-bbf5-4e09-aa99-9d37f0172810: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)
t01_fix_simple_typo: Test execution failed: Conversation run failed for id=daa50858-8bda-46ca-b17d-1dfb5ed26937: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)
t02_add_bash_hello: Test execution failed: Conversation run failed for id=caa42434-1fd8-4c0c-98b7-4a2d90387e67: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)
t04_git_staging: Test execution failed: Conversation run failed for id=d654e711-96c2-46dc-bac9-363f0cf5ac83: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)
t03_jupyter_write_file: Test execution failed: Conversation run failed for id=8b07fea4-494d-4dc1-bb65-dba27a4ee011: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)
t06_github_pr_browsing: Test execution failed: Conversation run failed for id=ac31aab6-3f85-4776-8701-6957bf2a1afc: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)
t05_simple_browsing: Test execution failed: Conversation run failed for id=691a3e8d-1605-4167-9798-3109f516c82a: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)

litellm_proxy_openai_gpt_5_mini

Success Rate: 0.0% (0/7)
Total Cost: $0.00
Run Suffix: litellm_proxy_openai_gpt_5_mini_4b0e7bd_gpt5_mini_run_N7_20251023_223120

Failed Tests:

t06_github_pr_browsing: Test execution failed: Conversation run failed for id=7ea1018f-3f7d-421f-a3d4-704ce271e6f8: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - {"error":{"message":"Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable","type":"token_not_found_in_db","param":"key","code":"401"}} (Cost: $0.00)
t07_interactive_commands: Test execution failed: Conversation run failed for id=1e745fe5-2578-4960-8be8-9f134b8e40fe: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - {"error":{"message":"Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable","type":"token_not_found_in_db","param":"key","code":"401"}} (Cost: $0.00)
t04_git_staging: Test execution failed: Conversation run failed for id=b29a12b7-02a4-47ca-bcd0-1276d9e7a115: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - {"error":{"message":"Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable","type":"token_not_found_in_db","param":"key","code":"401"}} (Cost: $0.00)
t01_fix_simple_typo: Test execution failed: Conversation run failed for id=fe526d6f-12e4-4d0b-86cd-ee6190e0f417: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - {"error":{"message":"Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable","type":"token_not_found_in_db","param":"key","code":"401"}} (Cost: $0.00)
t02_add_bash_hello: Test execution failed: Conversation run failed for id=6339e0ed-571b-43cd-83b8-741ee3a51af9: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - {"error":{"message":"Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable","type":"token_not_found_in_db","param":"key","code":"401"}} (Cost: $0.00)
t03_jupyter_write_file: Test execution failed: Conversation run failed for id=df8efe23-aad7-4a43-be6f-0e1f0ae2ae23: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - {"error":{"message":"Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable","type":"token_not_found_in_db","param":"key","code":"401"}} (Cost: $0.00)
t05_simple_browsing: Test execution failed: Conversation run failed for id=c7ebc958-d96b-411e-a31f-a1d7a8b295c2: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - {"error":{"message":"Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable","type":"token_not_found_in_db","param":"key","code":"401"}} (Cost: $0.00)

litellm_proxy_anthropic_claude_sonnet_4_5_20250929

Success Rate: 0.0% (0/7)
Total Cost: $0.00
Run Suffix: litellm_proxy_anthropic_claude_sonnet_4_5_20250929_4b0e7bd_sonnet_run_N7_20251023_223118

Failed Tests:

t01_fix_simple_typo: Test execution failed: Conversation run failed for id=442e74c5-486d-4649-b83a-44faf8ecb301: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)
t02_add_bash_hello: Test execution failed: Conversation run failed for id=8da7a4d4-b9f2-4e00-b64f-c1cc0303e3ae: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)
t06_github_pr_browsing: Test execution failed: Conversation run failed for id=780227fd-fdf6-488b-8bdd-e82b8e9a7e5f: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)
t04_git_staging: Test execution failed: Conversation run failed for id=b1d32414-3058-46cd-b012-7f5010c03ac4: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)
t07_interactive_commands: Test execution failed: Conversation run failed for id=22cf4b92-a21f-4e8d-8921-02785e74cc71: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)
t03_jupyter_write_file: Test execution failed: Conversation run failed for id=ede7d737-1649-4025-a454-6eba705dc062: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)
t05_simple_browsing: Test execution failed: Conversation run failed for id=5d32c77b-fe3d-43c7-aede-ff4de873af4e: litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...T9_Q, Key Hash (Token) =61c9fb32902f3b0764b58f832bcf8f0908410d7664c9b8d9030801eba8dbffde. Unable to find token in cache or LiteLLM_VerificationTokenTable (Cost: $0.00)

openhands-ai bot mentioned this pull request Oct 23, 2025

Stop the model-name whack-a-mole: revert to core family substring matching (remove globs, drop normalize) #844

Open

enyst marked this pull request as draft October 23, 2025 18:51

enyst and others added 4 commits October 23, 2025 18:53

sdk(llm): simplify substring matching further by removing trailing-* …

c3080f8

…handling and empty-token skipping - Patterns are now used exactly as provided (lowercased/stripped) - No special handling for '*' or empty tokens Co-authored-by: openhands <[email protected]>

enyst added the integration-test Runs the integration tests and comments the results label Oct 23, 2025

Merge branch 'main' into openhands/revert-to-substring-matching

602f6bd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sdk(llm): Stop model-name whack-a-mole: revert to core family substring matching #879

sdk(llm): Stop model-name whack-a-mole: revert to core family substring matching #879

Uh oh!

enyst commented Oct 23, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Oct 23, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 23, 2025

Uh oh!

github-actions bot commented Oct 23, 2025

Uh oh!

github-actions bot commented Oct 23, 2025

Uh oh!

github-actions bot commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sdk(llm): Stop model-name whack-a-mole: revert to core family substring matching #879

Are you sure you want to change the base?

sdk(llm): Stop model-name whack-a-mole: revert to core family substring matching #879

Uh oh!

Conversation

enyst commented Oct 23, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 23, 2025

Uh oh!

github-actions bot commented Oct 23, 2025

Uh oh!

github-actions bot commented Oct 23, 2025

Uh oh!

github-actions bot commented Oct 23, 2025

🧪 Integration Tests Results

📁 Detailed Logs & Artifacts

📊 Summary

📋 Detailed Results

litellm_proxy_deepseek_deepseek_chat

litellm_proxy_openai_gpt_5_mini

litellm_proxy_anthropic_claude_sonnet_4_5_20250929

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

enyst commented Oct 23, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Oct 23, 2025 •

edited

Loading