Skip to content

ping page: per-row Ping buttons + show every model#669

Merged
neoneye merged 2 commits intomainfrom
feat/ping-page-rework-and-llm-config
May 3, 2026
Merged

ping page: per-row Ping buttons + show every model#669
neoneye merged 2 commits intomainfrom
feat/ping-page-rework-and-llm-config

Conversation

@neoneye
Copy link
Copy Markdown
Member

@neoneye neoneye commented May 3, 2026

Summary

Reworks /ping for troubleshooting individual models, plus two related llm_config/baseline.json updates.

Page behaviour

Before: the page opened an SSE stream that pinged every prioritized model in sequence and appended rows as results came back. Models without a priority field were silently skipped, and there was no way to retry a single model.

After:

  • The full table renders on page load with one row per configured model across every llm_config profile.
  • A new Priority column shows each model's priority value (blank when unset). Prioritized models are listed first (asc), unprioritized after in declaration order.
  • Each row has its own Ping button.
  • A top Ping All button iterates rows sequentially.

Backend

  • New GET /llm-list returns every (profile, llm_name, priority) tuple as JSON.
  • New GET /llm-ping-one?profile=X&llm_name=Y pings one model and returns the result as JSON.
  • Removed GET /llm-ping (SSE) and the matching /ping/stream proxy in frontend_multi_user.
  • New helper get_all_llm_names_with_priority() in llm_factory.py.

Bundled llm_config/baseline.json changes

Both caught while exercising the new ping page:

  • Add deepseek-v4-flash (DeepSeek native API at api.deepseek.com/v1, thinking mode explicitly disabled via additional_kwargs.extra_body, $0.14/M input / $0.28/M output).
  • Replace openrouter-elephant-alpha with openrouter-ling-2.6-flash — Elephant Alpha was a stealth alias and now returns 404; production name is inclusionai/ling-2.6-flash (262K context, $0.08/M in / $0.24/M out).

Test plan

  • Open /ping, confirm every model appears (incl. ones without priority).
  • Click per-row Ping on a working model — row updates with success + response time.
  • Click Ping All — rows update sequentially.
  • Verify deepseek-v4-flash pings green with a DEEPSEEK_API_KEY set.
  • Verify openrouter-ling-2.6-flash pings green (no longer 404).

🤖 Generated with Claude Code

neoneye added 2 commits May 3, 2026 21:18
Reworks /ping for troubleshooting individual models.

Old behaviour: page opened an SSE stream that pinged every
*prioritized* model in sequence and appended rows as results
came back. Models without a `priority` field were silently
skipped, and there was no way to retry a single model.

New behaviour:
- The full table renders on page load with one row per
  configured model across every llm_config profile, prioritized
  models first.
- A new "Priority" column shows each model's priority value
  (blank when unset).
- Each row has its own "Ping" button.
- A top "Ping All" button iterates rows sequentially.

Backend:
- New `/llm-list` returns every (profile, llm_name, priority)
  tuple in JSON. Replaces the prior priority-only enumeration
  inside the SSE handler.
- New `/llm-ping-one?profile=X&llm_name=Y` pings one model and
  returns the result as JSON.
- Removed `/llm-ping` (SSE) and the matching `/ping/stream`
  proxy in frontend_multi_user.
- New helper `get_all_llm_names_with_priority()` in
  llm_factory.py returns every model in the profile, prioritized
  first (sorted asc), unprioritized after in declaration order.

Bundled config changes (caught while exercising the new page):
- Add `deepseek-v4-flash` (DeepSeek native API,
  api.deepseek.com/v1, thinking mode disabled via
  `additional_kwargs.extra_body`, $0.14/M in / $0.28/M out).
- Replace `openrouter-elephant-alpha` with
  `openrouter-ling-2.6-flash` — Elephant Alpha was a stealth
  alias and now 404s; production name is
  `inclusionai/ling-2.6-flash` (262K context, $0.08/M in /
  $0.24/M out).
Add:
- openrouter-granite-4.1-8b (ibm-granite/granite-4.1-8b,
  131K context, $0.05/$0.10).
- openrouter-laguna-xs.2-free (poolside/laguna-xs.2:free,
  128K context, free).
- openrouter-nemotron-3-nano-omni-30b-reasoning-free
  (nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free, 256K
  context, free, reasoning model).

Rename:
- deepseek-v4-flash -> deepseek-v4-flash-thinking-disabled.
  The "model" argument is unchanged (deepseek-v4-flash is the
  DeepSeek API id); only the config key gets the suffix to make
  the no-thinking behaviour explicit alongside any future
  thinking-enabled variant.
@neoneye neoneye merged commit e18b2e1 into main May 3, 2026
3 checks passed
@neoneye neoneye deleted the feat/ping-page-rework-and-llm-config branch May 3, 2026 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant