fix: model routing bugs, per-model cost tracking, and quickstart doc#2
Closed
HelloAnner wants to merge 4 commits into
Closed
Conversation
- Fix jq syntax error in post-tool.sh that silently broke metrics reporting - Split cost tracking: complex and simple models now use independent pricing variables (__INPUT_COST_PER_M_COMPLEX/SIMPLE__) in openclaw.json - Add per-model cost variable substitution in restart.sh - Fix budget exhaustion check (>= to >) to avoid premature shutdown - Fix metrics overview cost estimation to respect per-model pricing - Make GITHUB_USERNAME configurable in dashboard-reporter and github sync - Add docs/quickstart.md with setup, model switching, budget, and dashboard guide Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a MODEL_TOKEN_BUDGETS config (env var + dashboard settings) that caps cumulative input+output token usage per model, matched by bare model name so the same model served by multiple providers (e.g. z-ai/glm-4.6 and openrouter/glm-4.6) shares a single counter. When a model exceeds its cap, the health-check endpoint emits a MODEL TOKEN BUDGET EXHAUSTED directive picked up by the heartbeat loop, and a non-dismissible red banner appears at the top of every dashboard page. - Add bareModelName() helper for cross-provider model matching - Extend /api/agent/health-check with per-model aggregation, directive injection, and a new modelBudgets response field (exhausted/usage/caps) - Add ModelBudgetBanner client component, mounted globally in app/layout.tsx - Accept and normalize modelTokenBudgets in /api/settings PUT - Wire NEXT_PUBLIC_LLM_MODEL_COMPLEX into the header model label - Document MODEL_TOKEN_BUDGETS semantics in docs/model-routing.md and .env.example, emphasizing bare-name matching across providers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The dashboard's connection state only reflects whether the `dashboard-reporter` hook has fired heartbeats, and that hook only fires on `agent_end`. In continuous mode the main session never ends (stopReason=toolUse loop), so a running agent whose first LLM call is failing would appear `disconnected` even though the container is perfectly alive — and an agent hitting repeated upstream 401/quota errors would be indistinguishable from one whose container has died. Add a separate LLM health signal derived directly from the openclaw session jsonl, and surface it as its own top-of-page banner so the two failure modes are visually distinct. - Add /api/agent/llm-health — tails latest session jsonl, walks back from the most recent assistant message to classify the LLM as ok/errored/unknown and report the most recent error message and timestamps - Embed the LLM health block inside /api/connection-status so existing consumers get both dimensions in a single request - Add LlmErrorBanner client component, mounted alongside ModelBudgetBanner in the root layout. Renders only when llm.state === "errored", showing the upstream error text and "last ok/fail" relative timestamps Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Protocol-zero-0
added a commit
that referenced
this pull request
Apr 17, 2026
Integrates HelloAnner's fix/model-routing-and-cost-tracking branch with
alpha/v0.1.0's extra features preserved:
- Generic LLM_PROVIDER / LLM_BASE_URL / LLM_API_KEY / LLM_MODEL_* env scheme
supersedes hardcoded minimax/MiniMax-M2.7 defaults. CLAWOSS_PRIMARY_MODEL
etc. kept as optional legacy overrides.
- Per-tier pricing (INPUT_COST_PER_M_COMPLEX / _SIMPLE, with flat fallback)
flows through post-tool.sh, handler.ts, and dashboard-sync.sh.
- heartbeat.every=5m and LLM_* placeholders land in config/openclaw.json;
acp.defaultAgent=codex, loopDetection, logging.level from alpha preserved.
- restart.sh now also exports LLM_* and MODEL_TOKEN_BUDGETS into the
deployed config env block, alongside alpha's CLAWOSS_ROOT / RECORD_* vars.
- Dashboard: llm-health probe, model-budget banner, cost-models lookup,
llm-error banner all merged; env-driven model display in header/gateway.
- .env.example rewritten with LLM_* primary + legacy CLAWOSS_* section and
full Provider Quick Reference block (Gemini, Mistral, DeepSeek, MiniMax,
Moonshot, GLM).
Not a clean replay: 10 conflict files resolved by hand. See the PR body for
the dimension-by-dimension integration notes.
7 tasks
Protocol-zero-0
added a commit
that referenced
this pull request
May 17, 2026
Previously, budget was marked exhausted when usage exactly equaled the budget cap, which prematurely paused the agent before it could spend the full budget. Changed to `>` so the agent can use the entire configured budget before pausing. Credit: HelloAnner first identified this in PR #2 (closed; superseded by the telemetry-driven budget rewrite in PR #9). Carrying the fix forward. Refs #10
Protocol-zero-0
added a commit
that referenced
this pull request
May 17, 2026
Adds a user-facing quickstart covering setup, model switching, budget control, and dashboard usage. PR #9's README does not cover these flows, so we are landing this doc on v6-release independently to keep new contributors onboarded. Original author: HelloAnner. Original PR: #2 (closed; bug-fix portion superseded by PR #9's telemetry-driven runtime rewrite, but docs were not absorbed). Co-Authored-By: HelloAnner <helloanner@gmail.com> Refs #2 #10
Contributor
|
@HelloAnner 感谢非常扎实的工作。我们仔细对照了你 7 个修复点和后续 PR #9 的重构,做一份吸收/取代说明: 已带入 v6-release(致谢继承)
被 PR #9 用更彻底的方案取代(不再单独吸收)
端到端验证你在 huggingface/transformers#45371 跑通的端到端证据,是当前 ClawOSS 仓库里少数有真实上游 PR 落地的验证记录。我们会把这条记录引用进 v1.0 的 release notes。 v1.0 封版 + 后续v6-release 即将打 v1.0 tag,作为"自建 gateway + dashboard + placeholder 渲染"架构的最终封版。我们正在并行做一个 Phase 2 — 把 ClawOSS 重写成 Claude Code 的 Skill + 模板仓库形态,目标是把现在 ~3700 行的基础设施缩到 ~500 行,同时直接绕过你诊断的 7 个 bug 里至少 5 个(它们的本质是"自建基础设施"的副产品,在 Skill 形态下根本不会存在)。 如果你对这个方向感兴趣,欢迎参与 Phase 2 设计 — 你对 7 个 bug 的诊断质量,在那个语境下会变成"为什么不这么做"的最强论据。 基于上述,本 PR 关闭。再次感谢你的工作。 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
post-tool.shthat silently broke metrics reporting to dashboardopenclaw.jsonrestart.sh>=→>) to avoid premature agent shutdownGITHUB_USERNAMEconfigurable in dashboard-reporter hook and GitHub sync (was hardcoded toBillionClaw)docs/quickstart.mdcovering setup, model switching, budget control, and dashboard usageTest plan
restart.shcorrectly substitutes__INPUT_COST_PER_M_COMPLEX__placeholders>threshold (not>=)🤖 Generated with Claude Code