Token accounting: dedupe fallback by message.id + fresh-vs-cached headline by wan-huiyan · Pull Request #11 · dioptx/cctime

wan-huiyan · 2026-05-29T12:17:48Z

Two related token-reporting changes:

Dedup fallback to message.id — deduplicateAssistant groups streaming chunks by requestId, but transcripts that omit requestId (older Claude Code / partial logs) bypassed it and re-introduced the ~2-3× token inflation. Chunks still share message.id, so it now keys on requestId ?? message.id.
Decompose the "in" headline into fresh vs cached — the "NN in" total is ~97% cheap cache_read; a new Input X new · Y cached line separates freshly-billed input from cache reads so the cost is interpretable. Headline total unchanged.

91 tests pass, tsc clean. Supersedes #9 (consolidated with the dedup fix here).

🤖 Generated with Claude Code

…absent (#5) `deduplicateAssistant` collapses streaming assistant chunks (which share a requestId and each report the same usage) so tokens aren't counted ~2-3x. But assistant rows that OMIT requestId — older Claude Code versions or partial transcripts — hit the `!msg.requestId` guard and passed straight through un-grouped, re-introducing the exact inflation this function exists to prevent. Those chunks still share `message.id`, so group by `requestId ?? message.id`. Rows with neither key still pass through unchanged; behaviour for requestId- bearing transcripts is identical. - types.ts: add optional `message.id` - parser.ts: key dedup on `requestId ?? message.id`; extract a `flush()` helper - parser.test.ts: +2 cases (merge no-requestId chunks sharing message.id; do NOT merge no-requestId chunks with different message.ids) - CHANGELOG: Unreleased entry Defensive: current Claude Code transcripts carry requestId on every assistant row (verified against a real 150-response session — dedup already correct there), so this closes a latent edge rather than a live miscount. Full suite 89 passing; tsc clean. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The "NN in" headline sums input + cache_read + cache_creation, which on a long multi-turn session is ~97% cheap cache_read — so a user sees an alarming "37M in / $80" without realizing almost none of it is freshly-billed input. Add an `Input X new · Y cached` line that splits freshly-billed input (input + cache_creation, at $15/M + $18.75/M) from cache reads ($1.50/M), so the cost line is interpretable. Single-session and live views; +2 tests. 89 pass.

wan-huiyan and others added 2 commits May 29, 2026 13:17

This was referenced May 29, 2026

feat: decompose the token "in" headline into fresh vs cached #9

Closed

Accurate time breakdown: cap suspensions + union parallel spans (+ parallelism insight) #10

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token accounting: dedupe fallback by message.id + fresh-vs-cached headline#11

Token accounting: dedupe fallback by message.id + fresh-vs-cached headline#11
wan-huiyan wants to merge 2 commits into
dioptx:mainfrom
wan-huiyan:upstream/token-accounting

wan-huiyan commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wan-huiyan commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant