Add token counts, timestamps, and model to rollouts #1583

bl-ue · 2025-07-15T15:37:34Z

This PR adds per-message token count info and timestamps and per-chat selected model info to the rollout JSONL files. Note that this change is only in the Rust version; I didn't add this information to the TypeScript version's JSON rollout files, but I can if desired.

Codex support in Splitrail is now implemented and waiting for this PR.

Related: #1572

github-actions · 2025-07-15T15:37:43Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

bl-ue · 2025-07-15T15:59:49Z

I have read the CLA Document and I hereby sign the CLA

…ecorded-chat-history-071525

bl-ue · 2025-07-17T20:35:15Z

Ready for review.

…rollout files This ensures that the metadata is updated as well as the contents, so that file watchers can get immediate updates.

bl-ue · 2025-07-21T16:09:50Z

Hi @bolinfest! This PR adds token, timestamp, and model information to rollout files. This enables our new tool, Splitrail, to track token usage, cost, and throughput for Codex users. It also makes it easier for other tools to do the same. Do you mind taking a look? Thank you!

…-history-071525

bl-ue · 2025-07-26T15:56:13Z

cc @aibrahim-oai

aibrahim-oai · 2025-07-26T19:39:34Z

Tools can already calculate tokens used from the rollout info. I don't think there is much benefit in adding them in response items. What is the use?

bl-ue · 2025-07-26T20:42:39Z

Hi @aibrahim-oai! Thank you for reviewing so quickly. Yes, that makes sense, but automatic input caching makes accurate calculations impossible. When input exceeds 1024 tokens, inputs are automatically cached (see here), so there's no way to determine which part of the input was cached and which wasn't; therefore, we can't calculate cost accurately.

In addition, rollouts don't currently store model information, so it's not possible to determine which tokenizer to use, nor can we determine model/token cost for accurate cost calculation. This PR stores model information to address this.

Last but not least, Codex can be used with custom providers, and it would be difficult to perform tokenization on rollouts that use custom models/providers; with open-source models in Ollama, you'd have to download and use a tokenizer, and with providers that don't distribute their tokenizers, you'd have to use an API.

…cal timezone to match the other timestamps and be more standard

Add token counts, timestamps, and model to rollouts

9bf17f4

bl-ue force-pushed the enhance-recorded-chat-history-071525 branch from 30edd03 to 9bf17f4 Compare July 15, 2025 15:41

github-actions bot added a commit that referenced this pull request Jul 15, 2025

@bl-ue has signed the CLA in #1583

f869517

bl-ue marked this pull request as draft July 16, 2025 01:21

bl-ue added 2 commits July 17, 2025 07:54

Merge branch 'main' of https://github.com/openai/codex into enhance-r…

52a7795

…ecorded-chat-history-071525

Updates

8137167

bl-ue marked this pull request as ready for review July 17, 2025 20:30

bl-ue added 4 commits July 18, 2025 14:44

Merge branch 'main' into enhance-recorded-chat-history-071525

58cbc24

Merge branch 'main' into enhance-recorded-chat-history-071525

1c38a80

Merge branch 'main' into enhance-recorded-chat-history-071525

0eda033

Merge branch 'main' into enhance-recorded-chat-history-071525

9d03571

bl-ue marked this pull request as draft July 21, 2025 15:17

Use tokio::fs::File's sync_all() method instead of flush() for …

8d87494

…rollout files This ensures that the metadata is updated as well as the contents, so that file watchers can get immediate updates.

bl-ue marked this pull request as ready for review July 21, 2025 16:09

bl-ue added 2 commits July 23, 2025 09:38

Merge branch 'main' into enhance-recorded-chat-history-071525

479c663

Merge remote-tracking branch 'origin/main' into enhance-recorded-chat…

cb93638

…-history-071525

bl-ue added 4 commits July 26, 2025 14:43

Merge branch 'main' into enhance-recorded-chat-history-071525

a0d9e5c

Merge branch 'main' into enhance-recorded-chat-history-071525

a333c3c

Merge branch 'main' into enhance-recorded-chat-history-071525

37dae82

Make the initial timestamp in rollout files use UTC instead of the lo…

036c8f9

…cal timezone to match the other timestamps and be more standard

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add token counts, timestamps, and model to rollouts #1583

Add token counts, timestamps, and model to rollouts #1583

bl-ue commented Jul 15, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 15, 2025 •

edited

Loading

Uh oh!

bl-ue commented Jul 15, 2025

Uh oh!

bl-ue commented Jul 17, 2025

Uh oh!

bl-ue commented Jul 21, 2025

Uh oh!

bl-ue commented Jul 26, 2025

Uh oh!

aibrahim-oai commented Jul 26, 2025 •

edited

Loading

Uh oh!

bl-ue commented Jul 26, 2025

Uh oh!

Uh oh!

Add token counts, timestamps, and model to rollouts #1583

Are you sure you want to change the base?

Add token counts, timestamps, and model to rollouts #1583

Conversation

bl-ue commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bl-ue commented Jul 15, 2025

Uh oh!

bl-ue commented Jul 17, 2025

Uh oh!

bl-ue commented Jul 21, 2025

Uh oh!

bl-ue commented Jul 26, 2025

Uh oh!

aibrahim-oai commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bl-ue commented Jul 26, 2025

Uh oh!

Uh oh!

bl-ue commented Jul 15, 2025 •

edited

Loading

github-actions bot commented Jul 15, 2025 •

edited

Loading

aibrahim-oai commented Jul 26, 2025 •

edited

Loading