Skip to content

trananhtung/tokenfit

Repository files navigation

tokenfit

All Contributors

Fit text into LLM token budgets — estimate, trim, and pack prompts with zero dependencies.

CI npm version bundle size types license

Every app that talks to an LLM eventually fights the same battle: the context window. You have retrieved documents, chat history, logs, and system rules — and they don't all fit. tokenfit is a tiny, dependency-free toolkit that helps you measure how much you have and keep only what fits, without pulling in a megabyte-sized tokenizer.

import { pack, trim, estimateTokens } from "tokenfit";

estimateTokens("How many tokens is this?"); // → 7

// Keep the largest high-priority subset that fits in 4000 tokens
const { text, dropped } = pack(
  [
    { text: systemRules,   priority: 10 },
    { text: retrievedDocs, priority: 5  },
    { text: chatHistory,   priority: 1  },
  ],
  4000,
);

Why tokenfit?

  • Zero dependencies. No native bindings, no 2 MB vocab files. Drops into edge functions, Cloudflare Workers, browsers, and serverless without a cold-start penalty.
  • Conservative by design. The built-in estimator brackets real BPE behaviour so you under-fill rather than overflow the window.
  • Bring your own tokenizer. Need exact counts? Pass tiktoken, Anthropic's tokenizer, or any (text) => number to every API.
  • Three things, done well. estimateTokens, trim, and pack — fully typed, tested, and documented.
  • ESM + CJS + types, with a handy CLI.

Install

npm install tokenfit
# or: pnpm add tokenfit  /  yarn add tokenfit  /  bun add tokenfit

API

estimateTokens(text): number

Fast, dependency-free token estimate. Blends a chars / 4 and a words / 0.75 signal and takes the larger of the two, so it stays conservative for both prose and code.

estimateTokens("");                  // 0
estimateTokens("hello world");       // 3

Estimates are typically within ~10–15% of tiktoken for English and common code. When you need exact counts, supply countTokens (below).

trim(text, budget, options?): string

Trim a string so its token count never exceeds budget. The result — including the ellipsis marker — is guaranteed to fit.

trim(longLog, 2000, { strategy: "start" });   // keep the tail (newest log lines)
trim(bigFile, 1500, { strategy: "middle" });  // keep both ends, drop the middle
trim(article, 500);                            // strategy defaults to "end"
Option Type Default Description
strategy "end" | "start" | "middle" "end" Which part of the text to drop.
ellipsis string "…" Marker inserted where text was removed.
countTokens (text) => number built-in Custom token counter.

pack(items, budget, options?): PackResult

Greedily assemble the largest subset of items that fits the budget, highest priority first, accounting for the separator between items.

const result = pack(
  [
    { text: rules,  priority: 10, id: "rules" },
    { text: docA,   priority: 5,  id: "docA"  },
    { text: docB,   priority: 5,  id: "docB"  },
  ],
  3000,
  { separator: "\n\n---\n\n", trimLast: true },
);

result.text;      // assembled prompt, ≤ 3000 tokens
result.tokens;    // estimated token count of result.text
result.included;  // items that made it in (output order)
result.dropped;   // items left out
Option Type Default Description
separator string "\n\n" Inserted between included items.
trimLast boolean false Trim the first non-fitting item to use the leftover.
trimStrategy "end" | "start" | "middle" "end" Strategy used when trimLast is on.
countTokens (text) => number built-in Custom token counter.

Bring your own tokenizer

For exact counts, hand any counter to any function:

import { encoding_for_model } from "tiktoken";
import { pack } from "tokenfit";

const enc = encoding_for_model("gpt-4o");
const countTokens = (t: string) => enc.encode(t).length;

pack(items, 8000, { countTokens });

CLI

tokenfit ships a small CLI for shell pipelines:

# Estimate tokens
cat big.log | tokenfit count
tokenfit count README.md

# Trim to a budget (reads stdin or a file)
cat big.log | tokenfit trim -b 2000 -s start
tokenfit trim --budget 500 --strategy middle notes.md
tokenfit count [file]                 Estimate tokens (stdin if no file)
tokenfit trim --budget <n> [file]     Trim text to fit a token budget
  --budget, -b <n>      Token budget (required)
  --strategy, -s <s>    end | start | middle   (default: end)
  --ellipsis <str>      Marker for removed text

Recipes

Keep a chat history under budget (newest wins):

const history = messages.map((m) => ({ text: m.content, priority: m.index }));
const { text } = pack(history, 6000, { trimLast: true, trimStrategy: "start" });

Truncate a noisy log before sending it to a model:

const safe = trim(rawLog, 4000, { strategy: "start" }); // keep the most recent lines

How accurate is the estimator?

The default estimator is a heuristic, not a tokenizer. It is designed to be slightly conservative — it tends to estimate a touch high so your prompts fit on the first try. For budgeting, trimming, and packing this is exactly what you want. When you need guaranteed-exact counts (e.g. billing), plug in a real tokenizer via countTokens.

Contributors ✨

This project follows the all-contributors specification. Contributions of any kind are welcome — code, docs, bug reports, ideas, reviews! See the emoji key for how each contribution is recognized, and open a PR or issue to get involved.

Thanks goes to these wonderful people:

Tung Tran
Tung Tran

💻 🚧

License

MIT © Tung Tran

About

Fit text into LLM token budgets — estimate, trim, and pack prompts with zero dependencies.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors