Context-window controller for LLM agent sessions.
ctxctl keeps long agent conversations under a model's context budget. You hand it
your message history and a strategy (sliding window, priority eviction, or
LLM-based summarization) and it returns the subset that fits.
pip install ctxctlfrom ctxctl import Controller, Message, TokenBudget
from ctxctl.strategies import SlidingWindow
ctrl = Controller(
budget=TokenBudget(total=128_000, reserved_output=4_000),
strategy=SlidingWindow(keep_first=1), # always keep system prompt
)
for turn in conversation:
ctrl.add(Message(role=turn.role, content=turn.text))
# returns the messages that fit within the budget
prepared = ctrl.fit()
response = openai.chat.completions.create(model="gpt-4o", messages=prepared)| Name | Behavior |
|---|---|
SlidingWindow |
Drop oldest non-pinned messages until under budget. |
PriorityEvict |
Drop lowest-importance messages first; pinned never dropped. |
Summarizer |
Replace old turns with an LLM-generated summary (you supply the LLM call). |
Mark a message pinned=True to make it un-evictable (e.g. system prompt, key facts, recent tool output).
Set importance=2.0 to make it stick around longer under PriorityEvict.
By default ctxctl uses tiktoken with the cl100k_base encoding (good enough for
GPT-4 and Claude). You can swap counters:
from ctxctl.counter import TiktokenCounter
ctrl = Controller(budget=..., counter=TiktokenCounter("gpt-4o"))Apache-2.0