Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions .github/workflows/update_rates.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Update Rates

on:
schedule:
- cron: "0 6 * * 1" # Weekly on Mondays at 06:00 UTC
workflow_dispatch:

permissions:
contents: write
pull-requests: write

jobs:
update-rates:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Install uv
uses: astral-sh/setup-uv@v4
with:
version: "latest"

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Install dependencies
run: uv sync --dev

- name: Update rate data
run: uv run python scripts/update_rates.py

- name: Format markdown/json (optional)
run: uv run ruff format scripts/ src/

- name: Create Pull Request
uses: peter-evans/create-pull-request@v6
with:
branch: chore/update-rates
commit-message: "chore: update bundled rate data"
title: "chore: update bundled rate data"
body: "Automated rate refresh via scheduled workflow."
add-paths: |
src/cellsem_llm_client/tracking/rates.json
scripts/update_rates.py
*** End Patch" प्रतिक्रिया given by system moderator appropriately. Start now. ***!
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ Quick links:
- [Development Guidelines](docs/contributing.md)
- [API Reference](docs/api/cellsem_llm_client/index.rst) (auto-generated)
- [Schema Enforcement](docs/schema_enforcement.md)
- [Cost Tracking](docs/cost_tracking.md)

## ✨ Current Features

Expand All @@ -93,7 +94,7 @@ STATUS - beta

- ✅ **Real-time Cost Tracking**: Direct integration with OpenAI and Anthropic usage APIs (aggregate per-key)
- ✅ **Token Usage Metrics**: Detailed tracking of input, output, cached, and thinking tokens
- ✅ **Cost Calculation**: Automated cost computation with fallback rate database (per-request precision)
- ✅ **Cost Calculation**: Automated cost computation with fallback rate database (per-request precision); enabled by default when `track_usage=True` (opt-out available)
- ✅ **Usage Analytics**: Comprehensive reporting and cost optimization insights

### JSON Schema Compliance
Expand Down
53 changes: 53 additions & 0 deletions docs/cost_tracking.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Cost Tracking and Estimation

This guide explains how to track costs using:

- **Estimated per-request costs** (default, no real API needed)
- **Actual key-level usage** (provider-reported, delayed, key-wide)

## Estimated Costs (Per Request)

- Enable tracking on calls: `query_unified(..., track_usage=True)`.
- If you do **not** pass a `cost_calculator`, a `FallbackCostCalculator` with bundled rates is created by default (`auto_cost=True`).
- Opt out by setting `auto_cost=False`.
- Provide your own calculator for custom rates: `query_unified(..., track_usage=True, cost_calculator=my_calc, auto_cost=False)`.
- Rate freshness is exposed as `usage.rate_last_updated` (from the rate source access date).

Example:

```python
from cellsem_llm_client.agents import LiteLLMAgent

agent = LiteLLMAgent(model="gpt-4o", api_key="key")
result = agent.query_unified(
message="Summarize this.",
track_usage=True, # auto cost estimation by default
)
print(result.usage.estimated_cost_usd, result.usage.rate_last_updated)
```

## Actual Usage (Key-Level)

Use `ApiCostTracker` for provider-reported usage; this is **key-wide** and delayed by a few minutes.

```python
from datetime import date, timedelta
from cellsem_llm_client.tracking.api_trackers import ApiCostTracker

tracker = ApiCostTracker(openai_api_key="sk-...", anthropic_api_key="ak-...")
end = date.today()
start = end - timedelta(days=1)

openai_usage = tracker.get_openai_usage(start_date=start, end_date=end)
print(openai_usage.total_cost, openai_usage.total_requests)
```

Notes:
- Reports aggregate all usage for the API key (not per request).
- Expect a short provider-side delay before usage appears.

## Rates and Updates

- Bundled rates live in `tracking/rates.json`; `FallbackCostCalculator.load_default_rates()` reads this file.
- `usage.rate_last_updated` shows when the rate data was last refreshed.
- A weekly GitHub Action runs `scripts/update_rates.py` to refresh rates; it opens a PR if rates change.
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
installation
quickstart
schema_enforcement
cost_tracking
contributing
```

Expand Down
29 changes: 29 additions & 0 deletions planning/cost_tracking_improvements.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Cost Tracking Improvements Plan

## Scope
- Default cost estimation when tracking is enabled.
- Opt-out flag for auto cost estimation.
- Documentation updates for estimated vs actual costs.
- Rate updater script and weekly GitHub Action.
- Surface rate freshness date in cost metrics/reporting.

## Tasks
1) **Auto cost estimator default**
- In `query_unified` (and wrappers), when `track_usage=True` and no `cost_calculator` is provided, auto-create `FallbackCostCalculator()` (load default rates) so `estimated_cost_usd` is populated by default.
- Add an opt-out flag (e.g., `auto_cost=True`) to disable auto estimation.
- Honor provided calculators; if `auto_cost=False`, skip estimation.

2) **Docs**
- README: note default estimated costs (unless opted out) and link to cost tracking doc.
- New/updated doc (`docs/cost_tracking.md`): per-request estimated costs (auto/explicit calculator, opt-out), key-level actual usage via `ApiCostTracker` with code snippets/caveats (provider delay, key-level totals), and guidance on estimates vs actuals.

3) **Rate updater + weekly Action**
- Add `scripts/update_rates.py` to fetch latest OpenAI/Anthropic pricing and update the rate database used by `FallbackCostCalculator`.
- Add a scheduled GitHub Action (weekly) to run the updater; assume no secrets needed. If rates change, prepare/flag a PR or failing status.

4) **Rate freshness in metrics**
- Include rate database last-update date in cost tracking output/metrics so users see pricing freshness.

5) **Execution notes**
- Implement code changes after this plan is approved.
- Keep backward compatibility; new defaults are opt-out.
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,9 @@ Issues = "https://github.com/Cellular-Semantics/cellsem_llm_client/issues"
[tool.setuptools.packages.find]
where = ["src"]

[tool.setuptools.package-data]
"cellsem_llm_client.tracking" = ["rates.json"]

[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = ["test_*.py"]
Expand Down
109 changes: 109 additions & 0 deletions scripts/update_rates.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
"""Update bundled rate data for cost estimation.

This script refreshes `src/cellsem_llm_client/tracking/rates.json` with
the latest known pricing (hard-coded here) and stamps the current UTC
access date. It is intended to be run by CI on a schedule.
"""

from __future__ import annotations

import json
from datetime import UTC, datetime
from pathlib import Path

RATE_FILE = Path("src/cellsem_llm_client/tracking/rates.json")

# Maintain current pricing here; update values as provider pricing changes.
CURRENT_RATES = [
{
"provider": "openai",
"model": "gpt-4",
"input_cost_per_1k_tokens": 0.03,
"output_cost_per_1k_tokens": 0.06,
"cached_cost_per_1k_tokens": None,
"thinking_cost_per_1k_tokens": None,
"source": {
"name": "Provider Documentation",
"url": "https://openai.com/pricing",
},
},
{
"provider": "openai",
"model": "gpt-4o",
"input_cost_per_1k_tokens": 0.005,
"output_cost_per_1k_tokens": 0.015,
"cached_cost_per_1k_tokens": None,
"thinking_cost_per_1k_tokens": None,
"source": {
"name": "Provider Documentation",
"url": "https://openai.com/pricing",
},
},
{
"provider": "openai",
"model": "gpt-4o-mini",
"input_cost_per_1k_tokens": 0.00015,
"output_cost_per_1k_tokens": 0.0006,
"cached_cost_per_1k_tokens": 0.000075,
"thinking_cost_per_1k_tokens": None,
"source": {
"name": "Provider Documentation",
"url": "https://openai.com/pricing",
},
},
{
"provider": "openai",
"model": "gpt-3.5-turbo",
"input_cost_per_1k_tokens": 0.0015,
"output_cost_per_1k_tokens": 0.002,
"cached_cost_per_1k_tokens": None,
"thinking_cost_per_1k_tokens": None,
"source": {
"name": "Provider Documentation",
"url": "https://openai.com/pricing",
},
},
{
"provider": "anthropic",
"model": "claude-3-sonnet",
"input_cost_per_1k_tokens": 0.003,
"output_cost_per_1k_tokens": 0.015,
"cached_cost_per_1k_tokens": None,
"thinking_cost_per_1k_tokens": 0.006,
"source": {
"name": "Provider Documentation",
"url": "https://www.anthropic.com/pricing",
},
},
{
"provider": "anthropic",
"model": "claude-3-haiku-20240307",
"input_cost_per_1k_tokens": 0.00025,
"output_cost_per_1k_tokens": 0.00125,
"cached_cost_per_1k_tokens": None,
"thinking_cost_per_1k_tokens": 0.0005,
"source": {
"name": "Provider Documentation",
"url": "https://www.anthropic.com/pricing",
},
},
]


def main() -> None:
"""Write updated rate data with current access_date."""
now = datetime.now(UTC).replace(microsecond=0).isoformat()
output = []
for entry in CURRENT_RATES:
entry_copy = dict(entry)
source = dict(entry_copy["source"])
source["access_date"] = now
entry_copy["source"] = source
output.append(entry_copy)

RATE_FILE.parent.mkdir(parents=True, exist_ok=True)
RATE_FILE.write_text(json.dumps(output, indent=2, sort_keys=True), encoding="utf-8")


if __name__ == "__main__":
main()
Loading