Benchmark and compare Ollama models across local and cloud endpoints with rich, sortable tables.
- 📋 List models from local and cloud Ollama endpoints
- 📊 Rich tables with sorting by name, size, context length, modification date, TTFT, or TPS
- 🔃 Reverse sort with
--reverse - ⏱️ Benchmark time-to-first-token (TTFT) and tokens-per-second (TPS)
- 🔍 Model filtering by exact name or family match (e.g.
llama3matchesllama3:latest) - 📤 Export results to JSON or CSV (stdout or file)
- 🧪 Multi-prompt averaging — 3 prompts per model for robust stats (or use
--promptsfor custom prompts) - 🧬 Embedding model support — automatically uses
/api/embedfor local embedding models - 🎨 Beautiful CLI powered by
rich+InquirerPy - 📜 Benchmark history — every run is auto-saved to a local SQLite database; view past results with
--history - 📈 Performance trends — arrows (↑↓→) automatically appear inline next to TTFT/TPS values when historical data is available
From the project directory:
uv tool install .Or install directly from GitHub:
uv tool install git+https://github.com/EndoTheDev/OMeter.gitThis installs ometer and ometer globally, so you can run them from anywhere.
Update:
uv tool install --upgrade ometerUninstall:
uv tool uninstall ometeruv add ometerOr via pip:
pip install ometerShow the version:
ometer --versionList models with an interactive menu:
ometerList local models only:
ometer --localList cloud models only:
ometer --cloudList both local and cloud models:
ometer --local --cloudBenchmark time-to-first-token and tokens-per-second:
ometer --cloud --ttft --tpsBenchmark models in parallel for faster results (default is 1 — max 10):
ometer --cloud --ttft --tps --parallel 4Show per-run breakdown in the table:
ometer --cloud --ttft --tps --verboseRun with fewer benchmark prompts for faster results (default is 3 — max 3):
ometer --cloud --ttft --tps --verbose --runs 1
ometer --cloud --ttft --tps --verbose --runs 2Use custom benchmark prompts instead of the built-in defaults (overrides --runs):
ometer --local --ttft --tps --prompts "why is the ocean salty?"
ometer --local --ttft --tps --prompts prompts.txtPass a filename to read one prompt per line (skips blank lines, strips whitespace).
Filter to specific models (exact name or family match, accepts multiple names):
ometer --model llama3 --ttft --tps
ometer --local --model llama3.2:3b llama3.3:8b --ttft --tpsSort results by model size (largest first) or name (A–Z):
ometer --cloud --sort size
ometer --cloud --sort nameSort by context length (largest first) or modification date (newest first):
ometer --cloud --sort ctx
ometer --local --sort modifiedSort by benchmark metrics — TTFT (lowest/best first) and TPS (highest/best first):
ometer --cloud --ttft --tps --sort ttft
ometer --cloud --ttft --tps --sort tpsReverse any sort order (worst first, Z–A, oldest first):
ometer --cloud --sort name --reverse
ometer --cloud --ttft --tps --sort tps --reverseExport results as JSON (to stdout or a file):
ometer --cloud --ttft --tps --json
ometer --cloud --ttft --tps --json results.jsonExport results as CSV (to stdout or a file):
ometer --local --ttft --tps --csv
ometer --local --ttft --tps --csv results.csvView benchmark history (latest run per model):
ometer --historyShow all historical runs with full details:
ometer --history --verboseFilter history to specific models:
ometer --history --model llama3Export history as JSON or CSV:
ometer --history --json
ometer --history --csv history.csvPerformance trend arrows (↑ improved, ↓ degraded, → stable within 5%) appear inline next to TTFT and TPS values automatically. No flag needed.
See all options:
ometer --helpOMeter looks for a .env file in this order, using the first one found:
./.env— current working directory (project-specific)~/.env— home directory (global fallback)~/.config/ometer/.env— dedicated config directory (recommended for global installs)
Create the config directory and file:
mkdir -p ~/.config/ometer
cat > ~/.config/ometer/.env << 'EOF'
OLLAMA_CLOUD_BASE_URL=https://ollama.com
OLLAMA_CLOUD_API_KEY=your_api_key_here
OLLAMA_LOCAL_BASE_URL=http://localhost:11434
# Number of benchmark prompts per model (1–3, default 3). Ignored when --prompts is used.
OMETER_RUNS=3
# Number of models benchmarked in parallel (default 1, max 10)
OMETER_PARALLEL=1
EOFThe cloud API key is only needed for benchmarking cloud models.
Benchmark results are auto-saved to a local SQLite database. The database path can be overridden:
export OMETER_HISTORY_DB=/custom/path/history.dbBy default it lives at ~/.local/share/ometer/ometer_history.db.
OMeter has six modules that handle distinct concerns:
User ──► cli.py ──► config.py ──► api.py ──► display.py
│ │ │ │
arg parsing .env load HTTP calls rich tables
mode resolve validate benchmark color thresholds
interactive clamp stream live updates
export │ │ │
│ │ │ history.py
│ │ │ │
export.py │ │ SQLite DB
│ │
JSON/CSV output auto-save + trend- cli.py — Entry point, argument parsing, interactive model selection, export dispatch
- config.py — Hierarchical
.envloading, settings validation and clamping - api.py — HTTP communication with Ollama, TTFT/TPS measurement
- display.py — Rich terminal UI, live table updates, percentile-based color coding
- export.py — JSON/CSV export formatting and file output
- history.py — SQLite-backed benchmark persistence, trend computation, history queries
For detailed documentation, see the docs directory:
- Architecture — Module decomposition, request lifecycle, data entities
- Benchmarking Pipeline — TTFT/TPS methodology, concurrency, color thresholds
- Configuration — Environment variables, CLI flags, loading order
- API Reference — Ollama endpoints, function reference, BenchmarkResult
- Development — Dev setup, running tests, project structure, conventions
MIT License — see LICENSE for details.
Made by EndoTheDev


