Add Ollama warmth lifetime scoring as bounded placement tiebreaker by toasterbook88 · Pull Request #151 · toasterbook88/axis

toasterbook88 · 2026-06-02T14:48:28Z

Summary

Promote the resident-model "is loaded" boolean into a continuous 0.0-1.0 warmth score derived from Ollama's /api/ps expires_at and default_keep_alive. Warmth becomes a bounded tiebreaker at position 10 of the placement rank comparator, after RAM, GPU, pressure, and reservation ratio.

FilterCandidates is unchanged: warmth is consulted only among nodes that already passed eval.Eligible(). It cannot promote an undersized node, and the three-bucket discretization (cold/warm/hot at 0.5 and 0.9) keeps ranking stable.

What ships

ResidentModel.ExpiresAt (time.Time, omitempty) and ResidentModel.WarmthScore (float64, omitempty) in internal/models/types.go
OllamaInfo.DefaultKeepAlive (string, omitempty) for the process-level Ollama default
ApplyOllamaWarmth and DefaultOllamaKeepAlive helpers in internal/facts/local.go (exported for testability)
OllamaDiscoveryScript updated to read ollama ps -qq (JSON path, Ollama 0.3.10+) and fall back to the existing awk parser on older Ollama; queries /api/ps for default_keep_alive and falls back to 5m when missing or unparseable
modelWarmthRank and warmthToRank in internal/placement/empirical.go (3 buckets: 0 cold, 1 warm, 2 hot at 0.5 and 0.9)
modelWarmthRank wired into rankKey at position 10 of the comparator in internal/placement/ranker.go
11 new tests in internal/placement/warmth_test.go covering: warmth loses to allocatable RAM, warmth breaks ties on equal RAM, warmth is ignored when FilterCandidates rejects, boundary cases (0, 0.5, 0.51, 0.9, 0.91, 1.0, 2.0), highest-relevant-wins, other-runtime warmth is ignored, and time math for zero / future / past ExpiresAt

Safety contract

Warmth is strictly a tiebreaker. FilterCandidates (ranker.go) calls eval.Eligible() before any ranking begins; warmth cannot override RAM, GPU, or pressure eligibility.
The 3-bucket discretization (cold/warm/hot) means warmth is stable under small time changes and cannot induce rank flips on transient observations.
All new fields use omitempty, so older Ollama, other runtimes (llama-server, mlx_lm.server), and zero/expired ExpiresAt all leave WarmthScore at 0 and behave as cold.
Probe path tolerates ollama ps -qq failing on older Ollama (e.g., 0.23.3 rejects -qq with "unknown shorthand flag") and gracefully falls back to the existing awk parser, which emits no expires_at — those entries remain cold.

Quality gates

go build ./... ✓
go test ./... -count=1 ✓
go test -race ./... -count=1 ✓
gofmt -l . clean ✓
go vet ./... clean ✓
make coverage ✓ (knowledge 90.9% / api 80.9% / mcp 83.7% / ui 94.0% / total 69.1%)
./hack/verify-repo-truth.sh ✓
make build ✓ — binary exposes expires_at and default_keep_alive in axis facts --format json

Test coverage

11 new tests in internal/placement/warmth_test.go:

Test	Locks in
`TestRankCandidatesWarmthLosesToAllocatableRAM`	small-hot loses to large-cold (warmth cannot promote undersized node)
`TestRankCandidatesWarmthBreaksTieOnEqualAllocatableRAM`	warmth resolves ties on equal RAM
`TestRankCandidatesWarmthFilteredBeforeRanking`	hot node with insufficient RAM is dropped by `FilterCandidates`
`TestWarmthToRankBoundaries`	strict `>` comparisons at 0.5 and 0.9 (exactly 0.5 is cold; exactly 0.9 is warm)
`TestModelWarmthRankPicksHighestRelevant`	ranks by highest matching model's warmth, not average
`TestModelWarmthRankIgnoresOtherRuntimes`	non-ollama warmth is ignored
`TestApplyOllamaWarmthTimeZero`	`time.Time{}` (omitted) → all cold
`TestApplyOllamaWarmthInFuturePopulates`	future `ExpiresAt` → `WarmthScore > 0`
`TestApplyOllamaWarmthPastExpiresAtIsCold`	past `ExpiresAt` → cold
`TestDefaultOllamaKeepAliveFallbacks`	empty/garbage/zero → 5m
`TestDefaultOllamaKeepAliveParses`	valid durations parse; 5m default when negative

Backward compatibility

All new fields are omitempty, so older /api/ps payloads and existing fixtures continue to unmarshal cleanly.
Older Ollama (<0.3.10) falls back to the awk parser; warmth for those nodes is 0 (cold), which is the correct conservative behavior.
The ranker comparator insert is additive — no existing comparator field is renamed, retyped, or removed.

Notes

No public-repo-sensitive content (no real hostnames, IPs, SSH users, model names beyond generic placeholders, or per-host output) is included in this PR.
Author identity is the project's public-safe contributor handle.

gemini-code-assist

Code Review

This pull request introduces model warmth scoring and ranking for Ollama resident models. It updates the local discovery script to parse JSON from ollama ps -qq and retrieve the process-level default_keep_alive duration, which is then used to compute a continuous warmth score based on the model's expiration time. This score is integrated into the candidate ranking logic as a bounded tiebreaker. Feedback on the changes includes making the embedded Python script more robust against unexpected JSON structures or types, and handling bare integers representing seconds in DefaultOllamaKeepAlive to prevent parsing failures.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Promote the resident-model 'is loaded' boolean into a continuous 0.0-1.0 warmth score derived from Ollama's /api/ps expires_at and default_keep_alive. Warmth becomes a bounded tiebreaker at position 10 of the rank comparator, after RAM, GPU, pressure, and reservation ratio. FilterCandidates is unchanged: warmth is consulted only among nodes that already passed eval.Eligible(). It cannot promote an undersized node, and the three-bucket discretization (cold/warm/hot at 0.5 and 0.9) keeps ranking stable. Probe layer reads the new fields when Ollama 0.3.10+ is present ('ollama ps -qq' JSON path), and degrades gracefully to the existing awk parser on older Ollama - no expires_at is emitted in that case and WarmthScore remains 0 (cold). /api/ps is also queried for default_keep_alive, falling back to 5m when missing or unparseable. Adds ResidentModel.ExpiresAt, ResidentModel.WarmthScore, and OllamaInfo.DefaultKeepAlive (all omitempty, additive JSON), plus ApplyOllamaWarmth / DefaultOllamaKeepAlive helpers in the facts layer and modelWarmthRank in the ranker. Tests cover: warmth loses to allocatable RAM, warmth breaks ties on equal RAM, warmth is ignored when FilterCandidates rejects, boundary cases (0, 0.5, 0.51, 0.9, 0.91, 1.0), highest-relevant wins, other-runtime warmth is ignored, and time math for zero / future / past ExpiresAt.

…ing robust Improve robustness of the Ollama resident model discovery script by handling non-list JSON formats and string type conversions for VRAM size safely. Handle bare integer keep-alive duration strings in DefaultOllamaKeepAlive by appending seconds ("s") unit prior to duration parsing. Addresses review comments from gemini-code-assist[bot] on PR 151. Co-Authored-By: Antigravity <noreply@gemini.google.com>

gemini-code-assist Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread internal/facts/tools.go Outdated

Comment thread internal/facts/local.go

AXIS Contributor and others added 2 commits June 2, 2026 12:15

toasterbook88 force-pushed the feat/o9-ollama-warmth branch from bc78644 to 9ff97d6 Compare June 2, 2026 16:40

toasterbook88 merged commit 5089e7f into main Jun 2, 2026
8 checks passed

toasterbook88 deleted the feat/o9-ollama-warmth branch June 2, 2026 19:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Ollama warmth lifetime scoring as bounded placement tiebreaker#151

Add Ollama warmth lifetime scoring as bounded placement tiebreaker#151
toasterbook88 merged 2 commits into
mainfrom
feat/o9-ollama-warmth

toasterbook88 commented Jun 2, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

toasterbook88 commented Jun 2, 2026

Summary

What ships

Safety contract

Quality gates

Test coverage

Backward compatibility

Notes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant