feat(doctor): surface embedding model health to user (#2474)#2674
Conversation
Adds a `check_embedding_model_health` diagnostic to the doctor
`run()` pipeline. When the configured embedding provider is Ollama, it:
1. Probes `GET <ollama_base_url>/api/tags` to verify daemon reachability.
2. Checks whether the configured embedding model (e.g. `bge-m3`) is
listed among installed models.
3. Emits PASS / FAIL `DiagnosticItem` with an actionable fix hint
("Run `ollama pull bge-m3`") on failure.
Uses `reqwest::blocking` with a 3 s timeout so `openhuman doctor`
does not stall on an unresponsive daemon. Non-Ollama providers (cloud,
custom) are skipped with an `Ok` item since they have no local daemon
to probe.
Resolves the first part of tinyhumansai#2474: users can now run `openhuman doctor`
to diagnose a silently-missing embedding model.
…inyhumansai#2474) Adds a new `DomainEvent::EmbeddingModelUnhealthy` variant (domain "memory") that carries: provider, model, fallback_provider, and a pre-formatted human-readable message with an actionable pull command. `report_ollama_health_gate_once` now: - Accepts the intended model name (threaded through from call sites) - Publishes `EmbeddingModelUnhealthy` via `publish_global` after reporting to Sentry, so UI subscribers can surface a notification. The event is published at most once per process (reusing the existing `OLLAMA_HEALTH_REPORTED` latch), consistent with the Sentry gate. `publish_global` is best-effort — no runtime present in test contexts drops the event silently. Log messages at the two fallback call sites now also include the model name for grep-friendly correlation.
…i#2474) `OllamaEmbedder::embed` previously called `anyhow::bail!` directly on a non-2xx status, so the actionable "model not installed — run `ollama pull <model>`" message only appeared if the caller happened to log the error (which most callers do at debug). Now `format_embedding_status_error` is captured before the bail and emitted with `log::warn!([embeddings] ...)` so missing-model failures surface in default-level traces without requiring debug logging.
The doctor `run()` function calls `check_embedding_model_health` which uses `reqwest::blocking::Client`. This panics when called from the async RPC handler because tokio forbids blocking I/O on the runtime thread. Move the entire sync `run()` onto `spawn_blocking`. Closes tinyhumansai#2474
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds embedding-model health diagnostics and reporting: new DomainEvent::EmbeddingModelUnhealthy, a doctor check that probes Ollama /api/tags (3s timeout) and records Ok/Warn/Error, spawn_blocking for the doctor runner, model-aware health-gate publishing on Ollama failures, and improved Ollama embedder warning logs. ChangesEmbedding Model Health Monitoring
Sequence DiagramsequenceDiagram
participant DoctorOps
participant SpawnBlocking
participant DoctorRun
participant EmbeddingCheck
participant OllamaAPI
DoctorOps->>SpawnBlocking: spawn_blocking(doctor::run)
SpawnBlocking->>DoctorRun: doctor::run(config)
DoctorRun->>EmbeddingCheck: check_embedding_model_health(config)
EmbeddingCheck->>OllamaAPI: GET /api/tags (3s timeout)
OllamaAPI-->>EmbeddingCheck: models list or error
EmbeddingCheck-->>DoctorRun: DoctorItem (Ok/Warn/Error)
DoctorRun-->>SpawnBlocking: DoctorReport
SpawnBlocking-->>DoctorOps: RpcOutcome
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/memory_store/factories.rs`:
- Around line 44-45: The logs emit raw base_url (variable base_url) which can
contain secrets; create a small redaction helper (e.g., redact_url(base_url:
&str) -> String) that parses the URL (using url::Url) and returns a string with
userinfo and query removed or replaced (e.g., remove username/password, drop
query, or replace host/path with "<redacted>"), then use that helper wherever
base_url is interpolated in the "[memory::factory] ollama..." log/report calls
(the occurrences that currently include base_url at the top-level factory
logging and the other two occurrences) so logs only include the redacted URL.
Ensure the helper is called in the places that construct those log messages
instead of using base_url directly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bc8c3a0e-f25a-47c6-9123-456637fb0e85
📒 Files selected for processing (5)
src/core/event_bus/events.rssrc/openhuman/doctor/core.rssrc/openhuman/doctor/ops.rssrc/openhuman/memory_store/factories.rssrc/openhuman/memory_tree/score/embed/ollama.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/memory_store/factories.rs`:
- Around line 50-51: Log statements that currently interpolate raw Ollama URLs
(the `{url}` and `{base_url}` occurrences) can leak credentials from
OPENHUMAN_OLLAMA_BASE_URL; add a small helper (e.g., redact_ollama_url or
similar) that parses the URL and removes/hides userinfo (username:password) and
any query credentials, then replace direct interpolations of
`{url}`/`{base_url}` in the functions that emit those logs with the redacted
output; ensure all places that log the Ollama base URL — including the locations
that currently print `{url}` and `{base_url}` — use this helper before calling
processLogger/Sentry logging so no raw secrets are emitted.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3bf17076-370f-444f-9761-e0b21b1cc5b7
📒 Files selected for processing (1)
src/openhuman/memory_store/factories.rs
0bdc0dc to
c7a0a78
Compare
M3gA-Mind
left a comment
There was a problem hiding this comment.
Clean feature addition with a well-targeted panic fix. A few things to address.
Issues
Missing tests for check_embedding_model_health
The function's HTTP paths are reasonably hard to unit-test (live daemon required), but the model-matching logic is pure and testable in isolation:
let model_base = model.split(':').next().unwrap_or(&model);
// … serde_json parse + .any(|entry| { … tag_base == model_base … })This branch — matching "bge-m3" against "bge-m3:latest" in the /api/tags response — is the exact case most likely to cause silent failures in production (users install with a tag, doctor says "not found"). Consider extracting it into a small fn model_matches(installed: &str, configured: &str) -> bool and adding a #[cfg(test)] mod tests block with bge-m3 / bge-m3:latest / nomic-embed-text:latest / bge-m3:v1.0 cases.
Dead unwrap_or branch
let model_base = model.split(':').next().unwrap_or(&model);split(':').next() always returns Some — there is always at least one segment before a :. The unwrap_or branch is unreachable. Use unwrap_or(&model) → .unwrap() or just model.split(':').next().unwrap_or(model.as_str()) to make the intent clear. Minor, but it's dead code in a security-adjacent path.
report_ollama_health_gate_once event publication not tested
The existing gate test verifies the bool return value (first call fires, subsequent calls suppress). It doesn't verify that the EmbeddingModelUnhealthy event is published. Testing event bus emission in unit tests is tricky (requires init_global), but the DEBUG log line is a reasonable proxy — at minimum, a comment in the test noting "event publication is fire-and-forget; verified manually" would prevent future readers from asking why it isn't covered.
spawn_blocking moves ALL doctor checks, not just the embedding probe
doctor::run() is a large sync function (env checks, file system probes, memory DB checks, now the Ollama probe). Wrapping the whole thing in spawn_blocking works correctly — none of that code needs the async runtime. But it means future synchronous checks added to doctor::run() will implicitly inherit this threading model. A doc comment on doctor::run() noting "runs in a blocking thread (via spawn_blocking in ops::doctor_report) — do not call .await here or add non-blocking async calls" would prevent accidental future breakage.
What's correct
spawn_blockingis the right fix for the panic.Config: Clone + Send + 'static(required for the closure) — theconfig.clone()before the closure correctly satisfies the borrow. ✓EmbeddingModelUnhealthyis in the"memory"domain bucket (matches thedomain()arm added at line 605). ✓report_ollama_health_gate_oncecorrectly updated at both call sites ineffective_embedding_settings_probedandcreate_memory_full. ✓publish_globalis infallible (drops event when no receivers registered); the fire-and-forget comment is accurate. ✓- Upgrading the 404 log in
ollama.rstolog::warn!beforeanyhow::bail!is correct — the warn fires before the bail, so the remediation hint appears in logs regardless of whether the caller swallows the error. ✓ - The 3-second timeout on the blocking probe is appropriate for a daemon on localhost. ✓
- The non-Ollama fast path (
if provider != "ollama" { items.push(ok(...)); return; }) correctly skips the probe for cloud/custom providers. ✓ - Existing factory gate tests updated with the new
modelargument. ✓ - CI green, no merge conflicts.
Add the model_matches unit test (or equivalent) and the doc note on doctor::run(), then this is ready.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/openhuman/memory_store/factories.rs (1)
156-176:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftThe health gate still misses the missing-model path.
A 2xx from
/api/tagsis treated as healthy without checking whether the configured model is actually present. If Ollama is up but the model is missing—the root-cause path from the linked issue—startup keeps Ollama selected, no cloud fallback happens, and noEmbeddingModelUnhealthyevent is emitted. Please gate on model presence here as well, ideally by reusing the/api/tagsmatching logic added insrc/openhuman/doctor/core.rs.Also applies to: 250-277, 382-399
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_store/factories.rs` around lines 156 - 176, probe_ollama_reachable currently treats any 2xx from /api/tags as healthy; change it to also verify the configured model is present by parsing the /api/tags response and applying the same model-matching logic used in src/openhuman/doctor/core.rs (reuse that function or logic) instead of returning true immediately on success. Update probe_ollama_reachable (and the equivalent checks at the other mentioned locations) so after client.get(&url).send().await succeeds you deserialize the tags response, run the existing tag->model matching helper, and only return true when the configured model is found; if the model is missing, return false and ensure the caller emits the EmbeddingModelUnhealthy event as before.src/openhuman/doctor/core.rs (1)
883-965:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRedact the Ollama URL before logging or returning diagnostics.
ollama_base_url()can come fromOPENHUMAN_OLLAMA_BASE_URL, so these log lines and doctor messages can echo embedded credentials or query tokens back to logs and the UI. Please emit a sanitized host (or redacted URL) instead ofbase_url/tags_urlhere.As per coding guidelines, "Never log secrets or full PII; always redact sensitive data in debug logs".
♻️ Duplicate comments (1)
src/openhuman/memory_store/factories.rs (1)
48-60:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRe-redact the Ollama URL before sending observability data.
This path still builds the Sentry message from raw
base_url, and the surrounding debug/warn logs in this file interpolate raw{base_url}/{url}again. IfOPENHUMAN_OLLAMA_BASE_URLcarries credentials, they'll leak into logs/observability despite the newhost_tag.🔧 Suggested fix
- // Tags are indexed and grouped on; keep them low-cardinality and free of - // credentials. Full URL stays in the message body for diagnostics. + // Tags and message bodies must stay free of credentials. let host_tag = redact_ollama_host(base_url); let sentry_message = format!( - "ollama embeddings opted-in but daemon unreachable at {base_url}; falling back to cloud embeddings for this session" + "ollama embeddings opted-in but daemon unreachable at host={host_tag}; falling back to cloud embeddings for this session" );As per coding guidelines, "Never log secrets or full PII; always redact sensitive data in debug logs".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_store/factories.rs` around lines 48 - 60, The sentry_message and nearby debug/warn logs still interpolate the raw base_url (or url) and can leak credentials; replace uses of raw base_url/url in the constructed sentry_message and any surrounding logs with the redacted value produced by redact_ollama_host(base_url) (e.g., use host_tag or call redact_ollama_host(url) instead of base_url), and ensure the same redacted value is passed to crate::core::observability::report_error_message so no unredacted URL appears in logs or observability.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/doctor/core.rs`:
- Around line 979-985: The current model_matches(installed, configured) and
model_base(model) treat different explicit tags as equal; change model_matches
so it returns true if strings are identical, but if both installed and
configured contain an explicit tag (contain ':'), require exact match (return
false if they differ); only fall back to comparing model_base(installed) ==
model_base(configured) when at least one side is untagged. Keep model_base as
the helper to extract the base name.
---
Outside diff comments:
In `@src/openhuman/memory_store/factories.rs`:
- Around line 156-176: probe_ollama_reachable currently treats any 2xx from
/api/tags as healthy; change it to also verify the configured model is present
by parsing the /api/tags response and applying the same model-matching logic
used in src/openhuman/doctor/core.rs (reuse that function or logic) instead of
returning true immediately on success. Update probe_ollama_reachable (and the
equivalent checks at the other mentioned locations) so after
client.get(&url).send().await succeeds you deserialize the tags response, run
the existing tag->model matching helper, and only return true when the
configured model is found; if the model is missing, return false and ensure the
caller emits the EmbeddingModelUnhealthy event as before.
---
Duplicate comments:
In `@src/openhuman/memory_store/factories.rs`:
- Around line 48-60: The sentry_message and nearby debug/warn logs still
interpolate the raw base_url (or url) and can leak credentials; replace uses of
raw base_url/url in the constructed sentry_message and any surrounding logs with
the redacted value produced by redact_ollama_host(base_url) (e.g., use host_tag
or call redact_ollama_host(url) instead of base_url), and ensure the same
redacted value is passed to crate::core::observability::report_error_message so
no unredacted URL appears in logs or observability.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6766f9b5-5678-41ae-b46f-7f3f04e17d68
📒 Files selected for processing (3)
src/openhuman/doctor/core.rssrc/openhuman/doctor/core_tests.rssrc/openhuman/memory_store/factories.rs
resolved this issue in further commits |
graycyrus
left a comment
There was a problem hiding this comment.
Solid work. This cleanly addresses the embedding model discoverability gap from #2474.
What this does well:
- The
spawn_blockingfix inops.rsis a real bug fix — callingreqwest::blocking::Clientinside an async runtime panics, and this was happening on everydoctor_reportRPC call. Good catch. check_embedding_model_healthis well-structured: non-Ollama providers get an earlyOk, Ollama gets a reachability probe then a model availability check, each with an actionable error message. The 3s timeout keepsopenhuman doctorsnappy.model_matchescorrectly handles the tagged-vs-untagged semantics —bge-m3:latestmatchesbge-m3, butbge-m3:v1.0won't falsely matchbge-m3:latest. Tests cover both positive and negative cases.- The
EmbeddingModelUnhealthydomain event gives the UI layer exactly what it needs to surface a notification without coupling to the health gate internals. - Upgrading the missing-model 404 from debug to WARN is a good call — users shouldn't need
--debugto see why their embeddings are failing.
Minor note (non-blocking): If Ollama's /api/tags returns valid HTTP but malformed JSON (or a schema change drops the models key), the doctor reports "model NOT installed" rather than "couldn't parse Ollama response." Unlikely in practice, but worth a comment if you're ever back in this code.
CodeRabbit's redaction findings were addressed in the follow-up commits. No additional issues from my side.
| Area | Files | Verdict |
|---|---|---|
| Event bus | events.rs |
Clean — new variant, correct category |
| Doctor core | core.rs, core_tests.rs |
Clean — good probe structure, tested |
| Doctor ops | ops.rs |
Bug fix — spawn_blocking is correct |
| Memory factory | factories.rs |
Clean — event emission is fire-and-forget |
| Embeddings | ollama.rs |
Clean — log level upgrade |
When /api/tags returns invalid JSON or is missing the `models` key, report that explicitly instead of falling through to "model NOT installed" which is misleading. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
ollama pull bge-m3").DomainEvent::EmbeddingModelUnhealthyon the event bus when the startup health gate triggers a cloud fallback — UI layer can now surface a notification.doctor_reportRPC:reqwest::blocking::Clientwas called inside the async runtime, now wrapped inspawn_blocking.Test plan
cargo check— cleancargo fmt --check— cleanopenhuman.doctor_reportwith Ollama down →[Error] Ollama daemon unreachable... Start Ollama, then run: ollama pull bge-m3openhuman.doctor_reportwith cloud provider →[Ok] embedding provider: cloud — no local daemon requiredbge-m3produces 1024-dim embeddings correctlyCloses #2474
Summary by CodeRabbit