Skip to content

feat(doctor): surface embedding model health to user (#2474)#2674

Merged
graycyrus merged 8 commits into
tinyhumansai:mainfrom
YellowSnnowmann:fix/2474-surface-embedding-health
May 27, 2026
Merged

feat(doctor): surface embedding model health to user (#2474)#2674
graycyrus merged 8 commits into
tinyhumansai:mainfrom
YellowSnnowmann:fix/2474-surface-embedding-health

Conversation

@YellowSnnowmann
Copy link
Copy Markdown
Contributor

@YellowSnnowmann YellowSnnowmann commented May 26, 2026

Summary

  • Adds embedding model health check to the doctor diagnostics system: probes Ollama reachability and model availability, reports actionable errors ("Run ollama pull bge-m3").
  • Emits DomainEvent::EmbeddingModelUnhealthy on the event bus when the startup health gate triggers a cloud fallback — UI layer can now surface a notification.
  • Upgrades missing-model 404 log from debug to WARN so it's visible without debug logging.
  • Fixes a panic in doctor_report RPC: reqwest::blocking::Client was called inside the async runtime, now wrapped in spawn_blocking.

Test plan

  • cargo check — clean
  • cargo fmt --check — clean
  • RPC-verified: openhuman.doctor_report with Ollama down[Error] Ollama daemon unreachable... Start Ollama, then run: ollama pull bge-m3
  • RPC-verified: openhuman.doctor_report with cloud provider → [Ok] embedding provider: cloud — no local daemon required
  • Confirmed panic fix: doctor_report no longer crashes on async context
  • Direct Ollama API test: bge-m3 produces 1024-dim embeddings correctly

Closes #2474

Summary by CodeRabbit

  • New Features
    • Added embedding-model health diagnostics that probe configured providers (including local daemons) and surface provider/model status with user-facing guidance when unavailable.
  • Bug Fixes
    • Improved and consolidated warning/error messages for embedding failures and ensured duplicate health alerts are suppressed to one actionable notification.
  • Tests
    • Added unit tests validating model name and tag matching behavior.

Review Change Stack

Adds a `check_embedding_model_health` diagnostic to the doctor
`run()` pipeline. When the configured embedding provider is Ollama, it:

1. Probes `GET <ollama_base_url>/api/tags` to verify daemon reachability.
2. Checks whether the configured embedding model (e.g. `bge-m3`) is
   listed among installed models.
3. Emits PASS / FAIL `DiagnosticItem` with an actionable fix hint
   ("Run `ollama pull bge-m3`") on failure.

Uses `reqwest::blocking` with a 3 s timeout so `openhuman doctor`
does not stall on an unresponsive daemon. Non-Ollama providers (cloud,
custom) are skipped with an `Ok` item since they have no local daemon
to probe.

Resolves the first part of tinyhumansai#2474: users can now run `openhuman doctor`
to diagnose a silently-missing embedding model.
…inyhumansai#2474)

Adds a new `DomainEvent::EmbeddingModelUnhealthy` variant (domain
"memory") that carries: provider, model, fallback_provider, and a
pre-formatted human-readable message with an actionable pull command.

`report_ollama_health_gate_once` now:
- Accepts the intended model name (threaded through from call sites)
- Publishes `EmbeddingModelUnhealthy` via `publish_global` after
  reporting to Sentry, so UI subscribers can surface a notification.

The event is published at most once per process (reusing the existing
`OLLAMA_HEALTH_REPORTED` latch), consistent with the Sentry gate.
`publish_global` is best-effort — no runtime present in test contexts
drops the event silently.

Log messages at the two fallback call sites now also include the model
name for grep-friendly correlation.
…i#2474)

`OllamaEmbedder::embed` previously called `anyhow::bail!` directly on
a non-2xx status, so the actionable "model not installed — run
`ollama pull <model>`" message only appeared if the caller happened to
log the error (which most callers do at debug).

Now `format_embedding_status_error` is captured before the bail and
emitted with `log::warn!([embeddings] ...)` so missing-model failures
surface in default-level traces without requiring debug logging.
The doctor `run()` function calls `check_embedding_model_health` which
uses `reqwest::blocking::Client`. This panics when called from the
async RPC handler because tokio forbids blocking I/O on the runtime
thread. Move the entire sync `run()` onto `spawn_blocking`.

Closes tinyhumansai#2474
@YellowSnnowmann YellowSnnowmann requested a review from a team May 26, 2026 09:13
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 26, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 578ebaf7-4fcd-40d8-8d64-01064eb91564

📥 Commits

Reviewing files that changed from the base of the PR and between 2d2b681 and ec16011.

📒 Files selected for processing (2)
  • src/openhuman/doctor/core.rs
  • src/openhuman/doctor/core_tests.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/openhuman/doctor/core.rs

📝 Walkthrough

Walkthrough

Adds embedding-model health diagnostics and reporting: new DomainEvent::EmbeddingModelUnhealthy, a doctor check that probes Ollama /api/tags (3s timeout) and records Ok/Warn/Error, spawn_blocking for the doctor runner, model-aware health-gate publishing on Ollama failures, and improved Ollama embedder warning logs.

Changes

Embedding Model Health Monitoring

Layer / File(s) Summary
EmbeddingModelUnhealthy event type
src/core/event_bus/events.rs
Adds DomainEvent::EmbeddingModelUnhealthy { provider, model, fallback_provider, message } and routes it to the "memory" domain.
Doctor embedding health check
src/openhuman/doctor/core.rs, src/openhuman/doctor/ops.rs
Adds check_embedding_model_health (probes Ollama /api/tags with 3s timeout, records Ok/Warn/Error and supports base-name model matches), integrates it into doctor::run, and runs doctor::run inside tokio::task::spawn_blocking. Includes model_matches helpers and unit tests.
Health gate event publishing and tests
src/openhuman/memory_store/factories.rs
report_ollama_health_gate_once now accepts (base_url, model), logs host+model context, publishes DomainEvent::EmbeddingModelUnhealthy with provider/model/fallback/message, and unit test updated to verify once-per-process suppression.
Ollama embedder warning logging
src/openhuman/memory_tree/score/embed/ollama.rs
Formats non-success HTTP responses into msg, logs WARN with that message (including ollama pull <model> hint when applicable), then bails with the formatted message.

Sequence Diagram

sequenceDiagram
  participant DoctorOps
  participant SpawnBlocking
  participant DoctorRun
  participant EmbeddingCheck
  participant OllamaAPI
  DoctorOps->>SpawnBlocking: spawn_blocking(doctor::run)
  SpawnBlocking->>DoctorRun: doctor::run(config)
  DoctorRun->>EmbeddingCheck: check_embedding_model_health(config)
  EmbeddingCheck->>OllamaAPI: GET /api/tags (3s timeout)
  OllamaAPI-->>EmbeddingCheck: models list or error
  EmbeddingCheck-->>DoctorRun: DoctorItem (Ok/Warn/Error)
  DoctorRun-->>SpawnBlocking: DoctorReport
  SpawnBlocking-->>DoctorOps: RpcOutcome
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • tinyhumansai/openhuman#2533: Centralizes and improves Ollama error message formatting and failure guidance in the same embedder file.

Suggested reviewers

  • senamakel
  • M3gA-Mind
  • graycyrus

Poem

🐰 I sniffed the models, checked every tag,

The daemon was silent, my ears went zag.
I sounded the gate, sent a tiny alarm,
Fell back to the cloud to keep things from harm.
Hop, patch, and log — now the system's a charm. 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding embedding model health checking to the doctor diagnostics system to surface issues to the user.
Linked Issues check ✅ Passed The PR directly addresses the root cause identified in #2474: implementing embedding model health checks, publishing health events, and providing actionable remediation guidance.
Out of Scope Changes check ✅ Passed All changes are scoped to implementing the embedding model health diagnostics feature: event definitions, health-check probes, event publishing, and supporting logging/timeout improvements.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added feature Net-new user-facing capability or product behavior. memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. labels May 26, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/memory_store/factories.rs`:
- Around line 44-45: The logs emit raw base_url (variable base_url) which can
contain secrets; create a small redaction helper (e.g., redact_url(base_url:
&str) -> String) that parses the URL (using url::Url) and returns a string with
userinfo and query removed or replaced (e.g., remove username/password, drop
query, or replace host/path with "<redacted>"), then use that helper wherever
base_url is interpolated in the "[memory::factory] ollama..." log/report calls
(the occurrences that currently include base_url at the top-level factory
logging and the other two occurrences) so logs only include the redacted URL.
Ensure the helper is called in the places that construct those log messages
instead of using base_url directly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bc8c3a0e-f25a-47c6-9123-456637fb0e85

📥 Commits

Reviewing files that changed from the base of the PR and between de2a78a and c7a0a78.

📒 Files selected for processing (5)
  • src/core/event_bus/events.rs
  • src/openhuman/doctor/core.rs
  • src/openhuman/doctor/ops.rs
  • src/openhuman/memory_store/factories.rs
  • src/openhuman/memory_tree/score/embed/ollama.rs

Comment thread src/openhuman/memory_store/factories.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/memory_store/factories.rs`:
- Around line 50-51: Log statements that currently interpolate raw Ollama URLs
(the `{url}` and `{base_url}` occurrences) can leak credentials from
OPENHUMAN_OLLAMA_BASE_URL; add a small helper (e.g., redact_ollama_url or
similar) that parses the URL and removes/hides userinfo (username:password) and
any query credentials, then replace direct interpolations of
`{url}`/`{base_url}` in the functions that emit those logs with the redacted
output; ensure all places that log the Ollama base URL — including the locations
that currently print `{url}` and `{base_url}` — use this helper before calling
processLogger/Sentry logging so no raw secrets are emitted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3bf17076-370f-444f-9761-e0b21b1cc5b7

📥 Commits

Reviewing files that changed from the base of the PR and between c7a0a78 and 6e84d09.

📒 Files selected for processing (1)
  • src/openhuman/memory_store/factories.rs

Comment thread src/openhuman/memory_store/factories.rs Outdated
@YellowSnnowmann YellowSnnowmann force-pushed the fix/2474-surface-embedding-health branch from 0bdc0dc to c7a0a78 Compare May 26, 2026 09:52
Copy link
Copy Markdown
Contributor

@M3gA-Mind M3gA-Mind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean feature addition with a well-targeted panic fix. A few things to address.

Issues

Missing tests for check_embedding_model_health

The function's HTTP paths are reasonably hard to unit-test (live daemon required), but the model-matching logic is pure and testable in isolation:

let model_base = model.split(':').next().unwrap_or(&model);
// … serde_json parse + .any(|entry| { … tag_base == model_base … })

This branch — matching "bge-m3" against "bge-m3:latest" in the /api/tags response — is the exact case most likely to cause silent failures in production (users install with a tag, doctor says "not found"). Consider extracting it into a small fn model_matches(installed: &str, configured: &str) -> bool and adding a #[cfg(test)] mod tests block with bge-m3 / bge-m3:latest / nomic-embed-text:latest / bge-m3:v1.0 cases.

Dead unwrap_or branch

let model_base = model.split(':').next().unwrap_or(&model);

split(':').next() always returns Some — there is always at least one segment before a :. The unwrap_or branch is unreachable. Use unwrap_or(&model).unwrap() or just model.split(':').next().unwrap_or(model.as_str()) to make the intent clear. Minor, but it's dead code in a security-adjacent path.

report_ollama_health_gate_once event publication not tested

The existing gate test verifies the bool return value (first call fires, subsequent calls suppress). It doesn't verify that the EmbeddingModelUnhealthy event is published. Testing event bus emission in unit tests is tricky (requires init_global), but the DEBUG log line is a reasonable proxy — at minimum, a comment in the test noting "event publication is fire-and-forget; verified manually" would prevent future readers from asking why it isn't covered.

spawn_blocking moves ALL doctor checks, not just the embedding probe

doctor::run() is a large sync function (env checks, file system probes, memory DB checks, now the Ollama probe). Wrapping the whole thing in spawn_blocking works correctly — none of that code needs the async runtime. But it means future synchronous checks added to doctor::run() will implicitly inherit this threading model. A doc comment on doctor::run() noting "runs in a blocking thread (via spawn_blocking in ops::doctor_report) — do not call .await here or add non-blocking async calls" would prevent accidental future breakage.

What's correct

  • spawn_blocking is the right fix for the panic. Config: Clone + Send + 'static (required for the closure) — the config.clone() before the closure correctly satisfies the borrow. ✓
  • EmbeddingModelUnhealthy is in the "memory" domain bucket (matches the domain() arm added at line 605). ✓
  • report_ollama_health_gate_once correctly updated at both call sites in effective_embedding_settings_probed and create_memory_full. ✓
  • publish_global is infallible (drops event when no receivers registered); the fire-and-forget comment is accurate. ✓
  • Upgrading the 404 log in ollama.rs to log::warn! before anyhow::bail! is correct — the warn fires before the bail, so the remediation hint appears in logs regardless of whether the caller swallows the error. ✓
  • The 3-second timeout on the blocking probe is appropriate for a daemon on localhost. ✓
  • The non-Ollama fast path (if provider != "ollama" { items.push(ok(...)); return; }) correctly skips the probe for cloud/custom providers. ✓
  • Existing factory gate tests updated with the new model argument. ✓
  • CI green, no merge conflicts.

Add the model_matches unit test (or equivalent) and the doc note on doctor::run(), then this is ready.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/openhuman/memory_store/factories.rs (1)

156-176: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

The health gate still misses the missing-model path.

A 2xx from /api/tags is treated as healthy without checking whether the configured model is actually present. If Ollama is up but the model is missing—the root-cause path from the linked issue—startup keeps Ollama selected, no cloud fallback happens, and no EmbeddingModelUnhealthy event is emitted. Please gate on model presence here as well, ideally by reusing the /api/tags matching logic added in src/openhuman/doctor/core.rs.

Also applies to: 250-277, 382-399

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/memory_store/factories.rs` around lines 156 - 176,
probe_ollama_reachable currently treats any 2xx from /api/tags as healthy;
change it to also verify the configured model is present by parsing the
/api/tags response and applying the same model-matching logic used in
src/openhuman/doctor/core.rs (reuse that function or logic) instead of returning
true immediately on success. Update probe_ollama_reachable (and the equivalent
checks at the other mentioned locations) so after client.get(&url).send().await
succeeds you deserialize the tags response, run the existing tag->model matching
helper, and only return true when the configured model is found; if the model is
missing, return false and ensure the caller emits the EmbeddingModelUnhealthy
event as before.
src/openhuman/doctor/core.rs (1)

883-965: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Redact the Ollama URL before logging or returning diagnostics.

ollama_base_url() can come from OPENHUMAN_OLLAMA_BASE_URL, so these log lines and doctor messages can echo embedded credentials or query tokens back to logs and the UI. Please emit a sanitized host (or redacted URL) instead of base_url / tags_url here.

As per coding guidelines, "Never log secrets or full PII; always redact sensitive data in debug logs".

♻️ Duplicate comments (1)
src/openhuman/memory_store/factories.rs (1)

48-60: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Re-redact the Ollama URL before sending observability data.

This path still builds the Sentry message from raw base_url, and the surrounding debug/warn logs in this file interpolate raw {base_url} / {url} again. If OPENHUMAN_OLLAMA_BASE_URL carries credentials, they'll leak into logs/observability despite the new host_tag.

🔧 Suggested fix
-    // Tags are indexed and grouped on; keep them low-cardinality and free of
-    // credentials. Full URL stays in the message body for diagnostics.
+    // Tags and message bodies must stay free of credentials.
     let host_tag = redact_ollama_host(base_url);
     let sentry_message = format!(
-        "ollama embeddings opted-in but daemon unreachable at {base_url}; falling back to cloud embeddings for this session"
+        "ollama embeddings opted-in but daemon unreachable at host={host_tag}; falling back to cloud embeddings for this session"
     );

As per coding guidelines, "Never log secrets or full PII; always redact sensitive data in debug logs".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/memory_store/factories.rs` around lines 48 - 60, The
sentry_message and nearby debug/warn logs still interpolate the raw base_url (or
url) and can leak credentials; replace uses of raw base_url/url in the
constructed sentry_message and any surrounding logs with the redacted value
produced by redact_ollama_host(base_url) (e.g., use host_tag or call
redact_ollama_host(url) instead of base_url), and ensure the same redacted value
is passed to crate::core::observability::report_error_message so no unredacted
URL appears in logs or observability.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/doctor/core.rs`:
- Around line 979-985: The current model_matches(installed, configured) and
model_base(model) treat different explicit tags as equal; change model_matches
so it returns true if strings are identical, but if both installed and
configured contain an explicit tag (contain ':'), require exact match (return
false if they differ); only fall back to comparing model_base(installed) ==
model_base(configured) when at least one side is untagged. Keep model_base as
the helper to extract the base name.

---

Outside diff comments:
In `@src/openhuman/memory_store/factories.rs`:
- Around line 156-176: probe_ollama_reachable currently treats any 2xx from
/api/tags as healthy; change it to also verify the configured model is present
by parsing the /api/tags response and applying the same model-matching logic
used in src/openhuman/doctor/core.rs (reuse that function or logic) instead of
returning true immediately on success. Update probe_ollama_reachable (and the
equivalent checks at the other mentioned locations) so after
client.get(&url).send().await succeeds you deserialize the tags response, run
the existing tag->model matching helper, and only return true when the
configured model is found; if the model is missing, return false and ensure the
caller emits the EmbeddingModelUnhealthy event as before.

---

Duplicate comments:
In `@src/openhuman/memory_store/factories.rs`:
- Around line 48-60: The sentry_message and nearby debug/warn logs still
interpolate the raw base_url (or url) and can leak credentials; replace uses of
raw base_url/url in the constructed sentry_message and any surrounding logs with
the redacted value produced by redact_ollama_host(base_url) (e.g., use host_tag
or call redact_ollama_host(url) instead of base_url), and ensure the same
redacted value is passed to crate::core::observability::report_error_message so
no unredacted URL appears in logs or observability.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6766f9b5-5678-41ae-b46f-7f3f04e17d68

📥 Commits

Reviewing files that changed from the base of the PR and between 6e84d09 and 2d2b681.

📒 Files selected for processing (3)
  • src/openhuman/doctor/core.rs
  • src/openhuman/doctor/core_tests.rs
  • src/openhuman/memory_store/factories.rs

Comment thread src/openhuman/doctor/core.rs
@YellowSnnowmann
Copy link
Copy Markdown
Contributor Author

YellowSnnowmann commented May 26, 2026

Clean feature addition with a well-targeted panic fix. A few things to address.

Issues

Missing tests for check_embedding_model_health

The function's HTTP paths are reasonably hard to unit-test (live daemon required), but the model-matching logic is pure and testable in isolation:

let model_base = model.split(':').next().unwrap_or(&model);
// … serde_json parse + .any(|entry| { … tag_base == model_base … })

This branch — matching "bge-m3" against "bge-m3:latest" in the /api/tags response — is the exact case most likely to cause silent failures in production (users install with a tag, doctor says "not found"). Consider extracting it into a small fn model_matches(installed: &str, configured: &str) -> bool and adding a #[cfg(test)] mod tests block with bge-m3 / bge-m3:latest / nomic-embed-text:latest / bge-m3:v1.0 cases.

Dead unwrap_or branch

let model_base = model.split(':').next().unwrap_or(&model);

split(':').next() always returns Some — there is always at least one segment before a :. The unwrap_or branch is unreachable. Use unwrap_or(&model).unwrap() or just model.split(':').next().unwrap_or(model.as_str()) to make the intent clear. Minor, but it's dead code in a security-adjacent path.

report_ollama_health_gate_once event publication not tested

The existing gate test verifies the bool return value (first call fires, subsequent calls suppress). It doesn't verify that the EmbeddingModelUnhealthy event is published. Testing event bus emission in unit tests is tricky (requires init_global), but the DEBUG log line is a reasonable proxy — at minimum, a comment in the test noting "event publication is fire-and-forget; verified manually" would prevent future readers from asking why it isn't covered.

spawn_blocking moves ALL doctor checks, not just the embedding probe

doctor::run() is a large sync function (env checks, file system probes, memory DB checks, now the Ollama probe). Wrapping the whole thing in spawn_blocking works correctly — none of that code needs the async runtime. But it means future synchronous checks added to doctor::run() will implicitly inherit this threading model. A doc comment on doctor::run() noting "runs in a blocking thread (via spawn_blocking in ops::doctor_report) — do not call .await here or add non-blocking async calls" would prevent accidental future breakage.

What's correct

  • spawn_blocking is the right fix for the panic. Config: Clone + Send + 'static (required for the closure) — the config.clone() before the closure correctly satisfies the borrow. ✓
  • EmbeddingModelUnhealthy is in the "memory" domain bucket (matches the domain() arm added at line 605). ✓
  • report_ollama_health_gate_once correctly updated at both call sites in effective_embedding_settings_probed and create_memory_full. ✓
  • publish_global is infallible (drops event when no receivers registered); the fire-and-forget comment is accurate. ✓
  • Upgrading the 404 log in ollama.rs to log::warn! before anyhow::bail! is correct — the warn fires before the bail, so the remediation hint appears in logs regardless of whether the caller swallows the error. ✓
  • The 3-second timeout on the blocking probe is appropriate for a daemon on localhost. ✓
  • The non-Ollama fast path (if provider != "ollama" { items.push(ok(...)); return; }) correctly skips the probe for cloud/custom providers. ✓
  • Existing factory gate tests updated with the new model argument. ✓
  • CI green, no merge conflicts.

Add the model_matches unit test (or equivalent) and the doc note on doctor::run(), then this is ready.

resolved this issue in further commits
2d2b681

graycyrus
graycyrus previously approved these changes May 26, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid work. This cleanly addresses the embedding model discoverability gap from #2474.

What this does well:

  • The spawn_blocking fix in ops.rs is a real bug fix — calling reqwest::blocking::Client inside an async runtime panics, and this was happening on every doctor_report RPC call. Good catch.
  • check_embedding_model_health is well-structured: non-Ollama providers get an early Ok, Ollama gets a reachability probe then a model availability check, each with an actionable error message. The 3s timeout keeps openhuman doctor snappy.
  • model_matches correctly handles the tagged-vs-untagged semantics — bge-m3:latest matches bge-m3, but bge-m3:v1.0 won't falsely match bge-m3:latest. Tests cover both positive and negative cases.
  • The EmbeddingModelUnhealthy domain event gives the UI layer exactly what it needs to surface a notification without coupling to the health gate internals.
  • Upgrading the missing-model 404 from debug to WARN is a good call — users shouldn't need --debug to see why their embeddings are failing.

Minor note (non-blocking): If Ollama's /api/tags returns valid HTTP but malformed JSON (or a schema change drops the models key), the doctor reports "model NOT installed" rather than "couldn't parse Ollama response." Unlikely in practice, but worth a comment if you're ever back in this code.

CodeRabbit's redaction findings were addressed in the follow-up commits. No additional issues from my side.

Area Files Verdict
Event bus events.rs Clean — new variant, correct category
Doctor core core.rs, core_tests.rs Clean — good probe structure, tested
Doctor ops ops.rs Bug fix — spawn_blocking is correct
Memory factory factories.rs Clean — event emission is fire-and-forget
Embeddings ollama.rs Clean — log level upgrade

When /api/tags returns invalid JSON or is missing the `models` key,
report that explicitly instead of falling through to "model NOT
installed" which is misleading.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai coderabbitai Bot added the rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. label May 27, 2026
@graycyrus graycyrus merged commit e2601a2 into tinyhumansai:main May 27, 2026
40 of 45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Net-new user-facing capability or product behavior. memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure.

Projects

None yet

3 participants