test: unblock cargo test --lib on main (3 stale test fixtures) by oxoxDev · Pull Request #2710 · tinyhumansai/openhuman

oxoxDev · 2026-05-26T18:58:24Z

Summary

Unblock cargo test --lib on main by fixing 3 test-side regressions that have shipped onto main without their tests being updated.
Compile fix: ops_tests.rs:631 AutonomySettingsPatch literal missing fields added in OpenHuman 不执行命令和无反馈的分析 #2486.
Behavior-update fix Feat/gitbooks #1: factory_test.rs:768 test was iterating hint:summarization as an unknown-passthrough case after fix: runtime snapshot timeouts, config perms, stale lock recovery, summarization-v1 tier, loopback OAuth UX #2690 made it canonical.
Behavior-update fix #2: chat.rs:218,226 test assertions hard-coded reasoning-v1 even though build_chat_runtime now resolves to summarization-v1 by default after fix: runtime snapshot timeouts, config perms, stale lock recovery, summarization-v1 tier, loopback OAuth UX #2690.

Problem

cargo test --lib is red on upstream/main across the Rust Core Coverage, Rust Core Tests + Quality, and Rust Core Tests (Windows — secrets ACL) workflows. Every open PR inherits these failures regardless of what it touches.

Root cause: two recently-merged PRs shipped code changes without updating their tests in the same change.

Failure 1 — `ops_tests.rs:631` (compile-time, hidden by silence)

error[E0063]: missing fields `allow_tool_install`, `allowed_commands`,
  `forbidden_paths` and 3 other fields in initializer of
  `openhuman::config::ops::AutonomySettingsPatch`
  --> src/openhuman/config/ops_tests.rs:631:9

AutonomySettingsPatch gained six Option<…> fields (level, workspace_only, allowed_commands, forbidden_paths, trusted_roots, allow_tool_install) via #2486 (feat(app): UI control for max_actions_per_hour). The test literal at line 631 still set only max_actions_per_hour. All other literals in the same file were updated; this one was missed.

Failure 2 — `factory_test.rs:768` (assertion)

thread '…::make_openhuman_backend_forwards_unknown_hint_verbatim' panicked at
  src/openhuman/inference/provider/factory_test.rs:768:9:
  assertion `left == right` failed: hint 'hint:summarization' should pass through unchanged
    left: "summarization-v1"
   right: "hint:summarization"

#2690 (fix: … summarization-v1 tier …) added summarization-v1 as a canonical model tier. is_known_openhuman_tier (src/openhuman/inference/provider/factory.rs:79-98) explicitly includes "hint:summarization", so the factory now translates that hint to summarization-v1 rather than forwarding it verbatim. The test was iterating hint:summarization as an "unknown" passthrough case — that classification is no longer correct.

Failure 3 — `chat.rs:218 / chat.rs:226` (assertion)

thread '…::build_chat_runtime_defaults_to_openhuman_resolved_model' panicked at
  src/openhuman/memory/chat.rs:218:9:
  assertion `left == right` failed
    left: "summarization-v1"
   right: "reasoning-v1"

build_chat_runtime calls create_chat_provider("summarization", …) (src/openhuman/memory/chat.rs:127-147). After #2690 the "summarization" workload routes to summarization-v1 by default. The first assertion still expected reasoning-v1. Same file at line 226 (build_chat_runtime_still_builds_when_cloud_memory_model_is_overridden) is a sister case — but when memory_tree.cloud_llm_model is overridden, the routing actually does fall back to reasoning-v1. Both tests are updated to assert the value the function returns under each scenario.

Solution

Three micro-commits, each cargo check-clean independently:

fix(config/tests): init AutonomySettingsPatch with Default-padded fields — single-line ..Default::default() addition at ops_tests.rs:632. AutonomySettingsPatch derives Default, so the other Option<…> fields keep their previous (absent → None) value.
test(inference): drop hint:summarization from passthrough loop — remove the canonical hint from the iteration and tighten the comment.
test(memory/chat): update expected default model after summarization-v1 — flip the two stale assertions (reasoning-v1 → summarization-v1 for the default; reasoning-v1 is correct for the cloud-override case).

No production code changes. Three test files only.

Submission Checklist

cargo check --tests --lib clean (was E0063 before)
cargo test --lib openhuman::inference::provider::factory clean
cargo test --lib openhuman::memory::chat::tests::build_chat_runtime — 3 / 3 pass
cargo fmt --all -- --check clean
N/A: i18n strings — non-UI change
N/A: frontend typecheck / lint / vitest — no TS surface touched
N/A: docs / changelog — pure test compile + assertion fix
N/A: tests added — this PR restores existing tests; it doesn't change production behavior so no new assertions are warranted

Known still-flaky tests (out of scope for this PR)

cargo test --lib (full local run) also fails two tests when run as part of the full suite that PASS in isolation:

openhuman::memory::ops::documents::tests::envelope_memory_handlers_report_counts_and_statuses — surfaces memory_init: "disk I/O error" from sqlite when prior tests in the suite ran and contaminated process-global state. GLOBAL_MEMORY_TEST_LOCK exists already (added in test(memory): serialize tests that drive the process-global memory client #2649) for cross-test serialization but the sqlite open path appears to use OnceLock-style caching that breaks when tempdirs are recycled across tests. Same pattern flagged in earlier memory work.
openhuman::composio::action_tool::tests::mode_toggle_between_calls_is_observed — the test asserts a direct-mode tool error must NOT contain "no backend session", but under full-suite ordering a prior test leaves a cached backend client visible to this one. Same isolation issue, different domain.

Both pass in isolation:

cargo test --lib envelope_memory_handlers_report_counts_and_statuses → ok
cargo test --lib mode_toggle_between_calls_is_observed → ok

These are pre-existing test-isolation bugs (process-global state across tests), not regressions caused by this PR. Worth a follow-up that audits per-test reset hooks for the memory and composio singletons (RwLock<Option<Arc>> + reset_for_tests pattern is the usual fix shape).

Impact

Unblocks cargo test --lib matrix for every open PR (we hit these failures on feat(wallet): bind prepared transaction quotes to originating chat session #2708 and feat(core): pass in-process RPC bearer via internal handle, not process env #2709 immediately after pushing).
Zero production behavior change — only test code touched.

Test-side fallout from OpenHuman 不执行命令和无反馈的分析 #2486 (AutonomySettingsPatch field additions) and fix: runtime snapshot timeouts, config perms, stale lock recovery, summarization-v1 tier, loopback OAuth UX #2690 (summarization-v1 tier).
Surfaced while running CI on feat(wallet): bind prepared transaction quotes to originating chat session #2708 (wallet quote owner binding) and feat(core): pass in-process RPC bearer via internal handle, not process env #2709 (in-process core token transport).

Summary by CodeRabbit

Release Notes

Tests
- Updated test coverage for hint handling and workload routing behavior to accurately reflect current system configurations.

The literal at apply_autonomy_settings_updates_action_budget set only `max_actions_per_hour`, but the patch struct gained six more Option fields (level, workspace_only, allowed_commands, forbidden_paths, trusted_roots, allow_tool_install) and broke `cargo test --lib` on main with E0063. Add `..Default::default()` so the literal stays forward-compatible with field additions.

coderabbitai · 2026-05-26T18:58:40Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e365fa5f-ed06-4383-a665-da3e68848a70

📥 Commits

Reviewing files that changed from the base of the PR and between f03a6d1 and 44dabd7.

📒 Files selected for processing (2)

src/openhuman/inference/provider/factory_test.rs
src/openhuman/memory/chat.rs

✅ Files skipped from review due to trivial changes (1)

src/openhuman/memory/chat.rs

📝 Walkthrough

Walkthrough

This PR updates test documentation and assertions across two files to reflect that hint:summarization and summarization workload routing are now handled as canonical cases with dedicated tier routing, rather than unknown passthrough or fallback behaviors.

Changes

Test expectations: hint canonicalization and model routing

Layer / File(s)	Summary
Provider factory: hint:summarization canonicalization `src/openhuman/inference/provider/factory_test.rs`	`make_openhuman_backend_forwards_unknown_hint_verbatim` test narrowed to verify passthrough only for truly unrecognized hints like `hint:reaction` and `hint:garbage`, excluding `hint:summarization` as a canonical case.
Memory chat: workload routing and model defaults `src/openhuman/memory/chat.rs`	Test comments updated to document that `summarization` workload resolves to `summarization-v1` tier and clarify how `memory_tree.cloud_llm_model` override affects cloud-memory routing paths.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

tinyhumansai/openhuman#2223: Both PRs adjust src/openhuman/inference/provider/factory_test.rs to align hint:summarization handling with the factory's hint:*/canonical-tier recognition rules (while other unrecognized hint:* values continue to be treated as verbatim passthrough).

Suggested labels

memory

Suggested reviewers

graycyrus
senamakel

Poem

🐰 Hints now have their proper tier,
summarization claims its sphere,
No longer passthrough, now it's clear—
Routing is canonical here! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main objective: unblocking cargo test --lib by fixing three stale test fixtures. It is specific and directly reflects the primary change.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

After tinyhumansai#2690 added the summarization-v1 tier, hint:summarization became a canonical hint (mapped to summarization-v1 by the factory). The test was iterating over it as an unknown passthrough case, which then failed on `cargo test --lib`. Drop the now-canonical entry and update the comment.

build_chat_runtime resolves the "summarization" workload role, which post-tinyhumansai#2690 routes to summarization-v1 by default. With memory_tree.cloud_llm_model overridden, routing falls back to reasoning-v1. Update both assertions to match current behavior.

oxoxDev · 2026-05-26T19:36:29Z

CI note: Frontend Unit Tests and Frontend Coverage (Vitest) failed on this run, but both are pre-existing main failures unrelated to this PR.

This PR touches only 3 Rust test files (ops_tests.rs, factory_test.rs, chat.rs) — zero frontend overlap.
Verified: the same Vitest job is failing on the latest main run (loopbackOauthListener.test.ts > returns null when shell bind fails — 1 failed / 359 passed / 1 skipped). Same shape on this PR.
Rust-side checks for this PR (Rust Quality fmt+clippy, Type Check TypeScript) are green; test / Rust Core Tests is the gate this PR actually moves.

No fix here — the loopback OAuth test flake should land in a separate frontend-scoped PR.

…gs-patch-test-init # Conflicts: # src/openhuman/inference/provider/factory_test.rs # src/openhuman/memory/chat.rs

graycyrus

Three test fixes to unblock cargo test --lib on main after recent changes to AutonomySettingsPatch and summarization-v1 routing.

ops_tests.rs:632 — add ..Default::default() to forward-compatible struct literal that gained fields in #2486.

factory_test.rs:765 — hint:summarization became canonical (maps to summarization-v1 after #2690), so remove from passthrough test iteration. Comment updated.

chat.rs:218,226 — assertions updated to reflect new defaults: build_chat_runtime resolves to summarization-v1 by default (post-#2690), and falls back to reasoning-v1 when cloud model is overridden.

All three commits are correct and independently clean. No production behavior change — test-side only. CI fully green.

oxoxDev requested a review from a team May 26, 2026 18:58

coderabbitai Bot added the working A PR that is being worked on by the team. label May 26, 2026

coderabbitai Bot previously approved these changes May 26, 2026

View reviewed changes

oxoxDev added 2 commits May 27, 2026 00:51

oxoxDev dismissed coderabbitai[bot]’s stale review via f03a6d1 May 26, 2026 19:21

oxoxDev changed the title ~~fix(config/tests): unblock cargo test --lib (AutonomySettingsPatch E0063)~~ test: unblock cargo test --lib on main (3 stale test fixtures) May 26, 2026

coderabbitai Bot added the rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. label May 26, 2026

coderabbitai Bot previously approved these changes May 26, 2026

View reviewed changes

Merge remote-tracking branch 'upstream/main' into fix/autonomy-settin…

44dabd7

…gs-patch-test-init # Conflicts: # src/openhuman/inference/provider/factory_test.rs # src/openhuman/memory/chat.rs

oxoxDev dismissed coderabbitai[bot]’s stale review via 44dabd7 May 26, 2026 19:52

coderabbitai Bot added the memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. label May 26, 2026

coderabbitai Bot approved these changes May 26, 2026

View reviewed changes

graycyrus approved these changes May 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: unblock cargo test --lib on main (3 stale test fixtures)#2710

test: unblock cargo test --lib on main (3 stale test fixtures)#2710
oxoxDev wants to merge 4 commits into
tinyhumansai:mainfrom
oxoxDev:fix/autonomy-settings-patch-test-init

oxoxDev commented May 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

oxoxDev commented May 26, 2026

Uh oh!

graycyrus left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

oxoxDev commented May 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Failure 1 — ops_tests.rs:631 (compile-time, hidden by silence)

Failure 2 — factory_test.rs:768 (assertion)

Failure 3 — chat.rs:218 / chat.rs:226 (assertion)

Solution

Submission Checklist

Known still-flaky tests (out of scope for this PR)

Impact

Related

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

oxoxDev commented May 26, 2026

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

oxoxDev commented May 26, 2026 •

edited by coderabbitai Bot

Loading

Failure 1 — `ops_tests.rs:631` (compile-time, hidden by silence)

Failure 2 — `factory_test.rs:768` (assertion)

Failure 3 — `chat.rs:218 / chat.rs:226` (assertion)

coderabbitai Bot commented May 26, 2026 •

edited

Loading