Skip to content

test: unblock cargo test --lib on main (3 stale test fixtures)#2710

Open
oxoxDev wants to merge 4 commits into
tinyhumansai:mainfrom
oxoxDev:fix/autonomy-settings-patch-test-init
Open

test: unblock cargo test --lib on main (3 stale test fixtures)#2710
oxoxDev wants to merge 4 commits into
tinyhumansai:mainfrom
oxoxDev:fix/autonomy-settings-patch-test-init

Conversation

@oxoxDev
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev commented May 26, 2026

Summary

Problem

cargo test --lib is red on upstream/main across the Rust Core Coverage, Rust Core Tests + Quality, and Rust Core Tests (Windows — secrets ACL) workflows. Every open PR inherits these failures regardless of what it touches.

Root cause: two recently-merged PRs shipped code changes without updating their tests in the same change.

Failure 1 — ops_tests.rs:631 (compile-time, hidden by silence)

error[E0063]: missing fields `allow_tool_install`, `allowed_commands`,
  `forbidden_paths` and 3 other fields in initializer of
  `openhuman::config::ops::AutonomySettingsPatch`
  --> src/openhuman/config/ops_tests.rs:631:9

AutonomySettingsPatch gained six Option<…> fields (level, workspace_only, allowed_commands, forbidden_paths, trusted_roots, allow_tool_install) via #2486 (feat(app): UI control for max_actions_per_hour). The test literal at line 631 still set only max_actions_per_hour. All other literals in the same file were updated; this one was missed.

Failure 2 — factory_test.rs:768 (assertion)

thread '…::make_openhuman_backend_forwards_unknown_hint_verbatim' panicked at
  src/openhuman/inference/provider/factory_test.rs:768:9:
  assertion `left == right` failed: hint 'hint:summarization' should pass through unchanged
    left: "summarization-v1"
   right: "hint:summarization"

#2690 (fix: … summarization-v1 tier …) added summarization-v1 as a canonical model tier. is_known_openhuman_tier (src/openhuman/inference/provider/factory.rs:79-98) explicitly includes "hint:summarization", so the factory now translates that hint to summarization-v1 rather than forwarding it verbatim. The test was iterating hint:summarization as an "unknown" passthrough case — that classification is no longer correct.

Failure 3 — chat.rs:218 / chat.rs:226 (assertion)

thread '…::build_chat_runtime_defaults_to_openhuman_resolved_model' panicked at
  src/openhuman/memory/chat.rs:218:9:
  assertion `left == right` failed
    left: "summarization-v1"
   right: "reasoning-v1"

build_chat_runtime calls create_chat_provider("summarization", …) (src/openhuman/memory/chat.rs:127-147). After #2690 the "summarization" workload routes to summarization-v1 by default. The first assertion still expected reasoning-v1. Same file at line 226 (build_chat_runtime_still_builds_when_cloud_memory_model_is_overridden) is a sister case — but when memory_tree.cloud_llm_model is overridden, the routing actually does fall back to reasoning-v1. Both tests are updated to assert the value the function returns under each scenario.

Solution

Three micro-commits, each cargo check-clean independently:

  1. fix(config/tests): init AutonomySettingsPatch with Default-padded fields — single-line ..Default::default() addition at ops_tests.rs:632. AutonomySettingsPatch derives Default, so the other Option<…> fields keep their previous (absent → None) value.
  2. test(inference): drop hint:summarization from passthrough loop — remove the canonical hint from the iteration and tighten the comment.
  3. test(memory/chat): update expected default model after summarization-v1 — flip the two stale assertions (reasoning-v1summarization-v1 for the default; reasoning-v1 is correct for the cloud-override case).

No production code changes. Three test files only.

Submission Checklist

  • cargo check --tests --lib clean (was E0063 before)
  • cargo test --lib openhuman::inference::provider::factory clean
  • cargo test --lib openhuman::memory::chat::tests::build_chat_runtime — 3 / 3 pass
  • cargo fmt --all -- --check clean
  • N/A: i18n strings — non-UI change
  • N/A: frontend typecheck / lint / vitest — no TS surface touched
  • N/A: docs / changelog — pure test compile + assertion fix
  • N/A: tests added — this PR restores existing tests; it doesn't change production behavior so no new assertions are warranted

Known still-flaky tests (out of scope for this PR)

cargo test --lib (full local run) also fails two tests when run as part of the full suite that PASS in isolation:

  • openhuman::memory::ops::documents::tests::envelope_memory_handlers_report_counts_and_statuses — surfaces memory_init: "disk I/O error" from sqlite when prior tests in the suite ran and contaminated process-global state. GLOBAL_MEMORY_TEST_LOCK exists already (added in test(memory): serialize tests that drive the process-global memory client #2649) for cross-test serialization but the sqlite open path appears to use OnceLock-style caching that breaks when tempdirs are recycled across tests. Same pattern flagged in earlier memory work.
  • openhuman::composio::action_tool::tests::mode_toggle_between_calls_is_observed — the test asserts a direct-mode tool error must NOT contain "no backend session", but under full-suite ordering a prior test leaves a cached backend client visible to this one. Same isolation issue, different domain.

Both pass in isolation:

cargo test --lib envelope_memory_handlers_report_counts_and_statuses → ok
cargo test --lib mode_toggle_between_calls_is_observed → ok

These are pre-existing test-isolation bugs (process-global state across tests), not regressions caused by this PR. Worth a follow-up that audits per-test reset hooks for the memory and composio singletons (RwLock<Option<Arc>> + reset_for_tests pattern is the usual fix shape).

Impact

Related

Summary by CodeRabbit

Release Notes

  • Tests
    • Updated test coverage for hint handling and workload routing behavior to accurately reflect current system configurations.

Review Change Stack

The literal at apply_autonomy_settings_updates_action_budget set only
`max_actions_per_hour`, but the patch struct gained six more Option
fields (level, workspace_only, allowed_commands, forbidden_paths,
trusted_roots, allow_tool_install) and broke `cargo test --lib` on
main with E0063. Add `..Default::default()` so the literal stays
forward-compatible with field additions.
@oxoxDev oxoxDev requested a review from a team May 26, 2026 18:58
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 26, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e365fa5f-ed06-4383-a665-da3e68848a70

📥 Commits

Reviewing files that changed from the base of the PR and between f03a6d1 and 44dabd7.

📒 Files selected for processing (2)
  • src/openhuman/inference/provider/factory_test.rs
  • src/openhuman/memory/chat.rs
✅ Files skipped from review due to trivial changes (1)
  • src/openhuman/memory/chat.rs

📝 Walkthrough

Walkthrough

This PR updates test documentation and assertions across two files to reflect that hint:summarization and summarization workload routing are now handled as canonical cases with dedicated tier routing, rather than unknown passthrough or fallback behaviors.

Changes

Test expectations: hint canonicalization and model routing

Layer / File(s) Summary
Provider factory: hint:summarization canonicalization
src/openhuman/inference/provider/factory_test.rs
make_openhuman_backend_forwards_unknown_hint_verbatim test narrowed to verify passthrough only for truly unrecognized hints like hint:reaction and hint:garbage, excluding hint:summarization as a canonical case.
Memory chat: workload routing and model defaults
src/openhuman/memory/chat.rs
Test comments updated to document that summarization workload resolves to summarization-v1 tier and clarify how memory_tree.cloud_llm_model override affects cloud-memory routing paths.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

  • tinyhumansai/openhuman#2223: Both PRs adjust src/openhuman/inference/provider/factory_test.rs to align hint:summarization handling with the factory's hint:*/canonical-tier recognition rules (while other unrecognized hint:* values continue to be treated as verbatim passthrough).

Suggested labels

memory

Suggested reviewers

  • graycyrus
  • senamakel

Poem

🐰 Hints now have their proper tier,
summarization claims its sphere,
No longer passthrough, now it's clear—
Routing is canonical here! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective: unblocking cargo test --lib by fixing three stale test fixtures. It is specific and directly reflects the primary change.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 26, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 26, 2026
oxoxDev added 2 commits May 27, 2026 00:51
After tinyhumansai#2690 added the summarization-v1 tier, hint:summarization
became a canonical hint (mapped to summarization-v1 by the factory).
The test was iterating over it as an unknown passthrough case,
which then failed on `cargo test --lib`. Drop the now-canonical
entry and update the comment.
build_chat_runtime resolves the "summarization" workload role, which
post-tinyhumansai#2690 routes to summarization-v1 by default. With
memory_tree.cloud_llm_model overridden, routing falls back to
reasoning-v1. Update both assertions to match current behavior.
@oxoxDev oxoxDev changed the title fix(config/tests): unblock cargo test --lib (AutonomySettingsPatch E0063) test: unblock cargo test --lib on main (3 stale test fixtures) May 26, 2026
@coderabbitai coderabbitai Bot added the rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. label May 26, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 26, 2026
@oxoxDev
Copy link
Copy Markdown
Contributor Author

oxoxDev commented May 26, 2026

CI note: Frontend Unit Tests and Frontend Coverage (Vitest) failed on this run, but both are pre-existing main failures unrelated to this PR.

  • This PR touches only 3 Rust test files (ops_tests.rs, factory_test.rs, chat.rs) — zero frontend overlap.
  • Verified: the same Vitest job is failing on the latest main run (loopbackOauthListener.test.ts > returns null when shell bind fails — 1 failed / 359 passed / 1 skipped). Same shape on this PR.
  • Rust-side checks for this PR (Rust Quality fmt+clippy, Type Check TypeScript) are green; test / Rust Core Tests is the gate this PR actually moves.

No fix here — the loopback OAuth test flake should land in a separate frontend-scoped PR.

…gs-patch-test-init

# Conflicts:
#	src/openhuman/inference/provider/factory_test.rs
#	src/openhuman/memory/chat.rs
@coderabbitai coderabbitai Bot added the memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. label May 26, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three test fixes to unblock cargo test --lib on main after recent changes to AutonomySettingsPatch and summarization-v1 routing.

ops_tests.rs:632 — add ..Default::default() to forward-compatible struct literal that gained fields in #2486.

factory_test.rs:765hint:summarization became canonical (maps to summarization-v1 after #2690), so remove from passthrough test iteration. Comment updated.

chat.rs:218,226 — assertions updated to reflect new defaults: build_chat_runtime resolves to summarization-v1 by default (post-#2690), and falls back to reasoning-v1 when cloud model is overridden.

All three commits are correct and independently clean. No production behavior change — test-side only. CI fully green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants