test: unblock cargo test --lib on main (3 stale test fixtures)#2710
test: unblock cargo test --lib on main (3 stale test fixtures)#2710oxoxDev wants to merge 4 commits into
Conversation
The literal at apply_autonomy_settings_updates_action_budget set only `max_actions_per_hour`, but the patch struct gained six more Option fields (level, workspace_only, allowed_commands, forbidden_paths, trusted_roots, allow_tool_install) and broke `cargo test --lib` on main with E0063. Add `..Default::default()` so the literal stays forward-compatible with field additions.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughThis PR updates test documentation and assertions across two files to reflect that ChangesTest expectations: hint canonicalization and model routing
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
After tinyhumansai#2690 added the summarization-v1 tier, hint:summarization became a canonical hint (mapped to summarization-v1 by the factory). The test was iterating over it as an unknown passthrough case, which then failed on `cargo test --lib`. Drop the now-canonical entry and update the comment.
build_chat_runtime resolves the "summarization" workload role, which post-tinyhumansai#2690 routes to summarization-v1 by default. With memory_tree.cloud_llm_model overridden, routing falls back to reasoning-v1. Update both assertions to match current behavior.
|
CI note:
No fix here — the loopback OAuth test flake should land in a separate frontend-scoped PR. |
…gs-patch-test-init # Conflicts: # src/openhuman/inference/provider/factory_test.rs # src/openhuman/memory/chat.rs
graycyrus
left a comment
There was a problem hiding this comment.
Three test fixes to unblock cargo test --lib on main after recent changes to AutonomySettingsPatch and summarization-v1 routing.
ops_tests.rs:632 — add ..Default::default() to forward-compatible struct literal that gained fields in #2486.
factory_test.rs:765 — hint:summarization became canonical (maps to summarization-v1 after #2690), so remove from passthrough test iteration. Comment updated.
chat.rs:218,226 — assertions updated to reflect new defaults: build_chat_runtime resolves to summarization-v1 by default (post-#2690), and falls back to reasoning-v1 when cloud model is overridden.
All three commits are correct and independently clean. No production behavior change — test-side only. CI fully green.
Summary
cargo test --libonmainby fixing 3 test-side regressions that have shipped ontomainwithout their tests being updated.ops_tests.rs:631AutonomySettingsPatchliteral missing fields added in OpenHuman 不执行命令和无反馈的分析 #2486.factory_test.rs:768test was iteratinghint:summarizationas an unknown-passthrough case after fix: runtime snapshot timeouts, config perms, stale lock recovery, summarization-v1 tier, loopback OAuth UX #2690 made it canonical.chat.rs:218,226test assertions hard-codedreasoning-v1even thoughbuild_chat_runtimenow resolves tosummarization-v1by default after fix: runtime snapshot timeouts, config perms, stale lock recovery, summarization-v1 tier, loopback OAuth UX #2690.Problem
cargo test --libis red onupstream/mainacross theRust Core Coverage,Rust Core Tests + Quality, andRust Core Tests (Windows — secrets ACL)workflows. Every open PR inherits these failures regardless of what it touches.Root cause: two recently-merged PRs shipped code changes without updating their tests in the same change.
Failure 1 —
ops_tests.rs:631(compile-time, hidden by silence)AutonomySettingsPatchgained sixOption<…>fields (level,workspace_only,allowed_commands,forbidden_paths,trusted_roots,allow_tool_install) via #2486 (feat(app): UI control for max_actions_per_hour). The test literal at line 631 still set onlymax_actions_per_hour. All other literals in the same file were updated; this one was missed.Failure 2 —
factory_test.rs:768(assertion)#2690(fix: … summarization-v1 tier …) addedsummarization-v1as a canonical model tier.is_known_openhuman_tier(src/openhuman/inference/provider/factory.rs:79-98) explicitly includes"hint:summarization", so the factory now translates that hint tosummarization-v1rather than forwarding it verbatim. The test was iteratinghint:summarizationas an "unknown" passthrough case — that classification is no longer correct.Failure 3 —
chat.rs:218 / chat.rs:226(assertion)build_chat_runtimecallscreate_chat_provider("summarization", …)(src/openhuman/memory/chat.rs:127-147). After #2690 the "summarization" workload routes tosummarization-v1by default. The first assertion still expectedreasoning-v1. Same file at line 226 (build_chat_runtime_still_builds_when_cloud_memory_model_is_overridden) is a sister case — but whenmemory_tree.cloud_llm_modelis overridden, the routing actually does fall back toreasoning-v1. Both tests are updated to assert the value the function returns under each scenario.Solution
Three micro-commits, each
cargo check-clean independently:fix(config/tests): init AutonomySettingsPatch with Default-padded fields— single-line..Default::default()addition atops_tests.rs:632.AutonomySettingsPatchderivesDefault, so the otherOption<…>fields keep their previous (absent →None) value.test(inference): drop hint:summarization from passthrough loop— remove the canonical hint from the iteration and tighten the comment.test(memory/chat): update expected default model after summarization-v1— flip the two stale assertions (reasoning-v1→summarization-v1for the default;reasoning-v1is correct for the cloud-override case).No production code changes. Three test files only.
Submission Checklist
cargo check --tests --libclean (was E0063 before)cargo test --lib openhuman::inference::provider::factorycleancargo test --lib openhuman::memory::chat::tests::build_chat_runtime— 3 / 3 passcargo fmt --all -- --checkcleanKnown still-flaky tests (out of scope for this PR)
cargo test --lib(full local run) also fails two tests when run as part of the full suite that PASS in isolation:openhuman::memory::ops::documents::tests::envelope_memory_handlers_report_counts_and_statuses— surfacesmemory_init: "disk I/O error"from sqlite when prior tests in the suite ran and contaminated process-global state.GLOBAL_MEMORY_TEST_LOCKexists already (added in test(memory): serialize tests that drive the process-global memory client #2649) for cross-test serialization but the sqlite open path appears to useOnceLock-style caching that breaks when tempdirs are recycled across tests. Same pattern flagged in earlier memory work.openhuman::composio::action_tool::tests::mode_toggle_between_calls_is_observed— the test asserts a direct-mode tool error must NOT contain"no backend session", but under full-suite ordering a prior test leaves a cached backend client visible to this one. Same isolation issue, different domain.Both pass in isolation:
These are pre-existing test-isolation bugs (process-global state across tests), not regressions caused by this PR. Worth a follow-up that audits per-test reset hooks for the memory and composio singletons (
RwLock<Option<Arc>>+reset_for_testspattern is the usual fix shape).Impact
cargo test --libmatrix for every open PR (we hit these failures on feat(wallet): bind prepared transaction quotes to originating chat session #2708 and feat(core): pass in-process RPC bearer via internal handle, not process env #2709 immediately after pushing).Related
AutonomySettingsPatchfield additions) and fix: runtime snapshot timeouts, config perms, stale lock recovery, summarization-v1 tier, loopback OAuth UX #2690 (summarization-v1tier).Summary by CodeRabbit
Release Notes