Release v0.3.1: MIRA rebrand, bundled engine, mira research CLI, uv runtime, provider hardening by ChenglongWang · Pull Request #88 · MIRA-Intelligence/mira

ChenglongWang · 2026-05-30T16:36:22Z

Summary

Roll-up of all release work since the previous v0.3.0 merge (PR #69) into main. 92 commits, ~709 files touched. Tip of release is tagged as v0.3.1.

Highlights by Theme

🏷️ Rebrand: MedPilot → MIRA (#47)

pip package renamed medpilot → mira-engine
CLI entry points renamed: medpilot → mira, medpilot-agent → mira-engine
Workspace migration: legacy ~/.medpilot/ auto-migrates to ~/.mira/ on first command; MEDPILOT_* env vars are mapped to MIRA_*
GPL licensing + CLA governance added across the deploy flow

📦 Bundled engine + release packaging (#49, #51, #52, #58, #62, #84)

Checked-in PyInstaller spec, embedded uv binary in the bundle, runtime config + release asset support
Windows: bundled engine startup without SCM services, bundled gateway subprocess extraction, asyncio warning silencing
macOS / Windows release matrix unblocked for v0.3.0rc → v0.3.1

🤖 Agent loop split + `mira research` CLI (#55)

AgentLoop → BaseAgentLoop (single-shot, nanobot-style baseline) + ResearchAgentLoop (auto-mode, task_plan, profiles, contracts, automation_policy, cross-round token budget)
New mira research CLI exposes the research loop directly; mira agent keeps the lightweight loop
Background exec jobs and bg companion tool for long-running tasks (Add background exec jobs and bg companion tool for long-running tasks #58)
"web" channel renamed to "ui" with quieter engine logs in interactive CLI (Rename "web" channel to "ui" + silence engine logs in interactive CLI #57)

🔬 Research loop quality (#41, #56, #70, #71, #80, #85)

Strict contract completion + runtime profile sync (Enforce strict contract completion and runtime profile sync #41)
Tightened web auto-run re-plan and export guards (Tighten web auto-run re-plan and export guards #56)
Loosened auto-mode stop heuristics + strictHeuristics toggle (feat(research-loop): loosen auto-mode stop heuristics + strictHeuristics toggle #70)
Replan exhausted queue, anti-confirmation prompt, structured stop reason (feat(research-loop): replan exhausted queue, anti-confirmation prompt, structured stop reason #71)
Improved skill routing precision (fix: improve skill routing precision #80)
Engine lifecycle + guardrails + UI liveness hardening (Fix engine lifecycle, guardrails, and UI liveness #85)

🐍 uv runtime + per-project venv isolation (#59, #60, #61, #63, #64, #65, #66, #67, #82)

Opt-in tools.exec.python schema, per-project venv auto-bootstrap on first python command
First-launch uv python install, agent prompt taught venv conventions
mira runtime cache-prune + mira runtime project-gc CLIs
Opt-in pip install → uv pip install rewrite

🔌 Provider + UI runtime config (#43, #45, #73, #74, #76, #77, #83, #86, plus ee22fde4)

Unified web bind host/port under gateway config (config: unify web bind host/port under gateway #45)
Custom provider apiBase prompt fix (fix: add missing prompt for apiBase in custom config #43)
Runtime provider metadata exposed for all supported providers (fix(ui): expose runtime provider metadata for all supported providers #73)
Provider proxy + OAuth path support (Fix provider auth paths and proxy handling #74)
UI project bindings isolated by workspace (fix: isolate UI project bindings by workspace #76)
Compatibility tracking relocated to mira-ui repo (chore: relocate compatibility tracking to mira-ui repo #77)
Better LLM error display in CLI with provider/model context (fix: improve LLM error display in CLI with provider/model context #83)
DeepSeek routed through native OpenAI-compatible provider (sidesteps LiteLLM reasoning_content round-trip bug) (fix: route DeepSeek through native OpenAI-compatible provider #86)
Moonshot/Kimi temperature enforcement (Improve bundled engine service lifecycle #84 area)
Runtime routing models preserved on UI config saves

🛠️ Infra / CI / governance

Consolidated workflows + ruff lint cleanup (chore(ci): consolidate workflows and fix outstanding ruff lint #81)
Submodule checkout on release CI (skills loading)
Windows release path test stabilization
Compatibility contract validation workflow (Add compatibility mapping validation for release train #21)
Standardized gateway health + version endpoints (Add standardized gateway /health and /version contracts #22)
Local-engine service + self-hosted deployment consolidation (Consolidate local-engine service, release, and self-hosted deployment stack #32)
Project action audit logs + skill invocation tracking (Add project action audit logs and skill invocation tracking #7)

CLA Acknowledgement

All inbound PRs (referenced above by #NN) have already been individually CLA-checked at merge into release / dev. This is a fast-forward / merge roll-up, no new external contribution introduced at this step.

Test Evidence

CI green on release head (47fc2ca). Each constituent PR carried its own test evidence at merge time; this PR is a roll-up, not a fresh change.

git rev-list --count origin/main..origin/release   # → 92
git diff --stat origin/main...origin/release | tail -1
# 709 files changed, 82404 insertions(+), 141623 deletions(-)
git tag --points-at origin/release
# v0.3.1

Rollback Notes

Rollback steps: git revert -m 1 <merge-commit-sha> on main, or git reset --hard a649c64 if no follow-up work has landed on main yet.
Data migration impact: workspace auto-migration ~/.medpilot/ → ~/.mira/ is one-way; rollback won't reverse it, but legacy ~/.medpilot/ is left intact on first migration so users can re-point manually if needed.
Safe fallback version: pip install mira-engine==0.3.0 (or the prior medpilot==0.2.x for full pre-rename rollback).

* Enable web auto-run continuation and mode-aware runtime context. Carry run mode from UI metadata into the agent loop, continue pending experiments server-side in auto mode, and tighten stop heuristics so ordinary experiment analysis does not halt execution. Made-with: Cursor * Apply session-level run mode control to web auto loops. Handle set_mode control messages in the gateway and agent loop so manual/auto switches take effect during active auto execution, and include the latest UI submodule pointer and refreshed template assets. Made-with: Cursor * Update submodule. Advance the UI submodule pointer to include the manual/auto toggle switch styling update. Made-with: Cursor

* Enable profile-based AGENTS template selection for web sessions. Parse and persist the UI-selected agent profile per session, i.e., engineer/research/default, during prompt construction, and cover the new selection paths with context and web channel tests. * Update system prompts. * Add logging functionality. * Log skill invoke.

Standardize issue linking, test evidence, and rollback notes for per-ticket deploy PRs. Made-with: Cursor

Define a release compatibility mapping with schema-style checks and enforce it in CI so UI and agent versions stay aligned.

Expose machine-readable /health and /version contracts (plus /api aliases) so desktop bootstrap and release compatibility checks can rely on a stable runtime handshake.

Introduce a dedicated local engine management CLI with install/start/stop/status/logs/doctor commands and test coverage so deployment workflows no longer depend on tmux sessions.

… stack (#32) * Add medpilot-agent service lifecycle CLI skeleton. Introduce a dedicated local engine management CLI with install/start/stop/status/logs/doctor commands and test coverage so deployment workflows no longer depend on tmux sessions. Made-with: Cursor * Add macOS launchd support for local engine service. Implement launchd-backed install/start/stop/status/doctor behavior and plist generation so desktop local mode can run as a managed user service instead of tmux. Made-with: Cursor * Add rollback-safe manual local engine upgrade flow (#25) * Add manual upgrade command with rollback safeguards. Provide a medpilot-agent upgrade flow that stops service, upgrades package, verifies health, and rolls back on failures, with an operator runbook for manual recovery. Made-with: Cursor * Add local engine structured logs and diagnostics export (#26) * Add structured logging and diagnostics export for local engine. Emit JSONL service lifecycle logs, rotate log files, and support doctor --export diagnostics bundles to speed up support triage. Made-with: Cursor * Add tag-driven agent release pipeline (#27) * Add agent tag-release pipeline for package and executables. Automate cross-platform build/test, PyPI publish, standalone medpilot-agent executable packaging, and checksum generation on version tags. Made-with: Cursor * Add release-train smoke orchestration workflow (#28) * Add release-train orchestration and smoke workflow. Introduce a manual workflow that validates agent/ui tag pairs, runs gateway smoke checks, and publishes release-train summary artifacts for coordinated releases. Made-with: Cursor * Add Linux systemd user-service manager for medpilot-agent (#29) * Add Linux systemd --user service support for medpilot-agent. Introduce systemd unit generation and lifecycle commands with status checks so Linux users can run local engine as a managed per-user service. Made-with: Cursor * Add Windows service manager for medpilot-agent (#30) * Add Windows Service support for medpilot-agent lifecycle. Implement Windows service create/start/stop/status/delete flows and tests so local engine management has parity with macOS/Linux service models. Made-with: Cursor * Add optional self-hosted Docker templates and operator guide. (#31) Provide compose and env examples plus upgrade/rollback instructions while keeping Docker explicitly positioned as an advanced deployment path. Made-with: Cursor

Update submodule version.

Switch runtime/docs/tests to the new package name, enable hatch-vcs tag-based versioning, and improve release workflow observability with full tag checkout plus verbose PyPI uploads. Made-with: Cursor

Update compatibility.json with new release train and agent/ui versions.

Introduce project GPL metadata and CLA policy docs, require CLA acknowledgement in PR templates, update release-train cross-repo tag checks for private repos, and bump the UI submodule pointer to the latest governance updates. Made-with: Cursor

Made-with: Cursor

Update submodule version.

Add high-value coverage for agent loop paths, channel/web handlers, config matching, and tool execution edge cases, plus a reusable scoped coverage command to track core coverage targets consistently. Made-with: Cursor

Add high-value coverage for agent loop paths, channel/web handlers, config matching, and tool execution edge cases, plus a reusable scoped coverage command to track core coverage targets consistently.

* Add gateway-side data path validation API for UI project setup. This keeps workspace restrictions enforced while allowing the UI to check server path visibility before project creation, and updates the UI submodule to the unified data-source entry flow. * remove codecov threshold.

Keep pull_request checks for all PRs while restricting push-triggered test runs to main/dev/release to avoid duplicate CI runs on feature branches. Made-with: Cursor

* Add gateway-side data path validation API for UI project setup. This keeps workspace restrictions enforced while allowing the UI to check server path visibility before project creation, and updates the UI submodule to the unified data-source entry flow. * Add task_plan guardrails to prevent experiment structure drift. This introduces shared task_plan lint/reconcile logic, auto-fix and lint APIs in the web channel, and auto-mode gating so malformed or drifting experiment structures are corrected (or blocked) before they can break downstream UI rendering. * remove codecov threshold. * Harden web session persistence and add experiment snapshots. This makes session history append-only, returns recoverable experiment snapshots for completed entries, and updates UI integration/tests to preserve stable experiment detail views during task_plan drift. * Deduplicate repeated workspace-root update logs. Only emit audit/log entries when /api/config actually changes projects_root, reducing reconnect noise while preserving behavior and test coverage. * Add profile-aware task-plan guardrails with repair and versioning. Enforce required evidence fields by profile, add one-shot auto repair for blocked auto runs, and make strictness configurable via project contract version with updated web APIs/tests/docs. * Auto-fix duplicate experiment IDs in task plans. Reconcile duplicate experiment ids to fresh ExpNNN values so malformed task_plan updates cannot persist ambiguous experiment records. * Surface task-plan ID remaps to agent context. When guardrails auto-reassign duplicate experiment IDs, inject a canonical remap notice into the web message metadata so the LLM reasons over corrected IDs instead of stale references. * Expose profile contract metadata for task-plan UI alignment. Derive required field rules from guardrails in a new read-only plan contract endpoint, persist rich experiment evidence fields in snapshots, and wire tests to keep profile-specific requirements synchronized with dashboard rendering.

* feat: sync all features from nanobot v0.15 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(onboard): add guided provider and model setup - integrate OAuth providers into onboard flow (no separate provider login command) - show provider endpoint connectivity with dim URL display - add model examples and validation; allow bare model input after provider selection - improve non-wizard onboarding prompts and related docs/tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(docker): stabilize cli oauth and config paths Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bug): fix the problem of skill discovery and routing in nested workspace skill directories? * feat(cli): auto-prepend provider prefix to model names during onboard * fix(core): resolve cron reload loop and enhance gateway with fail-safe protection - Fix infinite reload loop in CronService by updating mtime after loading. - Add PID file locking and port collision detection to prevent duplicate gateway instances. - Introduce '--host' option to gateway and service install commands for LAN access. - Improve channel configuration handling for Pydantic models and key normalization. - Bump version to 0.3.0 and add comprehensive fail-safe unit tests. * fix(ci): add psutil dependency and include missing fail-safe tests * fix(test): resolve CI failures by skipping failsafe in tests and fixing version mismatch - Add MEDPILOT_SKIP_GATEWAY_FAILSAVE to globally skip port/PID checks during testing. - Revert pyproject.toml version to 0.0.0 to match __init__.py and test expectations. - Update gateway CLI tests to match new output format (0.0.0.0:port). - Ensure psutil is in dependencies and failsafe tests are included. * fix(test): extract gateway failsafe to function and global mock in tests - Extract PID/port collision check into _gateway_failsafe_check for granular control. - Add global mock in tests/conftest.py to bypass failsafe in all existing CLI tests. - Update failsafe unit tests to test the check function directly and bypass global mock. - Ensure all tests pass without SystemExit(1) due to port collisions in CI environment. --------- Co-authored-by: Chenglong Wang <ryuu.j.ching@gmail.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Initialize web projects root from runtime workspace and write /api/config changes back to the currently selected config path so UI workspace edits remain permanent across restarts.

* Add gateway-side data path validation API for UI project setup. This keeps workspace restrictions enforced while allowing the UI to check server path visibility before project creation, and updates the UI submodule to the unified data-source entry flow. * Add task_plan guardrails to prevent experiment structure drift. This introduces shared task_plan lint/reconcile logic, auto-fix and lint APIs in the web channel, and auto-mode gating so malformed or drifting experiment structures are corrected (or blocked) before they can break downstream UI rendering. * remove codecov threshold. * Harden web session persistence and add experiment snapshots. This makes session history append-only, returns recoverable experiment snapshots for completed entries, and updates UI integration/tests to preserve stable experiment detail views during task_plan drift. * Deduplicate repeated workspace-root update logs. Only emit audit/log entries when /api/config actually changes projects_root, reducing reconnect noise while preserving behavior and test coverage. * Add profile-aware task-plan guardrails with repair and versioning. Enforce required evidence fields by profile, add one-shot auto repair for blocked auto runs, and make strictness configurable via project contract version with updated web APIs/tests/docs. * Auto-fix duplicate experiment IDs in task plans. Reconcile duplicate experiment ids to fresh ExpNNN values so malformed task_plan updates cannot persist ambiguous experiment records. * Surface task-plan ID remaps to agent context. When guardrails auto-reassign duplicate experiment IDs, inject a canonical remap notice into the web message metadata so the LLM reasons over corrected IDs instead of stale references. * Expose profile contract metadata for task-plan UI alignment. Derive required field rules from guardrails in a new read-only plan contract endpoint, persist rich experiment evidence fields in snapshots, and wire tests to keep profile-specific requirements synchronized with dashboard rendering. * Add references ingestion and policy-driven auto-stop controls for web projects. This introduces references-target uploads with safe zip extraction, persists automation policies from project setup, and switches auto-mode continuation to stop on goals, experiment limits, or token budgets for clearer project control. * Enforce auto-run task_plan checkpoint updates between experiments. Add a checkpoint barrier that detects unchanged running experiments across auto rounds and forces a task_plan sync repair round before continuing, so progress state is persisted incrementally instead of only at the end. * Update UI submodule for manual export-driven results flow. Track the latest MedPilotUI commits that remove output-goal setup, move automation policy defaults, and add user-triggered export actions in the result stage. * Require explicit user request for result deliverables. Remove the natural-conclusion trigger from web agent instructions so final deliverables are generated only when the user explicitly asks for export.

* Add gateway-side data path validation API for UI project setup. This keeps workspace restrictions enforced while allowing the UI to check server path visibility before project creation, and updates the UI submodule to the unified data-source entry flow. * Add task_plan guardrails to prevent experiment structure drift. This introduces shared task_plan lint/reconcile logic, auto-fix and lint APIs in the web channel, and auto-mode gating so malformed or drifting experiment structures are corrected (or blocked) before they can break downstream UI rendering. * remove codecov threshold. * Harden web session persistence and add experiment snapshots. This makes session history append-only, returns recoverable experiment snapshots for completed entries, and updates UI integration/tests to preserve stable experiment detail views during task_plan drift. * Deduplicate repeated workspace-root update logs. Only emit audit/log entries when /api/config actually changes projects_root, reducing reconnect noise while preserving behavior and test coverage. * Add profile-aware task-plan guardrails with repair and versioning. Enforce required evidence fields by profile, add one-shot auto repair for blocked auto runs, and make strictness configurable via project contract version with updated web APIs/tests/docs. * Auto-fix duplicate experiment IDs in task plans. Reconcile duplicate experiment ids to fresh ExpNNN values so malformed task_plan updates cannot persist ambiguous experiment records. * Surface task-plan ID remaps to agent context. When guardrails auto-reassign duplicate experiment IDs, inject a canonical remap notice into the web message metadata so the LLM reasons over corrected IDs instead of stale references. * Expose profile contract metadata for task-plan UI alignment. Derive required field rules from guardrails in a new read-only plan contract endpoint, persist rich experiment evidence fields in snapshots, and wire tests to keep profile-specific requirements synchronized with dashboard rendering. * Add references ingestion and policy-driven auto-stop controls for web projects. This introduces references-target uploads with safe zip extraction, persists automation policies from project setup, and switches auto-mode continuation to stop on goals, experiment limits, or token budgets for clearer project control. * Enforce auto-run task_plan checkpoint updates between experiments. Add a checkpoint barrier that detects unchanged running experiments across auto rounds and forces a task_plan sync repair round before continuing, so progress state is persisted incrementally instead of only at the end. * Update UI submodule for manual export-driven results flow. Track the latest MedPilotUI commits that remove output-goal setup, move automation policy defaults, and add user-triggered export actions in the result stage. * Require explicit user request for result deliverables. Remove the natural-conclusion trigger from web agent instructions so final deliverables are generated only when the user explicitly asks for export. * Harden auto-run guardrails and persist runtime contract settings. Apply code-level task-plan contract normalization after experiment transitions, ensure websocket messages can persist contract version metadata, and update tests while advancing the UI submodule for synced new-project runtime preferences. * Enforce strict contract completion flow without placeholder bypass. Fix the auto-run guardrail crash by aligning _guard_task_plan_structure with auto_fix calls, and require strict-mode experiments to request model补全 instead of auto-filling missing contract fields. * Update UI submodule pointer for kickoff language policy. Record the latest UI commit so runtime-profile-contract-sync includes the new language-aware project kickoff prompt behavior.

* Fix: Require explicit apiBase for custom provider - Add validation in make_provider() to raise clear error when custom provider lacks apiBase - Enhance onboard.py to prompt for apiBase when configuring custom provider - Add tests for custom provider apiBase validation - Error message guides users to configure via config.json or onboard wizard * Fix: Prompt for apiBase in non-wizard onboard for custom provider - Add api_base prompt logic for custom provider in non-wizard onboarding flow - When user selects custom provider, prompt for API base URL (required if not set) - If api_base already exists, offer options to update/keep/clear - Consistent with wizard onboarding behavior * fix: add missing prompt for apiBase in custom config * fix: add missing prompt for apiBase in custom config (resolved conflict with dev) - Re-apply custom provider api_base prompt logic on top of latest dev branch - Maintains compatibility with dev branch changes to onboard flow - Prompts for API Base URL when configuring custom provider in non-wizard mode * merge: resolve conflicts with origin/dev (loop.py, web.py, guardrails.py, tests) Agent-Logs-Url: https://github.com/Project-MedPilot/MedPilot/sessions/6a11157f-7b33-47e6-91fe-0d26235cdc68 Co-authored-by: ldxFAIRYTAIL <82999767+ldxFAIRYTAIL@users.noreply.github.com> --------- Co-authored-by: LoveMachine <yqyi@example.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: ldxFAIRYTAIL <82999767+ldxFAIRYTAIL@users.noreply.github.com>

* config: unify web bind host/port under gateway Use gateway.host/port as the single bind source for the Web channel and stop duplicating bind settings in channels.web. Add config migration plus regression tests so legacy channels.web host/port values are safely promoted without breaking existing configs. * chore: normalize config-unification files after origin/dev rebase Preserve existing CRLF conventions in files touched during conflict resolution so the branch keeps a minimal, reviewable diff against origin/dev without changing behavior. * Remove test from other branch.

The Agent Release workflow on tag ``v0.3.0rc2`` failed across the matrix with three independent issues. Fixing them in one commit so the RC build can re-run cleanly. 1. macOS — ``Fetch bundled uv binary`` step hit GitHub API HTTP 403 ``rate limit exceeded`` while resolving the latest ``uv`` release tag. Unauthenticated GitHub API requests are capped at 60/h per IP and Actions runners on macOS share that quota across many jobs. - ``scripts/fetch_uv.py``: ``resolve_release_tag`` now sends ``Authorization: Bearer $GITHUB_TOKEN`` (or ``$GH_TOKEN``) when present, lifting the quota to 5,000 req/h per repo. Also pins the recommended ``Accept`` and ``X-GitHub-Api-Version`` headers. - ``.github/workflows/agent-release.yml``: the ``Fetch bundled uv binary`` step now exposes ``GITHUB_TOKEN`` as an env var so the script can pick it up. ``GITHUB_TOKEN`` is the read-only, auto-issued workflow token; no new secret is needed. 2. Windows — ``test_project_gc_lists_venvs`` failed because Rich wrapped the long absolute project path mid-word inside the ``Project`` table cell, splitting the literal ``proj`` substring across a newline (``...test_pro\nject_gc_lists_ven...``). Collapse line breaks before searching for the substring; keep the ``"active"`` assertion as-is since the status column never wraps. 3. Windows — ``test_info_when_enabled`` asserted the literal POSIX string ``"/usr/local/bin/uv"`` against output that ``Path.__str__`` had rendered with Windows separators (``"\\usr\\local\\bin\\uv"``). Derive the expected substring from the same ``Path`` the CLI will render so the assertion is OS-agnostic. Verified by running ``pytest tests/`` locally (1934 passed, 1 skipped). Co-authored-by: Cursor <cursoragent@cursor.com>

…#73) The UI needs one authoritative runtime contract so newer providers do not get blocked behind stale frontend heuristics. This publishes the full provider catalog plus structured setup status for localized and actionable connection feedback.

* Remove mira-ui submodule. * fix: isolate UI project bindings by workspace

* chore: relocate compatibility tracking to mira-ui repo The compatibility.json file conceptually belongs in the consumer repo, not the producer one — mira-ui is what depends on mira (calls its API), so the UI side should declare which agent versions it's compatible with. Living in mira coupled the agent's release cadence to the UI for no runtime benefit: nothing in mira reads compatibility.json at runtime, the actual UI ↔ agent wire-format handshake goes through GET /version (served from _API_CONTRACT_VERSION in mira_engine/channels/ui.py). This commit deletes the file and its supporting tooling from mira. A companion PR in mira-ui will create compatibility.json there with the same schema, port the validator to Node, and add a tag-time guard to desktop-release.yml. Removed: - compatibility.json - scripts/validate_compatibility.py - tests/test_compatibility_validation.py Updated docs to reflect the new location: - README.md: "Release Compatibility Mapping" section now points to mira-ui and clarifies that mira's only contribution is the api_contract field on /version (sourced from _API_CONTRACT_VERSION). - RELEASE_DAY_CHECKLIST.md: pre-release check now runs the Node validator from mira-ui; added an explicit reminder to bump _API_CONTRACT_VERSION + mira-ui's api_contract together when wire format changes. - DEPLOYMENT_RELEASE_BLUEPRINT.md: blueprint already anticipated this split ("可放在 UI repo"); narrative updated to make it definitive. No runtime change — /version still reports api_contract: "v1" as before, and mira-ui still does its compatibility check off the /version response, not off any file in this repo. Co-authored-by: Cursor <cursoragent@cursor.com> * ci: drop validate_compatibility step from tests.yml The previous commit deleted scripts/validate_compatibility.py but missed this caller in tests.yml. CI was failing on: python: can't open file '.../scripts/validate_compatibility.py': [Errno 2] No such file or directory Compatibility validation now lives in the mira-ui repo (scripts/validate-compatibility.mjs there), so this step has no business in mira's PR-time test workflow anymore.

* Add release-train workflow to main branch. Sync the manual release-train GitHub Actions workflow from deploy so it is visible and runnable from the default branch Actions page. * Sync release-train workflow updates from deploy to main. * Fix release-train CI config for writable model field. * Update .github/workflows/release-train.yml * Update .github/workflows/release-train.yml * Replace local skills with submodule (mira-skills repo) * Remove useless files. * fix(agent): improve skill routing precision with metadata + session memory - Parse scenarios/aliases from SKILL.md frontmatter for matching - Recent skills always get scoring boost (+15), not just on follow-up - Raise minimum score threshold from >0 to >=4 to filter weak matches - Remove follow-up short-circuit, treat it as +10 extra score instead * submodule: update mira-skills submodule reference * fix(agent): improve skill routing precision with metadata + session memory * ci: enable submodules in Tests workflow and drop stale skill test - tests.yml now checks out submodules recursively so mira-skills (newly added as a submodule) is available during pytest. - .gitmodules: switch mira-ui to HTTPS so CI runners without SSH keys can clone it. - Remove tests/agent/test_skill_creator_scripts.py: a nanobot v0.15 leftover that hard-coded the legacy skills/skill-creator layout and expected an init_skill module that does not exist in the new mira-skills submodule. * fix(skills): align medical-image skill references with mira-skills rename The mira-skills submodule renamed the deep-learning medical imaging skill from `medical-image-dl-pipeline` to `medical-image-analysis`, but several references in this repo still pointed at the old name: - mira_engine/agent/skills.py: hardcoded alias key for routing ("去伪影", "monai", "mri", etc.) lived under the old name, so the router could never reach this skill when it lived in the submodule. - tests/test_agent_loop_core.py: three fixtures and the active-skills injection assertion referenced the old name. - tests/agent/test_skills_loader.py: routing tests referenced the old name (and would have broken once the alias key was renamed). - README.md: docs still listed the old skill name. Rename every reference to `medical-image-analysis` so routing actually selects the skill that exists in the submodule.

agent-release.yml runs the full pytest suite before building wheels. Now that mira-skills lives in a submodule (PR #80), the runner must clone it or skill-dependent tests fail (skill_names is None, builtin skill list missing 'builtin-skills').

- Delete .github/workflows/ci.yml (nanobot v0.15 leftover that duplicated pytest and only triggered for main/nightly). - Move its `ruff check --select F401,F841` step into tests.yml. - Exclude mira-skills submodule from ruff via pyproject `extend-exclude`. - Fix 8 F401/F841 violations in mira_engine/* that the legacy workflow had surfaced.

* fix: handle conda as .bat file on Windows On Windows, conda is a .bat file which cannot be executed directly by subprocess.run without shell=True. Add shell=True on Windows and catch FileNotFoundError as a fallback to prevent crashes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: remove redundant auto_activate_env The auto_activate_env function attempted to inject the conda mira environment into PATH at runtime, but this is unnecessary — subprocesses inherit the parent process's environment automatically. When started inside the conda mira env, child processes already have the correct Python and packages available. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: improve LLM error display in CLI with provider/model context When the LLM provider returns a 500 or other error, the CLI now shows: - The provider name and model that failed - The raw API error message (without "Error:" prefix) - A hint suggesting retry/checking network Before: just "Error: Internal Server Error" which was cryptic After: structured error output with actionable context Also added _is_llm_error() detection to distinguish provider errors from normal agent responses. * fix: restore _print_agent_response accidentally removed in LLM error fix

* Update versions. * Expand core-module tests to harden release quality. Add high-value coverage for agent loop paths, channel/web handlers, config matching, and tool execution edge cases, plus a reusable scoped coverage command to track core coverage targets consistently. * fix: harden rc6 tests against environment differences Use interpreter-path based shell commands and deterministic gateway/DDGS mocks so release candidate test runs pass reliably across local setups. * fix: stabilize rc6 Windows build test compatibility Normalize backup and file URI path handling for Windows and make command/search tests deterministic across shell quoting and file timestamp differences. * ci(release): checkout submodules so pytest and skills loading work agent-release.yml runs the full pytest suite before building wheels. Now that mira-skills lives in a submodule (PR #80), the runner must clone it or skill-dependent tests fail (skill_names is None, builtin skill list missing 'builtin-skills'). * Add native Windows service manager * Support Windows ARM64 uv bundling * Mark bundle placeholder runtime as unconfigured * Fix Windows release CI test failures * Fix Windows service manager test home path * Improve macOS LaunchAgent bundle service * Support per-engine runtime workspace * Hot reload runtime config from UI saves * Fix bundled engine service identity * Stop auto replanning without policy

* Update versions. * Expand core-module tests to harden release quality. Add high-value coverage for agent loop paths, channel/web handlers, config matching, and tool execution edge cases, plus a reusable scoped coverage command to track core coverage targets consistently. Made-with: Cursor * fix: harden rc6 tests against environment differences Use interpreter-path based shell commands and deterministic gateway/DDGS mocks so release candidate test runs pass reliably across local setups. Made-with: Cursor * fix: stabilize rc6 Windows build test compatibility Normalize backup and file URI path handling for Windows and make command/search tests deterministic across shell quoting and file timestamp differences. Made-with: Cursor * ci(release): checkout submodules so pytest and skills loading work agent-release.yml runs the full pytest suite before building wheels. Now that mira-skills lives in a submodule (PR #80), the runner must clone it or skill-dependent tests fail (skill_names is None, builtin skill list missing 'builtin-skills'). * Fix Windows release path tests * Route normal UI chats through base loop * Generalize default USER.md template The shipped template included personal name, timezone, and research-domain details that are specific to a single contributor. Replace with a neutral, general-purpose profile so new installs do not silently inherit someone else's identity and research preferences. * Harden macOS launchd bootstrap against in-place bundle upgrades After a DMG re-install the user-installed LaunchAgent often refused to update with "Bootstrap failed: 5: Input/output error" and the desktop UI kept talking to the previous engine. Three independent issues conspired to produce this: * `launchctl bootout` returns before the previous job has actually exited (especially when it has active aiohttp/WebSocket clients to drain), so the immediate `bootstrap` raced against a half-torn-down label. Poll `launchctl print` until the label leaves the domain before bootstrapping, with a bounded timeout so a stuck job cannot hang the install forever. * `install_service` was not transactional: a failed bootstrap could leave the new plist on disk and the persisted state file pointing at an executable that was never actually loaded. Snapshot the previous plist, restore it on bootstrap failure (delete it if there was none), and roll the launchd job back if the base-class state write fails after bootstrap succeeded. * `_handle_version` re-read the on-disk manifest on every call, which meant the still-running old engine reported the *new* SHA after the DMG overwrote the manifest file in place — making the desktop UI believe the live engine already matched the bundle and skip the reinstall. Snapshot the engine identity at `UiChannel.__init__` and expose a new `engine_sha256_at_boot` marker field so the UI can distinguish trustworthy boot snapshots from legacy engines. Also tighten cleanup: `uninstall_service` now mirrors install by calling `launchctl remove` in addition to `bootout` so the label does not linger in launchd's cache and trip a subsequent reinstall. Covered by new pytest cases: * teardown waits for launchd to release the label * teardown wait is bounded by its timeout * failed installs restore the previous plist (or remove a fresh one) * failed installs do not update the engine identity state * uninstall removes the label from the cache * /version exposes engine_sha256_at_boot * /version snapshots identity at boot (survives manifest swap) * fix: preserve DeepSeek reasoning content * fix: require Windows service install * fix: refine task plan contract guardrails * feat: keep UI alive during agent work * fix: preserve DeepSeek reasoning content * fix: suppress activity pings for plain progress callbacks

* fix: route DeepSeek through native OpenAI-compatible provider LiteLLM strips reasoning_content from assistant messages on the way back to DeepSeek (litellm#26395), so any multi-turn thinking-mode or tool-call conversation 400s after the first response. Route DeepSeek directly through OpenAICompatProvider — same approach nanobot took (commit 3dfdab7) — to preserve reasoning_content end-to-end. - registry: give the deepseek spec api.deepseek.com/v1 as default base and enable strip_model_prefix so "deepseek/deepseek-chat" reaches the API as "deepseek-chat". - factory: route provider_name == "deepseek" (or deepseek/* models) to OpenAICompatProvider before falling back to LiteLLM. - openai_compat: add DeepSeek to the thinking-mode extra_body branch ({"thinking": {"type": "enabled"|"disabled"}}) and backfill empty reasoning_content on legacy assistant tool-call turns so DeepSeek's validator stops rejecting follow-ups on resumed sessions. - litellm_provider: drop the now-dead DeepSeek error-retry/backfill path; the native route handles it proactively. Keep the generalized provider_specific_fields reasoning extraction, which still helps any remaining LiteLLM-routed model that hides reasoning under nested keys. * fix: bring OpenAICompatProvider HTTP timeouts to LiteLLM parity Routing DeepSeek through the native OpenAI SDK shortened the effective HTTP timeout from LiteLLM's 6000s to the SDK's 600s read / 5s connect, which is too tight for DeepSeek V4-Pro thinking mode (often quiet for minutes before streaming) and for slow / proxied networks in CN. Users report `APITimeoutError: Request timed out.` on the first turn even though LiteLLM never tripped on the same setup. Pass an explicit `httpx.Timeout(connect=30, read=6000, ...)` into the AsyncOpenAI client and let operators tune both legs via MIRA_LLM_CONNECT_TIMEOUT_S / MIRA_LLM_READ_TIMEOUT_S. Garbage env values fall back to the generous defaults instead of crashing provider init. * fix: relax per-round experiment guardrail and surface provider error inline The 'auto-run guard warning: multiple experiments advanced in one round' emission was pure noise — it never actually stopped the loop, but the matching prompt rule (AT MOST ONE terminal transition per turn) was making the model timid and the warning made the UI imply a hard stop. Drop the warning, soften the prompt to a preference, and rely on the existing `running_count <= 1` invariant in task_plan guardrails for the real concurrency limit. Also fix the confusing stop-reason ordering: when an LLM call fails the error text lives in `final_content` and is rendered as the assistant reply (below the progress events), so users saw `auto-run stop reason: provider error` followed by the error, making it look like the stop caused the error. Surface a truncated snippet of the error inline with the stop reason so the cause is visible next to the effect.

Resolves a single conflict in .github/workflows/release-train.yml: both sides edited the file independently. main carried 8 pre-rename fixes (writable model field, token check, etc.); release later absorbed all of those concerns implicitly during the MedPilot → MIRA rename (#47) and the gateway config unification (#45). Resolution: keep release's version verbatim. Discarding main's version is safe because: - main still imports `medpilot.config.schema` → would ImportError after the package rename. - main still calls `medpilot gateway` → CLI no longer exists. - main still sets `cfg.channels.web.host` → field removed by #45 (unify web bind host/port under gateway) and would be rejected by the Config schema. - Token-check verbosity differences are cosmetic only. This commit unblocks PR #88 (release → main) by making the branch fast-forward-able / cleanly mergeable.

ChenglongWang and others added 30 commits April 3, 2026 23:18

Add deployment PR template.

16c438f

Standardize issue linking, test evidence, and rollback notes for per-ticket deploy PRs. Made-with: Cursor

Add compatibility contract validation workflow. (#21)

fe3aecb

Define a release compatibility mapping with schema-style checks and enforce it in CI so UI and agent versions stay aligned.

Standardize gateway health and version endpoints. (#22)

6e0ff26

Expose machine-readable /health and /version contracts (plus /api aliases) so desktop bootstrap and release compatibility checks can rely on a stable runtime handshake.

Add medpilot-agent service lifecycle CLI skeleton. (#23)

86b6b9e

Introduce a dedicated local engine management CLI with install/start/stop/status/logs/doctor commands and test coverage so deployment workflows no longer depend on tmux sessions.

Fix PyPi publish workflow.

dd51104

Update submodule version.

Fix build config file.

ead8ec8

Window platform tests fix.

11cfac6

Rename package to medpilot and derive versions from tags.

814c6bb

Switch runtime/docs/tests to the new package name, enable hatch-vcs tag-based versioning, and improve release workflow observability with full tag checkout plus verbose PyPI uploads. Made-with: Cursor

Update deploy package name.

f7c8a08

Update compatibility.json with new release train and agent/ui versions.

Update versions.

a3be66b

Fix smoke test fail issue.

a3de6a6

Fix release-train token check to avoid workflow parse failures.

038ac3b

Made-with: Cursor

Fix release-train CI config for writable model field.

b5983ae

Made-with: Cursor

Persist project display names in per-project metadata.

620baa0

Update submodule version.

Update versions.

83133be

Expand core-module tests to harden release quality.

f64c6a8

Add high-value coverage for agent loop paths, channel/web handlers, config matching, and tool execution edge cases, plus a reusable scoped coverage command to track core coverage targets consistently. Made-with: Cursor

Expand core-module tests to harden release quality. (#34)

e5ec4a2

Add high-value coverage for agent loop paths, channel/web handlers, config matching, and tool execution edge cases, plus a reusable scoped coverage command to track core coverage targets consistently.

Limit tests push triggers to stable branches.

81f3746

Keep pull_request checks for all PRs while restricting push-triggered test runs to main/dev/release to avoid duplicate CI runs on feature branches. Made-with: Cursor

Persist UI workspace updates to active config file. (#38)

7434801

Initialize web projects root from runtime workspace and write /api/config changes back to the currently selected config path so UI workspace edits remain permanent across restarts.

ChenglongWang and others added 27 commits May 2, 2026 18:26

Merge branch 'dev' into release

e4c5178

Merge branch 'dev' into release

1852545

fix: support provider proxy and oauth paths (#74)

47f7219

fix: isolate UI project bindings by workspace (#76)

4e4151e

* Remove mira-ui submodule. * fix: isolate UI project bindings by workspace

Preserve runtime routing models on UI config saves

ff2637c

Merge branch 'dev' into release

3ffe8ab

fix: stabilize OAuth XDG path expansion on Windows

8c2a6ef

Merge branch 'dev' into release

3aaabb6

Merge branch 'dev' into release

d2054a6

Merge branch 'dev' into release

59dab52

Fix Windows release path tests

db31465

Fix Windows release path tests

9e13fae

Merge branch 'dev' into release

b491502

Merge branch 'dev' into release

47fc2ca

ChenglongWang requested a review from ldxFAIRYTAIL May 30, 2026 17:19

ldxFAIRYTAIL approved these changes May 30, 2026

View reviewed changes

ldxFAIRYTAIL merged commit 6d577e6 into main May 30, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.3.1: MIRA rebrand, bundled engine, mira research CLI, uv runtime, provider hardening#88

Release v0.3.1: MIRA rebrand, bundled engine, mira research CLI, uv runtime, provider hardening#88
ldxFAIRYTAIL merged 93 commits into
mainfrom
release

ChenglongWang commented May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ChenglongWang commented May 30, 2026

Summary

Highlights by Theme

🏷️ Rebrand: MedPilot → MIRA (#47)

📦 Bundled engine + release packaging (#49, #51, #52, #58, #62, #84)

🤖 Agent loop split + mira research CLI (#55)

🔬 Research loop quality (#41, #56, #70, #71, #80, #85)

🐍 uv runtime + per-project venv isolation (#59, #60, #61, #63, #64, #65, #66, #67, #82)

🔌 Provider + UI runtime config (#43, #45, #73, #74, #76, #77, #83, #86, plus ee22fde4)

🛠️ Infra / CI / governance

CLA Acknowledgement

Test Evidence

Rollback Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

🤖 Agent loop split + `mira research` CLI (#55)