diff --git a/CHANGELOG.md b/CHANGELOG.md index ed03c4680..a4ae307a7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## Recent Releases +**v0.1.92 (June 1, 2026)** - Orchestrator Collaborator Refactor & Parallel Search MCP +Refactors the monolithic orchestrator into 49 lazy collaborators with stable delegator call sites, splits focused TUI display helpers into sibling modules, adds characterization coverage for the extraction seams, and introduces a Parallel Web Search MCP registry entry plus runnable example config. + **v0.1.91 (May 27, 2026)** - Config Reliability & Hook Safety Hardens release-critical configuration paths with centralized coordination, timeout, and orchestrator runtime parsing; strict unknown-key validation for typo detection; checklist runtime control wiring; and safer Gemini/Codex native hook path permission precedence. @@ -32,6 +35,36 @@ New `orchestrator.coordination.criteria_mode` option lets evaluation criteria em --- +## [0.1.92] - 2026-06-01 + +### Added +- **Parallel Web Search MCP**: Added a `parallel_search` MCP server registry entry and `massgen/configs/tools/web-search/parallel_search_example.yaml` for Parallel's hosted Search MCP server, supporting anonymous exploratory use and optional `PARALLEL_API_KEY` headers for higher rate limits. +- **Orchestrator Refactor Roadmap**: Added `docs/dev_notes/orchestrator_refactor_roadmap.md` to document the extraction sequence, lessons learned, and high-risk follow-up work left intentionally out of scope. +- **Characterization Coverage**: Added orchestrator and Textual terminal display characterization suites to pin public contracts and extraction seams before continuing deeper refactors. + +### Changed +- **Orchestrator Collaborator Extraction**: `massgen/orchestrator.py` was reduced from 21,599 to 8,574 lines by extracting 49 lazy collaborators into `massgen/orchestrator_collaborators/`. Existing methods remain available through thin delegators so current internal and external call sites keep working. +- **Textual Terminal Display Cleanup**: Provider/model display helpers, terminal capability probing, and widget-debug helpers moved out of `textual_terminal_display.py` into focused sibling modules while preserving public imports. +- **Refactor Test Seams**: Existing monkeypatch and mock-stub tests were repointed to the collaborator locations without deleting tests or weakening assertions. + +### Tests +- Added `massgen/tests/test_orchestrator_characterization.py` covering the orchestrator public contract and lazy collaborator access pattern. +- Added `massgen/tests/frontend/test_textual_terminal_display_characterization.py` covering Textual display public exports and helper extraction seams. +- Updated integration/unit coverage around broadcast hooks, restart/external tools, auto trace analysis, essential files, evaluator personas, and orchestrator units for the new collaborator seams. +- Verified targeted characterization and collaborator suites; ruff checks pass for the refactored orchestrator, collaborator package, and Textual display modules. + +### Notes +- Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.93. +- Remaining high-risk extraction work for MidStreamInjectionHookInstaller and streaming/coordination cores is documented for follow-up. + +### Technical Details +- **Major Focus**: Shrink MassGen's orchestration core without changing behavior, making future coordination changes easier to isolate and review. +- **Key Commits**: `f9227eaf`, `a80281cb`, `efa4dd4c`, `b155a346` +- **PRs Merged**: [#1108](https://github.com/massgen/MassGen/pull/1108) +- **Contributors**: @NormallyGaussian, @ncrispino, @HenryQi and the MassGen team + +--- + ## [0.1.91] - 2026-05-27 ### Added @@ -48,7 +81,7 @@ New `orchestrator.coordination.criteria_mode` option lets evaluation criteria em - Added native hook regression coverage for nested read-only path precedence, protected-path enforcement, and Claude Code `additionalContext` injection conversion. ### Notes -- Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.92. +- Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.93. ### Technical Details - **Major Focus**: Make release-critical YAML configuration surfaces typo-resistant and parser-complete while hardening native hook path authorization. @@ -81,7 +114,7 @@ New `orchestrator.coordination.criteria_mode` option lets evaluation criteria em ### Notes - Discriminative Criteria Refinements from the v0.1.90 roadmap landed in this release. -- Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.92. +- Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.93. ### Technical Details - **Major Focus**: Make checklist-gated refinement a stronger optimization loop by improving the loss signal, reducing scoring bias, and preventing low-signal criteria from dominating later rounds. @@ -109,7 +142,7 @@ New `orchestrator.coordination.criteria_mode` option lets evaluation criteria em ### Notes - This release completes the follow-up Antigravity integration pass that v0.1.88 introduced as a first version. -- Discriminative Criteria Refinements landed in v0.1.90; Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.92. +- Discriminative Criteria Refinements landed in v0.1.90; Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.93. ### Technical Details - **Major Focus**: Make Antigravity CLI reliable in real MassGen coordination runs by hardening auth, workspace isolation, workflow-tool semantics, hook integration, and prompt affordance boundaries. @@ -140,7 +173,7 @@ New `orchestrator.coordination.criteria_mode` option lets evaluation criteria em ### Notes - Antigravity CLI (`agy`) must be installed separately with `curl -fsSL https://antigravity.google/cli/install.sh | bash`. - Local mode can use existing Google OAuth state at `~/.gemini/google_accounts.json`; Docker mode requires `GEMINI_API_KEY` or `GOOGLE_API_KEY` because OAuth state does not cross container boundaries. -- Follow-up Antigravity hardening landed in v0.1.89; Discriminative Criteria Refinements landed in v0.1.90; Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.92. +- Follow-up Antigravity hardening landed in v0.1.89; Discriminative Criteria Refinements landed in v0.1.90; Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.93. ### Technical Details - **Major Focus**: Add Google Antigravity CLI as a first-class MassGen backend while keeping project-local isolation and MassGen workflow/tool semantics intact. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 89860b0fc..c3c0eb507 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -359,7 +359,7 @@ Create a `.env` file in the `massgen` directory as described in [README](README. ## 🔧 Development Workflow -> **Important**: Our next version is v0.1.92. If you want to contribute, please contribute to the `dev/v0.1.92` branch (or `main` if dev/v0.1.92 doesn't exist yet). +> **Important**: Our next version is v0.1.93. If you want to contribute, please contribute to the `dev/v0.1.93` branch (or `main` if dev/v0.1.93 doesn't exist yet). ### 1. Create Feature Branch @@ -367,8 +367,8 @@ Create a `.env` file in the `massgen` directory as described in [README](README. # Fetch latest changes from upstream git fetch upstream -# Create feature branch from dev/v0.1.92 (or main if dev branch doesn't exist yet) -git checkout -b feature/your-feature-name upstream/dev/v0.1.92 +# Create feature branch from dev/v0.1.93 (or main if dev branch doesn't exist yet) +git checkout -b feature/your-feature-name upstream/dev/v0.1.93 ``` ### 2. Make Your Changes @@ -507,7 +507,7 @@ git push origin feature/your-feature-name ``` Then create a pull request on GitHub: -- Base branch: `dev/v0.1.92` (or `main` if dev branch doesn't exist yet) +- Base branch: `dev/v0.1.93` (or `main` if dev branch doesn't exist yet) - Compare branch: `feature/your-feature-name` - Add clear description of changes - Link any related issues @@ -617,7 +617,7 @@ Have a significant feature idea not covered by existing tracks? - [ ] Tests pass locally - [ ] Documentation is updated if needed - [ ] Commit messages follow convention -- [ ] PR targets `dev/v0.1.92` branch (or `main` if dev branch doesn't exist yet) +- [ ] PR targets `dev/v0.1.93` branch (or `main` if dev branch doesn't exist yet) ### PR Description Should Include diff --git a/PR_DRAFT.md b/PR_DRAFT.md index afdb34ba6..7eff2a20d 100644 --- a/PR_DRAFT.md +++ b/PR_DRAFT.md @@ -1,35 +1,75 @@ -# PR Draft: v0.1.91 Config Reliability +# PR Draft: v0.1.92 God-Class Refactor — Collaborator Extraction ## Summary -- Centralize `orchestrator.coordination` YAML parsing in `CoordinationConfig.from_dict()`. -- Centralize top-level `timeout_settings` parsing in `TimeoutConfig.from_dict()`. -- Centralize top-level orchestrator runtime application in `AgentConfig.apply_orchestrator_config()`. -- Keep `cli._parse_coordination_config()` as a compatibility wrapper. -- Keep `cli._parse_timeout_config()` and `cli._apply_orchestrator_runtime_params()` as compatibility wrappers. -- Warn on unknown coordination, orchestrator, and timeout keys so typos such as `fast_interation_mode`, `voting_sensitivty`, and `orchestrator_timout_seconds` surface during validation. -- Make strict config validation release-blocking for unknown config key warnings. -- Wire documented subagent timeout fields and planning controls through the centralized parser. -- Wire validated checklist runtime fields `max_checklist_calls_per_round` and `checklist_first_answer` through the centralized orchestrator runtime helper. -- Harden standalone native hook permission enforcement so nested read-only/protected paths override broader writable parent paths. -- Align Claude Code native hook injection tests/docs with the SDK-native `additionalContext` contract. + +Refactor MassGen's two largest files into modular collaborators with **zero breaking changes**, gated by characterization tests written first. Establishes a repeatable extract-collaborators pattern for the remaining cleanup work. + +- **`orchestrator.py`: 21,599 → 8,574 lines (−13,025, 60% reduction)** — 49 collaborator classes extracted into a new `massgen/orchestrator_collaborators/` package. The Orchestrator class keeps thin delegator methods so every existing call site (internal and external) works unchanged. +- **`textual_terminal_display.py`: 14,580 → 14,287** — 3 sibling display modules extracted (`_textual_terminal_capabilities.py`, `_textual_provider_model.py`, `_textual_widget_debug.py`). +- **Two characterization test files** (77 tests) pin the public contract + extraction seams so the refactor is provably safe. +- **Six tracked test files** updated to repoint monkeypatch / mock-stub seams to the new collaborator locations (no test deletions or assertion weakenings). +- **`orchestrator.py` stays a single `.py` module** (not a package) — preserves `Path(__file__).parent / 'skills'` resolution. + +### Collaborators extracted (49, alphabetical) + +`ActiveCoordinationCleanup`, `AgentOrchestrationSetup`, `AnswerLimitGate`, `AnswerTextNormalizer`, `BootstrapCriteriaEngine`, `BroadcastToolInitializer`, `ChangedocCoordinator`, `ChatFollowupHandler`, `ChecklistGateManager`, `CheckpointCoordinator`, `ContextPathWriteTracker`, `CriteriaEvolutionRunner`, `DockerDiagnostics`, `DspyParaphraseCoordinator`, `EnforcementBufferHelper`, `EssentialFilesHelper`, `EvaluationCriteriaGeneratorCollaborator`, `EvaluatorResultExtractor`, `FairnessGate`, `FinalPresentationRunner`, `FinalResultReporter`, `IsolatedChangeReviewer`, `MetricsReporter`, `MidStreamInjectionHookInstaller` (partial — 6 of 18 pure helpers), `NlipRoutingInitializer`, `OrchestratorTimeoutCalculator`, `PeerAnswerVisibilityTracker`, `PersonaInjector`, `PlanningToolInjector`, `PostEvaluationRunner`, `PreCollabHelpers`, `PreviousLogRestorer`, `PromptImproverCollaborator`, `QuestionIrreversibilityAnalyzer`, `RateLimitController`, `RoundEvaluatorGateConfig`, `RoundEvaluatorRunner`, `RoundStartContextQueue`, `RunModeStrategyResolver`, `RuntimeInputDelivery`, `SkillsConfigValidator`, `SnapshotManager`, `StepModeHandler`, `SubagentLifecycleCoordinator`, `SubagentToolInjector`, `ToolMessageHelpers`, `TraceAnalyzerRunner`, `WorkspaceLifecycleManager`, `WorkspaceModalPresenter`. + +### Public contracts preserved + +- `massgen.orchestrator`: `Orchestrator`, `AgentState`, `WORKFLOW_TOOL_NAMES`, `create_orchestrator` (`MassOrchestrator` intentionally absent — lives in `massgen/v1/orchestrator.py`). +- `massgen.frontend.displays.textual_terminal_display`: `TextualApp`, `AgentPanel`, `TextualTerminalDisplay`, `ProgressIndicator`, `tui_log`, `EMOJI_FALLBACKS`, `TerminalCapabilityProbe`, `_PrecollabSubagentState`. + +### Implementation patterns established + +- Collaborators exposed via `functools.cached_property` (not eager `__init__` attrs) so tests using `Orchestrator.__new__(Orchestrator)` still resolve them lazily. +- Collaborators hold an orchestrator back-reference for shared mutable state; pure helpers are `@staticmethod`. +- All shared state (`_pre_populated_workspaces`, `workflow_tools`, `_bootstrap_criteria_accumulator`, `_subagent_launch_watcher`, `_planning_injection_dirs`, etc.) is mutated via the orchestrator back-ref to preserve single ownership. +- Cross-collaborator method calls route through the orchestrator delegator (e.g. `self._orchestrator._check_fairness_answer_lead_cap(...)`) so monkeypatches at the orchestrator level keep working. + +### Dev-note + +Full roadmap, lessons-learned, and the 6 remaining (high-risk, paused) steps are documented in `docs/dev_notes/orchestrator_refactor_roadmap.md`. ## Issues + - Linear: TBD - GitHub: TBD ## Tests -- `uv run pytest massgen/tests/test_coordination_config_wiring.py massgen/tests/test_config_validator.py massgen/tests/test_validate_all_configs_script.py massgen/tests/test_webui_config_parity.py massgen/tests/test_standalone_checkpoint_config.py -q --tb=short -ra --color=no` -- `uv run pytest massgen/tests/test_config_wiring_refactors.py massgen/tests/test_coordination_config_wiring.py massgen/tests/test_config_validator.py massgen/tests/test_validate_all_configs_script.py massgen/tests/test_webui_config_parity.py massgen/tests/test_standalone_checkpoint_config.py massgen/tests/test_decomposition_bugfixes.py -q --tb=short -ra --color=no` -- `uv run pytest massgen/tests/test_native_hook_adapters.py massgen/tests/test_gemini_cli_hook_script.py massgen/tests/test_codex_hook_script.py massgen/tests/test_gemini_cli_hook_ipc.py massgen/tests/test_codex_hook_ipc.py massgen/tests/test_codex_native_hook_adapter.py -q --tb=short -ra --color=no` -- `uv run pytest massgen/tests/test_api_params_exclusion.py -q --tb=short -ra --color=no` -- `uv run pytest massgen/tests/test_fast_mode.py massgen/tests/test_prompt_improver.py massgen/tests/test_checklist_criteria_presets.py massgen/tests/test_round_evaluator_loop.py massgen/tests/test_novelty_injection.py massgen/tests/test_evolving_criteria.py massgen/tests/test_execution_trace_analyzer.py massgen/tests/test_auto_trace_analysis.py massgen/tests/test_coordination_improvements_config.py massgen/tests/test_config_changedoc.py massgen/tests/test_web_review.py -q --tb=short -ra --color=no` -- `uv run python scripts/validate_all_configs.py --strict` -- `uv run python -m py_compile massgen/agent_config.py massgen/cli.py massgen/config_validator.py massgen/mcp_tools/native_hook_adapters/gemini_cli_hook_script.py massgen/mcp_tools/native_hook_adapters/codex_hook_script.py massgen/tests/test_config_wiring_refactors.py massgen/tests/test_coordination_config_wiring.py massgen/tests/test_validate_all_configs_script.py massgen/tests/test_native_hook_adapters.py massgen/tests/test_gemini_cli_hook_script.py massgen/tests/test_codex_hook_script.py` -## Configs Validated -- `scripts/validate_all_configs.py --strict` validated 281 configs under `massgen/configs`. +- New characterization tests (must pass before any extraction can ship): + - `uv run pytest massgen/tests/test_orchestrator_characterization.py -q` (37 tests) + - `uv run pytest massgen/tests/frontend/test_textual_terminal_display_characterization.py -q` (40 cases) +- Suites that exercise extracted collaborators (regression coverage): + - `uv run pytest massgen/tests/test_vote_only_mode.py massgen/tests/test_decomposition_mode.py -q` (load-bearing `_is_vote_only_mode` side effect) + - `uv run pytest massgen/tests/test_orchestrator_skills_injection.py massgen/tests/test_essential_files_manifest.py massgen/tests/test_evaluator_personas.py -q` (MagicMock-stub fixtures rewired to real collaborators) + - `uv run pytest massgen/tests/integration/test_orchestrator_hooks_broadcast_subagents.py massgen/tests/integration/test_orchestrator_restart_and_external_tools.py massgen/tests/test_auto_trace_analysis.py massgen/tests/unit/test_orchestrator_unit.py -q` (monkeypatch seams repointed to collaborator) +- Full fast non-API lane: + - `uv run pytest massgen/tests/ -q --tb=no -ra -p no:cacheprovider -m "not live_api and not docker and not expensive and not integration" --ignore=massgen/tests/frontend/test_launch_run_card.py --ignore=massgen/tests/test_interactive_system_prompt.py` +- Lint: `uv run ruff check massgen/orchestrator.py massgen/orchestrator_collaborators/ massgen/frontend/displays/textual_terminal_display.py massgen/frontend/displays/_textual_*.py` — all checks passed. + +## Verification + +- Public-method set of `Orchestrator` is byte-identical to HEAD (no methods silently dropped or renamed). +- `MassOrchestrator` correctly absent from `massgen.orchestrator` (asserted in characterization tests). +- Built-in skills directory still resolves to `/massgen/skills` (`SkillsConfigValidator` anchors via `massgen.orchestrator.__file__`, NOT the collaborator's `__file__`). +- Working tree is the only artifact — no commits. ## Known Test Lane Note -- Full fast non-API lane currently stops during collection on unrelated in-flight tests: - - `massgen/tests/frontend/test_launch_run_card.py` imports missing `massgen.frontend.displays.textual_widgets.launch_run_card` - - `massgen/tests/test_interactive_system_prompt.py` imports missing `InteractiveOrchestratorSection` + +- Independent review (senior-code-reviewer) ran the fast non-API lane against HEAD vs this branch: + - **HEAD baseline: 60 failures** (all unrelated to this PR — pre-existing on `dev/v0.1.92` from in-flight WIP) + - **This branch: 59 failures** + - Net impact: **0 regressions introduced, +1 test newly passing** (`test_lazy_collaborator_accessors_do_not_require_init` — fails on HEAD because the cached-property pattern doesn't exist there; passes here) +- The 60 baseline failures are largely from untracked WIP test files that this PR does not touch (e.g. `test_subagent_round_timeouts.py`, `test_interactive_thread_style.py`, `test_subagent_continuation.py`, `test_broadcast_integration.py`, `test_interactive_mode.py`, plus the long-standing `test_timeline_snapshot_scaffold.py` snapshot mismatch). +- The same untracked WIP test-file collection errors as v0.1.91 still apply (`test_launch_run_card.py` missing `launch_run_card` module; `test_interactive_system_prompt.py` missing `InteractiveOrchestratorSection`). + +## Out of scope (paused for follow-up release) + +One remaining concern needs behavior-changing work before extraction: + +- **`MidStreamInjectionHookInstaller` (remaining 12 of 18 methods)** — `_setup_hook_manager_for_agent`, `_setup_codex_mcp_hooks`, `_setup_codex_hybrid_hooks`, `_setup_native_hooks_for_agent`, `_register_round_timeout_hooks`, etc. These contain duplicated `get_injection_content` closures across 3 backend paths. Need a callback-unification pass first (behavior-changing, separate validation surface), THEN extract. The 6 pure helpers (`_close_agent_stream`, `_check_restart_pending`, `_should_defer_restart_for_first_answer`, `_clear_framework_mcp_state`, `_compute_plan_progress_stats`, `_build_tool_result_injection`) ARE extracted in this PR. + +Also out of scope: the streaming/coordination cores (`_stream_agent_execution` 2,239 lines, `_stream_coordination_with_agents` 911, `_coordinate_agents` 541, `__init__` ~557, `chat` 180) — these need structural restructuring rather than pure extraction, deferred to a separate PR. + +The full plan, ordered steps, and applied-during-this-PR lessons learned are in `docs/dev_notes/orchestrator_refactor_roadmap.md` to make the follow-up straightforward. diff --git a/README.md b/README.md index c0917e968..cb68d440a 100644 --- a/README.md +++ b/README.md @@ -69,7 +69,7 @@ This project started with the "threads of thought" and "iterative refinement" id

🆕 Latest Features

-- [v0.1.91 Features](#-latest-features-v0191) +- [v0.1.92 Features](#-latest-features-v0192)
@@ -122,15 +122,15 @@ This project started with the "threads of thought" and "iterative refinement" id

🗺️ Roadmap

-- [Recent Achievements (v0.1.91)](#recent-achievements-v0191) -- [Previous Achievements (v0.0.3 - v0.1.90)](#previous-achievements-v003---v0190) +- [Recent Achievements (v0.1.92)](#recent-achievements-v0192) +- [Previous Achievements (v0.0.3 - v0.1.91)](#previous-achievements-v003---v0191) - [Key Future Enhancements](#key-future-enhancements) - Bug Fixes & Backend Improvements - Advanced Agent Collaboration - Expanded Model, Tool & Agent Integrations - Improved Performance & Scalability - Enhanced Developer Experience -- [v0.1.92 Roadmap](#v0192-roadmap) +- [v0.1.93 Roadmap](#v0193-roadmap)
@@ -155,19 +155,19 @@ This project started with the "threads of thought" and "iterative refinement" id --- -## 🆕 Latest Features (v0.1.91) +## 🆕 Latest Features (v0.1.92) -**🎉 Released: May 27, 2026** +**🎉 Released: June 1, 2026** -**What's New in v0.1.91:** -- **🧭 Centralized Config Wiring** - Coordination, timeout, and top-level orchestrator runtime settings now parse through single source-of-truth helpers. -- **🔎 Config Drift Detection** - Unknown YAML keys in release-critical config surfaces now produce validation warnings, and strict config validation treats them as release blockers. -- **🛡️ Native Hook Permission Safety** - Gemini CLI and Codex standalone hooks now apply nested protected/read-only paths before broader writable workspace rules. +**What's New in v0.1.92:** +- **🧩 Orchestrator Collaborators** - The monolithic orchestrator is split into 49 lazy collaborators while preserving existing public call sites. +- **🧪 Characterization Safety Net** - New orchestrator and Textual display characterization tests pin public contracts and extraction seams. +- **🔎 Parallel Web Search MCP** - A new Parallel Search MCP registry entry and example config support LLM-optimized web research workflows. -**Try v0.1.91 Features:** +**Try v0.1.92 Features:** ```bash -pip install massgen==0.1.91 -uv run massgen --config massgen/configs/features/fast_iteration.yaml "Create an svg of an AI agent coding." +pip install massgen==0.1.92 +uv run massgen --config massgen/configs/tools/web-search/parallel_search_example.yaml "Research the latest advances in multi-agent AI systems" ``` → [See full release history and examples](massgen/configs/README.md#release-history--examples) @@ -1242,18 +1242,20 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch ⚠️ **Early Stage Notice:** As MassGen is in active development, please expect upcoming breaking architecture changes as we continue to refine and improve the system. -### Recent Achievements (v0.1.91) +### Recent Achievements (v0.1.92) -**🎉 Released: May 27, 2026** +**🎉 Released: June 1, 2026** -#### Config Reliability & Hook Safety -- **Centralized Config Wiring**: `CoordinationConfig.from_dict()`, `TimeoutConfig.from_dict()`, and `AgentConfig.apply_orchestrator_config()` now own their respective YAML/runtime surfaces -- **Config Drift Detection**: Unknown coordination, orchestrator, and timeout keys produce validation warnings, and strict validation treats them as release-blocking -- **Checklist Runtime Controls**: `max_checklist_calls_per_round` and `checklist_first_answer` now flow through the centralized orchestrator runtime helper -- **Native Hook Permission Safety**: Gemini CLI and Codex standalone hooks enforce nested protected/read-only paths before broader writable parents -- **Tests**: New parser/validator parity coverage and native hook regression tests guard the release-critical paths +#### Orchestrator Collaborator Refactor & Parallel Search MCP +- **Orchestrator Collaborators**: `orchestrator.py` dropped from 21,599 to 8,574 lines by extracting 49 lazy collaborators into `massgen/orchestrator_collaborators/` +- **Stable Delegator Surface**: Existing public methods remain available through thin delegators, preserving internal and external call sites +- **Textual Display Cleanup**: Provider/model helpers, terminal capability probing, and widget-debug helpers moved into focused sibling modules +- **Parallel Web Search MCP**: New `parallel_search` registry entry and example config support Parallel's hosted Search MCP server +- **Tests**: 77 new characterization cases pin the orchestrator and Textual display public contracts -### Previous Achievements (v0.0.3 - v0.1.90) +### Previous Achievements (v0.0.3 - v0.1.91) + +✅ **Config Reliability & Hook Safety (v0.1.91)**: Centralized coordination, timeout, and orchestrator runtime parsing; strict unknown-key validation; checklist runtime control wiring; and safer Gemini/Codex native hook path permission precedence. ✅ **Discriminative Criteria Refinements & Checklist Calibration (v0.1.90)**: Improved checklist-gated refinement quality with discriminative-power pruning, per-criterion feedback, position-bias counterbalancing, deterministic tie-breaking, a unified checklist gate, shared score parsing utilities, and fast-iteration config updates. @@ -1580,9 +1582,9 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch We welcome community contributions to achieve these goals. -### v0.1.92 Roadmap +### v0.1.93 Roadmap -Version 0.1.92 picks up the image/video edit work deferred from v0.1.86-v0.1.91 and continues multimodal provider-parity work: +Version 0.1.93 picks up the image/video edit work deferred from v0.1.86-v0.1.92 and continues multimodal provider-parity work: #### Planned Features - **Image/Video Edit Capabilities** ([#959](https://github.com/massgen/MassGen/issues/959)): Image and video editing across providers with multi-turn editing workflows via continuation IDs diff --git a/README_PYPI.md b/README_PYPI.md index a2a048b1c..aeff8532f 100644 --- a/README_PYPI.md +++ b/README_PYPI.md @@ -68,7 +68,7 @@ This project started with the "threads of thought" and "iterative refinement" id

🆕 Latest Features

-- [v0.1.91 Features](#-latest-features-v0191) +- [v0.1.92 Features](#-latest-features-v0192)
@@ -121,15 +121,15 @@ This project started with the "threads of thought" and "iterative refinement" id

🗺️ Roadmap

-- [Recent Achievements (v0.1.91)](#recent-achievements-v0191) -- [Previous Achievements (v0.0.3 - v0.1.90)](#previous-achievements-v003---v0190) +- [Recent Achievements (v0.1.92)](#recent-achievements-v0192) +- [Previous Achievements (v0.0.3 - v0.1.91)](#previous-achievements-v003---v0191) - [Key Future Enhancements](#key-future-enhancements) - Bug Fixes & Backend Improvements - Advanced Agent Collaboration - Expanded Model, Tool & Agent Integrations - Improved Performance & Scalability - Enhanced Developer Experience -- [v0.1.92 Roadmap](#v0192-roadmap) +- [v0.1.93 Roadmap](#v0193-roadmap)
@@ -154,19 +154,19 @@ This project started with the "threads of thought" and "iterative refinement" id --- -## 🆕 Latest Features (v0.1.91) +## 🆕 Latest Features (v0.1.92) -**🎉 Released: May 27, 2026** +**🎉 Released: June 1, 2026** -**What's New in v0.1.91:** -- **🧭 Centralized Config Wiring** - Coordination, timeout, and top-level orchestrator runtime settings now parse through single source-of-truth helpers. -- **🔎 Config Drift Detection** - Unknown YAML keys in release-critical config surfaces now produce validation warnings, and strict config validation treats them as release blockers. -- **🛡️ Native Hook Permission Safety** - Gemini CLI and Codex standalone hooks now apply nested protected/read-only paths before broader writable workspace rules. +**What's New in v0.1.92:** +- **🧩 Orchestrator Collaborators** - The monolithic orchestrator is split into 49 lazy collaborators while preserving existing public call sites. +- **🧪 Characterization Safety Net** - New orchestrator and Textual display characterization tests pin public contracts and extraction seams. +- **🔎 Parallel Web Search MCP** - A new Parallel Search MCP registry entry and example config support LLM-optimized web research workflows. -**Try v0.1.91 Features:** +**Try v0.1.92 Features:** ```bash -pip install massgen==0.1.91 -uv run massgen --config massgen/configs/features/fast_iteration.yaml "Create an svg of an AI agent coding." +pip install massgen==0.1.92 +uv run massgen --config massgen/configs/tools/web-search/parallel_search_example.yaml "Research the latest advances in multi-agent AI systems" ``` → [See full release history and examples](massgen/configs/README.md#release-history--examples) @@ -1241,18 +1241,20 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch ⚠️ **Early Stage Notice:** As MassGen is in active development, please expect upcoming breaking architecture changes as we continue to refine and improve the system. -### Recent Achievements (v0.1.91) +### Recent Achievements (v0.1.92) -**🎉 Released: May 27, 2026** +**🎉 Released: June 1, 2026** -#### Config Reliability & Hook Safety -- **Centralized Config Wiring**: `CoordinationConfig.from_dict()`, `TimeoutConfig.from_dict()`, and `AgentConfig.apply_orchestrator_config()` now own their respective YAML/runtime surfaces -- **Config Drift Detection**: Unknown coordination, orchestrator, and timeout keys produce validation warnings, and strict validation treats them as release-blocking -- **Checklist Runtime Controls**: `max_checklist_calls_per_round` and `checklist_first_answer` now flow through the centralized orchestrator runtime helper -- **Native Hook Permission Safety**: Gemini CLI and Codex standalone hooks enforce nested protected/read-only paths before broader writable parents -- **Tests**: New parser/validator parity coverage and native hook regression tests guard the release-critical paths +#### Orchestrator Collaborator Refactor & Parallel Search MCP +- **Orchestrator Collaborators**: `orchestrator.py` dropped from 21,599 to 8,574 lines by extracting 49 lazy collaborators into `massgen/orchestrator_collaborators/` +- **Stable Delegator Surface**: Existing public methods remain available through thin delegators, preserving internal and external call sites +- **Textual Display Cleanup**: Provider/model helpers, terminal capability probing, and widget-debug helpers moved into focused sibling modules +- **Parallel Web Search MCP**: New `parallel_search` registry entry and example config support Parallel's hosted Search MCP server +- **Tests**: 77 new characterization cases pin the orchestrator and Textual display public contracts -### Previous Achievements (v0.0.3 - v0.1.90) +### Previous Achievements (v0.0.3 - v0.1.91) + +✅ **Config Reliability & Hook Safety (v0.1.91)**: Centralized coordination, timeout, and orchestrator runtime parsing; strict unknown-key validation; checklist runtime control wiring; and safer Gemini/Codex native hook path permission precedence. ✅ **Discriminative Criteria Refinements & Checklist Calibration (v0.1.90)**: Improved checklist-gated refinement quality with discriminative-power pruning, per-criterion feedback, position-bias counterbalancing, deterministic tie-breaking, a unified checklist gate, shared score parsing utilities, and fast-iteration config updates. @@ -1579,9 +1581,9 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch We welcome community contributions to achieve these goals. -### v0.1.92 Roadmap +### v0.1.93 Roadmap -Version 0.1.92 picks up the image/video edit work deferred from v0.1.86-v0.1.91 and continues multimodal provider-parity work: +Version 0.1.93 picks up the image/video edit work deferred from v0.1.86-v0.1.92 and continues multimodal provider-parity work: #### Planned Features - **Image/Video Edit Capabilities** ([#959](https://github.com/massgen/MassGen/issues/959)): Image and video editing across providers with multi-turn editing workflows via continuation IDs diff --git a/ROADMAP.md b/ROADMAP.md index f0b1698a8..d6b3d0c92 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -1,10 +1,10 @@ # MassGen Roadmap -**Current Version:** v0.1.91 +**Current Version:** v0.1.92 **Release Schedule:** Mondays, Wednesdays, Fridays @ 9am PT -**Last Updated:** May 27, 2026 +**Last Updated:** June 1, 2026 This roadmap outlines MassGen's development priorities for upcoming releases. Each release focuses on specific capabilities with real-world use cases. @@ -42,12 +42,29 @@ Want to contribute or collaborate on a specific track? Reach out to the track ow | Release | Target | Feature | Owner | Use Case | |---------|--------|---------|-------|----------| -| **v0.1.92** | 05/29/26 | Image/Video Edit Capabilities | @ncrispino | Check and support img/video editing capabilities — deferred from v0.1.86-v0.1.91 ([#959](https://github.com/massgen/MassGen/issues/959)) | +| **v0.1.93** | 06/03/26 | Image/Video Edit Capabilities | @ncrispino | Check and support img/video editing capabilities — deferred from v0.1.86-v0.1.92 ([#959](https://github.com/massgen/MassGen/issues/959)) | *All releases ship on MWF @ 9am PT when ready* --- +## ✅ v0.1.92 - Orchestrator Collaborator Refactor & Parallel Search MCP (Completed) + +**Released:** June 1, 2026 + +### Features +- **Orchestrator Collaborator Extraction**: `orchestrator.py` dropped from 21,599 to 8,574 lines by extracting 49 lazy collaborators into `massgen/orchestrator_collaborators/` +- **Stable Delegator Surface**: Public methods remain available through thin delegators, preserving existing internal and external call sites +- **Textual Display Cleanup**: Provider/model helpers, terminal capability probing, and widget-debug helpers moved out of `textual_terminal_display.py` into focused sibling modules +- **Parallel Web Search MCP**: New `parallel_search` MCP registry entry and `massgen/configs/tools/web-search/parallel_search_example.yaml` for Parallel's hosted Search MCP server +- **Refactor Roadmap**: `docs/dev_notes/orchestrator_refactor_roadmap.md` documents remaining high-risk follow-up extraction work +- **Tests**: 77 new characterization cases cover orchestrator and Textual display contracts, with existing integration/unit seams repointed to the collaborators + +### Notes +- Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.93. + +--- + ## ✅ v0.1.91 - Config Reliability & Hook Safety (Completed) **Released:** May 27, 2026 @@ -61,7 +78,7 @@ Want to contribute or collaborate on a specific track? Reach out to the track ow - **Tests**: New parser/validator parity coverage and native hook regression tests protect these release-critical paths ### Notes -- Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.92. +- Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.93. --- @@ -80,7 +97,7 @@ Want to contribute or collaborate on a specific track? Reach out to the track ow ### Notes - Discriminative Criteria Refinements from the roadmap landed in this release. -- Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.92. +- Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.93. --- @@ -98,7 +115,7 @@ Want to contribute or collaborate on a specific track? Reach out to the track ow ### Notes - This completes the follow-up Antigravity integration pass introduced in v0.1.88. -- Discriminative Criteria Refinements landed in v0.1.90; Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.92. +- Discriminative Criteria Refinements landed in v0.1.90; Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.93. --- @@ -115,7 +132,7 @@ Want to contribute or collaborate on a specific track? Reach out to the track ow - **Tests**: `massgen/tests/test_antigravity_cli_backend.py` covers command construction, config isolation, MCP schema, workflow JSON envelopes, Docker/API-key constraints, hook wiring, and env passthrough ### Notes -- Follow-up Antigravity hardening landed in v0.1.89; Discriminative Criteria Refinements landed in v0.1.90; Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.92. +- Follow-up Antigravity hardening landed in v0.1.89; Discriminative Criteria Refinements landed in v0.1.90; Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) remain deferred to v0.1.93. --- @@ -132,7 +149,7 @@ Want to contribute or collaborate on a specific track? Reach out to the track ow - **`bootstrap_subagent` Single-Shot Fix**: `Orchestrator._run_bootstrap_discriminator_step` passes `refine=False` to `spawn_subagent` — the canonical knob `SubagentManager` respects at the orchestrator level (the orchestrator's `max_new_answers_per_agent: 3` default was shadowing coordination-dict overrides) ### Notes -- Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) ultimately carried forward to v0.1.92. +- Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) ultimately carried forward to v0.1.93. - Closes [#1082](https://github.com/massgen/MassGen/issues/1082) (`llms.txt` + `llms-full.txt`) and [#1083](https://github.com/massgen/MassGen/issues/1083) (CrewAI / LangGraph / AutoGen comparison pages). --- @@ -308,7 +325,7 @@ Want to contribute or collaborate on a specific track? Reach out to the track ow --- -## 📋 v0.1.92 - Image/Video Edit Capabilities (Deferred from v0.1.86-v0.1.91) +## 📋 v0.1.93 - Image/Video Edit Capabilities (Deferred from v0.1.86-v0.1.92) ### Features diff --git a/ROADMAP_v0.1.92.md b/ROADMAP_v0.1.93.md similarity index 84% rename from ROADMAP_v0.1.92.md rename to ROADMAP_v0.1.93.md index 8953707c9..b69a703cc 100644 --- a/ROADMAP_v0.1.92.md +++ b/ROADMAP_v0.1.93.md @@ -1,14 +1,14 @@ -# MassGen v0.1.92 Roadmap +# MassGen v0.1.93 Roadmap -**Target Release:** May 29, 2026 +**Target Release:** June 3, 2026 ## Overview -Version 0.1.92 picks up the image/video edit work deferred from v0.1.86-v0.1.91 and continues multimodal provider-parity work. +Version 0.1.93 picks up the image/video edit work deferred from v0.1.86-v0.1.92 and continues multimodal provider-parity work. --- -## Feature: Image/Video Edit Capabilities (Deferred from v0.1.86-v0.1.91) +## Feature: Image/Video Edit Capabilities (Deferred from v0.1.86-v0.1.92) **Issue:** [#959](https://github.com/massgen/MassGen/issues/959) **Owner:** @ncrispino @@ -30,6 +30,7 @@ Version 0.1.92 picks up the image/video edit work deferred from v0.1.86-v0.1.91 ## Related Tracks +- **v0.1.92**: Orchestrator collaborator refactor and Parallel Search MCP — 49 collaborator extractions, Textual display helper split, characterization coverage, and a Parallel hosted search example - **v0.1.91**: Config reliability and hook safety — centralized config parsing, strict unknown-key validation, checklist runtime control wiring, and nested native-hook permission precedence - **v0.1.90**: Discriminative criteria refinements and checklist calibration — score-spread pruning, per-criterion feedback, position-bias counterbalancing, unified checklist gate, and shared score utilities - **v0.1.89**: Antigravity CLI full integration and hardening — workflow-mode parity, auth checks, workspace project anchoring, standalone hooks.json, and prompt affordance gating diff --git a/docs/announcements/archive/v0.1.91.md b/docs/announcements/archive/v0.1.91.md new file mode 100644 index 000000000..7e913f26e --- /dev/null +++ b/docs/announcements/archive/v0.1.91.md @@ -0,0 +1,77 @@ +# MassGen v0.1.91 Release Announcement + + + +## Release Summary + +We're excited to release MassGen v0.1.91 — Config Reliability & Hook Safety! 🚀 This is a reliability pass for the config and hook paths that releases depend on. Coordination, timeout, and orchestrator runtime settings now go through centralized parsers; validation catches unknown YAML keys earlier; and strict mode turns those typos into release blockers. Checklist runtime controls use the same wiring, while Gemini/Codex standalone hooks respect nested protected paths before broader workspace write permissions. + +## Install + +```bash +pip install massgen==0.1.91 +``` + +## Links + +- **Release notes:** https://github.com/massgen/MassGen/releases/tag/v0.1.91 +- **X post:** [TO BE ADDED AFTER POSTING] +- **LinkedIn post:** [TO BE ADDED AFTER POSTING] + +## Posting Notes + +- **Suggested image:** Use a screenshot of the v0.1.91 release notes. + +--- + +## Full Announcement (for LinkedIn) + +Copy everything below this line, then append content from `feature-highlights.md`: + +--- + +We're excited to release MassGen v0.1.91 — Config Reliability & Hook Safety! 🚀 This is a reliability pass for the config and hook paths that releases depend on. Coordination, timeout, and orchestrator runtime settings now go through centralized parsers; validation catches unknown YAML keys earlier; and strict mode turns those typos into release blockers. Checklist runtime controls use the same wiring, while Gemini/Codex standalone hooks respect nested protected paths before broader workspace write permissions. + +**Key Improvements:** + +🧭 **Centralized Config Wiring**: +- `CoordinationConfig.from_dict()` now owns coordination YAML parsing +- `TimeoutConfig.from_dict()` now owns timeout setting parsing +- `AgentConfig.apply_orchestrator_config()` now owns top-level orchestrator runtime field application +- CLI helpers remain as compatibility wrappers around the centralized paths + +🔎 **Config Drift Detection**: +- Unknown `orchestrator.coordination.*` keys now produce validation warnings +- Unknown top-level `orchestrator.*` and `timeout_settings.*` keys are also flagged +- `scripts/validate_all_configs.py --strict` treats those warnings as release-blocking + +✅ **Checklist Runtime Controls**: +- `max_checklist_calls_per_round` now flows through the centralized orchestrator runtime helper +- `checklist_first_answer` is wired through the same path +- Documented planning and subagent timeout fields have parser coverage + +🛡️ **Native Hook Permission Safety**: +- Gemini CLI and Codex standalone hook scripts now prefer more-specific managed paths +- Nested read-only and protected paths override broader writable parent directories +- Claude Code hook injection tests/docs now match the SDK-native `additionalContext` contract + +🧪 **Tests**: +- New parser/validator parity coverage for coordination config, timeout settings, and top-level orchestrator runtime fields +- Strict config validation tests cover typo detection for release configs +- Native hook regression tests cover nested read-only precedence and protected-path enforcement + +**Getting Started:** + +```bash +pip install massgen==0.1.91 +uv run massgen --config massgen/configs/features/fast_iteration.yaml "Create an svg of an AI agent coding." +``` + +Release notes: https://github.com/massgen/MassGen/releases/tag/v0.1.91 + +Feature highlights: + + diff --git a/docs/announcements/current-release.md b/docs/announcements/current-release.md index 7e913f26e..869f76fbf 100644 --- a/docs/announcements/current-release.md +++ b/docs/announcements/current-release.md @@ -1,4 +1,4 @@ -# MassGen v0.1.91 Release Announcement +# MassGen v0.1.92 Release Announcement