Skip to content
Merged
Show file tree
Hide file tree
Changes from 40 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
f9227ea
feat: v0.1.92
Henry-811 May 28, 2026
a80281c
feat(mcp): add Parallel Web Search to MCP server registry + example c…
NormallyGaussian May 28, 2026
34a7654
fix(mcp): drop signature-style tool syntax from prompt-facing text
NormallyGaussian May 28, 2026
9616809
fix(mcp): inject optional API key as Bearer header for streamable-htt…
NormallyGaussian May 28, 2026
43f137d
fix(mcp): make parallel_search key-gated for auto-discovery
NormallyGaussian May 28, 2026
1fb5b17
fix(mcp): use 'claude' backend in parallel_search example for headers…
NormallyGaussian May 28, 2026
efa4dd4
refactor: extract 35 collaborators from orchestrator.py + 3 from TUI …
ncrispino Jun 1, 2026
25f7ff9
refactor: extract EvaluatorResultExtractor + QuestionIrreversibilityA…
ncrispino Jun 1, 2026
ed2edb7
refactor: extract EvaluationCriteriaGenerator collaborator
ncrispino Jun 1, 2026
0caff11
refactor: extract RateLimitController collaborator
ncrispino Jun 1, 2026
a4c4f0e
refactor: extract ChangedocCoordinator collaborator
ncrispino Jun 1, 2026
c4b1aa8
refactor: extract PromptImproverCollaborator
ncrispino Jun 1, 2026
d94ac07
refactor: fold memory-merger into WorkspaceLifecycleManager
ncrispino Jun 1, 2026
d6237b1
refactor: extract PreCollabHelpers collaborator
ncrispino Jun 1, 2026
94d8c3d
refactor: fold format_planning_mode_ui into QuestionIrreversibilityAn…
ncrispino Jun 1, 2026
458d489
docs: update PR_DRAFT to reflect 42 collaborators + 57% reduction
ncrispino Jun 1, 2026
fe1d397
refactor: extract AgentOrchestrationSetup (paused step from original …
ncrispino Jun 1, 2026
fac92cd
refactor: fold _rewrite_subagent_mcp_config_files into SubagentToolIn…
ncrispino Jun 1, 2026
4786fd5
refactor: fold _read_evolution_json_from_result into CriteriaEvolutio…
ncrispino Jun 1, 2026
ae2b449
chore: remove unused local Any import in SubagentToolInjector
ncrispino Jun 1, 2026
9c49863
docs: bump PR_DRAFT to 43 collaborators + 58% reduction
ncrispino Jun 1, 2026
9967331
refactor: extract DockerDiagnostics collaborator
ncrispino Jun 1, 2026
a3beb7e
refactor: extract StepModeHandler collaborator
ncrispino Jun 1, 2026
b204f28
refactor: extract ChatFollowupHandler collaborator
ncrispino Jun 1, 2026
5380832
docs: bump PR_DRAFT to 46 collaborators + 59% reduction
ncrispino Jun 1, 2026
4aef1fc
refactor: extract ToolMessageHelpers collaborator
ncrispino Jun 1, 2026
633197a
refactor: fold 2 workspace helpers into WorkspaceLifecycleManager
ncrispino Jun 1, 2026
af6b9ff
refactor: fold _split_combined_spawn_result into TraceAnalyzerRunner
ncrispino Jun 1, 2026
01b6413
refactor: extract EssentialFilesHelper collaborator
ncrispino Jun 1, 2026
56f5471
fix: restore @staticmethod on _format_trace_analyzer_for_memory_static
ncrispino Jun 1, 2026
274ba5f
docs: sync PR_DRAFT to final 48 collaborators / 60% reduction
ncrispino Jun 1, 2026
2d5a4a6
refactor: fold 3 criteria-evolution helpers into CriteriaEvolutionRunner
ncrispino Jun 1, 2026
7ada7b7
refactor: extract EnforcementBufferHelper collaborator
ncrispino Jun 1, 2026
df962b5
docs: sync PR_DRAFT — 49 collaborators
ncrispino Jun 1, 2026
f2987d0
docs: correct PR_DRAFT out-of-scope section
ncrispino Jun 1, 2026
5aa70c4
refactor: fold _should_spawn_trace_analyzer into TraceAnalyzerRunner
ncrispino Jun 1, 2026
f5a0d76
chore: remove unused pathlib.Path import in CriteriaEvolutionRunner
ncrispino Jun 1, 2026
0b23bc0
Merge branch 'dev/v0.1.92' of https://github.com/Leezekun/MassGen int…
ncrispino Jun 1, 2026
b155a34
Merge pull request #1108 from NormallyGaussian/feat/parallel-search-mcp
Henry-811 Jun 1, 2026
f4bc957
docs: docs for v0.1.92
Henry-811 Jun 1, 2026
3e82d2b
Fix parallel mcp example
ncrispino Jun 1, 2026
268436d
Merge pull request #1109 from massgen/docs_for_v0.1.92
Henry-811 Jun 1, 2026
e5bfb47
Fix parallel mcp example
ncrispino Jun 1, 2026
bc1581e
Merge branch 'dev/v0.1.92' of https://github.com/Leezekun/MassGen int…
ncrispino Jun 1, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 64 additions & 24 deletions PR_DRAFT.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,75 @@
# PR Draft: v0.1.91 Config Reliability
# PR Draft: v0.1.92 God-Class Refactor — Collaborator Extraction

## Summary
- Centralize `orchestrator.coordination` YAML parsing in `CoordinationConfig.from_dict()`.
- Centralize top-level `timeout_settings` parsing in `TimeoutConfig.from_dict()`.
- Centralize top-level orchestrator runtime application in `AgentConfig.apply_orchestrator_config()`.
- Keep `cli._parse_coordination_config()` as a compatibility wrapper.
- Keep `cli._parse_timeout_config()` and `cli._apply_orchestrator_runtime_params()` as compatibility wrappers.
- Warn on unknown coordination, orchestrator, and timeout keys so typos such as `fast_interation_mode`, `voting_sensitivty`, and `orchestrator_timout_seconds` surface during validation.
- Make strict config validation release-blocking for unknown config key warnings.
- Wire documented subagent timeout fields and planning controls through the centralized parser.
- Wire validated checklist runtime fields `max_checklist_calls_per_round` and `checklist_first_answer` through the centralized orchestrator runtime helper.
- Harden standalone native hook permission enforcement so nested read-only/protected paths override broader writable parent paths.
- Align Claude Code native hook injection tests/docs with the SDK-native `additionalContext` contract.

Refactor MassGen's two largest files into modular collaborators with **zero breaking changes**, gated by characterization tests written first. Establishes a repeatable extract-collaborators pattern for the remaining cleanup work.

- **`orchestrator.py`: 21,599 → 8,574 lines (−13,025, 60% reduction)** — 49 collaborator classes extracted into a new `massgen/orchestrator_collaborators/` package. The Orchestrator class keeps thin delegator methods so every existing call site (internal and external) works unchanged.
- **`textual_terminal_display.py`: 14,580 → 14,287** — 3 sibling display modules extracted (`_textual_terminal_capabilities.py`, `_textual_provider_model.py`, `_textual_widget_debug.py`).
- **Two characterization test files** (77 tests) pin the public contract + extraction seams so the refactor is provably safe.
- **Six tracked test files** updated to repoint monkeypatch / mock-stub seams to the new collaborator locations (no test deletions or assertion weakenings).
- **`orchestrator.py` stays a single `.py` module** (not a package) — preserves `Path(__file__).parent / 'skills'` resolution.

### Collaborators extracted (49, alphabetical)

`ActiveCoordinationCleanup`, `AgentOrchestrationSetup`, `AnswerLimitGate`, `AnswerTextNormalizer`, `BootstrapCriteriaEngine`, `BroadcastToolInitializer`, `ChangedocCoordinator`, `ChatFollowupHandler`, `ChecklistGateManager`, `CheckpointCoordinator`, `ContextPathWriteTracker`, `CriteriaEvolutionRunner`, `DockerDiagnostics`, `DspyParaphraseCoordinator`, `EnforcementBufferHelper`, `EssentialFilesHelper`, `EvaluationCriteriaGeneratorCollaborator`, `EvaluatorResultExtractor`, `FairnessGate`, `FinalPresentationRunner`, `FinalResultReporter`, `IsolatedChangeReviewer`, `MetricsReporter`, `MidStreamInjectionHookInstaller` (partial — 6 of 18 pure helpers), `NlipRoutingInitializer`, `OrchestratorTimeoutCalculator`, `PeerAnswerVisibilityTracker`, `PersonaInjector`, `PlanningToolInjector`, `PostEvaluationRunner`, `PreCollabHelpers`, `PreviousLogRestorer`, `PromptImproverCollaborator`, `QuestionIrreversibilityAnalyzer`, `RateLimitController`, `RoundEvaluatorGateConfig`, `RoundEvaluatorRunner`, `RoundStartContextQueue`, `RunModeStrategyResolver`, `RuntimeInputDelivery`, `SkillsConfigValidator`, `SnapshotManager`, `StepModeHandler`, `SubagentLifecycleCoordinator`, `SubagentToolInjector`, `ToolMessageHelpers`, `TraceAnalyzerRunner`, `WorkspaceLifecycleManager`, `WorkspaceModalPresenter`.

### Public contracts preserved

- `massgen.orchestrator`: `Orchestrator`, `AgentState`, `WORKFLOW_TOOL_NAMES`, `create_orchestrator` (`MassOrchestrator` intentionally absent — lives in `massgen/v1/orchestrator.py`).
- `massgen.frontend.displays.textual_terminal_display`: `TextualApp`, `AgentPanel`, `TextualTerminalDisplay`, `ProgressIndicator`, `tui_log`, `EMOJI_FALLBACKS`, `TerminalCapabilityProbe`, `_PrecollabSubagentState`.

### Implementation patterns established

- Collaborators exposed via `functools.cached_property` (not eager `__init__` attrs) so tests using `Orchestrator.__new__(Orchestrator)` still resolve them lazily.
- Collaborators hold an orchestrator back-reference for shared mutable state; pure helpers are `@staticmethod`.
- All shared state (`_pre_populated_workspaces`, `workflow_tools`, `_bootstrap_criteria_accumulator`, `_subagent_launch_watcher`, `_planning_injection_dirs`, etc.) is mutated via the orchestrator back-ref to preserve single ownership.
- Cross-collaborator method calls route through the orchestrator delegator (e.g. `self._orchestrator._check_fairness_answer_lead_cap(...)`) so monkeypatches at the orchestrator level keep working.

### Dev-note

Full roadmap, lessons-learned, and the 6 remaining (high-risk, paused) steps are documented in `docs/dev_notes/orchestrator_refactor_roadmap.md`.

## Issues

- Linear: TBD
- GitHub: TBD

## Tests
- `uv run pytest massgen/tests/test_coordination_config_wiring.py massgen/tests/test_config_validator.py massgen/tests/test_validate_all_configs_script.py massgen/tests/test_webui_config_parity.py massgen/tests/test_standalone_checkpoint_config.py -q --tb=short -ra --color=no`
- `uv run pytest massgen/tests/test_config_wiring_refactors.py massgen/tests/test_coordination_config_wiring.py massgen/tests/test_config_validator.py massgen/tests/test_validate_all_configs_script.py massgen/tests/test_webui_config_parity.py massgen/tests/test_standalone_checkpoint_config.py massgen/tests/test_decomposition_bugfixes.py -q --tb=short -ra --color=no`
- `uv run pytest massgen/tests/test_native_hook_adapters.py massgen/tests/test_gemini_cli_hook_script.py massgen/tests/test_codex_hook_script.py massgen/tests/test_gemini_cli_hook_ipc.py massgen/tests/test_codex_hook_ipc.py massgen/tests/test_codex_native_hook_adapter.py -q --tb=short -ra --color=no`
- `uv run pytest massgen/tests/test_api_params_exclusion.py -q --tb=short -ra --color=no`
- `uv run pytest massgen/tests/test_fast_mode.py massgen/tests/test_prompt_improver.py massgen/tests/test_checklist_criteria_presets.py massgen/tests/test_round_evaluator_loop.py massgen/tests/test_novelty_injection.py massgen/tests/test_evolving_criteria.py massgen/tests/test_execution_trace_analyzer.py massgen/tests/test_auto_trace_analysis.py massgen/tests/test_coordination_improvements_config.py massgen/tests/test_config_changedoc.py massgen/tests/test_web_review.py -q --tb=short -ra --color=no`
- `uv run python scripts/validate_all_configs.py --strict`
- `uv run python -m py_compile massgen/agent_config.py massgen/cli.py massgen/config_validator.py massgen/mcp_tools/native_hook_adapters/gemini_cli_hook_script.py massgen/mcp_tools/native_hook_adapters/codex_hook_script.py massgen/tests/test_config_wiring_refactors.py massgen/tests/test_coordination_config_wiring.py massgen/tests/test_validate_all_configs_script.py massgen/tests/test_native_hook_adapters.py massgen/tests/test_gemini_cli_hook_script.py massgen/tests/test_codex_hook_script.py`

## Configs Validated
- `scripts/validate_all_configs.py --strict` validated 281 configs under `massgen/configs`.
- New characterization tests (must pass before any extraction can ship):
- `uv run pytest massgen/tests/test_orchestrator_characterization.py -q` (37 tests)
- `uv run pytest massgen/tests/frontend/test_textual_terminal_display_characterization.py -q` (40 cases)
- Suites that exercise extracted collaborators (regression coverage):
- `uv run pytest massgen/tests/test_vote_only_mode.py massgen/tests/test_decomposition_mode.py -q` (load-bearing `_is_vote_only_mode` side effect)
- `uv run pytest massgen/tests/test_orchestrator_skills_injection.py massgen/tests/test_essential_files_manifest.py massgen/tests/test_evaluator_personas.py -q` (MagicMock-stub fixtures rewired to real collaborators)
- `uv run pytest massgen/tests/integration/test_orchestrator_hooks_broadcast_subagents.py massgen/tests/integration/test_orchestrator_restart_and_external_tools.py massgen/tests/test_auto_trace_analysis.py massgen/tests/unit/test_orchestrator_unit.py -q` (monkeypatch seams repointed to collaborator)
- Full fast non-API lane:
- `uv run pytest massgen/tests/ -q --tb=no -ra -p no:cacheprovider -m "not live_api and not docker and not expensive and not integration" --ignore=massgen/tests/frontend/test_launch_run_card.py --ignore=massgen/tests/test_interactive_system_prompt.py`
- Lint: `uv run ruff check massgen/orchestrator.py massgen/orchestrator_collaborators/ massgen/frontend/displays/textual_terminal_display.py massgen/frontend/displays/_textual_*.py` — all checks passed.

## Verification

- Public-method set of `Orchestrator` is byte-identical to HEAD (no methods silently dropped or renamed).
- `MassOrchestrator` correctly absent from `massgen.orchestrator` (asserted in characterization tests).
- Built-in skills directory still resolves to `<repo>/massgen/skills` (`SkillsConfigValidator` anchors via `massgen.orchestrator.__file__`, NOT the collaborator's `__file__`).
- Working tree is the only artifact — no commits.

## Known Test Lane Note
- Full fast non-API lane currently stops during collection on unrelated in-flight tests:
- `massgen/tests/frontend/test_launch_run_card.py` imports missing `massgen.frontend.displays.textual_widgets.launch_run_card`
- `massgen/tests/test_interactive_system_prompt.py` imports missing `InteractiveOrchestratorSection`

- Independent review (senior-code-reviewer) ran the fast non-API lane against HEAD vs this branch:
- **HEAD baseline: 60 failures** (all unrelated to this PR — pre-existing on `dev/v0.1.92` from in-flight WIP)
- **This branch: 59 failures**
- Net impact: **0 regressions introduced, +1 test newly passing** (`test_lazy_collaborator_accessors_do_not_require_init` — fails on HEAD because the cached-property pattern doesn't exist there; passes here)
- The 60 baseline failures are largely from untracked WIP test files that this PR does not touch (e.g. `test_subagent_round_timeouts.py`, `test_interactive_thread_style.py`, `test_subagent_continuation.py`, `test_broadcast_integration.py`, `test_interactive_mode.py`, plus the long-standing `test_timeline_snapshot_scaffold.py` snapshot mismatch).
- The same untracked WIP test-file collection errors as v0.1.91 still apply (`test_launch_run_card.py` missing `launch_run_card` module; `test_interactive_system_prompt.py` missing `InteractiveOrchestratorSection`).

## Out of scope (paused for follow-up release)

One remaining concern needs behavior-changing work before extraction:

- **`MidStreamInjectionHookInstaller` (remaining 12 of 18 methods)** — `_setup_hook_manager_for_agent`, `_setup_codex_mcp_hooks`, `_setup_codex_hybrid_hooks`, `_setup_native_hooks_for_agent`, `_register_round_timeout_hooks`, etc. These contain duplicated `get_injection_content` closures across 3 backend paths. Need a callback-unification pass first (behavior-changing, separate validation surface), THEN extract. The 6 pure helpers (`_close_agent_stream`, `_check_restart_pending`, `_should_defer_restart_for_first_answer`, `_clear_framework_mcp_state`, `_compute_plan_progress_stats`, `_build_tool_result_injection`) ARE extracted in this PR.

Also out of scope: the streaming/coordination cores (`_stream_agent_execution` 2,239 lines, `_stream_coordination_with_agents` 911, `_coordinate_agents` 541, `__init__` ~557, `chat` 180) — these need structural restructuring rather than pure extraction, deferred to a separate PR.

The full plan, ordered steps, and applied-during-this-PR lessons learned are in `docs/dev_notes/orchestrator_refactor_roadmap.md` to make the follow-up straightforward.
File renamed without changes.
Loading
Loading