feat: v0.1.92#1107
Conversation
|
Note Reviews pausedUse the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughBumps package version to 0.1.92, adds PR draft and roadmap, extracts Textual display helpers, moves many orchestrator responsibilities into a new collaborators package, and updates/adds tests to follow the new collaborator wiring. ChangesOrchestrator Refactor & Frontend Extraction
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
|
…onfig Adds Parallel Web Systems' hosted Search MCP server alongside the existing Context7, Brave, and Exa entries. - massgen/mcp_tools/server_registry.py: new `parallel_search` entry using `type: streamable-http` and `url: https://search.parallel.ai/mcp`. The server accepts unauthenticated requests by default, so the entry is registered with `requires_api_key: False` and an `optional_api_key_env_var: PARALLEL_API_KEY` for users who want higher rate limits via the `Authorization: Bearer ...` header. - massgen/configs/tools/web-search/parallel_search_example.yaml: full example config modelled on `exa_search_example.yaml`, with a system message that explains Parallel's `objective + search_queries` pattern and `web_search`/`web_fetch` tools, plus a commented-out `headers` block for opting into the API key. - Module docstring updated to list `parallel_search`. Docs: - https://docs.parallel.ai/integrations/mcp/search-mcp - https://docs.parallel.ai/api-reference/search/search - https://docs.parallel.ai/search/best-practices Tested with claude-sonnet-4-6 backend via the new example config.
Addresses CodeRabbit findings (Major) on PR #1108: - server_registry.py: registry description no longer encodes tool invocation shape (web_search(...) / web_fetch(URL ->...)). Reworded as a natural-language description of what the server provides; the actual call schemas come from the MCP server itself. - parallel_search_example.yaml: system message dropped the web_search(objective, search_queries[, ...]) and web_fetch(urls[, objective]) headers, and now leans on the MCP-supplied schemas for argument names. Usage guidance (objective + 2-3 keyword queries, mode preset, citation discipline) is preserved in plain prose. Tools remain referenced by their conceptual function ("web search", "URL extraction") rather than by hardcoded call signatures, so the prompt stays accurate if the underlying tool schema evolves.
…p servers
Codex review identified that the new parallel_search entry advertises
PARALLEL_API_KEY support but get_server_config() only special-cased
context7's --api-key CLI flag. As written, parallel_search would always
connect anonymously even when PARALLEL_API_KEY was set, defeating the
documented higher-rate-limit path.
Generalize get_server_config() so that when a registry entry declares
optional_api_key_env_var and the env var is set:
- stdio context7 keeps its existing --api-key flag injection (unchanged
behavior; explicit branch on server_name)
- streamable-http entries get a default Authorization: Bearer ${key}
header (using setdefault so a registry-declared header takes
precedence)
- everything else is unchanged
Manually verified:
- PARALLEL_API_KEY unset -> no headers added to parallel_search config
- PARALLEL_API_KEY set -> headers.Authorization = "Bearer <key>"
- apply_api_key_logic=False -> no injection
- context7 still injects --api-key when CONTEXT7_API_KEY is set
Codex review on the prior commit flagged a P1 behavioral regression:
marking parallel_search as `requires_api_key: False` put it in the
always-on auto-discovery bucket alongside context7, so every config
with `auto_discover_custom_tools: true` (e.g. subagent_checklist.yaml,
log_analysis.yaml, many others) silently gained an outbound web-search
tool — breaking deterministic/offline assumptions of existing workflows.
Realign with the Brave/Exa convention:
- `requires_api_key: True`, `api_key_env_var: "PARALLEL_API_KEY"`
- Add a baked `headers: {"Authorization": "Bearer ${PARALLEL_API_KEY}"}`
block; the MCP transport's existing env-var substitution unpacks it at
connection time (mirrors how stdio servers consume `env:` like
`BRAVE_API_KEY: "${BRAVE_API_KEY}"`).
- Revert the temporary streamable-http bearer-injection branch added
to `get_server_config()` last commit — no longer needed and made the
registry entry's `optional_api_key_env_var` look like it was opt-in
when it actually gated nothing.
- Description/notes reworded: anonymous use is still possible by adding
the server manually to mcp_servers without the headers block (the
example YAML continues to show that path).
Manually verified:
- PARALLEL_API_KEY unset -> auto-discovery is just [context7]
(no regression vs. main)
- PARALLEL_API_KEY set -> auto-discovery is [context7, parallel_search]
with the headers template ready for MCP substitution
- Missing-key reporting now lists parallel_search alongside
brave_search and exa_search when their env vars are unset
… compat Codex review caught that the example was wired to backend.type: 'claude_code' but used the streamable-http 'headers:' field. ClaudeCodeBackend forwards MCP configs to ClaudeAgentOptions, where the equivalent auth field is a top-level 'authorization:' string (per massgen/backend/docs/MCP_IMPLEMENTATION_CLAUDE_BACKEND.md), not a nested 'headers' map — so users uncommenting the API-key block would have silently kept anonymous mode. Switch the example to backend.type: 'claude' (generic massgen MCP transport), matching the existing streamable_http_test configs whose 'headers:' shape lines up with the registry entry. Added an inline note pointing claude_code users at the 'authorization:' alternative.
…display Splits two god-class files into composable collaborators with zero breaking changes, gated by characterization tests written first. - massgen/orchestrator.py: 21,599 → 10,741 lines (−10,858, 50% reduction) - 35 collaborator modules in massgen/orchestrator_collaborators/ - massgen/frontend/displays/textual_terminal_display.py: 14,580 → 14,287 - 3 sibling display modules (_textual_*.py) The Orchestrator class keeps thin delegator methods so every existing internal/external call site works unchanged. All public contracts preserved (Orchestrator, AgentState, WORKFLOW_TOOL_NAMES, create_orchestrator, and all display public names). orchestrator.py stays a single .py module (not a package) to preserve Path(__file__)-based skill resolution. Two characterization test files pin the public contract + extraction seams. Six tracked test files updated with seam-rewiring (mock-stub or monkeypatch repointing) — no assertion weakening. Full plan + lessons learned in docs/dev_notes/orchestrator_refactor_roadmap.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…nalyzer Two more cohesive clusters extracted from orchestrator.py: - EvaluatorResultExtractor: 4 PUBLIC @staticmethod methods (extract_all_evaluator_answers, extract_evaluator_workspace_paths, format_multi_evaluator_result_block, build_task_plan_from_evaluator_verdict) - QuestionIrreversibilityAnalyzer: 1 async method (224 lines) orchestrator.py: 10,381 → 9,966 lines (37 total collaborators, 54% reduction from the original 21,599-line god class). Both collaborators preserve all delegator signatures (including @staticmethod for unbound-call tests in test_round_evaluator_verdict.py). Tests: test_round_evaluator_verdict.py (58/58), test_intelligent_planning_mode.py (14/14), test_orchestrator_characterization.py (37/37) all green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
3 methods: _generate_and_inject_evaluation_criteria (async 101 lines), _save_evaluation_criteria_to_log, get_generated_evaluation_criteria (PUBLIC, called by cli.py). orchestrator.py: 9,966 → 9,860 lines (38 collaborators, 54% cumulative reduction from 21,599). Class named EvaluationCriteriaGeneratorCollaborator to avoid clash with the existing massgen.evaluation_criteria_generator module that it wraps. State mutations route through orchestrator back-ref so ChecklistGateManager and CriteriaEvolutionRunner (both already extracted) see live state. get_log_session_dir resolved lazily via massgen.orchestrator namespace for test patchability. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2 cohesive methods: _load_rate_limits_from_config (init-time), _apply_agent_startup_rate_limit (async, per-agent cooldown enforcement). orchestrator.py: 9,860 → 9,698 lines (39 collaborators, 55% cumulative reduction from 21,599). _rate_limits + _agent_startup_times state remains on Orchestrator and is read/mutated via back-ref so monkeypatching still works. Tests: characterization (37/37), test_rate_limiter (all green). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
3 methods: _is_changedoc_enabled (heavily monkeypatched by 10+ tests via instance-attr assignment — kept as thin delegator), _gather_agent_changedocs, _sync_applied_context_files_into_final_artifacts (110-line post-presentation file-overlay routine). orchestrator.py: 9,698 → 9,599 lines (40 collaborators, 56% cumulative reduction from 21,599). gather_agent_changedocs routes through self._orchestrator._is_changedoc_enabled() so the test monkeypatch idiom (orch._is_changedoc_enabled = lambda: ...) keeps shadowing at the instance level. get_log_session_dir resolved lazily via massgen.orchestrator namespace. Tests: characterization + answer_path_normalization + bootstrap_criteria + discriminative_pruning + checklist_label_refresh_ordering (105 total green). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wraps the pre-collab prompt-improvement subagent flow (_improve_and_inject_prompt, async, 77 lines). orchestrator.py: 9,599 → 9,531 lines (41 collaborators, 56% cumulative reduction from 21,599). Class name -Collaborator suffixed to avoid clash with massgen.prompt_improver.PromptImprover that it wraps. _prompt_improved and current_task mutations route through orchestrator back-ref. Tests: characterization + test_prompt_improver + test_orchestrator_timeout_selection (53 total). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
_merge_agent_memories_to_winner (95 lines, post-coordination memory consolidation into the winning agent's workspace) folded into the existing WorkspaceLifecycleManager collaborator since it naturally fits there alongside clear_agent_workspaces / archive_agent_memories / namespace_verification_memory_files. orchestrator.py: 9,531 → 9,440 lines (still 41 collaborators, 56% cumulative reduction from 21,599). Method stays on Orchestrator as a thin delegator. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
5 small cohesive helpers used by 6 existing collaborators (EvaluationCriteriaGenerator, PersonaInjector, PromptImproverCollaborator, CriteriaEvolutionRunner, RoundEvaluatorRunner, BootstrapCriteriaEngine): - _build_parent_agent_configs - _get_parent_workspace - _get_log_directory (@staticmethod) - _make_precollab_started_callback - _notify_precollab_completed (monkeypatched in test_evolving_criteria.py) orchestrator.py: 9,440 → 9,385 lines (42 collaborators, 57% cumulative reduction from 21,599). All 5 methods kept as thin delegators on Orchestrator preserving signatures including @staticmethod. Lazy namespace lookup via `from massgen import orchestrator as _orch_mod` preserves test patches at the massgen.orchestrator namespace. Tests: characterization (37) all green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…alyzer _format_planning_mode_ui (81 lines, pure formatter — no self references) folded into the existing QuestionIrreversibilityAnalyzer collaborator as a @staticmethod since QIA was already its only caller. Method stays on Orchestrator as a thin delegator. orchestrator.py: 9,385 → 9,316 lines (still 42 collaborators, 57% cumulative). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reflects current state after all follow-up extractions: PostEvaluationRunner, FinalPresentationRunner, PeerAnswerVisibilityTracker, ChecklistGateManager, MidStreamInjectionHookInstaller (partial), TraceAnalyzerRunner, CriteriaEvolutionRunner, SnapshotManager, RoundEvaluatorRunner, MetricsReporter, PersonaInjector, PreviousLogRestorer, EvaluatorResultExtractor, QuestionIrreversibilityAnalyzer, EvaluationCriteriaGeneratorCollaborator, RateLimitController, ChangedocCoordinator, PromptImproverCollaborator, PreCollabHelpers, and folding memory-merger + format_planning_mode_ui into existing collaborators. orchestrator.py: 21,599 → 9,319 lines (57% reduction). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…plan) Promoted the nested _setup_agent_orchestration function inside __init__ to a regular method on the new AgentOrchestrationSetup collaborator. Closure vars (skills_directory, massgen_skills, load_previous_session_skills) become explicit method args. The two call sites in __init__ (parallel ThreadPoolExecutor + sequential fallback) call through a tiny local wrapper that forwards to self._agent_orchestration_setup.setup_agent_orchestration(). Also folded ensure_workspace_symlinks (PUBLIC, called externally) into the same collaborator. Both kept as thin delegators on Orchestrator. orchestrator.py: 9,319 → 9,228 lines (43 collaborators, 57% cumulative reduction from 21,599). This was the last paused step from the original 29-step plan; only the MidStreamInjectionHookInstaller hook-install methods (12 remaining, behavior-changing) and the streaming/coordination cores remain unrefactored. Tests: characterization + test_orchestrator_skills_injection all green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…jector _rewrite_subagent_mcp_config_files (56 lines, round-2+ counterpart to _create_subagent_mcp_config which is already on SubagentToolInjector) folded into the existing collaborator. Method kept on Orchestrator as a thin delegator. orchestrator.py: 9,228 → 9,180 lines (still 43 collaborators). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…nRunner 53-line @staticmethod (criteria-evolution result parser, prefer workspace file then fall back to fenced JSON in answer text) folded into the existing CriteriaEvolutionRunner collaborator that was already its only caller. Method stays on Orchestrator as a thin @staticmethod delegator. orchestrator.py: 9,180 → 9,132 lines. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Any is already imported at module top; the redundant local import from the rewrite_subagent_mcp_config_files fold-in tripped ruff F401. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds AgentOrchestrationSetup to the collaborator list (the previously-paused inline-in-__init__ extraction) and reflects post-fold-in line count. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2 stateless Docker-failure diagnostic helpers (~100 lines): - save_docker_logs_on_mcp_failure (capture container state + logs to session log dir when a Docker MCP server disconnects) - get_docker_health (read container health metrics for reliability tracing) Both tolerant of non-Docker mode (no-op when filesystem_manager.docker_manager is absent). Lazy get_log_session_dir lookup via massgen.orchestrator namespace for test patchability. orchestrator.py: 9,133 → 9,057 lines (44 collaborators, 58% cumulative reduction from 21,599). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
4 cohesive step-mode methods (~160 lines): - finalize_step_mode (PUBLIC, called by cli.py + tested unbound) - _continuous_status_updates (async) - _resolve_step_mode_workspace (heavily tested directly) - _resolve_step_mode_stale_paths orchestrator.py: 9,057 → 8,923 lines (45 collaborators, 58.7% cumulative reduction from 21,599). finalize_step_mode is @staticmethod on the collaborator; Orchestrator delegator invokes as StepModeHandler.finalize_step_mode(self, log_dir) to preserve the Orchestrator.finalize_step_mode(MagicMock(), log_dir) unbound-call test pattern. shutil routed via massgen.orchestrator namespace for test patchability (test_answer_path_normalization patches massgen.orchestrator.shutil). Tests: characterization + test_step_mode + test_answer_path_normalization green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2 chat-flow helpers used by the public chat() entry point: - handle_followup (async generator, follow-up response with planning-mode switching) - build_conversation_context (@staticmethod, message-list → context dict) orchestrator.py: 8,923 → 8,847 lines (46 collaborators, 59% cumulative reduction from 21,599). Lazy lookups via massgen.orchestrator namespace for log_orchestrator_activity / log_stream_chunk preserve test patches. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds StepModeHandler, DockerDiagnostics, ChatFollowupHandler to the collaborator list. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
3 cohesive tool-message helpers for the coordination/streaming loop (~116 lines): - is_tool_related_content (@staticmethod, content-classification guard) - create_tool_error_messages (@staticmethod, per-tool-call error builder) - split_disallowed_workflow_tool_calls (workflow-tool partitioner) orchestrator.py: 8,847 → 8,759 lines (47 collaborators, 59.4% cumulative reduction from 21,599). split_disallowed lazy-imports WORKFLOW_TOOL_NAMES via massgen.orchestrator namespace so tests patching it still fire. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
_seed_plan_execution_workspaces (59 lines, plan/spec artifact seeding) and _copy_workspace_contents (33 lines, @staticmethod, top-level contents copy) folded into the existing WorkspaceLifecycleManager. Both stay on Orchestrator as thin delegators. orchestrator.py: 8,759 → 8,682 lines (still 47 collaborators, 59.8% cumulative reduction from 21,599). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@staticmethod helper (36 lines) that partitions a combined spawn result into evaluator vs trace halves — folded into TraceAnalyzerRunner since the splitting is specifically about extracting the trace half. Stays on Orchestrator as @staticmethod delegator so Orchestrator._split_combined_spawn_result(...) unbound test calls in test_orchestrator_unit.py keep working. orchestrator.py: 8,682 → 8,655 lines. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
_load_essential_files_manifests (35 lines, reads per-agent essential_files_manifest.json from snapshot storage) → new EssentialFilesHelper collaborator. Method kept on Orchestrator as a thin delegator. Test stubs in test_essential_files_manifest.py updated to wire a real EssentialFilesHelper into the MagicMock orchestrator so the bound method routes to real logic (same pattern as the WorkspaceLifecycleManager fix earlier in this session). orchestrator.py: 8,655 → 8,633 lines (48 collaborators, 60% cumulative reduction from 21,599). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
A prior fold-in (split_combined_spawn_result extraction) inadvertently ate the @staticmethod decorator from the immediately-following _format_trace_analyzer_for_memory_static method. The docstring still claimed @staticmethod but the decorator was missing, so unbound Orchestrator._format_trace_analyzer_for_memory_static(arg1, arg2) calls in test_auto_trace_analysis.py failed with "takes 2 positional arguments but 3 were given". Restored the decorator. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds ToolMessageHelpers + EssentialFilesHelper to the collaborator list. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
3 methods only called by CriteriaEvolutionRunner via orch back-ref folded into the collaborator they belong to: - _collect_evolution_input_data (22 lines, trace + history aggregation) - _format_score_history_table (19 lines, @staticmethod) - _format_criteria_for_prompt (15 lines, @staticmethod) All 3 stay on Orchestrator as thin delegators (@staticmethod preserved for the latter two so any unbound-call tests keep working). orchestrator.py: 8,634 → 8,594 lines (still 48 collaborators, 60.2% cumulative reduction from 21,599). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2 streaming-buffer helpers for the enforcement-retry path: - get_buffer_content (@staticmethod, capture preview + total chars) - truncate_enforcement_buffer_content (bound retry-buffer length) orchestrator.py: 8,596 → 8,574 lines (49 collaborators, 60.3% cumulative reduction from 21,599). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds EnforcementBufferHelper. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AgentOrchestrationSetup was actually extracted in this PR; only the MidStreamInjectionHookInstaller hook-install methods and the streaming/ coordination cores remain out of scope. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
17-line gate (auto_trace_analysis config + round-2+ + no-in-flight-task) folded into the existing TraceAnalyzerRunner. Stays on Orchestrator as thin delegator. orchestrator.py: 8,574 → 8,561 lines (still 49 collaborators). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
With `from __future__ import annotations` the dict[str, Path | None] annotation doesn't need a runtime import. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
massgen/orchestrator_collaborators/final_result_reporter.py (1)
37-417: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winUse Google-style docstrings for the new result/reporting methods.
This module adds several externally consumed helpers, but the new methods still use summary-only docstrings instead of the required
Args/Returnssections.As per coding guidelines, "
**/*.py: For new or changed functions, include Google-style docstrings"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/final_result_reporter.py` around lines 37 - 417, Update the summary-only docstrings in the new reporting methods to Google-style docstrings including Args and Returns (and Raises where applicable); specifically edit resolve_final_workspace_path, get_vote_results, determine_final_agent_from_states, get_final_result, get_partial_result, ensure_final_directory_on_shutdown, get_all_agent_workspaces, get_coordination_result, and get_status to include parameter descriptions (types and meaning), what the function returns (type and structure), and any exceptions or side effects (e.g., file operations or logging) so they comply with the repository's Google-style docstring requirement.
🟠 Major comments (19)
massgen/orchestrator_collaborators/orchestrator_timeout_calculator.py-37-40 (1)
37-40:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHandle unknown
agent_idbefore indexingagent_states.This path currently raises
KeyErrorfor a stale or already-cleaned-up agent, which turns a timeout check into a round-breaking exception.get_agent_timeout_state()already uses a safe lookup; this helper should do the same.Suggested fix
orchestrator = self._orchestrator timeout_config = orchestrator.config.timeout_config - round_start = orchestrator.agent_states[agent_id].round_start_time + state = orchestrator.agent_states.get(agent_id) + if state is None: + return False + round_start = state.round_start_time🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/orchestrator_timeout_calculator.py` around lines 37 - 40, The code directly indexes orchestrator.agent_states[agent_id] which can raise KeyError for stale/cleaned agents; change the lookup to a safe retrieval (mirroring get_agent_timeout_state()) by using orchestrator.agent_states.get(agent_id) (or the existing safe helper) and bail out or return a default timeout when missing before accessing round_start, so the timeout check does not raise an exception.massgen/orchestrator_collaborators/bootstrap_criteria_engine.py-137-147 (1)
137-147:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftDon't cache the discriminator signature before the run outcome is known.
seen.add(content_signature)happens beforerun_bootstrap_discriminator_step()returns, but that callee returns0for both legitimate “nothing new to add” runs and outright failures/import errors/spawn failures. A transient failure therefore marks the answer set as completed and suppresses every future retry for the same content.Please make the callee return an explicit completion status and only record the signature after a real completed discriminator pass.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/bootstrap_criteria_engine.py` around lines 137 - 147, The code currently adds content_signature to orch._bootstrap_discriminator_completed_signatures before knowing the outcome of run_bootstrap_discriminator_step(), causing transient failures to be treated as completed; change the flow so run_bootstrap_discriminator_step() returns an explicit completion status (e.g., boolean or enum) and only call seen.add(content_signature) when that status indicates a successful/real completed discriminator pass; update the check here (using content_signature, orch._bootstrap_discriminator_completed_signatures, and run_bootstrap_discriminator_step) to add the signature on success and leave it unadded on failure, and adjust any other callers of run_bootstrap_discriminator_step to handle the new explicit return value.massgen/orchestrator_collaborators/round_evaluator_runner.py-644-654 (1)
644-654:⚠️ Potential issue | 🟠 Major | ⚡ Quick winEmit the display completion status after salvage/degraded handling, not before.
This notification is finalized off the raw
successbit, but the method can still salvage a failed spawn from recovered artifacts or convert a timeout into the degraded-success path. In those cases the UI is toldfailedhere and never corrected, even though orchestration continues with a usable evaluator outcome.Move the completion notification to the final exit paths so it reflects the derived status (
completed,degraded, or terminal failure) exactly once.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/round_evaluator_runner.py` around lines 644 - 654, The current code calls orch.coordination_ui.display.notify_runtime_subagent_completed early using the raw success flag (variables: display, orch.coordination_ui, notify_runtime_subagent_completed, parent_agent_id, upcoming_round, tool_call_id, success), which can misreport status if later salvage/degraded logic converts failures into usable outcomes; move this notification out of the early path and invoke it exactly once at the final exit points after all salvage/degraded handling finishes, passing the derived status string ("completed", "degraded", or "failed") so the UI reflects the final outcome.massgen/orchestrator_collaborators/pre_collab_helpers.py-38-40 (1)
38-40:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHandle non-dict backend configs before calling
.items().This helper now assumes
agent.backend.configis always a mapping, but other extracted code in this PR still treats backend config as potentially object-shaped. On those agents,.items()raises and breaks every pre-collab flow that calls_build_parent_agent_configs().Suggested fix
+from collections.abc import Mapping import tempfile from typing import TYPE_CHECKING, Any @@ for agent_id, agent in self._orchestrator.agents.items(): agent_cfg: dict[str, Any] = {"id": agent_id} - if hasattr(agent, "backend") and hasattr(agent.backend, "config"): - backend_cfg = {k: v for k, v in agent.backend.config.items() if k not in ("mcp_servers", "_config_path")} + cfg = getattr(getattr(agent, "backend", None), "config", None) + if isinstance(cfg, Mapping): + backend_cfg = { + k: v + for k, v in cfg.items() + if k not in ("mcp_servers", "_config_path") + } + agent_cfg["backend"] = backend_cfg + elif cfg is not None: + backend_cfg = { + k: v + for k, v in vars(cfg).items() + if k not in ("mcp_servers", "_config_path") + } agent_cfg["backend"] = backend_cfg configs.append(agent_cfg)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/pre_collab_helpers.py` around lines 38 - 40, The code in _build_parent_agent_configs assumes agent.backend.config is a mapping and calls .items(), which breaks for object-shaped configs; update the branch that builds backend_cfg to first test the type of agent.backend.config (e.g., isinstance(agent.backend.config, Mapping) from collections.abc) and only call .items() for mappings, otherwise fallback to a safe conversion such as vars(agent.backend.config) or {k: getattr(agent.backend.config, k) for k in dir(agent.backend.config) if not k.startswith("_")} to produce a dict; ensure this logic is applied where agent_cfg["backend"] is set so non-dict backend.configs no longer raise.massgen/orchestrator_collaborators/active_coordination_cleanup.py-49-60 (1)
49-60:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSnapshot
_active_tasksbefore awaiting cancelled tasks.This loop iterates the live
_active_tasksdict and thenawaits inside the iteration. Cancellation callbacks can mutate_active_taskswhile control is yielded, which can raiseRuntimeError: dictionary changed size during iterationand abort the cleanup path.Suggested fix
- for agent_id, task in orch._active_tasks.items(): + for agent_id, task in list(orch._active_tasks.items()): if not task.done(): if not orch.is_orchestrator_timeout: orch.coordination_tracker.track_agent_action(🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/active_coordination_cleanup.py` around lines 49 - 60, The loop mutates orch._active_tasks while awaiting cancelled tasks, causing a RuntimeError; fix by snapshotting the items first (e.g., items = list(orch._active_tasks.items()) or collect tasks to cancel in a separate list) and then iterate that snapshot to call orch.coordination_tracker.track_agent_action (using ActionType.CANCELLED when appropriate), task.cancel(), and await the task from the snapshot so the original orch._active_tasks dict cannot change during iteration.massgen/orchestrator_collaborators/checkpoint_coordinator.py-190-295 (1)
190-295:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReset checkpoint state in a
finallypath.
_checkpoint_active,_checkpoint_task, and_checkpoint_participantsare set before the subprocess work starts, but any exception fromspawn(),_copy_subprocess_logs(), event emission, orcleanup()will exit this method without clearing that state. That can leave the orchestrator stuck in checkpoint mode for the rest of the session.The early
"no workspace available"return also happens after_checkpoint_participantsis populated, so stale participant state can leak even without an exception.Suggested direction
- orch._checkpoint_active = True - orch._checkpoint_number += 1 - orch._checkpoint_task = signal.get("task", "") + orch._checkpoint_active = True + orch._checkpoint_number += 1 + orch._checkpoint_task = signal.get("task", "") + orch._checkpoint_participants = {} + + try: + ... + if not parent_workspace: + logger.error("[Checkpoint] No parent workspace for subprocess") + return "Checkpoint failed: no workspace available" + + result = await manager.spawn( + signal=signal, + on_event=_relay_event, + ) + ... + return consensus + finally: + orch._checkpoint_active = False + orch._checkpoint_participants = {}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/checkpoint_coordinator.py` around lines 190 - 295, Wrap the subprocess work and all subsequent post-processing (the call to CheckpointSubprocessManager(...), await manager.spawn(...), manager._copy_subprocess_logs(), event emissions, and manager.cleanup()) in a try/finally so orch._checkpoint_active, orch._checkpoint_task and orch._checkpoint_participants are always reset in the finally block; also ensure the early "no workspace available" return path clears those same fields before returning. Use the existing symbols (CheckpointSubprocessManager, manager.spawn, manager._copy_subprocess_logs, manager.cleanup, get_event_emitter) to locate where to add the try/finally and the cleanup assignments.massgen/orchestrator_collaborators/subagent_tool_injector.py-97-111 (1)
97-111:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMake list-backed MCP injection idempotent.
When
mcp_serversis a dict you replace by key, but in the list branch you always append. If this injector runs more than once for the same agent, the same subagent server gets registered repeatedly and the backend will see duplicate tool definitions. Filter the list byself.subagent_server_name(agent_id)before appending the new config.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/subagent_tool_injector.py` around lines 97 - 111, The list-backed MCP injection is not idempotent: in the else branch you always append subagent_mcp_config to mcp_servers causing duplicates; modify the logic that handles list mcp_servers (variable mcp_servers, method subagent_server_name(agent_id), and value subagent_mcp_config) to first filter out any existing entries whose server name equals self.subagent_server_name(agent_id) (or whose "name" field matches that value) and then append the new subagent_mcp_config only once, finally writing back to agent.backend.config["mcp_servers"] so repeated runs replace the previous entry instead of duplicating it.massgen/orchestrator_collaborators/subagent_tool_injector.py-542-568 (1)
542-568:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep the round-2 rewrite path in sync with the initial config writer.
rewrite_subagent_mcp_config_files()is not reproducing the payload fromcreate_subagent_mcp_config(): it drops the backend-model fallback and never rewrites*_context_paths.json. After a workspace clear, later subagent launches can therefore run with a different config than round 1 even though the docstring says this is the counterpart path. Please share the serialization logic or rewrite the same files/fields here.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/subagent_tool_injector.py` around lines 542 - 568, rewrite_subagent_mcp_config_files() currently builds agent_configs but omits the backend-model fallback and never emits the *_context_paths.json (so round-2 payload diverges from create_subagent_mcp_config()); update rewrite_subagent_mcp_config_files() to mirror create_subagent_mcp_config() exactly by: when serializing each agent (agent_cfg in the loop over orch.agents) include the backend model fallback field(s) you add/expect from backend.config (e.g., preserve any "model" or fallback keys removed by the filter), serialize runtime_agent_config.context_paths (or the same attribute create_subagent_mcp_config reads) and write it to mcp_temp_dir / f"{_token}_context_paths.json", and ensure the same three files are produced (*_agent_configs.json, *_coordination_config.json, *_orchestrator_config.json plus *_context_paths.json) using the same to_dict/json serialization logic used in create_subagent_mcp_config() so the round‑2 payload matches round‑1.massgen/orchestrator_collaborators/previous_log_restorer.py-39-42 (1)
39-42: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winUse Google-style docstrings on the new methods.
These new APIs still use summary-only docstrings, so the extracted collaborator is missing the required
Args/Returnsstructure where callers now need it most.As per coding guidelines, "
**/*.py: For new or changed functions, include Google-style docstrings"Also applies to: 42-52, 215-222, 281-282
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/previous_log_restorer.py` around lines 39 - 42, The new methods currently have summary-only docstrings; update each changed/new function (e.g., the class __init__ and restore_from_previous_log in PreviousLogRestorer, plus the other modified functions referenced around the file such as the methods at the other changed ranges) to use Google-style docstrings including at minimum an "Args:" section describing parameters and types and a "Returns:" section (or "Raises:" if applicable) so callers can rely on structured doc metadata; ensure the docstring formatting follows Google style (triple-quoted string immediately under the def) and include types and brief descriptions for each parameter and the return value or None.massgen/orchestrator_collaborators/post_evaluation_runner.py-39-44 (1)
39-44: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winUse Google-style docstrings on these new methods.
Both new entry points are missing the required
Args/Returnssections, which makes the restart and post-eval flow harder to follow now that it lives outsideOrchestrator.As per coding guidelines, "
**/*.py: For new or changed functions, include Google-style docstrings"Also applies to: 436-443
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/post_evaluation_runner.py` around lines 39 - 44, The two new async entry methods (post_evaluate_answer and the other new method at lines ~436-443) are missing Google-style docstrings; update each function (post_evaluate_answer and the other new entrypoint method name as seen in the diff) to include a proper Google-style docstring with an overall summary line plus an Args section that documents selected_agent_id: str and final_answer: str (or the appropriate parameters for the second method) and a Returns section that describes the AsyncGenerator[StreamChunk] return type and what each yielded StreamChunk represents; ensure parameter types and return behavior (async generator yielding StreamChunk events) are clearly described so the restart and post-eval flows are discoverable.massgen/orchestrator_collaborators/post_evaluation_runner.py-436-489 (1)
436-489:⚠️ Potential issue | 🟠 Major | ⚡ Quick winClear the consumed restart flag in
handle_restart().
post_evaluate_answer()raisesorch.restart_pending, buthandle_restart()never clears it. That stale flag can make the next attempt look like it is still awaiting a restart decision and trigger another restart even after a later approval.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/post_evaluation_runner.py` around lines 436 - 489, post_evaluate_answer() sets orch.restart_pending but handle_restart() never clears it, leaving a stale restart flag that can trigger another restart; inside handle_restart (after obtaining orch or before finishing the method) explicitly reset the flag (e.g., set orch.restart_pending = False or None) so the orchestrator no longer appears to be awaiting a restart decision; update any places that check the flag to expect this cleared state and ensure you reference handle_restart and orch.restart_pending when making the change.massgen/orchestrator_collaborators/final_presentation_runner.py-449-458 (1)
449-458:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReset
_presentation_startedin thefinallypath.Line 458 flips the guard to
True, but nothing clears it afterward. After the first presentation attempt, every later call returns "Presentation already in progress", which breaks retries, restarts, and later-turn reuse of this runner.Also applies to: 1127-1237
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/final_presentation_runner.py` around lines 449 - 458, The guard orch._presentation_started is set to True but never cleared, causing all subsequent calls to short-circuit; update the presentation runner functions (the block setting orch._presentation_started = True and the duplicate-call path that logs via _orch_mod.logger.warning and yields StreamChunk) to ensure orch._presentation_started is reset to False in a finally path (or via context manager) so it is cleared after success or any exception; apply the same fix to the other occurrence around the 1127-1237 region to restore retries/restarts correctly.massgen/orchestrator_collaborators/workspace_lifecycle_manager.py-23-317 (1)
23-317: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winUse Google-style docstrings for the new workspace lifecycle methods.
This module adds several new entry points, but the docstrings still don't follow the required
Args/Returnsstructure.As per coding guidelines, "
**/*.py: For new or changed functions, include Google-style docstrings"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/workspace_lifecycle_manager.py` around lines 23 - 317, Several new public methods (clear_agent_workspaces, archive_agent_memories, namespace_verification_memory_files, merge_agent_memories_to_winner, seed_plan_execution_workspaces, copy_workspace_contents) lack Google-style docstrings; update each docstring to follow the project's Google style (one-line summary, blank line, Args: with parameter names and types, Returns: with type and description where applicable, and Raises: when exceptions are expected) and ensure types for orch/session_id/agent_id/etc. are specified and brief example lines removed so linters accept them. Locate and replace the existing triple-quoted summaries in those functions with the new formatted docstrings and keep the current brief descriptions as the one-line summary. Ensure seed_plan_execution_workspaces and copy_workspace_contents include Returns or None as appropriate and archive_agent_memories/docstrings mention side effects (file I/O) in Raises if relevant.massgen/orchestrator_collaborators/isolated_change_reviewer.py-33-112 (1)
33-112: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winConvert the new helper and entry-point docstrings to Google style.
The extracted review/apply flow is adding new API surface, but the methods still omit the required
Args/Returnssections.As per coding guidelines, "
**/*.py: For new or changed functions, include Google-style docstrings"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/isolated_change_reviewer.py` around lines 33 - 112, The new helper functions (_count_diff_files, _normalize_approved_files_by_context, _normalize_approved_hunks_by_context, _is_in_context_prefix, _format_file_list) and the entry-point method review need Google-style docstrings including Args and Returns (and Raises when applicable); update each function/method to keep the existing short description and add an "Args:" section listing parameters with types and a brief description and a "Returns:" section describing the return value and type (for review, also include that it yields AsyncGenerator[StreamChunk] and any contextual behavior), and add "Raises:" only if the function can intentionally raise exceptions—ensure formatting matches other Google-style docstrings in the codebase.massgen/orchestrator_collaborators/previous_log_restorer.py-83-85 (1)
83-85:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSort restored snapshots by round/timestamp, not by label string.
Line 84 uses lexicographic label ordering, so revisions like
agent1.10restore beforeagent1.2. That can leaveanswers[-1]and the restored agent state pointing at an older revision after longer runs. Sort the mappings by numeric round and a monotonic field like timestamp instead.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/previous_log_restorer.py` around lines 83 - 85, The current loop sorts answer_snapshots by label string which causes lexicographic order issues (e.g., "agent1.10" before "agent1.2"); change the sort to order mappings by their numeric round and a monotonic timestamp instead: iterate over answer_snapshots.items() sorted by a key that uses mapping['round'] (or parses the numeric round from label if round is not present) and then mapping['timestamp'] (or another monotonic field) so restored mappings and answers[-1] reflect true chronological/round order; update the loop that references answer_snapshots, label, and mapping to use that sorted order.massgen/orchestrator_collaborators/peer_answer_visibility_tracker.py-39-291 (1)
39-291: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winConvert the new method docstrings to Google style.
This extracted collaborator adds a lot of new surface area, but the methods still use summary-only docstrings instead of the required
Args/Returnsformat.As per coding guidelines, "
**/*.py: For new or changed functions, include Google-style docstrings"massgen/orchestrator_collaborators/final_result_reporter.py-342-353 (1)
342-353:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPopulate per-snapshot content from the snapshot itself.
These entries are keyed by snapshot label, but
contentis pulled fromorch.agent_states[agent_id].answer, i.e. the agent's latest answer. Once an agent has multiple snapshots, older labels will all report the newest content, so the API no longer matches the snapshot mapping.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/final_result_reporter.py` around lines 342 - 353, The code currently sets content = orch.agent_states[agent_id].answer which returns the agent's latest answer and causes older snapshot labels to show the newest content; update the content assignment to read the per-snapshot content from the snapshot mapping (e.g., content = mapping.get("content") or mapping.get("snapshot_answer")) and only fall back to orch.agent_states[agent_id].answer if the mapping has no snapshot content; change the assignment near where mapping, agent_id, and answers are used (the block that builds the answers list with label, agent_id, answer_path) so each snapshot label reflects its own stored content.massgen/orchestrator_collaborators/workspace_lifecycle_manager.py-60-71 (1)
60-71:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSkip symlinks when repopulating workspaces.
These copy loops restore top-level entries from previous/cancelled workspaces without an
item.is_symlink()guard. That can reintroduce symlinks pointing outside the live workspace, even thoughcopy_workspace_contents()explicitly avoids that later in this file.Also applies to: 83-94
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/workspace_lifecycle_manager.py` around lines 60 - 71, The loop that restores top-level entries from previous workspaces (iterating source.iterdir() and calling shutil.copy2 / shutil.copytree) lacks an item.is_symlink() guard and can reintroduce external symlinks; update both copy loops (the one shown and the similar block at lines ~83-94) to skip items where item.is_symlink() is True (e.g., if item.is_symlink(): continue) before performing shutil.copy2 or shutil.copytree, keeping consistency with copy_workspace_contents() which deliberately avoids symlinks; optionally add a brief debug/log statement when skipping a symlink for traceability.massgen/orchestrator_collaborators/final_presentation_runner.py-38-444 (1)
38-444: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winUse Google-style docstrings on the new final-presentation methods.
This extracted runner now owns several core orchestration entry points, but the new methods still omit the required
Args/Returnssections.As per coding guidelines, "
**/*.py: For new or changed functions, include Google-style docstrings"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/final_presentation_runner.py` around lines 38 - 444, The new final-presentation methods (yield_existing_answer_finalization, present_final_answer, handle_orchestrator_timeout, determine_final_agent_from_votes, get_final_presentation) lack Google-style docstrings with Args/Returns; update each function/class docstring to follow Google style by adding an Args section describing parameters (types and meaning) and a Returns section describing the AsyncGenerator or return value (including yielded StreamChunk types), and include any raised exceptions or side-effects (e.g., workspace snapshot, event emissions) where relevant so the docstrings for these methods conform to the repository guideline.
🟡 Minor comments (8)
massgen/orchestrator_collaborators/skills_config_validator.py-65-68 (1)
65-68:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winGuard the built-in skills probe with
is_dir()before callingiterdir().If
massgen/skillsexists but is a file or broken symlink, this path raises fromiterdir()and skips the clearerRuntimeErrorbelow. Mirror the external-skills check here so broken installs fail deterministically.Suggested fix
builtin_skills_dir = _builtin_skills_dir() - has_builtin_skills = builtin_skills_dir.exists() and any( + has_builtin_skills = ( + builtin_skills_dir.exists() + and builtin_skills_dir.is_dir() + and any( builtin_skills_dir.iterdir(), - ) + ) + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/skills_config_validator.py` around lines 65 - 68, The probe for built-in skills should check that builtin_skills_dir is a directory before calling iterdir(): update the logic around builtin_skills_dir (returned by _builtin_skills_dir()) and the has_builtin_skills assignment to mirror the external-skills check by adding builtin_skills_dir.is_dir() as a prerequisite to calling builtin_skills_dir.iterdir(); this ensures broken symlinks or files named massgen/skills don't cause an exception and will instead fall through to the existing RuntimeError handling.massgen/orchestrator_collaborators/snapshot_manager.py-201-220 (1)
201-220:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix type hints and return type consistency.
- Lines 204-205 use implicit
Optionalwhich violates PEP 484. Use explicit| Nonesyntax.- Line 220 returns
Nonebut the method signature declares-> str. This should be-> str | None.🔧 Proposed fix
async def save_agent_snapshot( self, agent_id: str, - answer_content: str = None, - vote_data: dict[str, Any] = None, + answer_content: str | None = None, + vote_data: dict[str, Any] | None = None, is_final: bool = False, context_data: Any = None, - ) -> str: + ) -> str | None: """See Orchestrator._save_agent_snapshot.""" orch = self._orchestrator logger.info(🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/snapshot_manager.py` around lines 201 - 220, The signature of save_agent_snapshot uses implicit Optional types and declares a non-optional return; update the type hints to use explicit union-with-None and make the return type nullable: change the function signature in save_agent_snapshot to return "str | None" and change parameters to "answer_content: str | None = None", "vote_data: dict[str, Any] | None = None", and "context_data: Any | None = None" so annotations are PEP 484-compliant and match the code path that returns None when agent not found.massgen/orchestrator_collaborators/subagent_lifecycle_coordinator.py-84-89 (1)
84-89:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winBackground task exceptions may be silently lost.
The task created by
loop.create_task()is not stored, so any unhandled exception in_maybe_interrupt_background_wait_for_agentwill be silently dropped. Consider adding an error handler or storing the task reference.🛡️ Proposed fix with error callback
- loop.create_task( + task = loop.create_task( orch._maybe_interrupt_background_wait_for_agent( agent_id, trigger=trigger, ), ) + task.add_done_callback( + lambda t: logger.debug( + f"[SubagentLifecycle] Background interrupt task for {agent_id} " + f"{'failed: ' + str(t.exception()) if t.exception() else 'completed'}" + ) if t.exception() else None + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/subagent_lifecycle_coordinator.py` around lines 84 - 89, The created background task for orch._maybe_interrupt_background_wait_for_agent is not stored, so exceptions can be lost; modify the code that calls loop.create_task(...) to capture the Task object (e.g., task = loop.create_task(...)) and attach a done callback that checks task.exception() and logs or forwards the error (or re-raises) via the orchestrator's logger; alternatively register the task in an in-flight tasks collection on the orch instance so it can be awaited/cancelled on shutdown and its exceptions handled centrally.massgen/orchestrator_collaborators/workspace_modal_presenter.py-61-66 (1)
61-66:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winLog workspace-resolution failures instead of silently dropping the workspace tab.
If
get_current_workspace()or the directory probe throws here, the code quietly falls back to the answer-only modal. That changes the visible result but leaves no clue why the workspace tab disappeared. Please log the exception before falling back so regressions in filesystem-manager integration are diagnosable.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/workspace_modal_presenter.py` around lines 61 - 66, The try/except that calls fm.get_current_workspace() in workspace_modal_presenter.py currently suppresses all errors; change the except to "except Exception as e:" and log the failure before falling back (e.g., call logger.exception or logger.error with the exception) so the message includes context like "Failed to resolve current workspace" and the exception information; keep the existing fallback behavior that leaves workspace_path unset. Reference the existing call sites get_current_workspace(), Path.is_dir(), Path.iterdir(), and the workspace_path variable when adding the log.massgen/frontend/displays/textual_terminal_display.py-741-749 (1)
741-749:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd Google-style docstrings for the touched delegating methods.
The capability-probe change touches
TextualTerminalDisplay.__init__, and the new wrapper methods here are also changed, but they still use missing or non-Google-style docstrings. Please bring these updated methods in line with the repo’s Python docstring requirement.As per coding guidelines,
**/*.py: For new or changed functions, include Google-style docstrings.Also applies to: 832-871, 3957-4000, 4366-4384
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/frontend/displays/textual_terminal_display.py` around lines 741 - 749, The modified TextualTerminalDisplay class methods lack Google-style docstrings; update TextualTerminalDisplay.__init__ and any delegating/wrapper methods you changed to include proper Google-style docstrings (one-line summary, blank line, Args: and Returns: sections as applicable) so they follow the repo convention—ensure each docstring documents parameters like env/on_error/refresh_rate or the wrapper's input/return values and side effects, and add them to the other touched methods referenced in the review (the delegating/wrapper methods within the same class).massgen/orchestrator_collaborators/criteria_evolution_runner.py-434-434 (1)
434-434:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
zip()withoutstrict=Truemay silently truncate if list lengths differ.If
parse_evolution_responsereturns anevolvedlist with a different length thanold_criteria, thezip()will silently drop items, causing an incorrectevolved_count. Consider usingstrict=True(Python 3.10+) or adding an explicit length check.Suggested fix
- evolved_count = sum(1 for old_c, new_c in zip(evolution_data["current_criteria"], evolved) if old_c.text != new_c.text) + evolved_count = sum(1 for old_c, new_c in zip(evolution_data["current_criteria"], evolved, strict=True) if old_c.text != new_c.text)Or add a defensive length check before the zip if strict mode causes issues with the expected behavior.
Also applies to: 465-465
massgen/orchestrator_collaborators/final_result_reporter.py-402-414 (1)
402-414:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winBuild the agent status payload by
aid, not by zipping two dicts.This assumes
orch.agents.values()andorch.agent_states.values()always stay in identical order. If they ever diverge, the status response can pair one agent's backend status with another agent's coordination state.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/final_result_reporter.py` around lines 402 - 414, The current payload builds "agents" by zipping orch.agents.keys() with zip(orch.agents.values(), orch.agent_states.values()), which can mis-pair agent objects and their states if dict ordering diverges; instead iterate over orch.agents.items() (for aid, agent in orch.agents.items()), look up the corresponding state via orch.agent_states[aid] (or .get(aid) with a safe fallback), and construct the map using agent.get_status(), state.answer, and state.has_voted so each aid is guaranteed to be paired with its correct coordination state.massgen/orchestrator_collaborators/persona_injector.py-195-196 (1)
195-196:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winGuard against
Nonereturn fromget_log_session_dir().If
get_log_session_dir()returnsNone, the path operation on line 196 will raise aTypeError. Add a guard or early return.🛡️ Suggested fix
log_dir = _orch_mod.get_log_session_dir() + if log_dir is None: + logger.debug("[Orchestrator] No log session dir; skipping persona save") + return personas_file = log_dir / "generated_personas.yaml"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/orchestrator_collaborators/persona_injector.py` around lines 195 - 196, Guard against get_log_session_dir() returning None before using it: check the value of log_dir (returned by _orch_mod.get_log_session_dir()) and if it's None either return early or raise a clear error, otherwise proceed to construct personas_file = log_dir / "generated_personas.yaml"; ensure any downstream code in PersonaInjector (or persona_injector.py) handles the early-return or error case consistently.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 55433d91-58a3-4af6-8475-1bce8f5be183
📒 Files selected for processing (66)
PR_DRAFT.mddocs/dev_notes/orchestrator_refactor_roadmap.mdmassgen/frontend/displays/_textual_provider_model.pymassgen/frontend/displays/_textual_terminal_capabilities.pymassgen/frontend/displays/_textual_widget_debug.pymassgen/frontend/displays/textual_terminal_display.pymassgen/orchestrator.pymassgen/orchestrator_collaborators/__init__.pymassgen/orchestrator_collaborators/active_coordination_cleanup.pymassgen/orchestrator_collaborators/agent_orchestration_setup.pymassgen/orchestrator_collaborators/answer_limit_gate.pymassgen/orchestrator_collaborators/answer_text_normalizer.pymassgen/orchestrator_collaborators/bootstrap_criteria_engine.pymassgen/orchestrator_collaborators/broadcast_tool_initializer.pymassgen/orchestrator_collaborators/changedoc_coordinator.pymassgen/orchestrator_collaborators/chat_followup_handler.pymassgen/orchestrator_collaborators/checklist_gate_manager.pymassgen/orchestrator_collaborators/checkpoint_coordinator.pymassgen/orchestrator_collaborators/context_path_write_tracker.pymassgen/orchestrator_collaborators/criteria_evolution_runner.pymassgen/orchestrator_collaborators/docker_diagnostics.pymassgen/orchestrator_collaborators/dspy_paraphrase_coordinator.pymassgen/orchestrator_collaborators/enforcement_buffer_helper.pymassgen/orchestrator_collaborators/essential_files_helper.pymassgen/orchestrator_collaborators/evaluation_criteria_generator.pymassgen/orchestrator_collaborators/evaluator_result_extractor.pymassgen/orchestrator_collaborators/fairness_gate.pymassgen/orchestrator_collaborators/final_presentation_runner.pymassgen/orchestrator_collaborators/final_result_reporter.pymassgen/orchestrator_collaborators/isolated_change_reviewer.pymassgen/orchestrator_collaborators/metrics_reporter.pymassgen/orchestrator_collaborators/midstream_injection_hook_installer.pymassgen/orchestrator_collaborators/nlip_routing_initializer.pymassgen/orchestrator_collaborators/orchestrator_timeout_calculator.pymassgen/orchestrator_collaborators/peer_answer_visibility_tracker.pymassgen/orchestrator_collaborators/persona_injector.pymassgen/orchestrator_collaborators/planning_tool_injector.pymassgen/orchestrator_collaborators/post_evaluation_runner.pymassgen/orchestrator_collaborators/pre_collab_helpers.pymassgen/orchestrator_collaborators/previous_log_restorer.pymassgen/orchestrator_collaborators/prompt_improver_collaborator.pymassgen/orchestrator_collaborators/question_irreversibility_analyzer.pymassgen/orchestrator_collaborators/rate_limit_controller.pymassgen/orchestrator_collaborators/round_evaluator_gate_config.pymassgen/orchestrator_collaborators/round_evaluator_runner.pymassgen/orchestrator_collaborators/round_start_context_queue.pymassgen/orchestrator_collaborators/run_mode_strategy_resolver.pymassgen/orchestrator_collaborators/runtime_input_delivery.pymassgen/orchestrator_collaborators/skills_config_validator.pymassgen/orchestrator_collaborators/snapshot_manager.pymassgen/orchestrator_collaborators/step_mode_handler.pymassgen/orchestrator_collaborators/subagent_lifecycle_coordinator.pymassgen/orchestrator_collaborators/subagent_tool_injector.pymassgen/orchestrator_collaborators/tool_message_helpers.pymassgen/orchestrator_collaborators/trace_analyzer_runner.pymassgen/orchestrator_collaborators/workspace_lifecycle_manager.pymassgen/orchestrator_collaborators/workspace_modal_presenter.pymassgen/tests/frontend/test_textual_terminal_display_characterization.pymassgen/tests/integration/test_orchestrator_hooks_broadcast_subagents.pymassgen/tests/integration/test_orchestrator_restart_and_external_tools.pymassgen/tests/test_answer_path_normalization.pymassgen/tests/test_auto_trace_analysis.pymassgen/tests/test_essential_files_manifest.pymassgen/tests/test_evaluator_personas.pymassgen/tests/test_orchestrator_characterization.pymassgen/tests/unit/test_orchestrator_unit.py
✅ Files skipped from review due to trivial changes (1)
- PR_DRAFT.md
feat(mcp): add Parallel Web Search to MCP server registry + example config
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
massgen/configs/tools/web-search/parallel_search_example.yaml (1)
28-30: ⚡ Quick winUse a cost-effective default model in this example config.
Please switch this example to a cost-effective default (or add a short note explaining why a premium model is required here).
As per coding guidelines,
massgen/configs/**/*.yaml: “Prefer cost-effective models (gpt-5-nano, gpt-5-mini, gemini-2.5-flash)”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/configs/tools/web-search/parallel_search_example.yaml` around lines 28 - 30, The example uses a premium Claude model ("claude-sonnet-4-20250514"); update the example to use a cost-effective default per guidelines by replacing the model value with a recommended cheaper model (e.g., "gpt-5-nano" or "gpt-5-mini" or "gemini-2.5-flash") or add a one-line comment next to the model key explaining why a premium Claude model is required; target the YAML keys "type", "model", and the surrounding example block (where "mcp_servers" is defined) when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@massgen/mcp_tools/server_registry.py`:
- Around line 86-97: The registry entry for "parallel_search" currently sets
requires_api_key=True and hardcodes api_key_env_var/Authorization behavior,
preventing anonymous use; change the base "parallel_search" entry to not force
an API key (set requires_api_key=False or remove the header behavior) and remove
any mandatory Authorization header from the static entry, then update the MCP
config assembly code that builds mcp_servers to inject the Authorization header
only when os.environ.get("PARALLEL_API_KEY") is present (use api_key_env_var to
detect the env var), mirroring the existing optional-key pattern so anonymous
endpoints remain discoverable without headers.
---
Nitpick comments:
In `@massgen/configs/tools/web-search/parallel_search_example.yaml`:
- Around line 28-30: The example uses a premium Claude model
("claude-sonnet-4-20250514"); update the example to use a cost-effective default
per guidelines by replacing the model value with a recommended cheaper model
(e.g., "gpt-5-nano" or "gpt-5-mini" or "gemini-2.5-flash") or add a one-line
comment next to the model key explaining why a premium Claude model is required;
target the YAML keys "type", "model", and the surrounding example block (where
"mcp_servers" is defined) when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b59d69e2-58a9-42c8-9c84-4d0773e2f237
📒 Files selected for processing (2)
massgen/configs/tools/web-search/parallel_search_example.yamlmassgen/mcp_tools/server_registry.py
|
@coderabbitai pause |
docs: docs for v0.1.92
✅ Actions performedReviews paused. |
PR Title Format
Your PR title must follow the format:
<type>: <brief description>Valid types:
fix:- Bug fixesfeat:- New featuresbreaking:- Breaking changesdocs:- Documentation updatesrefactor:- Code refactoringtest:- Test additions/modificationschore:- Maintenance tasksperf:- Performance improvementsstyle:- Code style changesci:- CI/CD configuration changesExamples:
fix: resolve memory leak in data processingfeat: add export to CSV functionalitybreaking: change API response formatdocs: update installation guideDescription
Brief description of the changes in this PR
Type of change
fix:) - Non-breaking change which fixes an issuefeat:) - Non-breaking change which adds functionalitybreaking:) - Fix or feature that would cause existing functionality to not work as expecteddocs:) - Documentation updatesrefactor:) - Code changes that neither fix a bug nor add a featuretest:) - Adding missing tests or correcting existing testschore:) - Maintenance tasks, dependency updates, etc.perf:) - Code changes that improve performancestyle:) - Changes that do not affect the meaning of the code (formatting, missing semi-colons, etc.)ci:) - Changes to CI/CD configuration files and scriptsChecklist
Pre-commit status
How to Test
Add test method for this PR.
Test CLI Command
Write down the test bash command. If there is pre-requests, please emphasize.
Expected Results
Description/screenshots of expected results.
Additional context
Add any other context about the PR here.
Summary by CodeRabbit
New Features
Documentation
Tests
Chores