feat(mcp): add Parallel Web Search to MCP server registry + example config#1108
feat(mcp): add Parallel Web Search to MCP server registry + example config#1108NormallyGaussian wants to merge 5 commits into
Conversation
…onfig Adds Parallel Web Systems' hosted Search MCP server alongside the existing Context7, Brave, and Exa entries. - massgen/mcp_tools/server_registry.py: new `parallel_search` entry using `type: streamable-http` and `url: https://search.parallel.ai/mcp`. The server accepts unauthenticated requests by default, so the entry is registered with `requires_api_key: False` and an `optional_api_key_env_var: PARALLEL_API_KEY` for users who want higher rate limits via the `Authorization: Bearer ...` header. - massgen/configs/tools/web-search/parallel_search_example.yaml: full example config modelled on `exa_search_example.yaml`, with a system message that explains Parallel's `objective + search_queries` pattern and `web_search`/`web_fetch` tools, plus a commented-out `headers` block for opting into the API key. - Module docstring updated to list `parallel_search`. Docs: - https://docs.parallel.ai/integrations/mcp/search-mcp - https://docs.parallel.ai/api-reference/search/search - https://docs.parallel.ai/search/best-practices Tested with claude-sonnet-4-6 backend via the new example config.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRegisters Parallel Search as an MCP server in the registry and adds an example YAML agent config showing how to connect to Parallel’s hosted MCP, use the ChangesParallel Search MCP Server Integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 7 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (7 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
massgen/configs/tools/web-search/parallel_search_example.yaml (1)
15-24: ⚡ Quick winAdd a “What happens” execution-flow comment block.
This example has prerequisites/features, but it should also include a short "What happens" flow section for MassGen config convention consistency.
As per coding guidelines,
massgen/configs/**/*.yamlshould "Include 'What happens' comments explaining execution flow."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/configs/tools/web-search/parallel_search_example.yaml` around lines 15 - 24, Add a short "What happens" execution-flow comment block below the existing "Features" comment in the YAML example describing the runtime steps (e.g., input objective + search_queries → web_search returns ranked URLs and LLM excerpts → web_fetch fetches and extracts markdown content → excerpts are compressed and returned), following the MassGen config convention; place this new "What happens" comment block adjacent to the "Features" block so readers of parallel_search_example.yaml can quickly see the end-to-end flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@massgen/configs/tools/web-search/parallel_search_example.yaml`:
- Around line 47-66: The system prompt embeds hardcoded tool-call signatures
(e.g., "web_search(objective, search_queries[, ...])" and "web_fetch(urls[,
objective])"); remove those function-like invocation syntaxes and rewrite the
entries for web_search and web_fetch into natural-language descriptions that
describe inputs and behavior (mention objective, list of short search queries,
optional mode, max counts, and returned results) without suggesting exact call
syntax so the tool schemas handle invocation details.
In `@massgen/mcp_tools/server_registry.py`:
- Around line 77-83: The registry entry's "description" string embeds
signature-style tool syntax (mentions like "web_search (...)” and "web_fetch
(...)"); update the description field in the server registry to use
natural-language phrasing that describes the server's capabilities (e.g.,
supports ranked web search results and URL-to-markdown fetching optimized for
LLM consumption, free anonymous use with optional PARALLEL_API_KEY for higher
rate limits) instead of showing function-call shapes; edit the "description"
value (the string literal in the registry) to remove parenthesized signatures
and present the features as plain text.
---
Nitpick comments:
In `@massgen/configs/tools/web-search/parallel_search_example.yaml`:
- Around line 15-24: Add a short "What happens" execution-flow comment block
below the existing "Features" comment in the YAML example describing the runtime
steps (e.g., input objective + search_queries → web_search returns ranked URLs
and LLM excerpts → web_fetch fetches and extracts markdown content → excerpts
are compressed and returned), following the MassGen config convention; place
this new "What happens" comment block adjacent to the "Features" block so
readers of parallel_search_example.yaml can quickly see the end-to-end flow.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0ae94dad-9970-465f-96be-b94991e5c40b
📒 Files selected for processing (2)
massgen/configs/tools/web-search/parallel_search_example.yamlmassgen/mcp_tools/server_registry.py
Addresses CodeRabbit findings (Major) on PR massgen#1108: - server_registry.py: registry description no longer encodes tool invocation shape (web_search(...) / web_fetch(URL ->...)). Reworded as a natural-language description of what the server provides; the actual call schemas come from the MCP server itself. - parallel_search_example.yaml: system message dropped the web_search(objective, search_queries[, ...]) and web_fetch(urls[, objective]) headers, and now leans on the MCP-supplied schemas for argument names. Usage guidance (objective + 2-3 keyword queries, mode preset, citation discipline) is preserved in plain prose. Tools remain referenced by their conceptual function ("web search", "URL extraction") rather than by hardcoded call signatures, so the prompt stays accurate if the underlying tool schema evolves.
|
Actionable comments posted: 0 |
…p servers
Codex review identified that the new parallel_search entry advertises
PARALLEL_API_KEY support but get_server_config() only special-cased
context7's --api-key CLI flag. As written, parallel_search would always
connect anonymously even when PARALLEL_API_KEY was set, defeating the
documented higher-rate-limit path.
Generalize get_server_config() so that when a registry entry declares
optional_api_key_env_var and the env var is set:
- stdio context7 keeps its existing --api-key flag injection (unchanged
behavior; explicit branch on server_name)
- streamable-http entries get a default Authorization: Bearer ${key}
header (using setdefault so a registry-declared header takes
precedence)
- everything else is unchanged
Manually verified:
- PARALLEL_API_KEY unset -> no headers added to parallel_search config
- PARALLEL_API_KEY set -> headers.Authorization = "Bearer <key>"
- apply_api_key_logic=False -> no injection
- context7 still injects --api-key when CONTEXT7_API_KEY is set
Codex review on the prior commit flagged a P1 behavioral regression:
marking parallel_search as `requires_api_key: False` put it in the
always-on auto-discovery bucket alongside context7, so every config
with `auto_discover_custom_tools: true` (e.g. subagent_checklist.yaml,
log_analysis.yaml, many others) silently gained an outbound web-search
tool — breaking deterministic/offline assumptions of existing workflows.
Realign with the Brave/Exa convention:
- `requires_api_key: True`, `api_key_env_var: "PARALLEL_API_KEY"`
- Add a baked `headers: {"Authorization": "Bearer ${PARALLEL_API_KEY}"}`
block; the MCP transport's existing env-var substitution unpacks it at
connection time (mirrors how stdio servers consume `env:` like
`BRAVE_API_KEY: "${BRAVE_API_KEY}"`).
- Revert the temporary streamable-http bearer-injection branch added
to `get_server_config()` last commit — no longer needed and made the
registry entry's `optional_api_key_env_var` look like it was opt-in
when it actually gated nothing.
- Description/notes reworded: anonymous use is still possible by adding
the server manually to mcp_servers without the headers block (the
example YAML continues to show that path).
Manually verified:
- PARALLEL_API_KEY unset -> auto-discovery is just [context7]
(no regression vs. main)
- PARALLEL_API_KEY set -> auto-discovery is [context7, parallel_search]
with the headers template ready for MCP substitution
- Missing-key reporting now lists parallel_search alongside
brave_search and exa_search when their env vars are unset
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
massgen/mcp_tools/server_registry.py (1)
113-127: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winUpdate docstring to document streamable-http API key injection.
The function now injects optional API keys for
streamable-httpservers (viaAuthorization: Bearer <key>header), not just forcontext7(via CLI--api-keyarg). The docstring should reflect this enhanced behavior for clarity.📝 Suggested docstring update
def get_server_config(server_name: str, apply_api_key_logic: bool = True) -> dict[str, Any] | None: """Get configuration for a specific MCP server from registry. Args: server_name: Name of the server (e.g., "context7", "serena", "brave_search") - apply_api_key_logic: If True, adds optional API keys when available (e.g., Context7) + apply_api_key_logic: If True, injects optional API keys when available: + - For context7: appends '--api-key <value>' to CLI args + - For streamable-http servers: sets 'Authorization: Bearer <value>' header + (preserving any registry-declared Authorization header) Returns: Deep copy of server configuration dict, or None if server not foundAs per coding guidelines, "For new or changed functions, include Google-style docstrings."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/mcp_tools/server_registry.py` around lines 113 - 127, Update the get_server_config(...) docstring to document that when apply_api_key_logic is True the function will also inject optional API keys for streamable-http servers by adding an Authorization: Bearer <key> header (in addition to the existing Context7/CLI --api-key behavior); mention the streamable-http header format and that this injection is optional and only applied when a key is available, keeping Google-style docstring structure and examples consistent with the existing description.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@massgen/mcp_tools/server_registry.py`:
- Around line 113-127: Update the get_server_config(...) docstring to document
that when apply_api_key_logic is True the function will also inject optional API
keys for streamable-http servers by adding an Authorization: Bearer <key> header
(in addition to the existing Context7/CLI --api-key behavior); mention the
streamable-http header format and that this injection is optional and only
applied when a key is available, keeping Google-style docstring structure and
examples consistent with the existing description.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b1150680-fdac-442d-98f7-ac639ddcfc58
📒 Files selected for processing (1)
massgen/mcp_tools/server_registry.py
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
massgen/mcp_tools/server_registry.py (1)
139-145: 💤 Low valueContext7 logic appears out of scope for this PR.
These lines add Context7-specific logic to inject optional API keys, but the PR objectives and commit messages only mention adding Parallel Search. While the implementation is correct, this change isn't documented in the PR description or summary.
Consider:
- Adding a note to the PR description explaining this change
- Or moving this to a separate commit/PR if it's unrelated to Parallel Search
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/mcp_tools/server_registry.py` around lines 139 - 145, This change introduces Context7-specific API key injection (see apply_api_key_logic check, server_name == "context7", optional_key_var and is_api_key_available usage that appends to config["args"]) which is out of scope for the Parallel Search PR; either remove/move this block into its own commit/PR focused on Context7 API key behavior, or update the current PR description and summary to explicitly document this new Context7 behavior and why it was included so reviewers see it is intentional.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@massgen/mcp_tools/server_registry.py`:
- Around line 139-145: This change introduces Context7-specific API key
injection (see apply_api_key_logic check, server_name == "context7",
optional_key_var and is_api_key_available usage that appends to config["args"])
which is out of scope for the Parallel Search PR; either remove/move this block
into its own commit/PR focused on Context7 API key behavior, or update the
current PR description and summary to explicitly document this new Context7
behavior and why it was included so reviewers see it is intentional.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 4bd7c903-e1bf-475b-b8b2-0992cdd583d6
📒 Files selected for processing (1)
massgen/mcp_tools/server_registry.py
… compat Codex review caught that the example was wired to backend.type: 'claude_code' but used the streamable-http 'headers:' field. ClaudeCodeBackend forwards MCP configs to ClaudeAgentOptions, where the equivalent auth field is a top-level 'authorization:' string (per massgen/backend/docs/MCP_IMPLEMENTATION_CLAUDE_BACKEND.md), not a nested 'headers' map — so users uncommenting the API-key block would have silently kept anonymous mode. Switch the example to backend.type: 'claude' (generic massgen MCP transport), matching the existing streamable_http_test configs whose 'headers:' shape lines up with the registry entry. Added an inline note pointing claude_code users at the 'authorization:' alternative.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
massgen/configs/tools/web-search/parallel_search_example.yaml (2)
1-24: ⚡ Quick winAdd execution flow comments to the header.
The configuration lacks "What happens" comments explaining the execution flow. Consider adding a section that describes what happens when this configuration runs (e.g., agent receives user query → calls web_search with objective + keywords → processes results → may call web_fetch for specific URLs → synthesizes final response with citations).
As per coding guidelines, "Include 'What happens' comments explaining execution flow".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/configs/tools/web-search/parallel_search_example.yaml` around lines 1 - 24, Add a "What happens" header comment describing the execution flow: state that when the config runs the agent receives the user query, maps it into an objective and 2–3 search_queries, calls web_search (passing objective + search_queries) to get ranked URLs and LLM-optimized excerpts, optionally calls web_fetch with selected URLs and the same objective to retrieve bounded markdown content, and finally synthesizes a response with citations; reference the web_search and web_fetch blocks and the objective/search_queries fields so the reader can locate where each step is configured.
28-29: ⚡ Quick winConsider using a cost-effective model for this example.
The configuration uses
claude-sonnet-4-20250514, a premium model. For web search examples, cost-effective models likegpt-5-nano,gpt-5-mini, orgemini-2.5-flashwould demonstrate best practices and reduce usage costs for users following this example.As per coding guidelines, "Prefer cost-effective models (gpt-5-nano, gpt-5-mini, gemini-2.5-flash)".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@massgen/configs/tools/web-search/parallel_search_example.yaml` around lines 28 - 29, The YAML uses the premium model value "claude-sonnet-4-20250514"; change the model field under the tool definition (type: "claude", model: "claude-sonnet-4-20250514") to a cost-effective option such as "gpt-5-nano", "gpt-5-mini", or "gemini-2.5-flash" so examples follow the guideline to prefer lower-cost models for web-search examples.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@massgen/configs/tools/web-search/parallel_search_example.yaml`:
- Around line 1-24: Add a "What happens" header comment describing the execution
flow: state that when the config runs the agent receives the user query, maps it
into an objective and 2–3 search_queries, calls web_search (passing objective +
search_queries) to get ranked URLs and LLM-optimized excerpts, optionally calls
web_fetch with selected URLs and the same objective to retrieve bounded markdown
content, and finally synthesizes a response with citations; reference the
web_search and web_fetch blocks and the objective/search_queries fields so the
reader can locate where each step is configured.
- Around line 28-29: The YAML uses the premium model value
"claude-sonnet-4-20250514"; change the model field under the tool definition
(type: "claude", model: "claude-sonnet-4-20250514") to a cost-effective option
such as "gpt-5-nano", "gpt-5-mini", or "gemini-2.5-flash" so examples follow the
guideline to prefer lower-cost models for web-search examples.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 127a75a1-78ea-477c-86be-e64f145b135b
📒 Files selected for processing (1)
massgen/configs/tools/web-search/parallel_search_example.yaml
Summary
Adds Parallel Web Search as the first hosted (streamable-http) entry in MassGen's MCP server registry, alongside the existing stdio-based Context7, Brave, and Exa entries.
What's added
massgen/mcp_tools/server_registry.pyNew
parallel_searchregistry entry:Module docstring updated to list the new server.
massgen/configs/tools/web-search/parallel_search_example.yamlFull example modeled on `exa_search_example.yaml`:
Why
Parallel's Search API returns LLM-ranked, compressed excerpts in a single call (replacing multi-step keyword-search loops), and the MCP exposes it via `web_search` and `web_fetch` tools. The objective + queries pattern fits MassGen's multi-agent research scenarios particularly well — agents can issue one structured search per research goal instead of looping.
Tested
Docs
Summary by CodeRabbit
New Features
Documentation