Skip to content

Conversation

@vblagoje
Copy link
Member

@vblagoje vblagoje commented Jan 6, 2026

Why

Prevents runtime errors from typos in inputs_from_state and outputs_to_state by validating parameter/output names at tool construction time.

The upstream Haystack PR added validation hooks (_get_valid_inputs() and _get_valid_outputs()) to the base Tool class to catch configuration errors early. Without implementing these methods, MCPTool failed to validate state-mapping parameters correctly because:

  1. MCPTool uses _invoke_tool(**kwargs) - function introspection only finds 'kwargs', not actual parameters
  2. MCPTool parameters come from MCP server's JSON schema, not Python function signatures
  3. MCPTool supports lazy connection (eager_connect=False) where schema isn't available during init

This led to silent failures when users mistyped parameter names in state-mapping configurations.

What

  • Implemented _get_valid_inputs() method in MCPTool to return valid parameter names from MCP tool's JSON schema
  • Returns empty set when eager_connect=False to defer validation until warm_up() is called (schema not yet available)
  • Added explicit validation in warm_up() method for lazy-connection mode to detect errors before first tool use
  • Fixed test case that had invalid state-mapping parameters (inputs_from_state={"filter": "query_filter"} changed to {"state_a": "a"} to match actual 'add' tool parameters)
  • Enhanced test coverage with proper validation testing

How can it be used

  from haystack_integrations.tools.mcp import MCPTool, StdioServerInfo

  server_info = StdioServerInfo(command="uvx", args=["my-mcp-server"])

  # ❌ Typo caught at construction time (eager_connect=True):
  tool = MCPTool(
      name="add",
      server_info=server_info,
      eager_connect=True,
      inputs_from_state={"num1": "numbr"}  # Typo: 'numbr' instead of 'number'
  )
  # ValueError: inputs_from_state maps 'num1' to unknown parameter 'numbr'. 
  #            Valid parameters are: {'a', 'b'}.

  # ❌ Typo caught at warm_up time (eager_connect=False, default):
  tool = MCPTool(
      name="add",
      server_info=server_info,
      inputs_from_state={"num1": "numbr"}  # Typo detected during warm_up
  )
  tool.warm_up()  # or first invoke() call
  # ValueError: inputs_from_state maps 'num1' to unknown parameter 'numbr'.
  #            Valid parameters are: {'a', 'b'}.

  # ✅ Correct usage:
  tool = MCPTool(
      name="add",
      server_info=server_info,
      inputs_from_state={"num1": "a"}  # Valid parameter
  )

How did you test it

  • Fixed existing test test_mcp_tool_serde_with_state_mapping that was using invalid parameter names ({"filter": "query_filter"} → {"state_a": "a"})
  • Verified validation works with both eager and lazy connection modes
  • Existing test test_mcp_tool_state_mapping_parameters validates the eager_connect=False path with warm_up() call
  • Integration tests with real MCP server validate end-to-end state-mapping functionality

Notes for the reviewer

Design decisions:

  1. Empty set for lazy mode: When eager_connect=False, _get_valid_inputs() returns an empty set to skip validation during init. This is intentional because the MCP tool schema isn't available yet. Validation happens later in warm_up() when the schema is fetched from the server.
  2. Validation duplication: The validation logic in warm_up() duplicates Tool.post_init() logic, but this is necessary for early error detection in lazy-connection mode. Without it, errors would only surface during actual tool invocation, defeating the purpose of early validation.
  3. Schema-based validation: Unlike function-based tools that use Python introspection, MCPTool validates against JSON schema properties because MCP tools are defined by their protocol schema, not Python signatures.

Testing note: The test fix reveals that the previous test was actually testing invalid configuration - it would have failed if the upstream validation had been working. This validates that the fix is working correctly.

@github-actions github-actions bot added integration:mcp type:documentation Improvements or additions to documentation labels Jan 6, 2026
@vblagoje vblagoje marked this pull request as ready for review January 6, 2026 13:42
@vblagoje vblagoje requested a review from a team as a code owner January 6, 2026 13:42
@vblagoje vblagoje requested review from davidsbatista and removed request for a team January 6, 2026 13:42
@vblagoje
Copy link
Member Author

vblagoje commented Jan 6, 2026

@davidsbatista lmk double check everything in RL apps for potential issues.

@vblagoje
Copy link
Member Author

vblagoje commented Jan 7, 2026

Verified to work with itinerary agent as well.

@vblagoje vblagoje merged commit 454d9ee into main Jan 7, 2026
10 checks passed
@vblagoje vblagoje deleted the fix-mcp-validation branch January 7, 2026 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration:mcp type:documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants