Skip to content

Conversation

@nathaliellenaa
Copy link
Contributor

@nathaliellenaa nathaliellenaa commented Oct 28, 2025

Description

Add tool parameter type validation.

  • Add new parameter max_question_length/max_input_length to control max size for question/input field. Default is 10000. Users can change the value by passing this parameter when register an agent or executing tool
// Register agent
POST /_plugins/_ml/agents/_register
{
  "name": "Test agent tool",
  "type": "flow",
  "description": "this is a test agent",
  "tools": [
    {
      "type": "AgentTool",
      "description": "A general agent to answer any question",
      "parameters": {
        "agent_id": "9X7xWI0Bpc3sThaJdY9i",
        "max_question_length": 10
      }
    }
  ]
}

// Execute agent
POST /_plugins/_ml/tools/_execute/AgentTool
{
    "parameters": {
        "agent_id": "g-CELJoBkoz7WY_qrUkJ",
        "question": "how many indices in my cluster?"
        "max_question_length": 20000
      }
}

// Sample error response
{
    "status": 400,
    "error": {
        "type": "IllegalArgumentException",
        "reason": "Invalid Request",
        "details": "question length cannot exceed 10 characters"
    }
}
  • Add validations for malformed input

Examples:

POST /_plugins/_ml/tools/_execute/ListIndexTool
{
  "parameters": {
    "question": ["invalid"]
  }
}

// Error response
{
    "status": 400,
    "error": {
        "type": "OpenSearchParseException",
        "reason": "Invalid Request",
        "details": "[question] property isn't a string, but of type [java.util.ArrayList]"
    }
}

More examples in this doc:
Tool Validation Test.pdf

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval October 28, 2025 18:59 — with GitHub Actions Failure
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval October 28, 2025 18:59 — with GitHub Actions Error
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval October 28, 2025 18:59 — with GitHub Actions Error
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval October 28, 2025 18:59 — with GitHub Actions Failure
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 29, 2025 04:49 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 29, 2025 04:49 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 29, 2025 04:49 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 29, 2025 04:49 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 29, 2025 05:02 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 29, 2025 05:02 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 29, 2025 05:02 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 29, 2025 05:02 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 29, 2025 05:39 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 29, 2025 05:39 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 29, 2025 05:39 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 29, 2025 05:39 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 31, 2025 01:10 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 31, 2025 01:10 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 31, 2025 01:10 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 31, 2025 01:10 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 31, 2025 01:57 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 31, 2025 01:57 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 31, 2025 01:57 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 31, 2025 01:57 — with GitHub Actions Waiting
@nathaliellenaa nathaliellenaa marked this pull request as ready for review October 31, 2025 01:59
@nathaliellenaa
Copy link
Contributor Author

Working on adding tests

@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 02:00 — with GitHub Actions Error
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 02:00 — with GitHub Actions Failure
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 02:00 — with GitHub Actions Failure
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 02:00 — with GitHub Actions Error
Copy link
Contributor

@rithin-pullela-aws rithin-pullela-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR Nathalie, these validations have been long time pending. Added 2 small comments, other than that LGTM!

Comment on lines +187 to +190
String question = parameters.get("question");
if (question != null && question.length() > maxQuestionLength) {
throw new IllegalArgumentException("question length cannot exceed " + maxQuestionLength + " characters");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would the index mapping tool, need a "question" attribute?
Looks like we are not using it anywhere

Copy link
Contributor Author

@nathaliellenaa nathaliellenaa Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use it during tool execution. Also, in the doc, it listed question as a required parameter.

POST /_plugins/_ml/agents/9X7xWI0Bpc3sThaJdY9i/_execute
{
  "parameters": {
    "index": [ "sample-ecommerce" ],
    "question": "What fields are in the sample-ecommerce index?"
  }
}

I tested and we can run this tool without the question parameter. I'll update the doc to make it optional.
We can keep the question length validation here since users can still pass the question parameter, but I'll modify this to optional

ConfigurationUtils.readStringProperty(TYPE, null, params, "question");

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the question is shown in the parameter map passed to tools, but question is an agent level parameter, let's say an agent runs 10 tools, we only need to validate the question once when agent runs, instead of validating it 10 times when each tool runs.

Comment on lines +90 to +92
public static final int DEFAULT_MAX_QUESTION_LENGTH = 10000;
public static final String MAX_QUESTION_LENGTH_FIELD = "max_question_length";
private int maxQuestionLength = DEFAULT_MAX_QUESTION_LENGTH;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, we don't need the question parameter. The tool just ignores the question param

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing here, we use it during tool execution. Doc marked it as a required field. But looking at the code, we can actually run this tool without question parameter. I'll update the doc to make this parameter optional.

We can keep the question length validation here since users can still pass the question parameter, but I'll modify this to optional

ConfigurationUtils.readStringProperty(TYPE, null, params, "question");

@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 04:14 — with GitHub Actions Error
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 04:14 — with GitHub Actions Failure
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 04:14 — with GitHub Actions Error
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 04:14 — with GitHub Actions Failure
@nathaliellenaa
Copy link
Contributor Author

Some outdated tool information I found during testing, will open doc PR to update these

Signed-off-by: Nathalie Jonathan <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
@ylwu-amzn ylwu-amzn force-pushed the execute_tool_security branch from 78c29fd to f6c3fcc Compare October 31, 2025 04:38
@ylwu-amzn ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 04:39 — with GitHub Actions Failure
@ylwu-amzn ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 04:39 — with GitHub Actions Failure
@ylwu-amzn ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 04:39 — with GitHub Actions Error
@ylwu-amzn ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 04:39 — with GitHub Actions Error
Comment on lines +187 to +190
String question = parameters.get("question");
if (question != null && question.length() > maxQuestionLength) {
throw new IllegalArgumentException("question length cannot exceed " + maxQuestionLength + " characters");
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the question is shown in the parameter map passed to tools, but question is an agent level parameter, let's say an agent runs 10 tools, we only need to validate the question once when agent runs, instead of validating it 10 times when each tool runs.

return tool;
}

private static Object parseValue(String value) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the benefit of parsing these values, if a tool requires different types like map or double/float, then they're still being passed to tool in String format.

@zane-neo
Copy link
Collaborator

zane-neo commented Nov 4, 2025

High level comment:

  1. Can we add an interface method in org.opensearch.ml.common.spi.tools.Tool like getCreateToolParamsDefinition so that every tool returns either a json string like attributes or a Map<String, Class> to indicate what type the parameter should be. And validate the parameters when creating the agent?
  2. question is an agent level parameter and should be validated in agent instead of tools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants