Skip to content

Check array size at parse time #129997

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

drempapis
Copy link
Contributor

For an index having an array defined as nested

PUT /test-large-array
{
  "mappings": {
    "properties": {
      "array": {
        "type": "nested",           
        "properties": {
          "value": { "type": "integer" }
        }
      }
    }
  }
}

Indexing is protected by the index.mapping.nested_objects.limit setting, which restricts the number of nested objects allowed per document. However, in cases where an index is created via dynamic mapping without predefined limits, no adequate control is in place. This can result in the ingestion of a poison document—a document containing an array with a vast number of objects, which can severely degrade Elasticsearch cluster performance or even lead to node crashes due to excessive memory consumption and processing overhead.

This is a preliminary draft that attempts to address the scenario by incorporating safeguards, enforcement of the index.mapping.nested_objects.limit setting, and additional validation checks during document parsing. The approach aims to mitigate the risk of ingesting poison documents by interleaving structural controls and runtime validation to limit the number of objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants