Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Parsed field data missing when source field not included #3265

Open
dai-chen opened this issue Jan 27, 2025 · 3 comments
Open

[BUG] Parsed field data missing when source field not included #3265

dai-chen opened this issue Jan 27, 2025 · 3 comments
Labels
bug Something isn't working documentation Improvements or additions to documentation PPL Piped processing language

Comments

@dai-chen
Copy link
Collaborator

What is the bug?

When using the PPL parse command, if the source field that was parsed is not included in the subsequent fields command, the parsed field data is missing from the final result.

This behavior appears to be by design in the code: ProjectOperator.java#L76.

How can one reproduce the bug?

POST parse_command_test/_doc
{
  "@timestamp": "2025-01-04T04:00:00Z",
  "line": {
    "@message": "Request from AWS for client ID 123 is OVER_LIMIT"
  }
}

POST _plugins/_ppl
{
  "query": """
    search source=parse_command_test
    | parse line.@message 'Request from (?<service>.+) for client ID (?<clientId>.+) is OVER_LIMIT'
    | fields @timestamp, service, clientId
  """
}
{
  "schema": [
    {
      "name": "@timestamp",
      "type": "timestamp"
    },
    {
      "name": "service",
      "type": "string"
    },
    {
      "name": "clientId",
      "type": "string"
    }
  ],
  "datarows": [
    [
      "2025-01-04 04:00:00"
    ]
  ],
  "total": 1,
  "size": 1
}

What is the expected behavior?

  • If intentional: Provide clear documentation to explain why parsed field data depends on the inclusion of the original field in the fields command.
  • Otherwise: Modify the behavior to ensure parsed fields remain available even if the original field is excluded.

Do you have any screenshots?

N/A

Do you have any additional context?

N/A

@dai-chen dai-chen added bug Something isn't working untriaged labels Jan 27, 2025
@RyanL1997
Copy link
Contributor

Adding the action item for document this for now.

@RyanL1997 RyanL1997 added documentation Improvements or additions to documentation PPL Piped processing language and removed untriaged labels Jan 28, 2025
@penghuo
Copy link
Collaborator

penghuo commented Jan 28, 2025

The results should be null if no data.

  "datarows": [
    [
      "2025-01-04 04:00:00",
      null,
      null
    ]
  ]

@dai-chen
Copy link
Collaborator Author

Tasks:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation PPL Piped processing language
Projects
None yet
Development

No branches or pull requests

3 participants