-
Notifications
You must be signed in to change notification settings - Fork 559
Add feedback on Parse JSON processor #9917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,8 +8,7 @@ nav_order: 80 | |
|
||
# parse_json | ||
|
||
The `parse_json` processor parses JSON data for an event, including any nested fields. The processor extracts the JSON pointer data and adds the input event to the extracted fields. | ||
|
||
The `parse_json` processor parses JSON-formatted strings within an event, including nested fields. It can optionally use a JSON pointer to extract a specific part of the source JSON and add the extracted data to the event. | ||
|
||
## Configuration | ||
|
||
|
@@ -24,65 +23,68 @@ This table is autogenerated. Do not edit it. | |
|
||
| Option | Required | Type | Description | | ||
| :--- | :--- | :--- | :--- | | ||
| `source` | No | String | The field in the `event` that will be parsed. Default value is `message`. | | ||
| `destination` | No | String | The destination field of the parsed JSON. Defaults to the root of the `event`. Cannot be `""`, `/`, or any white-space-only `string` because these are not valid `event` fields. | | ||
| `pointer` | No | String | A JSON pointer to the field to be parsed. There is no `pointer` by default, meaning the entire `source` is parsed. The `pointer` can access JSON array indexes as well. If the JSON pointer is invalid then the entire `source` data is parsed into the outgoing `event`. If the key that is pointed to already exists in the `event` and the `destination` is the root, then the pointer uses the entire path of the key. | | ||
| `parse_when` | No | String | Specifies under which conditions the processor should perform parsing. Default is no condition. Accepts an OpenSearch Data Prepper expression string following the [expression syntax]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/expression-syntax/). | | ||
| `overwrite_if_destination_exists` | No | Boolean | Overwrites the destination if set to `true`. Set to `false` to prevent changing a destination value that exists. Defaults to `true`. | | ||
| `delete_source` | No | Boolean | If set to `true` then this will delete the source field. Defaults to `false`. | | ||
| `tags_on_failure` | No | String | A list of strings specifying the tags to be set in the event that the processor fails or an unknown exception occurs during parsing. | ||
| `source` | No | String | The field in the event that will be parsed. Default is `message`. | | ||
| `destination` | No | String | The destination field for the parsed JSON. Defaults to the root of the event. Cannot be `""`, `/`, or any white-space-only string. | | ||
| `pointer` | No | String | A JSON pointer (as defined by [RFC 6901](https://datatracker.ietf.org/doc/html/rfc6901)) to a specific field in the source JSON. If omitted, the entire `source` is parsed. If the pointer is invalid, the full `source` is parsed instead. When writing to the root destination, existing keys will be preserved unless overwritten. | | ||
| `parse_when` | No | String | A condition expression that determines when to parse the field. Accepts a string following the [expression syntax]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/expression-syntax/). | | ||
| `overwrite_if_destination_exists` | No | Boolean | Whether to overwrite the destination field if it already exists. Default is `true`. | | ||
| `delete_source` | No | Boolean | Whether to delete the source field after parsing. Default is `false`. | | ||
| `tags_on_failure` | No | String | A list of tags to apply if parsing fails or an unexpected exception occurs. | | ||
|
||
## Usage | ||
|
||
To get started, create the following `pipeline.yaml` file: | ||
To use the `parse_json` processor, add it to your `pipeline.yaml` configuration file: | ||
|
||
```yaml | ||
parse-json-pipeline: | ||
source: | ||
... | ||
.... | ||
... | ||
processor: | ||
- parse_json: | ||
``` | ||
|
||
### Basic example | ||
|
||
To test the `parse_json` processor with the previous configuration, run the pipeline and paste the following line into your console, then enter `exit` on a new line: | ||
This example parses a JSON message field and flattens the data into the event. | ||
|
||
``` | ||
For example, the following JSON message contains a key-value pair: | ||
|
||
```json | ||
{"outer_key": {"inner_key": "inner_value"}} | ||
``` | ||
{% include copy.html %} | ||
|
||
The `parse_json` processor parses the message into the following format: | ||
In example event, the original `message` field remains, and the parsed content is added at the root level. Use the `delete_source` option if you want to remove the original field: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Should this be?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The example below includes the original. So putting this sentence about I recommend either of these changes:
|
||
|
||
``` | ||
{"message": {"outer_key": {"inner_key": "inner_value"}}", "outer_key":{"inner_key":"inner_value"}}} | ||
```json | ||
{ | ||
"message": "{\"outer_key\": {\"inner_key\": \"inner_value\"}}", | ||
"outer_key": { | ||
"inner_key": "inner_value" | ||
} | ||
} | ||
``` | ||
|
||
### Example with a JSON pointer | ||
### Example using a JSON pointer | ||
|
||
You can use a JSON pointer to parse a selection of the JSON data by specifying the `pointer` option in the configuration. To get started, create the following `pipeline.yaml` file: | ||
You can use the `pointer` option to extract a specific nested field from the JSON data. | ||
|
||
```yaml | ||
parse-json-pipeline: | ||
source: | ||
... | ||
.... | ||
... | ||
processor: | ||
- parse_json: | ||
pointer: "outer_key/inner_key" | ||
pointer: "/outer_key/inner_key" | ||
``` | ||
|
||
To test the `parse_json` processor with the pointer option, run the pipeline, paste the following line into your console, and then enter `exit` on a new line: | ||
Using the same JSON message as the previous example, only the value at the pointer path `/outer_key/inner_key` is extracted and added to the event. If you set `destination`, the extracted value will be added under that field instead: | ||
|
||
``` | ||
{"outer_key": {"inner_key": "inner_value"}} | ||
``` | ||
{% include copy.html %} | ||
|
||
The processor parses the message into the following format: | ||
|
||
{ | ||
"message": "{\"outer_key\": {\"inner_key\": \"inner_value\"}}", | ||
"inner_key": "inner_value" | ||
} | ||
``` | ||
{"message": {"outer_key": {"inner_key": "inner_value"}}", "inner_key": "inner_value"} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be clearer to say: