Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] ERROR org.opensearch.dataprepper.plugins.source.loghttp.LogHTTPService - Failed to parse the request of size XXX due to: Unrecognized token #3865

Open
canob opened this issue Dec 27, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@canob
Copy link

canob commented Dec 27, 2023

Describe the bug
After a while that is running, Data Prepper is starting to generate these lines on the log:

2023-12-21T17:50:19,079 [pool-9-thread-71] ERROR org.opensearch.dataprepper.plugins.source.loghttp.LogHTTPService - Failed to parse the request of size 56244 due to: Unrecognized token 'inf': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
 at [Source: (com.linecorp.armeria.internal.shaded.fastutil.io.FastByteArrayInputStream); line: 1, column: 25221] (through reference chain: java.util.ArrayList[215])
2023-12-21T17:50:28,018 [pool-9-thread-141] ERROR org.opensearch.dataprepper.plugins.source.loghttp.LogHTTPService - Failed to parse the request of size 56244 due to: Unrecognized token 'inf': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
 at [Source: (com.linecorp.armeria.internal.shaded.fastutil.io.FastByteArrayInputStream); line: 1, column: 25221] (through reference chain: java.util.ArrayList[215])
2023-12-26T12:10:19,062 [pool-9-thread-125] ERROR org.opensearch.dataprepper.plugins.source.loghttp.LogHTTPService - Failed to parse the request of size 38988 due to: Unrecognized token 'inf': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
 at [Source: (com.linecorp.armeria.internal.shaded.fastutil.io.FastByteArrayInputStream); line: 1, column: 27896] (through reference chain: java.util.ArrayList[240])
2023-12-26T12:10:29,030 [pool-9-thread-63] ERROR org.opensearch.dataprepper.plugins.source.loghttp.LogHTTPService - Failed to parse the request of size 38988 due to: Unrecognized token 'inf': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
 at [Source: (com.linecorp.armeria.internal.shaded.fastutil.io.FastByteArrayInputStream); line: 1, column: 27896] (through reference chain: java.util.ArrayList[240])

To Reproduce
yaml config file:

dns-ip-pipeline:
#example event: {"count_source_ip":102,"source_ip":"10.199.0.150","tag":"dns_metrics_query_by_ip_5m"}
  source:
    http:
      ssl: false
      port: 2023
  buffer:
    kafka:
      bootstrap_servers:
        - redpanda-0:9092
      topics:
        - name: dns-ip-pipeline-buffer
          group_id: data-prepper-group
      encryption:
        type: none
  processor:
    - anomaly_detector:
        keys: ["count_source_ip"]
        mode:
            random_cut_forest:
  sink:
    - file:
        path: /logs/dns_metrics_ip_anomalies.json

Expected behavior
After that errors on the log, the file is not being written anymore.

Environment (please complete the following information):

  • OS: Ubuntu 20.04 LTS, Docker Container
  • Version: 2.6.0

Can anyone explain to me what that error is saying, and how can I correct it?

@dlvenable
Copy link
Member

dlvenable commented Jan 2, 2024

@canob , Do you have any full examples with the full request? It appears from the sizes that only part of the message is invalid. There appears to be an invalid token in the request. See this message: Unrecognized token 'inf': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false'). And it happens at either 25221 or 27896 characters into the request body.

Also what is sending the data to Data Prepper? Is it FluentBit?

@dlvenable
Copy link
Member

Also, from the error log, it appears that the JSON parser has already found a key. It is expecting a JSON value.

Is it possible you might have the value inf without quotes for count_source_ip? Perhaps it means an undefined number.

e.g.

"count_source_ip":inf

@dlvenable
Copy link
Member

We may want to consider having configurations in Data Prepper to make JSON processing more lenient. For example, Jackson does support non-numeric numbers which may be the problem in this specific case.

@canob
Copy link
Author

canob commented Jan 3, 2024

Hi @dlvenable. Thanks for your replies.
My answers:

  • Yes, FluentBit is the one that is sending metrics logs to Data-Prepper. The output configs to Data-Prepper and to a file (same output):
[OUTPUT]
    Name http
    Alias dataprepper_dns_metrics_query_by_ip
    Match dns_metrics_query_by_ip_5m
    Host data-prepper
    Port 2023
    URI /log/ingest
    Format json

[OUTPUT]
    Name file
    Path /tmp/fluentbit_output
    Match dns_metrics_query_by_ip_5m
  • Full examples with the full request (the same FluentBit output that is sending the metrics to Data-Prepper HTTP Input, as a FluentBit file output):
dns_metrics_query_by_ip_5m: [1704251020.017663002, {"source_ip":"10.199.0.48","count_source_ip":22,"tag":"dns_metrics_query_by_ip_5m"}]
dns_metrics_query_by_ip_5m: [1704251020.017664432, {"source_ip":"10.199.0.38","count_source_ip":67,"tag":"dns_metrics_query_by_ip_5m"}]
dns_metrics_query_by_ip_5m: [1704251020.017665624, {"source_ip":"127.0.0.1","count_source_ip":29,"tag":"dns_metrics_query_by_ip_5m"}]
dns_metrics_query_by_ip_5m: [1704251020.017666101, {"source_ip":"10.199.0.182","count_source_ip":69,"tag":"dns_metrics_query_by_ip_5m"}]
dns_metrics_query_by_ip_5m: [1704251020.017666816, {"source_ip":"10.199.0.46","count_source_ip":61,"tag":"dns_metrics_query_by_ip_5m"}]
dns_metrics_query_by_ip_5m: [1704251020.017667293, {"source_ip":"10.11.6.15","count_source_ip":42,"tag":"dns_metrics_query_by_ip_5m"}]
dns_metrics_query_by_ip_5m: [1704251020.017668008, {"source_ip":"10.11.0.50","count_source_ip":7,"tag":"dns_metrics_query_by_ip_5m"}]
dns_metrics_query_by_ip_5m: [1704251020.017668962, {"source_ip":"10.11.0.247","count_source_ip":8,"tag":"dns_metrics_query_by_ip_5m"}]
dns_metrics_query_by_ip_5m: [1704251020.017669916, {"source_ip":"10.199.0.84","count_source_ip":2,"tag":"dns_metrics_query_by_ip_5m"}]
  • "count_source_ip" doesn't have quotes in the value of the field, because it is numeric, and is always present, is always a positive integer number, and is never null.

Maybe I can try to change the output format for HTTP on FluentBit to json_stream or json_lines, https://docs.fluentbit.io/manual/pipeline/outputs/http (I don't remember if I already tried that and was not working).

Let me know if I can provide additional information.

Thanks.

@canob
Copy link
Author

canob commented Mar 4, 2024

Also, from the error log, it appears that the JSON parser has already found a key. It is expecting a JSON value.

Is it possible you might have the value inf without quotes for count_source_ip? Perhaps it means an undefined number.

e.g.

"count_source_ip":inf

Hi @dlvenable

I still having issues with this, and I didn't find a way to solve it. I still receiving that kind of error logs from time to time:

2024-03-01T20:35:20,003 [pool-8-thread-139] ERROR org.opensearch.dataprepper.plugins.source.loghttp.LogHTTPService - Failed to parse the request of size 46797 due to: Unrecognized token 'inf': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
 at [Source: (com.linecorp.armeria.internal.shaded.fastutil.io.FastByteArrayInputStream); line: 1, column: 24728] (through reference chain: java.util.ArrayList[211])
2024-03-01T20:35:25,993 [pool-8-thread-18] ERROR org.opensearch.dataprepper.plugins.source.loghttp.LogHTTPService - Failed to parse the request of size 46797 due to: Unrecognized token 'inf': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
 at [Source: (com.linecorp.armeria.internal.shaded.fastutil.io.FastByteArrayInputStream); line: 1, column: 24728] (through reference chain: java.util.ArrayList[211])
2024-03-02T05:15:20,034 [pool-8-thread-96] ERROR org.opensearch.dataprepper.plugins.source.loghttp.LogHTTPService - Failed to parse the request of size 28374 due to: Unrecognized token 'inf': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
 at [Source: (com.linecorp.armeria.internal.shaded.fastutil.io.FastByteArrayInputStream); line: 1, column: 10741] (through reference chain: java.util.ArrayList[93])
2024-03-02T05:15:28,993 [pool-8-thread-192] ERROR org.opensearch.dataprepper.plugins.source.loghttp.LogHTTPService - Failed to parse the request of size 28374 due to: Unrecognized token 'inf': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
 at [Source: (com.linecorp.armeria.internal.shaded.fastutil.io.FastByteArrayInputStream); line: 1, column: 10741] (through reference chain: java.util.ArrayList[93])

Is there a way to enable some additional debugging level on the logs, to see the complete JSON objects that are generating that errors on the log?

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

No branches or pull requests

2 participants