Skip to content

EashanKaushik/real-time-log-anomaly-detection

Repository files navigation

real-time-logs-anomaly-monitoring

In this project, I have developed a pipeline to ingest logs from any source and store these logs in OpenSearch for anomaly detection. This pipeline works in real time and can be easily configured to monitor any field in your log file. A typical use case for such implementation: detecting anomalies, alerting security team, and presenting a dashboard to executives showing KPI on:

  • WAF (Web based Application firewall)
  • System logs
  • Cloudwatch logs
  • Splunk logs
  • Cloudtrail logs

I have developed a script to generate mock log files from an ec2 instance to store them in s3. The script is ran using cron jobs and generates 4 random log files per minute. Script can either randomly generate anomalies or we can enfore anomalies

Architecture Diagram

Log-file

Log file generated by ec2 has the following fields, highlighted fields reperesents the fields monitored for anomalies. As mentioned above, a python script is responsible for generating this log-file, there we know what to expect as values. Following we discuss values various fields can take and their respective anomaly value.

  1. action can take the following value: ["ALLOW", "CAPTCHA", "Challenge", "Count", "BLOCK"], here "BLOCK" is the anomaly
  2. httpSourceName can take the following value: ["ALB", "APIGW", "APPSYNC", "CF", "-"], here "-" is the anomaly
  3. httpStatus can take the following value: ["200", "202", "307", "308", "403", "404", "502", "504"], here "4xx" and "5xx" are the anomaly
  4. country can take the following value:["US", "UK", "IN", "XX", "YY", "ZZ", "WW"], here "XX", "YY", "ZZ", "WW" are the anomaly
  5. httpMethod can take the following value: ["GET", "HEAD", "POST", "DELETE"], here "DELETE" is the anomaly

Transformed log-file

Kinesis-firehose transforms the log-file before storing it in opensearch. The log-file post transform looks like:

  • K_action = 1 if action in ["ALLOW", "CAPTCHA", "Challenge", "Count"] else K_action = 0
  • B_action = 1 if action in ["BLOCK"] else B_action = 0
  • K_source = 1 if httpSourceName in ["ALB", "APIGW", "APPSYNC", "CF"] else K_source = 0
  • UNK_source = 1 if httpSourceName not in ["ALB", "APIGW", "APPSYNC", "CF"] else UNK_source = 0
  • G_httpstatus= 1 if httpStatus in ["2xx", "3xx"] else G_httpstatus= 0
  • B_httpstatus= 1 if httpStatus in ["4xx", "5xx"] else B_httpstatus= 0
  • K_country = 1 if country in ["US", "UK", "IN"] else K_country = 0
  • UNK_country = 1 if country not in ["US", "UK", "IN"] else UNK_country = 0
  • common_method = 1 if httpMethod in ["GET", "HEAD", "POST"] else common_method = 0
  • uncommon_method = 1 if httpMethod in ["DELETE"] else uncommon_method = 0

Opensearch (Anomaly Detector)

Historical Analysis for Block action anomaly

Live HttpStatus anomaly trends

Live unknown country anomaly trends

Alerts

About

A real time log monitoring approach using OpenSearch and Kinesis Streams

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages