Skip to content

API Gateway monitoring tools, out-of-box dashboard helps you find out performance issue,help improve SLA.

Notifications You must be signed in to change notification settings

zhikaichen123/prometheus-lua-nginx

 
 

Repository files navigation

prometheus-lua-nginx

GitHub stars GitHub followers

Overview

Project prometheus-lua-nginx is a complete solution for openresty based api gateway monitoring, prometheus-lua-nginx use prometheus lua lib and openresty http request life cycle to inject monitoring code, it is async, high performance, and business awareness, it also have built-in grafana dashboard out of box, please see Snapshots

Concept

How to measure your API performance? how stable your service are?

The answer is metrics monitoring:

From last one years, our best practise is the following metrics and dimensions

  • P99(latency percent 99%), P999, P90 this most important metrics, it will impact your users expirence about your apis.
  • Error rate
  • QPS

Dimensions:

  • host, every web site use virtual host to isolate sub business systems
  • endpoint, endpoint is the root access point of your api like /myapi
  • fullurl, this is the detail of one specific api, not aggrated data, it will help you find out the service quality trends
  • SLA, correct response/all request * 100% during a timewindow
  • statuscode, method, scheme

prometheus-lua-nginx use Prometheus to collect request latency metrics, below is the data model

<metric name>{<label name>=<label value>, ...}

for example:

nginx_http_request_duration_seconds_bucket{host="demo1.coohua.com",status="200",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency",le="00.010"} 42
nginx_http_request_duration_seconds_bucket{host="demo1.coohua.com",status="200",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency",le="00.020"} 249
nginx_http_request_duration_seconds_bucket{host="demo1.coohua.com",status="200",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency",le="00.030"} 249
nginx_http_request_duration_seconds_bucket{host="demo1.coohua.com",status="200",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency",le="00.050"} 249
nginx_http_request_duration_seconds_bucket{host="demo1.coohua.com",status="200",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency",le="00.075"} 249
nginx_http_request_duration_seconds_bucket{host="demo1.coohua.com",status="200",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency",le="00.100"} 249
nginx_http_request_duration_seconds_bucket{host="demo1.coohua.com",status="200",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency",le="00.200"} 249
nginx_http_request_duration_seconds_bucket{host="demo1.coohua.com",status="200",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency",le="00.300"} 249
nginx_http_request_duration_seconds_bucket{host="demo1.coohua.com",status="200",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency",le="00.400"} 249
nginx_http_request_duration_seconds_bucket{host="demo1.coohua.com",status="200",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency",le="00.500"} 249
nginx_http_request_duration_seconds_bucket{host="demo1.coohua.com",status="200",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency",le="00.750"} 249
...
nginx_http_request_duration_seconds_bucket{host="demo1.coohua.com",status="499",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency",le="+Inf"} 1
nginx_http_request_duration_seconds_count{host="demo1.coohua.com",status="200",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency"} 249
nginx_http_request_duration_seconds_count{host="demo1.coohua.com",status="499",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency"} 1
nginx_http_request_duration_seconds_sum{host="demo1.coohua.com",status="200",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency"} 2.82
nginx_http_request_duration_seconds_sum{host="demo1.coohua.com",status="499",scheme="http",method="GET",endpoint="/service",fullurl="/service/latency"} 0.008

and then we use grafana to setup a dashboard, and add a prometheus datasource, then use prometheus functions to get the calculate metrics like P99, P999, SLA, ERR rate, etc.

Architecture

Features

  • Base on promethues pull mode to collect metrics
  • Multiple dimension latency metrics collection(host,status code,schema,etc.)
  • Out-of-box grafana dashboards
  • LUA based latency metrics collection, high performance, low memory use, async

Getting started

Quickly start up a demo by docker-compose

# please instal docker-ce, docker-compose correctly
docker-compose up

this compose will install a set of containers for a demo use:

  • a grafana has built in dashboards
  • a backend services for mock latency and the real workload
  • a openresty as proxy with counter.lua and promethues.lua integrated, can collect latency metrics, and expose them by :9145/metrics endpoint
  • a promethues has pre configured previous oprenresty as scrape target

how to access:

  • access grafana by [hostip]:3000
  • access promethues by [hostip]:9090
  • access openresty exposed metrics by [hostip]:9145/metrics
  • access [hostip]:80/service/latency/[latency]/bytes/[bytes] to mock a api call,

    for instance -> curl -H "host:anyhost.com" [hostip]:80/service/latency/10/bytes/10 or use ab to simulate real workload like: ab -n 10 -H "host:anyhost.com" officepc:80/service/latency/10/bytes/10

  • check the dashboard

Building

About Demo build

You should aware that the demo is only for demo use, if you want to use it in your production environment, you need to completely know how the demo and the solution runs. If you are ready, please see ./docker/xxx for more details

Build OpenResty from source and integrate LUA metrics collection code

  • build OpenResty from source, please follow BuildOpenResty
  • copy workdir/conf.d/counter.conf to your conf.d dir
  • copy workdir/lua/prometheus.lua to your openresty lib dir
  • change lua_package_path in counter.conf to define lib location
  • copy /workdir/lua/counter.lua to your openresty lua dir here is our workdir structure
workdir
├── conf.d
│   ├── backend.conf # backend service config, replace it by your real workload
│   └── counter.conf # define lua-prometheus related vars and import deps
├── html
│   ├── 50x.html # demo use static html
│   └── index.html # demo use static html
├── lua
│   ├── counter.lua # metric collection code
│   └── prometheus.lua # prometheus.lua is offical Prometheus LUA Library you can upgrade it from https://github.com/knyar/nginx-lua-prometheus
└── nginx.conf # main config
  • if you just want run current project, please follow:
export RESTY_HOME=/opt/openresty
cd $RESTY_HOME
git clone https://github.com/zrbcool/prometheus-lua-nginx.git
ln -s $RESTY_HOME/prometheus-lua-nginx/workdir $RESTY_HOME/workdir
mkdir $RESTY_HOME/workdir/logs
cd $RESTY_HOME
git clone https://github.com/knyar/nginx-lua-prometheus.git
cp nginx-lua-prometheus/prometheus.lua workdir/lua
$RESTY_HOME/build/nginx/sbin/nginx \
	-p $RESTY_HOME/workdir \
	-c $RESTY_HOME/workdir/nginx.conf

curl localhost:8080
	<p>hello, world</p>
curl localhost:9145/metrics
	# HELP nginx_metric_errors_total Number of nginx-lua-prometheus errors
	# TYPE nginx_metric_errors_total counter
	nginx_metric_errors_total 0
pkill nginx

Contact

E-Mail

Blog

blog.zrbcool.top

Other Topics

github

About

API Gateway monitoring tools, out-of-box dashboard helps you find out performance issue,help improve SLA.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Lua 83.9%
  • Dockerfile 5.8%
  • Shell 5.5%
  • HTML 4.8%