Anas Anjaria
Anas Anjaria's blog

Follow

Anas Anjaria's blog

Follow
Monitoring Nginx Logs

Photo by Carlos Muza on Unsplash

Monitoring Nginx Logs

And increase the stability of your web application

Anas Anjaria's photo
Anas Anjaria
·Sep 4, 2022·

4 min read

Are you using Nginx? Are you monitoring your Nginx logs?

Do you know you can increase the stability and reliability of your web application by monitoring nginx logs?

I will show you how to monitor Nginx logs, use them as metrics and make the most out of them.

Why monitor Nginx logs?

To ensure the availability of our system — A system is available when there are no failures or failures that do not exceed a certain threshold. By analyzing the HTTP status code, we can determine the failure percentage and ensure the availability of our system.

Monitor Nginx logs. HTTP responses over time. Increase Stability of Your Web Application by Monitoring Nginx Logs
HTTP responses over time

Latency — Availability is one metric, but latency is also crucial. We will lose our customers when our system takes too long to respond.

Removing deprecated endpoints — Sometimes, we would like to remove old deprecated endpoints. But we can’t remove it directly as some of our customers might be using it. We could analyze endpoint usage by looking into the history and removing it accordingly.

Basic workflow

The workflow is as follows

  1. We collect Nginx metrics using filebeat’s module [1].
  2. Filebeat publishes collected metrics to elasticsearch.
  3. Using Kibana, we can visualize these metrics.
  4. We can inform our team using watchers [2].
The basic workflow for collecting Nginx metrics. Monitor Nginx logs using filebeat & elasticsearch. Increase Stability of Your Web Application by Monitoring Nginx Logs
The basic workflow for collecting Nginx metrics

Proof of concept

Source code — monitor-nginx-logs

Collect Nginx metrics

We collect Nginx metrics using the following code.

filebeat.modules:  
- module: nginx  
  access:  
    var.paths: ["/var/log/nginx/host.access.log"]

Metrics are published to elastic search using the following code.

output.elasticsearch:  
  hosts: ["http://elasticsearch:9200"]

You can view collected metrics using Discover in kibana at localhost:5601. http.response.status_codeis the metric of interest.

Nginx metrics collected via filebeat. Increase Stability of Your Web Application by Monitoring Nginx Logs
Nginx metrics collected via filebeat

Visualize HTTP status code in Kibana

Select Visualize Library and create a visualization

Monitoring nginx logs

Monitoring nginx logs

Select TSVB

Select visualization. Increase Stability of Your Web Application by Monitoring Nginx Logs

Group the HTTP status code as shown in the following diagram.

Increase Stability of Your Web Application by Monitoring Nginx Logs
Group HTTP status codes

Sorry, it’s hard to read. I tried to zoom in but it was still not readable. Hence I am writing it for you down.

Left hand side: http.response.status_code>= 200 and http.response.status_code<= 299

Right hand side: 200

Similarly, adjust it for other response codes

Finally, we can see the HTTP status code over time.

Increase Stability of Your Web Application by Monitoring Nginx Logs
HTTP status code over time

Inform the team upon failed requests

You can use the watcher API [3] to inform your team. The watcher will retrieve data at regular intervals, and inform our team when a condition is met. Upon receiving notifications, the team acts accordingly.

Suppose we want to trigger watcher at every 1h.

"trigger": {  
  "schedule": {  
    "interval": "1h"  
  }  
}

As we store metrics in filebeat index, hence

"indices": [  
  "<filebeat-*-{now/d}>"  
]

We can determine the failed requests using the following query.

{
 "size": 0,
 "query": {
  "bool": {
   "must": [
    {
     "range": {
      "@timestamp": {
       "gte": "now-1h"
      }
     }
    },
    {
     "exists": {
      "field": "http.response.status_code"
     }
    }
   ]
  }
 },
 "aggs": {
  "response_code_ranges": {
   "range": {
    "field": "http.response.status_code",
    "keyed": true,
    "ranges": [
     {
      "key": "server_errors",
      "from": 500,
      "to": 599
     }
    ]
   }
  }
 }
}

The following condition will trigger our watcher and inform our team when the percentage of failed requests exceeds 0.3%. You can adjust this threshold as per your need.

"condition": {  
  "script": {  
    "source": "(double) ctx.payload.aggregations.response_code_ranges.buckets.server_errors.doc_count/(double) ctx.payload.hits.total > params.threshold",  
    "lang": "painless",  
    "params": {  
      "threshold": 0.003  
    }  
  }  
}

To inform team via slack

-- copied from https://www.elastic.co/guide/en/elasticsearch/reference/7.17/actions-slack.html

"actions" : {  
  "notify-slack" : {  
    "transform" : { ... },  
    "throttle_period" : "5m",  
    "slack" : {  
      "message" : {  
        "to" : [ "#admins", "@chief-admin" ],   
        "text" : "..."   
      }  
    }  
  }  
}

Thanks for reading.

Resources

[1] filebeat-module-nginx.html

[2] watcher-ui.html

[3] watcher-api-put-watch.html


Want to connect?

anasanjaria.bio.link

 
Share this