Anas Anjaria
Anas Anjaria's blog

Follow

Anas Anjaria's blog

Follow
Monitor Aborted MySQL Connections using metricbeat

Photo by Rubaitul Azad on Unsplash

Monitor Aborted MySQL Connections using metricbeat

Anas Anjaria's photo
Anas Anjaria
·Sep 4, 2022·

4 min read

We have watchers in our production system that inform us when something goes wrong with our system. Suddenly, one of the systems becomes unhealthy and it failed to connect MySQL despite valid credentials.

Upon investigation, we noticed that MySQL is rejecting connections due to full disk space. The issue was resolved by increasing the disk space.

Our team decided to collect aborted MySQL connections and disk space metrics. This story only focuses on collecting aborted MySQL connection metrics.

Basic workflow

The workflow is as follows

  1. We collect MySQL metrics using metricbeat’s module [1]
  2. Metricbeat publishes collected metrics to elasticsearch.
  3. Using Kibana, we can visualize this information.
  4. We can inform our team using watchers [2]
Basic workflow of collecting mysql aborted connections
The basic workflow for collecting MySQL metrics

Proof of concept

Source code —monitor-mysql-aborted-connection

Collect MySQL metrics

We collect MySQL metrics using metricbeat as follows:

metricbeat.modules:  
- module: mysql  
  metricsets:  
  - "status"  
  period: 1m  
  hosts: ["tcp(localhost:3306)/"]  
  username: root  
  password: secret

Metrics are published to elastic search using the following code.

output.elasticsearch:  
  hosts: ["http://localhost:9200"]

You can view collected metrics using Discover in kibana at localhost:5601. mysql.status.aborted.connects is the metric of interest.

MySQL metrics collected via metricbeat
MySQL metrics collected via metricbeat

Understanding how to use collected metrics

After collecting metrics, it is crucial to understand how to use this information. Let’s have a look at an example.

If there are no aborted connections, the value remains the same. But if there are aborted connections, it will keep on increasing. It is crucial to note that this metric will not go back to zero when there are no aborted connections but remains at the same level.

MySQL aborted connections over time
Aborted connections over time

It means we are interested in detecting a change in behavior of aborted connections over time, that is, instantaneous change in aborted connections. Therefore, we use the derivative aggregation [3][4].

Furthermore, we all know that the derivative of a constant is 0. If there are no aborted connections, the value remains constant. Hence the derivative would be 0, that is, no aborted connections.

Visualize aborted connections via Kibana

Select Visualize Library

Create a visualization

Select aggregation-based visualization

Select line graph

Select metricbeat index

Adjust the x and y axis as follows

Press Update button to view the result.

Aborted MySQL connections over time
Aborted MySQL connections over time

As you can see, we only had aborted connections ~12:55.

Inform the team about the problem

You can use the watcher API [5] to inform your team. The watcher will retrieve data at regular intervals, and inform our team when a condition is met. Upon receiving notifications, the team acts accordingly.

Suppose we want to trigger watcher at every 1h.

"trigger": {  
  "schedule": {  
    "interval": "1h"  
  }  
}

As we store metrics in metricbeat index, hence

"indices": [  
  "<metricbeat-*-{now/d}>"  
]

We can query MySQL metrics using the following query.

TIP: You can generate the elastic search query by inspecting an element in Chrome or Firefox. For instance, I did the same and adjust the query to make it more readable.

"input": {
  "search": {
   "request": {
    "search_type": "query_then_fetch",
    "indices": [
     "<metricbeat-*-{now/d}>"
    ],
    "body": {
     "size": 0,
     "query": {
      "bool": {
       "must": [
        {
         "range": {
          "@timestamp": {
           "gte": "now-1h"
          }
         }
        },
        {
         "term": {
          "service.type": "mysql"
         }
        },
        {
         "term": {
          "metricset.name": "status"
         }
        }
       ]
      }
     },
     "aggs": {
      "aborted_connections_over_time": {
       "date_histogram": {
        "field": "@timestamp",
        "calendar_interval": "1h"
       },
       "aggs": {
        "max_aborted_connections": {
         "max": {
          "field": "mysql.status.aborted.connects"
         }
        },
        "instantaneous_change": {
         "derivative": {
          "buckets_path": "max_aborted_connections"
         }
        }
       }
      }
     }
    }
   }
  }
 }

The following condition will return true if at least one bucket in the aggregation has instantaneous change (> 0).

"condition": {  
  "array_compare": {  
   "ctx.payload.aggregations.aborted_connections_over_time.buckets": {  
    "path": "instantaneous_change",  
    "gt": {  
     "value": 0  
    }  
   }  
  }  
 }

To inform team via slack

-- copied from https://www.elastic.co/guide/en/elasticsearch/reference/7.17//actions-slack.html


"actions" : {  
  "notify-slack" : {  
    "transform" : { ... },  
    ... 
    "slack" : {  
      "message" : {  
        "to" : [ "#admins", "@chief-admin" ],   
        "text" : "..."   
      }  
    }  
  }  
}

Thanks for reading.

Resources

[1] metricbeat-module-mysql

[2] watcher-ui.html

[3] youtube.com/watch?v=lowavG2SXsQ

[4] search-aggregations-pipeline-derivative-aggregation

[5] watcher-api-put-watch.html


Want to connect?

anasanjaria.bio.link

 
Share this