We have watchers in our production system that inform us when something goes wrong with our system. Suddenly, one of the systems becomes unhealthy and it failed to connect MySQL despite valid credentials.
Upon investigation, we noticed that MySQL is rejecting connections due to full disk space. The issue was resolved by increasing the disk space.
Our team decided to collect aborted MySQL connections and disk space metrics. This story only focuses on collecting aborted MySQL connection metrics.
Basic workflow
The workflow is as follows
- We collect MySQL metrics using metricbeat’s module [1]
- Metricbeat publishes collected metrics to elasticsearch.
- Using Kibana, we can visualize this information.
- We can inform our team using watchers [2]
![]() |
Proof of concept
Source code —monitor-mysql-aborted-connection
Collect MySQL metrics
We collect MySQL metrics using metricbeat as follows:
metricbeat.modules:
- module: mysql
metricsets:
- "status"
period: 1m
hosts: ["tcp(localhost:3306)/"]
username: root
password: secret
Metrics are published to elastic search using the following code.
output.elasticsearch:
hosts: ["http://localhost:9200"]
You can view collected metrics using Discover
in kibana at localhost:5601. mysql.status.aborted.connects
is the metric of interest.
![]() |
Understanding how to use collected metrics
After collecting metrics, it is crucial to understand how to use this information. Let’s have a look at an example.
If there are no aborted connections, the value remains the same. But if there are aborted connections, it will keep on increasing. It is crucial to note that this metric will not go back to zero when there are no aborted connections but remains at the same level.
![]() |
It means we are interested in detecting a change in behavior of aborted connections over time, that is, instantaneous change in aborted connections. Therefore, we use the derivative aggregation [3][4].
Furthermore, we all know that the derivative of a constant is 0. If there are no aborted connections, the value remains constant. Hence the derivative would be 0, that is, no aborted connections.
Visualize aborted connections via Kibana
Select Visualize Library
Create a visualization
Select aggregation-based visualization
Select line graph
Select metricbeat
index
Adjust the x and y axis as follows
Press Update
button to view the result.
![]() |
As you can see, we only had aborted connections ~12:55.
Inform the team about the problem
You can use the watcher API [5] to inform your team. The watcher will retrieve data at regular intervals, and inform our team when a condition is met. Upon receiving notifications, the team acts accordingly.
Suppose we want to trigger watcher at every 1h.
"trigger": {
"schedule": {
"interval": "1h"
}
}
As we store metrics in metricbeat
index, hence
"indices": [
"<metricbeat-*-{now/d}>"
]
We can query MySQL metrics using the following query.
TIP: You can generate the elastic search query by inspecting an element in Chrome or Firefox. For instance, I did the same and adjust the query to make it more readable.
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"<metricbeat-*-{now/d}>"
],
"body": {
"size": 0,
"query": {
"bool": {
"must": [
{
"range": {
"@timestamp": {
"gte": "now-1h"
}
}
},
{
"term": {
"service.type": "mysql"
}
},
{
"term": {
"metricset.name": "status"
}
}
]
}
},
"aggs": {
"aborted_connections_over_time": {
"date_histogram": {
"field": "@timestamp",
"calendar_interval": "1h"
},
"aggs": {
"max_aborted_connections": {
"max": {
"field": "mysql.status.aborted.connects"
}
},
"instantaneous_change": {
"derivative": {
"buckets_path": "max_aborted_connections"
}
}
}
}
}
}
}
}
}
The following condition will return true if at least one bucket in the aggregation has instantaneous change (> 0).
"condition": {
"array_compare": {
"ctx.payload.aggregations.aborted_connections_over_time.buckets": {
"path": "instantaneous_change",
"gt": {
"value": 0
}
}
}
}
To inform team via slack
-- copied from https://www.elastic.co/guide/en/elasticsearch/reference/7.17//actions-slack.html
"actions" : {
"notify-slack" : {
"transform" : { ... },
...
"slack" : {
"message" : {
"to" : [ "#admins", "@chief-admin" ],
"text" : "..."
}
}
}
}
Thanks for reading.
Resources
[2] watcher-ui.html
[3] youtube.com/watch?v=lowavG2SXsQ
[4] search-aggregations-pipeline-derivative-aggregation
[5] watcher-api-put-watch.html