Dash conference! July 11-12, NYC

RDS MySQL Dashboard

What is RDS MySQL?

Amazon Relational Database Service (RDS) is a hosted database service in the AWS cloud. Users can choose from several relational database engines, including MySQL, Oracle, SQL Server, Postgres, MariaDB, and Amazon Aurora.

RDS MySQL is the original database engine for the service.

RDS MySQL dashboard overview

RDS MySQL users can access RDS metrics via Amazon CloudWatch and native MySQL metrics from the database instance itself. Each metric type gives you different insights into MySQL performance. Ideally, both RDS and MySQL metrics should be collected for a comprehensive view.

However, it’s not always easy to tell which of the hundreds of RDS MySQL metrics you should focus on. Refer to the image below for an example of a customizable RDS MySQL dashboard in Datadog with the critical metrics to focus on.

MySQL on RDS dashboard template

The following is a widget-by-widget breakdown of the template RDS MySQL dashboard separated into five categories: query volume, disk I/O, connections, AWS resource metrics and replication.

Query volume metrics

Queries per second

The current rate of queries will naturally rise and fall. But it is worth alerting on sudden changes in query volume—drastic drops in throughput, especially, can indicate a serious problem.

Slow queries

Slow_queries is a native MySQL performance metric for query latency that increments every time a query’s execution time exceeds the number of seconds specified by the long_query_time parameter (which can be set in the AWS console).

If the number of slow queries reaches worrisome levels, you will want to identify the actual queries that are executing slowly so you can optimize them.

Reads per second

Depending on whether or not the read is served from the query cache, reads increment one of two MySQL metrics (Com_select or Qcache_hits). This graph sums both metrics.

Highest reads/sec by instance, past hour

This toplist further breaks down the reads per second graph by providing instance-specific data about reads on an hourly basis.

Writes per second

All INSERT, UPDATE, and DELETE commands are rolled into this graph.

Writes/sec by instance, past hour

This metric further breaks down the writes per second graph by providing instance-specific data about writes on an hourly basis.

Disk I/O Metrics

Total I/O

The sum of read IOPS (input/output operations per second) and write IOPS for the MySQL instance. Note there is not a one-to-one correspondence between queries and I/O operations. Monitoring total IOPS ensures that you do not exceed the limits of your chosen instance type. It also helps you understand how much of your workload is served from memory and how much must be retrieved from disk.

Read I/O

The number of read IOPS.

Write I/O

The number of write IOPS.

Disk queue depth

The DiskQueueDepth metric measures the length of the I/O operations queue when storage volumes cannot keep pace with the volume of read and write requests.

Although it is not unusual to have some requests in the disk queue, you might have to investigate if this metric climbs—especially if latency increases as a result.

Read latency per I/O

This metric measures how long your read I/O operations are taking at the disk level. To overcome I/O limitations in read-heavy applications, you can create read replica of the database to serve some of the client read requests.

Write latency per I/O

This metric measures how long your write I/O operations are taking at the disk level.

Connection Metrics

Threads connected

The Threads_connected metric counts connection threads in MySQL (one thread is allocated per connection). This native MySQL metric can be collected at a higher resolution than the CloudWatch DatabaseConnections metric.

Threads running

The Threads_running metric isolates the threads that are actively processing queries.

Replication

Bin log usage on master

This metric is relevant for replication scenarios because it measures the disk usage of binary logs on the master database instance.

Replication lag

The ReplicaLag CloudWatch metric captures the lag time for any read replica. While this metric is usually not actionable, you should investigate your settings and resource usage if the lag is consistently very long.

Replication lag, by instance

This graph further breaks down the ReplicaLag metric by providing instance-specific data.

AWS resource metrics

CPU

This graph monitors CPU utilization. While high CPU utilization is not necessarily a bad sign, it could indicate a bottleneck if your database is performing poorly while it is within IOPS and network limits and appears to have sufficient memory.

Network in

The network traffic received by the MySQL instance per second. Monitoring NetworkReceiveThroughput and NetworkTransmitThroughput metrics helps you identify potential network bottlenecks.

Network out

The network traffic transmitted by the MySQL instance per second.

Free storage space

The CloudWatch FreeStorageSpace metric lets you monitor how much of your allocated storage is still available. AWS recommends that RDS users take action to delete unneeded data or add more storage if disk usage consistently reaches levels of 85 percent or more.

RAM

The CloudWatch FreeableMemory metric tracks the available RAM.

Swap

The CloudWatch SwapUsage metric is used to monitor memory saturation, showing you how much swap is used on the instance.

Monitor RDS MySQL with Datadog

If you’d like to see this dashboard populated with your MySQL metrics, you can try Datadog for free for 14 days. This customizable dashboard will be displayed immediately after you set up the MySQL for RDS integration.

For a deep dive on RDS MySQL metrics and how to monitor them, check out our three-part How to Monitor RDS MySQL series.

 

 

RDS MySQL Dashboard