Monitor CockroachDB Performance Metrics With Datadog | Datadog

Monitor CockroachDB performance metrics with Datadog

Author Jordan Obey

Published: 2月 12, 2019

We are excited to announce a new integration with CockroachDB, an open source distributed SQL database. CockroachDB assures ACID semantics and aims to make it easy to scale horizontally by adding nodes instead of manually sharding the database. Built to be resilient (much like its namesake insect) and highly available as it scales, CockroachDB readily recovers from node failures by repairing and rebalancing automatically. To guarantee high performance, you can now monitor CockroachDB clusters with Datadog.

Visualize and alert on hundreds of CockroachDB metrics

Because the Datadog Agent comes with a CockroachDB check already included, you can set up the integration without an additional installation. Once you’ve completed the integration, the Datadog Agent automatically collects hundreds of metrics to help you track the overall performance of your CockroachDB cluster. By monitoring your database with Datadog, you can ensure that it is always available and serving queries, and that it has sufficient resources to maintain high levels of performance.

All the metrics collected from your CockroachDB database are available for alerting, correlation, and graphing on customizable dashboards.

cockroachdb performance metrics dashboard

Monitor data store workload

For virtually any data store use case, you will want to closely monitor its workload. By monitoring your CockroachDB query throughput, you can track high-level database utilization and watch out for sudden changes (especially drops) that might be symptomatic of a problem. You can avoid missing unexpected behavior by setting up an automated alert with Datadog to detect changes in query throughput that deviate out of a forecasted range.

cockroachdb performance query throughput

Track high-level cluster health

The Datadog integration with CockroachDB also provides higher-level metrics describing the state of your database infrastructure, such as the count of live nodes in your CockroachDB cluster. These distributed nodes store data as a map of key-value pairs and divide them into ranges. To maintain consistency, CockroachDB implements a Raft consensus protocol that requires at least three nodes to be live in order to confirm data store changes. Ensure there are enough live nodes by setting up an alert to automatically notify you if the count drops unexpectedly.

cockroachdb performance live nodes

Add thresholds to your dashboards

You can set up dashboard graphs and counters with thresholds to mark when a metric is above or below a specified value to know when your database needs attention. For example, you can take the metric tracking available storage capacity and divide it by the database’s total capacity to quickly calculate the percentage of disk space you have left. You can then use conditional formatting to display the resulting percentage in red if it falls below an acceptable threshold.

cockroachdb performance available capacity

Monitor database resource utilization

The Agent monitors user CPU time and percentage to measure how busy your database server is. By tracking CPU utilization alongside network throughput and other resource metrics, you can ensure that CockroachDB has sufficient capacity to serve queries quickly, and that you can efficiently troubleshoot any issues that arise.

cockroachdb performance CPU time

Monitor CockroachDB performance in context

We are pleased to add CockroachDB to the growing list of supported integrations in Datadog. With the new integration, you can monitor hundreds of performance and usage metrics from CockroachDB alongside monitoring data from 450+ other technologies, including Amazon Web Services, Kubernetes, and NGINX. Datadog brings together metrics, logs, and distributed tracing to give you a comprehensive view of all the technologies in your stack, enabling you to monitor CockroachDB and all the applications that depend on it.

If you aren’t already using Datadog, get started with a 14-day .