Tiered Alerts: Urgency-Aware Alerting | Datadog

Tiered alerts: Urgency-aware alerting

Author Evan Mouzakitis
@vagelim

Published: 1月 14, 2016

Datadog alerts are commonly used to identify dips, spikes, or unhealthy trends in your metrics. For instance, you might alert on memory rapidly running out, or a dramatic drop in requested work. Alerts are built to automatically notify individuals or teams via the communication tools you already use, such as PagerDuty or Slack.

But alerts are only effective when they reach the right people. With Datadog’s new tiered alerts, you can trigger alerts that go to different people, or via different channels, depending on the alert severity. Now you can set a single alert to send email or chat notifications for relatively low-urgency issues, and to page the engineer on call if the situation gets worse.

This new feature works with any integration that supports regular metric alerts. The screenshot below shows a tiered alert configured for the haproxy.frontend.denied.req_rate metric.

Defining tiered alerts

The monitor has two threshold values set: an alert threshold (50), and a warning threshold (25). In this example, if the average metric value rises above either threshold, a specific alert action is triggered.

The right alerts, to the right people

With tiered alerts, you do not need to create multiple monitors for the same metric. You simply configure different severity levels so that no one will be awakened in the middle of the night over a minor issue, and so that major issues will not be missed. With the right people getting the alerts over the right channels, you can reduce alert fatigue significantly!

Try it

To set up your own tiered alert, navigate to the monitors page and choose Metric as your monitor type.

New metric monitor

Choose the metric you want to alert on, and define threshold values that are appropriate for your setup.

Set alert conditions

Next, you need to notify the right people. For alerts, wrap your notification message and recipient list in {{#is_alert}} and {{/is_alert}} tags; for warnings use {{#is_warning}} and {{/is_warning}}. See the screenshot below for an example.

Alert the right people

In the above screenshot, you can see the ops channel in Slack will be notified when the metric value exceeds the warning threshold, and Evan will be notified by email when the metric value is greater than the alert threshold.

Already a Datadog customer? You can now set up tiered alerts on any of your integrations, or on your own custom metrics. Otherwise, try it out today by signing up for a of Datadog.