New monitor status page: From alert to investigation in one click
When you’re alerted to a problem in your infrastructure, the last thing you want to do is waste time wondering what to do next. That’s why we have just launched a new feature that lets you quickly get the lay of the land so you can take action. Our new monitor status page brings together a wealth of useful data about any automated monitor and the alerts it generates:
- Timeseries graph of the monitored metric or state
- Detailed history across monitored infrastructure
- Host map showing which hosts are alerting
- Related monitors and their status
The monitor status page is instantly accessible from any Datadog alert you receive, so you can use it to jump-start an investigation. But you can also access the status page when your systems are healthy to explore how often each monitor alerts, in which groups, and see the detailed history of each monitor from the past days, weeks, or months.
See the history of all your monitors
What happened when?
The new monitor status page displays the history of your monitor and its associated metrics across any timeframe you choose. The history pane allows you to explore monitor trends, see instantly whether a particular issue is acute or chronic, or compare monitored groups with one another. Just like any timeseries graph in Datadog, you can select an area of interest to zoom in, or pan across time to identify when symptoms first appeared. Uptime statistics show how often each monitored group has triggered an alert.
Is a single noisy host causing problems?
In a distributed environment, your monitors usually keep watch over many components of your infrastructure at once. Now you can easily isolate metrics and alert history from the hosts or groups that are currently alerting. You can also look at alerting hosts alongside their healthy peers to see, for instance, if one particular host has recently been alerting with anomalous frequency.
Dive into affected hosts
See the big picture
The monitor status page includes our recently introduced Host Map functionality to show which among the monitored hosts are alerting. You can see, at a glance, whether the issue is widespread or is confined to a subset of your infrastructure. Size each of your hosts by load or CPU usage to quickly identify overburdened hosts.
Nothing happens in isolation
To find the root cause of a performance issue, you’ll often need to look at the problem from multiple angles. The monitor status page automatically surfaces all related monitors tracking similar metrics or similar hosts, so that you can quickly determine whether other symptoms are emerging that might help you diagnose the problem.
A new home for your monitor data
By displaying all the data about a monitor in one place, we are confident that the new status page will help you better understand your monitors and the metrics they’re tracking. The new monitor status page is available for you to explore in Datadog today. Current users can access the status page by clicking on the name of any monitor or from any monitor event notification. Not a Datadog customer yet? Sign up for a free trial.