Actionable Alerting

Datadog’s robust alerting capabilities are crucial for the operations team here at Segment. Our team needs to understand the difference between a minor concern and something that needs all hands on deck.

Calvin French-Owen

Co-Founder, Segment

Being able to quickly update alerts and having so many monitors managed so effectively via the API has been very big for us—it's meant that we're very proactive about getting alerted to any system issues before they affect our users.

Aaron Webber

Software Engineer, Nextdoor

Datadog helped us utilize Site Reliability Engineering concepts, allowing us to implement meaningful SLIs and SLOs. We now have 10 times the observability as before at less than half the cost. It just worked.

Matt Ball

Chief Technology Officer, ParkMobile

Feature Overview

Datadog alerts use tags and machine learning to efficiently identify problems in your infrastructure, applications, and services. Every alert is specific, actionable, and contextual—even in large-scale and highly ephemeral environments—which helps minimize downtime and prevents alert fatigue. And with native SLO and SLA tracking, you can prioritize and address the issues that matter most to your business.

Monitor ephemeral systems without fatigue

Use tags to create targeted alerts on sets of hosts, containers, or any other component of your applications or infrastructure, including serverless functions
Automatically apply alerts to new hosts so you can scale up your environment without blind spots
Combine multiple alerts with sophisticated composite monitors to minimize alert noise

Create targeted alerts on any component of your applications or infrastructure.

Ensure optimal performance with machine learning-powered alerts

Use Watchdog to automatically detect anomalies within your infrastructure, applications, and services
Conduct root cause analysis faster by viewing intelligently grouped anomalies, metrics, and stack traces that are related to the surfaced issue
Start receiving Watchdog alerts immediately, without any setup or configuration

Never lose sight of your SLOs

Create, track, and report on critical SLOs and visualize them on dashboards using customizable widgets
Set SLO targets and thresholds to see at a glance which SLAs may be at risk
View rolling error budgets to better prioritize engineering efforts and deploy new code with confidence

Investigate alerts on the go with the Datadog mobile app

View alerts and anomalous data patterns instantly on your mobile device with the Datadog Android and iOS apps
Understand the severity of incidents and monitor the health of all your services with real-time dashboards
Receive alerts anytime, anywhere through seamless mobile integrations with on-call notification services

Receive alerts anytime, anywere with the Datadog mobile app.

Investigate alerts on the go with the Datadog mobile app

Plug-and-play with your existing workflow

Easily route notifications directly to communication tools like Slack, Hangouts Chat, and Microsoft Teams
Automatically create and update custom JIRA tickets so bugs don’t fall through the cracks
Send alerts to webhooks to enrich your existing workflows and trigger custom code