Introducing Updog.ai: Real-time provider status from Datadog

Brianne Bujnowski

Hugo Puceat

When external SaaS providers or cloud services degrade or go down, engineers often find themselves wondering if the issue they’re encountering is local or more widespread. The answers they find are usually slow to surface, limited in detail, or entirely dependent on the provider’s updates. Vendor-controlled status pages and third-party aggregators don’t provide the timely, independent visibility that’s necessary to quickly and accurately identify the root cause of slowdowns.

Introducing Updog.ai, a free public-facing web page from Datadog that shows the live health status of 30+ popular SaaS providers (such as OpenAI, Zoom, and GitHub) and 13 AWS services. Instead of depending on provider updates, Updog.ai is powered by aggregated, anonymized observability data and AI models. Now anyone—not just Datadog customers—can access independent, real-time visibility into the status of the services they depend on, all in one place.

What’s Updog.ai?

Updog.ai is a public web page that provides a single dashboard for monitoring the near real-time health of major SaaS APIs and AWS services. Coverage includes widely used platforms like OpenAI, GitHub, Slack, Stripe, ServiceNow, Zendesk, and Zoom, as well as AWS services such as Amazon S3, AWS Lambda, and Amazon DynamoDB.

Updog.ai turns anonymized telemetry data from thousands of environments into real-time status updates, highlighting performance issues or outages the moment they emerge. Engineers can immediately verify if a problem is part of a broader incident or confined to their systems without waiting on vendor-maintained status pages.

Updog.ai showing live status of major SaaS providers and AWS services.

Updog.ai also offers historical views, providing up to 90 days of degradation history, for easy identification of recurring reliability issues, such as API disruptions that consistently affect customer checkouts. Teams can use these insights to make informed architectural decisions and improve fault tolerance.

Extending observability beyond customer environments

Observability has traditionally been bound by the walls of individual systems, with teams focused on what they could measure within their own environments. Datadog is redefining that boundary by collecting and correlating telemetry data across the entire breadth of our products and customer base. With one of the world’s largest and most diverse streams of telemetry data, we can apply AI models that identify patterns and risks that no single organization can see on its own. This represents a shift from simply helping customers manage their environments to creating shared intelligence.

Updog.ai is an expression of this shift. By analyzing Application Performance Monitoring (APM) data across thousands of organizations, it surfaces systemic error signals that individual teams cannot detect in isolation. In doing so, Updog.ai not only serves engineers in their own environments but also supports the broader community in navigating provider reliability.

Real-time updates powered by telemetry data and AI

Updog.ai builds on the foundation of Datadog’s External Provider Status in-app feature by using:

Aggregated, anonymized APM telemetry data from thousands of organizations
A Bayesian model that infers abnormal error rates across independent customer environments
Correlation across customers and regions to confirm whether degradations are systemic

This approach enables Datadog to detect issues faster than vendor-controlled pages. For example, Updog.ai recently surfaced an Amazon DynamoDB degradation 32 minutes before AWS updated its own status page. The result is a reliable, AI-driven signal that reflects the real-world experience of users around the globe.

Example of DynamoDB degradation detected by Updog.ai before AWS updates.

What’s next: GPU availability monitoring and beyond

This iteration of Updog.ai is just the first step. Over time, its scope will expand beyond availability to include real-time updates for systemic risks, including:

GPU availability monitoring, which will enable AI infrastructure teams to plan their workloads
Spot interruption monitoring, which will enable infra teams to anticipate spot interruptions and run workloads with extra resilience
Cyber attack and vector monitoring, which will provide a view of global malicious actors and the most frequently used attack vectors

Built on anonymized observability data and AI at internet scale, Updog.ai is a comprehensive public resource for real-time service transparency.

Get started with Updog.ai today

Visit Updog.ai today to check the live status of major providers for free. No Datadog account is required. To gain visibility into how these outages impact your own services, explore these features within Datadog by signing up for a 14-day free trial.

Introducing Updog.ai: Real-time provider status from Datadog

What’s Updog.ai?

Extending observability beyond customer environments

Real-time updates powered by telemetry data and AI

What’s next: GPU availability monitoring and beyond

Get started with Updog.ai today

Related Articles

How Datadog can support your DORA compliance strategy and operational resilience

This Month in Datadog - February 2026

Monitoring Kafka with Datadog

Java memory management: How to monitor

Start monitoring your metrics in minutes

Get Started with Datadog

What’s Updog.ai?

Extending observability beyond customer environments

Real-time updates powered by telemetry data and AI

What’s next: GPU availability monitoring and beyond

Get started with Updog.ai today

Related Articles

How Datadog can support your DORA compliance strategy and operational resilience

This Month in Datadog - February 2026

Monitoring Kafka with Datadog

Java memory management: How to monitor

Related jobs at Datadog

We're always looking for talented people to collaborate with

Start monitoring your metrics in minutes