Auth0 strengthens resiliency and service reliability with Datadog | Datadog
Auth0 strengthens resiliency and service reliability with Datadog

Testimonial

Auth0 strengthens resiliency and service reliability with Datadog

About Auth0

Okta provides identity and access management (IAM) solutions for enterprises, securing authentication, authorization, and single sign-on. Auth0, Okta’s customer IAM platform, helps organizations secure applications and manage customer logins at scale.

IT Security
8,000+ Employees
San Francisco
“By migrating our monitoring tools into Datadog, we're able to react faster, reduce time and cost spent on RCA, allowing us to innovate more than ever before.”
case-studies/auth0/andrew-yu
“By migrating our monitoring tools into Datadog, we're able to react faster, reduce time and cost spent on RCA, allowing us to innovate more than ever before.”
Andrew Yu VP Engineering Okta

なぜDatadogなのか?

  • To unify metrics, traces, and logs across its systems—enabling engineers to troubleshoot faster and collaborate more effectively
  • Intuitive dashboards made adoption seamless across teams
  • Comprehensive, cost-effective log retention eliminated data gaps
  • Real-time monitoring empowered engineers to detect issues before customers felt impact

Challenge

To uphold its 99.99% uptime promise and deliver seamless customer experiences, Auth0 set out to centralize its monitoring and log management across 30+ clusters—accelerating incident response and controlling costs at scale.

KEY RESULTS

1,000+ hours

Saved in Root Cause Analysis (RCA) annually

$488K dev productivity gains

Unified data access boosted eng velocity

94% faster log query

Flex Logs cut minutes to seconds

19% cost savings

On log management



Finding opportunities to strengthen reliability and customer service

Okta is a leading provider of identity and access management (IAM) services that helps enterprises across financial services, healthcare, retail, and government authenticate and secure workforce and customer logins. Auth0 is Okta’s IAM platform dedicated to helping developers secure customer-facing applications, manage logins, and administrate access controls. Many also use Auth0 to support compliance efforts related to identity, privacy, and security as stipulated by GDPR, HIPAA, SOC 2, ISO 27001, PCI DSS, and FedRAMP.

As a vital component to their customers’ security infrastructure, Auth0 maintains 99.99% availability of its core services, and is always looking for ways to shorten mean time to detect and resolve issues before they impact end user logins. “If we are down, our customers and their users feel the impact immediately,” notes Matt Drozdz, Senior Engineering Manager for Observability and AI Productivity.

Managing technology effectively at Auth0 is critical given the scale, complexity, and security demands of its environment. In addition to being a prime target for threat actors, Auth0’s infrastructure must continuously adapt to support frequent feature updates, evolving industry standards, and growing customer demand. To sustain this pace, engineers need complete visibility across their systems to act quickly and confidently.

Supporting 99.99% uptime with unified observability

Auth0 proactively identifies ways to make its environment more efficient, resilient, and secure. Delivering on its 99.99% uptime SLA to its customers—equating to only 52 minutes of downtime per year, or 4 minutes per month—requires engineers to detect and resolve issues in seconds, not minutes. “To meet this SLA, it’s critical for engineers to identify issues and resolve them as fast as possible,” says Andy Puch, Senior Software Engineer. “Every second counts.”

To achieve this, Auth0 launched a strategic, organization-wide initiative toward unified observability with Datadog. The first phase of the plan was to correlate trace data and infrastructure metrics. By integrating metrics and traces into a single platform, developers and monitoring users found their query times were faster, which reduced monitoring toil and enabled them to reclaim valuable time in their day. The team was able to refocus that time on innovation and other business critical roles.

Expanding on this initial success, the next phase was to bring in all their logs without increasing costs. The team ingests over five to ten billion logs per month and previously, to keep systems running, the team either sampled the data or spread across 30+ clusters. By using Flex Logs, they achieved 100% log volume retention without increasing costs.

“For the first time with Flex Logs, we have comprehensive and affordable log retention and can investigate incidents in minutes, not hours,” explains Puch.

According to Andrew Yu, Vice President of Engineering, migrating Auth0’s logs was a strategic decision to improve developer productivity and customer experience. “By bringing our logging together with our metrics and tracing, our RCA costs have decreased and we can deploy faster than before,” says Yu.

The initiative also transformed how Auth0’s engineers work. In just six weeks, the observability team led a coordinated adoption effort across global engineering groups—training users, hosting workshops, and tracking dashboard usage. Adoption was immediate and widespread, establishing unified observability as a shared practice across teams and maximizing Auth0’s technical investment value.

By consolidating all telemetry in one interface, Auth0 engineers dramatically improved operation efficiency:

How democratization of observability data led to IT transformation

With all logs, metrics, and traces centralized, engineers no longer had to wonder if they were missing data. Every event could be retained, searched, and analyzed instantly. “We now have the confidence to keep everything we need, not just a subset,” says Drozdz. “It’s transformed how our teams work, collaborate, and secure customer trust.”

That visibility had ripple effects:

Building resilience and trust at a global scale

With unified observability, Auth0 engineers are more agile. They are faster at detecting, mitigating, and preventing issues, and better equipped to build secure, high-performing identity products. As digital identity becomes more critical to every online interaction, that agility is essential.

“With AI agents acting autonomously, identity and observability are critical to secure trustworthy decision-making at scale,” says Okta CTO Bhawna Singh.

Okta and Auth0 are at the forefront of this shift, delivering secure identity management for a world where digital trust is essential. Auth0’s commitment to innovation and resilience continues to drive its success. With unified observability at its foundation, the company is shaping the future of secure, intelligent, and reliable digital identity.

リソース

log-management/product_heros_Logs

product

Datadog Modern Log Management & Analytics