PayPal | Datadog
PayPal

Case Study

PayPal scales observability across global payment applications with Datadog and OpenTelemetry

About PayPal

PayPal has been at the forefront of the digital commerce revolution for more than 25 years, growing into a two-sided network that connects people and businesses in approximately 200 markets worldwide.

Financial services
24,000 Employees
San Jose, California
RapDev
“Datadog's support for interoperability with OpenTelemetry proved to be the best solution for the level of scale, observability, and support we need.”
case-studies/paypal/tapan-sanghvi
“Datadog's support for interoperability with OpenTelemetry proved to be the best solution for the level of scale, observability, and support we need.”
Tapan Sanghvi Director of Software Development PayPal

Why Datadog?

  • Enabled a simplified company-wide OTel consolidation
  • Superior latency measurements and visualization capabilities
  • Intuitive interface reduced learning curve for engineering teams
  • Comprehensive self-service dashboarding and alerts out of the box
  • Strong strategic partnership with direct access to Datadog’s product teams and training resources

Challenge

PayPal needed to implement comprehensive observability across its massive multi-cloud infrastructure while maintaining platform independence and supporting rapid innovation across multiple brands.

Key Results

24/7
monitoring

Of critical infrastructure and apps

Enables real-time visibility

Into merchant and customer health

Reduction in time to insights

To identify and fix issues quickly

Reduces debugging time

Through unified metrics and traces

Seeking consolidated observability across a multi-cloud, multi-brand infrastructure

PayPal is a global leader in digital payments, operating one of the world’s largest fintech infrastructures. The organization currently serves hundreds of millions of consumers and tens of millions of merchants across multiple brands including Venmo, Braintree, Honey, and Xoom. 

PayPal’s vast technology stack spans multiple cloud providers (Google Cloud, Microsoft Azure, and Amazon Web Services) and on-premises data centers. The company generates massive volumes of operational data and monitors more than 70,000 physical hosts and 400,000+ container hosts, collectively generating 200+ million metrics per minute. 

As PayPal grew its highly complex architecture over the years, it needed an observability platform that could handle enterprise-level scale and multilayered infrastructure. With millions of transactions flowing through interconnected brands, multiple cloud providers, and an intricate service mesh, PayPal required a solution that could provide flexible instrumentation and end-to-end monitoring. “Most of our products span across brands,” says Tapan Sanghvi, director of software development. “When you pay with Venmo, it actually goes through PayPal and Braintree and other brands. We needed to understand what happens end-to-end so we could identify where failures were occurring.”

Additionally, PayPal wanted deeper client-side observability to monitor mobile and web applications and support its rapid innovation. “PayPal moves at a very fast pace and we are trying to get very granular into every single action that the user is taking,” says Sanghvi. “Whether that’s clicking on a button or a link, or how much time they’re spending—we want that real-time visibility.”

PayPal payment

Standardizing observability across a global enterprise with OpenTelemetry

PayPal adopted OpenTelemetry (OTel) as the vendor-neutral industry-standard framework to collect telemetry data. Datadog—with its native and hybrid support for OTel and unified, scalable platform—emerged as the ideal solution for PayPal’s observability needs. “OTel provides flexibility to choose observability solutions that fit our scale, performance, and cost needs,” says Sanghvi. “Datadog’s support for interoperability with OTel proved to be the best solution for the level of scale, observability, and support we need.”

Datadog’s out-of-the-box visualizations, dashboards, and self-service analytics offered significant improvements over PayPal’s previous observability solution, and Datadog’s OTel support would allow PayPal to ingest massive volumes of metrics and traces from its multi-layered architecture seamlessly, ensuring full-stack observability. 

To meet its complex needs, PayPal implemented a hybrid observability strategy, using OTel collectors for its core brand and Datadog agents for other brands. This approach included a custom OTel pipeline for data sampling and processing, which ensured that OTel data flowed into Datadog without major rework. The tight integration between Datadog Application Performance Monitoring (APM) and Real User Monitoring (RUM), as well as critical functionalities like seamless metric-to-trace linking, enabled end-to-end observability and allowed PayPal to easily correspond frontend behavior and business impact with deep backend performance data across its apps and services.

Improving performance and reliability

Datadog’s ability to seamlessly correlate all telemetry and visualize complex service relationships now provides PayPal with deeper visibility into system behavior. This has allowed engineers to move faster, spending less time debugging fragmented systems and more time improving performance and reliability. With Datadog and OTel, observability “becomes a more consolidated solution across PayPal, reducing the learning curve while maintaining consistent use of industry standards," says Sanghvi. 

Beyond technical capabilities, PayPal now views Datadog as a strategic partner for its observability needs. “The access to Datadog’s engineering and product teams was extremely important in the adoption process,” says Sanghvi. 

Datadog further supported PayPal with office hours for live troubleshooting and comprehensive training programs tailored to OpenTelemetry adoption. To ensure smooth adoption across thousands of engineers, PayPal developed a “train the trainer” program, appointing observability champions across 400 application teams to accelerate learning and drive adoption. “All of those things—the trainings, self-service, prior knowledge, office hours—made the adoption of Datadog easy,” Sanghvi explains.

Enabling a rapid pace of innovation

Today, PayPal has successfully transformed its global monitoring capabilities, enabling comprehensive visibility across its vast infrastructure and applications while supporting PayPal engineers daily. “Pretty much everybody now uses Datadog in some way, shape, or form,” notes Sanghvi.

The Datadog platform is also used by other teams across the organization. For example, customer service agents use it to understand merchant and customer health, while product managers track and analyze new product launches, measuring conversion rates, error reduction, and latency improvements. Meanwhile, the Operations team and PayPal Command Center rely on Datadog 24/7 to monitor site health and track systemwide metrics like failed customer interactions across different availability zones and data centers. 

“The navigation from metrics to traces is available across the platform and really speeds up root-cause analysis for our teams.”

Sanghvi says he especially appreciates Datadog’s self-service dashboarding and metrics analysis capabilities. “It gives me quick access to understand all of the metrics that are going in and what tags are associated with them. The navigation from metrics to traces is available across the platform and really speeds up root-cause analysis for our teams.”  

The Datadog platform has proven particularly valuable for high-stakes initiatives. When PayPal recently launched its Modular Checkout transformation, engineers quickly implemented Datadog Real User Monitoring (RUM) to track the rollout. “This was the highest visibility for PayPal’s most senior leaders, and we needed observability into the real-user impact of the deployment,” notes Sanghvi. 

Since the Modular Checkout launch, RUM adoption has expanded to more use cases, helping identify insights not possible before. “Feedback from early adopters is positive, as they can see a significant reduction in Time to Insights to identify and fix issues quickly, enabling a continued world-class experience for our users,” says Sanghvi. 

Going forward, he notes that Datadog’s forward-thinking approach will be valuable as PayPal explores technologies like generative AI and serverless computing. “Datadog is already thinking about enabling observability that we may need six months from now,” says Sanghvi.

“Datadog is already thinking about enabling observability that we may need six months from now.”

Perhaps most importantly, Sanghvi says the Datadog platform has enabled PayPal to maintain its rapid pace of innovation. “Being able to provide observability across all of these applications and infrastructure has been the biggest win,” he says. “If we had to build observability for these products and applications from scratch, it would take us a long time. Being able to rely on Datadog for these capabilities helps us continue to move at a fast pace where we are not blocked by observability.”

Resources

solutions/otel/otel-image-desktop

official docs

Getting started with OpenTelemetry and Datadog
solutions/otel/desktop-illustration

solutions

OpenTelemetry
blog/datadog-distribution-otel-collector/ddot-collector-hero

BLOG

Unify OpenTelemetry and Datadog with the Datadog Distribution of the OTel Collector
blog/otel-deployments/otel-deployments-hero

BLOG

How to select your OpenTelemetry deployment