The Service Map for APM is here!
Introducing distribution metrics

Introducing distribution metrics


Published: July 12, 2018
00:00:00

Moving up the stack

You’ve seen great new products from two of our pillars of observability, Trace and Logs. But I’m here to talk about that pillar of observability that you’ve been relying on since you first started using Datadog, Metrics.

Our customers started by sending us infrastructure metrics: CPU, RSS, Queue Length, database locks. But more and more, you’re thinking, “Up the stack.” How many people here…? But how do you measure things at arbitrarily high scales? What’s the average aggregate user experience for all of your users? How do those users behave? How much revenue do you realize from your investment in new product development or expect to, next quarter?

How many people here send Datadog custom metrics today? That’s great. That’s almost everybody here, and that’s because it’s a common pattern. You’re thinking, “Up the stack.” You’re thinking in terms of services, in terms of business metrics. Metrics that you have to measure globally. You have to measure time on page, dollars per customer, items bought, trips taken, widgets produced, data from fleets of mobile devices and smart sensors. And that’s why I’m excited to introduce global distributions.

Global distributions

Global distributions are a new metric type in Datadog, which allow you to accurately describe arbitrary tag-level objects, allowing you to compute, for example the user experience for the 75th or 99th percentile of your users.

I’m gonna show you a quick demo here. I’ve already started sending a metric to Datadog, a distribution metric to Datadog. In this case, I’ve started sending purchases per customer. Here, I’ve selected my metric. And now, I’m gonna edit its behavior in the app. So, there are some physical attributes that I don’t necessarily care about for this purpose—which host serves the page that the customer lands on, which database shard provides that data. But what I do care in this case is the campaign that I’m running or the, you know, the geographical location that my customer is coming from, or whether they’re on mobile or my website.

So now, I’ve applied that campaign tag and I’m gonna add an aggregation for this metric, and aggregate globally for the purchases per customer. I’m gonna bring up my dashboard and I’m gonna add a time series widget to use that metric. Here, I’ll select the metric. And again, I’m gonna be able to aggregate globally over those tags that I’ve chosen without worrying about what’s going on underneath. Here, I can see that I can aggregate at the 99th percentile or the 75th percentile or 50th percentile of my customers. If I were to track every single customer, I wouldn’t be able to see the forest for the trees. But if I were to take a subset of my customers, say, my most lucrative or the slowest loading pages, then I would get a warped view of my business.

Counterintuitively, by tracking the 99th percentile customer, I get a better view, a more accurate view of what all of my customers are experiencing. So, I’m gonna close out of this. And now, you can see I have two dashboards…two widgets, the global 50th, 75th percentile and 95th percentiles for organic traffic and underneath that for a campaign that I have placed. Here, I have those same two graphs. And what you can see is that the 50th percentile user, your average user, behaves broadly the same way.

However, it’s at the extremes where things diverge. Here, you can see in the case of the campaign that there are comparatively few average users. People either get to my page and leave quickly or they find what they want and they make a lot of purchases. So, now that I know that, I can make some strategic decisions. How can I funnel those people coming from the campaign toward down a better happy path where they can make those purchases? Comparing between campaigns which exhibits a higher return on investment, global distributions help answer these questions of your customer experience, of your campaign effectiveness, and of your business KPIs, in addition to those of your infrastructure and application performance. So, global distributions let you monitor objects at arbitrary scales. They give you accurate statistical percentiles. And they’re available in beta today.

Step way back

So, find me on the Expo floor and I’m happy to introduce you to the feature. Brad showed us trace search which lets you take a microscope to every user in every transaction. With global distributions, you can step back, way back and see the entire population.