Without unified visibility across your entire stack, it can be difficult to investigate backend dependencies when troubleshooting frontend issues, or to track the source of database failures that originate from bad browser requests. Full-stack visibility gives you the insight you need to pinpoint and resolve incidents quickly.
Datadog Real User Monitoring (RUM) gives you real-time insight into how users are experiencing your application. On the backend, distributed tracing provides visibility into the lifespan of individual requests, as well as key performance metrics including request throughput, latency, and error rates. Now, you can connect your RUM data with corresponding traces, giving you unified, end-to-end visibility into requests as they move across layers of your stack. This provides rich context around problems and helps you more easily locate backend problems that resulted in a user-facing error, or identify the scope of which users are affected by an issue within your stack, whether it’s a specific endpoint of your application, or a geographic region.
In this post, we’ll look at how you can use Datadog APM and RUM to more easily investigate application errors and track their impact. We will walk through:
- finding the backend root cause of a rise in frontend errors
- analyzing frontend metrics to gauge user impact from a database slowdown
Datadog RUM can help alert you to problems with your application that are affecting end-user experiences. For example, Error Tracking automatically aggregates similar frontend errors into issues so you can triage them and investigate the most urgent ones.
Datadog captures key details about the error as well as information about the user session (like the user’s location, device type, and browser) and the page view (including the view path group and URL) that experienced the problem. This helps you determine the scope of the issue, including where exactly in your application it is manifesting and who it is affecting.
But if the root cause of the problem is located somewhere in one of your backend services or dependencies, it can be difficult to find it with frontend data alone. For that, we can pivot to APM.
Because Datadog Real User Monitoring and APM are fully integrated, traces are tagged with frontend data, including the session ID, view ID, and view path group of the user that initiated the request. For example, if we receive an alert about a frontend error, we can use the view ID to pivot to the RUM Explorer to examine the specific view events that resulted in that error. From there, we can move to the Traces tab to see a flame graph visualizing the full trace associated with that view.
In addition to being able to identify the backend service that is causing our frontend problem, visualizing the trace allows us to debug the issue by providing full visibility into metrics, logs, network performance data, and code hotspots, all from within a single pane of glass.
So far, we’ve seen how Datadog’s integration between RUM and APM data lets you pivot from frontend data to view backend traces, letting you locate and troubleshoot the root cause of a problem. Next, we’ll see how RUM can provide deep context around an incident by analyzing who the problem affected.
Let’s say we receive an alert indicating an increased error rate for requests to our
product-recommendation service. To investigate, we could start by looking at related traces to localize the error and determine where the service is experiencing problems. Drilling down into a trace, we can see that our
product-recommendation service is experiencing timeouts. Viewing the logs associated with the trace reveals multiple attempts by our code to divide by zero, likely causing the problem.
We’ve used APM to identify the cause of our errors (and so can notify the right team to deploy a fix). Next, we can use RUM to find out which users were actually affected and how widespread the incident was.
Our trace includes a top-level span named
browser.request, which tracks the request’s full lifecycle. By selecting that span, we can see frontend metadata, including session ID and the view path group. This indicates that the span represents the real user interaction with our application that initiated the backend request that threw the error.
Because Datadog connects traces with associated RUM data, we can see that our trace resulted in a view of the /department/sofas/product/? path group. We can select the path group and pivot to view it in the RUM Explorer. This lets us see, for example, where incoming requests to that path group are coming from and their loading time. Or, we can use the view ID to see the exact page that was rendered for even more context on how the error impacted the user session.
From here, we can view a waterfall breakdown of all resources called during the exact page load that resulted in our backend error and where there was a slowdown.
Datadog makes full-stack troubleshooting seamless by bringing together real-user analytics with 100 percent of real-time backend traces. You can easily visualize and correlate frontend data alongside a full breakdown of backend activity from a single view. So, with one pane of glass, you can trace a browser timeout to a database operation, or link an API failure to a typo in a web component.
You can start using Datadog APM and RUM to get complete visibility into your stack today. Or, if you’re new to Datadog, get started with our 14-day free trial.