This feature is currently in private beta. If you’d like to register for access, sign up here.
Running Kubernetes applications requires visibility into not only the overall performance of clusters but also the health of individual pods, deployments, and other resources that make up your environment. Datadog already integrates with your containerized environments and includes features like the Live Container view and the Container Map, enabling you to easily monitor Kubernetes and container runtime performance in real time and get deep visibility into clusters.
Today we build upon our existing functionality to provide deeper insight into your Kubernetes workloads from within our Live Containers view, providing a multidimensional look into your Kubernetes environment. Live Containers now offers curated views for your Kubernetes applications, so you can look at performance data in its appropriate context and surface critical information about every layer of your Kubernetes clusters. You can monitor the state of pods or deployments in a specific namespace or availability zone, view the resource specifications for a failed pod within a deployment, correlate node activity with related logs, and more.
Kubernetes environments consist of several object types to run and operate workloads, and you need the ability to see into every object to troubleshoot issues efficiently. While Live Containers always provided information about individual containers, it now gives you real-time views into all your orchestration’s objects, with additional insights into their overall health. This simplifies the complexity of your containerized applications and provides an easy-to-use interface that is tightly integrated with the rest of the Datadog platform.
Datadog also provides quick access to more context for each of your Kubernetes objects. For example, you can search for a specific deployment and use the context menu to drill down to a list of related pods. From this list, you can select an individual pod to view a breakdown of its constituent containers and monitor related events, running processes, traces, logs, and more—all in the same panel. Each panel includes a “YAML” tab, which shows state and configuration data similar to output from the
kubectl describe command.
This enables you to troubleshoot critical startup issues such as a pod failing to pull an image or failing readiness probes. With Datadog, you can easily identify costly, underutilized pods or nodes and adjust your deployments accordingly. Monitoring the performance of your pods also enables you to manage the costs of running Kubernetes on managed platforms such as Google Kubernetes Engine and Amazon Elastic Kubernetes Service.
In addition to providing deep visibility into individual pods, Datadog includes a Cluster Map to give you a 30,000-foot view of your entire Kubernetes environment, so you can review the state of all of your deployments and pods at a glance. The Cluster Map in the screenshot below groups pods by deployment. If there is an issue, such as too many pods crashing, the map automatically will highlight the problematic pods in light blue.
If you notice that several pods are failing to spin up within a specific cluster, it could be a sign that the cluster needs more resources or that your deployment is misconfigured. You can click on an affected pod in the map to open its overview panel and troubleshoot further.
If you need to view more details about the state of your Kubernetes objects, you can use one of the new Kubernetes Overview dashboards for pods, deployments, and nodes. We created these dashboards so you can easily get a high-level overview of critical performance data such as the CPU and memory usage of your pods, changes in deployment replicas, and the condition of your nodes.
You can access these dashboards from your dashboard list, or you can easily pivot from an overview panel in Live Containers to a dedicated dashboard that’s automatically filtered with the appropriate tags, similar to host dashboards for individual containers. For example, if you notice that several pods are failing for a specific deployment, you can quickly jump to the pods dashboard to investigate the root cause.
Pods may fail if they use more memory or CPU than their defined limits—or if they have poorly-configured limits. You can use the pods dashboard to visualize CPU and memory usage and determine if there were unexpected spikes within a specific timeframe that lead to the failure.
With Datadog, you can monitor every layer of your Kubernetes orchestration—from clusters down to individual pods. This new feature is currently in beta, with additional functionality coming soon. You can submit a request here to access the preview, or sign up for a free trial to start using Datadog.