Streamline Azure Container Monitoring With the Datadog AKS Cluster Extension | Datadog

Streamline Azure container monitoring with the Datadog AKS cluster extension

Author Addie Beach
Author Michael Cronk

Published: January 24, 2024

Azure Kubernetes Service (AKS) enables you to easily deploy and manage containerized applications in Azure while leveraging Microsoft resources such as development tools, security features, and more. As with any Kubernetes service, the sheer volume of containers being orchestrated makes monitoring AKS cluster health challenging, which can slow response times to critical incidents and create bottlenecks around long-term optimizations.

Datadog’s AKS integration already provides complete visibility into your AKS clusters—once you’ve enabled the integration and deployed the Datadog Agent to your clusters, Datadog automatically begins collecting metrics and logs from your entire AKS setup and organizing them into high-level visualizations. However, the fact that many teams use third-party services such as Helm and Ansible to install the Datadog Agent on their clusters can add complexity to workflows and increase overhead. With the Datadog cluster extension for AKS, you can now easily deploy the Datadog Agent to your Kubernetes clusters directly within Azure—no other tools needed.

In this post, we’ll explore how you can:

Quickly deploy the Datadog Agent across your AKS clusters

AKS cluster extensions make it easy to deploy services to your AKS clusters at scale and manage them from Azure Resource Manager. Like other cluster extensions, the Datadog AKS extension provides two methods of installation, enabling you to choose the deployment method that works best for your workflows. One way is to search for the Datadog AKS extension within Azure Marketplace. The extension setup page then enables you to configure details such as the relevant resource group, region, and cluster name.

The setup page for the Datadog AKS Cluster Extension, including options for setting the project and instance details.

Alternatively, you can also access the extension setup page directly from your Azure Kubernetes clusters by selecting the service you want to monitor, then choosing the Extensions and applications option from the sidebar.

Finally, before you can begin collecting AKS metrics in Datadog, you’ll want to enable the Azure integration. You can do so either from the Azure Portal via our Azure Native integration, or within Datadog by accessing the Azure integration tile.

Visualize AKS cluster and control plane activity

Once you’ve deployed the Datadog Agent to your clusters, metrics and logs from your AKS setup immediately begin streaming into Datadog. By using Datadog’s monitors and the OOTB AKS dashboard, you can quickly detect issues in your nodes before they bring processing to a halt.

The information Datadog ingests includes logs from the AKS control plane, which manages cluster resources. These logs contain critical information about the status of various orchestration components, including your API server, scheduler, and controller manager. With the AKS dashboard, you can view your control plane logs alongside performance metrics from the rest of your clusters, enabling you to quickly trace the root cause of issues no matter where they occur in your Kubernetes setup.

The OOTB AKS dashboard in Datadog, with metrics such as the cluster and node counts, average CPU utlization by node, and total number of unhealthy clusters displayed.

Let’s say that you receive an alert of a spike in CPU utilization across several nodes. While inspecting the dashboard, you notice an increase in clusters reporting an unhealthy status and, by scrolling down and viewing the logs for your control plane components, you also see an increase in error messages for the controller manager. The issues with your controller manager have led to fewer pods being created, causing the CPU on your existing nodes to overload. From here, you can take steps to debug the problematic node and prevent future issues, such as switching to a high-availability cluster with multiple control plane nodes.

Start monitoring your AKS clusters in minutes

The Datadog AKS integration helps you catch issues across all your Azure clusters, but using third-party tools to install the Datadog Agent on your services can lead to increased overhead and tool sprawl. With the Datadog AKS cluster extension, you can easily deploy the Datadog Agent to your clusters directly within Azure. Once the extension is enabled and the Agent is installed, metrics and logs from your AKS setup start streaming to the AKS dashboard in Datadog, so you can immediately begin analyzing troubleshooting data.

You can start monitoring your AKS clusters within Datadog by using the AKS integration—see our docs for more information. Or, if you’re new to Datadog, you can sign up for a 14-day .