Helm is a package manager that makes it easy to deploy and manage Kubernetes applications. Our new Helm integration allows you to monitor the availability and status of the Helm-managed applications deployed in your Kubernetes clusters. In this post, we’ll show you how you can visualize the status of your Helm releases and use monitors to notify you of important changes in your Helm environment.
Helm defines applications using charts, which are packaged collections of Kubernetes manifests. When you deploy a chart, Helm creates a new release—an application in your cluster—or applies a new configuration to upgrade an existing release to a new revision.
Datadog’s Helm integration includes an out-of-the-box dashboard that displays data about Helm releases. This allows you to see the status of your Helm-managed applications and helps you spot trends in key Helm activity like installations and upgrades. Template variables at the top of the dashboard allow you to filter your Helm data so you can see releases from a single cluster, for example. In the screenshot below, the dashboard displays a count of releases in the
shepherd cluster that failed (shown in red) and a count of releases that are working properly (in green). A list of Helm events shows a history of status changes—including new releases, upgrades, and deletions—and a timeseries graph shows a breakdown of releases by status. The dashboard also provides a detailed overview of information from each Helm release, including the cluster name, datacenter, chart, and revision.
As you upgrade your Helm releases, Datadog tracks changes in the status of each release. If a release fails—for example, if it can’t be scheduled due to a resource constraint—you will be able to detect this change in the event stream widget on the dashboard. You can click the event to see more information, including tags that describe the release and the underlying infrastructure. For even more information, you can pivot to the host dashboard to see the worker node’s resource usage data.
To enable the Helm integration, set the Datadog Helm chart’s
datadog.helmCheck.collectEvents parameters to
true. The integration includes a service check that automatically detects if any Helm releases are in a failed state. You can also enable an out-of-the-box monitor—shown in the screenshot below—to notify your team when a service check fails. By default, the alert triggers if a release fails five consecutive service checks. This can help you quickly detect and troubleshoot issues that could affect the performance of your service, such as a
liveness probe that fails due to an unhealthy container. You can start using this monitor right away, and you can clone it to create new versions that are scoped by tags like
kube_cluster_name to notify specific teams if a release that they’re responsible for has failed.
Datadog’s new integration provides deep visibility into your Helm releases alongside monitoring data from the rest of your orchestrated environment. Note that this integration requires Datadog Agent version 7.36.0+ and Cluster Agent version 1.20.0+. See our documentation for more information about monitoring Helm with Datadog. If you’re not yet using Datadog, you can start right away with a 14-day free trial.