Monitor Your HCP Vault Cluster With Datadog | Datadog

Monitor your HCP Vault cluster with Datadog

Author Addie Beach

Published: December 1, 2021

HashiCorp Cloud Platform (HCP) provides fully managed versions of some of HashiCorp’s most popular offerings, including Vault. With Vault, users have a centralized way to secure, store, and manage access to secrets across distributed systems. HCP Vault handles the day-to-day cluster maintenance, patches, and overall system security, making it easy to deploy a cluster without needing to host or manage your own infrastructure. You can also connect HCP Vault to your existing AWS accounts via VPC peering in order to securely access and modify secrets from your own system.

Datadog’s new HCP Vault integration gives you an out-of-the-box dashboard that lets you start monitoring activity across your cluster right away. You can also analyze how teams across your organization are using Vault and set up alerts to detect possible security vulnerabilities.

Out-of-the-box dashboard for HCP Vault monitoring.

Spot vulnerabilities in token TTLs

HCP Vault manages secret access with tokens. Once a client—such as a user, application, or container—has been successfully authenticated, Vault assigns it a token with the correct access control list (ACL) policies. The client can then use this token to make future requests without needing to repeat the full authentication process.

Tokens may be associated with leases. Vault leases use time-to-live (TTL) settings to determine their length. Before their TTLs are up, leases must be manually renewed—otherwise, both the leases and their associated tokens expire. This minimizes the amount of damage a bad actor can do in the event of an attack. Long TTLs present a security risk, and HashiCorp recommends setting TTL values that are shorter than the default 32 days. A sudden increase in tokens with unexpectedly long TTLs can be a sign of an attack.

Datadog’s dashboard helps you stay on top of potential TTL-based security threats. Once you enable the integration, Datadog instantly starts collecting data on every Vault token. You can track token counts, grouped by their TTLs (hcp.vault_token_count_by_ttl), as well as set up alerts to automatically notify you of suspicious spikes in the number of tokens that have long TTLs.

Graph of HCP Vault token counts, tagged by ttl.

If you identify unusual token activity, you can easily pivot to audit logs for additional context. The logs allow you to pinpoint which clients were assigned the unexpected tokens and when. In the event of a security breach, you can stop new tokens from being assigned and revoke existing tokens. Tracking token revoke latency with the hcp.vault_expire_revoke metric lets you stay on top of how quickly your organization can respond to potential threats.

Visualize client usage across namespaces

Vault uses namespaces to support multiple tenants, including different teams and applications. You can manage data access and storage in Vault separately for every tenant in your organization, with dedicated policies, authentication methods, and tokens for each namespace.

Datadog helps you get granular insights into how and when your clients are interacting with Vault. Since Vault assigns each client a unique token after the authentication process, you can use token counts to judge how many clients are using the system. For example, you can view the number of available tokens in each namespace (hcp.vault_token_count broken down by the namespace tag) to understand which teams in your organization are getting the most out of Vault. You can also gauge traffic to the system with the number of authentication requests submitted (hcp.vault_core_handle_login_request_count).

Graph of HCP Vault token counts, tagged by namespace.

When combined with metrics for monitoring your cluster’s resource usage (e.g., CPU, memory, and disk usage), namespace token and authentication request counts allow you to optimize how your organization uses Vault. You can decide whether you need to scale your cluster up or down based on traffic and cluster efficiency, and determine accurate spending estimates and limits based on the number of clients per billing period. In addition to potential cost savings, your organization can use this data to effectively track and plan your cloud resources.

For additional insight into Vault usage and security, you can also use Datadog’s HCP Vault dashboard to analyze tokens by the authentication method used to create them (hcp.vault_token_count_by_auth) and the policies they are associated with (hcp.vault_token_count_by_policy). These metrics are useful for identifying abnormal usage patterns that could point to potential threats. With anomaly detection, you can assess unusual activity and analyze it with recent metric history.

Start monitoring HCP Vault with Datadog

With HCP Vault, you get tried-and-true Vault security without any of the overhead of self-hosting. Datadog’s new HCP Vault integration enables you to easily detect threats in your cluster and optimize client usage across your organization. Use our documentation to start streaming HCP Vault metrics and audit logs to your existing account, or get started with a 14-day of Datadog.