Monitor Critical Datadog Assets and Configurations With Audit Trail | Datadog

Monitor critical Datadog assets and configurations with Audit Trail

Author Jordan Obey
Author Anshum Garg

Published: September 20, 2022

Datadog Audit Trail provides administrators and security team members visibility into how different users and teams within their organizations use and interact with various Datadog products and observability data. If you are running a service at scale—with dozens of teams and users sharing a growing number of dashboards, pipelines, and alerts—but only a small team of Datadog administrators are dedicated to keeping track of usage patterns and accessibility, it can be difficult to identify how each team’s activity impacts the work of another. With Audit Trail, you can get a centralized view of activity in Datadog for valuable usage insights. For example, if a critical dashboard suddenly breaks after a configuration change, Audit Trail enables you to see which user or team made the change so that you can better determine what happened, which allows you to establish a robust Datadog monitoring setup.

In this post, we’ll look at how teams in your organization can fully leverage Audit Trail to:

Track configuration changes

One of the top priorities of Datadog administrators is to ensure that their organizations have a robust monitoring setup, and that developer operational workflows remain uninterrupted—which becomes more challenging as organizations grow. For instance, in a large ecommerce service with thousands of users and hundreds of teams sharing the same dashboards, data sources, and monitors, it can be difficult to determine how changes one team makes can impact the organizational workflow of another.

With Audit Trail, you can investigate unexpected issues by digging into audit events to reveal when segments of the Datadog platform were accessed or modified and by whom. Audit events are particularly useful for uncovering product-specific actions such as:

Dashboard updates

DevOps teams commonly rely on dashboards to monitor the health and performance of critical services. Therefore, any changes to a dashboard, such as the addition or removal of a query or function from a graph, may have unintended consequences and result in decreased visibility into critical troubleshooting telemetry.

If someone makes a breaking change to a critical dashboard, you can navigate to the Audit Trail page (which can be found under “Organization Settings”) and use the search terms Event Name:Dashboard and Action:modified to see recent dashboard changes and the users responsible so that you can follow up accordingly.

audit-trail-01.png

Log management configuration changes

Security teams often use Datadog’s log processing pipelines to parse logs and enrich them with contextual metadata—such as team tags, geography information, IP information, and other infrastructure-related information—so they can better pinpoint threats and set up alerts. To ensure that your logs are properly enriched and identify downstream alerts that may be affected, you can use the “Changes made to Pipelines” and “Changes made to Pipeline Processors” tables in the Audit Trail overview dashboard. These tables enable you to quickly see whether pipelines and processors were changed and when those changes took place. For example, an engineer could have accidentally misconfigured grok rules for parsing custom application logs. If you see that an unplanned change was made to a pipeline or process, you can pivot to the Audit Trail page to explore the event in greater detail to see what changes occurred.

audit-trail-02.png

Changes to custom metrics and tags

Organizations usually monitor business-specific data by using custom metrics with tags on customer IDs, locations, and item types. These tags enable teams to slice and dice metrics so they can quickly access the data they need. Changing or removing custom metrics and tags can lead to disrupted analyst workflows, broken executive dashboards, and other unintended consequences. You can search for audit events that capture changes made to a metric if associated tags that teams rely on have been changed. In the example below, by viewing recent changes to the shopist.basket.size metric, we can see that it was deleted by a user. You can then follow up with that user to determine whether that deletion was a mistake.

audit-trail-03.png

Monitor Datadog activity for security and usage insights

Another top priority for administrators is ensuring that Datadog is being used securely. To that end, admins need visibility into who has access to their organization’s Datadog account. Audit Trail allows you to track the creation, deletion, and modification of Datadog API and APP keys. This capability allows you to set an alert to notify you if the number of API deletions exceeds a threshold in a short time frame, which may indicate a disruption in your monitoring setup.

audit-trail-04.png

In addition to ensuring the security of your Datadog account, insight into current usage can help drive wider and more effective adoption of monitoring tools. For instance, you can configure the Audit Trail page to show the number of unique users for each Datadog feature. This knowledge can help you determine where to focus an enablement plan that encourages teams to increase adoption of specific features for their use cases. For example, if you see that Datadog Application Performance Monitoring (APM) is leveraged by users less than expected, it may mean that teams within your organization need more guidance on how APM can help them meet specific goals.

audit-trail-05.png

You should also consider setting alerts on specific audit events that can impact your monitoring, such as changes to your log retention settings and the enablement of any new features. For instance, if your log retention settings are changed so that you retain more logs than necessary, it can make it harder to find relevant troubleshooting data because you will have a greater volume of logs to analyze.

Enhance your Datadog experience by using Audit Trail today

In this post, we looked at how you can use Audit Trail to see how Datadog is used across your organization. This lets you keep track of configuration changes involving dashboards, monitors, and custom metrics, as well as optimize the security and efficiency of your monitoring efforts by notifying on API key changes or unused Datadog products and features. You can learn more about how Audit Trail provides you with insight into your organization’s Datadog usage by checking out our documentation.

If you’re not already a Datadog customer, sign up today for a 14-day .