11 minute read

Published

Share

How we use Datadog to further our FedRAMP® compliance
Bowen Chen

Bowen Chen

Dylan Villacis

Dylan Villacis

Following our commitment to achieve High and Impact Level 5 authorization for the Federal Risk and Authorization Management Program (FedRAMP), our sponsoring agency has begun the review process and granted us an “In Process” status for FedRAMP® High authorization. FedRAMP High adds significantly more security controls than the FedRAMP Moderate baseline and strengthens requirements for existing control groups. In our process of implementing the infrastructure and frameworks to meet these stricter standards for our federal data environment, we used out-of-the-box (OOTB) Datadog capabilities for monitoring, logging, misconfiguration and vulnerability detection, and compliance enforcement to help us meet the additional FedRAMP High baseline requirements while maintaining our FedRAMP Moderate baseline.

In this blog, we’ll showcase how we used the following Datadog products to support a wide range of federal security requirements and specific control families:

We’ll also discuss how we used custom dashboards to track the progress of initiatives such as FIPS-validated cryptography and user account compliance.

Support FedRAMP compliance with Datadog’s OOTB capabilities

FedRAMP defines specific requirements cloud providers must meet to handle data from US federal agencies. The program establishes baselines for different authorization-to-use (ATO) levels that are composed of hundreds of security controls documented in the National Institute of Standards and Technology’s (NIST) 800-53 publication. If an organization can demonstrate that they comply with the required controls and parameters for a given baseline, FedRAMP may grant them authorization to handle federal data at that corresponding impact level.

As one might expect, increasing an organization’s FedRAMP ATO baseline—in our case,from Moderate to High—can be incredibly demanding. It requires a comprehensive audit that investigates an organization’s implementation for each of FedRAMP’s hundreds of required controls. Many of these controls introduce enhanced requirements that must be met at higher baselines, followed by iterative cycles of improvement where teams address any gaps in coverage. To demonstrate our compliance with the FedRAMP High baseline while maintaining our existing FedRAMP Moderate authorization, Datadog engineers have taken full advantage of the security controls and tools available within the Datadog platform. In this section, we’ll discuss how Datadog tools can be used OOTB to support various security controls.

Audit Trail and Log Management

Applicable controls:

AU: Audit and Accountability is a control family designed to ensure that an organization’s system and user activity is recorded and traceable, and that audit records can be surfaced to aid in the investigation of suspicious behavior if the need arises. A large part of Datadog's approach to complying with AU controls is the use of Datadog Log Management. From across our environment—including from hosts, operating systems, containers, running applications, and other infrastructure components—Log Management ingests logs that capture event types mandated under FedRAMP, a process that helps us comply with AU-2: Audit Events. Unified service tagging via the Datadog Agent ensures that these logs are automatically tagged with env, service, and version. Additionally, we still configure tags that record other critical audit information, such as timestamps, event type and name, user identification, and more, to comply with AU-03: Content of Audit Records. To bolster our compliance with this control, we also enrich logs using Datadog Observability Pipelines. This process fixes inconsistent logs and ensures that necessary audit fields are present.

For hosts that have been decommissioned and are no longer active in our environment, Datadog Flex Logs enable us to quickly query these logs under urgent circumstances and ensure that these records remain available even after those systems are decommissioned. This helps support investigations where we need to trace network or user activity across both active and decommissioned hosts or look into the configuration changes that lead to the hosts being decommissioned. Flex Logs remain accessible in the product for a minimum of 90 days, helping us meet the retention requirements outlined in AU-11: Audit Record Retention.

While Log Management gives us visibility into audit records generated within our environment, we use Datadog Audit Trail to create and store audit events that capture activity within the Datadog platform and its APIs. Under FedRAMP regulations, if you use Datadog for Government to monitor a system subject to FedRAMP that processes federal data, Datadog automatically becomes a part of your authorization boundary, even if that data isn’t explicitly ingested into Datadog.

Audit Trail provides immutable records of administrator activity within Datadog, helping us further fulfill AU-02 by capturing and recording privileged user actions for accountability and traceability. Furthermore, by defining Datadog roles and permissions, we’re able to enforce role-based access control to different Datadog products. This ensures that only authorized users can access, modify, or delete audit logs, which helps us meet the enhanced requirements under AU-9(4): Protection of Audit Information—Access by Subset of Privileged Users.

Enforce RBAC from within the Datadog App.
Enforce RBAC from within the Datadog App.

Cloud SIEM

Applicable controls:

Datadog Cloud SIEM analyzes logs to automatically generate signals when threats are detected in your environment, while surfacing entities that introduce risk. Cloud SIEM is invaluable in helping us monitor and secure our system against threats, supporting our ability to meet the requirements listed under the SI: System and Information Integrity control family. This capability enables compliance with SI-4: Information System Monitoring by continuously querying Log Management data for signs of security violations and indicators of compromise. Cloud SIEM offers hundreds of OOTB detection rules, and you can view how your enabled detection rules align with industry standards using the in-app MITRE ATT&CK Map.

Visualize how your detection rules cover known security threats using the MITRE ATT&CK Map.
Visualize how your detection rules cover known security threats using the MITRE ATT&CK Map.

In preparation for our FedRAMP audit, our Threat Detection team also leveraged the ability to create custom detection rules to ensure that all relevant security activity under FedRAMP-mandated event types produced actionable signals. By inspecting a signal in Cloud SIEM, responders can quickly record status updates and upload supporting evidence of log and alert reviews, in alignment with FedRAMP requirements under SI-4 and its enhanced requirements. Configuring notification rules supports compliance with enhanced requirements SI-4(5) and SI-4(12) by enabling us to tailor automated alerts based on different security signals and vulnerabilities, and have those alerts sent directly to relevant stakeholders and responders.

Incident Management

Applicable Controls:

Our teams use Datadog Incident Management as the central hub for real-time incident response, as well as for testing and documenting the results of various contingency plans to meet FedRAMP requirements listed under IR: Incident Response and CP: Contingency Planning. For example, IR-4: Incident Handling requires organizations to demonstrate capabilities that include preparation, detection, containment, resolution, and recovery in response to security incidents. At the FedRAMP High baseline, this control includes enhanced requirements such as IR-4(1): Automated Incident Handling Processes.

Incident Management automates core aspects of the response process, such as generating a real-time incident timeline, creating trackable follow-on actions, populating postmortems from custom templates that are pre-filled with incident variables, and more. This helps us support automated incident tracking and data collection requirements under IR-5(1), an additional enhanced control required at the High baseline. Using the notifications feature, we’re able to configure incident notifications that automatically page relevant service owners and on-call channels based on attributes such as teams, services, and products impacted. Once the necessary responders have been assembled, the incident commander can use Workstreams to assign them to concurrently investigate various leads and document their progress without creating noise for responders that are focusing on other tasks. Responders are also able to quickly notify customers of service degradations by creating in-app banners directly from Incident Management.

Concurrently assign tasks to different investigators without creating noise.
Concurrently assign tasks to different investigators without creating noise.

We also use Incident Management to document and coordinate the testing of our incident response and contingency plans. By simulating incidents, we’re able to capture how responders execute planned incident response workflows and evaluate their effectiveness in a central knowledge center. This helps ensure that the results of mock incidents and our responses are easily queryable and audit-ready, demonstrating our compliance with IR-3 Incident Response Testing and CP-4: Contingency Plan Testing. Learn how Datadog manages incidents in this blog post.

Incident Management enables us to conduct various incident repsonse and contingency planning testing.
Incident Management enables us to conduct various incident repsonse and contingency planning testing.

Cloud Security Misconfigurations and Compliance

Applicable controls:

Datadog Cloud Security continuously scans environments to detect resource misconfigurations, security vulnerabilities, and identity risks. It’s an invaluable tool in helping us comply with the SC: System and Communications Protection and the CM: Configuration Management control families. For example, SC-13 and SC-28 introduce strict standards for the cryptographic encryption of data and the protection of data at rest. Using Cloud Security Misconfiguration’s OOTB detection rules, we’re able to validate that default encryption is enabled on all S3 buckets, all EBS snapshots that handle federal data are encrypted, and etcd key-value stores are encrypted at rest.

Cloud Security Misconfigurations helps us ensure that our data at rest is properly encrypted.
Cloud Security Misconfigurations helps us ensure that our data at rest is properly encrypted.

Additionally, Cloud Security Compliance continuously evaluates our organization’s compliance posture against various industry frameworks, including NIST 800-53. It provides benchmark monitoring and reporting functionality for various resource types, including Ubuntu OS, Kubernetes, Docker, and more. We are actively developing configuration benchmarks for additional resources to support FedRAMP Revision 5 STIG requirements.

Cloud Security Compliance automatically identifies the most frequently failed findings across our environment by severity and prevalence. This enables our engineers to prioritize fixing the most important security violations. The feature also breaks down our compliance against different control families and their individual controls. By ensuring that our cloud resources within our Datadog FedRAMP organization pass the required rules for FedRAMP High, we’re able to demonstrate proof of compliance across benchmarked resources with the various requirements listed for each NIST control.

Evaluate your compliance against industry frameworks with Cloud Security Compliance.
Note: The image above is taken from a demo organization and does not reflect the compliance posture of internal Datadog organizations.
Evaluate your compliance against industry frameworks with Cloud Security Compliance.
Note: The image above is taken from a demo organization and does not reflect the compliance posture of internal Datadog organizations.

How we tracked compliance initiatives using custom dashboards

As Datadog prepared to apply for FedRAMP High authorization, our teams needed continuous visibility into the rollout of various compliance initiatives, the implementation status of security controls, and the actual effectiveness of our solutions. In this section, we’ll discuss how we used custom dashboards to track the progress of implementing cryptographic modules accepted under Federal Information Processing Standards (FIPS) 140-2 and of ensuring compliance of internal user accounts.

FIPS-validated cryptography

Compared to FedRAMP Moderate, the High baseline introduces significantly more rigorous expectations for the enforcement and visibility of encryption controls, as detailed in the FIPS 140-2 publication. Controls such as SC-8: Transmission Confidentiality and Integrity, SC-13, and SC-28 expand both the scope of systems and data flows requiring protection, and the precision with which that protection must be validated. These stricter controls required us to implement FIPS-validated cryptography for a wide range of scenarios—such as data at rest, data in transit (e.g., traffic between compute instances and storage services), and every cryptographic operation performed on customer or protected data.

To track the progress of implementing FIPS 140-2–validated cryptography, our engineers developed a custom dashboard that monitors the use of secure configurations across infrastructure running in our organization that handles federal data. This dashboard leverages metadata collected by the Datadog Agent, which is deployed across systems in the environment. The Agent reports configuration and system-level data into the platform, enabling tailored visibility into the adoption of hardened system images and cryptographic settings aligned with US federal standards. Teams use this dashboard to continuously assess compliance posture and quickly identify the status of infrastructure development such as implementing FIPS image labeling, TLS support, and creating FIPS-validated endpoints to send and receive traffic.

User account compliance

To help enforce internal user account requirements within our organization that handles federal data, we built a custom dashboard to monitor access compliance on an ongoing basis. FedRAMP High introduces stricter requirements for user account activity compared to FedRAMP Moderate, increasing the frequency of account reviews from quarterly to monthly for privileged accounts, and from annually to every six months for non-privileged accounts. We began automating account reviews with the help of dashboards to ensure compliance at the updated intervals while reducing the manual effort required to compile user activity reports. The dashboard ingests account-related metadata from internal sources and continuously evaluates each user's status against defined internal account requirements. It proactively flags users approaching non-compliance, such as those who have not yet completed their annual training, as well as those already out of compliance, enabling our Corporate IT team to take timely corrective action. This supports access control and personnel screening objectives such as AT-3: Role-based training while mitigating the risk of unauthorized access.

Improve your compliance posture with Datadog

Monitoring highly sensitive data requires the right tools and frameworks, many of which are available on the Datadog platform and can be integrated alongside the tools you already use to monitor your tech stack. To learn more, check out this blog post on why FedRAMP High matters for Government IT teams. You can also learn how our FIPS-enabled Agent helps you monitor highly regulated workloads. Explore our documentation for more details.

If you don’t already have a Datadog account, .

Related Articles

How Datadog can support your DORA compliance strategy and operational resilience

How Datadog can support your DORA compliance strategy and operational resilience

Datadog Security extends compliance and threat protection capabilities for Google Cloud

Datadog Security extends compliance and threat protection capabilities for Google Cloud

Meet EO 14028 requirements with Datadog Log Management, Workload Protection, and Cloud SIEM

Meet EO 14028 requirements with Datadog Log Management, Workload Protection, and Cloud SIEM

Customize rules for detecting cloud misconfigurations with Datadog Cloud Security Management

Customize rules for detecting cloud misconfigurations with Datadog Cloud Security Management

Start monitoring your metrics in minutes