Security and DevOps engineers often spend a lot of time and effort creating and managing complex, repetitive workflows, such as incident response, honeypotting, recovery and remediation, and more. Blink is a no-code security platform that enables users to create workflow automations, triggers, and self-service apps to streamline processes, better enforce guardrails, and eliminate operational bottlenecks. These capabilities help reduce the time and effort required for your security and DevOps teams to create a more dependable, secure system.
Datadog now offers an out-of-the-box Blink integration and software license via the Datadog Marketplace. Joint users can enhance their infrastructure monitoring and incident response workflows by utilizing Blink’s workflow automations powered by native Datadog actions—such as creating a monitor, creating events, and querying metrics.
In this post, we’ll show you how to:
- Automate infrastructure monitoring as you scale
- Enhance cyber incident response by automatically triggering Datadog actions
- Enrich your Datadog alerts with Blink workflow automations
Once you’ve enabled the integration, you can create workflow automations that will ensure your infrastructure is being thoroughly monitored as you spin up new cloud resources. For instance, say you’re a platform engineer at a startup that is scaling quickly and runs on Amazon EC2 instances. Because your infrastructure is growing fast, you want to make sure that all your new cloud resources are being monitored to avoid visibility gaps, which could lead to application performance issues and degraded end-user experience. You can leverage a Blink automation for Datadog that scans your EC2 instances to find ones that aren’t yet monitored.
The automation lists all your active Datadog monitors and synthetic tests, then gets all your AWS EC2 instances. It then compares your monitors with your EC2 instances to find any discrepancies, which it will report to you via Slack automatically. Once you have a list of unmonitored cloud resources, you can create a monitor for the instance to alert you when changes in critical metrics—such as latency, resource usage, errors, and more—violate SLAs or require investigation.
In addition to helping you monitor your infrastructure as you scale up and down, the integration also allows you to automatically trigger Datadog actions from Blink workflows.
For example, say you’re a SecOps engineer on a healthcare application that must adhere to strict compliance regulations. You create a daily scheduled Blink automation that checks misconfigured resources in Datadog Cloud Security Management for potential SOC 2 compliance violations. If any issues are discovered that violate SOC 2 requirements, the workflow automation will create a new incident in Datadog. From there, you can pivot to Datadog Incident Management to track your team’s progress in investigating and remediating the issue from a single, consolidated platform. In performing your investigation, you can utilize other areas of the Datadog platform, such as Cloud Security Posture Management, where you can identify any misconfigured cloud resources or other issues that may be at the root of the SOC 2 compliance violation.
Or, say you’re a SecOps engineer at a social media site with a large attack surface. You create a Blink action that is triggered when Datadog detects a Security Signal with a high or critical priority status. This action will then query the Datadog Events Stream for events that occurred during the time frame when the Security Signal was triggered and are tagged with the associated service. To complete the incident response workflow, your Blink action can then automatically send a list of events matching these criteria to the appropriate member of your team via Slack, so they can investigate further and determine the severity of the Security Signal.
For example, say you are a DevOps engineer working on a banking app, and your team uses Datadog to monitor your infrastructure. You have set up a monitor in Datadog to alert you when a service running on Kubernetes fails after a recent deployment. Within Blink, you can set up an automated rollback and incident response workflow that initiates every time this Datadog alert triggers.
Datadog sends the alert to Blink, where the workflow automation will be triggered and automatically roll back the most recent deployment to the last stable version and run a performance test to confirm the issue is resolved.
Blink also integrates with communication tools like email, Slack, and Microsoft Teams to optimize incident response and allow teams to notify each other about what’s been done, add decision points or questions to bring clarity and context to your workflows, and collaborate on troubleshooting. To continue our example, you can get, list, and query specific metric metadata from the service failure incident that was created following the alert, and send an interactive notification over Slack—including the test result—to the service owner to ask whether the incident should be closed in Datadog. After a decision has been made, you or the service owner can close the incident directly from Slack.
Datadog’s Blink integration enables you to trigger workflow automations in Blink from Datadog events, as well as trigger native Datadog actions from Blink workflows, helping you enforce compliance and security guardrails across your distributed infrastructure. Datadog’s Blink integration complements Datadog Incident Management, Cloud Security Monitoring, SIEM, Infrastructure Monitoring, Container Monitoring, and more. If you’re new to Datadog, sign up for a 14-day free trial.
The ability to promote branded marketing tools is a membership benefit offered through the Datadog Partner Network. You can learn more about the Datadog Marketplace in this blog post. If you’re interested in developing an integration or application that you’d like to promote, you can contact us at email@example.com.