Debug Issues and Automate Remediation With Shoreline and Datadog | Datadog

Debug issues and automate remediation with Shoreline and Datadog

Author Thomas Sobolik

Published: April 21, 2022

Shoreline is an incident response automation service that enables DevOps engineers and site reliability engineers (SREs) to quickly debug and remediate issues at scale and develop automated routines for incident management. Using Shoreline’s proprietary Op language, customers can run debug commands across all their hosts simultaneously and then deploy custom scripts via Actions to trigger automated remediations.

We’re pleased to announce the release of the Shoreline Datadog App, enabling our users to leverage these debug and repair features entirely within the Datadog UI. The app is now available for installation from the Datadog integrations page for all joint Shoreline and Datadog customers. In this post, we’ll discuss how you can use the app to issue Shoreline debug commands from an out-of-the-box dashboard, and then connect Shoreline remediations to Datadog alerts.

Generate a remediation dashboard

Once you’ve installed the app, you can access an out-of-the-box Shoreline Remediation Dashboard. Shoreline automatically finds all your Datadog monitors currently tracking metrics on Shoreline-instrumented hosts and displays them in a table via a custom widget. The table also includes each monitor’s alert status as well as a list of relevant tags, such as service, Kubernetes pod, or cloud provider account. Using the dashboard’s monitor filter, you can filter these tags to home in on the alerts you care about.

Auto-generated Shoreline Remediation Dashboard

Real-time debugging and automated remediation

If one of your monitors is firing an alert, you’ll want to investigate the underlying hosts immediately to begin diagnosing the problem. Using the link in each table entry, you can open a side panel to run debug statements and configure automated remediations for a monitor’s relevant hosts—without ever leaving the Datadog UI. In the Debug tab, you can enter debugging commands using Shoreline’s Op language to query for important metrics, state information, command line output, and more from all hosts associated with the monitor you’re investigating. For example, let’s say you’ve received an alert that the CPU usage on a service running on a Kubernetes cluster is high. You can query for the CPU usage on each pod to better understand the source of the problem and then grab logs from the affected pods to view which tasks are running. You can come back to this after remediation to verify that your configured Action successfully fixed the issue.

Debugging issues from the dashboard's Debug sidepanel

Using Shoreline Actions, you can configure remediation scripts to run automatically across all relevant hosts simultaneously. In addition to the Debug tab, the side panel also includes an Automate tab, which enables you to create and manage Shoreline Actions. If your team has already created Actions with Shoreline, you can choose an Action from the dropdown menu and link it to the alert you’re investigating. Or, you can create a new Action directly in the Datadog UI, setting its name and description and inputting the Op statements that the Action will execute. If your Action depends on artifacts like Python or bash scripts, you can upload them directly to guarantee that they will be available to Shoreline as it runs your Action on each host.

Configuring Shoreline Actions using the Automate sidepanel

Once you’ve configured your Action, Shoreline automatically links it to its associated monitor in Datadog. You can choose to run the Action automatically whenever the monitor enters an alert status, or trigger it manually from the dashboard. Additionally, you can track all Shoreline commands—including debug commands and triggered Actions—as Datadog Events, so you can track incident management activity alongside notifications for new code deployments, service health checks, configuration changes across dozens of supported managed service integrations, triggered alerts, and more.

Debug and remediate across thousands of hosts simultaneously

Shoreline’s new Datadog App enables you to execute debug commands and run your Shoreline Actions across thousands of hosts simultaneously directly from Datadog, saving you DevOps overhead, reducing your MTTR, and empowering more engineers and developers at your organization to take on SRE work. You can purchase a subscription to Shoreline or try it by signing up for a free 14-day trial via the Datadog Marketplace. The ability to promote branded monitoring tools in the Datadog Marketplace is one of the benefits of membership in the Datadog Partner Network. You can learn more about the Datadog Marketplace in our blog post, and you can contact us at marketplace@datadog.com if you’re interested in developing an integration or application And if you’re brand new to Datadog, get started with a .