Serverless platforms like AWS Fargate enable teams to focus on delivering value to customers by freeing up time otherwise spent managing infrastructure and operations. However, maintaining a deep level of observability into applications running on these fully managed platforms remains challenging. As a DevOps engineer, SRE, or application developer, understanding the performance of processes running on your serverless infrastructure is critical, as these are the building blocks that power your application and consume system resources.
Datadog Live Processes enables you to see and resolve process-related problems across your infrastructure. We’re excited to announce we’ve just expanded this ability to applications running on AWS Fargate, a fully managed compute engine for Amazon ECS and EKS that many organizations use to simplify the deployment and operation of containerized applications at scale.
Gaining process-level visibility can be difficult with a serverless environment like AWS Fargate, where compute resources are outside your team’s ownership and direct control. Now, you can use Live Processes to see every process running across all your ECS tasks in one place, monitor their resource metrics, isolate processes causing crashes or latency, investigate anomalous behavior with Watchdog, and spot suspicious processes running on your serverless containers. You can also view these processes in the context of different Datadog products—whether you’re investigating an error log in Log Management, a latency metric in APM, or a vulnerability in Cloud Security Management.
In this post, we’ll show you how to use Datadog Live Processes to start investigating processes for your applications running in AWS Fargate.
Even in a fully managed environment like Fargate, problems with your processes can cause issues that teams need to know about. These include application or network latency, processes “stealing” resources from more important processes in your application, container restarts, memory leaks, and even security issues. Monitoring your AWS Fargate processes in Datadog allows you to identify these issues so you can quickly take steps to resolve them and ensure your application runs as expected.
Once you have configured process monitoring in your AWS Fargate environment, navigate to the Live Processes page to view and investigate processes running in AWS Fargate. Datadog pulls in task metadata, so you can use the Task Name, Task Version, and Task Family facets from the facet column on the left side of the page to help you filter and sort processes by these variables and investigate the associated metrics, traces, and logs.
We’ve also added the AWS Fargate facet, which enables you to filter by containers orchestrated in either ECS or EKS, since Fargate can be used to run containers with both platforms.
By selecting this filter, you can quickly view only the processes running in your Fargate-powered serverless containers. Now, when there is an issue in your Fargate environment—for example, high latency on one of your services—you can easily identify any failing processes or tasks that are consuming abnormally high CPU or memory resources, which could be causing the issue.
In addition, with Datadog Agent v7.50.0+, you can see your AWS Fargate processes alongside data about the containers they run in. Datadog automatically tags your Fargate processes with container metadata, like container name and image, so that you can more easily investigate issues with a process alongside its container and vice versa. For example, if you identify a process that is experiencing a resource or configuration issue, you can see if the container is generating a high number of error logs, or check for AWS Fargate metrics that could indicate an integration or network issue associated with that container, such as a high number of TCP retransmits.
The AWS Fargate facet, as well as task-specific facets, are also available in the Containers view, so you can easily pivot your investigation there and view metrics related to the containers you are interested in.
Live Processes for AWS Fargate helps ensure that you and your team have the visibility you need to investigate problems with your applications, services, and infrastructure running in AWS Fargate. For more information see our documentation to enable Live Processes for your infrastructure. If you’re not using Datadog yet, sign up for a 14-day free trial.