Monitoring AWS Lambda With Datadog | Datadog

Monitoring AWS Lambda with Datadog

Author Mallory Mooney

Last updated: April 9, 2021

In Part 2 of this series, we looked at how Amazon’s built-in monitoring services can help you get insights into all of your AWS Lambda functions. In this post, we’ll show you how to use Datadog to monitor all of the metrics emitted by Lambda, as well as function logs and performance data, to get a complete picture of your serverless applications.

View all of your Lambda metrics in Datadog's out-of-the-box integration dashboard
Visualize your AWS Lambda metrics with Datadog's out-of-the-box integration dashboard.

In this post, we will:

Enable Datadog’s AWS integration

Datadog integrates with AWS Lambda and other services such as Amazon API Gateway, S3, and DynamoDB. If you’re already using Datadog’s AWS integration and your Datadog role has read-only access to Lambda, make sure that “Lambda” is checked in your AWS integration tile and skip to the next section.

Configure AWS Lambda metric collection

To get started, configure IAM role delegation and an IAM policy that grants your Datadog role read-only access to AWS Lambda and any other services you wish to monitor. You can find an example policy in our documentation.

If you use other AWS integrations with Lambda, such as AWS Step Functions or Amazon EFS for Lambda, there are a few permissions that you will need to include in your Datadog IAM policy:

  • states:ListStateMachine: List active Step Functions
  • states:DescribeStateMachine: Get Step Functions metadata and tags
  • elasticfilesystem:DescribeAccessPoints: List active EFS resources connected to Lambda functions

Then navigate to the AWS integration tile in your Datadog account. Add your AWS account information, along with the name of the IAM role you configured. Make sure that you select “Lambda” (along with the names of any other services you want to start monitoring).

Enable Datadog's Lambda integration in the AWS integration tile.

Visualize your AWS Lambda metrics

Datadog will automatically start collecting the key Lambda metrics discussed in Part 1, such as invocations, duration, and errors, and generate real-time enhanced metrics for your Lambda functions. You can easily visualize all of this data with Datadog’s out-of-the-box integration and enhanced metrics dashboards, giving you deep visibility into the performance of your Lambda functions.

View all of your Lambda enhanced metrics in Datadog's out-of-the-box integration dashboard

You can also customize your dashboards to include function logs and trace data, as well as metrics from all of your services, not just Lambda. Check out our documentation for more information about creating custom dashboards for your services.

Get more insight with Datadog’s Lambda Library

Though Datadog’s AWS Lambda integration automatically collects standard metrics (e.g., duration, invocations, concurrent executions), you can also set up Datadog’s Lambda Library to get deeper insights from your code. In this section, we’ll show you how the Lambda Library can help you collect custom business metrics, distributed traces, and enhanced metrics from your functions. Datadog’s Lambda Library runs as a part of each function’s runtime, and works with the Datadog Lambda Forwarder to generate high-granularity enhanced metrics and automatically surface actionable insights into your functions. Data collected with the Lambda Library complements the metrics, logs, and other traces that you are already collecting from services outside of Lambda.

Set up the Lambda Library

You can get started by adding the Datadog Lambda Library ARN (Amazon Resource Name) to your function.

Add Datadog's Lambda Library ARN to your function

This ARN requires a region, runtime, and version. Check out our documentation to see supported runtimes and versions. You will also need to add your Datadog API key to the function’s environment variable section.

If you use the AWS Serverless Application Model (SAM) or AWS Cloud Development Kit (CDK) to deploy your applications, you can automatically send observability data from your Lambda functions to Datadog with Datadog’s serverless macro.

Custom business metrics

Custom metrics give additional insights into use cases that are unique to your application workflows, such as a user logging into your application, purchasing an item, or updating a user profile.

The Lambda Library can send custom metrics asynchronously or synchronously. Sending metrics asynchronously is recommended because it does not add any overhead to your code, making it an ideal solution for functions that power performance-critical tasks for your applications. To emit metrics asynchronously, add the DD_FLUSH_TO_LOG environment variable to your Lambda function and set it to true. Make sure that you’re using version 3.0.0+ of Datadog’s log forwarder function.

Datadog provides Node.js, Python, Go, Ruby, and Java libraries for instrumenting your functions. To get started, import the appropriate Lambda Library methods and add a wrapper around your function, as seen in the example Node.js function snippet below:

const { datadog, sendDistributionMetric } = require("datadog-lambda-js");

async function customHandler(event, context) {
  sendDistributionMetric(
    "delivery_application.meal_value",       // Metric name
    13.54,                                                  // Metric value
    "item:pizza", "order:online"                // Associated tags
  );
  return {
    statusCode: 200,
    body: "Item purchased for delivery",
  };
}
// Wrap your handler function:
module.exports.customHandler = datadog(customHandler);

As the function code is invoked, the Lambda Library will automatically emit the delivery_application.meal_value metric to Datadog. You can read more about instrumenting your Lambda functions to send custom metrics in our documentation.

Enhanced metrics

Along with collecting custom metrics, you will also be able to analyze enhanced metrics from your Lambda functions (collected by Datadog’s Lambda Library and Log Forwarder). Enhanced metrics will show up in Datadog with the aws.lambda.enhanced prefix. These metrics are collected at higher granularity than standard CloudWatch metrics, enabling you to view metric data at near real-time in Datadog. For example, while Lambda errors are available as a standard CloudWatch metric, you can create an alert on the enhanced metric (aws.lambda.enhanced.errors) to get higher-granularity insights into potential issues.

Some enhanced metrics (such as billed duration and estimated execution cost) are automatically extracted from your Lambda logs, eliminating the need to create custom queries in CloudWatch. Enhanced metrics also include detailed metadata for your functions such as cold_start and any custom tags you added to your function in the Lambda console.

View a heat map of cold starts for your functions

Datadog uses enhanced metrics to automatically generate insights into your functions, so you can see which ones are performing poorly. For example, if a function is using too much memory, Datadog will flag it in the UI and provide more context, such as related traces and logs, for faster troubleshooting.

The Lambda Library can also trace requests across all your Lambda functions instrumented with Datadog’s native tracing libraries and other systems running the Datadog Agent. In the next sections, we’ll show you how to start collecting and analyzing Lambda traces.

Native tracing for AWS Lambda functions

Datadog APM provides tracing libraries that you can use with the Lambda Library in order to natively trace request traffic across your serverless architecture. In the example below, you can see the full path of a request as it travels across services in your environment.

View the full path of a request as it travels across services

The Lambda Library automatically propagates trace context across service boundaries, so you can get end-to-end visibility of all requests, even as they travel across hosts, containers, and AWS Lambda functions. Traces are sent asynchronously so they don’t add any latency overhead to your serverless applications.

Configure tracing

Currently, Datadog APM includes native support for tracing Lambda functions written in Go, Java, Node.js, Ruby, and Python. To get started, you will need to set up (or upgrade) Datadog’s Lambda Library and Lambda Forwarder for your function. Once configured, you can instrument your function code:

index.js

 
const { datadog } = require("datadog-lambda-js");
const tracer = require("dd-trace").init(); // Any manual tracer config goes here.

// This function will be wrapped in a span
const myFunction = tracer.wrap("my-function", () => {
 [...]
});

// This function will also be wrapped in a span, (based on the current function ARN).
module.exports.hello = datadog((event, context, callback) => {
  myFunction();
  callback(null, {
    statusCode: 200,
    body: "Hello from Lambda!"
  });
});

For AWS SAM and AWS CDK infrastructure, you can also use Datadog’s serverless macro to automatically collect traces from Lambda functions, without any instrumentation. Check out our documentation to learn more about using Datadog’s macro or one of our native tracing libraries with your Lambda functions.

Explore your trace data

To start analyzing trace data from your serverless functions, navigate to Datadog’s Serverless homepage, where you can view key function metrics alongside curated insights into function performance. Datadog provides visualizations you can customize to display the data you deem most important, such as iterator age, concurrent executions, and cold starts. You can also search for a group of functions with tags such as region, CloudFormation Stack name, and whether they were deployed to Lambda@Edge or as a Step Function.

View all your functions in the Serverless homepage

Clicking on a function shows you a full list of invocations, including key metrics, links to associated traces and logs, and insights, such as which invocations used over 95 percent of the function’s allocated memory.

View traces and logs and key metrics for a single AWS Lambda function in the Serverless homepage

In addition to viewing the performance of individual functions, you need a high-level view of your entire microservice infrastructure in order to troubleshoot application issues. Datadog APM automatically generates a Service Map based on your trace data, so you can visualize all your Lambda functions in one place and understand the flow of traffic across microservices in your environment.

View a service map of your AWS Lambda functions and connected services

You can also analyze and explore your Lambda trace data with Trace Search and Analytics. By using any combination of tags, you can quickly filter down to a specific service or function. Trace Search and Analytics also uses the tags that are automatically created with Datadog’s Lambda Library so you can filter functions by tags such as cold_start:true. The graph below displays the top five functions with cold starts over time, broken down by function name. If you like, you can easily export this to a monitor or dashboard.

Analyze your functions with Trace Search and Analytics

So far, we’ve shown you how to collect and analyze data with Datadog’s Lambda integration and Lambda Library. Now that all of your function data is flowing into Datadog, we’ll explore how you can get more out of your data with Datadog’s predictive monitoring and alerts.

Monitor AWS Lambda logs with Datadog

Datadog provides a Lambda-based log forwarder that you can use to send logs and other telemetry to your account such as Amazon S3 events and Amazon Kinesis data stream events. You can deploy this function to your AWS account using the provided CloudFormation stack. Lambda applications use CloudFormation to package functions, AWS resources, and event sources together in order to perform specific tasks. When you deploy Datadog’s Lambda Forwarder as an application, AWS will automatically create the Lambda function with the appropriate role, add Datadog’s Lambda Library, and create relevant tags that you can search on in Datadog like functionname and cloud_provider.

Because the log forwarder is a Lambda function, it relies on triggers to execute, which you can let Datadog automatically set up for you. You can choose which AWS services the log forwarder should start collecting logs from (e.g., Lambda, S3, classic ELBs) in the Collect Logs tab of your Datadog account’s AWS integration tile. Alternatively, you can manually set up triggers on S3 buckets or CloudWatch log groups. Once configured, Datadog’s Lambda Forwarder will begin sending logs from Lambda (and any other AWS services you’ve configured) to your Datadog account.

Search and analyze your Lambda logs

Datadog enables you to search on, analyze, and easily discover patterns in your logs. You can use identifiers such as the function’s log group or name to search for your logs in the Log Explorer, as seen in the example below.

Explore your AWS Lambda logs in the Log Explorer

Lambda functions generate a large volume of logs, making it difficult to pinpoint issues during an incident or simply monitor the current state of your functions. You can use Log Patterns to help you surface interesting trends in your logs.

For example, if you notice a spike in Lambda errors on your dashboard, you can use Log Patterns to quickly search for the most common types of errors. In the example below, you can see a cluster of function logs for an AccessDeniedException permissions error. The logs provide a stack trace so you can troubleshoot further.

Quickly point out patterns in your AWS Lambda logs with Log Patterns

When you select a pattern, you can click on the View All button to pivot to the Log Explorer and inspect individual logs that exhibit that pattern, or you can analyze trends in your logs by clicking on the Graph button. For example, you can view the most invoked functions or a toplist of the most common function errors. You can then export the graph to a Lambda dashboard to monitor it alongside real-time performance data from your functions.

Visualize your logs with Log Analytics and export to a dashboard

Proactively monitor AWS Lambda with alerts

Once you’re aggregating all your Lambda metrics, logs, and traces with Datadog, you can automatically detect anomalies and forecast trends in key Lambda metrics. You can also set up alerts to quickly find out about issues.

As mentioned earlier, Datadog generates enhanced metrics from your function code and Lambda logs that help you track data such as errors in near real time, memory usage, and estimated costs. You can apply anomaly detection to metrics like max memory used (e.g., aws.lambda.enhanced.max_memory_used) in order to see any unusual trends in memory usage.

View anomalies in memory usage for your Lambda functions

You can also apply a forecast to the estimated_cost metric to determine if your costs are expected to increase, based on historical data.

Forecast trends in your AWS Lambda functions

Alert on critical AWS Lambda metrics

Monitoring Lambda enables you to visualize trends and identify issues during critical outages, but it’s easy to overlook an issue when you are monitoring a large volume of datapoints in complex infrastructures. In order to ensure that you are aware of critical issues affecting your applications, you can create monitors to get notified about key issues detected in the Lambda metrics logs, or traces.

For example, you can create an alert to notify you if a function has been throttled frequently over a specific period of time. If you configure the alert to automatically trigger separate notifications per affected function, this saves you from creating duplicate alerts and enables you to get continuous, scalable coverage of your environment, no matter how many functions you’re running.

Create alerts on key AWS Lambda metrics

Throttles occur when there is not enough capacity for a function, either because available concurrency is used up or because requests are coming in faster than the function can scale. You can use an alert to notify you if you are reaching the threshold of concurrent executions for your account or per region, as seen below.

View the status of all of your Lambda alerts

There are several monitor types, including anomaly detection and forecasts, so you can be notified about only the issues you care about. For example, you can create a forecast alert to notify you a week before you run out of concurrency.

Start monitoring AWS Lambda with Datadog

In this post, we’ve looked at how to get deep visibility into all your AWS Lambda functions with Datadog. Once you integrate Lambda with Datadog, you can monitor the performance of your serverless applications, and optimize your functions by analyzing concurrency utilization, memory usage execution costs, and other metrics. And, if you use Lambda@Edge with Amazon CloudFront, Step Functions, or AppSync on top of your Lambda functions, you can automatically pull in monitoring data from those services with Datadog’s built-in integrations. Check out our AWS documentation for more information.

If you don’t yet have a Datadog account, sign up for a to start monitoring your AWS Lambda functions today.