---
title: "Monitor Apache Airflow with Datadog"
description: "Get a close look at the performance of the scheduled tasks in your managed workloads."
author: "Jordan Obey"
date: 2020-02-24
tags: ["apm", "apache", "airflow", "apache airflow"]
blog_type_id: the-monitor
locale: en
---

Apache Airflow is an open source system for programmatically creating, scheduling, and monitoring complex workflows including data processing pipelines. Originally developed by [Airbnb](https://medium.com/airbnb-engineering/airflow-a-workflow-management-platform-46318b977fd8) in 2014, Airflow is now a part of the [Apache Software Foundation](https://www.apache.org/) and has an active community of contributing developers.

Airflow represents workflows as [**Directed Acyclic Graphs**](https://airflow.apache.org/docs/stable/concepts.html#dags) **(DAGs)**, which are made up of [**tasks**](https://airflow.apache.org/docs/stable/concepts.html#tasks) written in Python. This allows Airflow users to programmatically build and modify their workflows.

If you use Airflow to orchestrate your workflows, you'll want to keep track of the status of your scheduled tasks and ensure that Airflow executes them as expected. We are pleased to announce Datadog's new integration with [Apache Airflow](https://docs.datadoghq.com/integrations/airflow.md), which takes advantage of Airflow's StatsD plugin to collect metrics with our [DogStatsD](https://docs.datadoghq.com/developers/dogstatsd.md?tab=python) service.

![Apache-Airflow-dashboard](https://web-assets.dd-static.net/42588/1776301201-monitor-airflow-with-datadog-apache-airflow-dashboard.png)

As soon as you enable our Airflow integration, you will see key metrics like DAG duration and task status populating an [out-of-the-box dashboard](https://app.datadoghq.com/screen/integration/30298/airflow-overview?from_ts=1582297314491&live=true&to_ts=1582300914491), so you can get immediate insight into your Airflow-managed workloads.

## Ensure your DAGs don't drag

Airflow represents each workflow as a series of tasks collected into a DAG. DAGs define the relationships and dependencies between tasks. An [Airflow scheduler](https://airflow.apache.org/docs/stable/scheduler.html) monitors your DAGs and initiates them based on their [schedule](https://airflow.apache.org/docs/stable/scheduler.html#dag-runs). The scheduler then attempts to execute every task within an instantiated DAG (referred to as a [**DAG Run**](https://airflow.apache.org/docs/stable/concepts.html#dag-runs)) in the appropriate order based on each task's dependencies.

Ideally, the scheduler will execute tasks on time and without delay. The higher the latency of your DAG Runs, the more likely subsequent DAG Runs will start before previous ones have finished executing. Having an increasing number of concurrent DAG Runs may lead to Airflow reaching the [`max_active_runs`](https://github.com/apache/airflow/blob/83d826b9925ce0eb2bd1fe403f5151fbef310b63/airflow/models/dag.py#L144-L146) limit, causing it to stop scheduling new DAG runs and possibly leading to a [timeout](https://github.com/apache/airflow/blob/83d826b9925ce0eb2bd1fe403f5151fbef310b63/airflow/models/dag.py#L148-L151) of currently scheduled workflows.

You can use the `airflow.dag.task.duration.avg` metric to monitor the average time it takes to complete a task and help you determine if your DAG runs are lagging or close to timing out. To add context to incoming duration metrics, Datadog's [DogStatsD Mapper](https://www.datadoghq.com/blog/dogstatsd-mapper.md) feature tags your DAG duration metrics with `task_id` and `dag_id` so you can surface particularly slow tasks and DAGs.

![airflow_dag_duration](https://web-assets.dd-static.net/42588/1776301206-monitor-airflow-with-datadog-airflow_dag_duration.png)

For further insight into workflow performance, you can also track metrics from the Airflow scheduler. For instance, the metric `airflow.dagrun.schedule_delay` provides you with the duration of time between when a DAG run is supposed to start and when it actually starts. When DAG runs are delayed they can slow down your workflows, causing you to potentially miss [service level agreements](https://airflow.apache.org/docs/stable/concepts.html#slas) (SLAs). If you notice unusually long delays, the Airflow documentation [recommends](https://airflow.apache.org/docs/stable/faq.html#how-to-reduce-airflow-dag-scheduling-latency-in-production) improving scheduler latency by increasing your DAG's `max_threads` and `scheduler_heartbeat_sec` during configuration.

## Keep tabs on queued tasks 

Before an Airflow task completes successfully, it goes through a [series of stages](https://airflow.apache.org/docs/stable/concepts.html#task-lifecycle) including `scheduled`, `queued`, and `running`. After tasks have been scheduled and added to a queue, they will remain idle until they are run by an Airflow worker. Large and complex workflows might risk reaching the limit of Airflow's `concurrency` parameter, which dictates how many tasks Airflow can run at once. This may lead your queue to balloon with backed-up tasks. With Datadog, you can create an alert to notify you if the amount of tasks running in a DAG is about to surpass your concurrency limit and cause your queue to inflate and potentially slow down workflow execution.

## Dig deeper with Datadog APM 

Airflow relies on the background job manager [Celery](http://www.celeryproject.org/) to distribute tasks across multi-node clusters. Datadog APM [supports](https://docs.datadoghq.com/tracing/setup/python.md#library-compatibility) the Celery [library](http://pypi.datadoghq.com/trace/docs/other_integrations.html#celery), so you can easily trace your tasks. This means you can get visibility into the performance of your distributed workflows, for example with [flame graphs](https://docs.datadoghq.com/tracing/visualization/trace.md?tab=spantags) that trace tasks executed by Celery workers as they propagate across your infrastructure. This helps surface, for instance, where your DAG runs are experiencing latency. Together, our Airflow and Celery integrations can help you gain a complete picture of your workflow performance as you monitor Airflow metrics and distributed traces.

![celery_trace](https://web-assets.dd-static.net/42588/1776301211-monitor-airflow-with-datadog-celery_trace.png)

## Datadog goes with the flow

Datadog is pleased to include [Apache Airflow](https://docs.datadoghq.com/integrations/airflow.md) into our growing list of over 1,000 integrations, so that you can get comprehensive visibility into your managed workflows. If you're currently a Datadog user, make sure you have Datadog Agent version 7.17+ so that you can implement [DogStatsD Mapper](https://docs.datadoghq.com/developers/dogstatsd/dogstatsd_mapper.md)–enabled tagging and get the most out of your Airflow metrics.

If you are not already using Datadog, sign up today for a 14-day <!-- Sign-up trigger (free trial.) omitted -->