ECS Fargate: Dual logging to CloudWatch and Datadog | Datadog
Back to Architecture Center
Architecture Center ECS Fargate: Dual logging to CloudWatch and Datadog

ECS Fargate: Dual logging to CloudWatch and Datadog

June 16, 2026

Introduction

Teams adopting Datadog for Amazon ECS Fargate often want to keep shipping logs to Amazon CloudWatch Logs in parallel during a transition period. Long-term dual shipping isn’t recommended (you pay twice for the same logs), but a temporary overlap is reasonable while you validate the new pipeline.

Two patterns exist for getting logs from ECS Fargate into both CloudWatch Logs and Datadog.

Pattern 1: CloudWatch → Lambda Forwarder or Firehose

Containers write logs via the awslogs driver to CloudWatch Logs. From there, logs reach Datadog via one of two forwarding mechanisms:

Pros

  • Simple container configuration: just set the awslogs log driver
  • No sidecar containers required
  • CloudWatch Logs is the canonical store; Datadog receives a copy

Cons

  • Forwarding latency: logs arrive in Datadog seconds to minutes after emission
  • Lambda Forwarder adds infrastructure to manage (deployment, IAM, scaling)
  • Log subscription filters have a limit of two per log group (can conflict with other tooling)
  • Higher AWS cost: CloudWatch ingestion + Lambda invocations (or Firehose)

Pattern 2: FireLens log router

FireLens is an ECS log routing mechanism built on Fluent Bit. A Fluent Bit sidecar container runs alongside the application container. ECS routes the application container’s stdout/stderr to the Fluent Bit sidecar, which fans out to multiple destinations simultaneously: CloudWatch Logs and Datadog.

Pros

  • Near-real-time delivery to both destinations
  • No Lambda Forwarder or Firehose infrastructure required
  • Flexible routing: filter, enrich, or transform logs before delivery
  • Single ingestion path: avoids CloudWatch subscription filter limits

Cons

  • Adds a sidecar container to every task definition (CPU/memory overhead)
  • More complex task definition configuration
  • Fluent Bit config errors can silently drop logs; requires testing
  • Datadog API key must be available to the sidecar at runtime

The Fluent Bit init image

The setup described here uses the init-latest image tag rather than the standard stable tag. The difference matters for how Fluent Bit config is managed.

With the standard image, custom config must be baked into a Docker image at build time. Any config change requires rebuilding and pushing a new image, updating the task definition to reference the new image tag, and redeploying. That cycle is slow.

The reference task definition uses init-latest and agent:latest for readability. In production, pin both to specific versions (e.g. init-2.31.11, agent:7.63.0) to prevent upstream image updates from silently changing behavior.

The init image separates config from the image. At container start, a pre-launch process:

  1. Fetches ECS task metadata and exports AWS_REGION, ECS_CLUSTER, ECS_TASK_ARN, and related values as environment variables available inside the Fluent Bit config.
  2. Downloads config files from Amazon S3, identified by aws_fluent_bit_init_s3_<N> environment variables on the container.
  3. Composes a main config that merges the FireLens-generated base config with the downloaded S3 configs.
  4. Launches Fluent Bit with the composed config.

Config files live in S3 and are downloaded fresh on every task start. To update the Fluent Bit config, upload a new file to S3 and force a new task deployment. No Docker build is required. The same image tag runs in every environment; only the S3 config differs.

FireLens step-by-step setup

The setup splits responsibilities between two places:

  • App container logConfiguration: owns the Datadog output. FireLens reads the options here and generates a native Fluent Bit datadog output block at task start.
  • Custom Fluent Bit config in S3: owns the JSON parser filter and the CloudWatch output.

1. Add the Fluent Bit sidecar container

Use the AWS-managed Fluent Bit init image for Fargate:

public.ecr.aws/aws-observability/aws-for-fluent-bit:init-latest

Define the sidecar as a container in your task definition with firelensConfiguration:

{
  "name": "log_router",
  "image": "public.ecr.aws/aws-observability/aws-for-fluent-bit:init-latest",
  "essential": true,
  "firelensConfiguration": {
    "type": "fluentbit",
    "options": {
      "enable-ecs-log-metadata": "true"
    }
  },
  "logConfiguration": {
    "logDriver": "awslogs",
    "options": {
      "awslogs-group": "/ecs/fluent-bit-internal",
      "awslogs-region": "<region>",
      "awslogs-stream-prefix": "fluent-bit"
    }
  }
}

The sidecar’s own logs should go to CloudWatch via awslogs so you can debug routing issues independently of the Fluent Bit pipeline itself.

Do not set config-file-type or config-file-value inside firelensConfiguration.options. On Fargate, config-file-type: s3 is not supported. AWS only allows the file type via firelensConfiguration. The file type requires the config to be baked into the container image, which defeats the purpose of the init image. Use aws_fluent_bit_init_s3_* environment variables on the container instead. This is the init image’s mechanism for loading S3 configs on Fargate.

2. Configure the application container log driver

Set the application container’s logConfiguration to awsfirelens and put the Datadog output options here. FireLens converts these options into a native datadog output block in the merged Fluent Bit config at task start.

{
  "logConfiguration": {
    "logDriver": "awsfirelens",
    "options": {
      "Name": "datadog",
      "Host": "http-intake.logs.<dd-site>",
      "TLS": "on",
      "compress": "gzip",
      "dd_service": "my-service",
      "dd_source": "python",
      "dd_tags": "env:production"
    },
    "secretOptions": [
      { "name": "apikey", "valueFrom": "<dd-api-key-secret-arn>" }
    ]
  }
}

Notes on the options:

  • dd_source should match the technology stack of the application (for example python, nodejs, java, nginx). Datadog uses this value to select the built-in log processing pipeline that parses and enriches the logs. Setting it to an unrecognized value means no automatic parsing is applied.
  • dd_service stamps a service name on every record passing through this output. If the app uses ddtrace log injection (DD_LOGS_INJECTION=true, the default in recent ddtrace), each record also carries a dd.service field. Datadog’s reserved attribute remapper uses the record-level value over the FireLens option. The FireLens option is the backstop for records without injection: sidecars, uninstrumented containers, or any case where injection breaks. Set it to the same value as DD_SERVICE on the app container.
  • compress: gzip enables gzip compression on log uploads to Datadog’s intake. Datadog recommends it; the default is no compression.
  • apikey must arrive via secretOptions so the API key is fetched from Secrets Manager rather than baked into the task definition.
  • The values in <..> are placeholders that need to be replaced with the actual dd-site and secret arn.

Do not also add a [OUTPUT] Name datadog block to the S3 config. FireLens already creates one from these options. A second block would double-ship every record to Datadog.

3. Define the parser filter and CloudWatch output in the S3 config

The custom config in S3 owns two things: the JSON parser filter that promotes dd.trace_id and related fields to the top level, and the CloudWatch output. The Datadog output is defined in the app container’s logConfiguration (Step 2), which tells FireLens to generate a native datadog output block.

The two config files

Two files are uploaded to S3:

fluent-bit-parsers.conf (parser definitions only):

[PARSER]
    Name   json
    Format json

fluent-bit-custom.conf (filter and CloudWatch output):

[FILTER]
    Name         parser
    Match        *
    Key_Name     log
    Parser       json
    Reserve_Data True

[OUTPUT]
    Name              cloudwatch_logs
    Match             *
    region            ${AWS_REGION}
    log_group_name    /ecs/my-service
    log_stream_prefix ecs/
    auto_create_group true

Neither file contains a [SERVICE] block. The init process generates one, and FireLens itself does not emit one.

The custom config does not contain a [PARSER] block of its own. Parser definitions belong in the parsers file so the init process can load them with the right flag (see below).

AWS_REGION is automatically available inside the config because the init process injects it from ECS task metadata.

The [FILTER] parser block requires the application to emit structured JSON logs: one JSON object per line on stdout/stderr. The filter parses the log field and promotes the JSON fields (including dd.trace_id) to the top level. If the application emits plain text, the filter silently passes records through unchanged and trace correlation will not work. Dual-ship delivery to both destinations still functions normally.

Reserve_Data True tells the filter to merge the parsed fields into the existing record rather than replace it. Without this option, a successful parse discards everything already on the record, including the ECS metadata fields that FireLens injects when enable-ecs-log-metadata: "true" is set (ecs_cluster, ecs_task_arn, container_name, container_id). With it, the promoted JSON fields and the ECS context fields are both present in the final record sent to CloudWatch and Datadog.

How the init process loads the files

  1. The init process fetches ECS task metadata and exports AWS_REGION, ECS_CLUSTER, ECS_TASK_ARN, and related values as environment variables available inside the Fluent Bit config.
  2. It downloads each file listed in aws_fluent_bit_init_s3_<N> environment variables on the log_router container.
  3. Files that contain only [PARSER] sections are loaded with the Fluent Bit -R flag (parser definitions). All other files are @INCLUDEd into the composed main config.
  4. The composed main config also @INCLUDEs the FireLens-generated base config, which contains the INPUT and ECS metadata filter, plus the Datadog [OUTPUT] block generated from the app container’s logConfiguration options.
  5. Fluent Bit launches with the composed config.

Task definition wiring

Point the init process at the two S3 objects via environment variables on the log_router container:

"environment": [
  { "name": "AWS_REGION",               "value": "<region>" },
  { "name": "aws_fluent_bit_init_s3_1", "value": "arn:aws:s3:::<bucket>/fluent-bit-parsers.conf" },
  { "name": "aws_fluent_bit_init_s3_2", "value": "arn:aws:s3:::<bucket>/fluent-bit-custom.conf" }
]

The numeric suffix controls load order. Additional files can be added by incrementing the suffix.

Upload both files to S3 before starting the service:

aws s3 cp fluent-bit-parsers.conf s3://<bucket>/fluent-bit-parsers.conf
aws s3 cp fluent-bit-custom.conf  s3://<bucket>/fluent-bit-custom.conf

To update either config without touching the task definition, upload a new version and force a new deployment:

aws ecs update-service --cluster <cluster> --service <service> --force-new-deployment

4. Grant IAM permissions

RolePermission
Task Execution Rolelogs:CreateLogGroup, logs:CreateLogStream, logs:PutLogEvents (for sidecar logs via the awslogs driver)
Task Execution Rolesecretsmanager:GetSecretValue (for the Datadog API key)
Task Roles3:GetObject, s3:GetBucketLocation, s3:ListBucket (for the init process to download configs from S3)
Task Rolelogs:CreateLogGroup, logs:CreateLogStream, logs:PutLogEvents (for the Fluent Bit cloudwatch_logs plugin at runtime)

The init image is hosted on public ECR; no ecr: pull permissions are needed.

The two roles serve different purposes:

  • Execution role: used by the ECS agent to pull images and fetch secrets before containers start. The awslogs driver (for sidecar container logs) uses this role.
  • Task role: used by containers at runtime. Both the init process (S3 download) and the Fluent Bit cloudwatch_logs plugin (app log shipping) use this role. S3 and runtime CloudWatch Logs permissions must be here, not on the execution role.

The Go S3 client used by the init process calls ListBucket during object retrieval, so s3:ListBucket is required in addition to s3:GetObject.

5. Verify

After deploying:

  1. Check the Fluent Bit sidecar logs in CloudWatch. Look for the init process command line showing both S3 files downloaded. The parsers file should be loaded with -R and the custom config @INCLUDEd. The output should show output:cloudwatch_logs:cloudwatch_logs.0 and output:datadog:datadog.0 both starting. S3 errors (AccessDenied, NoSuchKey) indicate a task role permission or path misconfiguration.

  2. Confirm log events appear in your CloudWatch log group. Use CloudWatch Logs Insights rather than the raw stream view. The stream console paginates at ~1 MB (~1,800 events per page). Under any real load the first page is a partial slice and logs will appear missing even when delivery is healthy. A Logs Insights query avoids this:

    fields @timestamp, level, message
    | sort @timestamp desc
    | limit 50
    
  3. Confirm logs appear in Datadog Log Explorer filtered by service:<your-service-name>. Do not rely on Fluent Bit sidecar 202 responses alone. A 202 is an async acknowledgment of acceptance, not a guarantee that records were indexed and are searchable. The only reliable check is that the logs actually appear in Log Explorer.

  4. Confirm log-trace correlation. Open a trace in APM and click the Logs tab on a span. You should see the matching log records with the same dd.trace_id. If logs flow but correlation is empty, the parser filter is not promoting dd.trace_id to the top level. Check that the application emits structured JSON, not plain text.

Python apps: set PYTHONUNBUFFERED=1 on the application container. Without it, Python buffers stdout/stderr in non-TTY environments (which includes all containers) and INFO-level logs can be lost under sustained load. Only ERROR-level writes tend to flush promptly. The symptom is indistinguishable from a FireLens delivery failure.

Summary

CloudWatch → ForwarderFireLens
Delivery latencySeconds to minutesNear-real-time
InfrastructureLambda ForwarderFluent Bit sidecar
ComplexityLow (container config)Medium (task def + Fluent Bit config)
CostCloudWatch + LambdaCloudWatch + sidecar compute
Best forSimple workloads, existing Forwarder setupNew builds, low latency, flexible routing

Transition to Datadog-only log shipping

Once Datadog is validated as the primary log destination, you can stop sending application logs to CloudWatch and simplify the pipeline. The Datadog output lives in the task definition and does not move. Only the CloudWatch output in the S3 config needs to be removed.

What to change

1. Remove the CloudWatch output from fluent-bit-custom.conf

Delete the [OUTPUT] block for cloudwatch_logs. After the change, fluent-bit-custom.conf contains only the parser filter:

[FILTER]
    Name         parser
    Match        *
    Key_Name     log
    Parser       json
    Reserve_Data True

2. Upload the updated config to S3 and force a new deployment

Config changes do not require rebuilding the Fluent Bit Docker image; the init process re-downloads from S3 on each task start.

aws s3 cp fluent-bit-custom.conf s3://<bucket>/fluent-bit-custom.conf
aws ecs update-service --cluster <cluster> --service <service> --force-new-deployment

3. Keep awslogs on the infrastructure sidecars

Leave the logConfiguration on the log_router and datadog-agent containers unchanged. Their logs should remain in CloudWatch via awslogs so you can debug routing failures independently of the Fluent Bit pipeline.

What does not change

  • The app container’s logConfiguration (awsfirelens with Name: datadog): Datadog routing lives here and is unaffected.
  • The parsers file in S3: still required for log-trace correlation.
  • The [FILTER] parser block in fluent-bit-custom.conf: still required for log-trace correlation.
  • All DD_* environment variables on the app and Agent containers.

Reference: complete task definition

Full task definition for a three-container setup: Fluent Bit log router + Datadog Agent sidecar + application container. Replace placeholder values (<…>) with your own.

The log_router uses the init image (init-latest tag). No custom Docker image is required. Upload both fluent-bit-parsers.conf and fluent-bit-custom.conf to S3 and reference them via aws_fluent_bit_init_s3_1 and aws_fluent_bit_init_s3_2. The taskRoleArn must be a separate role with S3 read permissions and CloudWatch Logs write permissions.

{
  "family": "my-service",
  "cpu": "1024",
  "memory": "2048",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "executionRoleArn": "<task-execution-role-arn>",
  "taskRoleArn": "<task-role-arn>",
  "containerDefinitions": [
    {
      "name": "log_router",
      "image": "public.ecr.aws/aws-observability/aws-for-fluent-bit:init-latest",
      "essential": true,
      "firelensConfiguration": {
        "type": "fluentbit",
        "options": {
          "enable-ecs-log-metadata": "true"
        }
      },
      "environment": [
        { "name": "AWS_REGION",               "value": "<region>" },
        { "name": "aws_fluent_bit_init_s3_1", "value": "arn:aws:s3:::<bucket>/fluent-bit-parsers.conf" },
        { "name": "aws_fluent_bit_init_s3_2", "value": "arn:aws:s3:::<bucket>/fluent-bit-custom.conf" }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/fluent-bit-internal",
          "awslogs-region": "<region>",
          "awslogs-stream-prefix": "fluent-bit",
          "awslogs-create-group": "true"
        }
      },
      "memoryReservation": 256
    },
    {
      "name": "datadog-agent",
      "image": "public.ecr.aws/datadog/agent:latest",
      "essential": true,
      "dependsOn": [
        { "containerName": "log_router", "condition": "START" }
      ],
      "environment": [
        { "name": "DD_SITE",                       "value": "<dd-site>" },
        { "name": "ECS_FARGATE",                   "value": "true" },
        { "name": "DD_APM_ENABLED",                "value": "true" },
        { "name": "DD_APM_NON_LOCAL_TRAFFIC",      "value": "true" },
        { "name": "DD_DOGSTATSD_NON_LOCAL_TRAFFIC","value": "true" },
        { "name": "DD_PROCESS_AGENT_ENABLED",      "value": "true" },
        { "name": "DD_LOGS_ENABLED",               "value": "false" }
      ],
      "secrets": [
        {
          "name": "DD_API_KEY",
          "valueFrom": "<dd-api-key-secret-arn>"
        }
      ],
      "portMappings": [
        { "containerPort": 8126, "protocol": "tcp" },
        { "containerPort": 8125, "protocol": "udp" }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/datadog-agent",
          "awslogs-region": "<region>",
          "awslogs-stream-prefix": "datadog-agent",
          "awslogs-create-group": "true"
        }
      },
      "memoryReservation": 512
    },
    {
      "name": "app",
      "image": "<your-app-image>",
      "essential": true,
      "dependsOn": [
        { "containerName": "log_router",    "condition": "START" },
        { "containerName": "datadog-agent", "condition": "START" }
      ],
      "environment": [
        { "name": "DD_SERVICE",          "value": "my-service" },
        { "name": "DD_ENV",              "value": "production" },
        { "name": "DD_VERSION",          "value": "1.0" },
        { "name": "DD_AGENT_HOST",       "value": "localhost" },
        { "name": "DD_TRACE_AGENT_PORT", "value": "8126" },
        { "name": "DD_DOGSTATSD_PORT",   "value": "8125" },
        { "name": "DD_TRACE_SAMPLE_RATE","value": "1" },
        { "name": "DD_LOGS_INJECTION",   "value": "true" },
        { "name": "PYTHONUNBUFFERED",    "value": "1" }
      ],
      "logConfiguration": {
        "logDriver": "awsfirelens",
        "options": {
          "Name":       "datadog",
          "Host":       "http-intake.logs.<dd-site>",
          "TLS":        "on",
          "compress":   "gzip",
          "dd_service": "my-service",
          "dd_source":  "python",
          "dd_tags":    "env:production"
        },
        "secretOptions": [
          { "name": "apikey", "valueFrom": "<dd-api-key-secret-arn>" }
        ]
      },
      "memoryReservation": 512
    }
  ]
}

Notes on the app container environment:

  • DD_SERVICE / DD_ENV / DD_VERSION: unified service tagging. DD_SERVICE and DD_ENV must match dd_service and dd_tags in the app container’s logConfiguration.options so that logs and traces appear under the same service in Datadog. DD_VERSION does not need to be in dd_tags — with DD_LOGS_INJECTION=true, ddtrace injects dd.version directly into each log record.
  • DD_AGENT_HOST=localhost: in Fargate’s awsvpc network mode all containers in a task share a network namespace, so the app reaches the Agent on localhost.
  • DD_LOGS_INJECTION=true: enables ddtrace’s automatic injection of dd.trace_id, dd.span_id, dd.service, dd.env, and dd.version into every Python LogRecord. This is the default in recent ddtrace versions; set explicitly for clarity. Use a JSON formatter (for example python-json-logger) that emits these fields as top-level keys.
  • DD_TRACE_SAMPLE_RATE=1: Sample 100% of traces for initial validation so every request is visible in APM while you verify the pipeline end-to-end. Tune down in production for high-volume workloads — Datadog’s intelligent sampling retains error traces and slow traces regardless of this setting.
  • PYTHONUNBUFFERED=1: Python-specific. Forces immediate flush of stdout/stderr so that all log levels reach Fluent Bit. Without it, INFO logs buffer silently and are lost under sustained load.
  • DD_LOGS_ENABLED=false on the Agent: The Agent cannot access the Docker socket on Fargate and therefore cannot collect container logs regardless of this setting. Fluent Bit owns log routing. This setting explicitly prevents the Agent from attempting log collection and producing noisy startup warnings.