How to Manage Log Files Using Logrotate | Datadog

How to manage log files using logrotate

Author Usman Khan
Author Mallory Mooney

Published: March 4, 2022

Logs are records of system events and activities that provide valuable information used to support a wide range of administrative tasks—from analyzing application performance and debugging system errors to investigating security and compliance issues. Large-scale production environments emit enormous quantities of logs, which can make them more challenging to manage and introduces the risk of losing important data if underlying resources run out of space. Logrotate is a Linux utility designed to simplify log management and maintenance within these types of environments through a process known as log rotation.

In this guide, we’ll walk you through how to customize the logrotate utility to fit your logging needs. But first, we’ll further explore the importance of logging to files, and how log rotation solves some of the challenges that often accompany this recommended practice.

The importance of file-based logging

Modern environments—including those of many Datadog customers—generate large volumes of log data from multiple sources, such as containers, servers, databases, firewalls, and physical network devices. When operating at this scale, it is critical to have robust processes in place that enable you to efficiently collect logs from all of your system components.

Teams that leverage log management solutions like Datadog to centralize, store, and analyze their logs may start by configuring applications to stream them directly to an endpoint. This approach, however, can consume a significant amount of your network’s bandwidth and overwhelm the destination server—especially in larger environments. Sudden spikes in a host’s CPU or network traffic can also create a cascading effect for downstream logging services. For example, log events generated by an influx of traffic can easily overwhelm a log streaming service and exhaust its underlying resources. In these cases, you risk losing valuable log data that you rely on for maintaining compliance, troubleshooting production issues, or conducting security investigations.

Logging data to files, and forwarding them with a log forwarder like the Datadog Agent, offers the following advantages over streaming logs:

  • decouples log generation from log collection
  • separates the logging process from application logic

Both of these benefits help minimize resource consumption and reduce the risk of interference with other critical operations. For example, an application that does not leverage file-based logging may attempt to execute resource-intensive retry logic multiple times in order to resume log streams during an outage. Logging to files ensures that you always have access to the information you need for identifying security threats and troubleshooting application issues, even (or especially) in the case of a production issue.

Rotate your logs with the logrotate utility

Logging to files, while recommended, presents its own set of challenges when it comes to implementation at scale. A primary concern for system administrators is that log files can quickly take up a resource’s available disk space, requiring additional maintenance and on-demand capacity that can significantly drive up costs. On top of that, tasks like searching through log files with commands like grep become more computationally expensive and time consuming as their size increases. Finding the logs you need to troubleshoot an issue can often feel like looking for a needle in a haystack, especially if you need to manually sift through large volumes of log files, individual log files that are too large, or both.

Log rotation, which is managed by tools like logrotate, solves these problems by performing routine maintenance on log files. Rotation refers to the process of creating new files on a schedule (e.g., hourly, daily, weekly, etc.) and renaming old files to prevent resources from writing to them. Other routines, such as compressing or removing older files to save disk space, can also be configured with logrotate, complementing the rotation process.

Using log rotation to limit the size of your individual log files makes them easier to parse when you need to determine the root cause of a performance issue or simply want to share an interesting log with a team member via email. For example, rotating logs on a daily basis gives you an immediate starting point for an investigation—when a customer notifies you of an issue that began a week ago, you know to start with the log files that correspond with that time frame.

It’s important to note that log rotation is not a substitute for using a log forwarder. Rather, log rotation works in conjunction with a log forwarding service that ships your logs to external systems, such as servers for remote backup or log management services like Datadog for centralization, search, analysis, visualization, and alerting capabilities.

Get started with logrotate

The logrotate utility is installed on most Linux distributions (e.g., Ubuntu, Red Hat, Debian) by default. In this section, we’ll look at the standard configurations available for Ubuntu 20.04.3 LTS specifically, though these options can be used in most other distributions. We’ll also walk through best practices for setting up logrotate for your environment, such as:

For any Linux distribution, all system events, such as logins, errors, and user activity, are logged in the /var/log/ directory. Logrotate provides two options for managing system logs:

  • an /etc/logrotate.conf configuration file for applying rotation settings globally
  • an /etc/logrotate.d directory for configuring log rotation per package or service installed on the system (e.g., mysql-server, apache)

/etc/logrotate.conf

 
# see "man logrotate" for details
# rotate log files weekly
weekly
# use the syslog group by default, since this is the owning group
# of /var/log/syslog.
su root adm
# keep 4 weeks worth of backlogs
rotate 4
# create new (empty) log files after rotating old ones
create
 
[...]

With the default configuration snippet above, the utility will automatically create new log files in the /var/log/ directory on a weekly basis, compress old files, and maintain a four-week backlog. The su root adm directive specifies the user (root) and group (adm) for the operation to prevent permissions issues. You can adjust any of logrotate’s settings based on your needs. For example, you can modify your rotation schedule or permissions to give log forwarders like the Datadog Agent access to logs. For a complete list of logrotate’s configuration options, you can run the man logrotate command.

Create or copy log files to manage rotation

Logrotate offers two primary directives for specifying how new and existing files are rotated: create and copytruncate. The create directive seen in the above example is the default mode for specifying how new and existing files are rotated. In this mode, logrotate will rename a log file—for example, an Apache error log file located in the /var/log/apache2 directory—to error.log.1, and then create a new error.log file in the same directory. This operation triggers based on your rotation’s schedule.

The copytruncate directive, on the other hand, instructs logrotate to first create a copy of the original log file and then delete the original file’s contents in order to reduce its size (i.e., truncate). This behavior is less disruptive than create mode in that it accommodates continuous, uninterrupted writing to the original file. Using our previous Apache error log example, logrotate will copy the file (e.g,. /var/log/apache2/error.log.1) and then delete data from the original one—instead of renaming it—so the logging service can continue to write to the file without interruption.

copytruncate mode is useful when a service is not able to close and reopen a log file without restarting. For instance, system daemons and other background processes may not be configured to handle termination signals like SIGHUP to reopen log files. However, this mode is more resource intensive than create mode and can lead to race conditions that result in lost data. For example, any data that was logged after the copy process but before the truncate operation will be lost. Because of these caveats, create mode is recommended for most use cases.

Adjust your log rotation schedule

Logrotate enables you to rotate on a daily, weekly (default), monthly, or yearly basis—or when a file’s size reaches a certain limit. For services that generate a consistent volume of logs, rotating files on a consistent schedule is suitable. In contrast, services that experience periodic spikes in activity or generate large volumes of logs may require size limits to ensure that files do not consume too much disk space.

Logrotate offers the following options for controlling your size-based rotation:

  • size: rotate by a specified size on a daily basis (overrides default schedule)
  • minsize: rotate based on time interval but only when files are at least this size
  • maxsize: rotate when files exceed this size, even before the time interval

Depending on your needs, you can apply these settings globally in the /etc/logrotate.conf file or per service in the corresponding configuration file located in the /etc/logrotate.d directory. Configuring size-based rotation per service can be useful for resources that generate a larger volume of logs more quickly, such as an Apache web server.

/etc/logrotate.d/apache2

/var/log/apache2/*.log {
        weekly
        maxsize 1G
}

In the example snippet above, logrotate will rotate any Apache log as soon as it exceeds one gigabyte. Otherwise, logs will rotate weekly.

Compress files to save disk space

Compressing files can complement your rotation schedule and help you save additional disk space. Compression is also recommended for any log files that you want to store long term in a cold-storage server or an Amazon S3 bucket for compliance and auditing purposes. Logrotate provides several configuration options for file compression to fit your needs. Using our previous example, Apache server logs can quickly consume disk space but provide valuable information about performance trends, such as a sudden influx of application requests. Compressing Apache logs as part of your rotation schedule allows you to keep them for troubleshooting while maintaining adequate disk space for underlying resources.

/etc/logrotate.d/apache2

/var/log/apache2/* {
    [...]
    compress
    delaycompress
}

The compress directive seen in the snippet above instructs logrotate to automatically compress rotated files with the default gzip compression utility. The configuration also includes the delaycompress directive, which postpones compression to the next rotation cycle. This option ensures that you can access your most recently rotated log file without needing to take extra steps to decompress it.

Run scripts for additional processing

The configuration options we’ve discussed so far can support most rotation use cases, but there may be instances when you need to run additional processing on rotated logs. Logrotate’s postrotate directive enables you to create custom rules for logs after they are rotated. For example, you can use this directive to modify permissions for a log forwarder.

You can also use the postrotate directive to automatically ship rotated logs to another destination via Rsync, a tool that synchronizes files across local and remote hosts. Rsync can be useful for sending logs to a remote server for scheduled backups. In the example configuration below, the rsync script will execute on all Apache server logs after rotation.

/etc/logrotate.d/apache2

/var/log/apache2/*.log {
    [...]
    sharedscripts
    postrotate
         rsync -avzr /var/log/apache2/*.log-* REMOTE-HOST:/path/to/directory/
    endscript
}

This example also uses the sharedscripts directive for greater control over when logrotate executes a particular script. In this case, logrotate will only run the rsync script once, after all Apache access and error logs are rotated. This configuration ensures that only the latest log files are processed, which helps reduce the risk of sending duplicate files to a backup server.

Modify permissions to grant access to logs

The /var/log/ directory is only accessible by the root user in the adm group, as seen in the default configuration. This setting ensures that logrotate has the appropriate permissions to modify most logs, but there are some scenarios where you may need to update your settings to grant additional access. For example, some versions of the MySQL service, which runs a database server on Linux systems, may only give the mysql user and group access to database logs. In this case, you can update the mysql-server service’s group from mysql to logrotate’s adm group in the associated configuration file, as seen below:

/etc/logrotate.d/mysql-service

/var/log/mysql.log /var/log/mysql/*log {
        [...]
        create 640 mysql adm
        [...]
}

This setting grants the utility access to database logs (e.g.,/var/log/mysql/mysql_error.log) so it can successfully manage rotation, while still giving the mysql-service service the ability to write to new log files.

Some environments leverage access control lists (ACLs) for greater control over file and directory permissions, so you may need to modify additional settings to ensure logrotate works with other services. For example, if you want to use the Datadog Agent to collect your logs for analysis, you will need to grant it access to log directories.

The Datadog Agent runs under the dd-agent user and group. To give the Agent access to a log directory, such as /var/log/apache, you can run the following command in an ACL-enabled environment:

setfacl -m u:dd-agent:rx /var/log/apache2

The command uses the -rx option to give the Agent user read and execute permissions for Apache logs.

It’s important to note that the above command will only apply the ACL setting once, so you need to make sure that the configuration persists for rotated logs. You can accomplish this by creating a new dd-agent_ACL file in the /etc/logrotate.d directory and specifying which logs you want the Agent to forward to Datadog.

/etc/logrotate.d/dd-agent_ACL

{
    postrotate
        /usr/bin/setfacl -m g:dd-agent:rx /var/log/apache2/error.log
        /usr/bin/setfacl -m g:dd-agent:rx /var/log/apache2/access.log
    endscript
}

The configuration above uses a postrotate script to apply the ACL setting for each log after it is rotated, which ensures that logrotate is able to continue rotating logs successfully while enabling the Datadog Agent to collect them.

Organize log files with timestamps

Logs are invaluable for managing and resolving performance issues, but when multiple services are generating thousands of logs during an incident, it’s difficult to cut through the noise. Adding timestamps to log filenames enables you to easily search for the logs created during a period of interest. Timestamps also help establish a standard format across all system logs so that you can follow the timeline of events across multiple services in the appropriate order.

The example configuration below uses logrotate’s dateext and dateformat directives to add a timestamp to the end of a log filename in the DDMMYYYY format (e.g., error.log.21012022)::

/etc/logrotate.d/apache2

/var/log/apache2/*.log {
    rotate 14
    daily
    dateext
    dateformat -%d%m%Y
}

With this configuration, logrotate will add the formatted timestamp based on the day that the file was rotated. You can also use the dateyesterday directive to ensure that the file’s timestamp matches the log date instead of its rotation date. Using the latter option is beneficial if you are forwarding logs directly to Datadog for analysis instead of forwarding to local or cloud storage directly.

Debug issues with the logrotate utility

Since logrotate is a primary method for organizing an environment’s logs, you need to make sure that the utility is running as expected. Logrotate is typically scheduled to run on a daily basis via the cron utility, which is a Linux tool for maintaining a system’s environment through scheduled jobs. The cron job can fail if there are issues in one of your logrotate configuration files, such as a misconfigured postrotate script.

For a high-level overview of logrotate’s status, you can run the following command:

cat /var/lib/logrotate/status

The status command enables you to verify that a log file for a service is included in the utility’s rotation schedule and view when it was last rotated. If a file is not included in the command’s output, you may need to confirm that the /etc/logrotate.d/ directory includes a configuration file for that service.

You can also configure logrotate to use verbose mode and log its output to a file by appending the -l flag to the logrotate command in the appropriate cron file, as seen below:

/etc/cron.daily/logrotate

#!/bin/sh
[...]
/usr/sbin/logrotate -l /etc/logrotate.conf

With verbose mode, the utility will log each operation it executes and any generated errors, giving you better visibility into any issues that may prevent logrotate from running. You can then forward these logs to Datadog so you can easily search them for key events and create alerts to notify you when logrotate generates an error.

Configure resources to log at the appropriate level

Some application frameworks do not support Linux or third-party logging tools like logrotate, so you may need to implement logrotate to manage system-level logs while leveraging built-in tools to rotate logs at the application level. For example, you may need to use Apache’s rotatelogs package to schedule rotation for server logs if you are running Apache on the Windows operating system.

Carefully configuring each of your system resources to log at the appropriate level can help ensure that built-in logging services do not overwrite logrotate’s configuration or interfere with the utility by creating duplicate log files.

Logrotate + Datadog

In this guide, we walked through the various benefits of using logrotate to manage logs, as well as some configuration options to help you customize log rotation for your environment. For greater visibility into all of your logs, including logs managed by logrotate, use the Datadog Agent to collect and forward them to Datadog Log Management, which enables you to search and filter logs by key attributes and leverage them for additional insights into application errors and performance.

Check out our documentation to learn how to configure log collection based on your application’s infrastructure. If you don’t already have a Datadog account, you can sign up for a .