---
title: "How to use ApacheBench for web server performance testing"
description: "Learn how to interpret the results of your ApacheBench tests as you optimize your HTTP backend"
author: "Paul Gottschling"
date: 2019-07-10
tags: ["infrastructure monitoring", "apache", "apachebench", "apache", "testing", "web server"]
blog_type_id: the-monitor
locale: en
---

When developing web services and tuning the infrastructure that runs them, you'll want to make sure that they handle requests quickly enough, and at a high enough volume, to meet your requirements. [ApacheBench](https://httpd.apache.org/docs/2.4/programs/ab.html) (`ab`) is a benchmarking tool that measures the performance of a web server by inundating it with HTTP requests and recording metrics for latency and success. ApacheBench can help you determine how much traffic your HTTP server can sustain before performance degrades, and set baselines for typical response times.

While ApacheBench was designed to benchmark the Apache web server, it is a fully fledged HTTP client that benchmarks actual connections, and you can use it to test the performance of any backend that processes HTTP requests. In this post, we will explain [how ApacheBench works](#how-apachebench-works), and walk you through the ApacheBench metrics that can help you tune your web servers and optimize your applications, including:

- Metrics for [request latency](#request-latency-metrics), which range from percentile breakdowns of [all requests](#percentiles) to durations of specific phases within [TCP connections](#metrics-for-individual-requests).
- Metrics for [request and connection status](#success-metrics), including HTTP [status codes](#non-2xx-responses) and [request failures](#complete-versus-failed-requests).

![Web server access logs in Datadog showing requests from ApacheBench.](https://web-assets.dd-static.net/42588/1776291500-apachebench-apachebench-logs-spike.png)
*Web server access logs in Datadog showing requests from ApacheBench.*

## How ApacheBench works

### Installing ApacheBench

ApacheBench is a command line tool that is bundled in the [`apache2-utils` package](https://packages.debian.org/stretch/apache2-utils). To install `ab`, run the following commands on Debian/Ubuntu platforms:

```bash
apt-get update
apt-get install -y apache2-utils
```

### Configuring ApacheBench

ApacheBench allows you to [configure](https://httpd.apache.org/docs/2.4/programs/ab.html#options) the number of requests to send, a timeout limit, and request headers. `ab` will send the requests, wait for a response (up to a user-specified timeout), and output statistics as a report.

You can run an ApacheBench command with the following format:

```bash
ab <OPTIONS> <WEB_SERVER_ADDRESS>/<PATH>
```

The only required argument is the address of the web server, followed by a `/` (URLs without the trailing `/` will cause `ab` to return an error) and an optional URL path. If you don't specify any options, ApacheBench will send a single request. `ab`'s [options](https://httpd.apache.org/docs/2.4/programs/ab.html) allow you to adjust the volume of requests, as well as (for specialized cases) their headers and request bodies. Some commonly used options include:

- `-n`: The number of requests to send
- `-t`: A duration in seconds after which `ab` will stop sending requests
- `-c`: The number of concurrent requests to make

If you're using both the `-t` and the `-n` flags, note that`-t` should always go first—otherwise ApacheBench will [override](https://www.pinkbike.com/news/Apache-Bench-you-may-be-using-it-incorrectly.html) the value of `-n` and assign the default `-n` value of `50000`, or 50,000 requests. Additionally, `ab` runs on a single thread—the `-c` value tells `ab` how many file descriptors to allocate at a time for TCP connections, not how many HTTP requests to send simultaneously. The `-c` flag does allow `ab` to complete its tests in less time, and simulates a higher number of concurrent connections. The two timeseries graphs below, for example, show the number of concurrent connections as well as requests per second for the command:

```bash
ab -n 100000 -c 1000 <SERVER_ADDRESS>
```

![apachebench-server-metrics](https://web-assets.dd-static.net/42588/1776291508-apachebench-apachebench-server-metrics.png)

### Constraints

When using ApacheBench, you'll want to understand its constraints, which stem from the fact that `ab` measures actual HTTP requests between two servers that may or may not run on the same host. First, you'll want to make sure that the host conducting an `ab` test has enough CPU resources to make all TCP connections you've specified, as well as enough memory available to store the state of each connection (`ab` [allocates memory](https://github.com/apache/httpd/blob/497e991285d7179bb0d81e3f8fae049605f9aaee/support/ab.c#L1822-L1828) based on the number of requests).

![Memory utilization by one invocation of the ab command, as seen in Datadog’s Live Processes view.](https://web-assets.dd-static.net/42588/1776291513-apachebench-apachebench-live-proc-mem.png)
*Memory utilization by one invocation of the ab command, as seen in Datadog’s Live Processes view.*

[Other constraints](https://blog.getpolymorph.com/7-tips-for-heavy-load-testing-with-apache-bench-b1127916b7b6) include the fact that ApacheBench's send-everything-at-once approach may not reflect the way your application handles requests over time; and the distance between the server and `ab` client (e.g., the same host or different availability zones) will affect latency scores. As a result, `ab` requests may not reflect the sorts of latencies you can expect to see in your usual production load. You'll get the most out of ApacheBench when using the relative results of your tests—not "How fast are responses?" but "Are responses getting faster?"—to see how effective your efforts are to optimize your application code, server configuration, load balancing architecture, and so on.

## How to interpret ApacheBench

After finishing its tests, ApacheBench produces a report that resembles the code snippet below. Later in this section, we will explain the different metrics provided in this report, such as:

- [How long](#request-latency-metrics) requests took to complete, including latency for different parts of the request-response cycle
- [How successful](#success-metrics) the requests were, including how many requests received non-2xx responses, and how many requests failed

```text
Server Software:        AmazonS3
Server Hostname:       <SOME_HOST>
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256,2048,128
TLS Server Name:        <SOME_HOST>

Document Path:          /
Document Length:        45563 bytes

Concurrency Level:      2
Time taken for tests:   3.955 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      4625489 bytes
HTML transferred:       4556300 bytes
Requests per second:    25.29 [#/sec] (mean)
Time per request:       79.094 [ms] (mean)
Time per request:       39.547 [ms] (mean, across all concurrent requests)
Transfer rate:          1142.21 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       40   53   8.1     51      99
Processing:    12   24   9.4     23      98
Waiting:        5   14  10.6     12      95
Total:         57   77  15.0     75     197

Percentage of the requests served within a certain time (ms)
  50%     75
  66%     77
  75%     80
  80%     81
  90%     85
  95%     92
  98%    116
  99%    197
 100%    197 (longest request)
```

### Request latency metrics

#### Top-level request latency metrics

##### Time taken for tests

`Time taken for tests` measures the duration between when ApacheBench first connects to the server and when it receives the final response or is interrupted with Ctrl-C. This metric is useful if you're conducting repeated `ab` tests while optimizing your application or tuning your web server, and want a single score to consult for signs of improvement.

##### Time per request

ApacheBench provides two variations on this metric, and both depend on the number of responses that `ab` has finished processing (`done`), as well as the value of the metric `Time taken for tests` (`timetaken`). Both multiply their results by 1,000 to get a number in milliseconds.

The first `Time per request` metric doesn't take the concurrency value into account:

```text
timetaken * 1000 / done
```

The second version of `Time per request` accounts for the number of concurrent connections the user has configured `ab` to make, using the `-c` option (`concurrency`):

```text
concurrency * timetaken * 1000 / done
```

If you set a `-c` value greater than 1, the second `Time per request` metric should (in theory) provide a more accurate assessment of how long each request takes.

![ab as seen within Datadog’s Live Processes view, demonstrating that ab uses a single CPU thread despite a high concurrency value.](https://web-assets.dd-static.net/42588/1776291517-apachebench-apachebench-live-proc-cpu.png)
*ab as seen within Datadog’s Live Processes view, demonstrating that ab uses a single CPU thread despite a high concurrency value.*

You should treat both `Time taken for tests` and `Time per request` as rough indicators of web server performance under specific levels of load. Tracking these values over time can help you assess your optimization efforts, and unusually high or low values can point to CPU saturation on the web server, code changes, or other events that you'll want to investigate in more detail.

#### Metrics for individual requests

For each connection it makes, `ab` stores five [timestamps](https://github.com/apache/httpd/blob/497e991285d7179bb0d81e3f8fae049605f9aaee/support/ab.c#L259-L263):

- `start`: when `ab` began making a connection with the web server
- `connect`: when `ab` finished making the connection and began writing the request
- `endwrite`: when `ab` finished writing the request
- `beginread`: when `ab` began reading from the response
- `done`: when `ab` closed the connection

`ab` uses these timestamps to create [a `data` object](https://github.com/apache/httpd/blob/497e991285d7179bb0d81e3f8fae049605f9aaee/support/ab.c#L271-L276) for each connection, containing the following properties:

- `starttime`: the `start` property
- `waittime`: How long `ab` spent waiting before it began reading from the response, after it finished writing the request (i.e., `beginread - endwrite`)
- `ctime`: How long `ab` spent making a connection with the web server (i.e., `connect - start`)
- `time`: How long `ab` spent handling the connection, from the beginning of its attempt to contact the server to when the connection closed (i.e., `done - start`)

From there, `ab` uses the `data` object to calculate the remaining latency-related metrics: aggregated connection times, percentiles, and, if you use the `-g` flag, data for individual requests.

#### Aggregated connection times

After inspecting ApacheBench's top-level latency metrics, you can track the duration of different stages of the request-response cycle. `ab` aggregates properties from its per-request `data` objects to calculate the `min`, `mean`, `sd` (standard deviation), `median`, and `max` of each stage, all in milliseconds:

- **Connect:** How long it takes `ab` to establish a TCP connection with the target server before writing the request to the connection (`ctime`)
- **Processing:** How long the connection was open after being created (`time - ctime`)
- **Waiting:** How long `ab` waits after sending the request before beginning to read a response from the connection (`waittime`)
- **Total:** The time elapsed from the moment `ab` attempts to make the connection to the time the connection closes (`time`)

The output will resemble the following:

```text
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       42   71  24.3     63     198
Processing:    13   32  11.5     33      80
Waiting:        5   19  10.7     15      40
Total:         65  103  30.9     95     233
```

These metrics give you a more nuanced understanding than `Time taken for tests` and `Time per request`, allowing you to see which part of the request-response cycle was responsible for the overall latency. Further, these metrics are based on the `data` object for each request, rather than the start and end times of `ab`'s tests.

In the above example, we can see that `Connect` was on average (i.e., in both mean and median) the longest part of the cycle, though also the one with the highest standard deviation. Since the `Connect` metric depends on client latency as well as server latency, we could investigate each of these, determining which side of the connection is responsible for the variation. Then we'd consider focusing our optimization efforts on how our backend handles new TCP connections.

#### Percentiles

The final ApacheBench report also includes a breakdown of request latency percentiles, giving you a more detailed view of request latency distribution than the standard deviations within the `Connection Times` table. The percentile breakdown resembles the following. In the report, "certain time" refers to the value of the `time` property (defined [above](#aggregated-connection-times)).

```text
Percentage of the requests served within a certain time (ms)
  50%     95
  66%    102
  75%    107
  80%    115
  90%    138
  95%    163
  98%    229
  99%    233
 100%    233 (longest request)
```

Unlike the `Connection Times` table, these metrics are not broken down by stage of the request-response cycle. As a result, `ab`'s latency metrics are best viewed together to get a comprehensive view of web server performance:

1. Use the **top-level latency metrics** to get a quick read of the server's overall performance, especially in comparison to other `ab` tests applied in similar conditions.
1. From there, use the higher-granularity **aggregated connection times** to find out which part(s) of the request-response cycle to investigate further.
1. Finally, **percentiles** tell you how much variation there is between the fastest and slowest requests, so you can address [any long-tail latencies](https://en.wikipedia.org/wiki/Long-tail_traffic).

#### Latency of individual requests

ApacheBench can also display data about each connection in tab-separated values (TSV) format, allowing you to calculate values that are not available within the standard `ab` report, such as `wait` time percentiles. This data comes from the same `data` objects that `ab` uses to calculate [`Connection Times`](#aggregated-connection-times) and [percentiles](#percentiles). These per-request values are:

- `starttime`: The time the connection began, printed in the [`ctime()` format](https://en.cppreference.com/w/c/chrono/ctime) (from the `starttime` property of the connection)
- `seconds`: The same value as `starttime`, but given as the number of seconds since the Unix Epoch
- `ctime`: The time it takes to connect to the server before making this request, in milliseconds
- `dtime`: How long an established connection was open, in milliseconds (i.e, `time - ctime`)
- `ttime`: The total time spent on this connection (i.e., the `time` property) in milliseconds
- `wait`: The time spent waiting between sending the request and reading from the response (i.e., the `waittime` property), in milliseconds)

To access per-request data in TSV format, use the `-g` flag in your `ab` command, specifying the path to the output file. For example, let's say you run this command:

```bash
ab -n 100 -g /path/to/plot.tsv <WEB_SERVER_ADDRESS>
```

The first five rows of the data in the **plot.tsv** file will resemble the following.

```text
starttime	seconds	ctime	dtime	ttime	wait
Tue Jun 18 15:56:10 2019	1560887770	46	14	60	6
Tue Jun 18 15:56:07 2019	1560887767	48	15	63	7
Tue Jun 18 15:56:11 2019	1560887771	51	13	63	6
Tue Jun 18 15:56:12 2019	1560887772	47	17	64	12
Tue Jun 18 15:56:06 2019	1560887766	46	20	66	9
```

The `-g` flag gives you flexibility in how you analyze request data. You could, for example, plot the `dtime` of each request in a timeseries graph (i.e., by using [gnuplot](http://www.gnuplot.info/), spreadsheet software, or a graphics library), indicating whether increased load on the server affected the performance of requests after a certain point.

### Success metrics

Since ApacheBench makes real requests to an HTTP server, it's important to understand the sorts of responses your web servers are returning. `ab` allows you to track both non-2xx response codes and errors that arise when making requests.

#### Non-2xx responses

One way to understand the responses from your web servers is to count those that return an error or a failure. This is helpful if your goal is to benchmark deliberately unsuccessful requests (e.g., the time it takes to load a 404 page after a redesign), or requests that are not guaranteed to return a successful response (e.g., in production environments).

It's important to know that `ab` will classify 3xx, 4xx, and 5xx response codes as `Non-2xx responses`, and report this metric in a format similar to this:

```text
Non-2xx responses:      10
```

You can see which status codes make up the "non-2xx" responses by adding the `-v <INTEGER>` option when running `ab`, where a value of `2` will show the headers and body of each response in STDOUT. You can also send HTTP logs from your server to a dedicated monitoring platform like Datadog to aggregate responses by status code.

![ApacheBench logs aggregated by status code in Datadog.](https://web-assets.dd-static.net/42588/1776291524-apachebench-apachebench-logs-view.png)
*ApacheBench logs aggregated by status code in Datadog.*

#### Complete versus failed requests

Connections between ApacheBench and your web server can fail just like any TCP connection, and `ab` counts failures within four different categories:

- **Length:** After the first response, a subsequent response displays varying content length (in bytes)
- **Connect:** `ab`'s attempt to connect to a server results in an error
- **Receive:** `ab` encounters an error when reading data from the connection
- **Exceptions:** `ab` encounters an error while setting up file descriptors for making TCP connections

The following comes from running an `ab` command with the `-n` option set to `1000` and the URL set to `www.google.com/`.

```text
Complete requests:      1000
Failed requests:        992
   (Connect: 0, Receive: 0, Length: 992, Exceptions: 0)
```

The high number of `Length`-related errors is likely due to Google's dynamic content. In a [well-known bug](https://bz.apache.org/bugzilla/show_bug.cgi?id=27888) (since marked as ["a bug which will never be fixed"](https://bz.apache.org/bugzilla/page.cgi?id=fields.html)),`ab` will treat requests as failures if the content length of their responses is variable. This is because `ab` stores the length of the first response it receives, and compares subsequent lengths to that value. This means that you should disregard length-related failures when you're using `ab` to test  dynamic sites.

In general, the `Failed request` metrics monitor activity at the transport layer, i.e., with your TCP connections, rather than at the application layer. If you see nonzero results for `Failed requests`, you'll want to investigate your web server for possible issues with handling TCP connections.

## ApacheBench metrics for better benchmarking

In this post, we've shown you how to use ApacheBench to measure the performance of your HTTP servers, explored some limitations of ApacheBench, and explained the metrics that ApacheBench can provide. You can also use Datadog to get full visibility into your HTTP servers, including [Apache](https://www.datadoghq.com/blog/monitor-apache-web-server-datadog.md), [NGINX](https://www.datadoghq.com/blog/how-to-monitor-nginx-with-datadog.md), and [Gunicorn](https://www.datadoghq.com/blog/monitor-gunicorn-performance.md), along with 1,000 other technologies. Datadog can provide complete visibility into HTTP response latency by analyzing server logs, tracing requests as they travel across distributed services, collecting network metrics from your hosts, and simulating HTTP requests from clients around the world—check out our <!-- Sign-up trigger (free trial) omitted -->.