What is HAProxy?
HAProxy is an open source solution for load balancing and reverse proxying TCP and HTTP requests. It is a robust, high-availability platform that can route around and remove backends that fail its built-in health checks. HAProxy is a keystone of some extremely well-known, high-traffic sites, including Twitter, Instagram, Tumblr, Reddit, and Yelp.
HAProxy dashboard overview
Because a load balancer is the intermediary between client connections and the backend, a misconfigured HAProxy setup will slow everything down. Thus, it is important to monitor frontend connections between the client and HAProxy, connections between HAProxy and your backend servers, and combined metrics such as error codes and server status.
Below is the HAProxy template dashboard in Datadog. In this article you’ll find a breakdown of the metrics on that dashboard, which make for a good starting point for anyone looking to monitor HAProxy.
Here’s a graph-by-graph breakdown of the dashboard, separated into frontend, backend, and combined metrics.
The following metrics tell you about client interactions with the load balancer.
Front 2xx %
The percentage of responses that are successful.
Front sessions %
Indicates the number of frontend sessions (from clients to HAProxy), as a percent of HAProxy’s total capacity. As shown, the number of frontend sessions is well over 80% of capacity, meaning that you might want to increase the session cap, migrate your HAProxy server to a bigger box, or add another HAProxy server to the pool.
Frontend network traffic
The volume of network traffic, over time, expressed as a rate (e.g., mebibytes per second).
HAProxy defines a session as being composed of two connections, one from the client to HAProxy, and the other from HAProxy to the appropriate backend server. Thus, frontend sessions represent the number of clients that are connected to HAProxy.
The number of new connections made per second, from clients to HAProxy. HAProxy allows you to cap the number of new sessions per second in order to keep your deployment from creaking under the weight of new visitors.
Once a session has been created, clients can begin to issue requests. All of a client’s requests are usually contained within a single session.
Frontend denials track the number of requests that are denied because they fail to pass security restrictions.
These metrics inform you about HAProxy’s connections with your backend servers.
Back 2xx %
The number of successful responses from the backend, expressed as a percentage of the total.
Back sessions %
Indicates the number of backend sessions (from HAProxy to your backend servers), as a percent of HAProxy’s total capacity.
Current number of requests unassigned in queue. If your backend is bombarded with connections to the point you have reached your global
maxconn limit, HAProxy will seamlessly queue new connections in your system kernel’s socket queue until a backend server becomes available, or a timeout is reached.
This metric indicates failed backend requests and general backend errors. To hunt down the cause of backend errors, correlate this metric with response codes from your frontend and backend servers.
Response time represents the average response time over a sliding window of the last 1,024 requests. Keep in mind that this metric will be zero if you are not using HTTP mode.
Backend network traffic
The volume of network traffic being served by the backend over time, expressed as a rate (e.g., mebibytes per second).
The number of connections from HAProxy to a backend server at any given time. Even if HAProxy reaches its connection limit, the application will continue to accept and queue connections until the backend server fails.
The number of connections from HAProxy to backend servers, per second.
Average backend response time by host (ms)
This metric measures the latency of your system. Response times over 500 milliseconds typically result in degradation of application performance and customer experience. Watch this closely.
Time to connect to backend (in ms)
The time it takes for HAProxy to connect to a backend server.
Backend retries and redispatches
The number of retries, plus the number of times a request was redispatched to a different backend. When the retry metric creeps above baseline, expect a spike in errors and connection failures.
Average queue time
Average time spent in queue (in milliseconds) for the last 1,024 requests. Since this is an average, the overall result can be skewed by a single request trapped in queue. Keep this value as low as possible.
Backend denials occur when a seemingly benign request provides a response that contains sensitive information. You can prevent these responses with an Access Control List (ACL). This usually throws a 502 error code.
Shared metrics (frontend and backend)
3xx response codes (F&B)
The number of 3xx error codes, which indicate that the client has taken more than one action to complete a request. This is usually due to some kind of URL redirection.
4xx response codes (F&B)
This represents the number of client error codes. Code 408 tends to crop up when browsers pre-connect and timeout, whereas code 404 could point to a misconfigured application or unruly client.
5xx response codes (F&B)
The number of HTTP server errors. These are usually correlated with a large number of denied responses.
The combined number of errors, including all HTTP response codes, from both the frontend and the backend.
HAProxy servers by status
Once set up, HAProxy can regularly perform health checks on all enabled servers. If a health check fails three times in a row (configurable with the
rise directive), it is marked in a DOWN state. Monitoring the health of your HAProxy servers gives you the information you need to quickly respond to outages as they occur.
Monitor frontend connections and backend servers with the HAProxy dashboard
For a deep dive on HAProxy metrics and how to monitor them, check out our three-part How to Monitor HAProxy series.