How to collect Varnish metrics

Jean-Mathieu Saponaro

This post is part 2 of a 3-part series on Varnish monitoring. Part 1 explores the key Varnish metrics available, and Part 3 details how Datadog can help you to monitor Varnish metrics.

How to get the Varnish metrics you need

Varnish Cache ships with very useful and precise monitoring and logging tools. As explained in the first post of this series, for monitoring purposes, the most useful of the available tools is varnishstat which gives you a detailed snapshot of Varnish’s current performance. It provides access to in-memory statistics such as cache hits and misses, resource consumption, threads created, and more.

varnishstat

If you run varnishstat from the command line you will see a list of all available Varnish metrics, with values changing in real time. If you add the -1 flag, varnishstat will exit after printing the list one time. Example output below:

1
$ varnishstat
2

3
     MAIN.uptime                  Child process uptime
4
     MAIN.sess_conn               Sessions accepted
5
     MAIN.sess_drop               Sessions dropped
6
     MAIN.sess_fail               Session accept failures
7
     MAIN.sess_pipe_overflow      Session pipe overflow
8
     MAIN.client_req              Good client requests received
9
     MAIN.cache_hit               Cache hits
10
     MAIN.cache_hitpass           Cache hits for pass
11
     MAIN.cache_miss              Cache misses
12
     MAIN.backend_conn            Backend conn. success
13
     MAIN.backend_unhealthy       Backend conn. not attempted
14
     MAIN.backend_busy            Backend conn. too many
15
     MAIN.backend_fail            Backend conn. failures
16
     MAIN.backend_reuse           Backend conn. reuses
17
     MAIN.backend_toolate         Backend conn. was closed
18
     MAIN.backend_recycle         Backend conn. recycles
19
     MAIN.backend_retry           Backend conn. retry
20
     MAIN.pools                   Number of thread pools
21
     MAIN.threads                 Total number of threads
22
     MAIN.threads_limited         Threads hit max
23
     MAIN.threads_created         Threads created
24
     MAIN.threads_destroyed       Threads destroyed
25
     MAIN.threads_failed          Thread creation failed
26
     MAIN.thread_queue_len        Length of session queue

To list specific values, pass them with the -f flag, separated by commas (and followed by -1 if needed).

For instance, to display the number of threads currently being used, run: varnishstat -f MAIN.threads

Varnishstat is useful as a standalone tool if you need to spot-check the health of your cache. However, if Varnish is an important part of your software service, you will almost certainly want to graph its performance over time, correlate it with other metrics from across your infrastructure, and be alerted about any problems that may arise. To do this you will probably want to integrate the metrics that Varnishstat is reporting with a dedicated monitoring service.

varnishlog

If you need to debug your system or tune configuration, varnishlog can be a useful tool, as it provides detailed information about each individual request.

Here is an edited example of varnishlog output generated by a single request—a full example would be several times longer:

1
$ varnishlog
2

3
     3727 RxRequest    c GET
4
     3727 RxProtocol   c HTTP/1.1
5
     3727 RxHeader     c Content-Type: application/x-www-form-urlencoded;
6
     3727 RxHeader     c Accept-Encoding: gzip,deflate,sdch
7
     3727 RxHeader     c Accept-Language: en-US,en;q=0.8
8
     3727 VCL_return   c hit
9
     3727 ObjProtocol  c HTTP/1.1
10
     3727 TxProtocol   c HTTP/1.1
11
     3727 TxStatus     c 200
12
     3727 Length       c 316
13
  […]

The 4 columns represent:

varnishlog’s children

You can display a subset of varnishlog’s information via three specialized tools built on top of varnishlog:

varnishtop exposes the log entries that occur most often. You can filter to show the most frequently requested documents, the most common clients or user agents, or other data.
varnishhist returns a histogram of latency for recent requests.
varnishsizes returns a histogram of request size for recent requests.

Conclusion

Which metrics you monitor will depend on your use case, the tools available to you, and whether the insight provided by a given metric justifies the overhead of monitoring it.

At Datadog, we have built an integration with Varnish so that you can begin collecting and monitoring its metrics with a minimum of setup. Learn how Datadog can help you to monitor Varnish in the next and final part of this series of articles.

Get Started with Datadog

How to collect Varnish metrics