How to collect Varnish metrics

How to collect Varnish metrics

/ /
Published: July 28, 2015

This post is part 2 of a 3-part series on Varnish monitoring. Part 1 explores the key Varnish metrics available, and Part 3 details how Datadog can help you to monitor Varnish metrics.

How to get the Varnish metrics you need

Varnish Cache ships with very useful and precise monitoring and logging tools. As explained in the first post of this series, for monitoring purposes, the most useful of the available tools is varnishstat which gives you a detailed snapshot of Varnish’s current performance. It provides access to in-memory statistics such as cache hits and misses, resource consumption, threads created, and more.

varnishstat

If you run varnishstat from the command line you will see a list of all available Varnish metrics, with values changing in real time. If you add the -1 flag, varnishstat will exit after printing the list one time. Example output below:

$ varnishstat      

     MAIN.uptime                  Child process uptime      
     MAIN.sess_conn               Sessions accepted      
     MAIN.sess_drop               Sessions dropped      
     MAIN.sess_fail               Session accept failures      
     MAIN.sess_pipe_overflow      Session pipe overflow      
     MAIN.client_req              Good client requests received      
     MAIN.cache_hit               Cache hits      
     MAIN.cache_hitpass           Cache hits for pass      
     MAIN.cache_miss              Cache misses      
     MAIN.backend_conn            Backend conn. success      
     MAIN.backend_unhealthy       Backend conn. not attempted      
     MAIN.backend_busy            Backend conn. too many      
     MAIN.backend_fail            Backend conn. failures      
     MAIN.backend_reuse           Backend conn. reuses      
     MAIN.backend_toolate         Backend conn. was closed      
     MAIN.backend_recycle         Backend conn. recycles      
     MAIN.backend_retry           Backend conn. retry      
     MAIN.pools                   Number of thread pools      
     MAIN.threads                 Total number of threads      
     MAIN.threads_limited         Threads hit max      
     MAIN.threads_created         Threads created      
     MAIN.threads_destroyed       Threads destroyed      
     MAIN.threads_failed          Thread creation failed      
     MAIN.thread_queue_len        Length of session queue

To list specific values, pass them with the -f flag, separated by commas (and followed by -1 if needed).

For instance, to display the number of threads currently being used, run: varnishstat -f MAIN.threads

varnishstat output

Varnishstat is useful as a standalone tool if you need to spot-check the health of your cache. However, if Varnish is an important part of your software service, you will almost certainly want to graph its performance over time, correlate it with other metrics from across your infrastructure, and be alerted about any problems that may arise. To do this you will probably want to integrate the metrics that Varnishstat is reporting with a dedicated monitoring service.

varnishlog

If you need to debug your system or tune configuration, varnishlog can be a useful tool, as it provides detailed information about each individual request.

Here is an edited example of varnishlog output generated by a single request—a full example would be several times longer:

$ varnishlog      

     3727 RxRequest    c GET      
     3727 RxProtocol   c HTTP/1.1      
     3727 RxHeader     c Content-Type: application/x-www-form-urlencoded;      
     3727 RxHeader     c Accept-Encoding: gzip,deflate,sdch      
     3727 RxHeader     c Accept-Language: en-US,en;q=0.8      
     3727 VCL_return   c hit      
     3727 ObjProtocol  c HTTP/1.1      
     3727 TxProtocol   c HTTP/1.1      
     3727 TxStatus     c 200      
     3727 Length       c 316      
  […]

The 4 columns represent:

varnishlog’s children

You can display a subset of varnishlog’s information via three specialized tools built on top of varnishlog:

  • varnishtop exposes the log entries that occur most often. You can filter to show the most frequently requested documents, the most common clients or user agents, or other data.
  • varnishhist returns a histogram of latency for recent requests.
  • varnishsizes returns a histogram of request size for recent requests.

Conclusion

Which metrics you monitor will depend on your use case, the tools available to you, and whether the insight provided by a given metric justifies the overhead of monitoring it.

At Datadog, we have built an integration with Varnish so that you can begin collecting and monitoring its metrics with a minimum of setup. Learn how Datadog can help you to monitor Varnish in the next and final part of this series of articles.


Source Markdown for this post is available on GitHub. Questions, corrections, additions, etc.? Please let us know.


Want to write articles like this one? Our team is hiring!