A year ago, I had a serious problem. I was the Director of Engineering at an eCommerce company running a Ruby on Rails website. Our Rails app was quite stable except for one part: our background worker processes kept increasing in memory over the course of a day from 300MB up to many gigabytes. Eventually this would cause our machines to run out of memory and we’d have to restart the service manually to regain that memory.
Of course we were monitoring everything with Datadog. We could see the memory increasing and we could receive an email alert about the problem, but we couldn’t automate the restart. Inspeqtor to the rescue!
Inspeqtor is Linux-based software which monitors your critical application infrastructure on the local machine: processes like MySQL or PostgreSQL, Memcached, Redis, Java VM processes, custom daemons, etc. You define simple rules, and Inspeqtor will verify proper behavior and take action if a rule is broken.
To get started with Inspeqtor, first install it on each machine:
# For Ubuntu 12.04 and 14.04 LTS curl -L https://bit.ly/InspeqtorDEB | sudo bash sudo apt-get install inspeqtor
and then write rules about your services running on that machine:
# /etc/inspeqtor/services.d/background_worker.inq check service background_worker if memory:rss > 1g then restart, alert if cpu:user > 95% for 4 cycles then alert
Here, you can see I’m using Inspeqtor to automate the restart of our background worker service. Inspeqtor will scan the worker process every 15 seconds. If it is using more than one GB of memory, Inspeqtor will immediately restart it. I’ve also added a rule to alert me if the process uses an entire CPU for 4 cycles (or 60 seconds).
It seems too simple to work, right? The trick is that Inspeqtor queries any installed init systems – upstart, systemd, runit or init.d – to find a service with that name and any associated processes so it can collect metrics.
Inspeqtor is open source and free; I also offer Inspeqtor Pro with more features, including Statsd integration. With just a single line of code, you can send all metrics collected by Inspeqtor Pro to the Datadog Agent for visualization:
# /etc/inspeqtor/inspeqtor.conf set statsd_location localhost:8125
Within seconds your data should appear in the Datadog system. Here you can see Inspeqtor Pro collecting the total memory used by the Apache 2.x service on my website:
Using Inspeqtor Pro and Datadog together works great in my experience. Datadog provides a rich, flexible UI for seeing historical data and finding problems while Inspeqtor allows you to take action to solve those problems with rules that are incredibly easy to write. Take a look through Inspeqtor’s documentation or the Getting Started screencast if you want to learn more.