Monitor Jenkins jobs with Datadog
Jenkins is an open source, Java-based continuous integration server that helps organizations build, test, and deploy projects automatically. Jenkins is widely used, having been adopted by organizations like GitHub, Etsy, LinkedIn, and Datadog.
You can set up Jenkins to test and deploy your software projects every time you commit changes, to trigger new builds upon successful completion of other builds, and to run jobs on a regular schedule. With hundreds of plugins, Jenkins supports a wide variety of use cases.
As shown in the out-of-the-box dashboard below, our Datadog plugin will provide more insights into job history and trends than Jenkins’s standard weather reports. You can use the plugin to:
- Set alerts for important build failures
- Identify trends in build durations
- Correlate Jenkins events with performance metrics from other parts of your infrastructure in order to identify and resolve issues
Monitor Jenkins build status in real-time
Once you install the Jenkins-Datadog plugin, Jenkins activities (when a build starts, fails, or succeeds) will start appearing in your Datadog event stream. You will also see what percentage of builds failed within the same job, so that you can quickly spot which jobs are experiencing a higher rate of failure than others.
Remember to blacklist any jobs you don’t want to track by indicating them in your plugin configuration.
Datadog’s Jenkins dashboard gives you a high-level overview of how your jobs are performing. The status widget displays the current status of all jobs that have run in the past day, grouped by success or failure. To explore further, you can also click on the widget to view the jobs that have failed or succeeded in the past day.
You can also see the proportion of successful vs. failed builds, along with the total number of job runs completed over the past four hours.
Datadog also enables you to correlate Jenkins events with application performance metrics to investigate the root cause of an issue. For example, the screenshot below shows that average CPU on the app servers increased sharply after a Jenkins build was completed and deployed (indicated by the pink bar). Your team can use this information as a starting point to investigate if code changes in the corresponding release may be causing the issue.
Visualize job duration metrics
Every time a build is completed, Datadog’s plugin collects its duration as a metric that you can aggregate by job name or any other tag, and graph over time. In the screenshot below, we can view the average job durations in the past four hours, sorted in decreasing order:
You can also graph and visualize trends in build durations for each job by using Datadog’s
robust_trend() linear regression function, as shown in the screenshot below. This graph indicates which jobs’ durations are trending longer over time, so that you can investigate if there appears to be a problem. If you’re experimenting with changes to your CI pipeline, consulting this graph can help you track the effects of those changes over time.
Use tags to monitor Jenkins jobs
Tags add custom dimensions to your monitoring, so you can focus on what’s important to you right now.
Every Jenkins event, metric, and service check is auto-tagged with
branch (if applicable). You can also enable the optional
node tag in the plugin settings.
You can create custom tags for the name of the application you’re building, your particular team name (e.g.
team=licorice), or any other info that matters to you. For example, if you have multiple jobs that perform nightly builds, you might want to create a descriptive tag that distinguishes them from other types of jobs.