Jenkins Continuous Integration

Alan Mark Berg

February 2015

This article by Alan Mark Berg, the author of Jenkins Continuous Integration Cookbook Second Edition, outlines the main themes surrounding the correct use of a Jenkins server.

Overview

Jenkins (http://jenkins-ci.org/) is a Java-based Continuous Integration (CI) server that supports the discovery of defects early in the software cycle. Thanks to over 1,000 plugins, Jenkins communicates with many types of systems building and triggering a wide variety of tests.

CI involves making small changes to software and then building and applying quality assurance processes. Defects do not only occur in the code, but also appear in the naming conventions, documentation, how the software is designed, build scripts, the process of deploying the software to servers, and so on. CI forces the defects to emerge early, rather than waiting for software to be fully produced. If defects are caught in the later stages of the software development life cycle, the process will be more expensive. The cost of repair radically increases as soon the bugs escape to production. Estimates suggest it is 100 to 1,000 times cheaper to capture defects early. Effective use of a CI server, such as Jenkins, could be the difference between enjoying a holiday and working unplanned hours to heroically save the day. And as you can imagine, in my day job as a senior developer with aspirations to quality assurance, I like long boring days, at least for mission-critical production environments.

Jenkins can automate the building of software regularly and trigger tests pulling in the results and failing based on defined criteria. Failing early via build failure lowers the costs, increases confidence in the software produced, and has the potential to morph subjective processes into an aggressive metrics-based process that the development team feels is unbiased.

Jenkins is:

  • A proven technology that is deployed at large scale in many organizations.
  • Open source, so the code is open to review and has no licensing costs.
  • Has a simple configuration through a web-based GUI. This speeds up job creation, improves consistency, and decreases the maintenance costs.
  • A master slave topology that distributes the build and testing effort over slave servers with the results automatically accumulated on the master. This topology ensures a scalable, responsive, and stable environment.
  • Has the ability to call slaves from the cloud. Jenkins can use Amazon services or an Application Service Provider (ASP) such as CloudBees. (http://www.cloudbees.com/).
  • No fuss installation. Installation is as simple as running only a single downloaded file named jenkins.war.
  • Has many plugins. Over 1,000 plugins supporting communication, testing, and integration to numerous external applications (https://wiki.jenkins-ci.org/display/JENKINS/Plugins).
  • A straightforward plugin framework. For Java programmers, writing plugins is straightforward. Jenkins plugin framework has clear interfaces that are easy to extend. The framework uses XStream (http://xstream.codehaus.org/) for persisting configuration information as XML and Jelly (http://commons.apache.org/proper/commons-jelly/) for the creation of parts of the GUI.
  • Runs Groovy scripts. Jenkins has the facility to support running Groovy scripts both in the master and remotely on slaves. This allows for consistent scripting across operating systems. However, you are not limited to scripting in Groovy. Many administrators like to use Ant, Bash, or Perl scripts and statisticians and developers with complex analytics requirements the R language.
  • Not just Java. Though highly supportive of Java, Jenkins also supports other languages.
  • Rapid pace of improvement. Jenkins is an agile project; you can see numerous releases in the year, pushing improvements rapidly at http://jenkins-ci.org/changelog.
    There is also a highly stable Long-Term Support Release for the more conservative.
  • Jenkins pushes up code quality by automatically testing within a short period after code commit and then shouting loudly if build failure occurs.

The importance of continuous testing

In 2002, NIST estimated that software defects were costing America around 60 billion dollars per year (http://www.abeacha.com/NIST_press_release_bugs_cost.htm). Expect the cost to have increased considerably since.

To save money and improve quality, you need to remove defects as early in the software lifecycle as possible. The Jenkins test automation creates a safety net of measurements. Another key benefit is that once you have added tests, it is trivial to develop similar tests for other projects.

Jenkins works well with best practices such as Test Driven Development (TDD) or Behavior Driven Development (BDD). Using TDD, you write tests that fail first and then build the functionality needed to pass the tests. With BDD, the project team writes the description of tests in terms of behavior. This makes the description understandable to a wider audience. The wider audience has more influence over the details of the implementation.

Regression tests increase confidence that you have not broken code while refactoring software. The more coverage of code by tests, the more confidence.

There are a number of good introductions to software metrics. These include a wikibook on the details of the metrics (http://en.wikibooks.org/wiki/Introduction_to_Software_Engineering/Quality/Metrics). And a well written book is by Diomidis Spinellis Code Quality: The Open Source Perspective.

Remote testing through Jenkins considerably increases the number of dependencies in your infrastructure and thus the maintenance effort. Remote testing is a problem that is domain specific, decreasing the size of the audience that can write tests.

You need to make test writing accessible to a large audience. Embracing the largest possible audience improves the chances that the tests defend the intent of the application.

The technologies highlighted in the Jenkins book include:

  • Fitnesse: It is a wiki with which you can write different types of tests. Having a WIKI like language to express and change tests on the fly gives functional administrators, consultants, and the end-user a place to express their needs. You will be shown how to run Fitnesse tests through Jenkins. Fitnesse is also a framework where you can extend Java interfaces to create new testing types. The testing types are called fixtures; there are a number of fixtures available, including ones for database testing, running tools from the command line and functional testing of web applications.
  • JMeter: It is a popular open source tool for stress testing. It can also be used to functionally test through the use of assertions. JMeter has a GUI that allows you to build test plans. The test plans are then stored in XML format. Jmeter is runnable through a Maven or Ant scripts. JMeter is very efficient and one instance is normally enough to hit your infrastructure hard. However, for super high load scenarios Jmeter can trigger an array of JMeter instances.
  • Selenium: It is the de-facto industrial standard for functional testing of web applications. With Selenium IDE, you can record your actions within Firefox, saving them in HTML format to replay later. The tests can be re-run through Maven using Selenium Remote Control (RC). It is common to use Jenkins slaves with different OS's and browser types to run the tests. The alternative is to use Selenium Grid (https://code.google.com/p/selenium/wiki/Grid2).
  • Selenium Webdriver and TestNG unit tests: A programmer-specific approach to functional testing is to write unit tests using the TestNG framework. The unit tests apply the Selenium WebDriver framework. Selenium RC is a proxy that controls the web browser. In contrast, the WebDriver framework uses native API calls to control the web browser. You can even run the HtmlUnit framework removing the dependency of a real web browser. This enables OS independent testing, but removes the ability to test for browser specific dependencies. WebDriver supports many different browser types.
  • SoapUI: It simplifies the creation of functional tests for web services. The tool can read Web Service Definition Language (WSDL) files publicized by web services, using the information to generate the skeleton for functional tests. The GUI makes it easy to understand the process.

Plugins

Jenkins is not only a CI server, it is also a platform to create extra functionality. Once a few concepts are learned, a programmer can adapt available plugins to their organization's needs.

If you see a feature that is missing, it is normally easier to adapt an existing one than to write from scratch. If you are thinking of adapting then the plugin tutorial (https://wiki.jenkins-ci.org/display/JENKINS/Plugin+tutorial) is a good starting point. The tutorial is relevant background information on the infrastructure you use daily.

There is a large amount of information available on plugins. Here are some key points:

  • There are many plugins and more will be developed. To keep up with these changes, you will need to regularly review the available section of the Jenkins plugin manager.
  • Work with the community: If you centrally commit your improvements then they become visible to a wide audience. Under the careful watch if the community, the code is more likely to be reviewed and further improved.
  • Don't reinvent the wheel: With so many plugins, in the majority of situations, it is easier to adapt an already existing plugin than write from scratch.
  • Pinning a plugin occurs when you cannot update the plugin to a new version through the Jenkins plugin manager. Pinning helps to maintain a stable Jenkins environment.
  • Most plugin workflows are easy to understand. However, as the number of plugins you use expands, the likelihood of an inadvertent configuration error increases.
  • The Jenkins Maven Plugin allows you to run a test Jenkins server from within a Maven build without risk.
  • Conventions save effort: The location of files in plugins matters. For example, you can find the description of a plugin displayed in Jenkins at the file location /src/main/resources/index.jelly.

    By keeping to Jenkins conventions, the amount of source code you write is minimized and the readability is improved.

The three frameworks that are heavily used in Jenkins are as follows:

  • Jelly for the creation of the GUI
  • Stapler for the binding of the Java classes to the URL space
  • XStream for persistence of configuration into XML

Maintenance

In the Jenkins' book, we will also provide recipes that support maintenance cycles. For large scale deployments of Jenkins within diverse enterprise infrastructure, proper maintenance of Jenkins is crucial to planning predictable software cycles. Proper maintenance lowers the risk of failures:

  • Storage over-flowing with artifacts: If you keep a build history that includes artifacts such as WAR files, large sets of JAR files, or other types of binaries, then your storage space is consumed at a surprising rate. Storage costs have decreased tremendously, but storage usage equates to longer backup times and more communication from slave to master. To minimize the risk of disk, overflowing you will need to consider your backup and also restore policy and the associated build retention policy expressed in the advanced options of jobs.
  • Script spaghetti: As jobs are written by various development teams, the location and style of the included scripts vary. This makes it difficult for you to keep track. Consider using well-defined locations for your scripts and a scripts repository managed through a plugin.
  • Resource depletion: As memory is consumed or the number of intense jobs increases, then Jenkins slows down. Proper monitoring and quick reaction reduce impact.
  • A general lack of consistency between Jobs due to organic growth: Jenkins is easy to install and use. The ability to seamlessly turn on plugins is addictive. The pace of adoption of Jenkins within an organization can be breath taking. Without a consistent policy, your teams will introduce lots of plugins and also lots of ways of performing the same work. Conventions improve consistency and readability of Jobs and thus decrease maintenance.
  • New plugins causing exceptions: There are a lot of good plugins being written with rapid version change. In this situation, it is easy for you to accidentally add new versions of plugins with new defects. There have been a number of times during upgrading plugins that suddenly the plugin does not work. To combat the risk of plugin exceptions, consider using a sacrificial Jenkins instance before releasing to a critical system.Jenkins Continuous Integration Cookbook - Second Edition

Jack of all trades

Jenkins has many plugins that allow it to integrate easily into complex and diverse environments. If there is a need that is not directly supported you can always use a scripting language of choice and wire that into your jobs. In this section, we'll explore the R plugin and see how it can help you generate great graphics.

R is a popular programming language for statistics (http://en.wikipedia.org/wiki/R_programming_language). It has many hundreds of extensions and has a powerful set of graphical capabilities. In this recipe, we will show you how to use the graphical capabilities of R within your Jenkins Jobs and then point you to some excellent starter resources.

For a full list of plugins that improve the UI of Jenkins including Jenkins' graphical capabilities, visit https://wiki.jenkins-ci.org/display/JENKINS/Plugins#Plugins-UIplugins.

Getting ready

Install the R plugin (https://wiki.jenkins-ci.org/display/JENKINS/R+Plugin). Review the R installation documentation (http://cran.r-project.org/doc/manuals/r-release/R-admin.html).

How to do it...

  1. From the command line, install the R language:
    sudo apt-get install r-base
  2. Review the available R packages:
    apt-cache search r-cran | less
  3. Create a free-style job with the name ch4.powerfull.visualizations.
  4. In the Build section, under Add build step, select Execute R script.
  5. In the Script text area, add the following lines of code:
    paste('=======================================');
    paste('WORKSPACE: ', Sys.getenv('WORKSPACE'))
    paste('BUILD_URL: ', Sys.getenv('BUILD_URL'))
    print('ls /var/lib/jenkins/jobs/R-ME/builds/')
    paste('BUILD_NUMBER: ', Sys.getenv('BUILD_NUMBER'))
    paste('JOB_NAME: ', Sys.getenv('JOB_NAME'))
    paste('JENKINS_HOME: ', Sys.getenv('JENKINS_HOME'))
    paste( 'JOB LOCATION: ', Sys.getenv('JENKINS_HOME'),'/jobs/',
    Sys.getenv('JOB_NAME'),'/builds/', Sys.getenv('BUILD_NUMBER'),"/test.pdf",sep="") paste('=======================================');   filename<-paste('pie_',Sys.getenv('BUILD_NUMBER'),'.pdf',sep="") pdf(file=filename) slices<- c(1,2,3,3,6,2,2) labels <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday","Saturday","Sunday") pie(slices, labels = labels, main="Number of failed jobs for each day of the week")   filename<-paste('freq_',Sys.getenv('BUILD_NUMBER'),'.pdf',sep="") pdf(file=filename) Number_OF_LINES_OF_ACTIVE_CODE=rnorm(10000, mean=200, sd=50) hist(Number_OF_LINES_OF_ACTIVE_CODE,main="Frequency plot of Class Sizes")   filename<-paste('scatter_',Sys.getenv('BUILD_NUMBER'),'.pdf',sep="") pdf(file=filename) Y <- rnorm(3000) plot(Y,main='Random Data within a normal distribution')
  6. Click on the Save button.
  7. Click on the Build Now icon
  8. Beneath Build History, click on the Workspace button.
  9. Review the generated graphics by clicking on the freq_1.pdf, pie_1.pdf, and scatter_1.pdf links, as shown in the following screenshot:

    Jenkins Continuous Integration Cookbook - Second Edition

The following screenshot is a histogram of the values from the random data generated by the R script during the build process. The data simulates class sizes within a large project:

Jenkins Continuous Integration Cookbook - Second Edition

Another view is a pie chart. The fake data representing the number of failed jobs for each day of the week. If you make this plot against your own values, you might see particularly bad days, such as the day before or after the weekend. This might have implications about how developers work or motivation is distributed through the week.

Jenkins Continuous Integration Cookbook - Second Edition

Perform the following steps:

  1. Run the job and review the Workspace.
  2. Click on Console Output. You will see output similar to:
    Started by user anonymous
    Building in workspace /var/lib/jenkins/workspace/ch4.Powerfull.Visualizations
    [ch4.Powerfull.Visualizations] $ Rscript /tmp/hudson6203634518082768146.R [1] "=======================================" [1] "WORKSPACE: /var/lib/jenkins/workspace/ch4.Powerfull.Visualizations" [1] "BUILD_URL: " [1] "ls /var/lib/jenkins/jobs/R-ME/builds/" [1] "BUILD_NUMBER: 9" [1] "JOB_NAME: ch4.Powerfull.Visualizations" [1] "JENKINS_HOME: /var/lib/jenkins" [1] "JOB LOCATION: /var/lib/jenkins/jobs/ch4.Powerfull.Visualizations/builds/9/test.pdf" [1] "=======================================" Finished: SUCCESS
  3. Click on Back to Project
  4. Click on Workspace.

How it works...

With a few lines of R code, you have generated three different well-presented PDF graphs.

The R plugin ran a script as part of the build. The script printed out the WORKSPACE and other Jenkins environment variables to the console:

paste ('WORKSPACE: ', Sys.getenv('WORKSPACE'))

Next, a filename is set with the build number appended to the pie_ string. This allows the script to generate a different filename each time it is run, as shown:

filename <-paste('pie_',Sys.getenv('BUILD_NUMBER'),'.pdf',sep="")

The script now opens output to the location defined in the filename variable through the pdf(file=filename) command. By default, the output directory is the job's workspace.

Next, we define fake data for the graph, representing the number of failed jobs on any given day of the week. Note that in the simulated world, Friday is a bad day:

slices <- c(1,2,3,3,6,2,2)
labels <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday","Saturday","Sunday")

We can also plot a pie graph, as follows:

pie(slices, labels = labels, main="Number of failed jobs for each day of the week")

For the second graph, we generated 10,000 pieces of random data within a normal distribution.
The fake data represents the number of lines of active code that ran for a give job:

Number_OF_LINES_OF_ACTIVE_CODE=rnorm(10000, mean=200, sd=50)

The hist command generates a frequency plot:

hist(Number_OF_LINES_OF_ACTIVE_CODE,main="Frequency plot of Class Sizes")

The third graph is a scatter plot with 3,000 data points generated at random within a normal distribution.
This represents a typical sampling process, such as the number of potential defects found using Sonar or FindBugs:

Y <- rnorm(3000)
plot(Y,main='Random Data within a normal distribution')

We will leave it to an exercise for the reader to link real data to the graphing capabilities of R.

There's more...

Here are a couple more points for you to think about.

R studio or StatET

A popular IDE for R is RStudio (http://www.rstudio.com/). The open source edition is free. The feature set includes a source code editor with code completion and syntax highlighting, integrated help, solid debugging features and a slew of other features.

Jenkins Continuous Integration Cookbook - Second Edition

An alternative for the Eclipse environment is the StatET plugin (http://www.walware.de/goto/statet).

Quickly getting help

The first place to start to learn R is by typing help.start() from the R console. The command launches a browser with an overview of the main documentation

If you want descriptions of R commands then typing ? before a command generates detailed help documentation. For example, in the recipe we looked at the use of the rnorm command. Typing ?rnorm produces documentation similar to:

The Normal Distribution
Description
Density, distribution function, quantile function and random generation
for the normal distribution with mean equal to mean and standard deviation equal to sd.
Usage
dnorm(x, mean = 0, sd = 1, log = FALSE)
pnorm(q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)
qnorm(p, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)
rnorm(n, mean = 0, sd = 1)

The Community

Jenkins is not just a CI server, it is also a vibrant and highly active community. Enlightened self-interest dictates participation. There are a number of ways to do this:

Final Comments

An efficient approach to learning how to effectively use Jenkins is to download and install the server and then trying out recipes you find in books, the Internet or developed by your fellow developers. I wish you good fortune and an interesting learning experience.

Resources for Article:


Further resources on this subject:


You've been reading an excerpt of:

Jenkins Continuous Integration Cookbook - Second Edition

Explore Title