Knapsack with Jenkins Pipeline

If you don't pay close attention to the performance of every single new automated test, the suite build time becomes unbearable rapidly. Even if the speed of tests is a concern for all team members, once the app reaches a certain size, long wait is inevitable.
Implementing continuous integration best practices helps minimising feedback loops so getting the tests right is a crucial ingredient of a software delivery process.

It's important to have a reliable and fast builds on a CI server. Jenkins supports parallel execution of tests for Java world with splitTests.
However, Jenkins does not provide good support for Ruby out of the box and it is similar in terms of parallelisation.
Ruby community is wonderful in terms of providing developers with great productivity tools. parallel_tests allows to run a test suite in parallel (duh!) and it's great for development. It could be utilised on CI as well but given the design of executors and slaves in Jenkins, running multiple processes per executor might throttle other jobs and decrease the overall performance of a worker node.
Here comes Knapsack for the rescue.

Knapsack tries to split tests into multiple chunks and run test suites in separate processes. Since the division is deterministic, the tests do not have to be run at the same time nor on the same machine. On top of that, knapsack tries to do a better job in splitting the tests than other solutions by solving knapsack problem.

Jenkins Pipeline

Jenkins Pipeline is a DSL on top of Groovy which allows to define a build flow and treat build definition as an internal part of a project. I assume you're having a basic understanding of CI/Jenkins and Pipeline plugin. If you've never heard of Jenkins Pipeline please refer to the documentation. Analysing the chapter of a handbook would be more than enough to understand examples below.

I'd like to highlight a couple of features which will be useful for our examples.

It's possible to use Groovy to implement custom DSL steps and provide them as a shared library.

Assuming our current build looks like that:

node {  
    checkout scm
    sh 'bundle install --frozen --deployment'
    sh 'bundle exec rake spec'
}

let's parallelise it!

Run in parallel

To run a task in parallel using Jenkins Pipeline:

parallel(  
    'spec1': { node {...} },
    'spec2': { node {...} },
    'rubocop': { node {...} }
)

So by applying Knapsack environmental variables and using Knapsack rake task, tests can be run in parallel like that:

parallel(  
    'node0': {
        node {
            checkout scm
            sh 'bundle install --frozen --deployment'
            sh 'CI_NODE_INDEX=0 CI_NODE_TOTAL=2 bundle exec rake knapsack:rspec'
        }
    },
    'node1': {
        node {
            checkout scm
            sh 'bundle install --frozen --deployment'
            sh 'CI_NODE_INDEX=1 CI_NODE_TOTAL=2 bundle exec rake knapsack:rspec'
        }
    }
)

The build configuration above would allocate 2 Jenkins workers and simply execute all the shell commands specified within node block.
This works and might fulfil basic requirements but there is a lot of repetition in the code so we can do better!

Define knapsack DSL function

First, let's think what we wish our DSL looked like.
The requirements are:

  • full control over node step (e.g. adding custom labels)
  • not repeating variable export
  • configurable number of nodes

Jenkins Pipeline's parallel accepts a map of label to build instructions as an argument. So the desired DSL might look like this:

parallel(  
    knapsack(2) {
        node {
            checkout scm
            sh 'bundle install --frozen --deployment'
            sh 'bundle exec rake knapsack:rspec'
        }
    }
}

knapsack accepts 2 arguments, number of parallel nodes and a closure to execute.

def knapsack(Integer ci_node_total, Closure cl) {  
    cl()
}

Let's provide the implementation which would build the map:

def knapsack(Integer ci_node_total, Closure cl) {  
    // Create a map which has `ci_node_total` values
    (0..(ci_node_total - 1)).inject([:]) { nodes, index ->
        nodes["ci_node_${index}"] = {
            withEnv(["CI_NODE_INDEX=$index", "CI_NODE_TOTAL=$ci_node_total"]) {
                cl()
            }
        }
        nodes
    }
}

withEnv can be used both inside and outside node step. The environmental variables defined there are accessible by env map and are passed to all shell commands invoked by sh step.

The solution should work but because of Jenkins Pipeline DSL magic and serialisation of this object we get the following error (Kudos to Jenkins team for significantly improving error message. It took me a while to figure it out in the past):

java.lang.UnsupportedOperationException: Calling public static java.lang.Object org.codehaus.groovy.runtime.DefaultGroovyMethods.inject(java.util.Collection,java.lang.Object,groovy.lang.Closure) on a CPS-transformed closure is not yet supported (JENKINS-26481); encapsulate in a @NonCPS method, or use Java-style loops  

Let's follow the exception advice.

Just add @NonCPS before function definition.

@NonCPS
def knapsack(Integer ci_node_total, Closure cl) {  
...

Or implement imperative, Java-style solution:

def knapsack(Integer ci_node_total, Closure cl) {  
    def nodes = [:]

    for(int i = 0; i < ci_node_total; i++) {
        def index = i;
        nodes["ci_node_${i}"] = {
            withEnv(["CI_NODE_INDEX=$index", "CI_NODE_TOTAL=$ci_node_total"]) {
                cl()
            }
        }
    }

    return nodes;
}

Now we're ready to wrap our job definition with the new DSL directive.

Full working example

Checkout simple Ruby application to see a working example.

Testing locally with Docker

If you're familiar with Docker, it's possible to run all examples on a local Jenkins instance.

Clone the repository and run the example using Docker Compose:

git clone https://github.com/mknapik/jenkins-knapsack-docker  
cd jenkins-knapsack-docker  
docker-compose up --build  

Visit http://localhost:8080.
Explore various variants of builds and check the implementation (switch branches to compare).

Tested with:

  • docker version 1.12.1 (Docker for Mac)
  • docker-compose version 1.8.0