ROSE Compiler Framework/Continuous Integration

Motivation
Without automated continuous integration, we had frequent incidents like:
 * Developer A commits something to our central git repository's master branch. The commits contain some bugs which break our build and take a long time to have a fix. Then the central master branch is left to a corrupted state for weeks so nobody can check out/in anything.
 * Developer A does a lot of wonderful work offline for months. But her work later is found to be incompatible with another developer's work. Her work has unsolvable merge conflicts.

Overview
The ROSE project uses a workflow that automates the central principles of continuous integration in order to make integrating the work from different developers a non-event. Because the integration process only integrates with ROSE the changes that passes all tests we encourage all developers to stay in sync with the latest version.

A high level overview of the development model used by ROSE developers.
 * Step 1: Taking advantage of the distributed source code repositories based on git, each developer should first clone his/her own repository from our central git repository (or its mirrors/clones/forks).
 * Step 2: Then a feature or a bugfix can be developed in isolation within the private repository. The developer can create any number of private branches. Each branch should relate to a feature that this developer is working on and be relatively short-lived. The developer can commit changes to the private repository without maintaining an active connection to the shared repository.
 * Step 3: When work is finished and locally tested (make, make check, and make distcheck -j#n), she can pull the latest commits from the central repo's master branch
 * Step 4: She then can push all accumulated commits within the private repository to his branch within the shared repository. We create a dedicated branch within the central repository for each developer and establish access control of the branch so only an authorized developer can push commits to a particular branch of the shared repository.
 * Step 5-6 (automated): Any commits from a developer’s private repository will not be immediately merged to the master branch of the shared repository.

In fact, we have access control to prevent any developer from pushing commits to the master branch within the shared repository. A continuous integration server called Jenkins is actively monitoring each developer’s branch within the central repository and will initiate comprehensive commit tests upon the branch once new commits are detected. Finally, Jenkins will merge the new commits to the master branch of the central repository if all tests pass. If a single test fails, Jenkins will report the error and the responsible developer should address the error in his private repository and push improved commits again.

As a result, the master branch of the central git repository is mostly stable and can be a good candidate for our external release. On top of the master branch of the central git repository, we further have more comprehensive release tests in Jenkins. If all the release tests pass, an external release based on the master branch will be made available outside.

Tests on Jenkins
We use Jenkins ( http://hudson-rose-30:8080/ ) to test commits added to developer's release candidate branches at the central git repository.

The tests are organized into three categories
 * Integration: tests used to check if the new commits can pass various "make check" rules, compatibility tests, portability tests, configuration tests, and so on. If all tests pass, the commits will be merged (or integrated) into the master branch of the central repository.
 * Release: tests used to test the updated master branch of the central repository for additional set of tests using external benchmarks. If all tests pass, the head of the master will be released as a stable snapshot for public file package releases(generated by "make dist").
 * Others: for informational purpose now, not being used in our production workflow.

So for each push (one or more commits to a -rc branch), it will go through two stages: Integration test and Release test stage.

It is each developer's responsibility to make sure their commits can pass BOTH stage by fixing any bugs discovered by the tests.

Installed Software Packages
Here we list software packages installed and used by Jenkins
 * Yices: /export/tmp.hudson-rose/opt/yices/1.0.34

Development Jenkins
Several configurations GCC_VERSION=4.4.7 BOOST_VERSION=1_47_0 source /nfs/casc/overture/ROSE/opt/rhel6/x86_64/rose_environment.sh __rose__JAVA_VERSION_HOME=/nfs/casc/overture/ROSE/opt/rhel6/x86_64/java/jdk/1.7.0_51

GCC_VERSION=4.8.1 BOOST_VERSION=1_50_0 source /nfs/casc/overture/ROSE/opt/rhel6/x86_64/rose_environment.sh __rose__JAVA_VERSION_HOME=/nfs/casc/overture/ROSE/opt/rhel6/x86_64/java/jdk/1.7.0_51

Check Testing Results
It is possible to manually tracking down how you commits are doing within the test pipeline within Jenkins (http://hudson-rose-30:8080/). But it can be tedious and overwhelming.

So we provide a dashboard ( http://sealavender:4000/) to summarize the commits to your release candidate branch(-rc) and the pass/fail status for each integration tests.

Note: It's possible that all of your testing jobs (finally) pass, but the actual integration is not performed. This typically occurs when one of your jobs have a system failure, for instance, so it has to be manually re-started. If you see that all of your jobs have passed, but your work has not been integrated, please let the Jenkins administrator know.

Frequently Failed Jobs
See details at ROSE Compiler Framework/Jenkins Failures

Connection to Code Review


In reality, most LLNL developers are now asked to push things to Github Enterprise for code review first instead of directly pushing to our central git repository. The synchronization between the Github Enterprise's code review repositories and our Central Git repo are automated.

Auto Pull
Auto pull: we have another Jenkins at (https://hudson-rose-30:8443/jenkins/) which serves as the bridge between Github Enterprise and our main production Jenkins.
 * For each private repositories on Github Enterprise, we have a Jenkins job to monitor the master branch for approved pull (merge) request. If there is any new approved commits, the job will transfer the commits to the central repository's -reviewed-rc branch for that developer.

Configuration of the auto pull job: git remote add nfs /nfs/casc/overture/ROSE/git/ROSE.git || true git fetch nfs
 * Source code management
 * git: git@github.llnl.gov:account_name/rose.git
 * branches to be build: github/master
 * Build Trigger: Poll SCM, schedule "* * * * *"
 * Execution shell
 * 1) Add /nfs as remote
 * `|| true`: don't error if remote exists
 * `|| true`: don't error if remote exists
 * `|| true`: don't error if remote exists

if [ -n "$(git log --oneline nfs/master..github/master)" ]; then git push --force nfs "$GIT_BRANCH":refs/heads/oun-reviewed-rc fi
 * 1) Push to /nfs *-rc
 * 1) Push to /nfs *-rc

Auto Push
Auto push: A Jenkins job is responsible for propagating latest central master contents to all private repositories on github.llnl.gov
 * http://hudson-rose-30:8080/job/Commit-sync-github

The Job configuration USERS="\ user1\ user2 "
 * source Code Management:
 * Git: /nfs/casc/overture/ROSE/git/ROSE.git
 * Branches to build: */master
 * Build Trigger: Build after other projects are built: Commit
 * Execute Shell

for user in $USERS; do tmpfile="$(mktemp)" ( git push git@github.llnl.gov:"$user"/rose.git origin/master:refs/heads/master 2>"$tmpfile" ) || true set +e cat "$tmpfile" cat "$tmpfile" | grep -q "non-fast.*forward" if [ $? -eq 0 ]; then echo "Sending error email to [${user}@llnl.gov] because their github/master is non-fast-forwardable" # email details are omitted here. fi done

Reproduce Jenkins Job failures
Several key elements
 * the Jenkins script repository
 * the right version of ROSE
 * the hardware machine
 * the environment

Assume one failed job is https://hudson-rose-44.llnl.gov:9443/jenkins/job/development-compile-with-autotools-default-EDG-RHEL6/892/gcc=4.4.7,label=RHEL6,rhel=6/parameters/

Steps:
 * First, find the node used for the failed Jenkins job from its log
 * identify the build parameter:
 * clone_url: rose-dev@rosecompiler1.llnl.gov:rose/scratch/rose
 * commit: 83abd459eee1b575b4e7fab04a9f1dfc4955f02a
 * from the job configuration/s Build section, find the script used https://hudson-rose-44.llnl.gov:9443/jenkins/job/development-compile-with-autotools-default-EDG-RHEL6/configure
 * 1) !/bin/bash -ex

export GCC_VERSION="$gcc"

rm -rf ./jenkins-build-scripts/ || exit 1 git clone rose-dev@rosecompiler1.llnl.gov:jenkins/dev/jenkins-build-scripts.git || exit 1 source ./jenkins-build-scripts/config/env-Linux.sh $gcc $rhel || exit 1 ./jenkins-build-scripts/development/development-compile-with-autotools-default-EDG-RHEL6.sh $gcc || exit 1
 * the same configuration page has a configuration matrix with all user defined parameters and values, including gcc versions, OS versions, and others.

Now manually check out the commit and run the script with the right gcc version and rhel version passed
 * ../jenkins-build-scripts/config/env-Linux.sh "4.4.7" 6
 * git clone rose-dev@rosecompiler1.llnl.gov:rose/scratch/rose sourcetree
 * cd sourcetree/
 * git checkout -b autopar 83abd459eee1b575b4e7fab04a9f1dfc4955f02a
 * git submodule init
 * git submodule update
 * ../jenkins-build-scripts/development/development-compile-with-autotools-default-EDG-RHEL6.sh 4.4.7

TODO
High priority
 * Add a pre-screening job before manual code review kicks in. the pre-screening job can make sure the code to be reviewed will be compiled with minimum warning messages and with required make check rules to run tests.
 * enable email notification for the final results of each test:
 * incrementally add more compilation tests using external benchmarks to be integration tests.
 * Initial two jobs: spec cpu benchmark + NPB Fortran benchmarks
 * Better integration with Github Enterprise
 * Avoid the Auto Push failure due to pending commits on private repo's master branch.
 * Look into how others are doing this Github+ Jenkins integration
 * http://www.foraker.com/hudson-github-hooks/
 * https://wiki.jenkins-ci.org/display/JENKINS/Github+Plugin

Third Party software installed for testing in Jenkins.
 * Yices (http://yices.csl.sri.com/)
 * Download Yices1, the lasted version is better.
 * untar the tarball package of yices, then it is YICES_INSTALL, which is name like yices-1.0.34
 * Type --with-yices=YICES_INSTALL with ROSE/configure option
 * setup YICES_INSTALL/lib into LD_LIBRARY_PATH for Linux and DYLD_LIBRARY_PATH for mac users, it is like add Boost/lib into LD_LIBRARY_PATH