My work with the Institute of Arctic Biology requires managing several fairly complicated software installations in a few different environments. Our target deployment environment is a cluster computer running CentOS and accessible to the whole team. As such, I do not have administrator priviledges, and can't use the system pacakge manager, yum
, to easily install new packages. Our system administrator is super helpful, but sometimes I am simply experimenting and don't want to potentially disrupt all the other users by pestering the administrator to install something that breaks everyone elses tools.
In addition to the shared cluster computer, I end up working on my main OSX machine, and also on a development Virtual Machine I set up using Vagrant. Each environment has its tradeoffs:
- The shared computer won't allow me to install using the system package manager.
- While I can get most things to work on OSX, there are a few key tools (mainly PEST and gdb at this point) that I can't get to work.
- The Virtual Machine (running Fedora), provides an environment closer to the final deployment environment than OSX, but being a virtual machine with its associated overhead, I get annoyed running it all the time.
I like having an easy way to install tools and to track my install steps, because inevitably I have to repeat the process (or a variant of it) in order to maintain my machines, and to help other folks get setup.
Today's problem
I am trying to install PEST++, which is a newer version of PEST, but with at least parts of it written in C++ and with an improved Run Manager. I am hoping to use PEST++ and its Run Manager for running our model (dvmdostem
) under on our shared, multi-core computer (atlas
). The installation instructions for PEST++ list gcc 5.2+ as a requirement for compiling. Darn. I got this work work (after quite a bit of headache) on OSX last week (installing newer version of gcc using Homebrew). But then I remembered that I was doing all my PEST work on the Vagrant VM because I had never gotten the original PEST program to work. PEST++ is backwards compatibile with PEST, so I am hoping to setup a basic PEST run and run it under PEST and PEST++ to make sure that both tools are functional. Then I'll be able to play with the run manager (YAMR - for "Yet Another Run Manager") on the VM, to see if I can get a parallel calibration-optimization to run. Finally, if so, I will need to get this same experiment setup and running using the scheduling manager (SLURM) on the shared computer, atlas
. So while I did manage to get PEST++ compiled on OSX, I now need it on Fedora, and shortly after will need it on CentOS.
My Vagrant Virtual machine runs Fedora 20 and for Fedora 20, the newest version of GCC in the repos is 4.8.3. I looked into a) udating the entire system to the more recent Fecdora 23 (possible, looks kinda tedious and might take a few hours), b) downloading and creating an entirely new Vagrant "box" that is based off of the recent Fedora 23 (also possible, but would take a long time to download and install all the required packages. Plus then I'd have YAVMtM - "Yet Another VM to Manage"), and c) trying to build gcc-5.2+ in my home directory (also probably possible, but potentially quite tedious and difficult to repeat. And fraugth with peril in trying to manage my custom gcc the the system gcc). And at the end of the day, I am going to run into similar problems trying to install PEST++ on atlas
, so finding an easy repeatable solution is by far the best.
EasyBuild to the rescue
I stumbled across EasyBuild which might solve my installation and tool chain problems. Even better, EasyBuild seems to be specially designed for people working with scientific software on HPC systems. So off I go installing EasyBuild, and then promptly putting it to use trying to install GCC-5.2+.
Install EasyBuild
First try, no problems, simply copy-pasting directions from the boostrap easy build page. Wow, this seems like it is going to be good!
## pick an installation prefix to install EasyBuild to (change this to your liking)
EASYBUILD_PREFIX=$HOME/.local/easybuild
## download script
curl -O https://raw.githubusercontent.com/hpcugent/easybuild-framework/develop/easybuild/scripts/bootstrap_eb.py
## bootstrap EasyBuild
python bootstrap_eb.py $EASYBUILD_PREFIX
## update $MODULEPATH, and load the EasyBuild module
module use $EASYBUILD_PREFIX/modules/all
module load EasyBuild
The sanity check worked as expected:
$ module load EasyBuild
$ module list
Currently Loaded Modulefiles:
1) EasyBuild/2.8.2
$ eb --version
This is EasyBuild 2.8.2 (framework: 2.8.2, easyblocks: 2.8.2) on host localhost.localdomain.
I put the following in my ~/.bashrc
so I don't have to think about it in the future:
## Easy build stuff
EASYBUILD_PREFIX=$HOME/.local/easybuild
## update $MODULEPATH, and load the EasyBuild module
module use $EASYBUILD_PREFIX/modules/all
module load EasyBuild
Using EasyBuild to get new GCC compilers
Well the install was easy, so let see how it works. First I had to read the help a bit to find out that, wow, there is already a recipe for GCC-5.3. This looks like it is going to be awesome...
$ eb --help
$ eb --list-toolchains
$ eb --search gcc
$ eb --search gcc | grep GCC-5
Just to check it out, I try the --dry-run/-D
options before running the command for real:
$ eb GCC-5.3.0.eb -D
== temporary log file in case of crash /tmp/eb-4c0Znt/easybuild-0ixe_x.log
Dry run: printing build status of easyconfigs and dependencies
CFGS=/home/vagrant/.local/easybuild/software/EasyBuild/2.8.2/lib/python2.7/site-packages/easybuild_easyconfigs-2.8.2-py2.7.egg/easybuild/easyconfigs
* [ ] $CFGS/m/M4/M4-1.4.17.eb (module: M4/1.4.17)
* [ ] $CFGS/g/GCC/GCC-5.3.0.eb (module: GCC/5.3.0)
== Temporary log file(s) /tmp/eb-4c0Znt/easybuild-0ixe_x.log* have been removed.
== Temporary directory /tmp/eb-4c0Znt has been removed.
I then changed the flags from -D
to --robot
for automatic dependency resolution, and EasyBuild was easily able to build most of the first dependency, but then failed complaining about finding the patch
program:
$ eb GCC-5.3.0.eb --robot
...
...
== COMPLETED: Installation ended successfully
== Results of the build can be found in the log file(s) /home/vagrant/.local/easybuild/software/M4/1.4.17/easybuild/easybuild-M4-1.4.17-20160718.154405.log
== processing EasyBuild easyconfig /home/vagrant/.local/easybuild/software/EasyBuild/2.8.2/lib/python2.7/site-packages/easybuild_easyconfigs-2.8.2-py2.7.egg/easybuild/easyconfigs/g/GCC/GCC-5.3.0.eb
== building and installing GCC/5.3.0...
== fetching files...
== creating build dir, resetting environment...
== unpacking...
== patching...
== FAILED: Installation ended unsuccessfully (build directory: /home/vagrant/.local/easybuild/build/GCC/5.3.0/dummy-): build failed (first 300 chars): cmd "patch -b -p1 -i /home/vagrant/.local/easybuild/software/EasyBuild/2.8.2/lib/python2.7/site-packages/easybuild_easyconfigs-2.8.2-py2.7.egg/easybuild/easyconfigs/g/GCC/mpfr-3.1.3-allpatches-20151029.patch" exited with exitcode 127 and output:
/bin/bash: patch: command not found
== Results of the build can be found in the log file(s) /tmp/eb-Ie1LgC/easybuild-GCC-5.3.0-20160718.154405.hMvfX.log
ERROR: Build of /home/vagrant/.local/easybuild/software/EasyBuild/2.8.2/lib/python2.7/site-packages/easybuild_easyconfigs-2.8.2-py2.7.egg/easybuild/easyconfigs/g/GCC/GCC-5.3.0.eb failed (err: 'build failed (first 300 chars): cmd "patch -b -p1 -i /home/vagrant/.local/easybuild/software/EasyBuild/2.8.2/lib/python2.7/site-packages/easybuild_easyconfigs-2.8.2-py2.7.egg/easybuild/easyconfigs/g/GCC/mpfr-3.1.3-allpatches-20151029.patch" exited with exitcode 127 and output:\n/bin/bash: patch: command not found\n')
Easily fixed with:
$ sudo yum install patch
And then I tried again:
$ eb GCC-5.3.0.eb --robot
...
And an hour or two later, success!:
...
...
== permissions...
== packaging...
== COMPLETED: Installation ended successfully
== Results of the build can be found in the log file(s) /home/vagrant/.local/easybuild/software/GCC/5.3.0/easybuild/easybuild-GCC-5.3.0-20160718.171043.log
== Build succeeded for 1 out of 1
== Temporary log file(s) /tmp/eb-dtlcZS/easybuild-sOvAcy.log* have been removed.
== Temporary directory /tmp/eb-dtlcZS has been removed.
Wow, that was actually pretty easy, and now I have an alternate version of gcc. Now I need to figure out how to build stuff on top of it.
Compiling PEST++ using EasyBuild provided GCC-5.3
Even though PEST++'s build process is pretty basic, it took 2 days to figure out how to do the build under EasyBuild. Lots of trial and error, lots of reading the EasyBuild documentation, and even reaching out to ask for help on the EasyBuild IRC channel. While building PEST++ is really straightforward (the main Makefile is less than 75 lines long, and very readable), there are a few things that made it tricky (for me) to use within EasyBuild:
- There is no
./configure
step. - There is no
make install
step. - The compiled executable gets moved to a special directory at the end of the Makefile.
- The user needs to hand-edit a few flags at the top of the file.
- The file is basic enough that different configurations are managed by simply commenting/uncommenting different sections of code.
There were several things about EasyBuild (eb
) that took me a while to get my head wrapped around:
easyconfig
is the higher level construct to aneasyblock
; ideally I won't need to create aneasyblock
.eb
wants to download and manage any packages in the~/.local/easybuild/software/
directory.eb
doesn't support downloading fromgithub.com
(as far as I can tell).- It is entirely possible (and reccomended even I suspect) to manage my own custom eb scripts outside of the
./local/easybuild
tree. eb
wants to compile everything withing~/.local/easybuild/build
and expects themake install
step to copy the files elsewhere.- An
eb
"module" (the result of a successful easybuild) is bascially a script (bash?) that sets a bunch of environment variables needed to make your software run and link to the approprate underlying software stack.
At this point, I had already tried compiling PEST++ before running into the problem of needing a newer version of GCC. I had cloned PEST++ from github and had it living in a directory in my home folder. My initial attempts and many hours were wasted trying to get eb
to work directly with this directory. It also took me quite a while to figure out that I could keep my own custom easyconfig
and easyblock
files (though I found that I didn't actually need to write my own easyblock
) in my own directory, instead of including them deep within the .local/easybuild
tree with the supplied configs and blocks.
The possible options for the easyconfig
file are availe by running eb --avail-easyconfig-params
. The first part of the easyconfig
file is straightforward and easy to create, and it was pretty easy to figure out how to name the file according to eb
's conventions (<software>-<version>-<toolchain-version>.eb
). Although PEST++ seems to only have officially released v3.0, by poking around on the gitbub for the project, it looks like they have a commit labled 3.3 that is pretty recent, so I decided to use that code and call my version of PEST++ 3.3. I will keep my personal configurations in my own creatively named directory:~/myeasyconfigs/pestpp-3.3-gcc-5.3.0.eb
. For now I ignored some of the license settings:
## The easyconfig file for PEST++ (pestpp)
## A short description of the software [default: None]
description = 'Some program...'
## The homepage of the software [default: None]
homepage = 'http://inversemodeler.com'
## Name of software [default: None]
name = "pestpp"
## Software license [default: None]
## List of software license locations [default: None]
## Name and version of toolchain [default: None]
toolchain = {'name': 'GCC', 'version': '5.3.0'}
## Version of software [default: None]
version = '3.3'
Turns out the best provided "generic" easyblock
for me to use is named "MakeCp", which bascially covers my case - no configure step, and no install step. It took me a long time to figure this out. I started off figuring I should be able to use a generic, provided easyblock
, but couldn't get it to work so I took a long detour, starting to write by own custon easyblock
, but then got on the IRC chanel where I got straighened out. So:
easyblock = 'MakeCp'
It turns out that eb
is pretty smart and wants to download and extract the source code itself. Because I couldn't figure out how to get it to download straight from github, I ended up downloading a .zip
file myself and placing it directly in the folder where eb
expected to find it: .local/easybuild/sources/p/pestpp/pestpp-master.zip
. I got the link to the zip file by clicking the download link on github for a specific commit. I really should add the checksum to the easyconfig
file to avoid confusion over versions in the future. For future reference, I left the source_urls
set, but when eb
encountered the pestpp-master.zip
file that I hand placed in the right directory, it skips the download step.
source_urls = ['https://codeload.github.com/dwelter/pestpp/zip/master']
sources = ['pestpp-master.zip']
Next up is the meat of the installation - setting up and compiling the code. This also took quite a while to figure out and I had to sift thru the rather extensive log files generated by running eb --debug
in order to understand the problems I was having with directories. First it helps to go over what the manual steps would be:
- download zip
- extract zip to some location
- change into the extracted location
- go one step further into the
src/
subdirectory of the PEST++ project - edit the
makefile_linux
as needed - run
make -f makefile_linux
- test that the generated executable runs
It took lots of reading the eb --help
, trial and error, and checking the logs to figure out which easyconfig-params
to set and how to get the directory paths exactly right. To begin with I wanted to start in the project's src
directory as that is where the makefile_linux
expects to be run from. Then I eventually figured out that I could essentially customize the make
command that eb
would run by setting buildopts
:
start_dir = 'src'
buildopts = '-f makefile_linux'
Looking thru the logs (grep -rn "DEBUG cmd" /tmp/eb-* | less
), I can find that the actually command issued by eb
will look something like this:
/tmp/eb-YyVdkz/easybuild-8I6qQI.log:2398:== 2016-07-20 00:39:08,810 run.py:392 DEBUG cmd "make -j 4 -f makefile_linux" ...
Of course this failed because I needed to modify the makefile_linux
before running it. But at least I had eb
running the correct file! So the next step is figuring out how to patch the makefile with my own custom modifications. eb
has a built in "patching step", so the trick was figuring out how to generate the required patch, and where to put it so that eb
could find and apply the patch. To generate the patch, what I ended doing was modifying the makefile_linux
in the original pestpp gitrepo that I had cloned from github. Then I was able to generate the patch using git diff --no-prefix > ~/myeasyconfigs/tbc-fix-make.patch
. I am not sure I completely understand the search path resolution for eb
but if I simply set:
patches = ['tbc-fix-make.patch']
in the easyconfig
, and place the .patch
file next to by config file eb
will find and apply it. The summary of my patch is:
- hard code the
GCCLIBDIR
to a path I discovered by finding where easybuild installslibquadmath
for GCC-5.3.0. - change to using what looks like the debug
CFLAGS
andFFLAGS
- add
-static-libstdc++
toCFLAGS
(as reccomended by the compiler after a failed build) - don't set the
EXE_DIR
variable (so it can be set outside the makefile and passed in)
Next up was figuring out how to get the appropriate directories setup before running the makefile. The makefile expects to copy the resulting binary up to an exe/
directory, so I have to make sure that exists, or the cp
step in the makefile will fail. This was accomplished with eb
's pre-build options:
prebuildopts = 'mkdir -p %(builddir)s/pestpp-master/exe/linux && export EXE_DIR="%(builddir)s/pestpp-master/exe/linux" && '
Again grepping thru the logs was key to figuring out how the %(builddir)s
template variable was expanded. Finally the software should be making it thru the compilation step! But the resulting software ends up in the EasyBuild's build directory tree, and EasyBuild expects to "install" it to a final location. The generic 'MakeCp' easyblock
comes with a special parameter to control this and it just took more trial and error and more log-reading to figure out how to use it. Turns out the install step starts in the root of the project. I should figure out exactly which files we need to "install", but for now I simply specify to go up one level and then take the entire pestpp-master
directory (which results from unpacking the original .zip
file; this could probably be adjusted by passing different options to the upacking step.) So this setting:
## List of files or dirs to copy [default: []]
files_to_copy = ['../pestpp-master']
works, as you can see from the final log:
$ grep -A 5 "Starting install_step " ~/.local/easybuild/software/pestpp/3.3-GCC-5.3.0/easybuild/easybuild-pestpp-3.3-20160720.125629.log
== 2016-07-20 12:56:03,944 makecp.py:71 DEBUG Starting install_step with files_to_copy: ['../pestpp-master']
== 2016-07-20 12:56:03,944 makecp.py:101 DEBUG List of files matching '../pestpp-master' in start dir /home/vagrant/.local/easybuild/build/pestpp/3.3/GCC-5.3.0/pestpp-master/src: []
== 2016-07-20 12:56:03,944 makecp.py:106 WARNING No files matching '../pestpp-master' found in start dir /home/vagrant/.local/easybuild/build/pestpp/3.3/GCC-5.3.0/pestpp-master/src
== 2016-07-20 12:56:03,944 makecp.py:108 DEBUG List of files matching '../pestpp-master' in /home/vagrant/.local/easybuild/build/pestpp/3.3/GCC-5.3.0/pestpp-master/src: ['/home/vagrant/.local/easybuild/build/pestpp/3.3/GCC-5.3.0/pestpp-master/../pestpp-master']
== 2016-07-20 12:56:03,944 makecp.py:129 DEBUG Copying directory /home/vagrant/.local/easybuild/build/pestpp/3.3/GCC-5.3.0/pestpp-master/../pestpp-master to /home/vagrant/.local/easybuild/software/pestpp/3.3-GCC-5.3.0
== 2016-07-20 12:56:28,609 easyblock.py:2127 DEBUG Not skipping extensions step (skippable: False, skip: None, skipsteps: [], module_only: False, force: True
With all this in place, eb
was successuflly compiling and installing the software, but still failing to complete because of failing a sanity check. I made a super basic sanity check. It would be awesome to further flesh this out:
sanity_check_paths = {'files':[], 'dirs':['%(installdir)s/pestpp-master/exe/linux/']}
There is one final step to being able to easily use the software without digging thru the .local/easybuild/...
hierarchy to find the binary and explicitly call it each time, not to mention setting up all the right environment variables, so that easy-built software can find the correct libraries at runtime. This if fortunately handled by EasyBuilds "module" concept. I had to ask about this on the IRC chanel to figure it out. But basically with one last setting in the easyconfig
file, EasyBuild will generate a "module file" for my pest++ program that can set the appropriate environment variables, and prepend my $PATH with the right locaiton of the EasyBuild provided PEST++:
modextrapaths = {'PATH':'pestpp-master/exe/linux/'}
Now, PEST++ is availble as a "module" that can be turned on or off via the command line - the same way I enabled the GCC-5.3.0 module at the beginning of this whole process:
$ module avail
----------------------- /home/vagrant/.local/easybuild/modules/all ------------------
EasyBuild/2.8.2 GCC/5.3.0 M4/1.4.17 pestpp/3.3-GCC-5.3.0
----------------------- /usr/share/Modules/modulefiles -------------------------------
dot module-git module-info modules null use.own
----------------------- /etc/modulefiles ----------------------------------------------
mpi/openmpi-x86_64
To load it:
$ module load pestpp
Check that it is loaded:
$ module list
Currently Loaded Modulefiles:
1) EasyBuild/2.8.2 2) GCC/5.3.0 3) pestpp/3.3-GCC-5.3.0
And finally, I can run pestpp
directly from my shell.
$ pestpp
PEST++ Version 3.3.0
by Dave Welter
Computational Water Resource Engineering
--------------------------------------------------------
usage:
serial run manager:
pest++ pest_ctl_file.pst
YAMR master:
pest++ control_file.pst /H :port
YAMR runner:
pest++ /H hostname:port
GENIE:
pest++ control_file.pst /G hostname:port
external run manager:
pest++ control_file.pst /E
--------------------------------------------------------
Wow, the build works! EasyBuild can compile, install, and check that my PEST++ is functioning properly (although I really would need to make a better test-suite/script to be able to say we are really testing the install...but still). So in theory, now I should be able to duplicate this installation on top of the GCC-5.3.0 stack on atlas
in a very short amount of time. Awesome.