A common request we get is to install some software package on either Gordon or Trestles, sometimes with the implicit assumption that it should be a simple matter of doing sudo apt-get install somepackage. Unfortunately, installing a new piece of software on a shared resource like Gordon or Trestles is not that easy because
- we need to make sure the software will not break another library or program on the system
- we will typically have to install it on all of the compute nodes too, not just the login nodes
- we then have to support that software (and any/all users’ questions about it) since we officially provide it
- our system engineers, who can actually deploy packages, are not the same applications experts who compile the packages
The net result is that installing a new software package system-wide can take weeks or months to do. When I get requests to install software, I invariably respond that it would be easier and faster for the user (that’s you) to just install the package himself, and provide step-by-step instructions on exactly how to do that.
For the sake of anyone who wants to know how to install his or her own software applications on SDSC Trestles or Gordon (or any other Linux or UNIX machine, for that matter), here are some generic guidelines on how to do this.
2. Python Modules
There are several ways Python will let you manage your own set of libraries, and I find the virtualenv package to be the easiest. It creates what amounts to an installation of Python that is local to your home directory, which means that any libraries you install using that special personalized Python will also install into your home directory.
First, download virtualenv from the project’s website, e.g.,
$ wget --no-check-certificate "https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.2.tar.gz"
Then be sure to load the python module that you wish to clone. This
is important because the
python that you will run if you don’t
explicitly load a Python module is the system-wide default, Python 2.6.6.
I recommend using the Python 2.7.x module we provide on both machines, so load
that module before trying to install virtualenv:
$ module load python
You must now decide the location of this custom Python you want to install
~/python27-gordon is a good choice, assuming you are
using Python 2.7 as previously discussed. Then, unpacking and installing
virtualenv is a snap:
$ tar zxvf virtualenv-1.11.2.tar.gz virtualenv-1.11.2/ virtualenv-1.11.2/AUTHORS.txt ... virtualenv-1.11.2/virtualenv_support/pip-1.5.2-py2.py3-none-any.whl virtualenv-1.11.2/virtualenv_support/setuptools-2.1-py2.py3-none-any.whl $ python virtualenv-1.11.2/virtualenv.py ~/python27-gordon New python executable in /home/username/python27-gordon/bin/python Installing setuptools, pip...done.
And that’s all you have to do. Now whenever you want to use your custom Python installation, you will have to issue this command:
$ source ~/python27-gordon/bin/activate (python27-gordon)$
As you may notice, it modifies your prompt to show that you are in this
custom Python’s “virtual environment.” If you always plan on using this custom
Python, you can go ahead and add the following lines to your
module load python VIRTUAL_ENV_DISABLE_PROMPT=1 source python27-gordon/bin/activate
Note the VIRTUAL_ENV_DISABLE_PROMPT=1 preceding the “source” command–this option prevents that annoying prompt prefix that virtualenv will otherwise give you every time you log in.
Once you’ve got your virtualenv activated, installing new libraries is easy
(python27-gordon)$ pip install cutadapt Downloading/unpacking cutadapt Downloading cutadapt-1.3.tar.gz (149kB): 149kB downloaded Running setup.py (path:/home/username/python27-gordon/build/cutadapt/setup.py) egg_info for package cutadapt ... Successfully installed cutadapt Cleaning up... (python27-gordon)$ pip install pyvcf Downloading/unpacking pyvcf Downloading PyVCF-0.6.4.tar.gz ... Successfully installed pyvcf distribute setuptools Cleaning up...
As you can see, pip automatically downloads and installs dependencies for you, making the task of managing Python libraries under your own user account on our supercomputers pretty easy.
3. Perl Modules
One of the standard ways of maintaining your own Perl libraries installed
into your home directory is using the
local::lib module which,
like Python’s virtualenv, lets you emulate having Perl installed locally.
To get started with local::lib you’ve first got to download and unpack it:
$ wget http://search.cpan.org/CPAN/authors/id/E/ET/ETHER/local-lib-1.008010.tar.gz $ tar zxvf local-lib-1.008010.tar.gz local-lib-1.008010/ local-lib-1.008010/Changes local-lib-1.008010/inc/ ... $ cd local-lib-1.008010
Unlike with Python, we do not have a separate Perl module that needs to be
loaded. Once you’re in that
local-lib-1.008010 directory, you can
initiate the bootstrap process by which
your custom Perl installation and installs itself. Let’s assume that we want
to install our custom Perl into
~/perl5-gordon (note: use
$HOME instead of
$ perl Makefile.PL --bootstrap=$HOME/perl5-gordon Attempting to create directory /home/username/perl5-gordon ...
If you don’t specify a path after the
–bootstrap flag, your
local::lib installation will be in
bootstrapping process may take a very long time as CPAN
needs to first configure itself, then install all of the libraries that
local::lib needs to work. After a lot of text scrolls by
(many of which look like errors–this isn’t necessarily bad), hopefully you
wind up at
... Checking if your kit is complete... Looks good Generating a GNU-style Makefile Writing Makefile for local::lib Writing MYMETA.yml and MYMETA.json
Then test and install
$ make test ... t/subroutine-in-inc.t .. ok All tests successful. Files=8, Tests=35, 0 wallclock secs ( 0.04 usr 0.03 sys + 0.23 cusr 0.07 csys = 0.37 CPU) Result: PASS $ make install Installing /home/username/perl5-gordon/lib/perl5/POD2/PT_BR/local/lib.pod ... Appending installation info to /home/username/perl5-gordon/lib/perl5/x86_64-linux-thread-multi/perllocal.pod
Now we need to put a few new lines in our
effectively do what that
activate script does for Python’s
virtualenv. Issue the following command, then append its output into your
$ perl -I$HOME/perl5-gordon/lib/perl5 -Mlocal::lib=$HOME/perl5-gordon | tee -a ~/.bashrc export PERL_LOCAL_LIB_ROOT="$PERL_LOCAL_LIB_ROOT:/home/username/perl5-gordon"; export PERL_MB_OPT="--install_base /home/username/perl5-gordon"; export PERL_MM_OPT="INSTALL_BASE=/home/username/perl5-gordon"; export PERL5LIB="/home/username/perl5-gordon/lib/perl5:$PERL5LIB"; export PATH="/home/username/perl5-gordon/bin:$PATH";
You should then either log out and log back in, or paste those export lines into your current terminal session to put them into effect. Following that, you should be able to install Perl libraries into your home directory:
$ perl -MCPAN -e 'install(Time::Piece)' Reading '/home/username/.cpan/Metadata' Database was generated on Mon, 16 Sep 2013 19:53:02 GMT Running install for module 'Time::Piece' ... RJBS/Time-Piece-1.23.tar.gz /usr/bin/make install -- OK
4. R Libraries
Users cannot install R libraries globally on our machines, but R makes it
very easy for users to install libraries in their home directories. To do
this, fire up R and when presented with the
> prompt, use the
install.packages() method to install things:
> install.packages('doSNOW') Installing package(s) into '/opt/R/local/lib' (as 'lib' is unspecified) Warning in install.packages("doSNOW") : 'lib = "/opt/R/local/lib"' is not writable Would you like to use a personal library instead? (y/n)
This error comes up because you can’t install libraries system-wide as a non-root user. Say y and accept the default which should be something similar to ~/R/x86_64-unknown-linux-gnu-library/3.0. Pick a mirror and let her rip. If you want to install multiple packages at once, you can just do something like
For most packages, this is all you will have to do. However, sometimes R packages depend on other system libraries, and those system libraries might not be in the default search path for the R package installer. When that happens, you might get an error that looks something like this:
> install.packages('rjags'); * installing *source* package 'rjags' ... ** package 'rjags' successfully unpacked and MD5 sums checked checking for prefix by checking for jags... no configure: error: "Location of JAGS headers not defined. Use configure arg '--with-jags-include' or environment variable 'JAGS_INCLUDE'" ERROR: configuration failed for package 'rjags' * removing '/home/glock/R/x86_64-unknown-linux-gnu-library/3.0/rjags' The downloaded source packages are in '/tmp/RtmpdE6UYF/downloaded_packages' Warning message: In install.packages("rjags") : installation of package â€˜rjagsâ€™ had non-zero exit status
The relevant part of the error log is highlighted in red; the library could
not install because it depends on a system library (as opposed to an R library)
install.packages() command could not find.
While fixing this error can be tricky since each R library can have a
different installation procedure, you can pass extra hints to the
install.packages() command to suggest where it can find some of
these system libraries. For example, the jags library is already installed on
Gordon and Trestles, and it can be loaded using module load jags.
After doing this, you can do something like
> install.packages('rjags', configure.args=c(rjags='--with-jags-include=$JAGSHOME/include/JAGS --with-jags-lib=$JAGSHOME/lib')) * installing *source* package 'rjags' ... ** package 'rjags' successfully unpacked and MD5 sums checked checking for prefix by checking for jags... /opt/jags/bin/jags checking whether the C++ compiler works... yes ... ** testing if installed package can be loaded * DONE (rjags)
The green text highlights something that looks gnarly, but actually tells R that
- the contents of
configure.argsshould be passed to the underlying library’s installer.
configure.argsis a named list containing special configuration parameters, where the name of each value corresponds to the package to which the special parameters apply.
–with-jags-libsignal the rjags installer where your JAGS library’s
libdirectories are located
- $JAGSHOME is a variable that gets defined when you load the
jagsmodule on Gordon and Trestles. On other systems, you would specify the full path to your jags installation directory instead.
Actually knowing what
configure.args to use when generic package
installations fail requires some amount of intuition. If all else fails,
contact your help desk!