Archive

Monthly Archives: October 2014

When installing Cactus (using the progressiveCactus repository) I encountered the following issues during compiling:

  1. easy_install not found
    Solution: Needed to remove my ~/.pydistutils.cfg
  2. Dependencies/includes not being found
    Solution: add ‘CXXFLAGS=-I”<install_location>/include/”‘ and ‘CFLAGS=-I”<install_location>/include/”‘ to <install_location>/share/config.site
  3. kyototycoon not compiling as kyotocabinet functions not found (as in issue 27)
    Solution: (as in my comment to the issue)entering the kyototycoon directory and running configure with different flags, then make:

    ./configure --prefix=~/software --with-kc=~/software
    
    make
    

    where ~/software is the prefix I am installing to (with subdirs bin, lib, include, man etc)

  4. Various cases of -ltokyocabinet not found, or other dependencies missing in the USCS submodules stages of compilation (which use makefiles but not configure, so don’t pick up config.site)
    Solution: add:

    cflags += -I"/nfs/users/nfs_j/jl11/software/include/" -L/nfs/users/nfs_j/jl11/software/lib

    to include.mk in submodules cactus, cactus2hal, hal, pinchesAndCacti and matchingAndOrdering

Hope that helps anyone who might have been struggling with similar compile issues!

Advertisements

Over the past few months I’ve found myself running large numbers of jobs over an LSF system, for example assembling and annotating thousands of bacterial genomes or imputing thousands of human genomes in 5Mb chunks.

Inevitable some of these jobs fail, and often for a number of reasons. I thought it might be helpful to share some of the commands I’ve found useful for diagnosing the jobs that have finished. The commands apply to IBM platform LSF (bsub), but I imagine have slightly wider applicability

bjobs -a -x

This command is useful if run just after jobs finish, so that they are still in the history (they are usually cleared after a couple of hours). It will show all jobs that have finished with a non-zero exit code, and also jobs which have underrun/overrun. This is especially useful if you’ve run something that has exited with an error early on, but still returns exit code 0 (e.g. wrong command line parameters).

find . -name "*.o" | xargs grep -L "Successfully completed"

Assuming all your job STDOUT files have the suffix .o (bsub -o), this will show any jobs (files at least) that have not finished with exit code 0.
find – returns all files names which end with .o, searching recursively
xargs – passes these file names one by one to grep
grep -L returns the file names of any files which do not contain the given phrase

find . -name "*.o" | xargs grep -l "MEMLIMIT"

Similar to the above command, except returns all those jobs that exceeded their memory limit. grep -l returns files with the match.
Makes it easy to find jobs which just need to be resubmitted with higher memory limits.

This and the above command can obviously be simply extended by grepping for different strings in the log files

find . -name "*.e" | xargs cat

Useful for some tasks, this will display all the output to STDERR assuming you wrote it to files with the suffix .e (bsub -e). Some software writes logs to STDERR, but in some cases you might expect this command to return no text