Software requirements

Depending on the parts of the Phylogenetic pipeline used, different programs are required. The following lists which programs are required for each pipeline, as well as the minimum version required:

Genotyping

Both the 'tabix' and the 'bgzip' executable from the Tabix package must be installed. On Debian based distros, these tools can be installed as follows:

$ sudo apt-get install bcftools samtools tabix

Multiple Sequence Alignment

Note that the pipeline requires that the algorithm-specific MAFFT commands (e.g. 'mafft-ginsi', 'mafft-fftnsi'). These are automatically created by the 'make install' command.

On Debian based distros, MAFFT can be installed as follows:

$ sudo apt-get install mafft

The various mafft-* binaries may not be added to your PATH by default on Debian. If that is the case, then they can be included as follows:

$ echo 'export PATH=/usr/lib/mafft/bin/:$PATH' >> ~/.bashrc

Phylogenetic Inference

The pipeline expects a single-threaded binary named 'raxmlHPC' for RAxML. The pipeline expects the ExaML binary to be named 'examl', and the parser binary to be named 'parse-examl'. Compiling and running ExaML requires an MPI implementation (e.g. OpenMPI), even if ExaML is run single-threaded.

Both programs offer a variety of makefiles suited for different server-architectures and use-cases. If in doubt, use the Makefile.SSE3.gcc makefiles, which are compatible with most modern systems:

$ make -f Makefile.SSE3.gcc

RAxML and MPI (mpirun/mpicc) can be installed as follows on Debian based distros:

sudo apt-get install raxml mpi-default-bin mpi-default-dev

Testing the pipeline

An example project is included with the phylogenetic pipeline, and it is recommended to run this project in order to verify that the pipeline and required applications have been correctly installed. See the Example projects and data-sets section for a description of how to run this example project.