.. _examples: Example projects and data-sets ============================== The PALEOMIX pipeline contains small example projects for the larger pipelines, which are designed to be executed in a short amount of time, and to help verify that the pipelines have been correctly installed. .. _examples_bam: BAM Pipeline example project ---------------------------- The example project for the BAM pipeline involves the processing of a small data set consisting of (simulated) ancient sequences derived from the human mitochondrial genome. The runtime of this project on a typical desktop or laptop ranges from around 1 minute to around 1 hour (when full modeling of ancient DNA damage patterns is enabled). To access this example project, use the 'example' command for the BAM pipeline to copy the project files to a given directory (here, the current directory):: $ paleomix bam example . $ cd bam_pipeline $ paleomix bam run makefile.yaml The output generated by the pipeline is described in the :ref:`bam_filestructure` section. Please see the :ref:`troubleshooting` section if you run into problems running the pipeline. .. _examples_phylo: Phylogentic Pipeline example project ------------------------------------ The example project for the phylogenetic pipeline involves the processing and mapping of a small data set consisting of (simulated) sequences derived from the human and primate mitochondrial genome, followed by the genotyping of gene sequences and the construction of a maximum likelihood phylogeny. Since this example project starts from raw reads, it therefore requires that the BAM pipeline has been correctly installed, as described in section :ref:`bam_requirements`). The runtime of this project on a typical desktop or laptop ranges from around 30 minutes to around 1 hour. To access this example project, use the 'example' command for the phylogenetic pipeline to copy the project files to a given directory (here, the current directory), and then run the 'setup.sh' script in the root directory, to generate the data set:: $ paleomix phylo example . $ cd phylo_pipeline $ ./setup.sh Once the example data has been generated, the two pipelines may be executed:: $ cd alignment $ paleomix bam run makefile.yaml $ cd ../phylogeny $ paleomix phylo genotype+msa+phylogeny makefile.yaml The output generated by the pipeline is described in the :ref:`phylo_filestructure` section. Please see the :ref:`troubleshooting` section if you run into problems running the pipeline. .. _examples_zonkey: Zonkey Pipeline example project ------------------------------- The example project for the Zonkey pipeline is based on a synthetic hybrid between a Domestic donkey and an Arabian horse (obtained from [Orlando2013]_), using a low number of reads (1200). The runtime of these examples on a typical desktop or laptop ranges from around 30 minutes to around 1 hour, depending on your local configuration. To access this example project, download the Zonkey reference database (see the 'Prerequisites' section of the :ref:`zonkey_usage` page for instructions), and use the 'example' command for zonkey to copy the project files to a given directory. Here, the current directory directory is used; to place the example files in a different location, simply replace the '.' with the full path to the desired directory:: $ paleomix zonkey example database.tar . $ cd zonkey_pipeline The example directory contains 3 BAM files; one containing a nuclear alignment ('nuclear.bam'); one containing a mitochondrial alignment ('mitochondrial.bam'); and one containing a combined nuclear and mitochondrial alignment ('combined.bam'). In addition, a sample table is included which shows how multiple samples may be specified and processed at once. Each of these may be run as follows:: # Process only the nuclear BAM; # by default, results are saved in 'nuclear.zonkey' $ paleomix zonkey run database.tar nuclear.bam # Process only the mitochondrial BAM; # by default, results are saved in 'mitochondrial.zonkey' $ paleomix zonkey run database.tar mitochondrial.bam # Process both the nuclear and the mitochondrial BAMs; # note that is nessesary to specify an output directory $ paleomix zonkey run database.tar nuclear.bam mitochondrial.bam results # Process both the combined nuclear and the mitochondrial BAM; # by default, results are saved in 'combined.zonkey' $ paleomix zonkey run database.tar combined.bam # Process multiple samples; the table corresponds to the four # cases listed above. $ paleomix zonkey run database.tar samples.txt Please see the :ref:`troubleshooting` section if you run into problems running the pipeline. The output generated by the pipeline is described in the :ref:`zonkey_filestructure` section.