This is a collection of software curated by the Florey institute at The University of Sheffield for analysing microbial genomes. The software is collected in a docker container that can be run locally on any machine (e.g. Mac OSX, Windows) for testing purposes, or on a HPC environment (via singularity) for large-scale analysis.
Here is a list of all the software that is available through the Docker container. Each software is available in /usr/bin
so can be called from any directory according to the standard instructions for that software.
To run the software on your own machine, you will first need to install the docker container management system. Here is a short overview of Docker for those not familiar. The links below can be used to download Docker for Mac and Windows
Once docker is setup, you can run the following command in a terminal to run the container
docker run --rm -it markdunning/microbial-pipeline
You should then notice that the command prompt has changed to reflect the fact that you are inside the home directory of the container. You should now be able to run any of the software installed in the container
e.g.
samtools
You can leave the container by typing exit
By default the container is isolated from your own machine. However, you can mount particular directories so they are accessible by adding the -v
argument. If your fastq
files are in a directory called /PATH/TO/MY/DATA
you can map this to a directory called data
using the following
docker run --rm -it -v /PATH/TO/MY/DATA/:/data markdunning/microbial-pipeline
You could then run any of included tools on your data by supplying the path /data
to any tool. e.g.
fastqc /data/*
For various security reasons, docker is not usually available on a HPC environment such as SHARC at the University of Sheffield. However, there is a system called singularity that can perform a similar function.
You will first need to login to sharc in the usual way and launch an interactive node (adding any memory requirements with the lrmem=
option
qrshx
Commands from inside the container can then be run using:-
singularity exec /usr/local/community/Florey/singularity/microbial-pipeline_latest.simg __TOOLNAME__ PATH_TO_YOUR_DATA
e.g. to QC a file with fastqc
singularity exec /usr/local/community/Florey/singularity/microbial-pipeline_latest.simg fastqc MY_FASTQ
This image itself was built with the command
singularity pull docker://markdunning/microbial-pipeline
The installation of the software is controlled by a Dockerfile which is kept under version control in the github repository assoicated with this project. Please feel free to make suggestions and changes to this file via a pull request on github.