High-Performance Computing at the NIH

RSS Feed

Freebayes / Ogap on Helix

FreeBayes is a high-performance, flexible, and open-source Bayesian genetic variant detector. It operates on BAM alignment files, which are produced by most contemporary short-read aligners.

In addition to substantial performance improvements over its predecessors (PolyBayes, GigaBayes, and BamBayes), it expands the scope of SNP and small-indel variant calling to populations of individuals with heterogeneous copy number.

Freebayes is developed by Erik Garrison and Gabor Marth.

 

The environment variable(s) need to be set properly first. The easiest way to do this is by using the modules commands as in the example below.

Note, the following module command automatically add 'ogat' program path as well.

$ module avail freebayes
---------------------- /usr/local/Modules/3.2.9/modulefiles --------------------------------
freebayes/0.9.4          freebayes/0.9.6          freebayes/0.9.8(default)


$ module load freebayes

$ module list
Currently Loaded Modulefiles:
1) freebayes/0.9.8 $ module unload freebayes $ module load freebayes/0.9.6 $ module list Currently Loaded Modulefiles: 1) freebayes/0.9.6 $ module show freebayes ------------------------------------------------------------------- /usr/local/Modules/3.2.9/modulefiles/freebayes/0.9.8: module-whatis Sets up freebayes 0.9.8 prepend-path PATH /usr/local/apps/freebayes/0.9.8/bin prepend-path PATH /usr/local/apps/ogap/current -------------------------------------------------------------------

 

How To Use

Example

The following example is modified based on this file: /usr/local/apps/freebayes/current/examples/pipeline.sh

     
$ module load freebayes
$ export outdir=/data/$USER/YourOutputDir
$ mkdir $outdir
$ export reference=/data/$USER/YourReference.fasta
$ export bamlist=YourBamfileNames.txt
$ export cnvmap=YourCNVMAP.bed
$ export region=YourRegion

$ freebayes \
	 --min-alternate-count 2 \
    --min-alternate-qsum 40 \
    --pvar 0.0001 \
    --use-mapping-quality \
    --posterior-integration-limits 1,3 \
    --genotype-variant-threshold 4 \
    --site-selection-max-iterations 3 \
    --genotyping-max-iterations 25 \
    --max-complex-gap 3 \
    --cnv-map $cnvmap \
    --stdin \
    --region $region \
    -f $reference \
	| gzip > $outdir/$region.vcf.gz

 

Documentation

https://github.com/ekg/freebayes