The XHMM C++ software suite was written to call copy number variation (CNV) from next-generation sequencing projects, where exome capture was used (or targeted sequencing, more generally).
XHMM uses principal component analysis (PCA) normalization and a hidden Markov model (HMM) to detect and genotype copy number variation (CNV) from normalized read-depth data from targeted sequencing experiments.
XHMM was explicitly designed to be used with targeted exome sequencing at high coverage (at least 60x - 100x) using Illumina HiSeq (or similar) sequencing of at least ~50 samples. However, no part of XHMM explicitly requires these particular experimental conditions, just high coverage of genomic regions for many samples.
How to Use
XHMM uses environment modules. Typemodule load XHMM
at the prompt. Then typexhmm -p params.txt
A params.txt will need to be created. Here is an example:1e-8 6 70 -3 1.00 0 1.00 3 1.00
A parameters file consists of the following 9 values (tab-delimited):
- Exome-wide CNV rate
- Mean number of targets in CNV
- Mean distance between targets within CNV (in KB)
- Mean of DELETION z-score distribution
- Standard deviation of DELETION z-score distribution
- Mean of DIPLOID z-score distribution
- Standard deviation of DIPLOID z-score distribution
- Mean of DUPLICATION z-score distribution
- Standard deviation of DUPLICATION z-score distribution
See the documentation below for more information.