High-Performance Computing at the NIH

RSS Feed
GERMLINE

GERMLINE is a program for discovering long shared segments of Identity by Descent (IBD) between pairs of individuals in a large population. It takes as input genotype or haplotype marker data for individuals (as well as an optional known pedigree) and generates a list of all pairwise segmental sharing.

GERMLINE uses a novel hashing & extension algorithm which allows for segment identification in haplotype data in time proportional to the number of individuals. Presently, GERMLINE can execute on phased or un-phased data; though we have found performance much improved with phasing while phasing & running GERMLINE is still significantly faster than comparable IBD algorithms. GERMLINE can identify shared segments of any specified length, as well as allow for any number of mismatching markers.

GERMLINE was developed in Itsik Pe'er's lab at Columbia University. GERMLINE website.

For large numbers of Germline jobs, use Germline on Biowulf.

Sample session

User input in bold.
[user@helix myproject]$ germline Please indicate which file format you'll be using for individuals Enter 1 for PED / Plink Enter 2 for PHASE / HapMap 1 Please enter the MAP file name CEU.22.map Please enter the PED file name CEU.22.ped Please indicate output file location: newtest 0 SNPs have genetic distance Read Markers 100% Match Markers 100% [user@helix myproject]

Documentation

Germline website and documentation.