Matthew Stephens - Software for Haplotype Estimation etc (original) (raw)
fastPHASE: software for haplotype reconstruction, and estimating missing genotypes from population data
The program fastPHASE implements methods described in
Scheet, P and Stephens, M (2006). A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet (to appear)
fastPHASE can handle larger data-sets than PHASE (eg hundreds of thousands of markers in thousands of individuals), but does not provide estimates of recombination rates. Our experiments suggest that haplotype estimates are slightly less accurate than from PHASE, but missing genotype estimates appear to be similar or even slightly better than PHASE.
Note that the download site is under construction, and the following links may not yet work.
The software is free for non-commercial use, and may be licensed for commercial use. To view the terms and conditions, and then proceed to download, click here.
PHASE: software for haplotype reconstruction, and recombination rate estimation from population data
The program PHASE implements methods for estimating haplotypes from population genotype data described in
Stephens, M., and Donnelly, P. (2003).A comparison of Bayesian methods for haplotype reconstruction from population genotype data. American Journal of Human Genetics, 73:1162-1169.
Stephens, M., Smith, N., and Donnelly, P. (2001).A new statistical method for haplotype reconstruction from population data. American Journal of Human Genetics, 68, 978--989.
Stephens, M., and Scheet, P. (2005).Accounting for Decay of Linkage Disequilibrium in Haplotype Inference and Missing-Data Imputation. American Journal of Human Genetics, 76:449-462.
The software also incorporates methods for estimating recombination rates, and identifying recombination hotspots:
Crawford et al (2004). Evidence for substantial fine-scale variation in recombination rates across the human genome. Nature Genetics, to appear.
The software is free for non-commercial use, and may be licensed for commercial use. To view the terms and conditions, and then proceed to download, click here.
Instructions for PHASE are included on the download site, or are also available here.
SCAT: Smoothed and Continuous AssignmenTs
The program SCAT (Smoothed and Continuous AssignmenTs) implements a Bayesian statistical method for estimating allele frequencies and assigning samples of unknown (or known) origin across a continuous range of locations, based on genotypes collected at distinct sampling locations. In brief, the idea is to assume that allele frequencies vary smoothly in the study region, so allele frequencies are estimated at any given location using observed genotypes at near-by sampling locations, with data at the nearest sampling locations being given greatest weight. Details are given in
Wasser, S., et al (2004). PNAS, 41, 14844-14852.
SCAT is available here.
HOTSPOTTER: software for identifying recombination hotspots from population SNP data
This software by Na Li implements methods from
Li, N., and Stephens, M. (2003). Modelling Linkage Disequilibrium, and identifying recombination hotspots using SNP data Genetics, To appear.
It is available free from here.