Intra- and interpopulation genotype reconstruction from tagging SNPs (original) (raw)
- Peristera Paschou1,4,6,
- Michael W. Mahoney2,5,
- Asif Javed3,
- Judith R. Kidd1,
- Andrew J. Pakstis1,
- Sheng Gu1,
- Kenneth K. Kidd1, and
- Petros Drineas3
- 1 Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06511, USA;
- 2 Department of Mathematics, Yale University, New Haven, Connecticut 06511, USA;
- 3 Department of Computer Science, Rensselaer Polytechnic Institute, Troy, New York 12180, USA
Abstract
The optimal method to be used for tSNP selection, the applicability of a reference LD map to unassayed populations, and the scalability of these methods to genome-wide analysis, all remain subjects of debate. We propose novel, scalable matrix algorithms that address these issues and we evaluate them on genotypic data from 38 populations and four genomic regions (248 SNPs typed for ∼2000 individuals). We also evaluate these algorithms on a second data set consisting of genotypes available from the HapMap database (1336 SNPs for four populations) over the same genomic regions. Furthermore, we test these methods in the setting of a real association study using a publicly available family data set. The algorithms we use for tSNP selection and unassayed SNP reconstruction do not require haplotype inference and they are, in principle, scalable even to genome-wide analysis. Moreover, they are greedy variants of recently developed matrix algorithms with provable performance guarantees. Using a small set of carefully selected tSNPs, we achieve very good reconstruction accuracy of “untyped” genotypes for most of the populations studied. Additionally, we demonstrate in a quantitative manner that the chosen tSNPs exhibit substantial transferability, both within and across different geographic regions. Finally, we show that reconstruction can be applied to retrieve significant SNP associations with disease, with important genotyping savings.
Footnotes
↵4 Present addresses: Department of Molecular Biology and Genetics, Democritus University of Thrace, Alexandroupoli 68100, Greece;
↵5 Yahoo Research Labs, Sunnyvale, California 94089, USA.
↵6 Corresponding author.
↵6 E-mail ppaschou{at}mbg.duth.gr; fax 30-25510-30613.[Supplemental material is available online at www.genome.org.]
Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.5741407
- Received July 6, 2006.
- Accepted November 1, 2006.
Copyright © 2007, Cold Spring Harbor Laboratory Press