Efficient and accurate construction of genetic linkage maps from the minimum spanning tree of a graph - PubMed (original) (raw)
Efficient and accurate construction of genetic linkage maps from the minimum spanning tree of a graph
Yonghui Wu et al. PLoS Genet. 2008 Oct.
Abstract
Genetic linkage maps are cornerstones of a wide spectrum of biotechnology applications, including map-assisted breeding, association genetics, and map-assisted gene cloning. During the past several years, the adoption of high-throughput genotyping technologies has been paralleled by a substantial increase in the density and diversity of genetic markers. New genetic mapping algorithms are needed in order to efficiently process these large datasets and accurately construct high-density genetic maps. In this paper, we introduce a novel algorithm to order markers on a genetic linkage map. Our method is based on a simple yet fundamental mathematical property that we prove under rather general assumptions. The validity of this property allows one to determine efficiently the correct order of markers by computing the minimum spanning tree of an associated graph. Our empirical studies obtained on genotyping data for three mapping populations of barley (Hordeum vulgare), as well as extensive simulations on synthetic data, show that our algorithm consistently outperforms the best available methods in the literature, particularly when the input data are noisy or incomplete. The software implementing our algorithm is available in the public domain as a web tool under the name MSTmap.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
Figure 1. Average ρ (rho) for thirty runs on simulated data for several choices of the error rates (and no missing data).
The variable n represents the number of individuals, and m represents the number of markers.
Figure 2. An illustration of the MST-based algorithm.
(A) The MST obtained for a synthetic example; the MST is not a TSP yet; the backbone of the MST is shown with dotted edges. (B) An initial TSP obtained from the backbone (see text for details). The dotted edges represent marker pairs in the wrong order. Several local improvement operations are applied to further improve the TSP, namely 2-OPT (C1), node-relocation (C2) and block-optimize (C3). The final TSP is shown in (D).
Figure 3. An example of a singleton (double crossover).
Each row refers to an individual and each column refers to a marker locus. Given the current order, the entry (c 1, l 7) appears to be a possible error because its state differs from both its immediately preceding and following markers.
Figure 4. Running time of MSTmap and Record with respect to error rate or missing rate or error and missing rate.
Every point in the graph is an average of 30 runs. The lines “missing only” correspond to data sets with no error (η = 0, γ is on the _x_-axis). Similarly, lines “error only” correspond to data sets with no missing (γ = 0, η is on the _x_-axis), and lines “error and missing” correspond to data sets with equal missing rate and error rate (η = γ is on the _x_-axis).
Similar articles
- A nearest-neighboring-end algorithm for genetic mapping.
Crane CF, Crane YM. Crane CF, et al. Bioinformatics. 2005 Apr 15;21(8):1579-91. doi: 10.1093/bioinformatics/bti164. Epub 2004 Nov 25. Bioinformatics. 2005. PMID: 15564296 - SMOOTH: a statistical method for successful removal of genotyping errors from high-density genetic linkage data.
van Os H, Stam P, Visser RG, van Eck HJ. van Os H, et al. Theor Appl Genet. 2005 Dec;112(1):187-94. doi: 10.1007/s00122-005-0124-y. Epub 2005 Oct 29. Theor Appl Genet. 2005. PMID: 16258753 - New algorithm improves fine structure of the barley consensus SNP map.
Endelman JB. Endelman JB. BMC Genomics. 2011 Aug 10;12:407. doi: 10.1186/1471-2164-12-407. BMC Genomics. 2011. PMID: 21831315 Free PMC article. - Accounting for Errors in Low Coverage High-Throughput Sequencing Data When Constructing Genetic Maps Using Biparental Outcrossed Populations.
Bilton TP, Schofield MR, Black MA, Chagné D, Wilcox PL, Dodds KG. Bilton TP, et al. Genetics. 2018 May;209(1):65-76. doi: 10.1534/genetics.117.300627. Epub 2018 Feb 27. Genetics. 2018. PMID: 29487138 Free PMC article. - Development of genome-wide InDel markers and their integration with SSR, DArT and SNP markers in single barley map.
Zhou G, Zhang Q, Tan C, Zhang XQ, Li C. Zhou G, et al. BMC Genomics. 2015 Oct 16;16:804. doi: 10.1186/s12864-015-2027-x. BMC Genomics. 2015. PMID: 26474969 Free PMC article.
Cited by
- JCVI: A versatile toolkit for comparative genomics analysis.
Tang H, Krishnakumar V, Zeng X, Xu Z, Taranto A, Lomas JS, Zhang Y, Huang Y, Wang Y, Yim WC, Zhang J, Zhang X. Tang H, et al. Imeta. 2024 Jun 12;3(4):e211. doi: 10.1002/imt2.211. eCollection 2024 Aug. Imeta. 2024. PMID: 39135687 Free PMC article. - Causes and consequences of a complex recombinational landscape in the ant Cardiocondyla obscurior.
Errbii M, Gadau J, Becker K, Schrader L, Oettler J. Errbii M, et al. Genome Res. 2024 Jul 23;34(6):863-876. doi: 10.1101/gr.278392.123. Genome Res. 2024. PMID: 38839375 - Genetic mapping of loci affecting seedling and adult-plant resistance to powdery mildew derived from two CIMMYT wheat lines.
Golzar H, Shankar M, Sznajder B, Fox R, Reeves K, Mather DE. Golzar H, et al. Planta. 2024 May 29;260(1):13. doi: 10.1007/s00425-024-04444-9. Planta. 2024. PMID: 38809276 Free PMC article. - Aphid Resistance Segregates Independently of Cardenolide and Glucosinolate Content in an Erysimum cheiranthoides (Wormseed Wallflower) F2 Population.
Mirzaei M, Younkin GC, Powell AF, Alani ML, Strickler SR, Jander G. Mirzaei M, et al. Plants (Basel). 2024 Feb 6;13(4):466. doi: 10.3390/plants13040466. Plants (Basel). 2024. PMID: 38498451 Free PMC article. - High-density bin-based genetic map reveals a 530-kb chromosome segment derived from wild peanut contributing to late leaf spot resistance.
Pan J, Li X, Fu C, Bian J, Wang Z, Yu C, Liu X, Wang G, Tian R, Song X, Li C, Xia H, Zhao S, Hou L, Gao M, Zi H, Bertioli D, Leal-Bertioli S, Pandey MK, Wang X, Zhao C. Pan J, et al. Theor Appl Genet. 2024 Mar 5;137(3):69. doi: 10.1007/s00122-024-04580-6. Theor Appl Genet. 2024. PMID: 38441650
References
- Sturtevant AH. The linear arrangement of six sex-linked factors in drosophila, as shown by their mode of association. J Exp Zool. 1913;14:43–59.
- Sun Z, Wang Z, Tu J, Zhang J, Yu F, et al. An ultradense genetic recombination map for Brassica napus, consisting of 13551 srap markers. Theor Appl Genet. 2007;114:1305–1317. - PubMed
- Stam P. Construction of integrated genetic linkage maps by means of a new computer package: Joinmap. The Plant Journal. 1993;3:739–744.
- Os HV, Stam P, Visser RGF, Eck HJV. RECORD: a novel method for ordering loci on a genetic linkage map. Theor Appl Genet. 2005;112:30–40. - PubMed
- Jansen J, de Jong AG, van Ooijen JW. Constructing dense genetic linkage maps. Theor Appl Genet. 2001;102:1113–1122.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous