Efficient and accurate construction of genetic linkage maps from the minimum spanning tree of a graph - PubMed (original) (raw)

Efficient and accurate construction of genetic linkage maps from the minimum spanning tree of a graph

Yonghui Wu et al. PLoS Genet. 2008 Oct.

Abstract

Genetic linkage maps are cornerstones of a wide spectrum of biotechnology applications, including map-assisted breeding, association genetics, and map-assisted gene cloning. During the past several years, the adoption of high-throughput genotyping technologies has been paralleled by a substantial increase in the density and diversity of genetic markers. New genetic mapping algorithms are needed in order to efficiently process these large datasets and accurately construct high-density genetic maps. In this paper, we introduce a novel algorithm to order markers on a genetic linkage map. Our method is based on a simple yet fundamental mathematical property that we prove under rather general assumptions. The validity of this property allows one to determine efficiently the correct order of markers by computing the minimum spanning tree of an associated graph. Our empirical studies obtained on genotyping data for three mapping populations of barley (Hordeum vulgare), as well as extensive simulations on synthetic data, show that our algorithm consistently outperforms the best available methods in the literature, particularly when the input data are noisy or incomplete. The software implementing our algorithm is available in the public domain as a web tool under the name MSTmap.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1

Figure 1. Average ρ (rho) for thirty runs on simulated data for several choices of the error rates (and no missing data).

The variable n represents the number of individuals, and m represents the number of markers.

Figure 2

Figure 2. An illustration of the MST-based algorithm.

(A) The MST obtained for a synthetic example; the MST is not a TSP yet; the backbone of the MST is shown with dotted edges. (B) An initial TSP obtained from the backbone (see text for details). The dotted edges represent marker pairs in the wrong order. Several local improvement operations are applied to further improve the TSP, namely 2-OPT (C1), node-relocation (C2) and block-optimize (C3). The final TSP is shown in (D).

Figure 3

Figure 3. An example of a singleton (double crossover).

Each row refers to an individual and each column refers to a marker locus. Given the current order, the entry (c 1, l 7) appears to be a possible error because its state differs from both its immediately preceding and following markers.

Figure 4

Figure 4. Running time of MSTmap and Record with respect to error rate or missing rate or error and missing rate.

Every point in the graph is an average of 30 runs. The lines “missing only” correspond to data sets with no error (η = 0, γ is on the _x_-axis). Similarly, lines “error only” correspond to data sets with no missing (γ = 0, η is on the _x_-axis), and lines “error and missing” correspond to data sets with equal missing rate and error rate (η = γ is on the _x_-axis).

Similar articles

Cited by

References

    1. Sturtevant AH. The linear arrangement of six sex-linked factors in drosophila, as shown by their mode of association. J Exp Zool. 1913;14:43–59.
    1. Sun Z, Wang Z, Tu J, Zhang J, Yu F, et al. An ultradense genetic recombination map for Brassica napus, consisting of 13551 srap markers. Theor Appl Genet. 2007;114:1305–1317. - PubMed
    1. Stam P. Construction of integrated genetic linkage maps by means of a new computer package: Joinmap. The Plant Journal. 1993;3:739–744.
    1. Os HV, Stam P, Visser RGF, Eck HJV. RECORD: a novel method for ordering loci on a genetic linkage map. Theor Appl Genet. 2005;112:30–40. - PubMed
    1. Jansen J, de Jong AG, van Ooijen JW. Constructing dense genetic linkage maps. Theor Appl Genet. 2001;102:1113–1122.

Publication types

MeSH terms

Substances

LinkOut - more resources