Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment - PubMed (original) (raw)

Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment

K Strimmer et al. Proc Natl Acad Sci U S A. 1997.

Abstract

We introduce a graphical method, likelihood-mapping, to visualize the phylogenetic content of a set of aligned sequences. The method is based on an analysis of the maximum likelihoods for the three fully resolved tree topologies that can be computed for four sequences. The three likelihoods are represented as one point inside an equilateral triangle. The triangle is partitioned in different regions. One region represents star-like evolution, three regions represent a well-resolved phylogeny, and three regions reflect the situation where it is difficult to distinguish between two of the three trees. The location of the likelihoods in the triangle defines the mode of sequence evolution. If n sequences are analyzed, then the likelihoods for each subset of four sequences are mapped onto the triangle. The resulting distribution of points shows whether the data are suitable for a phylogenetic reconstruction or not.

PubMed Disclaimer

Figures

Figure 1

Figure 1

The fully resolved tree topologies _T_1, _T_2, and _T_3 connecting four sequences _S_1, _S_2, _S_3, and _S_4.

Figure 2

Figure 2

Map of the probability vector P = (_p_1, _p_2, _p_3) onto an equilateral triangle. Barycentric coordinates are used, i.e. the lengths of the perpendiculars from point P to the triangle sides are equal to the probabilities p i. The corners _T_1, _T_2, and _T_3 represent three quartet topologies with corresponding coordinates (probabilities) (1, 0, 0), (0, 1, 0), and (0, 0, 1).

Figure 3

Figure 3

(A) Basins of attraction for the three topologies _T_1, _T_2, and _T_3. The gray area shows the region where the probability for tree _T_1 is largest. In the center c = (1/3, 1/3, 1/3) all trees are equally likely, at the points x12 = (1/2, 1/2, 0), x13 = (1/2, 0, 1/2), and x23 = (0, 1/2, 1/2) two trees have the same likelihood whereas the remaining one has probability zero. (B) The seven basins of attraction allowing not only fully resolved trees but also the star phylogeny and three regions where it is not possible to decide between two topologies. The dots indicate the corresponding seven attractors. _A_1, _A_2, _A_3 show the tree-like regions. _A_12, _A_13, _A_23represent the net-like regions and _A_∗ displays the star-like area.

Figure 4

Figure 4

Effect of sequence length (50, 100, 200, and 500 bp) on the distribution of P vectors for a simulated data set with 16 sequences. (Upper) Sequences evolving along a perfect star phylogeny. (Lower) Sequences evolving along a completely resolved tree. Sequences evolved according to the Jukes–Cantor model. The number of substitutions per site and per branch was 0.1. Each triangle shows a result of one simulation and all possible 1,820 P vectors were computed. If tree-like data were generated (Lower) the number of P vectors seems to decrease with increasing sequence length. This effect is due to the fact that identical P vectors fall on top of each other. Longer sequences increase the probability that one of the trees favored equals one. That is, most of the 1,820 P vectors superimpose each other in the corners of the triangles (cf. Table 1).

Figure 5

Figure 5

Likelihood-mapping analysis for two biological data sets. (Upper) The distribution patterns. (Lower) The occupancies (in percent) for the seven areas of attraction. (A) Cytochrome-b data from ref. . (B) Ribosomal DNA of major arthropod groups (15).

Figure 6

Figure 6

Four-cluster likelihood-mapping of ribosomal DNA (15). Sequences were split in four disjoint groups, misc. represents the nonarthropod sequences. The corners of the triangle are labeled with the corresponding tree topologies.

Similar articles

Cited by

References

    1. Swofford D L, Olsen G J, Waddell P J, Hillis D M. In: Molecular Systematics. Hillis D M, Moritz C, Mable B K, editors. Sunderland, MA: Sinauer; 1995. pp. 407–514.
    1. Bandelt H-J, Dress A. Adv Math. 1992;92:47–105.
    1. Dopazo J, Dress A, von Haeseler A. Proc Natl Acad Sci USA. 1993;90:10320–10324. - PMC - PubMed
    1. von Haeseler A, Churchill G A. J Mol Evol. 1993;37:77–85. - PubMed
    1. Eigen M, Winkler-Oswatitsch R, Dress A. Proc Natl Acad Sci USA. 1988;85:5913–5917. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources