Polymorphism and concerted evolution in a tandemly repeated gene family: 5S ribosomal DNA in diploid and allopolyploid cottons (original) (raw)
Related papers
Multigene families have provided opportunities for evolutionary biologists to assess molecular evolution processes and phylogenetic reconstructions at deep and shallow systematic levels. However, the use of these markers is not free of technical and analytical challenges. Many evolutionary studies that used the nuclear 5S rDNA gene family rarely used contiguous 5S coding sequences due to the routine use of head-to-tail polymerase chain reaction primers that are anchored to the coding region. Moreover, the 5S coding sequences have been concatenated with independent, adjacent gene units in many studies, creating simulated chimeric genes as the raw data for evolutionary analysis. This practice is based on the tacitly assumed, but rarely tested, hypothesis that strict intra-locus concerted evolution processes are operating in 5S rDNA genes, without any empirical evidence as to whether it holds for the recovered data. The potential pitfalls of analysing the patterns of molecular evolution and reconstructing phylogenies based on these chimeric genes have not been assessed to date. Here, we compared the sequence integrity and phylogenetic behavior of entire versus concatenated 5S coding regions from a real data set obtained from closely related plant species (Medicago, Fabaceae). Our results suggest that within arrays sequence homogenization is partially operating in the 5S coding region, which is traditionally assumed to be highly conserved. Consequently, concatenating 5S genes increases haplotype diversity, generating novel chimeric genotypes that most likely do not exist within the genome. In addition, the patterns of gene evolution are distorted, leading to incorrect haplotype relationships in some evolutionary reconstructions. [5S rDNA; concerted evolution; Medicago; multigene families; phylogeny.]
An overview of evolution in plant 5S DNA
Plant Systematics and Evolution, 1992
The DNA sequence properties of 5S DNA (5S RNA gene plus spacer) from a wide range of families of plants is reviewed with particular reference to the possibility of using the information for phylogenetic inference. Although the data-base is extremely limited, the available evidence suggests that within a subclass or tribe phylogenetic inference can be made, provided that a knowledge about the number of chromosomal locations of the gene loci (5SDna loci) is available. The evidence suggests little, if any, exchange occurs between the 5S DNA units at different chromosomal loci and the available data favour a mechanism involving amplification/deletion processes for creating structural changes at the 5SDna loci. Sequences originating from species in the families Rosaceae, Poaceae, and Brassicaceae tended to group together in c!adistic analyses but with low confidence limits. Surprisingly little of the spacer region showed conservation of sequence that may relate to a function in the control of transcription by RNA polymerase III.
Heredity, 2013
In higher eukaryotes, the 5S rRNA genes occur in tandem units and are arranged either separately (S-type arrangement) or linked to other repeated genes, in most cases to rDNA locus encoding 18S-5.8S-26S genes (L-type arrangement). Here we used Southern blot hybridisation, PCR and sequencing approaches to analyse genomic organisation of rRNA genes in all large gymnosperm groups, including Coniferales, Ginkgoales, Gnetales and Cycadales. The data are provided for 27 species (21 genera). The 5S units linked to the 35S rDNA units occur in some but not all Gnetales, Coniferales and in Ginkgo (B30% of the species analysed), while the remaining exhibit separate organisation. The linked 5S rRNA genes may occur as single-copy insertions or as short tandems embedded in the 26S-18S rDNA intergenic spacer (IGS). The 5S transcript may be encoded by the same (Ginkgo, Ephedra) or opposite (Podocarpus) DNA strand as the 18S-5.8S-26S genes. In addition, pseudogenised 5S copies were also found in some IGS types. Both Land S-type units have been largely homogenised across the genomes. Phylogenetic relationships based on the comparison of 5S coding sequences suggest that the 5S genes independently inserted IGS at least three times in the course of gymnosperm evolution. Frequent transpositions and rearrangements of basic units indicate relatively relaxed selection pressures imposed on genomic organisation of 5S genes in plants.
PLoS ONE, 2012
Simple sequence repeat (SSR) markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily, mutations in the target sequences follow the stepwise mutation model (SMM). Generally speaking, PCR amplicon sizes are used as direct indicators of the number of SSR repeats composing an allele with the data analysis either ignoring the extent of allele size differences or assuming that there is a direct correlation between differences in amplicon size and evolutionary distance. However, without precisely knowing the kind and distribution of polymorphism within an allele (SSR and the associated flanking region (FR) sequences), it is hard to say what kind of evolutionary message is conveyed by such a synthetic descriptor of polymorphism as DNA amplicon size. In this study, we sequenced several SSR alleles in multiple populations of three divergent tree genera and disentangled the types of polymorphisms contained in each portion of the DNA amplicon containing an SSR. The patterns of diversity provided by amplicon size variation, SSR variation itself, insertions/deletions (indels), and single nucleotide polymorphisms (SNPs) observed in the FRs were compared. Amplicon size variation largely reflected SSR repeat number. The amount of variation was as large in FRs as in the SSR itself. The former contributed significantly to the phylogenetic information and sometimes was the main source of differentiation among individuals and populations contained by FR and SSR regions of SSR markers. The presence of mutations occurring at different rates within a marker's sequence offers the opportunity to analyse evolutionary events occurring on various timescales, but at the same time calls for caution in the interpretation of SSR marker data when the distribution of within-locus polymorphism is not known.
Incongruent Patterns of Local and Global Genome Size Evolution in Cotton
Genome Research, 2004
Genome sizes in plants vary over several orders of magnitude, reflecting a combination of differentially acting local and global forces such as biases in indel accumulation and transposable element proliferation or removal. To gain insight into the relative role of these and other forces, ∼105 kb of contiguous sequence surrounding the cellulose synthase gene CesA1 was compared for the two coresident genomes (A T and D T ) of the allopolyploid cotton species, Gossypium hirsutum. These two genomes differ approximately twofold in size, having diverged from a common ancestor ∼5-10 million years ago (Mya) and been reunited in the same nucleus at the time of polyploid formation, ∼1-2 Mya. Gene content, order, and spacing are largely conserved between the two genomes, although a few transposable elements and a single cpDNA fragment distinguish the two homoeologs. Sequence conservation is high in both intergenic and genic regions, with 14 conserved genes detected in both genomes yielding a density of 1 gene every 7.5 kb. In contrast to the twofold overall difference in DNA content, no disparity in size was observed for this 105-kb region, and 555 indels were detected that distinguish the two homoeologous BACs, approximately equally distributed between A T and D T in number and aggregate size. The data demonstrate that genome size evolution at this phylogenetic scale is not primarily caused by mechanisms that operate uniformly across different genomic regions and components; instead, the twofold overall difference in DNA content must reflect locally operating forces between gene islands or in largely gene-free regions.
AoB PLANTS, 2015
Several genome duplications have been identified in the evolution of seed plants, providing unique systems for studying karyological processes promoting diversification and speciation. Knowledge about the number of ribosomal DNA (rDNA) loci, together with their chromosomal distribution and structure, provides clues about organismal and molecular evolution at various phylogenetic levels. In this work, we aim to elucidate the evolutionary dynamics of karyological and rDNA site-number variation in all known taxa of subtribe Vellinae, showing a complex scenario of ancestral and more recent polyploid events. Specifically, we aim to infer the ancestral chromosome numbers and patterns of chromosome number variation, assess patterns of variation of both 45S and 5S rDNA families, trends in site-number change of rDNA loci within homoploid and polyploid series, and reconstruct the evolutionary history of rDNA site number using a phylogenetic hypothesis as a framework. The best-fitting model of chromosome number evolution with a high likelihood score suggests that the Vellinae core showing x ¼ 17 chromosomes arose by duplication events from a recent x ¼ 8 ancestor. Our survey suggests more complex patterns of polyploid evolution than previously noted for Vellinae. High polyploidization events (6x, 8x) arose independently in the basal clade Vella castrilensis-V. lucentina, where extant diploid species are unknown. Reconstruction of ancestral rDNA states in Vellinae supports the inference that the ancestral number of loci in the subtribe was two for each multigene family, suggesting that an overall tendency towards a net loss of 5S rDNA loci occurred during the splitting of Vellinae ancestors from the remaining Brassiceae lineages. A contrasting pattern for rDNA site change in both paleopolyploid and neopolyploid species was linked to diversification of Vellinae lineages. This suggests dynamic and independent changes in rDNA site number during speciation processes and a significant lack of correlation between 45S and 5S rDNA evolutionary pathways.
Restless 5S: The re-arrangement(s) and evolution of the nuclear ribosomal DNA in land plants
Molecular Phylogenetics and Evolution, 2011
Among eukaryotes two types of nuclear ribosomal DNA (nrDNA) organization have been observed. Either all components, i.e. the small ribosomal subunit, 5.8S, large ribosomal subunit, and 5S occur tandemly arranged or the 5S rDNA forms a separate cluster of its own. Generalizations based on data derived from just a few model organisms have led to a superimposition of structural and evolutionary traits to the entire plant kingdom asserting that plants generally possess separate arrays.
The evolution of ribosomal DNA: divergent paralogues and phylogenetic implications
1997
Although nuclear ribosomal DNA (rDNA) repeats evolve together through concerted evolution, some genomes contain a considerable diversity of paralogous rDNA. This diversity includes not only multiple functional loci but also putative pseudogenes and recombinants. We examined the occurrence of divergent paralogues and recombinant.2 in Gossypium, Nicotiana, Tripsacum, Winteraceae, and Zea ribosomal internal transcribed spacer (ITS) sequences. Some of the divergent paralogues are probably rDNA pseudogenes, since they have low predicted secondary structure stability, high substitution rates, and many deamination-driven substitutions at methylation sites. Under standard PCR conditions, the low stability paralogues amplified well, while many high-stability paralogues amplified poorly. Under highly denaturing PCR conditions (i.e., with dimethylsulfoxide), both low-and high-stability paralogues amplified well. We also found recombination between divergent paralogues. For phylogenetics, divergent ribosomal paralogues can aid in reconstructing ancestral states and thus serve as good outgroups. Divergent paralogues can also provide companion rDNA phylogenies. However, phylogeneticists must discriminate among families of divergent paralogues and recombinants or suffer from muddled and ~.
Evolutionary hierarchies of conserved blocks in 5'-noncoding sequences of dicot rbcS genes
BMC Evolutionary Biology, 2007
Background: Evolutionary processes in gene regulatory regions are major determinants of organismal evolution, but exceptionally challenging to study. We explored the possibilities of evolutionary analysis of phylogenetic footprints in 5'-noncoding sequences (NCS) from 27 ribulose-1,5-bisphosphate carboxylase small subunit (rbcS) genes, from three dicot families (Brassicaceae, Fabaceae and Solanaceae).