Elizabeth Kellogg - Academia.edu (original) (raw)
Papers by Elizabeth Kellogg
Proceedings of the National Academy of Sciences, 1997
Homeodomain proteins are transcription factors that play a critical role in early development in ... more Homeodomain proteins are transcription factors that play a critical role in early development in eukaryotes. These proteins previously have been classified into numerous subgroups whose phylogenetic relationships are unclear. Our phylogenetic analysis of representative eukaryotic sequences suggests that there are two major groups of homeodomain proteins, each containing sequences from angiosperms, metazoa, and fungi. This result, based on parsimony and neighbor-joining analyses of primary amino acid sequences, was supported by two additional features of the proteins. The two protein groups are distinguished by an insertion͞deletion in the homeodomain, between helices I and II. In addition, an amphipathic alpha-helical secondary structure in the region N terminal of the homeodomain is shared by angiosperm and metazoan sequences in one group. These results support the hypothesis that there was at least one duplication of homeobox genes before the origin of angiosperms, fungi, and metazoa. This duplication, in turn, suggests that these proteins had diverse functions early in the evolution of eukaryotes. The shared secondary structure in angiosperm and metazoan sequences points to an ancient conserved functional domain.
Plant Physiol, 2006
Formal description of plant phenotypes and standardized annotation of gene expression and protein... more Formal description of plant phenotypes and standardized annotation of gene expression and protein localization data require uniform terminology that accurately describes plant anatomy and morphology. This facilitates cross species comparative studies and quantitative comparison of phenotypes and expression patterns. A major drawback is variable terminology that is used to describe plant anatomy and morphology in publications and genomic databases for different species. The same terms are sometimes applied to different plant structures in different taxonomic groups. Conversely, similar structures are named by their species-specific terms. To address this problem, we created the Plant Structure Ontology (PSO), the first generic ontological representation of anatomy and morphology of a flowering plant. The PSO is intended for a broad plant research community, including bench scientists, curators in genomic databases, and bioinformaticians. The initial releases of the PSO integrated existing ontologies for Arabidopsis (Arabidopsis thaliana), maize (Zea mays), and rice (Oryza sativa); more recent versions of the ontology encompass terms relevant to Fabaceae, Solanaceae, additional cereal crops, and poplar (Populus spp.). Databases such as The Arabidopsis Information Resource, Nottingham Arabidopsis Stock Centre, Gramene, MaizeGDB, and SOL Genomics Network are using the PSO to describe expression patterns of genes and phenotypes of mutants and natural variants and are regularly contributing new annotations to the Plant Ontology database. The PSO is also used in specialized public databases, such as BRENDA, GENEVESTIGATOR, NASCArrays, and others. Over 10,000 gene annotations and phenotype descriptions from participating databases can be queried and retrieved using the Plant Ontology browser. The PSO, as well as contributed gene associations, can be obtained at www.plantontology.org.
Systematic Biology, Apr 1, 2004
The internal transcribed spacer (ITS) of nuclear ribosomal DNA has been widely used by systematis... more The internal transcribed spacer (ITS) of nuclear ribosomal DNA has been widely used by systematists for reconstructing phylogenies of closely related taxa. Although the occurrence of ITS putative pseudogenes is well documented for many groups of animals and plants, the potential utility of these pseudogenes in phylogenetic analyses has often been underestimated or even ignored in part because of deletions that make unambiguous alignment difficult. In addition, long branches often can lead to spurious relationships, particularly in parsimony analyses. We have discovered unusually high levels of ITS polymorphism (up to 30%, 40%, and 14%, respectively) in three tropical tree species of the coffee family (Rubiaceae), Adinauclea fagifolia, Haldina cordifolia, and Mitragyna rubrostipulata. Both secondary structure stability and patterns of nucleotide substitutions in a highly conserved region (5.8S gene) were used for distinguishing presumed functional sequences from putative pseudogenes. The combination of both criteria was the most powerful approach. The sequences from A. fagifolia appear to be a mix of functional genes and highly distinct putative pseudogenes, whereas those from H. cordifolia and M. rubrostipulata were identified as putative pseudogenes. We explored the potential utility of the identified putative pseudogenes in the phylogenetic analyses of Naucleeae sensu lato. Both Bayesian and parsimony trees identified the same monophyletic groups and indicated that the polymorphisms do not transcend species boundaries, implying that they do not predate the divergence of these three species. The resulting trees are similar to those produced by previous analyses of chloroplast genes. In contrast to results of previous studies therefore, divergent putative pseudogenes can be useful for phylogenetic analyses, especially when no sequences of their functional counterparts are available. Our studies clearly show that ITS polymorphism may not necessarily mislead phylogenetic inference. Despite using many different PCR conditions (different primers, higher denaturing temperatures, and absence or presence of DMSO and BSA-TMACl), we recovered only a few functional ITS copies from A. fagifolia and none from H. cordifolia and M. rubrostipulata, which suggests that PCR selection is occurring and/or the presumed functional alleles are located at minor loci (with few ribosomal DNA copies).
To estimate the evolutionary history of the mustard family (Brassicaceae or Cruciferae), we sampl... more To estimate the evolutionary history of the mustard family (Brassicaceae or Cruciferae), we sampled 113 species, representing 101 of the roughly 350 genera and 17 of the 19 tribes of the family, for the chloroplast gene ndhF. The included accessions increase the number of genera sampled over previous phylogenetic studies by four-fold. Using parsimony, likelihood, and Bayesian methods, we reconstructed the phylogeny of the gene and used the Shimodaira-Hasegawa test (S-H test) to compare the phylogenetic results with the most recent tribal classification for the family. The resultant phylogeny allowed a critical assessment of variations in fruit morphology and seed anatomy, upon which the current classification is based. We also used the S-H test to examine the utility of trichome branching patterns for describing monophyletic groups in the ndhF phylogeny. Our phylogenetic results indicate that 97 of 114 ingroup accessions fall into one of 21 strongly supported clades. Some of these clades can themselves be grouped into strongly to moderately supported monophyletic groups. One of these lineages is a novel grouping overlooked in previous phylogenetic studies. Results comparing 30 different scenarios of evolution by the S-H test indicate that five of 12 tribes represented by two or more genera in the study are clearly polyphyletic, although a few tribes are not sampled well enough to establish para-or polyphyly. In addition, branched trichomes likely evolved independently several times in the Brassicaceae, although malpighiaceous and stellate trichomes may each have a single origin.
Nar, 2007
The Plant Ontology Consortium (POC, http:// www.plantontology.org) is a collaborative effort amon... more The Plant Ontology Consortium (POC, http:// www.plantontology.org) is a collaborative effort among model plant genome databases and plant researchers that aims to create, maintain and facilitate the use of a controlled vocabulary (ontology) for plants. The ontology allows users to ascribe attributes of plant structure (anatomy and morphology) and developmental stages to data types, such as genes and phenotypes, to provide a semantic framework to make meaningful crossspecies and database comparisons. The POC builds upon groundbreaking work by the Gene Ontology Consortium (GOC) by adopting and extending the GOC's principles, existing software and database structure. Over the past year, POC has added hundreds of ontology terms to associate with thousands of genes and gene products from Arabidopsis, rice and maize, which are available through a newly updated web-based browser (http://www.plantontology.org/amigo/go.cgi) for viewing, searching and querying. The Consortium has also implemented new functionalities to facilitate the application of PO in genomic research and updated the website to keep the contents current.
Computational Biology, 2008
The Plant Structure Ontology (PSO) is a controlled vocabulary of anatomy and morphology of a gene... more The Plant Structure Ontology (PSO) is a controlled vocabulary of anatomy and morphology of a generic flowering plant, developed by the Plant Ontology Consortium (POC) The main goal of the POC was to reduce the problem of heterogeneity of terminology used to describe comparable object types in plant genomic databases. PSO provides standardized set of terms describing anatomical and morphological
Molecular Biology and Evolution
An exception to the generally conservative nature of plastid gene evolution is the gene coding fo... more An exception to the generally conservative nature of plastid gene evolution is the gene coding for the p" subunit of RNA polymerase, rpoC2. Previous work by others has shown that maize and rice have an insertion in the coding region of rpoC2, relative to spinach and tobacco. To assess the distribution of this extra coding sequence, we surveyed a broad phylogenetic sample comprising 55 species from 17 angiosperm families by using Southern hybridization.
American Journal of Botany
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, a... more JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. This content downloaded from 160.111.254.17 on Wed, 08 Apr 2015 15:38:59 UTC All use subject to JSTOR Terms and Conditions
American journal of botany, 2014
The Food and Agriculture Organization (FAO) predicts that food production must rise 70% over the ... more The Food and Agriculture Organization (FAO) predicts that food production must rise 70% over the next 40 years to meet the demands of a growing population that is expected to reach nine billion by the year 2050. Many facets of basic plant science promoted by the Botanical Society of America are important for agriculture; however, more explicit connections are needed to bridge the gap between basic and applied plant research. This special issue, Speaking of Food: Connecting Basic and Applied Plant Science, was conceived to showcase productive overlaps of basic and applied research to address the challenges posed by feeding billions of people and to stimulate more research, fresh connections, and new paradigms. Contributions to this special issue thus illustrate some interactive areas of study in plant science-historical and modern plant-human interaction, crop and weed origins and evolution, and the effects of natural and artificial selection on crops and their wild relatives. These ...
Genetics
We designate a region of the alcohol dehydrogenase locus ( A d h ) of the weedy crucifer, Arabido... more We designate a region of the alcohol dehydrogenase locus ( A d h ) of the weedy crucifer, Arabidopsis thaliana, as "hypervariable" on the basis of a comparison of sequences from ecotypes Columbia and Landsberg. We found eight synonymous and two replacement mutations in the first 262 nucleotides of exon 4, and an additional two mutations in the contiguous region of intron 3. The rest of the sequence (2611 bp) has just three mutations, all of them confined to noncoding regions. Our survey of the hypervariable region among 37 ecotypes of A . thaliana revealed two predominant haplotypes, corresponding to the Columbia and Landsberg sequences. We identified five additional haplotypes and 4 additional segregating sites. The lack of haplotype diversity is presumably in part a function of low rates of recombination between haplotypes conferred by A. thaliana's tendency to self-fertilize. However, an analysis in 32 ecotypes of 12 genome-wide polymorphic markers distinguishing Columbia and Landsberg ecotypes indicated levels of outcrossing sufficient at least to erode linkage disequilibrium between dispersed markers. We discuss possible evolutionary explanations for the coupled observation of marked divergence within the hypervariable region and a lack of haplotype diversity among ecotypes. The sequence of the region for closely related species argues against the possibility that one allele is the product of introgression. We note (1) that several loss of function mutations (both naturally and chemically induced) map to the hypervariable region, and (2) the presence of two amino acid replacement polymorphisms, one of which causes the mobility difference between the two major classes of A. thaliana Adh electrophoretic alleles. We argue that protein polymorphism in such a functionally significant part of the molecule may be subject to balancing selection. The observed pattern of extensive divergence between the alleles is consistent with this explanation because balancing selection on a particular site maintains linked neutral polymorphisms at intermediate frequencies.
Evolution & Development
LEAFY HULL STERILE1 (LHS1) is an MIKC-type MADS-box gene in the SEPALLATA class. Expression patte... more LEAFY HULL STERILE1 (LHS1) is an MIKC-type MADS-box gene in the SEPALLATA class. Expression patterns of LHS1 homologs vary among species of grasses, and may be involved in determining palea and lemma morphology, specifying the terminal floret of the spikelet, and sex determination. Here we present LHS1 expression data from Eleusine indica (subfamily Chloridoideae) and Megathyrsus maximus (subfamily Panicoideae) to provide further insights into the hypothesized roles of the gene. E. indica has spikelets with three to eight florets that mature acropetally; E. indica LHS1 (EiLHS1) is expressed in the palea and lemma of all florets. In contrast, M. maximus has spikelets with two florets that mature basipetally; M. maximus LHS1 (MmLHS1) is expressed in the palea and lemma of the distal floret only. These data are consistent with the hypothesis that LHS1 plays a role in determining palea and lemma morphology and specifies the terminal floret of basipetally maturing grass spikelets. Howeve...
American journal of botany, 2015
Proceedings of the National Academy of Sciences, 1997
Homeodomain proteins are transcription factors that play a critical role in early development in ... more Homeodomain proteins are transcription factors that play a critical role in early development in eukaryotes. These proteins previously have been classified into numerous subgroups whose phylogenetic relationships are unclear. Our phylogenetic analysis of representative eukaryotic sequences suggests that there are two major groups of homeodomain proteins, each containing sequences from angiosperms, metazoa, and fungi. This result, based on parsimony and neighbor-joining analyses of primary amino acid sequences, was supported by two additional features of the proteins. The two protein groups are distinguished by an insertion͞deletion in the homeodomain, between helices I and II. In addition, an amphipathic alpha-helical secondary structure in the region N terminal of the homeodomain is shared by angiosperm and metazoan sequences in one group. These results support the hypothesis that there was at least one duplication of homeobox genes before the origin of angiosperms, fungi, and metazoa. This duplication, in turn, suggests that these proteins had diverse functions early in the evolution of eukaryotes. The shared secondary structure in angiosperm and metazoan sequences points to an ancient conserved functional domain.
Plant Physiol, 2006
Formal description of plant phenotypes and standardized annotation of gene expression and protein... more Formal description of plant phenotypes and standardized annotation of gene expression and protein localization data require uniform terminology that accurately describes plant anatomy and morphology. This facilitates cross species comparative studies and quantitative comparison of phenotypes and expression patterns. A major drawback is variable terminology that is used to describe plant anatomy and morphology in publications and genomic databases for different species. The same terms are sometimes applied to different plant structures in different taxonomic groups. Conversely, similar structures are named by their species-specific terms. To address this problem, we created the Plant Structure Ontology (PSO), the first generic ontological representation of anatomy and morphology of a flowering plant. The PSO is intended for a broad plant research community, including bench scientists, curators in genomic databases, and bioinformaticians. The initial releases of the PSO integrated existing ontologies for Arabidopsis (Arabidopsis thaliana), maize (Zea mays), and rice (Oryza sativa); more recent versions of the ontology encompass terms relevant to Fabaceae, Solanaceae, additional cereal crops, and poplar (Populus spp.). Databases such as The Arabidopsis Information Resource, Nottingham Arabidopsis Stock Centre, Gramene, MaizeGDB, and SOL Genomics Network are using the PSO to describe expression patterns of genes and phenotypes of mutants and natural variants and are regularly contributing new annotations to the Plant Ontology database. The PSO is also used in specialized public databases, such as BRENDA, GENEVESTIGATOR, NASCArrays, and others. Over 10,000 gene annotations and phenotype descriptions from participating databases can be queried and retrieved using the Plant Ontology browser. The PSO, as well as contributed gene associations, can be obtained at www.plantontology.org.
Systematic Biology, Apr 1, 2004
The internal transcribed spacer (ITS) of nuclear ribosomal DNA has been widely used by systematis... more The internal transcribed spacer (ITS) of nuclear ribosomal DNA has been widely used by systematists for reconstructing phylogenies of closely related taxa. Although the occurrence of ITS putative pseudogenes is well documented for many groups of animals and plants, the potential utility of these pseudogenes in phylogenetic analyses has often been underestimated or even ignored in part because of deletions that make unambiguous alignment difficult. In addition, long branches often can lead to spurious relationships, particularly in parsimony analyses. We have discovered unusually high levels of ITS polymorphism (up to 30%, 40%, and 14%, respectively) in three tropical tree species of the coffee family (Rubiaceae), Adinauclea fagifolia, Haldina cordifolia, and Mitragyna rubrostipulata. Both secondary structure stability and patterns of nucleotide substitutions in a highly conserved region (5.8S gene) were used for distinguishing presumed functional sequences from putative pseudogenes. The combination of both criteria was the most powerful approach. The sequences from A. fagifolia appear to be a mix of functional genes and highly distinct putative pseudogenes, whereas those from H. cordifolia and M. rubrostipulata were identified as putative pseudogenes. We explored the potential utility of the identified putative pseudogenes in the phylogenetic analyses of Naucleeae sensu lato. Both Bayesian and parsimony trees identified the same monophyletic groups and indicated that the polymorphisms do not transcend species boundaries, implying that they do not predate the divergence of these three species. The resulting trees are similar to those produced by previous analyses of chloroplast genes. In contrast to results of previous studies therefore, divergent putative pseudogenes can be useful for phylogenetic analyses, especially when no sequences of their functional counterparts are available. Our studies clearly show that ITS polymorphism may not necessarily mislead phylogenetic inference. Despite using many different PCR conditions (different primers, higher denaturing temperatures, and absence or presence of DMSO and BSA-TMACl), we recovered only a few functional ITS copies from A. fagifolia and none from H. cordifolia and M. rubrostipulata, which suggests that PCR selection is occurring and/or the presumed functional alleles are located at minor loci (with few ribosomal DNA copies).
To estimate the evolutionary history of the mustard family (Brassicaceae or Cruciferae), we sampl... more To estimate the evolutionary history of the mustard family (Brassicaceae or Cruciferae), we sampled 113 species, representing 101 of the roughly 350 genera and 17 of the 19 tribes of the family, for the chloroplast gene ndhF. The included accessions increase the number of genera sampled over previous phylogenetic studies by four-fold. Using parsimony, likelihood, and Bayesian methods, we reconstructed the phylogeny of the gene and used the Shimodaira-Hasegawa test (S-H test) to compare the phylogenetic results with the most recent tribal classification for the family. The resultant phylogeny allowed a critical assessment of variations in fruit morphology and seed anatomy, upon which the current classification is based. We also used the S-H test to examine the utility of trichome branching patterns for describing monophyletic groups in the ndhF phylogeny. Our phylogenetic results indicate that 97 of 114 ingroup accessions fall into one of 21 strongly supported clades. Some of these clades can themselves be grouped into strongly to moderately supported monophyletic groups. One of these lineages is a novel grouping overlooked in previous phylogenetic studies. Results comparing 30 different scenarios of evolution by the S-H test indicate that five of 12 tribes represented by two or more genera in the study are clearly polyphyletic, although a few tribes are not sampled well enough to establish para-or polyphyly. In addition, branched trichomes likely evolved independently several times in the Brassicaceae, although malpighiaceous and stellate trichomes may each have a single origin.
Nar, 2007
The Plant Ontology Consortium (POC, http:// www.plantontology.org) is a collaborative effort amon... more The Plant Ontology Consortium (POC, http:// www.plantontology.org) is a collaborative effort among model plant genome databases and plant researchers that aims to create, maintain and facilitate the use of a controlled vocabulary (ontology) for plants. The ontology allows users to ascribe attributes of plant structure (anatomy and morphology) and developmental stages to data types, such as genes and phenotypes, to provide a semantic framework to make meaningful crossspecies and database comparisons. The POC builds upon groundbreaking work by the Gene Ontology Consortium (GOC) by adopting and extending the GOC's principles, existing software and database structure. Over the past year, POC has added hundreds of ontology terms to associate with thousands of genes and gene products from Arabidopsis, rice and maize, which are available through a newly updated web-based browser (http://www.plantontology.org/amigo/go.cgi) for viewing, searching and querying. The Consortium has also implemented new functionalities to facilitate the application of PO in genomic research and updated the website to keep the contents current.
Computational Biology, 2008
The Plant Structure Ontology (PSO) is a controlled vocabulary of anatomy and morphology of a gene... more The Plant Structure Ontology (PSO) is a controlled vocabulary of anatomy and morphology of a generic flowering plant, developed by the Plant Ontology Consortium (POC) The main goal of the POC was to reduce the problem of heterogeneity of terminology used to describe comparable object types in plant genomic databases. PSO provides standardized set of terms describing anatomical and morphological
Molecular Biology and Evolution
An exception to the generally conservative nature of plastid gene evolution is the gene coding fo... more An exception to the generally conservative nature of plastid gene evolution is the gene coding for the p" subunit of RNA polymerase, rpoC2. Previous work by others has shown that maize and rice have an insertion in the coding region of rpoC2, relative to spinach and tobacco. To assess the distribution of this extra coding sequence, we surveyed a broad phylogenetic sample comprising 55 species from 17 angiosperm families by using Southern hybridization.
American Journal of Botany
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, a... more JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. This content downloaded from 160.111.254.17 on Wed, 08 Apr 2015 15:38:59 UTC All use subject to JSTOR Terms and Conditions
American journal of botany, 2014
The Food and Agriculture Organization (FAO) predicts that food production must rise 70% over the ... more The Food and Agriculture Organization (FAO) predicts that food production must rise 70% over the next 40 years to meet the demands of a growing population that is expected to reach nine billion by the year 2050. Many facets of basic plant science promoted by the Botanical Society of America are important for agriculture; however, more explicit connections are needed to bridge the gap between basic and applied plant research. This special issue, Speaking of Food: Connecting Basic and Applied Plant Science, was conceived to showcase productive overlaps of basic and applied research to address the challenges posed by feeding billions of people and to stimulate more research, fresh connections, and new paradigms. Contributions to this special issue thus illustrate some interactive areas of study in plant science-historical and modern plant-human interaction, crop and weed origins and evolution, and the effects of natural and artificial selection on crops and their wild relatives. These ...
Genetics
We designate a region of the alcohol dehydrogenase locus ( A d h ) of the weedy crucifer, Arabido... more We designate a region of the alcohol dehydrogenase locus ( A d h ) of the weedy crucifer, Arabidopsis thaliana, as "hypervariable" on the basis of a comparison of sequences from ecotypes Columbia and Landsberg. We found eight synonymous and two replacement mutations in the first 262 nucleotides of exon 4, and an additional two mutations in the contiguous region of intron 3. The rest of the sequence (2611 bp) has just three mutations, all of them confined to noncoding regions. Our survey of the hypervariable region among 37 ecotypes of A . thaliana revealed two predominant haplotypes, corresponding to the Columbia and Landsberg sequences. We identified five additional haplotypes and 4 additional segregating sites. The lack of haplotype diversity is presumably in part a function of low rates of recombination between haplotypes conferred by A. thaliana's tendency to self-fertilize. However, an analysis in 32 ecotypes of 12 genome-wide polymorphic markers distinguishing Columbia and Landsberg ecotypes indicated levels of outcrossing sufficient at least to erode linkage disequilibrium between dispersed markers. We discuss possible evolutionary explanations for the coupled observation of marked divergence within the hypervariable region and a lack of haplotype diversity among ecotypes. The sequence of the region for closely related species argues against the possibility that one allele is the product of introgression. We note (1) that several loss of function mutations (both naturally and chemically induced) map to the hypervariable region, and (2) the presence of two amino acid replacement polymorphisms, one of which causes the mobility difference between the two major classes of A. thaliana Adh electrophoretic alleles. We argue that protein polymorphism in such a functionally significant part of the molecule may be subject to balancing selection. The observed pattern of extensive divergence between the alleles is consistent with this explanation because balancing selection on a particular site maintains linked neutral polymorphisms at intermediate frequencies.
Evolution & Development
LEAFY HULL STERILE1 (LHS1) is an MIKC-type MADS-box gene in the SEPALLATA class. Expression patte... more LEAFY HULL STERILE1 (LHS1) is an MIKC-type MADS-box gene in the SEPALLATA class. Expression patterns of LHS1 homologs vary among species of grasses, and may be involved in determining palea and lemma morphology, specifying the terminal floret of the spikelet, and sex determination. Here we present LHS1 expression data from Eleusine indica (subfamily Chloridoideae) and Megathyrsus maximus (subfamily Panicoideae) to provide further insights into the hypothesized roles of the gene. E. indica has spikelets with three to eight florets that mature acropetally; E. indica LHS1 (EiLHS1) is expressed in the palea and lemma of all florets. In contrast, M. maximus has spikelets with two florets that mature basipetally; M. maximus LHS1 (MmLHS1) is expressed in the palea and lemma of the distal floret only. These data are consistent with the hypothesis that LHS1 plays a role in determining palea and lemma morphology and specifies the terminal floret of basipetally maturing grass spikelets. Howeve...
American journal of botany, 2015