Sorel Fitz-gibbon | University of California, Los Angeles (original) (raw)

Papers by Sorel Fitz-gibbon

Research paper thumbnail of A novel Rieske iron-sulfur protein from the hyperthermophilic crenarchaeon Pyrobaculum aerophilum: sequencing of the gene, expression in E. coli and characterization of the protein

Journal of bioenergetics and biomembranes, 1999

The crenarchaeon Pyrobaculum aerophilum is with an optimal growth temperature of 100 degrees C on... more The crenarchaeon Pyrobaculum aerophilum is with an optimal growth temperature of 100 degrees C one of the most thermophilic organisms known to possess an aerobic respiratory chain. The analysis of DNA sequences from the Pyrobaculum genome project lead to the identification of an open reading frame potentially coding for a Rieske iron-sulfur protein. The complete gene (named parR) was cloned and sequenced. The deduced amino acid sequence displays unusual amino acid exchanges and a so far unknown sequence insertion. The N-terminus shows similarities to bacterial signal sequences. Several forms of the gene were expressed in E. coli in order to verify the classification as a Rieske protein and to facilitate biophysical studies. Soluble, thermo-stable proteins with correctly inserted iron-sulfur clusters were expressed from two versions of the gene. The delta1-23 truncated holo-protein is redox active. It displays the typical spectroscopic properties of a Rieske protein. The redox potent...

Research paper thumbnail of From Stars to Genes: An Integrated Study of the Prospects for Life in the Cosmos

Research paper thumbnail of A novel Rieske iron-sulfur protein from the hyperthermophilic crenarchaeon Pyrobaculum aerophilum: sequencing of the gene, expression in E. coli and characterization of the protein

Journal of bioenergetics and biomembranes, 1999

The crenarchaeon Pyrobaculum aerophilum is with an optimal growth temperature of 100 degrees C on... more The crenarchaeon Pyrobaculum aerophilum is with an optimal growth temperature of 100 degrees C one of the most thermophilic organisms known to possess an aerobic respiratory chain. The analysis of DNA sequences from the Pyrobaculum genome project lead to the identification of an open reading frame potentially coding for a Rieske iron-sulfur protein. The complete gene (named parR) was cloned and sequenced. The deduced amino acid sequence displays unusual amino acid exchanges and a so far unknown sequence insertion. The N-terminus shows similarities to bacterial signal sequences. Several forms of the gene were expressed in E. coli in order to verify the classification as a Rieske protein and to facilitate biophysical studies. Soluble, thermo-stable proteins with correctly inserted iron-sulfur clusters were expressed from two versions of the gene. The delta1-23 truncated holo-protein is redox active. It displays the typical spectroscopic properties of a Rieske protein. The redox potent...

Research paper thumbnail of A novel uracil-DNA glycosylase with broad substrate specificity and an unusual active site

The EMBO Journal, 2002

Uracil-DNA glycosylases (UDGs) catalyse the removal of uracil by¯ipping it out of the double heli... more Uracil-DNA glycosylases (UDGs) catalyse the removal of uracil by¯ipping it out of the double helix into their binding pockets, where the glycosidic bond is hydrolysed by a water molecule activated by a polar amino acid. Interestingly, the four known UDG families differ in their active site make-up. The activating residues in UNG and SMUG enzymes are aspartates, thermostable UDGs resemble UNG-type enzymes, but carry glutamate rather than aspartate residues in their active sites, and the less active MUG/TDG enzymes contain an active site asparagine. We now describe the ®rst member of a ®fth UDG family, Pa-UDGb from the hyperthermophilic crenarchaeon Pyrobaculum aerophilum, the active site of which lacks the polar residue that was hitherto thought to be essential for catalysis. Moreover, Pa-UDGb is the ®rst member of the UDG family that ef®ciently catalyses the removal of an aberrant purine, hypoxanthine, from DNA. We postulate that this enzyme has evolved to counteract the mutagenic threat of cytosine and adenine deamination, which becomes particularly acute in organisms living at elevated temperatures.

Research paper thumbnail of Biochemical Characterization of Uracil Processing Activities in the Hyperthermophilic Archaeon Pyrobaculum aerophilum

Journal of Biological Chemistry, 2001

Deamination of cytosine to uracil and 5-methylcytosine to thymine represents a major mutagenic th... more Deamination of cytosine to uracil and 5-methylcytosine to thymine represents a major mutagenic threat particularly at high temperatures. In double-stranded DNA, these spontaneous hydrolytic reactions give rise to G⅐U and G⅐T mispairs, respectively, that must be restored to G⅐C pairs prior to the next round of DNA replication; if left unrepaired, 50% of progeny DNA would acquire G⅐C 3 A⅐T transition mutations. The genome of the hyperthermophilic archaeon Pyrobaculum aerophilum has been recently shown to encode a protein, Pa-MIG, a member of the endonuclease III family, capable of processing both G⅐U and G⅐T mispairs. We now show that this latter activity is undetectable in crude extracts of P. aerophilum. However, uracil residues in G⅐U mispairs, in A⅐U pairs, and in single-stranded DNA were efficiently removed in these extracts. These activities were assigned to a ϳ22-kDa polypeptide named Pa-UDG (P. aerophilum uracil-DNA glycosylase). The recombinant Pa-UDG protein is highly thermostable and displays a considerable degree of homology to the recently described uracil-DNA glycosylases from Archaeoglobus fulgidus and Thermotoga maritima. Interestingly, neither Pa-MIG nor Pa-UDG was inhibited by UGI, a generic inhibitor of the UNG family of uracil glycosylases. Yet a small fraction of the total uracil processing activity present in crude extracts of P. aerophilum was inhibited by this peptide. This implies that the hyperthermophilic archaeon possesses at least a three-pronged defense against the mutagenic threat of hydrolytic deamination of cytosines in its genomic DNA.

Research paper thumbnail of The Euryarchaeota, Nature's Medium for Engineering of Single-stranded DNA-binding Proteins

Journal of Biological Chemistry, 2005

The architecture of single-stranded DNA-binding proteins, which play key roles in DNA metabolism,... more The architecture of single-stranded DNA-binding proteins, which play key roles in DNA metabolism, is based on different combinations of the oligonucleotide/oligosaccharide binding (OB) fold. Whereas the polypeptide serving this function in bacteria contains one OB fold, the eukaryotic functional homolog comprises a complex of three proteins, each harboring at least one OB fold. Here we show that unlike these groups of organisms, the Euryarchaeota has exploited the potential in the OB fold to re-invent single-stranded DNA-binding proteins many times. However, the most common form is a protein with two OB folds and one zinc finger domain. We created several deletion mutants of this protein based on its conserved motifs, and from these structures functional chimeras were synthesized, supporting the hypothesis that gene duplication and recombination could lead to novel functional forms of single-stranded DNA-binding proteins. Biophysical studies showed that the orthologs of the two OB fold/one zinc finger replication protein A in Methanosarcina acetivorans and Methanopyrus kandleri exhibit two binding modes, wrapping and stretching of DNA. However, the ortholog in Ferroplasma acidarmanus possessed only the stretching mode. Most interestingly, a second single-stranded DNA-binding protein, FacRPA2, in this archaeon exhibited the wrapping mode. Domain analysis of this protein, which contains a single OB fold, showed that its architecture is similar to the functional homologs thought to be unique to the Crenarchaeotes. Most unexpectedly, genes coding for similar proteins were found in the genomes of eukaryotes, including humans. Although the diversity shown by archaeal single-stranded DNA-binding proteins is unparalleled, the presence of their simplest form in many organisms across all domains of life is of greater evolutionary consequence.

Research paper thumbnail of Terminal addition, the Cambrian radiation and the Phanerozoic evolution of bilaterian form

Evolution <html_ent glyph="@amp;" ascii="&"/> Development, 2005

We examine terminal addition, the process of addition of serial elements in a posterior subtermin... more We examine terminal addition, the process of addition of serial elements in a posterior subterminal growth zone during animal development, across modern taxa and fossil material. We argue that terminal addition was the basal condition in Bilateria, and that modification of terminal addition was an important component of the rapid Cambrian evolution of novel bilaterian morphology. We categorize the often-convergent modifications of terminal addition from the presumed ancestral condition. Our focus on terminal addition and its modification highlights trends in the history of animal evolution evident in the fossil record. These trends appear to be the product of departure from the initial terminal addition state, as is evident in evolutionary patterns within-fossil groups such as trilobites, but is also more generally related to shifts in types of morphologic change through the early Phanerozoic. Our argument is contingent on dates of metazoan divergence that are roughly convergent with the first appearance of metazoan fossils in the latest Proterozoic and Cambrian, as well as on an inference of homology of terminal addition across bilaterian Metazoa.

Research paper thumbnail of Genome-wide analysis on Chlamydomonas reinhardtii reveals impact of hydrogen peroxide on protein stress responses and overlap with other stress transcriptomes

The Plant journal : for cell and molecular biology, Jan 16, 2015

Reactive oxygen species (ROS) are produced by and have the potential to be damaging to all aerobi... more Reactive oxygen species (ROS) are produced by and have the potential to be damaging to all aerobic organisms. In photosynthetic organisms, they are an unavoidable byproduct of electron transfer in both the chloroplast and mitochondrion. We employ the reference unicellular green alga, Chlamydomonas reinhardtii, to identify the effect of H2 O2 on gene expression by monitoring the transcriptome changes in a timecourse experiment. Comparison of transcriptomes from cells sampled immediately prior to addition of H2 O2 , and 0.5 and 1 h subsequently revealed 1278 differentially abundant transcripts. Of those transcripts that increase in abundance, many encode proteins involved in ROS detoxification, protein degradation and stress-responses, whereas among those that decrease are transcripts encoding proteins involved in photosynthesis and central carbon metabolism. In addition to these transcriptomic adjustments, we observe that H2 O2 addition is followed by an accumulation and oxidation of...

Research paper thumbnail of Dynamic changes in the transcriptome and methylome of Chlamydomonas reinhardtii throughout its life cycle

Research paper thumbnail of Chlamydomonas Genome Resource for Laboratory Strains Reveals a Mosaic of Sequence Variation, Identifies True Strain Histories, and Enables Strain-Specific Studies

The Plant cell, Jan 25, 2015

Chlamydomonas reinhardtii is a widely used reference organism in studies of photosynthesis, cilia... more Chlamydomonas reinhardtii is a widely used reference organism in studies of photosynthesis, cilia, and biofuels. Most research in this field uses a few dozen standard laboratory strains that are reported to share a common ancestry, but exhibit substantial phenotypic differences. In order to facilitate ongoing Chlamydomonas research and explain the phenotypic variation, we mapped the genetic diversity within these strains using whole-genome resequencing. We identified 524,640 single nucleotide variants and 4812 structural variants among 39 commonly used laboratory strains. Nearly all (98.2%) of the total observed genetic diversity was attributable to the presence of two, previously unrecognized, alternate haplotypes that are distributed in a mosaic pattern among the extant laboratory strains. We propose that these two haplotypes are the remnants of an ancestral cross between two strains with ∼2% relative divergence. These haplotype patterns create a fingerprint for each strain that f...

Research paper thumbnail of Distinct Shifts in Microbiota Composition during Drosophila Aging Impair Intestinal Function and Drive Mortality

Cell Reports, 2015

Graphical Abstract Highlights d Age-related dysbiosis in Drosophila is characterized by Gammaprot... more Graphical Abstract Highlights d Age-related dysbiosis in Drosophila is characterized by Gammaproteobacteria expansion d Dysbiosis predicts age-onset intestinal barrier dysfunction and rapid health decline d Age-related dysbiosis drives changes in excretory function d Loss of commensal control following intestinal barrier dysfunction drives mortality SUMMARY

Research paper thumbnail of Systems-level analysis of N-starvation induced TAG accumulation in a Chlamydomonas starchless mutant

The Plant Cell

To understand the molecular basis underlying increased triacylglycerol (TAG) accumulation in star... more To understand the molecular basis underlying increased triacylglycerol (TAG) accumulation in starchless (sta) Chlamydomonas reinhardtii mutants, we undertook comparative time-course transcriptomics of strains CC-4348 (sta6 mutant), CC-4349, a cell wall-deficient (cw) strain purported to represent the parental STA6 strain, and three independent STA6 strains generated by complementation of sta6 (CC-4565/STA6-C2, CC-4566/STA6-C4, and CC-4567/STA6-C6) in the context of N deprivation. Despite N starvation-induced dramatic remodeling of the transcriptome, there were relatively few differences (5 3 10 2 ) observed between sta6 and STA6, the most dramatic of which were increased abundance of transcripts encoding key regulated or rate-limiting steps in central carbon metabolism, specifically isocitrate lyase, malate synthase, transaldolase, fructose bisphosphatase and phosphoenolpyruvate carboxykinase (encoded by ICL1, MAS1, TAL1, FBP1, and PCK1 respectively), suggestive of increased carbon movement toward hexose-phosphate in sta6 by upregulation of the glyoxylate pathway and gluconeogenesis. Enzyme assays validated the increase in isocitrate lyase and malate synthase activities. Targeted metabolite analysis indicated increased succinate, malate, and Glc-6-P and decreased Fru-1,6-bisphosphate, illustrating the effect of these changes. Comparisons of independent data sets in multiple strains allowed the delineation of a sequence of events in the global N starvation response in C. reinhardtii, starting within minutes with the upregulation of alternative N assimilation routes and carbohydrate synthesis and subsequently a more gradual upregulation of genes encoding enzymes of TAG synthesis. Finally, genome resequencing analysis indicated that (1) the deletion in sta6 extends into the neighboring gene encoding respiratory burst oxidase, and (2) a commonly used STA6 strain (CC-4349) as well as the sequenced reference (CC-503) are not congenic with respect to sta6 (CC-4348), underscoring the importance of using complemented strains for more rigorous assignment of phenotype to genotype.

Research paper thumbnail of Genome-wide gene order distances support clustering the gram-positive bacteria

Frontiers in Microbiology, 2015

Initially using 143 genomes, we developed a method for calculating the pair-wise distance between... more Initially using 143 genomes, we developed a method for calculating the pair-wise distance between prokaryotic genomes using a Monte Carlo method to estimate the conservation of gene order. The method was based on repeatedly selecting five or six non-adjacent random orthologs from each of two genomes and determining if the chosen orthologs were in the same order. The raw distances were then corrected for gene order convergence using an adaptation of the Jukes-Cantor model, as well as using the common distance correction D&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;#39; = -ln(1-D). First, we compared the distances found via the order of six orthologs to distances found based on ortholog gene content and small subunit rRNA sequences. The Jukes-Cantor gene order distances are reasonably well correlated with the divergence of rRNA (R (2) = 0.24), especially at rRNA Jukes-Cantor distances of less than 0.2 (R (2) = 0.52). Gene content is only weakly correlated with rRNA divergence (R (2) = 0.04) over all distances, however, it is especially strongly correlated at rRNA Jukes-Cantor distances of less than 0.1 (R (2) = 0.67). This initial work suggests that gene order may be useful in conjunction with other methods to help understand the relatedness of genomes. Using the gene order distances in 143 genomes, the relations of prokaryotes were studied using neighbor joining and agreement subtrees. We then repeated our study of the relations of prokaryotes using gene order in 172 complete genomes better representing a wider-diversity of prokaryotes. Consistently, our trees show the Actinobacteria as a sister group to the bulk of the Firmicutes. In fact, the robustness of gene order support was found to be considerably greater for uniting these two phyla than for uniting any of the proteobacterial classes together. The results are supportive of the idea that Actinobacteria and Firmicutes are closely related, which in turn implies a single origin for the gram-positive cell.

Research paper thumbnail of Phylogenetic profiling

Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics, 2004

Research paper thumbnail of Cloning of a functional 25-hydroxyvitamin D-1α-hydroxylase in zebrafish (Danio rerio)

Cell biochemistry and function, 2014

Activation of precursor 25-hydroxyvitamin D3 (25D) to hormonal 1,25-dihydroxyvitamin D3 (1,25D) i... more Activation of precursor 25-hydroxyvitamin D3 (25D) to hormonal 1,25-dihydroxyvitamin D3 (1,25D) is a pivotal step in vitamin D physiology, catalysed by the enzyme 25-hydroxyvitamin D-1α-hydroxylase (1α-hydroxylase). To establish new models for assessing the physiological importance of the 1α-hydroxylase-25D-axis, we used Danio rerio (zebrafish) to characterize expression and biological activity of the gene for 1α-hydroxylase (cyp27b1). Treatment of day 5 zebrafish larvae with inactive 25D (5-150 nM) or active 1,25D (0.1-10 nM) induced dose responsive expression (15-95-fold) of the vitamin D-target gene cyp24a1 relative to larvae treated with vehicle, suggesting the presence of Cyp27b1 activity. A full-length zebrafish cyp27b1 cDNA was then generated using RACE and RT-PCR methods. Sequencing of the resulting clone revealed an open reading frame encoding a protein of 505 amino acids with 54% identity to human CYP27B1. Transfection of a cyp27b1 expression vector into HKC-8, a human kid...

Research paper thumbnail of Systems-Level Analysis of Nitrogen Starvation-Induced Modifications of Carbon Metabolism in a Chlamydomonas reinhardtii Starchless Mutant

The Plant Cell, 2013

To understand the molecular basis underlying increased triacylglycerol (TAG) accumulation in star... more To understand the molecular basis underlying increased triacylglycerol (TAG) accumulation in starchless (sta) Chlamydomonas reinhardtii mutants, we undertook comparative time-course transcriptomics of strains CC-4348 (sta6 mutant), CC-4349, a cell wall-deficient (cw) strain purported to represent the parental STA6 strain, and three independent STA6 strains generated by complementation of sta6 (CC-4565/STA6-C2, CC-4566/STA6-C4, and CC-4567/STA6-C6) in the context of N deprivation. Despite N starvation-induced dramatic remodeling of the transcriptome, there were relatively few differences (5 3 10 2 ) observed between sta6 and STA6, the most dramatic of which were increased abundance of transcripts encoding key regulated or rate-limiting steps in central carbon metabolism, specifically isocitrate lyase, malate synthase, transaldolase, fructose bisphosphatase and phosphoenolpyruvate carboxykinase (encoded by ICL1, MAS1, TAL1, FBP1, and PCK1 respectively), suggestive of increased carbon movement toward hexose-phosphate in sta6 by upregulation of the glyoxylate pathway and gluconeogenesis. Enzyme assays validated the increase in isocitrate lyase and malate synthase activities. Targeted metabolite analysis indicated increased succinate, malate, and Glc-6-P and decreased Fru-1,6bisphosphate, illustrating the effect of these changes. Comparisons of independent data sets in multiple strains allowed the delineation of a sequence of events in the global N starvation response in C. reinhardtii, starting within minutes with the upregulation of alternative N assimilation routes and carbohydrate synthesis and subsequently a more gradual upregulation of genes encoding enzymes of TAG synthesis. Finally, genome resequencing analysis indicated that (1) the deletion in sta6 extends into the neighboring gene encoding respiratory burst oxidase, and (2) a commonly used STA6 strain (CC-4349) as well as the sequenced reference (CC-503) are not congenic with respect to sta6 (CC-4348), underscoring the importance of using complemented strains for more rigorous assignment of phenotype to genotype.

Research paper thumbnail of Genome sequence of the hyperthermophilic crenarchaeon Pyrobaculum aerophilum

Proceedings of the National Academy of Sciences, 2002

We determined and annotated the complete 2.2-megabase genome sequence of Pyrobaculum aerophilum, ... more We determined and annotated the complete 2.2-megabase genome sequence of Pyrobaculum aerophilum, a facultatively aerobic nitratereducing hyperthermophilic (T opt ‫؍‬ 100°C) crenarchaeon. Clues were found suggesting explanations of the organism's surprising intolerance to sulfur, which may aid in the development of methods for genetic studies of the organism. Many interesting features worthy of further genetic studies were revealed. Whole genome computational analysis confirmed experiments showing that P. aerophilum (and perhaps all crenarchaea) lack 5 untranslated regions in their mRNAs and thus appear not to use a ribosome-binding site (Shine-Dalgarno)based mechanism for translation initiation at the 5 end of transcripts. Inspection of the lengths and distribution of mononucleotide repeattracts revealed some interesting features. For instance, it was seen that mononucleotide repeat-tracts of Gs (or Cs) are highly unstable, a pattern expected for an organism deficient in mismatch repair. This result, together with an independent study on mutation rates, suggests a ''mutator'' phenotype. P yrobaculum aerophilum is a hyperthermophilic (T max ϭ 104°C, T opt ϭ 100°C) and metabolically versatile member of the crenarchaea , which are predominantly anaerobic respirers. Unlike most hyperthermophiles, P. aerophilum can withstand the presence of oxygen, growing efficiently in microaerobic conditions, thus making it relatively easy to work with in the laboratory. Unlike most of its phylogenetic neighbors, the growth of P. aerophilum is inhibited by the presence of elemental sulfur, but it grows well anaerobically using nitrate reduction (1). Here we have determined the complete genome sequence of P. aerophilum IM2, which was isolated from a boiling marine water hole at Maronti Beach, Italy (1). We obtained the sequence by a low coverage random shotgun sequencing strategy, with gap closure and resolution of ambiguities aided by the creation of a genomic fosmid map (2). We present an overview of the features and content of the genome, including a possible explanation of the organism's intolerance to sulfur and evidence of a possible lack of mismatch repair activity. Studies of the genus Pyrobaculum provide important opportunities for understanding the boundaries of life in extreme habitats. In a recent molecular sampling of a deep subsurface geothermal water pool, the only organisms detected were hyperthermophilic archaeal members closely related to Pyrobaculum (3).

Research paper thumbnail of Metagenomic signatures of the Peru Margin subseafloor biosphere show a genetically distinct environment

Proceedings of the National Academy of Sciences, 2008

The subseafloor marine biosphere may be one of the largest reservoirs of microbial biomass on Ear... more The subseafloor marine biosphere may be one of the largest reservoirs of microbial biomass on Earth and has recently been the subject of debate in terms of the composition of its microbial inhabitants, particularly on sediments from the Peru Margin. A metagenomic analysis was made by using whole-genome amplification and pyrosequencing of sediments from Ocean Drilling Program Site 1229 on the Peru Margin to further explore the microbial diversity and overall community composition within this environment. A total of 61.9 Mb of genetic material was sequenced from sediments at horizons 1, 16, 32, and 50 m below the seafloor. These depths include sediments from both primarily sulfate-reducing methane-generating regions of the sediment column. Many genes of the annotated genes, including those encoding ribosomal proteins, corresponded to those from the Chloroflexi and Euryarchaeota. However, analysis of the 16S small-subunit ribosomal genes suggests that Crenarchaeota are the abundant microbial member. Quantitative PCR confirms that uncultivated Crenarchaeota are indeed a major microbial group in these subsurface samples. These findings show that the marine subsurface is a distinct microbial habitat and is different from environments studied by metagenomics, especially because of the predominance of uncultivated archaeal groups.

Research paper thumbnail of Selecting protein targets for structural genomics of Pyrobaculum aerophilum: Validating automated fold assignment methods by using binary hypothesis testing

Proceedings of the National Academy of Sciences, 2000

Pyrobaculum Genome. Predicted coding region sequences of the PA genome were obtained from the Jef... more Pyrobaculum Genome. Predicted coding region sequences of the PA genome were obtained from the Jeffrey H. Miller Laboratory of the University of California Los Angeles Molecular Biology Institute and correspond to the 1͞1͞99 version of the genome. This version contained 2,681 open reading frames (ORFs) predicted to code for proteins. Membrane-Spanning Proteins. Of the 2,681 PA ORFs, 551 contained membrane-spanning ␣-helices as determined by MOMENT (7) (PAHOME͞TRANSMEMBRANEHELIX PREDICTIONRESULTS). These proteins were excluded from fold recognition and novel fold prediction analysis. Protein Sequence Databases. The Online Mendelian Inheritance in Man (OMIM) database containing 15,743 sequences was downloaded from the National Center for Biotechnology Information (http:͞͞www.ncbi.nlm.nih.gov͞Omim͞; authored and edited by V. A. McKusick and his colleagues at Johns Hopkins and elsewhere). Similarity searches of the OMIM database were performed by using a local implementation of the Smith-Waterman algorithm (8) with probability values determined by Waterman-Vingron statistics (9, 10).

Research paper thumbnail of Whole genome-based phylogenetic analysis of free-living microorganisms

Nucleic Acids Research, 1999

A phylogenetic 'tree of life' has been constructed based on the observed presence and absence of ... more A phylogenetic 'tree of life' has been constructed based on the observed presence and absence of families of protein-encoding genes observed in 11 complete genomes of free-living microorganisms. Past attempts to reconstruct the evolutionary relationships of microorganisms have been limited to sets of genes rather than complete genomes. Despite apparent rampant lateral gene transfer among microorganisms, these results indicate a single robust underlying evolutionary history for these organisms. Broadly, the tree produced is very similar to the small subunit rRNA tree although several additional phylogenetic relationships appear to be resolved, including the relationship of Archaeoglobus to the methanogens studied. This result is in contrast to notions that a robust phylogenetic reconstruction of microorganisms is impossible due to their genomes being composed of an incomprehensible amalgam of genes with complicated histories and suggests that this style of genome-wide phylogenetic analysis could become an important method for studying the ancient diversification of life on Earth. Analyses using informational and operational subsets of the genes showed that this 'tree of life' is not dependent on the phylogenetically more consistent informational genes.

Research paper thumbnail of A novel Rieske iron-sulfur protein from the hyperthermophilic crenarchaeon Pyrobaculum aerophilum: sequencing of the gene, expression in E. coli and characterization of the protein

Journal of bioenergetics and biomembranes, 1999

The crenarchaeon Pyrobaculum aerophilum is with an optimal growth temperature of 100 degrees C on... more The crenarchaeon Pyrobaculum aerophilum is with an optimal growth temperature of 100 degrees C one of the most thermophilic organisms known to possess an aerobic respiratory chain. The analysis of DNA sequences from the Pyrobaculum genome project lead to the identification of an open reading frame potentially coding for a Rieske iron-sulfur protein. The complete gene (named parR) was cloned and sequenced. The deduced amino acid sequence displays unusual amino acid exchanges and a so far unknown sequence insertion. The N-terminus shows similarities to bacterial signal sequences. Several forms of the gene were expressed in E. coli in order to verify the classification as a Rieske protein and to facilitate biophysical studies. Soluble, thermo-stable proteins with correctly inserted iron-sulfur clusters were expressed from two versions of the gene. The delta1-23 truncated holo-protein is redox active. It displays the typical spectroscopic properties of a Rieske protein. The redox potent...

Research paper thumbnail of From Stars to Genes: An Integrated Study of the Prospects for Life in the Cosmos

Research paper thumbnail of A novel Rieske iron-sulfur protein from the hyperthermophilic crenarchaeon Pyrobaculum aerophilum: sequencing of the gene, expression in E. coli and characterization of the protein

Journal of bioenergetics and biomembranes, 1999

The crenarchaeon Pyrobaculum aerophilum is with an optimal growth temperature of 100 degrees C on... more The crenarchaeon Pyrobaculum aerophilum is with an optimal growth temperature of 100 degrees C one of the most thermophilic organisms known to possess an aerobic respiratory chain. The analysis of DNA sequences from the Pyrobaculum genome project lead to the identification of an open reading frame potentially coding for a Rieske iron-sulfur protein. The complete gene (named parR) was cloned and sequenced. The deduced amino acid sequence displays unusual amino acid exchanges and a so far unknown sequence insertion. The N-terminus shows similarities to bacterial signal sequences. Several forms of the gene were expressed in E. coli in order to verify the classification as a Rieske protein and to facilitate biophysical studies. Soluble, thermo-stable proteins with correctly inserted iron-sulfur clusters were expressed from two versions of the gene. The delta1-23 truncated holo-protein is redox active. It displays the typical spectroscopic properties of a Rieske protein. The redox potent...

Research paper thumbnail of A novel uracil-DNA glycosylase with broad substrate specificity and an unusual active site

The EMBO Journal, 2002

Uracil-DNA glycosylases (UDGs) catalyse the removal of uracil by¯ipping it out of the double heli... more Uracil-DNA glycosylases (UDGs) catalyse the removal of uracil by¯ipping it out of the double helix into their binding pockets, where the glycosidic bond is hydrolysed by a water molecule activated by a polar amino acid. Interestingly, the four known UDG families differ in their active site make-up. The activating residues in UNG and SMUG enzymes are aspartates, thermostable UDGs resemble UNG-type enzymes, but carry glutamate rather than aspartate residues in their active sites, and the less active MUG/TDG enzymes contain an active site asparagine. We now describe the ®rst member of a ®fth UDG family, Pa-UDGb from the hyperthermophilic crenarchaeon Pyrobaculum aerophilum, the active site of which lacks the polar residue that was hitherto thought to be essential for catalysis. Moreover, Pa-UDGb is the ®rst member of the UDG family that ef®ciently catalyses the removal of an aberrant purine, hypoxanthine, from DNA. We postulate that this enzyme has evolved to counteract the mutagenic threat of cytosine and adenine deamination, which becomes particularly acute in organisms living at elevated temperatures.

Research paper thumbnail of Biochemical Characterization of Uracil Processing Activities in the Hyperthermophilic Archaeon Pyrobaculum aerophilum

Journal of Biological Chemistry, 2001

Deamination of cytosine to uracil and 5-methylcytosine to thymine represents a major mutagenic th... more Deamination of cytosine to uracil and 5-methylcytosine to thymine represents a major mutagenic threat particularly at high temperatures. In double-stranded DNA, these spontaneous hydrolytic reactions give rise to G⅐U and G⅐T mispairs, respectively, that must be restored to G⅐C pairs prior to the next round of DNA replication; if left unrepaired, 50% of progeny DNA would acquire G⅐C 3 A⅐T transition mutations. The genome of the hyperthermophilic archaeon Pyrobaculum aerophilum has been recently shown to encode a protein, Pa-MIG, a member of the endonuclease III family, capable of processing both G⅐U and G⅐T mispairs. We now show that this latter activity is undetectable in crude extracts of P. aerophilum. However, uracil residues in G⅐U mispairs, in A⅐U pairs, and in single-stranded DNA were efficiently removed in these extracts. These activities were assigned to a ϳ22-kDa polypeptide named Pa-UDG (P. aerophilum uracil-DNA glycosylase). The recombinant Pa-UDG protein is highly thermostable and displays a considerable degree of homology to the recently described uracil-DNA glycosylases from Archaeoglobus fulgidus and Thermotoga maritima. Interestingly, neither Pa-MIG nor Pa-UDG was inhibited by UGI, a generic inhibitor of the UNG family of uracil glycosylases. Yet a small fraction of the total uracil processing activity present in crude extracts of P. aerophilum was inhibited by this peptide. This implies that the hyperthermophilic archaeon possesses at least a three-pronged defense against the mutagenic threat of hydrolytic deamination of cytosines in its genomic DNA.

Research paper thumbnail of The Euryarchaeota, Nature's Medium for Engineering of Single-stranded DNA-binding Proteins

Journal of Biological Chemistry, 2005

The architecture of single-stranded DNA-binding proteins, which play key roles in DNA metabolism,... more The architecture of single-stranded DNA-binding proteins, which play key roles in DNA metabolism, is based on different combinations of the oligonucleotide/oligosaccharide binding (OB) fold. Whereas the polypeptide serving this function in bacteria contains one OB fold, the eukaryotic functional homolog comprises a complex of three proteins, each harboring at least one OB fold. Here we show that unlike these groups of organisms, the Euryarchaeota has exploited the potential in the OB fold to re-invent single-stranded DNA-binding proteins many times. However, the most common form is a protein with two OB folds and one zinc finger domain. We created several deletion mutants of this protein based on its conserved motifs, and from these structures functional chimeras were synthesized, supporting the hypothesis that gene duplication and recombination could lead to novel functional forms of single-stranded DNA-binding proteins. Biophysical studies showed that the orthologs of the two OB fold/one zinc finger replication protein A in Methanosarcina acetivorans and Methanopyrus kandleri exhibit two binding modes, wrapping and stretching of DNA. However, the ortholog in Ferroplasma acidarmanus possessed only the stretching mode. Most interestingly, a second single-stranded DNA-binding protein, FacRPA2, in this archaeon exhibited the wrapping mode. Domain analysis of this protein, which contains a single OB fold, showed that its architecture is similar to the functional homologs thought to be unique to the Crenarchaeotes. Most unexpectedly, genes coding for similar proteins were found in the genomes of eukaryotes, including humans. Although the diversity shown by archaeal single-stranded DNA-binding proteins is unparalleled, the presence of their simplest form in many organisms across all domains of life is of greater evolutionary consequence.

Research paper thumbnail of Terminal addition, the Cambrian radiation and the Phanerozoic evolution of bilaterian form

Evolution <html_ent glyph="@amp;" ascii="&"/> Development, 2005

We examine terminal addition, the process of addition of serial elements in a posterior subtermin... more We examine terminal addition, the process of addition of serial elements in a posterior subterminal growth zone during animal development, across modern taxa and fossil material. We argue that terminal addition was the basal condition in Bilateria, and that modification of terminal addition was an important component of the rapid Cambrian evolution of novel bilaterian morphology. We categorize the often-convergent modifications of terminal addition from the presumed ancestral condition. Our focus on terminal addition and its modification highlights trends in the history of animal evolution evident in the fossil record. These trends appear to be the product of departure from the initial terminal addition state, as is evident in evolutionary patterns within-fossil groups such as trilobites, but is also more generally related to shifts in types of morphologic change through the early Phanerozoic. Our argument is contingent on dates of metazoan divergence that are roughly convergent with the first appearance of metazoan fossils in the latest Proterozoic and Cambrian, as well as on an inference of homology of terminal addition across bilaterian Metazoa.

Research paper thumbnail of Genome-wide analysis on Chlamydomonas reinhardtii reveals impact of hydrogen peroxide on protein stress responses and overlap with other stress transcriptomes

The Plant journal : for cell and molecular biology, Jan 16, 2015

Reactive oxygen species (ROS) are produced by and have the potential to be damaging to all aerobi... more Reactive oxygen species (ROS) are produced by and have the potential to be damaging to all aerobic organisms. In photosynthetic organisms, they are an unavoidable byproduct of electron transfer in both the chloroplast and mitochondrion. We employ the reference unicellular green alga, Chlamydomonas reinhardtii, to identify the effect of H2 O2 on gene expression by monitoring the transcriptome changes in a timecourse experiment. Comparison of transcriptomes from cells sampled immediately prior to addition of H2 O2 , and 0.5 and 1 h subsequently revealed 1278 differentially abundant transcripts. Of those transcripts that increase in abundance, many encode proteins involved in ROS detoxification, protein degradation and stress-responses, whereas among those that decrease are transcripts encoding proteins involved in photosynthesis and central carbon metabolism. In addition to these transcriptomic adjustments, we observe that H2 O2 addition is followed by an accumulation and oxidation of...

Research paper thumbnail of Dynamic changes in the transcriptome and methylome of Chlamydomonas reinhardtii throughout its life cycle

Research paper thumbnail of Chlamydomonas Genome Resource for Laboratory Strains Reveals a Mosaic of Sequence Variation, Identifies True Strain Histories, and Enables Strain-Specific Studies

The Plant cell, Jan 25, 2015

Chlamydomonas reinhardtii is a widely used reference organism in studies of photosynthesis, cilia... more Chlamydomonas reinhardtii is a widely used reference organism in studies of photosynthesis, cilia, and biofuels. Most research in this field uses a few dozen standard laboratory strains that are reported to share a common ancestry, but exhibit substantial phenotypic differences. In order to facilitate ongoing Chlamydomonas research and explain the phenotypic variation, we mapped the genetic diversity within these strains using whole-genome resequencing. We identified 524,640 single nucleotide variants and 4812 structural variants among 39 commonly used laboratory strains. Nearly all (98.2%) of the total observed genetic diversity was attributable to the presence of two, previously unrecognized, alternate haplotypes that are distributed in a mosaic pattern among the extant laboratory strains. We propose that these two haplotypes are the remnants of an ancestral cross between two strains with ∼2% relative divergence. These haplotype patterns create a fingerprint for each strain that f...

Research paper thumbnail of Distinct Shifts in Microbiota Composition during Drosophila Aging Impair Intestinal Function and Drive Mortality

Cell Reports, 2015

Graphical Abstract Highlights d Age-related dysbiosis in Drosophila is characterized by Gammaprot... more Graphical Abstract Highlights d Age-related dysbiosis in Drosophila is characterized by Gammaproteobacteria expansion d Dysbiosis predicts age-onset intestinal barrier dysfunction and rapid health decline d Age-related dysbiosis drives changes in excretory function d Loss of commensal control following intestinal barrier dysfunction drives mortality SUMMARY

Research paper thumbnail of Systems-level analysis of N-starvation induced TAG accumulation in a Chlamydomonas starchless mutant

The Plant Cell

To understand the molecular basis underlying increased triacylglycerol (TAG) accumulation in star... more To understand the molecular basis underlying increased triacylglycerol (TAG) accumulation in starchless (sta) Chlamydomonas reinhardtii mutants, we undertook comparative time-course transcriptomics of strains CC-4348 (sta6 mutant), CC-4349, a cell wall-deficient (cw) strain purported to represent the parental STA6 strain, and three independent STA6 strains generated by complementation of sta6 (CC-4565/STA6-C2, CC-4566/STA6-C4, and CC-4567/STA6-C6) in the context of N deprivation. Despite N starvation-induced dramatic remodeling of the transcriptome, there were relatively few differences (5 3 10 2 ) observed between sta6 and STA6, the most dramatic of which were increased abundance of transcripts encoding key regulated or rate-limiting steps in central carbon metabolism, specifically isocitrate lyase, malate synthase, transaldolase, fructose bisphosphatase and phosphoenolpyruvate carboxykinase (encoded by ICL1, MAS1, TAL1, FBP1, and PCK1 respectively), suggestive of increased carbon movement toward hexose-phosphate in sta6 by upregulation of the glyoxylate pathway and gluconeogenesis. Enzyme assays validated the increase in isocitrate lyase and malate synthase activities. Targeted metabolite analysis indicated increased succinate, malate, and Glc-6-P and decreased Fru-1,6-bisphosphate, illustrating the effect of these changes. Comparisons of independent data sets in multiple strains allowed the delineation of a sequence of events in the global N starvation response in C. reinhardtii, starting within minutes with the upregulation of alternative N assimilation routes and carbohydrate synthesis and subsequently a more gradual upregulation of genes encoding enzymes of TAG synthesis. Finally, genome resequencing analysis indicated that (1) the deletion in sta6 extends into the neighboring gene encoding respiratory burst oxidase, and (2) a commonly used STA6 strain (CC-4349) as well as the sequenced reference (CC-503) are not congenic with respect to sta6 (CC-4348), underscoring the importance of using complemented strains for more rigorous assignment of phenotype to genotype.

Research paper thumbnail of Genome-wide gene order distances support clustering the gram-positive bacteria

Frontiers in Microbiology, 2015

Initially using 143 genomes, we developed a method for calculating the pair-wise distance between... more Initially using 143 genomes, we developed a method for calculating the pair-wise distance between prokaryotic genomes using a Monte Carlo method to estimate the conservation of gene order. The method was based on repeatedly selecting five or six non-adjacent random orthologs from each of two genomes and determining if the chosen orthologs were in the same order. The raw distances were then corrected for gene order convergence using an adaptation of the Jukes-Cantor model, as well as using the common distance correction D&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;#39; = -ln(1-D). First, we compared the distances found via the order of six orthologs to distances found based on ortholog gene content and small subunit rRNA sequences. The Jukes-Cantor gene order distances are reasonably well correlated with the divergence of rRNA (R (2) = 0.24), especially at rRNA Jukes-Cantor distances of less than 0.2 (R (2) = 0.52). Gene content is only weakly correlated with rRNA divergence (R (2) = 0.04) over all distances, however, it is especially strongly correlated at rRNA Jukes-Cantor distances of less than 0.1 (R (2) = 0.67). This initial work suggests that gene order may be useful in conjunction with other methods to help understand the relatedness of genomes. Using the gene order distances in 143 genomes, the relations of prokaryotes were studied using neighbor joining and agreement subtrees. We then repeated our study of the relations of prokaryotes using gene order in 172 complete genomes better representing a wider-diversity of prokaryotes. Consistently, our trees show the Actinobacteria as a sister group to the bulk of the Firmicutes. In fact, the robustness of gene order support was found to be considerably greater for uniting these two phyla than for uniting any of the proteobacterial classes together. The results are supportive of the idea that Actinobacteria and Firmicutes are closely related, which in turn implies a single origin for the gram-positive cell.

Research paper thumbnail of Phylogenetic profiling

Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics, 2004

Research paper thumbnail of Cloning of a functional 25-hydroxyvitamin D-1α-hydroxylase in zebrafish (Danio rerio)

Cell biochemistry and function, 2014

Activation of precursor 25-hydroxyvitamin D3 (25D) to hormonal 1,25-dihydroxyvitamin D3 (1,25D) i... more Activation of precursor 25-hydroxyvitamin D3 (25D) to hormonal 1,25-dihydroxyvitamin D3 (1,25D) is a pivotal step in vitamin D physiology, catalysed by the enzyme 25-hydroxyvitamin D-1α-hydroxylase (1α-hydroxylase). To establish new models for assessing the physiological importance of the 1α-hydroxylase-25D-axis, we used Danio rerio (zebrafish) to characterize expression and biological activity of the gene for 1α-hydroxylase (cyp27b1). Treatment of day 5 zebrafish larvae with inactive 25D (5-150 nM) or active 1,25D (0.1-10 nM) induced dose responsive expression (15-95-fold) of the vitamin D-target gene cyp24a1 relative to larvae treated with vehicle, suggesting the presence of Cyp27b1 activity. A full-length zebrafish cyp27b1 cDNA was then generated using RACE and RT-PCR methods. Sequencing of the resulting clone revealed an open reading frame encoding a protein of 505 amino acids with 54% identity to human CYP27B1. Transfection of a cyp27b1 expression vector into HKC-8, a human kid...

Research paper thumbnail of Systems-Level Analysis of Nitrogen Starvation-Induced Modifications of Carbon Metabolism in a Chlamydomonas reinhardtii Starchless Mutant

The Plant Cell, 2013

To understand the molecular basis underlying increased triacylglycerol (TAG) accumulation in star... more To understand the molecular basis underlying increased triacylglycerol (TAG) accumulation in starchless (sta) Chlamydomonas reinhardtii mutants, we undertook comparative time-course transcriptomics of strains CC-4348 (sta6 mutant), CC-4349, a cell wall-deficient (cw) strain purported to represent the parental STA6 strain, and three independent STA6 strains generated by complementation of sta6 (CC-4565/STA6-C2, CC-4566/STA6-C4, and CC-4567/STA6-C6) in the context of N deprivation. Despite N starvation-induced dramatic remodeling of the transcriptome, there were relatively few differences (5 3 10 2 ) observed between sta6 and STA6, the most dramatic of which were increased abundance of transcripts encoding key regulated or rate-limiting steps in central carbon metabolism, specifically isocitrate lyase, malate synthase, transaldolase, fructose bisphosphatase and phosphoenolpyruvate carboxykinase (encoded by ICL1, MAS1, TAL1, FBP1, and PCK1 respectively), suggestive of increased carbon movement toward hexose-phosphate in sta6 by upregulation of the glyoxylate pathway and gluconeogenesis. Enzyme assays validated the increase in isocitrate lyase and malate synthase activities. Targeted metabolite analysis indicated increased succinate, malate, and Glc-6-P and decreased Fru-1,6bisphosphate, illustrating the effect of these changes. Comparisons of independent data sets in multiple strains allowed the delineation of a sequence of events in the global N starvation response in C. reinhardtii, starting within minutes with the upregulation of alternative N assimilation routes and carbohydrate synthesis and subsequently a more gradual upregulation of genes encoding enzymes of TAG synthesis. Finally, genome resequencing analysis indicated that (1) the deletion in sta6 extends into the neighboring gene encoding respiratory burst oxidase, and (2) a commonly used STA6 strain (CC-4349) as well as the sequenced reference (CC-503) are not congenic with respect to sta6 (CC-4348), underscoring the importance of using complemented strains for more rigorous assignment of phenotype to genotype.

Research paper thumbnail of Genome sequence of the hyperthermophilic crenarchaeon Pyrobaculum aerophilum

Proceedings of the National Academy of Sciences, 2002

We determined and annotated the complete 2.2-megabase genome sequence of Pyrobaculum aerophilum, ... more We determined and annotated the complete 2.2-megabase genome sequence of Pyrobaculum aerophilum, a facultatively aerobic nitratereducing hyperthermophilic (T opt ‫؍‬ 100°C) crenarchaeon. Clues were found suggesting explanations of the organism's surprising intolerance to sulfur, which may aid in the development of methods for genetic studies of the organism. Many interesting features worthy of further genetic studies were revealed. Whole genome computational analysis confirmed experiments showing that P. aerophilum (and perhaps all crenarchaea) lack 5 untranslated regions in their mRNAs and thus appear not to use a ribosome-binding site (Shine-Dalgarno)based mechanism for translation initiation at the 5 end of transcripts. Inspection of the lengths and distribution of mononucleotide repeattracts revealed some interesting features. For instance, it was seen that mononucleotide repeat-tracts of Gs (or Cs) are highly unstable, a pattern expected for an organism deficient in mismatch repair. This result, together with an independent study on mutation rates, suggests a ''mutator'' phenotype. P yrobaculum aerophilum is a hyperthermophilic (T max ϭ 104°C, T opt ϭ 100°C) and metabolically versatile member of the crenarchaea , which are predominantly anaerobic respirers. Unlike most hyperthermophiles, P. aerophilum can withstand the presence of oxygen, growing efficiently in microaerobic conditions, thus making it relatively easy to work with in the laboratory. Unlike most of its phylogenetic neighbors, the growth of P. aerophilum is inhibited by the presence of elemental sulfur, but it grows well anaerobically using nitrate reduction (1). Here we have determined the complete genome sequence of P. aerophilum IM2, which was isolated from a boiling marine water hole at Maronti Beach, Italy (1). We obtained the sequence by a low coverage random shotgun sequencing strategy, with gap closure and resolution of ambiguities aided by the creation of a genomic fosmid map (2). We present an overview of the features and content of the genome, including a possible explanation of the organism's intolerance to sulfur and evidence of a possible lack of mismatch repair activity. Studies of the genus Pyrobaculum provide important opportunities for understanding the boundaries of life in extreme habitats. In a recent molecular sampling of a deep subsurface geothermal water pool, the only organisms detected were hyperthermophilic archaeal members closely related to Pyrobaculum (3).

Research paper thumbnail of Metagenomic signatures of the Peru Margin subseafloor biosphere show a genetically distinct environment

Proceedings of the National Academy of Sciences, 2008

The subseafloor marine biosphere may be one of the largest reservoirs of microbial biomass on Ear... more The subseafloor marine biosphere may be one of the largest reservoirs of microbial biomass on Earth and has recently been the subject of debate in terms of the composition of its microbial inhabitants, particularly on sediments from the Peru Margin. A metagenomic analysis was made by using whole-genome amplification and pyrosequencing of sediments from Ocean Drilling Program Site 1229 on the Peru Margin to further explore the microbial diversity and overall community composition within this environment. A total of 61.9 Mb of genetic material was sequenced from sediments at horizons 1, 16, 32, and 50 m below the seafloor. These depths include sediments from both primarily sulfate-reducing methane-generating regions of the sediment column. Many genes of the annotated genes, including those encoding ribosomal proteins, corresponded to those from the Chloroflexi and Euryarchaeota. However, analysis of the 16S small-subunit ribosomal genes suggests that Crenarchaeota are the abundant microbial member. Quantitative PCR confirms that uncultivated Crenarchaeota are indeed a major microbial group in these subsurface samples. These findings show that the marine subsurface is a distinct microbial habitat and is different from environments studied by metagenomics, especially because of the predominance of uncultivated archaeal groups.

Research paper thumbnail of Selecting protein targets for structural genomics of Pyrobaculum aerophilum: Validating automated fold assignment methods by using binary hypothesis testing

Proceedings of the National Academy of Sciences, 2000

Pyrobaculum Genome. Predicted coding region sequences of the PA genome were obtained from the Jef... more Pyrobaculum Genome. Predicted coding region sequences of the PA genome were obtained from the Jeffrey H. Miller Laboratory of the University of California Los Angeles Molecular Biology Institute and correspond to the 1͞1͞99 version of the genome. This version contained 2,681 open reading frames (ORFs) predicted to code for proteins. Membrane-Spanning Proteins. Of the 2,681 PA ORFs, 551 contained membrane-spanning ␣-helices as determined by MOMENT (7) (PAHOME͞TRANSMEMBRANEHELIX PREDICTIONRESULTS). These proteins were excluded from fold recognition and novel fold prediction analysis. Protein Sequence Databases. The Online Mendelian Inheritance in Man (OMIM) database containing 15,743 sequences was downloaded from the National Center for Biotechnology Information (http:͞͞www.ncbi.nlm.nih.gov͞Omim͞; authored and edited by V. A. McKusick and his colleagues at Johns Hopkins and elsewhere). Similarity searches of the OMIM database were performed by using a local implementation of the Smith-Waterman algorithm (8) with probability values determined by Waterman-Vingron statistics (9, 10).

Research paper thumbnail of Whole genome-based phylogenetic analysis of free-living microorganisms

Nucleic Acids Research, 1999

A phylogenetic 'tree of life' has been constructed based on the observed presence and absence of ... more A phylogenetic 'tree of life' has been constructed based on the observed presence and absence of families of protein-encoding genes observed in 11 complete genomes of free-living microorganisms. Past attempts to reconstruct the evolutionary relationships of microorganisms have been limited to sets of genes rather than complete genomes. Despite apparent rampant lateral gene transfer among microorganisms, these results indicate a single robust underlying evolutionary history for these organisms. Broadly, the tree produced is very similar to the small subunit rRNA tree although several additional phylogenetic relationships appear to be resolved, including the relationship of Archaeoglobus to the methanogens studied. This result is in contrast to notions that a robust phylogenetic reconstruction of microorganisms is impossible due to their genomes being composed of an incomprehensible amalgam of genes with complicated histories and suggests that this style of genome-wide phylogenetic analysis could become an important method for studying the ancient diversification of life on Earth. Analyses using informational and operational subsets of the genes showed that this 'tree of life' is not dependent on the phylogenetically more consistent informational genes.