Yulia Mostovoy | University of California, San Francisco (original) (raw)
Papers by Yulia Mostovoy
PLoS computational biology, 2013
One of the outstanding challenges in comparative genomics is to interpret the evolutionary import... more One of the outstanding challenges in comparative genomics is to interpret the evolutionary importance of 11 regulatory variation between species. Rigorous molecular evolution-based methods to infer evidence for 12 natural selection from expression data are at a premium in the field, and to date, phylogenetic approaches 13 have not been well-suited to address the question in the small sets of taxa profiled in standard surveys of 14 gene expression. We have developed a strategy to infer evolutionary histories from expression profiles by 15 analyzing suites of genes of common function. In a manner conceptually similar to molecular evolution 16 models in which the evolutionary rates of DNA sequence at multiple loci follow a gamma distribution, 17 we modeled expression of the genes of an a priori-defined pathway with rates drawn from an inverse 18 gamma distribution. We then developed a fitting strategy to infer the parameters of this distribution 19 from expression measurements, and to identify gene groups whose expression patterns were consistent 20 with evolutionary constraint or rapid evolution in particular species. Simulations confirmed the power 21 and accuracy of our inference method. As an experimental testbed for our approach, we generated and 22 analyzed transcriptional profiles of four Saccharomyces yeasts. The results revealed pathways with signa-23 tures of constrained and accelerated regulatory evolution in individual yeasts and across the phylogeny, 24 highlighting the prevalence of pathway-level expression change during the divergence of yeast species. 25 We anticipate that our pathway-based phylogenetic approach will be of broad utility in the search to 26 understand the evolutionary relevance of regulatory change. 27 2 Author Summary 28
Proceedings of the National Academy of Sciences of the United States of America, 2010
The search to understand how genomes innovate in response to selection dominates the field of evo... more The search to understand how genomes innovate in response to selection dominates the field of evolutionary biology. Powerful molecular evolution approaches have been developed to test individual loci for signatures of selection. In many cases, however, an organism's response to changes in selective pressure may be mediated by multiple genes, whose products function together in a cellular process or pathway. Here we assess the prevalence of polygenic evolution in pathways in the yeasts Saccharomyces cerevisiae and S. bayanus. We first established short-read sequencing methods to detect cis-regulatory variation in a diploid hybrid between the species. We then tested for the scenario in which selective pressure in one species to increase or decrease the activity of a pathway has driven the accumulation of cis-regulatory variants that act in the same direction on gene expression. Application of this test revealed a variety of yeast pathways with evidence for directional regulatory evolution. In parallel, we also used population genomic sequencing data to compare protein and cis-regulatory variation within and between species. We identified pathways with evidence for divergence within S. cerevisiae, and we detected signatures of positive selection between S. cerevisiae and S. bayanus. Our results point to polygenic, pathway-level change as a common evolutionary mechanism among yeasts. We suggest that pathway analyses, including our test for directional regulatory evolution, will prove to be a relevant and powerful strategy in many evolutionary genomic applications.
articles by Yulia Mostovoy
Despite tremendous progress in genome sequencing, the basic goal of producing a phased (haplotype... more Despite tremendous progress in genome sequencing, the basic goal of producing a phased (haplotype-resolved) genome sequence with end-to-end contiguity for each chromosome at reasonable cost and effort is still unrealized. In this study, we describe an approach to performing de novo genome assembly and experimental phasing by integrating the data from Illumina short-read sequencing, 10X Genomics linked-read sequencing, and BioNano Genomics genome mapping to yield a high-quality, phased, de novo assembled human genome.
Genes encoded close to one another on the chromosome are often co-expressed, by a mechanism and r... more Genes encoded close to one another on the chromosome are often co-expressed, by a mechanism and regulatory logic that remain poorly understood. We surveyed the yeast genome for tandem gene pairs oriented tail-to-head at which expression antisense to the upstream gene was conserved across species. The intergenic region at most such tandem pairs is a bidirectional promoter, shared by the downstream gene mRNA and the upstream antisense transcript. Genomic analyses of these intergenic loci revealed distinctive patterns of transcription factor regulation. Mutation of a given transcription factor verified its role as a regulator in trans of tandem gene pair loci, including the proximally-initiating upstream antisense transcript and downstream mRNA and the distally-initiating upstream mRNA. To investigate cis-regulatory activity at such a locus, we focused on the stress-induced NAD(P)H dehydratase YKL151C and its downstream neighbor, the metabolic enzyme GPM1. Previous work has implicated the region between these genes in regulation of GPM1 expression; our mutation experiments established its function in rich medium as a repressor in cis of the distally-initiating YKL151C sense RNA, and an activator of the proximally-initiating YKL151C antisense RNA. Wild-type expression of all three transcripts required the transcription factor Gcr2. Thus, at this locus, the intergenic region serves as a focal point of regulatory input, driving antisense expression and mediating the coordinated regulation of YKL151C and GPM1. Together, our findings implicate transcription factors in the joint control of neighboring genes specialized to opposing conditions and the antisense transcripts expressed between them.
Comprehensive whole-genome structural variation detection is challenging with current approaches.... more Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation.
Comparative genomic studies have reported widespread variation in levels of gene expression withi... more Comparative genomic studies have reported widespread variation in levels of gene expression within and between species. Using these data to infer organism-level trait divergence has proven to be a key challenge in the field. We have used a wild Malaysian population of S. cerevisiae as a test bed in the search to predict and validate trait differences based on observations of regulatory variation. Malaysian yeast, when cultured in standard medium, activated regulatory programs that protect cells from the toxic effects of high iron. Malaysian yeast also showed a hyperactive regulatory response during culture in the presence of excess iron and had a unique growth defect in conditions of high iron. Molecular validation experiments pinpointed the iron metabolism factors AFT1, CCC1, and YAP5 as contributors to these molecular and cellular phenotypes; in genome-scale sequence analyses, a suite of iron toxicity response genes showed evidence for rapid protein evolution in Malaysian yeast. Our findings support a model in which iron metabolism has diverged in Malaysian yeast as a consequence of a change in selective pressure, with Malaysian alleles shifting the dynamic range of iron response to low-iron concentrations and weakening resistance to extreme iron toxicity. By dissecting the iron scarcity specialist behavior of Malaysian yeast, our work highlights the power of expression divergence as a signpost for biologically and evolutionarily relevant variation at the organismal level. Interpreting the phenotypic relevance of gene expression variation is one of the primary challenges of modern genomics.
PLoS computational biology, 2013
One of the outstanding challenges in comparative genomics is to interpret the evolutionary import... more One of the outstanding challenges in comparative genomics is to interpret the evolutionary importance of 11 regulatory variation between species. Rigorous molecular evolution-based methods to infer evidence for 12 natural selection from expression data are at a premium in the field, and to date, phylogenetic approaches 13 have not been well-suited to address the question in the small sets of taxa profiled in standard surveys of 14 gene expression. We have developed a strategy to infer evolutionary histories from expression profiles by 15 analyzing suites of genes of common function. In a manner conceptually similar to molecular evolution 16 models in which the evolutionary rates of DNA sequence at multiple loci follow a gamma distribution, 17 we modeled expression of the genes of an a priori-defined pathway with rates drawn from an inverse 18 gamma distribution. We then developed a fitting strategy to infer the parameters of this distribution 19 from expression measurements, and to identify gene groups whose expression patterns were consistent 20 with evolutionary constraint or rapid evolution in particular species. Simulations confirmed the power 21 and accuracy of our inference method. As an experimental testbed for our approach, we generated and 22 analyzed transcriptional profiles of four Saccharomyces yeasts. The results revealed pathways with signa-23 tures of constrained and accelerated regulatory evolution in individual yeasts and across the phylogeny, 24 highlighting the prevalence of pathway-level expression change during the divergence of yeast species. 25 We anticipate that our pathway-based phylogenetic approach will be of broad utility in the search to 26 understand the evolutionary relevance of regulatory change. 27 2 Author Summary 28
Proceedings of the National Academy of Sciences of the United States of America, 2010
The search to understand how genomes innovate in response to selection dominates the field of evo... more The search to understand how genomes innovate in response to selection dominates the field of evolutionary biology. Powerful molecular evolution approaches have been developed to test individual loci for signatures of selection. In many cases, however, an organism's response to changes in selective pressure may be mediated by multiple genes, whose products function together in a cellular process or pathway. Here we assess the prevalence of polygenic evolution in pathways in the yeasts Saccharomyces cerevisiae and S. bayanus. We first established short-read sequencing methods to detect cis-regulatory variation in a diploid hybrid between the species. We then tested for the scenario in which selective pressure in one species to increase or decrease the activity of a pathway has driven the accumulation of cis-regulatory variants that act in the same direction on gene expression. Application of this test revealed a variety of yeast pathways with evidence for directional regulatory evolution. In parallel, we also used population genomic sequencing data to compare protein and cis-regulatory variation within and between species. We identified pathways with evidence for divergence within S. cerevisiae, and we detected signatures of positive selection between S. cerevisiae and S. bayanus. Our results point to polygenic, pathway-level change as a common evolutionary mechanism among yeasts. We suggest that pathway analyses, including our test for directional regulatory evolution, will prove to be a relevant and powerful strategy in many evolutionary genomic applications.
Despite tremendous progress in genome sequencing, the basic goal of producing a phased (haplotype... more Despite tremendous progress in genome sequencing, the basic goal of producing a phased (haplotype-resolved) genome sequence with end-to-end contiguity for each chromosome at reasonable cost and effort is still unrealized. In this study, we describe an approach to performing de novo genome assembly and experimental phasing by integrating the data from Illumina short-read sequencing, 10X Genomics linked-read sequencing, and BioNano Genomics genome mapping to yield a high-quality, phased, de novo assembled human genome.
Genes encoded close to one another on the chromosome are often co-expressed, by a mechanism and r... more Genes encoded close to one another on the chromosome are often co-expressed, by a mechanism and regulatory logic that remain poorly understood. We surveyed the yeast genome for tandem gene pairs oriented tail-to-head at which expression antisense to the upstream gene was conserved across species. The intergenic region at most such tandem pairs is a bidirectional promoter, shared by the downstream gene mRNA and the upstream antisense transcript. Genomic analyses of these intergenic loci revealed distinctive patterns of transcription factor regulation. Mutation of a given transcription factor verified its role as a regulator in trans of tandem gene pair loci, including the proximally-initiating upstream antisense transcript and downstream mRNA and the distally-initiating upstream mRNA. To investigate cis-regulatory activity at such a locus, we focused on the stress-induced NAD(P)H dehydratase YKL151C and its downstream neighbor, the metabolic enzyme GPM1. Previous work has implicated the region between these genes in regulation of GPM1 expression; our mutation experiments established its function in rich medium as a repressor in cis of the distally-initiating YKL151C sense RNA, and an activator of the proximally-initiating YKL151C antisense RNA. Wild-type expression of all three transcripts required the transcription factor Gcr2. Thus, at this locus, the intergenic region serves as a focal point of regulatory input, driving antisense expression and mediating the coordinated regulation of YKL151C and GPM1. Together, our findings implicate transcription factors in the joint control of neighboring genes specialized to opposing conditions and the antisense transcripts expressed between them.
Comprehensive whole-genome structural variation detection is challenging with current approaches.... more Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation.
Comparative genomic studies have reported widespread variation in levels of gene expression withi... more Comparative genomic studies have reported widespread variation in levels of gene expression within and between species. Using these data to infer organism-level trait divergence has proven to be a key challenge in the field. We have used a wild Malaysian population of S. cerevisiae as a test bed in the search to predict and validate trait differences based on observations of regulatory variation. Malaysian yeast, when cultured in standard medium, activated regulatory programs that protect cells from the toxic effects of high iron. Malaysian yeast also showed a hyperactive regulatory response during culture in the presence of excess iron and had a unique growth defect in conditions of high iron. Molecular validation experiments pinpointed the iron metabolism factors AFT1, CCC1, and YAP5 as contributors to these molecular and cellular phenotypes; in genome-scale sequence analyses, a suite of iron toxicity response genes showed evidence for rapid protein evolution in Malaysian yeast. Our findings support a model in which iron metabolism has diverged in Malaysian yeast as a consequence of a change in selective pressure, with Malaysian alleles shifting the dynamic range of iron response to low-iron concentrations and weakening resistance to extreme iron toxicity. By dissecting the iron scarcity specialist behavior of Malaysian yeast, our work highlights the power of expression divergence as a signpost for biologically and evolutionarily relevant variation at the organismal level. Interpreting the phenotypic relevance of gene expression variation is one of the primary challenges of modern genomics.