Yili Chen - Academia.edu (original) (raw)

Papers by Yili Chen

Research paper thumbnail of Integration of genome and chromatin structure with gene expression profiles to predict c-MYC recognition site binding and function

PLoS computational biology, Jan 6, 2007

The MYC genes encode nuclear sequence specific-binding DNA-binding proteins that are pleiotropic ... more The MYC genes encode nuclear sequence specific-binding DNA-binding proteins that are pleiotropic regulators of cellular function, and the c-MYC proto-oncogene is deregulated and/or mutated in most human cancers. Experimental studies of MYC binding to the genome are not fully consistent. While many c-MYC recognition sites can be identified in c-MYC responsive genes, other motif matches-even experimentally confirmed sites-are associated with genes showing no c-MYC response. We have developed a computational model that integrates multiple sources of evidence to predict which genes will bind and be regulated by MYC in vivo. First, a Bayesian network classifier is used to predict those c-MYC recognition sites that are most likely to exhibit high-occupancy binding in chromatin immunoprecipitation studies. This classifier incorporates genomic sequence, experimentally determined genomic chromatin acetylation islands, and predicted methylation status from a computational model estimating the...

Research paper thumbnail of Title: Computational and functional analysis of growth hormone-regulated genes identifies the transcriptional repressor Bcl6 as a participant in GH-regulated transcription Short title: Analysis of GH-regulated gene profile identifies Bcl6 Precis: An integrative computational strategy and function...

Precis: An integrative computational strategy and functional studies reveal the transcriptional r... more Precis: An integrative computational strategy and functional studies reveal the transcriptional repressor Bcl6 as a novel participant in GH-regulated gene expression in adipocytes.

Research paper thumbnail of Expression of heterologous sigma factors enables functional screening of metagenomic and heterologous genomic libraries

Nature communications, 2015

A key limitation in using heterologous genomic or metagenomic libraries in functional genomics an... more A key limitation in using heterologous genomic or metagenomic libraries in functional genomics and genome engineering is the low expression of heterologous genes in screening hosts, such as Escherichia coli. To overcome this limitation, here we generate E. coli strains capable of recognizing heterologous promoters by expressing heterologous sigma factors. Among seven sigma factors tested, RpoD from Lactobacillus plantarum (Lpl) appears to be able of initiating transcription from all sources of DNA. Using the promoter GFP-trap concept, we successfully screen several heterologous and metagenomic DNA libraries, thus enlarging the genomic space that can be functionally sampled in E. coli. For an application, we show that screening fosmid-based Lpl genomic libraries in an E. coli strain with a chromosomally integrated Lpl rpoD enables the identification of Lpl genetic determinants imparting strong ethanol tolerance in E. coli. Transcriptome analysis confirms increased expression of heter...

Research paper thumbnail of A genomic-library based discovery of a novel, possibly synthetic, acid-tolerance mechanism in Clostridium acetobutylicum involving non-coding RNAs and ribosomal RNA processing

Metabolic Engineering, 2010

We generated a genomic library from sheared Clostridium acetobutylicum 824 DNA, whereby inserts c... more We generated a genomic library from sheared Clostridium acetobutylicum 824 DNA, whereby inserts can be expressed in both directions from the thiolase promoter, P thl . Serial transfer of library-bearing C. acetobutylicum cultures exposed to increasing butyrate concentrations enriched for inserts containing fragments of rRNA genetic loci. The selected library inserts were placed so that antisense (to the rRNAs) non-coding RNAs (ncRNAs) would be transcribed from P thl . Different enriched inserts imparted similar butyrate-tolerance characteristics. A minimal tolerance fragment (RDNA7) was identified as the 16S-rRNA promoter region. Expressed on plasmid pRD7 off P thl , RDNA7 can produce putative ncRNAs termed ncRNA RD7 . C. acetobutylicum 824(pRD7) showed superior resistance to butyrate and other carboxylic acids. Transcriptional analysis of butyrate stress identified 120 differentially expressed genes between 824(pRD7) and 824(pSOS95del). The few upregulated genes included the ffh gene of the putative signal recognition particle (SRP) system. Northern analysis of ncRNA RD7 and corresponding antisense RNAs demonstrated multiple ncRNA RD7 molecules in 824(pRD7). Several corresponding antisense RNA molecules were identified both in 824(pRD7) and 824(pSOS95del), but at much higher levels in 824(pRD7). Northern analysis of 16S rRNA expression suggested complex RDNA7-dependent rRNA processing. Our data suggest that by hybridizing against unprocessed rRNA precursors, ncRNA RD7 alters rRNA processing, and these alterations result in acid tolerance, possibly through a mechanism involving the Ffh protein.

Research paper thumbnail of Small RNAs in the Genus Clostridium

mBio, 2011

The genus Clostridium includes major human pathogens and species important to cellulose degradati... more The genus Clostridium includes major human pathogens and species important to cellulose degradation, the carbon cycle, and biotechnology. Small RNAs (sRNAs) are emerging as crucial regulatory molecules in all organisms, but they have not been investigated in clostridia. Research on sRNAs in clostridia is hindered by the absence of a systematic method to identify sRNA candidates, thus delegating clostridial sRNA research to a hit-and-miss process. Thus, we wanted to develop a method to identify potential sRNAs in the Clostridium genus to open up the field of sRNA research in clostridia. Using comparative genomics analyses combined with predictions of rho-independent terminators and promoters, we predicted sRNAs in 21 clostridial genomes: Clostridium acetobutylicum, C. beijerinckii, C. botulinum (eight strains), C. cellulolyticum, C. difficile, C. kluyveri (two strains), C. novyi, C. perfringens (three strains), C. phytofermentans, C. tetani, and C. thermocellum. Although more than one-third of predicted sRNAs have Shine-Dalgarno (SD) sequences, only one-sixth have a start codon downstream of SD sequences; thus, most of the predicted sRNAs are noncoding RNAs. Quantitative reverse transcription-PCR (Q-RT-PCR) and Northern analysis were employed to test the presence of a randomly chosen set of sRNAs in C. acetobutylicum and several C. botulinum strains, leading to the confirmation of a large fraction of the tested sRNAs. We identified a conserved, novel sRNA which, together with the downstream gene coding for an ATP-binding cassette (ABC) transporter gene, responds to the antibiotic clindamycin. The number of predicted sRNAs correlated with the physiological function of the species (high for pathogens, low for cellulolytic, and intermediate for solventogenic), but not with 16S rRNA-based phylogeny.

Research paper thumbnail of Integration of genome and chromatin structure with gene expression profiles to predict c-MYC recognition site binding and function

PLoS computational biology, Jan 6, 2007

The MYC genes encode nuclear sequence specific-binding DNA-binding proteins that are pleiotropic ... more The MYC genes encode nuclear sequence specific-binding DNA-binding proteins that are pleiotropic regulators of cellular function, and the c-MYC proto-oncogene is deregulated and/or mutated in most human cancers. Experimental studies of MYC binding to the genome are not fully consistent. While many c-MYC recognition sites can be identified in c-MYC responsive genes, other motif matches-even experimentally confirmed sites-are associated with genes showing no c-MYC response. We have developed a computational model that integrates multiple sources of evidence to predict which genes will bind and be regulated by MYC in vivo. First, a Bayesian network classifier is used to predict those c-MYC recognition sites that are most likely to exhibit high-occupancy binding in chromatin immunoprecipitation studies. This classifier incorporates genomic sequence, experimentally determined genomic chromatin acetylation islands, and predicted methylation status from a computational model estimating the...

Research paper thumbnail of Title: Computational and functional analysis of growth hormone-regulated genes identifies the transcriptional repressor Bcl6 as a participant in GH-regulated transcription Short title: Analysis of GH-regulated gene profile identifies Bcl6 Precis: An integrative computational strategy and function...

Precis: An integrative computational strategy and functional studies reveal the transcriptional r... more Precis: An integrative computational strategy and functional studies reveal the transcriptional repressor Bcl6 as a novel participant in GH-regulated gene expression in adipocytes.

Research paper thumbnail of Expression of heterologous sigma factors enables functional screening of metagenomic and heterologous genomic libraries

Nature communications, 2015

A key limitation in using heterologous genomic or metagenomic libraries in functional genomics an... more A key limitation in using heterologous genomic or metagenomic libraries in functional genomics and genome engineering is the low expression of heterologous genes in screening hosts, such as Escherichia coli. To overcome this limitation, here we generate E. coli strains capable of recognizing heterologous promoters by expressing heterologous sigma factors. Among seven sigma factors tested, RpoD from Lactobacillus plantarum (Lpl) appears to be able of initiating transcription from all sources of DNA. Using the promoter GFP-trap concept, we successfully screen several heterologous and metagenomic DNA libraries, thus enlarging the genomic space that can be functionally sampled in E. coli. For an application, we show that screening fosmid-based Lpl genomic libraries in an E. coli strain with a chromosomally integrated Lpl rpoD enables the identification of Lpl genetic determinants imparting strong ethanol tolerance in E. coli. Transcriptome analysis confirms increased expression of heter...

Research paper thumbnail of A genomic-library based discovery of a novel, possibly synthetic, acid-tolerance mechanism in Clostridium acetobutylicum involving non-coding RNAs and ribosomal RNA processing

Metabolic Engineering, 2010

We generated a genomic library from sheared Clostridium acetobutylicum 824 DNA, whereby inserts c... more We generated a genomic library from sheared Clostridium acetobutylicum 824 DNA, whereby inserts can be expressed in both directions from the thiolase promoter, P thl . Serial transfer of library-bearing C. acetobutylicum cultures exposed to increasing butyrate concentrations enriched for inserts containing fragments of rRNA genetic loci. The selected library inserts were placed so that antisense (to the rRNAs) non-coding RNAs (ncRNAs) would be transcribed from P thl . Different enriched inserts imparted similar butyrate-tolerance characteristics. A minimal tolerance fragment (RDNA7) was identified as the 16S-rRNA promoter region. Expressed on plasmid pRD7 off P thl , RDNA7 can produce putative ncRNAs termed ncRNA RD7 . C. acetobutylicum 824(pRD7) showed superior resistance to butyrate and other carboxylic acids. Transcriptional analysis of butyrate stress identified 120 differentially expressed genes between 824(pRD7) and 824(pSOS95del). The few upregulated genes included the ffh gene of the putative signal recognition particle (SRP) system. Northern analysis of ncRNA RD7 and corresponding antisense RNAs demonstrated multiple ncRNA RD7 molecules in 824(pRD7). Several corresponding antisense RNA molecules were identified both in 824(pRD7) and 824(pSOS95del), but at much higher levels in 824(pRD7). Northern analysis of 16S rRNA expression suggested complex RDNA7-dependent rRNA processing. Our data suggest that by hybridizing against unprocessed rRNA precursors, ncRNA RD7 alters rRNA processing, and these alterations result in acid tolerance, possibly through a mechanism involving the Ffh protein.

Research paper thumbnail of Small RNAs in the Genus Clostridium

mBio, 2011

The genus Clostridium includes major human pathogens and species important to cellulose degradati... more The genus Clostridium includes major human pathogens and species important to cellulose degradation, the carbon cycle, and biotechnology. Small RNAs (sRNAs) are emerging as crucial regulatory molecules in all organisms, but they have not been investigated in clostridia. Research on sRNAs in clostridia is hindered by the absence of a systematic method to identify sRNA candidates, thus delegating clostridial sRNA research to a hit-and-miss process. Thus, we wanted to develop a method to identify potential sRNAs in the Clostridium genus to open up the field of sRNA research in clostridia. Using comparative genomics analyses combined with predictions of rho-independent terminators and promoters, we predicted sRNAs in 21 clostridial genomes: Clostridium acetobutylicum, C. beijerinckii, C. botulinum (eight strains), C. cellulolyticum, C. difficile, C. kluyveri (two strains), C. novyi, C. perfringens (three strains), C. phytofermentans, C. tetani, and C. thermocellum. Although more than one-third of predicted sRNAs have Shine-Dalgarno (SD) sequences, only one-sixth have a start codon downstream of SD sequences; thus, most of the predicted sRNAs are noncoding RNAs. Quantitative reverse transcription-PCR (Q-RT-PCR) and Northern analysis were employed to test the presence of a randomly chosen set of sRNAs in C. acetobutylicum and several C. botulinum strains, leading to the confirmation of a large fraction of the tested sRNAs. We identified a conserved, novel sRNA which, together with the downstream gene coding for an ATP-binding cassette (ABC) transporter gene, responds to the antibiotic clindamycin. The number of predicted sRNAs correlated with the physiological function of the species (high for pathogens, low for cellulolytic, and intermediate for solventogenic), but not with 16S rRNA-based phylogeny.