Yili Chen - Academia.edu (original) (raw)
Papers by Yili Chen
PLoS computational biology, Jan 6, 2007
The MYC genes encode nuclear sequence specific-binding DNA-binding proteins that are pleiotropic ... more The MYC genes encode nuclear sequence specific-binding DNA-binding proteins that are pleiotropic regulators of cellular function, and the c-MYC proto-oncogene is deregulated and/or mutated in most human cancers. Experimental studies of MYC binding to the genome are not fully consistent. While many c-MYC recognition sites can be identified in c-MYC responsive genes, other motif matches-even experimentally confirmed sites-are associated with genes showing no c-MYC response. We have developed a computational model that integrates multiple sources of evidence to predict which genes will bind and be regulated by MYC in vivo. First, a Bayesian network classifier is used to predict those c-MYC recognition sites that are most likely to exhibit high-occupancy binding in chromatin immunoprecipitation studies. This classifier incorporates genomic sequence, experimentally determined genomic chromatin acetylation islands, and predicted methylation status from a computational model estimating the...
Precis: An integrative computational strategy and functional studies reveal the transcriptional r... more Precis: An integrative computational strategy and functional studies reveal the transcriptional repressor Bcl6 as a novel participant in GH-regulated gene expression in adipocytes.
Nature communications, 2015
A key limitation in using heterologous genomic or metagenomic libraries in functional genomics an... more A key limitation in using heterologous genomic or metagenomic libraries in functional genomics and genome engineering is the low expression of heterologous genes in screening hosts, such as Escherichia coli. To overcome this limitation, here we generate E. coli strains capable of recognizing heterologous promoters by expressing heterologous sigma factors. Among seven sigma factors tested, RpoD from Lactobacillus plantarum (Lpl) appears to be able of initiating transcription from all sources of DNA. Using the promoter GFP-trap concept, we successfully screen several heterologous and metagenomic DNA libraries, thus enlarging the genomic space that can be functionally sampled in E. coli. For an application, we show that screening fosmid-based Lpl genomic libraries in an E. coli strain with a chromosomally integrated Lpl rpoD enables the identification of Lpl genetic determinants imparting strong ethanol tolerance in E. coli. Transcriptome analysis confirms increased expression of heter...
Metabolic Engineering, 2010
We generated a genomic library from sheared Clostridium acetobutylicum 824 DNA, whereby inserts c... more We generated a genomic library from sheared Clostridium acetobutylicum 824 DNA, whereby inserts can be expressed in both directions from the thiolase promoter, P thl . Serial transfer of library-bearing C. acetobutylicum cultures exposed to increasing butyrate concentrations enriched for inserts containing fragments of rRNA genetic loci. The selected library inserts were placed so that antisense (to the rRNAs) non-coding RNAs (ncRNAs) would be transcribed from P thl . Different enriched inserts imparted similar butyrate-tolerance characteristics. A minimal tolerance fragment (RDNA7) was identified as the 16S-rRNA promoter region. Expressed on plasmid pRD7 off P thl , RDNA7 can produce putative ncRNAs termed ncRNA RD7 . C. acetobutylicum 824(pRD7) showed superior resistance to butyrate and other carboxylic acids. Transcriptional analysis of butyrate stress identified 120 differentially expressed genes between 824(pRD7) and 824(pSOS95del). The few upregulated genes included the ffh gene of the putative signal recognition particle (SRP) system. Northern analysis of ncRNA RD7 and corresponding antisense RNAs demonstrated multiple ncRNA RD7 molecules in 824(pRD7). Several corresponding antisense RNA molecules were identified both in 824(pRD7) and 824(pSOS95del), but at much higher levels in 824(pRD7). Northern analysis of 16S rRNA expression suggested complex RDNA7-dependent rRNA processing. Our data suggest that by hybridizing against unprocessed rRNA precursors, ncRNA RD7 alters rRNA processing, and these alterations result in acid tolerance, possibly through a mechanism involving the Ffh protein.
mBio, 2011
The genus Clostridium includes major human pathogens and species important to cellulose degradati... more The genus Clostridium includes major human pathogens and species important to cellulose degradation, the carbon cycle, and biotechnology. Small RNAs (sRNAs) are emerging as crucial regulatory molecules in all organisms, but they have not been investigated in clostridia. Research on sRNAs in clostridia is hindered by the absence of a systematic method to identify sRNA candidates, thus delegating clostridial sRNA research to a hit-and-miss process. Thus, we wanted to develop a method to identify potential sRNAs in the Clostridium genus to open up the field of sRNA research in clostridia. Using comparative genomics analyses combined with predictions of rho-independent terminators and promoters, we predicted sRNAs in 21 clostridial genomes: Clostridium acetobutylicum, C. beijerinckii, C. botulinum (eight strains), C. cellulolyticum, C. difficile, C. kluyveri (two strains), C. novyi, C. perfringens (three strains), C. phytofermentans, C. tetani, and C. thermocellum. Although more than one-third of predicted sRNAs have Shine-Dalgarno (SD) sequences, only one-sixth have a start codon downstream of SD sequences; thus, most of the predicted sRNAs are noncoding RNAs. Quantitative reverse transcription-PCR (Q-RT-PCR) and Northern analysis were employed to test the presence of a randomly chosen set of sRNAs in C. acetobutylicum and several C. botulinum strains, leading to the confirmation of a large fraction of the tested sRNAs. We identified a conserved, novel sRNA which, together with the downstream gene coding for an ATP-binding cassette (ABC) transporter gene, responds to the antibiotic clindamycin. The number of predicted sRNAs correlated with the physiological function of the species (high for pathogens, low for cellulolytic, and intermediate for solventogenic), but not with 16S rRNA-based phylogeny.
PLoS computational biology, Jan 6, 2007
The MYC genes encode nuclear sequence specific-binding DNA-binding proteins that are pleiotropic ... more The MYC genes encode nuclear sequence specific-binding DNA-binding proteins that are pleiotropic regulators of cellular function, and the c-MYC proto-oncogene is deregulated and/or mutated in most human cancers. Experimental studies of MYC binding to the genome are not fully consistent. While many c-MYC recognition sites can be identified in c-MYC responsive genes, other motif matches-even experimentally confirmed sites-are associated with genes showing no c-MYC response. We have developed a computational model that integrates multiple sources of evidence to predict which genes will bind and be regulated by MYC in vivo. First, a Bayesian network classifier is used to predict those c-MYC recognition sites that are most likely to exhibit high-occupancy binding in chromatin immunoprecipitation studies. This classifier incorporates genomic sequence, experimentally determined genomic chromatin acetylation islands, and predicted methylation status from a computational model estimating the...
Precis: An integrative computational strategy and functional studies reveal the transcriptional r... more Precis: An integrative computational strategy and functional studies reveal the transcriptional repressor Bcl6 as a novel participant in GH-regulated gene expression in adipocytes.
Nature communications, 2015
A key limitation in using heterologous genomic or metagenomic libraries in functional genomics an... more A key limitation in using heterologous genomic or metagenomic libraries in functional genomics and genome engineering is the low expression of heterologous genes in screening hosts, such as Escherichia coli. To overcome this limitation, here we generate E. coli strains capable of recognizing heterologous promoters by expressing heterologous sigma factors. Among seven sigma factors tested, RpoD from Lactobacillus plantarum (Lpl) appears to be able of initiating transcription from all sources of DNA. Using the promoter GFP-trap concept, we successfully screen several heterologous and metagenomic DNA libraries, thus enlarging the genomic space that can be functionally sampled in E. coli. For an application, we show that screening fosmid-based Lpl genomic libraries in an E. coli strain with a chromosomally integrated Lpl rpoD enables the identification of Lpl genetic determinants imparting strong ethanol tolerance in E. coli. Transcriptome analysis confirms increased expression of heter...
Metabolic Engineering, 2010
We generated a genomic library from sheared Clostridium acetobutylicum 824 DNA, whereby inserts c... more We generated a genomic library from sheared Clostridium acetobutylicum 824 DNA, whereby inserts can be expressed in both directions from the thiolase promoter, P thl . Serial transfer of library-bearing C. acetobutylicum cultures exposed to increasing butyrate concentrations enriched for inserts containing fragments of rRNA genetic loci. The selected library inserts were placed so that antisense (to the rRNAs) non-coding RNAs (ncRNAs) would be transcribed from P thl . Different enriched inserts imparted similar butyrate-tolerance characteristics. A minimal tolerance fragment (RDNA7) was identified as the 16S-rRNA promoter region. Expressed on plasmid pRD7 off P thl , RDNA7 can produce putative ncRNAs termed ncRNA RD7 . C. acetobutylicum 824(pRD7) showed superior resistance to butyrate and other carboxylic acids. Transcriptional analysis of butyrate stress identified 120 differentially expressed genes between 824(pRD7) and 824(pSOS95del). The few upregulated genes included the ffh gene of the putative signal recognition particle (SRP) system. Northern analysis of ncRNA RD7 and corresponding antisense RNAs demonstrated multiple ncRNA RD7 molecules in 824(pRD7). Several corresponding antisense RNA molecules were identified both in 824(pRD7) and 824(pSOS95del), but at much higher levels in 824(pRD7). Northern analysis of 16S rRNA expression suggested complex RDNA7-dependent rRNA processing. Our data suggest that by hybridizing against unprocessed rRNA precursors, ncRNA RD7 alters rRNA processing, and these alterations result in acid tolerance, possibly through a mechanism involving the Ffh protein.
mBio, 2011
The genus Clostridium includes major human pathogens and species important to cellulose degradati... more The genus Clostridium includes major human pathogens and species important to cellulose degradation, the carbon cycle, and biotechnology. Small RNAs (sRNAs) are emerging as crucial regulatory molecules in all organisms, but they have not been investigated in clostridia. Research on sRNAs in clostridia is hindered by the absence of a systematic method to identify sRNA candidates, thus delegating clostridial sRNA research to a hit-and-miss process. Thus, we wanted to develop a method to identify potential sRNAs in the Clostridium genus to open up the field of sRNA research in clostridia. Using comparative genomics analyses combined with predictions of rho-independent terminators and promoters, we predicted sRNAs in 21 clostridial genomes: Clostridium acetobutylicum, C. beijerinckii, C. botulinum (eight strains), C. cellulolyticum, C. difficile, C. kluyveri (two strains), C. novyi, C. perfringens (three strains), C. phytofermentans, C. tetani, and C. thermocellum. Although more than one-third of predicted sRNAs have Shine-Dalgarno (SD) sequences, only one-sixth have a start codon downstream of SD sequences; thus, most of the predicted sRNAs are noncoding RNAs. Quantitative reverse transcription-PCR (Q-RT-PCR) and Northern analysis were employed to test the presence of a randomly chosen set of sRNAs in C. acetobutylicum and several C. botulinum strains, leading to the confirmation of a large fraction of the tested sRNAs. We identified a conserved, novel sRNA which, together with the downstream gene coding for an ATP-binding cassette (ABC) transporter gene, responds to the antibiotic clindamycin. The number of predicted sRNAs correlated with the physiological function of the species (high for pathogens, low for cellulolytic, and intermediate for solventogenic), but not with 16S rRNA-based phylogeny.