Identification of functional cis-regulatory elements by sequential enrichment from a randomized synthetic DNA library (original) (raw)
Related papers
The Plant Journal, 2006
Short motifs of many cis-regulatory elements (CREs) can be found in the promoters of most Arabidopsis genes, and this raises the question of how their presence can confer specific regulation. We developed a universal algorithm to test the biological significance of CREs by first identifying every Arabidopsis gene with a CRE and then statistically correlating the presence or absence of the element with the gene expression profile on multiple DNA microarrays. This algorithm was successfully verified for previously characterized abscisic acid, ethylene, sucrose and drought responsive CREs in Arabidopsis, showing that the presence of these elements indeed correlates with treatment-specific gene induction. Later, we used standard motif sampling methods to identify 128 putative motifs induced by excess light, reactive oxygen species and sucrose. Our algorithm was able to filter 20 out of 128 novel CREs which significantly correlated with gene induction by either heat, reactive oxygen species and/or sucrose. The position, orientation and sequence specificity of CREs was tested in silicio by analyzing the expression of genes with naturally occurring sequence variations. In three novel CREs the forward orientation correlated with sucrose induction and the reverse orientation with sucrose suppression. The functionality of the predicted novel CREs was experimentally confirmed using Arabidopsis cell-suspension cultures transformed with short promoter fragments or artificial promoters fused with the GUS reporter gene. Our genome-wide analysis opens up new possibilities for in silicio verification of the biological significance of newly discovered CREs, and allows for subsequent selection of such CREs for experimental studies.
Cis-regulatory code of stress-responsive transcription in Arabidopsis thaliana
Proceedings of the National Academy of Sciences, 2011
Environmental stress leads to dramatic transcriptional reprogramming, which is central to plant survival. Although substantial knowledge has accumulated on how a few plant cis-regulatory elements (CREs) function in stress regulation, many more CREs remain to be discovered. In addition, the plant stress cis-regulatory code, i.e., how CREs work independently and/or in concert to specify stress-responsive transcription, is mostly unknown. On the basis of gene expression patterns under multiple stresses, we identified a large number of putative CREs (pCREs) in Arabidopsis thaliana with characteristics of authentic cis-elements. Surprisingly, biotic and abiotic responses are mostly mediated by two distinct pCRE superfamilies. In addition, we uncovered cis-regulatory codes specifying how pCRE presence and absence, combinatorial relationships, location, and copy number can be used to predict stress-responsive expression. Expression prediction models based on pCRE combinations perform significantly better than those based on simply pCRE presence and absence, location, and copy number. Furthermore, instead of a few master combinatorial rules for each stress condition, many rules were discovered, and each appears to control only a small subset of stress-responsive genes. Given there are very few documented interactions between plant CREs, the combinatorial rules we have uncovered significantly contribute to a better understanding of the cis-regulatory logic underlying plant stress response and provide prioritized targets for experimentation. machine learning | motif discovery | transcription factor binding site
2017
Changing environmental conditions are limiting crop productivity and, hence, there is an urgent need to develop stress tolerant plants. Engineering of Cisregulatory elements (CREs) is an effective strategy to design such plants. Transcription factors (TFs) can be used effectively to manipulate gene expression. However, overlapping expression has been observed for several stress-responsive TFs. In order to design improved plants by Cis-engineering, we first need to understand the complex regulatory network of TFs and the cross-talk between them. Advances in systems biology have enabled us to visualize plants from a holistic view during the abiotic stress. The current review discusses major transcriptional regulatory networks involved in abiotic stress tolerance, and how a better understanding of these networks may help in designing stress-tolerant plants. Finally, the review mentions some potential approaches to generate stresstolerant crops to enhance crop productivity, which is the...
PLANT PHYSIOLOGY, 2009
Analysis of gene expression data generated by high-throughput microarray transcript profiling experiments has demonstrated that genes with an overall similar expression pattern are often enriched for similar functions. This guilt-by-association principle can be applied to define modular gene programs, identify cis-regulatory elements, or predict gene functions for unknown genes based on their coexpression neighborhood. We evaluated the potential to use Gene Ontology (GO) enrichment of a gene's coexpression neighborhood as a tool to predict its function but found overall low sensitivity scores (13%-34%). This indicates that for many functional categories, coexpression alone performs poorly to infer known biological gene functions. However, integration of cis-regulatory elements shows that 46% of the gene coexpression neighborhoods are enriched for one or more motifs, providing a valuable complementary source to functionally annotate genes. Through the integration of coexpression data, GO annotations, and a set of known cis-regulatory elements combined with a novel set of evolutionarily conserved plant motifs, we could link many genes and motifs to specific biological functions. Application of our coexpression framework extended with cis-regulatory element analysis on transcriptome data from the cell cycle-related transcription factor OBP1 yielded several coexpressed modules associated with specific cis-regulatory elements. Moreover, our analysis strongly suggests a feed-forward regulatory interaction between OBP1 and the E2F pathway. The ATCOECIS resource (http:// bioinformatics.psb.ugent.be/ATCOECIS/) makes it possible to query coexpression data and GO and cis-regulatory element annotations and to submit user-defined gene sets for motif analysis, providing an access point to unravel the regulatory code underlying transcriptional control in Arabidopsis (Arabidopsis thaliana).
PLoS ONE, 2011
Transcriptional regulation is an important mechanism underlying gene expression and has played a crucial role in evolution. The number, position and interactions between cis-elements and transcription factors (TFs) determine the expression pattern of a gene. To identify functionally relevant cis-elements in gene promoters, a phylogenetic shadowing approach with a lipase gene (LIP1) was used. As a proof of concept, in silico analyses of several Brassicaceae LIP1 promoters identified a highly conserved sequence (LIP1 element) that is sufficient to drive strong expression of a reporter gene in planta. A collection of ca. 1,200 Arabidopsis thaliana TF open reading frames (ORFs) was arrayed in a 96-well format (RR library) and a convenient mating based yeast one hybrid (Y1H) screening procedure was established. We constructed an episomal plasmid (pTUY1H) to clone the LIP1 element and used it as bait for Y1H screenings. A novel interaction with an HD-ZIP (AtML1) TF was identified and abolished by a 2 bp mutation in the LIP1 element. A role of this interaction in transcriptional regulation was confirmed in planta. In addition, we validated our strategy by reproducing the previously reported interaction between a MYB-CC (PHR1) TF, a central regulator of phosphate starvation responses, with a conserved promoter fragment (IPS1 element) containing its cognate binding sequence. Finally, we established that the LIP1 and IPS1 elements were differentially bound by HD-ZIP and MYB-CC family members in agreement with their genetic redundancy in planta. In conclusion, combining in silico analyses of orthologous gene promoters with Y1H screening of the RR library represents a powerful approach to decipher cis-and trans-regulatory codes.
PLANT PHYSIOLOGY, 2006
The rapidly increasing amount of plant genomic sequences allows for the detection of cis-elements through comparative methods. In addition, large-scale gene expression data for Arabidopsis (Arabidopsis thaliana) have recently become available. Coexpression and evolutionarily conserved sequences are criteria widely used to identify shared cis-regulatory elements. In our study, we employ an integrated approach to combine two sources of information, coexpression and sequence conservation. Best-candidate orthologous promoter sequences were identified by a bidirectional best blast hit strategy in genome survey sequences from Brassica oleracea. The analysis of 779 microarrays from 81 different experiments provided detailed expression information for Arabidopsis genes coexpressed in multiple tissues and under various conditions and developmental stages. We discovered candidate transcription factor binding sites in 64% of the Arabidopsis genes analyzed. Among them, we detected experimentally verified binding sites and showed strong enrichment of shared cis-elements within functionally related genes. This study demonstrates the value of partially shotgun sequenced genomes and their combinatorial use with functional genomics data to address complex questions in comparative genomics.
cis-Regulatory elements in plant cell signaling
Current Opinion in Plant Biology, 2009
Plant cell signaling pathways are in part dependent on transcriptional regulatory networks comprising circuits of transcription factors (TFs) and regulatory DNA elements that control the expression of target genes. Here, we describe experimental and bioinformatic approaches for identifying potential cis-regulatory elements. We also discuss recent integrative genomics studies aimed at elucidating the functions of cis-regulatory elements in aspects of plant biology, including the circadian clock, interactions with the environment, stress responses, and regulation of growth and development by phytohormones. Finally, we discuss emerging technologies and approaches that offer great potential for accelerating the discovery and functional characterization of cis-elements and interacting TFs -which will help realize the promise of systems biology.
2006
The rapidly increasing amount of plant genomic sequences allows for the detection of cis-elements through comparative methods. In addition large-scale gene expression data for Arabidopsis thaliana have recently become available. Co-expression and evolutionarily conserved sequences are criteria, widely used to identify shared cis-regulatory elements. In our study we employ an integrated approach to combine two sources of information, co-expression and sequence conservation. Best candidate orthologous promoter sequences were identified by a bidirectional best blast hit strategy in genome survey sequences from Brassica oleracea. The analysis of 779 microarrays from 81 different experiments provided detailed expression information for Arabidopsis genes co-expressed in multiple tissues and under various conditions and developmental stages. We discovered candidate transcription factor binding sites in 64% of the Arabidopsis genes analyzed. Among them, we detected experimentally verified b...