Conserved noncoding sequences in the grasses - PubMed (original) (raw)

Comparative Study

. 2003 Sep;13(9):2030-41.

doi: 10.1101/gr.1280703.

Affiliations

Comparative Study

Conserved noncoding sequences in the grasses

Dan Choffnes Inada et al. Genome Res. 2003 Sep.

Abstract

As orthologous genes from related species diverge over time, some sequences are conserved in noncoding regions. In mammals, large phylogenetic footprints, or conserved noncoding sequences (CNSs), are known to be common features of genes. Here we present the first large-scale analysis of plant genes for CNSs. We used maize and rice, maximally diverged members of the grass family of monocots. Using a local sequence alignment set to deliver only significant alignments, we found one or more CNSs in the noncoding regions of the majority of genes studied. Grass genes have dramatically fewer and much smaller CNSs than mammalian genes. Twenty-seven percent of grass gene comparisons revealed no CNSs. Genes functioning in upstream regulatory roles, such as transcription factors, are greatly enriched for CNSs relative to genes encoding enzymes or structural proteins. Further, we show that a CNS cluster in an intron of the knotted1 homeobox gene serves as a site of negative regulation. We showthat CNSs in the adh1 gene do not correlate with known cis-acting sites. We discuss the potential meanings of CNSs and their value as analytical tools and evolutionary characters. We advance the idea that many CNSs function to lock-in gene regulatory decisions.

PubMed Disclaimer

Figures

Figure 1

Figure 1

The CNS Viewer software displays the bl2seq results of an orthologous pair of ferredoxin3 genes aligned on their start codon. The CNS repeats in the 5′ region of maize have been shown to be in a 5′ UTR intron (Nakano et al. 1997). The text boxes permit quick input to other applications, such as BLASTX or ORF predictors. The button “Remove/Display CNS Annotation” brings up the CNS list with the option to toggle “off.” Once off, a CNS does not appear in the graphic, but the data remain. The dark blue box denotes the experimentally verified exon in maize. The light blue box denotes orthologous space between the rice 5′ATG and the stop.

Figure 2

Figure 2

Examples of CNS graphic readouts from our 52 genes. From the upper right corner and moving clockwise, _alcohol dehydrogenase_1-F, homeobox gene gnarley1, shrunken1 encoding an enzyme, downstream bZip gene opaque-2, RNA binding motif gene terminal ear1, MADS-box zea agamous2. oleosin16 and 18, and defective kernel1 encoding a membrane-bound calpain. Red lines indicate change of strand. A blue line through a CNS denotes removal during manual proofing of data.

Figure 3

Figure 3

Our 52 two genes distributed by total number of CNSs. The most CNS-rich genes are upstream, developmental regulatory genes. The six CNS-richest, in descending order, are lrs1 (bZip-TGA-1a family), kn1 and gn1 (Class I homeobox), te1 (RNA-binding product; apex domain identity), sus1 (encodes an enzyme), and tb1 (transcription factor).

Figure 4

Figure 4

CNSs in 52 grass genes distributed by size.

Figure 5

Figure 5

CNSs may bind a negative regulatory factor(s). Greene and coworkers (1994) positioned nine transposon insertions (colored triangles) within the third intron of the knotted1 Class I homeobox gene in maize. These insertions were found as dominant mutants conferring ectopic gene expression and phenotype. Note how they cluster in a region rich in CNSs. The bottom-most plot is a global alignment of the third intron (Vista; Loots et al. 2002), showing bp identity in a 20-bp sliding window.

Similar articles

Cited by

References

    1. Ahn, S. and Tanksley, S.D. 1993. Comparative linkage maps of rice and maize genomes. Proc. Natl. Acad. Sci. 90: 7980–7984. - PMC - PubMed
    1. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tools. J. Mol. Biol. 215: 403–410. - PubMed
    1. Bennetzen, J.L. 2000. Comparative sequence analysis of plant nuclear genomes: Microcolinearity and its many exceptions. Plant Cell 12: 1021–1029. - PMC - PubMed
    1. Blanchette, M. and Tompa, M. 2002. Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res. 12: 739–748. - PMC - PubMed
    1. Bolouri, H. and Davidson, E.H. 2002. Modeling DNA sequence-based cis-regulatory gene networks. Dev. Biol. 246: 2–13. - PubMed

WEB SITE REFERENCES

    1. http://ncbi.nlm.nih.gov; GenBank
    1. http://www.tmri.org; Torrey Mesa Research Institute, Syngenta Inc. rice genome portal.
    1. http://oberon.rug.ac.be:8080/PlantCARE/index.html; PlantCARE (cis-acting regulatory elements; Lescot et al. 2002). - PubMed
    1. http://www.rgp.dna.affrc.go.jp; Rice Genome Project.
    1. http://genomics.cnr.Berkeley.edu/cns/; User = reviewer; password = super (our 52 genes' graphics).

Publication types

MeSH terms

Substances

LinkOut - more resources