MECP2 genomic structure and function: insights from ENCODE - PubMed (original) (raw)

Review

. 2008 Nov;36(19):6035-47.

doi: 10.1093/nar/gkn591. Epub 2008 Sep 27.

Affiliations

Review

MECP2 genomic structure and function: insights from ENCODE

Jasmine Singh et al. Nucleic Acids Res. 2008 Nov.

Abstract

MECP2, a relatively small gene located in the human X chromosome, was initially described with three exons transcribing RNA from which the protein MeCP2 was translated. It is now known to have four exons from which two isoforms are translated; however, there is also evidence of additional functional genomic structures within MECP2, including exons potentially transcribing non-coding RNAs. Accompanying the recognition of a higher level of intricacy within MECP2 has been a recent surge of knowledge about the structure and function of human genes more generally, to the extent that the definition of a gene is being revisited. It is timely now to review the published and novel functional elements within MECP2, which is proving to have a complexity far greater than was previously thought.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

ENCODE data reveals increased transcriptional complexity within MECP2. This screenshot from the March 2006 assembly of the UCSC Genome Browser (

http://genome.ucsc.edu/

) compares the previously annotated MECP2 gene structure (RefSeq-annotated genes) with the GENCODE annotation (v3.1, March 2007). The tracks ‘GENCODE RACEfrags’ show the mapped locations of 5′-RACE products from a range of tissues (83). Also displayed is the Affymetrix data representing the genomic source and relative abundance of RNA transcripts detected in the cytoplasm of HeLa cells.

Figure 2.

Figure 2.

MECP2 coding regions and functional protein domains. The four annotated exons of the MECP2 gene are magnified and their coding portions are shown in colour, with the non-coding portions represented in grey. The MeCP2_e1 and MeCP2_e2 isoforms are also displayed. The 3′-UTR, with four alternative polyadenylation signals, and the 5′-UTR of each transcript are also shown. The positions of the well-characterized MBD, TRD and nuclear localization signals of the MeCP2 protein as well as the more-recently identified WDR and RG repeat region are as labelled. The distinct tertiary structures and functional domains defined by trypsin cleavage sites (46) are also indicated.

Figure 3.

Figure 3.

MECP2 regulatory elements. (a) MECP2 regulatory elements include the _cis_-regulatory elements identified by Liu and Francke (74) and the TargetScanS-identified binding site for miR-132, which has recently been shown to repress MECP2 translation (82). (b) A UCSC Genome Browser screenshot (

http://genome.ucsc.edu/

, May 2004 assembly) of a 5500-bp region encompassing the promoter and exon 1 is also included. This displays the _cis_-regulatory elements identified in (a) above; CpG islands; repeating elements identified by RepeatMasker; constrained elements defined by ENCODE; single nucleotide polymorphisms (SNPs) from dbSNP build 126 (March 2006 assembly); GENCODE gene annotations (v3.1 March 2007); RIKEN CAGE-predicted transcription start sites (TSSs) on the minus strand; and Yale Regulatory Factor Binding Region (RFBRs) clusters and deserts. DNase I hypersensitivity sites are shown for HeLa (epithelioid carcinoma) and IMR90 (fibroblast) cells, as identified by Duke/NHGRI. The binding of RNA polymerase II (RNA pol II) and TATA-binding protein-associated factor 1 (TAF1); as well as the locations of histone modifications, was determined by ChIP studies performed by Ludwig Institute/UCSD. The Sp1-binding sites were identified by Stanford in K562 (chronic myeloid leukaemia) cells.

Figure 4.

Figure 4.

Features within the MECP2 genomic region and adjacent genes. A UCSC Genome Browser screenshot of a 160 000-nt region encompassing MECP2 and its immediate neighbouring genes (

http://genome.ucsc.edu/

, May 2004 assembly) is displayed. The tracks displaying regulatory elements, CpG islands, repetitive elements, constrained elements, SNPs, GENCODE-annotated genes, TSSs, RFBR clusters and deserts, DNase I hypersensitivity, regulatory potential; RNA pol II, TAF1, Sp1 and the histone modifications are all as defined in Figure 3. Also shown are TargetScanS-predicted miRNA-binding sites (March 2006 assembly) deletion-insertion polymorphisms (DIPs) defined by NHGRI, regulatory potential as predicted by ESPERR, CCCTC-binding factor (CTCF) binding sites identified by Sanger in GM06990 (lymphoblastoid) cells and binding sites for CCAAT/enhancer binding protein epsilon (CEBPe), CTCF, P300, PU1 and retinoic acid receptor, alpha (RARA) identified by Affymetrix in HL60 (promyelocytic leukaemia) cells.

Similar articles

Cited by

References

    1. Mattick JS, Makunin IV. Mol. Genet. 2006. Non-coding RNA. Hum. 15 (Spec No. 1), R17–R29.2. - PubMed
    1. The ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447:799–816. - PMC - PubMed
    1. Gerstein MB, Bruce C, Rozowsky JS, Zheng D, Du J, Korbel JO, Emanuelsson O, Zhang ZD, Weissman S, Snyder M. What is a gene, post-ENCODE? History and updated definition. Genome Res. 2007;17:669–681. - PubMed
    1. Gingeras TR. Origin of phenotypes: genes and transcripts. Genome Res. 2007;17:682–690. - PubMed
    1. Johannsen W. Elemente der exakten erblichkeitslehre : Deutsche Wesentlich erweiterte ausgabe in fünfundzwanzig vorlesuengen. 1909. G. Fisher, Jena.

Publication types

MeSH terms

Substances

LinkOut - more resources