The role of human bromodomains in chromatin biology and gene transcription (original) (raw)

. Author manuscript; available in PMC: 2010 Aug 16.

Published in final edited form as: Curr Opin Drug Discov Devel. 2009 Sep;12(5):659–665.

Abstract

The acetylation of histone lysine is central to providing the dynamic regulation of chromatin-based gene transcription. The bromodomain (BRD), which is the conserved structural module in chromatin-associated proteins and histone acetyltranferases, is the sole protein domain known to recognize acetyl-lysine residues on proteins. Structural analyses of the recognition of lysine-acetylated peptides derived from histones and cellular proteins by BRDs have provided new insights into the differences between and unifying features of the selectivity that BRDs exhibit in binding biological ligands. Recent research has highlighted the importance of BRD/acetyl-lysine binding in orchestrating molecular interactions in chromatin biology and regulating gene transcription. These studies suggest that modulating BRD/acetyl-lysine interactions with small molecules may provide new opportunities for the control of gene expression in human health and disease.

Keywords: Acetyl-lysine recognition, bromodomain, gene transcription, lysine acetylation

Introduction

Changes in the structure of chromatin, which is a combination of histone proteins and DNA, are closely coupled to changes in gene transcription. This complex and tightly regulated relationship is possible because of the post-translational modification of DNA-packing histones present in the chromatin. Chromatin contains the entire genomic DNA in eukaryotic cells, and functions as the primary regulator controlling global dynamic changes in gene expression and silencing. Nucleosomes are primary components of chromatin and pack 147-bp lengths of DNA in two super-helical turns around a histone octamer, which consists of a histone-3-histone-4 (H3-H4) tetramer and two H2A-H2B dimers. These nucleosome core particles are connected by short lengths of DNA between the linker histones H1 and H5 to form a nucleosomal filament within the higher-order structure of the chromatin fiber. Within the chromatin structure, the N- and C-termini of the core histone octamers protrude out from the nucleosome particles and are subject to various post-translational modifications, including acetylation, methylation, phosphorylation, ubiquitination and small ubiquitin-like modification (SUMO)ylation. These site-specific modifications may act collectively in the cell nucleus to orchestrate genomic stability and gene expression or repression [13]. Of the various histone modifications, lysine acetylation [4] is the most dynamic as this modification directs both structural changes to chromatin and gene transcription [57].

The dynamic role of lysine acetylation in gene transcription is, to some extent, a result of the bromodomain (BRD), which is the only protein domain known to act as a acetyl-lysine binding domain [8]. BRD-containing proteins have also been implicated in disease processes, including cancer, inflammation and viral replication. This review describes the structural and functional features of human BRDs in chromatin biology and gene transcription.

The bromodomain fold and acetyl-lysine recognition

The available structures of the BRD from the human transcriptional co-activator PCAF (p300/CREB-binding protein-associated factor) [9,10] and the BRDs from the transcriptional protein TAF1 (transcription initiation factor TFIID subunit 1) [11] reveal that the BRDs of the histone acetyltransferase GCN5 [12,13], the co-activator CBP (CREB binding protein) [14], the BET family protein BRD2 [15], BPTF (the BRD and plant homeodomain [PHD] finger-containing transcription factor) [16] and the SNF2L4 (a SWI/SNF remodeling complex protein) [17] all adopt a distinct structural fold of a left-handed four-helix bundle (αZ, αA, αB and αC), termed the `BRD fold'. The inter-helical αZ-αA (ZA) and αB-αC (BC) loops constitute a hydrophobic pocket that recognizes the acetyl-lysine (Figure 1A). Notably, the structural features of BRD/acetyl-lysine binding are significantly different to those of chromodomain/methyl-lysine binding, in which a methyl-lysine sequence forms an anti-parallel β-strand to the β-barrel structure of the chromodomain [18,19]. The modular nature of the BRD fold enables the BRD to act as a functional unit within a protein, either individually or in combination with other modules.

Figure 1. Structural basis of acetyl-lysine recognition by the bromodomain.

Figure 1

(A) The 3-D structure of the CREB binding protein (CBP) bromodomain (BRD) bound to a H4K20ac peptide (PDB code: 2RNY); and (B) the acetyl-lysine binding site, showing the key interactions between the CBP BRD and a H4K20ac peptide. The peptide is yellow and the side chains of the protein residues are color-coded by atom type.

Despite the structurally conserved BRD fold, the overall sequence similarity between members of the BRD family is not high, and there are significant variations in the sequences of the ZA and BC loops [20]. Nevertheless, the amino acid residues that are engaged in acetyl-lysine recognition are among the most conserved residues in the large BRD family, and correspond to Tyr1125, Tyr1167 and Asn1168 in CBP (Figure 1B) [14,21]. The crystal structure of the yeast GCN5 BRD bound to a histone H4 peptide containing acetylated-Lys16 identified that, in addition to binding to the conserved Tyr364 and Tyr406 residues (corresponding to Tyr1125 and Tyr1167 in CBP, respectively), the acetyl-lysine residue forms a specific hydrogen-bond between the oxygen of the acetyl carbonyl group and the side-chain amide nitrogen of the conserved asparagine residue, Asn407 (corresponding to Asn1168 in CBP) [13]. A network of water-mediated hydrogen bonds involving carbonyl groups from the protein backbone at the base of the binding pocket also contributes to acetyl-lysine binding. The critical role of these three conserved amino acid residues in acetyl-lysine recognition has been confirmed by mutagenesis studies [9,10,14], and data demonstrate that most of the BRD family members function as acetyl-lysine binding domains [9]. Significantly, the key Asn1168 residue in CBP (Asn407 in GCN5) is not present in a small subgroup of BRDs, such as that of the transcriptional corepressor TIF1β (transcription intermediate factor 1β) or the sixth BRD in the human Polybromo protein. The former BRD does not bind to lysine-acetylated histones [22], whereas the latter BRD does [Zhou MM: unpublished data], suggesting that there may be another mode of acetyl-lysine binding to the conserved BRD fold.

The human bromodomain family

The human genome encodes 42 BRD-containing proteins, each of which contains between one and six BRDs [23]. The total number of unique individual human BRDs is 56 (2 human proteins, which contain 2 BRDs each, are both annotated as BRD2 and share > 99% sequence identity; their corresponding BRDs are identical in sequence [23]). These numbers can be contrasted with those of BRDs in the yeast genome, which encodes only nine BRD-containing proteins with a total of 14 BRDs [23]. The structural diversity of the human BRD family can be examined indirectly by clustering the 56 BRD sequences into groups that share similar sequence length and at least 35% sequence identity [24]. This yields nine groups, each of which contains at least two BRDs and eight outliers (see Table 1).

Table 1.

Classification of the human bromodomain family.

Group Bromodomains Reference
1 PCAF, hGCN5, TIF1α, TIF1γ, TRI66, Rack7, BAZ2A, BAZ2B, NURF (isoform 2), Sp100, Sp110, Sp140, BRD21, BRD22, BRD31, BRD32, BRD41, BRD42, TAFII2101, TAFII2102, TAF11, TAF12, BRDT1 and BRDT2 [9,11,12,16,27]*
2 SNF2L2, SNF4L4, Polybromo2, Polybromo4 and Polybromo5 [17,29]*
3 BRD1, BRD7, BRD9, HOTTL and BRD and PHD finger-containing protein 3 [30]*
4 BRD 1 of BRD and WD-repeat-containing proteins 1, 2 and 3 -
5 BRD 2 of BRD and WD-repeat-containing proteins 1, 2 and 3 -
6 ATAD2 and ATAD2B *
7 Polybromo1 and Polybromo3 -
8 CBP and p300 [14]*
9 BRD81 and BRD82 -
Outliers Polybromo6, HRX/ALL-1, ASH1L, BAZ1B, BAZ1A, MYND11, TIF1β and CECR2 [22]

The dominant BRD group has 24 members, including the BRDs of several double-BRD-containing proteins such as BRD2, BRD3, BRD4, the TFIID (transcription initiation factor) 210-kDa subunit, TAF1 (the TFIID 250-kDa subunit) and the testis-specific protein BRDT [Sanchez R: unpublished data]. The same group includes BRDs from single-BRD-containing proteins such as the histone acetyltransferases PCAF and GCN5, TIF1α, TIF1γ, tripartite motif-containing protein 66, protein kinase C-binding protein 1 (Rack7), BAZ2A (BRD adjacent to zinc finger domain protein 2A), BAZ2B, BRD PHD finger transcription factor isoform 2, and the nuclear proteins Sp100, Sp110 and Sp140 [Sanchez R: unpublished data]. The structures of several proteins in this group have been solved [9,11,12,16]. The RIKEN Structural Genomics/Proteomics Initiative (RSGI) [25,26] and the Structural Genomics Consortium (SGC) [26] have determined the structures of the BRDs of BRD2, BRD3, BRD4, BRDT, TIF1α and BAZ2B. Huang et al determined the solution-state structure of the second BRD of BRD2 and demonstrated that this BRD is monomeric in solution and interacts dynamically with the acetylated Lys12 residue of histone H4 [27].

The second BRD group includes the BRDs of SNF2L2 and Brg-1, and the second, fourth and fifth BRDs of the Polybromo protein. The RSGI and the SGC have determined structures for the SNF2L2 BRD and the fifth BRD of the Polybromo protein [26,28], and research groups at two laboratories recently determined the structure of the Brg-1 BRD [17,29]. Shen et al used a solution structure and data from NMR perturbation studies to demonstrate that the Brg-1 BRD interacts with an H3K14ac peptide [17]. A crystal structure of the same domain exhibited an unusual small β-sheet in the ZA loop [29].

The third group of human BRD-containing proteins includes BRD7 and BRD9, as well as three proteins that contain a PHD in addition to BRDs. Sun et al reported a solution structure for the BRD7 BRD, and used NMR and titration analysis with several acetylated histone peptides to demonstrate that this BRD lacks inherent histone binding specificity in vitro [30]. The RSGI determined the BRD structure of HOTTL (tubulin-tyrosine ligase-like protein 3), one of the BRD and PHD containing proteins [28].

The next two groups of human BRDs correspond to the N-terminal and C-terminal BRDs of three BRD- and tryptophan-aspartate (WD)-repeat-containing proteins, respectively [Sanchez R: unpublished data]. No experimentally determined structures were available for these six proteins.

The remaining four groups of human BRDs contain pairs of similar BRDs: the BRDs of ATAD2 (ATPase family AAA domain-containing protein 2; ANCCA) and ATAD2B, the structures of which were determined by the RSGI and the SGC [26,28]; the first and third BRDs of human Polybromo; the BRDs of CBP and p300 (solution [14] and crystallographic (SGC) [26] structures were available for the BRD of CBP); and the two BRDs of BRD8.

The eight outliers in this classification of human BRDs correspond to the sixth BRD of Polybromo, and the BRDs of the zinc finger proteins HRX/ALL-1, ASH1L (absent small and homeotic disks protein 1 homolog), BAZ1B, BAZ1A, MYND (myeloid translocation protein 8, Nervy, and DEAF-1) domain-containing protein 11, TIF1β and CECR2 (cat eye syndrome critical region protein 2) [Sanchez R: unpublished data]. Of these BRDs, an experimentally determined structure was available for TIF1β, which formed a structural unit with the adjacent PHD finger [22].

In total, the structures of 23 of the 56 human BRDs have been experimentally determined. Of the 33 BRDs without experimental structures, 21 share > 35% sequence identity with a protein of known structure, and it is therefore possible to construct reasonably accurate homology models for these BRDs [24]. Although these data represent a thorough coverage of the structures of the BRD family proteins, no structural information is available for 12 of the 56 human BRDs. The unique sequences of these 12 BRDs of unknown structure (which either have low overall sequence similarity or long insertions with respect to the sequences of proteins with known structures) may contain special structural or functional features that have not yet been observed in the better characterized BRDs. Additionally, although the structural description of the BRDs is almost complete, many more structures of BRD-ligand complexes are required to facilitate a detailed understanding of ligand-binding selectivity. The existence of tandem domains (BRD-BRD, PHD-BRD or BRD-PHD), which are tightly associated with each other in sequence and most likely also in the 3D structure, adds another level of complexity to the structural description of BRDs.

The association of bromodomains with other chromatin modules

BRDs are promiscuous domains in that they occur in a variety of proteins with different domain architectures, and can be considered functionally independent (ie, BRD-containing proteins do not all perform the same function) [31,32]. For example, the BRD-containing protein PCAF is a histone lysine acetyltransferase, whereas HRX/ALL-1 is a histone lysine _N_-methyltransferase and SNF2L2 is an ATP-dependent helicase. More than 15 different domain types have been identified to occur within the same proteins as BRDs, including the PHD, PWWP, B-box type zinc finger, ring finger, SAND, FY Rich, SET, TAZ zinc Finger, helicase, ATPase, BAH (bromo adjacent homolog) domain, WD40 repeat and MBD (methyl-CpG binding domain) [23].

The domain that is most frequently associated with the BRD is the PHD finger, which is a C4HC3 zinc-finger-like motif present in nuclear proteins. A PHD has been identified in 19 of the 42 human BRD-containing proteins. In 12 of these proteins the PHD and BRD are separated by a short amino acid sequence (< 30 residues) and may form structurally interdependent tandem PHD/BRD arrangements such as those observed in TIF1β [22]. The TIF1β structure contains a distinct scaffold that unifies the two protein modules, in which the Z helix of the BRD forms a hydrophobic core that anchors the other three helices of the BRD on one side and the PHD finger on the other. A comprehensive structure-function analysis correlating transcriptional repression, UBC9 (ubiquitin-conjugating enzyme 9) binding and SUMOylation demonstrated that the PHD finger and BRD of TIF1β cooperate as a functional unit to facilitate lysine SUMOylation, which is required for TIF1β corepressor activity in gene silencing [22,33]. These results identified a unified function for the tandem PHD/BRD as an intramolecular SUMO E3 ligase for transcriptional silencing. The ligase activity is a divergent function for the BRD, which does not bind to lysine-acetylated histones in this form. In contrast to TIF1β, the structure of BPTF, which also contains a PHD finger and a BRD separated by a short linker [16], did not demonstrate any significant structural interactions between the two domains. In BPTF, the PHD domain recognized the methylated Lys4 residue of histone H3 (H3K4me3) [16,34]. However, the histone binding specificity of the BPTF BRD was not established. These examples demonstrate that not only do BRD-containing proteins vary significantly in function, but that the BRD itself may have different binding activities as a consequence of other associated domains.

The second most common domain association for the BRD is an association with another BRD; of the 42 BRD-containing proteins, 12 contain 2 or more BRDs. With the exception of Polybromo, which contains six BRDs, all other proteins with multiple BRDs contain two BRDs. In the transcription initiation factor TAF1 and TFIID 210-kDa subunits, as well as in some of the Polybromo BRD pairs, the two BRDs are separated by short amino acid sequences (< 20 residues). The structure of the TAF1 BRDs suggests that they form a tandem arrangement that binds selectively to multiple acetylated histone H4 peptides [11]. Another tandem BRD arrangement was observed in the yeast Rsc4 protein, which is related to human Polybromo [35]. The yeast Rsc4 structure contains a compact BRD tandem that binds H3K14ac in the second BRD and the acetylated Lys25 residue of Rsc4 itself in the first BRD, suggesting an autoregulatory mechanism [35]. The arrangement of the tandem BRDs in TAF1 and yeast Rsc4 are different; therefore, whether the arrangement of tandem BRDs are protein-specific or are evolutionarily conserved is unclear. The structures of additional putative tandem BRDs (such as those from Polybromo) will be necessary to fully elucidate if BRD arrangements are evolutionarily conserved.

In human proteins, the association of BRDs with domains other than PHDs or BRDs occurs infrequently. For example, BRDs can associate with either PWWP or B-box zinc finger domains, but in each case these domains are present in only four human BRD-containing proteins. Additionally, of the non-BRD domains that associate with BRDs, none occur in as close proximity to a BRD within the protein sequence as either the PHD or BRDs that form the tandem motifs [23]. The association between PHDs and BRDs often observed in human proteins is absent in yeast, although yeast does express proteins that contain both BRDs and PHDs [23].

Functions of human bromodomain proteins

The complexity and variability of the domain composition of human BRD-containing proteins, and the influence of neighboring domains (such as the PHD) on the function of the BRD itself, make it difficult to predict the function of BRD-containing proteins based on sequence similarity alone. Many of the human BRD-containing proteins do not have well-characterized functions, although some have been implicated in disease processes. The most recent data on human BRD-containing protein function and disease involvement are reviewed in this section. However, the specific function of any of the BRDs remains to be elucidated.

BRD4 plays an important role in various biological processes by means of its two BRDs. This protein functions in the inflammatory response as a co-activator for the transcriptional activation of NFκB, via the binding of the BRDs to the acetylated-Lys310 residue on the RelA subunit of NFκB [36]. BRD4 also plays a cellular role by stimulating G1 gene transcription and promoting cell-cycle progression to the S phase [37]. Additionally, BRD4 can control the transcription of viral genes. For example, this protein regulates HIV transcription by inducing the phosphorylation of CDK9 (cyclin-dependent kinase 9) at the Thr29 residue in the HIV transcription initiation complex, thereby inhibiting CDK9 kinase activity and leading to the inhibition of HIV transcription [38]. BRD4 is also involved in the inhibition of the proteasomal degradation of the papillomavirus E2 protein [39]. Furthermore, BRD4 associates with Kaposi's sarcoma-associated herpesvirus-encoded LANA-1 (latency-associated nuclear antigen) through molecular interactions involving the C-terminal region [40] and extraterminal domain [41] of BRD4. Additionally, both BRD4 and BRD2 interact with the murine γ-herpesvirus 68 protein or f73, which is required to establish viral latency in vivo [42]. Finally, BRD4 activation may also predict the survival of patients with breast cancer [43]. Crawford et al proposed that the activation of BRD4 manipulates the response of the tumor to its microenvironment in vivo, resulting in a reduction of tumor growth and pulmonary metastasis in mice [44]. Microarray analysis of multiple human mammary tumor cell lines demonstrated that the activation of BRD4 was predictive of progression and/or survival. These results suggest that the dysregulation of BRD4-associated pathways may play an important role in breast cancer progression.

The coupling of histone acetylation to transcription in vivo by BRD2 and BRD3 was demonstrated; in human 293 cells, these proteins preferentially associated with specific H4 modifications along the entire lengths of genes, and allowed RNA polymerase II to transcribe through the nucleosomes [43]. BRD2 also exhibited histone chaperone activity [43]. In mice, BRD2 is essential for embryonic development [45], and an association between BRD2 and juvenile myoclonic epilepsy in humans has been reported [46].

In mice, the BRD and WD-repeat-containing protein BRWD1 is required for normal spermiogenesis and the oocyte-embryo transition [47]. A mutation in BRWD1 leads to phenotypically normal, but infertile mice.

The BRD of transcriptional co-activator p300 was suggested to play a role in the IL-6 signaling pathway, by mediating the interaction of the STAT3 amide-terminal domain with p300, thereby stabilizing enhanceosome assembly [48].

ATAD2 is an estrogen-regulated ATPase co-activator with a BRD that functions in both estrogen receptor α and androgen receptor signaling. This protein is required for the formation of transcriptional coregulator complexes at chromatin and the modification of chromatin [49]. Chen and colleagues suggested that ATAD2 plays an important role in prostate cancer by mediating specific androgen receptor functions involved in cancer cell survival and proliferation [50].

Conclusion

The role of the BRD as the sole protein domain known to recognize acetyl-lysine residues on proteins is more complex than initially thought [9,11,51]. Studies of individual BRDs, which have focused on the structural characterization of the domains and their interactions with ligands, identified varied ligand-binding specificities that were dependent not only on the characteristics of the BRD itself, but also on the other domains (BRD and non-BRD) present in the same protein. Studies of BRD-containing proteins have highlighted the role of these domains in many important biological processes and their association with disease. The characterization of the multiplicity of molecular interactions mediated by BRDs is therefore essential for deciphering the role of individual domains and proteins. This challenging task may be facilitated by the high structural coverage of the human BRD family, which presents a unique opportunity for the rational design of selective small molecules that could serve as tools to modulate and control gene expression in the human biology.

Acknowledgements

The authors were supported by the grants GM081713 (Roberto Sanchez), MCB0517352 (Ming-Ming Zhou/R Sanchez), CA87658 and HG004508 (MM Zhou).

Abbreviations

ATAD

ATPase family AAA domain-containing protein

BPTF

BRD and PHD finger-containing transcription factor

BRD

bromodomain

CBP

CREB binding protein

H

histone

PCAF

p300/CREB-binding protein-associated factor

PHD

plant homeodomain

SUMO

small ubiquitin-like modifier

TAF1

transcription initiation factor TFIID subunit 1

TIF

transcription intermediate factor

TFIID

transcription initiation factor

WD

tryptophan-aspartate

ZA

αZ-αA

References

•• of outstanding interest

• of special interest