The role of DNA shape in protein-DNA recognition - PubMed (original) (raw)

The role of DNA shape in protein-DNA recognition

Remo Rohs et al. Nature. 2009.

Abstract

The recognition of specific DNA sequences by proteins is thought to depend on two types of mechanism: one that involves the formation of hydrogen bonds with specific bases, primarily in the major groove, and one involving sequence-dependent deformations of the DNA helix. By comprehensively analysing the three-dimensional structures of protein-DNA complexes, here we show that the binding of arginine residues to narrow minor grooves is a widely used mode for protein-DNA recognition. This readout mechanism exploits the phenomenon that narrow minor grooves strongly enhance the negative electrostatic potential of the DNA. The nucleosome core particle offers a prominent example of this effect. Minor-groove narrowing is often associated with the presence of A-tracts, AT-rich sequences that exclude the flexible TpA step. These findings indicate that the ability to detect local variations in DNA shape and electrostatic potential is a general mechanism that enables proteins to use information in the minor groove, which otherwise offers few opportunities for the formation of base-specific hydrogen bonds, to achieve DNA-binding specificity.

PubMed Disclaimer

Figures

Figure 1

Figure 1. Amino acid frequencies in minor grooves

(a) Histograms for each amino acid illustrate the frequency with which they are observed in any minor groove (green), in minor grooves with a width of ≥5.0 Å (blue), and in narrow minor grooves of <5.0 Å width (red). (b) Frequency of AT (W) and GC (S) base pairs in sequences of 229 sites contacted by arginines in narrow minor grooves. The central base pair (boxed) is contacted by arginine. Frequencies are symmetrized by using both complementary strands.

Figure 2

Figure 2. Distribution of tetranucleotide sequences according to average minor groove width

Tetranucleotides from structures with a minimum length of one helical turn for which minor groove width can be defined are ordered by average minor groove width (red). The widths of all tetranucleotides are shown (black) and the sequence, average width, and occurrence in our dataset are given in Supplementary Table 2. (a) The 59 unique tetranucleotides from free DNA structures. (b) The set of all 136 unique tetranucleotides derived from protein-DNA complexes.

Figure 3

Figure 3. Specific examples of minor groove shape recognition by arginines

DNA shapes of the binding sites of (a) Ubx-Exd 1b8i, (b) MATa1/MATα2 1akh, and (c) Oct-1/PORE 1hf0, (d) the MogR repressor 3fdq, (e) the Tc3 transposase 1u78, and (f) the phage 434 repressor 2or1 are shown in GRASP surface representations, with convex surfaces color-coded in green and concave surfaces in grey/black. Plots of minor groove width (blue) and electrostatic potential in the center of the minor groove (red) are shown below. Arginine contacts (defined by the closest distance between the guanidinium groups and the bases) are indicated. A-tract sequences are highlighted by a solid red line, the TATA box in (e) by a dashed line.

Figure 4

Figure 4. Minor groove shape recognition in the nucleosome

(a) Correlation of minor groove width of the nucleosome core particle (PDB code 1kx5) (blue) and electrostatic potential (red). Arginine contacts (defined by the closest distance between the guanidinium groups and the bases) are indicated. A-tract sequences are highlighted by solid red lines. (b) Schematic representation of the DNA backbone in the nucleosome color-coded by minor groove width (red ≤4.0 Å, pink >4.0 Å and ≤5.0 Å, light blue >5.0 Å and ≤6.0 Å, dark blue >6.0 Å), including all arginines that contact the minor groove. (c) The distribution of A-tracts of length three base pairs or longer in 23,076 yeast nucleosome-bound DNA sequences. (d) Histogram of the occurrence of A-tracts of length three or longer in the same dataset.

Figure 5

Figure 5. The biophysical origins of the negative potential of narrow minor grooves

Electrostatic potential in the minor groove of the MogR binding site (PDB code 3fdq), calculated in the presence of a dielectric boundary (ε=2 in solute and ε=80 in solvent – solid line) and in the absence of a boundary (ε= 80 in both solute and solvent – dashed line).

Comment in

References

    1. Garvie CW, Wolberger C. Recognition of specific DNA sequences. Mol Cell. 2001;8(5):937–946. - PubMed
    1. Seeman NC, Rosenberg JM, Rich A. Sequence-specific recognition of double helical nucleic acids by proteins. Proc Natl Acad Sci U S A. 1976;73(3):804–808. - PMC - PubMed
    1. Travers AA. DNA conformation and protein binding. Annu Rev Biochem. 1989;58:427–452. - PubMed
    1. Shakked Z, et al. Determinants of repressor/operator recognition from the structure of the trp operator binding site. Nature. 1994;368(6470):469–473. - PubMed
    1. Lu XJ, Shakked Z, Olson WK. A-form conformational motifs in ligand-bound DNA structures. J Mol Biol. 2000;300(4):819–840. - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources