Inferring binding energies from selected binding sites - PubMed (original) (raw)
Inferring binding energies from selected binding sites
Yue Zhao et al. PLoS Comput Biol. 2009 Dec.
Abstract
We employ a biophysical model that accounts for the non-linear relationship between binding energy and the statistics of selected binding sites. The model includes the chemical potential of the transcription factor, non-specific binding affinity of the protein for DNA, as well as sequence-specific parameters that may include non-independent contributions of bases to the interaction. We obtain maximum likelihood estimates for all of the parameters and compare the results to standard probabilistic methods of parameter estimation. On simulated data, where the true energy model is known and samples are generated with a variety of parameter values, we show that our method returns much more accurate estimates of the true parameters and much better predictions of the selected binding site distributions. We also introduce a new high-throughput SELEX (HT-SELEX) procedure to determine the binding specificity of a transcription factor in which the initial randomized library and the selected sites are sequenced with next generation methods that return hundreds of thousands of sites. We show that after a single round of selection our method can estimate binding parameters that give very good fits to the selected site distributions, much better than standard motif identification algorithms.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
Figure 1. Effect of Mu on binding probabilities.
(A) Prior distribution of binding energy for Mnt half-site ,, with equiprobable background frequency. (B) Binding probability as function of binding energy, according to equation (1). Colors correspond to values of , Black: = −3.48, Red: = −0.85, Blue: = 2.2. These values were chosen such that binding probabilities of the consensus sequence are 0.03, 0.3 and 0.9, respectively. No non-specific binding energy is used. (C) Posterior distribution of binding energy.
Figure 2. Examples of Simulation Results.
Top Panel (A–C): Effects of . Non-specific energy was set to 30 so as to have negligible effect on binding. (A) = −3.48 (B) = −0.85 (C) = 2.2. Bottom Panel (D–F): Effects of at low concentration limit. was set to −100. (D) = 13.82 (E) = 11.51 (F) = 9.21. These values were chosen such that the relative of consensus sequence to non-specific binding is (D) 1,000,000 (E) 100,000 (F) 10,000.
Figure 3. Re-analysis of Maerkl & Quake data.
(A) Fit of point-estimate of binding energy as done in Maerkl & Quake paper (B) BEEML fit with PWM energy model and non-specific energy parameter (C) BEEML fit with position specific di-nucleotide energy model and non-specific energy parameter. (Note that in a previous analysis of this data there was an error in equation (2), and equation (2) from this paper is the correct model.)
Figure 4. Fit of BEEML and BioProspector model to SELEX data.
References
- Fields DS, He Y, Al-Uzri AY, Stormo GD. Quantitative specificity of the Mnt repressor. J Mol Biol. 1997;271:178–194. - PubMed
- Teh HF, Peh WY, Su X, Thomsen JS. Characterization of protein–DNA interactions using surface plasmon resonance spectroscopy with various assay schemes. Biochemistry. 2007;46:2127–35. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- T32 HG00045/HG/NHGRI NIH HHS/United States
- R01 GM078222/GM/NIGMS NIH HHS/United States
- R01 HG00249/HG/NHGRI NIH HHS/United States
- T32 HG000045/HG/NHGRI NIH HHS/United States
- R01 HG000249/HG/NHGRI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources