Prediction of Cytochrome P450 Profiles of Environmental Chemicals with QSAR Models Built from Drug-like Molecules - PubMed (original) (raw)
Prediction of Cytochrome P450 Profiles of Environmental Chemicals with QSAR Models Built from Drug-like Molecules
Hongmao Sun et al. Mol Inform. 2012.
Abstract
The human cytochrome P450 (CYP) enzyme family is involved in the biotransformation of many xenobiotics. As part of the U.S. Tox21 Phase I effort, we profiled the CYP activity of approximately three thousand compounds, primarily those of environmental concern, against human CYP1A2, CYP2C19, CYP2C9, CYP2D6, and CYP3A4 isoforms in a quantitative high throughput screening (qHTS) format. In order to evaluate the extent to which computational models built from a drug-like library screened in these five CYP assays under the same conditions can accurately predict the outcome of an environmental compound library, five support vector machines (SVM) models built from over 17,000 drug-like compounds were challenged to predict the CYP activities of the Tox21 compound collection. Although a large fraction of the test compounds fall outside of the applicability domain (AD) of the models, as measured by _k_-nearest neighbor (_k_-NN) similarities, the predictions were largely accurate for CYP1A2, CYP2C9, and CYP3A4 ioszymes with area under the receiver operator characteristic curves (AUC-ROC) ranging between 0.82 and 0.84. The lower predictive power of the CYP2C19 model (AUC-ROC = 0.76) is caused by experimental errors and that of the CYP2D6 model (AUC-ROC = 0.76) can be rescued by rebalancing the training data. Our results demonstrate that decomposing molecules into atom types enhanced the coverage of the AD and that computational models built from drug-like molecules can be used to predict the ability of non-drug like compounds to interact with these CYPs.
Keywords: Human CYPs; Predictive Capacity; Predictive Toxicology; QSAR models; SVM.
Figures
Figure 1
Confusion matrix of a two-class model and common performance metrics calculated from it.
Figure 2
Venn diagrams showing the overlaps of (A) CYP1A2, CYP2C9, and CYP2C19 actives, (B) CYP1A2, CYP2C9, and CYP2D6 actives, (C) CYP1A2, CYP2C9, and CYP3A4 actives, and (D) a histogram showing the number of compounds active in 0 to 5 CYP isozymes.
Figure 3
The 5-NN similarity distributions of CYP3A4 test compounds (gray bars) and the percentages of correctly predicted compounds (black lines) binned into the ranges of the averaged 5NN similarities. (A) Tanimoto similarities using ECFP_4 and (B) Cosine similarities using atom typing descriptors.
References
- Evans WE, Relling MV. Science. 1999;286(5439):487–491. - PubMed
- Brown CM, Reisfeld B, Mayeno AN. Drug Metab Rev. 2008;40(1):1–100. - PubMed
- Denisov IG, Makris TM, Sligar SG, Schlichting I. Chem Rev. 2005;105(6):2253–2277. - PubMed
- Guengerich FP, Wu ZL, Bartleson CJ. Biochem Biophys Res Commun. 2005;338(1):465–469. - PubMed
- Guengerich FP. Chem Res Toxicol. 2008;21(1):70–83. - PubMed
LinkOut - more resources
Full Text Sources
Research Materials