Exploring QSAR of non-nucleoside reverse transcriptase inhibitors by artificial neural networks: HEPT derivatives (original) (raw)

Exploring QSAR of Non-Nucleoside Reverse Transcriptase Inhibitors by Neural Networks: TIBO Derivatives

International Journal of Molecular Sciences, 2004

Human Immunodeficiency Virus type 1 (HIV-1) reverse transcriptase is an important target for chemotherapeutic agents against the AIDS disease. 4,5,6,7-Tetrahydro-5-methylimidazo[4,5,1-jk][1,4]benzodiazepin-2(1H)-ones (TIBO) derivatives are potent non-nucleoside reverse transcriptase inhibitors (NNRTIs). In the present work, quantitative structure-activity relationship (QSAR) analysis for a set of 82 TIBO derivatives has been investigated by means of a three-layered neural network (NN). It has been shown that NN can be a potential tool in the investigation of QSAR analysis compared with the models given in the literature. NN gave good statistical results both in fitting and prediction processes (0.861 ≤ r² ≤ 0.928, 0.839 ≤q² ≤ 0.845). The relevant factors controlling the anti-HIV-1 activity of TIBO derivatives have been identified. The results are along the same lines as those of our previous studies on HEPT derivatives and indicate the importance of the hydrophobic parameter in modeling the QSAR for TIBO derivatives.

Artificial neural networks: Non-linear QSAR studies of HEPT derivatives as HIV-1 reverse transcriptase inhibitors

Molecular Diversity, 2004

Research on the structure-activity reiationships of molecules with acidic carbon atoms led us to undertake a feasibility study on the determination of their acidity constants by capillary electrophoresis (CE). The studied molecules had diverse structures and v/ere tetronic acid, acetylacetone, diethylmalonate, Meldrum's acid, 3-methylrhodanine, nitroacetic acid ethyl ester, pyrimidine-2,4,6-trione, 3-oxo-3-phenylpropionic acid ethyl ester, 1-phenylbutan-1,3-dione, 5,5-dirnethylcyclohexan-1,3-dione and homophthalic anhydride. The p K^ range explored by CE was therefore very large (from 3 to l2) and p K" values near i2 were evaluated by mathematical extrapolations. The analyses were carried out in CZE mode using a fused silica capillary grafted (or not) with hexadimethrine. Owing to rhe electrophoretic behaviour of these compounds according to the pH, their acidity constants could be evaluated and appeared in perfect agreement with the literatùre data obtained, a few decades ago, by means of potentiometry, spectrometry or conductimetry. The pK" of homophthalic anhydride and 3-methylrhodanine were evaluated for the first time.

Neural Networks :Application for prediction of Anti-HIV-1 Activity of HEPT Derivatives

Structure-anti HIV activity relationships were established for a sample of 80 l-[2-hydroryethory-melhyl]-6-(phenylthio)thymine (HEPT) using a three-layer neural network (NN). Eight structural descriptors and physicochemical variables were used to characterize the HEPT derivatives under study. The network's architecture and parameters were optimized in order to obtain good results. All the NN architectures were able to establish a satisfactory relationship between the molecular descriptors and the anti-HIV activity. NN proved to give better results than other models in the literature. NN have been shown to be particularly successful in their ability to identify non-linear relationships.

Development of linear and nonlinear predictive QSAR models and their external validation using molecular similarity principle for anti-HIV indolyl aryl sulfones

Journal of Enzyme Inhibition and Medicinal Chemistry, 2008

Quantitative structure-activity relationship (QSAR) studies have been carried out on indolyl aryl sulfones, a class of novel HIV-1 non-nucleoside reverse transcriptase inhibitors, using physicochemical, topological and structural parameters along with appropriate indicator variables. The statistical tools used were linear methods (e.g., stepwise regression analysis, partial least squares (PLS), factor analysis followed by multiple regression (FA-MLR), genetic function approximation combined with multiple linear regression (GFA-MLR) and GFA followed by PLS or G/PLS and nonlinear method (artificial neural network or ANN). In case of physicochemical parameters, GFA-MLR generated the best Equation (n ¼ 97, R 2 ¼ 0.862, Q 2 ¼ 0.821). Using topological parameters, the best Equation (based on leave-one-out Q 2) was obtained with stepwise regression technique (n ¼ 97, R 2 ¼ 0.867, Q 2 ¼ 0.811). When topological and physicochemical parameters were used in combination, statistical quality increased to a great extent (n ¼ 97, R 2 ¼ 0.891, Q 2 ¼ 0.849 from stepwise regression). Furthermore, the whole dataset had been divided into test (25% of whole dataset) and training (remaining 75%) sets. Models were developed based on the training set and predictive potential of such models was checked from the test set. The selection of the training set was based on K-means clustering of the standardized descriptors (topological and physicochemical). In this case also the best results were obtained with stepwise regression (n ¼ 72, R 2 ¼ 0.906, Q 2 ¼ 0.853) but external predictive capacity of this model (R 2 pred ¼ 0:738) was inferior to the model developed from GFA-MLR technique (R 2 ¼ 0.883, Q 2 ¼ 0.823, R 2 pred ¼ 0:760). However, the squared regression coefficient between observed activity and predicted activity values of the test set compounds for the best linear model, i.e., GFA-MLR (r 2 ¼ 0.736) was lower in comparison to the best nonlinear model developed using artificial neural network (r 2 ¼ 0.781). Thus, based on external validation, the ANN models were superior to the linear models. The predictive potential of the best linear Equation (stepwise regression model) was superior to that of the previously published CoMFA (Q 2 ¼ 0.81, SDEP Test ¼ 0.89) on the same data set (Ragno R. et al., J Med Chem 2006, 49, 3172-3184). Furthermore, the physicochemical parameter based models also supported the previous observations based on docking (Ragno R. et al.,

QSAR Studies of HEPT Derivatives Using Support Vector Machines

QSAR & Combinatorial Science, 2009

Human Immunodeficiency Virus type 1 reverse transcriptase is an important target for chemotherapeutic agents against the AIDS disease. 1-[2-Hydroxyethoxy-methyl]-6-(phenylthio) thymine] derivatives are potent nonnucleoside reverse transcriptase inhibitors. In the present work, quantitative structure-activity relationship analysis for a set of 79 HEPT derivatives has been investigated by means of support vector machines. The relationships between structure and activity were examined quantitatively using descriptors encoding the steric, hydrophobic, electronic and structural features of I-12hydroxyethoxy-methyl]-6-(phenylthio) thymine] derivatives. The performance and predictive capability of support vector machines method are investigated and compared with other methods such as artificial neural network and multiple linear regression methods. The obtained results indicate that the support vector machines model with the kernel radial basis function can be employed as a forceful tool for quantitative structureactivity relationship studies. The contribution of each descriptor to the structure-activity relationships was evaluated. 71,0

Support vector machines: Development of QSAR models for predicting anti-HIV-1 activity of TIBO derivatives

European Journal of Medicinal Chemistry, 2010

The tetrahydroimidazo [4,5,1-jk][1,4]benzodiazepinone (TIBO) derivatives, as non-nucleoside reverse transcriptase inhibitors, acquire a significant place in the treatment of the infections by the HIV. In the present paper, the support vector machines (SVM) are used to develop quantitative relationships between the anti-HIV activity and four molecular descriptors of 82 TIBO derivatives. The results obtained by SVM give good statistical results compared to those given by multiple linear regressions and artificial neural networks. The contribution of each descriptor to structure-activity relationships was evaluated. It indicates the importance of the hydrophobic parameter. The proposed method can be successfully used to predict the anti-HIV of TIBO derivatives with only four molecular descriptors which can be calculated directly from molecular structure alone.

QSAR models for HEPT derivates as NNRTI inhibitors based on Monte Carlo method

European Journal of Medicinal Chemistry, 2014

A series of 107 1-[(2-hydroxyethoxy)-methyl]-6-(phenylthio) thymine (HEPT) with anti-HIV-1 activity as a non-nucleoside reverse transcriptase inhibitor (NNRTI) has been studied. Monte Carlo method has been used as a tool to build up the quantitative structureeactivity relationships (QSAR) for anti-HIV-1 activity. The QSAR models were calculated with the representation of the molecular structure by simplified molecular input-line entry system and by the molecular graph. Three various splits into training and test set were examined. Statistical quality of all build models is very good. Best calculated model had following statistical parameters: for training set r 2 ¼ 0.8818, q 2 ¼ 0.8774 and r 2 ¼ 0.9360, q 2 ¼ 0.9243 for test set. Structural indicators (alerts) for increase and decrease of the IC 50 are defined. Using defined structural alerts computer aided design of new potential anti-HIV-1 HEPT derivates is presented.

A novel simple QSAR model for the prediction of anti-HIV activity using multiple linear regression analysis

Molecular Diversity, 2006

A quantitative–structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R 2CV = 0.8160; S PRESS = 0.5680) proved to be very accurate both in training and predictive stages.

Predictive QSAR modeling of HIV reverse transcriptase inhibitor TIBO derivatives

European Journal of Medicinal Chemistry, 2009

Comparative quantitative structure-activity relationship (QSAR) studies have been carried out on tetrahydroimidazo[4,5,1-jk][1,4]benzodiazepine (TIBO) derivatives as reverse transcriptase inhibitors (n ¼ 70) using topological, structural, physicochemical, electronic and spatial descriptors. The data set was divided into training and test sets using a cluster-based method. Linear models were developed using multiple regression (with stepwise regression, factor analysis and genetic function approximation (GFA) as variable selection tools) and partial least squares (PLS) and combination of factor analysis and partial least squares (FA-PLS). Genetic function approximation (spline) and artificial neural networks (ANN) were used for the development of non-linear models. Using topological and structural descriptors, the best equation was obtained from GFA (spline) based on internal validation (Q 2 ¼ 0.737), but the model with the best external validation characteristics was obtained with FA-PLS (R pred 2 ¼ 0.707). When structural, physicochemical, electronic and spatial descriptors were used, the best Q 2 (0.740) value was obtained from GFA (spline) whereas PLS provided the best R pred 2 (0.784) value. When all descriptors were used in combination, the best R pred 2 (0.760) value and the best Q 2 (0.800) value were obtained from ANN and GFA (spline), respectively. The majority of the models satisfied the criteria of external validation recommended by Golbraikh and Tropsha (2002) and the criteria of modified r 2 (r m 2) values of the test set for external validation as suggested by Roy and Roy (2008). In order to further validate selected models, an external set of 10 TIBO derivatives, which fall within the applicability domain of the models and are not shared with the compounds of the present data set, was taken from a different source, and reverse transcriptase inhibitory activity of these compounds was predicted. Acceptable values of squared correlation coefficients between the observed and predicted values of the external set compounds were obtained from the selected models suggesting true predictive potential of the models.