In-silico Discovery and Simulated Selection of Multi-target Anti-HIV-1 Inhibitors (original) (raw)

A quantitative structure–activity relationship study on HIV-1 integrase inhibitors using genetic algorithm, artificial neural networks and different statistical methods

Arabian Journal of Chemistry, 2016

In this work, quantitative structure-activity relationship (QSAR) study has been done on tricyclic phthalimide analogues acting as HIV-1 integrase inhibitors. Forty compounds were used in this study. Genetic algorithm (GA), artificial neural network (ANN) and multiple linear regressions (MLR) were utilized to construct the non-linear and linear QSAR models. It revealed that the GA-ANN model was much better than other models. For this purpose, ab initio geometry optimization performed at B3LYP level with a known basis set 6-31G (d). Hyperchem, ChemOffice and Gaussian 98W softwares were used for geometry optimization of the molecules and calculation of the quantum chemical descriptors. To include some of the correlation energy, the calculation was done with the density functional theory (DFT) with the same basis set and Becke's three parameter hybrid functional using the LYP correlation functional (B3LYP/6-31G (d)). For the calculations in solution phase, the polarized continuum model (PCM) was used and also included optimizations at gas-phase B3LYP/6-31G (d) level for comparison. In the aqueous phase, the root-mean-square errors of the training set and the test set for GA-ANN model using jack-knife method, were 0.1409, 0.1804, respectively. In the gas phase, the

QSAR-based drug designing studies on HIV-1 integrase inhibitors

Network Modeling Analysis in Health Informatics and Bioinformatics, 2016

In this study, QSAR modeling was performed for predicting the IC 50 value for a set of HIV-1 integrase inhibitors using multiple regression and partial least square method obtaining an optimized model for each method. These models were used to predict a set of test compounds obtained by performing a chemical similarity search of the training set from the NCBI PubChem database subjected to the Lipinski rule of five filters. The predicted IC 50 value for the test set compounds was further analyzed for molecular docking simulation against HIV-1 integrase revealing that the test set compounds have a more binding affinity than the training set compounds and the market approved drug raltegravir. The stability of the docked compounds (protein-ligand complexes) was further validated by performing molecular dynamics simulations for 20 ns using Gromacs 5.0, and the RMSD backbone was analyzed. Last, the ADME-toxicity analysis was carried out for the top docking hit compounds and the market approved drug raltegravir revealing that the docked compounds have enhanced pharmacological parameters than raltegravir.

Linear and non-linear quantitative structure-activity relationship models on indole substitution patterns as inhibitors of HIV-1 attachment

The antiviral drugs that inhibit human immunodeficiency virus (HIV) entry to the target cells are already in different phases of clinical trials. They prevent viral entry and have a highly specific mechanism of action with a low toxicity profile. Few QSAR studies have been performed on this group of inhibitors. This study was performed to develop a quantitative structure-activity relationship (QSAR) model of the biological activity of indole glyoxamide derivatives as inhibitors of the interaction between HIV glycoprotein gp120 and host cell CD4 receptors. Forty different indole glyoxamide derivatives were selected as a sample set and geometrically optimized using Gaussian 98W. Different combinations of multiple linear regression (MLR), genetic algorithms (GA) and artificial neural networks (ANN) were then utilized to construct the QSAR models. These models were also utilized to select the most efficient subsets of descriptors in a cross-validation procedure for non-linear log (1/EC 50 ) prediction. The results that were obtained using GA-ANN were compared with MLR-MLR and MLR-ANN models. A high predictive ability was observed for the MLR, MLR-ANN and GA-ANN models, with root mean sum square errors (RMSE) of 0.99, 0.91 and 0.67, respectively (N = 40). In summary, machine learning methods were highly effective in designing QSAR models when compared to statistical method.

Development of QSAR models for predicting anti-HIV-1 activity using the Monte Carlo method

Central European Journal of Chemistry, 2013

The CORAL software (http://www.insilico.eu/coral/) has been examined as a tool for modeling anti-HIV-1 activity by quantitative structure — activity relationships (QSAR) for three different sets: (i) TIBO derivatives (n=82) (ii) anti-HIV-1 activity of 2-amino-6-arylsulfonylbenzonitriles and their congeners (n=64), and (iii) the measured binding affinity for fullerene-based HIV-1 PR inhibitors (n=48). A new global invariant ATOMPAIR of the molecular structure which can be calculated with the simplified molecular input line entry system (SMILES) was studied. The ATOMPAIR is an indicator of the joint presence of pairs of chemical elements (F, Cl, Br, N, O, S, and P) and three types of bonds (double covalent bond, triple covalent bond, and stereo chemical bond). Six random splits into sub-training, calibration, and test set were examined for each set. For the three aforementioned sets, the use of ATOMPAIR in the modeling process improves the predictive potential of the models for six ra...

Hybrid-genetic algorithm based descriptor optimization and QSAR models for predicting the biological activity of Tipranavir analogs for HIV protease inhibition

Journal of Molecular Graphics and Modelling, 2010

The prediction of biological activity of a chemical compound from its structural features plays an important role in drug design. In this paper, we discuss the quantitative structure activity relationship (QSAR) prediction models developed on a dataset of 170 HIV protease enzyme inhibitors. Various chemical descriptors that encode hydrophobic, topological, geometrical and electronic properties are calculated to represent the structures of the molecules in the dataset. We use the hybrid-GA (genetic algorithm) optimization technique for descriptor space reduction. The linear multiple regression analysis (MLR), correlation-based feature selection (CFS), non-linear decision tree (DT), and artificial neural network (ANN) approaches are used as fitness functions. The selected descriptors represent the overall descriptor space and account well for the binding nature of the considered dataset. These selected features are also human interpretable and can be used to explain the interactions between a drug molecule and its receptor protein (HIV protease). The selected descriptors are then used for developing the QSAR prediction models by using the MLR, DT and ANN approaches. These models are discussed, analyzed and compared to validate and test their performance for this dataset. All three approaches yield the QSAR models with good prediction performance. The models developed by DT and ANN are comparable and have better prediction than the MLR model. For ANN model, weight analysis is carried out to analyze the role of various descriptors in activity prediction. All the prediction models point towards the involvement of hydrophobic interactions. These models can be useful for predicting the biological activity of new untested HIV protease inhibitors and virtual screening for identifying new lead compounds.

Quantitative structure and activity relationship modeling study of anti-HIV-1 RT inhibitors: Genetic function approximation and density function theory methods

Journal of Computational Methods in Molecular Design, 2015

In the present work, quantitative structure activity relationship studies were performed to explore the structural and physicochemical requirements of 1-[(2-hydroxyethoxy)methyl]-6-(phenylthio)thymine (HEPT) derivatives for anti-HIV activity. QSAR models have been developed using steric, electronic and thermodynamic descriptors. Statistical techniques like genetic function approximation-multiple linear regression (GFA-MLR) as the data preprocessing step were applied to identify the structural and physicochemical requirements for anti-HIV activity. The generated equations were statistically validated using leave-one-out technique and the best models were also subjected to leave-5-out cross-validation. The quality of fit and predictive ability of equations obtained from GFA-MLR is of acceptable statistical range (explained variance of 91.74%, while predicted variance of 74.14%). The robustness of the best models was checked by Y-randomization test and identified as good predictive models. The coefficient of ALogP, ATSm5 and CrippenLogP shows that the activity increases with increase in ALogP, ATSm5 and Crippen LogP of molecules. The coefficient of C2SP3, VPC-4, SsI, ETA_AlphaP, ETA_Epsilon_1, nAtomP, Petitjean Number and Wlambda2.unity shows that the activity decreases with increase in volume and Wlambda1.polar of the molecules is detrimental to activity. The information generated from the present study may be useful in the design of more potent HEPT derivatives as anti HIV agents.

Development of linear and nonlinear predictive QSAR models and their external validation using molecular similarity principle for anti-HIV indolyl aryl sulfones

Journal of Enzyme Inhibition and Medicinal Chemistry, 2008

Quantitative structure-activity relationship (QSAR) studies have been carried out on indolyl aryl sulfones, a class of novel HIV-1 non-nucleoside reverse transcriptase inhibitors, using physicochemical, topological and structural parameters along with appropriate indicator variables. The statistical tools used were linear methods (e.g., stepwise regression analysis, partial least squares (PLS), factor analysis followed by multiple regression (FA-MLR), genetic function approximation combined with multiple linear regression (GFA-MLR) and GFA followed by PLS or G/PLS and nonlinear method (artificial neural network or ANN). In case of physicochemical parameters, GFA-MLR generated the best Equation (n ¼ 97, R 2 ¼ 0.862, Q 2 ¼ 0.821). Using topological parameters, the best Equation (based on leave-one-out Q 2) was obtained with stepwise regression technique (n ¼ 97, R 2 ¼ 0.867, Q 2 ¼ 0.811). When topological and physicochemical parameters were used in combination, statistical quality increased to a great extent (n ¼ 97, R 2 ¼ 0.891, Q 2 ¼ 0.849 from stepwise regression). Furthermore, the whole dataset had been divided into test (25% of whole dataset) and training (remaining 75%) sets. Models were developed based on the training set and predictive potential of such models was checked from the test set. The selection of the training set was based on K-means clustering of the standardized descriptors (topological and physicochemical). In this case also the best results were obtained with stepwise regression (n ¼ 72, R 2 ¼ 0.906, Q 2 ¼ 0.853) but external predictive capacity of this model (R 2 pred ¼ 0:738) was inferior to the model developed from GFA-MLR technique (R 2 ¼ 0.883, Q 2 ¼ 0.823, R 2 pred ¼ 0:760). However, the squared regression coefficient between observed activity and predicted activity values of the test set compounds for the best linear model, i.e., GFA-MLR (r 2 ¼ 0.736) was lower in comparison to the best nonlinear model developed using artificial neural network (r 2 ¼ 0.781). Thus, based on external validation, the ANN models were superior to the linear models. The predictive potential of the best linear Equation (stepwise regression model) was superior to that of the previously published CoMFA (Q 2 ¼ 0.81, SDEP Test ¼ 0.89) on the same data set (Ragno R. et al., J Med Chem 2006, 49, 3172-3184). Furthermore, the physicochemical parameter based models also supported the previous observations based on docking (Ragno R. et al.,

Molecular modeling studies of HIV-1 non-nucleoside reverse transcriptase inhibitors using 3D-QSAR, virtual screening and docking simulations

Journal of the Serbian Chemical Society, 2019

Acquired immunodeficiency syndrome (AIDS) is a significant human health threat around the world and therefore, the study of anti-HIV drug design has become an important task for today's society. In this paper, a three-dimensional quantitative structure-activity relationships study (3D-QSAR) was conducted on 72 HIV-1 non-nucleoside reverse transcriptase inhibitors (NNRTIs) using Topomer comparative molecular field analysis (Topomer CoMFA). The multiple correlation coefficients of fitting, cross-validation, and external validation were 0.899, 0.788 and 0.942, respectively. The results indicated that the obtained model had both a favorable estimation stability and a good prediction capability. Topomer Search was used to search appropriate R groups from the ZINC database, Thereby, 14 new compounds were designed, and 12 of the new compounds were predicted to be more active than the template molecule. These results strongly suggest that the Topomer search was effective in screening and could be a useful guide in the design of new HIV-1 drugs. The ligands of the template molecule and the new designed compounds were used for molecular docking to study the interaction of these compounds with the protein receptor. The results showed that the ligands would generally form hydrogen-bonding interactions with the residues Ala28, Asp29, Gly49 and Ile50 of the protein receptor, thereby providing additional insights for the designing of even more effective drugs.

A ligand-based approach for the in silico discovery of multi-target inhibitors for proteins associated with HIV infection

Molecular BioSystems, 2012

Acquired immunodeficiency syndrome (AIDS) is a dangerous disease, which damages the immune system cells to the point that the immune system can no longer fight against other infections that it would usually be able to prevent. The causal agent is the human immunodeficiency virus (HIV), and for this reason, the search for more effective chemotherapies against HIV is a challenge for the scientific community. Chemoinformatics and Quantitative Structure-Activity Relationship (QSAR) studies have played an essential role in the design of potent inhibitors for proteins associated with the HIV infection. However, all previous studies took into consideration the discovery of future drug candidates using homogeneous series of compounds against only one protein. This fact limits the use of more efficient anti-HIV chemotherapies. In this work, we develop the first ligand-based approach for the in silico design of multi-target (mt) inhibitors for seven key proteins associated with the HIV infection. Two mt-QSAR models were constructed from a large and heterogeneous database of compounds. The first model was based on linear discriminant analysis (mt-QSAR-LDA) employing fragment-based descriptors. The second model was obtained using artificial neural networks (mt-QSAR-ANN) with global 2D descriptors. Both models correctly classified more than 90% of active and inactive compounds in training and prediction sets. Some fragments were extracted and their contributions to anti-HIV activity through inhibition of the different proteins were calculated using the mt-QSAR-LDA model. New molecules designed from fragments with positive contributions were suggested and correctly predicted by the two models as possible potent and versatile anti-HIV agents.

Introducing Catastrophe-QSAR. Application on Modeling Molecular Mechanisms of Pyridinone Derivative-Type HIV Non-Nucleoside Reverse Transcriptase Inhibitors

International Journal of Molecular Sciences, 2011

The classical method of quantitative structure-activity relationships (QSAR) is enriched using non-linear models, as Thom's polynomials allow either uni-or bi-variate structural parameters. In this context, catastrophe QSAR algorithms are applied to the anti-HIV-1 activity of pyridinone derivatives. This requires calculation of the so-called relative statistical power and of its minimum principle in various QSAR models. A new index, known as a statistical relative power, is constructed as an Euclidian measure for the combined ratio of the Pearson correlation to algebraic correlation, with normalized t-Student and the Fisher tests. First and second order inter-model paths are considered for mono-variate catastrophes, whereas for bi-variate catastrophes the direct minimum path is provided, allowing the QSAR models to be tested for predictive purposes. At this stage, the max-to-min hierarchies of the tested models allow the interaction mechanism to be identified using structural parameter succession and the typical catastrophes involved. Minimized differences between these catastrophe models in the common structurally influential domains that span both the trial and tested compounds identify the "optimal molecular structural domains" and the molecules with the best output with respect to the modeled activity, which in this case is human immunodeficiency virus type 1 HIV-1