HIVprotI: an integrated web based platform for prediction and design of HIV proteins inhibitors (original) (raw)

MOESM1 of HIVprotI: an integrated web based platform for prediction and design of HIV proteins inhibitors

2018

Additional file 1. Supporting information including Table S1. Performance of QSAR predictive models on three times randomly picked ~ 10% independent/validation data. These models were developed using remaining ~ 90% data during training/testing respectively for each of the six datasets; Table S2. Details of statistical parameters used for the development of IC50 based QSAR models; Table S3. Details of statistical parameters used for the development of percent inhibition based QSAR models; Table S4. Details of chemical descriptors used in the development of IC50 based QSAR models; Table S5. Details of chemical descriptors used in the development of percent inhibition based QSAR models; Table S6. Details of slopes k (predicted vs. observed inhibition) and k' (observed vs. predicted inhibition) of the regression lines for the QSAR models; Table S7. Details of Y-randomization test performed on the QSAR models; Figure S1. Chemical space analysis of QSAR studies (Table 5) for Protease...

QSAR models for prediction study of HIV protease inhibitors using support vector machines, neural networks and multiple linear regression

Arabian Journal of Chemistry, 2017

Support vector machines (SVM) represent one of the most promising Machine Learning (ML) tools that can be applied to develop a predictive quantitative structure-activity relationship (QSAR) models using molecular descriptors. Multiple linear regression (MLR) and artificial neural networks (ANNs) were also utilized to construct quantitative linear and non linear models to compare with the results obtained by SVM. The prediction results are in good agreement with the experimental value of HIV activity; also, the results reveal the superiority of the SVM over MLR and ANN model. The contribution of each descriptor to the structure-activity relationships was evaluated.

AVCpred: An integrated web server for prediction and design of antiviral compounds

Chemical biology & drug design, 2016

Viral infections constantly jeopardize the global public health due to lack of effective antiviral therapeutics. Therefore, there is an imperative need to speed up the drug discovery process to identify novel and efficient drug candidates. In the present study, we have developed Quantitative structure-activity relationship (QSAR) based models for predicting Antiviral compounds (AVCs) against deadly viruses like Human immunodeficiency virus (HIV), Hepatitis C virus (HCV), Hepatitis B virus (HBV), Human herpesvirus (HHV) and 26 others using publicly available experimental data from the ChEMBL bioactivity database. Support Vector Machine (SVM) models achieved a maximum Pearson Correlation Coefficient of 0.72, 0.74, 0.66, 0.68 and 0.71 in regression mode and a maximum Matthew's correlation coefficient 0.91, 0.93, 0.70, 0.89 and 0.71 respectively in classification mode during 10-fold cross-validation. Furthermore, similar performance was observed on the independent validation sets. W...

Support vector machines: Development of QSAR models for predicting anti-HIV-1 activity of TIBO derivatives

European Journal of Medicinal Chemistry, 2010

The tetrahydroimidazo [4,5,1-jk][1,4]benzodiazepinone (TIBO) derivatives, as non-nucleoside reverse transcriptase inhibitors, acquire a significant place in the treatment of the infections by the HIV. In the present paper, the support vector machines (SVM) are used to develop quantitative relationships between the anti-HIV activity and four molecular descriptors of 82 TIBO derivatives. The results obtained by SVM give good statistical results compared to those given by multiple linear regressions and artificial neural networks. The contribution of each descriptor to structure-activity relationships was evaluated. It indicates the importance of the hydrophobic parameter. The proposed method can be successfully used to predict the anti-HIV of TIBO derivatives with only four molecular descriptors which can be calculated directly from molecular structure alone.

3D-QSAR and SVM Prediction of BRAF-V600E and HIV Integrase Inhibitors: A Comparative Study and Characterization of Performance with a New Expected Prediction Performance Metric

The results of directly comparing the prediction accuracy of optimized 3D Quantitative Structure-Activity Relationship (3D-QSAR) models and linear Support Vector Machine (SVM) classifiers to identify small molecule inhibitors of the BRAF-V600E and HIV Integrase targets are reported. Performance comparisons were carried out using 303 compounds (68 active) against BRAF-V600E and 204 compounds (159 active) against HIV Integrase. A SVM prediction accuracy of 95% (BRAF-V600E) and 100% (HIV Integrase) and 3D-QSAR prediction accuracy of 76% (BRAFV600E) and 82% (HIV Integrase) was observed. To help explain the better performance of SVM in the comparison reported here and to help assess the degree to which a SVM or 3D-QSAR model is likely to perform best for other targetligands of interest a new EPP (Expected Predictive Performance) metric is introduced. How EPP can be used to help predict future performance of SVM and 3D-QSAR models by quantifying the degree of similarity between candidate compounds and training data is also demonstrated. Results show that the EPP metric is capable of predicting future prediction accuracy of SVM and 3D-QSAr models within 7% of actual performance.

An automated workflow by using KNIME Analytical Platform: a case study for modelling and predicting HIV-1 protease inhibitors

Progress in Drug Discovery & Biomedical Science

In this study, we have demonstrated an automated workflow by using KNIME Analytical Platform for modelling and predicting potential HIV-1 protease (HIVP) inhibitors. The workflow has been simplified in three easy steps i.e., 1) retrievethe database of inhibitors for the target disease from ChEMBL website and well-known drug from DrugBank database, 2) generate the descriptors and, 3) select the optimal number of features after machine learning models training. Our results have indicated that the random forest with auto prediction validation method is the most reliable with the best R2 value of 0.9394. Apparently, this workflow can be transformed easily for any other diseases and the quantitative structure-activity relationship (QSAR) model that has been developed can accurately predict in silico how chemical modifications might influence biological behaviour. Overall, the automated workflow which has been presented in this study may significantly reduce the time, cost and efforts nee...

Significantly Improved HIV Inhibitor Efficacy Prediction Employing Proteochemometric Models Generated From Antivirogram Data

Infection with HIV cannot currently be cured; however it can be controlled by combination treatment with multiple antiretroviral drugs. Given different viral genotypes for virtually each individual patient, the question now arises which drug combination to use to achieve effective treatment. With the availability of viral genotypic data and clinical phenotypic data, it has become possible to create computational models able to predict an optimal treatment regimen for an individual patient. Current models are based only on sequence data derived from viral genotyping; chemical similarity of drugs is not considered. To explore the added value of chemical similarity inclusion we applied proteochemometric models, combining chemical and protein target properties in a single bioactivity model. Our dataset was a large scale clinical database of genotypic and phenotypic information (in total ca. 300,000 drug-mutant bioactivity data points, 4 (NNRTI), 8 (NRTI) or 9 (PI) drugs, and 10,700 (NNRTI) 10,500 (NRTI) or 27,000 (PI) mutants). Our models achieved a prediction error below 0.5 Log Fold Change. Moreover, when directly compared with previously published sequence data, derived models PCM performed better in resistance classification and prediction of Log Fold Change (0.76 log units versus 0.91). Furthermore, we were able to successfully confirm both known and identify previously unpublished, resistance-conferring mutations of HIV Reverse Transcriptase (e.g. K102Y, T216M) and HIV Protease (e.g. Q18N, N88G) from our dataset. Finally, we applied our models prospectively to the public HIV resistance database from Stanford University obtaining a correct resistance prediction rate of 84% on the full set (compared to 80% in previous work on a high quality subset). We conclude that proteochemometric models are able to accurately predict the phenotypic resistance based on genotypic data even for novel mutants and mixtures. Furthermore, we add an applicability domain to the prediction, informing the user about the reliability of predictions. Citation: van Westen GJP, Hendriks A, Wegner JK, IJzerman AP, van Vlijmen HWT, et al. (2013) Significantly Improved HIV Inhibitor Efficacy Prediction Employing Proteochemometric Models Generated From Antivirogram Data. PLoS Comput Biol 9(2): e1002899.

Hybrid-genetic algorithm based descriptor optimization and QSAR models for predicting the biological activity of Tipranavir analogs for HIV protease inhibition

Journal of Molecular Graphics and Modelling, 2010

The prediction of biological activity of a chemical compound from its structural features plays an important role in drug design. In this paper, we discuss the quantitative structure activity relationship (QSAR) prediction models developed on a dataset of 170 HIV protease enzyme inhibitors. Various chemical descriptors that encode hydrophobic, topological, geometrical and electronic properties are calculated to represent the structures of the molecules in the dataset. We use the hybrid-GA (genetic algorithm) optimization technique for descriptor space reduction. The linear multiple regression analysis (MLR), correlation-based feature selection (CFS), non-linear decision tree (DT), and artificial neural network (ANN) approaches are used as fitness functions. The selected descriptors represent the overall descriptor space and account well for the binding nature of the considered dataset. These selected features are also human interpretable and can be used to explain the interactions between a drug molecule and its receptor protein (HIV protease). The selected descriptors are then used for developing the QSAR prediction models by using the MLR, DT and ANN approaches. These models are discussed, analyzed and compared to validate and test their performance for this dataset. All three approaches yield the QSAR models with good prediction performance. The models developed by DT and ANN are comparable and have better prediction than the MLR model. For ANN model, weight analysis is carried out to analyze the role of various descriptors in activity prediction. All the prediction models point towards the involvement of hydrophobic interactions. These models can be useful for predicting the biological activity of new untested HIV protease inhibitors and virtual screening for identifying new lead compounds.

QSAR prediction of HIV-1 protease inhibitory activities using docking derived molecular descriptors

Journal of theoretical biology, 2015

In this study, application of a new hybrid docking-quantitative structure activity relationship (QSAR) methodology to model and predict the HIV-1 protease inhibitory activities of a series of newly synthesized chemicals is reported. This hybrid docking-QSAR approach can provide valuable information about the most important chemical and structural features of the ligands that affect their inhibitory activities. Docking studies were used to find the actual conformations of chemicals in active site of HIV-1 protease. Then the molecular descriptors were calculated from these conformations. Multiple linear regression (MLR) and least square support vector machine (LS-SVM) were used as QSAR models, respectively. The obtained results reveal that statistical parameters of the LS-SVM model are better than the MLR model, which indicate that there are some non-linear relations between selected molecular descriptors and anti-HIV activities of interested chemicals. The correlation coefficient (R)...

Application of support vector machines for prediction of anti-HIV activity of TIBO Derivatives

Chemistry and Materials Research, 2013

The performance and predictive power of support vector machines (SVM) for regression problems in quantitative structure-activity relationship were investigated. The SVM results are superior to those obtained by artificial neural network and multiple linear regression. These results indicate that the SVM model with the kernel radial basis function can be used as an alternative tool for regression problems in quantitative structureactivity relationship.