Mehdi Ghorbanzadeh | Umeå University (original) (raw)

Papers by Mehdi Ghorbanzadeh

Journal of Chemometrics, 2016

The identification of industrial chemicals, which may cause developmental effects, is of great im... more The identification of industrial chemicals, which may cause developmental effects, is of great importance for an early detection of hazardous chemicals. Accordingly, categorical quantitative structure-activity relationship (QSAR) models were developed, based on developmental toxicity profile data for zebrafish from the ToxCast Phase I testing, to predict the toxicity of a large set of high and low production volume chemicals (H/LPVCs). QSARs were created using linear (LDA), quadratic, and partial least squares-discriminant analysis with different chemical descriptors. The predictions of the best model (LDA) were compared with those obtained by the freely available QSAR model VEGA, created based on a dataset with a different chemical domain. The results showed that despite similar accuracy (AC) of both models, the LDA model is more specific than VEGA and shows a better agreement between sensitivity (SE) and specificity (SP). Applying a 90% confidence level on the LDA model led to even better predictions showing SE of 0.92, AC of 0.95, and geometric mean of SE and SP (G) of 0.96 for the prediction set. The LDA model predicted 608 H/LPVCs as toxicants among which 123 chemicals fall inside the AD of the VEGA model, which predicted 112 of those as toxicants. Among the 112 chemicals predicted as toxic H/LPVCs, 23 have been previously reported as developmental toxicants. The here presented LDA model could be used to identify and prioritize H/LPVCs for subsequent developmental toxicity assessment, as a screening tool of potential developmental effects of new chemicals, and to guide synthesis of safer alternative chemicals.

Molecular Diversity, 2009

Quantitative structure-property relationship models for the prediction of the nematic transition ... more Quantitative structure-property relationship models for the prediction of the nematic transition temperature (T (N)) were developed by using multilinear regression analysis and a feedforward artificial neural network (ANN). A collection of 42 thermotropic liquid crystals was chosen as the data set. The data set was divided into three sets: for training, and an internal and external test set. Training and internal test sets were used for ANN model development, and the external test set was used for evaluation of the predictive power of the model. In order to build the models, a set of six descriptors were selected by the best multilinear regression procedure of the CODESSA program. These descriptors were: atomic charge weighted partial negatively charged surface area, relative negative charged surface area, polarity parameter/square distance, minimum most negative atomic partial charge, molecular volume, and the A component of moment of inertia, which encode geometrical and electronic characteristics of molecules. These descriptors were used as inputs to ANN. The optimized ANN model had 6:6:1 topology. The standard errors in the calculation of T (N) for the training, internal, and external test sets using the ANN model were 1.012, 4.910, and 4.070, respectively. To further evaluate the ANN model, a crossvalidation test was performed, which produced the statistic Q (2) = 0.9796 and standard deviation of 2.67 based on predicted residual sum of square. Also, the diversity test was performed to ensure the model&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;#39;s stability and prove its predictive capability. The obtained results reveal the suitability of ANN for the prediction of T (N) for liquid crystals using molecular structural descriptors.

Talanta, 2011

A set of 69 drug-like compounds with corneal permeability was studied using quantitative and qual... more A set of 69 drug-like compounds with corneal permeability was studied using quantitative and qualitative modeling techniques. Multiple linear regression (MLR) and multilayer perceptron neural network (MLP-NN) were used to develop quantitative relationships between the corneal permeability and seven molecular descriptors selected by stepwise MLR and sensitivity analysis methods. In order to evaluate the models, a leave many out cross-validation test was performed, which produced the statistic Q 2 = 0.584 and SPRESS = 0.378 for MLR and Q 2 = 0.774 and SPRESS = 0.087 for MLP-NN. The obtained results revealed the suitability of MLP-NN for the prediction of corneal permeability. The contribution of each descriptor to MLP-NN model was evaluated. It indicated the importance of the molecular volume and weight. The pattern recognition methods principal component analysis (PCA) and hierarchical clustering analysis (HCA) have been employed in order to investigate the possible qualitative relationships between the molecular descriptors and the corneal permeability. The PCA and HCA results showed that, the data set contains two groups. Then, the same descriptors used in quantitative modeling were considered as inputs of counter propagation neural network (CPNN) to classify the compounds into low permeable (LP) and very low permeable (VLP) categories in supervised manner. The overall classification non error rate was 95.7% and 95.4% for the training and prediction test sets, respectively. The results revealed the ability of CPNN to correctly recognize the compounds belonging to the categories. The proposed models can be successfully used to predict the corneal permeability values and to classify the compounds into LP and VLP ones.

European Journal of Medicinal Chemistry, 2010

The classification of drugs was done according to their milk/plasma concentration ratio (M/P) by ... more The classification of drugs was done according to their milk/plasma concentration ratio (M/P) by using counter propagation artificial neural network (CP-ANN). The features of each drug were encoded by linear free energy relationship (LFER) parameters. These descriptors were used as inputs for developing linear discriminant analysis, quadratic discriminant analysis, least square support vector machine and CP-ANN models to distinguish the potential risk of 154 drugs as high risk (with M/P &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt; 1) and low risk (with M/P &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt; 1) for lactating women. The accuracy of classification for training, internal and external test sets was 100.00%, 100.00% and 90.00%, respectively for CP-ANN model, as the best model. The obtained results revealed the applicability of CP-ANN in classification of drugs based on their M/P values, using LFER parameters.

Chemometrics and Intelligent Laboratory Systems, 2012

Linear and quadratic discriminant analysis and least squares support vector machine (LS-SVM) were... more Linear and quadratic discriminant analysis and least squares support vector machine (LS-SVM) were used to classify a data set of 326 central nervous system (CNS) drugs as active or inactive CNS agents according to their permeation into the blood-brain barrier. A pool of descriptors was calculated by DRAGON software and nine of them were selected based on Wilk's lambda and classification accuracy and used for classification of drugs in data set. The classification models were validated based on accuracy, sensitivity, specificity, Matthew's correlation coefficient and Cohen's kappa values. The developed LS-SVM model, as the superior model has the accuracy of 96.5% and 96.0%, Matthew's correlation coefficient of 0.930 and 0.920, Cohen's kappa value of 0.963 and 0.917, and area under recursive operating characteristic curve of 0.95 and 0.98 for training and test sets, respectively. The results of this study indicated the applicability of LS-SVM in classification of CNS drugs based on their structural descriptors.

Bulletin of the Chemical Society of Japan, 2010

In this work the aqueous solubilities of 145 drug-like compounds were predicted from their theore... more In this work the aqueous solubilities of 145 drug-like compounds were predicted from their theoretical derived molecular descriptors. Descriptors which were selected by stepwise multiple subset selection methods are; 1st-order solvation connectivity index, average span R, overall hydrogen bond basicity, and percent of hydrophilic surface area. These descriptors can encode features of molecules which are effected on dispersion, hydrophobic and steric interactions between solute and solvent molecules. To develop quantitative structureactivity relationship (QSAR) models, the methods of multiple linear regressions, least-squares support vector machine, and artificial neural network (ANN) were used by applying the selected descriptors as their inputs. The obtained statistical parameters of these models revealed that ANN model was superior to other methods. The standard error (SE), average error (AE), and average absolute error (AAE) for ANN model are: SE = 0.714, AE = ¹0.178, and AAE = 0.546, while these values for internal test set are: SE = 0.830, AE = ¹0.056, and AAE = 0.630 and for external test set are: SE = 0.762, AE = ¹0.431, and AAE = 0.626, respectively. Moreover the leave-many-out cross validation test was used to further investigate the prediction power and robustness of model, which lead to R L10O 2 = 0.816 and SPRESS = 0.32 for ANN model, which revealed the reliability of this model.

Environmental science & technology, Jan 6, 2015

Thyroid hormone disrupting chemicals (THDCs) interfere with the thyroid hormone system and may in... more Thyroid hormone disrupting chemicals (THDCs) interfere with the thyroid hormone system and may induce multiple severe physiological disorders. Indoor dust ingestion is a major route of THDCs exposure in humans, and one of the molecular targets of these chemicals is the hormone transporter transthyretin (TTR). To virtually screen indoor dust contaminants and their metabolites for THDCs targeting TTR, we developed a quantitative structure-activity relationship (QSAR) classification model. The QSAR model was applied to an in-house database including 485 organic dust contaminants reported from literature data and their 433 in silico derived metabolites. The model predicted 37 (7.6%) dust contaminants and 230 (53.1%) metabolites as potential TTR binders. Four new THDCs were identified after testing 23 selected parent dust contaminants in a radio-ligand TTR binding assay; 2,2',4,4'-tetrahydroxybenzophenone, perfluoroheptanesulfonic acid, 3,5,6-trichloro-2-pyridinol, and 2,4,5-tric...

The quantitative structure-retention relationship (QSRR) was employed to predict the retention ti... more The quantitative structure-retention relationship (QSRR) was employed to predict the retention time (min) (RT) of pesticides using five molecular descriptors selected by genetic algorithm (GA) as a feature selection technique. Then the data set was randomly divided into training and prediction sets. The selected descriptors were used as inputs of multi-linear regression (MLR), multilayer perceptron neural network (MLP-NN) and generalized regression neural network (GR-NN) modeling techniques to build QSRR models. Both linear and nonlinear models show good predictive ability, of which the GR-NN model demonstrated a better performance than that of the MLR and MLP-NN models. The root mean square error of cross validation of the training and the prediction set for the GR-NN model was 1.245 and 2.210, and the correlation coefficients (R) were 0.975 and 0.937 respectively, while the square correlation coefficient of the cross validation (Q 2 LOO) on the GR-NN model was 0.951, revealing the reliability of this model. The obtained results indicated that GR-NN could be used as predictive tools for prediction of RT (min) values for understudy pesticides.

Journal of Separation Science, 2009

In this study, quantitative structure-retention relationship (QSRR) was used for the prediction o... more In this study, quantitative structure-retention relationship (QSRR) was used for the prediction of Kováts retention indices of 180 alkylphenols and their derivatives using the multiple linear regression (MLR) and support vector machine (SVM). After the calculation of some molecular descriptors for all molecules, the data set was randomly divided into training and test sets. The diversity of training and test sets was examined by molecular diversity validation test. Then stepwise MLR was used for the selection of the most important descriptors and development of MLR models. Descriptors which appeared in these QSRR models are number of H atoms, relative number of O atoms, Balaban index, relation yz-shadow/yz-rectangle and partial charges hydrogen bond donor atoms HDCA(2) index. These descriptors were used as inputs for developing the SVM model. After optimizing the SVM parameters, it was used for the calculation of chromatographic retention of interest molecules. The values of SE in calculation of Kováts retention indices for training and test sets are 0.34 and 0.63, respectively, for MLR model and 0.35 and 0.63, respectively, for SVM model. The overall values of average absolute relative error were 13.24 and 13.83 for MLR and SVM models, respectively. In addition, the cross-validation tests were performed to further examine the obtained model. The calculated values of cross-validation correlation coefficient (Q(2)) and standard deviation based on predicted residual sum of square are 0.896 and 0.680 for MLR model and 0.893 and 0.67 for SVM model. These values and other obtained statistical parameters for these models reveal the suitability of QSRR in prediction of Kováts retention indices of alkylphenols using MLR and SVM techniques.

Chemical Research in Toxicology, 2014

For a better understanding of species-specific relative effect potencies (REPs), responses of dio... more For a better understanding of species-specific relative effect potencies (REPs), responses of dioxin-like compounds (DLCs) were assessed. REPs were calculated using chemical-activated luciferase gene expression assays (CALUX) derived from guinea pig, rat, and mouse cell lines. Almost all 20 congeners tested in the rodent cell lines were partial agonists and less efficacious than 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD). For this reason, REPs were calculated for each congener using concentrations at which 20% of the maximal TCDD response was reached (REP20TCDD). REP20TCDD values obtained for PCDD/Fs were comparable with their toxic equivalency factors assigned by the World Health Organization (WHO-TEF), while those for PCBs were in general lower than the WHO-TEF values. Moreover, the guinea pig cell line was the most sensitive as indicated by the 20% effect concentrations of TCDD of 1.5, 5.6, and 11.0 pM for guinea pig, rat, and mouse cells, respectively. A similar response pattern was observed using multivariate statistical analysis between the three CALUX assays and the WHO-TEFs. The mouse assay showed minor deviation due to higher relative induction potential for 2,3,7,8-tetrachlorodibenzofuran and 2,3,4,6,7,8-hexachlorodibenzofuran and lower for 1,2,3,4,6,7,8-heptachlorodibenzofuran and 3,3&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;#39;,4,4&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;#39;,5-pentachlorobiphenyl (PCB126). 2,3,7,8-Tetrachlorodibenzofuran was more than two times more potent in the mouse assay as compared with that of rat and guinea pig cells, while measured REP20TCDD for PCB126 was lower in mouse cells (0.05) as compared with that of the guinea pig (0.2) and rat (0.07). In order to provide REP20TCDD values for all WHO-TEF assigned compounds, quantitative structure-activity relationship (QSAR) models were developed. The QSAR models showed that specific electronic properties and molecular surface characteristics play important roles in the AhR-mediated response. In silico derived REP20TCDD values were generally consistent with the WHO-TEFs with a few exceptions. The QSAR models indicated that, e.g., 1,2,3,7,8-pentachlorodibenzofuran and 1,2,3,7,8,9-hexachlorodibenzofuran were more potent than given by their assigned WHO-TEF values, and the non-ortho PCB 81 was predicted, based on the guinea-pig model, to be 1 order of magnitude above its WHO-TEF value. By combining in vitro and in silico approaches, REPs were established for all WHO-TEF assigned compounds (except OCDD), which will provide future guidance in testing AhR-mediated responses of DLCs and to increase our understanding of species variation in AhR-mediated effects.

Industrial & Engineering Chemistry Research, 2012

ABSTRACT An artificial neural network was employed to predict the cellular uptake of 109 magnetof... more ABSTRACT An artificial neural network was employed to predict the cellular uptake of 109 magnetofluorescent nanoparticles (NPs) in pancreatic cancer cells on the basis of quantitative structure activity relationship method. Six descriptors chosen by combining self-organizing map and stepwise multiple linear regression (MLR) techniques were used to correlate the nanostructure of the studied particles with their bioactivity using MLR and multilayered perceptron neural network (MLP-NN) modeling techniques. For the MLR and MLP-NN models, the correlation coefficient was 0.769 and 0.934, and the root-mean-square error was 0.364 and 0.150, respectively. The results obtained after a leave-many-out cross-validation test revealed the credibility of MLP-NN for the prediction of cellular uptake of NPs. In addition, sensitivity analysis of MLP-NN model indicated that the number of hydrogen-bond donor sites in the organic coating of a NP is the predominant factor responsible for cellular uptake.