Generation of artificial neural networks models in anticancer study (original) (raw)

Using Neural Models for Evaluation of Biological Activity of Selected Chemical Compounds

Applications of Computational Intelligence in Biology, Current Trends and Open Problems, 2008

The chapter shows how we can predict and evaluate the biological activity of particular chemical compounds using neural networks models. The purpose of the work was to verify the usefulness of various types and different structures of neural networks as well as various techniques of teaching the networks to predict the properties of defined chemical compounds, prior to studying them using laboratory methods. The huge number and variety of chemical compounds, which can be synthesized makes the prediction of any of their properties by computer modeling a very attractive alternative to costly experimental studies. The method described in this chapter may be useful for forecasting various properties of different groups of chemical compounds. The purpose of this chapter is to present the studied problem (and obtained solutions) from the point of view of the technique of neural networks and optimization of neural computations. The usefulness and wide-ranging applicability of neural networks have already been shown in hundreds of tasks concerning different and often very distant fields. Nevertheless, the majority of investigators tend to attain particular pragmatic ends, treating the used neural models purely as tools to get solutions: some particular network is arbitrarily chosen, results are obtained and presented, omitting or greatly limiting the discussion on which neural network was used, why it has been chosen and what could have been achieved if another network (or other non-neural methods, like regressive ones) had been applied. In this situation, every researcher undertaking any similar problem once more faces the serious methodological question: which network to select, how to train it and how to present the data in order to obtain the best results. This chapter will present the results of the investigations, in which, to the same (difficult) problem of predicting the chemical activity of quite a large group of chemical compounds, various networks were applied and different results were obtained. Basing ourselves on the results, we will draw conclusions showing which networks and methods of learning are better and which are worse in solving the considered problem. These conclusions cannot just be mechanically generalized because every question on the application of neural networks has its own unique specificity, but the authors of this chapter hope that their wide and precisely documented studies will appear useful for persons wanting to apply neural networks and considering which model to use as a starting one.

Prediction of anticancer molecules using hybrid model developed on molecules screened against NCI-60 cancer cell lines

BMC cancer, 2015

In past, numerous quantitative structure-activity relationship (QSAR) based models have been developed for predicting anticancer activity for a specific class of molecules against different cancer drug targets. In contrast, limited attempt have been made to predict the anticancer activity of a diverse class of chemicals against a wide variety of cancer cell lines. In this study, we described a hybrid method developed on thousands of anticancer and non-anticancer molecules tested against National Cancer Institute (NCI) 60 cancer cell lines. Our analysis of anticancer molecules revealed that majority of anticancer molecules contains 18-24 carbon atoms and are dominated by functional groups like R2NH, R3N, ROH, RCOR, and ROR. It was also observed that certain substructures (e.g., 1-methoxy-4-methylbenzene, 1-methoxy benzene, Nitrobenzene, Indole, Propenyl benzene) are more abundant in anticancer molecules. Next, we developed anticancer molecule prediction models using various machine-l...

Artificial Neural Network--Based Analysis of High-Throughput Screening Data for Improved Prediction of Active Compounds

Journal of Biomolecular Screening, 2009

Artificial neural networks (ANNs) are trained using high-throughput screening (HTS) data to recover active compounds from a large data set. Improved classification performance was obtained on combining predictions made by multiple ANNs. The HTS data, acquired from a methionine aminopeptidases inhibition study, consisted of a library of 43,347 compounds, and the ratio of active to nonactive compounds, R A/N, was 0.0321. Back-propagation ANNs were trained and validated using principal components derived from the physicochemical features of the compounds. On selecting the training parameters carefully, an ANN recovers one-third of all active compounds from the validation set with a 3-fold gain in R A/N value. Further gains in RA/N values were obtained upon combining the predictions made by a number of ANNs. The generalization property of the back-propagation ANNs was used to train those ANNs with the same training samples, after being initialized with different sets of random weights. ...

Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research

Journal of pharmaceutical and …, 2000

Artificial neural networks (ANNs) are biologically inspired computer programs designed to simulate the way in which the human brain processes information. ANNs gather their knowledge by detecting the patterns and relationships in data and learn (or are trained) through experience, not from programming. An ANN is formed from hundreds of single units, artificial neurons or processing elements (PE), connected with coefficients (weights), which constitute the neural structure and are organised in layers. The power of neural computations comes from connecting neurons in a network. Each PE has weighted inputs, transfer function and one output. The behavior of a neural network is determined by the transfer functions of its neurons, by the learning rule, and by the architecture itself. The weights are the adjustable parameters and, in that sense, a neural network is a parameterized system. The weighed sum of the inputs constitutes the activation of the neuron. The activation signal is passed through transfer function to produce a single output of the neuron. Transfer function introduces non-linearity to the network. During training, the inter-unit connections are optimized until the error in predictions is minimized and the network reaches the specified level of accuracy. Once the network is trained and tested it can be given new input information to predict the output. Many types of neural networks have been designed already and new ones are invented every week but all can be described by the transfer functions of their neurons, by the learning rule, and by the connection formula. ANN represents a promising modeling technique, especially for data sets having non-linear relationships which are frequently encountered in pharmaceutical processes. In terms of model specification, artificial neural networks require no knowledge of the data source but, since they often contain many weights that must be estimated, they require large training sets. In addition, ANNs can combine and incorporate both literature-based and experimental data to solve problems. The various applications of ANNs can be summarised into classification or pattern recognition, prediction and modeling. Supervised associating networks can be applied in pharmaceutical fields as an alternative to conventional response surface methodology. Unsupervised feature-extracting networks represent an alternative to principal component analysis. Non-adaptive unsupervised networks are able to reconstruct their patterns when presented with noisy samples and can be used for image recognition. The potential applications of ANN methodology in the pharmaceutical sciences range from interpretation of analytical data, drug and dosage form design through biopharmacy to clinical pharmacy.

Prediction of Drug Lipophilicity using Back Propagation Artificial Neural Network Modeling

Oriental Journal of Chemistry, 2014

A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structures of 150 drug organic compounds to their n-octanol-water partition coefficients (logP o/w). Molecular descriptors derived solely from 3D structures of the molecular drugs. A genetic algorithm was also applied as a variable selection tools in QSPR analysis. The models were constructed based on 110 training compounds, and predictive ability was tested on 40 compounds reserved for that purpose. Application of the developed models to a testing set of 40 drug organic compounds demonstrates that the new models are reliable with good predictive accuracy and simple formulation. Modeling of logarithm of logP o/w of these compounds as a function of the theoretically derived descriptors was established by artificial neural network (ANN). The neural network employed here is a connected back-propagation model with a 4-4-1 architecture. Four descriptors for these compounds molecular volume (MV) (Geometrical), hydrophilic-lipophilic balance (HLB) (Constitutional), hydrogen bond forming ability (HB) (Electronic) and polar surface area (PSA) (Electrostatic) are taken as inputs for the models. The use of descriptors calculated only from molecular structure eliminates the need for experimental determination of properties for use in the correlation and allows for the estimation of logP o/w for molecules not yet synthesized. The prediction results are in good agreement with the experimental value. The root mean square error of prediction (RMSEP) and square correlation coefficient (R 2) for ANN model were 0.1838, 0.9876 for the prediction set log P o/w , respectively.

Machine learning algorithms used in Quantitative structure-activity relationships studies as new approaches in drug discovery

2019 International Conference on Intelligent Systems and Advanced Computing Sciences (ISACS), 2019

Developing machine learning algorithms have become important tools in drug discovery process. Nowadays, a variety of machine learning tools are used in quantitative structure-activity relationships (QSARs) to establish QSAR models. The 2D-QSAR analysis involves the study of quantitative relationships between the molecular descriptors and biological activity by using machine learning algorithms, such as partial least squares (PLS) and artificial neural networks (ANNs). The best linear 2D-QSAR model was developed through partial least squares (PLS) gave a high predictive ability (R2 = 0.87, F=52.80, R2pred = 0.80, Q2 = 0.77). Moreover, the non-linear artificial neural networks (ANNs) was shown better performance with Levenberge Marquardt (L-M) algorithm (architecture [3-3-1]: R2=0.94, R2pred=0.81, Q2=0.86). Those results uncovered that a_nO, PEOE_VSA+6 and Vsurf_R are important descriptors on which biological activity depends. Moreover, the retained 3D-QSAR model exhibits the best res...

ANN-QSAR model for selection of anticancer leads from structurally heterogeneous series of compounds

European Journal of Medicinal Chemistry, 2007

Developing a model for predicting anticancer activity of any classes of organic compounds based on molecular structure is very important goal for medicinal chemist. Different molecular descriptors can be used to solve this problem. Stochastic molecular descriptors so-called the MARCH-INSIDE approach, shown to be very successful in drug design. Nevertheless, the structural diversity of compounds is so vast that we may need non-linear models such as artificial neural networks (ANN) instead of linear ones. SmartMLP-ANN analysis used to model the anticancer activity of organic compounds has shown high average accuracy of 93.79% (train performance) and predictability of 90.88% (validation performance) for the 8:3-MLP topology with different training and predicting series. This ANN model favourably compares with respect to a previous linear discriminant analysis (LDA) model [H. González-Díaz et al., J. Mol. Model 9 (2003) 395] that showed only 80.49% of accuracy and 79.34% of predictability. The present SmartMLP approach employed shorter training times of only 10 h while previous models give accuracies of 70–89% only after 25–46 h of training. In order to illustrate the practical use of the model in bioorganic medicinal chemistry, we report the in silico prediction, and in vitro evaluation of six new synthetic tegafur analogues having IC50 values in a broad range between 37.1 and 138 μg mL−1 for leukemia (L1210/0) and human T-lymphocyte (Molt4/C8, CEM/0) cells. Theoretical predictions coincide very well with experimental results.

Neural network models for predicting the properties of chemical compounds

Fibre Chemistry, 2008

Neural networks are a universal tool used to investigate the dependences between the structure of organic compounds and a broad spectrum of their physicochemical properties. The potential of neural network modeling is not yet exhausted, as the increasing number of publications on their use indicates. Neural network models can solve both classification (for a discrete set of values of the modeled property) and regression problems (for continuous values of the modeled property). The reason for the popularity of neural network models in applied research is their clarity and the fact that no deep knowledge of mathematical statistics is required for their effective use.

A classification scheme for the prediction of essential chemical and biological properties based on the classical neural network approach

Prediction of biological and chemical features of a given chemical structure is a challenging problem for the existing nonlinear mapping performed by neural networks. In combinatorial chemistry, computational approaches are capable to significantly decrease the necessary amounts of synthesis for the development of a specific chemical or biological drug. Therefore, the main goal is to distinguish appropriate descriptors from insignificant ones. The experimental design for the classical nonlinear neural network mapping for the approximation of five descriptors and the corresponding reaction of the immune system for the drug development are reported briefly. The results for the different descriptors are presented in comparison.

Predictive statistics and artificial intelligence in the U.S. National Cancer Institute's drug discovery program for cancer and AIDS

Stem Cells, 1994

The National Cancer Institute's drug discovery program screens more than 20,000 chemical compounds and natural products a year for activity against a panel of 60 tumor cell lines in vitro. The result is an information-rich database of patterns that form the basis for what we term an "information-intensive" approach to the process of drug discovery. The first step was a demonstration, both by statistical methods (including the program COMPARE) and by neural networks, that patterns of activity in the screen can be used to predict a compound's mechanism of action. Given this finding, the overall plan has been to develop three large matrices of information: the first (designated A) gives the pattern of activity for each compound tested against each cell line in the screen; the second (S) encodes any of a number of types of 2-D or 3-D structural motifs for each compound; the third (T) indicates each cell's expression of molecular targets (e.g., from 2-dimensional protein gel electrophoresis). Construction and updating of these matrices is an ongoing process. The matrices can be concatenated in various ways to test a variety of specific hypotheses about compounds screened, as well as to "prioritize" candidate compounds for testing. To aid in these efforts, we have developed the DISCOVERY program package, which integrates the matrix data for visual pattern recogni