Novel topological descriptors for analyzing biological networks (original) (raw)

Local Topological Signatures for Network-Based Prediction of Biological Function

Lecture Notes in Computer Science, 2013

In biology, similarity in structure or sequence between molecules is often used as evidence of functional similarity. In protein interaction networks, structural similarity of nodes (i.e., proteins) is often captured by comparing node signatures (vectors of topological properties of neighborhoods surrounding the nodes). In this paper, we ask how well such topological signatures predict protein function, using protein interaction networks of the organism Saccharomyces cerevisiae. To this end, we compare two node signatures from the literature-the graphlet degree vector and a signature based on the graph spectrum-and our own simple node signature based on basic topological properties. We find the connection between topology and protein function to be weak but statistically significant. Surprisingly, our node signature, despite its simplicity, performs on par with the other more sophisticated node signatures. In fact, we show that just two metrics, the link count and transitivity, are enough to classify protein function at a level on par with the other signatures suggesting that detailed topological characteristics are unlikely to aid in protein function prediction based on protein interaction networks.

Structural Discrimination of Networks by Using Distance, Degree and Eigenvalue-Based Measures

PLoS ONE, 2012

In chemistry and computational biology, structural graph descriptors have been proven essential for characterizing the structure of chemical and biological networks. It has also been demonstrated that they are useful to derive empirical models for structure-oriented drug design. However, from a more general (complex network-oriented) point of view, investigating mathematical properties of structural descriptors, such as their uniqueness and structural interpretation, is also important for an in-depth understanding of the underlying methods. In this paper, we emphasize the evaluation of the uniqueness of distance, degree and eigenvalue-based measures. Among these are measures that have been recently investigated extensively. We report numerical results using chemical and exhaustively generated graphs and also investigate correlations between the measures. Citation: Dehmer M, Grabner M, Furtula B (2012) Structural Discrimination of Networks by Using Distance, Degree and Eigenvalue-Based Measures. PLoS ONE 7(7): e38564.

On Valency-Based Molecular Topological Descriptors of Subdivision Vertex-Edge Join of Three Graphs

Symmetry

In the studies of quantitative structure–activity relationships (QSARs) and quantitative structure–property relationships (QSPRs), graph invariants are used to estimate the biological activities and properties of chemical compounds. In these studies, degree-based topological indices have a significant place among the other descriptors because of the ease of generation and the speed with which these computations can be accomplished. In this paper, we give the results related to the first, second, and third Zagreb indices, forgotten index, hyper Zagreb index, reduced first and second Zagreb indices, multiplicative Zagreb indices, redefined version of Zagreb indices, first reformulated Zagreb index, harmonic index, atom-bond connectivity index, geometric-arithmetic index, and reduced reciprocal Randić index of a new graph operation named as “subdivision vertex-edge join” of three graphs.

Topological Indices of Molecular Graph and Drug Design

International Journal for Research in Applied Science & Engineering Technology (IJRASET), 2022

The application of topology in molecular graph and drug design is covered in this article. On the basis of the most recent developments in this area, an overview of the use of topological indices (TIs) in the process of drug design and development is provided. The introduction of concepts used in drug design and discovery, graph theory, and topological indices is the primary goal of the first section of this book. Researchers can learn more about the physical characteristics, chemical reactivity, and biological activity of these chemical molecular structures by using topological indices. In order to compensate for the lack of chemical experiments and offer a theoretical foundation for the production of medications and chemical materials, topological indices on the chemical structure of chemical materials and drugs are studied. In this article, we concentrate on the family of smart polymers that are frequently utilised in the production of drugs.

Structural Measures for Network Biology Using QuACN

BMC Bioinformatics, 2011

Background: Structural measures for networks have been extensively developed, but many of them have not yet demonstrated their sustainably. That means, it remains often unclear whether a particular measure is useful and feasible to solve a particular problem in network biology. Exemplarily, the classification of complex biological networks can be named, for which structural measures are used leading to a minimal classification error. Hence, there is a strong need to provide freely available software packages to calculate and demonstrate the appropriate usage of structural graph measures in network biology. Results: Here, we discuss topological network descriptors that are implemented in the R-package QuACN and demonstrate their behavior and characteristics by applying them to a set of example graphs. Moreover, we show a representative application to illustrate their capabilities for classifying biological networks. In particular, we infer gene regulatory networks from microarray data and classify them by methods provided by QuACN. Note that QuACN is the first freely available software written in R containing a large number of structural graph measures. Conclusion: The R package QuACN is under ongoing development and we add promising groups of topological network descriptors continuously. The package can be used to answer intriguing research questions in network biology, e.g., classifying biological data or identifying meaningful biological features, by analyzing the topology of biological networks.

On Distance-Based Topological Descriptors of Chemical Interconnection Networks

Journal of Mathematics

Structure-based topological descriptors of chemical networks enable us the prediction of physico-chemical properties and the bioactivities of compounds through QSAR/QSPR methods. Topological indices are the numerical values to represent a graph which characterises the graph. One of the latest distance-based topological index is the Mostar index. In this paper, we study the Mostar index, Szeged index, PI index, ABC GG index, and NGG index, for chain oxide network COX n , chain silicate network CS n , ortho chain S n , and para chain Q n , for the first time. Moreover, analytically closed formulae for these structures are determined.

Graph representations of molecular similarity measures based on topological resolution

Journal of Computational Methods in Sciences and Engineering, 2005

Graph representations of families of topologies are introduced for a concise description of hierarchies of topologies involved in Resolution-Based Similarity Measures of molecules within a topological context. In the general case of partially ordered families of topological spaces involved in the characterization of similarity relations within large families of molecules, for example, those in pharmaceutical databases, these graphs provide a global characterization of the databases. The newly introduced graph representations also serve as tools for compatibility assessment for database mergers, for example, if the accumulated data in the information systems of two pharmaceutical companies are combined.

Natural/random protein classification models based on star network topological indices

Journal of Theoretical Biology, 2008

The development of the complex network graphs permits us to describe any real system such as social, neural, computer or genetic networks by transforming real properties in topological indices (TIs). This work uses Randic's star networks in order to convert the protein primary structure data in specific topological indices that are used to construct a natural/random protein classification model.The set of natural proteins contains 1046 protein chains selected from the pre-compiled CulledPDB list from PISCES Dunbrack's Web Lab. This set is characterized by a protein homology of 20%, a structure resolution of 1.6 Å and R-factor lower than 25%. The set of random amino acid chains contains 1046 sequences which were generated by Python script according to the same type of residues and average chain length found in the natural set.A new Sequence to Star Networks (S2SNet) wxPython GUI application (with a Graphviz graphics back-end) was designed by our group in order to transform any character sequence in the following star network topological indices: Shannon entropy of Markov matrices, trace of connectivity matrices, Harary number, Wiener index, Gutman index, Schultz index, Moreau–Broto indices, Balaban distance connectivity index, Kier–Hall connectivity indices and Randic connectivity index. The model was constructed with the General Discriminant Analysis methods from STATISTICA package and gave training/predicting set accuracies of 90.77% for the forward stepwise model type.In conclusion, this study extends for the first time the classical TIs to protein star network TIs by proposing a model that can predict if a protein/fragment of protein is natural or random using only the amino acid sequence data. This classification can be used in the studies of the protein functions by changing some fragments with random amino acid sequences or to detect the fake amino acid sequences or the errors in proteins. These results promote the use of the S2SNet application not only for protein structure analysis but also for mass spectroscopy, clinical proteomics and imaging, or DNA/RNA structure analysis.

Graph kernels for chemical informatics

Neural Networks, 2005

Increased availability of large repositories of chemical compounds is creating new challenges and opportunities for the application of machine learning methods to problems in computational chemistry and chemical informatics. Because chemical compounds are often represented by the graph of their covalent bonds, machine learning methods in this domain must be capable of processing graphical structures with variable size. Here we first briefly review the literature on graph kernels and then introduce three new kernels (Tanimoto, MinMax, Hybrid) based on the idea of molecular fingerprints and counting labeled paths of depth up to d using depthfirst search from each possible vertex. The kernels are applied to three classification problems to predict mutagenicity, toxicity, and anti-cancer activity on three publicly available data sets. The kernels achieve performances at least comparable, and most often superior, to those previously reported in the literature reaching accuracies of 91.5% on the Mutag dataset, 65-67% on the PTC (Predictive Toxicology Challenge) dataset, and 72% on the NCI (National Cancer Institute) dataset. Properties and tradeoffs of these kernels, as well as other proposed kernels that leverage 1D or 3D representations of molecules, are briefly discussed.