Characterizing the Morphology . . . Binding Patches (original) (raw)

Characterizing the morphology of protein binding patches

Proteins: Structure, Function, and Bioinformatics, 2012

Let the patch of a partner in a protein complex be the collection of atoms accounting for the interaction. To improve our understanding of the structure-function relationship, we present a patch model decoupling the topological and geometric properties. While the geometry is classically encoded by the atomic positions, the topology is recorded in a graph encoding the relative position of concentric shells partitioning the interface atoms. The topological-geometric duality provides the basis of a generic dynamic programming based algorithm comparing patches at the shell level, which may favor topological or geometric features. On the biological side, we address four questions, using 249 co-crystallized heterodimers organized in biological families. First, we dissect the morphology of binding patches, and show that Nature enjoyed the topological and geometric degrees of freedom independently while retaining a nite set of qualitatively distinct topological signatures. Second, we argue that our shell-based comparison is eective to perform atomic-level comparisons, and show that topological similarity is a less stringent than geometric similarity. We also use the topological versus geometric duality to exhibit topo-rigid patches, whose topology (but not geometry) remains stable upon docking. Third, we use our comparison algorithms to infer specicity related information amidst a database of complexes. Finally, we exhibit a descriptor outperforming its contenders to predict the binding anities of the anity benchmark.

Exploring the Stability of Dimers Through Protein Structure Topology

Protein homodimers pose some intriguing questions about the relation between structure and stability. We approached the problem by means of a topological methodology based on protein contact networks. We correlated local interface descriptors with structure and energy global properties of the systems under analysis. We demonstrated that the graph energy, formerly applied to the analysis of unconjugated hydrocarbons structures, is the bridge between the topological and energetic description of protein complexes. This is a first step for the generation of a " protein structural formula " , analogous to the molecular graphs in organic chemistry.

An Algebro-Topological Description of Protein Domain Structure

PLoS ONE, 2011

The space of possible protein structures appears vast and continuous, and the relationship between primary, secondary and tertiary structure levels is complex. Protein structure comparison and classification is therefore a difficult but important task since structure is a determinant for molecular interaction and function. We introduce a novel mathematical abstraction based on geometric topology to describe protein domain structure. Using the locations of the backbone atoms and the hydrogen bonds, we build a combinatorial object-a so-called fatgraph. The description is discrete yet gives rise to a 2dimensional mathematical surface. Thus, each protein domain corresponds to a particular mathematical surface with characteristic topological invariants, such as the genus (number of holes) and the number of boundary components. Both invariants are global fatgraph features reflecting the interconnectivity of the domain by hydrogen bonds. We introduce the notion of robust variables, that is variables that are robust towards minor changes in the structure/fatgraph, and show that the genus and the number of boundary components are robust. Further, we invesigate the distribution of different fatgraph variables and show how only four variables are capable of distinguishing different folds. We use local (secondary) and global (tertiary) fatgraph features to describe domain structures and illustrate that they are useful for classification of domains in CATH. In addition, we combine our method with two other methods thereby using primary, secondary, and tertiary structure information, and show that we can identify a large percentage of new and unclassified structures in CATH.

Analysis of geometrical and topological attitude for proteinprotein interaction

2012

Protein-protein interaction takes usually place on an extended area of the external molecules surfaces that are morphologically fitting. Geometric and topological congruence (i.e. concavity and convexity correspondences) is required to support the neighboring interaction of surface patches belonging to the two protein molecules. It is therefore important to adopt representations and data structures that can facilitate the analysis and the implementation of techniques for the evaluation of geometric and topological properties on extended surfaces. These areas of activity are usually roughly “planar” but with local concavity and complexity that must match each other for interacting. To this purpose we are suggesting a solution different from the one of ligand-protein interaction in which are involved a pocket and a small molecule. The solution here suggested is based on the concavity tree representation. Starting from the convex hull of the protein molecule a recursive process leads t...

A topological approach for analyzing the protein structure

Persistent homology is a new tool from algebraic topology, showing until nowadays a lot of success when it comes to application in biology since this latest use metrics only for measuring similarities, Embedding the geometric details and focusing on the global shape is the key point making the success of persistent homology as an efficient topological data analysis tool. In this work we will be confirming the latest assumption (topology embeds geometry) by analyzing the structure of COILED SERINE which is a protein estimated to constitute 3-5 percent of the encoded residues in most genomes, and giving a substitute of the optimal characteristic distance that can be used in the flexibility-rigidity index, a classic method used to simulate molecule movements and flexible behavior, when it comes to atomic rigidity functions. We will also analyze interesting patterns in the binding site of the beta sheet generated from the pdb file 2JOX. We will be detecting and giving a simple descripti...

Topological properties of the configurational space of proteins

Understanding protein structure and dynamics is essential for understanding their function. This is a challenging task due to the high complexity of protein structure and demanding calculations. In this work we conduct a topological analysis of the conformational space of several hormone peptides. We used MD simulations to obtain an extensive conformational sampling at low energy, and a topological software package, JavaPlex, to count the number of distinct energy minima. These local energy minima may correspond to stable and highly populated conformations. This information can help us determine the structural and hopefully functional properties of these peptides. Of special interest are proteins with very similar sequences which may exhibit different structural preferences. We focused on two pairs of peptides: Vasopressin and Oxytocin, and human and porcine Galanin. Each pair of peptides are very similar in their sequence, but as we discovered, their energetic and structural preference tend to be rather diverse. In the future we plan to work on larger proteins as well as detecting more robust topological properties.

A Topological Data Analysis of the Protein Structure

A Topological Data Analysis of the Protein Structure, 2023

Persistent homology is a tool from a set of methods called Topological data analysis, showing until nowadays a lot of success when it comes to application in biology since this latest uses metrics only for measuring similarities, Embedding the geometric details and focusing on the global shape is the key point making the success of persistent homology, this will be investigated in the paper since enormous work already done in the field and results seems to be endless, as an efficient topological data analysis tool. In this work we will be confirming the latest assumption (topology embeds geometry) by displaying the structure of COILED SERINE which is a protein estimated to constitute 3-5 percent of the encoded residues in most genomes, and giving a substitute of the optimal characteristic distance that can be used in the flexibility-rigidity index, a classic method used to simulate molecule movements and flexible behavior, when it comes to atomic rigidity functions. We will also analyze interesting patterns in the binding site of the beta sheet generated from the pdb file 2JOX. We will be detecting and giving a simple description of different patterns generated by using javaplex generating barcodes and linear statistical results as a summary statistics.

Do we see what we should see? Describing non-covalent interactions in protein structures including precision

IUCrJ, 2013

The power of X-ray crystal structure analysis as a technique is to 'see where the atoms are'. The results are extensively used by a wide variety of research communities. However, this 'seeing where the atoms are' can give a false sense of security unless the precision of the placement of the atoms has been taken into account. Indeed, the presentation of bond distances and angles to a false precision (i.e. to too many decimal places) is commonplace. This article has three themes. Firstly, a basis for a proper representation of protein crystal structure results is detailed and demonstrated with respect to analyses of Protein Data Bank entries. The basis for establishing the precision of placement of each atom in a protein crystal structure is non-trivial. Secondly, a knowledge base harnessing such a descriptor of precision is presented. It is applied here to the case of salt bridges, i.e. ion pairs, in protein structures; this is the most fundamental place to start with such structure-precision representations since salt bridges are one of the tenets of protein structure stability. Ion pairs also play a central role in protein oligomerization, molecular recognition of ligands and substrates, allosteric regulation, domain motion and -helix capping. A new knowledge base, SBPS (Salt Bridges in Protein Structures), takes these structural precisions into account and is the first of its kind. The third theme of the article is to indicate natural extensions of the need for such a description of precision, such as those involving metalloproteins and the determination of the protonation states of ionizable amino acids. Overall, it is also noted that this work and these examples are also relevant to protein three-dimensional structure molecular graphics software.

A novel method for comparing topological models of protein structures enhanced with ligand information

Bioinformatics, 2008

We introduce TOPS+ strings, a highly abstract string-based model of protein topology that permits efficient computation of structure comparison, and can optionally represent ligand information. In this model, we consider loops as secondary structure elements (SSEs) as well as helices and strands; in addition we represent ligands as first class objects. Interactions between SSEs and between SSEs and ligands are described by incoming/outgoing arcs and ligand arcs, respectively; and SSEs are annotated with arc interaction direction and type. We are able to abstract away from the ligands themselves, to give a model characterized by a regular grammar rather than the context sensitive grammar of the original TOPS model. Our TOPS+ strings model is sufficiently descriptive to obtain biologically meaningful results and has the advantage of permitting fast stringbased structure matching and comparison as well as avoiding issues of Non-deterministic Polynomial time (NP)-completeness associated with graph problems. Our structure comparison method is computationally more efficient in identifying distantly related proteins than BLAST, CLUSTALW, SSAP and TOPS because of the compact and abstract string-based representation of protein structure which records both topological and biochemical information including the functionally important loop regions of the protein structures. The accuracy of our comparison method is comparable with that of TOPS. Also, we have demonstrated that our TOPS+ strings method out-performs the TOPS method for the ligand-dependent protein structures and provides biologically meaningful results. Availability: The TOPS+ strings comparison server is available from