Rapid knot detection and application to protein structure prediction (original) (raw)

Protein knot server: detection of knots in protein structures

Nucleic Acids Research, 2007

KNOTS (http://knots.mit.edu) is a web server that detects knots in protein structures. Several protein structures have been reported to contain intricate knots. The physiological role of knots and their effect on folding and evolution is an area of active research. The user submits a PDB id or uploads a 3D protein structure in PDB or mmCIF format. The current implementation of the server uses the Alexander polynomial to detect knots. The results of the analysis that are presented to the user are the location of the knot in the structure, the type of the knot and an interactive visualization of the knot. The results can also be downloaded and viewed offline. The server also maintains a regularly updated list of known knots in protein structures.

KnotProt: a database of proteins with knots and slipknots

Nucleic Acids Research, 2014

The protein topology database KnotProt, http:// knotprot.cent.uw.edu.pl/, collects information about protein structures with open polypeptide chains forming knots or slipknots. The knotting complexity of the cataloged proteins is presented in the form of a matrix diagram that shows users the knot type of the entire polypeptide chain and of each of its subchains. The pattern visible in the matrix gives the knotting fingerprint of a given protein and permits users to determine, for example, the minimal length of the knotted regions (knot's core size) or the depth of a knot, i.e. how many amino acids can be removed from either end of the cataloged protein structure before converting it from a knot to a different type of knot. In addition, the database presents extensive information about the biological functions, families and fold types of proteins with non-trivial knotting. As an additional feature, the KnotProt database enables users to submit protein or polymer chains and generate their knotting fingerprints.

Identifying knots in proteins

Biochemical Society Transactions, 2013

Polypeptide chains form open knots in many proteins. How these knotted proteins fold and finding the evolutionary advantage provided by these knots are among some of the key questions currently being studied in the protein folding field. The detection and identification of protein knots are substantial challenges. Different methods and many variations of them have been employed, but they can give different results for the same protein. In the present article, we review the various knot identification algorithms and compare their relative strengths when applied to the study of knots in proteins. We show that the statistical approach based on the uniform closure method is advantageous in comparison with other methods used to characterize protein knots.

Pokefind: a novel topological filter for use with protein structure prediction

Bioinformatics, 2009

Our focus has been on detecting topological properties that are rare in real proteins, but occur more frequently in models generated by protein structure prediction methods such as Rosetta. We previously created the Knotfind algorithm, successfully decreasing the frequency of knotted Rosetta models during CASP6. We observed an additional class of knot-like loops that appeared to be equally un-protein-like and yet do not contain a mathematical knot. These topological features are commonly referred to as slipknots and are caused by the same mechanisms that result in knotted models. Slip-knots are undetectable by the original Knotfind algorithm. We have generalized our algorithm to detect them, and analyzed CASP6 models built using the Rosetta loop modeling method. Results: After analyzing known protein structures in the PDB, we found that slip-knots do occur in certain proteins, but are rare and fall into a small number of specific classes. Our group used this new Pokefind algorithm to distinguish between these rare real slip-knots and the numerous classes of slip-knots that we discovered in Rosetta models and models submitted by the various CASP7 servers. The goal of this work is to improve future models created by protein structure prediction methods. Both algorithms are able to detect unprotein-like features that current metrics such as GDT are unable to identify, so these topological filters can also be used as additional assessment tools.

KnotProt 2.0: a database of proteins with knots and other entangled structures

Nucleic Acids Research

The KnotProt 2.0 database (the updated version of the KnotProt database) collects information about proteins which form knots and other entangled structures. New features in KnotProt 2.0 include the characterization of both probabilistic and deterministic entanglements which can be formed by disulfide bonds and interactions via ions, a refined characterization of entanglement in terms of knotoids, the identification of the so-called cysteine knots, the possibility to analyze all or a non-redundant set of proteins, and various technical updates. The KnotProt 2.0 database classifies all entangled proteins, represents their complexity in the form of a knotting fingerprint, and presents many biological and geometrical statistics based on these results. Currently the database contains >2000 entangled structures, and it regularly self-updates based on proteins deposited in the Protein Data Bank (PDB).

Proteins containing 6-crossing knot types and their folding pathways

bioRxiv (Cold Spring Harbor Laboratory), 2023

Studying complex protein knots can provide new insights into potential knot folding mechanisms and other fundamental aspects of why and how proteins knot. This paper presents results of a systematic analysis of the 3D structure of proteins with 6-crossings knots predicted by the artificial intelligence program AlphaFold 2. Furthermore, using a coarse-grained native based model, we found that three representative proteins can self tie to a 6 3 knot, the most complex knot found in a protein thus far. Because it is not a twist knot, the 6 3 knot cannot be folded via a simple mechanism involving the threading of a single loop. Based on successful trajectories for each protein, we determined that the 6 3 knot is formed after folding a significant part of the protein backbone to the native conformation. Moreover, we found that there are two distinct knotting mechanisms, which are described here. Also, building on a loop flipping theory developed earlier, we present two new theories of protein folding involving the creation and threading of two loops, and explain how our theories can describe the successful folding trajectories for each of the three representative 6 3-knotted proteins.

PackHelix: A tool for helix-sheet packing during protein structure prediction

Proteins: Structure, Function, and Bioinformatics, 2011

The three-dimensional structure of a protein is organized around the packing of its secondary structure elements. Predicting the topology and constructing the geometry of structural motifs involving α-helices and/or β-strands are therefore key steps for accurate prediction of protein structure. While many efforts have focused on how to pack helices and on how to sample exhaustively the topologies and geometries of multiple strands forming a β-sheet in a protein, there has been little progress on generating native-like packing of helices on sheets. We describe a method that can generate the packing of multiple helices on a given β-sheet for αβα sandwich type protein folds. This method mines the results of a statistical analysis of the conformations of αβ 2 motifs in protein structures to provide input values for the geometric attributes of the packing of a helix on a sheet. It then proceeds with a geometric builder that generates multiple arrangements of the helices on the sheet of interest by sampling through these values and performing consistency checks that guarantee proper loop geometry between the helices and the strands, minimal number of collisions between the helices, and proper formation of a hydrophobic core. The method is implemented as a module of ProteinShop. Our results show that it produces structures that are within 4-6 Å RMSD of the native one, regardless of the number of helices that need to be packed, though this number may increase if the protein has several helices between two consecutive strands in the sequence that pack on the sheet formed by these two strands.

New 63 knot and other knots in human proteome from AlphaFold predictions

2022

AlphaFold is a new, highly accurate machine learning protein structure prediction method that outperforms other methods. Recently this method was used to predict the structure of 98.5% of human proteins. We analyze here the structure of these AlphaFold-predicted human proteins for the presence of knots. We found that the human proteome contains 65 robustly knotted proteins, including the most complex type of a knot yet reported in proteins. That knot type, denoted 63 in mathematical notation, would necessitate a more complex folding path than any knotted proteins characterized to date. In some cases AlphaFold structure predictions are not highly accurate, which either makes their topology hard to verify or results in topological artifacts. Other structures that we found, which are knotted, potentially knotted, and structures with artifacts (knots) we deposited in a database available at: https://knotprot.cent.uw.edu.pl/alphafold.

Knot localization in proteins

Biochemical Society Transactions, 2013

The backbones of proteins form linear chains. In the case of some proteins, these chains can be characterized as forming linear open knots. The knot type of the full chain reveals only limited information about the entanglement of the chain since, for example, subchains of an unknotted protein can form knots and subchains of a knotted protein can form different types of knots than the entire protein. To understand fully the entanglement within the backbone of a given protein, a complete analysis of the knotting within all of the subchains of that protein is necessary. In the present article, we review efforts to characterize the full knotting complexity within individual proteins and present a matrix that conveys information about various aspects of protein knotting. For a given protein, this matrix identifies the precise localization of knotted regions and shows the knot types formed by all subchains. The pattern in the matrix can be considered as a knotting fingerprint of that pro...

Knotting pathways in proteins

Biochemical Society Transactions, 2013

Most proteins, in order to perform their biological function, have to fold to a compact native state. The increasing number of knotted and slipknotted proteins identified suggests that proteins are able to manoeuvre around topological barriers during folding. In the present article, we review the current progress in elucidating the knotting process in proteins. Although we concentrate on theoretical approaches, where a knotted topology can be unambiguously detected, comparison with experiments is also reviewed. Numerical simulations suggest that the folding process for small knotted proteins is composed of twisted loop formation and then threading by either slipknot geometries or flipping. As the size of the knotted proteins increases, particularly for more deeply threaded termini, the prevalence of traps in the free energy landscape also increases. Thus, in the case of longer knotted and slipknotted proteins, the folding mechanism is probably supported by chaperones. Overall, resul...