Liam McGuffin - Academia.edu (original) (raw)

Papers by Liam McGuffin

Research paper thumbnail of Are the integrin binding motifs within SARS CoV-2 spike protein and MHC class II alleles playing the key role in COVID-19?

Frontiers in Immunology, Jul 10, 2023

The previous studies on the RGD motif (aa403-405) within the SARS CoV-2 spike (S) protein recepto... more The previous studies on the RGD motif (aa403-405) within the SARS CoV-2 spike (S) protein receptor binding domain (RBD) suggest that the RGD motif binding integrin(s) may play an important role in infection of the host cells. We also discussed the possible role of two other integrin binding motifs that are present in S protein: LDI (aa585-587) and ECD (661-663), the motifs used by some other viruses in the course of infection. The MultiFOLD models for protein structure analysis have shown that the ECD motif is clearly accessible in the S protein, whereas the RGD and LDI motifs are partially accessible. Furthermore, the amino acids that are present in Epstein-Barr virus protein (EBV) gp42 playing very important role in binding to the HLA-DRB1 molecule and in the subsequent immune response evasion, are also present in the S protein heptad repeat-2. Our MultiFOLD model analyses have shown that these amino acids are clearly accessible on the surface in each S protein chain as monomers and in the homotrimer complex and bind to HLA-DRB1 b chain. Therefore, they may have the identical role in SARS CoV-2 immune evasion as in EBV infection. The prediction analyses of the MHC class II binding peptides within the S protein have shown that the RGD motif is present in the core 9-mer peptide IRGDEVRQI within the two HLA-DRB1*03:01 and HLA-DRB3*01.01 strong binding 15-mer peptides suggesting that RGD motif may be the potential immune epitope. Accordingly, infected HLA-DRB1*03:01 or HLA-DRB3*01.01 positive individuals may develop high affinity anti-RGD motif antibodies that react with the RGD motif in the host proteins, like fibrinogen, thrombin or von Willebrand factor, affecting haemostasis or participating in autoimmune disorders.

Research paper thumbnail of The binding site distance test score: a robust method for the assessment of predicted protein binding sites

Bioinformatics, Sep 22, 2010

Motivation: We propose a novel method for scoring the accuracy of protein binding site prediction... more Motivation: We propose a novel method for scoring the accuracy of protein binding site predictions-the Binding-site Distance Test (BDT) score. Recently, the Matthews Correlation Coefficient (MCC) has been used to evaluate binding site predictions, both by developers of new methods and by the assessors for the community-wide prediction experiment-CASP8. While being a rigorous scoring method, the MCC does not take into account the actual 3D location of the predicted residues from the observed binding site. Thus, an incorrectly predicted site that is nevertheless close to the observed binding site will obtain an identical score to the same number of non-binding residues predicted at random. The MCC is somewhat affected by the subjectivity of determining observed binding residues and the ambiguity of choosing distance cutoffs. By contrast the BDT method produces continuous scores ranging between 0 and 1, relating to the distance between the predicted and observed residues. Residues predicted close to the binding site will score higher than those more distant, providing a better reflection of the true accuracy of predictions. The CASP8 function predictions were evaluated using both the MCC and BDT methods and the scores were compared. The BDT was found to strongly correlate with the MCC scores while also being less susceptible to the subjectivity of defining binding residues. We therefore suggest that this new simple score is a potentially more robust method for future evaluations of protein-ligand binding site predictions.

Research paper thumbnail of ModFOLD6: an accurate web server for the global and local quality estimation of 3D protein models

Nucleic Acids Research, Apr 29, 2017

Methods that reliably estimate the likely similarity between the predicted and native structures ... more Methods that reliably estimate the likely similarity between the predicted and native structures of proteins have become essential for driving the acceptance and adoption of three-dimensional protein models by life scientists. ModFOLD6 is the latest version of our leading resource for Estimates of Model Accuracy (EMA), which uses a pioneering hybrid quasisingle model approach. The ModFOLD6 server integrates scores from three pure-single model methods and three quasi-single model methods using a neural network to estimate local quality scores. Additionally, the server provides three options for producing global score estimates, depending on the requirements of the user: (i) ModFOLD6 rank, which is optimized for ranking/selection, (ii) ModFOLD6 cor, which is optimized for correlations of predicted and observed scores and (iii) ModFOLD6 global for balanced performance. The ModFOLD6 methods rank among the top few for EMA, according to independent blind testing by the CASP12 assessors. The ModFOLD6 server is also continuously automatically evaluated as part of the CAMEO project, where significant performance gains have been observed compared to our previous server and other publicly available servers. The ModFOLD6 server is freely available at: http://www.reading.ac.uk/bioinf/ModFOLD/.

Research paper thumbnail of Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments

Bioinformatics, Nov 6, 2009

Motivation: The accurate prediction of the quality of 3D models is a key component of successful ... more Motivation: The accurate prediction of the quality of 3D models is a key component of successful protein tertiary structure prediction methods. Currently, clustering-or consensus-based Model Quality Assessment Programs (MQAPs) are the most accurate methods for predicting 3D model quality; however, they are often CPU intensive as they carry out multiple structural alignments in order to compare numerous models. In this study, we describe ModFOLDclustQ-a novel MQAP that compares 3D models of proteins without the need for CPU intensive structural alignments by utilizing the Q measure for model comparisons. The ModFOLDclustQ method is benchmarked against the top established methods in terms of both accuracy and speed. In addition, the ModFOLDclustQ scores are combined with those from our older ModFOLDclust method to form a new method, ModFOLDclust2, that aims to provide increased prediction accuracy with negligible computational overhead. Results: The ModFOLDclustQ method is competitive with leading clustering-based MQAPs for the prediction of global model quality, yet it is up to 150 times faster than the previous version of the ModFOLDclust method at comparing models of small proteins (<60 residues) and over five times faster at comparing models of large proteins (>800 residues). Furthermore, a significant improvement in accuracy can be gained over the previous clustering-based MQAPs by combining the scores from ModFOLDclustQ and ModFOLDclust to form the new ModFOLDclust2 method, with little impact on the overall time taken for each prediction.

Research paper thumbnail of The ModFOLD server for the quality assessment of protein structural models

Bioinformatics, Jan 9, 2008

The reliable assessment of the quality of protein structural models is fundamental to the progres... more The reliable assessment of the quality of protein structural models is fundamental to the progress of structural bioinformatics. The ModFOLD server provides access to two accurate techniques for the global and local prediction of the quality of 3D models of proteins. Firstly ModFOLD, which is a fast Model Quality Assessment Program (MQAP) used for the global assessment of either single or multiple models. Secondly ModFOLDclust, which is a more intensive method that carries out clustering of multiple models and provides per-residue local quality assessment.

Research paper thumbnail of Insertion and Deletion Events, Their Molecular Mechanisms, and Their Impact on Sequence Alignments

University of California Press eBooks, Nov 3, 2009

Research paper thumbnail of Using Local Protein Model Quality Estimates to Guide a Molecular Dynamics-Based Refinement Strategy

Research paper thumbnail of MAP4K4 expression in cardiomyocytes: multiple isoforms, multiple phosphorylations and interactions with striatins

Biochemical Journal, Jun 11, 2021

The Ser/Thr kinase MAP4K4, like other GCKIV kinases, has N-terminal kinase and C-terminal citron ... more The Ser/Thr kinase MAP4K4, like other GCKIV kinases, has N-terminal kinase and C-terminal citron homology (CNH) domains. MAP4K4 can activate c-Jun N-terminal kinases (JNKs), and studies in the heart suggest it links oxidative stress to JNKs and heart failure. In other systems, MAP4K4 is regulated in striatin-interacting phosphatase and kinase (STRIPAK) complexes, in which one of three striatins tethers PP2A adjacent to a kinase to keep it dephosphorylated and inactive. Our aim was to understand how MAP4K4 is regulated in cardiomyocytes. The rat MAP4K4 gene was not properly defined. We identified the first coding exon of the rat gene using 5 0-RACE, we cloned the fulllength sequence and confirmed alternative-splicing of MAP4K4 in rat cardiomyocytes. We identified an additional α-helix C-terminal to the kinase domain important for kinase activity. In further studies, FLAG-MAP4K4 was expressed in HEK293 cells or cardiomyocytes. The Ser/Thr protein phosphatase inhibitor calyculin A (CalA) induced MAP4K4 hyperphosphorylation, with phosphorylation of the activation loop and extensive phosphorylation of the linker between the kinase and CNH domains. This required kinase activity. MAP4K4 associated with myosin in untreated cardiomyocytes, and this was lost with CalA-treatment. FLAG-MAP4K4 associated with all three striatins in cardiomyocytes, indicative of regulation within STRIPAK complexes and consistent with activation by CalA. Computational analysis suggested the interaction was direct and mediated via coiled-coil domains. Surprisingly, FLAG-MAP4K4 inhibited JNK activation by H 2 O 2 in cardiomyocytes and increased myofibrillar organisation. Our data identify MAP4K4 as a STRIPAK-regulated kinase in cardiomyocytes, and suggest it regulates the cytoskeleton rather than activates JNKs.

Research paper thumbnail of In silico Identification and Characterization of Protein-Ligand Binding Sites

Springer eBooks, 2016

Protein ligand binding site prediction methods aim to predict, from amino acid sequence, protein-... more Protein ligand binding site prediction methods aim to predict, from amino acid sequence, protein-ligand interactions, putative ligands and ligand binding site residues using either sequence information, structural information or a combination of both. In silico characterisation of protein-ligand interactions have become extremely important to help determine a protein functionality, as in vivo based functional elucidation is unable to keep pace with the current growth of sequence databases. Additionally, in vitro biochemical functional elucidation is time consuming, costly and may not be feasible for large scale analysis, such as drug discovery. Thus, in silico prediction of protein-ligand interactions need to be utilized to aid in functional elucidation. Here we briefly discuss protein function prediction, prediction of protein-ligand interactions, the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated EvaluatiOn (CAMEO) competitions, along with their role in shaping the field. We also discuss, in detail, our cutting-edge web-server method FunFOLD for the structurally informed prediction of protein-ligand interactions. Furthermore, we provide a step-by-step guide on using the FunFOLD webserver and FunFOLD3 downloadable application, along with some real world examples, where the FunFOLD methods have been used to aid functional elucidation.

Research paper thumbnail of Automated tertiary structure prediction with accurate local model quality assessment using the intfold-ts method

Proteins, 2011

The IntFOLD‐TS method was developed according to the guiding principle that the model quality ass... more The IntFOLD‐TS method was developed according to the guiding principle that the model quality assessment (QA) would be the most critical stage for our template‐based modeling pipeline. Thus, the IntFOLD‐TS method firstly generates numerous alternate models, using in‐house versions of several different sequence‐structure alignment methods, which are then ranked in terms of global quality using our top performing QA method—ModFOLDclust2. In addition to the predicted global quality scores, the predictions of local errors are also provided in the resulting coordinate files, using scores that represent the predicted deviation of each residue in the model from the equivalent residue in the native structure. The IntFOLD‐TS method was found to generate high quality 3D models for many of the CASP9 targets, whilst also providing highly accurate predictions of their per‐residue errors. This important information may help to make the 3D models that are produced by the IntFOLD‐TS method more useful for guiding future experimental work. Proteins 2011; © 2011 Wiley‐Liss, Inc.

Research paper thumbnail of Prediction of global and local model quality in CASP8 using the ModFOLD server

Proteins, 2009

The ability to rank and select the best model is important in protein structure prediction. Model... more The ability to rank and select the best model is important in protein structure prediction. Model Quality Assessment Programs (MQAPs) are programs developed to perform this task. They can be divided into three categories based on the information they use. Consensus based methods use the similarity to other models, structure-based methods use features calculated from the structure and evolutionary based methods use the sequence similarity between a model and a template. These methods can be trained to predict the overall global quality of a model, that is, how much a model is likely to differ from the native structure. The methods can also be trained to pinpoint which local regions in a model are likely to be incorrect. In CASP7, we participated with three predictors of global and four of local quality using information from the three categories described above. The result shows that the MQAP using consensus, Pcons, was significantly better at predicting both global and local quality compared with MQAPs using only structure or sequence based information.

Research paper thumbnail of Quality Estimates for 3D Protein Models

Research paper thumbnail of Zinc binding to Muscle LIM Protein regulates its structure and interaction with binding partners

The FASEB Journal, Apr 1, 2019

Research paper thumbnail of Proteins and Their Interacting Partners: An Introduction to Protein–Ligand Binding Site Prediction Methods with a Focus on

Methods in molecular biology, 2021

Proteins are essential molecules with a diverse range of functions; elucidating their biological ... more Proteins are essential molecules with a diverse range of functions; elucidating their biological and biochemical characteristics can be difficult and time consuming using in vitro and/or in vivo methods. Additionally, in vivo protein-ligand binding site elucidation is unable to keep place with current growth in sequencing, leaving the majority of new protein sequences without known functions. Therefore, the development of new methods, which aim to predict the protein-ligand interactions and ligand-binding site residues directly from amino acid sequences, is becoming increasingly important. In silico prediction can utilise either sequence information, structural information or a combination of both. In this chapter, we will discuss the broad range of methods for ligand-binding site prediction from protein structure and we will describe our method, FunFOLD3, for the prediction of protein-ligand interactions and ligand-binding sites based on template-based modelling. Additionally, we will describe the step-by-step instructions using the Fun-FOLD3 downloadable application along with examples from the Critical Assessment of Techniques for Protein Structure Prediction (CASP) where FunFOLD3 has been used to aid ligand and ligand-binding site prediction. Finally, we will introduce our newer method, FunFOLD3-D, a version of FunFOLD3 which aims to improve template-based protein-ligand binding site prediction through the integration of docking, using AutoDock Vina.

Research paper thumbnail of Aligning Sequences to Structures

Humana Press eBooks, 2008

Most newly sequenced proteins are likely to adopt a similar structure to one which has already be... more Most newly sequenced proteins are likely to adopt a similar structure to one which has already been experimentally determined. For this reason, the most successful approaches to protein structure prediction have been template-based methods. Such prediction methods attempt to identify and model the folds of unknown structures by aligning the target sequences to a set of representative template structures within a fold library. In this chapter, I discuss the development of template-based approaches to fold prediction, from the traditional techniques to the recent state-of-the-art methods. I also discuss the recent development of structural annotation databases, which contain models built by aligning the sequences from entire proteomes against known structures. Finally, I run through a practical step-by-step guide for aligning target sequences to known structures and contemplate the future direction of template-based structure prediction.

Research paper thumbnail of Model Quality Prediction

John Wiley & Sons, Inc. eBooks, Sep 7, 2010

Research paper thumbnail of ModFOLD8: accurate global and local quality estimates for 3D protein models

Nucleic Acids Research, May 8, 2021

Methods for estimating the quality of 3D models of proteins are vital tools for driving the accep... more Methods for estimating the quality of 3D models of proteins are vital tools for driving the acceptance and utility of predicted tertiary structures by the wider bioscience community. Here we describe the significant major updates to ModFOLD, which has maintained its position as a leading server for the prediction of global and local quality of 3D protein models, over the past decade (>20 000 unique external users). ModFOLD8 is the latest version of the server, which combines the strengths of multiple pure-single and quasi-single model methods. Improvements have been made to the web server interface and there has been successive increases in prediction accuracy, which were achieved through integration of newly developed scoring methods and advanced deep learning-based residue contact predictions. Each version of the ModFOLD server has been independently blind tested in the biennial CASP experiments, as well as being continuously evaluated via the CAMEO project. In CASP13 and CASP14, the ModFOLD7 and ModFOLD8 variants ranked among the top 10 quality estimation methods according to almost every official analysis. Prior to CASP14, Mod-FOLD8 was also applied for the evaluation of SARS-CoV-2 protein models as part of CASP Commons 2020 initiative. The ModFOLD8 server is freely available at: https://www.reading.ac.uk/bioinf/ModFOLD/.

Research paper thumbnail of Proteins and Their Interacting Partners: An Introduction to Protein–Ligand Binding Site Prediction Methods

International Journal of Molecular Sciences, Dec 15, 2015

Elucidating the biological and biochemical roles of proteins, and subsequently determining their ... more Elucidating the biological and biochemical roles of proteins, and subsequently determining their interacting partners, can be difficult and time consuming using in vitro and/or in vivo methods, and consequently the majority of newly sequenced proteins will have unknown structures and functions. However, in silico methods for predicting protein-ligand binding sites and protein biochemical functions offer an alternative practical solution. The characterisation of protein-ligand binding sites is essential for investigating new functional roles, which can impact the major biological research spheres of health, food, and energy security. In this review we discuss the role in silico methods play in 3D modelling of protein-ligand binding sites, along with their role in predicting biochemical functionality. In addition, we describe in detail some of the key alternative in silico prediction approaches that are available, as well as discussing the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated Model EvaluatiOn (CAMEO) projects, and their impact on developments in the field. Furthermore, we discuss the importance of protein function prediction methods for tackling 21st century problems.

Research paper thumbnail of PINOT: An Intuitive Resource for Integrating Protein-Protein Interactions

The past decade has seen the rise of omics data, for the understanding of biological systems in h... more The past decade has seen the rise of omics data, for the understanding of biological systems in health and disease. This wealth of data includes protein-protein interaction (PPI) derived from both low and high-throughput assays, which is curated into multiple databases that capture the extent of available information from the peer-reviewed literature. Although these curation efforts are extremely useful, reliably downloading and integrating PPI data from the variety of available repositories is challenging and time consuming. We here present a novel user-friendly web-resource called PINOT (Protein Interaction Network Online Tool; available at http://www.reading.ac.uk/bioinf/PINOT/PINOT_form.html) to optimise the collection and processing of PPI data from the IMEx consortium associated repositories (members and observers) and from WormBase for constructing, respectively, human and C. elegans PPI networks. Users submit a query containing a list of proteins of interest for which PINOT will mine PPIs. PPI data is downloaded, merged, quality checked, and confidence scored based on the number of distinct methods and publications in which each interaction has been reported. Examples of PINOT applications are provided to highlight the performance, the ease of use and the potential applications of this tool. .

Research paper thumbnail of Evolutionary rewiring of bacterial regulatory networks

Microbial Cell, Jul 6, 2015

Research paper thumbnail of Are the integrin binding motifs within SARS CoV-2 spike protein and MHC class II alleles playing the key role in COVID-19?

Frontiers in Immunology, Jul 10, 2023

The previous studies on the RGD motif (aa403-405) within the SARS CoV-2 spike (S) protein recepto... more The previous studies on the RGD motif (aa403-405) within the SARS CoV-2 spike (S) protein receptor binding domain (RBD) suggest that the RGD motif binding integrin(s) may play an important role in infection of the host cells. We also discussed the possible role of two other integrin binding motifs that are present in S protein: LDI (aa585-587) and ECD (661-663), the motifs used by some other viruses in the course of infection. The MultiFOLD models for protein structure analysis have shown that the ECD motif is clearly accessible in the S protein, whereas the RGD and LDI motifs are partially accessible. Furthermore, the amino acids that are present in Epstein-Barr virus protein (EBV) gp42 playing very important role in binding to the HLA-DRB1 molecule and in the subsequent immune response evasion, are also present in the S protein heptad repeat-2. Our MultiFOLD model analyses have shown that these amino acids are clearly accessible on the surface in each S protein chain as monomers and in the homotrimer complex and bind to HLA-DRB1 b chain. Therefore, they may have the identical role in SARS CoV-2 immune evasion as in EBV infection. The prediction analyses of the MHC class II binding peptides within the S protein have shown that the RGD motif is present in the core 9-mer peptide IRGDEVRQI within the two HLA-DRB1*03:01 and HLA-DRB3*01.01 strong binding 15-mer peptides suggesting that RGD motif may be the potential immune epitope. Accordingly, infected HLA-DRB1*03:01 or HLA-DRB3*01.01 positive individuals may develop high affinity anti-RGD motif antibodies that react with the RGD motif in the host proteins, like fibrinogen, thrombin or von Willebrand factor, affecting haemostasis or participating in autoimmune disorders.

Research paper thumbnail of The binding site distance test score: a robust method for the assessment of predicted protein binding sites

Bioinformatics, Sep 22, 2010

Motivation: We propose a novel method for scoring the accuracy of protein binding site prediction... more Motivation: We propose a novel method for scoring the accuracy of protein binding site predictions-the Binding-site Distance Test (BDT) score. Recently, the Matthews Correlation Coefficient (MCC) has been used to evaluate binding site predictions, both by developers of new methods and by the assessors for the community-wide prediction experiment-CASP8. While being a rigorous scoring method, the MCC does not take into account the actual 3D location of the predicted residues from the observed binding site. Thus, an incorrectly predicted site that is nevertheless close to the observed binding site will obtain an identical score to the same number of non-binding residues predicted at random. The MCC is somewhat affected by the subjectivity of determining observed binding residues and the ambiguity of choosing distance cutoffs. By contrast the BDT method produces continuous scores ranging between 0 and 1, relating to the distance between the predicted and observed residues. Residues predicted close to the binding site will score higher than those more distant, providing a better reflection of the true accuracy of predictions. The CASP8 function predictions were evaluated using both the MCC and BDT methods and the scores were compared. The BDT was found to strongly correlate with the MCC scores while also being less susceptible to the subjectivity of defining binding residues. We therefore suggest that this new simple score is a potentially more robust method for future evaluations of protein-ligand binding site predictions.

Research paper thumbnail of ModFOLD6: an accurate web server for the global and local quality estimation of 3D protein models

Nucleic Acids Research, Apr 29, 2017

Methods that reliably estimate the likely similarity between the predicted and native structures ... more Methods that reliably estimate the likely similarity between the predicted and native structures of proteins have become essential for driving the acceptance and adoption of three-dimensional protein models by life scientists. ModFOLD6 is the latest version of our leading resource for Estimates of Model Accuracy (EMA), which uses a pioneering hybrid quasisingle model approach. The ModFOLD6 server integrates scores from three pure-single model methods and three quasi-single model methods using a neural network to estimate local quality scores. Additionally, the server provides three options for producing global score estimates, depending on the requirements of the user: (i) ModFOLD6 rank, which is optimized for ranking/selection, (ii) ModFOLD6 cor, which is optimized for correlations of predicted and observed scores and (iii) ModFOLD6 global for balanced performance. The ModFOLD6 methods rank among the top few for EMA, according to independent blind testing by the CASP12 assessors. The ModFOLD6 server is also continuously automatically evaluated as part of the CAMEO project, where significant performance gains have been observed compared to our previous server and other publicly available servers. The ModFOLD6 server is freely available at: http://www.reading.ac.uk/bioinf/ModFOLD/.

Research paper thumbnail of Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments

Bioinformatics, Nov 6, 2009

Motivation: The accurate prediction of the quality of 3D models is a key component of successful ... more Motivation: The accurate prediction of the quality of 3D models is a key component of successful protein tertiary structure prediction methods. Currently, clustering-or consensus-based Model Quality Assessment Programs (MQAPs) are the most accurate methods for predicting 3D model quality; however, they are often CPU intensive as they carry out multiple structural alignments in order to compare numerous models. In this study, we describe ModFOLDclustQ-a novel MQAP that compares 3D models of proteins without the need for CPU intensive structural alignments by utilizing the Q measure for model comparisons. The ModFOLDclustQ method is benchmarked against the top established methods in terms of both accuracy and speed. In addition, the ModFOLDclustQ scores are combined with those from our older ModFOLDclust method to form a new method, ModFOLDclust2, that aims to provide increased prediction accuracy with negligible computational overhead. Results: The ModFOLDclustQ method is competitive with leading clustering-based MQAPs for the prediction of global model quality, yet it is up to 150 times faster than the previous version of the ModFOLDclust method at comparing models of small proteins (<60 residues) and over five times faster at comparing models of large proteins (>800 residues). Furthermore, a significant improvement in accuracy can be gained over the previous clustering-based MQAPs by combining the scores from ModFOLDclustQ and ModFOLDclust to form the new ModFOLDclust2 method, with little impact on the overall time taken for each prediction.

Research paper thumbnail of The ModFOLD server for the quality assessment of protein structural models

Bioinformatics, Jan 9, 2008

The reliable assessment of the quality of protein structural models is fundamental to the progres... more The reliable assessment of the quality of protein structural models is fundamental to the progress of structural bioinformatics. The ModFOLD server provides access to two accurate techniques for the global and local prediction of the quality of 3D models of proteins. Firstly ModFOLD, which is a fast Model Quality Assessment Program (MQAP) used for the global assessment of either single or multiple models. Secondly ModFOLDclust, which is a more intensive method that carries out clustering of multiple models and provides per-residue local quality assessment.

Research paper thumbnail of Insertion and Deletion Events, Their Molecular Mechanisms, and Their Impact on Sequence Alignments

University of California Press eBooks, Nov 3, 2009

Research paper thumbnail of Using Local Protein Model Quality Estimates to Guide a Molecular Dynamics-Based Refinement Strategy

Research paper thumbnail of MAP4K4 expression in cardiomyocytes: multiple isoforms, multiple phosphorylations and interactions with striatins

Biochemical Journal, Jun 11, 2021

The Ser/Thr kinase MAP4K4, like other GCKIV kinases, has N-terminal kinase and C-terminal citron ... more The Ser/Thr kinase MAP4K4, like other GCKIV kinases, has N-terminal kinase and C-terminal citron homology (CNH) domains. MAP4K4 can activate c-Jun N-terminal kinases (JNKs), and studies in the heart suggest it links oxidative stress to JNKs and heart failure. In other systems, MAP4K4 is regulated in striatin-interacting phosphatase and kinase (STRIPAK) complexes, in which one of three striatins tethers PP2A adjacent to a kinase to keep it dephosphorylated and inactive. Our aim was to understand how MAP4K4 is regulated in cardiomyocytes. The rat MAP4K4 gene was not properly defined. We identified the first coding exon of the rat gene using 5 0-RACE, we cloned the fulllength sequence and confirmed alternative-splicing of MAP4K4 in rat cardiomyocytes. We identified an additional α-helix C-terminal to the kinase domain important for kinase activity. In further studies, FLAG-MAP4K4 was expressed in HEK293 cells or cardiomyocytes. The Ser/Thr protein phosphatase inhibitor calyculin A (CalA) induced MAP4K4 hyperphosphorylation, with phosphorylation of the activation loop and extensive phosphorylation of the linker between the kinase and CNH domains. This required kinase activity. MAP4K4 associated with myosin in untreated cardiomyocytes, and this was lost with CalA-treatment. FLAG-MAP4K4 associated with all three striatins in cardiomyocytes, indicative of regulation within STRIPAK complexes and consistent with activation by CalA. Computational analysis suggested the interaction was direct and mediated via coiled-coil domains. Surprisingly, FLAG-MAP4K4 inhibited JNK activation by H 2 O 2 in cardiomyocytes and increased myofibrillar organisation. Our data identify MAP4K4 as a STRIPAK-regulated kinase in cardiomyocytes, and suggest it regulates the cytoskeleton rather than activates JNKs.

Research paper thumbnail of In silico Identification and Characterization of Protein-Ligand Binding Sites

Springer eBooks, 2016

Protein ligand binding site prediction methods aim to predict, from amino acid sequence, protein-... more Protein ligand binding site prediction methods aim to predict, from amino acid sequence, protein-ligand interactions, putative ligands and ligand binding site residues using either sequence information, structural information or a combination of both. In silico characterisation of protein-ligand interactions have become extremely important to help determine a protein functionality, as in vivo based functional elucidation is unable to keep pace with the current growth of sequence databases. Additionally, in vitro biochemical functional elucidation is time consuming, costly and may not be feasible for large scale analysis, such as drug discovery. Thus, in silico prediction of protein-ligand interactions need to be utilized to aid in functional elucidation. Here we briefly discuss protein function prediction, prediction of protein-ligand interactions, the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated EvaluatiOn (CAMEO) competitions, along with their role in shaping the field. We also discuss, in detail, our cutting-edge web-server method FunFOLD for the structurally informed prediction of protein-ligand interactions. Furthermore, we provide a step-by-step guide on using the FunFOLD webserver and FunFOLD3 downloadable application, along with some real world examples, where the FunFOLD methods have been used to aid functional elucidation.

Research paper thumbnail of Automated tertiary structure prediction with accurate local model quality assessment using the intfold-ts method

Proteins, 2011

The IntFOLD‐TS method was developed according to the guiding principle that the model quality ass... more The IntFOLD‐TS method was developed according to the guiding principle that the model quality assessment (QA) would be the most critical stage for our template‐based modeling pipeline. Thus, the IntFOLD‐TS method firstly generates numerous alternate models, using in‐house versions of several different sequence‐structure alignment methods, which are then ranked in terms of global quality using our top performing QA method—ModFOLDclust2. In addition to the predicted global quality scores, the predictions of local errors are also provided in the resulting coordinate files, using scores that represent the predicted deviation of each residue in the model from the equivalent residue in the native structure. The IntFOLD‐TS method was found to generate high quality 3D models for many of the CASP9 targets, whilst also providing highly accurate predictions of their per‐residue errors. This important information may help to make the 3D models that are produced by the IntFOLD‐TS method more useful for guiding future experimental work. Proteins 2011; © 2011 Wiley‐Liss, Inc.

Research paper thumbnail of Prediction of global and local model quality in CASP8 using the ModFOLD server

Proteins, 2009

The ability to rank and select the best model is important in protein structure prediction. Model... more The ability to rank and select the best model is important in protein structure prediction. Model Quality Assessment Programs (MQAPs) are programs developed to perform this task. They can be divided into three categories based on the information they use. Consensus based methods use the similarity to other models, structure-based methods use features calculated from the structure and evolutionary based methods use the sequence similarity between a model and a template. These methods can be trained to predict the overall global quality of a model, that is, how much a model is likely to differ from the native structure. The methods can also be trained to pinpoint which local regions in a model are likely to be incorrect. In CASP7, we participated with three predictors of global and four of local quality using information from the three categories described above. The result shows that the MQAP using consensus, Pcons, was significantly better at predicting both global and local quality compared with MQAPs using only structure or sequence based information.

Research paper thumbnail of Quality Estimates for 3D Protein Models

Research paper thumbnail of Zinc binding to Muscle LIM Protein regulates its structure and interaction with binding partners

The FASEB Journal, Apr 1, 2019

Research paper thumbnail of Proteins and Their Interacting Partners: An Introduction to Protein–Ligand Binding Site Prediction Methods with a Focus on

Methods in molecular biology, 2021

Proteins are essential molecules with a diverse range of functions; elucidating their biological ... more Proteins are essential molecules with a diverse range of functions; elucidating their biological and biochemical characteristics can be difficult and time consuming using in vitro and/or in vivo methods. Additionally, in vivo protein-ligand binding site elucidation is unable to keep place with current growth in sequencing, leaving the majority of new protein sequences without known functions. Therefore, the development of new methods, which aim to predict the protein-ligand interactions and ligand-binding site residues directly from amino acid sequences, is becoming increasingly important. In silico prediction can utilise either sequence information, structural information or a combination of both. In this chapter, we will discuss the broad range of methods for ligand-binding site prediction from protein structure and we will describe our method, FunFOLD3, for the prediction of protein-ligand interactions and ligand-binding sites based on template-based modelling. Additionally, we will describe the step-by-step instructions using the Fun-FOLD3 downloadable application along with examples from the Critical Assessment of Techniques for Protein Structure Prediction (CASP) where FunFOLD3 has been used to aid ligand and ligand-binding site prediction. Finally, we will introduce our newer method, FunFOLD3-D, a version of FunFOLD3 which aims to improve template-based protein-ligand binding site prediction through the integration of docking, using AutoDock Vina.

Research paper thumbnail of Aligning Sequences to Structures

Humana Press eBooks, 2008

Most newly sequenced proteins are likely to adopt a similar structure to one which has already be... more Most newly sequenced proteins are likely to adopt a similar structure to one which has already been experimentally determined. For this reason, the most successful approaches to protein structure prediction have been template-based methods. Such prediction methods attempt to identify and model the folds of unknown structures by aligning the target sequences to a set of representative template structures within a fold library. In this chapter, I discuss the development of template-based approaches to fold prediction, from the traditional techniques to the recent state-of-the-art methods. I also discuss the recent development of structural annotation databases, which contain models built by aligning the sequences from entire proteomes against known structures. Finally, I run through a practical step-by-step guide for aligning target sequences to known structures and contemplate the future direction of template-based structure prediction.

Research paper thumbnail of Model Quality Prediction

John Wiley & Sons, Inc. eBooks, Sep 7, 2010

Research paper thumbnail of ModFOLD8: accurate global and local quality estimates for 3D protein models

Nucleic Acids Research, May 8, 2021

Methods for estimating the quality of 3D models of proteins are vital tools for driving the accep... more Methods for estimating the quality of 3D models of proteins are vital tools for driving the acceptance and utility of predicted tertiary structures by the wider bioscience community. Here we describe the significant major updates to ModFOLD, which has maintained its position as a leading server for the prediction of global and local quality of 3D protein models, over the past decade (>20 000 unique external users). ModFOLD8 is the latest version of the server, which combines the strengths of multiple pure-single and quasi-single model methods. Improvements have been made to the web server interface and there has been successive increases in prediction accuracy, which were achieved through integration of newly developed scoring methods and advanced deep learning-based residue contact predictions. Each version of the ModFOLD server has been independently blind tested in the biennial CASP experiments, as well as being continuously evaluated via the CAMEO project. In CASP13 and CASP14, the ModFOLD7 and ModFOLD8 variants ranked among the top 10 quality estimation methods according to almost every official analysis. Prior to CASP14, Mod-FOLD8 was also applied for the evaluation of SARS-CoV-2 protein models as part of CASP Commons 2020 initiative. The ModFOLD8 server is freely available at: https://www.reading.ac.uk/bioinf/ModFOLD/.

Research paper thumbnail of Proteins and Their Interacting Partners: An Introduction to Protein–Ligand Binding Site Prediction Methods

International Journal of Molecular Sciences, Dec 15, 2015

Elucidating the biological and biochemical roles of proteins, and subsequently determining their ... more Elucidating the biological and biochemical roles of proteins, and subsequently determining their interacting partners, can be difficult and time consuming using in vitro and/or in vivo methods, and consequently the majority of newly sequenced proteins will have unknown structures and functions. However, in silico methods for predicting protein-ligand binding sites and protein biochemical functions offer an alternative practical solution. The characterisation of protein-ligand binding sites is essential for investigating new functional roles, which can impact the major biological research spheres of health, food, and energy security. In this review we discuss the role in silico methods play in 3D modelling of protein-ligand binding sites, along with their role in predicting biochemical functionality. In addition, we describe in detail some of the key alternative in silico prediction approaches that are available, as well as discussing the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated Model EvaluatiOn (CAMEO) projects, and their impact on developments in the field. Furthermore, we discuss the importance of protein function prediction methods for tackling 21st century problems.

Research paper thumbnail of PINOT: An Intuitive Resource for Integrating Protein-Protein Interactions

The past decade has seen the rise of omics data, for the understanding of biological systems in h... more The past decade has seen the rise of omics data, for the understanding of biological systems in health and disease. This wealth of data includes protein-protein interaction (PPI) derived from both low and high-throughput assays, which is curated into multiple databases that capture the extent of available information from the peer-reviewed literature. Although these curation efforts are extremely useful, reliably downloading and integrating PPI data from the variety of available repositories is challenging and time consuming. We here present a novel user-friendly web-resource called PINOT (Protein Interaction Network Online Tool; available at http://www.reading.ac.uk/bioinf/PINOT/PINOT_form.html) to optimise the collection and processing of PPI data from the IMEx consortium associated repositories (members and observers) and from WormBase for constructing, respectively, human and C. elegans PPI networks. Users submit a query containing a list of proteins of interest for which PINOT will mine PPIs. PPI data is downloaded, merged, quality checked, and confidence scored based on the number of distinct methods and publications in which each interaction has been reported. Examples of PINOT applications are provided to highlight the performance, the ease of use and the potential applications of this tool. .

Research paper thumbnail of Evolutionary rewiring of bacterial regulatory networks

Microbial Cell, Jul 6, 2015