Joke Reumers - Academia.edu (original) (raw)
Papers by Joke Reumers
An important gap in the relation between artificial evolutionary systems and their biological cou... more An important gap in the relation between artificial evolutionary systems and their biological counterpart is the inability of artificial models to construct functional hierarchical structures in an emergent way. For composite structures to emerge and prosper in a biological hierarchical system, some form of selection is required at each level. In this paper we examine an abstract model of immune networks in terms of the selection dynamics present at the individual level and at the network level. In this context, a property of the individuals ...
This paper introduces an algorithm that investigates whether the effect of an intervention is ide... more This paper introduces an algorithm that investigates whether the effect of an intervention is identifiable from a multi-agent causal model. A multi-agent causal model consists of a collection of agents each having access to a nondisjoint subset of the variables constituting the domain. Every agent has a causal model, determined by nonexperimental data and an acyclic causal diagram over its variables. Since in some cases nonexperimental data can be explained by more than one causal model, the effect of an intervention can not necessarily be calculated. The algorithm under investigation in this paper tests whether the assumptions made in a causal model are sufficient to calculate the effect of an intervention (i.e. whether the effect of an intervention is identifiable). It is a distributed algorithm with a minimum amount of inter-agent communication concerning solely shared variables and where the local causal models of each agent are kept confidential.
... The name ARCHeMEDES was invented by the second author, in the spirit of the great inventor Ar... more ... The name ARCHeMEDES was invented by the second author, in the spirit of the great inventor Archimedes, the "e" for e-learning and "arch" for architecture. 3.2. Issues in architectural design courses 1 Integrative education aims at making students aware of their learning ...
... The name ARCHeMEDES was invented by the second author, in the spirit of the great inventor Ar... more ... The name ARCHeMEDES was invented by the second author, in the spirit of the great inventor Archimedes, the "e" for e-learning and "arch" for architecture. 3.2. Issues in architectural design courses 1 Integrative education aims at making students aware of their learning ...
Human Mutation, 2009
Functional requirements shaped proteins into globular structures. Under these structural constrai... more Functional requirements shaped proteins into globular structures. Under these structural constraints, which require both regular secondary structure and a hydrophobic core, protein aggregation is an unavoidable corollary to protein structure. However, as aggregation results in reduced fitness, natural selection will tend to eliminate strongly aggregating sequences. The analysis of distribution and variation of aggregation patterns in the human proteome using the TANGO algorithm confirms the findings of a previous study on several proteomes: the flanks of aggregation-prone regions are enriched with charged residues and proline, the so-called gatekeeper-residues. Moreover, in this study, we observed a widespread redundancy in gatekeeper usage. Interestingly, aggregating regions from key proteins such as p53 or huntingtin are among the most extensive “gatekept” sequences. As a consequence, mutations that remove gatekeepers could therefore result in a strong increase in disease-susceptibility. In a set of disease-associated mutations from the UniProt database, we find a strong enrichment of mutations that disrupt gatekeeper motifs. Closer inspection of a number of case studies indicates clearly that removing gatekeepers may play a determining role in widely varying disorders, such as van der Woude syndrome (VWS), X-linked Fabry disease (FD), and limb-girdle muscular dystrophy. Hum Mutat 0, 1–7, 2009. © 2009 Wiley-Liss, Inc.
Nucleic Acids Research, 2005
Single nucleotide polymorphisms (SNPs) are an increasingly important tool for genetic and biomedi... more Single nucleotide polymorphisms (SNPs) are an increasingly important tool for genetic and biomedical research. However, the accumulated sequence information on allelic variation is not matched by an understanding of the effect of SNPs on the functional attributes or 'molecular phenotype' of a protein.
Human Mutation, 2008
In one genetic study, the high temperature requirement A2 (HTRA2) mitochondrial protein has been ... more In one genetic study, the high temperature requirement A2 (HTRA2) mitochondrial protein has been associated with increased risk for sporadic Parkinson disease (PD). One missense mutation, p.Gly399Ser, in its C-terminal PDZ domain (from the initial letters of the postsynaptic density 95, PSD-95; discs large; and zonula occludens-1, ZO-1 proteins [Kennedy, 1995]) resulted in defective protease activation, and induced mitochondrial dysfunction when overexpressed in stably transfected cells. Here we examined the contribution of genetic variability in HTRA2 to PD risk in an extended series of 266 Belgian PD patients and 273 control individuals. Mutation analysis identified a novel p.Arg404Trp mutation within the PDZ domain predicted to freeze HTRA2 in an inactive form. Moreover, we identified six patient-specific variants in 5′ and 3′ regulatory regions that might affect HTRA2 expression as supported by data of luciferase reporter gene analyses. Our study confirms a role of the HTRA2 mitochondrial protein in PD susceptibility through mutations in its functional PDZ domain. In addition, it extends the HTRA2 mutation spectrum to functional variants possibly affecting transcriptional activity. The latter underpins a previously unrecognized role for altered HTRA2 expression as a risk factor relevant to parkinsonian neurodegeneration. Hum Mutat 29(6), 832–840, 2008. © 2008 Wiley-Liss, Inc.
Nature Chemical Biology, 2011
A growing number of diseases are associated with inappropriate depositions of protein aggregates,... more A growing number of diseases are associated with inappropriate depositions of protein aggregates, especially neurological disorders and systemic amyloidoses 1 . During malignancy, proteins are usually uncontrollably overexpressed or structurally affected because of genetic mutations, resulting in changes in activity and protein-protein interactions in cancer cells 2 . It remains, however, largely unexplored whether aggregation of tumor suppressors and/or oncogenes could contribute to the induction or progression of malignancy.
PLOS Computational Biology, 2008
As modeling of changes in backbone conformation still lacks a computationally efficient solution,... more As modeling of changes in backbone conformation still lacks a computationally efficient solution, we developed a discretisation of the conformational states accessible to the protein backbone similar to the successful rotamer approach in side chains. The BriX fragment database, consisting of fragments from 4 to 14 residues long, was realized through identification of recurrent backbone fragments from a non-redundant set of high-resolution protein structures. BriX contains an alphabet of more than 1,000 frequently observed conformations per peptide length for 6 different variation levels. Analysis of the performance of BriX revealed an average structural coverage of protein structures of more than 99% within a root mean square distance (RMSD) of 1 Angstrom. Globally, we are able to reconstruct protein structures with an average accuracy of 0.48 Angstrom RMSD. As expected, regular structures are well covered, but, interestingly, many loop regions that appear irregular at first glance are also found to form a recurrent structural motif, albeit with lower frequency of occurrence than regular secondary structures. Larger loop regions could be completely reconstructed from smaller recurrent elements, between 4 and 8 residues long. Finally, we observed that a significant amount of short sequences tend to display strong structural ambiguity between alpha helix and extended conformations. When the sequence length increases, this so-called sequence plasticity is no longer observed, illustrating the context dependency of polypeptide structures. Citation: Baeten L, Reumers J, Tur V, Stricher F, Lenaerts T, et al. (2008) Reconstruction of Protein Backbones from the BriX Collection of Canonical Protein Fragments. PLoS Comput Biol 4(5): e1000083.
Nature Methods, 2010
Articles nAture methods | VOL.7 NO.3 | MARCH 2010 | 237
PLOS Computational Biology, 2008
As modeling of changes in backbone conformation still lacks a computationally efficient solution,... more As modeling of changes in backbone conformation still lacks a computationally efficient solution, we developed a discretisation of the conformational states accessible to the protein backbone similar to the successful rotamer approach in side chains. The BriX fragment database, consisting of fragments from 4 to 14 residues long, was realized through identification of recurrent backbone fragments from a non-redundant set of high-resolution protein structures. BriX contains an alphabet of more than 1,000 frequently observed conformations per peptide length for 6 different variation levels. Analysis of the performance of BriX revealed an average structural coverage of protein structures of more than 99% within a root mean square distance (RMSD) of 1 Angstrom. Globally, we are able to reconstruct protein structures with an average accuracy of 0.48 Angstrom RMSD. As expected, regular structures are well covered, but, interestingly, many loop regions that appear irregular at first glance are also found to form a recurrent structural motif, albeit with lower frequency of occurrence than regular secondary structures. Larger loop regions could be completely reconstructed from smaller recurrent elements, between 4 and 8 residues long. Finally, we observed that a significant amount of short sequences tend to display strong structural ambiguity between alpha helix and extended conformations. When the sequence length increases, this so-called sequence plasticity is no longer observed, illustrating the context dependency of polypeptide structures. Citation: Baeten L, Reumers J, Tur V, Stricher F, Lenaerts T, et al. (2008) Reconstruction of Protein Backbones from the BriX Collection of Canonical Protein Fragments. PLoS Comput Biol 4(5): e1000083.
Nucleic Acids Research, 2007
Single nucleotide polymorphisms (SNPs) are, together with copy number variation, the primary sour... more Single nucleotide polymorphisms (SNPs) are, together with copy number variation, the primary source of variation in the human genome. SNPs are associated with altered response to drug treatment, susceptibility to disease and other phenotypic variation. Furthermore, during genetic screens for disease-associated mutations in groups of patients and control individuals, the distinction between disease causing mutation and polymorphism is often unclear. Annotation of the functional and structural implications of single nucleotide changes thus provides valuable information to interpret and guide experiments. The SNPeffect and PupaSuite databases are now synchronized to deliver annotations for both non-coding and coding SNP, as well as annotations for the SwissProt set of human disease mutations. In addition, SNPeffect now contains predictions of Tango2: an improved aggregation detector, and Waltz: a novel predictor of amyloid-forming sequences, as well as improved predictors for regions that are recognized by the Hsp70 family of chaperones. The new PupaSuite version incorporates predictions for SNPs in silencers and miRNAs including their targets, as well as additional methods for predicting SNPs in TFBSs and splice sites. Also predictions for mouse and rat genomes have been added. In addition, a PupaSuite web service has been developed to enable data access, programmatically. The combined database holds annotations for 4 965 073 regulatory as well as 133 505 coding human SNPs and 14 935 disease mutations, and phenotypic descriptions of 43 797 human proteins and is accessible via http://snpeffect.vib.be and
Bioinformatics/computer Applications in The Biosciences, 2006
Single nucleotide polymorphisms (SNPs) constitute the most fundamental type of genetic variation ... more Single nucleotide polymorphisms (SNPs) constitute the most fundamental type of genetic variation in human populations. About 75 000 of these reported variations cause an amino acid change in the translated protein. An important goal in genomic research is to understand how this variability affects protein function, and whether or not particular SNPs are associated to disease susceptibility. Accordingly, the SNPeffect database uses sequence-and structure-based bioinformatics tools to predict the effect of nonsynonymous SNPs on the molecular phenotype of proteins. SNPeffect analyses the effect of SNPs on three categories of functional properties: (1) structural and thermodynamic properties affecting protein dynamics and stability (2) the integrity of functional and binding sites and (3) changes in posttranslational processing and cellular localization of proteins. The search interface of the database can be used to search specifically for polymorphisms that are predicted to cause a change in one of these properties. Now based on the Ensembl human databases, the SNPeffect database has been remodeled to better fit an automatically updatable structure. The current edition holds the molecular phenotype of 74 567 nsSNPs in 23 426 proteins. Availability: SNPeffect can be accessed through http://snpeffect. vib.be Supplementary Material: Statistics on the contents of the database, figures on the workflow used to create the database and information on the used sources and tools is available at http://
Nucleic Acids Research, 2010
Although protein-peptide interactions are estimated to constitute up to 40% of all protein intera... more Although protein-peptide interactions are estimated to constitute up to 40% of all protein interactions, relatively little information is available for the structural details of these interactions. Peptide-mediated interactions are a prime target for drug design because they are predominantly present in signaling and regulatory networks. A reliable data set of nonredundant protein-peptide complexes is indispensable as a basis for modeling and design, but current data sets for protein-peptide interactions are often biased towards specific types of interactions or are limited to interactions with small ligands. In PepX (http://pepx.switchlab.org), we have designed an unbiased and exhaustive data set of all protein-peptide complexes available in the Protein Data Bank with peptide lengths up to 35 residues. In addition, these complexes have been clustered based on their binding interfaces rather than sequence homology, providing a set of structurally diverse protein-peptide interactions. The final data set contains 505 unique protein-peptide interface clusters from 1431 complexes. Thorough annotation of each complex with both biological and structural information facilitates searching for and browsing through individual complexes and clusters. Moreover, we provide an additional source of data for peptide design by annotating peptides with naturally occurring backbone variations using fragment clusters from the BriX database.
The Open Biology Journal, 2010
The folding of polypeptides into stable globular protein structures requires protein sequences wi... more The folding of polypeptides into stable globular protein structures requires protein sequences with a relatively high hydrophobicity and secondary structure propensity. These biophysical properties, however, also favor protein aggregation via the formation of intermolecular beta-sheets and, as a result, globular structure and aggregation are inextricable properties of protein polypeptides. Aggregates that are enriched in beta-sheet structures have been found in diseased tissues in association with at least twenty different human disorders and the effect of aggregation on protein function include simple loss-of-function but also often a gain of toxicity. Given both the ubiquity and the potentially lethal consequences of protein aggregation, negative selective pressure strongly minimizes aggregation. Various evolutionary strategies keep aggregation in check, including (1) the optimisation of the thermodynamic stability of the protein, which precludes aggregation by burial of the aggregation prone regions in solvent inaccessible regions of the structure, (2) segregation between folding nuclei and aggregation nuclei within a protein sequence, (3) the placement of so-called gatekeeper residues at the flanks of aggregating segments, that reduce the aggregation rate of (partially) unfolded proteins, and (4) molecular chaperones that target aggregation nucleating sequences directly, thereby further suppressing aggregation in a cellular environment. In this review we describe the intrinsic features built into protein sequence and structure that protect against aggregation.
Nucleic Acids Research, 2006
We have developed a web tool, PupaSuite, for the selection of single nucleotide polymorphisms (SN... more We have developed a web tool, PupaSuite, for the selection of single nucleotide polymorphisms (SNPs) with potential phenotypic effect, specifically oriented to help in the design of large-scale genotyping projects. PupaSuite uses a collection of data on SNPs from heterogeneous sources and a large number of pre-calculated predictions to offer a flexible and intuitive interface for selecting an optimal set of SNPs. It improves the functionality of PupaSNP and PupasView programs and implements new facilities such as the analysis of user's data to derive haplotypes with functional information. A new estimator of putative effect of polymorphisms has been included that uses evolutionary information. Also SNPeffect database predictions have been included. The PupaSuite web interface is accessible through http://pupasuite.bioinfo.cipf.es and through
BMC Bioinformatics, 2009
Background: Linking structural effects of mutations to functional outcomes is a major issue in st... more Background: Linking structural effects of mutations to functional outcomes is a major issue in structural bioinformatics, and many tools and studies have shown that specific structural properties such as stability and residue burial can be used to distinguish neutral variations and disease associated mutations.
PLOS Computational Biology, 2011
We previously showed the existence of selective pressure against protein aggregation by the enric... more We previously showed the existence of selective pressure against protein aggregation by the enrichment of aggregationopposing 'gatekeeper' residues at strategic places along the sequence of proteins. Here we analyzed the relationship between protein lifetime and protein aggregation by combining experimentally determined turnover rates, expression data, structural data and chaperone interaction data on a set of more than 500 proteins. We find that selective pressure on protein sequences against aggregation is not homogeneous but that short-living proteins on average have a higher aggregation propensity and fewer chaperone interactions than long-living proteins. We also find that short-living proteins are more often associated to deposition diseases. These findings suggest that the efficient degradation of high-turnover proteins is sufficient to preclude aggregation, but also that factors that inhibit proteasomal activity, such as physiological ageing, will primarily affect the aggregation of short-living proteins.
Nucleic Acids Research, 2008
Single nucleotide polymorphisms (SNPs) are, together with copy number variation, the primary sour... more Single nucleotide polymorphisms (SNPs) are, together with copy number variation, the primary source of variation in the human genome. SNPs are associated with altered response to drug treatment, susceptibility to disease and other phenotypic variation. Furthermore, during genetic screens for disease-associated mutations in groups of patients and control individuals, the distinction between disease causing mutation and polymorphism is often unclear. Annotation of the functional and structural implications of single nucleotide changes thus provides valuable information to interpret and guide experiments. The SNPeffect and PupaSuite databases are now synchronized to deliver annotations for both non-coding and coding SNP, as well as annotations for the SwissProt set of human disease mutations. In addition, SNPeffect now contains predictions of Tango2: an improved aggregation detector, and Waltz: a novel predictor of amyloid-forming sequences, as well as improved predictors for regions that are recognized by the Hsp70 family of chaperones. The new PupaSuite version incorporates predictions for SNPs in silencers and miRNAs including their targets, as well as additional methods for predicting SNPs in TFBSs and splice sites. Also predictions for mouse and rat genomes have been added. In addition, a PupaSuite web service has been developed to enable data access, programmatically. The combined database holds annotations for 4 965 073 regulatory as well as 133 505 coding human SNPs and 14 935 disease mutations, and phenotypic descriptions of 43 797 human proteins and is accessible via http://snpeffect.vib.be and
An important gap in the relation between artificial evolutionary systems and their biological cou... more An important gap in the relation between artificial evolutionary systems and their biological counterpart is the inability of artificial models to construct functional hierarchical structures in an emergent way. For composite structures to emerge and prosper in a biological hierarchical system, some form of selection is required at each level. In this paper we examine an abstract model of immune networks in terms of the selection dynamics present at the individual level and at the network level. In this context, a property of the individuals ...
This paper introduces an algorithm that investigates whether the effect of an intervention is ide... more This paper introduces an algorithm that investigates whether the effect of an intervention is identifiable from a multi-agent causal model. A multi-agent causal model consists of a collection of agents each having access to a nondisjoint subset of the variables constituting the domain. Every agent has a causal model, determined by nonexperimental data and an acyclic causal diagram over its variables. Since in some cases nonexperimental data can be explained by more than one causal model, the effect of an intervention can not necessarily be calculated. The algorithm under investigation in this paper tests whether the assumptions made in a causal model are sufficient to calculate the effect of an intervention (i.e. whether the effect of an intervention is identifiable). It is a distributed algorithm with a minimum amount of inter-agent communication concerning solely shared variables and where the local causal models of each agent are kept confidential.
... The name ARCHeMEDES was invented by the second author, in the spirit of the great inventor Ar... more ... The name ARCHeMEDES was invented by the second author, in the spirit of the great inventor Archimedes, the "e" for e-learning and "arch" for architecture. 3.2. Issues in architectural design courses 1 Integrative education aims at making students aware of their learning ...
... The name ARCHeMEDES was invented by the second author, in the spirit of the great inventor Ar... more ... The name ARCHeMEDES was invented by the second author, in the spirit of the great inventor Archimedes, the "e" for e-learning and "arch" for architecture. 3.2. Issues in architectural design courses 1 Integrative education aims at making students aware of their learning ...
Human Mutation, 2009
Functional requirements shaped proteins into globular structures. Under these structural constrai... more Functional requirements shaped proteins into globular structures. Under these structural constraints, which require both regular secondary structure and a hydrophobic core, protein aggregation is an unavoidable corollary to protein structure. However, as aggregation results in reduced fitness, natural selection will tend to eliminate strongly aggregating sequences. The analysis of distribution and variation of aggregation patterns in the human proteome using the TANGO algorithm confirms the findings of a previous study on several proteomes: the flanks of aggregation-prone regions are enriched with charged residues and proline, the so-called gatekeeper-residues. Moreover, in this study, we observed a widespread redundancy in gatekeeper usage. Interestingly, aggregating regions from key proteins such as p53 or huntingtin are among the most extensive “gatekept” sequences. As a consequence, mutations that remove gatekeepers could therefore result in a strong increase in disease-susceptibility. In a set of disease-associated mutations from the UniProt database, we find a strong enrichment of mutations that disrupt gatekeeper motifs. Closer inspection of a number of case studies indicates clearly that removing gatekeepers may play a determining role in widely varying disorders, such as van der Woude syndrome (VWS), X-linked Fabry disease (FD), and limb-girdle muscular dystrophy. Hum Mutat 0, 1–7, 2009. © 2009 Wiley-Liss, Inc.
Nucleic Acids Research, 2005
Single nucleotide polymorphisms (SNPs) are an increasingly important tool for genetic and biomedi... more Single nucleotide polymorphisms (SNPs) are an increasingly important tool for genetic and biomedical research. However, the accumulated sequence information on allelic variation is not matched by an understanding of the effect of SNPs on the functional attributes or 'molecular phenotype' of a protein.
Human Mutation, 2008
In one genetic study, the high temperature requirement A2 (HTRA2) mitochondrial protein has been ... more In one genetic study, the high temperature requirement A2 (HTRA2) mitochondrial protein has been associated with increased risk for sporadic Parkinson disease (PD). One missense mutation, p.Gly399Ser, in its C-terminal PDZ domain (from the initial letters of the postsynaptic density 95, PSD-95; discs large; and zonula occludens-1, ZO-1 proteins [Kennedy, 1995]) resulted in defective protease activation, and induced mitochondrial dysfunction when overexpressed in stably transfected cells. Here we examined the contribution of genetic variability in HTRA2 to PD risk in an extended series of 266 Belgian PD patients and 273 control individuals. Mutation analysis identified a novel p.Arg404Trp mutation within the PDZ domain predicted to freeze HTRA2 in an inactive form. Moreover, we identified six patient-specific variants in 5′ and 3′ regulatory regions that might affect HTRA2 expression as supported by data of luciferase reporter gene analyses. Our study confirms a role of the HTRA2 mitochondrial protein in PD susceptibility through mutations in its functional PDZ domain. In addition, it extends the HTRA2 mutation spectrum to functional variants possibly affecting transcriptional activity. The latter underpins a previously unrecognized role for altered HTRA2 expression as a risk factor relevant to parkinsonian neurodegeneration. Hum Mutat 29(6), 832–840, 2008. © 2008 Wiley-Liss, Inc.
Nature Chemical Biology, 2011
A growing number of diseases are associated with inappropriate depositions of protein aggregates,... more A growing number of diseases are associated with inappropriate depositions of protein aggregates, especially neurological disorders and systemic amyloidoses 1 . During malignancy, proteins are usually uncontrollably overexpressed or structurally affected because of genetic mutations, resulting in changes in activity and protein-protein interactions in cancer cells 2 . It remains, however, largely unexplored whether aggregation of tumor suppressors and/or oncogenes could contribute to the induction or progression of malignancy.
PLOS Computational Biology, 2008
As modeling of changes in backbone conformation still lacks a computationally efficient solution,... more As modeling of changes in backbone conformation still lacks a computationally efficient solution, we developed a discretisation of the conformational states accessible to the protein backbone similar to the successful rotamer approach in side chains. The BriX fragment database, consisting of fragments from 4 to 14 residues long, was realized through identification of recurrent backbone fragments from a non-redundant set of high-resolution protein structures. BriX contains an alphabet of more than 1,000 frequently observed conformations per peptide length for 6 different variation levels. Analysis of the performance of BriX revealed an average structural coverage of protein structures of more than 99% within a root mean square distance (RMSD) of 1 Angstrom. Globally, we are able to reconstruct protein structures with an average accuracy of 0.48 Angstrom RMSD. As expected, regular structures are well covered, but, interestingly, many loop regions that appear irregular at first glance are also found to form a recurrent structural motif, albeit with lower frequency of occurrence than regular secondary structures. Larger loop regions could be completely reconstructed from smaller recurrent elements, between 4 and 8 residues long. Finally, we observed that a significant amount of short sequences tend to display strong structural ambiguity between alpha helix and extended conformations. When the sequence length increases, this so-called sequence plasticity is no longer observed, illustrating the context dependency of polypeptide structures. Citation: Baeten L, Reumers J, Tur V, Stricher F, Lenaerts T, et al. (2008) Reconstruction of Protein Backbones from the BriX Collection of Canonical Protein Fragments. PLoS Comput Biol 4(5): e1000083.
Nature Methods, 2010
Articles nAture methods | VOL.7 NO.3 | MARCH 2010 | 237
PLOS Computational Biology, 2008
As modeling of changes in backbone conformation still lacks a computationally efficient solution,... more As modeling of changes in backbone conformation still lacks a computationally efficient solution, we developed a discretisation of the conformational states accessible to the protein backbone similar to the successful rotamer approach in side chains. The BriX fragment database, consisting of fragments from 4 to 14 residues long, was realized through identification of recurrent backbone fragments from a non-redundant set of high-resolution protein structures. BriX contains an alphabet of more than 1,000 frequently observed conformations per peptide length for 6 different variation levels. Analysis of the performance of BriX revealed an average structural coverage of protein structures of more than 99% within a root mean square distance (RMSD) of 1 Angstrom. Globally, we are able to reconstruct protein structures with an average accuracy of 0.48 Angstrom RMSD. As expected, regular structures are well covered, but, interestingly, many loop regions that appear irregular at first glance are also found to form a recurrent structural motif, albeit with lower frequency of occurrence than regular secondary structures. Larger loop regions could be completely reconstructed from smaller recurrent elements, between 4 and 8 residues long. Finally, we observed that a significant amount of short sequences tend to display strong structural ambiguity between alpha helix and extended conformations. When the sequence length increases, this so-called sequence plasticity is no longer observed, illustrating the context dependency of polypeptide structures. Citation: Baeten L, Reumers J, Tur V, Stricher F, Lenaerts T, et al. (2008) Reconstruction of Protein Backbones from the BriX Collection of Canonical Protein Fragments. PLoS Comput Biol 4(5): e1000083.
Nucleic Acids Research, 2007
Single nucleotide polymorphisms (SNPs) are, together with copy number variation, the primary sour... more Single nucleotide polymorphisms (SNPs) are, together with copy number variation, the primary source of variation in the human genome. SNPs are associated with altered response to drug treatment, susceptibility to disease and other phenotypic variation. Furthermore, during genetic screens for disease-associated mutations in groups of patients and control individuals, the distinction between disease causing mutation and polymorphism is often unclear. Annotation of the functional and structural implications of single nucleotide changes thus provides valuable information to interpret and guide experiments. The SNPeffect and PupaSuite databases are now synchronized to deliver annotations for both non-coding and coding SNP, as well as annotations for the SwissProt set of human disease mutations. In addition, SNPeffect now contains predictions of Tango2: an improved aggregation detector, and Waltz: a novel predictor of amyloid-forming sequences, as well as improved predictors for regions that are recognized by the Hsp70 family of chaperones. The new PupaSuite version incorporates predictions for SNPs in silencers and miRNAs including their targets, as well as additional methods for predicting SNPs in TFBSs and splice sites. Also predictions for mouse and rat genomes have been added. In addition, a PupaSuite web service has been developed to enable data access, programmatically. The combined database holds annotations for 4 965 073 regulatory as well as 133 505 coding human SNPs and 14 935 disease mutations, and phenotypic descriptions of 43 797 human proteins and is accessible via http://snpeffect.vib.be and
Bioinformatics/computer Applications in The Biosciences, 2006
Single nucleotide polymorphisms (SNPs) constitute the most fundamental type of genetic variation ... more Single nucleotide polymorphisms (SNPs) constitute the most fundamental type of genetic variation in human populations. About 75 000 of these reported variations cause an amino acid change in the translated protein. An important goal in genomic research is to understand how this variability affects protein function, and whether or not particular SNPs are associated to disease susceptibility. Accordingly, the SNPeffect database uses sequence-and structure-based bioinformatics tools to predict the effect of nonsynonymous SNPs on the molecular phenotype of proteins. SNPeffect analyses the effect of SNPs on three categories of functional properties: (1) structural and thermodynamic properties affecting protein dynamics and stability (2) the integrity of functional and binding sites and (3) changes in posttranslational processing and cellular localization of proteins. The search interface of the database can be used to search specifically for polymorphisms that are predicted to cause a change in one of these properties. Now based on the Ensembl human databases, the SNPeffect database has been remodeled to better fit an automatically updatable structure. The current edition holds the molecular phenotype of 74 567 nsSNPs in 23 426 proteins. Availability: SNPeffect can be accessed through http://snpeffect. vib.be Supplementary Material: Statistics on the contents of the database, figures on the workflow used to create the database and information on the used sources and tools is available at http://
Nucleic Acids Research, 2010
Although protein-peptide interactions are estimated to constitute up to 40% of all protein intera... more Although protein-peptide interactions are estimated to constitute up to 40% of all protein interactions, relatively little information is available for the structural details of these interactions. Peptide-mediated interactions are a prime target for drug design because they are predominantly present in signaling and regulatory networks. A reliable data set of nonredundant protein-peptide complexes is indispensable as a basis for modeling and design, but current data sets for protein-peptide interactions are often biased towards specific types of interactions or are limited to interactions with small ligands. In PepX (http://pepx.switchlab.org), we have designed an unbiased and exhaustive data set of all protein-peptide complexes available in the Protein Data Bank with peptide lengths up to 35 residues. In addition, these complexes have been clustered based on their binding interfaces rather than sequence homology, providing a set of structurally diverse protein-peptide interactions. The final data set contains 505 unique protein-peptide interface clusters from 1431 complexes. Thorough annotation of each complex with both biological and structural information facilitates searching for and browsing through individual complexes and clusters. Moreover, we provide an additional source of data for peptide design by annotating peptides with naturally occurring backbone variations using fragment clusters from the BriX database.
The Open Biology Journal, 2010
The folding of polypeptides into stable globular protein structures requires protein sequences wi... more The folding of polypeptides into stable globular protein structures requires protein sequences with a relatively high hydrophobicity and secondary structure propensity. These biophysical properties, however, also favor protein aggregation via the formation of intermolecular beta-sheets and, as a result, globular structure and aggregation are inextricable properties of protein polypeptides. Aggregates that are enriched in beta-sheet structures have been found in diseased tissues in association with at least twenty different human disorders and the effect of aggregation on protein function include simple loss-of-function but also often a gain of toxicity. Given both the ubiquity and the potentially lethal consequences of protein aggregation, negative selective pressure strongly minimizes aggregation. Various evolutionary strategies keep aggregation in check, including (1) the optimisation of the thermodynamic stability of the protein, which precludes aggregation by burial of the aggregation prone regions in solvent inaccessible regions of the structure, (2) segregation between folding nuclei and aggregation nuclei within a protein sequence, (3) the placement of so-called gatekeeper residues at the flanks of aggregating segments, that reduce the aggregation rate of (partially) unfolded proteins, and (4) molecular chaperones that target aggregation nucleating sequences directly, thereby further suppressing aggregation in a cellular environment. In this review we describe the intrinsic features built into protein sequence and structure that protect against aggregation.
Nucleic Acids Research, 2006
We have developed a web tool, PupaSuite, for the selection of single nucleotide polymorphisms (SN... more We have developed a web tool, PupaSuite, for the selection of single nucleotide polymorphisms (SNPs) with potential phenotypic effect, specifically oriented to help in the design of large-scale genotyping projects. PupaSuite uses a collection of data on SNPs from heterogeneous sources and a large number of pre-calculated predictions to offer a flexible and intuitive interface for selecting an optimal set of SNPs. It improves the functionality of PupaSNP and PupasView programs and implements new facilities such as the analysis of user's data to derive haplotypes with functional information. A new estimator of putative effect of polymorphisms has been included that uses evolutionary information. Also SNPeffect database predictions have been included. The PupaSuite web interface is accessible through http://pupasuite.bioinfo.cipf.es and through
BMC Bioinformatics, 2009
Background: Linking structural effects of mutations to functional outcomes is a major issue in st... more Background: Linking structural effects of mutations to functional outcomes is a major issue in structural bioinformatics, and many tools and studies have shown that specific structural properties such as stability and residue burial can be used to distinguish neutral variations and disease associated mutations.
PLOS Computational Biology, 2011
We previously showed the existence of selective pressure against protein aggregation by the enric... more We previously showed the existence of selective pressure against protein aggregation by the enrichment of aggregationopposing 'gatekeeper' residues at strategic places along the sequence of proteins. Here we analyzed the relationship between protein lifetime and protein aggregation by combining experimentally determined turnover rates, expression data, structural data and chaperone interaction data on a set of more than 500 proteins. We find that selective pressure on protein sequences against aggregation is not homogeneous but that short-living proteins on average have a higher aggregation propensity and fewer chaperone interactions than long-living proteins. We also find that short-living proteins are more often associated to deposition diseases. These findings suggest that the efficient degradation of high-turnover proteins is sufficient to preclude aggregation, but also that factors that inhibit proteasomal activity, such as physiological ageing, will primarily affect the aggregation of short-living proteins.
Nucleic Acids Research, 2008
Single nucleotide polymorphisms (SNPs) are, together with copy number variation, the primary sour... more Single nucleotide polymorphisms (SNPs) are, together with copy number variation, the primary source of variation in the human genome. SNPs are associated with altered response to drug treatment, susceptibility to disease and other phenotypic variation. Furthermore, during genetic screens for disease-associated mutations in groups of patients and control individuals, the distinction between disease causing mutation and polymorphism is often unclear. Annotation of the functional and structural implications of single nucleotide changes thus provides valuable information to interpret and guide experiments. The SNPeffect and PupaSuite databases are now synchronized to deliver annotations for both non-coding and coding SNP, as well as annotations for the SwissProt set of human disease mutations. In addition, SNPeffect now contains predictions of Tango2: an improved aggregation detector, and Waltz: a novel predictor of amyloid-forming sequences, as well as improved predictors for regions that are recognized by the Hsp70 family of chaperones. The new PupaSuite version incorporates predictions for SNPs in silencers and miRNAs including their targets, as well as additional methods for predicting SNPs in TFBSs and splice sites. Also predictions for mouse and rat genomes have been added. In addition, a PupaSuite web service has been developed to enable data access, programmatically. The combined database holds annotations for 4 965 073 regulatory as well as 133 505 coding human SNPs and 14 935 disease mutations, and phenotypic descriptions of 43 797 human proteins and is accessible via http://snpeffect.vib.be and