Sanne Abeln - Academia.edu (original) (raw)

Papers by Sanne Abeln

Research paper thumbnail of Lab Course MolSim Exercises 2013

Research paper thumbnail of DETECTION OF BREAKPOINTS BASED ON COPY NUMBER ABERRATION PROFILES

Research paper thumbnail of HOW TO INTEGRATE MULTI-OMICS DATA

Research paper thumbnail of COMPUTATIONAL ANALYSIS OF FORCE INDUCED PROTEIN UNFOLDING

Research paper thumbnail of The hydrophobic temperature dependence of amino acids directly calculated from protein structures

PLoS computational biology, 2015

The hydrophobic effect is the main driving force in protein folding. One can estimate the relativ... more The hydrophobic effect is the main driving force in protein folding. One can estimate the relative strength of this hydrophobic effect for each amino acid by mining a large set of experimentally determined protein structures. However, the hydrophobic force is known to be strongly temperature dependent. This temperature dependence is thought to explain the denaturation of proteins at low temperatures. Here we investigate if it is possible to extract this temperature dependence directly from a large set of protein structures determined at different temperatures. Using NMR structures filtered for sequence identity, we were able to extract hydrophobicity propensities for all amino acids at five different temperature ranges (spanning 265-340 K). These propensities show that the hydrophobicity becomes weaker at lower temperatures, in line with current theory. Alternatively, one can conclude that the temperature dependence of the hydrophobic effect has a measurable influence on protein str...

Research paper thumbnail of Quantifying the Displacement of Mismatches in Multiple Sequence Alignment Benchmarks

PLOS ONE, 2015

Multiple Sequence Alignment (MSA) methods are typically benchmarked on sets of reference alignmen... more Multiple Sequence Alignment (MSA) methods are typically benchmarked on sets of reference alignments. The quality of the alignment can then be represented by the sum-of-pairs (SP) or column (CS) scores, which measure the agreement between a reference and corresponding query alignment. Both the SP and CS scores treat mismatches between a query and reference alignment as equally bad, and do not take the separation into account between two amino acids in the query alignment, that should have been matched according to the reference alignment. This is significant since the magnitude of alignment shifts is often of relevance in biological analyses, including homology modeling and MSA refinement/ manual alignment editing. In this study we develop a new alignment benchmark scoring scheme, SPdist, that takes the degree of discordance of mismatches into account by measuring the sequence distance between mismatched residue pairs in the query alignment. Using this new score along with the standard SP score, we investigate the discriminatory behavior of the new score by assessing how well six different MSA methods perform with respect to BAliBASE reference alignments. The SP score and the SPdist score yield very similar outcomes when the reference and query alignments are close. However, for more divergent reference alignments the SPdist score is able to distinguish between methods that keep alignments approximately close to the reference and those exhibiting larger shifts. We observed that by using SPdist together with SP scoring we were able to better delineate the alignment quality difference between alternative MSA methods. With a case study we exemplify why it is important, from a biological perspective, to consider the separation of mismatches. The SPdist scoring scheme has been implemented in the VerAlign web server (http://www.ibi.vu.nl/programs/veralignwww/). The code for calculating SPdist score is also available upon request.

Research paper thumbnail of NGS-eval: NGS Error analysis and novel sequence VAriant detection tooL

Nucleic Acids Research, 2015

Massively parallel sequencing of microbial genetic markers (MGMs) is used to uncover the species ... more Massively parallel sequencing of microbial genetic markers (MGMs) is used to uncover the species composition in a multitude of ecological niches. These sequencing runs often contain a sample with known composition that can be used to evaluate the sequencing quality or to detect novel sequence variants. With NGS-eval, the reads from such (mock) samples can be used to (i) explore the differences between the reads and their references and to (ii) estimate the sequencing error rate. This tool maps these reads to references and calculates as well as visualizes the different types of sequencing errors. Clearly, sequencing errors can only be accurately calculated if the reference sequences are correct. However, even with known strains, it is not straightforward to select the correct references from databases. We previously analysed a pyrosequencing dataset from a mock sample to estimate sequencing error rates and detected sequence variants in our mock community, allowing us to obtain an accurate error estimation. Here, we demonstrate the variant detection and error analysis capability of NGS-eval with Illumina MiSeq reads from the same mock community. While tailored towards the field of metagenomics, this server can be used for any type of MGM-based reads. NGS-eval is available at http://www.ibi.vu.nl/programs/ngsevalwww/.

Research paper thumbnail of Competition between surface adsorption and folding of fibril-forming polypeptides

Physical review. E, Statistical, nonlinear, and soft matter physics, 2015

Self-assembly of polypeptides into fibrillar structures can be initiated by planar surfaces that ... more Self-assembly of polypeptides into fibrillar structures can be initiated by planar surfaces that interact favorably with certain residues. Using a coarse-grained model, we systematically studied the folding and adsorption behavior of a β-roll forming polypeptide. We find that there are two different folding pathways depending on the temperature: (i) at low temperature, the polypeptide folds in solution into a β-roll before adsorbing onto the attractive surface; (ii) at higher temperature, the polypeptide first adsorbs in a disordered state and folds while on the surface. The folding temperature increases with increasing attraction as the folded β-roll is stabilized by the surface. Surprisingly, further increasing the attraction lowers the folding temperature again, as strong attraction also stabilizes the adsorbed disordered state, which competes with folding of the polypeptide. Our results suggest that to enhance the folding, one should use a weakly attractive surface. They also ex...

Research paper thumbnail of Fast and accurate calculation of protein-protein interaction: contribution of surface and interface

Research paper thumbnail of Evolutionary relevance of structural fragments

Research paper thumbnail of Distance-based MSA benchmarking scoring scheme: The importance of measuring the reference-query alignment shifts

Research paper thumbnail of Protein fold evolution on completed genomes: distinguishing between young and old folds

Research paper thumbnail of Linking evolution of protein structures through fragments

His research interests are centered around ageing. This includes the study of the molecular pathw... more His research interests are centered around ageing. This includes the study of the molecular pathways that extend lifespan. His research uses both computational and wetlab techniques for modelling gene regulatory networks and predicting transcription factor binding sites.

Research paper thumbnail of A Simple Lattice Model Reproduces Folding Specificity and Aggregation Behaviour of Proteins

Research paper thumbnail of CANDIDATE DRIVER GENE DISCOVERY THROUGH NETWORK AIDED CLUSTERING OF TUMOR MUTATION PROFILES

Research paper thumbnail of THE TRAIT ROAD TOWARDS TOOLS FOR SUSTAINABLE MOLECULAR DATA MANAGEMENT AND ANALYSIS

Research paper thumbnail of Interplay between Folding and Assembly of Fibril-Forming Polypeptides

Phys. Rev. Lett., 2013

Polypeptides can self-assemble into hierarchically organized fibrils consisting of a stack of ind... more Polypeptides can self-assemble into hierarchically organized fibrils consisting of a stack of individually folded polypeptides driven together by hydrophobic interaction. Using a coarse grained model, we systematically studied this self-assembly as a function of temperature and hydrophobicity of the residues on the outside of the building block. We find the self-assembly can occur via two different pathways -a random aggregation-folding route, and a templated-folding processthus indicating a strong coupling between folding and assembly. The simulation results can explain experimental evidence that assembly through stacking of folded building blocks is rarely observed, at the experimental concentrations. The model thus provides a generic picture of hierarchical fibril formation.

Research paper thumbnail of Fold usage on genomes and protein fold evolution

Proteins: Structure, Function, and Bioinformatics, 2005

We review fold usage on completed genomes to explore protein structure evolution. The patterns of... more We review fold usage on completed genomes to explore protein structure evolution. The patterns of presence or absence of folds on genomes gives us insights into the relationships between folds, the age of different folds and how we have arrived at the set of folds we see today. We examine the relationships between different measures which describe protein fold usage, such as the number of copies of a fold per genome, the number of families per fold, and the number of genomes a fold occurs on. We obtained these measures of fold usage by searching for the structural domains on 157 completed genome sequences from all three kingdoms of life. In our comparisons of these measures we found that bacteria have relatively more distinct folds on their genomes than archaea. Eukaryotes were found to have many more copies of a fold on their genomes. If we separate out the different fold classes, the alpha/beta class has relatively fewer distinct folds on large genomes, more copies of a fold on bacteria and more folds occurring in all three kingdoms simultaneously. These results possibly indicate that most alpha/beta folds originated earlier than other folds. The expected power law distribution is observed for copies of a fold per genome and we found a similar distribution for the number of families per fold. However, a more complicated distribution appears for fold occurrence across genomes, which strongly depends on fold class and kingdom. We also show that there is not a clear relationship between the three measures of fold usage. A fold which occurs on many genomes does not necessarily have many copies on each genome. Similarly, folds with many copies do not necessarily have many families or vice versa. Proteins 2005;60:690 -700.

Research paper thumbnail of A Simple Lattice Model That Captures Protein Folding, Aggregation and Amyloid Formation

PLoS ONE, 2014

The ability of many proteins to convert from their functional soluble state to amyloid fibrils ca... more The ability of many proteins to convert from their functional soluble state to amyloid fibrils can be attributed to intermolecular beta strand formation. Such amyloid formation is associated with neurodegenerative disorders like Alzheimer's and Parkinson's. Molecular modelling can play a key role in providing insight into the factors that make proteins prone to fibril formation. However, fully atomistic models are computationally too expensive to capture the length and time scales associated with fibril formation. As the ability to form fibrils is the rule rather than the exception, much insight can be gained from the study of coarse-grained models that capture the key generic features associated with amyloid formation. Here we present a simple lattice model that can capture both protein folding and beta strand formation. Unlike standard lattice models, this model explicitly incorporates the formation of hydrogen bonds and the directionality of side chains. The simplicity of our model makes it computationally feasible to investigate the interplay between folding, amorphous aggregation and fibril formation, and maintains the capability of classic lattice models to simulate protein folding with high specificity. In our model, the folded proteins contain structures that resemble naturally occurring beta-sheets, with alternating polar and hydrophobic amino acids. Moreover, fibrils with intermolecular cross-beta strand conformations can be formed spontaneously out of multiple short hydrophobic peptide sequences. Both the formation of hydrogen bonds in folded structures and in fibrils is strongly dependent on the amino acid sequence, indicating that hydrogen-bonding interactions alone are not strong enough to initiate the formation of beta sheets. This result agrees with experimental observations that beta sheet and amyloid formation is strongly sequence dependent, with hydrophobic sequences being more prone to form such structures. Our model should open the way to a systematic study of the interplay between the factors that lead to amyloid formation. Citation: Abeln S, Vendruscolo M, Dobson CM, Frenkel D (2014) A Simple Lattice Model That Captures Protein Folding, Aggregation and Amyloid Formation. PLoS ONE 9(1): e85185.

Research paper thumbnail of Disordered Flanks Prevent Peptide Aggregation

PLoS Computational Biology, 2008

Research paper thumbnail of Lab Course MolSim Exercises 2013

Research paper thumbnail of DETECTION OF BREAKPOINTS BASED ON COPY NUMBER ABERRATION PROFILES

Research paper thumbnail of HOW TO INTEGRATE MULTI-OMICS DATA

Research paper thumbnail of COMPUTATIONAL ANALYSIS OF FORCE INDUCED PROTEIN UNFOLDING

Research paper thumbnail of The hydrophobic temperature dependence of amino acids directly calculated from protein structures

PLoS computational biology, 2015

The hydrophobic effect is the main driving force in protein folding. One can estimate the relativ... more The hydrophobic effect is the main driving force in protein folding. One can estimate the relative strength of this hydrophobic effect for each amino acid by mining a large set of experimentally determined protein structures. However, the hydrophobic force is known to be strongly temperature dependent. This temperature dependence is thought to explain the denaturation of proteins at low temperatures. Here we investigate if it is possible to extract this temperature dependence directly from a large set of protein structures determined at different temperatures. Using NMR structures filtered for sequence identity, we were able to extract hydrophobicity propensities for all amino acids at five different temperature ranges (spanning 265-340 K). These propensities show that the hydrophobicity becomes weaker at lower temperatures, in line with current theory. Alternatively, one can conclude that the temperature dependence of the hydrophobic effect has a measurable influence on protein str...

Research paper thumbnail of Quantifying the Displacement of Mismatches in Multiple Sequence Alignment Benchmarks

PLOS ONE, 2015

Multiple Sequence Alignment (MSA) methods are typically benchmarked on sets of reference alignmen... more Multiple Sequence Alignment (MSA) methods are typically benchmarked on sets of reference alignments. The quality of the alignment can then be represented by the sum-of-pairs (SP) or column (CS) scores, which measure the agreement between a reference and corresponding query alignment. Both the SP and CS scores treat mismatches between a query and reference alignment as equally bad, and do not take the separation into account between two amino acids in the query alignment, that should have been matched according to the reference alignment. This is significant since the magnitude of alignment shifts is often of relevance in biological analyses, including homology modeling and MSA refinement/ manual alignment editing. In this study we develop a new alignment benchmark scoring scheme, SPdist, that takes the degree of discordance of mismatches into account by measuring the sequence distance between mismatched residue pairs in the query alignment. Using this new score along with the standard SP score, we investigate the discriminatory behavior of the new score by assessing how well six different MSA methods perform with respect to BAliBASE reference alignments. The SP score and the SPdist score yield very similar outcomes when the reference and query alignments are close. However, for more divergent reference alignments the SPdist score is able to distinguish between methods that keep alignments approximately close to the reference and those exhibiting larger shifts. We observed that by using SPdist together with SP scoring we were able to better delineate the alignment quality difference between alternative MSA methods. With a case study we exemplify why it is important, from a biological perspective, to consider the separation of mismatches. The SPdist scoring scheme has been implemented in the VerAlign web server (http://www.ibi.vu.nl/programs/veralignwww/). The code for calculating SPdist score is also available upon request.

Research paper thumbnail of NGS-eval: NGS Error analysis and novel sequence VAriant detection tooL

Nucleic Acids Research, 2015

Massively parallel sequencing of microbial genetic markers (MGMs) is used to uncover the species ... more Massively parallel sequencing of microbial genetic markers (MGMs) is used to uncover the species composition in a multitude of ecological niches. These sequencing runs often contain a sample with known composition that can be used to evaluate the sequencing quality or to detect novel sequence variants. With NGS-eval, the reads from such (mock) samples can be used to (i) explore the differences between the reads and their references and to (ii) estimate the sequencing error rate. This tool maps these reads to references and calculates as well as visualizes the different types of sequencing errors. Clearly, sequencing errors can only be accurately calculated if the reference sequences are correct. However, even with known strains, it is not straightforward to select the correct references from databases. We previously analysed a pyrosequencing dataset from a mock sample to estimate sequencing error rates and detected sequence variants in our mock community, allowing us to obtain an accurate error estimation. Here, we demonstrate the variant detection and error analysis capability of NGS-eval with Illumina MiSeq reads from the same mock community. While tailored towards the field of metagenomics, this server can be used for any type of MGM-based reads. NGS-eval is available at http://www.ibi.vu.nl/programs/ngsevalwww/.

Research paper thumbnail of Competition between surface adsorption and folding of fibril-forming polypeptides

Physical review. E, Statistical, nonlinear, and soft matter physics, 2015

Self-assembly of polypeptides into fibrillar structures can be initiated by planar surfaces that ... more Self-assembly of polypeptides into fibrillar structures can be initiated by planar surfaces that interact favorably with certain residues. Using a coarse-grained model, we systematically studied the folding and adsorption behavior of a β-roll forming polypeptide. We find that there are two different folding pathways depending on the temperature: (i) at low temperature, the polypeptide folds in solution into a β-roll before adsorbing onto the attractive surface; (ii) at higher temperature, the polypeptide first adsorbs in a disordered state and folds while on the surface. The folding temperature increases with increasing attraction as the folded β-roll is stabilized by the surface. Surprisingly, further increasing the attraction lowers the folding temperature again, as strong attraction also stabilizes the adsorbed disordered state, which competes with folding of the polypeptide. Our results suggest that to enhance the folding, one should use a weakly attractive surface. They also ex...

Research paper thumbnail of Fast and accurate calculation of protein-protein interaction: contribution of surface and interface

Research paper thumbnail of Evolutionary relevance of structural fragments

Research paper thumbnail of Distance-based MSA benchmarking scoring scheme: The importance of measuring the reference-query alignment shifts

Research paper thumbnail of Protein fold evolution on completed genomes: distinguishing between young and old folds

Research paper thumbnail of Linking evolution of protein structures through fragments

His research interests are centered around ageing. This includes the study of the molecular pathw... more His research interests are centered around ageing. This includes the study of the molecular pathways that extend lifespan. His research uses both computational and wetlab techniques for modelling gene regulatory networks and predicting transcription factor binding sites.

Research paper thumbnail of A Simple Lattice Model Reproduces Folding Specificity and Aggregation Behaviour of Proteins

Research paper thumbnail of CANDIDATE DRIVER GENE DISCOVERY THROUGH NETWORK AIDED CLUSTERING OF TUMOR MUTATION PROFILES

Research paper thumbnail of THE TRAIT ROAD TOWARDS TOOLS FOR SUSTAINABLE MOLECULAR DATA MANAGEMENT AND ANALYSIS

Research paper thumbnail of Interplay between Folding and Assembly of Fibril-Forming Polypeptides

Phys. Rev. Lett., 2013

Polypeptides can self-assemble into hierarchically organized fibrils consisting of a stack of ind... more Polypeptides can self-assemble into hierarchically organized fibrils consisting of a stack of individually folded polypeptides driven together by hydrophobic interaction. Using a coarse grained model, we systematically studied this self-assembly as a function of temperature and hydrophobicity of the residues on the outside of the building block. We find the self-assembly can occur via two different pathways -a random aggregation-folding route, and a templated-folding processthus indicating a strong coupling between folding and assembly. The simulation results can explain experimental evidence that assembly through stacking of folded building blocks is rarely observed, at the experimental concentrations. The model thus provides a generic picture of hierarchical fibril formation.

Research paper thumbnail of Fold usage on genomes and protein fold evolution

Proteins: Structure, Function, and Bioinformatics, 2005

We review fold usage on completed genomes to explore protein structure evolution. The patterns of... more We review fold usage on completed genomes to explore protein structure evolution. The patterns of presence or absence of folds on genomes gives us insights into the relationships between folds, the age of different folds and how we have arrived at the set of folds we see today. We examine the relationships between different measures which describe protein fold usage, such as the number of copies of a fold per genome, the number of families per fold, and the number of genomes a fold occurs on. We obtained these measures of fold usage by searching for the structural domains on 157 completed genome sequences from all three kingdoms of life. In our comparisons of these measures we found that bacteria have relatively more distinct folds on their genomes than archaea. Eukaryotes were found to have many more copies of a fold on their genomes. If we separate out the different fold classes, the alpha/beta class has relatively fewer distinct folds on large genomes, more copies of a fold on bacteria and more folds occurring in all three kingdoms simultaneously. These results possibly indicate that most alpha/beta folds originated earlier than other folds. The expected power law distribution is observed for copies of a fold per genome and we found a similar distribution for the number of families per fold. However, a more complicated distribution appears for fold occurrence across genomes, which strongly depends on fold class and kingdom. We also show that there is not a clear relationship between the three measures of fold usage. A fold which occurs on many genomes does not necessarily have many copies on each genome. Similarly, folds with many copies do not necessarily have many families or vice versa. Proteins 2005;60:690 -700.

Research paper thumbnail of A Simple Lattice Model That Captures Protein Folding, Aggregation and Amyloid Formation

PLoS ONE, 2014

The ability of many proteins to convert from their functional soluble state to amyloid fibrils ca... more The ability of many proteins to convert from their functional soluble state to amyloid fibrils can be attributed to intermolecular beta strand formation. Such amyloid formation is associated with neurodegenerative disorders like Alzheimer's and Parkinson's. Molecular modelling can play a key role in providing insight into the factors that make proteins prone to fibril formation. However, fully atomistic models are computationally too expensive to capture the length and time scales associated with fibril formation. As the ability to form fibrils is the rule rather than the exception, much insight can be gained from the study of coarse-grained models that capture the key generic features associated with amyloid formation. Here we present a simple lattice model that can capture both protein folding and beta strand formation. Unlike standard lattice models, this model explicitly incorporates the formation of hydrogen bonds and the directionality of side chains. The simplicity of our model makes it computationally feasible to investigate the interplay between folding, amorphous aggregation and fibril formation, and maintains the capability of classic lattice models to simulate protein folding with high specificity. In our model, the folded proteins contain structures that resemble naturally occurring beta-sheets, with alternating polar and hydrophobic amino acids. Moreover, fibrils with intermolecular cross-beta strand conformations can be formed spontaneously out of multiple short hydrophobic peptide sequences. Both the formation of hydrogen bonds in folded structures and in fibrils is strongly dependent on the amino acid sequence, indicating that hydrogen-bonding interactions alone are not strong enough to initiate the formation of beta sheets. This result agrees with experimental observations that beta sheet and amyloid formation is strongly sequence dependent, with hydrophobic sequences being more prone to form such structures. Our model should open the way to a systematic study of the interplay between the factors that lead to amyloid formation. Citation: Abeln S, Vendruscolo M, Dobson CM, Frenkel D (2014) A Simple Lattice Model That Captures Protein Folding, Aggregation and Amyloid Formation. PLoS ONE 9(1): e85185.

Research paper thumbnail of Disordered Flanks Prevent Peptide Aggregation

PLoS Computational Biology, 2008