Mario Compiani | Università degli studi di Camerino UNICAM (original) (raw)
Papers by Mario Compiani
Lettere Al Nuovo Cimento, Oct 1, 1983
Springer eBooks, 1990
Classifier systems use both dynamical and inferential operations in order to perform complex task... more Classifier systems use both dynamical and inferential operations in order to perform complex tasks. Their relationship with neural networks is briefly discussed. While dynamical features are crucial for learning, we investigate their role in the solutions which are finally discovered by the system, showing that some truly dynamical solutions exist and that they can perform a specific task with a smaller number of classifiers than purely “inferential” solutions. The stability of the rules under the action of genetic operators is also briefly discussed.
Franco Angeli eBooks, 2007
Il Nuovo Cimento B, 1995
Summary In the absence of experimental facts to support the study of the origins of the Universe... more Summary In the absence of experimental facts to support the study of the origins of the Universe and black holes, reliance must be shifted to theory, especially thermodynamics. To superstring theories, dealing with the unification of the four known forces during the evolution of the early Universe, and black holes, the end product of gravitational collapse, have been attributed rather peculiar thermodynamic properties such as lack of concavity, with the consequence of negative heat capacities, non-extensivity. It has been proposed that superadditivity rather than concavity constitutes the essence of the second law. Here we refute such claims and show that concavity determines the natural evolution of thermodynamic processes.
Lettere al Nuovo Cimento, 1983
Oxford University Press eBooks, Nov 27, 1996
Physical Review E, 1997
ABSTRACT In the first part of this paper we study the performance of a single-layer perceptron th... more ABSTRACT In the first part of this paper we study the performance of a single-layer perceptron that is expected to classify patterns into classes in the case where the mapping to be learned is corrupted by noise. Extending previous results concerning the statistical behavior of perceptrons, we distinguish two mutually exclusive kinds of noise (I noise and R noise) and study their effect on the statistical information that can be drawn from the output. In the presence of I noise, the learning stage results in the convergence of the output to the probabilities that the input occurs in each class. R noise, on the contrary, perturbs the learning of probabilities to the extent that the performance of the perceptron deteriorates and the network becomes equivalent to a random predictor. We derive an analytical expression for the efficiency of classification of inputs affected by strong R noise. We argue that, from the standpoint of the efficiency score, the network is equivalent to a device performing biased random flights in the space of the weights, which are ruled by the statistical information stored by the network during the learning stage. The second part of the paper is devoted to the application of our model to the prediction of protein secondary structures where one has to deal with the effects of R noise. Our results are shown to be consistent with data drawn from experiments and simulations of the folding process. In particular, the existence of coding and noncoding traits of the protein is properly rationalized in terms of R-noise intensity. In addition, our model provides a justification of the seeming existence of a relationship between the prediction efficiency and the amount of R noise in the sequence-to-structure mapping. Finally, we define an entropylike parameter that is useful as a measure of R noise.
Proceedings / ... International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology, 1999
A data base of minimally frustrated alpha helical segments is defined by filtering a set comprisi... more A data base of minimally frustrated alpha helical segments is defined by filtering a set comprising 822 non redundant proteins, which contain 4783 alpha helical structures. The data base definition is performed using a neural network-based alpha helix predictor, whose outputs are rated according to an entropy criterion. A comparison with the presently available experimental results indicates that a subset of the data base contains the initiation sites of protein folding experimentally detected and also protein fragments which fold into stable isolated alpha helices. This suggests the usage of the data base (and/or of the predictor) to highlight patterns which govern the stability of alpha helices in proteins and the helical behavior of isolated protein fragments.
Proceedings / ... International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology, 1995
Radial basis function neural networks are trained on a data base comprising 38 globular proteins ... more Radial basis function neural networks are trained on a data base comprising 38 globular proteins of well resolved crystallographic structure and the corresponding free energy contributions to the overall protein stability (as computed partially from chrystallographic analysis and partially with multiple regression from experimental thermodynamic data by Ponnuswamy and Gromiha (1994)). Starting from the residue sequence and using as input code the percentage of each residue and the total residue number of the protein, it is found with a cross-validation method that neural networks can optimally predict the free energy contributions due to hydrogen bonds, hydrophobic interactions and the unfolded state. Terms due to electrostatic and disulfide bonding free energies are poorly predicted. This is so also when other input codes, including the percentage of secondary structure type of the protein and/or residue-pair information are used. Furthermore, trained on the computed and/or experim...
Applications and Science of Artificial Neural Networks II, 1996
ABSTRACT
Theoretical Chemistry Accounts: Theory, Computation, and Modeling (Theoretica Chimica Acta), 1999
Protein secondary structures result both from short-range and long-range interactions. Here neura... more Protein secondary structures result both from short-range and long-range interactions. Here neural networks are used to implement a procedure to detect regions of the protein backbone where local interactions have an overwhelming eect in determining the formation of stretches in a-helical conformation. Within the framework of a modular view of protein folding we have argued that these structures correspond to the initiation sites of folding. The hypothesis to be tested in this paper is that sequence identity beside ensuring similarity of the three-dimensional conformation also entails similar folding mechanisms. In particular, we compare the location and sequence variability of the initiation sites extracted from a set of proteins homologous to horse heart cytochrome. We present evidence that the initiation sites conserve their position in the aligned sequences and exhibit a more reduced variability in the residue composition than the rest of the protein.
The Journal of Physical Chemistry B, 2005
A diffusion-collision-like model is proposed for helical proteins with three-state folding dynami... more A diffusion-collision-like model is proposed for helical proteins with three-state folding dynamics. The model generalizes a previous scheme based on the dynamics of putatively essential parts of the protein (foldons) that was successfully tested on proteins with two-state folding. We show that the extended model, unlike the original one, allows satisfactory calculation of the folding rate and reconstruction of the salient steps of the folding pathway of two proteins with three-state folding (Im7 and p16). The dramatic reduction of variables achieved by focusing on the foldons makes our model a good candidate for a minimal description of the folding process also for three-state folders. Finally, the applicability of the foldon diffusion-collision model to two-state and three-state folders suggests that different folding mechanisms are amenable to conceptually homogeneous descriptions. The implications for a unification of the variety of folding theories so far proposed for helical proteins are discussed in the final discussion.
SAR and QSAR in Environmental Research, 2000
Reprints available directly from the publisher Photocopying permitted by license only 2000. v0i. ... more Reprints available directly from the publisher Photocopying permitted by license only 2000. v0i. I I , pp. 149-1x2 @ 2WO OPA (Overseas Publishers Association) N.V.
SAR and QSAR in Environmental Research, 2002
Computational tools can bridge the gap between sequence and protein 3D structure based on the not... more Computational tools can bridge the gap between sequence and protein 3D structure based on the notion that information is to be retrieved from the databases and that knowledge-based methods can help in approaching a solution of the protein-folding problem. To this aim our group has implemented neural network-based predictors capable of performing with some success in different tasks, including predictions of the secondary structure of globular and membrane proteins, the topology of membrane proteins and porins and stable alpha-helical segments suited for protein design. Moreover we have developed methods for predicting contact maps in proteins and the probability of finding a cysteine in a disulfide bridge, tools which can contribute to the goal of predicting the 3D structure starting from the sequence (the so called ab initio prediction). All our predictors take advantage of evolution information derived from the structural alignments of homologous (evolutionary related) proteins and taken from the sequence and structure databases. When it is necessary to build models for proteins of unknown spatial structure, which have very little homology with other proteins of known structure, non-standard techniques need to be developed and the tools for protein structure predictions may help in protein modeling. The results of a recent simulation performed in our lab highlights the role of high performing computing technology and of tools of computational biology in protein modeling and peptidomimetic design.
Proteins: Structure, Function, and Genetics, 2000
The most stringent test for predictive methods of protein secondary structure is whether identica... more The most stringent test for predictive methods of protein secondary structure is whether identical short sequences that are known to be present with different conformations in different proteins known at atomic resolution can be correctly discriminated. In this study, we show that the prediction efficiency of this type of segments in unrelated proteins reaches an average accuracy per residue ranging from about 72 to 75% (depending on the alignment method used to generate the input sequence profile) only when methods of the third generation are used. A comparison of different methods based on segment statistics (2nd generation methods) and/or including also evolutionary information (3rd generation methods) indicate that the discrimination of the different conformations of identical segments is dependent on the method used for the prediction. Accuracy is similar when methods similarly performing on the secondary structure prediction are tested. When evolutionary information is taken into account as compared to single sequence input, the number of correctly discriminated pairs is increased twofold. The results also highlight the predictive capability of neural networks for identical segments whose conformation differs in different proteins. Proteins 2000;41:535-544.
Proteins: Structure, Function, and Bioinformatics, 2006
In this article we use mutation studies as a benchmark for a minimal model of the folding process... more In this article we use mutation studies as a benchmark for a minimal model of the folding process of helical proteins. The model ascribes a pivotal role to the collisional dynamics of a few crucial residues (foldons) and predicts the folding rates by exploiting information drawn from the protein sequence. We show that our model rationalizes the effects of point mutations on the kinetics of folding. The folding times of two proteins and their mutants are predicted. Stability and location of foldons have a critical role as the determinants of protein folding. This allows us to elucidate two main mechanisms for the kinetic effects of mutations. First, it turns out that the mutations eliciting the most notable effects alter protein stability through stabilization or destabilization of the foldons. Secondly, the folding rate is affected via a modification of the foldon topology by those mutations that lead to the birth or death of foldons. The few mispredicted folding rates of some mutants hint at the limits of the current version of the folding model proposed in the present article. The performance of our folding model declines in case the mutated residues are subject to strong long-range forces. That foldons are the critical targets of mutation studies has notable implications for design strategies and is of particular interest to address the issue of the kinetic regulation of single proteins in the general context of the overall dynamics of the interactome.
Proceedings of the National Academy of Sciences, 1998
The analysis of the information flow in a feed-forward neural network suggests that the output of... more The analysis of the information flow in a feed-forward neural network suggests that the output of the network can be used to compute a structural entropy for the sequence-to-secondary structure mapping. On this basis, we formulate a minimum entropy criterion for the identification of minimally frustrated traits with helical conformation that correspond to initiation sites of protein folding. The entropy of protein segments can be viewed as a nucleation propensity that is useful to characterize putative regions where folding is likely to be initiated with the formation of stretches of α-helices under the predominant influence of local interactions. Our procedure is successfully tested in the search for initiation sites of protein folding for which independent experimental and computational evidence exists. Our results lend support to the view that folding is a hierarchical event in which, in harmony with the minimal frustration principle, the final conformation preserves structural m...
Lettere Al Nuovo Cimento, Oct 1, 1983
Springer eBooks, 1990
Classifier systems use both dynamical and inferential operations in order to perform complex task... more Classifier systems use both dynamical and inferential operations in order to perform complex tasks. Their relationship with neural networks is briefly discussed. While dynamical features are crucial for learning, we investigate their role in the solutions which are finally discovered by the system, showing that some truly dynamical solutions exist and that they can perform a specific task with a smaller number of classifiers than purely “inferential” solutions. The stability of the rules under the action of genetic operators is also briefly discussed.
Franco Angeli eBooks, 2007
Il Nuovo Cimento B, 1995
Summary In the absence of experimental facts to support the study of the origins of the Universe... more Summary In the absence of experimental facts to support the study of the origins of the Universe and black holes, reliance must be shifted to theory, especially thermodynamics. To superstring theories, dealing with the unification of the four known forces during the evolution of the early Universe, and black holes, the end product of gravitational collapse, have been attributed rather peculiar thermodynamic properties such as lack of concavity, with the consequence of negative heat capacities, non-extensivity. It has been proposed that superadditivity rather than concavity constitutes the essence of the second law. Here we refute such claims and show that concavity determines the natural evolution of thermodynamic processes.
Lettere al Nuovo Cimento, 1983
Oxford University Press eBooks, Nov 27, 1996
Physical Review E, 1997
ABSTRACT In the first part of this paper we study the performance of a single-layer perceptron th... more ABSTRACT In the first part of this paper we study the performance of a single-layer perceptron that is expected to classify patterns into classes in the case where the mapping to be learned is corrupted by noise. Extending previous results concerning the statistical behavior of perceptrons, we distinguish two mutually exclusive kinds of noise (I noise and R noise) and study their effect on the statistical information that can be drawn from the output. In the presence of I noise, the learning stage results in the convergence of the output to the probabilities that the input occurs in each class. R noise, on the contrary, perturbs the learning of probabilities to the extent that the performance of the perceptron deteriorates and the network becomes equivalent to a random predictor. We derive an analytical expression for the efficiency of classification of inputs affected by strong R noise. We argue that, from the standpoint of the efficiency score, the network is equivalent to a device performing biased random flights in the space of the weights, which are ruled by the statistical information stored by the network during the learning stage. The second part of the paper is devoted to the application of our model to the prediction of protein secondary structures where one has to deal with the effects of R noise. Our results are shown to be consistent with data drawn from experiments and simulations of the folding process. In particular, the existence of coding and noncoding traits of the protein is properly rationalized in terms of R-noise intensity. In addition, our model provides a justification of the seeming existence of a relationship between the prediction efficiency and the amount of R noise in the sequence-to-structure mapping. Finally, we define an entropylike parameter that is useful as a measure of R noise.
Proceedings / ... International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology, 1999
A data base of minimally frustrated alpha helical segments is defined by filtering a set comprisi... more A data base of minimally frustrated alpha helical segments is defined by filtering a set comprising 822 non redundant proteins, which contain 4783 alpha helical structures. The data base definition is performed using a neural network-based alpha helix predictor, whose outputs are rated according to an entropy criterion. A comparison with the presently available experimental results indicates that a subset of the data base contains the initiation sites of protein folding experimentally detected and also protein fragments which fold into stable isolated alpha helices. This suggests the usage of the data base (and/or of the predictor) to highlight patterns which govern the stability of alpha helices in proteins and the helical behavior of isolated protein fragments.
Proceedings / ... International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology, 1995
Radial basis function neural networks are trained on a data base comprising 38 globular proteins ... more Radial basis function neural networks are trained on a data base comprising 38 globular proteins of well resolved crystallographic structure and the corresponding free energy contributions to the overall protein stability (as computed partially from chrystallographic analysis and partially with multiple regression from experimental thermodynamic data by Ponnuswamy and Gromiha (1994)). Starting from the residue sequence and using as input code the percentage of each residue and the total residue number of the protein, it is found with a cross-validation method that neural networks can optimally predict the free energy contributions due to hydrogen bonds, hydrophobic interactions and the unfolded state. Terms due to electrostatic and disulfide bonding free energies are poorly predicted. This is so also when other input codes, including the percentage of secondary structure type of the protein and/or residue-pair information are used. Furthermore, trained on the computed and/or experim...
Applications and Science of Artificial Neural Networks II, 1996
ABSTRACT
Theoretical Chemistry Accounts: Theory, Computation, and Modeling (Theoretica Chimica Acta), 1999
Protein secondary structures result both from short-range and long-range interactions. Here neura... more Protein secondary structures result both from short-range and long-range interactions. Here neural networks are used to implement a procedure to detect regions of the protein backbone where local interactions have an overwhelming eect in determining the formation of stretches in a-helical conformation. Within the framework of a modular view of protein folding we have argued that these structures correspond to the initiation sites of folding. The hypothesis to be tested in this paper is that sequence identity beside ensuring similarity of the three-dimensional conformation also entails similar folding mechanisms. In particular, we compare the location and sequence variability of the initiation sites extracted from a set of proteins homologous to horse heart cytochrome. We present evidence that the initiation sites conserve their position in the aligned sequences and exhibit a more reduced variability in the residue composition than the rest of the protein.
The Journal of Physical Chemistry B, 2005
A diffusion-collision-like model is proposed for helical proteins with three-state folding dynami... more A diffusion-collision-like model is proposed for helical proteins with three-state folding dynamics. The model generalizes a previous scheme based on the dynamics of putatively essential parts of the protein (foldons) that was successfully tested on proteins with two-state folding. We show that the extended model, unlike the original one, allows satisfactory calculation of the folding rate and reconstruction of the salient steps of the folding pathway of two proteins with three-state folding (Im7 and p16). The dramatic reduction of variables achieved by focusing on the foldons makes our model a good candidate for a minimal description of the folding process also for three-state folders. Finally, the applicability of the foldon diffusion-collision model to two-state and three-state folders suggests that different folding mechanisms are amenable to conceptually homogeneous descriptions. The implications for a unification of the variety of folding theories so far proposed for helical proteins are discussed in the final discussion.
SAR and QSAR in Environmental Research, 2000
Reprints available directly from the publisher Photocopying permitted by license only 2000. v0i. ... more Reprints available directly from the publisher Photocopying permitted by license only 2000. v0i. I I , pp. 149-1x2 @ 2WO OPA (Overseas Publishers Association) N.V.
SAR and QSAR in Environmental Research, 2002
Computational tools can bridge the gap between sequence and protein 3D structure based on the not... more Computational tools can bridge the gap between sequence and protein 3D structure based on the notion that information is to be retrieved from the databases and that knowledge-based methods can help in approaching a solution of the protein-folding problem. To this aim our group has implemented neural network-based predictors capable of performing with some success in different tasks, including predictions of the secondary structure of globular and membrane proteins, the topology of membrane proteins and porins and stable alpha-helical segments suited for protein design. Moreover we have developed methods for predicting contact maps in proteins and the probability of finding a cysteine in a disulfide bridge, tools which can contribute to the goal of predicting the 3D structure starting from the sequence (the so called ab initio prediction). All our predictors take advantage of evolution information derived from the structural alignments of homologous (evolutionary related) proteins and taken from the sequence and structure databases. When it is necessary to build models for proteins of unknown spatial structure, which have very little homology with other proteins of known structure, non-standard techniques need to be developed and the tools for protein structure predictions may help in protein modeling. The results of a recent simulation performed in our lab highlights the role of high performing computing technology and of tools of computational biology in protein modeling and peptidomimetic design.
Proteins: Structure, Function, and Genetics, 2000
The most stringent test for predictive methods of protein secondary structure is whether identica... more The most stringent test for predictive methods of protein secondary structure is whether identical short sequences that are known to be present with different conformations in different proteins known at atomic resolution can be correctly discriminated. In this study, we show that the prediction efficiency of this type of segments in unrelated proteins reaches an average accuracy per residue ranging from about 72 to 75% (depending on the alignment method used to generate the input sequence profile) only when methods of the third generation are used. A comparison of different methods based on segment statistics (2nd generation methods) and/or including also evolutionary information (3rd generation methods) indicate that the discrimination of the different conformations of identical segments is dependent on the method used for the prediction. Accuracy is similar when methods similarly performing on the secondary structure prediction are tested. When evolutionary information is taken into account as compared to single sequence input, the number of correctly discriminated pairs is increased twofold. The results also highlight the predictive capability of neural networks for identical segments whose conformation differs in different proteins. Proteins 2000;41:535-544.
Proteins: Structure, Function, and Bioinformatics, 2006
In this article we use mutation studies as a benchmark for a minimal model of the folding process... more In this article we use mutation studies as a benchmark for a minimal model of the folding process of helical proteins. The model ascribes a pivotal role to the collisional dynamics of a few crucial residues (foldons) and predicts the folding rates by exploiting information drawn from the protein sequence. We show that our model rationalizes the effects of point mutations on the kinetics of folding. The folding times of two proteins and their mutants are predicted. Stability and location of foldons have a critical role as the determinants of protein folding. This allows us to elucidate two main mechanisms for the kinetic effects of mutations. First, it turns out that the mutations eliciting the most notable effects alter protein stability through stabilization or destabilization of the foldons. Secondly, the folding rate is affected via a modification of the foldon topology by those mutations that lead to the birth or death of foldons. The few mispredicted folding rates of some mutants hint at the limits of the current version of the folding model proposed in the present article. The performance of our folding model declines in case the mutated residues are subject to strong long-range forces. That foldons are the critical targets of mutation studies has notable implications for design strategies and is of particular interest to address the issue of the kinetic regulation of single proteins in the general context of the overall dynamics of the interactome.
Proceedings of the National Academy of Sciences, 1998
The analysis of the information flow in a feed-forward neural network suggests that the output of... more The analysis of the information flow in a feed-forward neural network suggests that the output of the network can be used to compute a structural entropy for the sequence-to-secondary structure mapping. On this basis, we formulate a minimum entropy criterion for the identification of minimally frustrated traits with helical conformation that correspond to initiation sites of protein folding. The entropy of protein segments can be viewed as a nucleation propensity that is useful to characterize putative regions where folding is likely to be initiated with the formation of stretches of α-helices under the predominant influence of local interactions. Our procedure is successfully tested in the search for initiation sites of protein folding for which independent experimental and computational evidence exists. Our results lend support to the view that folding is a hierarchical event in which, in harmony with the minimal frustration principle, the final conformation preserves structural m...