Claudio Borile - Academia.edu (original) (raw)
Papers by Claudio Borile
arXiv (Cornell University), Jun 10, 2021
The constant growth in the number of malware-software or code fragment potentially harmful for co... more The constant growth in the number of malware-software or code fragment potentially harmful for computers and information networks-and the use of sophisticated evasion and obfuscation techniques have seriously hindered classic signature-based approaches. On the other hand, malware detection systems based on machine learning techniques started offering a promising alternative to standard approaches, drastically reducing analysis time and turning out to be more robust against evasion and obfuscation techniques. In this paper, we propose a malware taxonomic classification pipeline able to classify Windows Portable Executable files (PEs). Given an input PE sample, it is first classified as either malicious or benign. If malicious, the pipeline further analyzes it in order to establish its threat type, family, and behavior(s). We tested the proposed pipeline on the open source dataset EMBER, containing approximately 1 million PE samples, analyzed through static analysis. Obtained malware detection results are comparable to other academic works in the current state of art and, in addition, we provide an in-depth classification of malicious samples. Models used in the pipeline provides interpretable results which can help security analysts in better understanding decisions taken by the automated pipeline.
Journal of Statistical Mechanics: Theory and Experiment, 2015
Neutral theories have played a crucial and revolutionary role in fields such as population geneti... more Neutral theories have played a crucial and revolutionary role in fields such as population genetics and biogeography. These theories are critical by definition, in the sense that the overall growth rate of each single allele/species/type vanishes. Thus each species in a neutral model sits at the edge between invasion and extinction, allowing for the coexistence of symmetric/neutral types. However, in finite systems, mono-dominated states are ineludibly reached in relatively short times owing to demographic fluctuations, thus leaving us with an unsatisfactory framework to rationalize empirically-observed long-term coexistence. Here, we scrutinize the effect of heterogeneity in quasi-neutral theories, in which there can be a local mild preference for some of the competing species at some sites, even if the overall species symmetry is maintained. As we show here, mild biases at a small fraction of locations suffice to induce overall robust and durable species coexistence, even in regions arbitrarily far apart from the biased locations. This result stems from the long-range nature of the underlying critical bulk dynamics and has a number of implications, for example, in conservation ecology as it suggests that constructing local specific "sanctuaries" for different competing species can result in global enhancement of biodiversity, even in regions arbitrarily distant from the protected refuges.
Physical Review Letters, 2012
Spontaneous symmetry breaking plays a fundamental role in many areas of condensed matter and part... more Spontaneous symmetry breaking plays a fundamental role in many areas of condensed matter and particle physics. A fundamental problem in ecology is the elucidation of the mechanisms responsible for biodiversity and stability. Neutral theory, which makes the simplifying assumption that all individuals (such as trees in a tropical forest)-regardless of the species they belong tohave the same prospect of reproduction, death, etc., yields gross patterns that are in accord with empirical data. We explore the possibility of birth and death rates that depend on the population density of species while treating the dynamics in a species-symmetric manner. We demonstrate that the dynamical evolution can lead to a stationary state characterized simultaneously by both biodiversity and spontaneously broken neutral symmetry.
Journal of Statistical Physics, 2014
Neutral models aspire to explain biodiversity patterns in ecosystems where species difference can... more Neutral models aspire to explain biodiversity patterns in ecosystems where species difference can be neglected, as it might occur at a specific trophic level, and perfect symmetry is assumed between species. Voter-like models capture the essential ingredients of the neutral hypothesis and represent a paradigm for other disciplines like social studies and chemical reactions. In a system where each individual can interact with all the other members of the community, the typical time to reach an absorbing state with a single species scales linearly with the community size. Here we show, by using a rigorous approach within a large deviation principle and confirming previous approximate and numerical results, that in a heterogeneous voter model the typical time to reach an absorbing state scales exponentially with the system size, suggestive of an asymptotic active phase.
Journal of Statistical Mechanics: Theory and Experiment, 2013
We study systems with two symmetric absorbing states, such as the voter model and variations of i... more We study systems with two symmetric absorbing states, such as the voter model and variations of it, which have been broadly used as minimal neutral models in genetics, population ecology, sociology, etc. We analyze the effects of a key ingredient ineluctably present in most real applications: random-field-like quenched disorder. In accord with simulations and previous findings, coexistence between the two competing states/opinions turns out to be strongly favored by disorder in the standard voter model; actually, a disorder-induced phase transition is generated for any finite system size in the presence of an arbitrary small spontaneous-inversion rate (preventing absorbing states from being stable). For non-linear versions of the voter model a general theory (by AlHammal et al.) explains that the spontaneous breaking of the up/down symmetry and an absorbing state phase transition can occur either together or separately, giving raise to two different scenarios. Here, we show that he presence of quenched disorder in non-linear voter models does not allow the separation of the up-down (Ising-like) symmetry breaking from the active-to-absorbing phase transition in low-dimensional systems: both phenomena can occur only simultaneously, as a consequence of the well-known Imry-Ma argument generalized to these non-equilibrium problems. When the two phenomena occur at unison, resulting into a genuinely nonequilibrium ("Generalized Voter") transition, the Imry-Ma argument is violated and the symmetry can be spontaneously broken even in low dimensions.
BMC Bioinformatics, 2011
Background Classification and naming is a key step in the analysis, understanding and adequate ma... more Background Classification and naming is a key step in the analysis, understanding and adequate management of living organisms. However, where to set limits between groups can be puzzling especially in clonal organisms. Within the Mycobacterium tuberculosis complex (MTC), the etiological agent of tuberculosis (TB), experts have first identified several groups according to their pattern at repetitive sequences, especially at the CRISPR locus (spoligotyping), and to their epidemiological relevance. Most groups such as "Beijing" found good support when tested with other loci. However, other groups such as T family and T1 subfamily (belonging to the "Euro-American" lineage) correspond to non-monophyletic groups and still need to be refined. Here, we propose to use a method called Affinity Propagation that has been successfully used in image categorization to identify relevant patterns at the CRISPR locus in MTC. Results To adequately infer the relative divergence time...
Current Bioinformatics
Background: The new paradigm of precision medicine brought an increasing interest in survival pre... more Background: The new paradigm of precision medicine brought an increasing interest in survival prediction based on the integration of multi-omics and multi-sources data. Several models have been developed to address this task, but their performances are widely variable depending on the specific disease and are often poor on noisy datasets, such as in the case of non-small cell lung cancer (NSCLC). Objective: The aim of this work is to introduce a novel computational approach, named multi-omic two-layer SVM (mtSVM), and to exploit it to get a survival-based risk stratification of NSCLC patients from an ongoing observational prospective cohort clinical study named PROMOLE. Methods: The model implements a model-based integration by means of a two-layer feed-forward network of FastSurvivalSVMs, and it can be used to get individual survival estimates or survival-based risk stratification. Despite being designed for NSCLC, its range of applicability can potentially cover the full spectrum ...
arXiv (Cornell University), Jun 10, 2021
The constant growth in the number of malware-software or code fragment potentially harmful for co... more The constant growth in the number of malware-software or code fragment potentially harmful for computers and information networks-and the use of sophisticated evasion and obfuscation techniques have seriously hindered classic signature-based approaches. On the other hand, malware detection systems based on machine learning techniques started offering a promising alternative to standard approaches, drastically reducing analysis time and turning out to be more robust against evasion and obfuscation techniques. In this paper, we propose a malware taxonomic classification pipeline able to classify Windows Portable Executable files (PEs). Given an input PE sample, it is first classified as either malicious or benign. If malicious, the pipeline further analyzes it in order to establish its threat type, family, and behavior(s). We tested the proposed pipeline on the open source dataset EMBER, containing approximately 1 million PE samples, analyzed through static analysis. Obtained malware detection results are comparable to other academic works in the current state of art and, in addition, we provide an in-depth classification of malicious samples. Models used in the pipeline provides interpretable results which can help security analysts in better understanding decisions taken by the automated pipeline.
Journal of Statistical Mechanics: Theory and Experiment, 2015
Neutral theories have played a crucial and revolutionary role in fields such as population geneti... more Neutral theories have played a crucial and revolutionary role in fields such as population genetics and biogeography. These theories are critical by definition, in the sense that the overall growth rate of each single allele/species/type vanishes. Thus each species in a neutral model sits at the edge between invasion and extinction, allowing for the coexistence of symmetric/neutral types. However, in finite systems, mono-dominated states are ineludibly reached in relatively short times owing to demographic fluctuations, thus leaving us with an unsatisfactory framework to rationalize empirically-observed long-term coexistence. Here, we scrutinize the effect of heterogeneity in quasi-neutral theories, in which there can be a local mild preference for some of the competing species at some sites, even if the overall species symmetry is maintained. As we show here, mild biases at a small fraction of locations suffice to induce overall robust and durable species coexistence, even in regions arbitrarily far apart from the biased locations. This result stems from the long-range nature of the underlying critical bulk dynamics and has a number of implications, for example, in conservation ecology as it suggests that constructing local specific "sanctuaries" for different competing species can result in global enhancement of biodiversity, even in regions arbitrarily distant from the protected refuges.
Physical Review Letters, 2012
Spontaneous symmetry breaking plays a fundamental role in many areas of condensed matter and part... more Spontaneous symmetry breaking plays a fundamental role in many areas of condensed matter and particle physics. A fundamental problem in ecology is the elucidation of the mechanisms responsible for biodiversity and stability. Neutral theory, which makes the simplifying assumption that all individuals (such as trees in a tropical forest)-regardless of the species they belong tohave the same prospect of reproduction, death, etc., yields gross patterns that are in accord with empirical data. We explore the possibility of birth and death rates that depend on the population density of species while treating the dynamics in a species-symmetric manner. We demonstrate that the dynamical evolution can lead to a stationary state characterized simultaneously by both biodiversity and spontaneously broken neutral symmetry.
Journal of Statistical Physics, 2014
Neutral models aspire to explain biodiversity patterns in ecosystems where species difference can... more Neutral models aspire to explain biodiversity patterns in ecosystems where species difference can be neglected, as it might occur at a specific trophic level, and perfect symmetry is assumed between species. Voter-like models capture the essential ingredients of the neutral hypothesis and represent a paradigm for other disciplines like social studies and chemical reactions. In a system where each individual can interact with all the other members of the community, the typical time to reach an absorbing state with a single species scales linearly with the community size. Here we show, by using a rigorous approach within a large deviation principle and confirming previous approximate and numerical results, that in a heterogeneous voter model the typical time to reach an absorbing state scales exponentially with the system size, suggestive of an asymptotic active phase.
Journal of Statistical Mechanics: Theory and Experiment, 2013
We study systems with two symmetric absorbing states, such as the voter model and variations of i... more We study systems with two symmetric absorbing states, such as the voter model and variations of it, which have been broadly used as minimal neutral models in genetics, population ecology, sociology, etc. We analyze the effects of a key ingredient ineluctably present in most real applications: random-field-like quenched disorder. In accord with simulations and previous findings, coexistence between the two competing states/opinions turns out to be strongly favored by disorder in the standard voter model; actually, a disorder-induced phase transition is generated for any finite system size in the presence of an arbitrary small spontaneous-inversion rate (preventing absorbing states from being stable). For non-linear versions of the voter model a general theory (by AlHammal et al.) explains that the spontaneous breaking of the up/down symmetry and an absorbing state phase transition can occur either together or separately, giving raise to two different scenarios. Here, we show that he presence of quenched disorder in non-linear voter models does not allow the separation of the up-down (Ising-like) symmetry breaking from the active-to-absorbing phase transition in low-dimensional systems: both phenomena can occur only simultaneously, as a consequence of the well-known Imry-Ma argument generalized to these non-equilibrium problems. When the two phenomena occur at unison, resulting into a genuinely nonequilibrium ("Generalized Voter") transition, the Imry-Ma argument is violated and the symmetry can be spontaneously broken even in low dimensions.
BMC Bioinformatics, 2011
Background Classification and naming is a key step in the analysis, understanding and adequate ma... more Background Classification and naming is a key step in the analysis, understanding and adequate management of living organisms. However, where to set limits between groups can be puzzling especially in clonal organisms. Within the Mycobacterium tuberculosis complex (MTC), the etiological agent of tuberculosis (TB), experts have first identified several groups according to their pattern at repetitive sequences, especially at the CRISPR locus (spoligotyping), and to their epidemiological relevance. Most groups such as "Beijing" found good support when tested with other loci. However, other groups such as T family and T1 subfamily (belonging to the "Euro-American" lineage) correspond to non-monophyletic groups and still need to be refined. Here, we propose to use a method called Affinity Propagation that has been successfully used in image categorization to identify relevant patterns at the CRISPR locus in MTC. Results To adequately infer the relative divergence time...
Current Bioinformatics
Background: The new paradigm of precision medicine brought an increasing interest in survival pre... more Background: The new paradigm of precision medicine brought an increasing interest in survival prediction based on the integration of multi-omics and multi-sources data. Several models have been developed to address this task, but their performances are widely variable depending on the specific disease and are often poor on noisy datasets, such as in the case of non-small cell lung cancer (NSCLC). Objective: The aim of this work is to introduce a novel computational approach, named multi-omic two-layer SVM (mtSVM), and to exploit it to get a survival-based risk stratification of NSCLC patients from an ongoing observational prospective cohort clinical study named PROMOLE. Methods: The model implements a model-based integration by means of a two-layer feed-forward network of FastSurvivalSVMs, and it can be used to get individual survival estimates or survival-based risk stratification. Despite being designed for NSCLC, its range of applicability can potentially cover the full spectrum ...