Giuseppe Narzisi | New York Genome Center (original) (raw)

Papers by Giuseppe Narzisi

Research paper thumbnail of SUTTA: Scoring-and-Unfolding Trimmed Tree Assembler

Abstract We have developed a novel algorithmic framework for assembling haplotypic genome sequenc... more Abstract We have developed a novel algorithmic framework for assembling haplotypic genome sequences, and thus address a key open problem in the study of populations and polymorphisms, which cannot be solved with the currently available genotypic sequences.

Research paper thumbnail of Feature-by-feature–evaluating de novo sequence assembly

The whole-genome sequence assembly (WGSA) problem is among one of the most studied problems in co... more The whole-genome sequence assembly (WGSA) problem is among one of the most studied problems in computational biology. Despite the availability of a plethora of tools (ie, assemblers), all claiming to have solved the WGSA problem, little has been done to systematically compare their accuracy and power.

Research paper thumbnail of Reevaluating Assembly Evaluations with Feature Response Curves: GAGE and Assemblathons

Abstract In just the last decade, a multitude of bio-technologies and software pipelines have eme... more Abstract In just the last decade, a multitude of bio-technologies and software pipelines have emerged to revolutionize genomics. To further their central goal, they aim to accelerate and improve the quality of de novo whole-genome assembly starting from short DNA sequences/reads. However, the performance of each of these tools is contingent on the length and quality of the sequencing data, the structure and complexity of the genome sequence, and the resolution and quality of long-range information.

Research paper thumbnail of Hawkeye and AMOS: visualizing and assessing the quality of genome assemblies

Abstract Since its launch in 2004, the open-source AMOS project has released several innovative D... more Abstract Since its launch in 2004, the open-source AMOS project has released several innovative DNA sequence analysis applications including: Hawkeye, a visual analytics tool for inspecting the structure of genome assemblies; the Assembly Forensics and FRCurve pipelines for systematically evaluating the quality of a genome assembly; and AMOScmp, the first comparative genome assembler.

Research paper thumbnail of Complexities, catastrophes and cities: Unraveling emergency dynamics

1Author contributions: BM designed research; GN, VM and BM performed research; GN, VM and BM cont... more 1Author contributions: BM designed research; GN, VM and BM performed research; GN, VM and BM contributed new analytical tools; LN, DR and MT contributed to the clinical aspects of the study design and development; GN, VM and BM wrote the paper; and GN, VM, LN, DR, MT, LH, and IP reviewed the paper.

Research paper thumbnail of Emergency response planning for a potential sarin gas attack in Manhattan using agent-based models

ABSTRACT In this paper, we describe the agent-based modeling (ABM), simulation and analysis of a ... more ABSTRACT In this paper, we describe the agent-based modeling (ABM), simulation and analysis of a potential Sarin gas attack in the Port Authority Bus Terminal in the island of Manhattan in New York city, USA. The streets and subways of Manhattan have been modeled as a non-planar graph. The people at the terminal are modeled as agents initially moving randomly, but with a resultant drift velocity towards their destinations, eg, work places.

Research paper thumbnail of Improved Assembly Accuracy by Integrating Base-Calling, Error Correction and Assembly

Abstract Motivation. With the recent advent of a multitude of next-generation sequencing (NGS) te... more Abstract Motivation. With the recent advent of a multitude of next-generation sequencing (NGS) technologies (characterized by high throughput but relatively shorter read length), de novo DNA sequence assembly has become again one of the most prominent problems in Genomics and Computational Biology. Although algorithmic improvements play an important role in sequence assembly, the complexity of the problem is strongly reduced if higher quality (low error rate) sequences can be generated.

Research paper thumbnail of Reevaluating Assembly Evaluations using Feature Analysis: GAGE and Assemblathons Supplementary material

FRCbam computes features using a sliding window of size W. By default W is set to 1 Kbp, and in e... more FRCbam computes features using a sliding window of size W. By default W is set to 1 Kbp, and in each step it slides by 200 bp. Let A denote a genome to be assembled (ie, in other words it is the desired output). Let R={r1 1, r2 1,..., r1 n, r2 n} denote the set of sequenced paired reads from A. Pairs are at a known estimated distance, d (and standard variation, v) and with known orientations. FRCbam input is:

Research paper thumbnail of An Experimental Multi-Objective Study of the SVM Model Selection problem

Abstract. Support Vector machines (SVMs) are a powerful method for both regression and classifica... more Abstract. Support Vector machines (SVMs) are a powerful method for both regression and classification. However, any SVM formulation requires the user to set two or more parameters which govern the training process and such parameters can have a strong effect on the result performance of the engine. Moreover, the design of learning systems is inherently a multi-objective optimization problem. It requires to find a suitable trade-off between at least two conflicting objectives: model complexity and accuracy.

Research paper thumbnail of Agent modeling of a sarin attack in manhattan

ABSTRACT In this paper, we describe the agent-based modeling (ABM), simulation and analysis of a ... more ABSTRACT In this paper, we describe the agent-based modeling (ABM), simulation and analysis of a potential Sarin gas attack at the Port Authority Bus Terminal in the island of Manhattan in New York city, USA. The streets and subways of Manhattan have been modeled as a non-planar graph. The people at the terminal are modeled as agents initially moving randomly, but with a resultant drift velocity towards their destinations, eg, work places.

Research paper thumbnail of Complexities, catastrophes and cities: Emergency dynamics in varying scenarios and urban topologies

Complex Systems are often characterized by agents capable of interacting with each other dynamica... more Complex Systems are often characterized by agents capable of interacting with each other dynamically, often in non-linear and non-intuitive ways. Trying to characterize their dynamics often results in partial differential equations that are difficult, if not impossible, to solve. A large city or a city-state is an example of such an evolving and self-organizing complex environment that efficiently adapts to different and numerous incremental changes to its social, cultural and technological infrastructure [2].

Research paper thumbnail of An immunological algorithm for global numerical optimization

Abstract. Numerical optimization of given objective functions is a crucial task in many real-life... more Abstract. Numerical optimization of given objective functions is a crucial task in many real-life problems. The present article introduces an immunological algorithm for continuous global optimization problems, called opt-IA. Several biologically inspired algorithms have been designed during the last few years and have shown to have very good performance on standard test bed for numerical optimization.

Research paper thumbnail of Determination of protein structure and dynamics combining immune algorithms and pattern search methods

Abstract Natural proteins quickly fold into a complicated three-dimensional structure. Evolutiona... more Abstract Natural proteins quickly fold into a complicated three-dimensional structure. Evolutionary algorithms have been used to predict the native structure with the lowest energy conformation of the primary sequence of a given protein. Successful structure prediction requires a free energy function sufficiently close to the true potential for the native state, as well as a method for exploring the conformational space.

Research paper thumbnail of A multi-objective evolutionary approach to the protein structure prediction problem

Abstract The protein structure prediction (PSP) problem is concerned with the prediction of the f... more Abstract The protein structure prediction (PSP) problem is concerned with the prediction of the folded, native, tertiary structure of a protein given its sequence of amino acids. It is a challenging and computationally open problem, as proven by the numerous methodological attempts and the research effort applied to it in the last few years.

Research paper thumbnail of Computational Studies of Peptide and Protein Structure Prediction Problems via Multiobjective Evolutionary Algorithms

Finding the native structure of a protein starting from its amino acid sequence remains one of th... more Finding the native structure of a protein starting from its amino acid sequence remains one of the most challenging open problems in bioinformatics and molecular biology. The Protein Structure Prediction (PSP) problem has been tackled from many different directions. The common approach is to cast it in the form of a global single-objective optimization problem using energy functions to evaluate the physical state of the conformations.

Research paper thumbnail of Supplementary Material for “Complexities, Catastrophes and Cities: Emergency Dynamics in Varying Scenarios and Urban Topologies”

One of the main issues in ABM is to build models at the appropriate level of description, using t... more One of the main issues in ABM is to build models at the appropriate level of description, using the requisite level of details in order to produce a system that serves its analytical purpose. The details of our model have been summarized below from [2, 3, 5, 6, 4]. The table 1 shows the main parameters that the user can modify.

Research paper thumbnail of Yahoo! Clusty-Adding real-time clustering functionality to the Yahoo! web search engine

Yahoo!Clusty1 is a Clustering Meta-search Engine (MSE) that allows users to send queries to Yahoo... more Yahoo!Clusty1 is a Clustering Meta-search Engine (MSE) that allows users to send queries to Yahoo!. The returned snippets are grouped into homogeneous groups by topic. The objective of this project has been to create a flexible MSE for the Yahoo! web search engine. The purpose is to present the results returned to a query in a more structured format which will allow the user to easily explore them by category.

Research paper thumbnail of Scoring-and-Unfolding Trimmed Tree Assembler: Algorithms for Assembling Genome Sequences Accurately and Efficiently

Page 1. Scoring-and-Unfolding Trimmed Tree Assembler: Algorithms for Assembling Genome Sequences ... more Page 1. Scoring-and-Unfolding Trimmed Tree Assembler: Algorithms for Assembling Genome Sequences Accurately and Efficiently by Giuseppe Narzisi A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computer Science Courant Institute of Mathematical Sciences New York University May 2011 Bud Mishra — Advisor Page 2. c Giuseppe Narzisi All Rights Reserved, 2011 Page 3. To Valentina & My family iii Page 4.

Research paper thumbnail of Clonal Selection Algorithms: A Comparative Case Study Using Effective Mutation Potentials

Artificial Immune Systems, Jan 1, 2005

This paper presents a comparative study of two important Clonal Selection Algorithms (CSAs): CLON... more This paper presents a comparative study of two important Clonal Selection Algorithms (CSAs): CLONALG and opt-IA. To deeply understand the performance of both algorithms, we deal with four different classes of problems: toy problems (one-counting and trap functions), pattern recognition, numerical optimization problems and NP-complete problem (the 2D HP model for protein structure prediction problem). Two possible versions of CLONALG have been implemented and tested. The experimental results show a global better performance of opt-IA with respect to CLONALG. Considering the results obtained, we can claim that CSAs represent a new class of Evolutionary Algorithms for effectively performing searching, learning and optimization tasks.

Research paper thumbnail of A class of Pareto archived evolution strategy algorithms using immune inspired operators for ab-initio protein structure prediction

Applications on Evolutionary Computing, Jan 1, 2005

In this work we investigate the applicability of a multiobjective formulation of the Ab-Initio Pr... more In this work we investigate the applicability of a multiobjective formulation of the Ab-Initio Protein Structure Prediction (PSP) to medium size protein sequences (46-70 residues). In particular, we introduce a modified version of Pareto Archived Evolution Strategy (PAES) which makes use of immune inspired computing principles and which we will denote by "I-PAES". Experimental results on the test bed of five proteins from PDB show that PAES, (1+1)-PAES and its modified version I-PAES, are optimal multiobjective optimization algorithms and the introduced mutation operators, mut1 and mut2, are effective for the PSP problem. The proposed I-PAES is comparable with other evolutionary algorithms proposed in literature, both in terms of best solution found and computational cost.

Research paper thumbnail of SUTTA: Scoring-and-Unfolding Trimmed Tree Assembler

Abstract We have developed a novel algorithmic framework for assembling haplotypic genome sequenc... more Abstract We have developed a novel algorithmic framework for assembling haplotypic genome sequences, and thus address a key open problem in the study of populations and polymorphisms, which cannot be solved with the currently available genotypic sequences.

Research paper thumbnail of Feature-by-feature–evaluating de novo sequence assembly

The whole-genome sequence assembly (WGSA) problem is among one of the most studied problems in co... more The whole-genome sequence assembly (WGSA) problem is among one of the most studied problems in computational biology. Despite the availability of a plethora of tools (ie, assemblers), all claiming to have solved the WGSA problem, little has been done to systematically compare their accuracy and power.

Research paper thumbnail of Reevaluating Assembly Evaluations with Feature Response Curves: GAGE and Assemblathons

Abstract In just the last decade, a multitude of bio-technologies and software pipelines have eme... more Abstract In just the last decade, a multitude of bio-technologies and software pipelines have emerged to revolutionize genomics. To further their central goal, they aim to accelerate and improve the quality of de novo whole-genome assembly starting from short DNA sequences/reads. However, the performance of each of these tools is contingent on the length and quality of the sequencing data, the structure and complexity of the genome sequence, and the resolution and quality of long-range information.

Research paper thumbnail of Hawkeye and AMOS: visualizing and assessing the quality of genome assemblies

Abstract Since its launch in 2004, the open-source AMOS project has released several innovative D... more Abstract Since its launch in 2004, the open-source AMOS project has released several innovative DNA sequence analysis applications including: Hawkeye, a visual analytics tool for inspecting the structure of genome assemblies; the Assembly Forensics and FRCurve pipelines for systematically evaluating the quality of a genome assembly; and AMOScmp, the first comparative genome assembler.

Research paper thumbnail of Complexities, catastrophes and cities: Unraveling emergency dynamics

1Author contributions: BM designed research; GN, VM and BM performed research; GN, VM and BM cont... more 1Author contributions: BM designed research; GN, VM and BM performed research; GN, VM and BM contributed new analytical tools; LN, DR and MT contributed to the clinical aspects of the study design and development; GN, VM and BM wrote the paper; and GN, VM, LN, DR, MT, LH, and IP reviewed the paper.

Research paper thumbnail of Emergency response planning for a potential sarin gas attack in Manhattan using agent-based models

ABSTRACT In this paper, we describe the agent-based modeling (ABM), simulation and analysis of a ... more ABSTRACT In this paper, we describe the agent-based modeling (ABM), simulation and analysis of a potential Sarin gas attack in the Port Authority Bus Terminal in the island of Manhattan in New York city, USA. The streets and subways of Manhattan have been modeled as a non-planar graph. The people at the terminal are modeled as agents initially moving randomly, but with a resultant drift velocity towards their destinations, eg, work places.

Research paper thumbnail of Improved Assembly Accuracy by Integrating Base-Calling, Error Correction and Assembly

Abstract Motivation. With the recent advent of a multitude of next-generation sequencing (NGS) te... more Abstract Motivation. With the recent advent of a multitude of next-generation sequencing (NGS) technologies (characterized by high throughput but relatively shorter read length), de novo DNA sequence assembly has become again one of the most prominent problems in Genomics and Computational Biology. Although algorithmic improvements play an important role in sequence assembly, the complexity of the problem is strongly reduced if higher quality (low error rate) sequences can be generated.

Research paper thumbnail of Reevaluating Assembly Evaluations using Feature Analysis: GAGE and Assemblathons Supplementary material

FRCbam computes features using a sliding window of size W. By default W is set to 1 Kbp, and in e... more FRCbam computes features using a sliding window of size W. By default W is set to 1 Kbp, and in each step it slides by 200 bp. Let A denote a genome to be assembled (ie, in other words it is the desired output). Let R={r1 1, r2 1,..., r1 n, r2 n} denote the set of sequenced paired reads from A. Pairs are at a known estimated distance, d (and standard variation, v) and with known orientations. FRCbam input is:

Research paper thumbnail of An Experimental Multi-Objective Study of the SVM Model Selection problem

Abstract. Support Vector machines (SVMs) are a powerful method for both regression and classifica... more Abstract. Support Vector machines (SVMs) are a powerful method for both regression and classification. However, any SVM formulation requires the user to set two or more parameters which govern the training process and such parameters can have a strong effect on the result performance of the engine. Moreover, the design of learning systems is inherently a multi-objective optimization problem. It requires to find a suitable trade-off between at least two conflicting objectives: model complexity and accuracy.

Research paper thumbnail of Agent modeling of a sarin attack in manhattan

ABSTRACT In this paper, we describe the agent-based modeling (ABM), simulation and analysis of a ... more ABSTRACT In this paper, we describe the agent-based modeling (ABM), simulation and analysis of a potential Sarin gas attack at the Port Authority Bus Terminal in the island of Manhattan in New York city, USA. The streets and subways of Manhattan have been modeled as a non-planar graph. The people at the terminal are modeled as agents initially moving randomly, but with a resultant drift velocity towards their destinations, eg, work places.

Research paper thumbnail of Complexities, catastrophes and cities: Emergency dynamics in varying scenarios and urban topologies

Complex Systems are often characterized by agents capable of interacting with each other dynamica... more Complex Systems are often characterized by agents capable of interacting with each other dynamically, often in non-linear and non-intuitive ways. Trying to characterize their dynamics often results in partial differential equations that are difficult, if not impossible, to solve. A large city or a city-state is an example of such an evolving and self-organizing complex environment that efficiently adapts to different and numerous incremental changes to its social, cultural and technological infrastructure [2].

Research paper thumbnail of An immunological algorithm for global numerical optimization

Abstract. Numerical optimization of given objective functions is a crucial task in many real-life... more Abstract. Numerical optimization of given objective functions is a crucial task in many real-life problems. The present article introduces an immunological algorithm for continuous global optimization problems, called opt-IA. Several biologically inspired algorithms have been designed during the last few years and have shown to have very good performance on standard test bed for numerical optimization.

Research paper thumbnail of Determination of protein structure and dynamics combining immune algorithms and pattern search methods

Abstract Natural proteins quickly fold into a complicated three-dimensional structure. Evolutiona... more Abstract Natural proteins quickly fold into a complicated three-dimensional structure. Evolutionary algorithms have been used to predict the native structure with the lowest energy conformation of the primary sequence of a given protein. Successful structure prediction requires a free energy function sufficiently close to the true potential for the native state, as well as a method for exploring the conformational space.

Research paper thumbnail of A multi-objective evolutionary approach to the protein structure prediction problem

Abstract The protein structure prediction (PSP) problem is concerned with the prediction of the f... more Abstract The protein structure prediction (PSP) problem is concerned with the prediction of the folded, native, tertiary structure of a protein given its sequence of amino acids. It is a challenging and computationally open problem, as proven by the numerous methodological attempts and the research effort applied to it in the last few years.

Research paper thumbnail of Computational Studies of Peptide and Protein Structure Prediction Problems via Multiobjective Evolutionary Algorithms

Finding the native structure of a protein starting from its amino acid sequence remains one of th... more Finding the native structure of a protein starting from its amino acid sequence remains one of the most challenging open problems in bioinformatics and molecular biology. The Protein Structure Prediction (PSP) problem has been tackled from many different directions. The common approach is to cast it in the form of a global single-objective optimization problem using energy functions to evaluate the physical state of the conformations.

Research paper thumbnail of Supplementary Material for “Complexities, Catastrophes and Cities: Emergency Dynamics in Varying Scenarios and Urban Topologies”

One of the main issues in ABM is to build models at the appropriate level of description, using t... more One of the main issues in ABM is to build models at the appropriate level of description, using the requisite level of details in order to produce a system that serves its analytical purpose. The details of our model have been summarized below from [2, 3, 5, 6, 4]. The table 1 shows the main parameters that the user can modify.

Research paper thumbnail of Yahoo! Clusty-Adding real-time clustering functionality to the Yahoo! web search engine

Yahoo!Clusty1 is a Clustering Meta-search Engine (MSE) that allows users to send queries to Yahoo... more Yahoo!Clusty1 is a Clustering Meta-search Engine (MSE) that allows users to send queries to Yahoo!. The returned snippets are grouped into homogeneous groups by topic. The objective of this project has been to create a flexible MSE for the Yahoo! web search engine. The purpose is to present the results returned to a query in a more structured format which will allow the user to easily explore them by category.

Research paper thumbnail of Scoring-and-Unfolding Trimmed Tree Assembler: Algorithms for Assembling Genome Sequences Accurately and Efficiently

Page 1. Scoring-and-Unfolding Trimmed Tree Assembler: Algorithms for Assembling Genome Sequences ... more Page 1. Scoring-and-Unfolding Trimmed Tree Assembler: Algorithms for Assembling Genome Sequences Accurately and Efficiently by Giuseppe Narzisi A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computer Science Courant Institute of Mathematical Sciences New York University May 2011 Bud Mishra — Advisor Page 2. c Giuseppe Narzisi All Rights Reserved, 2011 Page 3. To Valentina & My family iii Page 4.

Research paper thumbnail of Clonal Selection Algorithms: A Comparative Case Study Using Effective Mutation Potentials

Artificial Immune Systems, Jan 1, 2005

This paper presents a comparative study of two important Clonal Selection Algorithms (CSAs): CLON... more This paper presents a comparative study of two important Clonal Selection Algorithms (CSAs): CLONALG and opt-IA. To deeply understand the performance of both algorithms, we deal with four different classes of problems: toy problems (one-counting and trap functions), pattern recognition, numerical optimization problems and NP-complete problem (the 2D HP model for protein structure prediction problem). Two possible versions of CLONALG have been implemented and tested. The experimental results show a global better performance of opt-IA with respect to CLONALG. Considering the results obtained, we can claim that CSAs represent a new class of Evolutionary Algorithms for effectively performing searching, learning and optimization tasks.

Research paper thumbnail of A class of Pareto archived evolution strategy algorithms using immune inspired operators for ab-initio protein structure prediction

Applications on Evolutionary Computing, Jan 1, 2005

In this work we investigate the applicability of a multiobjective formulation of the Ab-Initio Pr... more In this work we investigate the applicability of a multiobjective formulation of the Ab-Initio Protein Structure Prediction (PSP) to medium size protein sequences (46-70 residues). In particular, we introduce a modified version of Pareto Archived Evolution Strategy (PAES) which makes use of immune inspired computing principles and which we will denote by "I-PAES". Experimental results on the test bed of five proteins from PDB show that PAES, (1+1)-PAES and its modified version I-PAES, are optimal multiobjective optimization algorithms and the introduced mutation operators, mut1 and mut2, are effective for the PSP problem. The proposed I-PAES is comparable with other evolutionary algorithms proposed in literature, both in terms of best solution found and computational cost.