Sami Khuri - Academia.edu (original) (raw)
Papers by Sami Khuri
Advances in computing technology and the affordability of software and high-performance graphics ... more Advances in computing technology and the affordability of software and high-performance graphics hardware enabled rapid growth of visual tools. Today, not only very expensive workstations, but also low-cost PCs are capable of running computationally demanding visualization systems. Algorithm visualizations or the graphic depictions of algorithms in execution are being used in explaining, designing, analysing algorithms, and in debugging, fine-tuning, and
ACM SIGCSE Bulletin, 1994
This paper introduces a geometric representation that can be applied to illustrate the complexity... more This paper introduces a geometric representation that can be applied to illustrate the complexity of some combinatorial optimization problems. In this work, it is applied to the 0/1 knapsack problem and to a special case of a scheduling problem. This representation gives insight into the difference between tractable and intractable problems. It can therefore be used as a stepping stone to compare polynomial (P) and nondeterministic polynomial (NP) problems, before venturing into the world of NP-completeness.
Proceedings of the 1990 ACM annual conference on Cooperation - CSC '90, 1990
Proceedings of the 1st conference on Integrating technology into computer science education - ITiCSE '96, 1996
This paper presents an overview of visualization in Computer Science instruction. It is broken do... more This paper presents an overview of visualization in Computer Science instruction. It is broken down in the following fashion. First, we present the motivation for using visualization and visual techniques in instruction. This is followed by a discussion of when the use of visualization is most appropriate. We then consider a broad spectrum of uses of visualization in Computer Science instruction. This spectrum is organized from passive to active in terms of a student's involvement with the visualization tools. Types of visualizations are then categorized.
A phylogenetic tree represents the evolutionary history of a group of organisms. In this work, we... more A phylogenetic tree represents the evolutionary history of a group of organisms. In this work, we introduce a novel interactive tool for constructing phylogenetic trees, phylogenetic tree construction package. The package supports four well-known algorithms, unweighted pair group method using arithmetic average, neighbor joining, Fitch Margoliash, and maximum parsimony.
… on mathematics and engineering techniques in …, 2004
As more research centers embark on sequencing new genomes, the problem of DNA fragment assembly f... more As more research centers embark on sequencing new genomes, the problem of DNA fragment assembly for shotgun sequencing is growing in importance and complexity. Accurate and fast assembly is a crucial part of any sequencing project and many algorithms have been developed to tackle it. Since the DNA fragment assembly problem is NP-hard, exact solutions are very difficult to obtain. In this work, we present four heuristic algorithms, which we designed, implemented and tested. We compare the algorithms and the data structures of the four heuristics and present results of our experiments. We also compare our results with the assemblies produced by the wellknown packages: PHRAP and CAP3.
Genetic Algorithms within the Framework of …, 1994
Theoretical analysis of fitness functions in genetic algorithms has included the use of Walsh fun... more Theoretical analysis of fitness functions in genetic algorithms has included the use of Walsh functions [14]. They form a convenient basis for the expansion of fitness functions [3]. These orthogonal, rectangular functions have also been used to compute the average fitness values of schemata [5]. This work explores the use of Haar functions [7] for the same purposes. While 2` non-zero terms are required for the expansion of a given function as a linear combination of Walsh functions, at most `+ 1 non-zero terms are required with the Haar expansion, where ` is the size of each binary string in the solution space. Similarly, Haar coefficients require less computation than their Walsh counterparts. The total number of terms required for the expansion of the fitness function at a given point using Haar is of order 2`, substantially less than Walsh’s 22`. A comparison of Haar functions and Walsh functions with respect to fitness averages shows that the use of Haar functions will reduce c...
The DNA fragment assembly is an essential step in DNA sequencing projects. Since DNA sequencers o... more The DNA fragment assembly is an essential step in DNA sequencing projects. Since DNA sequencers output fragments, the original genome must be reconstructed from these small reads. In this paper, a new fragment assembly algorithm, Pattern Matching based String Graph Assembler (PMSGA), is presented. The algorithm uses multipattern matching to detect overlaps and a minimum cost flow algorithm to detect repeats. Special care was taken to reduce the algorithm’s run time without compromising the quality of the assembly. PMSGA was compared with well-known fragment assemblers. The algorithm is faster than other assemblers. PMSGA produced high quality assemblies with prokaryotic data sets. The results for eukaryotic data are comparable with other assemblers.
Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education, 2017
In this paper, we describe the Minor in Bioinformatics that we created to better prepare students... more In this paper, we describe the Minor in Bioinformatics that we created to better prepare students, especially women, in acquiring computational and programming skills. Our program was motivated by the fact that women are underrepresented in computer science and in other information technology-related fields. We aim to recruit biology undergraduates, who are more than 60% female, to the new cohort-based integrative interdisciplinary Minor in Bioinformatics program. By rooting this new computational program in biological concepts and questions, we plan to interest and educate biology students in computational methods, which can be applied to complex questions in the growing field of bioinformatics. We expect that the Minor in Bioinformatics program will serve as a general framework for establishing similar interdisciplinary programs at large institutions and small colleges.
Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence
The results obtained from the application of a genetic algorithm, GENEsYs, to the NP-complete max... more The results obtained from the application of a genetic algorithm, GENEsYs, to the NP-complete maximum independent set problem are reported in this work. In contrast to many other genetic algorithm based approaches that use domain-speci c knowledge, the approach presented here relies on a graded penalty term component of the tness function to penalize infeasible solutions. The method is applied to several large problem instances of the maximum independent set problem. The results clearly indicate that genetic algorithms can be successfully used as heuristics for nding good approximative solutions for this highly constrained optimization problem.
eLife, Sep 5, 2017
Predators and prey co-evolve, each maximizing their own fitness, but the effects of predator-prey... more Predators and prey co-evolve, each maximizing their own fitness, but the effects of predator-prey interactions on cellular and molecular machinery are poorly understood. Here, we study this process using the predator Caenorhabditis elegans and the bacterial prey Streptomyces, which have evolved a powerful defense: the production of nematicides. We demonstrate that upon exposure to Streptomyces at their head or tail, nematodes display an escape response that is mediated by bacterially produced cues. Avoidance requires a predicted G-protein-coupled receptor, SRB-6, which is expressed in five types of amphid and phasmid chemosensory neurons. We establish that species of Streptomyces secrete dodecanoic acid, which is sensed by SRB-6. This behavioral adaptation represents an important strategy for the nematode, which utilizes specialized sensory organs and a chemoreceptor that is tuned to recognize the bacteria. These findings provide a window into the molecules and organs used in the co...
Novatica Revista De La Asociacion De Tecnicos De Informatica, 2001
Proceedings of the 22nd annual ACM computer science conference on Scaling up : meeting the challenge of complexity in real-world computing applications meeting the challenge of complexity in real-world computing applications - CSC '94, 1994
Page 1. An Evolutionary Approach to Combinatorial Optimization Problems Sami Khuri Department of ... more Page 1. An Evolutionary Approach to Combinatorial Optimization Problems Sami Khuri Department of Mathematics & Computer Science San ...
Studies in Fuzziness and Soft Computing, 2003
... Authors, Enrique Alba, Universidad de Málaga, Complejo Tecnológico, Campus de Teatinos, 29071... more ... Authors, Enrique Alba, Universidad de Málaga, Complejo Tecnológico, Campus de Teatinos, 29071 Málaga, Spain. Sami Khuri, Department of Mathematics & Computer Science, San Joséé State University, One Washington Square, San José, CA. ... Sami Khuri: colleagues. ...
Proceedings of the 1997 ACM symposium on Applied computing - SAC '97, 1997
In this paper, applications of heuristic techniques for solving the terminal assignment (TA) prob... more In this paper, applications of heuristic techniques for solving the terminal assignment (TA) problem are investigated. The task here is to assign terminals to concentrators in such a way that each terminal is assigned to one (and only one) concentrator and the aggregate capacity of all terminals assigned to any concentrator does not overload that concentrator, i.e., is within the concentrator's capacity. Under these two hard constraints, an assignment with the lowest possible cost is sought. The proposed cost is taken to be the distance between a terminal and a concentrator. The heuristic techniques we investigate in this article include greedy-based algorithms, genetic algorithms (GA), and grouping genetic algorithms (GGA) [4]. We elaborate on the different heuristics we use, and compare the solutions yielded by them.
Proceedings of the 1997 ACM symposium on Applied computing - SAC '97, 1997
Walsh functions are orthogonal, rectangular functions that take values 1 and form a convenient ba... more Walsh functions are orthogonal, rectangular functions that take values 1 and form a convenient basis for the expansion of genetic algorithm tness functions. Since their introduction into genetic algorithms 2, 8], they have been used to compute the average tness values of schemata, to decide whether functions are hard or easy for genetic algorithms, and to design deceptive functions for the genetic algorithm. In 10], Haar functions were introduced as an alternative to Walsh functions and it was shown that Haar functions are in general more computationally advantageous. This paper revisits Haar functions, albeit with a slight variation to 10]'s de nition, uses the functions to construct fully deceptive functions for the genetic algorithm (as was done with Walsh functions 5]), and studies fast Haar transforms and fast Walsh-Haar transforms.
2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)
Genome sequencing opened a new era in genetics allowing the study of genomes at the nucleotide le... more Genome sequencing opened a new era in genetics allowing the study of genomes at the nucleotide level. However, the chosen method of sequencing produced large numbers of nucleotide fragments which had to be reassembled. The re-assembly of string fragments is known to be NP-hard. We report the results of our fast heuristic implementation for reassembling DNA fragments based on a unique approach to the problem called, "A Structured Pattern Matching Approach to Shotgun Sequence Assembly," (AMASS) created by Sun Kim. The algorithm's main idea is taken from the biological concept of probe hybridization where certain strands of nucleic acids are identified by short, unique sequences of bases that are contained within much longer DNA strands.
ACM SIGCSE Bulletin, 1986
This paper describes an original method for introducing linear recurrence relations. Boolean expr... more This paper describes an original method for introducing linear recurrence relations. Boolean expressions are represented by binary trees and the counting of the internal nodes of these trees yield linear recurrence relations. The method allows the students to create their own family of Boolean expressions, to draw the corresponding binary trees, to deduce the recurrence relation representing the number of nodes in the trees, and finally, to solve and check the solutions of these relations.
Lecture Notes in Computer Science, 1996
In this paper we compare the e ects of using various stochastic operators with the nonunicost set... more In this paper we compare the e ects of using various stochastic operators with the nonunicost set-covering problem. Four di erent crossover operators are compared to a repair heuristic which consists in transforming infeasible strings into feasible ones. These stochastic operators are incorporated in GENEsYs 2], the genetic algorithm we apply to problem instances of the set-covering problem we draw from well known test problems. GENEsYs uses a simple tness function that has a graded penalty term to penalize infeasibly bred strings. The results are compared to a non GA-based algorithm based on the greedy technique. Our computational results are then compared, shedding some light on the e ects of using di erent operators, a penalty function, and a repair heuristic on a highly constrained combinatorial optimization problem.
Advances in computing technology and the affordability of software and high-performance graphics ... more Advances in computing technology and the affordability of software and high-performance graphics hardware enabled rapid growth of visual tools. Today, not only very expensive workstations, but also low-cost PCs are capable of running computationally demanding visualization systems. Algorithm visualizations or the graphic depictions of algorithms in execution are being used in explaining, designing, analysing algorithms, and in debugging, fine-tuning, and
ACM SIGCSE Bulletin, 1994
This paper introduces a geometric representation that can be applied to illustrate the complexity... more This paper introduces a geometric representation that can be applied to illustrate the complexity of some combinatorial optimization problems. In this work, it is applied to the 0/1 knapsack problem and to a special case of a scheduling problem. This representation gives insight into the difference between tractable and intractable problems. It can therefore be used as a stepping stone to compare polynomial (P) and nondeterministic polynomial (NP) problems, before venturing into the world of NP-completeness.
Proceedings of the 1990 ACM annual conference on Cooperation - CSC '90, 1990
Proceedings of the 1st conference on Integrating technology into computer science education - ITiCSE '96, 1996
This paper presents an overview of visualization in Computer Science instruction. It is broken do... more This paper presents an overview of visualization in Computer Science instruction. It is broken down in the following fashion. First, we present the motivation for using visualization and visual techniques in instruction. This is followed by a discussion of when the use of visualization is most appropriate. We then consider a broad spectrum of uses of visualization in Computer Science instruction. This spectrum is organized from passive to active in terms of a student's involvement with the visualization tools. Types of visualizations are then categorized.
A phylogenetic tree represents the evolutionary history of a group of organisms. In this work, we... more A phylogenetic tree represents the evolutionary history of a group of organisms. In this work, we introduce a novel interactive tool for constructing phylogenetic trees, phylogenetic tree construction package. The package supports four well-known algorithms, unweighted pair group method using arithmetic average, neighbor joining, Fitch Margoliash, and maximum parsimony.
… on mathematics and engineering techniques in …, 2004
As more research centers embark on sequencing new genomes, the problem of DNA fragment assembly f... more As more research centers embark on sequencing new genomes, the problem of DNA fragment assembly for shotgun sequencing is growing in importance and complexity. Accurate and fast assembly is a crucial part of any sequencing project and many algorithms have been developed to tackle it. Since the DNA fragment assembly problem is NP-hard, exact solutions are very difficult to obtain. In this work, we present four heuristic algorithms, which we designed, implemented and tested. We compare the algorithms and the data structures of the four heuristics and present results of our experiments. We also compare our results with the assemblies produced by the wellknown packages: PHRAP and CAP3.
Genetic Algorithms within the Framework of …, 1994
Theoretical analysis of fitness functions in genetic algorithms has included the use of Walsh fun... more Theoretical analysis of fitness functions in genetic algorithms has included the use of Walsh functions [14]. They form a convenient basis for the expansion of fitness functions [3]. These orthogonal, rectangular functions have also been used to compute the average fitness values of schemata [5]. This work explores the use of Haar functions [7] for the same purposes. While 2` non-zero terms are required for the expansion of a given function as a linear combination of Walsh functions, at most `+ 1 non-zero terms are required with the Haar expansion, where ` is the size of each binary string in the solution space. Similarly, Haar coefficients require less computation than their Walsh counterparts. The total number of terms required for the expansion of the fitness function at a given point using Haar is of order 2`, substantially less than Walsh’s 22`. A comparison of Haar functions and Walsh functions with respect to fitness averages shows that the use of Haar functions will reduce c...
The DNA fragment assembly is an essential step in DNA sequencing projects. Since DNA sequencers o... more The DNA fragment assembly is an essential step in DNA sequencing projects. Since DNA sequencers output fragments, the original genome must be reconstructed from these small reads. In this paper, a new fragment assembly algorithm, Pattern Matching based String Graph Assembler (PMSGA), is presented. The algorithm uses multipattern matching to detect overlaps and a minimum cost flow algorithm to detect repeats. Special care was taken to reduce the algorithm’s run time without compromising the quality of the assembly. PMSGA was compared with well-known fragment assemblers. The algorithm is faster than other assemblers. PMSGA produced high quality assemblies with prokaryotic data sets. The results for eukaryotic data are comparable with other assemblers.
Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education, 2017
In this paper, we describe the Minor in Bioinformatics that we created to better prepare students... more In this paper, we describe the Minor in Bioinformatics that we created to better prepare students, especially women, in acquiring computational and programming skills. Our program was motivated by the fact that women are underrepresented in computer science and in other information technology-related fields. We aim to recruit biology undergraduates, who are more than 60% female, to the new cohort-based integrative interdisciplinary Minor in Bioinformatics program. By rooting this new computational program in biological concepts and questions, we plan to interest and educate biology students in computational methods, which can be applied to complex questions in the growing field of bioinformatics. We expect that the Minor in Bioinformatics program will serve as a general framework for establishing similar interdisciplinary programs at large institutions and small colleges.
Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence
The results obtained from the application of a genetic algorithm, GENEsYs, to the NP-complete max... more The results obtained from the application of a genetic algorithm, GENEsYs, to the NP-complete maximum independent set problem are reported in this work. In contrast to many other genetic algorithm based approaches that use domain-speci c knowledge, the approach presented here relies on a graded penalty term component of the tness function to penalize infeasible solutions. The method is applied to several large problem instances of the maximum independent set problem. The results clearly indicate that genetic algorithms can be successfully used as heuristics for nding good approximative solutions for this highly constrained optimization problem.
eLife, Sep 5, 2017
Predators and prey co-evolve, each maximizing their own fitness, but the effects of predator-prey... more Predators and prey co-evolve, each maximizing their own fitness, but the effects of predator-prey interactions on cellular and molecular machinery are poorly understood. Here, we study this process using the predator Caenorhabditis elegans and the bacterial prey Streptomyces, which have evolved a powerful defense: the production of nematicides. We demonstrate that upon exposure to Streptomyces at their head or tail, nematodes display an escape response that is mediated by bacterially produced cues. Avoidance requires a predicted G-protein-coupled receptor, SRB-6, which is expressed in five types of amphid and phasmid chemosensory neurons. We establish that species of Streptomyces secrete dodecanoic acid, which is sensed by SRB-6. This behavioral adaptation represents an important strategy for the nematode, which utilizes specialized sensory organs and a chemoreceptor that is tuned to recognize the bacteria. These findings provide a window into the molecules and organs used in the co...
Novatica Revista De La Asociacion De Tecnicos De Informatica, 2001
Proceedings of the 22nd annual ACM computer science conference on Scaling up : meeting the challenge of complexity in real-world computing applications meeting the challenge of complexity in real-world computing applications - CSC '94, 1994
Page 1. An Evolutionary Approach to Combinatorial Optimization Problems Sami Khuri Department of ... more Page 1. An Evolutionary Approach to Combinatorial Optimization Problems Sami Khuri Department of Mathematics & Computer Science San ...
Studies in Fuzziness and Soft Computing, 2003
... Authors, Enrique Alba, Universidad de Málaga, Complejo Tecnológico, Campus de Teatinos, 29071... more ... Authors, Enrique Alba, Universidad de Málaga, Complejo Tecnológico, Campus de Teatinos, 29071 Málaga, Spain. Sami Khuri, Department of Mathematics & Computer Science, San Joséé State University, One Washington Square, San José, CA. ... Sami Khuri: colleagues. ...
Proceedings of the 1997 ACM symposium on Applied computing - SAC '97, 1997
In this paper, applications of heuristic techniques for solving the terminal assignment (TA) prob... more In this paper, applications of heuristic techniques for solving the terminal assignment (TA) problem are investigated. The task here is to assign terminals to concentrators in such a way that each terminal is assigned to one (and only one) concentrator and the aggregate capacity of all terminals assigned to any concentrator does not overload that concentrator, i.e., is within the concentrator's capacity. Under these two hard constraints, an assignment with the lowest possible cost is sought. The proposed cost is taken to be the distance between a terminal and a concentrator. The heuristic techniques we investigate in this article include greedy-based algorithms, genetic algorithms (GA), and grouping genetic algorithms (GGA) [4]. We elaborate on the different heuristics we use, and compare the solutions yielded by them.
Proceedings of the 1997 ACM symposium on Applied computing - SAC '97, 1997
Walsh functions are orthogonal, rectangular functions that take values 1 and form a convenient ba... more Walsh functions are orthogonal, rectangular functions that take values 1 and form a convenient basis for the expansion of genetic algorithm tness functions. Since their introduction into genetic algorithms 2, 8], they have been used to compute the average tness values of schemata, to decide whether functions are hard or easy for genetic algorithms, and to design deceptive functions for the genetic algorithm. In 10], Haar functions were introduced as an alternative to Walsh functions and it was shown that Haar functions are in general more computationally advantageous. This paper revisits Haar functions, albeit with a slight variation to 10]'s de nition, uses the functions to construct fully deceptive functions for the genetic algorithm (as was done with Walsh functions 5]), and studies fast Haar transforms and fast Walsh-Haar transforms.
2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)
Genome sequencing opened a new era in genetics allowing the study of genomes at the nucleotide le... more Genome sequencing opened a new era in genetics allowing the study of genomes at the nucleotide level. However, the chosen method of sequencing produced large numbers of nucleotide fragments which had to be reassembled. The re-assembly of string fragments is known to be NP-hard. We report the results of our fast heuristic implementation for reassembling DNA fragments based on a unique approach to the problem called, "A Structured Pattern Matching Approach to Shotgun Sequence Assembly," (AMASS) created by Sun Kim. The algorithm's main idea is taken from the biological concept of probe hybridization where certain strands of nucleic acids are identified by short, unique sequences of bases that are contained within much longer DNA strands.
ACM SIGCSE Bulletin, 1986
This paper describes an original method for introducing linear recurrence relations. Boolean expr... more This paper describes an original method for introducing linear recurrence relations. Boolean expressions are represented by binary trees and the counting of the internal nodes of these trees yield linear recurrence relations. The method allows the students to create their own family of Boolean expressions, to draw the corresponding binary trees, to deduce the recurrence relation representing the number of nodes in the trees, and finally, to solve and check the solutions of these relations.
Lecture Notes in Computer Science, 1996
In this paper we compare the e ects of using various stochastic operators with the nonunicost set... more In this paper we compare the e ects of using various stochastic operators with the nonunicost set-covering problem. Four di erent crossover operators are compared to a repair heuristic which consists in transforming infeasible strings into feasible ones. These stochastic operators are incorporated in GENEsYs 2], the genetic algorithm we apply to problem instances of the set-covering problem we draw from well known test problems. GENEsYs uses a simple tness function that has a graded penalty term to penalize infeasibly bred strings. The results are compared to a non GA-based algorithm based on the greedy technique. Our computational results are then compared, shedding some light on the e ects of using di erent operators, a penalty function, and a repair heuristic on a highly constrained combinatorial optimization problem.