Roberto Solar - Academia.edu (original) (raw)

Papers by Roberto Solar

Research paper thumbnail of Towards rapid population genetics forward-in-time simulations

winter simulation conference, Dec 3, 2017

Computer simulations are an important tool for the current research in population and evolutionar... more Computer simulations are an important tool for the current research in population and evolutionary genetics. They help to understand the genetic evolution of complex processes dynamics that cannot be analytically predicted. The basic idea is to generate synthetic data sets of genetic polymorphisms under user-specified scenarios describing the evolutionary history and genetic architecture of a species. In this work, we focus on forward-in-time simulations which represent the most powerful, but, at the same time, most compute-intensive approach for simulating the genetic material of a population. We present a highly-optimized forward-in-time simulation library called Libgdrift, specially designed to create large sets of replicated simulations. Our simulation library uses code optimizations such as spatial locality and a two-phase data compression approach which allow fast simulation executions, while reducing memory storage. Results show that our proposal can improve the performance reported by well-known simulation software.

Research paper thumbnail of AMEDS-tool: an automatic tool to model and simulate large scale systems

Simulating the cost of applications running on large clusters of processors poses difficulties in... more Simulating the cost of applications running on large clusters of processors poses difficulties in model definition and simulation. In this paper we propose a methodology to ease this burden. The user specifies a model describing it as a timed coloured Petri Net in a graphical manner by using a tool like CPN tool. The model is automatically converted into a XML specification. Then a code generator converts the XML file into C++ code which is linked to a simulation kernel. The result is an efficient and scalable simulation program that can be executed sequentially or in parallel on either single multi-core processors or cluster of multi-core processors. We illustrate the suitability of our proposal by modelling and simulating the cost of message passing in a Fat-tree network which is commonly used to support communication in cluster of processors.

Research paper thumbnail of SSSTree v2.0: búsqueda por similitud en espacios métricos con solapamientos de planos

Research paper thumbnail of Listas de clusters usando centros espacialmente dispersos para búsquedas por similitud en espacios métricos

Research paper thumbnail of Implementación de un digesto digital paralelo para búsquedas por similitud sobre documentos

Research paper thumbnail of Towards rapid population genetics forward-in-time simulations

Computer simulations are an important tool for the current research in population and evolutionar... more Computer simulations are an important tool for the current research in population and evolutionary genetics. They help to understand the genetic evolution of complex processes dynamics that cannot be analytically predicted. The basic idea is to generate synthetic data sets of genetic polymorphisms under user-specified scenarios describing the evolutionary history and genetic architecture of a species. In this work, we focus on forward-in-time simulations which represent the most powerful, but, at the same time, most compute-intensive approach for simulating the genetic material of a population. We present a highly-optimized forward-in-time simulation library called Libgdrift, specially designed to create large sets of replicated simulations. Our simulation library uses code optimizations such as spatial locality and a two-phase data compression approach which allow fast simulation executions, while reducing memory storage. Results show that our proposal can improve the performance r...

Research paper thumbnail of Evaluation of Static/Dynamic Cache for Similarity Search Engines

Lecture Notes in Computer Science, 2016

In large scale search systems, where it is important to achieve a high query throughput, cache st... more In large scale search systems, where it is important to achieve a high query throughput, cache strategies are a feasible tool to achieve this goal. A number of efficient cache strategies devised for exact query search in different application domains have been proposed so far. In similarity query search on metric spaces it is necessary to consider additional design requirements devised to produce good quality approximate results from the cache content. In this paper, we propose a Static/Dynamic cache strategy for metric spaces which takes advantage of results of static cache miss operations and their associated distance evaluations for increasing the overall performance of the cache. We present an experimental evaluation of the performance obtained with our strategy for different query selection/replacement strategies.

Research paper thumbnail of Dynamic load balance for approximate parallel simulations with consistent hashing

Parallel simulation is a powerful tool to evaluate the performance of large-scale systems. Howeve... more Parallel simulation is a powerful tool to evaluate the performance of large-scale systems. However when it comes to simulating large scale Web search engines, the parallel simulation execution can introduce imbalance among processors because event occurrence is driven by user behavior which is unpredictable, making events take place in different parts of the system in an irregular manner. In this paper, we study the impact of load balance strategies on the performance of a parallel simulation strategy. In particular, we present a consistent hashing load balance algorithm aimed to reduce queuing waiting times, evenly distribute the costs of executing events among processors, and more importantly migration of logical processes only occurs between neighbor processors. We use a Web search engine composed by services deployed on a cluster of processors as the application case study. Our simulations are driven by actual query log traces. We evaluate our proposed load balance algorithm in ...

Research paper thumbnail of A graph-based cache for large-scale similarity search engines

The Journal of Supercomputing

Research paper thumbnail of A Service-Oriented Platform for Approximate Bayesian Computation in Population Genetics

Journal of Computational Biology

Research paper thumbnail of Constrution Strategies on Metri Strutures for Similarity Searh

CLEI Electronic Journal

The List of luster LC is an eetive tehnique to index high dimension metri spaes. LC is an array-t... more The List of luster LC is an eetive tehnique to index high dimension metri spaes. LC is an array-type struture for similarity searh based on lustering. Sparse Spatial Seletion SSS is a new struture based on pivots for similarity searh in metri spaes. This array-type struture has shown good performane during the searh as ompared to other methods. This work shows dierent onstrution strategies on LC ; for instane, the use of SSS as a general seletion method of pivots or enters, among others. The artile also shows the advantages of the use of other tehniques like keeping the distane between the ob jets and the luster enter, apart from revising the eets of reursive appliation of suh methods. Finally, the inuene of the use of Voronoi partitions for the distribution of the ob jets within the struture will be shown. Preliminary experimental results show that the new versions of LC have a better performane in terms of distane evaluation than the original data struture and other renowned strut...

Research paper thumbnail of Implementación de un digesto digital paralelo para búsquedas por similitud sobre documentos

Research paper thumbnail of Estrategias de Construcción sobre Estructuras Métricas para Búsquedas por Similitud

Research paper thumbnail of Listas de Clusters usando Centros Espacialmente Dispersos para Búsquedas por Similaridad en Espacios Métricos

Research paper thumbnail of SSSTree v2.0: Búsqueda por Similitud en Espacios Métricos con Solapamiento de Planos

Research paper thumbnail of Load Balance Strategies for DEVS Approximated Parallel and Distributed Discrete-Event Simulations

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, 2015

Research paper thumbnail of Approximate parallel simulation of web search engines

Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation - SIGSIM-PADS '13, 2013

Research paper thumbnail of Improving Communication Patterns for Distributed Cluster-based Individual-oriented Fish School Simulations

Procedia Computer Science, 2013

Parallel discrete event simulation (PDES) have shown to be an useful paradigm for simulating comp... more Parallel discrete event simulation (PDES) have shown to be an useful paradigm for simulating complex and large-scale models. An individual-oriented approach allows modelers capture complex emerging global behaviors generated by simple local interaction, like observed in self-organized systems. Usually, this type of simulations are highly expensive in terms of computing and communications. One one hand, we can reduce the computing involved in individual interactions by means of developing a robust partitioning method. On the other hand, we have to be able to efficiently handle a huge number of individuals interacting with other individuals stored in memory of remote processors. In this work we will analyze and compare three communication strategies: synchronous and asynchronous message passing (via MPI) and bulk-synchronous parallel (BSP) for our distributed cluster-based individual-oriented fish school simulator. In this type of simulations, the main contributions of our work are: a) we showed that distributed time-driven simulations do not always improve the performance when using synchronous communication strategies, b) we show asynchronous communications strategies are more efficient. In addition, we have verified that the bulk-synchronous parallel method is a scalable.

Research paper thumbnail of High performance individual-oriented simulation using complex models

Procedia Computer Science, 2010

Computational simulation has been used as a powerful tool to represent the dynamical behavior of ... more Computational simulation has been used as a powerful tool to represent the dynamical behavior of systems based on complex analytic models. These types of models have two main drawbacks: a) limitations due to the degree of abstraction needed to simulate them, b) high computing power to simulate a heavily simplified models. The computing power available today can overcome these limitations to perform quicker simulations of complex models that are closer to reality. In this paper, the experiments and performance analysis of a distributed simulation for a complex individual oriented model (fish schools) are presented. The development of the fish school simulator includes the possibility of working with large models that include large numbers of fish (> 10 6 of individuals), predators and obstacles in the simulated world.

Research paper thumbnail of Proximity Load Balancing for Distributed Cluster-based Individual-oriented Fish School Simulations

Procedia Computer Science, 2012

Partitioning and load balancing are highly important issues in distributed individual-oriented si... more Partitioning and load balancing are highly important issues in distributed individual-oriented simulation. Choosing how to distribute individuals on the distributed environment can be a crucial factor at the moment of executing the simulation. Partitioning an individual-oriented system should be efficient in order to reduce communication involved in interaction between individuals belong to different logical processes. Furthermore, if the individual-oriented model exhibits mobility patterns, we should be able to maintain the load balancing in order to keep the global application performance. In this work, we present a proximity load balancing strategy for a distributed cluster-based individualoriented fish school simulator. On one hand, we implement a robust cluster-based partitioning method by means of covering radius criterion and voronoi diagrams. We use a proximity criterion to distribute individuals on the distributed architecture. On the other hand, we propose a proximity load balancing strategy in order to maintain the application performance as the simulation progresses.

Research paper thumbnail of Towards rapid population genetics forward-in-time simulations

winter simulation conference, Dec 3, 2017

Computer simulations are an important tool for the current research in population and evolutionar... more Computer simulations are an important tool for the current research in population and evolutionary genetics. They help to understand the genetic evolution of complex processes dynamics that cannot be analytically predicted. The basic idea is to generate synthetic data sets of genetic polymorphisms under user-specified scenarios describing the evolutionary history and genetic architecture of a species. In this work, we focus on forward-in-time simulations which represent the most powerful, but, at the same time, most compute-intensive approach for simulating the genetic material of a population. We present a highly-optimized forward-in-time simulation library called Libgdrift, specially designed to create large sets of replicated simulations. Our simulation library uses code optimizations such as spatial locality and a two-phase data compression approach which allow fast simulation executions, while reducing memory storage. Results show that our proposal can improve the performance reported by well-known simulation software.

Research paper thumbnail of AMEDS-tool: an automatic tool to model and simulate large scale systems

Simulating the cost of applications running on large clusters of processors poses difficulties in... more Simulating the cost of applications running on large clusters of processors poses difficulties in model definition and simulation. In this paper we propose a methodology to ease this burden. The user specifies a model describing it as a timed coloured Petri Net in a graphical manner by using a tool like CPN tool. The model is automatically converted into a XML specification. Then a code generator converts the XML file into C++ code which is linked to a simulation kernel. The result is an efficient and scalable simulation program that can be executed sequentially or in parallel on either single multi-core processors or cluster of multi-core processors. We illustrate the suitability of our proposal by modelling and simulating the cost of message passing in a Fat-tree network which is commonly used to support communication in cluster of processors.

Research paper thumbnail of SSSTree v2.0: búsqueda por similitud en espacios métricos con solapamientos de planos

Research paper thumbnail of Listas de clusters usando centros espacialmente dispersos para búsquedas por similitud en espacios métricos

Research paper thumbnail of Implementación de un digesto digital paralelo para búsquedas por similitud sobre documentos

Research paper thumbnail of Towards rapid population genetics forward-in-time simulations

Computer simulations are an important tool for the current research in population and evolutionar... more Computer simulations are an important tool for the current research in population and evolutionary genetics. They help to understand the genetic evolution of complex processes dynamics that cannot be analytically predicted. The basic idea is to generate synthetic data sets of genetic polymorphisms under user-specified scenarios describing the evolutionary history and genetic architecture of a species. In this work, we focus on forward-in-time simulations which represent the most powerful, but, at the same time, most compute-intensive approach for simulating the genetic material of a population. We present a highly-optimized forward-in-time simulation library called Libgdrift, specially designed to create large sets of replicated simulations. Our simulation library uses code optimizations such as spatial locality and a two-phase data compression approach which allow fast simulation executions, while reducing memory storage. Results show that our proposal can improve the performance r...

Research paper thumbnail of Evaluation of Static/Dynamic Cache for Similarity Search Engines

Lecture Notes in Computer Science, 2016

In large scale search systems, where it is important to achieve a high query throughput, cache st... more In large scale search systems, where it is important to achieve a high query throughput, cache strategies are a feasible tool to achieve this goal. A number of efficient cache strategies devised for exact query search in different application domains have been proposed so far. In similarity query search on metric spaces it is necessary to consider additional design requirements devised to produce good quality approximate results from the cache content. In this paper, we propose a Static/Dynamic cache strategy for metric spaces which takes advantage of results of static cache miss operations and their associated distance evaluations for increasing the overall performance of the cache. We present an experimental evaluation of the performance obtained with our strategy for different query selection/replacement strategies.

Research paper thumbnail of Dynamic load balance for approximate parallel simulations with consistent hashing

Parallel simulation is a powerful tool to evaluate the performance of large-scale systems. Howeve... more Parallel simulation is a powerful tool to evaluate the performance of large-scale systems. However when it comes to simulating large scale Web search engines, the parallel simulation execution can introduce imbalance among processors because event occurrence is driven by user behavior which is unpredictable, making events take place in different parts of the system in an irregular manner. In this paper, we study the impact of load balance strategies on the performance of a parallel simulation strategy. In particular, we present a consistent hashing load balance algorithm aimed to reduce queuing waiting times, evenly distribute the costs of executing events among processors, and more importantly migration of logical processes only occurs between neighbor processors. We use a Web search engine composed by services deployed on a cluster of processors as the application case study. Our simulations are driven by actual query log traces. We evaluate our proposed load balance algorithm in ...

Research paper thumbnail of A graph-based cache for large-scale similarity search engines

The Journal of Supercomputing

Research paper thumbnail of A Service-Oriented Platform for Approximate Bayesian Computation in Population Genetics

Journal of Computational Biology

Research paper thumbnail of Constrution Strategies on Metri Strutures for Similarity Searh

CLEI Electronic Journal

The List of luster LC is an eetive tehnique to index high dimension metri spaes. LC is an array-t... more The List of luster LC is an eetive tehnique to index high dimension metri spaes. LC is an array-type struture for similarity searh based on lustering. Sparse Spatial Seletion SSS is a new struture based on pivots for similarity searh in metri spaes. This array-type struture has shown good performane during the searh as ompared to other methods. This work shows dierent onstrution strategies on LC ; for instane, the use of SSS as a general seletion method of pivots or enters, among others. The artile also shows the advantages of the use of other tehniques like keeping the distane between the ob jets and the luster enter, apart from revising the eets of reursive appliation of suh methods. Finally, the inuene of the use of Voronoi partitions for the distribution of the ob jets within the struture will be shown. Preliminary experimental results show that the new versions of LC have a better performane in terms of distane evaluation than the original data struture and other renowned strut...

Research paper thumbnail of Implementación de un digesto digital paralelo para búsquedas por similitud sobre documentos

Research paper thumbnail of Estrategias de Construcción sobre Estructuras Métricas para Búsquedas por Similitud

Research paper thumbnail of Listas de Clusters usando Centros Espacialmente Dispersos para Búsquedas por Similaridad en Espacios Métricos

Research paper thumbnail of SSSTree v2.0: Búsqueda por Similitud en Espacios Métricos con Solapamiento de Planos

Research paper thumbnail of Load Balance Strategies for DEVS Approximated Parallel and Distributed Discrete-Event Simulations

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, 2015

Research paper thumbnail of Approximate parallel simulation of web search engines

Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation - SIGSIM-PADS '13, 2013

Research paper thumbnail of Improving Communication Patterns for Distributed Cluster-based Individual-oriented Fish School Simulations

Procedia Computer Science, 2013

Parallel discrete event simulation (PDES) have shown to be an useful paradigm for simulating comp... more Parallel discrete event simulation (PDES) have shown to be an useful paradigm for simulating complex and large-scale models. An individual-oriented approach allows modelers capture complex emerging global behaviors generated by simple local interaction, like observed in self-organized systems. Usually, this type of simulations are highly expensive in terms of computing and communications. One one hand, we can reduce the computing involved in individual interactions by means of developing a robust partitioning method. On the other hand, we have to be able to efficiently handle a huge number of individuals interacting with other individuals stored in memory of remote processors. In this work we will analyze and compare three communication strategies: synchronous and asynchronous message passing (via MPI) and bulk-synchronous parallel (BSP) for our distributed cluster-based individual-oriented fish school simulator. In this type of simulations, the main contributions of our work are: a) we showed that distributed time-driven simulations do not always improve the performance when using synchronous communication strategies, b) we show asynchronous communications strategies are more efficient. In addition, we have verified that the bulk-synchronous parallel method is a scalable.

Research paper thumbnail of High performance individual-oriented simulation using complex models

Procedia Computer Science, 2010

Computational simulation has been used as a powerful tool to represent the dynamical behavior of ... more Computational simulation has been used as a powerful tool to represent the dynamical behavior of systems based on complex analytic models. These types of models have two main drawbacks: a) limitations due to the degree of abstraction needed to simulate them, b) high computing power to simulate a heavily simplified models. The computing power available today can overcome these limitations to perform quicker simulations of complex models that are closer to reality. In this paper, the experiments and performance analysis of a distributed simulation for a complex individual oriented model (fish schools) are presented. The development of the fish school simulator includes the possibility of working with large models that include large numbers of fish (> 10 6 of individuals), predators and obstacles in the simulated world.

Research paper thumbnail of Proximity Load Balancing for Distributed Cluster-based Individual-oriented Fish School Simulations

Procedia Computer Science, 2012

Partitioning and load balancing are highly important issues in distributed individual-oriented si... more Partitioning and load balancing are highly important issues in distributed individual-oriented simulation. Choosing how to distribute individuals on the distributed environment can be a crucial factor at the moment of executing the simulation. Partitioning an individual-oriented system should be efficient in order to reduce communication involved in interaction between individuals belong to different logical processes. Furthermore, if the individual-oriented model exhibits mobility patterns, we should be able to maintain the load balancing in order to keep the global application performance. In this work, we present a proximity load balancing strategy for a distributed cluster-based individualoriented fish school simulator. On one hand, we implement a robust cluster-based partitioning method by means of covering radius criterion and voronoi diagrams. We use a proximity criterion to distribute individuals on the distributed architecture. On the other hand, we propose a proximity load balancing strategy in order to maintain the application performance as the simulation progresses.