Domingo Gimenez - Academia.edu (original) (raw)

Uploads

Papers by Domingo Gimenez

Research paper thumbnail of Parallelism on Hybrid Metaheuristics for Vector Autoregression Models

2018 International Conference on High Performance Computing & Simulation (HPCS), 2018

Research paper thumbnail of A Parallel Programming Course Based on an Execution Time-Energy Consumption Optimization Problem

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2016

Research paper thumbnail of Optimizing Metaheuristics and Hyperheuristics through Multi-level Parallelism on a Many-Core System

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2016

Research paper thumbnail of Sobre el papel de la programaci�n paralela en los nuevos planes de estudios de inform�tica

Research paper thumbnail of Improving Linear Algebra Computation on NUMA Platforms through Auto-tuned Nested Parallelism

2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing, 2012

Research paper thumbnail of On the development and optimization of hybrid parallel codes for Integral Equation formulations

Research paper thumbnail of Modeling the behaviour of linear algebra algorithms with message-passing

Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing, 2001

Modeling the behaviour of linear algebra algorithms is very suitable for designing linear algebra... more Modeling the behaviour of linear algebra algorithms is very suitable for designing linear algebra software for high performance computers. This modelization would enable us to predict the execution time of the routines depending on a number of parameters. There are two groups of parameters, in the first, there are the parameters whose values can be chosen by the user: number of processors, processors grid configuration, distribution of data in the system, block size; and in the second, we have the parameters that specify the characteristics of a target architecture: arithmetic cost and start-up and word-sending cost of a communication operation. Thus, a linear algebra library could be designed in such a way that each routine takes the values of the parameters of the first group that provide the expected optimum execution time, and solves the problem. This library could, therefore be employed by a non-expert user to solve scientific or engineering problems, because the user does not need to determine the values of these parameters. The design methodology is analysed with one-sided block Jacobi methods to solve the symmetric eigenvalue problem. Variants for a logical ring and a logical rectangular mesh of processors are considered. An analytical model of the algorithm is developed, and the behaviour of the algorithm is analysed with message-passing using MPI in a SGI Origin 2000. With the parameters chosen by our model, the execution time is reduced from about 50% higher than the optimal to just 2%.

Research paper thumbnail of Analysis of the Influence of the Compiler on Multicore Performance

2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, 2010

Research paper thumbnail of Empirical Installation of Linear Algebra Shared-Memory Subroutines for Auto-Tuning

International Journal of Parallel Programming, 2013

The introduction of auto-tuning techniques in linear algebra shared-memory routines is analyzed. ... more The introduction of auto-tuning techniques in linear algebra shared-memory routines is analyzed. Information obtained in the installation of the routines is used at running time to take some decisions to reduce the total execution time. The study is carried out with routines at different levels (matrix multiplication, LU and Cholesky factorizations and linear systems symmetric or general routines) and with calls to routines in the LAPACK and PLASMA libraries with multithread implementations. Medium NUMA and large cc-NUMA systems are used in the experiments. This variety of routines, libraries and systems allows us to obtain general conclusions about the methodology to use for linear algebra shared-memory routines auto-tuning. Satisfactory execution times are obtained with the proposed methodology.

Research paper thumbnail of Parallelizing the Computation of Green Functions for Computational Electromagnetism Problems

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Research paper thumbnail of Improving Metaheuristics for Mapping Independent Tasks into Heterogeneous Memory-Constrained Systems

Lecture Notes in Computer Science, 2008

This paper shows different strategies for improving some metaheuristics for the solution of a tas... more This paper shows different strategies for improving some metaheuristics for the solution of a task mapping problem. Independent tasks with different computational costs and memory requirements are scheduled in a heterogeneous system with computational heterogeneity and memory constraints. The tuned methods proposed in this work could be used for optimizing realistic systems, such as scheduling independent processes onto a processors farm.

Research paper thumbnail of Processes Distribution of Homogeneous Parallel Linear Algebra Routines on Heterogeneous Clusters

2005 IEEE International Conference on Cluster Computing, 2005

This paper presents a self-optimization methodology for parallel linear algebra routines on heter... more This paper presents a self-optimization methodology for parallel linear algebra routines on heterogeneous systems. For each routine, a series of decisions is taken automatically in order to obtain an execution time close to the optimum (without rewriting the routine's code). Some of these decisions are: the number of processes to generate, the heterogeneous distribution of these processes over the network of processors, the logical topology of the generated processes, ... To reduce the searching space of such decisions, different heuristics have been used. The experiments have been performed with a parallel LU factorization routine similar to the ScaLAPACK one, and good results have been obtained on different heterogeneous platforms.

Research paper thumbnail of Optimizing Shared-memory Hyperheuristics on Top of Parameterized Metaheuristics

Procedia Computer Science, 2014

Research paper thumbnail of A High Performance Computing Course Guided by the LU Factorization

Procedia Computer Science, 2014

Research paper thumbnail of Using Metaheuristics in a Parallel Computing Course

Lecture Notes in Computer Science, 2008

In this paper the use of metaheuristics techniques in a parallel computing course is explained. I... more In this paper the use of metaheuristics techniques in a parallel computing course is explained. In the practicals of the course different metaheuristics are used in the solution of a mapping problem in which processes are assigned to processors in a heterogeneous environment, with heterogeneity in computation and in the network. The parallelization of the metaheuristics is also considered.

Research paper thumbnail of Parameterized Schemes of Metaheuristics: Basic Ideas and Applications With Genetic Algorithms, Scatter Search, and GRASP

IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2013

Some optimization problems can be tackled only with metaheuristic methods, and to obtain a satisf... more Some optimization problems can be tackled only with metaheuristic methods, and to obtain a satisfactory metaheuristic, it is necessary to develop and experiment with various methods and to tune them for each particular problem. The use of a unified scheme for metaheuristics facilitates the development of metaheuristics by reutilizing the basic functions. In our proposal, the unified scheme is improved by adding transitional parameters. Those parameters are included in each of the functions, in such a way that different values of the parameters provide different metaheuristics or combinations of metaheuristics. Thus, the unified parameterized scheme eases the development of metaheuristics and their application. In this paper, we expose the basic ideas of the parameterization of metaheuristics. This methodology is tested with the application of local and global search methods (greedy randomized adaptive search procedure [GRASP], genetic algorithms, and scatter search), and their combinations, to three scientific problems: obtaining satisfactory simultaneous equation models from a set of values of the variables, a task-to-processor assignment problem with independent tasks and memory constrains, and the p-hub median location-allocation problem. Index Terms-Genetic algorithms (GAs), greedy randomized adaptive search procedure (GRASP), parameterized metaheuristics, scatter search (SS), unified metaheuristics.

Research paper thumbnail of Obtaining Simultaneous Equation Models through a Unified Shared-Memory Scheme of Metaheuristics

2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, 2011

Research paper thumbnail of The Spanish Parallel Programming Contests and its Use as an Educational Resource

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Research paper thumbnail of Auto-tuning methodology to represent landform attributes on multicore and multi-GPU systems

Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores, 2013

Research paper thumbnail of Hybrid-parallel Algorithms for 2D Green's Functions

Procedia Computer Science, 2013

Research paper thumbnail of Parallelism on Hybrid Metaheuristics for Vector Autoregression Models

2018 International Conference on High Performance Computing & Simulation (HPCS), 2018

Research paper thumbnail of A Parallel Programming Course Based on an Execution Time-Energy Consumption Optimization Problem

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2016

Research paper thumbnail of Optimizing Metaheuristics and Hyperheuristics through Multi-level Parallelism on a Many-Core System

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2016

Research paper thumbnail of Sobre el papel de la programaci�n paralela en los nuevos planes de estudios de inform�tica

Research paper thumbnail of Improving Linear Algebra Computation on NUMA Platforms through Auto-tuned Nested Parallelism

2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing, 2012

Research paper thumbnail of On the development and optimization of hybrid parallel codes for Integral Equation formulations

Research paper thumbnail of Modeling the behaviour of linear algebra algorithms with message-passing

Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing, 2001

Modeling the behaviour of linear algebra algorithms is very suitable for designing linear algebra... more Modeling the behaviour of linear algebra algorithms is very suitable for designing linear algebra software for high performance computers. This modelization would enable us to predict the execution time of the routines depending on a number of parameters. There are two groups of parameters, in the first, there are the parameters whose values can be chosen by the user: number of processors, processors grid configuration, distribution of data in the system, block size; and in the second, we have the parameters that specify the characteristics of a target architecture: arithmetic cost and start-up and word-sending cost of a communication operation. Thus, a linear algebra library could be designed in such a way that each routine takes the values of the parameters of the first group that provide the expected optimum execution time, and solves the problem. This library could, therefore be employed by a non-expert user to solve scientific or engineering problems, because the user does not need to determine the values of these parameters. The design methodology is analysed with one-sided block Jacobi methods to solve the symmetric eigenvalue problem. Variants for a logical ring and a logical rectangular mesh of processors are considered. An analytical model of the algorithm is developed, and the behaviour of the algorithm is analysed with message-passing using MPI in a SGI Origin 2000. With the parameters chosen by our model, the execution time is reduced from about 50% higher than the optimal to just 2%.

Research paper thumbnail of Analysis of the Influence of the Compiler on Multicore Performance

2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, 2010

Research paper thumbnail of Empirical Installation of Linear Algebra Shared-Memory Subroutines for Auto-Tuning

International Journal of Parallel Programming, 2013

The introduction of auto-tuning techniques in linear algebra shared-memory routines is analyzed. ... more The introduction of auto-tuning techniques in linear algebra shared-memory routines is analyzed. Information obtained in the installation of the routines is used at running time to take some decisions to reduce the total execution time. The study is carried out with routines at different levels (matrix multiplication, LU and Cholesky factorizations and linear systems symmetric or general routines) and with calls to routines in the LAPACK and PLASMA libraries with multithread implementations. Medium NUMA and large cc-NUMA systems are used in the experiments. This variety of routines, libraries and systems allows us to obtain general conclusions about the methodology to use for linear algebra shared-memory routines auto-tuning. Satisfactory execution times are obtained with the proposed methodology.

Research paper thumbnail of Parallelizing the Computation of Green Functions for Computational Electromagnetism Problems

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Research paper thumbnail of Improving Metaheuristics for Mapping Independent Tasks into Heterogeneous Memory-Constrained Systems

Lecture Notes in Computer Science, 2008

This paper shows different strategies for improving some metaheuristics for the solution of a tas... more This paper shows different strategies for improving some metaheuristics for the solution of a task mapping problem. Independent tasks with different computational costs and memory requirements are scheduled in a heterogeneous system with computational heterogeneity and memory constraints. The tuned methods proposed in this work could be used for optimizing realistic systems, such as scheduling independent processes onto a processors farm.

Research paper thumbnail of Processes Distribution of Homogeneous Parallel Linear Algebra Routines on Heterogeneous Clusters

2005 IEEE International Conference on Cluster Computing, 2005

This paper presents a self-optimization methodology for parallel linear algebra routines on heter... more This paper presents a self-optimization methodology for parallel linear algebra routines on heterogeneous systems. For each routine, a series of decisions is taken automatically in order to obtain an execution time close to the optimum (without rewriting the routine's code). Some of these decisions are: the number of processes to generate, the heterogeneous distribution of these processes over the network of processors, the logical topology of the generated processes, ... To reduce the searching space of such decisions, different heuristics have been used. The experiments have been performed with a parallel LU factorization routine similar to the ScaLAPACK one, and good results have been obtained on different heterogeneous platforms.

Research paper thumbnail of Optimizing Shared-memory Hyperheuristics on Top of Parameterized Metaheuristics

Procedia Computer Science, 2014

Research paper thumbnail of A High Performance Computing Course Guided by the LU Factorization

Procedia Computer Science, 2014

Research paper thumbnail of Using Metaheuristics in a Parallel Computing Course

Lecture Notes in Computer Science, 2008

In this paper the use of metaheuristics techniques in a parallel computing course is explained. I... more In this paper the use of metaheuristics techniques in a parallel computing course is explained. In the practicals of the course different metaheuristics are used in the solution of a mapping problem in which processes are assigned to processors in a heterogeneous environment, with heterogeneity in computation and in the network. The parallelization of the metaheuristics is also considered.

Research paper thumbnail of Parameterized Schemes of Metaheuristics: Basic Ideas and Applications With Genetic Algorithms, Scatter Search, and GRASP

IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2013

Some optimization problems can be tackled only with metaheuristic methods, and to obtain a satisf... more Some optimization problems can be tackled only with metaheuristic methods, and to obtain a satisfactory metaheuristic, it is necessary to develop and experiment with various methods and to tune them for each particular problem. The use of a unified scheme for metaheuristics facilitates the development of metaheuristics by reutilizing the basic functions. In our proposal, the unified scheme is improved by adding transitional parameters. Those parameters are included in each of the functions, in such a way that different values of the parameters provide different metaheuristics or combinations of metaheuristics. Thus, the unified parameterized scheme eases the development of metaheuristics and their application. In this paper, we expose the basic ideas of the parameterization of metaheuristics. This methodology is tested with the application of local and global search methods (greedy randomized adaptive search procedure [GRASP], genetic algorithms, and scatter search), and their combinations, to three scientific problems: obtaining satisfactory simultaneous equation models from a set of values of the variables, a task-to-processor assignment problem with independent tasks and memory constrains, and the p-hub median location-allocation problem. Index Terms-Genetic algorithms (GAs), greedy randomized adaptive search procedure (GRASP), parameterized metaheuristics, scatter search (SS), unified metaheuristics.

Research paper thumbnail of Obtaining Simultaneous Equation Models through a Unified Shared-Memory Scheme of Metaheuristics

2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, 2011

Research paper thumbnail of The Spanish Parallel Programming Contests and its Use as an Educational Resource

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Research paper thumbnail of Auto-tuning methodology to represent landform attributes on multicore and multi-GPU systems

Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores, 2013

Research paper thumbnail of Hybrid-parallel Algorithms for 2D Green's Functions

Procedia Computer Science, 2013