Gabriel Mateescu - Academia.edu (original) (raw)

Uploads

Papers by Gabriel Mateescu

Research paper thumbnail of High performance grid computing - HPGC

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010

Abstract The seventh workshop will be held in conjunction with IPDPS 2010 in Atlanta. It will giv... more Abstract The seventh workshop will be held in conjunction with IPDPS 2010 in Atlanta. It will give a forum to researchers and engineers to present their results in grid and distributed computing. Special areas of interest will be grid middleware, grid applications, grid ...

Research paper thumbnail of HEP Applications on the Grid Canada Testbed

arXiv (Cornell University), May 16, 2003

A Grid testbed has been established using resources at 12 sites across Canada involving researche... more A Grid testbed has been established using resources at 12 sites across Canada involving researchers from particle physics as well as other fields of science. We describe our use of the testbed with the BaBar Monte Carlo production and the ATLAS data challenge software. In each case the remote sites have no application-specific software stored locally and instead access the software and data via AFS and/or GridFTP from servers located in Victoria. In the case of BaBar, an Objectivity database server was used for data storage. We present the results of a series of initial tests of the Grid testbed using both BaBar and ATLAS applications. The initial results demonstrate the feasibility of using generic Grid resources for HEP applications.

Research paper thumbnail of Parallel Algorithm for the Solution of High Order Discretization of Elliptic PDEs

Fluids Engineering, 1998

We present a parallel preconditioned-GMRES algorithm for solving second-order elliptic PDEs defin... more We present a parallel preconditioned-GMRES algorithm for solving second-order elliptic PDEs defined on rectangular domains with Dirichlet and Neumann boundary conditions, and discretized with piecewise Hermite bicubics. The parallel performance of the algorithm is assessed by way of numerical experiments.

Research paper thumbnail of Service-Oriented

The GridX1 computational Grid: from a set of service-specific protocols to a

Research paper thumbnail of Near Real Time Processing Chain for Suomi NPP Satellite Data

Research paper thumbnail of Federating Grids : LCG meets Canadian HEPGrid

A large number of Grids have been developed worldwide. Despite being mostly based on the same und... more A large number of Grids have been developed worldwide. Despite being mostly based on the same underlying middleware, the Globus Toolkit, they are generally not inter-operable for a variety of reasons. We present a method of federating those disparate grids which are based on the Globus Toolkit, together with a concrete example of interfacing the LHC Computing Grid (LCG) with HEPGrid. HEPGrid consists of shared resources, at several Canadian research institutes, which are exposed via Globus gatekeepers, and makes use of Condor-G for resource advertisement, matchmaking and job submission. An LCG Computing Element (CE) based at the TRIUMF Laboratory hosts a HEPGrid User Interface (UI) that is contained within a custom JobManager. This JobManager appears in the LCG information system as a normal CE publishing an aggregation of the HEPGrid resources. The interface interprets the incoming job in terms of HEPGrid UI usage, submits it onto HEPGrid, and implements the JobManager 'poll&#3...

Research paper thumbnail of HPGC Introduction

2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, 2011

Research paper thumbnail of BaBar MC production on the Canadian grid using a web services approach

Journal of Physics: Conference Series, 2008

The present paper highlights the approach used to design and implement a web services based BaBar... more The present paper highlights the approach used to design and implement a web services based BaBar Monte Carlo (MC) production grid using Globus Toolkit version 4. The grid integrates the resources of two clusters at the University of Victoria, using the ClassAd mechanism provided by the Condor-G metascheduler. Each cluster uses the Portable Batch System (PBS) as its local resource

Research paper thumbnail of Derandomized Vector Sorting

An instance of the vector sorting problem is a sequence of k-dimensional vectors of length n. A s... more An instance of the vector sorting problem is a sequence of k-dimensional vectors of length n. A solution to the problem is a permutation of the vectors such that in each dimension the length of the longest decreasing subsequence is O(sqrt(n)). A random permutation solves the problem. Here we derandomize the obvious probabilistic algorithm and obtain a deterministic O(kn^3.5) time algorithm that solves the vector sorting problem. We also apply the algorithm to a book embedding problem.

Research paper thumbnail of A domain decomposition algorithm for hermite collocation problems

Research paper thumbnail of Time Series Model Mining with Similarity-Based Neuro-fuzzy Networks and Genetic Algorithms: A Parallel Implementation

Lecture Notes in Computer Science, 2002

... 1 National Research Council of Canada Institute for Information Technology 1200 Montreal Road... more ... 1 National Research Council of Canada Institute for Information Technology 1200 Montreal Road, Ottawa ON K1A 0R6, Canada julio.valdes@nrc.ca 2 National Research Council of Canada Information Management Services Branch 100 Sussex Drive, Ottawa ON K1A 0R6 ...

Research paper thumbnail of Optimizing matrix transposes using a POWER7 cache model and explicit prefetching

Proceedings of the Second International Workshop, Nov 13, 2011

We consider the problem of efficiently computing matrix transposes on the POWER7 architecture. We... more We consider the problem of efficiently computing matrix transposes on the POWER7 architecture. We develop a matrix transpose algorithm that uses cache blocking, cache prefetching and data alignment. We model the POWER7 data cache and memory concurrency and use the model to predict the memory throughput of the proposed matrix transpose algorithm. The performance of our matrix transpose algorithm is up to five times higher than that of the dgetmo routine of the Engineering and Scientific Subroutine Library and is 2.5 times higher than that of the code generated by compiler-inserted prefetching. Numerical experiments indicate a good agreement between the predicted and the measured memory throughput.

Research paper thumbnail of A New Model for Probabilistic Information Retrieval on the Web

Statistics: A Series of Textbooks and Monographs, 2004

ABSTRACT

Research paper thumbnail of Memory Synthesis Using Ai Methods

Research paper thumbnail of Overcoming the processor communication overhead in MPI applications

Research paper thumbnail of Parallel Computing with OpenMP on distributed shared memory platforms

Research paper thumbnail of Numerical Experiments With Parallel Orderings For Ilu Preconditioners

Electronic transactions on numerical analysis ETNA

Incomplete factorization preconditioners such as ILU, ILUT and MILU are well-known robust general... more Incomplete factorization preconditioners such as ILU, ILUT and MILU are well-known robust general-purpose techniques for solving linear systems on serial computers. However, they are difficult to parallelize efficiently. Various techniques have been used to parallelize these preconditioners, such as multicolor orderings and subdomain preconditioning. These techniques may degrade the performance and robustness of ILU preconditionings. The purpose of this paper is to perform numerical experiments to compare these techniques in order to assess what are the most effective ways to use ILU preconditioning for practical problems on serial and parallel computers.

Research paper thumbnail of Optimizing matrix transposes using a POWER7 cache model and explicit prefetching

Proceedings of the second international workshop on Performance modeling, benchmarking and simulation of high performance computing systems - PMBS '11, 2011

We consider the problem of efficiently computing matrix transposes on the POWER7 architecture. We... more We consider the problem of efficiently computing matrix transposes on the POWER7 architecture. We develop a matrix transpose algorithm that uses cache blocking, cache prefetching and data alignment. We model the POWER7 data cache and memory concurrency and use the model to predict the memory throughput of the proposed matrix transpose algorithm. The performance of our matrix transpose algorithm is up to five times higher than that of the dgetmo routine of the Engineering and Scientific Subroutine Library and is 2.5 times higher than that of the code generated by compiler-inserted prefetching. Numerical experiments indicate a good agreement between the predicted and the measured memory throughput.

Research paper thumbnail of The GridX1 computational Grid: from a set of service-specific protocols to a service-oriented approach

21st International Symposium on High Performance Computing Systems and Applications (HPCS'07), 2007

GridX1 is a computational Grid designed and built to link resources at a number of research insti... more GridX1 is a computational Grid designed and built to link resources at a number of research institutions across Canada. Building upon the experience of designing, deploying and operating the first generation of GridX1, we have designed a second-generation, web-services-based, computational Grid. The second generation of GridX1 leverages the Web Services Resource Framework, implemented by the Globus Toolkit version 4. The value added by GridX1 includes metascheduling, file staging, resource registry and resource monitoring.

Research paper thumbnail of Seamless and Secure Authentication for Grid Portals

Web Information Systems and Technologies, 2005

Grid portals typically store user grid credentials in a credential repository. Credential reposit... more Grid portals typically store user grid credentials in a credential repository. Credential repositories allow users to access Grid portals from any machine having a Web browser, but their usage requires several authentication steps. Current portals require users to explicitly go through these steps, thereby hindering their usability. In this paper we present intuitive and easy to use tools to manage certificates. We also describe the integration of Grid Security Infrastructure authentication into a Java-based SSH terminal tool. Based on these tools, we build an innovative portal authentication mechanism that enables transparent delegation of credentials between clients, grid portal and the credential repository.

Research paper thumbnail of High performance grid computing - HPGC

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010

Abstract The seventh workshop will be held in conjunction with IPDPS 2010 in Atlanta. It will giv... more Abstract The seventh workshop will be held in conjunction with IPDPS 2010 in Atlanta. It will give a forum to researchers and engineers to present their results in grid and distributed computing. Special areas of interest will be grid middleware, grid applications, grid ...

Research paper thumbnail of HEP Applications on the Grid Canada Testbed

arXiv (Cornell University), May 16, 2003

A Grid testbed has been established using resources at 12 sites across Canada involving researche... more A Grid testbed has been established using resources at 12 sites across Canada involving researchers from particle physics as well as other fields of science. We describe our use of the testbed with the BaBar Monte Carlo production and the ATLAS data challenge software. In each case the remote sites have no application-specific software stored locally and instead access the software and data via AFS and/or GridFTP from servers located in Victoria. In the case of BaBar, an Objectivity database server was used for data storage. We present the results of a series of initial tests of the Grid testbed using both BaBar and ATLAS applications. The initial results demonstrate the feasibility of using generic Grid resources for HEP applications.

Research paper thumbnail of Parallel Algorithm for the Solution of High Order Discretization of Elliptic PDEs

Fluids Engineering, 1998

We present a parallel preconditioned-GMRES algorithm for solving second-order elliptic PDEs defin... more We present a parallel preconditioned-GMRES algorithm for solving second-order elliptic PDEs defined on rectangular domains with Dirichlet and Neumann boundary conditions, and discretized with piecewise Hermite bicubics. The parallel performance of the algorithm is assessed by way of numerical experiments.

Research paper thumbnail of Service-Oriented

The GridX1 computational Grid: from a set of service-specific protocols to a

Research paper thumbnail of Near Real Time Processing Chain for Suomi NPP Satellite Data

Research paper thumbnail of Federating Grids : LCG meets Canadian HEPGrid

A large number of Grids have been developed worldwide. Despite being mostly based on the same und... more A large number of Grids have been developed worldwide. Despite being mostly based on the same underlying middleware, the Globus Toolkit, they are generally not inter-operable for a variety of reasons. We present a method of federating those disparate grids which are based on the Globus Toolkit, together with a concrete example of interfacing the LHC Computing Grid (LCG) with HEPGrid. HEPGrid consists of shared resources, at several Canadian research institutes, which are exposed via Globus gatekeepers, and makes use of Condor-G for resource advertisement, matchmaking and job submission. An LCG Computing Element (CE) based at the TRIUMF Laboratory hosts a HEPGrid User Interface (UI) that is contained within a custom JobManager. This JobManager appears in the LCG information system as a normal CE publishing an aggregation of the HEPGrid resources. The interface interprets the incoming job in terms of HEPGrid UI usage, submits it onto HEPGrid, and implements the JobManager 'poll&#3...

Research paper thumbnail of HPGC Introduction

2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, 2011

Research paper thumbnail of BaBar MC production on the Canadian grid using a web services approach

Journal of Physics: Conference Series, 2008

The present paper highlights the approach used to design and implement a web services based BaBar... more The present paper highlights the approach used to design and implement a web services based BaBar Monte Carlo (MC) production grid using Globus Toolkit version 4. The grid integrates the resources of two clusters at the University of Victoria, using the ClassAd mechanism provided by the Condor-G metascheduler. Each cluster uses the Portable Batch System (PBS) as its local resource

Research paper thumbnail of Derandomized Vector Sorting

An instance of the vector sorting problem is a sequence of k-dimensional vectors of length n. A s... more An instance of the vector sorting problem is a sequence of k-dimensional vectors of length n. A solution to the problem is a permutation of the vectors such that in each dimension the length of the longest decreasing subsequence is O(sqrt(n)). A random permutation solves the problem. Here we derandomize the obvious probabilistic algorithm and obtain a deterministic O(kn^3.5) time algorithm that solves the vector sorting problem. We also apply the algorithm to a book embedding problem.

Research paper thumbnail of A domain decomposition algorithm for hermite collocation problems

Research paper thumbnail of Time Series Model Mining with Similarity-Based Neuro-fuzzy Networks and Genetic Algorithms: A Parallel Implementation

Lecture Notes in Computer Science, 2002

... 1 National Research Council of Canada Institute for Information Technology 1200 Montreal Road... more ... 1 National Research Council of Canada Institute for Information Technology 1200 Montreal Road, Ottawa ON K1A 0R6, Canada julio.valdes@nrc.ca 2 National Research Council of Canada Information Management Services Branch 100 Sussex Drive, Ottawa ON K1A 0R6 ...

Research paper thumbnail of Optimizing matrix transposes using a POWER7 cache model and explicit prefetching

Proceedings of the Second International Workshop, Nov 13, 2011

We consider the problem of efficiently computing matrix transposes on the POWER7 architecture. We... more We consider the problem of efficiently computing matrix transposes on the POWER7 architecture. We develop a matrix transpose algorithm that uses cache blocking, cache prefetching and data alignment. We model the POWER7 data cache and memory concurrency and use the model to predict the memory throughput of the proposed matrix transpose algorithm. The performance of our matrix transpose algorithm is up to five times higher than that of the dgetmo routine of the Engineering and Scientific Subroutine Library and is 2.5 times higher than that of the code generated by compiler-inserted prefetching. Numerical experiments indicate a good agreement between the predicted and the measured memory throughput.

Research paper thumbnail of A New Model for Probabilistic Information Retrieval on the Web

Statistics: A Series of Textbooks and Monographs, 2004

ABSTRACT

Research paper thumbnail of Memory Synthesis Using Ai Methods

Research paper thumbnail of Overcoming the processor communication overhead in MPI applications

Research paper thumbnail of Parallel Computing with OpenMP on distributed shared memory platforms

Research paper thumbnail of Numerical Experiments With Parallel Orderings For Ilu Preconditioners

Electronic transactions on numerical analysis ETNA

Incomplete factorization preconditioners such as ILU, ILUT and MILU are well-known robust general... more Incomplete factorization preconditioners such as ILU, ILUT and MILU are well-known robust general-purpose techniques for solving linear systems on serial computers. However, they are difficult to parallelize efficiently. Various techniques have been used to parallelize these preconditioners, such as multicolor orderings and subdomain preconditioning. These techniques may degrade the performance and robustness of ILU preconditionings. The purpose of this paper is to perform numerical experiments to compare these techniques in order to assess what are the most effective ways to use ILU preconditioning for practical problems on serial and parallel computers.

Research paper thumbnail of Optimizing matrix transposes using a POWER7 cache model and explicit prefetching

Proceedings of the second international workshop on Performance modeling, benchmarking and simulation of high performance computing systems - PMBS '11, 2011

We consider the problem of efficiently computing matrix transposes on the POWER7 architecture. We... more We consider the problem of efficiently computing matrix transposes on the POWER7 architecture. We develop a matrix transpose algorithm that uses cache blocking, cache prefetching and data alignment. We model the POWER7 data cache and memory concurrency and use the model to predict the memory throughput of the proposed matrix transpose algorithm. The performance of our matrix transpose algorithm is up to five times higher than that of the dgetmo routine of the Engineering and Scientific Subroutine Library and is 2.5 times higher than that of the code generated by compiler-inserted prefetching. Numerical experiments indicate a good agreement between the predicted and the measured memory throughput.

Research paper thumbnail of The GridX1 computational Grid: from a set of service-specific protocols to a service-oriented approach

21st International Symposium on High Performance Computing Systems and Applications (HPCS'07), 2007

GridX1 is a computational Grid designed and built to link resources at a number of research insti... more GridX1 is a computational Grid designed and built to link resources at a number of research institutions across Canada. Building upon the experience of designing, deploying and operating the first generation of GridX1, we have designed a second-generation, web-services-based, computational Grid. The second generation of GridX1 leverages the Web Services Resource Framework, implemented by the Globus Toolkit version 4. The value added by GridX1 includes metascheduling, file staging, resource registry and resource monitoring.

Research paper thumbnail of Seamless and Secure Authentication for Grid Portals

Web Information Systems and Technologies, 2005

Grid portals typically store user grid credentials in a credential repository. Credential reposit... more Grid portals typically store user grid credentials in a credential repository. Credential repositories allow users to access Grid portals from any machine having a Web browser, but their usage requires several authentication steps. Current portals require users to explicitly go through these steps, thereby hindering their usability. In this paper we present intuitive and easy to use tools to manage certificates. We also describe the integration of Grid Security Infrastructure authentication into a Java-based SSH terminal tool. Based on these tools, we build an innovative portal authentication mechanism that enables transparent delegation of credentials between clients, grid portal and the credential repository.