Giuliano Laccetti | Università degli Studi di Napoli "Federico II" (original) (raw)
Uploads
Papers by Giuliano Laccetti
Lecture Notes in Computer Science, 2023
2022 21st International Symposium on Parallel and Distributed Computing (ISPDC)
2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)
Internet and Distributed Computing Systems, 2019
K-means algorithm is one of the most widely used methods in data mining and statistical data anal... more K-means algorithm is one of the most widely used methods in data mining and statistical data analysis to partition several objects in K distinct groups, called clusters, on the basis of their similarities. The main problel and distributed clustering algorithms start to be designem of this algorithm is that it requires the number of clusters as an input data, but in the real life it is very difficult to fix in advance such value. In this work we propose a parallel modified K-means algorithm where the number of clusters is increased at run time in a iterative procedure until a given cluster quality metric is satisfied. To improve the performance of the procedure, at each iteration two new clusters are created, splitting only the cluster with the worst value of the quality metric. Furthermore, experiments in a multi-core CPUs based environment are presented.
Parallel Processing and Applied Mathematics, 2018
Future Generation Computer Systems, 2020
Persistent scatterer interferometry (PSI) techniques using amplitude analysis and considering a t... more Persistent scatterer interferometry (PSI) techniques using amplitude analysis and considering a temporal deformation model for PS pixel selection are unable to identify PS pixels in rural areas lacking human-made structures. In contrast, high rates of land subsidence lead to significant phase-unwrapping errors in a recently developed PSI algorithm (StaMPS) that applies phase stability and amplitude analysis to select the PS pixels in rural areas. The objective of this paper is to present an enhanced algorithm based on PSI to estimate the deformation rate in rural areas undergoing high and nearly constant rates of deformation. The proposed approach integrates the strengths of all of the existing PSI algorithms in PS pixel selection and phase unwrapping. PS pixels are first selected based on the amplitude information and phase-stability estimation as performed in StaMPS. The phase-unwrapping step, including the deformation rate and phase-ambiguity estimation, is then performed using least-squares ambiguity decorrelation adjustment (LAMBDA). The atmospheric phase screen (APS) and nonlinear deformation contribution to the phase are estimated by applying a high-pass temporal filter to the residuals derived from the LAMBDA method. The final deformation rate and the ambiguity parameter are re-estimated after subtracting the APS and the nonlinear deformation from that of the initial phase. The proposed method is applied to 22 ENVISAT ASAR images of southwestern Tehran basin captured between 2003 and 2008. A quantitative comparison with the results obtained with leveling and GPS measurements demonstrates the significant improvement of the PSI technique.
Data Assimilation (DA) is an uncertainty quantification technique used to incorporate observed da... more Data Assimilation (DA) is an uncertainty quantification technique used to incorporate observed data into a prediction model in order to improve numerical forecasted results. The forecasting model used for producing oceanographic prediction into the Caspian Sea is the Regional Ocean Modeling System (ROMS). Here we propose the computational issues we are facing in a DA software we are developing (we named S3DVAR) which implements a Scalable Three Dimensional Variational Data Assimilation model for assimilating sea surface temperature (SST) values collected into the Caspian Sea with observations provided by the Group of High resolution sea surface temperature (GHRSST). We present the algorithmic strategies we employ and the numerical issues on data collected in two of the months which present the most significant variability in water temperature: August and March.
High Performance Scientific Computing Using Distributed Infrastructures, 2016
2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)
Proceedings of the 1st ACM workshop on Data grids for eScience - DaGreS '09, 2009
Because of the dynamic and heterogeneous nature of a grid infrastructure, the client/server parad... more Because of the dynamic and heterogeneous nature of a grid infrastructure, the client/server paradigm is a common programming model for these environments, where the client submits requests to several geographically remote servers for executing already deployed applications on its own data. According to this model, the applications are usually decomposed into independent tasks that are solved concurrently by the servers (the so called Data Grid applications). On the other hand, as many scientific applications are characterized by very large set of input data and dependencies among subproblems, avoiding unnecessary synchronizations and data transfer is a difficult task. This work addresses the problem of implementing a strategy for an efficient task scheduling and data management in case of data dependencies among subproblems in the same Linear Algebra application. For the purpose of the experiments, the NetSolve distributed computing environment has been used and some minor changes have been introduced to the underlying Distributed Storage Infrastructure in order to implement the proposed strategies.
We provide a multilevel approach for analysing performances of parallel algorithms. The main outc... more We provide a multilevel approach for analysing performances of parallel algorithms. The main outcome of such approach is that the algorithm is described by using a set of operators which are related to each other according to the problem decomposition. Decomposition level determines the granularity of the algorithm. A set of block matrices (decomposition and execution) highlights fundamental characteristics of the algorithm, such as inherent parallelism and sources of overheads.
Concurrency and Computation: Practice and Experience, 2020
In this work, we perform the feasibility analysis of a multi‐grained parallel version of the Ziel... more In this work, we perform the feasibility analysis of a multi‐grained parallel version of the Zielonka Recursive (ZR) algorithm exploiting the coarse‐ and fine‐ grained concurrency. Coarse‐grained parallelism relies on a suitable splitting of the problem, that is, a graph decomposition based on its Strongly Connected Components (SCC) or a splitting of the formula generating the game, while fine‐grained parallelism is introduced inside the Attractor which is the most intensive computational kernel. This configuration is new and addressed for the first time in this article. Innovation goes from the introduction of properly defined metrics for the strong and weak scaling of the algorithm. These metrics conduct to an analysis of the values of these metrics for the fine grained algorithm, we can infer the expected performance of the multi‐grained parallel algorithm running in a distributed and hybrid computing environment. Results confirm that while a fine‐grained parallelism have a clear performance limitation, the performance gain we can expect to get by employing a multilevel parallelism is significant.
Proceedings of 36th International Conference on High Energy Physics — PoS(ICHEP2012), 2013
The development of a computing model for the next generation of Super Flavour Factories, like Sup... more The development of a computing model for the next generation of Super Flavour Factories, like SuperB and SuperKEKB, presents significant challenges. With a nominal luminosity above 10 36 cm-2 s-1 , we estimate that, after few years of operation, the size of the data sample will be of the order of 500 PB and the amount of CPU required to process it will be close to 5000 KHep-Spec06 (the new HEP-wide benchmark for measuring CPU performance). The new many and multi core technologies need to be effectively exploited in order to manage very large data set and this has a potential large impact on the computing model for SuperB. In addition, the computing resources available to SuperB, as is already the case for LHC experiments, will be distributed and accessed through a Grid or a cloud infrastructure and a suite of efficient and reliable tools needs to be provided to the users. A dedicated research program to explore these issues is in progress and it is presented here.
The rise of new multicore CPUs introduced new challenges in the process of design of concurrent d... more The rise of new multicore CPUs introduced new challenges in the process of design of concurrent data structures: in addition to traditional requirements like correctness, linearizability and progress, the scalability is of paramount importance. It is a common opinion that these two demands are partially in conflict each others, so that in these computational environments it is necessary to relax the requirements on the traditional features of the data structures. In this paper we introduce a relaxed approach for the management of heap based priority queues on multicore CPUs, with the aim to realize a tradeoff between efficiency and sequential correctness. The approach is based on a sharing of information among only a small number of cores, so that to improve performance without completely losing the features of the data structure. The results obtained on a numerical algorithm show significant benefits in terms of parallel efficiency.
Lecture Notes in Computer Science, 2023
2022 21st International Symposium on Parallel and Distributed Computing (ISPDC)
2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)
Internet and Distributed Computing Systems, 2019
K-means algorithm is one of the most widely used methods in data mining and statistical data anal... more K-means algorithm is one of the most widely used methods in data mining and statistical data analysis to partition several objects in K distinct groups, called clusters, on the basis of their similarities. The main problel and distributed clustering algorithms start to be designem of this algorithm is that it requires the number of clusters as an input data, but in the real life it is very difficult to fix in advance such value. In this work we propose a parallel modified K-means algorithm where the number of clusters is increased at run time in a iterative procedure until a given cluster quality metric is satisfied. To improve the performance of the procedure, at each iteration two new clusters are created, splitting only the cluster with the worst value of the quality metric. Furthermore, experiments in a multi-core CPUs based environment are presented.
Parallel Processing and Applied Mathematics, 2018
Future Generation Computer Systems, 2020
Persistent scatterer interferometry (PSI) techniques using amplitude analysis and considering a t... more Persistent scatterer interferometry (PSI) techniques using amplitude analysis and considering a temporal deformation model for PS pixel selection are unable to identify PS pixels in rural areas lacking human-made structures. In contrast, high rates of land subsidence lead to significant phase-unwrapping errors in a recently developed PSI algorithm (StaMPS) that applies phase stability and amplitude analysis to select the PS pixels in rural areas. The objective of this paper is to present an enhanced algorithm based on PSI to estimate the deformation rate in rural areas undergoing high and nearly constant rates of deformation. The proposed approach integrates the strengths of all of the existing PSI algorithms in PS pixel selection and phase unwrapping. PS pixels are first selected based on the amplitude information and phase-stability estimation as performed in StaMPS. The phase-unwrapping step, including the deformation rate and phase-ambiguity estimation, is then performed using least-squares ambiguity decorrelation adjustment (LAMBDA). The atmospheric phase screen (APS) and nonlinear deformation contribution to the phase are estimated by applying a high-pass temporal filter to the residuals derived from the LAMBDA method. The final deformation rate and the ambiguity parameter are re-estimated after subtracting the APS and the nonlinear deformation from that of the initial phase. The proposed method is applied to 22 ENVISAT ASAR images of southwestern Tehran basin captured between 2003 and 2008. A quantitative comparison with the results obtained with leveling and GPS measurements demonstrates the significant improvement of the PSI technique.
Data Assimilation (DA) is an uncertainty quantification technique used to incorporate observed da... more Data Assimilation (DA) is an uncertainty quantification technique used to incorporate observed data into a prediction model in order to improve numerical forecasted results. The forecasting model used for producing oceanographic prediction into the Caspian Sea is the Regional Ocean Modeling System (ROMS). Here we propose the computational issues we are facing in a DA software we are developing (we named S3DVAR) which implements a Scalable Three Dimensional Variational Data Assimilation model for assimilating sea surface temperature (SST) values collected into the Caspian Sea with observations provided by the Group of High resolution sea surface temperature (GHRSST). We present the algorithmic strategies we employ and the numerical issues on data collected in two of the months which present the most significant variability in water temperature: August and March.
High Performance Scientific Computing Using Distributed Infrastructures, 2016
2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)
Proceedings of the 1st ACM workshop on Data grids for eScience - DaGreS '09, 2009
Because of the dynamic and heterogeneous nature of a grid infrastructure, the client/server parad... more Because of the dynamic and heterogeneous nature of a grid infrastructure, the client/server paradigm is a common programming model for these environments, where the client submits requests to several geographically remote servers for executing already deployed applications on its own data. According to this model, the applications are usually decomposed into independent tasks that are solved concurrently by the servers (the so called Data Grid applications). On the other hand, as many scientific applications are characterized by very large set of input data and dependencies among subproblems, avoiding unnecessary synchronizations and data transfer is a difficult task. This work addresses the problem of implementing a strategy for an efficient task scheduling and data management in case of data dependencies among subproblems in the same Linear Algebra application. For the purpose of the experiments, the NetSolve distributed computing environment has been used and some minor changes have been introduced to the underlying Distributed Storage Infrastructure in order to implement the proposed strategies.
We provide a multilevel approach for analysing performances of parallel algorithms. The main outc... more We provide a multilevel approach for analysing performances of parallel algorithms. The main outcome of such approach is that the algorithm is described by using a set of operators which are related to each other according to the problem decomposition. Decomposition level determines the granularity of the algorithm. A set of block matrices (decomposition and execution) highlights fundamental characteristics of the algorithm, such as inherent parallelism and sources of overheads.
Concurrency and Computation: Practice and Experience, 2020
In this work, we perform the feasibility analysis of a multi‐grained parallel version of the Ziel... more In this work, we perform the feasibility analysis of a multi‐grained parallel version of the Zielonka Recursive (ZR) algorithm exploiting the coarse‐ and fine‐ grained concurrency. Coarse‐grained parallelism relies on a suitable splitting of the problem, that is, a graph decomposition based on its Strongly Connected Components (SCC) or a splitting of the formula generating the game, while fine‐grained parallelism is introduced inside the Attractor which is the most intensive computational kernel. This configuration is new and addressed for the first time in this article. Innovation goes from the introduction of properly defined metrics for the strong and weak scaling of the algorithm. These metrics conduct to an analysis of the values of these metrics for the fine grained algorithm, we can infer the expected performance of the multi‐grained parallel algorithm running in a distributed and hybrid computing environment. Results confirm that while a fine‐grained parallelism have a clear performance limitation, the performance gain we can expect to get by employing a multilevel parallelism is significant.
Proceedings of 36th International Conference on High Energy Physics — PoS(ICHEP2012), 2013
The development of a computing model for the next generation of Super Flavour Factories, like Sup... more The development of a computing model for the next generation of Super Flavour Factories, like SuperB and SuperKEKB, presents significant challenges. With a nominal luminosity above 10 36 cm-2 s-1 , we estimate that, after few years of operation, the size of the data sample will be of the order of 500 PB and the amount of CPU required to process it will be close to 5000 KHep-Spec06 (the new HEP-wide benchmark for measuring CPU performance). The new many and multi core technologies need to be effectively exploited in order to manage very large data set and this has a potential large impact on the computing model for SuperB. In addition, the computing resources available to SuperB, as is already the case for LHC experiments, will be distributed and accessed through a Grid or a cloud infrastructure and a suite of efficient and reliable tools needs to be provided to the users. A dedicated research program to explore these issues is in progress and it is presented here.
The rise of new multicore CPUs introduced new challenges in the process of design of concurrent d... more The rise of new multicore CPUs introduced new challenges in the process of design of concurrent data structures: in addition to traditional requirements like correctness, linearizability and progress, the scalability is of paramount importance. It is a common opinion that these two demands are partially in conflict each others, so that in these computational environments it is necessary to relax the requirements on the traditional features of the data structures. In this paper we introduce a relaxed approach for the management of heap based priority queues on multicore CPUs, with the aim to realize a tradeoff between efficiency and sequential correctness. The approach is based on a sharing of information among only a small number of cores, so that to improve performance without completely losing the features of the data structure. The results obtained on a numerical algorithm show significant benefits in terms of parallel efficiency.