Momme Allalen - Academia.edu (original) (raw)
Papers by Momme Allalen
Journal of Open Source Software, 2022
QXTools is a framework for simulating quantum circuits using tensor network methods. Weak simulat... more QXTools is a framework for simulating quantum circuits using tensor network methods. Weak simulation is the primary use case where given a quantum circuit and input state QXTools will efficiently calculate the probability amplitude of a given output configuration or set of configurations. Given this ability one can sample from the output distribution using random sampling approaches. QXTools is intended to be used by researchers interested in simulating circuits larger than those possible with full wave-function simulators or those interested in research and development of tensor network circuit simulation methods. See Brennan et al. (2021) for more complete background and scaling results and Brayford et al. (2021) for details about deploying in containerised environments.
Journal of Open Source Software, 2022
QXTools is a framework for simulating quantum circuits using tensor network methods. Weak simulat... more QXTools is a framework for simulating quantum circuits using tensor network methods. Weak simulation is the primary use case where given a quantum circuit and input state QXTools will efficiently calculate the probability amplitude of a given output configuration or set of configurations. Given this ability one can sample from the output distribution using random sampling approaches. QXTools is intended to be used by researchers interested in simulating circuits larger than those possible with full wave-function simulators or those interested in research and development of tensor network circuit simulation methods. See Brennan et al. (2021) for more complete background and scaling results and Brayford et al. (2021) for details about deploying in containerised environments.
Graphics Processing Units (GPUs) were originally developed for computer gaming and other graphica... more Graphics Processing Units (GPUs) were originally developed for computer gaming and other graphical tasks,<br> but for many years have been exploited for general purpose computing across a number of areas. They offer<br> advantages over traditional CPUs because they have greater computational capability, and use high-bandwidth<br> memory systems (where memory bandwidth is the main bottleneck for many scientific applications).<br> This Best Practice Guide describes GPUs: it includes information on how to get started with programming GPUs,<br> which cannot be used in isolation but as "accelerators" in conjunction with CPUs, and how to get good performance.<br> Focus is given to NVIDIA GPUs, which are most widespread today.<br> In Section 2, "The GPU Architecture", the GPU architecture is described, with a focus on the latest "Pascal"<br> generation of NVIDIA GPUs, and attention is given to the architectural r...
2021 3rd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC), 2021
In this paper, to categorize and detect pneumonia from a collection of chest X-ray picture sample... more In this paper, to categorize and detect pneumonia from a collection of chest X-ray picture samples, we propose a deep learning technique based on object detection, convolutional neural networks, and transfer learning. The proposed model is a combination of the pre-trained model (VGG19) and our designed architecture. The Guangzhou Women and Children's Medical Center in Guangzhou, China provided the chest X-ray dataset used in this study. There are 5,000 samples in the data set, with 1,583 healthy samples and 4,273 pneumonia samples. Preprocessing techniques such as contrast limited adaptive histogram equalization (CLAHE) and brightness preserving bi-histogram equalization was also used (BBHE) to improve accuracy. Due to the imbalance of the data set, we adopted some training techniques to improve the learning process of the samples. This network achieved over 99% accuracy due to the proposed architecture that is based on a combination of two models. The pre-trained VGG19 as feature extractor and our designed convolutional neural network (CNN).
2021 IEEE/ACM Second International Workshop on Quantum Computing Software (QCS), 2021
Tensor network methods are incredibly effective for simulating quantum circuits. This is due to t... more Tensor network methods are incredibly effective for simulating quantum circuits. This is due to their ability to efficiently represent and manipulate the wave-functions of large interacting quantum systems. We describe the challenges faced when scaling tensor network simulation approaches to Exascale compute platforms and introduce QuantEx, a framework for tensor network circuit simulation at Exascale.
In spring 2015, the Leibniz Supercomputing Centre (Leibniz-Rechenzentrum, LRZ), installed their n... more In spring 2015, the Leibniz Supercomputing Centre (Leibniz-Rechenzentrum, LRZ), installed their new Peta-Scale System SuperMUC Phase2. Selected users were invited for a 28 day extreme scale-out block operation during which they were allowed to use the full system for their applications. The following projects participated in the extreme scale-out workshop: BQCD (Quantum Physics), SeisSol (Geophysics, Seismics), GPI-2/GASPI (Toolkit for HPC), Seven-League Hydro (Astrophysics), ILBDC (Lattice Boltzmann CFD), Iphigenie (Molecular Dynamic), FLASH (Astrophysics), GADGET (Cosmological Dynamics), PSC (Plasma Physics), waLBerla (Lattice Boltzmann CFD), Musubi (Lattice Boltzmann CFD), Vertex3D (Stellar Astrophysics), CIAO (Combustion CFD), and LS1-Mardyn (Material Science). The projects were allowed to use the machine exclusively during the 28 day period, which corresponds to a total of 63.4 million core-hours, of which 43.8 million core-hours were used by the applications, resulting in a ut...
ArXiv, 2015
Running from October 2011 to June 2015, the aim of the European project Mont-Blanc has been to de... more Running from October 2011 to June 2015, the aim of the European project Mont-Blanc has been to develop an approach to Exascale computing based on embedded power-efficient technology. The main goals of the project were to i) build an HPC prototype using currently available energy-efficient embedded technology, ii) design a Next Generation system to overcome the limitations of the built prototype and iii) port a set of representative Exascale applications to the system. This article summarises the contributions from the Leibniz Supercomputing Centre (LRZ) and the Juelich Supercomputing Centre (JSC), Germany, to the Mont-Blanc project.
SC16: International Conference for High Performance Computing, Networking, Storage and Analysis, 2016
High-performance computing (HPC) is recognized as one of the pillars for further advance of scien... more High-performance computing (HPC) is recognized as one of the pillars for further advance of science, industry, medicine, and education. Current HPC systems are being developed to overcome emerging challenges in order to reach Exascale level of performance, which is expected by the year 2020. The much larger embedded and mobile market allows for rapid development of IP blocks, and provides more flexibility in designing an application-specific SoC, un turn giving possibility in balancing performance, energy-efficiency and cost. In the Mont-Blanc project, we advocate for HPC systems be built from such commodity IP blocks, currently used in embedded and mobile SoCs. As a first demonstrator of such approach, we present the Mont-Blanc prototype; the first HPC system built with commodity SoCs, memories, and NICs from the embedded and mobile domain, and offthe-shelf HPC networking, storage, cooling and integration solutions. We present the system's architecture, and evaluation including both performance and energy efficiency. Further, we compare the system's abilities against a production level supercomputer. At the end, we discuss parallel scalability, and estimate the maximum scalability point of this approach across a set of applications.
With the rapidly growing demand for computing power new accelerator based architectures have ente... more With the rapidly growing demand for computing power new accelerator based architectures have entered the world of high performance computing since around 5 years. In particular GPGPUs have recently become very popular, however programming GPGPUs using programming languages like CUDA or OpenCL is cumbersome and error-prone. Trying to overcome these difficulties, Intel developed their own Many Integrated Core (MIC) architecture which can be programmed using standard parallel programming techniques like OpenMP and MPI. In the beginning of 2013, the first production-level cards named Intel Xeon Phi came on the market. LRZ has been considered by Intel as a leading research centre for evaluating coprocessors based on the MIC architecture since 2010 under strict NDA. Since the Intel Xeon Phi is now generally available, we can share our experience on programming Intel's new MIC architecture.
ArXiv, 2021
The simulation of quantum circuits using the tensor network method is very computationally demand... more The simulation of quantum circuits using the tensor network method is very computationally demanding and requires significant High Performance Computing (HPC) resources to find an efficient contraction order and to perform the contraction of the large tensor networks. In addition, the researchers want a workflow that is easy to customize, reproduce and migrate to different HPC systems. In this paper, we discuss the issues associated with the deployment of the QuantEX quantum computing simulation software within containers on different HPC systems. Also, we compare the performance of the containerized software with the software running on “bare metal”. Keywords-Containers, quantum computing, software development, HPC
As the complexity and size of challenges in science and engineering are continually increasing, i... more As the complexity and size of challenges in science and engineering are continually increasing, it is highly important that applications are able to scale strongly to very large numbers of cores (>100,000 cores) to enable HPC systems to be utilised efficiently. This paper presents results of strong scaling tests performed with an MPI only and a hybrid MPI + OpenMP version of the Lattice QCD application BQCD on the European Tier-0 system SuperMUC at LRZ.
November 2002, "Magister", Supervisors : Dr. H. Bouzar (Tizi-Ouzou, Algeria) with the c... more November 2002, "Magister", Supervisors : Dr. H. Bouzar (Tizi-Ouzou, Algeria) with the collaboration of Dr. V. Pierron-Bohnes and Dr. C. Goyhenex (Strasbourg, France). Mouloud Mammeri University, Tizi-Ouzou (Algeria) and Louis Pasteur University, Strasbourg (France).
As the complexity and size of challenges in science and engineering are continually increasing, i... more As the complexity and size of challenges in science and engineering are continually increasing, it is highly important that applications are able to scale strongly to very large numbers of cores (>100,000 cores) to enable HPC systems to be utilised efficiently. This paper presents results of strong scaling tests performed with an MPI only and a hybrid MPI + OpenMP version of the Lattice QCD application BQCD on the European Tier-0 system SuperMUC at LRZ.
Journal of Open Source Software, 2022
QXTools is a framework for simulating quantum circuits using tensor network methods. Weak simulat... more QXTools is a framework for simulating quantum circuits using tensor network methods. Weak simulation is the primary use case where given a quantum circuit and input state QXTools will efficiently calculate the probability amplitude of a given output configuration or set of configurations. Given this ability one can sample from the output distribution using random sampling approaches. QXTools is intended to be used by researchers interested in simulating circuits larger than those possible with full wave-function simulators or those interested in research and development of tensor network circuit simulation methods. See Brennan et al. (2021) for more complete background and scaling results and Brayford et al. (2021) for details about deploying in containerised environments.
Journal of Open Source Software, 2022
QXTools is a framework for simulating quantum circuits using tensor network methods. Weak simulat... more QXTools is a framework for simulating quantum circuits using tensor network methods. Weak simulation is the primary use case where given a quantum circuit and input state QXTools will efficiently calculate the probability amplitude of a given output configuration or set of configurations. Given this ability one can sample from the output distribution using random sampling approaches. QXTools is intended to be used by researchers interested in simulating circuits larger than those possible with full wave-function simulators or those interested in research and development of tensor network circuit simulation methods. See Brennan et al. (2021) for more complete background and scaling results and Brayford et al. (2021) for details about deploying in containerised environments.
Graphics Processing Units (GPUs) were originally developed for computer gaming and other graphica... more Graphics Processing Units (GPUs) were originally developed for computer gaming and other graphical tasks,<br> but for many years have been exploited for general purpose computing across a number of areas. They offer<br> advantages over traditional CPUs because they have greater computational capability, and use high-bandwidth<br> memory systems (where memory bandwidth is the main bottleneck for many scientific applications).<br> This Best Practice Guide describes GPUs: it includes information on how to get started with programming GPUs,<br> which cannot be used in isolation but as "accelerators" in conjunction with CPUs, and how to get good performance.<br> Focus is given to NVIDIA GPUs, which are most widespread today.<br> In Section 2, "The GPU Architecture", the GPU architecture is described, with a focus on the latest "Pascal"<br> generation of NVIDIA GPUs, and attention is given to the architectural r...
2021 3rd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC), 2021
In this paper, to categorize and detect pneumonia from a collection of chest X-ray picture sample... more In this paper, to categorize and detect pneumonia from a collection of chest X-ray picture samples, we propose a deep learning technique based on object detection, convolutional neural networks, and transfer learning. The proposed model is a combination of the pre-trained model (VGG19) and our designed architecture. The Guangzhou Women and Children's Medical Center in Guangzhou, China provided the chest X-ray dataset used in this study. There are 5,000 samples in the data set, with 1,583 healthy samples and 4,273 pneumonia samples. Preprocessing techniques such as contrast limited adaptive histogram equalization (CLAHE) and brightness preserving bi-histogram equalization was also used (BBHE) to improve accuracy. Due to the imbalance of the data set, we adopted some training techniques to improve the learning process of the samples. This network achieved over 99% accuracy due to the proposed architecture that is based on a combination of two models. The pre-trained VGG19 as feature extractor and our designed convolutional neural network (CNN).
2021 IEEE/ACM Second International Workshop on Quantum Computing Software (QCS), 2021
Tensor network methods are incredibly effective for simulating quantum circuits. This is due to t... more Tensor network methods are incredibly effective for simulating quantum circuits. This is due to their ability to efficiently represent and manipulate the wave-functions of large interacting quantum systems. We describe the challenges faced when scaling tensor network simulation approaches to Exascale compute platforms and introduce QuantEx, a framework for tensor network circuit simulation at Exascale.
In spring 2015, the Leibniz Supercomputing Centre (Leibniz-Rechenzentrum, LRZ), installed their n... more In spring 2015, the Leibniz Supercomputing Centre (Leibniz-Rechenzentrum, LRZ), installed their new Peta-Scale System SuperMUC Phase2. Selected users were invited for a 28 day extreme scale-out block operation during which they were allowed to use the full system for their applications. The following projects participated in the extreme scale-out workshop: BQCD (Quantum Physics), SeisSol (Geophysics, Seismics), GPI-2/GASPI (Toolkit for HPC), Seven-League Hydro (Astrophysics), ILBDC (Lattice Boltzmann CFD), Iphigenie (Molecular Dynamic), FLASH (Astrophysics), GADGET (Cosmological Dynamics), PSC (Plasma Physics), waLBerla (Lattice Boltzmann CFD), Musubi (Lattice Boltzmann CFD), Vertex3D (Stellar Astrophysics), CIAO (Combustion CFD), and LS1-Mardyn (Material Science). The projects were allowed to use the machine exclusively during the 28 day period, which corresponds to a total of 63.4 million core-hours, of which 43.8 million core-hours were used by the applications, resulting in a ut...
ArXiv, 2015
Running from October 2011 to June 2015, the aim of the European project Mont-Blanc has been to de... more Running from October 2011 to June 2015, the aim of the European project Mont-Blanc has been to develop an approach to Exascale computing based on embedded power-efficient technology. The main goals of the project were to i) build an HPC prototype using currently available energy-efficient embedded technology, ii) design a Next Generation system to overcome the limitations of the built prototype and iii) port a set of representative Exascale applications to the system. This article summarises the contributions from the Leibniz Supercomputing Centre (LRZ) and the Juelich Supercomputing Centre (JSC), Germany, to the Mont-Blanc project.
SC16: International Conference for High Performance Computing, Networking, Storage and Analysis, 2016
High-performance computing (HPC) is recognized as one of the pillars for further advance of scien... more High-performance computing (HPC) is recognized as one of the pillars for further advance of science, industry, medicine, and education. Current HPC systems are being developed to overcome emerging challenges in order to reach Exascale level of performance, which is expected by the year 2020. The much larger embedded and mobile market allows for rapid development of IP blocks, and provides more flexibility in designing an application-specific SoC, un turn giving possibility in balancing performance, energy-efficiency and cost. In the Mont-Blanc project, we advocate for HPC systems be built from such commodity IP blocks, currently used in embedded and mobile SoCs. As a first demonstrator of such approach, we present the Mont-Blanc prototype; the first HPC system built with commodity SoCs, memories, and NICs from the embedded and mobile domain, and offthe-shelf HPC networking, storage, cooling and integration solutions. We present the system's architecture, and evaluation including both performance and energy efficiency. Further, we compare the system's abilities against a production level supercomputer. At the end, we discuss parallel scalability, and estimate the maximum scalability point of this approach across a set of applications.
With the rapidly growing demand for computing power new accelerator based architectures have ente... more With the rapidly growing demand for computing power new accelerator based architectures have entered the world of high performance computing since around 5 years. In particular GPGPUs have recently become very popular, however programming GPGPUs using programming languages like CUDA or OpenCL is cumbersome and error-prone. Trying to overcome these difficulties, Intel developed their own Many Integrated Core (MIC) architecture which can be programmed using standard parallel programming techniques like OpenMP and MPI. In the beginning of 2013, the first production-level cards named Intel Xeon Phi came on the market. LRZ has been considered by Intel as a leading research centre for evaluating coprocessors based on the MIC architecture since 2010 under strict NDA. Since the Intel Xeon Phi is now generally available, we can share our experience on programming Intel's new MIC architecture.
ArXiv, 2021
The simulation of quantum circuits using the tensor network method is very computationally demand... more The simulation of quantum circuits using the tensor network method is very computationally demanding and requires significant High Performance Computing (HPC) resources to find an efficient contraction order and to perform the contraction of the large tensor networks. In addition, the researchers want a workflow that is easy to customize, reproduce and migrate to different HPC systems. In this paper, we discuss the issues associated with the deployment of the QuantEX quantum computing simulation software within containers on different HPC systems. Also, we compare the performance of the containerized software with the software running on “bare metal”. Keywords-Containers, quantum computing, software development, HPC
As the complexity and size of challenges in science and engineering are continually increasing, i... more As the complexity and size of challenges in science and engineering are continually increasing, it is highly important that applications are able to scale strongly to very large numbers of cores (>100,000 cores) to enable HPC systems to be utilised efficiently. This paper presents results of strong scaling tests performed with an MPI only and a hybrid MPI + OpenMP version of the Lattice QCD application BQCD on the European Tier-0 system SuperMUC at LRZ.
November 2002, "Magister", Supervisors : Dr. H. Bouzar (Tizi-Ouzou, Algeria) with the c... more November 2002, "Magister", Supervisors : Dr. H. Bouzar (Tizi-Ouzou, Algeria) with the collaboration of Dr. V. Pierron-Bohnes and Dr. C. Goyhenex (Strasbourg, France). Mouloud Mammeri University, Tizi-Ouzou (Algeria) and Louis Pasteur University, Strasbourg (France).
As the complexity and size of challenges in science and engineering are continually increasing, i... more As the complexity and size of challenges in science and engineering are continually increasing, it is highly important that applications are able to scale strongly to very large numbers of cores (>100,000 cores) to enable HPC systems to be utilised efficiently. This paper presents results of strong scaling tests performed with an MPI only and a hybrid MPI + OpenMP version of the Lattice QCD application BQCD on the European Tier-0 system SuperMUC at LRZ.