Progress report on the online processing upgrade at the NA62 experiment (original) (raw)

GPU real-time processing in NA62 trigger system

Journal of Physics: Conference Series

A commercial Graphics Processing Unit (GPU) is used to build a fast Level 0 (L0) trigger system tested parasitically with the TDAQ (Trigger and Data Acquisition systems) of the NA62 experiment at CERN. In particular, the parallel computing power of the GPU is exploited to perform real-time fitting in the Ring Imaging CHerenkov (RICH) detector. Direct GPU communication using a FPGA-based board has been used to reduce the data transmission latency. The performance of the system for multi-ring reconstrunction obtained during the NA62 physics run will be presented.

GPU-based Real-time Triggering in the NA62 Experiment

arXiv (Cornell University), 2016

Over the last few years the GPGPU (General-Purpose computing on Graphics Processing Units) paradigm represented a remarkable development in the world of computing. Computing for High-Energy Physics is no exception: several works have demonstrated the effectiveness of the integration of GPU-based systems in high level trigger of different experiments. On the other hand the use of GPUs in the low level trigger systems, characterized by stringent real-time constraints, such as tight time budget and high throughput, poses several challenges. In this paper we focus on the low level trigger in the CERN NA62 experiment, investigating the use of real-time computing on GPUs in this synchronous system. Our approach aimed at harvesting the GPU computing power to build in real-time refined physics-related trigger primitives for the RICH detector, as the the knowledge of Cerenkov rings parameters allows to build stringent conditions for data selection at trigger level. Latencies of all components of the trigger chain have been analyzed, pointing out that networking is the most critical one. To keep the latency of data transfer task under control, we devised NaNet, an FPGA-based PCIe Network Interface Card (NIC) with GPUDirect capabilities. For the processing task, we developed specific multiple ring trigger algorithms to leverage the parallel architecture of GPUs and increase the processing throughput to keep up with the high event rate. Results obtained during the first months of 2016 NA62 run are presented and discussed.

The FPGA based Trigger and Data Acquisition system for the CERN NA62 experiment

Journal of Instrumentation, 2014

The main goal of the NA62 experiment at CERN is to measure the branching ratio of the ultra-rare K + → π + νν decay, collecting about 100 events to test the Standard Model of Particle Physics. Readout uniformity of sub-detectors, scalability, efficient online selection and lossless high rate readout are key issues. The TDCB and TEL62 boards are the common blocks of the NA62 TDAQ system. TDCBs measure hit times from sub-detectors, TEL62s process and store them in a buffer, extracting only those requested by the trigger system following the matching of trigger primitives produced inside TEL62s themselves. During the NA62 Technical Run at the end of 2012 the TALK board has been used as prototype version of the L0 Trigger Processor.

GPUs for the realtime low-level trigger of the NA62 experiment at CERN

2015

A pilot project for the use of GPUs (Graphics processing units) in online triggering ap- plications for high energy physics experiments (HEP) is presented. GPUs offer a highly parallel architecture and the fact that most of the chip resources are devoted to computa- tion. Moreover, they allow to achieve a large computing power using a limited amount of space and power. The application of online parallel computing on GPUs is shown for the synchronous low level trigger of NA62 experiment at CERN. Direct GPU communication using a FPGA-based board has been exploited to reduce the data transmission latency and results on a first field test at CERN will be highlighted. This work is part of a wider project named GAP (GPU application project), intended to study the use of GPUs in real-time applications in both HEP and medical imagin

NaNet-10: a 10GbE network interface card for the GPU-based low-level trigger of the NA62 RICH detector

Journal of Instrumentation, 2016

A GPU-based low level (L0) trigger is currently integrated in the experimental setup of the RICH detector of the NA62 experiment to assess the feasibility of building more refined physics-related trigger primitives and thus improve the trigger discriminating power. To ensure the real-time operation of the system, a dedicated data transport mechanism has been implemented: an FPGA-based Network Interface Card (NaNet-10) receives data from detectors and forwards them with low, predictable latency to the memory of the GPU performing the trigger algorithms. Results of the ring-shaped hit patterns reconstruction will be reported and discussed. K : Data processing methods; Trigger concepts and systems (hardware and software); Online farms and online filtering 1Corresponding author.

GPUs for fast triggering and pattern matching at the CERN experiment NA62

2009 IEEE Nuclear Science Symposium Conference Record (NSS/MIC), 2009

In high energy physics experiment the trigger system is crucial to reduce the quantity of data recorded on tape and the acquisition bandwidth requirements. This is particularly true in rare decays experiments. The NA62 experiment aims at measuring the Branching Ratio of K + → π + νν, predicted in the Standard Model (SM) at level of ∼ 10 −10. In this paper we describe the idea to use the commercial video card processor (GPU) to construct a fast and effective trigger system, both in hardware and software level. Due to the use of off the shelf technology, in continuous development for other purposes, the architecture described would be easily exported to other experiments, to build a versatile and fully customizable trigger system.

L0TP+: the Upgrade of the NA62 Level-0 Trigger Processor

2020

The L0TP+ initiative is aimed at the upgrade of the FPGA-based Level-0 Trigger Processor (L0TP) of the NA62 experiment at CERN for the post-LS2 data taking, which is expected to happen at 100% of design beam intensity, corresponding to about 3.3 × 1012 protons per pulse on the beryllium target used to produce the kaons beam. Although tests performed at the end of 2018 showed a substantial robustness of the L0TP system also at full beam intensity, there are several reasons to motivate such an upgrade: i) avoid FPGA platform obsolescence, ii) make room for improvements in the firmware design leveraging a more capable FPGA device, iii) add new functionalities, iv) support the 4 beam intensity increase foreseen in future experiment upgrades. We singled out the Xilinx Virtex UltraScale+ VCU118 development board as the ideal platform for the project. L0TP+ seamless integration into the current NA62 TDAQ system and exact matching of L0TP functionalities represent the main requirements and ...

Trigger algorithm development on FPGA-based Compute Nodes

2009 16th IEEE-NPSS Real Time Conference, 2009

Based on the ATCA computation architecture and Compute Nodes (CN), investigation and implementation work has been being executed for HADES and PANDA trigger algorithms. We present our designs for HADES track reconstruction processing, Cherenkov ring recognition, Time-Of-Flight processing, electromagnetic shower recognition, and the PANDA straw tube tracking algorithm. They will appear as co-processors in the uniform system design to undertake the detector-specific computing. The algorithm principles will be explained and hardware designs are described in the paper. The current progress reveals the feasibility to implement these algorithms on FPGAs. Also experimental results demonstrate the performance speedup when compared to alternative software solutions, as well as the potential capability of high-speed parallel/pipelined processing in Data Acquisition and Trigger systems.

Real-time use of GPUs in NA62 experiment

… Networks and Their …, 2012

We describe a pilot project for the use of GPUs in a real-time triggering application in the early trigger stages at the CERN NA62 experiment, and the results of the first field tests together with a prototype data acquisition (DAQ) system. This pilot project within NA62 aims at integrating GPUs into the central L0 trigger processor, and also to use them as fast online processors for computing trigger primitives. Several TDCequipped sub-detectors with sub-nanosecond time resolution will participate in the first-level NA62 trigger (L0), fully integrated with the data-acquisition system, to reduce the readout rate of all sub-detectors to 1 MHz, using multiplicity information asynchronously computed over time frames of a few ns, both for positive sub-detectors and for vetos. The online use of GPUs would allow the computation of more complex trigger primitives already at this first trigger level.

Real-time heterogeneous stream processing with NaNet in the NA62 experiment

Journal of Physics: Conference Series, 2018

The use of GPUs to implement general purpose computational tasks, known as GPGPU since fifteen years ago, has reached maturity. Applications take advantage of the parallel architectures of these devices in many different domains. Over the last few years several works have demonstrated the effectiveness of the integration of GPU-based systems in the high level trigger of various HEP experiments. On the other hand, the use of GPUs in the DAQ and low level trigger systems, characterized by stringent real-time constraints, poses several challenges. In order to achieve such a goal we devised NaNet, a FPGA-based PCI-Express Network Interface Card design capable of direct (zero-copy) data transferring with CPU and GPU (GPUDirect) while online processing incoming and outgoing data streams. The board provides as well support for multiple link technologies (1/10/40GbE and custom ones). The validity of our approach has been tested in the context of the NA62 CERN experiment, harvesting the computing power of last generation NVIDIA Pascal GPUs and of the FPGA hosted by NaNet to build in real-time refined physics-related primitives for the RICH detector (i.e. the Cerenkov rings parameters) that enable the building of more stringent conditions for data selection in the low level trigger.

Progress report on the online processing upgrade at the NA62 experiment (original) (raw)

Related papers