Explorer High-Performance Reconfigurable Computing (original) (raw)

Quantitative analysis of FPGA-based database searching

2001

This paper reports two contributions to the theory and practice of using reconfigurable hardware to implement search engines based on hashing techniques. The first contribution concerns technology-independent optimisations involving run-time reconfiguration of the hash functions; a quantitative framework is developed for estimating design trade-offs, such as the amount of temporary storage versus reconfiguration time. The second contribution concerns methods for optimising implementations in Xilinx FPGA technology, which achieve different trade-offs in cell utilisation, reconfiguration time and critical path delay; quantitative analysis of these trade-offs are provided.

SCALABLE ARCHITECTURE FOR COMPUTATIONALLY INTENSIVE APPLICATIONS

IASET, 2013

High Performance Computing Platforms are being used to address operations with complex computational requirements or with significant processing time requirements or requirement to process significant amount of data. With the advent of low cost Field Programmable Gate Arrays (FPGA‘s), building hardware with parallel architecture for computationally intensive applications has now become possible. FPGA‘s offer massive and parallel architectures. This paper presents a FPGA based design of parallel architecture which is scalable for hardware implementation of computationally intensive applications. The aim of this work is to design a reconfigurable parallel and scalable High Performance Computing Platform to accelerate computations. The Cryptanalysis of Advanced Encryption Standard (AES) Algorithm is used as a proof of concept.

Reconfigurable Platforms for High Performance Processing

2007

This work deals with reconfigurable computation platforms for high speed simulation of physical phenomena, based on numerical models of algebraic linear systems. This type of simulation is of extreme importance in research centers as CENPES/Petrobrs, that develops applications of geophysical processing for prospection of oil and gas. Currently, these applications are implemented on PCs conventional clusters. A new approach for this type of problem is presented here, based on reconfigurable computer systems using Field Programmable Gate Arrays technology (FPGA) and its implications regarding the hardware/software partitioning, operating system, memory connections, communication and device drivers. Such technologies make possible appreciable profits in terms of performance -electric power and processing speed when compared to the conventional clusters. This solution also promotes cost reduction when applied to massive computation and high complexity large data applications, normally used in scientific computation.

Integration of FPGAs in Database Management Systems: Challenges and Opportunities

Datenbank-Spektrum, 2018

In the presence of exponential growth of the data produced every day in volume, velocity, and variety, online analytical processing (OLAP) is becoming increasingly challenging. FPGAs offer hardware reconfiguration to enable query-specific pipelined and parallel data processing with the potential of maximizing throughput, speedup as well as energy and resource efficiency. However, dynamically configuring hardware accelerators to match a given OLAP query is a complex task. Furthermore, resource limitations restrict the coverage of OLAP operators. As a consequence, query optimization through partitioning the processing onto components of heterogeneous hardware/software systems seems a promising direction. While there exists work on operator placement for heterogeneous systems, it mainly targets systems combining multi-core CPUs with GPUs. However, an inclusion of FPGAs, which uniquely offer efficient and high-throughput pipelined processing at the expense of potential reconfiguration overheads, is still an open problem. We postulate that this challenge can only be met in a scalable fashion when providing a cooperative optimization between global and FPGA-specific optimizers. We demonstrate how this is addressed in two current research projects on FPGA-based query processing.

Design and implementation of a database filter for BLAST acceleration

2009 Design, Automation & Test in Europe Conference & Exhibition, 2009

BLAST is a very popular Computational Biology algorithm. Since it is computationally expensive it is a natural target for acceleration research, and many reconfigurable architectures have been proposed offering significant improvements.

Hardware acceleration of database applications

2013

General purpose computing platforms have generally been favored over customized computational setups, due to the simplified usability and significant reduction of development time. These general purpose machines make use of the Von-Neumann architectural model which suffers from the sequential aspect of computing and heavy reliance on memory offloading. This dissertation proposes the use of hardware accelerators such as Field Programmable Gate Arrays (FPGAs) and Graphics Processing Units (GPUs) as a substitute or co-processor to general purpose CPUs, with a focus on database applications. Here, large amounts of data are queried in a time-critical manner. This dissertation shows that using hardware platforms allows processing data in a streaming (single pass) and massively parallel manner, hence speeding up computation by several orders of magnitude when compared to general purpose CPUs. The complexity of programming these parallel platforms is abstracted from the developers, as hardware constructs are automatically generated from high-level application languages and/or specifications. This dissertation explores the hardware acceleration of XML path and twig filtering, using novel dynamic programming algorithms. Publish-subscribe systems present v the state of the art in information dissemination to multiple users. Current XML-based publish-subscribe systems provide users with considerable flexibility allowing the formulation of complex queries on the content as well as the (tree) structure of the streaming messages. Messages that contain one or more matches for a given user profile (query) are forwarded to the user. This dissertation further studies FPGA-based architectures for processing expressive motion patterns on continuous spatio-temporal streams. Complex motion patterns are described as substantially flexible variable-enhanced regular expressions over a spatial alphabet that can be implicitly or explicitly anchored to the time domain. Using FPGAs, thousands of queries are matched in parallel. The challenges in handling several constructs of the assumed query language are explored, with a study on the tradeoffs between expressiveness, scalability and matching accuracy (eliminating false-positives). Finally, the first parallel Golomb-Rice (GR) integer decompression FPGAbased architecture is detailed, allowing the decoding of unmodified GR streams at the deterministic rate of several bytes (multiple integers) per hardware cycle. Integer decompression is a first step in the querying of inverted indexes.

Implementation of bloom filters in reconfigurable hardware for tracing network attacks

IFAC Proceedings Volumes, 2006

A Bloom filter is a data structure for representing a set of strings in order to support. membership querie.s. It was first. int.roduced in 1970 for the database query matchmg. Recently this structure ha.s been rediscovered and widely lIsro in the area of network processing. The main prohlems t.hat can he solvro lIsing t.he nloom filt.ers an~: dat.agram t.racehack, mult.i pat.tern mat.ching, packet classification and malicious code fingerprint.ing. In the art.icle we will descrihe our experiences with the implementation of Bloom filters in field-Programmable Gate Arrays for tracing network attacks. The prepared module can operate with the throughput over 1 Gbps and can store up to ::; seconds of traffic using less than 262,144 kB of memory.

Language classification using n-grams accelerated by FPGA-based Bloom filters

Proceedings of the 1st international workshop on High-performance reconfigurable computing technology and applications: held in conjunction with SC07, 2007

N-Gram (n-character sequences in text documents) counting is a well-established technique used in classifying the language of text in a document. In this paper, n-gram processing is accelerated through the use of reconfigurable hardware on the XtremeData XD1000 system. Our design employs parallelism at multiple levels, with parallel Bloom Filters accessing on-chip RAM, parallel language classifiers, and parallel document processing. In contrast to another hardware implementation (HAIL algorithm) that uses offchip SRAM for lookup, our highly scalable implementation uses only on-chip memory blocks. Our implementation of end-to-end language classification runs at 85× comparable software and 1.45× the competing hardware design.

EABF: Energy efficient self-adaptive Bloom filter for network packet processing

2012 IEEE International Conference on Communications (ICC), 2012

Future Internet requires rethinking of network infrastructure towards the balance between computing capacities and energy sustainable techniques. As one of computing intensive components, Bloom filters are widely used for network packet processing. In this paper, an energy efficient self-adaptive Bloom filter, EABF, is devoted to a balance of power and performance especially for high performance networks. The basic idea is to give the Bloom Filter the capability to adjust the number of active hash functions according to the current workload automatically. This adaption depends on its control policies. Three policies are presented and compared. We also give the method to implement EABF in hardware for higher performance. It is presented in a two-stage platform based on FPGA where Stage 1 is always active and Stage 2, a secondary stage, is only active when necessary. The platform can also be extended to multi-stages. A control circuit is designed for flexibly changing working stage and reducing both dynamic and static power consumption. Analysis and experiments show that our dynamic two-stage EABF can achieve almost the best power savings as that of the fixed schemes; unlike the fixed schemes that might have much longer latency, EABF maintains nearly 1 clock cycle latency as that of a regular Bloom filter.

FPGA-accelerated Information Retrieval: High-efficiency document filtering

2009 International Conference on Field Programmable Logic and Applications, 2009

Power consumption in data centres is a growing issue as the cost of the power for computation and cooling has become dominant. An emerging challenge is the development of "environmentally friendly" systems. In this paper we present a novel application of FPGAs for the acceleration of Information Retrieval algorithms, specifically, filtering streams/collections of documents against topic profiles. Our results show that FPGA acceleration can result in speed-ups of up to a factor 20 for large profiles. * W. Vanderbauwhede acknowledges support from the EPSRC. The authors thank F. Larsson of Mitrionics for technical support with the Mitrion SDK, and the IR Facility (www.ir-facility.org) and Matrixware GmbH for supporting this project, .

Explorer High-Performance Reconfigurable Computing (original) (raw)

Related papers