Linear QR architecture for a single chip adaptive beamformer (original) (raw)

Multidimensional-DSP Beamformers Using the ROACH-2 FPGA Platform

Electronics

Antenna array-based multi-dimensional infinite-impulse response (IIR) digital beamformers are employed in a multitude of radio frequency (RF) applications ranging from electronically-scanned radar, radio telescopes, long-range detection and target tracking. A method to design 3D IIR beam filters using 2D IIR beam filters is described. A cascaded 2D IIR beam filter architecture is proposed based on systolic array architecture as an alternative for an existing radar application. Differential-form transfer function and polyphase structures are employed in the design to gain an increase in the speed of operation to gigahertz range. The feasibility of practical implementation of a 4-phase polyphase 2D IIR beam filter is explored. A digital hardware prototype is designed, implemented and tested using a ROACH-2 Field Programmable Gate Array (FPGA) platform fitted with a Xilinx Virtex-6 SX475T FPGA chip and multi-input analog-to-digital converters (ADC) boards set to a maximum sampling rate of 960 MHz. The article describes a method to build a 3D IIR beamformer using polyphase structures. A comparison of technical specifications of an existing radar application based on phased-array and the proposed 3D IIR beamformer is also explained to illustrate the proposed method to be a better alternative for such applications.

Real-time QRD-based beamforming on an FPGA platform

2006

This paper describes the architecture, design flow and verification process for the FPGA implementation of a realtime beamformer. One of the challenges in realizing this class of processing is in the implementation of the linear algebra operations required in forming the least-squares solution to the Normal equations. We describe the FPGA realization of a flexible QRD-based approach to this problem in which the system parameters (row and column dimensions) can be supplied to the beamformer module at run-time. The design and FPGA implementation of the beamformer architecture and verification framework is described along with implementation considerations for Xilinx Virtex T M-4 family of FPGAs. A model-based FPGA design flow called System Generator T M [4], based on the The Mathworks Simulink R visual programming environment, was used exclusively to generate the implementation. The use of this tool chain for hardware verification is discussed. The FPGA resource utilization and performance of the QRD processor is reported.

A Reconfigurable Systolic Array Architecture for Multicarrier Wireless and Multirate Applications

International Journal of Reconfigurable Computing, 2009

A reconfigurable systolic array (RSA) architecture that supports the realization of DSP functions for multicarrier wireless and multirate applications is presented. The RSA consists of coarse-grained processing elements that can be configured as complex DSP functions that are the basic building blocks of Polyphase-FIR filters, phase shifters, DFTs, and Polyphase-DFT circuits. The homogeneous characteristic of the RSA architecture, where each reconfigurable processing element (PE) cell is connected to its nearest neighbors via configurable switch (SW) elements, enables array expansion for parallel processing and facilitates time sharing computation of high-throughput data by individual PEs. For DFT circuit configurations, an algorithmic optimization technique has been employed to reduce the overall number of vector-matrix products to be mapped on the RSA. The hardware complexity and throughput of the RSA-based DFT structures have been evaluated and compared against several conventional modular FFT realizations. Designs and circuit implementations of the PE cell and several RSAs configured as DFT and Polyphase filter circuits are also presented. The RSA architecture offers significant flexibility and computational capacity for applications that require real time reconfiguration and high-density computing.

Custom architecture for multicore audio beamforming systems

ACM Transactions on Embedded Computing Systems, 2013

The audio Beamforming (BF) technique utilizes microphone arrays to extract acoustic sources recorded in a noisy environment. In this article, we propose a new approach for rapid development of multicore BF systems. Research on literature reveals that the majority of such experimental and commercial audio systems are based on desktop PCs, due to their high-level programming support and potential of rapid system development. However, these approaches introduce performance bottlenecks, excessive power consumption, and increased overall cost. Systems based on DSPs require very low power, but their performance is still limited. Custom hardware solutions alleviate the aforementioned drawbacks, however, designers primarily focus on performance optimization without providing a high-level interface for system control and test. In order to address the aforementioned problems, we propose a custom platform-independent architecture for reconfigurable audio BF systems. To evaluate our proposal, we implement our architecture as a heterogeneous multicore reconfigurable processor and map it onto FPGAs. Our approach combines the software flexibility of General-Purpose Processors (GPPs) with the computational power of multicore platforms. In order to evaluate our system we compare it against a BF software application implemented to a low-power Atom 330, a middle-ranged Core2 Duo, and a high-end Core i3. Experimental results suggest that our proposed solution can extract up to 16 audio sources in real time under a 16-microphone setup. In contrast, under the same setup, the Atom 330 cannot extract any audio sources in real time, while the Core2 Duo and the Core i3 can process in real time only up to 4 and 6 sources respectively. Furthermore, a Virtex4-based BF system consumes more than an order less energy compared to the aforementioned GPP-based approaches.

Parameterisable QR core

Conference Record of the Thirty-Third Asilomar Conference on Signals, Systems, and Computers (Cat. No.CH37020), 1999

The design of a generic QR core for adaptive beamforming is presented. The work relies on an existing mapping technique that can be applied to a triangular QR m y in such a way to allow the generation of a range of QR architectures. All scheduling of data inputs and retiming to include processor latency has been included within the generic representation.

Configurable Universal Processing Module for DSP Cellular Arrays

The architecture of a configurable Universal Processing Module UPM for the optimal real time implementation of most digital signal processing DSP algorithms is presented. Field-Programmable Gate Arrays FPGAs, are employed to implement cellular arrays based on the proposed UPM. Transforms, adaptive filters, lattice structures, function generators, digital correlators and convolvers are among DSP functions that can be executed in real time with this Universal Processing Module based cellular arrays. The construction of DSP processors employs floating point arithmetic for flexibility and greater precision. Finite Impulse Response FIR filter cells are constructed, and employed to realize Infinite Impulse Response IIR filters. The same cells can be configured, moreover, to function as all-zero, all-pole and pole-zero lattice filters. Cross-Correlation arrays are shown to be among possible implementations using the cellular structures. The parallel architecture of the proposed cellular arrays leads to fast and efficient evaluation of most transforms, such as Fast Fourier, discrete Hilbert, discrete Hartley, discrete cosine, discrete Hankel, Walsh-Hadamard, fast Generalized Walsh and other generalized spectral analysis transforms. Other realizations include function generators which employ Chebyshev polynomials to generate trigonometric functions, inverse trigonometric, exponential, logarithmic, Gamma and Bessel functions. To obtain high processing speed, combinatorial logic is used when possible throughout in implementing arithmetic operations.

Scalable Fixed Point QRD Core Using Dynamic Partial Reconfiguration

A Givens rotation based scalable QRD core which utilizes an efficient pipelined and unfolded 2D multiply and accumulate (MAC) based systolic array architecture with dynamic partial reconfiguration (DPR) capability is proposed.The square root and inverse square root operations in the Givens rotation algorithm are handled using a modified look-up table (LUT) based Newton-Raphson method, thereby reducing the area by 71% and latency by 50%while operating at a frequency 49% higher than the existing boundary cell architectures. The proposed architecture is implemented on Xilinx Virtex-6 FPGA for any real matrices of size 𝑚 × 𝑛, where 4 ≤ 𝑛 ≤ 8 and 𝑚 ≥ 𝑛 by dynamically inserting or removing the partial modules.The evaluation results demonstrate a significant reduction in latency, area, and power as compared to other existing architectures.The functionality of the proposed core is evaluated for a variable length adaptive equalizer.

Design and Realization of Array Signal Processor VLSI Architecture for Phased Array System

A method for implementing an array signal processor for phased array radars. The array signal processor can receive planar array antenna inputs and can process it. It is based on the application of Adaptive Digital beam formers using FPGAs. Adaptive filter algorithm used here is Inverse Q-R Decomposition based Recursive Least Squares (IQRD-RLS) [1] algorithm. Array signal processor based on FPGAs is suitable in the areas of Phased Array Radar receiver, where speed, accuracy and numerical stability are of utmost important. Using IQRD-RLS algorithm, optimal weights are calculated in much less time compared to conventional QRD-RLS algorithm. A customized multiple FPGA board comprising three Kintex-7 FPGAs is employed to implement array signal processor. The proposed architecture can form multiple beams from planar array antenna elements.

Design and Chip Implementation of a SMI/MVDR Dual-Mode Beamformer for Wireless MIMO Communication Systems

IEEE Access

This paper presents a low complexity chip design supporting dual-mode beamforming, i.e. sampling matrix inversion (SMI) and the minimum variance distortionless response (MVDR), for wireless Multiple-Input Multiple-Output (MIMO) communication systems. The auto-correlation matrix inversion is the critical computing kernel shared by the two beamforming schemes. To alleviate the computing complexity, the auto-correlation matrix is approximated by a Toeplitz counterpart, which can be decomposed efficiently by applying the Cholesky decomposition and the Schur algorithm. This leads to an O N 3 to O N 2 complexity reduction, where N is the matrix size, while preserving computing parallelism for the hardware design. In addition, a diagonal loading technique is employed to mitigate the stability problem when the matrix is ill-conditioned. Simulation results indicate that no performance loss is observed due to the algorithm simplification measures. A systolic array based mapping procedure converts the two beamforming algorithms to a unified hardware accelerator design with 80% shared circuitry. Complex-valued divisions are achieved by adopting a hardware efficient coordinate rotation digital computer (CORDIC) scheme. In chip implementation, a TSMC 90nm UTM process technology is used and the design specs largely follow the requirements of IEEE 802.11ac standard. The core size of the chip design is 0.68mm 2. The measurement results show that the chip can operate up to 200MHz with a power consumption of 49.03mW. It can complete the computation of a new beamforming vector (of size 8) every 0.64us and exhibits the highest throughput among the 6 compared designs.