MICHAEL P. BEKAKOS - Academia.edu (original) (raw)
Papers by MICHAEL P. BEKAKOS
Concurrent exploitation of multiple pipes arrangements on 3D mesh(d) architectures
Neural, Parallel and Scientific Computations
The VLSI systolic processor arrays introduced herein implement the new B&F elimination technique.... more The VLSI systolic processor arrays introduced herein implement the new B&F elimination technique. The architectural design proposed consists of slightly more complex cells and leads, eventually, to the fastest concurrent bidirectional B&F complete tridiagonal linear solver. The generalization of this fastest concurrent B&F complete approach follows, for multiple pipes arrangements on a 3D Mesh(d). Finally, a generalized bidirectional Gauss eliminator is proposed for dense matrices. All VLSI constrains for simplicity, regularity in data flow and local communication are fulfilled.
Multioscillator cosinor models for optimal curve-fit of time series data
Nonlinear Analysis: Theory, Methods & Applications, 2001
ABSTRACT
Well defined generative lexicon with grammatical order versus text tagging
ABSTRACT
An FPGA Hardware Parallel Implementation of the DES Algorithm
Neural, Parallel and Scientific Computations
The Journal of Supercomputing, 2007
In this paper a regular bidirectional linear systolic array (RBLSA) for computing all-pairs short... more In this paper a regular bidirectional linear systolic array (RBLSA) for computing all-pairs shortest paths of a given directed graph is designed. The obtained array is optimal with respect to a number of processing elements (PE) for a given problem size. The execution time of the array has been minimized. To obtain RBLSA with optimal number of PEs, the accommodation of the inner computation space of the systolic algorithm to the projection direction vector is performed. Finally, FPGAbased reprogrammable systems are revolutionizing certain types of computation and digital logic, since as logic emulation systems they offer some orders of magnitude speedup over software simulation; herein, a FPGA realization of the RBLSA is investigated and the performance evaluation results are discussed.
VHDL Code Automatic Generator for Systolic Arrays
2006 2nd International Conference on Information & Communication Technologies
ABSTRACT Systolic arrays speed up scientific computations with inherent parallelization, by explo... more ABSTRACT Systolic arrays speed up scientific computations with inherent parallelization, by exploiting massive data pipeline parallelism. In addition, they include short and problem-size independent signal paths, predictable performance, scalability, and simple design and test. In this paper, a server-based software tool for the automatic generation of VHDL code describing systolic arrays topologies is presented. Input parameters of the tool are several essential factors for the architectural description of systolic arrays (SA), like the interconnection topology of the systolic array, i.e., linear, mesh or hex-connected, the size of the systolic array, i.e., the number of the processing elements (PE) in each dimension, the function of the PE, i.e., the relation between the output and the input ports of every PE and finally the bitlength of PE ports, i.e., the data word size of every port
Synthesis of a unidirectional systolic array for matrix–vector multiplication
Mathematical and Computer Modelling, 2006
In this paper we present a procedure, based on data dependencies and space–time transformations o... more In this paper we present a procedure, based on data dependencies and space–time transformations of index space, to design a unidirectional linear systolic array (ULSA) for computing a matrix–vector product. The obtained array is optimal with respect to the number of processing elements (PEs) for a given problem size. The execution time of the array is the minimal possible for
Systolic bandwidth and profile reduction of sparse matrices on pipenets
Nonlinear Analysis, Jan 1, 1997
Поиск в библиотеке, Расширенный поиск. ...
FPGA Implementation of a Unidirectional Systolic Array Generator for Matrix-Vector Multiplication
2007 IEEE International Conference on Signal Processing and Communications, 2007
Systolic arrays may prove ideal structures for the representation and the mapping of many applica... more Systolic arrays may prove ideal structures for the representation and the mapping of many applications concerning various numerical and non-numerical scientific applications. Especially, some formulation of Dynamic Programming (DP) - a commonly used technique for solving a wide variety of discrete optimization problems, such as scheduling, string-editing, packaging, and inventory management can be solved in parallel on systolic arrays as
A FPGA-based Dewavefront Array Prototype Implementing the Quadrant Interlocking Factorization Method
2006 2nd International Conference on Information & Communication Technologies
The systolic processing offers the possibility of solving a large number of standard problems on ... more The systolic processing offers the possibility of solving a large number of standard problems on multicellular computing devices with autonomous cells (processing elements - PEs). The resulting systolic arrays exploit the underlying parallelism of many computationally intensive problems and offer a vital and effective way of handling them. Advances in technology and especially in VLSI and FPGA have an ongoing
A Grid infrastructure of custom processing elements for scientific computations
WIT Transactions on State of the Art in Science and Engineering, 2006
Parallel cyclic odd-even reduction algorithms for solving Toeplitz tridiagonal equations on MIMD computers
Parallel Computing, 1993
ABSTRACT
Human genome and Bioinformatics: A survey
ABSTRACT
Multimedia Databases and distributed systems: A survey
ABSTRACT
Quadrant Interlocking Factorization on Systolic and Wavefront Array Processors
Series in Machine Perception and Artificial Intelligence, 2002
WSEAS TRANSACTIONS ON COMPUTERS, 2022
The need for new and more reliable metrics is always in demand. In this paper, a new metric is pr... more The need for new and more reliable metrics is always in demand. In this paper, a new metric is proposed for the evaluation of high performance computing platforms in conjunction with their energy consumption. The aim of the new metric is to reliably compare different HPC systems concerning their energy efficiency. The metric provides a mean to rank supercomputers of similar capabilities, avoiding the misleading results of metrics like performance-per-watt, currently used for ranking systems, as in the Green500 list, where systems with totally different sizes and capabilities are ranked consecutively. An example of this misuse for two adjacent systems in the Green500 list, is discussed. A comparative study for the energy efficiency of three high performance computing platforms, with different architectures, using the proposed metric is presented.
WIT Transactions on Information and Communication Technologies, 1970
This paper investigates an efficient parallel technique for reducing sparse matrices that can be ... more This paper investigates an efficient parallel technique for reducing sparse matrices that can be applied to analysis tables. This kind of matrices take up a great amount of memory space by the zero entries and, hence, a subtle compaction scheme is necessary. The benefit of the parallel approach introduced herein is that a very compact form results which will contribute to a greatly reduced time when accessing the given data structure.
A load balancing fault-tolerant algorithm for heterogeneous cluster environments
Neural Parallel Sci. Comput., 2009
Herein, a fault-tolerant parallel pattern matching algorithm with load balancing support is prese... more Herein, a fault-tolerant parallel pattern matching algorithm with load balancing support is presented, targeting at both homogeneous and heterogeneous clusters of workstations from the perspective of computational power. The algorithm is capable of handling up to (n-1) faults, introduced at any time, with n being the total number of cluster nodes. It is capable of handling either permanent faults or transient failure situations, temporarily handled as pennanent, due to network delay, and thus, nodes may be returned at any time. The experimental results exhibit that the algorithm is capable of returning reliable results in acceptable time limits.
FPGAImplementation ofaUnidirectional Systolic ArrayGenerator forMatrix- Vector Multiplication
Systolic arrays may proveideal structures forthe representation andthemapping ofmanyapplications ... more Systolic arrays may proveideal structures forthe representation andthemapping ofmanyapplications concerning various numerical andnon-numerical scientific applications. Especially, someformulation of DynamicProgramming (DP)- a commonlyused technique forsolving a widevariety ofdiscrete optimization problems, suchasscheduling, stringediting, packaging, andinventory management- canbe solved inparallel onsystolic arrays asmatrix-vector products. Systolic arrays usually haveaveryhighrate ofI/Oandarewellsuited forintensive parallel operations Hereinisa description oftheFPGA hardwareimplementation of a matrix-vector multiplication algorithm designed to producea unidirectional systolic array representation.
WIT Transactions on Biomedicine and Health, 1970
In this work we addressed the problem to determine the set of functions that representatively des... more In this work we addressed the problem to determine the set of functions that representatively describe 24h blood pressure and heart rate waveforms in the population of hypertensive patients. Curve fitting was conducted both for a set of linear and non-linear equations. The aim of the study was to investigate the probability to reproduce easily the 24h intra-arterial waveform if the corresponding extra-arterial waveform was available.
Concurrent exploitation of multiple pipes arrangements on 3D mesh(d) architectures
Neural, Parallel and Scientific Computations
The VLSI systolic processor arrays introduced herein implement the new B&F elimination technique.... more The VLSI systolic processor arrays introduced herein implement the new B&F elimination technique. The architectural design proposed consists of slightly more complex cells and leads, eventually, to the fastest concurrent bidirectional B&F complete tridiagonal linear solver. The generalization of this fastest concurrent B&F complete approach follows, for multiple pipes arrangements on a 3D Mesh(d). Finally, a generalized bidirectional Gauss eliminator is proposed for dense matrices. All VLSI constrains for simplicity, regularity in data flow and local communication are fulfilled.
Multioscillator cosinor models for optimal curve-fit of time series data
Nonlinear Analysis: Theory, Methods & Applications, 2001
ABSTRACT
Well defined generative lexicon with grammatical order versus text tagging
ABSTRACT
An FPGA Hardware Parallel Implementation of the DES Algorithm
Neural, Parallel and Scientific Computations
The Journal of Supercomputing, 2007
In this paper a regular bidirectional linear systolic array (RBLSA) for computing all-pairs short... more In this paper a regular bidirectional linear systolic array (RBLSA) for computing all-pairs shortest paths of a given directed graph is designed. The obtained array is optimal with respect to a number of processing elements (PE) for a given problem size. The execution time of the array has been minimized. To obtain RBLSA with optimal number of PEs, the accommodation of the inner computation space of the systolic algorithm to the projection direction vector is performed. Finally, FPGAbased reprogrammable systems are revolutionizing certain types of computation and digital logic, since as logic emulation systems they offer some orders of magnitude speedup over software simulation; herein, a FPGA realization of the RBLSA is investigated and the performance evaluation results are discussed.
VHDL Code Automatic Generator for Systolic Arrays
2006 2nd International Conference on Information & Communication Technologies
ABSTRACT Systolic arrays speed up scientific computations with inherent parallelization, by explo... more ABSTRACT Systolic arrays speed up scientific computations with inherent parallelization, by exploiting massive data pipeline parallelism. In addition, they include short and problem-size independent signal paths, predictable performance, scalability, and simple design and test. In this paper, a server-based software tool for the automatic generation of VHDL code describing systolic arrays topologies is presented. Input parameters of the tool are several essential factors for the architectural description of systolic arrays (SA), like the interconnection topology of the systolic array, i.e., linear, mesh or hex-connected, the size of the systolic array, i.e., the number of the processing elements (PE) in each dimension, the function of the PE, i.e., the relation between the output and the input ports of every PE and finally the bitlength of PE ports, i.e., the data word size of every port
Synthesis of a unidirectional systolic array for matrix–vector multiplication
Mathematical and Computer Modelling, 2006
In this paper we present a procedure, based on data dependencies and space–time transformations o... more In this paper we present a procedure, based on data dependencies and space–time transformations of index space, to design a unidirectional linear systolic array (ULSA) for computing a matrix–vector product. The obtained array is optimal with respect to the number of processing elements (PEs) for a given problem size. The execution time of the array is the minimal possible for
Systolic bandwidth and profile reduction of sparse matrices on pipenets
Nonlinear Analysis, Jan 1, 1997
Поиск в библиотеке, Расширенный поиск. ...
FPGA Implementation of a Unidirectional Systolic Array Generator for Matrix-Vector Multiplication
2007 IEEE International Conference on Signal Processing and Communications, 2007
Systolic arrays may prove ideal structures for the representation and the mapping of many applica... more Systolic arrays may prove ideal structures for the representation and the mapping of many applications concerning various numerical and non-numerical scientific applications. Especially, some formulation of Dynamic Programming (DP) - a commonly used technique for solving a wide variety of discrete optimization problems, such as scheduling, string-editing, packaging, and inventory management can be solved in parallel on systolic arrays as
A FPGA-based Dewavefront Array Prototype Implementing the Quadrant Interlocking Factorization Method
2006 2nd International Conference on Information & Communication Technologies
The systolic processing offers the possibility of solving a large number of standard problems on ... more The systolic processing offers the possibility of solving a large number of standard problems on multicellular computing devices with autonomous cells (processing elements - PEs). The resulting systolic arrays exploit the underlying parallelism of many computationally intensive problems and offer a vital and effective way of handling them. Advances in technology and especially in VLSI and FPGA have an ongoing
A Grid infrastructure of custom processing elements for scientific computations
WIT Transactions on State of the Art in Science and Engineering, 2006
Parallel cyclic odd-even reduction algorithms for solving Toeplitz tridiagonal equations on MIMD computers
Parallel Computing, 1993
ABSTRACT
Human genome and Bioinformatics: A survey
ABSTRACT
Multimedia Databases and distributed systems: A survey
ABSTRACT
Quadrant Interlocking Factorization on Systolic and Wavefront Array Processors
Series in Machine Perception and Artificial Intelligence, 2002
WSEAS TRANSACTIONS ON COMPUTERS, 2022
The need for new and more reliable metrics is always in demand. In this paper, a new metric is pr... more The need for new and more reliable metrics is always in demand. In this paper, a new metric is proposed for the evaluation of high performance computing platforms in conjunction with their energy consumption. The aim of the new metric is to reliably compare different HPC systems concerning their energy efficiency. The metric provides a mean to rank supercomputers of similar capabilities, avoiding the misleading results of metrics like performance-per-watt, currently used for ranking systems, as in the Green500 list, where systems with totally different sizes and capabilities are ranked consecutively. An example of this misuse for two adjacent systems in the Green500 list, is discussed. A comparative study for the energy efficiency of three high performance computing platforms, with different architectures, using the proposed metric is presented.
WIT Transactions on Information and Communication Technologies, 1970
This paper investigates an efficient parallel technique for reducing sparse matrices that can be ... more This paper investigates an efficient parallel technique for reducing sparse matrices that can be applied to analysis tables. This kind of matrices take up a great amount of memory space by the zero entries and, hence, a subtle compaction scheme is necessary. The benefit of the parallel approach introduced herein is that a very compact form results which will contribute to a greatly reduced time when accessing the given data structure.
A load balancing fault-tolerant algorithm for heterogeneous cluster environments
Neural Parallel Sci. Comput., 2009
Herein, a fault-tolerant parallel pattern matching algorithm with load balancing support is prese... more Herein, a fault-tolerant parallel pattern matching algorithm with load balancing support is presented, targeting at both homogeneous and heterogeneous clusters of workstations from the perspective of computational power. The algorithm is capable of handling up to (n-1) faults, introduced at any time, with n being the total number of cluster nodes. It is capable of handling either permanent faults or transient failure situations, temporarily handled as pennanent, due to network delay, and thus, nodes may be returned at any time. The experimental results exhibit that the algorithm is capable of returning reliable results in acceptable time limits.
FPGAImplementation ofaUnidirectional Systolic ArrayGenerator forMatrix- Vector Multiplication
Systolic arrays may proveideal structures forthe representation andthemapping ofmanyapplications ... more Systolic arrays may proveideal structures forthe representation andthemapping ofmanyapplications concerning various numerical andnon-numerical scientific applications. Especially, someformulation of DynamicProgramming (DP)- a commonlyused technique forsolving a widevariety ofdiscrete optimization problems, suchasscheduling, stringediting, packaging, andinventory management- canbe solved inparallel onsystolic arrays asmatrix-vector products. Systolic arrays usually haveaveryhighrate ofI/Oandarewellsuited forintensive parallel operations Hereinisa description oftheFPGA hardwareimplementation of a matrix-vector multiplication algorithm designed to producea unidirectional systolic array representation.
WIT Transactions on Biomedicine and Health, 1970
In this work we addressed the problem to determine the set of functions that representatively des... more In this work we addressed the problem to determine the set of functions that representatively describe 24h blood pressure and heart rate waveforms in the population of hypertensive patients. Curve fitting was conducted both for a set of linear and non-linear equations. The aim of the study was to investigate the probability to reproduce easily the 24h intra-arterial waveform if the corresponding extra-arterial waveform was available.