MICHAEL P. BEKAKOS - Academia.edu (original) (raw)

Papers by MICHAEL P. BEKAKOS

Research paper thumbnail of Concurrent exploitation of multiple pipes arrangements on 3D mesh(d) architectures

Concurrent exploitation of multiple pipes arrangements on 3D mesh(d) architectures

Neural, Parallel and Scientific Computations

The VLSI systolic processor arrays introduced herein implement the new B&F elimination technique.... more The VLSI systolic processor arrays introduced herein implement the new B&F elimination technique. The architectural design proposed consists of slightly more complex cells and leads, eventually, to the fastest concurrent bidirectional B&F complete tridiagonal linear solver. The generalization of this fastest concurrent B&F complete approach follows, for multiple pipes arrangements on a 3D Mesh(d). Finally, a generalized bidirectional Gauss eliminator is proposed for dense matrices. All VLSI constrains for simplicity, regularity in data flow and local communication are fulfilled.

Research paper thumbnail of Multioscillator cosinor models for optimal curve-fit of time series data

Multioscillator cosinor models for optimal curve-fit of time series data

Nonlinear Analysis: Theory, Methods & Applications, 2001

ABSTRACT

Research paper thumbnail of Well defined generative lexicon with grammatical order versus text tagging

Well defined generative lexicon with grammatical order versus text tagging

ABSTRACT

Research paper thumbnail of An FPGA Hardware Parallel Implementation of the DES Algorithm

An FPGA Hardware Parallel Implementation of the DES Algorithm

Neural, Parallel and Scientific Computations

Research paper thumbnail of Computing all-pairs shortest paths on a linear systolic array and hardware realization on a reprogrammable FPGA platform

The Journal of Supercomputing, 2007

In this paper a regular bidirectional linear systolic array (RBLSA) for computing all-pairs short... more In this paper a regular bidirectional linear systolic array (RBLSA) for computing all-pairs shortest paths of a given directed graph is designed. The obtained array is optimal with respect to a number of processing elements (PE) for a given problem size. The execution time of the array has been minimized. To obtain RBLSA with optimal number of PEs, the accommodation of the inner computation space of the systolic algorithm to the projection direction vector is performed. Finally, FPGAbased reprogrammable systems are revolutionizing certain types of computation and digital logic, since as logic emulation systems they offer some orders of magnitude speedup over software simulation; herein, a FPGA realization of the RBLSA is investigated and the performance evaluation results are discussed.

Research paper thumbnail of VHDL Code Automatic Generator for Systolic Arrays

VHDL Code Automatic Generator for Systolic Arrays

2006 2nd International Conference on Information & Communication Technologies

ABSTRACT Systolic arrays speed up scientific computations with inherent parallelization, by explo... more ABSTRACT Systolic arrays speed up scientific computations with inherent parallelization, by exploiting massive data pipeline parallelism. In addition, they include short and problem-size independent signal paths, predictable performance, scalability, and simple design and test. In this paper, a server-based software tool for the automatic generation of VHDL code describing systolic arrays topologies is presented. Input parameters of the tool are several essential factors for the architectural description of systolic arrays (SA), like the interconnection topology of the systolic array, i.e., linear, mesh or hex-connected, the size of the systolic array, i.e., the number of the processing elements (PE) in each dimension, the function of the PE, i.e., the relation between the output and the input ports of every PE and finally the bitlength of PE ports, i.e., the data word size of every port

Research paper thumbnail of Synthesis of a unidirectional systolic array for matrix–vector multiplication

Synthesis of a unidirectional systolic array for matrix–vector multiplication

Mathematical and Computer Modelling, 2006

In this paper we present a procedure, based on data dependencies and space–time transformations o... more In this paper we present a procedure, based on data dependencies and space–time transformations of index space, to design a unidirectional linear systolic array (ULSA) for computing a matrix–vector product. The obtained array is optimal with respect to the number of processing elements (PEs) for a given problem size. The execution time of the array is the minimal possible for

Research paper thumbnail of Systolic bandwidth and profile reduction of sparse matrices on pipenets

Systolic bandwidth and profile reduction of sparse matrices on pipenets

Nonlinear Analysis, Jan 1, 1997

Поиск в библиотеке, Расширенный поиск. ...

Research paper thumbnail of FPGA Implementation of a Unidirectional Systolic Array Generator for Matrix-Vector Multiplication

FPGA Implementation of a Unidirectional Systolic Array Generator for Matrix-Vector Multiplication

2007 IEEE International Conference on Signal Processing and Communications, 2007

Systolic arrays may prove ideal structures for the representation and the mapping of many applica... more Systolic arrays may prove ideal structures for the representation and the mapping of many applications concerning various numerical and non-numerical scientific applications. Especially, some formulation of Dynamic Programming (DP) - a commonly used technique for solving a wide variety of discrete optimization problems, such as scheduling, string-editing, packaging, and inventory management can be solved in parallel on systolic arrays as

Research paper thumbnail of A FPGA-based Dewavefront Array Prototype Implementing the Quadrant Interlocking Factorization Method

A FPGA-based Dewavefront Array Prototype Implementing the Quadrant Interlocking Factorization Method

2006 2nd International Conference on Information & Communication Technologies

The systolic processing offers the possibility of solving a large number of standard problems on ... more The systolic processing offers the possibility of solving a large number of standard problems on multicellular computing devices with autonomous cells (processing elements - PEs). The resulting systolic arrays exploit the underlying parallelism of many computationally intensive problems and offer a vital and effective way of handling them. Advances in technology and especially in VLSI and FPGA have an ongoing

Research paper thumbnail of A Grid infrastructure of custom processing elements for scientific computations

A Grid infrastructure of custom processing elements for scientific computations

WIT Transactions on State of the Art in Science and Engineering, 2006

Research paper thumbnail of Parallel cyclic odd-even reduction algorithms for solving Toeplitz tridiagonal equations on MIMD computers

Parallel cyclic odd-even reduction algorithms for solving Toeplitz tridiagonal equations on MIMD computers

Parallel Computing, 1993

ABSTRACT

Research paper thumbnail of Human genome and Bioinformatics: A survey

Human genome and Bioinformatics: A survey

ABSTRACT

Research paper thumbnail of Multimedia Databases and distributed systems: A survey

Multimedia Databases and distributed systems: A survey

ABSTRACT

Research paper thumbnail of Quadrant Interlocking Factorization on Systolic and Wavefront Array Processors

Quadrant Interlocking Factorization on Systolic and Wavefront Array Processors

Series in Machine Perception and Artificial Intelligence, 2002

Research paper thumbnail of Action: A New Metric for Evaluating the Energy Efficiency on High Performance Computing Platforms (ranked on Green500 List)

WSEAS TRANSACTIONS ON COMPUTERS, 2022

The need for new and more reliable metrics is always in demand. In this paper, a new metric is pr... more The need for new and more reliable metrics is always in demand. In this paper, a new metric is proposed for the evaluation of high performance computing platforms in conjunction with their energy consumption. The aim of the new metric is to reliably compare different HPC systems concerning their energy efficiency. The metric provides a mean to rank supercomputers of similar capabilities, avoiding the misleading results of metrics like performance-per-watt, currently used for ranking systems, as in the Green500 list, where systems with totally different sizes and capabilities are ranked consecutively. An example of this misuse for two adjacent systems in the Green500 list, is discussed. A comparative study for the energy efficiency of three high performance computing platforms, with different architectures, using the proposed metric is presented.

Research paper thumbnail of An Efficient Parallel Approach To ReduceSparse Matrices With Invariant Entries

WIT Transactions on Information and Communication Technologies, 1970

This paper investigates an efficient parallel technique for reducing sparse matrices that can be ... more This paper investigates an efficient parallel technique for reducing sparse matrices that can be applied to analysis tables. This kind of matrices take up a great amount of memory space by the zero entries and, hence, a subtle compaction scheme is necessary. The benefit of the parallel approach introduced herein is that a very compact form results which will contribute to a greatly reduced time when accessing the given data structure.

Research paper thumbnail of A load balancing fault-tolerant algorithm for heterogeneous cluster environments

A load balancing fault-tolerant algorithm for heterogeneous cluster environments

Neural Parallel Sci. Comput., 2009

Herein, a fault-tolerant parallel pattern matching algorithm with load balancing support is prese... more Herein, a fault-tolerant parallel pattern matching algorithm with load balancing support is presented, targeting at both homogeneous and heterogeneous clusters of workstations from the perspective of computational power. The algorithm is capable of handling up to (n-1) faults, introduced at any time, with n being the total number of cluster nodes. It is capable of handling either permanent faults or transient failure situations, temporarily handled as pennanent, due to network delay, and thus, nodes may be returned at any time. The experimental results exhibit that the algorithm is capable of returning reliable results in acceptable time limits.

Research paper thumbnail of FPGAImplementation ofaUnidirectional Systolic ArrayGenerator forMatrix- Vector Multiplication

FPGAImplementation ofaUnidirectional Systolic ArrayGenerator forMatrix- Vector Multiplication

Systolic arrays may proveideal structures forthe representation andthemapping ofmanyapplications ... more Systolic arrays may proveideal structures forthe representation andthemapping ofmanyapplications concerning various numerical andnon-numerical scientific applications. Especially, someformulation of DynamicProgramming (DP)- a commonlyused technique forsolving a widevariety ofdiscrete optimization problems, suchasscheduling, stringediting, packaging, andinventory management- canbe solved inparallel onsystolic arrays asmatrix-vector products. Systolic arrays usually haveaveryhighrate ofI/Oandarewellsuited forintensive parallel operations Hereinisa description oftheFPGA hardwareimplementation of a matrix-vector multiplication algorithm designed to producea unidirectional systolic array representation.

Research paper thumbnail of A Statistical Approach To Curve-fittingExploitation Of Biomedical Waveforms

WIT Transactions on Biomedicine and Health, 1970

In this work we addressed the problem to determine the set of functions that representatively des... more In this work we addressed the problem to determine the set of functions that representatively describe 24h blood pressure and heart rate waveforms in the population of hypertensive patients. Curve fitting was conducted both for a set of linear and non-linear equations. The aim of the study was to investigate the probability to reproduce easily the 24h intra-arterial waveform if the corresponding extra-arterial waveform was available.

Research paper thumbnail of Concurrent exploitation of multiple pipes arrangements on 3D mesh(d) architectures

Concurrent exploitation of multiple pipes arrangements on 3D mesh(d) architectures

Neural, Parallel and Scientific Computations

The VLSI systolic processor arrays introduced herein implement the new B&F elimination technique.... more The VLSI systolic processor arrays introduced herein implement the new B&F elimination technique. The architectural design proposed consists of slightly more complex cells and leads, eventually, to the fastest concurrent bidirectional B&F complete tridiagonal linear solver. The generalization of this fastest concurrent B&F complete approach follows, for multiple pipes arrangements on a 3D Mesh(d). Finally, a generalized bidirectional Gauss eliminator is proposed for dense matrices. All VLSI constrains for simplicity, regularity in data flow and local communication are fulfilled.

Research paper thumbnail of Multioscillator cosinor models for optimal curve-fit of time series data

Multioscillator cosinor models for optimal curve-fit of time series data

Nonlinear Analysis: Theory, Methods & Applications, 2001

ABSTRACT

Research paper thumbnail of Well defined generative lexicon with grammatical order versus text tagging

Well defined generative lexicon with grammatical order versus text tagging

ABSTRACT

Research paper thumbnail of An FPGA Hardware Parallel Implementation of the DES Algorithm

An FPGA Hardware Parallel Implementation of the DES Algorithm

Neural, Parallel and Scientific Computations

Research paper thumbnail of Computing all-pairs shortest paths on a linear systolic array and hardware realization on a reprogrammable FPGA platform

The Journal of Supercomputing, 2007

In this paper a regular bidirectional linear systolic array (RBLSA) for computing all-pairs short... more In this paper a regular bidirectional linear systolic array (RBLSA) for computing all-pairs shortest paths of a given directed graph is designed. The obtained array is optimal with respect to a number of processing elements (PE) for a given problem size. The execution time of the array has been minimized. To obtain RBLSA with optimal number of PEs, the accommodation of the inner computation space of the systolic algorithm to the projection direction vector is performed. Finally, FPGAbased reprogrammable systems are revolutionizing certain types of computation and digital logic, since as logic emulation systems they offer some orders of magnitude speedup over software simulation; herein, a FPGA realization of the RBLSA is investigated and the performance evaluation results are discussed.

Research paper thumbnail of VHDL Code Automatic Generator for Systolic Arrays

VHDL Code Automatic Generator for Systolic Arrays

2006 2nd International Conference on Information & Communication Technologies

ABSTRACT Systolic arrays speed up scientific computations with inherent parallelization, by explo... more ABSTRACT Systolic arrays speed up scientific computations with inherent parallelization, by exploiting massive data pipeline parallelism. In addition, they include short and problem-size independent signal paths, predictable performance, scalability, and simple design and test. In this paper, a server-based software tool for the automatic generation of VHDL code describing systolic arrays topologies is presented. Input parameters of the tool are several essential factors for the architectural description of systolic arrays (SA), like the interconnection topology of the systolic array, i.e., linear, mesh or hex-connected, the size of the systolic array, i.e., the number of the processing elements (PE) in each dimension, the function of the PE, i.e., the relation between the output and the input ports of every PE and finally the bitlength of PE ports, i.e., the data word size of every port

Research paper thumbnail of Synthesis of a unidirectional systolic array for matrix–vector multiplication

Synthesis of a unidirectional systolic array for matrix–vector multiplication

Mathematical and Computer Modelling, 2006

In this paper we present a procedure, based on data dependencies and space–time transformations o... more In this paper we present a procedure, based on data dependencies and space–time transformations of index space, to design a unidirectional linear systolic array (ULSA) for computing a matrix–vector product. The obtained array is optimal with respect to the number of processing elements (PEs) for a given problem size. The execution time of the array is the minimal possible for

Research paper thumbnail of Systolic bandwidth and profile reduction of sparse matrices on pipenets

Systolic bandwidth and profile reduction of sparse matrices on pipenets

Nonlinear Analysis, Jan 1, 1997

Поиск в библиотеке, Расширенный поиск. ...

Research paper thumbnail of FPGA Implementation of a Unidirectional Systolic Array Generator for Matrix-Vector Multiplication

FPGA Implementation of a Unidirectional Systolic Array Generator for Matrix-Vector Multiplication

2007 IEEE International Conference on Signal Processing and Communications, 2007

Systolic arrays may prove ideal structures for the representation and the mapping of many applica... more Systolic arrays may prove ideal structures for the representation and the mapping of many applications concerning various numerical and non-numerical scientific applications. Especially, some formulation of Dynamic Programming (DP) - a commonly used technique for solving a wide variety of discrete optimization problems, such as scheduling, string-editing, packaging, and inventory management can be solved in parallel on systolic arrays as

Research paper thumbnail of A FPGA-based Dewavefront Array Prototype Implementing the Quadrant Interlocking Factorization Method

A FPGA-based Dewavefront Array Prototype Implementing the Quadrant Interlocking Factorization Method

2006 2nd International Conference on Information & Communication Technologies

The systolic processing offers the possibility of solving a large number of standard problems on ... more The systolic processing offers the possibility of solving a large number of standard problems on multicellular computing devices with autonomous cells (processing elements - PEs). The resulting systolic arrays exploit the underlying parallelism of many computationally intensive problems and offer a vital and effective way of handling them. Advances in technology and especially in VLSI and FPGA have an ongoing

Research paper thumbnail of A Grid infrastructure of custom processing elements for scientific computations

A Grid infrastructure of custom processing elements for scientific computations

WIT Transactions on State of the Art in Science and Engineering, 2006

Research paper thumbnail of Parallel cyclic odd-even reduction algorithms for solving Toeplitz tridiagonal equations on MIMD computers

Parallel cyclic odd-even reduction algorithms for solving Toeplitz tridiagonal equations on MIMD computers

Parallel Computing, 1993

ABSTRACT

Research paper thumbnail of Human genome and Bioinformatics: A survey

Human genome and Bioinformatics: A survey

ABSTRACT

Research paper thumbnail of Multimedia Databases and distributed systems: A survey

Multimedia Databases and distributed systems: A survey

ABSTRACT

Research paper thumbnail of Quadrant Interlocking Factorization on Systolic and Wavefront Array Processors

Quadrant Interlocking Factorization on Systolic and Wavefront Array Processors

Series in Machine Perception and Artificial Intelligence, 2002

Research paper thumbnail of Action: A New Metric for Evaluating the Energy Efficiency on High Performance Computing Platforms (ranked on Green500 List)

WSEAS TRANSACTIONS ON COMPUTERS, 2022

The need for new and more reliable metrics is always in demand. In this paper, a new metric is pr... more The need for new and more reliable metrics is always in demand. In this paper, a new metric is proposed for the evaluation of high performance computing platforms in conjunction with their energy consumption. The aim of the new metric is to reliably compare different HPC systems concerning their energy efficiency. The metric provides a mean to rank supercomputers of similar capabilities, avoiding the misleading results of metrics like performance-per-watt, currently used for ranking systems, as in the Green500 list, where systems with totally different sizes and capabilities are ranked consecutively. An example of this misuse for two adjacent systems in the Green500 list, is discussed. A comparative study for the energy efficiency of three high performance computing platforms, with different architectures, using the proposed metric is presented.

Research paper thumbnail of An Efficient Parallel Approach To ReduceSparse Matrices With Invariant Entries

WIT Transactions on Information and Communication Technologies, 1970

This paper investigates an efficient parallel technique for reducing sparse matrices that can be ... more This paper investigates an efficient parallel technique for reducing sparse matrices that can be applied to analysis tables. This kind of matrices take up a great amount of memory space by the zero entries and, hence, a subtle compaction scheme is necessary. The benefit of the parallel approach introduced herein is that a very compact form results which will contribute to a greatly reduced time when accessing the given data structure.

Research paper thumbnail of A load balancing fault-tolerant algorithm for heterogeneous cluster environments

A load balancing fault-tolerant algorithm for heterogeneous cluster environments

Neural Parallel Sci. Comput., 2009

Herein, a fault-tolerant parallel pattern matching algorithm with load balancing support is prese... more Herein, a fault-tolerant parallel pattern matching algorithm with load balancing support is presented, targeting at both homogeneous and heterogeneous clusters of workstations from the perspective of computational power. The algorithm is capable of handling up to (n-1) faults, introduced at any time, with n being the total number of cluster nodes. It is capable of handling either permanent faults or transient failure situations, temporarily handled as pennanent, due to network delay, and thus, nodes may be returned at any time. The experimental results exhibit that the algorithm is capable of returning reliable results in acceptable time limits.

Research paper thumbnail of FPGAImplementation ofaUnidirectional Systolic ArrayGenerator forMatrix- Vector Multiplication

FPGAImplementation ofaUnidirectional Systolic ArrayGenerator forMatrix- Vector Multiplication

Systolic arrays may proveideal structures forthe representation andthemapping ofmanyapplications ... more Systolic arrays may proveideal structures forthe representation andthemapping ofmanyapplications concerning various numerical andnon-numerical scientific applications. Especially, someformulation of DynamicProgramming (DP)- a commonlyused technique forsolving a widevariety ofdiscrete optimization problems, suchasscheduling, stringediting, packaging, andinventory management- canbe solved inparallel onsystolic arrays asmatrix-vector products. Systolic arrays usually haveaveryhighrate ofI/Oandarewellsuited forintensive parallel operations Hereinisa description oftheFPGA hardwareimplementation of a matrix-vector multiplication algorithm designed to producea unidirectional systolic array representation.

Research paper thumbnail of A Statistical Approach To Curve-fittingExploitation Of Biomedical Waveforms

WIT Transactions on Biomedicine and Health, 1970

In this work we addressed the problem to determine the set of functions that representatively des... more In this work we addressed the problem to determine the set of functions that representatively describe 24h blood pressure and heart rate waveforms in the population of hypertensive patients. Curve fitting was conducted both for a set of linear and non-linear equations. The aim of the study was to investigate the probability to reproduce easily the 24h intra-arterial waveform if the corresponding extra-arterial waveform was available.