Tarek El-ghazawi - Academia.edu (original) (raw)

Uploads

Papers by Tarek El-ghazawi

Research paper thumbnail of Performance optimization of combined variable-cost computations and I/O

Lecture Notes in Computer Science, 1997

For applications involving large data sets yielding variablecost computations, achieving both eff... more For applications involving large data sets yielding variablecost computations, achieving both efficient I/O and load balancing may become particularly challenging though performance-critical tasks. In this work, we introduce a data scheduling approach that integrates several optimizing techniques, including dynamic allocation, prefetching, and asynchronous I/O and communications. We show that good scalability is obtained by both hiding the I/O latency and appropriately balancing the workloads. We use a statistical metric for data skewness to further improve the performance by adequately selecting among data-scheduling. We test our approach on sparse benchmark matrices for matrix-vector computations and show experimentally that our method can accurately predict the relative performance of different input/output schemes for a given data set and choose the best technique accordingly.

Research paper thumbnail of A statistically-based multi-algorithmic approach for load-balancing sparse matrix computations

Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96)

Research paper thumbnail of M03---Reconfigurable supercomputing

Proceedings of the 2006 ACM/IEEE conference on Supercomputing - SC '06, 2006

The synergistic advances in high-performance computing and reconfigurable computing, based on fie... more The synergistic advances in high-performance computing and reconfigurable computing, based on field programmable gate arrays (FPGAs), has resulted in hybrid parallel systems of microprocessors and FPGAs. Such systems support both fine-grain and coarse-grain parallelism, and can dynamically tune their architecture to fit various applications. Programming these systems can be quite challenging as programming of FPGA devices can involve hardware design.

Research paper thumbnail of A performance study of job management systems

Concurrency and Computation: Practice and Experience, 2004

Research paper thumbnail of An Empirical Comparative Study of Job Management Systems

Job Management Systems (JMSs) are the important component of grid software infrastructure. With m... more Job Management Systems (JMSs) are the important component of grid software infrastructure. With many JMSs available commercially and in public domain, it is difficult to choose an optimum JMS for a given computing environment. All previous comparisons of JMSs had only a conceptual character. In this paper, we present the results of the first empirical study of JMSs reported in the literature. Four popular systems, LSF, PBS Pro, Sun Grid Engine/CODINE, and Condor were included in our study. The study has revealed ...

Research paper thumbnail of Electrooptic Nonlinear Activation Functions for Vector Matrix Multiplications in Optical Neural Networks

Advanced Photonics 2018 (BGPP, IPR, NP, NOMA, Sensors, Networks, SPPCom, SOF)

Research paper thumbnail of Primer on silicon neuromorphic photonic processors: architecture and compiler

Nanophotonics

Microelectronic computers have encountered challenges in meeting all of today’s demands for infor... more Microelectronic computers have encountered challenges in meeting all of today’s demands for information processing. Meeting these demands will require the development of unconventional computers employing alternative processing models and new device physics. Neural network models have come to dominate modern machine learning algorithms, and specialized electronic hardware has been developed to implement them more efficiently. A silicon photonic integration industry promises to bring manufacturing ecosystems normally reserved for microelectronics to photonics. Photonic devices have already found simple analog signal processing niches where electronics cannot provide sufficient bandwidth and reconfigurability. In order to solve more complex information processing problems, they will have to adopt a processing model that generalizes and scales. Neuromorphic photonics aims to map physical models of optoelectronic systems to abstract models of neural networks. It represents a new opportu...

Research paper thumbnail of A Winograd-based Integrated Photonics Accelerator for Convolutional Neural Networks

IEEE Journal of Selected Topics in Quantum Electronics

Research paper thumbnail of Towards Energy-Quality Scaling in Deep Neural Networks

Research paper thumbnail of Software for Brain Network Simulations: A Comparative Study

Frontiers in Neuroinformatics

Research paper thumbnail of Optimization of Selected Remote Sensing Algorithms for Many-Core Architectures

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

Research paper thumbnail of Optimizing thin client caches for mobile cloud computing

Concurrency and Computation: Practice and Experience, 2017

Research paper thumbnail of MorphoNoC: Exploring the Design Space of a Configurable Hybrid NoC using Nanophotonics

Microprocessors and Microsystems, 2017

Research paper thumbnail of Novel Models of Visual Topographic Map Alignment in the Superior Colliculus

PLoS computational biology, 2016

The establishment of precise neuronal connectivity during development is critical for sensing the... more The establishment of precise neuronal connectivity during development is critical for sensing the external environment and informing appropriate behavioral responses. In the visual system, many connections are organized topographically, which preserves the spatial order of the visual scene. The superior colliculus (SC) is a midbrain nucleus that integrates visual inputs from the retina and primary visual cortex (V1) to regulate goal-directed eye movements. In the SC, topographically organized inputs from the retina and V1 must be aligned to facilitate integration. Previously, we showed that retinal input instructs the alignment of V1 inputs in the SC in a manner dependent on spontaneous neuronal activity; however, the mechanism of activity-dependent instruction remains unclear. To begin to address this gap, we developed two novel computational models of visual map alignment in the SC that incorporate distinct activity-dependent components. First, a Correlational Model assumes that V...

Research paper thumbnail of Performance bounds of partial run-time reconfiguration in high-performance reconfigurable computing

Proceedings of the 1st international workshop on High-performance reconfigurable computing technology and applications held in conjunction with SC07 - HPRCTA '07, 2007

Research paper thumbnail of Enabling PGAS Productivity with Hardware Support for Shared Address Mapping: A UPC Case Study

2014 Ieee Intl Conf on High Performance Computing and Communications 2014 Ieee 6th Intl Symp on Cyberspace Safety and Security 2014 Ieee 11th Intl Conf on Embedded Software and Syst, Aug 1, 2014

Research paper thumbnail of Low latency elliptic curve cryptography accelerators for NISTcurves over binary fields

Proceedings. 2005 IEEE International Conference on Field-Programmable Technology, 2005., 2005

Research paper thumbnail of The seasonal to interannual Earth Science Information Partner: a distributed data and information broker

IGARSS '98. Sensing and Managing the Environment. 1998 IEEE International Geoscience and Remote Sensing. Symposium Proceedings. (Cat. No.98CH36174), 1998

ABSTRACT

Research paper thumbnail of On the Implementation of Information Retrieval as Sparse Matric Application

Research paper thumbnail of Overhead and Scalability Measurements on the Cray T3D and Intel Paragon Systems

Research paper thumbnail of Performance optimization of combined variable-cost computations and I/O

Lecture Notes in Computer Science, 1997

For applications involving large data sets yielding variablecost computations, achieving both eff... more For applications involving large data sets yielding variablecost computations, achieving both efficient I/O and load balancing may become particularly challenging though performance-critical tasks. In this work, we introduce a data scheduling approach that integrates several optimizing techniques, including dynamic allocation, prefetching, and asynchronous I/O and communications. We show that good scalability is obtained by both hiding the I/O latency and appropriately balancing the workloads. We use a statistical metric for data skewness to further improve the performance by adequately selecting among data-scheduling. We test our approach on sparse benchmark matrices for matrix-vector computations and show experimentally that our method can accurately predict the relative performance of different input/output schemes for a given data set and choose the best technique accordingly.

Research paper thumbnail of A statistically-based multi-algorithmic approach for load-balancing sparse matrix computations

Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96)

Research paper thumbnail of M03---Reconfigurable supercomputing

Proceedings of the 2006 ACM/IEEE conference on Supercomputing - SC '06, 2006

The synergistic advances in high-performance computing and reconfigurable computing, based on fie... more The synergistic advances in high-performance computing and reconfigurable computing, based on field programmable gate arrays (FPGAs), has resulted in hybrid parallel systems of microprocessors and FPGAs. Such systems support both fine-grain and coarse-grain parallelism, and can dynamically tune their architecture to fit various applications. Programming these systems can be quite challenging as programming of FPGA devices can involve hardware design.

Research paper thumbnail of A performance study of job management systems

Concurrency and Computation: Practice and Experience, 2004

Research paper thumbnail of An Empirical Comparative Study of Job Management Systems

Job Management Systems (JMSs) are the important component of grid software infrastructure. With m... more Job Management Systems (JMSs) are the important component of grid software infrastructure. With many JMSs available commercially and in public domain, it is difficult to choose an optimum JMS for a given computing environment. All previous comparisons of JMSs had only a conceptual character. In this paper, we present the results of the first empirical study of JMSs reported in the literature. Four popular systems, LSF, PBS Pro, Sun Grid Engine/CODINE, and Condor were included in our study. The study has revealed ...

Research paper thumbnail of Electrooptic Nonlinear Activation Functions for Vector Matrix Multiplications in Optical Neural Networks

Advanced Photonics 2018 (BGPP, IPR, NP, NOMA, Sensors, Networks, SPPCom, SOF)

Research paper thumbnail of Primer on silicon neuromorphic photonic processors: architecture and compiler

Nanophotonics

Microelectronic computers have encountered challenges in meeting all of today’s demands for infor... more Microelectronic computers have encountered challenges in meeting all of today’s demands for information processing. Meeting these demands will require the development of unconventional computers employing alternative processing models and new device physics. Neural network models have come to dominate modern machine learning algorithms, and specialized electronic hardware has been developed to implement them more efficiently. A silicon photonic integration industry promises to bring manufacturing ecosystems normally reserved for microelectronics to photonics. Photonic devices have already found simple analog signal processing niches where electronics cannot provide sufficient bandwidth and reconfigurability. In order to solve more complex information processing problems, they will have to adopt a processing model that generalizes and scales. Neuromorphic photonics aims to map physical models of optoelectronic systems to abstract models of neural networks. It represents a new opportu...

Research paper thumbnail of A Winograd-based Integrated Photonics Accelerator for Convolutional Neural Networks

IEEE Journal of Selected Topics in Quantum Electronics

Research paper thumbnail of Towards Energy-Quality Scaling in Deep Neural Networks

Research paper thumbnail of Software for Brain Network Simulations: A Comparative Study

Frontiers in Neuroinformatics

Research paper thumbnail of Optimization of Selected Remote Sensing Algorithms for Many-Core Architectures

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

Research paper thumbnail of Optimizing thin client caches for mobile cloud computing

Concurrency and Computation: Practice and Experience, 2017

Research paper thumbnail of MorphoNoC: Exploring the Design Space of a Configurable Hybrid NoC using Nanophotonics

Microprocessors and Microsystems, 2017

Research paper thumbnail of Novel Models of Visual Topographic Map Alignment in the Superior Colliculus

PLoS computational biology, 2016

The establishment of precise neuronal connectivity during development is critical for sensing the... more The establishment of precise neuronal connectivity during development is critical for sensing the external environment and informing appropriate behavioral responses. In the visual system, many connections are organized topographically, which preserves the spatial order of the visual scene. The superior colliculus (SC) is a midbrain nucleus that integrates visual inputs from the retina and primary visual cortex (V1) to regulate goal-directed eye movements. In the SC, topographically organized inputs from the retina and V1 must be aligned to facilitate integration. Previously, we showed that retinal input instructs the alignment of V1 inputs in the SC in a manner dependent on spontaneous neuronal activity; however, the mechanism of activity-dependent instruction remains unclear. To begin to address this gap, we developed two novel computational models of visual map alignment in the SC that incorporate distinct activity-dependent components. First, a Correlational Model assumes that V...

Research paper thumbnail of Performance bounds of partial run-time reconfiguration in high-performance reconfigurable computing

Proceedings of the 1st international workshop on High-performance reconfigurable computing technology and applications held in conjunction with SC07 - HPRCTA '07, 2007

Research paper thumbnail of Enabling PGAS Productivity with Hardware Support for Shared Address Mapping: A UPC Case Study

2014 Ieee Intl Conf on High Performance Computing and Communications 2014 Ieee 6th Intl Symp on Cyberspace Safety and Security 2014 Ieee 11th Intl Conf on Embedded Software and Syst, Aug 1, 2014

Research paper thumbnail of Low latency elliptic curve cryptography accelerators for NISTcurves over binary fields

Proceedings. 2005 IEEE International Conference on Field-Programmable Technology, 2005., 2005

Research paper thumbnail of The seasonal to interannual Earth Science Information Partner: a distributed data and information broker

IGARSS '98. Sensing and Managing the Environment. 1998 IEEE International Geoscience and Remote Sensing. Symposium Proceedings. (Cat. No.98CH36174), 1998

ABSTRACT

Research paper thumbnail of On the Implementation of Information Retrieval as Sparse Matric Application

Research paper thumbnail of Overhead and Scalability Measurements on the Cray T3D and Intel Paragon Systems