Fathy Eassa | King AbdulAziz University (KAU) Jeddah, Saudi Arabia (original) (raw)
Papers by Fathy Eassa
In next decade, for exascale high computing power and speed, new high performance computing (HPC)... more In next decade, for exascale high computing power and speed, new high performance computing (HPC) architectures, algorithms and corrections in existing technologies are expected. In order to achieve HPC parallelism is becoming a core emphasizing point. Keeping in view the advantages of parallelism, GPU is a unit that provides the better performance to achieve HPC in exascale computing system. So far, many programming models have been introduced to program GPU like CUDA, OpenGL, and OpenCL etc. and still there are number of limitations for these models that are required a deep glance to fix them. In order to enhance the performance in GPU programming in OpenGL, we have proposed an OpenGL based testing tool architecture for exascale computing system. This testing architecture detects the errors from OpenGL code and enforce to write the code in accurate way.
Electronics
In the Internet of Things (IoT), technological developments have increased the significance of fe... more In the Internet of Things (IoT), technological developments have increased the significance of federated cloud systems with integrated cloud providers for exchange transactions. Monolithic IoT systems implement service-oriented architecture (SOA), which is complex for supporting scalability and communicating transactions in a federated cloud system. One weakness of conventional security methods is that they depend on a centralized party, which means there is a single point of failure for the system. In contrast, blockchain (BC) and microservice (MS) technologies allow services to split for independent tasks. In this research paper, we introduce BC security managers based on MS technology for federated cloud systems in an IoT environment. In addition, we present the design of the Federation Security System Manager (FSSM) MS with interoperability features. This enables the exchange of transactions between permissioned BC managers at different cloud providers, with some constraints. Fu...
Applied Sciences
As the development of high-performance computing (HPC) is growing, exascale computing is on the h... more As the development of high-performance computing (HPC) is growing, exascale computing is on the horizon. Therefore, it is imperative to develop parallel systems, such as graphics processing units (GPUs) and programming models, that can effectively utilise the powerful processing resources of exascale computing. A tri-level programming model comprising message passing interface (MPI), compute unified device architecture (CUDA), and open multi-processing (OpenMP) models may significantly enhance the parallelism, performance, productivity, and programmability of the heterogeneous architecture. However, the use of multiple programming models often leads to unexpected errors and behaviours during run-time. It is also difficult to detect such errors in high-level parallel programming languages. Therefore, this present study proposes a parallel hybrid testing tool that employs both static and dynamic testing techniques to address this issue. The proposed tool was designed to identify the r...
IEEE Access, 2022
The world is facing a growth in the amount and variety of data generated by both users and machin... more The world is facing a growth in the amount and variety of data generated by both users and machines. Despite the exponential increases, the tools and technologies developed to manage these data volumes are not intended to meet security and data protection requirements. Additionally, most of the current big data security systems are offered by a centralized third party, which is vulnerable to many security threats. Blockchain technology plays a significant role by addressing modern technology concerns such as decentralization, non-tampering, trust, data ownership, and traceability, making it great potential to protect personal information. This research presents a new big data security solution empowered by blockchain technology and incorporates fragmentation, encryption, and access control techniques. Our proposed fragmentation algorithm takes into account the data owner's demand for encryption to be added to the fragmentation process. Furthermore, data fragments will be stored in the distributed manner offered by the big data environment, resulting in an additional layer of data protection. In order to achieve an optimal security solution, we aim to enhance big data security with acceptable overhead and avoid the encryption overhead for non-sensitive and low-sensitive data portions. We present the results of our implemented techniques to highlight that the overheads (in terms of computation time) introduced by our solution are negligible relative to its security and privacy gains. INDEX TERMS Big data security, blockchain, fragmentation, access control, auditing.
Electronics, Feb 25, 2022
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
IEEE Access, 2023
The Software Defect Prediction (SDP) method forecasts the occurrence of defects at the beginning ... more The Software Defect Prediction (SDP) method forecasts the occurrence of defects at the beginning of the software development process. Early fault detection will decrease the overall cost of software and improve its dependability. However, no effort has been made in high-performance software to address it. The contribution of this paper is predicting and correcting software defects in the Message Passing Interface (MPI) based on machine learning (ML). This system predicts defects including deadlock, race conditions, and mismatch, by dividing the model into three stages: training, testing, and prediction. The training phase extracts and combines the features as well as the label and then trains on classification. During the testing phase, these features are extracted and classified. The prediction phase inputs the MPI code and determines whether it includes defects. If it discovers a defect, the correction subsystem corrects it. We collected 40 MPI codes in C++, including all MPI communication. Results show the NB classifiers have high accuracy, precision, and recall, which are about 1. INDEX TERMS High-performance computing, software defect prediction, semantic features, message passing interface, parallel programming.
IEEE Access, 2023
In software development systems, the maintenance process of software systems attracted the attent... more In software development systems, the maintenance process of software systems attracted the attention of researchers due to its importance in fixing the defects discovered in the software testing by using bug reports (BRs) which include detailed information like description, status, reporter, assignee, priority, and severity of the bug and other information. The main problem in this process is how to analyze these BRs to discover all defects in the system, which is a tedious and time-consuming task if done manually because the number of BRs increases dramatically. Thus, the automated solution is the best. Most of the current research focuses on automating this process from different aspects, such as detecting the severity or priority of the bug. However, they did not consider the nature of the bug, which is a multi-class classification problem. This paper solves this problem by proposing a new prediction model to analyze BRs and predict the nature of the bug. The proposed model constructs an ensemble machine learning algorithm using natural language processing (NLP) and machine learning techniques. We simulate the proposed model by using a publicly available dataset for two online software bug repositories (Mozilla and Eclipse), which includes six classes: Program Anomaly, GUI, Network or Security, Configuration, Performance, and Test-Code. The simulation results show that the proposed model can achieve better accuracy than most existing models, namely, 90.42% without text augmentation and 96.72% with text augmentation. INDEX TERMS Software maintenance, nature classification, ensemble machine learning algorithm, natural language processing, bug reports, machine learning.
IET Software, Aug 1, 2020
Exascale computing systems (ECS) are anticipated to perform at Exaflop speed (1018 operations per... more Exascale computing systems (ECS) are anticipated to perform at Exaflop speed (1018 operations per second) using power consumption <20 MW. This ultrascale performance requires the speedup in the system by thousand-fold enhancement in current Petascale. For future high-performance computing (HPC), power consumption is one of the vital challenges faced to achieve Exaflops through the traditional way of increasing clock-speed. One standard way to attain such significant performance is through massive parallelism. In the early stages, it is hard to decide the promising parallel programming approach that can provide massive parallelism to attain ExaFlops. This article commences with a short description and implementation of algorithms of various hybrid parallel programming models (PPMs) for homogeneous and heterogeneous cluster systems. Furthermore, the authors evaluated performance and power consumption in these hybrid models by implementing in two HPC benchmarking applications such as square matrix multiplication and Jacobi iterative solver for two-dimensional Laplace equation. The results demonstrated that the hybrid of heterogeneous (MPI + X) outperformed to homogeneous parallel programming (MPI + OpenMP) model. This empirical investigation of hybrid PPMs is a leading step for researchers and development communities to select a promising model for emerging ECS.
International Journal of Advanced Computer Science and Applications, 2017
Here, we present the design and architecture of an Agent-based Manager for Grid Cloud Systems (AM... more Here, we present the design and architecture of an Agent-based Manager for Grid Cloud Systems (AMGCS) using software agents to ensure independency and scalability when the number of resources and jobs increase. AMGCS handles IaaS resources (Infrastructure-as-a-Servicecompute, storage and physical resources), and schedules compute-intensive jobs for execution over available resources based on QoS criteria, with optimized task-execution and high resource-utilization, through the capabilities of grid clouds. This prototypal design and implementation has been tested and shown a proven ability to increase the reliability and performance of cloud application by distributing its tasks to more than one cloud system, hence increase the reliability of user jobs and complex tasks submitted from regular machines.
Advances in Science, Technology and Engineering Systems Journal, 2019
Building massively parallel applications has become increasingly important with coming Exascale r... more Building massively parallel applications has become increasingly important with coming Exascale related technologies. For building these applications, a combination of programming models is needed to increase the system's parallelism. One of these combinations is the dual-programming model (MPI+X) which has many structures that increase parallelism in heterogeneous systems that include CPUs and GPUs. MPI + OpenACC programming model has many advantages and features that increase parallelism with respect heterogeneous architecture and support different platform with more performance, productivity, and programmability. The main problem in building systems with different programming models that it is a hard job for programmers and it is more error-prone, which is not easy to test. Also, testing parallel applications is a difficult task, because of the non-determined behavior of the parallel application. Even after detecting the errors and modifying the source code, it is not easy to determine whether the errors have been corrected or remain hidden. Furthermore, integrating two different programming models inside the same application makes it even more difficult to test. Also, the misusage of OpenACC can lead to several run-time errors that compilers cannot detect, and the programmers will not know about them. To solve this problem, we proposed a parallel hybrid testing tool for detecting run-time errors for systems implemented in C++ and MPI + OpenACC. The hybrid techniques combine static and dynamic testing techniques for detecting real and potential run-time errors by analyzing the source code and during run time. Using parallel hybrid techniques will enhance the testing time and cover a wide range of errors. Also, we propose a new assertion language for helping in detecting potential run-time errors. Finally, to the best of our knowledge, identifying and classifying OpenACC errors has not been done before, and there is no parallel testing tool designed to test applications programmed by using the dualprogramming model MPI + OpenACC or the single-programming models OpenACC.
IEEE Access, 2022
Recently, incorporating more than one programming model into a system designed for high performan... more Recently, incorporating more than one programming model into a system designed for high performance computing (HPC) has become a popular solution to implementing parallel systems. Since traditional programming languages, such as C, C++, and Fortran, do not support parallelism at the level of multi-core processors and accelerators, many programmers add one or more programming models to achieve parallelism and accelerate computation efficiently. These models include Open Accelerators (OpenACC) and Open Multi-Processing (OpenMP), which have recently been used with various models, including Message Passing Interface (MPI) and Compute Unified Device Architecture (CUDA). Due to the difficulty of predicting the behavior of threads, runtime errors cannot be predicted. The compiler cannot identify runtime errors such as data races, race conditions, deadlocks, or livelocks. Many studies have been conducted on the development of testing tools to detect runtime errors when using programming models, such as the combinations of OpenACC with MPI models and OpenMP with MPI. Although more applications use OpenACC and OpenMP together, no testing tools have been developed to test these applications to date. This paper presents a testing tool for detecting runtime using a static testing technique. This tool can detect actual and potential runtime errors during the integration of the OpenACC and OpenMP models into systems developed in C++. This tool implement error dependency graphs, which are proposed in this paper. Additionally, a dependency graph of the errors is provided, along with a classification of runtime errors that result from combining the two programming models mentioned earlier.
Simulation, Aug 1, 1999
The major advantages of parallel processing sys tems are their great reliability and high perfor ... more The major advantages of parallel processing sys tems are their great reliability and high perfor mance. A class of massively parallel computing systems is the data flow machines. These machines work on the basis of data flow rather than control flow. This paper presents a reliability analysis of data flow machines using a graph theoretical ap proach. Three machines are considered here. They are the MIT, DDP and LAU static data flow ma chines. The data flow graph has been employed as a natural tool for representing that class of ma chines. The isomorphism between Petri nets and data flow graphs has been exploited to detect whether the consistency constraints are satisfied during various operational conditions. Such a graph is extended so that a timed data flow model has been constructed. This model integrates both the reliability features dependent on the system structure and the performance characteristics dependent on the components behavior. More over, a productivity index is introduced for evaluating the three machines.
International journal of cloud applications and computing, Jul 1, 2021
The variety of cloud services (CSs) that are described, their non-uniform naming conventions, and... more The variety of cloud services (CSs) that are described, their non-uniform naming conventions, and their heterogeneous types and features make cloud service discovery a difficult problem. Therefore, an intelligent cloud service discovery framework (CSDF) is needed for discovering the appropriate services that meet the user's requirements. This study proposes a CSDF for extracting cloud service attributes (CSAs) based on classification, ontology, and agents. Multiple-phase classification with topic modeling has been implemented using different machine learning techniques to increase the efficiency of CSA extraction. CSAs that are represented in different formats have been extracted and represented in a comprehensive ontology to enhance the efficiency and effectiveness of the framework. The experimental results showed that the multiple-phase classification methods with topic modeling for CSs using a support vector machine (SVM) obtained a high accuracy (87.90%) compared to other methods. In addition, the results of extracting CSAs showed high values for precision, recall, and f-measure of 99.24%, 99.24%, and 99.24%, respectively, for Java script object notation(JSON) format, followed by 99.05%, 97.20%, and 98.11% for table formats, and with lower accuracy for text format (90.63%, 86.57%, and 88.55%).
IEEE Access, 2019
With the continued increase of usage of High-Performance Computing (HPC) in scientific fields, th... more With the continued increase of usage of High-Performance Computing (HPC) in scientific fields, the need for programming models in a heterogeneous architecture with less programming effort has become important in scientific applications. OpenACC is a high-level parallel programming model used with FORTRAN, C, and C++ programming languages to accelerate the programmers' code with fewer changes and less effort, which reduces programmer workloads and makes it easier to use and learn. Also, OpenACC has been increasingly used in many top supercomputers around the world, and three of the top five HPC applications in Intersect360 Research are currently using OpenACC. However, when programmers use OpenACC to parallelize their code without correctly understanding OpenACC directives and their usage or following OpenACC instructions, they can cause run-time errors that vary from causing wrong results, performance issues, and other undefined behaviors. In addition, building parallel systems by using a higher level programming model increase the possibility to introduce errors, and the parallel applications thus have non-determined behavior, which makes testing and detecting their run-time errors a challenging task. Although there are many testing tools that detect run-time errors, this is still inadequate for detecting errors that occur in applications implemented in high-level parallel programming models, especially OpenACC related applications. As a result, OpenACC errors that cannot be detected by compilers should be identified, and their causes should be explained. In this paper, our contribution is introducing new static techniques for detecting OpenACC errors, as well as for the first time classifying errors that can occur in OpenACC software programs. Finally, to the best of our knowledge, there is no published work to date that identifies or classifies OpenACC-related errors, nor is there a testing tool designed to test OpenACC applications and detect their run-time errors. INDEX TERMS OpenACC, OpenACC run-time errors, OpenACC error classifications, OpenACC testing tool, Static approach for OpenACC application.
Location Based Services (LBS) have become widespread recently, especially with great increase in ... more Location Based Services (LBS) have become widespread recently, especially with great increase in the number of devices and applications that use the Global Positioning System and its related services. But, these applications and services have created new threats and challenges related to the privacy of users’ data in addition to the performance. Where server provider (SP) can tracing the locations of user frequently and analyzing them to reveal many of sensitive information about his life as his customs, religion, hobbies, job, ethics, social status, friends, and many of other private issues. This research proposes a new approach titled “Triple Cache Approach”. This approach dealing with peers cooperation methods and integrating with cache techniques with a novel idea of using three caches, one in each user device, and two in the access point of each cell/area. The proposed approach solves the drawbacks, that existing in previous approaches, which are related to performance, cache hit-ratio, level of privacy, and trust issue.
International Journal of Modern Education and Computer Science, Jun 8, 2016
During the last decade, Heterogeneous systems are emerging for high performance computing [1]. In... more During the last decade, Heterogeneous systems are emerging for high performance computing [1]. In order to achieve high performance computing (HPC), existing technologies and programming models aims to see rapid growth toward intra-node parallelism [2]. The current high computational system and applications demand for a massive level of computation power. In last few years, Graphical processing unit (GPU) has been introduced an alternative of conventional CPU for highly parallel computing applications both for general purpose and graphic processing. Rather than using the traditional way of coding algorithms in serial by single CPU, many multithreading programming models has been introduced such as CUDA, OpenMP, and MPI to make parallel processing by using multicores. These parallel programming models are supportive to data driven multithreading (DDM) principle [3]. In this paper, we have presented performance based preliminary evaluation of these programming models and compared with the conventional single CPU serial processing system. We have implemented a massive computational operation for performance evaluation such as complex matrix multiplication operation. We used data driven multithreaded HPC system for performance evaluation and presented the results with a comprehensive analysis of these parallel programming models for HPC parallelism.
The Internet of Thing (IoT) refers to a vast number of things (e.g., sensors) that are connected ... more The Internet of Thing (IoT) refers to a vast number of things (e.g., sensors) that are connected to the internet to share data such as in smart cities, intelligent transportation, and healthcare. However, these data may be subject to privacy breaching attacks that pose considerable challenges in the development and deployment of IoT. Therefore, many solutions have been introduced with the assumption of trusted parties' inclusion (e.g., Service Providers) to increase the level of privacy. For example, the blind approach solution depends on asymmetric keys in which the third parties anonymize the user's identity before forwarding the encrypted query to the service provider without being able to perusal the transferred data. But, what will happen if the anonymizer sends the request to the service provider itself without the user's knowledge, in this case, it will be able to read the answers and hacking the privacy of the user. In our research, we propose an enhanced technique for the blind approach to enable users to detect any unauthorized access to their data and then take action. A case-study will be presented in the paper to prove the effectiveness of the enhanced technique in addressing the drawback of the main Blind approach and thereby improving the level of user's privacy in IoT.
IEEE Access, 2018
The emerging high-performance computing Exascale supercomputing system, which is anticipated to b... more The emerging high-performance computing Exascale supercomputing system, which is anticipated to be available in 2020, will unravel many scientific mysteries. This extraordinary processing framework will accomplish a thousand-folds increment in figuring power contrasted with the current Petascale framework. The prospective framework will help development communities and researchers in exploring from conventional homogeneous to the heterogeneous frameworks that will be joined into energy efficient GPU devices along with traditional CPUs. For accomplishing ExaFlops execution through the Ultrascale framework, the present innovations are confronting several challenges. Huge parallelism is one of these challenges, which requires a novel low power consuming parallel programming approach for attaining massive performance. This paper introduced a new parallel programming model that achieves massive parallelism by combining coarse-grained and fine-grained parallelism over inter-node and intranode computation respectively. The proposed framework is tri-hybrid of MPI, OpenMP, and compute unified device architecture (MOC) that compute input data over heterogeneous framework. We implemented the proposed model in linear algebraic dense matrix multiplication application, and compared the quantified metrics with well-known basic linear algebra subroutine libraries such as CUDA basic linear algebra subroutines library and KAUST basic linear algebra subprograms. MOC outperformed to all implemented methods and achieved massive performance by consuming less power. The proposed MOC approach can be considered an initial and leading model to deal emerging Exascale computing systems.
In next decade, for exascale high computing power and speed, new high performance computing (HPC)... more In next decade, for exascale high computing power and speed, new high performance computing (HPC) architectures, algorithms and corrections in existing technologies are expected. In order to achieve HPC parallelism is becoming a core emphasizing point. Keeping in view the advantages of parallelism, GPU is a unit that provides the better performance to achieve HPC in exascale computing system. So far, many programming models have been introduced to program GPU like CUDA, OpenGL, and OpenCL etc. and still there are number of limitations for these models that are required a deep glance to fix them. In order to enhance the performance in GPU programming in OpenGL, we have proposed an OpenGL based testing tool architecture for exascale computing system. This testing architecture detects the errors from OpenGL code and enforce to write the code in accurate way.
Electronics
In the Internet of Things (IoT), technological developments have increased the significance of fe... more In the Internet of Things (IoT), technological developments have increased the significance of federated cloud systems with integrated cloud providers for exchange transactions. Monolithic IoT systems implement service-oriented architecture (SOA), which is complex for supporting scalability and communicating transactions in a federated cloud system. One weakness of conventional security methods is that they depend on a centralized party, which means there is a single point of failure for the system. In contrast, blockchain (BC) and microservice (MS) technologies allow services to split for independent tasks. In this research paper, we introduce BC security managers based on MS technology for federated cloud systems in an IoT environment. In addition, we present the design of the Federation Security System Manager (FSSM) MS with interoperability features. This enables the exchange of transactions between permissioned BC managers at different cloud providers, with some constraints. Fu...
Applied Sciences
As the development of high-performance computing (HPC) is growing, exascale computing is on the h... more As the development of high-performance computing (HPC) is growing, exascale computing is on the horizon. Therefore, it is imperative to develop parallel systems, such as graphics processing units (GPUs) and programming models, that can effectively utilise the powerful processing resources of exascale computing. A tri-level programming model comprising message passing interface (MPI), compute unified device architecture (CUDA), and open multi-processing (OpenMP) models may significantly enhance the parallelism, performance, productivity, and programmability of the heterogeneous architecture. However, the use of multiple programming models often leads to unexpected errors and behaviours during run-time. It is also difficult to detect such errors in high-level parallel programming languages. Therefore, this present study proposes a parallel hybrid testing tool that employs both static and dynamic testing techniques to address this issue. The proposed tool was designed to identify the r...
IEEE Access, 2022
The world is facing a growth in the amount and variety of data generated by both users and machin... more The world is facing a growth in the amount and variety of data generated by both users and machines. Despite the exponential increases, the tools and technologies developed to manage these data volumes are not intended to meet security and data protection requirements. Additionally, most of the current big data security systems are offered by a centralized third party, which is vulnerable to many security threats. Blockchain technology plays a significant role by addressing modern technology concerns such as decentralization, non-tampering, trust, data ownership, and traceability, making it great potential to protect personal information. This research presents a new big data security solution empowered by blockchain technology and incorporates fragmentation, encryption, and access control techniques. Our proposed fragmentation algorithm takes into account the data owner's demand for encryption to be added to the fragmentation process. Furthermore, data fragments will be stored in the distributed manner offered by the big data environment, resulting in an additional layer of data protection. In order to achieve an optimal security solution, we aim to enhance big data security with acceptable overhead and avoid the encryption overhead for non-sensitive and low-sensitive data portions. We present the results of our implemented techniques to highlight that the overheads (in terms of computation time) introduced by our solution are negligible relative to its security and privacy gains. INDEX TERMS Big data security, blockchain, fragmentation, access control, auditing.
Electronics, Feb 25, 2022
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
IEEE Access, 2023
The Software Defect Prediction (SDP) method forecasts the occurrence of defects at the beginning ... more The Software Defect Prediction (SDP) method forecasts the occurrence of defects at the beginning of the software development process. Early fault detection will decrease the overall cost of software and improve its dependability. However, no effort has been made in high-performance software to address it. The contribution of this paper is predicting and correcting software defects in the Message Passing Interface (MPI) based on machine learning (ML). This system predicts defects including deadlock, race conditions, and mismatch, by dividing the model into three stages: training, testing, and prediction. The training phase extracts and combines the features as well as the label and then trains on classification. During the testing phase, these features are extracted and classified. The prediction phase inputs the MPI code and determines whether it includes defects. If it discovers a defect, the correction subsystem corrects it. We collected 40 MPI codes in C++, including all MPI communication. Results show the NB classifiers have high accuracy, precision, and recall, which are about 1. INDEX TERMS High-performance computing, software defect prediction, semantic features, message passing interface, parallel programming.
IEEE Access, 2023
In software development systems, the maintenance process of software systems attracted the attent... more In software development systems, the maintenance process of software systems attracted the attention of researchers due to its importance in fixing the defects discovered in the software testing by using bug reports (BRs) which include detailed information like description, status, reporter, assignee, priority, and severity of the bug and other information. The main problem in this process is how to analyze these BRs to discover all defects in the system, which is a tedious and time-consuming task if done manually because the number of BRs increases dramatically. Thus, the automated solution is the best. Most of the current research focuses on automating this process from different aspects, such as detecting the severity or priority of the bug. However, they did not consider the nature of the bug, which is a multi-class classification problem. This paper solves this problem by proposing a new prediction model to analyze BRs and predict the nature of the bug. The proposed model constructs an ensemble machine learning algorithm using natural language processing (NLP) and machine learning techniques. We simulate the proposed model by using a publicly available dataset for two online software bug repositories (Mozilla and Eclipse), which includes six classes: Program Anomaly, GUI, Network or Security, Configuration, Performance, and Test-Code. The simulation results show that the proposed model can achieve better accuracy than most existing models, namely, 90.42% without text augmentation and 96.72% with text augmentation. INDEX TERMS Software maintenance, nature classification, ensemble machine learning algorithm, natural language processing, bug reports, machine learning.
IET Software, Aug 1, 2020
Exascale computing systems (ECS) are anticipated to perform at Exaflop speed (1018 operations per... more Exascale computing systems (ECS) are anticipated to perform at Exaflop speed (1018 operations per second) using power consumption <20 MW. This ultrascale performance requires the speedup in the system by thousand-fold enhancement in current Petascale. For future high-performance computing (HPC), power consumption is one of the vital challenges faced to achieve Exaflops through the traditional way of increasing clock-speed. One standard way to attain such significant performance is through massive parallelism. In the early stages, it is hard to decide the promising parallel programming approach that can provide massive parallelism to attain ExaFlops. This article commences with a short description and implementation of algorithms of various hybrid parallel programming models (PPMs) for homogeneous and heterogeneous cluster systems. Furthermore, the authors evaluated performance and power consumption in these hybrid models by implementing in two HPC benchmarking applications such as square matrix multiplication and Jacobi iterative solver for two-dimensional Laplace equation. The results demonstrated that the hybrid of heterogeneous (MPI + X) outperformed to homogeneous parallel programming (MPI + OpenMP) model. This empirical investigation of hybrid PPMs is a leading step for researchers and development communities to select a promising model for emerging ECS.
International Journal of Advanced Computer Science and Applications, 2017
Here, we present the design and architecture of an Agent-based Manager for Grid Cloud Systems (AM... more Here, we present the design and architecture of an Agent-based Manager for Grid Cloud Systems (AMGCS) using software agents to ensure independency and scalability when the number of resources and jobs increase. AMGCS handles IaaS resources (Infrastructure-as-a-Servicecompute, storage and physical resources), and schedules compute-intensive jobs for execution over available resources based on QoS criteria, with optimized task-execution and high resource-utilization, through the capabilities of grid clouds. This prototypal design and implementation has been tested and shown a proven ability to increase the reliability and performance of cloud application by distributing its tasks to more than one cloud system, hence increase the reliability of user jobs and complex tasks submitted from regular machines.
Advances in Science, Technology and Engineering Systems Journal, 2019
Building massively parallel applications has become increasingly important with coming Exascale r... more Building massively parallel applications has become increasingly important with coming Exascale related technologies. For building these applications, a combination of programming models is needed to increase the system's parallelism. One of these combinations is the dual-programming model (MPI+X) which has many structures that increase parallelism in heterogeneous systems that include CPUs and GPUs. MPI + OpenACC programming model has many advantages and features that increase parallelism with respect heterogeneous architecture and support different platform with more performance, productivity, and programmability. The main problem in building systems with different programming models that it is a hard job for programmers and it is more error-prone, which is not easy to test. Also, testing parallel applications is a difficult task, because of the non-determined behavior of the parallel application. Even after detecting the errors and modifying the source code, it is not easy to determine whether the errors have been corrected or remain hidden. Furthermore, integrating two different programming models inside the same application makes it even more difficult to test. Also, the misusage of OpenACC can lead to several run-time errors that compilers cannot detect, and the programmers will not know about them. To solve this problem, we proposed a parallel hybrid testing tool for detecting run-time errors for systems implemented in C++ and MPI + OpenACC. The hybrid techniques combine static and dynamic testing techniques for detecting real and potential run-time errors by analyzing the source code and during run time. Using parallel hybrid techniques will enhance the testing time and cover a wide range of errors. Also, we propose a new assertion language for helping in detecting potential run-time errors. Finally, to the best of our knowledge, identifying and classifying OpenACC errors has not been done before, and there is no parallel testing tool designed to test applications programmed by using the dualprogramming model MPI + OpenACC or the single-programming models OpenACC.
IEEE Access, 2022
Recently, incorporating more than one programming model into a system designed for high performan... more Recently, incorporating more than one programming model into a system designed for high performance computing (HPC) has become a popular solution to implementing parallel systems. Since traditional programming languages, such as C, C++, and Fortran, do not support parallelism at the level of multi-core processors and accelerators, many programmers add one or more programming models to achieve parallelism and accelerate computation efficiently. These models include Open Accelerators (OpenACC) and Open Multi-Processing (OpenMP), which have recently been used with various models, including Message Passing Interface (MPI) and Compute Unified Device Architecture (CUDA). Due to the difficulty of predicting the behavior of threads, runtime errors cannot be predicted. The compiler cannot identify runtime errors such as data races, race conditions, deadlocks, or livelocks. Many studies have been conducted on the development of testing tools to detect runtime errors when using programming models, such as the combinations of OpenACC with MPI models and OpenMP with MPI. Although more applications use OpenACC and OpenMP together, no testing tools have been developed to test these applications to date. This paper presents a testing tool for detecting runtime using a static testing technique. This tool can detect actual and potential runtime errors during the integration of the OpenACC and OpenMP models into systems developed in C++. This tool implement error dependency graphs, which are proposed in this paper. Additionally, a dependency graph of the errors is provided, along with a classification of runtime errors that result from combining the two programming models mentioned earlier.
Simulation, Aug 1, 1999
The major advantages of parallel processing sys tems are their great reliability and high perfor ... more The major advantages of parallel processing sys tems are their great reliability and high perfor mance. A class of massively parallel computing systems is the data flow machines. These machines work on the basis of data flow rather than control flow. This paper presents a reliability analysis of data flow machines using a graph theoretical ap proach. Three machines are considered here. They are the MIT, DDP and LAU static data flow ma chines. The data flow graph has been employed as a natural tool for representing that class of ma chines. The isomorphism between Petri nets and data flow graphs has been exploited to detect whether the consistency constraints are satisfied during various operational conditions. Such a graph is extended so that a timed data flow model has been constructed. This model integrates both the reliability features dependent on the system structure and the performance characteristics dependent on the components behavior. More over, a productivity index is introduced for evaluating the three machines.
International journal of cloud applications and computing, Jul 1, 2021
The variety of cloud services (CSs) that are described, their non-uniform naming conventions, and... more The variety of cloud services (CSs) that are described, their non-uniform naming conventions, and their heterogeneous types and features make cloud service discovery a difficult problem. Therefore, an intelligent cloud service discovery framework (CSDF) is needed for discovering the appropriate services that meet the user's requirements. This study proposes a CSDF for extracting cloud service attributes (CSAs) based on classification, ontology, and agents. Multiple-phase classification with topic modeling has been implemented using different machine learning techniques to increase the efficiency of CSA extraction. CSAs that are represented in different formats have been extracted and represented in a comprehensive ontology to enhance the efficiency and effectiveness of the framework. The experimental results showed that the multiple-phase classification methods with topic modeling for CSs using a support vector machine (SVM) obtained a high accuracy (87.90%) compared to other methods. In addition, the results of extracting CSAs showed high values for precision, recall, and f-measure of 99.24%, 99.24%, and 99.24%, respectively, for Java script object notation(JSON) format, followed by 99.05%, 97.20%, and 98.11% for table formats, and with lower accuracy for text format (90.63%, 86.57%, and 88.55%).
IEEE Access, 2019
With the continued increase of usage of High-Performance Computing (HPC) in scientific fields, th... more With the continued increase of usage of High-Performance Computing (HPC) in scientific fields, the need for programming models in a heterogeneous architecture with less programming effort has become important in scientific applications. OpenACC is a high-level parallel programming model used with FORTRAN, C, and C++ programming languages to accelerate the programmers' code with fewer changes and less effort, which reduces programmer workloads and makes it easier to use and learn. Also, OpenACC has been increasingly used in many top supercomputers around the world, and three of the top five HPC applications in Intersect360 Research are currently using OpenACC. However, when programmers use OpenACC to parallelize their code without correctly understanding OpenACC directives and their usage or following OpenACC instructions, they can cause run-time errors that vary from causing wrong results, performance issues, and other undefined behaviors. In addition, building parallel systems by using a higher level programming model increase the possibility to introduce errors, and the parallel applications thus have non-determined behavior, which makes testing and detecting their run-time errors a challenging task. Although there are many testing tools that detect run-time errors, this is still inadequate for detecting errors that occur in applications implemented in high-level parallel programming models, especially OpenACC related applications. As a result, OpenACC errors that cannot be detected by compilers should be identified, and their causes should be explained. In this paper, our contribution is introducing new static techniques for detecting OpenACC errors, as well as for the first time classifying errors that can occur in OpenACC software programs. Finally, to the best of our knowledge, there is no published work to date that identifies or classifies OpenACC-related errors, nor is there a testing tool designed to test OpenACC applications and detect their run-time errors. INDEX TERMS OpenACC, OpenACC run-time errors, OpenACC error classifications, OpenACC testing tool, Static approach for OpenACC application.
Location Based Services (LBS) have become widespread recently, especially with great increase in ... more Location Based Services (LBS) have become widespread recently, especially with great increase in the number of devices and applications that use the Global Positioning System and its related services. But, these applications and services have created new threats and challenges related to the privacy of users’ data in addition to the performance. Where server provider (SP) can tracing the locations of user frequently and analyzing them to reveal many of sensitive information about his life as his customs, religion, hobbies, job, ethics, social status, friends, and many of other private issues. This research proposes a new approach titled “Triple Cache Approach”. This approach dealing with peers cooperation methods and integrating with cache techniques with a novel idea of using three caches, one in each user device, and two in the access point of each cell/area. The proposed approach solves the drawbacks, that existing in previous approaches, which are related to performance, cache hit-ratio, level of privacy, and trust issue.
International Journal of Modern Education and Computer Science, Jun 8, 2016
During the last decade, Heterogeneous systems are emerging for high performance computing [1]. In... more During the last decade, Heterogeneous systems are emerging for high performance computing [1]. In order to achieve high performance computing (HPC), existing technologies and programming models aims to see rapid growth toward intra-node parallelism [2]. The current high computational system and applications demand for a massive level of computation power. In last few years, Graphical processing unit (GPU) has been introduced an alternative of conventional CPU for highly parallel computing applications both for general purpose and graphic processing. Rather than using the traditional way of coding algorithms in serial by single CPU, many multithreading programming models has been introduced such as CUDA, OpenMP, and MPI to make parallel processing by using multicores. These parallel programming models are supportive to data driven multithreading (DDM) principle [3]. In this paper, we have presented performance based preliminary evaluation of these programming models and compared with the conventional single CPU serial processing system. We have implemented a massive computational operation for performance evaluation such as complex matrix multiplication operation. We used data driven multithreaded HPC system for performance evaluation and presented the results with a comprehensive analysis of these parallel programming models for HPC parallelism.
The Internet of Thing (IoT) refers to a vast number of things (e.g., sensors) that are connected ... more The Internet of Thing (IoT) refers to a vast number of things (e.g., sensors) that are connected to the internet to share data such as in smart cities, intelligent transportation, and healthcare. However, these data may be subject to privacy breaching attacks that pose considerable challenges in the development and deployment of IoT. Therefore, many solutions have been introduced with the assumption of trusted parties' inclusion (e.g., Service Providers) to increase the level of privacy. For example, the blind approach solution depends on asymmetric keys in which the third parties anonymize the user's identity before forwarding the encrypted query to the service provider without being able to perusal the transferred data. But, what will happen if the anonymizer sends the request to the service provider itself without the user's knowledge, in this case, it will be able to read the answers and hacking the privacy of the user. In our research, we propose an enhanced technique for the blind approach to enable users to detect any unauthorized access to their data and then take action. A case-study will be presented in the paper to prove the effectiveness of the enhanced technique in addressing the drawback of the main Blind approach and thereby improving the level of user's privacy in IoT.
IEEE Access, 2018
The emerging high-performance computing Exascale supercomputing system, which is anticipated to b... more The emerging high-performance computing Exascale supercomputing system, which is anticipated to be available in 2020, will unravel many scientific mysteries. This extraordinary processing framework will accomplish a thousand-folds increment in figuring power contrasted with the current Petascale framework. The prospective framework will help development communities and researchers in exploring from conventional homogeneous to the heterogeneous frameworks that will be joined into energy efficient GPU devices along with traditional CPUs. For accomplishing ExaFlops execution through the Ultrascale framework, the present innovations are confronting several challenges. Huge parallelism is one of these challenges, which requires a novel low power consuming parallel programming approach for attaining massive performance. This paper introduced a new parallel programming model that achieves massive parallelism by combining coarse-grained and fine-grained parallelism over inter-node and intranode computation respectively. The proposed framework is tri-hybrid of MPI, OpenMP, and compute unified device architecture (MOC) that compute input data over heterogeneous framework. We implemented the proposed model in linear algebraic dense matrix multiplication application, and compared the quantified metrics with well-known basic linear algebra subroutine libraries such as CUDA basic linear algebra subroutines library and KAUST basic linear algebra subprograms. MOC outperformed to all implemented methods and achieved massive performance by consuming less power. The proposed MOC approach can be considered an initial and leading model to deal emerging Exascale computing systems.