AbdAllah A . AlHabshy | AlAzhar University cairo (original) (raw)
Uploads
Papers by AbdAllah A . AlHabshy
Computers, Materials & Continua
This study focuses on meeting the challenges of big data visualization by using of data reduction... more This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods. To reduce the volume of big data and minimize model training time (Tt) while maintaining data quality. We contributed to meeting the challenges of big data visualization using the embedded method based "Select from model (SFM)" method by using "Random forest Importance algorithm (RFI)" and comparing it with the filter method by using "Select percentile (SP)" method based chi square "Chi2" tool for selecting the most important features, which are then fed into a classification process using the logistic regression (LR) algorithm and the k-nearest neighbor (KNN) algorithm. Thus, the classification accuracy (AC) performance of LR is also compared to the KNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied. Consequently, the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal. After making several comparisons, the study suggests (SFMLR) using SFM based on RFI algorithm for feature selection, with LR algorithm for data classify. The proposal proved its efficacy by comparing its results with recent literature.
2022 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)
IEEE Access
The growth of the Internet of Things (IoT) generates new processing, networking infrastructure, d... more The growth of the Internet of Things (IoT) generates new processing, networking infrastructure, data storage, and management capabilities. This massive volume of data may be used to provide high-value information for decision support and data-intensive science research, etc. However, owing to the nature of IoT in distribution, virtualisation, cloud integration, and internet connectivity, the IoT environment is prone to various cyber-attacks and security issues. Hence, the increasing frequency and potency of recent attacks and constantly evolving attack vectors necessitate the development of improved detection methods. Therefore, this study proposes a distributed computing-based security model to safeguard big data systems. The proposed ensemble multi binary attack model (EMBAM) is an intrusion detection system (IDS) that offers a unique anomaly based IDS to detect normal behaviour and abnormal attack(s), for example, threats in a network. EMBAM ensembles multiple binary classifiers into a single model through stacking. The core binary model is a decision tree classifier with hyperparameters optimised using the grid search method. The use of multiple binary classifiers allows each binary classifier to adopt the limitations of the others. Empirical analysis of the experimental profile of the EMBAM has been discussed with eight-plus stateof-the-art methods using performance metrics, such as accuracy, detection rate, precision, specificity, false alarm rate, and F1-score. EMBAM can recognise multiple attack types as a star plug and play advantageous in a highly dynamic scheme. The proposed approach outperforms existing approaches on the UNSW-NB15 dataset and yields competitive results on the CICIDS2017 dataset. INDEX TERMS Anomaly-based IDS, ensemble learning, intrusion detection system, machine learning.
Al-Azhar Bulletin of Science, 2020
Relational databases are usually used for data storage and retrieval. They are suitable for limit... more Relational databases are usually used for data storage and retrieval. They are suitable for limited data volume. But when it comes to Bigdata, we need to use more flexible databases that satisfy the need to handle semi-structured and unstructured data. These databases are called NoSQL (Not only SQL) databases. This type of database was developed to interact with data of large volumes. NoSQL databases provide many features such as scalability, availability, replication models, file sharing, and schema-free. This paper's main purpose is to present a comparative study of the five main categories of NoSQL databases; key-value stores, document stores, column family stores, graph stores databases, and object store NoSQL systems. Also, it discusses the famous database management systems for each one of these five categories. The comparison criteria used are performance, scalability, flexibility, complexity, and functionality. Moreover, this paper presents an overview of big data concepts. It briefly discusses the SQL databases versus NoSQL databases in terms of their high-level characteristics. Furthermore, this paper emphasizes the advantages and disadvantages of NoSQL databases. It illustrates the query languages in both SQL and NoSQL databases and represents the most common uses for each category to help users choose the most convenient DBMS for their organization.
Al-Azhar Bulletin of Science, 2021
We live in a time where data stream by the second, which makes intrusion detection a more difficu... more We live in a time where data stream by the second, which makes intrusion detection a more difficult and tiresome task, and in turn intrusion detection systems require an efficient and improved detection mechanism to detect the intrusive activities. Moreover, handling the size, complexity, and availability of big data requires techniques that can create beneficial knowledge from huge streams of the information, which imposes the challenges on the process of both designing and management of both Intrusion Detection System (IDS) and Intrusion Prevention System (IPS) in terms of performance, sustainability, security, reliability, privacy, energy consumption, fault tolerance, scalability, and flexibility. IDSs and IPSs utilize various methodologies to guarantee security, accessibility and reliability of enterprise computer networks. This paper presents a comprehensive study of the Distributed Intrusion Detection Systems in Big Data, and presents intrusion detection and prevention techniques that utilize machine learning, big data analytics techniques in distributed systems of the intrusion detection.
PeerJ Computer Science, 2020
Background As the COVID-19 crisis endures and the virus continues to spread globally, the need fo... more Background As the COVID-19 crisis endures and the virus continues to spread globally, the need for collecting epidemiological data and patient information also grows exponentially. The race against the clock to find a cure and a vaccine to the disease means researchers require storage of increasingly large and diverse types of information; for doctors following patients, recording symptoms and reactions to treatments, the need for storage flexibility is only surpassed by the necessity of storage security. The volume, variety, and variability of COVID-19 patient data requires storage in NoSQL database management systems (DBMSs). But with a multitude of existing NoSQL DBMSs, there is no straightforward way for institutions to select the most appropriate. And more importantly, they suffer from security flaws that would render them inappropriate for the storage of confidential patient data. Motivation This paper develops an innovative solution to remedy the aforementioned shortcomings. ...
Int. J. Netw. Secur., 2019
Securing data over an open network is one of the most critical problems in network security. To s... more Securing data over an open network is one of the most critical problems in network security. To secure data, an encryption algorithm should be used. Hill cipher is one of most famous encryption algorithms. Although the Hill cipher is not strong enough and vulnerable to many types of attacks, it is still playing a significant role in educational systems; The original Hill cipher is vulnerable to known plaintext attack. In the last decade, Hill cipher got much attention. Researchers proposed many enhances to the Hill cipher; New modifications of the Hill cipher are proposed to enhance the security of Hill cipher. In this paper we shall show that “A Modified Hill Cipher Based on Circulant Matrices” is vulnerable to both known plaintext attack and chosen plaintext attack. Moreover, we will introduce a new mode of operation which can be used with any block cipher. Then we will propose a new enhanced encryption algorithm. After that, we shall provide a security analysis and efficiency eva...
New Technologies, Mobility and Security
Alexandria Engineering Journal, 2021
Abstract Medical image segmentation is important for disease diagnosis and support medical decisi... more Abstract Medical image segmentation is important for disease diagnosis and support medical decision systems. The study proposes an efficient 3D semantic segmentation deep learning model “3D-DenseUNet-569” for liver and tumor segmentation. The proposed 3D-DenseUNet-569 is a fully 3D semantic segmentation model with a significantly deeper network and lower trainable parameters. The proposed model adopts Depthwise Separable Convolution (DS-Conv) as opposed to traditional convolution. The DS-Conv significantly decreases GPU memory requirements and computational cost and achieves high performance. The proposed 3D-DenseUNet-569 utilizes DensNet connections and UNet links, which preserve low-level features and produce effective results. The results of experimental study on the standard LiTS dataset demonstrate that the 3D-DenseNet-569 model is effective and efficient with respect to related studies.
The 7th International Conference on Information Technology, 2015
Computers, Materials & Continua
This study focuses on meeting the challenges of big data visualization by using of data reduction... more This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods. To reduce the volume of big data and minimize model training time (Tt) while maintaining data quality. We contributed to meeting the challenges of big data visualization using the embedded method based "Select from model (SFM)" method by using "Random forest Importance algorithm (RFI)" and comparing it with the filter method by using "Select percentile (SP)" method based chi square "Chi2" tool for selecting the most important features, which are then fed into a classification process using the logistic regression (LR) algorithm and the k-nearest neighbor (KNN) algorithm. Thus, the classification accuracy (AC) performance of LR is also compared to the KNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied. Consequently, the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal. After making several comparisons, the study suggests (SFMLR) using SFM based on RFI algorithm for feature selection, with LR algorithm for data classify. The proposal proved its efficacy by comparing its results with recent literature.
2022 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)
IEEE Access
The growth of the Internet of Things (IoT) generates new processing, networking infrastructure, d... more The growth of the Internet of Things (IoT) generates new processing, networking infrastructure, data storage, and management capabilities. This massive volume of data may be used to provide high-value information for decision support and data-intensive science research, etc. However, owing to the nature of IoT in distribution, virtualisation, cloud integration, and internet connectivity, the IoT environment is prone to various cyber-attacks and security issues. Hence, the increasing frequency and potency of recent attacks and constantly evolving attack vectors necessitate the development of improved detection methods. Therefore, this study proposes a distributed computing-based security model to safeguard big data systems. The proposed ensemble multi binary attack model (EMBAM) is an intrusion detection system (IDS) that offers a unique anomaly based IDS to detect normal behaviour and abnormal attack(s), for example, threats in a network. EMBAM ensembles multiple binary classifiers into a single model through stacking. The core binary model is a decision tree classifier with hyperparameters optimised using the grid search method. The use of multiple binary classifiers allows each binary classifier to adopt the limitations of the others. Empirical analysis of the experimental profile of the EMBAM has been discussed with eight-plus stateof-the-art methods using performance metrics, such as accuracy, detection rate, precision, specificity, false alarm rate, and F1-score. EMBAM can recognise multiple attack types as a star plug and play advantageous in a highly dynamic scheme. The proposed approach outperforms existing approaches on the UNSW-NB15 dataset and yields competitive results on the CICIDS2017 dataset. INDEX TERMS Anomaly-based IDS, ensemble learning, intrusion detection system, machine learning.
Al-Azhar Bulletin of Science, 2020
Relational databases are usually used for data storage and retrieval. They are suitable for limit... more Relational databases are usually used for data storage and retrieval. They are suitable for limited data volume. But when it comes to Bigdata, we need to use more flexible databases that satisfy the need to handle semi-structured and unstructured data. These databases are called NoSQL (Not only SQL) databases. This type of database was developed to interact with data of large volumes. NoSQL databases provide many features such as scalability, availability, replication models, file sharing, and schema-free. This paper's main purpose is to present a comparative study of the five main categories of NoSQL databases; key-value stores, document stores, column family stores, graph stores databases, and object store NoSQL systems. Also, it discusses the famous database management systems for each one of these five categories. The comparison criteria used are performance, scalability, flexibility, complexity, and functionality. Moreover, this paper presents an overview of big data concepts. It briefly discusses the SQL databases versus NoSQL databases in terms of their high-level characteristics. Furthermore, this paper emphasizes the advantages and disadvantages of NoSQL databases. It illustrates the query languages in both SQL and NoSQL databases and represents the most common uses for each category to help users choose the most convenient DBMS for their organization.
Al-Azhar Bulletin of Science, 2021
We live in a time where data stream by the second, which makes intrusion detection a more difficu... more We live in a time where data stream by the second, which makes intrusion detection a more difficult and tiresome task, and in turn intrusion detection systems require an efficient and improved detection mechanism to detect the intrusive activities. Moreover, handling the size, complexity, and availability of big data requires techniques that can create beneficial knowledge from huge streams of the information, which imposes the challenges on the process of both designing and management of both Intrusion Detection System (IDS) and Intrusion Prevention System (IPS) in terms of performance, sustainability, security, reliability, privacy, energy consumption, fault tolerance, scalability, and flexibility. IDSs and IPSs utilize various methodologies to guarantee security, accessibility and reliability of enterprise computer networks. This paper presents a comprehensive study of the Distributed Intrusion Detection Systems in Big Data, and presents intrusion detection and prevention techniques that utilize machine learning, big data analytics techniques in distributed systems of the intrusion detection.
PeerJ Computer Science, 2020
Background As the COVID-19 crisis endures and the virus continues to spread globally, the need fo... more Background As the COVID-19 crisis endures and the virus continues to spread globally, the need for collecting epidemiological data and patient information also grows exponentially. The race against the clock to find a cure and a vaccine to the disease means researchers require storage of increasingly large and diverse types of information; for doctors following patients, recording symptoms and reactions to treatments, the need for storage flexibility is only surpassed by the necessity of storage security. The volume, variety, and variability of COVID-19 patient data requires storage in NoSQL database management systems (DBMSs). But with a multitude of existing NoSQL DBMSs, there is no straightforward way for institutions to select the most appropriate. And more importantly, they suffer from security flaws that would render them inappropriate for the storage of confidential patient data. Motivation This paper develops an innovative solution to remedy the aforementioned shortcomings. ...
Int. J. Netw. Secur., 2019
Securing data over an open network is one of the most critical problems in network security. To s... more Securing data over an open network is one of the most critical problems in network security. To secure data, an encryption algorithm should be used. Hill cipher is one of most famous encryption algorithms. Although the Hill cipher is not strong enough and vulnerable to many types of attacks, it is still playing a significant role in educational systems; The original Hill cipher is vulnerable to known plaintext attack. In the last decade, Hill cipher got much attention. Researchers proposed many enhances to the Hill cipher; New modifications of the Hill cipher are proposed to enhance the security of Hill cipher. In this paper we shall show that “A Modified Hill Cipher Based on Circulant Matrices” is vulnerable to both known plaintext attack and chosen plaintext attack. Moreover, we will introduce a new mode of operation which can be used with any block cipher. Then we will propose a new enhanced encryption algorithm. After that, we shall provide a security analysis and efficiency eva...
New Technologies, Mobility and Security
Alexandria Engineering Journal, 2021
Abstract Medical image segmentation is important for disease diagnosis and support medical decisi... more Abstract Medical image segmentation is important for disease diagnosis and support medical decision systems. The study proposes an efficient 3D semantic segmentation deep learning model “3D-DenseUNet-569” for liver and tumor segmentation. The proposed 3D-DenseUNet-569 is a fully 3D semantic segmentation model with a significantly deeper network and lower trainable parameters. The proposed model adopts Depthwise Separable Convolution (DS-Conv) as opposed to traditional convolution. The DS-Conv significantly decreases GPU memory requirements and computational cost and achieves high performance. The proposed 3D-DenseUNet-569 utilizes DensNet connections and UNet links, which preserve low-level features and produce effective results. The results of experimental study on the standard LiTS dataset demonstrate that the 3D-DenseNet-569 model is effective and efficient with respect to related studies.
The 7th International Conference on Information Technology, 2015