Zunera Jalil | Air University, Islamabad (original) (raw)

Papers by Zunera Jalil

Research paper thumbnail of Data Augmentation-based Novel Deep Learning Method for Deepfaked Images Detection

ACM Transactions on Multimedia Computing, Communications, and Applications, Apr 13, 2023

Research paper thumbnail of Authorship Analysis with Machine Learning

Research paper thumbnail of Cyber Forensics with Machine Learning

Research paper thumbnail of Deepfake Audio Detection via MFCC Features Using Machine Learning

Research paper thumbnail of Data Augmentation-based Novel Deep Learning Method for Deepfaked Images Detection

ACM Transactions on Multimedia Computing, Communications, and Applications

Recent advances in artificial intelligence have led to deepfake images, enabling users to replace... more Recent advances in artificial intelligence have led to deepfake images, enabling users to replace a real face with a genuine one. deepfake images have recently been used to malign public figures, politicians, and even average citizens. deepfake but realistic images have been used to stir political dissatisfaction, blackmail, propagate false news, and even carry out bogus terrorist attacks. Thus, identifying real images from fakes has got more challenging. To avoid these issues, this study employs transfer learning and data augmentation technique to classify deepfake images. For experimentation, 190,335 RGB-resolution deepfake and real images and image augmentation methods are used to prepare the dataset. The experiments use the deep learning models: convolutional neural network (CNN), Inception V3, visual geometry group (VGG19) and VGG16 with a transfer learning approach. Essential evaluation metrics (accuracy, precision, recall, F1-score, confusion matrix and AUC-ROC curve score) a...

Research paper thumbnail of Deep learning for religious and continent-based toxic content detection and classification

Scientific Reports

With time, numerous online communication platforms have emerged that allow people to express them... more With time, numerous online communication platforms have emerged that allow people to express themselves, increasing the dissemination of toxic languages, such as racism, sexual harassment, and other negative behaviors that are not accepted in polite society. As a result, toxic language identification in online communication has emerged as a critical application of natural language processing. Numerous academic and industrial researchers have recently researched toxic language identification using machine learning algorithms. However, Nontoxic comments, including particular identification descriptors, such as Muslim, Jewish, White, and Black, were assigned unrealistically high toxicity ratings in several machine learning models. This research analyzes and compares modern deep learning algorithms for multilabel toxic comments classification. We explore two scenarios: the first is a multilabel classification of Religious toxic comments, and the second is a multilabel classification of ...

Research paper thumbnail of Efficient Approach for Anomaly Detection in Internet of Things Traffic Using Deep Learning

Wireless Communications and Mobile Computing

The network intrusion detection system (NIDs) is a significant research milestone in information ... more The network intrusion detection system (NIDs) is a significant research milestone in information security. NIDs can scan and analyze the network to detect an attack or anomaly, which may be a continuing intrusion or perhaps an intrusion that has just occurred. During the pandemic, cybercriminals realized that home networks lurked with vulnerabilities due to a lack of security and computational limitations. A fundamental difficulty in NIDs is providing an effective, robust, lightweight, and rapid framework to perform real-time intrusion detection. This research proposes an efficient, functional cybersecurity approach based on machine/deep learning algorithms to detect anomalies using lightweight network-based IDs. A lightweight, real-time, network-based anomaly detection system can be used to secure connected IoT devices. The UNSW-NB15 dataset is used to evaluate the proposed approach DeepNet and compare results alongside other state-of-the-art existing techniques. For the classifica...

Research paper thumbnail of A Novel FCM and DT based Segmentation and Profiling Approach for Customer Relationship Management

2022 2nd International Conference on Artificial Intelligence (ICAI)

Research paper thumbnail of A Novel Benchmark Dataset for COVID-19 Detection during Third Wave in Pakistan

Computational Intelligence and Neuroscience

Coronavirus (COVID-19) is a highly severe infection caused by the severe acute respiratory corona... more Coronavirus (COVID-19) is a highly severe infection caused by the severe acute respiratory coronavirus 2 (SARS-CoV-2). The polymerase chain reaction (PCR) test is essential to confirm the COVID-19 infection, but it has certain limitations, including paucity of reagents, is computationally time-consuming, and requires expert clinicians. Clinicians suggest that the PCR test is not a reliable automated COVID-19 patient detection system. This study proposed a machine learning-based approach to evaluate the PCR role in COVID-19 detection. We collect real data containing 603 COVID-19 samples from the Pakistan Institute of Medical Sciences (PIMS) Hospital in Islamabad, Pakistan, during the third COVID-19 wave. The experiments are separated into two sets. The first set comprises 24 features, including PCR test results, whereas the second comprises 24 features without PCR test. The findings demonstrate that the decision tree achieves the best detection rate for positive and negative COVID-19...

Research paper thumbnail of Classification of Non-Functional Requirements From IoT Oriented Healthcare Requirement Document

Frontiers in Public Health, 2022

Internet of Things (IoT) involves a set of devices that aids in achieving a smart environment. He... more Internet of Things (IoT) involves a set of devices that aids in achieving a smart environment. Healthcare systems, which are IoT-oriented, provide monitoring services of patients' data and help take immediate steps in an emergency. Currently, machine learning-based techniques are adopted to ensure security and other non-functional requirements in smart health care systems. However, no attention is given to classifying the non-functional requirements from requirement documents. The manual process of classifying the non-functional requirements from documents is erroneous and laborious. Missing non-functional requirements in the Requirement Engineering (RE) phase results in IoT oriented healthcare system with compromised security and performance. In this research, an experiment is performed where non-functional requirements are classified from the IoT-oriented healthcare system's requirement document. The machine learning algorithms considered for classification are Logistic Re...

Research paper thumbnail of Email Classification using LSTM: A Deep Learning Technique

2021 International Conference on Cyber Warfare and Security (ICCWS)

Electronic mail has been in use for decades and more than four billion users access their emails ... more Electronic mail has been in use for decades and more than four billion users access their emails using different domains and servers. Emails are considered an official way of communication in remote working modes and in online businesses. Email labeling can reduce the amount of effort to manage this communication. Email classification is so far done to classify emails such as Spam, Non-spam, Junk, social media, etc. However, email classification keeping in view the types of cybercrimes committed through email is not done. Emails can be labeled as Spam, Phishing, fraudulent, harassing, bullying, or can be a general/normal email. This identification is one of the most challenging tasks for both email service providers and consumers. Several spam identification models have previously been proposed and tested but very limited work has been done so far on the multi-class classification of emails. Emails can be classified into more than two classes (spam and ham). In this paper, we have proposed a solution to classify emails into four classes: fraudulent, suspicious, harassment, and normal. A deep learning approach named Long Short Term Memory(LSTM) with stratified sampling has been used to identify the email classes. An effort has also been made to balance the input dataset using over-sampling methods. The proposed model obtained a classification accuracy of more than 90%. with stratified sampling only and more than 95% by applying data balancing techniques on the dataset.

Research paper thumbnail of Authorship identification using ensemble learning

Scientific Reports

With time, textual data is proliferating, primarily through the publications of articles. With th... more With time, textual data is proliferating, primarily through the publications of articles. With this rapid increase in textual data, anonymous content is also increasing. Researchers are searching for alternative strategies to identify the author of an unknown text. There is a need to develop a system to identify the actual author of unknown texts based on a given set of writing samples. This study presents a novel approach based on ensemble learning, DistilBERT, and conventional machine learning techniques for authorship identification. The proposed approach extracts the valuable characteristics of the author using a count vectorizer and bi-gram Term frequency-inverse document frequency (TF-IDF). An extensive and detailed dataset, “All the news” is used in this study for experimentation. The dataset is divided into three subsets (article1, article2, and article3). We limit the scope of the dataset and selected ten authors in the first scope and 20 authors in the second scope for exp...

Research paper thumbnail of Email Classification and Forensics Analysis using Machine Learning

IEEE

Emails are being used as a reliable, secure, and formal mode of communication for a long time. Wi... more Emails are being used as a reliable, secure, and formal mode of communication for a long time. With fast and secure communication technologies, reliance on Email has increased as well. The massive increase in email data has led to a big challenge in managing emails. Emails so far can be classified and grouped based on sender, size, and date. However, there is a need to detect and classify emails based on the contents contained therein. Several approaches have been used in the past for content-based classification of emails as Spam or Non-Spam Email. In this paper, we propose a multi-label email classification approach to organize emails. An efficient classification method has been proposed for forensic investigations of massive email data (e.g., a disk image of an email server). This method would help the investigator in Email related crimes investigations. A comparative study of machine learning algorithms identified Logistic Regression as a method that achieves the highest accuracy compared to Naive Bayes, Stochastic Gradient Descent, Random Forest, and Support Vector Machine. Experiments conducted on benchmark data sets depicted that logistic Regression performs best, with an accuracy of 91.9% with bi-gram features.

Research paper thumbnail of SeFACEDSemantic basedForensicAnalysisandClassificationofE MailDatausingDeepLearning

IEEE Access PP(99), 2021

Artificial Intelligence (AI), in combination with the Internet of Things (IoT), called (AIoT), an... more Artificial Intelligence (AI), in combination with the Internet of Things (IoT), called (AIoT),
an emerging trend in industrial applications, is capable of intelligent decision-making with self-driven
analytic. With its extensive usage in diverse scenarios, IoT devices generate bulk data that gets contrived by
attackers to disrupt normal operations and services. Hence, there is a daring need for proactive data analyses
that must prevent cyber-attacks and crimes. To investigate crimes involving Electronic Mail (email), analysis
of both the header and the email body is required since the semantics of communication helps to identify the
source of potential evidence.With the continued growth of data shared via emails, investigators now face the
daunting challenge of extracting the required semantic information from the bulks of emails, thereby causing
a delay in the investigation process. This gives an edge to the criminal in erasing their footprints of malicious
acts. The existing keyword-based search techniques and filtration often result in extraneous, short sequence
emails, which skips meaningful information. To overcome the above limitation, we successfully designed a
novel efficient approach called SeFACED that uses Long short-term memory (LSTM) based Gated Recurrent
Neural Network (GRU) for multiclass email classification. SeFACED not only caters to short sequences but
long dependencies of 1000+ characters as well. SeFACED focuses on tuning LSTM based GRU parameters
to attain the best performance, which has its assessment by comparing it with traditional Machine Learning
(ML) and Deep Learning (DL) models and state-of-the-art studies on the subject. Experimental results on
self-extended benchmark datasets exhibit that SeFACED effectively outperforms existing methods while
keeping the classification process robust and reliable.

Research paper thumbnail of A Large-Scale Benchmark Dataset for Anomaly Detection and Rare Event Classification for Audio Forensics

Research paper thumbnail of Privacy of Web Browsers: A Challenge in Digital Forensics

Lecture Notes in Electrical Engineering, 2022

Research paper thumbnail of Evading obscure communication from spam emails

Mathematical Biosciences and Engineering, 2021

Spam is any form of annoying and unsought digital communication sent in bulk and may contain offe... more Spam is any form of annoying and unsought digital communication sent in bulk and may contain offensive content feasting viruses and cyber-attacks. The voluminous increase in spam has necessitated developing more reliable and vigorous artificial intelligence-based anti-spam filters. Besides text, an email sometimes contains multimedia content such as audio, video, and images. However, text-centric email spam filtering employing text classification techniques remains today's preferred choice. In this paper, we show that text pre-processing techniques nullify the detection of malicious contents in an obscure communication framework. We use Spamassassin corpus with and without text pre-processing and examined it using machine learning (ML) and deep learning (DL) algorithms to classify these as ham or spam emails. The proposed DL-based approach consistently outperforms ML models. In the first stage, using pre-processing techniques, the long-short-term memory (LSTM) model achieves the...

Research paper thumbnail of Future Smart Cities: Requirements, Emerging Technologies, Applications, Challenges, and Future Aspects

Research paper thumbnail of Email Classification and Forensics Analysis using Machine Learning

2021 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/IOP/SCI), 2021

Research paper thumbnail of SeFACED: Semantic-Based Forensic Analysis and Classification of E-Mail Data Using Deep Learning

Research paper thumbnail of Data Augmentation-based Novel Deep Learning Method for Deepfaked Images Detection

ACM Transactions on Multimedia Computing, Communications, and Applications, Apr 13, 2023

Research paper thumbnail of Authorship Analysis with Machine Learning

Research paper thumbnail of Cyber Forensics with Machine Learning

Research paper thumbnail of Deepfake Audio Detection via MFCC Features Using Machine Learning

Research paper thumbnail of Data Augmentation-based Novel Deep Learning Method for Deepfaked Images Detection

ACM Transactions on Multimedia Computing, Communications, and Applications

Recent advances in artificial intelligence have led to deepfake images, enabling users to replace... more Recent advances in artificial intelligence have led to deepfake images, enabling users to replace a real face with a genuine one. deepfake images have recently been used to malign public figures, politicians, and even average citizens. deepfake but realistic images have been used to stir political dissatisfaction, blackmail, propagate false news, and even carry out bogus terrorist attacks. Thus, identifying real images from fakes has got more challenging. To avoid these issues, this study employs transfer learning and data augmentation technique to classify deepfake images. For experimentation, 190,335 RGB-resolution deepfake and real images and image augmentation methods are used to prepare the dataset. The experiments use the deep learning models: convolutional neural network (CNN), Inception V3, visual geometry group (VGG19) and VGG16 with a transfer learning approach. Essential evaluation metrics (accuracy, precision, recall, F1-score, confusion matrix and AUC-ROC curve score) a...

Research paper thumbnail of Deep learning for religious and continent-based toxic content detection and classification

Scientific Reports

With time, numerous online communication platforms have emerged that allow people to express them... more With time, numerous online communication platforms have emerged that allow people to express themselves, increasing the dissemination of toxic languages, such as racism, sexual harassment, and other negative behaviors that are not accepted in polite society. As a result, toxic language identification in online communication has emerged as a critical application of natural language processing. Numerous academic and industrial researchers have recently researched toxic language identification using machine learning algorithms. However, Nontoxic comments, including particular identification descriptors, such as Muslim, Jewish, White, and Black, were assigned unrealistically high toxicity ratings in several machine learning models. This research analyzes and compares modern deep learning algorithms for multilabel toxic comments classification. We explore two scenarios: the first is a multilabel classification of Religious toxic comments, and the second is a multilabel classification of ...

Research paper thumbnail of Efficient Approach for Anomaly Detection in Internet of Things Traffic Using Deep Learning

Wireless Communications and Mobile Computing

The network intrusion detection system (NIDs) is a significant research milestone in information ... more The network intrusion detection system (NIDs) is a significant research milestone in information security. NIDs can scan and analyze the network to detect an attack or anomaly, which may be a continuing intrusion or perhaps an intrusion that has just occurred. During the pandemic, cybercriminals realized that home networks lurked with vulnerabilities due to a lack of security and computational limitations. A fundamental difficulty in NIDs is providing an effective, robust, lightweight, and rapid framework to perform real-time intrusion detection. This research proposes an efficient, functional cybersecurity approach based on machine/deep learning algorithms to detect anomalies using lightweight network-based IDs. A lightweight, real-time, network-based anomaly detection system can be used to secure connected IoT devices. The UNSW-NB15 dataset is used to evaluate the proposed approach DeepNet and compare results alongside other state-of-the-art existing techniques. For the classifica...

Research paper thumbnail of A Novel FCM and DT based Segmentation and Profiling Approach for Customer Relationship Management

2022 2nd International Conference on Artificial Intelligence (ICAI)

Research paper thumbnail of A Novel Benchmark Dataset for COVID-19 Detection during Third Wave in Pakistan

Computational Intelligence and Neuroscience

Coronavirus (COVID-19) is a highly severe infection caused by the severe acute respiratory corona... more Coronavirus (COVID-19) is a highly severe infection caused by the severe acute respiratory coronavirus 2 (SARS-CoV-2). The polymerase chain reaction (PCR) test is essential to confirm the COVID-19 infection, but it has certain limitations, including paucity of reagents, is computationally time-consuming, and requires expert clinicians. Clinicians suggest that the PCR test is not a reliable automated COVID-19 patient detection system. This study proposed a machine learning-based approach to evaluate the PCR role in COVID-19 detection. We collect real data containing 603 COVID-19 samples from the Pakistan Institute of Medical Sciences (PIMS) Hospital in Islamabad, Pakistan, during the third COVID-19 wave. The experiments are separated into two sets. The first set comprises 24 features, including PCR test results, whereas the second comprises 24 features without PCR test. The findings demonstrate that the decision tree achieves the best detection rate for positive and negative COVID-19...

Research paper thumbnail of Classification of Non-Functional Requirements From IoT Oriented Healthcare Requirement Document

Frontiers in Public Health, 2022

Internet of Things (IoT) involves a set of devices that aids in achieving a smart environment. He... more Internet of Things (IoT) involves a set of devices that aids in achieving a smart environment. Healthcare systems, which are IoT-oriented, provide monitoring services of patients' data and help take immediate steps in an emergency. Currently, machine learning-based techniques are adopted to ensure security and other non-functional requirements in smart health care systems. However, no attention is given to classifying the non-functional requirements from requirement documents. The manual process of classifying the non-functional requirements from documents is erroneous and laborious. Missing non-functional requirements in the Requirement Engineering (RE) phase results in IoT oriented healthcare system with compromised security and performance. In this research, an experiment is performed where non-functional requirements are classified from the IoT-oriented healthcare system's requirement document. The machine learning algorithms considered for classification are Logistic Re...

Research paper thumbnail of Email Classification using LSTM: A Deep Learning Technique

2021 International Conference on Cyber Warfare and Security (ICCWS)

Electronic mail has been in use for decades and more than four billion users access their emails ... more Electronic mail has been in use for decades and more than four billion users access their emails using different domains and servers. Emails are considered an official way of communication in remote working modes and in online businesses. Email labeling can reduce the amount of effort to manage this communication. Email classification is so far done to classify emails such as Spam, Non-spam, Junk, social media, etc. However, email classification keeping in view the types of cybercrimes committed through email is not done. Emails can be labeled as Spam, Phishing, fraudulent, harassing, bullying, or can be a general/normal email. This identification is one of the most challenging tasks for both email service providers and consumers. Several spam identification models have previously been proposed and tested but very limited work has been done so far on the multi-class classification of emails. Emails can be classified into more than two classes (spam and ham). In this paper, we have proposed a solution to classify emails into four classes: fraudulent, suspicious, harassment, and normal. A deep learning approach named Long Short Term Memory(LSTM) with stratified sampling has been used to identify the email classes. An effort has also been made to balance the input dataset using over-sampling methods. The proposed model obtained a classification accuracy of more than 90%. with stratified sampling only and more than 95% by applying data balancing techniques on the dataset.

Research paper thumbnail of Authorship identification using ensemble learning

Scientific Reports

With time, textual data is proliferating, primarily through the publications of articles. With th... more With time, textual data is proliferating, primarily through the publications of articles. With this rapid increase in textual data, anonymous content is also increasing. Researchers are searching for alternative strategies to identify the author of an unknown text. There is a need to develop a system to identify the actual author of unknown texts based on a given set of writing samples. This study presents a novel approach based on ensemble learning, DistilBERT, and conventional machine learning techniques for authorship identification. The proposed approach extracts the valuable characteristics of the author using a count vectorizer and bi-gram Term frequency-inverse document frequency (TF-IDF). An extensive and detailed dataset, “All the news” is used in this study for experimentation. The dataset is divided into three subsets (article1, article2, and article3). We limit the scope of the dataset and selected ten authors in the first scope and 20 authors in the second scope for exp...

Research paper thumbnail of Email Classification and Forensics Analysis using Machine Learning

IEEE

Emails are being used as a reliable, secure, and formal mode of communication for a long time. Wi... more Emails are being used as a reliable, secure, and formal mode of communication for a long time. With fast and secure communication technologies, reliance on Email has increased as well. The massive increase in email data has led to a big challenge in managing emails. Emails so far can be classified and grouped based on sender, size, and date. However, there is a need to detect and classify emails based on the contents contained therein. Several approaches have been used in the past for content-based classification of emails as Spam or Non-Spam Email. In this paper, we propose a multi-label email classification approach to organize emails. An efficient classification method has been proposed for forensic investigations of massive email data (e.g., a disk image of an email server). This method would help the investigator in Email related crimes investigations. A comparative study of machine learning algorithms identified Logistic Regression as a method that achieves the highest accuracy compared to Naive Bayes, Stochastic Gradient Descent, Random Forest, and Support Vector Machine. Experiments conducted on benchmark data sets depicted that logistic Regression performs best, with an accuracy of 91.9% with bi-gram features.

Research paper thumbnail of SeFACEDSemantic basedForensicAnalysisandClassificationofE MailDatausingDeepLearning

IEEE Access PP(99), 2021

Artificial Intelligence (AI), in combination with the Internet of Things (IoT), called (AIoT), an... more Artificial Intelligence (AI), in combination with the Internet of Things (IoT), called (AIoT),
an emerging trend in industrial applications, is capable of intelligent decision-making with self-driven
analytic. With its extensive usage in diverse scenarios, IoT devices generate bulk data that gets contrived by
attackers to disrupt normal operations and services. Hence, there is a daring need for proactive data analyses
that must prevent cyber-attacks and crimes. To investigate crimes involving Electronic Mail (email), analysis
of both the header and the email body is required since the semantics of communication helps to identify the
source of potential evidence.With the continued growth of data shared via emails, investigators now face the
daunting challenge of extracting the required semantic information from the bulks of emails, thereby causing
a delay in the investigation process. This gives an edge to the criminal in erasing their footprints of malicious
acts. The existing keyword-based search techniques and filtration often result in extraneous, short sequence
emails, which skips meaningful information. To overcome the above limitation, we successfully designed a
novel efficient approach called SeFACED that uses Long short-term memory (LSTM) based Gated Recurrent
Neural Network (GRU) for multiclass email classification. SeFACED not only caters to short sequences but
long dependencies of 1000+ characters as well. SeFACED focuses on tuning LSTM based GRU parameters
to attain the best performance, which has its assessment by comparing it with traditional Machine Learning
(ML) and Deep Learning (DL) models and state-of-the-art studies on the subject. Experimental results on
self-extended benchmark datasets exhibit that SeFACED effectively outperforms existing methods while
keeping the classification process robust and reliable.

Research paper thumbnail of A Large-Scale Benchmark Dataset for Anomaly Detection and Rare Event Classification for Audio Forensics

Research paper thumbnail of Privacy of Web Browsers: A Challenge in Digital Forensics

Lecture Notes in Electrical Engineering, 2022

Research paper thumbnail of Evading obscure communication from spam emails

Mathematical Biosciences and Engineering, 2021

Spam is any form of annoying and unsought digital communication sent in bulk and may contain offe... more Spam is any form of annoying and unsought digital communication sent in bulk and may contain offensive content feasting viruses and cyber-attacks. The voluminous increase in spam has necessitated developing more reliable and vigorous artificial intelligence-based anti-spam filters. Besides text, an email sometimes contains multimedia content such as audio, video, and images. However, text-centric email spam filtering employing text classification techniques remains today's preferred choice. In this paper, we show that text pre-processing techniques nullify the detection of malicious contents in an obscure communication framework. We use Spamassassin corpus with and without text pre-processing and examined it using machine learning (ML) and deep learning (DL) algorithms to classify these as ham or spam emails. The proposed DL-based approach consistently outperforms ML models. In the first stage, using pre-processing techniques, the long-short-term memory (LSTM) model achieves the...

Research paper thumbnail of Future Smart Cities: Requirements, Emerging Technologies, Applications, Challenges, and Future Aspects

Research paper thumbnail of Email Classification and Forensics Analysis using Machine Learning

2021 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/IOP/SCI), 2021

Research paper thumbnail of SeFACED: Semantic-Based Forensic Analysis and Classification of E-Mail Data Using Deep Learning