Android Malware Classification Using Optimized Ensemble Learning Based on Genetic Algorithms (original) (raw)

Optimizing Android Malware Detection Via Ensemble Learning

International Journal of Interactive Mobile Technologies (iJIM)

Android operating system has become very popular, with the highest market share, amongst all other mobile operating systems due to its open source nature and users friendliness. This has brought about an uncontrolled rise in malicious applications targeting the Android platform. Emerging trends of Android malware are employing highly sophisticated detection and analysis avoidance techniques such that the traditional signature-based detection methods have become less potent in their ability to detect new and unknown malware. Alternative approaches, such as the Machine learning techniques have taken the lead for timely zero-day anomaly detections. The study aimed at developing an optimized Android malware detection model using ensemble learning technique. Random Forest, Support Vector Machine, and k-Nearest Neighbours were used to develop three distinct base models and their predictive results were further combined using Majority Vote combination function to produce an ensemble model...

Android Malware Detection System Based on Ensemble Learning

The rapid advancement of smartphones, as well as their widespread use, has resulted in a significant increase in new security concerns. Malware’s covert techniques make signature-based anti-virus/anti-malware solutions difficult to detect. The features used in such solutions are extracted from static or dynamic analysis. In this paper, an Android malware detection system has been proposed. It consists of two main subsystems that work in parallel, one has been trained for benign labeled apps while the second one has been trained on malware labeled apps. Each subsystem is based on an ensemble approach that consists of OC-SVM, LOF, and modified isolation forest (M-iForest) classifiers. Each subsystem used three one-class classifiers to take the decision in each subsystem independently. Moreover, each subsystem used both features that are extracted from static and dynamic malware analysis. The evaluation has been conducted based on two An-droid malware benchmark datasets which are DREBI...

Evaluation of Advanced Ensemble Learning Techniques for Android Malware Detection

Vietnam Journal of Computer Science

Android is the most well-known portable working framework having billions of dynamic clients worldwide that pulled in promoters, programmers, and cybercriminals to create malware for different purposes. As of late, wide-running inquiries have been led on malware examination and identification for Android gadgets while Android has likewise actualized different security controls to manage the malware issues, including a User ID (UID) for every application, framework authorizations. In this paper, we advance and assess various kinds of machine learning (ML) by applying ensemble-based learning systems for identifying Android malware related to a substring-based feature selection (SBFS) strategy for the classifiers. In the investigation, we have broadened our previous work where it has been seen that the ensemble-based learning techniques acquire preferred outcome over the recently revealed outcome by directing the DREBIN dataset, and in this manner they give a solid premise to building ...

High Accuracy Android Malware Detection Using Ensemble Learning

With over 50 billion downloads and more than 1.3 million apps in Google’s official market, Android has continued to gain popularity amongst smartphone users worldwide. At the same time there has been a rise in malware targeting the platform, with more recent strains employing highly sophisticated detection avoidance techniques. As traditional signature based methods become less potent in detecting unknown malware, alternatives are needed for timely zero-day discovery. Thus this paper proposes an approach that utilizes ensemble learning for Android malware detection. It combines advantages of static analysis with the efficiency and performance of ensemble machine learning to improve Android malware detection accuracy. The machine learning models are built using a large repository of malware samples and benign apps from a leading antivirus vendor. Experimental results and analysis presented shows that the proposed method which uses a large feature space to leverage the power of ensemble learning is capable of 97.3 % to 99% detection accuracy with very low false positive rates.

Empirical Study on Intelligent Android Malware Detection based on Supervised Machine Learning

International Journal of Advanced Computer Science and Applications, 2020

The increasing number of mobile devices using the Android operating system in the market makes these devices the first target for malicious applications. In recent years, several Android malware applications were developed to perform certain illegitimate activities and harmful actions on mobile devices. In response, specific tools and anti-virus programs used conventional signature-based methods in order to detect such Android malware applications. However, the most recent Android malware apps, such as zero-day, cannot be detected through conventional methods that are still based on fixed signatures or identifiers. Therefore, the most recently published research studies have suggested machine learning techniques as an alternative method to detect Android malware due to their ability to learn and use the existing information to detect the new Android malware apps. This paper presents the basic concepts of Android architecture, Android malware, and permission features utilized as effective malware predictors. Furthermore, a comprehensive review of the existing static, dynamic, and hybrid Android malware detection approaches is presented in this study. More significantly, this paper empirically discusses and compares the performances of six supervised machine learning algorithms, known as K-Nearest Neighbors (K-NN), Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), Naïve Bayes (NB), and Logistic Regression (LR), which are commonly used in the literature for detecting malware apps.

Empirical Analysis of Forest Penalizing Attribute and Its Enhanced Variations for Android Malware Detection

Applied Sciences

As a result of the rapid advancement of mobile and internet technology, a plethora of new mobile security risks has recently emerged. Many techniques have been developed to address the risks associated with Android malware. The most extensively used method for identifying Android malware is signature-based detection. The drawback of this method, however, is that it is unable to detect unknown malware. As a consequence of this problem, machine learning (ML) methods for detecting and classifying malware applications were developed. The goal of conventional ML approaches is to improve classification accuracy. However, owing to imbalanced real-world datasets, the traditional classification algorithms perform poorly in detecting malicious apps. As a result, in this study, we developed a meta-learning approach based on the forest penalizing attribute (FPA) classification algorithm for detecting malware applications. In other words, with this research, we investigated how to improve Androi...

An Evaluation of some Machine Learning Algorithms for the detection of Android Applications Malware

ASTESJ, 2020

Android Operating system (OS) has been used much more than all other mobile phone's OS turning android OS to a major point of attack. Android Application installation serves as a major avenue through which attacks can be perpetrated. Permissions must be first granted by the users seeking to install these third-party applications. Some permissions can be subtle escaping the attentions of the users. Some of these permissions can have adverse effects like spying on the users, unauthorized retrieval and transference of the data and so on. This calls for the need of a heuristic method for the identification and detection of malware. In this discourse, testing of classification algorithms including Random forest, Naïve Bayes, Random Tree, BayesNet, Decision Table, Multi-layer perceptron (MLP), Bagging, Sequential Minimal Optimization (SMO)/Support-Vector Machine (SVM), KStar and IBK (also known as K Nearest Neighbours classifier (KNN)) was carried out to decide which algorithm performs best in android malware detection. Two dataset was used in this study and were gotten from figshare. They were trained and tested in the Waikato Environment for Knowledge Analysis (WEKA). The performance metrics used are Root Mean Square Error (RMSE), Accuracy, Receiver Operating Curve (ROC), False positive rate, F-measure, Precision and recall. It was discovered that the best performance with an accuracy of 99.4% was the multi-layer perceptron on the first dataset. Random Forest has the best performance with accuracy, 98.9% on the second dataset. The implication of this is that MLP or random forest can be used to detect android application malwares.

LinRegDroid: Detection of Android Malware Using Multiple Linear Regression Models-Based Classifiers

IEEE Access, 2022

In this study, a framework for Android malware detection based on permissions is presented. This framework uses multiple linear regression methods. Application permissions, which are one of the most critical building blocks in the security of the Android operating system, are extracted through static analysis, and security analyzes of applications are carried out with machine learning techniques. Based on the multiple linear regression techniques, two classifiers are proposed for permission-based Android malware detection. These classifiers are compared on four different datasets with basic machine learning techniques such as support vector machine, k-nearest neighbor, Naive Bayes, and decision trees. In addition, using the bagging method, which is one of the ensemble learning, different classifiers are created, and the classification performance is increased. As a result, remarkable performances are obtained with classification algorithms based on linear regression models without the need for very complex classification algorithms.

Behaviour-based Malware Detection in Mobile AndroidPlatforms Using Machine Learning Algorithms

J. Wirel. Mob. Networks Ubiquitous Comput. Dependable Appl., 2021

During the last few years, several approaches have been proposed for detection of Android malware Apps, each usually using its own dataset. Generating a representative Android malware dataset to evaluate malware detection approaches is a challenging task. Recently, the Canadian Institute for Cybersecurity released the CICAndMal2017 dataset, which includes recent and sophisticated Android samples spanning between five distinct categories: Adware, Ransomware, SMS malware, Scareware, and Benign. The best classification result obtained for this dataset was with a Precision of 95.3%, achieved with the Random Forest algorithm, using Permissions and Intents as static features. In this paper, we investigate the usage of nine machine learning algorithms to classify malware in the above mentioned dataset. The comparison of the obtained results is performed with the ones obtained with Random Forest, including performance evaluation (in terms of Precision, Recall, F-Measure, and Accuracy) and resource usage (in terms of execution time and CPU and memory consumption). Besides, we also investigate the use of a non-sliding Bag of System Calls algorithm with the above mentioned machine learning algorithms. It is shown that the Adaboost algorithm, using the Random Forest as a base estimator, leads to the best classification results with an Accuracy of 98.24%, a Precision of 99.31% (for malware), and an F1-Measure of 95.05% (for malware), at the cost of a larger execution time than Random Forest.

Malware detection in android mobile platform using machine learning algorithms

2017 International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions) (ICTUS), 2017

Malware has always been a problem in regards to any technological advances in the software world. Thus, it is to be expected that smart phones and other mobile devices are facing the same issues. In this paper, a practical and effective anomaly based malware detection framework is proposed with an emphasis on Android mobile computing platform. A dataset consisting of both benign and malicious applications (apps) were installed on an Android device to analyze the behavioral patterns. We first generate the system metrics (feature vector) from each app by executing it in a controlled environment. Then, a variety of machine learning algorithms: Decision Tree, K Nearest Neighbor, Logistic Regression, Multilayer Perceptron Neural Network, Naive Bayes, Random Forest, and Support Vector Machine are used to classify the app as benign or malware. Each algorithm is assessed using various performance criteria to identify which ones are more suitable to detect malicious software. The results suggest that Random Forest and Support Vector Machine provide the best outcomes thus making them the most effective techniques for malware detection.