Optimizing Android Malware Detection Via Ensemble Learning (original) (raw)

Android Malware Detection System Based on Ensemble Learning

The rapid advancement of smartphones, as well as their widespread use, has resulted in a significant increase in new security concerns. Malware’s covert techniques make signature-based anti-virus/anti-malware solutions difficult to detect. The features used in such solutions are extracted from static or dynamic analysis. In this paper, an Android malware detection system has been proposed. It consists of two main subsystems that work in parallel, one has been trained for benign labeled apps while the second one has been trained on malware labeled apps. Each subsystem is based on an ensemble approach that consists of OC-SVM, LOF, and modified isolation forest (M-iForest) classifiers. Each subsystem used three one-class classifiers to take the decision in each subsystem independently. Moreover, each subsystem used both features that are extracted from static and dynamic malware analysis. The evaluation has been conducted based on two An-droid malware benchmark datasets which are DREBI...

High Accuracy Android Malware Detection Using Ensemble Learning

With over 50 billion downloads and more than 1.3 million apps in Google’s official market, Android has continued to gain popularity amongst smartphone users worldwide. At the same time there has been a rise in malware targeting the platform, with more recent strains employing highly sophisticated detection avoidance techniques. As traditional signature based methods become less potent in detecting unknown malware, alternatives are needed for timely zero-day discovery. Thus this paper proposes an approach that utilizes ensemble learning for Android malware detection. It combines advantages of static analysis with the efficiency and performance of ensemble machine learning to improve Android malware detection accuracy. The machine learning models are built using a large repository of malware samples and benign apps from a leading antivirus vendor. Experimental results and analysis presented shows that the proposed method which uses a large feature space to leverage the power of ensemble learning is capable of 97.3 % to 99% detection accuracy with very low false positive rates.

Evaluation of Advanced Ensemble Learning Techniques for Android Malware Detection

Vietnam Journal of Computer Science

Android is the most well-known portable working framework having billions of dynamic clients worldwide that pulled in promoters, programmers, and cybercriminals to create malware for different purposes. As of late, wide-running inquiries have been led on malware examination and identification for Android gadgets while Android has likewise actualized different security controls to manage the malware issues, including a User ID (UID) for every application, framework authorizations. In this paper, we advance and assess various kinds of machine learning (ML) by applying ensemble-based learning systems for identifying Android malware related to a substring-based feature selection (SBFS) strategy for the classifiers. In the investigation, we have broadened our previous work where it has been seen that the ensemble-based learning techniques acquire preferred outcome over the recently revealed outcome by directing the DREBIN dataset, and in this manner they give a solid premise to building ...

Android Malware Classification Using Optimized Ensemble Learning Based on Genetic Algorithms

Sustainability

The continuous increase in Android malware applications (apps) represents a significant danger to the privacy and security of users’ information. Therefore, effective and efficient Android malware app-classification techniques are needed. This paper presents a method for Android malware classification using optimized ensemble learning based on genetic algorithms. The suggested method is divided into two steps. First, a base learner is used to handle various machine learning algorithms, including support vector machine (SVM), logistic regression (LR), gradient boosting (GB), decision tree (DT), and AdaBoost (ADA) classifiers. Second, a meta learner RF-GA, utilizing genetic algorithm (GA) to optimize the parameters of a random forest (RF) algorithm, is employed to classify the prediction probabilities from the base learner. The genetic algorithm is used to optimize the parameter settings in the RF algorithm in order to obtain the highest Android malware classification accuracy. The ef...

Empirical Study on Intelligent Android Malware Detection based on Supervised Machine Learning

International Journal of Advanced Computer Science and Applications, 2020

The increasing number of mobile devices using the Android operating system in the market makes these devices the first target for malicious applications. In recent years, several Android malware applications were developed to perform certain illegitimate activities and harmful actions on mobile devices. In response, specific tools and anti-virus programs used conventional signature-based methods in order to detect such Android malware applications. However, the most recent Android malware apps, such as zero-day, cannot be detected through conventional methods that are still based on fixed signatures or identifiers. Therefore, the most recently published research studies have suggested machine learning techniques as an alternative method to detect Android malware due to their ability to learn and use the existing information to detect the new Android malware apps. This paper presents the basic concepts of Android architecture, Android malware, and permission features utilized as effective malware predictors. Furthermore, a comprehensive review of the existing static, dynamic, and hybrid Android malware detection approaches is presented in this study. More significantly, this paper empirically discusses and compares the performances of six supervised machine learning algorithms, known as K-Nearest Neighbors (K-NN), Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), Naïve Bayes (NB), and Logistic Regression (LR), which are commonly used in the literature for detecting malware apps.

An Ensemble Approach Based on Fuzzy Logic Using Machine Learning Classifiers for Android Malware Detection

Applied Sciences

In this study, a fuzzy logic-based dynamic ensemble (FL-BDE) model was proposed to detect malware exposed to the Android operating system. The FL-BDE model contains a structure that combines both the processing power of machine learning (ML)-based methods and the decision-making power of the Mamdani-type fuzzy inference system (FIS). In this structure, six different methods, namely, logistic regression (LR), Bayes point machine (BPM), boosted decision tree (BDT), neural network (NN), decision forest (DF) and support vector machine (SVM) were used as ML-based methods to benefit from their scores. However, through an approach involving the process of voting and routing, the scores of only three ML-based methods which were more successful in classifying either the negative instances or positive instances were sent to the FIS to be combined. During the combining process, the FIS processed the incoming inputs and determined the malicious application score. Experimental studies were perfo...

DroidNMD: Network-based Malware Detection in Android Using an Ensemble of One-Class Classifiers

Modares Journal of Electrical Engineering, 2016

During the past few years, the number of malware designed for Android devices has increased dramatically. To confront with Android malware, some anomaly detection techniques have been proposed that are able to detect zero-day malware, but they often produce many false alarms that make them impractical for real-world use. In this paper, we address this problem by presenting DroidNMD, an ensemble-based anomaly detection technique that focuses on the network behavior of Android applications in order to detect Android malware. DroidNMD constructs an ensemble classifier consisting of multiple heterogeneous oneclass classifiers and uses an ordered weighted averaging (OWA) operator to aggregate the outputs of the one-class classifiers. Our work is motivated by the observation that combining multiple oneclass classifiers often produces higher overall classification accuracy than any individual one-class classifier. We demonstrate the effectiveness of DroidNMD using a real dataset of Android...

An Evaluation of some Machine Learning Algorithms for the detection of Android Applications Malware

ASTESJ, 2020

Android Operating system (OS) has been used much more than all other mobile phone's OS turning android OS to a major point of attack. Android Application installation serves as a major avenue through which attacks can be perpetrated. Permissions must be first granted by the users seeking to install these third-party applications. Some permissions can be subtle escaping the attentions of the users. Some of these permissions can have adverse effects like spying on the users, unauthorized retrieval and transference of the data and so on. This calls for the need of a heuristic method for the identification and detection of malware. In this discourse, testing of classification algorithms including Random forest, Naïve Bayes, Random Tree, BayesNet, Decision Table, Multi-layer perceptron (MLP), Bagging, Sequential Minimal Optimization (SMO)/Support-Vector Machine (SVM), KStar and IBK (also known as K Nearest Neighbours classifier (KNN)) was carried out to decide which algorithm performs best in android malware detection. Two dataset was used in this study and were gotten from figshare. They were trained and tested in the Waikato Environment for Knowledge Analysis (WEKA). The performance metrics used are Root Mean Square Error (RMSE), Accuracy, Receiver Operating Curve (ROC), False positive rate, F-measure, Precision and recall. It was discovered that the best performance with an accuracy of 99.4% was the multi-layer perceptron on the first dataset. Random Forest has the best performance with accuracy, 98.9% on the second dataset. The implication of this is that MLP or random forest can be used to detect android application malwares.

Malware detection in android mobile platform using machine learning algorithms

2017 International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions) (ICTUS), 2017

Malware has always been a problem in regards to any technological advances in the software world. Thus, it is to be expected that smart phones and other mobile devices are facing the same issues. In this paper, a practical and effective anomaly based malware detection framework is proposed with an emphasis on Android mobile computing platform. A dataset consisting of both benign and malicious applications (apps) were installed on an Android device to analyze the behavioral patterns. We first generate the system metrics (feature vector) from each app by executing it in a controlled environment. Then, a variety of machine learning algorithms: Decision Tree, K Nearest Neighbor, Logistic Regression, Multilayer Perceptron Neural Network, Naive Bayes, Random Forest, and Support Vector Machine are used to classify the app as benign or malware. Each algorithm is assessed using various performance criteria to identify which ones are more suitable to detect malicious software. The results suggest that Random Forest and Support Vector Machine provide the best outcomes thus making them the most effective techniques for malware detection.

Empirical Analysis of Forest Penalizing Attribute and Its Enhanced Variations for Android Malware Detection

Applied Sciences

As a result of the rapid advancement of mobile and internet technology, a plethora of new mobile security risks has recently emerged. Many techniques have been developed to address the risks associated with Android malware. The most extensively used method for identifying Android malware is signature-based detection. The drawback of this method, however, is that it is unable to detect unknown malware. As a consequence of this problem, machine learning (ML) methods for detecting and classifying malware applications were developed. The goal of conventional ML approaches is to improve classification accuracy. However, owing to imbalanced real-world datasets, the traditional classification algorithms perform poorly in detecting malicious apps. As a result, in this study, we developed a meta-learning approach based on the forest penalizing attribute (FPA) classification algorithm for detecting malware applications. In other words, with this research, we investigated how to improve Androi...