Swati Narwane - Academia.edu (original) (raw)

Papers by Swati Narwane

Research paper thumbnail of Determinants of Blockchain- Machine Learning Adoption in Additive Manufacturing

International Journal of Industrial and Systems Engineering

Research paper thumbnail of Determinants of Machine Learning Adoption in a Manufacturing Supply Chain

Research paper thumbnail of Is handling unbalanced datasets for machine learning uplifts system performance?: A case of diabetic prediction

Diabetes & Metabolic Syndrome: Clinical Research & Reviews

Research paper thumbnail of Detection of URL based phishing attacks using machine learning: A Survey

International Journal of Engineering Development and Research, Jun 1, 2019

A fraud effort to get sensitive and personal information like password, username, and bank detail... more A fraud effort to get sensitive and personal information like password, username, and bank details like credit / debit card details by masking as a reliable organization in electronic communication. It most of the time redirects the users to similar looking website as legitimate website. The phishing website will appear same as the legitimate website and directs the user to a page to enter personal details of the user on the fake website. The system administration is very important these days as any failure can be detected and solved instantly. The system administration also need to define rules and set firewall settings to avoid phishing attacks through URL. Researchers have been studying various machine learning algorithm in lines to predict and avoid phishing attacks. Through machine learning algorithms one can improve the accuracy of the prediction. The machine learning, no one algorithm works best for every problem, and it's especially relevant for supervised learning. Using a single machine learning algorithm will give us good accuracy to predict the phishing attacks but to get better accuracy we need something more. The proposed system predicts the URL based phishing attacks with maximum accuracy. We shall talk about various machine learning, the algorithm which can help in decision making and prediction. We shall use more than one algorithm to get better accuracy of prediction. The algorithms namely the Naive Bayes and Random forest are used in the proposed system to detect URL based phishing attacks. The hybrid algorithm approach by combining two of the mentioned algorithms will increase accuracy.

Research paper thumbnail of Designing a Model to Detect Diabetes using Machine Learning

International journal of engineering research and technology, 2019

Many of the interesting and important applications of machine learning are seen in a medical orga... more Many of the interesting and important applications of machine learning are seen in a medical organization. The notion of machine learning has swiftly become very appealing to healthcare industries. The predictions and analysis made by the research community for medical dataset support the people by taking proper care and precautions by preventing diseases. Through a set of medical datasets, different methods are used extensively in developing the decision support systems for disease prediction. This paper explains various aspects of machine learning, the types of algorithm which can help in decision making and prediction. We also discuss various applications of machine learning in the field of medicine focusing on the prediction of diabetes through machine learning. Diabetes is one of the most increasing diseases in the world and it requires continuous monitoring. To check this we explore various machine learning algorithms which will help in early prediction of this disease.

Research paper thumbnail of Detection of URL based Phishing Attacks using Machine Learning

International journal of engineering research and technology, 2019

A grift attempt to get sensitive and personal information like password, username, and bank detai... more A grift attempt to get sensitive and personal information like password, username, and bank details like credit/debit card details by masking as a reliable organization in electronic communication. The phishing website will appear the same as the legitimate website and directs the user to a page to enter personal details of the user on the fake website. Through machine learning algorithms one can improve the accuracy of the prediction. The proposed method predicts the URL based phishing attacks based on features and also gives maximum accuracy. This method uses uniform resource locator (URL) features. We identified features that phishing site URLs contain. The proposed method employs those features for phishing detection. The proposed system predicts the URL based phishing attacks with maximum accuracy. We shall talk about various machine learning, the algorithm which can help in decision making and prediction. We shall use more than one algorithm to get better accuracy of predictio...

Research paper thumbnail of A Novel Approach to Handle Class Imbalance in Machine Learning

International journal of engineering research and technology, 2019

Machine learning is the study of algorithms that a system uses to effectively perform a specific ... more Machine learning is the study of algorithms that a system uses to effectively perform a specific task. It depends on the patterns and inference instead of any instructions. In machine learning, majorly there is always some level of class imbalance issue in realworld classification. This problem arises when each class does not make up an equal division of a data-set. It is important to properly change the metrics and methods to balance the data set goals. This means that many learning algorithms of machine learning have low predictive accuracy for the not often occurring class. In this paper, we shall discuss this problem and look into different approaches used to solve the class imbalanced issue. This paper discusses the survey of different approaches done to improve the class imbalance issue in the data sets by learning about the data level approaches and the algorithm approaches. We have discussed the oversampling and under sampling methods to overcome the data imbalance problem. ...

Research paper thumbnail of A novel approach to handle class imbalance : A Survey

International Journal of Engineering Development and Research, 2019

Machine learning is study of algorithms that a system uses to effectively perform a specific task... more Machine learning is study of algorithms that a system uses to effectively perform a specific task. It depends on the patterns and inference instead of any instructions. In machine learning, majorly there is some level of class imbalance issue in real-world classification. This problem arises when each class does not make up an equal division of a data-set. It is essential to properly alter the metrics and methods to balance the data set goals. This means that many learning algorithms of machine learning have low predictive accuracy for the not often occurring class. In this paper, we shall discuss this problem and look in to different approaches used to solve the class imbalanced issue. This paper discusses the survey of different approaches done to improve the class imbalance issue in the data sets by learning about the data level approaches and the algorithm approaches. We have discussed the oversampling and undersampling methods to overcome the data imbalance problem.

Research paper thumbnail of Effects of Class Imbalance Using Machine Learning Algorithms

International Journal of Applied Evolutionary Computation, 2021

Class imbalance is the major hurdle for machine learning-based systems. Data set is the backbone ... more Class imbalance is the major hurdle for machine learning-based systems. Data set is the backbone of machine learning and must be studied to handle the class imbalance. The purpose of this paper is to investigate the effect of class imbalance on the data sets. The proposed methodology determines the model accuracy for class distribution. To find possible solutions, the behaviour of an imbalanced data set was investigated. The study considers two case studies with data set divided balanced to unbalanced class distribution. Testing of the data set with trained and test data was carried out for standard machine learning algorithms. Model accuracy for class distribution was measured with the training data set. Further, the built model was tested with individual binary class. Results show that, for the improvement of the system performance, it is essential to work on class imbalance problems. The study concludes that the system produces biased results due to the majority class. In the fut...

Research paper thumbnail of Machine Learning and Class Imbalance: A Literature Survey

Industrial Engineering Journal, 2019

The rapid growth in technologies and inexpensive internet connection has increased the volume of ... more The rapid growth in technologies and inexpensive internet connection has increased the volume of data generated. The data generated can be used to derive lots of information and patterns. Data sets are an essential part of the Machine Learning (ML) technique. But modern data sets are suffering from class imbalance. ML does not work very well with unbalanced data sets. In this context, this paper aims to provide a systematic literature review of unbalanced data sets for ML. The collected papers on class imbalance problem for ML were 4 major categories like binary class imbalance, multi-class imbalance, binary and multi-class imbalance, and rare events class imbalance. The survey focused on, various issues in class imbalance for ML. The purpose of the present paper is to help the scholars and readers in understanding the impact of the class imbalance for ML. This article contributes to the role of unbalanced data sets and their impact on the predictive systems.

Research paper thumbnail of Dimensionality Reduction of Unbalanced Datasets: Principal Component Analysis

2021 Asian Conference on Innovation in Technology (ASIANCON), 2021

In this digital world, sharing of information is very easy and cost-effective; resulting in a lar... more In this digital world, sharing of information is very easy and cost-effective; resulting in a large amount of high-dimensional data, available in a variety of domains such as healthcare, finance, etc. Data available in the healthcare domain is used for disease diagnosis using Machine Learning (ML) models. The data set is the heart of the machine learning model. But the performance of such a model will not be satisfactory due to unbalance of the data set. One of the important points is that we can use sensitivity (true positive rate) and specificity (true negative rate) as performance measures along with accuracy. For ML-based healthcare systems, sensitivity plays a vital role. To balance unbalanced data set, the primary step is data preprocessing like feature selection and feature extraction. The proposed method used here is feature extraction method, Principle Component Analysis technique (PCA). Experimentation was done on Pima Diabetic Data set and calculations were done for accuracy as well as sensitivity. Obtained results proved that PCA is a better option for dimensionality reduction and also, it helps to improve the performance of the systems.

Research paper thumbnail of Determinants of Blockchain- Machine Learning Adoption in Additive Manufacturing

International Journal of Industrial and Systems Engineering

Research paper thumbnail of Determinants of Machine Learning Adoption in a Manufacturing Supply Chain

Research paper thumbnail of Is handling unbalanced datasets for machine learning uplifts system performance?: A case of diabetic prediction

Diabetes & Metabolic Syndrome: Clinical Research & Reviews

Research paper thumbnail of Detection of URL based phishing attacks using machine learning: A Survey

International Journal of Engineering Development and Research, Jun 1, 2019

A fraud effort to get sensitive and personal information like password, username, and bank detail... more A fraud effort to get sensitive and personal information like password, username, and bank details like credit / debit card details by masking as a reliable organization in electronic communication. It most of the time redirects the users to similar looking website as legitimate website. The phishing website will appear same as the legitimate website and directs the user to a page to enter personal details of the user on the fake website. The system administration is very important these days as any failure can be detected and solved instantly. The system administration also need to define rules and set firewall settings to avoid phishing attacks through URL. Researchers have been studying various machine learning algorithm in lines to predict and avoid phishing attacks. Through machine learning algorithms one can improve the accuracy of the prediction. The machine learning, no one algorithm works best for every problem, and it's especially relevant for supervised learning. Using a single machine learning algorithm will give us good accuracy to predict the phishing attacks but to get better accuracy we need something more. The proposed system predicts the URL based phishing attacks with maximum accuracy. We shall talk about various machine learning, the algorithm which can help in decision making and prediction. We shall use more than one algorithm to get better accuracy of prediction. The algorithms namely the Naive Bayes and Random forest are used in the proposed system to detect URL based phishing attacks. The hybrid algorithm approach by combining two of the mentioned algorithms will increase accuracy.

Research paper thumbnail of Designing a Model to Detect Diabetes using Machine Learning

International journal of engineering research and technology, 2019

Many of the interesting and important applications of machine learning are seen in a medical orga... more Many of the interesting and important applications of machine learning are seen in a medical organization. The notion of machine learning has swiftly become very appealing to healthcare industries. The predictions and analysis made by the research community for medical dataset support the people by taking proper care and precautions by preventing diseases. Through a set of medical datasets, different methods are used extensively in developing the decision support systems for disease prediction. This paper explains various aspects of machine learning, the types of algorithm which can help in decision making and prediction. We also discuss various applications of machine learning in the field of medicine focusing on the prediction of diabetes through machine learning. Diabetes is one of the most increasing diseases in the world and it requires continuous monitoring. To check this we explore various machine learning algorithms which will help in early prediction of this disease.

Research paper thumbnail of Detection of URL based Phishing Attacks using Machine Learning

International journal of engineering research and technology, 2019

A grift attempt to get sensitive and personal information like password, username, and bank detai... more A grift attempt to get sensitive and personal information like password, username, and bank details like credit/debit card details by masking as a reliable organization in electronic communication. The phishing website will appear the same as the legitimate website and directs the user to a page to enter personal details of the user on the fake website. Through machine learning algorithms one can improve the accuracy of the prediction. The proposed method predicts the URL based phishing attacks based on features and also gives maximum accuracy. This method uses uniform resource locator (URL) features. We identified features that phishing site URLs contain. The proposed method employs those features for phishing detection. The proposed system predicts the URL based phishing attacks with maximum accuracy. We shall talk about various machine learning, the algorithm which can help in decision making and prediction. We shall use more than one algorithm to get better accuracy of predictio...

Research paper thumbnail of A Novel Approach to Handle Class Imbalance in Machine Learning

International journal of engineering research and technology, 2019

Machine learning is the study of algorithms that a system uses to effectively perform a specific ... more Machine learning is the study of algorithms that a system uses to effectively perform a specific task. It depends on the patterns and inference instead of any instructions. In machine learning, majorly there is always some level of class imbalance issue in realworld classification. This problem arises when each class does not make up an equal division of a data-set. It is important to properly change the metrics and methods to balance the data set goals. This means that many learning algorithms of machine learning have low predictive accuracy for the not often occurring class. In this paper, we shall discuss this problem and look into different approaches used to solve the class imbalanced issue. This paper discusses the survey of different approaches done to improve the class imbalance issue in the data sets by learning about the data level approaches and the algorithm approaches. We have discussed the oversampling and under sampling methods to overcome the data imbalance problem. ...

Research paper thumbnail of A novel approach to handle class imbalance : A Survey

International Journal of Engineering Development and Research, 2019

Machine learning is study of algorithms that a system uses to effectively perform a specific task... more Machine learning is study of algorithms that a system uses to effectively perform a specific task. It depends on the patterns and inference instead of any instructions. In machine learning, majorly there is some level of class imbalance issue in real-world classification. This problem arises when each class does not make up an equal division of a data-set. It is essential to properly alter the metrics and methods to balance the data set goals. This means that many learning algorithms of machine learning have low predictive accuracy for the not often occurring class. In this paper, we shall discuss this problem and look in to different approaches used to solve the class imbalanced issue. This paper discusses the survey of different approaches done to improve the class imbalance issue in the data sets by learning about the data level approaches and the algorithm approaches. We have discussed the oversampling and undersampling methods to overcome the data imbalance problem.

Research paper thumbnail of Effects of Class Imbalance Using Machine Learning Algorithms

International Journal of Applied Evolutionary Computation, 2021

Class imbalance is the major hurdle for machine learning-based systems. Data set is the backbone ... more Class imbalance is the major hurdle for machine learning-based systems. Data set is the backbone of machine learning and must be studied to handle the class imbalance. The purpose of this paper is to investigate the effect of class imbalance on the data sets. The proposed methodology determines the model accuracy for class distribution. To find possible solutions, the behaviour of an imbalanced data set was investigated. The study considers two case studies with data set divided balanced to unbalanced class distribution. Testing of the data set with trained and test data was carried out for standard machine learning algorithms. Model accuracy for class distribution was measured with the training data set. Further, the built model was tested with individual binary class. Results show that, for the improvement of the system performance, it is essential to work on class imbalance problems. The study concludes that the system produces biased results due to the majority class. In the fut...

Research paper thumbnail of Machine Learning and Class Imbalance: A Literature Survey

Industrial Engineering Journal, 2019

The rapid growth in technologies and inexpensive internet connection has increased the volume of ... more The rapid growth in technologies and inexpensive internet connection has increased the volume of data generated. The data generated can be used to derive lots of information and patterns. Data sets are an essential part of the Machine Learning (ML) technique. But modern data sets are suffering from class imbalance. ML does not work very well with unbalanced data sets. In this context, this paper aims to provide a systematic literature review of unbalanced data sets for ML. The collected papers on class imbalance problem for ML were 4 major categories like binary class imbalance, multi-class imbalance, binary and multi-class imbalance, and rare events class imbalance. The survey focused on, various issues in class imbalance for ML. The purpose of the present paper is to help the scholars and readers in understanding the impact of the class imbalance for ML. This article contributes to the role of unbalanced data sets and their impact on the predictive systems.

Research paper thumbnail of Dimensionality Reduction of Unbalanced Datasets: Principal Component Analysis

2021 Asian Conference on Innovation in Technology (ASIANCON), 2021

In this digital world, sharing of information is very easy and cost-effective; resulting in a lar... more In this digital world, sharing of information is very easy and cost-effective; resulting in a large amount of high-dimensional data, available in a variety of domains such as healthcare, finance, etc. Data available in the healthcare domain is used for disease diagnosis using Machine Learning (ML) models. The data set is the heart of the machine learning model. But the performance of such a model will not be satisfactory due to unbalance of the data set. One of the important points is that we can use sensitivity (true positive rate) and specificity (true negative rate) as performance measures along with accuracy. For ML-based healthcare systems, sensitivity plays a vital role. To balance unbalanced data set, the primary step is data preprocessing like feature selection and feature extraction. The proposed method used here is feature extraction method, Principle Component Analysis technique (PCA). Experimentation was done on Pima Diabetic Data set and calculations were done for accuracy as well as sensitivity. Obtained results proved that PCA is a better option for dimensionality reduction and also, it helps to improve the performance of the systems.