Mafas Raheem - Academia.edu (original) (raw)
Papers by Mafas Raheem
International Journal of Advanced Computer Science and Applications
Tourism research has benefitted from the worldwide spread and development of social networking se... more Tourism research has benefitted from the worldwide spread and development of social networking services. People nowadays are more likely to rely on internet resources to plan their vacations. Thus, travel recommendation systems are designed to sift through the mammoth amount of data and identify the ideal travel destinations for the users. Moreover, it is shown that the increasing availability and popularity of geotagged data significantly impacts the destination decision. However, most current research concentrates on reviews and textual information to develop the recommendation model. Therefore, the proposed travel recommendation model examines the collective behaviour and connections between users based on geotagged data to provide personalized suggestions for individuals. The model was developed using the user-based collaborative filtering technique. The matrix factorization model was selected as the collaborative filtering technique to compute user similarities due to its adaptability in dealing with sparse rating matrices. The recommendation model generates prediction values to recommend the most appropriate locations. Finally, the model performance of the proposed model was assessed against the popularity and random models using the test design established using Mean Average Precision (MAP), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE). The findings indicated that the proposed matrix factorization model has an average MAP of 0.83, with RMSE and MAE values being 1.36 and 1.24, respectively. The proposed model got significantly higher MAP values and the lowest RMSE and MAE values compared to the two baseline models. The comparison shows that the proposed model is effective in providing personalized suggestions to users based on their past visits.
International Journal of Data Science
The research aims to perform meaningful human resource analysis on data science employment using ... more The research aims to perform meaningful human resource analysis on data science employment using the strong influences of specialized skills set with assisting salary prediction. With explosive big data development, a data science job shortage has occurred with high accurate recruitment demand to hire suitable professionals for specific data science roles. To achieve such outcomes, the current data science employment trends were analyzed based on a secondary dataset. Useful analytics insights for job securement and better career development were provided through the main dashboard. Besides, the significant in-demand data science skill variables were also identified for further effective model building. Particularly, certain data pre-processing techniques were performed extensively to prepare and optimize the dataset for the mentioned human resource analytics purposes. The ensemble model was selected as the most suitable salary prediction model with the lowest Average Squared Error (...
Indian Journal Of Science And Technology
Objectives: To examine whether the integration of Social Media features from YouTube videos and S... more Objectives: To examine whether the integration of Social Media features from YouTube videos and Spotify audio features can effectively predict music popularity. Methods: A dataset is constructed by collecting newly released tracks from May to August 2021. Audio features are acquired from Spotify while social media features are obtained from the official videos on YouTube. Music popularity is defined using five metrics derived from the Spotify Top 200 daily chart performance to measure diverse aspects of the songs' success (Length, Max, Sum, Mean, and Debut). The predicted popularity has three target variables, ranging from Low, Medium to High popularity. During model implementation, four machine learning models were trained on the dataset in two different stages such as purely audio features and both audio and social media features respectively. Findings: At the second stage, random forest outperformed the other three models with the best results for the four-evaluation metrics. In detail, the model generated accuracy of 79.6%, macro-precision of 74.5%, macro-recall of 73.2%, and macro F1-scores of 73.1% on average across the five-popularity metrics used. Moreover, the results from both experimental stages showed that the incorporation of social media variables significantly increased the model performances relative to the use of audio features only, with the margins of improvement ranging from 10% to 60%. This demonstrates that YouTube-based social media features are beneficial for the use of industry practitioners to identify potentially popular hits. Novelty: This research appears to be the first study to date in the Hit Song Science domain that utilizes Social Media data from YouTube for the prediction of hit songs. Furthermore, it promotes the prediction of potential hits by using audio features and social media data jointly.
In order for an organization to gain an edge over their competitor, they must be able to make the... more In order for an organization to gain an edge over their competitor, they must be able to make the right decision. Using machine learning models such as XGBoost and Artificial Neural Network during data analysis presents a way to obtain the information necessary for making decision. However, there is a need to secure both the data and the results from analysis to maintain data integrity whilst ensuring that malicious actors will not be able to access. The solution is to merge data analytics together with a private blockchain by storing the data and the result on the blockchain. Using smart contracts for access control, this ensures only users with permission will be able to access whilst maintaining the integrity of the data since it cannot be modified. Keywords— Data Analytics, Blockchain, Artificial Neural
India is an agriculture-based economy with 18% of its total Gross Domestic Product (GDP) coming f... more India is an agriculture-based economy with 18% of its total Gross Domestic Product (GDP) coming from different agricultural products. Agriculture 4.0 with modern technologies and robots for precision farming is shaping the future of agriculture in many places. In this research latest technologies like data science and machine learning algorithms are applied to understand different factors contributing to a profitable crop in India. These methods are applied on historical data collected from different Indian government web sites and publicly available data sets. This research provides a crop recommendation system with a prime motive of creating economic welfare of farmers. Multiple factors such as cost of planting, cost of harvesting, rainfall, crop demand, cost of seed, cost of fertilizer and yield of crop are considered to generate a more accurate prediction of whether a crop will be profitable or not. Keywords—Machine Learning, Recommender System &
In this digital era, online product reviews are ubiquity. Many shoppers are prone to read online ... more In this digital era, online product reviews are ubiquity. Many shoppers are prone to read online product reviews before making an actual purchase where it would minimize the risk of purchasing the undervalued product. A smaller number of research works are being found concerning the numerous techniques and methods followed in finding the correlation among online product reviews and actual purchase. It is found that review valence, review quantity, review timeliness and review length are intimately related with the actual online purchase. Also, the use of opinion mining/sentiment analysis on the consumers' reviews about their online purchases has been understood very prominently from the researches. Therefore, this paper aims to summarize the effect of online reviews and its impact towards the actual online purchase and the use of sentiment analysis on online reviews motivating towards actual online purchase. The research findings showed all the selected elements except the revie...
Journal of Physics: Conference Series, 2020
Recommender models for personalized marketing empower businesses to provide personalized recommen... more Recommender models for personalized marketing empower businesses to provide personalized recommendations of goods or services to customers to fulfil their requirements, thus ultimately improves the customer buying experience. Various recommender models powered by robust machine learning algorithms were reviewed on the methods and techniques to appraise its performance concerning the personalized marketing campaigns. Recommender models can be broadly categorized into four types such as content-based, collaborative-based, knowledge-based and hybrid-based. The content-based recommendation is suitable when the system, user or product is new where classification and regression algorithms are mostly implemented. The collaborative-based recommendation is suitable when a more accurate prediction is required where Neighbour-based models, Bayesian methods, rule-based models, decision trees, and latent matrix factorization models may be implemented in this scenario. Knowledge-based recommender...
The internet and social media platforms have made available massive quantities of information to ... more The internet and social media platforms have made available massive quantities of information to users worldwide. Numerous internet sources are present on categorical views of activities, goods and services, beliefs or perhaps the mood created by the online dwellers. In this competitive business world, various industries especially e-commerce immensely use sentiment analysis to increase productivity and make better business decisions. Sentiment Analysis is an associate degree in the field of analytics which has proven to be one of the significant instruments to reveal actionable insights using very big text databases from plentiful domains. This paper tackles a comprehensive overview of sentiment analysis and relevant techniques in e-commerce sector that is always keen to find out about the consumers’ opinions of their goods and services. It starts with the notion of an assessment of sentiments that have emerged as a method for understanding clients’ emotions. It also describes the ...
Understanding users behaviors and the popularity of events has become one of the widespread rese... more Understanding users behaviors and the popularity of events has become one of the widespread research topics in this competitive business world. Precise event attendance prediction enables effective organization, planning & resource allocation, vital for the facilitation of event participation and community development. Accurate event popularity prediction is beneficial to event organizers, enabling them to plan ahead on manpower, advertisement, logistics and many more aspects. This paper covers a comprehensive research review of event attendance prediction in the event-based social network (EBSNs) phenomena. This paper first identifies the key factors that influence the attendance for an event offline, and then look into the methods engaged in developing a predictive model which leads to the development of an event recommender system too. Finally, different predictive models that have been proposed for event attendance prediction are also explored in this context.
Webology, 2021
The exponential growth of social media has spurred an increase in the propagation of hate nowaday... more The exponential growth of social media has spurred an increase in the propagation of hate nowadays. Recent evidence shows that hate speech on social media is detrimental to the mental and physical health of individuals. Thus, there is an emerging need for automated hate speech detection. Automated hate speech detection rests on the intersection between Natural Language Processing (NLP) techniques and machine learning models. An introduction of NLP and its utilities, as well as commonly employed features and classification methods in hate speech detection, are discussed. Hate speech detection in non-English languages is needed to tackle this emergent issue in countries where multiple languages are used. Hence, an overview of the current literature on hate speech detection in non-English languages are covered too. Challenges in the field of hate speech detection are explored and the importance of standardized methodologies for building corpora and data sets are emphasized.
The International Journal of Recent Technology and Engineering (IJRTE), 2021
Diabetes has become a famous and lethal disease among the low and medium-income countries. People... more Diabetes has become a famous and lethal disease among the low and medium-income countries. People could not overcome this deadly abnormal condition due to the current lifestyle, food habit and the genetic transmittance. Medical practitioners provide advice to prevent the diabetic condition and medications to control as this disease does not have a permanent cure. However, the detection of the disease is being a tidy process and deployment of machine learning predictive models to conduct smart diagnosis/detection is vital in the healthcare domain nowadays. Though several machine learning models were built in this regard, deploying a Deep Neural Network seems less focused. Therefore, a Deep Neural Network model was built with the support of complete preprocessing, class balancing, normalization, feature selection process and hyper-parameter tuning using the cross-validated searching technique. The model achieved 88% of accuracy and 0.88 ROC score and standing out as a promising predic...
International Journal of Advanced Computer Science and Applications, 2020
The revolution of big data has made resonance in the banking sector especially in dealing with th... more The revolution of big data has made resonance in the banking sector especially in dealing with the massive amount of data. The banks have the opportunity to know about the customer's opinions and satisfaction regarding their products by analyzing the data gathered every day. So, the banks can transform these data into high-quality information that allow banks to improve their business especially in credit cards which is becoming a short-term business for the banks nowadays. Further, the sentiment analysis has become immense in the field of data analytics especially the customers' opinion makes a huge impact in making profitable business decisions. The outcome of the sentiment analysis does assist the banks to know the deficiencies of their product and allow them to improve their products to satisfy the customers. From the sentiment analysis, 45% of the customers were negative, 30% were positive and 25% were neutral towards the credit card facility offered by the commercial banks. Also, the prediction of credit card customer satisfaction will contribute in a significant way to create new opportunities for the banks to enhance their promotion aspects as well as the credit card business in future. Random Forest algorithm was applied with three various experiments utilizing the normal data, balanced data and the optimized model with the normal data. The optimized model with the normal data obtained the highest accuracy of 87.38% followed by the normal dataset by 85.82% and the least accuracy was for the balanced dataset by 82.83%.
Regular, 2020
Diabetes is a well-known common disease among people around the world. Diabetes causes many anoma... more Diabetes is a well-known common disease among people around the world. Diabetes causes many anomalies in the body and results in the patients to become under a long term medication. Detecting diabetes has been done via hectic medical tests and causes a delay for the patients to get to know their test results. However, data mining and machine learning approaches are in the frontline supporting the health care domain to make effective predictions in this regard. This paper elaborates about predicting Type 2 Diabetes Mellitus using classification models. A suitable secondary dataset was used to build classification models and the more suitable model was selected via the valid performance measures. In this line, the Random Forest, Support Vector Machine, Naïve Bayes and Artificial Neural Network models were built. Based on the performance measures, Random Forest has been identified as the more suitable classifier with the accuracy of 90%, the recall and precision value of 0.90.
International Journal of Recent Technology and Engineering (IJRTE), 2020
Big data has revolutionized every field of life, which accumulates human learning as well. The fi... more Big data has revolutionized every field of life, which accumulates human learning as well. The field of education has progressed in past couple of decades, and addition to that, rapid growth in the number of educational institutions has created a tough competition. The massive accumulation of data in the educational sector has created a great scope of EDM (Educational Data Mining) with the support of robust predictive models. It is quite necessary to regularly examine the performance of the students to make them perform better, thus helps to maintain the reputation of the institution. This study proposed a predictive model through which the performance of the student can be forecasted depending upon various characteristics. The KDD(Knowledge Discovery in Databases) methodology was followed stepwise in this study for developing predictive models to predict student performance. The data balancing techniques such as SMOTE (Synthetic Minority Over-sampling Technique) and ADASYN (Adaptiv...
International Journal of Recent Technology and Engineering, 2020
Business Analysis has become one of the crucial elements of any business in this data-driven busi... more Business Analysis has become one of the crucial elements of any business in this data-driven business world. This is at the frontline where the data analytics support the strategic management to make effective decisions with immense computing power. This paper investigates the big data problems of Adventure Works Cycles (AWC) by using analytical techniques and integrate different methods of knowledge discovery and data mining via descriptive and predicative analytics. The descriptive analytics revealed the prevailing business condition which could aid to make effective decisions. Consequently, an empirical study was performed to explore different types of predictive models to predict the future occurrences. Furthermore, a comparative analysis using different predictive algorithms which provides evidence that High-Performance Forest algorithm is particularly operative on the prediction of future occurrences with the accuracy of 80%, ROC index 0.878 and the cumulative lift value of 1....
Journal of Computational and Theoretical Nanoscience, 2019
In this era, big data is the most common buzzword across different industries due to its capabili... more In this era, big data is the most common buzzword across different industries due to its capabilities of collecting, processing, storing and analysing data. The advancement of the E-Commerce paved the way for merchants and customers to meet online to satisfy their requirements by exchanging goods and services at a reasonable cost. The challenges and opportunities for big data on the emphasis of data privacy and security is a widely discussed topic among businesses especially E-Commerce merchants. There are several reviews available on emphasizing big data opportunities and challenges with regard to privacy and security. However, a comprehensive review on E-Commerce highlighting thematically on the tools and technologies is not given enough consideration. Therefore, the purpose of this study is to review the state-of-the-art technologies towards privacy and security in the E-Commerce platforms. The identified cryptographic technologies were also discussed with the rational standpoint...
2022 IEEE International Conference on Distributed Computing and Electrical Circuits and Electronics (ICDCECE)
International Journal of Advanced Computer Science and Applications
Tourism research has benefitted from the worldwide spread and development of social networking se... more Tourism research has benefitted from the worldwide spread and development of social networking services. People nowadays are more likely to rely on internet resources to plan their vacations. Thus, travel recommendation systems are designed to sift through the mammoth amount of data and identify the ideal travel destinations for the users. Moreover, it is shown that the increasing availability and popularity of geotagged data significantly impacts the destination decision. However, most current research concentrates on reviews and textual information to develop the recommendation model. Therefore, the proposed travel recommendation model examines the collective behaviour and connections between users based on geotagged data to provide personalized suggestions for individuals. The model was developed using the user-based collaborative filtering technique. The matrix factorization model was selected as the collaborative filtering technique to compute user similarities due to its adaptability in dealing with sparse rating matrices. The recommendation model generates prediction values to recommend the most appropriate locations. Finally, the model performance of the proposed model was assessed against the popularity and random models using the test design established using Mean Average Precision (MAP), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE). The findings indicated that the proposed matrix factorization model has an average MAP of 0.83, with RMSE and MAE values being 1.36 and 1.24, respectively. The proposed model got significantly higher MAP values and the lowest RMSE and MAE values compared to the two baseline models. The comparison shows that the proposed model is effective in providing personalized suggestions to users based on their past visits.
International Journal of Data Science
The research aims to perform meaningful human resource analysis on data science employment using ... more The research aims to perform meaningful human resource analysis on data science employment using the strong influences of specialized skills set with assisting salary prediction. With explosive big data development, a data science job shortage has occurred with high accurate recruitment demand to hire suitable professionals for specific data science roles. To achieve such outcomes, the current data science employment trends were analyzed based on a secondary dataset. Useful analytics insights for job securement and better career development were provided through the main dashboard. Besides, the significant in-demand data science skill variables were also identified for further effective model building. Particularly, certain data pre-processing techniques were performed extensively to prepare and optimize the dataset for the mentioned human resource analytics purposes. The ensemble model was selected as the most suitable salary prediction model with the lowest Average Squared Error (...
Indian Journal Of Science And Technology
Objectives: To examine whether the integration of Social Media features from YouTube videos and S... more Objectives: To examine whether the integration of Social Media features from YouTube videos and Spotify audio features can effectively predict music popularity. Methods: A dataset is constructed by collecting newly released tracks from May to August 2021. Audio features are acquired from Spotify while social media features are obtained from the official videos on YouTube. Music popularity is defined using five metrics derived from the Spotify Top 200 daily chart performance to measure diverse aspects of the songs' success (Length, Max, Sum, Mean, and Debut). The predicted popularity has three target variables, ranging from Low, Medium to High popularity. During model implementation, four machine learning models were trained on the dataset in two different stages such as purely audio features and both audio and social media features respectively. Findings: At the second stage, random forest outperformed the other three models with the best results for the four-evaluation metrics. In detail, the model generated accuracy of 79.6%, macro-precision of 74.5%, macro-recall of 73.2%, and macro F1-scores of 73.1% on average across the five-popularity metrics used. Moreover, the results from both experimental stages showed that the incorporation of social media variables significantly increased the model performances relative to the use of audio features only, with the margins of improvement ranging from 10% to 60%. This demonstrates that YouTube-based social media features are beneficial for the use of industry practitioners to identify potentially popular hits. Novelty: This research appears to be the first study to date in the Hit Song Science domain that utilizes Social Media data from YouTube for the prediction of hit songs. Furthermore, it promotes the prediction of potential hits by using audio features and social media data jointly.
In order for an organization to gain an edge over their competitor, they must be able to make the... more In order for an organization to gain an edge over their competitor, they must be able to make the right decision. Using machine learning models such as XGBoost and Artificial Neural Network during data analysis presents a way to obtain the information necessary for making decision. However, there is a need to secure both the data and the results from analysis to maintain data integrity whilst ensuring that malicious actors will not be able to access. The solution is to merge data analytics together with a private blockchain by storing the data and the result on the blockchain. Using smart contracts for access control, this ensures only users with permission will be able to access whilst maintaining the integrity of the data since it cannot be modified. Keywords— Data Analytics, Blockchain, Artificial Neural
India is an agriculture-based economy with 18% of its total Gross Domestic Product (GDP) coming f... more India is an agriculture-based economy with 18% of its total Gross Domestic Product (GDP) coming from different agricultural products. Agriculture 4.0 with modern technologies and robots for precision farming is shaping the future of agriculture in many places. In this research latest technologies like data science and machine learning algorithms are applied to understand different factors contributing to a profitable crop in India. These methods are applied on historical data collected from different Indian government web sites and publicly available data sets. This research provides a crop recommendation system with a prime motive of creating economic welfare of farmers. Multiple factors such as cost of planting, cost of harvesting, rainfall, crop demand, cost of seed, cost of fertilizer and yield of crop are considered to generate a more accurate prediction of whether a crop will be profitable or not. Keywords—Machine Learning, Recommender System &
In this digital era, online product reviews are ubiquity. Many shoppers are prone to read online ... more In this digital era, online product reviews are ubiquity. Many shoppers are prone to read online product reviews before making an actual purchase where it would minimize the risk of purchasing the undervalued product. A smaller number of research works are being found concerning the numerous techniques and methods followed in finding the correlation among online product reviews and actual purchase. It is found that review valence, review quantity, review timeliness and review length are intimately related with the actual online purchase. Also, the use of opinion mining/sentiment analysis on the consumers' reviews about their online purchases has been understood very prominently from the researches. Therefore, this paper aims to summarize the effect of online reviews and its impact towards the actual online purchase and the use of sentiment analysis on online reviews motivating towards actual online purchase. The research findings showed all the selected elements except the revie...
Journal of Physics: Conference Series, 2020
Recommender models for personalized marketing empower businesses to provide personalized recommen... more Recommender models for personalized marketing empower businesses to provide personalized recommendations of goods or services to customers to fulfil their requirements, thus ultimately improves the customer buying experience. Various recommender models powered by robust machine learning algorithms were reviewed on the methods and techniques to appraise its performance concerning the personalized marketing campaigns. Recommender models can be broadly categorized into four types such as content-based, collaborative-based, knowledge-based and hybrid-based. The content-based recommendation is suitable when the system, user or product is new where classification and regression algorithms are mostly implemented. The collaborative-based recommendation is suitable when a more accurate prediction is required where Neighbour-based models, Bayesian methods, rule-based models, decision trees, and latent matrix factorization models may be implemented in this scenario. Knowledge-based recommender...
The internet and social media platforms have made available massive quantities of information to ... more The internet and social media platforms have made available massive quantities of information to users worldwide. Numerous internet sources are present on categorical views of activities, goods and services, beliefs or perhaps the mood created by the online dwellers. In this competitive business world, various industries especially e-commerce immensely use sentiment analysis to increase productivity and make better business decisions. Sentiment Analysis is an associate degree in the field of analytics which has proven to be one of the significant instruments to reveal actionable insights using very big text databases from plentiful domains. This paper tackles a comprehensive overview of sentiment analysis and relevant techniques in e-commerce sector that is always keen to find out about the consumers’ opinions of their goods and services. It starts with the notion of an assessment of sentiments that have emerged as a method for understanding clients’ emotions. It also describes the ...
Understanding users behaviors and the popularity of events has become one of the widespread rese... more Understanding users behaviors and the popularity of events has become one of the widespread research topics in this competitive business world. Precise event attendance prediction enables effective organization, planning & resource allocation, vital for the facilitation of event participation and community development. Accurate event popularity prediction is beneficial to event organizers, enabling them to plan ahead on manpower, advertisement, logistics and many more aspects. This paper covers a comprehensive research review of event attendance prediction in the event-based social network (EBSNs) phenomena. This paper first identifies the key factors that influence the attendance for an event offline, and then look into the methods engaged in developing a predictive model which leads to the development of an event recommender system too. Finally, different predictive models that have been proposed for event attendance prediction are also explored in this context.
Webology, 2021
The exponential growth of social media has spurred an increase in the propagation of hate nowaday... more The exponential growth of social media has spurred an increase in the propagation of hate nowadays. Recent evidence shows that hate speech on social media is detrimental to the mental and physical health of individuals. Thus, there is an emerging need for automated hate speech detection. Automated hate speech detection rests on the intersection between Natural Language Processing (NLP) techniques and machine learning models. An introduction of NLP and its utilities, as well as commonly employed features and classification methods in hate speech detection, are discussed. Hate speech detection in non-English languages is needed to tackle this emergent issue in countries where multiple languages are used. Hence, an overview of the current literature on hate speech detection in non-English languages are covered too. Challenges in the field of hate speech detection are explored and the importance of standardized methodologies for building corpora and data sets are emphasized.
The International Journal of Recent Technology and Engineering (IJRTE), 2021
Diabetes has become a famous and lethal disease among the low and medium-income countries. People... more Diabetes has become a famous and lethal disease among the low and medium-income countries. People could not overcome this deadly abnormal condition due to the current lifestyle, food habit and the genetic transmittance. Medical practitioners provide advice to prevent the diabetic condition and medications to control as this disease does not have a permanent cure. However, the detection of the disease is being a tidy process and deployment of machine learning predictive models to conduct smart diagnosis/detection is vital in the healthcare domain nowadays. Though several machine learning models were built in this regard, deploying a Deep Neural Network seems less focused. Therefore, a Deep Neural Network model was built with the support of complete preprocessing, class balancing, normalization, feature selection process and hyper-parameter tuning using the cross-validated searching technique. The model achieved 88% of accuracy and 0.88 ROC score and standing out as a promising predic...
International Journal of Advanced Computer Science and Applications, 2020
The revolution of big data has made resonance in the banking sector especially in dealing with th... more The revolution of big data has made resonance in the banking sector especially in dealing with the massive amount of data. The banks have the opportunity to know about the customer's opinions and satisfaction regarding their products by analyzing the data gathered every day. So, the banks can transform these data into high-quality information that allow banks to improve their business especially in credit cards which is becoming a short-term business for the banks nowadays. Further, the sentiment analysis has become immense in the field of data analytics especially the customers' opinion makes a huge impact in making profitable business decisions. The outcome of the sentiment analysis does assist the banks to know the deficiencies of their product and allow them to improve their products to satisfy the customers. From the sentiment analysis, 45% of the customers were negative, 30% were positive and 25% were neutral towards the credit card facility offered by the commercial banks. Also, the prediction of credit card customer satisfaction will contribute in a significant way to create new opportunities for the banks to enhance their promotion aspects as well as the credit card business in future. Random Forest algorithm was applied with three various experiments utilizing the normal data, balanced data and the optimized model with the normal data. The optimized model with the normal data obtained the highest accuracy of 87.38% followed by the normal dataset by 85.82% and the least accuracy was for the balanced dataset by 82.83%.
Regular, 2020
Diabetes is a well-known common disease among people around the world. Diabetes causes many anoma... more Diabetes is a well-known common disease among people around the world. Diabetes causes many anomalies in the body and results in the patients to become under a long term medication. Detecting diabetes has been done via hectic medical tests and causes a delay for the patients to get to know their test results. However, data mining and machine learning approaches are in the frontline supporting the health care domain to make effective predictions in this regard. This paper elaborates about predicting Type 2 Diabetes Mellitus using classification models. A suitable secondary dataset was used to build classification models and the more suitable model was selected via the valid performance measures. In this line, the Random Forest, Support Vector Machine, Naïve Bayes and Artificial Neural Network models were built. Based on the performance measures, Random Forest has been identified as the more suitable classifier with the accuracy of 90%, the recall and precision value of 0.90.
International Journal of Recent Technology and Engineering (IJRTE), 2020
Big data has revolutionized every field of life, which accumulates human learning as well. The fi... more Big data has revolutionized every field of life, which accumulates human learning as well. The field of education has progressed in past couple of decades, and addition to that, rapid growth in the number of educational institutions has created a tough competition. The massive accumulation of data in the educational sector has created a great scope of EDM (Educational Data Mining) with the support of robust predictive models. It is quite necessary to regularly examine the performance of the students to make them perform better, thus helps to maintain the reputation of the institution. This study proposed a predictive model through which the performance of the student can be forecasted depending upon various characteristics. The KDD(Knowledge Discovery in Databases) methodology was followed stepwise in this study for developing predictive models to predict student performance. The data balancing techniques such as SMOTE (Synthetic Minority Over-sampling Technique) and ADASYN (Adaptiv...
International Journal of Recent Technology and Engineering, 2020
Business Analysis has become one of the crucial elements of any business in this data-driven busi... more Business Analysis has become one of the crucial elements of any business in this data-driven business world. This is at the frontline where the data analytics support the strategic management to make effective decisions with immense computing power. This paper investigates the big data problems of Adventure Works Cycles (AWC) by using analytical techniques and integrate different methods of knowledge discovery and data mining via descriptive and predicative analytics. The descriptive analytics revealed the prevailing business condition which could aid to make effective decisions. Consequently, an empirical study was performed to explore different types of predictive models to predict the future occurrences. Furthermore, a comparative analysis using different predictive algorithms which provides evidence that High-Performance Forest algorithm is particularly operative on the prediction of future occurrences with the accuracy of 80%, ROC index 0.878 and the cumulative lift value of 1....
Journal of Computational and Theoretical Nanoscience, 2019
In this era, big data is the most common buzzword across different industries due to its capabili... more In this era, big data is the most common buzzword across different industries due to its capabilities of collecting, processing, storing and analysing data. The advancement of the E-Commerce paved the way for merchants and customers to meet online to satisfy their requirements by exchanging goods and services at a reasonable cost. The challenges and opportunities for big data on the emphasis of data privacy and security is a widely discussed topic among businesses especially E-Commerce merchants. There are several reviews available on emphasizing big data opportunities and challenges with regard to privacy and security. However, a comprehensive review on E-Commerce highlighting thematically on the tools and technologies is not given enough consideration. Therefore, the purpose of this study is to review the state-of-the-art technologies towards privacy and security in the E-Commerce platforms. The identified cryptographic technologies were also discussed with the rational standpoint...
2022 IEEE International Conference on Distributed Computing and Electrical Circuits and Electronics (ICDCECE)