Prediction of Heart Disease by Clustering and Classification Techniques (original) (raw)

Prediction of Heart Disease Using Classification Based Data Mining Techniques

Smart Innovation, Systems and Technologies, 2014

Every year 19 million people approximately die from heart disease worldwide. A heart patient shows several symptoms and it is very tough to attribute them to the heart disease in so many steps of disease progression. Data mining, as an answer to extract a hidden pattern from the clinical dataset, are applied to a database in this analysis. All available algorithms in classification technique are compared to each other to achieve the highest accuracy. To further increase the correctness of the solution, the dataset is preprocessed by different unsupervised and supervised algorithms. The two important tasks which are needed for the development of classifier come under data mining and they are clustering and classification. In K-means clustering the initial point selection effects on the results of the algorithm, both in the number of clusters found and their centroids. Methods to enhance the k-means clustering algorithm are discussed. With the help of these methods efficiency, accuracy and performance are improved. So, to improve the performance of clusters the Normalization which is a preprocessing stage is used to enhance the Euclidean distance by calculating more nearer centers, which result in a reduced number of iterations which will reduce the computational time as compared to k-means clustering. Finally, the classifiers are developed with Logistic regression by using the data extracted by K-Means Clustering. The techniques adopted in the design of classifier perform relatively well in terms of classification results better compared to clustering techniques.

Heart Disease Prediction with Data Mining Clustering Algorithms

International Journal of Computing, Communication and Instrumentation Engineering

Vast number of people annually suffer from heart malfunction worldwide. Various symptoms result in heart disease which in many cases is hard to diagnose a patient as a heart patient. Data mining, as a solution to extract hidden pattern from the clinical dataset are applied to a database in this research. The database consists of 209 instances and 8 attributes. All available algorithms in clustering technique, are compared to achieve the highest accuracy. To further increase the accuracy of the solution, the dataset is preprocessed by different supervised and unsupervised algorithms. The system was implemented in WEKA and prediction accuracy for 5 stages, and 40 approaches, are compared. Three clusters with an accuracy of 100% are introduced as the highest performance algorithms.

Heart Disease Prediction using K Nearest Neighbour and K Means Clustering

—The widespread application of data mining is highly noticeable fields like e-business, marketing and retail has led to its application in other industries and healthcare sectors. The healthcare environs are still information rich but that has poor knowledgeable data. Techniques in Data mining have been commonly used to extract knowledgeable information from medical data bases Today medical field have come a long way to treat patients with various kind of diseases. Among the most menacing one is the Heart disease which cannot be detected with a stripped eye and comes suddenly when its boundaries are reached. Bad medical decisions would cause death of a patient which cannot be afforded by any hospital. To achieve a correct and cost effective treatment computer-based and support Systems can be developed to make good decision. Many hospitals use hospital information systems to manage their healthcare or patient data. These systems produce huge amounts of data in the form of images, text, charts and numbers. K nearest neighbor and K means used to support the medical decision making efficiently. Keywords—K Nearest Neighbor, K Means.

PREDICTION OF HEART DISEASE USING K-MEANS and ARTIFICIAL NEURAL NETWORK as HYBRID APPROACH to IMPROVE ACCURACY

International Journal of Engineering and Technology, 2017

The heart is important organ of human body part. Life is completely dependent on efficient working of the heart. What if a heart undergoes a disorder, cardiovascular diseases are the most challenging disease for reducing patient count. According to survey conducted by WHO, about 17 million people die around the globe due to cardiovascular diseases i.e 29.20% among all caused death, mostly in developing countries. Thus there is a need of getting rid of the this complicated task CVD using advanced data mining techniques, in order to discover knowledge of Heart disease prediction. In this paper, we propose an efficient hybrid algorithmic approach for heart disease prediction. This paper serves efficient prediction technique to determine and extract the unknown knowledge of heart disease using hybrid combination of K-means clustering algorithm and artificial neural network. In our proposed model we considered 14 attribute out of 74 attributes of UCI Heart Disease Data Set [19]. This technique uses medical terms such as age, weight, gender, blood pressure and cholesterol rate etc for prediction. To perform grouping of various attributes it uses k-means algorithm and for predicting it uses Back propagation technique in neural networks. The main objective of this paper is to develop a prototype for predicting heart diseases with higher accuracy rate. Keyword-Heart disease, K-means, artificial neural network, cardiovascular diseases I. INTRODUCTION At the age above 30, the heart attack or CVD is a common problem can be seen in all human beings. Along with changing lifestyle there are many such factors such as smoking, alcohol, cholesterol level, obesity, high blood pressure, diabetes etc. which are responsible factors for the risk of having heart problems. However, resent studies says that, with the introduction of artificial intelligence and medical sciences, we can actually help in preventing any such kind of diseases. Data mining plays a vital role in healthcare domain. Data Mining and Machine learning comes up as an emerging field of high importance for providing prognosis and a deeper understanding of medical data [9]. In a old survey. The World Health Organization (WHO) has evaluated that 17 million deaths occur in world, every year due to the Heart diseases [12]. Prediction by using data mining techniques gives us accurate result of Heart Diseases. The prediction can solve complicated queries for detecting heart disease and thus assist medical practitioners to make smart clinical decisions. Researchers are suggesting that applying data mining techniques in identifying effective treatments for patients can improve practitioner performance. Researchers have been investigating and applying different data mining techniques in the diagnosis of heart disease to identify which data mining technique can provide more reliable accuracy. Different data mining techniques have been used to help health care departments in the diagnosis of heart disease [13] [14]. Those most frequently used focus on classification: Naïve Bayes, decision tree, and neural network. In such one of the systems, has used Back-Propagation in neural network which is stated as the best prediction algorithm. The system shows a non-linear relationship between the data and the target output. The characteristics of BP algorithm are that it is adaptive and tolerant towards the noisy data or other outliers present in the medical data. In our proposed system, we are proposing a hybrid approach to predict or diagnose heart disorders using UCI heart disease dataset [11]. by combining K-means and ANN algorithm. The main goal is to obtain high accuracy rate of prediction. Flow of the paper is given as; after the introduction section, a proper literature survey is made in section II. Section III specifies the proposed system architecture and flow chart of the implementation steps. Next section specifies about the steps required for Kmeans and ANN algorithm. In section IV experiment result is summarized.

Novel Approach for Heart Disease using Data Mining Techniques

International Journal of Advance Research, Ideas and Innovations in Technology, 2016

Data mining is the process of analysing large sets of data and then extracting the meaning of the data. It helps in predicting future trends and patterns, allowing business in decision making. Presently various algorithms are available for clustering the proposed data, in the existing work they used K mean clustering, C4.5 algorithm and MAFIA i.e. Maximal Frequent Item set algorithm for Heart disease prediction system and achieved the accuracy of 89%. As we can see that there is vast scope of improvement in our proposed system, in this paper we will implement various other algorithms for clustering and classifying data and will achieved the accuracy more than the present algorithm. Several Parameters has been proposed for heart disease prediction system but there have been always a need for better parameters or algorithms to improve the performance of heart disease prediction system.

A New Data Preparation Method Based on Clustering Algorithms for Diagnosis Systems of Heart and Diabetes Diseases

Journal of Medical Systems, 2014

The most important factors that prevent pattern recognition from functioning rapidly and effectively are the noisy and inconsistent data in databases. This article presents a new data preparation method based on clustering algorithms for diagnosis of heart and diabetes diseases. In this method, a new modified K-means Algorithm is used for clustering based data preparation system for the elimination of noisy and inconsistent data and Support Vector Machines is used for classification. This newly developed approach was tested in the diagnosis of heart diseases and diabetes, which are prevalent within society and figure among the leading causes of death. The data sets used in the diagnosis of these diseases are the Statlog (Heart), the SPECT images and the Pima Indians Diabetes data sets obtained from the UCI database. The proposed system achieved 97.87 %, 98.18 %, 96.71 % classification success rates from these data sets. Classification accuracies for these data sets were obtained through using 10-fold cross-validation method. According to the results, the proposed method of performance is highly successful compared to other results attained, and seems very promising for pattern recognition applications.

ANALYSIS OF CLASSIFICATION ALGORITHMS USING HEART DISEASES DATA SET FOR PREDICTION ITS ACCURACIES

Heart disease is the very important role for human death and we predict it at earlier stage to save the human life. So many of classification algorithms available in the data mining, we selected as few classification algorithms for heart disease prediction and found the accuracies. Different algorithms give various levels of accuracies. In the paper comparing the accuracies of few classification algorithms are Random Tree,Naives Bayes,Decision Tree and Random Forest then used K-Means clustering. The hungarian_csv, cleveland.csv and switzerland.csv heart disease data set received from UCI repository with 1272 instance and 14 regular attributes age, sex,cp, restbps, chol,fbs,restecg, thalach, exang, oldpeak , slope, ca, thalm, num were used here for analysis. Rapid miner studio software is a data science software platform developed by the company of the same name that provides an integrated environment for machine learning, data mining predicate analytics and business analysis. The different measures and result were tabulated and charted.

Identification and Predicting Heart Disease with Data Mining methods-A Survey

2018

Data mining mechanisms allow to create proactive decision making systems. Data mining methods can respond to any environment that usually involve more time and complexity in decision making . In this paper we considered several mechanisms in which data mining methods are used for the prediction of Heart Disease. The data mining systems specifically Decision Tree, Naïve Bayes, Neural Network, K-means Clustering, affiliation arrangement, Support vector machine algorithms are examined on Heart Disease database. This paper examined the general audit of Heart Disease diagnosis, utilizing different data mining strategies. These procedures of data mining utilized as a part of Heart Disease prediction take less time and make process easier and earlier for the diagnosis of Heart Disease with great precision so as to enhance heart safety. This paper investigates distinctive data mining strategies which are utilized as a part of human services for the diagnosis of heart infections utilizing da...

Cardiovascular Disease Analysis Using Supervised and Unsupervised Data Mining Techniques

Cardiovascular diseases are the main cause of death around the world. Every year, more people die from these diseases than from any other cause. According to World Health Organization data, in 2012 more than 17,5 million people died from this cause, and that represents 31% of all deaths registered worldwide. Data mining techniques are widely used for the analysis of diseases, including cardiovascular conditions, and the techniques used in the proposed method in this research are decision trees, support vector machines, bayesian networks and k-nearest neighbors. Apart from the previous techniques, it was necessary to use a clustering method for data segmentation according to their diagnosis. As a result, the Simple K-Means clustering method and the support vector machines technique obtained the best levels in metrics such as precision (97%), coverage (97%), true positive rate (97%) and false positive rate (0.02%), and this can be taken as evidence that the proposed method can be used assertively as decision making support to diagnose a patient with cardiovascular disease.

Algorithms Used for Optimizing K-Means for Heart Disease Diagnosis

The heart is a significant organ of the human body. Life is reliant on the proper functioning of the heart. Most of the time is difficult to analyze a patient as a heart patient. For this purpose, data mining can be used to recognize a hidden clinical dataset. In this paper, we study to find ways to optimize the K-Means algorithm by overcoming its drawbacks which may help create a heart disease predicting system by applying it. Here, we present a study on the advanced data mining techniques and hybrid algorithms that could be used to optimize the K-Means and increase the prediction accuracy of the system of Heart disease prediction.