Performance Evaluation of Data Mining Techniques Using Cancer Dataset (original) (raw)

In recent years DM has attracted great attention in the healthcare industry and society as a whole. The objective of this research work is focused on the cluster creation of two cancer dataset and analyzed the performance of partition based algorithms. The two types of partition based algorithms namely Kmeans Plus and Affinithy Propagation are implemented. Comparative analysis of clustering algorithms is also carried out using two different dataset Colon and Leukemia. The performance of algorithms depends on the Correctly classified clusters and the Average accuracy of data. The Affinity Propagation algorithm is efficient for clustering the cancer dataset. The final outcome of this work is suitable to analyses the behavior of cancer in the department of oncology in cancer centers. Ultimate goal of this research work is to find out which type of dataset and algorithm will be most suitable for analysis of cancer data Introduction Data Mining is one of the most important area of research and is pragmatically used in different domains like finance, education, clinical research, healthcare, agriculture etc. in the aim of discovering useful information from large amount of dataset. This research uses different data mining techniques to cluster medical data. data mining tasks can be categorized in to two types: supervised tasks and unsupervised tasks. Supervised tasks have datasets that contain both the explanatory variables, dependent variables. The objective is to discover the associations between the explanatory and dependent variables. On the other hand, unsupervised tasks have datasets that contain only the explanatory variables with the objective to explore and generate postulates about the hidden structures of the data. Clustering is one of the most common untested data mining methods that explore the hidden structures embedded in a dataset. Clustering is the process of making group of abstract objects into classes of similar objects. A cluster of data objects can be treated as one group. While doing the cluster analysis, first partition the set of data into groups based on data similarity and then assigns the label to the groups. The main advantage of clustering over classification is adaptable to changes and help single out useful features that distinguished different groups.

Sign up for access to the world's latest research.

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Loading...

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.