Comprehensive Study and Analysis of Partitional Data Clustering Techniques (original) (raw)

A Detailed Study and Analysis of differentPartitional Data Clustering Techniques

International Journal of Innovative Research in Science, Engineering and Technology, 2014

The concept of Data Clustering is considered to be very significant in various application areas like text mining, fraud detection, health care, image processing, bioinformatics etc. Due to its application in a variety of domains, various techniques are presented by many research domains in the literature. Data Clustering is one of the important tasks that make up Data Mining. Clustering can be classified into different types such as partitional, hierarchical, spectral, density-based, grid-based, model based etc. Among the different types of clustering available, partitional clustering is the most widely used in most of the applications since the computation involved is not very complex. Hence lot of research has been carried out in clustering using partitional method. In this paper, it is proposed to do a comprehensive study of the different partitional clustering techniques used in the literature which will also provide an insight into the recent problems in the same area. In this...

A Comparative Study on Partition-based Clustering Methods

2018

Clustering analysis is one of the essential data analysis tools that separate a group of data objects into similar sets called clusters. In Partition-based clustering method, identifying the initial centroid is challenging task. This paper presents a review of some partition-based clustering method that improves the selection of initial centroid value and enhances the quality of clustering to some extent. KeywordsData mining, spatial data Clustering, Partition-based method, k-means

Applications of Partition based Clustering Algorithms: A Survey

Data mining is one of the interesting research areas in database technology. In data mining, a cluster is a set of data objects that are similar to one another with in a cluster and are different to the entities in the former clusters. Clustering is the efficient method in data mining in order to process huge data sets. The core methodology of clustering is used in many domains like academic result analysis of institutions. Also, the methods are very well suited in machine learning, clustering in medical dataset, pattern recognition, image mining, information retrieval and bioinformatics. The clustering algorithms are categorized based upon different research phenomenon. Varieties of algorithms have recently occurred and were effectively applied to real-life data mining problems. This survey mainly focuses on partition based clustering algorithms namely k-Means, k-Medoids and Fuzzy c-Means In particular, they applied mostly in medical data sets. The importance of the survey is to explore the various applications in different domains.

A Study of Different Partitioning Clustering Technique

In the field of software, Data mining is very useful to identify the interesting patterns and trends from the large amount of stored data into different database and data repository. Clustering technique is basically used to extract the unknown pattern from the large set of data for electronic stored data, business and real time applications. Clustering is a division of data into different groups. Data are grouped into clusters with high intra group similarity and low inter group similarity [2]. Clustering is an unsupervised learning technique. Clustering is useful technique that applied into many areas like marketing studies, DNA analysis, text mining and web documents classification. In the large database, the clustering task is very complex with many attributes. There are many methods to deal with these problems. In this paper we discuss about the different Partitioning Based Methods like- K-Means, K-Medoids and Fuzzy K-Means and compare the advantages or disadvantages over these techniques.

Evaluation of Partitional and Hierarchical Clustering Techniques

IJCSMC, 2019

Machine learning algorithms were broadly classified into supervised, unsupervised and semi-supervised learning algorithms. Supervised learning algorithms were classified into classification and regression techniques whereas unsupervised learning algorithms were classified into clustering and dimensionality reduction. This paper deals with the evaluation of clustering techniques under unsupervised learning. Clustering is the process of coordinating the data of similar properties under single group. There are several clustering techniques available such as partitional clustering, hierarchical clustering, Fuzzy clustering, Density-based clustering, and Model-based clustering. This paper focuses on the analysis and evaluation of K-means clustering of partitional method and Divisive clustering of hierarchical method. The result of evaluation shows that K-means clustering can hold better for large datasets and it also takes less time than hierarchical clustering.

Constraint Based Partitional Clustering – a Comprehensive Study and Analysis

2014

Data clustering is the concept of forming predefined number of clusters where the data points within each cluster are very similar to each other and the data points between clusters are dissimilar to each other. The concept of clustering is widely used in various domains like bioinformatics, medical data, imaging, marketing study and crime analysis. The popular types of clustering techniques are partitional, hierarchical, spectral, density-based, mixture-modelling etc. Partitional clustering is a widely used technique for most of the applications since it is computationally inexpensive. An analysis of the various research works available on partitional clustering gives an insight into the recent problems in partitional clustering domain. In this paper, nine research articles from 2005 to 2013 have been taken for survey in order to analyse the different concepts used for constrained based partitional clustering techniques. Also, a comparative analysis is carried out, to find out the ...

Analytical Comparison of Some Traditional Partitioning based and Incremental Partitioning based Clustering Methods

International Journal of Computer Applications, 2012

Data clustering is a highly valuable field of computational statistics and data mining. Data clustering can be considered as the most important unsupervised learning technique as it deals with finding a structure in a collection of unlabeled data. A Clustering is division of data into similar objects. A major difficulty in the design of data clustering algorithms is that, in majority of applications, new data are dynamically appended into an existing database and it is not feasible to perform data clustering from scratch every time new data instances get added up in the database. The development of clustering algorithms which handle the incremental updating of data points is known as an Incremental clustering. In this paper authors have reviewed Partition based clustering methods mainly, K-means & DBSCAN and provided a detailed comparison of Traditional clustering and Incremental clustering method for both.

DIVISIVE HIERARCHICAL CLUSTERING USING PARTITIONING METHODS

IAEME PUBLICATION, 2013

Clustering is the process of partitioning a set of data so that the data can be divided into subsets. Clustering is implemented so that same set of data can be collected on one side and other set of data can be collected on the other end. Clustering can be done using many methods like partitioning methods, hierarchical methods, density based method. Hierarchical method creates a hierarchical decomposition of the given set of data objects. In successive iteration, a cluster is split into smaller clusters, until eventually each object is in one cluster, or a termination condition holds. In this paper, partitioning method has been used with hierarchical method to form better and improved clusters. We have used various algorithms for getting better and improved clusters.

AN OVERVIEW ON CLUSTERING METHODS

IOSR Journal of Engineering, 2012

Clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. Clustering is the process of grouping similar objects into different groups, or more precisely, the partitioning of a data set into subsets, so that the data in each subset according to some defined distance measure. This paper covers about clustering algorithms, benefits and its applications. Paper concludes by discussing some limitations.