Novel Two-Dimensional Visualization Approaches for Multivariate Centroids of Clustering Algorithms (original) (raw)

Ranked Centroid Projection: A Data Visualization Approach With Self-Organizing Maps

IEEE Transactions on Neural Networks, 2008

The self-organizing map (SOM) is an efficient tool for visualizing high-dimensional data. In this paper, the clustering and visualization capabilities of the SOM, especially in the analysis of textual data, i.e., document collections, are reviewed and further developed. A novel clustering and visualization approach based on the SOM is proposed for the task of text mining. The proposed approach first transforms the document space into a multidimensional vector space by means of document encoding. Afterwards, a growing hierarchical SOM (GHSOM) is trained and used as a baseline structure to automatically produce maps with various levels of detail. Following the GHSOM training, the new projection method, namely the ranked centroid projection (RCP), is applied to project the input vectors to a hierarchy of 2-D output maps. The RCP is used as a data analysis tool as well as a direct interface to the data. In a set of simulations, the proposed approach is applied to an illustrative data set and two real-world scientific document collections to demonstrate its applicability.

Improving cluster visualization in self-organizing maps: Application in gene expression data analysis

Computers in Biology and Medicine, 2007

Cluster analysis is one of the crucial steps in gene expression pattern (GEP) analysis. It leads to the discovery or identification of temporal patterns and coexpressed genes. GEP analysis involves highly dimensional multivariate data which demand appropriate tools. A good alternative for grouping many multidimensional objects is self-organizing maps (SOM), an unsupervised neural network algorithm able to find relationships among data. SOM groups and maps them topologically. However, it may be difficult to identify clusters with the usual visualization tools for SOM. We propose a simple algorithm to identify and visualize clusters in SOM (the RP-Q method). The RP is a new node-adaptive attribute that moves in a two dimensional virtual space imitating the movement of the codebooks vectors of the SOM net into the input space. The Q statistic evaluates the SOM structure providing an estimation of the number of clusters underlying the data set. The SOM-RP-Q algorithm permits the visualization of clusters in the SOM and their node patterns. The algorithm was evaluated in several simulated and real GEP data sets. Results show that the proposed algorithm successfully displays the underlying cluster structure directly from the SOM and is robust to different net sizes. ᭧

Data Clustering and Topology Preservation Using 3D Visualization of Self Organizing Maps

The Self Organizing Maps (SOM) is regarded as an excellent computational tool that can be used in data mining and data exploration processes. The SOM usually create a set of prototype vectors representing the data set and carries out a topology preserving projection from high-dimensional input space onto a low-dimensional grid such as two-dimensional (2D) regular grid or 2D map. The 2D-SOM technique can be effectively utilized to visualize and explore the properties of the data. This technique has been applied in numerous application areas such as in pattern recognition, robotics, bioinformatics and also life sciences including clustering complex gene expression patterns. In this paper, the structure of traditionally 2D-SOM map has been enhanced to a three-dimensional Self Organizing Maps (3D-SOM) maps. It has the purpose to directly cluster data into 3D-SOM space instead of 2D-SOM data clusters. The primary works mostly involved the extensions of SOM algorithm in particular the number, relation and structure arrangement of its output neurons, neighbourhood weight update processes and distances calculation in 3D xyz-axis. The proposed method has been demonstrated by computing 3D-SOM visualization on iris flowers dataset using high level computer language. The performance of 2D-SOM and 3D-SOM in terms of their quantization errors, topographic errors and computational time has been investigated and discussed. The experimental results have shown that the 3D-SOM has been able to form a 3D data representation, has slightly higher quantization error and computational time but performed better topology preservation than in 2D-SOM.

Comparison of self-organizing map with K-means hierarchical clustering for bioinformatics applications

2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541)

The Self-Organizing Map (SOM) has emerged as one of the popular choices for clustering data; however, when it comes to point density accuracy of codebooks or reliability and interpretability of the map, the SOM leaves much to be desired. In this paper, we compare the newly developed K-Means Hierarchical (KMH) clustering algorithm to the SOM. We also introduce a new initialization scheme for the K-means that improves codebook placement and, propose a novel visualization scheme that combines the Principal Component Analysis (PCA) and Minimal Spanning Tree (MST) in an arrangement that ensures reliability of the visualization unlike the SOM. A practical application of the algorithm is demonstrated on a challenging Bioinformatics problem.

The Self Organizing Map as a Tool for Cluster Analysis

Menemui Matematik (Discovering Mathematics), 2016

The Self-organizing map is among the most acceptable algorithm in the unsupervised learning technique for cluster analysis. It is an important tool used to map high-dimensional data sets onto a low-dimensional discrete lattice of neurons. This feature is used for clustering and classifying data. Clustering is the process of grouping data elements into classes or clusters so that items in each class or cluster are as similar to each other as possible. In this paper, we present an overview of self organizing map, its architecture, applications and its training algorithm. Computer simulations have been analyzed based on samples of data for clustering problems.

An improved Kohonen self-organizing map clustering algorithm for high-dimensional data sets

Indonesian Journal of Electrical Engineering and Computer Science, 2021

Manipulating high-dimensional data is a major research challenge in the field of computer science in recent years. To classify this data, a lot of clustering algorithms have already been proposed. Kohonen self-organizing map (KSOM) is one of them. However, this algorithm has some drawbacks like overlapping clusters and non-linear separability problems. Therefore, in this paper, we propose an improved KSOM (I-KSOM) to reduce the problems that measures distances among objects using EISEN Cosine correlation formula. So far as we know, no previous work has used EISEN Cosine correlation distance measurements to classify high-dimensional data sets. To the robustness of the proposed KSOM, we carry out the experiments on several popular datasets like Iris, Seeds, Glass, Vertebral column, and Wisconsin breast cancer data sets. Our proposed algorithm shows better result compared to the existing original KSOM and another modified KSOM in terms of predictive performance with topographic and quant...

Clustering of the Self-Organizing Map

Abstract—The self-organizing map (SOM) is an excellent tool in exploratory phase of data mining. It projects input space on prototypes of a low-dimensional regular grid that can be effectively utilized to visualize and explore properties of the data. When the number of SOM units is large, to facilitate quantitative analysis of the map and the data, similar units need to be grouped, i.e., clustered. In this paper, different approaches to clustering of the SOM are considered. In particular, the use of hierarchical agglomerative clustering and partitive clustering using -means are investigated. The two-stage procedure—first using SOM to produce the prototypes that are then clustered in the second stage—is found to perform well when compared with direct clustering of the data and to reduce the computation time.

A Comparison Study: Clustering using Self-Organizing Map and K-means Algorithm

2016

Nowadays clustering is applied in many different scopes of study. There are many methods that have been proposed, but the most widely used is K-means algorithm. Neural network has been also usedin clustering case, and the most popular neural network method for clustering is Self-Organizing Map (SOM). Both methods recently become the most popular and powerful one. Many scholarstry to employ and compare the performance of both mehods. Many papers have been proposed to reveal which one is outperform the other. However, until now there is no exact solution. Different scholar gives different conclusion. In this study, SOM and K-means are compared using three popular data set. Percent misclassified and output visualization graphs (separately and simultaneously with PCA) are presented to verify the comparison result.

Enhanced Clustering Analysis and Visualization Using Kohonen’s Self-Organizing Feature Map Networks

2008

Cluster analysis is the name given to a diverse collection of techniques that can be used to classify objects (e.g. individuals, quadrats, species etc). While Kohonen's Self-Organizing Feature Map (SOFM) or Self-Organizing Map (SOM) networks have been successfully applied as a classification tool to various problem domains, including speech recognition, image data compression, image or character recognition, robot control and medical diagnosis, its potential as a robust substitute for clustering analysis remains relatively unresearched. SOM networks combine competitive learning with dimensionality reduction by smoothing the clusters with respect to an a priori grid and provide a powerful tool for data visualization. In this paper, SOM is used for creating a toroidal mapping of two-dimensional lattice to perform cluster analysis on results of a chemical analysis of wines produced in the same region in Italy but derived from three different cultivators, referred to as the " wine recognition data " located in the University of California-Irvine database. The results are encouraging and it is believed that SOM would make an appealing and powerful decision-support system tool for clustering tasks and for data visualization.

Expanding Self-Organizing Map for data visualization and cluster analysis

Information Sciences, 2004

The Self-Organizing Map (SOM) is a powerful tool in the exploratory phase of data mining. It is capable of projecting high-dimensional data onto a regular, usually 2-dimensional grid of neurons with good neighborhood preservation between two spaces. However, due to the dimensional conflict, the neighborhood preservation cannot always lead to perfect topology preservation. In this paper, we establish an Expanding SOM (ESOM) to preserve better topology between the two spaces. Besides the neighborhood relationship, our ESOM can detect and preserve an ordering relationship using an expanding mechanism. The computation complexity of the ESOM is comparable with that of the SOM. Our experiment results demonstrate that the ESOM constructs better mappings than the classic SOM, especially, in terms of the topological error. Furthermore, clustering results generated by the ESOM are more accurate than those obtained by the SOM.