Cluster Analysis of Information in Complex Networks (original) (raw)

A clustering approach for exploring the Internet structure

2004 23rd IEEE Convention of Electrical and Electronics Engineers in Israel, 2004

This paper proposes several clustering algorithms to explore the topology of the AS-graph. Using these algorithms, we are able to view the Internet topology, as encapsulated by the AS-graph, at three levels of abstraction: (1) a low-resolution view, which shows a coarse cluster cover of the whole AS-graph, (2) a mid-resolution view, showing the relationships between dense cores inside the coarse clusters, and (3) a high-resolution view of individual highdensity cores.

Identification of clusters in the Web graph based on link topology

Seventh International Database Engineering and Applications Symposium, 2003. Proceedings., 2003

The web graph has recently been used to model the link structure of the Web. The studies of such graphs can yield valuable insights into web algorithms for crawling, searching and discovery of web communities. This paper proposes a new approach to clustering the Web graph. The proposed algorithm identifies a small subset of the graph as "core" members of clusters, and then incrementally constructs the clusters by a selection criterion. Two qualitative criteria are proposed to measure the quality of graph clustering. We have implemented our algorithm and tested a set of arbitrary graphs with good results. Applications of our approach include graph drawing and web visualization.

Web Data Clustering

Studies in Computational Intelligence, 2009

This chapter provides a survey of some clustering methods relevant to clustering Web elements for better information access. We start with classical methods of cluster analysis that seems to be relevant in approaching the clustering of Web data. Graph clustering is also described since its methods contribute significantly to clustering Web data. The use of artificial neural networks for clustering has the same motivation. Based on previously presented material, the core of the chapter provides an overview of approaches to clustering in the Web environment. Particularly, we focus on clustering Web search results, in which clustering search engines arrange the search results into groups around a common theme. We conclude with some general considerations concerning the justification of so many clustering algorithms and their application in the Web environment.

CLUSTERING TO ANALYZE CADEMIC SOCIAL NETWORKS OF CLUSTERING TO ANALYZE CADEMIC SOCIAL NETWORKS

Social network is a group of individuals with diverse social interactions amongst them. The network large scale and distributed due to Quantitative analysis of networks is need of and in turn the society. Clustering helps us to group people with similar characteristics dense social networks. We have considered similarity measures for statistical When a social network is represented as a graph with members as nodes and their relation as edges, graph mining would be suitable for statistical analysis. We have chosen academic social network nodesto simplify network analysis. similaritybetween unstructured data elements extracted from social network.

Network Analysis of Works on Clustering and Classification from Web of Science

Studies in Classification, Data Analysis, and Knowledge Organization, 2010

Web of Science (WoS) is a database that provides information about current and past articles published in over 10,000 of the most prestigious, high impact research journals in the world from year 1970 on. A file with full informationrecords about selected articles-can be downloaded and further analyzed. We collected from WoS complete records on articles from Journal of Classification, articles citing these articles, and articles in WoS cited by them at least 10 times. A special program WoS2Pajek was developed for converting such data into Pajek network files. The citation network between articles, networks of articles × authors, articles × keywords, articles × journals, and the partition according to publication year were obtained from the data. These networks were analyzed in order to identify the most important authors, works and topics that have been involved in the field in the last decades.

Clustering in complex networks. I. General formalism

Physical Review E, 2006

We develop a full theoretical approach to clustering in complex networks. A key concept is introduced, the edge multiplicity, that measures the number of triangles passing through an edge. This quantity extends the clustering coefficient in that it involves the properties of two -and not just one-vertices. The formalism is completed with the definition of a three-vertex correlation function, which is the fundamental quantity describing the properties of clustered networks. The formalism suggests new metrics that are able to thoroughly characterize transitive relations. A rigorous analysis of several real networks, which makes use of the new formalism and the new metrics, is also provided. It is also found that clustered networks can be classified into two main groups: the weak and the strong transitivity classes. In the first class, edge multiplicity is small, with triangles being disjoint. In the second class, edge multiplicity is high and so triangles share many edges. As we shall see in the following paper, the class a network belongs to has strong implications in its percolation properties.

Web Graph Clustering for Displays and Navigation of Cyberspace

Applications and Techniques

This chapter presents a new approach to clustering graphs, and applies it to Web graph display and navigation. The proposed approach takes advantage of the linkage patterns of graphs, and utilizes an affinity function in conjunction with the k-nearest neighbor. This chapter uses Web graph clustering as an illustrative example, and offers a potentially more applicable method to mine structural information from data sets, with the hope of informing readers of another aspect of data mining and its applications.

Practical challenges that arise when clustering the web using spectral methods

This is a report on an implementation of a spectral clustering algorithm for classifying very large internet sites, with special emphasis on the practical prob-lems encountered in developing such a data mining system. Remarkably some of these technical difficulties are due to fundamental issues pertaining to the mathematics in-volved, and are not treated properly in the literature. Others are inherent to the functions and numerical methods proper to the high level technical computing pro-gramming environment that we use. We will point out what these practical challenges are and how to solve them.

Cluster analysis in document networks

Data Mining IX, 2008

Text or document clustering is a subset of a larger field of data clustering and has been one of the research hotspots in text mining. On the other hand, recent studies have shown that many real systems may be represented as complex networks with astonishing similar proprieties. In this work a document corpora is represented as a complex network of documents, in which the nodes represent the documents and the edges are weighted according to the similarities among documents. The detection of community structures in complex networks can be seen as the cluster analysis in document networks. Recently community detection algorithms based on spectral proprieties of the underlying has shown good results. The main motivation for applying those methods is that they have shown to be robust to the high dimensionality of feature space and also to the inherent data sparsity resulting from text representation in the vector space model. The aim of this paper is to present the application of the community structures algorithms for text mining. Experiments have been carried out on the document clustering problems taken from 20 newsgroup document corpora to evaluate the performance of the proposed approach.