A Social Hypertext Model for Finding Community In Blogs (original) (raw)

Discovering Web Communities in the Blogspace

2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07), 2007

With the emergence of a range of second generation Internet based services such as weblogs and their hosting services, many loosely organized communities of bloggers have started to form around common interests. Those communities have quickly evolved into a new information and knowledge dissemination channel. Most current search engines cannot discover weblog communities through regular keyword search. In this paper we propose a new way of collecting and preparing information for weblog community discovery. Weblog community is treated as a social network and the data collection stage focuses on gaining knowledge of the strength of social ties between weblogs. The strength of social ties and the clustering feature of social network are used to extract communities from a large blogspace. We also develop a few metrics to rank communities as well as individual members in the community. We report several experimental results on "web services" communities.

Discovery of blog communities based on mutual awareness

2006

Blogs have many fast growing communities on the Internet. Discovering such communities in the blogosphere is important for sustaining and encouraging new blogger participation. We focus on extracting communities based on two key insights -(a) communities form due to individual blogger actions that are mutually observable; (b) semantics of the hyperlink structure are different from traditional web analysis problems. Our approach involves developing computational models for mutual awareness that incorporates the specific action type, frequency and time of occurrence. We use the mutual awareness feature with a rankingbased community extraction algorithm to discover communities. To validate our approach, four performance measures are used on the WWW2006 Blog Workshop dataset and the NEC focused blog dataset with excellent quantitative results. The extracted communities also demonstrate to be semantically cohesive with respect to their topics of interest.

Hyper-community detection in the blogosphere

Proceedings of second ACM SIGMM workshop on Social media - WSM '10, 2010

Most existing work on learning community structure in social network is graph-based whose links among the members are often represented as an adjacency matrix, encoding direct pairwise associations between members. In this paper, we propose a method to group online communities in blogosphere based on the topics learnt from the content blogged. We then consider a different type of online community formulationthe sentiment-based grouping of online communities. The problem of sentiment-based clustering for community structure discovery is rich with many interesting open aspects to be explored. We propose a novel approach for addressing hyper-community detection based on users' sentiment. We employ a nonparametric clustering to automatically discover hidden hyper-communities and present the results obtained from a large dataset.

Bloggers Behavior and Emergent Communities in Blog Space

Computing Research Repository, 2009

Interactions between users in cyberspace may lead to phenomena different from those observed in common social networks. Here we analyse large data sets about users and Blogs which they write and comment, mapped onto a bipartite graph. In such enlarged Blog space we trace user activity over time, which results in robust temporal patterns of user-Blog behavior and the emergence of communities. With the spectral methods applied to the projection on weighted user network we detect clusters of users related to their common interests and habits. Our results suggest that different mechanisms may play the role in the case of very popular Blogs. Our analysis makes a suitable basis for theoretical modeling of the evolution of cyber communities and for practical study of the data, in particular for an efficient search of interesting Blog clusters and further retrieval of their contents by text analysis.

Blog Community Discovery and Evolution Based on Mutual Awareness Expansion

IEEE/WIC/ACM International Conference on Web Intelligence (WI'07), 2007

There are information needs involving costly decisions that cannot be efficiently satisfied through conventional web search engines. Alternately, community centric search can provide multiple viewpoints to facilitate decision making. We propose to discover and model the temporal dynamics of thematic communities based on mutual awareness, where the awareness arises due to observable blogger actions and the expansion of mutual awareness leads to community formation. Given a query, we construct a directed action graph that is time-dependent, and weighted with respect to the query. We model the process of mutual awareness expansion using a random walk process and extract communities based on the model. We propose an interaction space based representation to quantify community dynamics. Each community is represented as a vector in the interaction space and its evolution is determined by a novel interaction correlation method. We have conducted experiments with a real-world blog dataset and have promising results for detection as well as insightful results for community evolution.

Structural and temporal analysis of the blogosphere through community factorization

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '07, 2007

The blogosphere has unique structural and temporal properties since blogs are typically used as communication media among human individuals. In this paper, we propose a novel technique that captures the structure and temporal dynamics of blog communities. In our framework, a community is a set of blogs that communicate with each other triggered by some events (such as a news article). The community is represented by its structure and temporal dynamics: a community graph indicates how often one blog communicates with another, and a community intensity indicates the activity level of the community that varies over time. Our method, community factorization, extracts such communities from the blogosphere, where the communication among blogs is observed as a set of subgraphs (i.e., threads of discussion). This community extraction is formulated as a factorization problem in the framework of constrained optimization, in which the objective is to best explain the observed interactions in the blogosphere over time. We further provide a scalable algorithm for computing solutions to the constrained optimization problems. Extensive experimental studies on both synthetic and real blog data demonstrate that our technique is able to discover meaningful communities that are not detectable by traditional methods.