Event detection in Twitter: A machine-learning approach based on term pivoting (original) (raw)

Event Detection in Twitter by Weighting Tweet's Features

2020

In recent years, people spend a lot of time on social networks. They use social networks as a place to comment on personal or public events. Thus, a large amount of information is generated and shared daily in these networks. Using such a massive amount of information can help authorities to react to events accurately and timely. In this study, the social network investigated is Twitter. The main idea of this research is to differentiate among tweets based on some of their features. This study aimed at investigating the performance of event detection by weighting three attributes of tweets; including the followers count, the retweets count, and the user location. The results show that the average execution time and the precision of event detection in the presented method improved 27% and 31%, respectively, than the base method. Another result of this research is the ability to detect all events (including hot events and less important ones) in the presented method.

Social event detection on twitter

Web Engineering, 2012

Various applications are developed today on top of microblogging services like Twitter. In order to engineer Web applications which operate on microblogging data, there is a need for appropriate filtering techniques to identify messages. In this paper, we focus on detecting Twitter messages (tweets) that report on social events. We introduce a filtering pipeline that exploits textual features and n-grams to classify messages into event related and non-event related tweets. We analyze the impact of preprocessing techniques, achieving accuracies higher than 80%. Further, we present a strategy to automate labeling of training data, since our proposed filtering pipeline requires training data. When testing on our dataset, this semi-automated method achieves an accuracy of 79% and results comparable to the manual labeling approach.

Event Detection in Twitter: A Content and Time-Based Analysis

ArXiv, 2021

The detection of events from online social networks is a recent, evolving field that attracts researchers from across a spectrum of disciplines and domains. Here we report a time-series analysis for predicting events. In particular, we evaluated the frequency distribution of top n-grams of terms over time, focusing on two indicators: high-frequency n-grams over both short and long periods of time. Both indicators can refer to certain aspects of events as they evolve. To evaluate the model’s accuracy in detecting events, we built and used a Twitter dataset of the mostpopular hashtags that surrounded the well-documented protests that occurred at the University of Missouri (Mizzou) in late 2015.

Tweets analysis for event detection

Ingénierie des systèmes d'information, 2016

Social media systems have been proven to be valuable platforms for information and communication, particularly during events; in case of natural disaster like earthquakes tsunami and states of nuclear emergencies in Japan in 2011. The behavior leads to an accumulation of an enormous amount of information. However, finding relevant posts can be a challenging task, since the relevance of a post is dependent both on its content, author and tweet's characteristics. Besides identifying tweets that describe a specific type of event is also challenging due to the high complexity and variety of event descriptions. These challenges present a big opportunity for Natural Language Processing (NLP) and Information Extraction (IE) technology to enable new large-scale data-analysis applications. Taking to account all the difficulties, this paper proposes a new metric to improve the results of the searches in microblogs. It combines content relevance, tweet relevance and author relevance, and develops a Natural Language Processing method for extracting temporal information of events from posts more specifically tweets. Our approach is based on a methodology of temporal markers classes and on a contextual exploration method. To evaluate our model, we built a knowledge management system. Actually, we used a collection of 10 thousand of tweets talking about the current events in 2014 and 2015.

Event detection in Tweets

2016

103 Event detection in Tweets Andrei-Bogdan Baran “Alexandru Ioan Cuza” University, Faculty of Computer Science General Berthelot, No. 16 andrei.baran@info.uaic.ro Adrian Iftene “Alexandru Ioan Cuza” University, Faculty of Computer Science General Berthelot, No. 16 adiftene@info.uaic.ro ABSTRACT Twitter is among the fastest-growing online social networking services, with more than 140 million users producing over 400 million tweets per day. It enables users to post status updates (tweets) about a huge variety of topics to a network of followers using various communication services such as cell phones, e-mails, Web interfaces, or other third-party applications. Monitoring and analyzing this rich and continuous usergenerated content can lead to obtaining valuable information about local and global news and events, because virtually, any person witnessing or involved in any event is nowadays able to disseminate realtime information, which can reach the other side of the world as the ev...

Exploring a Scalable Solution to Identifying Events in Noisy Twitter Streams

Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, 2015

The unprecedented use of social media through smartphones and other web-enabled mobile devices has enabled the rapid adoption of platforms like Twitter. Event detection has found many applications on the web, including breaking news identification and summarization. The recent increase in the usage of Twitter during crises has attracted researchers to focus on detecting events in tweets. However, current solutions have focused on static Twitter data. The necessity to detect events in a streaming environment during fast paced events such as a crisis presents new opportunities and challenges. In this paper, we investigate event detection in the context of real-time Twitter streams as observed in real-world crises. We highlight the key challenges in this problem: the informal nature of text, and the high-volume and high-velocity characteristics of Twitter streams. We present a novel approach to address these challenges using single-pass clustering and the compression distance to efficiently detect events in Twitter streams. Through experiments on large Twitter datasets, we demonstrate that the proposed framework is able to detect events in near real-time and can scale to large and noisy Twitter streams.

What’s Happening Around the World? A Survey and Framework on Event Detection Techniques on Twitter

Journal of Grid Computing

In the last few years, Twitter has become a popular platform for sharing opinions, experiences, news, and views in real-time. Twitter presents an interesting opportunity for detecting events happening around the world. The content (tweets) published on Twitter are short and pose diverse challenges for detecting and interpreting event-related information. This article provides insights into ongoing research and helps in understanding recent research trends and techniques used for event detection using Twitter data. We classify techniques and methodologies according to event types, orientation of content, event detection tasks, their evaluation, and common practices. We highlight the limitations of existing techniques and accordingly propose solutions to address the shortcomings. We propose a framework called EDoT based on the research trends, common practices, and techniques used for detecting events on Twitter. EDoT can serve as a guideline for developing event detection methods, especially for researchers who are new in this area. We also describe and compare data collection techniques, the effectiveness and shortcomings of various Twitter and non-Twitter-based features, and discuss various evaluation measures and benchmarking methodologies. Finally, we discuss the trends, limitations, and future directions for detecting events on Twitter.

TwitterNews: Real time event detection from the Twitter data stream

Research in event detection from the Twitter streaming data has been gaining momentum in the last couple of years. Although such data is noisy and often contains misleading information, Twitter can be a rich source of information if harnessed properly. In this paper, we propose a scalable event detection system, TwitterNews, to detect and track newsworthy events in real time from Twitter. TwitterNews provides a novel approach, by combining random indexing based term vector model with locality sensitive hashing, that aids in performing incremental clustering of tweets related to various events within a fixed time. TwitterNews also incorporates an effective strategy to deal with the cluster fragmentation issue prevalent in incremental clustering. The set of candidate events generated by TwitterNews are then filtered, to report the newsworthy events along with an automatically selected representative tweet from each event cluster. Finally, we evaluate the effectiveness of TwitterNews, ...

An Approximate Model for Event Detection From Twitter Data

IEEE Access, 2020

The abundance and real-time availability of Twitter data have proved beneficial in detecting events in various domains such as emergency situations, crime detection, public health, place recommendations, etc. Nevertheless, two critical challenges occur while detecting events using social media data. First, the uncertainty in capturing the contextual relationship among tweets, which is the result of the limited availability of the contextual information due to the small length of tweets. Second, the high computation cost required in event detection due to massive data processing. Earlier research works, addressing these challenges, have tried to capture the contextual information by using the dense vector representations of texts leveraging deep neural word embedding generation models such as Word2Vec and GloVe. However, these models are trained on the Euclidean vector space which fails to amalgamate the directional information of the vectors with the semantic information in text, incurring high computational costs. To target both the problems simultaneously, we propose modeling Twitter data as a graph-of-sentences which retains the contextual relationships while maintaining lower computational cost. The proposed model captures contextual information using JoSE, a spherical vector representation leveraging the word-word and word-paragraph semantic co-occurrence statistics in a spherical generative model. Furthermore, the framework uses the weighted-graph model to capture all the relationships among the Twitter data efficiently. The graph is further pruned with the help of the graph component filtering approach. The graph clustering model, employed to detect the events, leverages the edge weights and the partial-k clustering approach maintaining low computation costs. The experimentation on the annotated benchmark Twitter data set and the real-world datasets show improved run-time performance up to 30% while maintaining the qualitative performance (F1-score) comparable to the state-of-the-art models. INDEX TERMS Graph based event detection, social media data, uncertain clustering, word2vec, doc2vec, Jose twitter graph.