What Kind of #Communication is Twitter? Mining #Psycholinguistic Cues for Emergency Coordination (original) (raw)
Related papers
What Kind of #Conversation is Twitter? Mining #Psycholinguistic Cues for Emergency Coordination
Elsevier, 2013
The information overload created by social media messages in emergency situations challenges response organizations to find targeted content and users. We aim to select useful messages by detecting the presence of conversation as an indicator of coordinated citizen action. Using simple linguistic indicators associated with conversation analysis in social science, we model the presence of conversation in the communication landscape of Twitter in a large corpus of 1.5M tweets for various disaster and non-disaster events spanning different periods, lengths of time and varied social significance. Within Replies, Retweets and tweets that mention other Twitter users, we found that domain-independent, linguistic cues distinguish likely conversation from non-conversation in this online (mediated) communication. We demonstrate that conversation subsets within Replies, Retweets and tweets that mention other Twitter users potentially contain more information than non-conversation subsets. Information density also increases for tweets that are not Replies, Retweets or mentioning other Twitter users, as long as they reflect conversational properties. From a practical perspective, we have developed a model for trimming the candidate tweet corpus to identify a much smaller subset of data for submission to deeper, domain-dependent semantic analyses for the identification of actionable information nuggets for coordinated emergency response.
The information overload created by social media messages in emergency situations challenges response organizations to find targeted content and users. We aim to select useful messages by detecting the presence of conversation as an indicator of coordinated citizen action. Using simple linguistic indicators drawn from conversation analysis in social science, we model the presence of coordination in the communication landscape of Twitter 1 using a corpus of 1.5 million tweets for various disaster and non-disaster events spanning different periods, lengths of time, and varied social significance. Within replies, retweets and tweets that mention other Twitter users, we found that domain-independent, linguistic cues distinguish likely conversation from non-conversation in this online form of mediated communication. We demonstrate that these likely conversation subsets potentially contain more information than non-conversation subsets, whether or not the tweets are replies, retweets, or mention other Twitter users, as long as they reflect conversational properties. From a practical perspective, we have developed a model for trimming the candidate tweet corpus to identify a much smaller subset of data for submission to deeper, domain-dependent semantic analyses for the identification of actionable information nuggets for coordinated emergency response. (A. Hampton), valerie@knoesis.org (V.L. Shalin).
Lingue e Linguaggi, 2020
In recent years, social media platforms have had a tremendous impact on the online world due to their effectiveness in multimodal communication events (Herring 2001). Social media users benefit from the digital nature of such interactions to gather data for different purposes, including discourse-related ones (Zappavigna 2012). Hashtags, in this sense, have proved to be an effective tool that is used to broaden communication but also to prompt real-life actions, especially to face and manage critical or emergency situations (Olteanu et al. 2015). This specific device is more likely to be used effectively on Twitter, a popular micro-blog used as an information aggregator and catalyst for action (Zappavigna 2015). Following previous studies focusing on the same topic (Burnap et al. 2014; Hughes, Palen 2009), the paper examines examples of context-based words and/or purpose-specific hashtags to explore their use by different sets of users with diverse intentions and aims. Twitter data are retrieved by means of real-time data mining tools (Brooker et al. 2016) to create relevant keyword-or hashtag-based sample corpora dealing with emergency situations (Aug-Sep 2017: two terror attacks and a natural disaster) which have caused remarkable media exposure. Data from such corpora identify some relevant words used in such situations, grouped according to several variables such as event-related, channel-dependent or sentiment-based criteria. Furthermore, an aggregated analysis is carried out in order to retrieve the most common patterns used to highlight performativity, thus emphasising the role of purpose-specific communication. Finally, a comparison of aggregated corpora with different tool-specific features (retweets) highlights the importance of such tool-specific devices in magnifying the range of communication effectiveness occurring in a proper 'online discourse community' (Herring 2008).
2013
Disasters such as Hurricane Sandy in 2012 result in extensive social media traffic, using networking platforms such as Twitter, as citizens report on their situations, identify needs and attempt to distribute resources. We address the challenge of finding relevant, actionable tweets from this large volume with an information filtering model. Driven primarily by concern for coordination, the initial domain independent analysis incorporates psycholinguistic theory to filter for potential messages of cooperation. The subsequent domain dependent analysis leverages a lightweight, language-driven, disaster-related domain model to extract resource references (e.g., food, shelter, etc.) in its first phase. Using a lexicon of verbs concerning the transfer of property, combined with simple syntactic frames, a second phase of domain dependent analysis assists in the identification of a particular kind of tacit cooperation, in the declarations of resource needs and availability. The results populate an annotated information repository to support the presentation of organized, actionable information nuggets regarding resource needs and availability at varying levels of abstraction. Computationally grounding the abstractions in raw data enables complex querying ability for who-what-where in coordination. Initial evaluation of the annotations relative to human judgment shows fair to good agreement. In addition to the potential benefits to the formal emergency response community of a filtered and organized corpus, the results serve as a benchmark for evaluating more computationally intensive efforts and characterizing the patterns of language behavior for coordination during a disaster.
Aid is Out There: Looking for Help from Tweets during a Large Scale Disaster
The 2011 Great East Japan Earthquake caused a wide range of problems, and as countermeasures, many aid activities were carried out. Many of these problems and aid activities were reported via Twitter. However, most problem reports and corresponding aid messages were not successfully exchanged between victims and local governments or humanitarian organizations, overwhelmed by the vast amount of information. As a result, victims could not receive necessary aid and humanitarian organizations wasted resources on redundant efforts. In this paper, we propose a method for discovering matches between problem reports and aid messages. Our system contributes to problem-solving in a large scale disaster situation by facilitating communication between victims and humanitarian organizations.
A language-agnostic approach to exact informative tweets during emergency situations
2017 IEEE International Conference on Big Data (Big Data), 2017
In this paper, we propose a machine learning approach to automatically classify non-informative and informative contents shared on Twitter during disasters caused by natural hazards. In particular, we leverage on previously sampled and labeled datasets of messages posted on Twitter during or in the aftermath of natural disasters. Starting from results obtained in previous studies, we propose a language-agnostic model. We define a base feature set considering only Twitter-specific metadata of each tweet, using classification results from this set as a reference. We introduce an additional feature, called the Source Feature, which is computed considering the device or platform used to post a tweet, and we evaluate its contribution in improving the classifier accuracy. Index Terms-Disaster relief; social media analysis; classification; machine learning; real-world traces.
Helping Crisis Responders Find the Informative Needle in the Tweet Haystack
ArXiv, 2018
Crisis responders are increasingly using social media, data and other digital sources of information to build a situational understanding of a crisis situation in order to design an effective response. However with the increased availability of such data, the challenge of identifying relevant information from it also increases. This paper presents a successful automatic approach to handling this problem. Messages are filtered for informativeness based on a definition of the concept drawn from prior research and crisis response experts. Informative messages are tagged for actionable data -- for example, people in need, threats to rescue efforts, changes in environment, and so on. In all, eight categories of actionability are identified. The two components -- informativeness and actionability classification -- are packaged together as an openly-available tool called Emina (Emergent Informativeness and Actionability).
Extracting Valuable Information from Twitter during Natural Disasters
Social media is a vital source of information during any major event, especially natural disasters. However, with the exponential increase in volume of social media data, so comes the increase in conversational data that does not provide valuable information, especially in the context of disaster events, thus, diminishing peoples' ability to find the information that they need in order to organize relief efforts, find help, and potentially save lives. This project focuses on the development of a Bayesian approach to the classification of tweets (posts on Twitter) during Hurricane Sandy in order to distinguish "informational" from "conversational" tweets. We designed an effective set of features and used them as input to Naïve Bayes classifiers. In comparison to a "bag of words" approach, the new feature set provides similar results in the classification of tweets. However, the designed feature set contains only 9 features compared with more than 3000 features for "bag of words." When the feature set is combined with "bag of words", accuracy achieves 85.2914%. If integrated into disaster-related systems, our approach can serve as a boon to any person or organization seeking to extract useful information in the midst of a natural disaster.
In times of mass emergency, vast amounts of data are generated via computer-mediated communication (CMC) that are difficult to manually cull and organize into a coherent picture. Yet valuable information is broadcast, and can provide useful insight into time-and safety-critical situations if captured and analyzed properly and rapidly. We describe an approach for automatically identifying messages communicated via Twitter that contribute to situational awareness, and explain why it is beneficial for those seeking information during mass emergencies.