Collecting Facebook posts and WhatsApp chats. Corpus compilation of private social media messages (original) (raw)

Text, Speech and Dialogue, 2016

Abstract

This paper describes the compilation of a social media corpus with Facebook posts and WhatsApp chats. Authentic messages were voluntarily donated by Dutch youths between 12 and 23 years old. Social media nowadays constitute a fundamental part of youths’ private lives, constantly connecting them to friends and family via computer-mediated communication (CMC). The social networking site Facebook and mobile phone chat application WhatsApp are currently quite popular in the Netherlands. Several relevant issues concerning corpus compilation are discussed, including website creation, promotion, metadata collection, and intellectual property rights/ethical approval. The application that was created for scraping Facebook posts from users’ timelines, of course with their consent, can serve as an example for future data collection. The Facebook and WhatsApp messages are collected for a sociolinguistic study into Dutch youths’ written CMC, of which a preliminary analysis is presented, but also present a valuable data source for further research.

Lieke Verheijen hasn't uploaded this paper.

Let Lieke know you want this paper to be uploaded.

Ask for this paper to be uploaded.