Hate speech, Censorship, and Freedom of Speech: The Changing Policies of Reddit (original) (raw)

Tuning Out Hate Speech on Reddit: Automating Moderation and Detecting Toxicity in the Manosphere

AoIR Selected Papers of Internet Research, 2020

Over the past two years social media platforms have been struggling to moderate at scale. At the same time, they have come under fire for failing to mitigate the risks of perceived ‘toxic’ content or behaviour on their platforms. In effort to better cope with content moderation, to combat hate speech, ‘dangerous organisations’ and other bad actors present on platforms, discussion has turned to the role that automated machine-learning (ML) tools might play. This paper contributes to thinking about the role and suitability of ML for content moderation on community platforms such as Reddit and Facebook. In particular, it looks at how ML tools operate (or fail to operate) effectively at the intersection between online sentiment within communities and social and platform expectations of acceptable discourse. Through an examination of the r/MGTOW subreddit we problematise current understandings of the notion of ‘tox¬icity’ as applied to cultural or social sub-communities online and explai...

Measuring and Characterizing Hate Speech on News Websites

12th ACM Conference on Web Science, 2020

The Web has become the main source for news acquisition. At the same time, news discussion has become more social: users can post comments on news articles or discuss news articles on other platforms like Reddit. These features empower and enable discussions among the users; however, they also act as the medium for the dissemination of toxic discourse and hate speech. The research community lacks a general understanding on what type of content attracts hateful discourse and the possible effects of social networks on the commenting activity on news articles. In this work, we perform a large-scale quantitative analysis of 125M comments posted on 412K news articles over the course of 19 months. We analyze the content of the collected articles and their comments using temporal analysis, user-based analysis, and linguistic analysis, to shed light on what elements attract hateful comments on news articles. We also investigate commenting activity when an article is posted on either 4chan's Politically Incorrect board (/pol/) or six selected subreddits. We find statistically significant increases in hateful commenting activity around real-world divisive events like the "Unite the Right" rally in Charlottesville and political events like the second and third 2016 US presidential debates. Also, we find that articles that attract a substantial number of hateful comments have different linguistic characteristics when compared to articles that do not attract hateful comments. Furthermore, we observe that the post of a news articles on either /pol/ or the six subreddits is correlated with an increase of (hateful) commenting activity on the news articles.

Reddit quarantined: can changing platform affordances reduce hateful material online?

Internet policy review, 2020

This paper studies the efficacy of the Reddit's quarantine, increasingly implemented in the platform as a means of restricting and reducing misogynistic and other hateful material. Using the case studies of r/TheRedPill and r/Braincels, the paper argues the quarantine successfully cordoned off affected subreddits and associated hateful material from the rest of the platform. It did not, however, reduce the levels of hateful material within the affected spaces. Instead many users reacted by leaving Reddit for less regulated spaces, with Reddit making this hateful material someone else's problem. The paper argues therefore that the success of the quarantine as a policy response is mixed. Issue 4 This paper is part of Trust in the system, a special issue of Internet Policy Review guestedited by Péter Mezei and Andreea Verteş-Olteanu. Content moderation is an integral part of the political economy of large social media platforms (Gillespie, 2018). While social media companies position themselves as platforms which offer unlimited potential of free expression (Gillespie, 2010), these same sites have always engaged in some form of content moderation (Marantz, 2019). In recent years, in response to increasing pressure from the public, lawmakers and advertisers, many large social media companies have given up much of their free speech rhetoric and have become more active in regulating abusive, misogynistic, racist and homophobic language on their platforms. This has occurred in particular through banning and restricting users and channels (Marantz, 2019). In 2018 for example, a number of large social media companies banned the high-profile conspiracy theorist Alex Jones and his platform InfoWars from their platforms (Hern, 2018), while in 2019 the web infrastructure company Cloudflare deplatformed the controversial site 8chan (Prince, 2019). In 2020 a number of platforms even began regulating material from President Donald Trump, with Twitter placing fact-checks and warnings on some of his tweets and the platform Twitch temporarily suspending his account (Copland and Davis, 2020). As one of the largest digital platforms in the world, Reddit has not been immune from this pressure. Built upon a reputation of being a bastion of free speech (Ohanian, 2013), Reddit has historically resisted censoring its users, despite the prominence of racist, misogynistic, homophobic and explicitly violent material on the platform (for examples,

Hate Speech on Social Media: Content Moderation in Context

Connecticut Law Review, 2020

For all practical purposes, the policy of social media companies to suppress hate speech on their platforms means that the longstanding debate in the United States about whether to limit hate speech in the public square has been resolved in favor of vigorous regulation. Nonetheless, revisiting these debates provides insights essential for developing more empirically-based and narrowly tailored policies regarding online hate. First, a central issue in the hate speech debate is the extent to which hate speech contributes to violence. Those in favor of more robust regulation claim a connection to violence, while others dismiss these arguments as tenuous. The data generated by social media, however, now allow researchers to empirically test whether there are measurable harms resulting from hate speech. These data can assist in formulating evidence-based policies to address the most significant harms of hate speech, while avoiding overbroad regulation. Second, reexamining the U.S. debate about hate speech also reveals the serious missteps of social media policies that prohibit hate speech without regard to context. The policies that social media companies have developed define hate speech solely with respect to the content of the message. As the early advocates of limits on hate speech made clear, the meaning, force, and consequences of speech acts are deeply contextual, and it is impossible to understand the harms of hate speech without reference to political realities and power asymmetries. Regulation that is abstracted from context will inevitably be overbroad. This Article revisits these debates and considers how they map onto the platform law of content moderation, where emerging evidence indicates a correlation between hate speech online, virulent nationalism, and violence against minorities and activists. It concludes by advocating specific recommendations to bring greater consideration of context into the speech-regulation policies and procedures of social media companies.

Toxic Speech and Limited Demand for Content Moderation on Social Media

˜The œAmerican political science review, 2024

hen is speech on social media toxic enough to warrant content moderation? Platforms impose limits on what can be posted online, but also rely on users' reports of potentially harmful content. Yet we know little about what users consider inadmissible to public discourse and what measures they wish to see implemented. Building on past work, we conceptualize three variants of toxic speech: incivility, intolerance, and violent threats. We present results from two studies with pre-registered randomized experiments (Study 1, N ¼ 5,130 ; Study 2, N ¼ 3,734) to examine how these variants causally affect users' content moderation preferences. We find that while both the severity of toxicity and the target of the attack matter, the demand for content moderation of toxic speech is limited. We discuss implications for the study of toxicity and content moderation as an emerging area of research in political science with critical implications for platforms, policymakers, and democracy more broadly.

Hiding hate speech: political moderation on Facebook

Media, Culture & Society

Facebook facilitates more extensive dialogue between citizens and politicians. However, communicating via Facebook has also put pressure on political actors to administrate and moderate online debates in order to deal with uncivil comments. Based on a platform analysis of Facebook’s comment moderation functions and interviews with eight political parties’ communication advisors, this study explored how political actors conduct comment moderation. The findings indicate that these actors acknowledge being responsible for moderating debates. Since turning off the comment section is impossible in Facebook, moderators can choose to delete or hide comments, and these arbiters tend to use the latter in order to avoid an escalation of conflicts. The hide function makes comments invisible to participants in the comment section, but the hidden texts remain visible to those who made the comment and their network. Thus, the users are unaware of being moderated. In this paper, we argue that hidi...

Combining CDA and topic modeling: Analyzing discursive connections between Islamophobia and anti-feminism on an online forum

In this article we present an analysis of the discursive connections between Islamophobia and anti-feminism on a large Internet forum. We argue that the incipient shift from traditional media toward user-driven social media brings with it new media dynamics, relocating the (re)production of societal discourses and power structures and thus bringing about new ways in which discursive power is exercised. This clearly motivates the need to critically engage this field. Our research is based on the analysis of a corpus consisting of over 50 million posts, collected from the forum using custom web crawlers. In order to approach this vast material of unstructured text, we suggest a novel methodological synergy combining critical discourse analysis (CDA) and topic modeling – a type of statistical model for the automated categorization of large quantities of texts developed in computer science. By rendering an overview or 'content map' of the corpus, topic modeling provides an enriching complement to CDA, aiding discovery and adding analytical rigor.

Dynamics of online hate and misinformation

Scientific Reports, 2021

Online debates are often characterised by extreme polarisation and heated discussions among users. The presence of hate speech online is becoming increasingly problematic, making necessary the development of appropriate countermeasures. In this work, we perform hate speech detection on a corpus of more than one million comments on YouTube videos through a machine learning model, trained and fine-tuned on a large set of hand-annotated data. Our analysis shows that there is no evidence of the presence of "pure haters", meant as active users posting exclusively hateful comments. Moreover, coherently with the echo chamber hypothesis, we find that users skewed towards one of the two categories of video channels (questionable, reliable) are more prone to use inappropriate, violent, or hateful language within their opponents' community. Interestingly, users loyal to reliable sources use on average a more toxic language than their counterpart. Finally, we find that the overall toxicity of the discussion increases with its length, measured both in terms of the number of comments and time. Our results show that, coherently with Godwin's law, online debates tend to degenerate towards increasingly toxic exchanges of views. Public debates on social media platforms are often heated and polarised 1-3. Back in the 90s, Mike Godwin coined a theorem, today known as Godwin's law, stating that "As an online discussion grows longer, the probability of a comparison involving Nazis or Hitler approaches to one". More recently, with the advent of social media, an increasing number of people is reporting exposure to online hate speech 4 , leading institutions and online platforms to investigate possible solutions and countermeasures 5. To prevent and counter the spread of hate speech online, for example, the European Commission agreed with Facebook, Microsoft, Twitter, YouTube, Instagram, Snapchat, Dailymotion, Jeuxvideo.com, and TikTok on a "Code of conduct on countering illegal hate speech online" 6. In addition to fuelling the toxicity of the online debate, hate speech may have severe offline consequences. Some researchers hypothesised a causal link between online hate and offline violence 7-9. Furthermore, there is empirical evidence that online hate may induce fear of offline repercussions 10. However, the detection and contrast of hate speech is complicated. There are still ambiguities in the very definition of hate speech, with academic and relevant stakeholders providing their own interpretations 4 , including social media companies such as Facebook 11 , Twitter 12 , and YouTube 13. We use the term "hate speech" to cover whole spectrum of language used in online debates, from normal, acceptable to the extreme, inciting violence. On the extreme end, violent speech covers all forms of expression which spread, incite, promote or justify racial hatred, xenophobia, antisemitism or other forms of hatred based on intolerance, including: intolerance expressed by aggressive nationalism and ethnocentrism, discrimination and hostility against minorities, migrants and people of immigrant origin 14. Less extreme forms of unacceptable speech include inappropriate (e.g., profanity) and offensive language (e.g., dehumanisation, offensive remarks), which is not illegal, but deteriorates public discourse and can lead to a more radicalised society. In this work, we analyse a corpus of more than one million comments on Italian YouTube videos related to COVID-19 to unveil the dynamics and trends of online hate. First, we manually annotate a large corpus of YouTube comments for hate speech, and train and fine-tune a hate speech deep learning model to accurately detect it. Then, we apply the model to the entire corpus, aiming to characterise the behaviour of users producing hate, and shed light on the (possible) relationship between the consumption of misinformation and usage of hate and toxic language. The reason for performing hate speech detection on the Italian language is twofold: First, Italy was one of the countries most affected by the COVID-19 pandemic and especially by the early application of non-pharmaceutical interventions (strict lockdown happened on March 9, 2020). Such an event, by forcing people at home, increased the internet use and was likely to exacerbate the public debate and foment hate speech against specific targets such as the government and politicians. Second, Italian is a less studied language

Hateful and Other Negative Communication in Online Commenting Environments: Content, Structure and Targets

Acta Informatica Pragensia

Information and communication technologies are increasingly interacting with modern societies. One specific manifestation of this interaction concerns hateful and other negative comments in online environments. Various terms appear to denote this communication, from flaming, indecency and intolerance to hate speech. However, there is still a lack of an umbrella term that broadly captures this communication. Therefore, this paper introduces the concept of socially unacceptable discourse, which serves as the basis for an empirical study that evaluated online comments scraped from the Facebook pages of the three most-visited Slovenian news outlets. Machine-learning algorithms were used to narrow the focus to topics related to refugees and LGBT rights. Ten thousand comments were manually coded to identify and structure socially undesirable discourse. The results show that about half of all comments belonged to this type of discourse, with a surprisingly stable level and structure across media (i.e., right-wing versus mainstream) and topics. Most of these comments could also be considered a potential violation of hate speech legislation. In the context of these findings, the political and ideological consequences and implications of mediatised emotions are discussed.

Mapping the terrain of hate: identifying and analyzing online communities and political parties engaged in hate speech against Muslims and LGBTQ+ communities

International journal of data science and analytics, 2024

This study investigates the impact of X on political discourse and hate speech in Finland, focusing on Muslim and LGBTQ+ communities from 2018 to 2023. During this period, these groups have experienced increased hate speech and a concerning surge in hate crimes. Utilizing network analysis methods, we identified online communities and examined the interactions between Finnish MPs and these communities. Our investigation centered on uncovering the emergence of networks propagating hate speech, assessing the involvement of political figures, and exploring the formation dynamics of digital communities. Employing agenda-setting theory and methodologies including text classification, topic modeling, network analysis, and correspondence analysis, the research uncovers varied communication patterns in retweet and mention networks. Retweet networks tend to be more fragmented and smaller, with participation primarily from far-right Finns Party MPs, whereas mention networks exhibit wider political representation, including members from all parties. Findings highlight the Finns Party MPs' significant role in fostering divisive, emotionally charged communications within politically segregated retweet communities, contrasting with their broader engagement in mention networks. The study underscores the necessity for crossparty efforts to combat hate speech, promote inclusive dialogue, and mitigate political polarization.