case study – Techdirt (original) (raw)
Stories filed under: "case study"
Content Moderation Case Study: YouTube Doubles Down On Questionable 'graphic Content' Enforcement Before Reversing Course (2020)
from the moderation-road-rage dept
Summary:
YouTube creators have frequently complained about the opaque and frustrating nature of the platform’s appeals process for videos that are restricted or removed for violating its Community Guidelines. Beyond simply removing content, these takedowns can be severely damaging to creators, as they can result in “strikes” against a channel. Strikes incur temporary restrictions on the user’s ability to upload content and use other site features, and enough strikes can ultimately lead to permanent channel suspension.
Creators can appeal these strikes, but many complain that the response to appeals is inconsistent, and that rejections are deemed “final” without providing insight into the decision-making process or any further recourse. One such incident in 2020 involving high-profile creators drew widespread attention online and resulted in a rare apology and reversal of course by YouTube.
On August 24, 2020, YouTube creator MoistCr1TiKaL (aka Charlie White, who also uses the handle penguinz0), who at the time had nearly six-million subscribers, posted a video in which he reacted to a viral 2014 clip of a supposed “road rage” incident involving people dressed as popular animated characters. The authenticity of the original video is unverified and many viewers suspect it was staged for comedic purposes, as the supposed “violence” it portrays appears to be fake, and the target of the “attack” appears uninjured. Soon after posting his reaction video, White received a strike for “graphic content with intent to shock” and the video was removed. On September 1, White revealed on Twitter that he had appealed the strike, but the appeal was rejected.
White then posted a video expressing his anger at the situation, and pointed out that another high-profile YouTube creator, Markiplier (aka Mark Fischbach), had posted his own reaction to the same viral video nearly four years earlier but had not received a strike. Fischbach agreed with White and asked YouTube to address the inconsistency. To the surprise of both creators, YouTube responded by issuing a strike to Fischbach’s video as well.
The incident resulted in widespread backlash online, and the proliferation of the #AnswerUsYouTube hashtag on Twitter, with fans of both creators demanding a reversal of the strikes and/or more clarity on how the platform makes these enforcement decisions.
Company considerations:
- If erroneous strikes are inevitable given the volume of content being moderated, what are the necessary elements of an appeals process to ensure creators have adequate recourse and receive satisfactory explanations for final decisions?
- What are the conditions under which off-platform attention to a content moderation decision should result in further manual review and potential reversals outside the normal appeals process?
- How can similar consideration be afforded to creators who face erroneous strikes and rejected appeals, but do not have large audiences who will put off-platform pressure on the company?
Issue considerations:
- How can companies balance the desire to directly respond to controversies involving highly popular creators with the desire to employ consistent, equitable processes for all creators?
- How should platforms harmonize their enforcement decisions when they are alerted to clear contradictions between the decisions on similar pieces of content?
Resolution:
On September 2, a few hours after Fischbach announced his strike and White expressed his shock at that decision, the TeamYouTube Twitter account replied to White and to Fischbach with an apology, stating that it had restored both videos and reversed both strikes and calling the initial decision “an over-enforcement of our policies.” Both creators expressed their appreciation for the reversal, while also noting that they hope the company makes changes to prevent similar incidents from occurring in the future. Since such reversals by YouTube are quite rare, and apologies even rarer, the story sparked widespread coverage in a variety of outlets.
Originally posted to the Trust and Safety Foundation website.
Filed Under: case study, content moderation, enforcement, road rage
Companies: youtube
Content Moderation Case Study: Twitter Removes 'Verified' Badge In Response To Policy Violations (2017)
from the verified-or-a-stamp-of-approval? dept
Summary: Many social networks have enabled users to use a pseudonym as their identity on that network. Since users could use whatever name they wanted, they could pretend to be someone else, creating certain challenges for those platforms. For example, for sites that allowed such pseudonyms, how would they identify who the actual person was and who was merely an impostor? Some companies, such as Facebook, went the route of requiring users to use their real names. Twitter went another way, allowing pseudonyms.
But what can a company do when there are multiple accounts of the same, often famous, person?
In 2009, Twitter began experimenting with a program to “verify” celebrities.
The initial intent of this program was to identify which Twitter account actually belongs to the person or organization of that Twitter handle (or name). Twitter’s announcement of this feature explains it in straightforward terms:
_With this feature, you can easily see which accounts we know are ‘real’ and authentic. That means we’ve been in contact with the person or entity the account is representing and verified that it is approved. (This does not mean we have verified who, exactly, is writing the tweets.)
This also does not mean that accounts without the ‘Verified Account’ badge are fake. The vast majority of accounts on the system are not impersonators, and we don’t have the ability to check 100% of them. For now, we’ve only verified a handful of accounts to help with cases of mistaken identity or impersonation.
From the start, Twitter denoted “verified” accounts with a, now industry-standard, “blue checkmark.” In the initial announcement, Twitter noted that this was experimental, and the company did not have time to verify everyone who wanted to be verified. It was not until 2016 that Twitter first opened up an application process for anyone to get verified.
A year later, in late 2017, the company closed the application process, noting that people were interpreting “verification” as a stamp of endorsement, which it had not intended. Recognizing this unintended perception, Twitter began removing verification checkmarks from accounts that violated certain policies, starting with high-profile white supremacists.
While this policy received some criticism for “blurring” the line between speakers and speech, it was a recognition of the concerns about how the checkmark was seen as an “endorsement” of someone whose views and actions (even those off of Twitter) were not those Twitter wished to endorse. In that way, the removal of the verification became a content moderation tool for a type of subtle negative endorsement.
Even though those users were “verified” as authentic, Twitter recognized that being verified was a privilege and that removing it was a tool in the content moderation toolbox. Rather than suspending or terminating accounts, the company said that it would also consider removing the verification on accounts that violated its new hateful conduct and abusive behavior policies.
Company Considerations:
- What is the purpose of a verification system on social media? Should it just be to prove that a person is who they say they are, or should it also signal some kind of endorsement? How should the company develop a verification system to match that purpose?
- If the public views verification as a form of endorsement, how important is it for a company to reflect that in its verification program? Are there any realistic ways to have the program not be considered an endorsement?
- Under what conditions does it make sense to use removal of verification as a content moderation tool? Is removing verification an effective content moderation tool? If not, are there ways to make it more effective?
Issue Considerations:
- What are the consequences of using the verification (and de-verification) process as a content moderation tool to “punish” rule violators?
- What are both the risks and benefits of embracing verification as a form of endorsement?
- Are there other subtle forms of content moderation similar to the removal of privileges like the blue checkmark, and how effective can they be?
Resolution: It took many years until Twitter reopened its verification system, and then it did so only in a very limited manner. The system has already ran into problems, as journalists discovered multiple fake accounts that were verified.
However, a larger concern over the new verification rules is that it allows for significant subjective decision-making by the company over how the rules are applied. Activist Albert Fox Cahn explained how the new program is making it “too easy” for journalists to get verified but “too difficult” for activists, showing the challenging nature of any such program.
_“When Angela Lang, founder and executive director of the Milwaukee-based civic engagement group BLOC, decided to get a checkmark, she thought, ‘I’ve done enough. Let’s check out how to be verified.’ Despite Lang and BLOC’s nationally recognized work on Black civic engagement, she found herself shut out. When Detroit-based activist and Data 4 Black Lives national organizing director Tawana Petty applied, her request was promptly rejected. Posting on the platform that refused to verify her, Petty said, ‘Unbelievable that creating a popular hashtag would even be a requirement. This process totally misses the point of why so many of us want to be verified.’ Petty told me, ‘I still live with the anxiety that my page might be duplicated and my contacts will be spammed.’ Previously, she was forced to shut down pages on other social media platforms to protect loved ones from this sort of abuse.
“According to anti-racist economist Kim Crayton, verification is important because ‘that blue check automatically means that what you have to say is of value, and without it, particularly if you’re on the front lines, particularly if you’re a Black woman, you’re questioned.’ As Lang says, ‘Having that verification is another way of elevating those voices as trusted messengers.’ According to Virginia Eubanks, an associate professor of political science at the University at Albany, SUNY, and author of Automating Inequality, ‘The blue check isn’t about social affirmation, it’s a safety issue. Someone cloning my account could leave my family or friends vulnerable and could leave potential sources open to manipulation.’” — Albert Fox Cahn
Originally published to the Trust & Safety Foundation website.
Filed Under: case study, content moderation, endorsement, verification badges
Companies: twitter
Content Moderation Case Study: Sensitive Mental Health Information Is Also A Content Moderation Challenge (2020)
from the tricky-questions dept
Summary: Talkspace is a well known app that connects licensed therapists with clients, usually by text. Like many other services online, it acts as a form of ?marketplace? for therapists and those in the market for therapy. While there are ways to connect with those therapists by voice or video, the most common form of interaction is by text messages via the Talkspace app.
A recent NY Times profile detailed many concerns about the platform, including claims that it generated fake reviews, lied about events like the 2016 election leading to an increase in usage, and that there were conflicts between growing usage and providing the best mental health care for customers. It also detailed how Talkspace and similar apps face significant content moderation challenges as well — some unique to the type of content that the company manages.
Considering that so much of Talkspace?s usage includes text based communications, there are questions concerning how Talkspace handles that information and how it protects that information.
The article also reveals that the company would sometimes review therapy sessions and act on the information learned. While the company claims it only does this to make sure that therapists are doing a good job, the article suggests it is often used for marketing purposes as well.
Karissa Brennan, a New York-based therapist, provided services via Talkspace from 2015 to 2017, including to Mr. Lori. She said that after she provided a client with links to therapy resources outside of Talkspace, a company representative contacted her, saying she should seek to keep her clients inside the app.
?I was like, ?How do you know I did that??? Ms. Brennan said. ?They said it was private, but it wasn?t.?
The company says this would only happen if an algorithmic review flagged the interaction for some reason ? for example, if the therapist recommended medical marijuana to a client. Ms. Brennan says that to the best of her recollection, she had sent a link to an anxiety worksheet.
There was also a claim that researchers at the company would share information gleaned from looking at transcripts with others at the company:
The anonymous data Talkspace collects is not used just for medical advancements; it?s used to better sell Talkspace?s product. Two former employees said the company?s data scientists shared common phrases from clients? transcripts with the marketing team so that it could better target potential customers.
The company disputes this. ?We are a data-focused company, and data science and clinical leadership will from time to time share insights with their colleagues,? Mr. Reilly said. ?This can include evaluating critical information that can help us improve best practices.?
He added: ?It never has and never will be used for marketing purposes.?
Decisions to be made by Talkspace:
- How should private conversations between clients and therapists be handled? Should those conversations be viewable by employees of Talkspace?
- Will reviews (automated or by human) of these conversations raise significant privacy concerns? Or is it needed to provide quality therapeutic results to clients?
- What kinds of employee access rules and controls need to be put on therapy conversations?
- How should any research by the company be handled?
- What kinds of content need to be reviewed on the platform, and should it be reviewed by humans, technology, or both?
- Should the company even have access to this data at all?
Questions and policy implications to consider:
- What tradeoffs are there behind providing more access to therapy in an easier format and the privacy questions raised by storing this information?
- How effective is this form of treatment for clients?
- What kinds of demands does this put on therapists — and does being monitored change (for better or for worse) the kind of support they provide?
- Are current regulatory frameworks concerning mental health information appropriate for app-based therapy sessions?
Resolution: Talkspace insists that it is working hard to provide a better service to clients who are looking to communicate with therapists, and challenges many of the claims made in the article. Talkspace?s founders also wrote a response to the article that, while claiming to ?welcome? scrutiny, also questioned the competency of the reporter who wrote the NY Times story. They also argued that most of the negative claims in the Times piece came from disgruntled former workers — and that some of it is outdated and no longer accurate.
The company also argued that it is IPAA/HITECH and SOC2 approved and has never had a malpractice claim in its network. The company insists that access to the content of transcripts is greatly limited:
To be clear; only the company?s Chief Medical Officer and Chief Technology Officer hold the ?keys? to access original transcripts, and they both need to agree to do so. This has happened just a handful of times in the company?s history, typically only when a client points to particular language when reporting a therapist issue that cannot be resolved without seeing the original text. In these rare cases, Talkspace gathers affirmative consent from the client to view that original text: both facts which were made clear to the Times in spoken and written interviews. Only Safe-Harbor de-identified transcripts (A ?safe harbor? version of a transcript removes any specific identifiers of the individual and of the individual?s relatives, household members, employers and geographical identifiers etc.) are ever used for research or quality control.
Filed Under: case study, content moderation, mental health information
Companies: talkspace
Content Moderation Case Study: Ask.fm Responds After A Teen's Suicide Is Linked To Bullying On The Site (August 2013)
from the difficult-content-moderation-questions dept
Summary: After a UK teen took her own life in response to bullying on social networking site, ask.fm, her father asked both the site and the UK government to take corrective measures to prevent further tragedies. This wasn’t an isolated incident. Reports linked multiple suicides to bullying on the teen-centered site.
Ask.fm’s problems with bullying and other abuse appeared to be far greater than those observed on other social media sites. Part of this appeared to be due to the site’s user base, which was much younger than more-established social media platforms. This — combined with the option to create anonymous accounts — seemed to have made ask.fm a destination for abusive users. What moderation existed before these problems became headline news was apparently ineffective, resulting in a steady stream of horrific stories until the site began to make serious efforts to curb a problem now too big to ignore.
Ask.fm’s immediate response to both the teen’s father and UK Prime Minister David Cameron’s criticism (Cameron called for a boycott of the site) was to point to existing moderation efforts put in place to deter bullying and other terms of service violations.
After major companies pulled their advertising, ask.fm pledged to assist police in investigating the circumstances behind the teen’s suicide, as well as consult with a law firm to see if moderation efforts could be improved. It also hired more moderators and a safety officer, and made its “Report” button more prominent.
More than a year after ask.fm became the target of criticism around the world, the site implemented its first Safety Advisory Board. The group of experts on teens and their internet use was tasked with reducing the amount of bullying on the platform and making it safer for its young users.
More significantly, ask.fm’s founders — who were viewed as unresponsive to criticism — were removed by the site’s new owners, InterActiveCorp (IAC). IAC pledged to work more closely with US law enforcement and safety experts to improve moderation efforts.
Decisions to be made by ask.fm:
- Should anonymous accounts be eliminated (or stop-gapped by gathering IP address/personal info) to limit abusive behavior?
- Does catering to a younger user base create unique problems not found at sites that skew older?
- Would more transparency about moderation efforts/features nudge more users towards reporting abuse?
- Should the site directly intervene when moderators notice unhealthy/unwanted user interactions?
Questions and policy implications to consider:
- Given the international reaction to the teen’s suicide, does a minimal immediate response make the perceived problem worse?
- Does having a teen user base increase the risk of direct regulation or unfavorable legislation, given the increased privacy protections for minors in many countries?
- Are moderation efforts resulting from user reports vetted periodically to ensure the company isn’t making bullying/trolling problems worse by allowing abusive users to get others suspended or banned?
Resolution: When immediate steps did little to deter criticism, ask.fm formed a Safety Committee and, ultimately, dismissed founders that appeared to be unresponsive to users’ concerns. The site made changes to its moderation strategies, hired more moderators, and made users more aware of the features they could use to report users and avoid unwanted interactions.
Filed Under: bullying, case study, content moderation, suicide
Companies: ask.fm
Content Moderation Case Study: Detecting Sarcasm Is Not Easy (2018)
from the kill-me-now dept
Summary: Content moderation becomes even more difficult when you realize that there may be additional meaning to words or phrases beyond their most literal translation. One very clear example of that is the use of sarcasm, in which a word or phrase is used either in the opposite of its literal translation or as a greatly exaggerated way to express humor.
In March of 2018, facing increasing criticism regarding certain content that was appearing on Twitter, the company did a mass purge of accounts, including many popular accounts that were accused of simply copying and retweeting jokes and memes that others had created. Part of the accusation for those that were shut down, was that there was a network of accounts (referred to as ?Tweetdeckers? for the user of the Twitter application Tweetdeck) who would agree to mass retweet some of those jokes and memes. Twitter suggested that these retweet brigades were inauthentic and thus banned from the platform.
In the midst of all of these suspensions, however, there was another set of accounts and content suspended, allegedly for talking about ?self -harm.? Twitter has policies regarding glorifying self-harm which it had just updated a few weeks before this new round of bans.
However, in trying to apply that, Twitter took down a bunch of tweets that had people sarcastically using the phrase ?kill me.? This included suddenly suspending many accounts despite many of those tweets being from many years earlier. It appeared that Twitter may have just done a search on ?kill me? or other similar words and phrases including ?kill myself,? ?cut myself,? ?hang myself,? ?suicide,? or ?I wanna die.?
While some of these may indicate intentions for self-harm, in many other cases they were clearly sarcastic or just people saying odd things, and yet Twitter temporarily suspended many of those accounts and asked the users to delete the tweets. In at least some cases, the messages from Twitter did include some encouraging words, such as ?Please know that there are people out there who care about you, and you are not alone.? But that did not appear to be on all of the messages. That language, at least, suggested a specific response to concerns about self-harm.
Decisions to be made by Twitter:
- How do you handle situations where users indicate they may engage in self-harm?
- Should such content be removed or are there other approaches?
- How do you distinguish between sarcastic phrases and real threats of self-harm?
- What is the best way to track and monitor claims of self-harm? Does a keyword or key phrase list search help?
- Does automated tracking of self-harm messages work? Or is it better to rely on user reports?
- Does it change if the supposed messages regarding self-harm are years old?
Questions and policy implications to consider:
- Is suspending people for self-harm likely to prevent the harm? Or is it just hiding useful information from friends, family, officials, who might help?
- Detecting sarcasm creates many challenges; should internet platforms be the arbiters of what counts as reasonable sarcasm? Or must it take all content literally?
- Automated solutions to detect things like self-harm may cover a wider corpus of material, but is also more likely to misunderstand context. How should these issues be balanced?
Resolution: This continues to be a challenge for various platforms, including Twitter. The company has continued to tweak its policies regarding self-harm over the year, including partnering with suicide prevention groups in various locations to seek to help those who indicate that they are considering self-harm.
Filed Under: case study, content moderation, kill me, sarcasm
Companies: twitter
Content Moderation Case Study: Facebook Responds To A Live-streamed Mass Shooting (March 2019)
from the live-content-moderation dept
Summary: On March 15, 2019, the unimaginable happened. A Facebook user — utilizing the platform’s live-streaming option — filmed himself shooting mosque attendees in Christchurch, New Zealand.
By the end of the shooting, the shooter had killed 51 people and injured 49. Only the first shooting was live-streamed, but Facebook was unable to end the stream before it had been viewed by a few hundred users and shared by a few thousand more.
The stream was removed by Facebook almost an hour after it appeared, thanks to user reports. The moderation team began working immediately to find and delete re-uploads by other users. Violent content is generally a clear violation of Facebook’s terms of service, but context does matter. Not every video of violent content merits removal, but Facebook felt this one did.
The delay in response was partly due to limitations in Facebook’s automated moderation efforts. As Facebook admitted roughly a month after the shooting, the shooter’s use of a head-mounted camera made it much more difficult for its AI to make a judgment call on the content of the footage.
Facebook’s efforts to keep this footage off the platform continue to this day. The footage has migrated to other platforms and file-sharing sites — an inevitability in the digital age. Even with moderators knowing exactly what they’re looking for, platform users are still finding ways to post the shooter’s video to Facebook. Some of this is due to the sheer number of uploads moderators are dealing with. The Verge reported the video was re-uploaded 1.5 million times in the 48 hours following the shooting, with 1.2 million of those automatically blocked by moderation AI.
Decisions to be made by Facebook:
- Should the moderation of live-streamed content involve more humans if algorithms aren’t up to the task?
- When live-streamed content is reported by users, are automated steps in place to reduce visibility or sharing until a determination can be made on deletion?
- Will making AI moderation of livestreams more aggressive result in over-blocking and unhappy users?
- Do the risks of allowing content that can’t be moderated prior to posting outweigh the benefits Facebook gains from giving users this option?
- Is it realistic to “draft” Facebook users into the moderation effort by giving certain users additional moderation powers to deploy against marginal content?
Questions and policy implications to consider:
- Given the number of local laws Facebook attempts to abide by, is allowing questionable content to stay “live” still an option?
- Does newsworthiness outweigh local legal demands (laws, takedown requests) when making judgment calls on deletion?
- Does the identity of the perpetrator of violent acts change the moderation calculus (for instance, a police officer shooting a citizen, rather than a member of the public shooting other people)?
- Can Facebook realistically speed up moderation efforts without sacrificing the ability to make nuanced calls on content?
Resolution: Facebook reacted quickly to user reports and terminated the livestream and the user’s account. It then began the never-ending work of taking down uploads of the recording by other users. It also changed its rules governing livestreams in hopes of deterring future incidents. The new guidelines provide for temporary and permanent bans of users who livestream content that violates Facebook’s terms of service, as well as prevent these accounts from buying ads. The company also continues to invest in improving its automated moderation efforts in hopes of preventing streams like this from appearing on users’ timelines.
Filed Under: case study, christchurch, content moderation, live streaming, new zealand, shooting
Companies: facebook
Content Moderation Case Study: Twitter Acts To Remove Accounts For Violating The Terms Of Service By Buying/Selling Engagement (March 2018)
from the fake-followers dept
Summary: After an investigation by BuzzFeed uncovered several accounts trafficking in paid access to “decks” — Tweetdeck accounts from which buyers could mass-retweet their own tweets to make them go “viral” — Twitter acted to shut down the abusive accounts.
Most of the accounts were run by teens who leveraged the tools provided by Twitter-owned Tweetdeck to provide mass exposure to tweets for paying customers. Until Twitter acted, users who saw their tweets go viral under other users’ names tried to police the problem by naming paid accounts and putting them on blocklists.
Twitter’s Rules expressly forbid users from “artificially inflating account interactions?. But most accounts were apparently removed under Twitter’s anti-spam policy — one it beefed up after BuzzFeed published its investigation. The biggest change was the removal of the ability to simultaneously retweet tweets from several different accounts, rendering these “decks” built by “Tweetdeckers” mostly useless. Tweetdeckers responded by taking a manual approach to faux virality, sending direct messages requesting mutual retweets of posted content.
Unlike other corrective actions taken by Twitter in response to mass abuse, this cleanup process appears to have resulted in almost no collateral damage. Some users complained their follower counts had dropped, but this was likely the result of near-simultaneous moderation efforts targeting bot accounts.
Decisions to be made by Twitter:
- Do additional moderation efforts — AI or otherwise — need to be deployed to detect abuse of Twitter Rules?
- How often do these efforts mistakenly target legitimately “viral” content?
- Will altering Tweetdeck features harm users who aren’t engaged in the buying and selling of “engagement?”
- Will power users or those seeking to abuse the rules move to other third-party offerings to avoid moderation efforts?
- Is there any way to neutralize “retweet for retweet” requests in direct messages without raising concerns about user privacy?
Questions and policy implications to consider:
- Does targeting spam more aggressively risk alienating advertisers who rely on repetitive/scheduled posts and active user engagement?
- Does spam (in whatever form — including the manufactured virality seen above) still provide some value for Twitter as a company, considering it relies on active users and engagement to secure funding and/or sell ad space to companies?
- Do viral posts still add value for Twitter users, even if the source of the virality is illegitimate?
- Will increased moderation of spam reduce user engagement during events where advertising efforts and user engagement are routinely expected to increase (elections, sporting events, etc.)?
Resolution: Twitter moved quickly to disable and delete accounts linked to the marketing of user engagement. It chose to use its anti-spam rules as justification for account removals, even though users were allegedly engaged in other violations of the terms of service. The buying and selling of Twitter followers — along with retweets and likes — continues to be a problem, but Twitter clearly has a toolset in place that is effective against the behavior seen here. Due to the reliance on spam rules, the alterations to Tweetdeck — a favorite of Twitter power users — appear to have done minimal damage to legitimate users who enjoy the advantages of this expanded product.
Filed Under: buying engagement, case study, content moderation, fake followers, tweetdeck
Companies: twitter
Content Moderation Case Study: Social Media Services Respond When Recordings Of Shooting Are Uploaded By The Person Committing The Crimes (August 2015)
from the real-time-decision-making dept
Summary: The ability to instantly upload recordings and stream live video has made content moderation much more difficult. Uploads to YouTube have surpassed 500 hours of content every minute (as of May 2019), making any form of moderation inadequate.
The same goes for Twitter and Facebook. Facebook’s user base exceeds two billion worldwide. Over 500 million tweets are posted to Twitter every day (as of May 2020). Algorithms and human moderators are incapable of catching everything that violates terms of service.
When the unthinkable happens — as it did on August 26, 2015 — these two social media services swiftly responded. But even their swift efforts weren’t enough. The videos posted by Vester Lee Flanagan, a disgruntled former employee of CBS affiliate WDBJ in Virginia, showed him tracking down a WDBJ journalist and cameraman and shooting them both.
Both platforms removed the videos and deactivated Flanagan’s accounts. Twitter’s response took only minutes. But the spread of the videos had already begun, leaving moderators to try to track down duplicates before they could be seen and duplicated yet again. Many of these ended up on YouTube, where moderation efforts to contain the spread still left several reuploads intact. This was enough to instigate an FTC complaint against Google, filed by the father of the journalist killed by Flanagan. Google responded by stating it was still removing every copy of the videos it could locate, using a combination of AI and human moderation.
Users of Facebook and Twitter raised a novel complaint in the wake of the shooting, demanding “autoplay” be opt in — rather than the default setting — to prevent them from inadvertently viewing disturbing content.
Moderating content as it is created continues to pose challenges for Facebook, Twitter, and YouTube — all of which allow live-streaming.
Decisions to be made by social media platforms:
- What efforts are being put in place to better handle moderation of streaming content?
- What efforts — AI or otherwise — are being deployed to potentially prevent the streaming of criminal acts? Which ones should we adopt?
- Once notified of objectionable content, how quickly should we respond?
- Are there different types of content that require different procedures for responding rapidly?
- What is the internal process for making moderation decisions on breaking news over streaming?
- While the benefits of auto-playing content are clear for social media platforms, is making this the default option a responsible decision — not just for potentially-objectionable content but for users who may be using limited mobile data?
Questions and policy implications to consider:
- Given increasing Congressional pressure to moderate content (and similar pressure from other governments around the world), are platforms willing to “over-block” content to demonstrate their compliance with these competing demands? If so, will users seek out other services if their content is mistakenly blocked or deleted?
- If objectionable content is the source for additional news reporting or is of public interest (like depictions of violence against protesters, etc.), do these concerns override moderation decisions based on terms of service agreements?
- Does the immediate removal of criminal evidence from public view hamper criminal investigations?
- Are all criminal acts of violence considered violations of content guidelines? What if the crime is being committed by government agents or law enforcement officers? What if the video is of a criminal act being performed by someone other than the person filming it?
Resolution: All three platforms have made efforts to engage in faster, more accurate moderation of content. Live-streaming presents new challenges for all three platforms, which are being met with varying degrees of success. These three platforms are dealing with millions of uploads every day, ensuring objectionable content will still slip through and be seen by hundreds, if not thousands, of users before it can be targeted and taken down.
Content like this is a clear violation of terms of service agreements, making removal — once notified and located — straightforward. But being able to “see” it before dozens of users do remains a challenge.
Filed Under: case study, content moderation, live video, moderation, streaming, video streaming
Content Moderation Case Study: Facebook Nudity Filter Blocks Historical Content And News Reports About The Error (June 2020)
from the content-moderation-is-hard dept
Summary: Though social media networks take a wide variety of evolving approaches to their content policies, most have long maintained relatively broad bans on nudity and sexual content, and have heavily employed automated takedown systems to enforce these bans. Many controversies have arisen from this, leading some networks to adopt exceptions in recent years: Facebook now allows images of breastfeeding, child-birth, post-mastectomy scars, and post-gender-reassignment surgery photos, while Facebook-owned Instagram is still developing its exception for nudity in artistic works. However, even with exceptions in place, the heavy reliance on imperfect automated filters can obstruct political and social conversations, and block the sharing of relevant news reports.
One such instance occurred on June 11, 2020 following controversial comments by Australian Prime Minister Scott Morrison, who stated in a radio interview that ?there was no slavery in Australia?. This sparked widespread condemnation and rebuttals from both the public and the press, pointing to the long history of enslavement of Australian Aboriginals and Pacific Islanders in the country. One Australian Facebook user posted a late 19th century photo from the state library of Western Australia, depicting Aboriginal men chained together by their necks, along with a statement:
Kidnapped, ripped from the arms of their loved ones and forced into back-breaking labour: The brutal reality of life as a Kanaka worker – but Scott Morrison claims ?there was no slavery in Australia?
Facebook removed the post and image for violation of their policy against nudity, although no genitals are visible, and restricted the user?s account. The Guardian Australia contacted Facebook to determine if this decision was made in error and, the following day, Facebook restored the post and apologized to the user, explaining that it was an erroneous takedown caused by a false positive in the automated nudity filter. However, at the same time, Facebook continued to block posts that included The Guardian?s news story about the incident, which featured the same photo, and placed 30-day suspensions on some users who attempted to share it. Facebook?s community standards report shows that in the first three months of 2020, 39.5-million pieces of content were removed for nudity or sexual activity, over 99% of those takedowns were automated, 2.5-million appeals were filed, and 613,000 of the takedowns were reversed.
Decisions to be made by Facebook:
- Can nudity filters be improved to result in fewer false-positives, and/or is more human review required?
- For appeals of automated takedowns, what is an adequate review and response time?
- Should automated nudity filters be applied to the sharing of content from major journalistic sources such as The Guardian?
- Should questions about content takedowns from major news organizations be prioritized over those from regular users?
- Should 30-day suspensions and similar account restrictions be manually reviewed only if the user files an appeal?
Questions and policy implications to consider:
- Should automated filter systems be able to trigger account suspensions and restrictions without human review?
- Should content that has been restored in one instance be exempted from takedown, or flagged for automatic review, when it is shared again in future in different contexts?
- How quickly can erroneous takedowns be reviewed and reversed, and is this sufficient when dealing with current, rapidly-developing political conversations?
- Should nudity policies include exemptions for historical material, even when such material does include visible genitals, such as occurred in a related 2016 controversy over a Vietnam War photo?
- Should these policies take into account the source of the content?
- Should these policies take into account the associated messaging?
Resolution: Facebook?s restoration of the original post was undermined by its simultaneous blocking of The Guardian?s news reporting on the issue. After receiving dozens of reports from its readers that they were blocked from sharing the article and in some cases suspended for trying, The Guardian reached out to Facebook again and, by Monday, June 15, 2020, users were able to share the article without restriction. The difference in response times between the original incident and the blocking of posts is possibly attributable to the fact that the latter came to the fore on a weekend, but this meant that critical reporting on an unfolding political issue was blocked for several days while the subject was being widely discussed online.
Photo Credit (for first photo):
State Library of Western Australia
[Screenshot is taken directly from a Twitter embed]
Filed Under: case study, consistency, content moderation, historical content, nudity, reporting
Companies: facebook
Content Moderation Case Study: Talking About Racism On Social Media (2019)
from the what's-racist,-and-what's-a-discussion dept
Summary: With social media platforms taking a more aggressive stance regarding racist, abusive, and hateful language on their platforms, there are times when those efforts end up blocking conversations about race and racism itself. The likelihood of getting an account suspended or taken down has been referred to as ?Facebooking while Black.?
As covered in USA Today, the situations can become complicated quickly:
A post from poet Shawn William caught [Carolyn Wysinger?s] eye. “On the day that Trayvon would’ve turned 24, Liam Neeson is going on national talk shows trying to convince the world that he is not a racist.” While promoting a revenge movie, the Hollywood actor confessed that decades earlier, after a female friend told him she’d been raped by a black man she could not identify, he’d roamed the streets hunting for black men to harm.
For Wysinger, an activist whose podcast The C-Dubb Show frequently explores anti-black racism, the troubling episode recalled the nation’s dark history of lynching, when charges of sexual violence against a white woman were used to justify mob murders of black men.
“White men are so fragile,” she fired off, sharing William’s post with her friends, “and the mere presence of a black person challenges every single thing in them.”
This post was quickly deleted by Facebook, claiming that it violated the site?s ?hate speech? policies. She was also warned that attempting to repost the content would lead to her being banned for 72 hours.
Facebook?s rules are that an attack on a ?protected characteristic? — such as race, gender, sexuality or religion — violates its ?hate speech? policies. In this case, the removal was because Wysinger?s post was speech that targeted a group based on a ?protected characteristic? (in this case ?white men?) and thus it was flagged for deletion.
Questions to consider:
- How should a site handle sensitive conversations regarding discrimination?
- If a policy defines ?protected characteristics,? are all groups defined by one of those characteristics to be treated equally?
- If so, is that in itself a form of disparate treatment for historically oppressed groups?
- If not, does that risk accusations of bias?
- Is there any way to take wider context into account during human or technological reviews?
- Should the race/gender/sexuality/religion of the speaker be taken into account? What about the target of the speech?
- Is there a way to determine if a comment is ?speaking up? to power or ?speaking down? from a position of power?
Resolution: In the case described above, Wysinger chose not to risk losing her Facebook access for any amount of time, and simply chose not to repost the statement about Liam Neeson. Facebook, for its part, has continued to adapt and adjust is policy. It streamlined its ?appeals? process to try to deal with many of these kinds of cases, and has announced (after two years of planning and discussion) the first members of its Facebook Oversight Board, an independent body that will be tasked with reviewing particularly tricky content takedown cases on the platform.
Filed Under: case study, content moderation, content moderation case study, racism