github – Techdirt (original) (raw)

As Free Speech Enthusiast Elon Plans To Release Twitter’s Source Code, Twitter Desperately Seeking Identity Of FreeSpeechEnthusiast Who Leaked Twitter Source Code

from the troll-speech-enthusiast dept

Ever since Elon Musk made his initial bid to buy Twitter, he’s talked about “open sourcing” the algorithm. He mentioned it last April in the first interview he gave, on the TED stage, to talk about his plans with Twitter. And since taking over the company at the end of October, he’s mentioned it over and over again.

Indeed, on February 21st, he promised that Twitter would release its “algorithm” as open source code “next week.”

![Elon Musk Feb 21 Say what you want about me, but I acquired the world’s largest non-profit for $44B lol

Derek Smart Feb 21 Replying to @elonmusk Right. Now open source it, then we'll be truly impressed.

Elon Musk Replying to @dsmart Prepare to be disappointed at first when our algorithm is made open source next week, but it will improve rapidly!](https://i0.wp.com/lex-p.s3.us-west-1.amazonaws.com/img/d7c8df93-3ab2-4726-a323-41fb7bf1be25-RackMultipart20230327-11-p73syi.png?ssl=1)

And then, two weeks ago, he announced that “all code used to recommend tweets” will be released as open source on March 31st (i.e., this Friday).

![Elon Musk Mar 17 Twitter will open source all code used to recommend tweets on March 31st

Our “algorithm” is overly complex & not fully understood internally. People will discover many silly things , but we’ll patch issues as soon as they’re found!

We’re developing a simplified approach to serve more compelling tweets, but it’s still a work in progress. That’ll also be open source.

Providing code transparency will be incredibly embarrassing at first, but it should lead to rapid improvement in recommendation quality. Most importantly, we hope to earn your trust.](https://i0.wp.com/lex-p.s3.us-west-1.amazonaws.com/img/4d9a887a-117c-4de4-a72c-0407f1281b21-RackMultipart20230327-10-1bp11he.png?ssl=1)

Who knows if he’ll meet his deadline this time (he has a habit of missing deadlines pretty regularly).

However, over the weekend something vaguely interesting happened, in that it was revealed that someone had already, um, “open sourced” Twitter’s source code for it, by posting a repository of at least some of the code to Github. This was revealed in a DMCA notice that Twitter sent to Github, followed by a DMCA subpoena demanding the identity of the person who posted it along with any one who downloaded it.

Now, I initially wasn’t going to write about this. Leaks happen, and I think it’s perfectly fine for Twitter to issue the DMCA takedown for such a leak. But what caught my attention was the username of the leaker. According to the DMCA notice, the leaker went by “FreeSpeechEnthusiast,” and their account is (at the moment) still up on GitHub showing a single contribution on January 3rd (which makes me wonder if the code was sitting there for anyone to find for a whole month and a half):

FreeSpeechEnthusiast GitHub account, showing a single bit of activity on January 3rd.

That name choice takes this from a garden variety leak operation to an ultimate troll attempt against admitted troll Elon Musk. After all, Musk himself continually (if ridiculously) refers to himself as a “free speech absolutist.”

So, given both Elon’s repeated promises to reveal the source code and his publicly stated (if often violated) commitment to “free speech,” the leak of the source code by someone using the name FreeSpeechEnthusiast seems like it was designed directly as a troll move to Musk, goading him into exposing his own hypocrisy (which is way easier than many people may have thought).

Well played, FreeSpeechEnthusiast, well played.

As for the actual leak, again, it’s not clear how much source code was actually leaked or how problematic it is. As I understand it (and would expect) the full source code for Twitter is cumbersome and complex. Releasing a full dump of it would be difficult even if authorized, so I’m guessing it’s not everything.

And while you can find lots of quotes from “cybersecurity experts” about how this may expose vulnerabilities, my guess is that the risk of that is actually fairly low at first? Given enough time, yes, someone can probably find some messy code and some vulnerabilities, but Twitter had (at one time) lots of engineers who were focused on finding and patching those vulnerabilities themselves, and so whatever remains is likely nothing obvious, and anyone going through the code now would first have to figure out how it all worked, which may be no easy task in the first place.

Indeed, this is why, from the beginning, I’ve said that Elon’s promises to open source the code was mostly meaningless, because there are almost no examples of companies taking large, complex systems in proprietary code, and open sourcing them and finding anything valuable come out of it, because there’s so much baggage and complexity for people to even figuring out what the hell anything really does.

This is also why Musk’s announced plans to fix things that people find in the code he still promises to release this week also seems a bit silly, as there’s a reasonable interpretation of this as: “we fired everyone who understands our code, so we’re going to open it up to get engineers to clean up our code for free for the world’s richest man.”

It’s also why the better approach would have just been to improve the API and to allow more developers to build more tools, services, and features on top of Twitter code, but Elon’s already killed off that whole idea.

In the end, this particular story isn’t likely to be that big a deal, but it seemed worth commenting on solely for the lulz of the epic trolling job whoever leaked the code did in highlighting Musk’s hypocrisy. Again.

Filed Under: copyright, dmca, elon musk, free speech, freespeechenthusiast, leak, open source, release, source code, subpoena, troll
Companies: github, twitter

If GitHub Copilot Is A Copyright Problem, Perhaps The Problem Is Copyright

from the ai-did-not-write-this-article dept

Last week a new GitHub Copilot investigation website created by Matthew Butterick brought the conversation about GitHub’s Copilot project back to the front of mind for many people, myself included. Copilot, a tool trained on public code that is designed to auto-suggest code to programmers, has been greeted by excitement, curiosity, skepticism, and concern since it was announced.

The GitHub Copilot investigation site’s arguments build on previous work by Butterick, as well as thoughtful analysis by Bradley M. Kuhn at the Software Freedom Conservancy. I find the arguments contained in these pieces convincing in some places and not as convincing in others, so I’m writing this post in the hopes that it helps me begin to sort it all out.

At this point, Copilot strikes me as a tool that replaces googling for stack overflow answers. That seems like something that could be useful. It also seems plausible that training such a tool on open public software repositories (including open source repositories) could be allowed under US copyright law. That may change if or when Copilot evolves, which makes this discussion a fruitful one to be having right now.

Both Butterick and Kuhn combine legal and social/cultural arguments in their pieces. This blog post starts with the social/cultural arguments because they are more interesting right now, and may impact the legal analysis as facts evolve in the future. Butterick and Kuhn make related arguments, so I’ll do my best to be clear which specific version of a point I’m engaging with at any given time. As will probably become clear, I generally find Kuhn’s approach and framing more insightful (which isn’t to say that Butterick’s lacks insight!).

What is Copilot, Really?

A large part of this discussion seems to turn on the best way to think about and analogize what Copilot is doing (the actual Copilot page does a pretty good job of illustrating how one might use it).

Butterick seems to think that the correct way to think about Copilot is as a search engine that points users to a specific part of a specific (often open source) software package. In his words, it is “a convenient alternative interface to a large corpus of open-source code”. He worries that this “selfish interface to open-source software” is built around “**just give me what I want!**” (emphasis his).

The selfish approach may deliver users to what they think they want, but in doing so hides the community that exists around the software and removes critical information that the code is licensed under an open source license that comes with obligations. If I understand the argument correctly, over time this act of hiding the community will drain open source software of its vitality. That makes Copilot a threat to open source software as a sustainable concept.

But…

The concern about hiding open source software’s community resonates with me. At the same time, Butterick’s starting point strikes me as off, at least in terms of how I search for answers to coding questions.

This is probably a good place to pause and note that I am a Very Bad coder who, nonetheless, does create some code that tends to be openly licensed and is just about always built on other open source code. However, I have nowhere near the skills required to make a meaningful technical contribution to someone else’s code.

Today, my “convenient alternative interface” to finding answers when I need to solve coding problems is google. When I run into a coding problem, I either describe what I am trying to do or just paste the error message I’m getting into google. If I’m lucky, google will then point me to stack overflow, or a blog post, or documentation pages, or something similar. I don’t think that I have ever answered a coding question by ending up in a specific portion of open source code in a public repo. If I did, it seems unlikely that code – even if it had great comments – would get me where I was going on its own because I would not have the context required to quickly understand that it answered my question..

This distinction between “take me to part of open source code” (Butterick’s view) and “help me do this one thing” (my view) is important because when I look at the Copilot website, it feels like Copilot is currently marketed as a potentially useful stack overflow commenter, not someone with an encyclopedic knowledge of where that problem was solved in other open source code. Butterick experimented with Copilot in June and described the output as “This is the code I would expect from a talented 12-year-old who learned about JavaScript yesterday and prime numbers today.” That’s right at my level!

If you ask Copilot a question like “how can I parse this list and return a different kind of list?,” in most cases (but, as Butterick points out, not all!) it seems to respond with an answer synthesized from many different public code repositories instead of just pointing to a single “best answer” repo. That makes Copilot more of a stack overflow explorer than a public code explorer, albeit one that is itself trained by exploring public code. That feels like it reduces the type of harm that Butterick describes.

Use at Your Own Risk

Butterick and Kuhn also raise concerns about the fact that Copilot does not make any guarantees about the quality of code it suggests. Although this is a reasonable concern to have, it does not strike me as particularly unique to Copilot. Expecting Copilot to provide license-cleared and working code every time is benchmarking it against an unrealistic status quo.

While useful, the code snippets I find in stack overflow/blog post/whatever are rarely properly licensed and are always “use at your own risk” (to the extent that they even work). Butterick and Kuhn’s concerns in this area feel equally applicable to most of my stack overflow/blog post answers. Copilot’s documentation if fairly explicit about the value of the code it suggests (“We recommend you take the same precautions when using code generated by GitHub Copilot that you would when using any code you didn’t write yourself.”), for whatever that is worth.

Will Copilot Create One Less Reason to Interact Directly with Open Source Code?

In Butterick’s view, another downside of this “just give me what I want” service is that it reduces the number of situations where someone might knowingly interact with open source code directly. How often do most users interact directly with open source code? As noted above, I interact with a lot of other people’s open source software as an extremely grateful user and importer of libraries, but not as a contributor. So Copilot would shift my direct deep interaction with open source code from zero to zero.

Am I an outlier? Nadia Asparouhouva (née Eghbal)’s excellent book Working in Public provides insight into open source software grounded in user behavior on GitHub. In it, she tracks how most users of open source software are not part of the software’s active developer community:

“This distribution – where one or a few developers do most of the work, followed by a long tail of casual contributors, and many more passive users – is now the norm, not the exception, in open source.”

She also suggests that there may be too much community around some open source software projects, which is interesting to consider in light of Butterick’s concern about community depletion:

”The problem facing maintainers today is not how to get more contributors but how to manage a high volume of frequent, low-touch interactions. These developers aren’t building communities; they’re directing air traffic.”

That suggests that I am not necessarily an outlier. But maybe users like me don’t really matter in the grand scheme of open source software development. If Butterick is correct about Copilot’s impact on more active open source software developers, that could be a big problem.

Furthermore, even if users like me are representative today, and Copilot is not currently good enough to pull people away from interacting with open source code, might it be in the future?

“Maybe?” feels like the only reasonable answer to that question. As Kuhn points out, “AI is usually slow-moving, and produces incremental change far more often than it produces radical change.” Kuhn rightly argues that slow-moving change is not a reason to ignore a possible future threat. At the same time, it does present the possibility that a much better Copilot might itself be operating in an environment that has been subject to other radical changes. These changes might enhance or reduce that future Copilot’s negative impacts.

Where does that leave us? The kind of casual interaction with open source code that Butterick is concerned about may happen less than one might expect. At the same time, today’s Copilot does not feel like a replacement for someone who wants to take a deeper dive into a specific piece of open source software. A different version of Copilot might, but it is hard to imagine the other things that might be different in the event that version existed. Today’s version of Copilot does not feel like it quite manifests the threat described by Butterick.

Copilot is Trained on Open Source, Not Trained on Open Source

For some reason, I went into this research thinking that Copilot had explicitly been trained on open source software. That’s not quite right. Copilot was trained on public GitHub repositories. Those include many repositories of open source software. They also include many repositories of code that is just public, with no license, or a non-open license, or something else. So Copilot was trained on open source software in the sense that its training data includes a great deal of open source software. It was not trained on open source software in the sense that its training data only consists of open source software, or that its developers specifically sought out open source software as training data.

This distinction also happens to highlight an evolving trend in the open source world, where creators conflate public code with openly licensed code. As Asparouhouva notes:

”But the GitHub generation of open source developers doesn’t see it that way, because they prioritize convenience over freedom (unlike free software advocates) or openness (unlike everly open source advocates). Members of this generation aren’t aware of, nor do they really care about, the distinction between free and open source software. Neither are they fired up about evangelizing the idea of open source itself. They just publish their code on GitHub because, as with any other form of online content today, sharing is the default.”

As a lawyer who works with open source, I think the distinction between “openly/freely licensed” and “public” matters a lot. However, it may not be particularly important to people using publicly available software (regardless of the license) to get deeper into coding. While this may be a problem that is exacerbated by Copilot, I don’t know that Copilot fundamentally alters the underlying dynamics that feed it.

Is This Legal?

As noted at the top, and attested to by the body of this post so far, this post starts with the cultural and social critiques of Copilot because that is a richer area for exploration at this stage in the game. Nonetheless, the critiques are – quite reasonably – grounded in legal concerns.

Fair Use

The legal concerns are mostly about copyright and fair use. Normally, in order to make copies of software, you need permission from the creator. Open source software licenses grant those permissions in return for complying with specific obligations, like crediting the original creator.

However, if the copy being made of the software is protected by fair use, the copier does not need permission from the creator and can ignore any obligations in a license. In this case, GitHub is not complying with any open source licensing requirements because it believes that its copies are protected by fair use. Since it does not need permission, it does not need to copy with license requirements (although sometimes there are good reasons to comply with the social intent of licenses even if they are not legally binding…). It has said as much, although it (and its parent company Microsoft) has declined to elaborate further.

I read Butterick as implying that GitHub and Microsoft’s silence on the details of its fair use claim means that the claim itself is weak: “Why couldn’t Microsoft produce any legal authority for its position? Because [Kuhn and the Software Freedom Conservancy] is correct: there isn’t any.”

I don’t think that characterization is fair. Even if they believe that their claim is strong, GitHub cannot assume that it is so strong as to avoid litigation over the issue (see, e.g. the existence of the GitHub Copilot investigation website itself). They have every reason to avoid pre-litigating the fair use issue via blog post and press release, keeping their powder dry until real litigation.

Kuhn has a more nuanced (and correct, as far as I’m concerned) take on how to interpret the questions: “In fact, these areas are so substantially novel that almost every issue has no definitive answers”. While it is totally reasonable to push back on any claims that the law around this question is settled in GitHub’s favor (Kuhn, again, “We should simply ignore GitHub’s risible claim that the “fair use question” on machine learning is settled.”), that is very different than suggesting that it is settled against GitHub.

How will this all shake out? It’s hard to say. Google scanned all the books in order to create search and analytics tools, claiming that their copies were protected by fair use. They were sued by The Authors Guild in the Second Circuit. Google won that case. Is scanning books to create search and analytics tools the same as scanning code to create AI-powered autocomplete? In some ways yes? In other ways no?

Google also won a case before the Supreme Court where they relied on fair use to copy API calls. But TVEyes lost a case where they attempted to rely on fair use in recording all television broadcasts in order to make it easy to find and provide clips. And the Supreme Court is currently considering a case involving Warhold paintings of Prince that could change fair use in unexpected ways. As Kuhn noted, we’re in a place of novel questions with no definitive answers.

What About the ToS?

As Franklin Graves pointed out, it’s also possible that GitHub’s Terms of Service allow it to use anything in any repo to build Copilot without worrying about addition copyright permissions. If that’s the case, they won’t even need to get to the fair use part of the argument. Of course, there are probably good reasons that GitHub is not working hard to publicize the fact that their ToS might give them lots of room when it comes to making use of user uploads to the site.

Where Does That Leave Things?

To start with, I think it is responsible for advocates to get out ahead of things like this. As Kuhn points out:

”As such, we should not overestimate the likelihood that these new systems will both accelerate proprietary software development, while we simultaneously fail to prevent copylefted software from enabling that activity. The former may not come to pass, so we should not unduly fret about the latter, lest we misdirect resources. In short, AI is usually slow-moving, and produces incremental change far more often than it produces radical change. The problem is thus not imminent nor the damage irreversible. However, we must respond deliberately with all due celerity — and begin that work immediately.”

At the same time, I’m not convinced that Copilot is a problem. Is it possible that a future version of Copilot would starve open source software of its community, or allow people to effectively rebuild open source code outside of the scope of the original license? It is, but it seems like that version of Copilot would be meaningfully different from the current version in ways that feel hard to anticipate. Today’s Copilot feels more like a fast lane to possibly-useful stack overflow answers than an index that can provide unattributed snippets of all open source software.

As it is, the acute threat Copilot presents to open source software today feels relatively modest. And the benefits could be real. There are uses of today’s Copilot that could make it easier for more people to get into coding – even open source coding. Sometimes the answer of a talented 12 year old is exactly what you need to get over the hump.

Of course, GitHub can be right about fair use AND Copilot can be useful AND it would still be quite reasonable to conclude that you want to pull your code from GitHub. That’s true even if, as Butterick points out, GitHub being right about fair use means that code anywhere on the internet could be included in future versions of Copilot.

I’m glad that the Software Freedom Conservancy is getting out ahead of this and taking the time to be thoughtful about what it means. I’m also curious to see if Butterick ends up challenging things in a way that directly tests the fair use questions.

Finally, this entire discussion may also end up being a good example of why copyright is not the best tool to use against concerns about ML dataset building. Looking to copyright for solutions has the potential to stretch copyright law in strange directions, cause unexpected side effects, and misaddressing the thing you really care about. That is something that I am always wary of, and a prior that informs my analysis here. Of course, Amanda Levandowski makes precisely the opposite argument in her article Resisting Face Surveillance with Copyright Law.

Michael Weinberg is the Executive Director of NYU’s Engelberg Center for Innovation Law and Policy and Board President of Open Source Hardware Association. This article is reposted with permission from Michael Weinberg’s blog.

Filed Under: ai, ai generated code, code, copilot, copyright, developers, fair use, open source
Companies: github, microsoft

Content Moderation Case Study: GitHub Attempts To Moderate Banned Words Contained In Hosted Repositories (2015)

from the word-filters dept

Summary: GitHub solidified its position as the world’s foremost host of open source software not long after its formation in 2008. Twelve years after its founding, GitHub is host to 190 million repositories and 40 million users.

Even though its third-party content is software code, GitHub still polices this content for violations of its terms of service. Some violations are more overt, like possible copyright infringement. But much of it is a bit tougher to track down.

A GitHub user found themself targeted by a GitHub demand to remove certain comments from their code. The user’s code contained the word “retard” — a term that, while offensive in certain contexts, isn’t offensive when used as a verb to describe an intentional delay in progress or development. But rather than inform the user of this violation, GitHub chose to remove the entire repository, resulting in users who had forked this code to lose access to their repositories as well.

It wasn’t until the user demanded an explanation that GitHub finally provided one. In an email sent to the user, GitHub said the code contained content the site viewed as “unlawful, offensive, threatening, libelous, defamatory, pornographic, obscene, or otherwise objectionable.” More specifically, GitHub told the user to remove the words “retard” and “retarded,” restoring the repository for 24 hours to allow this change to be made.

Decisions for GitHub:

Is the blanket banning of certain words a wise decision, considering the idiosyncratic language of coding (and coders)?
Should GitHub account for downstream repositories that may be negatively affected by removal of the original code when making content moderation decisions, and how?
Could banned words inside code comments be moderated by only removing the comments, which would avoid impacting the functionality of the code?

Questions and policy implications to consider:

Is context considered when moderating possible terms of service violations?
Is it possible to police speech effectively when the content hosted isn’t what’s normally considered speech?
Does proactive moderation of certain terms deter users from deploying code designed to offend?

Resolution: The user’s repository was ultimately restored after the offending terms were removed. So were the repositories that relied on the original code GitHub decided was too offensive to allow to remain unaltered.

Unfortunately for GitHub, this drew attention to its less-than-consistent approach to terms of service violations. Searches for words considered “offensive” by GitHub turned up dozens of other potential violations — none of which appeared to have been targeted for removal despite the inclusion of far more offensive terms/code/notes.

And the original offending code was modified with a tweak that substituted the word “retard” with the word “git” — terms that are pretty much interchangeable in other parts of the world. The not-so-subtle dig at GitHub and its inability to detect nuance may have pushed the platform towards reinstating content it had perhaps pulled too hastily.

Originally posted on the Trust & Safety Foundation website.

Filed Under: code, content moderation, repositories
Companies: github

GitHub, EFF Push Back Against RIAA, Reinstate Youtube-dl Repository

from the DEAR-RIAA-YOU-ARE-CORDIALLY-INVITED-TO-GFY dept

A few weeks ago, the RIAA hurled a DMCA takedown notice at an unlikely target: GitHub. The code site was ordered to take down its repositories of youtube-dl, software that allowed users to download local copies of video and audio hosted at YouTube and other sites.

The RIAA made some noise about copyright infringement (citing notes in the code pointing to Vevo videos uploaded by major labels) before getting down to business. This was a Section 1201 complaint — one that claimed the software illegally circumvented copyright protection schemes applied to videos by YouTube.

The takedown notice demanded removal of the code, ignoring that fact there are plenty of non-infringing uses for a tool like this. It ignored Supreme Court precedent stating that tools with significant non-infringing uses cannot be considered de facto tools of infringement. It also ignored the reality of the internet: that targeting one code repository wouldn’t erase anything from dozens of other sites hosting the same code or the fact that engaging in an overblown, unjustified takedown demand would only increase demand (and use) of the software.

Youtube-dl is a tool used by plenty of non-infringers. It isn’t just for downloading Taylor Swift videos (to use one of the RIAA’s examples). As Parker Higgins pointed out, plenty of journalists and accountability activists use the software to create local copies of videos so they can be examined in far more detail than YouTube’s rudimentary tools allow.

John Bolger, a software developer and systems administrator who does freelance and data journalism, recounted the experience of reporting an award-winning investigation as the News Editor of the college paper the Hunter Envoy in 2012. In that story, the Envoy used video evidence to contradict official reports denying a police presence at an on-campus Occupy Wall Street protest.

“In order to reach my conclusions about the NYPD’s involvement… I had to watch this video hundreds of times—in slow motion, zoomed in, and looping over critical moments—in order to analyze the video I had to watch and manipulate it in ways that are just not possible” using the web interface. YouTube-dl is one effective method for downloading the video at the maximum possible resolution.

At the time, GitHub remained silent on the issue, suggesting it was beyond its control. Developers who’d worked on the youtube-dl project reported being hit with legal threats of their own from the RIAA.

There’s finally some good news to report. The EFF has taken up GitHub/youtube-dl’s case and is pushing back. A letter [PDF] from the EFF to GitHub’s DMCA agent gets into the tech weeds to contradict the RIAA’s baseless “circumvention” claims and the haphazard copyright infringement claims it threw in to muddy the waters.

First, youtube-dl does not infringe or encourage the infringement of any copyrighted works, and its references to copyrighted songs in its unit tests are a fair use. Nevertheless, youtube-dl’s maintainers are replacing these references.

Second, youtube-dl does not violate Section 1201 of the DMCA because it does not “circumvent” any technical protection measures on YouTube videos. Similarly, the “signature” or “rolling cipher” mechanism employed by YouTube does not prevent copying of videos.

There’s far more in the letter, but this explains it pretty succinctly in layman’s terms:

youtube-dl works the same way as a browser when it encounters the signature mechanism: it reads and interprets the JavaScript program sent by YouTube, derives the “signature” value, and sends that value back to YouTube to initiate the video stream. youtube-dl contains no password, key, or other secret knowledge that is required to access YouTube videos. It simply uses the same mechanism that YouTube presents to each and every user who views a video.

We presume that this “signature” code is what RIAA refers to as a “rolling cipher,” although YouTube’s JavaScript code does not contain this phrase. Regardless of what this mechanism is called, youtube-dl does not “circumvent” it as that term is defined in Section 1201(a) of the Digital Millennium Copyright Act, because YouTube provides the means of accessing these video streams to anyone who requests them. As a federal appeals court recently ruled, one does not “circumvent” an access control by using a publicly available password. Digital Drilling Data Systems, L.L.C. v. Petrolink Services, 965 F.3d 365, 372 (5th Cir. 2020). Circumvention is limited to actions that “descramble, decrypt, avoid, bypass, remove, deactivate or impair a technological measure,” without the authority of the copyright owner… Because youtube-dl simply uses the “signature” code provided by YouTube in the same manner as any browser, rather than bypassing or avoiding it, it does not circumvent, and any alleged lack of authorization from YouTube or the RIAA is irrelevant.

GitHub’s post on the subject explains the situation more fully, breaking down what the site’s obligations are under the DMCA and what it does to protect users from abuse of this law. It also states that its overhauling its response process to Section 1201 circumvention claims to provide even more protection for coders using the site. Going forward, takedown notices will be forwarded to GitHub’s legal team and if there’s any question about its legitimacy, GitHub will err on the side of USERS and leave the targeted repositories up until more facts are in. This puts it at odds with almost every major platform hosting third-party content which almost always err on the side of the complainant.

And the cherry on top is the establishment of a $1 million legal defense fund for developers by GitHub. This will assist developers in fighting back against bogus claims and give them access to legal advice and possible representation from the EFF and the Software Freedom Law Center.

Youtube-dl is back up. And the RIAA is now the one having to play defense. It will have to do better than its slapdash, precedent-ignoring, deliberately-confusing takedown notice to kill a tool that can be used as much for good as for infringement,

Filed Under: circumvention, copyright, copyright 1201, copyright 512, counternotice, recording software, takedowns, youtube-dl
Companies: eff, github, riaa, youtube

RIAA Tosses Bogus Claim At Github To Get Video Downloading Software Removed

from the mumbo-and/or-jumbo dept

The RIAA is still going after downloaders, years after targeting downloaders proved to be a waste of time and a PR catastrophe. It’s not actually thinking about suing the end users of certain programs, but it has targeted Github with a takedown notice for hosting youtube-dl, a command line video downloader that downloads videos from (obviously) YouTube and other video sites.

Not that this is going to be any more effective than suing file sharers. The software has been downloaded countless times and forked into new projects hosted (and distributed) elsewhere.

Github has posted the RIAA’s takedown request, which looks a lot like a DMCA notice for copyright infringement. But it isn’t actually targeting infringement. As Parker Higgins pointed out on Twitter, the RIAA — after saying a bunch of stuff about copyright infringement — is actually claiming this software violates Section 1201 of the DMCA, which deals with circumvention of copyright protection schemes.

The request lists a bunch of Github URLs as “copyright violations.” But these aren’t actually copyright violations. A little further down the RIAA gets to the point.

The clear purpose of this source code is to (i) circumvent the technological protection measures used by authorized streaming services such as YouTube, and (ii) reproduce and distribute music videos and sound recordings owned by our member companies without authorization for such use. We note that the source code is described on GitHub as “a command-line program to download videos from YouTube.com and a few more sites.”

So, it’s not really about copyright infringement. The RIAA tries to blur that line a bit by saying the source code includes a short list of videos the program can download — all three of which are videos owned by major labels. Then the RIAA goes a step further, basically claiming that any software that can download YouTube videos violates Section 1201 of the DMCA and only exists to engage in copyright infringement.

The source code is a technology primarily designed or produced for the purpose of, and marketed for, circumventing a technological measure that effectively controls access to copyrighted sound recordings on YouTube…

[T]he youtube-dl source code available on Github (which is the subject of this notice) circumvents YouTube’s rolling cipher to gain unauthorized access to copyrighted audio files, in violation of YouTube’s express terms of service,and in plain violation of Section 1201 of the Digital Millennium Copyright Act, 17 U.S.C. §1201.

This suggests the primary use of youtube-dl is to violate the law. There are plenty of non-infringing uses for this software, including the downloading of CC-licensed videos and those created by the US government, which are public domain. Basically, the RIAA is mashing up the takedown notice provision of DMCA 512 to try to remove code it claims (incorrectly) is violating DMCA 1201… while ignoring the Supreme Court’s ruling in Sony v. Universal that says that tools with substantial non-infringing uses (in that case — oh look! — a video recording tool) is not by itself infringing.

Making blanket statements like these is irresponsible and misleading, but that’s the sort of thing we’ve come to expect from entities like the RIAA. It’s the same questionable claim the MPAA made back in 2014, when it demanded third-party hosts remove Popcorn Time repositories because the software could be used to engage in copyright infringement. It didn’t make sense six years ago. It doesn’t make any more sense now.

Added to all the stupidity is the fact that the RIAA appears to be threatening anyone even loosely-connected to the youtube-dl project. A couple of contributors to the project over the years have reported they’ve received legal threats from the RIAA for working on unrelated code and maintaining the repository.

The RIAA is welcome to continue its mostly-fruitless fight against copyright infringement. But it needs to do so honestly and do it without causing collateral damage to people who haven’t engaged in infringement. The RIAA has no claim here. Github isn’t engaging in infringement or circumvention. The software isn’t either, not until someone uses it to accomplish this. If the RIAA has a problem with end users, it needs to take its complaints to them. This is just more bullshit being brought by an entity with enough heft it will rarely be challenged, even when it’s in the wrong.

Filed Under: copyright, dmca, dmca 1201, dmca 512, downloading, recording, youtube-dl
Companies: github, riaa, youtube

Reluctant To Block Embarrassing Coronavirus Material Held On GitHub, China Targets The People Who Put It There

from the rewriting-history dept

Over the years, Techdirt has written many stories about the various forms that censorship has taken in China. The coronavirus pandemic has added an extra dimension to the situation. China is evidently trying to erase certain aspects of the disease’s history. In particular, it seeks to deny its likely role in acting as the breeding ground for COVID-19, and to downplay how it infected the rest of the world after the initial outbreak in Wuhan. As the New York Times put it: “China is trying to rewrite its role, leveraging its increasingly sophisticated global propaganda machine to cast itself as the munificent, responsible leader that triumphed where others have stumbled.” Quartz reports on a new front in this campaign to re-cast China’s actions. Volunteers in China working on a project called Terminus2049, which aims to preserve key digital records of the coronavirus outbreak, are now targets of a crackdown:

During the outbreak, the project shifted its focus to storing articles including a Chinese magazine’s interview (link in Chinese) with Wuhan doctor Ai Fen, who said she was the first to reveal the existence of the epidemic but who was later reprimanded. The article, first published in March, was taken down within hours of publication, spurring a race among internet users who used various creative ways, including coded language and emojis, to keep the article alive. Terminus2049 also preserved a strongly worded critique (link in Chinese) aimed at Chinese leader Xi Jinping penned by outspoken professor Xu Zhangrun. In the essay, Xu attacked Beijing’s social controls and censorship. He was later reportedly placed under house arrest and his account has been suspended on WeChat.

For obvious reasons, the Chinese authorities are not saying whether the actions taken against three of the volunteers are specifically because of the coronavirus material, but it certainly seems likely given the fate that has met other COVID-19 whistleblowers, critics and journalists. Terminus2049 is hosted on Microsoft’s GitHub, as were other similar projects that aimed to preserve coronavirus memories — including those that were critical of the Chinese government and its response to the outbreak. The reason GitHub is popular for this kind of non-coding material is that its importance as a resource for Chinese programmers has become so great that the authorities in the country have so far been unwilling to block access to it. Since they can’t remove the embarrassing posts, they target the people behind the projects, as the latest moves confirm. Unless activists can keep their identities hidden — something that is hard in a society where surveillance is pervasive — this kind of reprisal is an ever-present risk. As such, it is one of the most powerful weapons that the authorities can deploy in order to silence unwanted voices.

Follow me @glynmoody on Twitter, Diaspora, or Mastodon.

Filed Under: china, covid-19, crackdown, digital records, free speech, transparency
Companies: github, microsoft

Abbott Laboratories Sends Heavy-Handed Copyright Threat To Shut Down Diabetes Community Tool For Accessing Blood-Sugar Data

from the not-very-sweet,-not-very-clever dept

One of the most important recent developments in the world of diabetes has been the arrival of relatively low-cost continuous blood glucose monitors. These allow people with diabetes to take frequent readings of their blood sugar levels without needing to use painful finger sticks every time. That, in turn, allows users to fine-tune their insulin injections, with big health benefits for both the short- and long-term. The new devices can be read by placing a smartphone close to them. People use an app that gathers the data from the unit, which is typically placed on the back of the upper arm with an adhesive.

One of the long-awaited technological treatments for diabetes is the “closed-loop” system, also called an “artificial pancreas”. Here, readings from a continuous glucose device are used to adjust an insulin pump in response to varying blood sugar levels — just as the pancreas does. The idea is to free those with diabetes from needing to monitor their levels all the time. Instead, software with appropriate algorithms does the job in the background.

Closed-loop systems are still being developed by pharma companies. In the meantime, many people have taken things into their own hands, and built DIY artificial pancreas systems from existing components, writing the control code themselves. One popular site for sharing help on the topic is Diabettech, with “information about [continuous glucose monitoring] systems, DIY Closed Loops, forthcoming insulins and a variety of other aspects.”

A few months back there was a post on Diabettech about some code posted to GitHub. A patch to Abbott Laboratories’ LibreLink app allowed data from the same company’s FreeStyle Libre continuous monitor to be accessed by other apps running on a smartphone. In particular, it enabled the blood-sugar data to be used by a program called xDrip, which provides “sophisticated charting, customization and data entry features as well as a predictive simulation model.” Innocent enough, you might think. But not according to Abbott Laboratories, which sent in the legal heavies waving the DMCA:

It has come to Abbott’s attention that a software project titled “Libre2-patched-App” has been uploaded to GitHub, Inc.’s (“GitHub?) website and creates unauthorized derivative works of Abbott’s LibreLink program (the “Infringing Software”). The Infringing Software is available at https://github.com/user987654321resu/Libre2-patched-App. In addition to offering the Infringing Software, the project provides instructions on how to download the Infringing Software, circumvent Abbott’s technological protection measures by disassembling the LibreLink program, and use the Infringing Software to modify the LibreLink program.

The patch is no longer available on GitHub. The original Diabettech post suggested that analyzing the Abbott app was permitted under EU law (pdf):

Perhaps surprisingly, this seems to be covered by the European Software Directive in article 6 which was implemented in member states years back, which allows for decompilation of the code by a licensed user in order to enable interoperability with another application (xDrip in this case).

As Cory Doctorow points out in his discussion of these events, in the US the DMCA has a similar exemption for reverse engineering:

a person who has lawfully obtained the right to use a copy of a computer program may circumvent a technological measure that effectively controls access to a particular portion of that program for the sole purpose of identifying and analyzing those elements of the program that are necessary to achieve interoperability of an independently created computer program with other programs, and that have not previously been readily available to the person engaging in the circumvention, to the extent any such acts of identification and analysis do not constitute infringement under this title.

Legal issues aside, there is a larger point here. As the success of open source software over the last twenty years has shown, one of the richest stores of new ideas for a product is its user community. Companies that embrace that group are able to draw on what is effectively a global research and development effort. Abbott is not just wrong to bully people looking to derive greater benefit from its products by extending them in interesting ways, it is extremely stupid. It is throwing away the enthusiasm and creativity of the very people it should be supporting and working with as closely as it can.

Follow me @glynmoody on Twitter, Diaspora, or Mastodon.

Filed Under: artificial pancreas, blood sugar, blood sugar data, copyright, data, diabetes, diabettech, diy, dmca, healthcare, librelink, reverse engineering, xdrip
Companies: abbott labs, github

Class Action Lawsuit Hopes To Hold GitHub Responsible For Hosting Data From Capital One Breach

from the into-the-breach dept

As soon as the Capital One breach was announced, you knew the lawsuits would follow. Handling the sensitive info of millions of people carelessly is guaranteed to net the handler a class-action lawsuit or two, but this one — filed by law firm Tycko & Zavareeri — adds a new twist.

The 28-page lawsuit filed Thursday in the U.S. District Court for the Northern District of California asserted that GitHub “actively encourages (at least) friendly hacking.”

It notes that the hacked Capital One information was posted online for months and alleges that the company violated state law to remove the information. “GitHub had an obligation, under California law, to keep off (or to remove from) its site Social Security numbers and other Personal Information,” the suit says

Weird legal theory, but one that could possibly to be stretched to target some of the $7.5 billion Microsoft paid to acquire GitHub. But it takes a lot of novel legal arguments to hold a third party responsible for content posted by a user, even if the content contained a ton of sensitive personal info.

The lawsuit [PDF] alleges GitHub knew about the contents of this posting since the middle of April, but did not remove it until the middle of July after being notified of its contents by another GitHub user. The theory the law firm is pushing is that GitHub was obligated to scan uploads for “sensitive info” and proactively remove third-party content. The lawsuit argues GitHub is more obligated than most because (gasp!) it encourages hacking and hackers.

GitHub knew or should have known that obviously hacked data had been posted to GitHub.com. Indeed, GitHub actively encourages (at least) friendly hacking as evidenced by, inter alia, GitHub.com’s “Awesome Hacking” page

GitHub had an obligation, under California law, to keep off (or to remove from) its site Social Security numbers and other Personal Information.

Further, pursuant to established industry standards, GitHub had an obligation to keep off (or to remove from) its site Social Security numbers and other Personal Information.

The “industry standards” the lawsuit references are voluntary moderation efforts engaged in by social media platforms. Certainly no platform would want to be known as the habitual host of exfiltrated credit card data, but comparing the removal of offensive or plainly illegal content to the removal of strings of numbers from a site hosting an unusually large amount of strings of numbers is quite another. The law firm feels this assertion helps its case. It probably doesn’t.

Moreover, Social Security numbers are readily identifiable: they are nine digits in the XXX-XX-XXXX sequence. Individuals’ contact information such as addresses are similarly readily identifiable.

Thus, it is substantially easier to identify—and remove—such sensitive data. GitHub nonetheless chose not to.

Nine digits in a sequence. Oh, like phone numbers. And phone numbers tend to be found near addresses, especially when coders and developers are using GitHub as an offshoot of LinkedIn, posting their personal info for employers to find. Even long lists of personal info wouldn’t necessarily be innately suspicious. Employers and recruiters looking for people with certain skills have probably compiled all of this freely-provided personal info for easy reference. It’s not as easy to moderate content as the litigants believe.

But this belief, if backed by a judge, could add Github’s money to the pool of damages. Things will get a lot more interesting once GitHub responds to unintentionally hilarious assertions like these:

GitHub knew or should have known that the Personal Information of Plaintiffs and the Class was sensitive information that is valuable to identity thieves and cyber criminals. GitHub also knew of the serious harms that could result through the wrongful disclosure of the Personal Information of Plaintiffs and the Class.

As an entity that not only allows for such sensitive information to be instantly, publicly displayed, but one that also arguably encourages it, GitHub is morally culpable, given the prominence of security breaches today, particularly in the financial industry.

Well, we’ll see how “morally culpable” stands up in court, where “legally culpable” is the actual standard. GitHub will rely on Section 230 to be dismissed from this case and rightly so. The person responsible for posting sensitive data exfiltrated from Capital One is, unsurprisingly, the person who posted the sensitive data exfiltrated from Capital One. Capital One has a duty to protect the information it gathers from customers. A third party site with hosting capabilities does not and it’s not nearly as easy to moderate and proactively remove content as this lawsuit says it is.

Filed Under: class action, data breach
Companies: capital one, github

What Happens When The US Government Tries To Take On The Open Source Community?

from the maybe-we-are-about-to-find-out dept

Last year, Microsoft bought the popular code repository GitHub. As Techdirt wrote at the time, many people were concerned by this takeover of a key open source resource by a corporate giant that has frequently proved unfriendly to free software. In the event, nothing worrying has happened — until this:

GitHub this week told Anatoliy Kashkin, a 21-year-old Russian citizen who lives in Crimea, that it had “restricted” his GitHub account “due to US trade controls”.

As the ZDNet article explains, a user in Iran encountered the same problems. Naturally, many people saw this as precisely the kind of danger they were worried about when Microsoft bought GitHub. The division’s CEO, Nat Friedman, used Twitter to explain what exactly was happening, and why:

To comply with US sanctions, we unfortunately had to implement new restrictions on private repos and paid accounts in Iran, Syria, and Crimea.

Public repos remain available to developers everywhere — open source repos are NOT affected.

He went on to note:

The restrictions are based on place of residence and location, not on nationality or heritage. If someone was flagged in error, they can fill out a form to get the restrictions lifted on their account within hours.

Users with restricted private repos can also choose to make them public. Our understanding of the law does not give us the option to give anyone advance notice of restrictions.

We’re not doing this because we want to; we’re doing it because we have to. GitHub will continue to advocate vigorously with governments around the world for policies that protect software developers and the global open source community.

The most important aspect of this latest move by GitHub is that open source projects are unaffected, and that even those who are hit by the bans can get around them by moving from private to public repositories. Friedman rightly points out that as a company based in the US, GitHub doesn’t have much scope for ignoring US laws.

However, this incident does raise some important questions. For example, what happens if the US government decides that it wants to prevent programmers in certain countries from accessing open source repositories on GitHub as well? That would go against a fundamental aspect of free software, which is that it can be used by anyone, for anything — including for bad stuff.

This question has already come up before, when President Trump issued the executive order “Securing the Information and Communications Technology and Services Supply Chain“, a thinly-disguised attack on the Chinese telecoms giant Huawei. As a result of the order, Google blocked Huawei’s access to updates of Android. Some Chinese users were worried they were about to lose access to GitHub, which is just as crucial for software development in China as elsewhere. GitHub said that wasn’t the case, but it’s not hard to imagine the Trump administration putting pressure on GitHub’s owner, Microsoft, to toe the line at some point in the future.

More generally, the worry has to be that the US government will attempt to dictate to all global free software projects who may and may not use their code. That’s something that the well-known open source and open hardware hacker Bunnie Huang has written about at length, in a blog post entitled “Open Source Could Be a Casualty of the Trade War“. It’s well-worth reading and pondering, because the relatively minor recent problems with GitHub could turn out to be a prelude to a far more serious clash of cultures.

Follow me @glynmoody on Twitter, Diaspora, or Mastodon.

Filed Under: crimea, doj, open source software, sanctions, trade wars, us sanctions
Companies: github, microsoft

Activism & Doxing: Stephen Miller, ICE And How Internet Platforms Have No Good Options

from the and-for-fun,-the-cfaa-and-scraping dept

Last month, at the COMO Content Moderation Summit in Washington DC, I co-ran a “You Make the Call” session with Emma Llanso from CDT. The idea was to turn the audience into a content moderation/trust & safety team of a fictionalized social media platform. We showed numerous examples of content or accounts that were “flagged” and then showed the associated terms of service, and had the entire audience vote on what to do. One of the fictional examples involved someone posting a link to a third-party website “contactinfo.com” claiming to have the personal phone and email contact info of Harvey Weinstein and urging people “you know what to do!” with a hashtag. The relevant terms of service included this: “You may not post personal information about others without their consent.”

The audience voting was pretty mixed on this. 47% of the audience punted on the question, choosing to escalate it to a supervisor as they felt they couldn’t decide whether to leave the content up or take it down. 32% felt it should just be taken down. 10% said to just leave it up and another 10% said to put a content warning flag on the content. We joked a bit during the session that some of these examples were “ripped from the headlines” but apparently we predicted the headlines in this case, because there are two stories this week that touch on exactly this kind of thing.

Example one is the story that came out yesterday, in which Twitter chose to start locking the accounts of users who were either tweeting Trump senior advisor Stephen Miller’s cell phone number, or merely linking to a Splinternews article that published his cell phone number (which I’m guessing has since been changed…).

Splinternews decided to publish Miller’s phone number after multiple news reports attributed the inhumane* decision to separate children of asylum seekers from their parents to Miller, who has defended the plan. Other reports noted that Miller is enjoying all of the controversy over this policy. Splinternews, citing Donald Trump’s own history of giving out the phone numbers of people who anger him, thought it was only fair that people be able to reach out to Miller.

This is — for fairly obvious reasons — a controversial decision. I think most news organizations would never do such a thing. Not surprisingly, the number spread rapidly on Twitter, and Twitter started locking all of those accounts until the tweets were removed. That seems at least well within reason under Twitter’s rules that explicitly state:

You may not publish or post other people’s private information without their express authorization and permission.

But, that question gets a lot sketchier when it comes to locking the accounts of people who merely linked to the Splinternews article. A la our fictionalized example, those people are not actually publishing or posting anyone’s private info. They are posting a link to a third party that purports to have that information. And, of course, in this case, the situation is complicated even more than our fictionalized example because Splinternews is a news organization (owned by Univision), and Twitter also has said that it has a “newsworthy” exception to its rules.

Personally, I think it’s the wrong call to lock the accounts of those linking to the news story, but… as we discovered in our own sample version, it’s not an easy call and lots of people have strong opinions one way or the other. Indeed, part of the reason why Twitter may have decided to do this was that supporters of Trump/Miller started calling out the article as an example of doxxing and claiming that leaving it up showed that Twitter was unfairly biased against them. It is a no win situation.

And, of course, it wouldn’t take long before people started coming up with clever workarounds, such as Parker Higgins (citing the infamous 09F9 controversy in which the MPAA tried to censor the revelation of a cryptographic key that broke the MPAA’s preferred DRM, and people responded by posting variations on the code, including a color chart in which the hex codes of the colors were the code), who posted the following:

Would Twitter lock his account for posting a two color image? At some point, the whole thing gets… crazy. That’s not to argue that revealing someone’s private cell phone number is a good thing — no matter how you feel about Miller or the border policy. But just on the content moderation side, it puts Twitter in a no win situation in which people are going to be pissed off no matter what it does. Oh, and of course, it also helped create something of a Streisand Effect. I certainly hadn’t heard about the Splinternews article or that people were passing around Miller’s phone number until the story broke about Twitter whac’ing at moles on its site.

And that takes us to the second example, which happened a day earlier — and was also in response to people’s quite reasonable* anger about the border policy. Sam Lavigne decided to make something of a public statement about how he felt about ICE by scraping** LinkedIn for profile information on everyone who works at ICE (and who has a LinkedIn public profile). His database included 1595 ICE employees. He wrote a Medium blog post about this, posted the repository to Github and another user, Russel Neiss, created a Twitter account (@iceHRgov) that tweeted out info about each of those employees from that database. Notice that none of those are linked. That’s because all three companies took them down (though you can still find archives of the Medium post). There was also an archive of the Github repository, but it has since been memory-holed as well.

Again… this raises a lot of questions. Github claimed that it removed the page for “violating community guidelines” — specifically around “doxxing and harassment, and violating a third party’s privacy.” Medium claimed that the post violated rules against “doxxing” and specifically the “aggregation of publicly available information to target, shame, blackmail, harass, intimidate, threaten or endanger.” Twitter, in Twitter’s usual way, is not commenting. LinkedIn put out a statement saying: “We do not support or condone what immigration authorities are doing at the border, but we can?t allow the illegal use of our member data. We will take appropriate action to ensure our members? data is protected and used properly.”

Many people point out that all of this feels kind of ridiculous, seeing as this is all public info that the individuals chose to reveal about themselves on a public website. While Medium’s expansive definition of doxxing makes things interesting by including an intent standard in releasing the info, even if it is publicly available, the whole thing, again, demonstrates how complex this is. I know that some people will claim that these are easy calls — but, just for fun, try flipping the equation a bit. If you’re anti-Trump, how would you feel if a prominent alt-right person compiled and posted your info — even if publicly available — on a site where alt-right folks gather, with the clear intent of having hoards of Trump trolls harassing you. Be careful the precedent you set.

If it were up to me, I think I would have come down differently than Medium, Github and Twitter in this case. My rationale: (1) all of this info was public information (2) that those individuals chose to place on a public website, knowing it was public (3) they are all employed by the federal government, meaning they are public servants and (4) while the compilation was done by someone who is clearly against the border policy, Lavigne never encouraged or suggested harassment of ICE agents. Instead, he wrote: “While I don?t have a precise idea of what should be done with this data set, I leave it here with the hope that researchers, journalists and activists will find it useful.” He separately noted that he believed “it’s important to document what’s happening, and by whom.” That seems to actually make a strong point in favor of leaving the data up, as there is value in documenting what’s happening.

That said, reasonable people can disagree on this question (even if there should be no disagreement about how inhumane the policy at the border has been*) of what is the appropriate way for different platforms to handle these situations — taking into account that this situation could play out with very different players in the future, and there is value in being consistent.

This is the very point that we were demonstrating with that game that we ran at COMO. Many people seem to think that content moderation decisions are easy: you just take down the content that is bad, and leave up the content that is good. But it’s pretty rare that the content is easily classified in one of those categories. There is an enormous gray area — and much of it involves nuance and context, which is not always easy to come by — and which may look incredibly different depending on where you sit and what kind of world you think we live in. I still think there are strong arguments that the platforms should have left much of the content discussed in this post up, but I’m not the one making that call.

When we ran that game in DC last month, it was notable that on every single example we used — even the ones we thought were “easy calls” — there were some audience members who selected every option in the game. That is, there was not a single situation in our examples in which everyone agreed what should be done. Indeed, since there were four options, and all four were chosen by at least one person in every single example, it shows just how difficult it really is to make these calls. They are subjective. And what plays into that subjective decision making includes your own views, your own perspective, your own reading of the content and the rules — and sometimes third party factors, including how people are reacting and what public pressure you’re getting (in both directions). It is an impossible situation.

This is also why the various calls to mandate that platforms do this or face legal liability are even more ridiculous and dangerous. There are no “right” answers to these decisions. There are solutions that seem better to lots of people, but plenty of others will disagree. If you think you know the “right” way that all of these questions should be handled, I guarantee you’re wrong, and if you were in charge of these platforms, you’d end up feeling just as conflicted as well.

This is why it’s really time to start thinking about and talking about better solutions. Simply calling on platforms to be the final arbiters of what goes online and what stays offline is not a workable solution.

* Just a side note: if you are among the small minority of ethically-challenged individuals who gets upset that I describe the policy as inhumane: fuck off. The policy is inhumane and if you’re defending it, you should seriously take time to re-evaluate your ethics and your life choices. On a separate note, if you are among the people who are then going to try to justify this policy as “but Obama/others did it too,” the same applies. Whataboutism is no argument here. The policy is inhumane no matter who did it, and pointing out that others did it too doesn’t change that. And, as inhumane as it may have been in the past, it has been severely ramped up. There is no defense for it. Attempting to defend it only serves to out yourself as a horrible person who has issues. Seriously: get help.

** This doesn’t even fit anywhere in with this story, but scraping LinkedIn is (stupidly) incredibly dangerous. Linkedin has a history of suing people for scraping public info off of LinkedIn. And even if it’s lost some of those cases, the company appears to take a pretty aggressive stance towards scrapers. We can argue about how ridiculous this is, but, dammit, this post is already too long talking about other stuff, so discuss it separately.

Filed Under: activism, content moderation, doxing, harassment, ice, internet platforms, phone numbers, stephen miller, takedowns
Companies: github, linkedin, medium, twitter