articles – Techdirt (original) (raw)

Here’s The Article We Didn’t Run Back In 2017 About DoNotPay

from the pulled-off-the-spike dept

So, over the last few weeks, we’ve written a bunch of articles about DoNotPay, highlighting some pretty significant questions about the company, its CEO, and the services it offers. To date, the CEO of the company, Josh Browder, has not responded particularly well to the concerns people are raising, and is acting like someone trying to hide things, rather than address the underlying issues.

Last night, Kathryn Tewson, who has been at the forefront of uncovering all sorts of sketchy behavior by Browder and DoNotPay, published yet another expose, highlighting how some of his earliest claims about how many people were using the tool to contest parking tickets in New York City and London didn’t seem like they could be accurate. It’s a wild ride.

And… it’s also a wild ride that we probably had a story on over five years ago… but did not publish. Back in the fall of 2017, Lawyerist’s founder Sam Glover reached out to me, saying he had gotten excited about the concept of DoNotPay, but when he dug into the details, nothing seemed to add up. He thought there might be a Techdirt story in all of it. Eventually, he put me in touch with David Colarusso, a lawyer and data scientist who was also investigating DoNotPay, initially for Lawyerist, and had reached out to Browder to try to better understand the details. Browder responded with some thinly veiled legal threats if Colarusso dug further (after first promising to supply him the necessary data to confirm the data), which definitely was a red flag. Glover also was unsure if the story was right for Lawyerist, and suggested it was a better story for Techdirt.

Colarusso worried that the implied legal threats might bias him in any article he had written, so first offered to hand the story off to us entirely to build on his research, but eventually felt that it was wrong to be bullied and sent over the draft of a story with a bunch of initial notes to us regarding Browder’s response to Colarusso, including detailing where Browder challenged some of Colarusso’s claims.

We went back and forth over this for a little while and, eventually, chose not to publish it. While we did feel the story was interesting, and we have a history of calling out techdudes making bullshit claims, we eventually felt there just wasn’t enough information to confirm things one way or the other, in large part due to Browder’s blustery responses to Colarusso. Even though the article admits that, and notes that Browder claimed to have the data to support the claims, but was refusing to share it with Colarusso, it felt like we needed a little more to be comfortable publishing it.

Also, at the time, DoNotPay appeared to be a side project of a college student, not a high profile startup funded by some of the biggest VC and angel investors in the world. That has now changed. And, combined with the many other highly questionable claims from Browder recently, and the additional data turned up by Tewson, Colarusso reached out to wonder if it made sense to publish the story now, with an intro like this one, to highlight how these issues were always present with the operation (things you’d think a giant VC firm like Andreessen Horowitz would have done due diligence on?!?).

In retrospect, it might have made sense to publish the story back then, though, again at the time it wasn’t part of a larger tapestry of questionable behavior, nor was it a big venture-backed startup, rather it was a noteworthy (somewhat hyped up) side project of a college student.

Anyway, we should note that after Colarusso wrote this unpublished article, he did become the lab director of the Legal Innovation and Technology lab at Suffolk University Law School, which, in some ways, is in an adjacent space to DoNotPay in that it helps digitize court forms to improve access to courts. This happened after this issue, and really just shows Colarusso’s general interest in this arena, but we wanted to post that disclaimer in the name of transparency.

So, here is the article that Colarusso wrote for us over five years ago, complete with the original notes interspersed in the piece where he highlights some issues and concerns, and leading off with the email he sent with it describing some of his thoughts. The only edits to the original were (1) to correct small typos (2) to insert some paragraph breaks for readability and (3) to remove someone’s name who was involved in the original discussion over what to do with this piece (4) to remove a short paragraph that Colarusso had included in the intro note regarding comments Browder made to Colarusso that possibly revealed sensitive information about Browder that we felt it was improper to publish. Finally, some of the links in the original piece no longer work. Some can be found via the Wayback machine, but for now we’ve chosen not to include those links. In retrospect, things might have been different if we had, in fact, published the article at the time.

Below you’ll find the draft article I put together on DoNotPay, plus a few notes to fill in recent developments.

[….]

He’s already admitted to some minor puffery with his original numbers, telling me that his initial 86,000 appeals claim is off by 10-20%. So my guess for worst case scenario here is that he overestimated his original numbers and built everything on that, and he can’t admit that he made a mistake. The thing is, even with half the numbers reported he would have likely received similar coverage.

Anywho, here’s what I have. Why don’t you give it a look and we can decide how to move forward, with a co-authored piece or you going off in your own direction. Having sat with this for a week, I have to admit I don’t like the idea of him bullying me off the story. That being said, I look forward to hearing your thoughts. Note: our CMS encloses footnotes in double parentheticals, and I’ve added notes to you in brackets.

=======

“Extraordinary claims require extraordinary evidence.” -Carl Sagan

Last month DoNotPay, the free “robot lawyer,” announced that you could sue Equifax by talking with it’s chatbot. The bot’s creator, Joshua Browder hopes his “product will replace lawyers, and, with enough success, bankrupt Equifax.” Browder’s bluster has earned him and DoNotPay a good deal of press. The story of DoNotPay is compelling, an 18-year-old student in the UK builds a tool to fight parking tickets, saving Britons £2 million in just four months. Browder expands the bot to fight tickets in New York City, overturning 160,000 tickets. Then he adds help for the newly evicted and refugees. This year’s big news? DoNotPay now helps with 1,000 areas of law, plus you can build your own. Behind the bluster, however, there is a hint of something extraordinary, the promise that someone has figured out how to use technology to help close the justice gap.

Like many in legal tech, I find Browder’s story inspiring. I teach law students how to build their own interactive flowcharts (chatbots), and DoNotPay has been a go-to example of such a product in the wild. After the sue Equifax feature launch, however, I was struck by what appeared to be, at best, a mismatch between hype and reality and, at worst, a breach of duty. I looked back on Browder’s earlier claims, and I realized I did not know the numbers well enough to put them in perspective. Implicit in the reporting was the idea that Browder was somehow leveraging technology to do something extraordinary. I wanted to know how extraordinary. Unfortunately, when asked to provide data to validate and place his original claims in perspective, Browder demurred.

DoNotPay’s first big splash came in late 2015. In December Browder was claiming to have saved Britons £2 million in just four months, citing 86,000 appeals and 30,000 overturned fines since its launch in late August. According to Browder, during this period DoNotPay was helping challenge parking tickets across all of the U.K. though in our conversation he was unsure if Scottish or Irish users would have been able to use the system. When asked about the scale of his claims, Browder explained that such numbers are just a drop in the bucket for the total number of U.K. tickets. When asked about the source of the 86,000 count, Browder explained it was not a count of appeals filed but rather a measure of completed interactions with the bot. That is, during the period from August to December the bot generated roughly 86,000 documents which could have been submitted as part of a challenge. The estimate of 30,000 overturned tickets was based on a user poll Browder conducted where he found the win percentage of users and applied this to the 86,000 number. When asked about the relationship between completed documents and actual challenges he estimated that 10 to 20% were not actually submitted. This nuance was not accounted for in his original claims and is the result of subsequent analysis on the part of Browder.

It is important to note that these appeals documents were appeals in the colloquial sense. That is, they represented initial challenges, not the formal appeals reported in most official statistics. Appealing a ticket is a multistage process. ((When referencing parking tickets, unless otherwise noted, it should be assumed that I am discussing penalty charge notices. See infra FN6.)) A win according to Browder was any challenge that fails to end with a person paying a ticket. This includes tickets that were canceled before the formal appeals process. Consequently, one cannot directly compare Browder’s numbers to data such as this from the London Councils. In Parking appeals statistics 2015-16, the councils reference 17,192 successful appeals for fiscal year 2016. Assuming a steady rate of tickets across the year, that equates to roughly 5,700 tickets for the four months covered by Browder’s claim.

When asked, Browder estimated that of the 86,000 documents cited 14.6% were for tickets in London. If the win rate was consistent across jurisdictions, we can assume this means that 14.6% of the 30,000 wins (roughly 4,400) occurred in London as well. Remember, however, we cannot compare these 5,700 and 4,400 counts. Browder is NOT claiming that DoNotPay handled around 80% of the winning appeals from London. The councils’ numbers do not include informal challenges that resolved pre appeal and so measure something different from DoNotPay’s wins which are larger than its number of successful formal appeals. The question is how much larger. ((It was the ratio of DoNotPay wins to successful official appeals that was first brought to my attention as something worthy of further examination. Jason Velez, another participant in the legal chatbot space, had found the London Councils’ parking statistics and was not sure what to make of them. Consequently, he shared them with Lawyerist.))

In 2010 and 2011, the rate of challenges across the whole of the U.K. was roughly 25%. ((This number comes from both the Civil parking enforcement statistics 2009/10 and a report from the car insurer Switcover looking at 2010 and 2011 data.)). We know the number of tickets issued in London during FY2016 was roughly 3.6 million. The adjusted four month equivalent comes in at roughly 1.2 million. Consequently, we can assume that about 300,000 tickets (25%) were challenged. 14.6% of 86,000 is about 12,600. This would be roughly 4% of all challenges. Is that a drop in the bucket? I do not know. Is it reasonable that only 14.6% of DoNotPay’s appeals came from London? I do not know. However, if this number were larger the same would be true for the percentage of challenges handled by DoNotPay, adding to the size of our drop. In 2010, London was responsible for about 56% of all on-street parking tickets across England. ((See Civil parking enforcement statistics 2009/10 and XLS tables (citing the number of on-street penalty charge notices for London as 4,023,000 and the number for all of England as 7,140,000).)) Given the relative size of Scotland, Ireland, and Wales, coupled with the question around DoNotPay’s operation in Scotland and Ireland, it seems safe to say the majority of DoNotPay’s users resided in England, and if London made up a similar proportion of England’s tickets in late 2015, it becomes reasonable to ask why there was such a relatively small percentage of DoNotPay users in London esp. given this is where Browder was based. ((England’s population accounts for more than 80% of the U.K.’s. Browder was clear to point out that his definition of London would likely differ from others as he “set a radius from central London and included every postcode within that radius, including places like Heathrow.”))

When asked why this was the case, Browder speculated that many of DoNotPay’s users might have been outside of London due to the patterns of press coverage (national and university related), and that perhaps they might involve tickets not in the government statistics either because they involved university tickets or private parking tickets (i.e., parking charge notices as opposed to council-imposed tickets, known as penalty charge notices). ((For a description of the differences between parking charge notices and penalty charge notices, as well as fixed penalty notices, see Appealing against a parking ticket. Note: Browder did not respond to my question regarding DoNotPay’s ability to fight fixed penalty notices which was prompted by his suggestion that DoNotPay numbers should include a consideration of anything other than penalty charge notices.)) In December 2015, at the time of the claims discussed here, it was reported that DoNotPay did not handle private parking tickets. See e.g., this Daily Mail article, from which the claims discussed here were drawn, stating that “[t]he appeals were all against council-imposed fines – but [Browder] plans to expand his website to cover private car parks ‘in the near future’.” When asked to explain this discrepancy, Browder noted that despite DoNotPay being focused on council-imposed tickets his users likely did not understand the difference between private and public enforcement. Consequently, he claims that users tried and succeeded to use DoNotPay to challenge private tickets given that many of the defenses applied to both.

It is worth noting that the analysis found here operates largely under the assumption DoNotPay’s services were that reported at the time of the 86,000 tickets claim (i.e., DoNotPay handled only penalty charge notices). This is justified in part by Browder’s own analysis. Browder claims to have written a program to check the status of a given ticket. This program works by submitting a ticket number to the government and payment systems and noting the response. For example, paid ticket numbers return different responses than unpaid. By noting the response for a given ticket he claims to be able to ascertain if a challenge was submitted. This program is what Browder claimed to use in determining what portion of the 86,000 tickets were actually submitted, and since it would only work with government issued tickets, we know that the earlier statement about 10 to 20% of challenges not being submitted is equivalent to 80 or 90% of them being penalty charge notices. That is, at least 80% of the 86,000 tickets were government issued tickets.

[this paragraph above is now disputed by Browder. So it can’t stand as is. Originally, I was led to believe his status checking program worked only with government-issued tickets, as described above. However, when I presented Browder with this interpretation to double check he claimed that it also worked with private parties (e.g., those issuing Parking Charge Notices). Since such a program would have to make special accommodations for private tickets, querying a different system, I pointed out that he should have access to the specific number of private vs public tickets. Browder responded to a request for these numbers by stating that I was drastically underestimating the number of private tickets (presumably disputing the logic presented above). He then claimed either that I was conducting my research on the dime of my full-time employer, namely the taxpayers of Massachusetts, or alternatively doing my work after hours such that I was sleep deprived and so doing my employer a disservice. Both of these were based on the timing of some of our emails. If the former, he demanded to know why taxpayer dollars were being used to harass a minor. He did not provide the number of private tickets contained within the 86,000 appeals number, and this is when we cut off contact as this was not the first time he had made a veiled threat and it was clear he was not acting in good faith. The first threat involved a peculiar reading of the disclaimer on my personal webpage, described below. Also, the private parking claim was an eleventh hour claim made only after many prior discussions in which it never came up. It actually came up after I had drafted much of this document and was looking to double check my facts. So I never had time to figure out how it affected the rest of the piece. That being said, my preliminary research put the number of private parking tickets at about half the number of public tickets though I didn’t find any really solid numbers.]

Unlike official appeals, I was unable to easily find aggregated numbers for parking challenges across all of the London Councils. For the small fraction of data I could easily find, the average weighted increase in challenges for FY2016 over FY2015 was less than three percent. ((These included Ealing (1% increase), Hackney 2015/16 and 2014/15 (1% decrease), Newham 2015/16 and 2013/14 (2% increase), Tower Hamlets (5% increase), and Westminster 2015/16 (1% decrease). The overall weighted average, accounting for the relative volumes of tickets, came out to be just shy of 3%, with the majority of tickets arising from either Westminster or Hackney. By way of methodology, I performed a single rudimentary Google search for each of the London Councils’ annual parking reports, varying each search by the council’s name. I took down data when: (1) I could find a report; and (2) that report contained sufficient information to determine the rate of challenges. Such a small sample is almost certainly not representative, but it need not be for the purpose at hand.)) That is, there was no dramatic uptick in challenges across the fraction of councils for which I found data. This is consistent with the “just a drop in the bucket” interpretation or the idea that DoNotPay simply replaced other methods of challenge. That is, the idea that DoNotPay users were just people who would have otherwise used the government system but for learning of DoNotPay. A news article referencing DoNotPay outranked the official gov.uk tool for challenging tickets in at least one of the Google searches I conducted about how to pay a ticket in London. So this seems possible.

This, however, is not the same as saving Britons £2 million. Admittedly, “shifting the source of challenges,” lacks a certain appeal as a headline. Depending on the number of DoNotPay users in the sample councils, however, this data is also consistent with DoNotPay driving an increase in challenges. The problem is that on its own it fails to be definitive either way. Browder called the replacement interpretation a “nasty opinion,” explaining that without access to all of the authorities’ numbers the trend of challenges was at best speculative. ((This came about as a reaction to my characterization that given the data I had available, DoNotPay didn’t seem to move the needle. In fairness, I should have added the qualifier much.)) I agree that the fraction of authority data cited here is insufficient to confidently make claims about challenge trends and effect size. That after all, is my point. If we had fine-grain data for DoNotPay challenges we could drill down into individual council numbers. For his part, Browder claimed to have enough information to know the replacement scenario was not true, and offered to put me in touch with users who could provide testimonials. He did not, however, offer to provide the breakdown of DoNotPay users for specific councils.

The rate of challenges went down in two of the five councils for which I found information, including the one with the highest volume of tickets. Westminster Council ascribed their drop, which started a year earlier in 2015/14, to improved procedures on their part, but it is also consistent with users finding DoNotPay first and then giving up after interacting with a frustrating user interface. If this were the case, DoNotPay would have actually cost people money. To be clear, I am being intentionally provocative. We have no way of knowing what actually happened absent more data because we do not know where DoNotPay users were making their challenges. I mean only to underline the point that it is possible both for Browder’s public numbers to be true and for us to take away the wrong lessons in the absence of more data. Are DoNotPay’s challenges a drop in the bucket, only shifting the source of challenges, the massive mobilization of previously unengaged drivers, or a mild distraction obfuscating government tools for challenging tickets? Given the currently available public data, one can not say because Browder has not provided sufficient detail to test competing theories. This is the problem.

Browder claims to have the data needed to help answer these questions. He says that he has a record of the ticket number for every appeals document created during the period in question, and the fact that he was able to answer my question about the percentage of appeals from London in only a few hours suggests that he can easily access information about the locations and dates of those appeals. Browder repeatedly expressed frustration over having to continually defend his claims. I suggested that if he was to provide a list of the ticket numbers, dates, and locations we would work to audit his claims and publish our results. Unique ticket numbers would remove ambiguity from his claims allowing for an easy assessment of how many tickets were public, private, or other. Coupled with location and date, they would provide a means for direct validation against council records. In our correspondence, Browder would repeatedly reply to a question by effectively explaining that this was all very complicated and that first one needed to account for this or that nuance. The sharing of detailed challenge data would cut through most such complications.

At the suggestion of sharing his data, Browder first expressed concern over his users’ confidential information. I suggested we could address this concern through a formal agreement granting access for the limited purpose of validating his claims. At this point, Browder expressed the belief that although the legal issues around sharing his data could be solved there was a larger issue of user trust and that he felt sharing his user data would be a violation of this trust. Upon further communication, he explained that he had in fact shared his fine-grained data with trusted parties but that he did not trust Lawyerist enough to share such data.

I then asked if he could put us in touch with one of these trusted parties as they may have answered the questions we still had. At first, Browder failed to answer this request directly. Instead he stated that he was working with the BBC and that they would have a piece based on his data coming out shortly. I asked for his contact there. This sparked an exchange of several emails in which Browder initially ignored the request, opting instead to question the ethics and integrity of those working at Lawyerist. ((Browder began asking questions about Lawyerist’s expected revenue for this piece and language in the disclaimer of my personal website where I state that I don’t take money to write about or feature material on either my personal website or blog. Apparently, the fact that I link to writing for which I am paid on my website caused him to think this line was misleading. He then explained that he would consider sharing contact info for those who had seen his data only after his “concerns surrounding ethics and integrity [were] satisfied.” This was followed by the suggestion that I was being paid specifically to write something with a “defamatory angle.” [eventually he put me in contact with a fellow Stanford student he believed to be an unbiased party. He, however, was unable to provide any helpful information and it was my interactions with him that led to the conversations with Browder that prompted us to cut off ties. Again, if you’d like a copy of the emails, I can provide them.]))

I am writing this post in the hope that either: (1) one of Browder’s trusted parties will step forward with a detailed analysis of his data, not just a collection of testimonials; or (2) some institution Browder trusts will step forward and offer their services to conduct a detailed audit and that Browder will avail himself of such an offer. For what it is worth, I do not believe Mr. Browder was engaged in some premeditated act of deception, and my hope is that his reticence to share the data necessary to fully assess his claims is simply a failure to recognize the burden of proof for an extraordinary claim lies with the one promoting it.

The details matter when talking about technology tools aimed at addressing access to justice because practitioners must understand the boundary of reasonable expectations for such tools. Only armed with such understanding can they be maximally used in the service of justice. I want to live in a world where a website built by an 18-year-old can save people millions in a matter of months while shrinking the access to justice gap. ((I teach law students how to build chatbots, and I’m the author of an open source markup language designed for use by attorneys. If we live in such a world, it means Browder’s success can be emulated, and that is something I want to believe, but I know enough to question strongly those things I want to be true.)) But wanting does not make it so. “Extraordinary claims require extraordinary evidence.”

Filed Under: articles, data, joshua browder, parking tickets, research, threats
Companies: donotpay

MLB Removes References To Current Players On MLB.com Due To Lockout

from the take-this! dept

Whether you’re a baseball fan, or a sports fan in general, or not, regular readers here will know that we’ve covered aspects of many sports leagues and Major League Baseball in particular. As you’d expect with any major business like MLB, some of those posts have dealt with some nonsense intellectual property actions the league has undertaken, but many more of them have been positive articles about the forward-thinking folks at MLB when it comes to how they make their products available using modern technology. The league’s website work has always been particularly good, whether it’s been the fantastic MLB.TV streaming site the league operates, or even simply the base MLB.com site itself.

But that latter site has now become a petty pawn being played by MLB as part of the owner’s lockout of players that just kicked off. For non-MLB fans, the quick version is this: the collectively bargained labor agreement between owners and players expired this week without a new agreement inked. As a result, the players are now locked out of team facilities by ownership. That last bit is important, because many people have been describing this as a labor strike. It isn’t. At all. This is the owners refusing to let the players fulfill their duties. And as part of that, it seems, MLB released the following news update on its MLB.com website.

You may notice that the content on this site looks a little different than usual. The reason for this is because the Collective Bargaining Agreement between the players and the league expired just before midnight on Dec. 1 and a new CBA is currently being negotiated between the owners and the MLBPA.

Until a new agreement is reached, there will be limitations on the type of content we display. As a result, you will see a lot more content that focuses on the game’s rich history. Once a new agreement is reached, the up-to-the minute news and analysis you have come to expect will continue as usual.

It’s unclear precisely what game MLB is playing with this move, but the end result is a website that is almost entirely bereft of content on any current MLB player. While the stats and standings from last season are still available in their tabs, the entire main page is now filled only with content about players no longer playing. Players that are on this year’s Hall of Fame ballot, for instance, or check ins with Ichiro showing up at a high school to hit home runs. Interested in Vin Scully’s thoughts on Gil Hodges? MLB.com has you covered! Want to know anything new about Kris Bryant or Mike Trout? You’ll have to go elsewhere.

The league is making noises about having to comply with federal labor laws regarding the use of player likenesses in promotional or advertising material, but that doesn’t make that much sense in the context of simply listing players currently under contract and on team rosters. Instead, this looks to be an attempt to, in some manner, punish current players by ripping away any fame or notoriety they might get via the MLB.com site. It’s also notable that each individual team site gets feeds directly from MLB.com and those sites too are changed in a similar manner. Perhaps most strangely, the headshots of all current players have been removed and replaced by generic avatars of faceless heads

It could be that MLB is just playing it really, really safe on the labor laws situation… but I doubt it. This is more likely part of the overall strong-arm tactic by team owners that are crying poor to the players’ union while beating the CBA buzzer to hand players millions and millions of dollars at the same time. And, just to add more to the mix, this all is happening at the same time MLB admitted it has been messing with the types of balls within the game, introducing multiple differently behaving balls in a league that is absolutely driven by statistics for what is supposed to be a uniform game.

Not exactly the ammo the owners need going into CBA negotiations, to be sure.

Filed Under: articles, baseball, collective bargaining agreement, journalism, labor agreement, lockout, mlb, news, players
Companies: mlb

TorrentFreak Continues To Get DMCA Takedown Notices Despite Not Hosting Infringing Material

from the this-is-not-the-way dept

It’s no secret that TorrentFreak, a mainstay news site covering copyright and filesharing issues, gets more than its fair share of errant DMCA takedowns and other wayward scrutiny. This is almost certainly a function of the site’s chosen name, though the sheer volume of mistaken targeting of the site also serves as a useful beacon for just how bad policing copyright has become. If you can’t get past a news site having the word “torrent” in its name, then we should probably all admit we’re operating at a very silly level of IP enforcement.

And yet it keeps happening. Most recently, TorrentFreak reported on a request made to Google to delist a post the site did on how popular The Mandalorian was with pirates.

Every week we see obvious errors, where sites such as IMDb, Wikipedia, Justice.gov, and NASA are targeted. By now we ignore most of these mistakes but in some instances, we take them personally. That’s also the case for a DMCA takedown request Google received a few days ago. This notice claims to identify several problematic URLs that allegedly infringe the copyrights of Disney’s hit series The Mandalorian.

This is not unexpected, as The Mandalorian was the most pirated TV show of last year, as we reported in late December. However, we didn’t expect to see our article as one of the targeted links in the notice. Apparently, the news that The Mandalorian is widely pirated – which was repeated by dozens of other publications – is seen as copyright infringement? Needless to say, we wholeheartedly disagree. This is not the way.

A couple of things we should absolutely point out. First, at the time of this post being written, Google has not delisted the post from search results. Also, and this genuinely surprised me, Disney was not the the party requesting the post be delisted, despite the show being a flagship on Disney+. Instead, the requesting entity is something called GFM Film. TorrentFreak was unable to pin down precisely who that company is or where it’s from, as there looks to be several potential candidates found via web search.

All of which is only really interesting in terms of finding out who is responsible for this screw up. Because, again, TorrentFreak is a news site that does not host a single bit of infringing digital material. The policing of copyright is full of this sort of collateral damage and that doesn’t seem to be a problem anyone seriously wants to tackle.

Filed Under: articles, dmca, speech, takedowns, the mandalorian
Companies: torrentfreak

AI Writes Article About AI: Does The Newspaper Hold The Copyright?

from the the-monkey-gets-it dept

For many years, we wrote about the infamous monkey selfie copyright situation (and lawsuit) not just because it was hellishly entertaining, but also because the legal questions underlying the issue were likely to become a lot more important. Specifically, while I don’t think anyone is expecting a rush of monkey-authored works to enter the market any time soon, we certainly do expect that works created by computers will be all over the damn place in the very, very near future (and, uh, even the immediate past). Just recently, IBM displayed its “Project Debater” offering, doing an AI-powered realtime debate against a human on the “Intelligence Squared” debates program. A few days after that, the Guardian used OpenAI to write an article about itself, which the Guardian then published (it’s embedded about halfway down the fuller article which is written by a real life human, Alex Hern).

In both cases, the output is mostly coherent, with a few quirks. Here’s a snippet that shows… both:

This new, artificial intelligence approach could revolutionize machine learning by making it a far more effective tool to teach machines about the workings of the language. Deep-learning systems currently only have the ability to learn something specific; a particular sentence, set of words or even a word or phrase; or what certain types of input (for example, how words are written on a paper) cause certain behaviors on computer screens.

GPT2 learns by absorbing words and sentences like food does at a restaurant, said DeepFakes? lead researcher Chris Nicholson, and then the system has to take the text and analyze it to find more meaning and meaning by the next layer of training. Instead of learning about words by themselves, the system learns by understanding word combinations, a technique researchers can then apply to the system?s work to teach its own language.

Almost… but not quite.

Anyway, in the ensuing discussion about all this on Twitter, James Green asked the “simple” question of who is the “author” of the piece in question. The answer, summed up by Parker Higgins is:

legally speaking: ?_(?)_/?

there are a few proposed frameworks and a few theories of what happens if none of the proposals get taken up, but it will likely be settled in court

— Parker Higgins (@xor) February 15, 2019

This is why I think the monkey selfie case was so important. In determining, quite clearly, that creative works need a human author, it suggests that works created by a computer are squarely in the public domain. And while this seems to lead some (mainly lawyers) to freak out. There’s this unfortunate assumption that many people (especially lawyers) seem to make: that every creative work must be “owned” under copyright. There is no legal or rational basis for such an argument. We lived for many years in which it was fine that many works entered life and went straight into the public domain, and we shouldn’t fear going back to such a world.

This certainly isn’t a new question. Pam Samuelson wrote a seminal paper on allocating ownership rights in computer-generated works all the way back in 1985 (go Pam!), but it’s an issue that is going to be at the forefront of a number of copyright discussions over the next few years. If you think that various companies, publishers and the like are going to just let those works go into the public domain without a fight, you haven’t been paying attention to the copyright wars of the past few decades.

I fully expect that there will be a number of other legal fights, not unlike the monkey selfie case but around AI-generated works, coming in the very near future. Having the successful monkey case in the books is good to start with, as it establishes the (correct) baseline of requiring a human. However, I imagine that we’ll see ever more creative attempts to get around that in the courts, and if that fails, a strong push to get Congress to amend the law to magically create copyrights for AI-generated works.

Filed Under: ai, articles, copyright, monkey selfie, ownership, public domain
Companies: guardian, openai

Organization Helping Police Inject Ads On 'Pirate' Sites 'Pirates' BBC Article About The Program

from the well-there-go-its-own-ads dept

Earlier this week we wrote about the latest ridiculous move by the City of London Police to inject ridiculous ads on sites that the City of London Police force deems to be “pirate sites.” As we noted in our writeup, it’s not always so easy to determine what is and what is not a “pirate” site. Here, let’s take a look at the website of a company called “Project Sunblock.” It’s a “brand safety” advertising company that claims to scan pages that ads appear on to make sure that good ads don’t appear on “bad pages.” It’s also the “partner” that the City of London Police are using to do their ad injection. Here’s what the original BBC article about this operation had to say about them:

Project Sunblock detects the content of websites to prevent brands’ ads appearing where they do not want them.

When a website on Pipcu’s Infringing Websites List (IWL) tries to display an advert, Project Sunblock will instead serve the police warning.

Neither the police or Project Sunblock are paying the website in question to display the police message.

So here’s the question: is Project Sunblock itself running a rogue site? Parker Higgins happened to notice that the company decided to copy the entire BBC article onto its blog. It seems to think it’s okay to do that, so long as it includes a “first published by Dave Lee on [BBC URL]” at the end. But, of course, that’s not true. The company appears to have just copied the entire article wholesale and put it on its own website. The BBC might claim that this is infringement. Assuming that, at some point, some genius at Project Sunblock may rethink this decision, here’s a thumbnail screenshot (you can click for a larger version):

Of course, this sort of thing — “ooh, nice PR article for us, let’s highlight it by posting it to our blog” — happens all the time. Because it seems totally natural and normal to most folks. Because it is. But it’s also likely to be copyright infringement, especially in the UK where they don’t have a pesky little thing called fair use.

But, really, it highlights the problem. The very company that is providing the tools to present bogus warnings to people that they’re on a site engaged in copyright infringement is, itself, likely engaged in copyright infringement. Because, these days, it’s almost impossible not to infringe someone’s copyright at some point or another. Figuring out what sites are “pirate” sites and what sites are “legit” isn’t so easy. When even the company the City of London Police signed up to do their ad injections can’t figure out how copyright works, shouldn’t the City of London Police think twice about unilaterally declaring sites pirate sites?

Filed Under: ads, articles, city of london police, copyright, infringement, rogue sites, uk
Companies: project sunblock

How The Constraints Of 'Traditional Journalism' Sometimes Lead To A Missed Opportunity To Better Inform

from the experiments-in-breaking-out-of-the-box dept

Recently, a NY Times article about the giant patch of floating garbage in the ocean got some attention, not so much for the contents of the article, but because it was the first time the NY Times had worked with Spot.us to fund some journalism. If you’re not familiar with Spot.us, it’s an innovative non-profit startup, that helps “crowdfund” certain journalism projects. I’m not convinced it’s a great business model, but it is one that’s interesting to watch, and a partnership with the NY Times is definitely a big win for the organization.

However, I think Mathew Ingram really highlighted the most interesting thing about the whole project. While the NY Times article that came from Spot.us was somewhat mundane and didn’t add much to the half a dozen or so other articles that have been written about the garbage patch, the blog written by the reporter who did this project, Lindsey Hoshaw, was a lot more interesting and compelling than the NY Times article itself. But the blog wasn’t a part of the NY Times at all.

What Mathew was really showing was how some traditional publications get locked into a certain way of doing things because “this is how we do things.” And in that world “the article” is the ultimate goal. It’s a “deliverable.” The process and the journey seem less important — even though they’re quite often the most interesting parts, to a wider community that wants to feel more and more a part of the journalism process itself. The NY Times is pretty good about doing certain topic blogs, and even brought in the Freakonomics blog under its own brand, a while back. But Mathew makes a really good point that this sort of thing probably would have worked better if the entire blog was seen as a part of the NY Times process. It could have ended with a big “story” — or not. It’s not even clear that’s needed here. In the end, the real point is that the old structures don’t always make sense. And while it was already a big step for the NY Times to create this story using such a new and different process as Spot.us, the end result might have been even better if they’d gone even further and highlighted the journey of the story, rather than just the endpoint.

Filed Under: articles, blogs, business models, crowdfunding, garbage patch, journalism
Companies: ny times, spot.us