translation – Techdirt (original) (raw)

Stories filed under: "translation"

How Refugee Applications Are Being Lost In (Machine) Translation

from the AI-not-I dept

As you may have noticed, headlines are full of the wonders of chatbots and generative AI these days. Although often presented as huge breakthroughs, in many ways they build on machine learning techniques that have been around for years. These older systems have been deployed in real-life situations for some time, which means they provide valuable information about the possible pitfalls of using AI for serious tasks. Here is a typical example of what has been happening in the world of machine translation when applied to refugee applications for asylum, as reported on the Rest of the World site:

A crisis translator specializing in Afghan languages, Mirkhail was working with a Pashto-speaking refugee who had fled Afghanistan. A U.S. court had denied the refugee’s asylum bid because her written application didn’t match the story told in the initial interviews.

In the interviews, the refugee had first maintained that she’d made it through one particular event alone, but the written statement seemed to reference other people with her at the time — a discrepancy large enough for a judge to reject her asylum claim.

After Mirkhail went over the documents, she saw what had gone wrong: An automated translation tool had swapped the “I” pronouns in the woman’s statement to “we.”

That’s a tiny difference, and one that today’s machine translation programs can easily miss, especially for languages where training materials are still scarce. And yet the consequences of the shift from singular “I” to plural “we” can have life-changing consequences – in the case above, whether asylum was granted to a refugee fleeing Afghanistan. There are other problems too:

Based in New York, the Refugee Translation Project works extensively with Afghan refugees, translating police reports, news clippings, and personal testimonies to bolster claims that asylum seekers have a credible fear of persecution. When machine translation is used to draft these documents, cultural blind spots and failures to understand regional colloquialisms can introduce inaccuracies. These errors can compromise claims in the rigorous review so many Afghan refugees experience.

In the future it is likely that the number of people seeking asylum will increase, not least because of environmental refugees who are fleeing lands made uninhabitable by climate change. Their applications for asylum elsewhere are likely to involve a wider range of lesser-known languages. Turning to machine translation will be a natural move by the authorities, since it takes time and resources to recruit specialist human translators.

The new generation of AI tools and their high-profile abilities will encourage this trend, as well as their use to evaluate applications and to make recommendations about whether they should be accepted. The Rest of the World article points out that OpenAI, the company that is behind ChatGPT, updated its user policies in late March with the following as “Disallowed usage of our models”:

High risk government decision-making, including:

Governments trying to save money will doubtless use them anyway. It will be important for courts and others dealing with asylum claims to bear this in mind when there seem to be serious discrepancies in refugees’ applications. They may be all in the (machine’s) mind.

Follow me @glynmoody on Mastodon.

Filed Under: afghanistan, ai, asylum, chatbots, chatgpt, climate crisis, machine learning, openai, pashto, refugees, translation
Companies: openai

Meta’s AI division has announced two exciting new projects in the field of machine translation:

The first is No Language Left Behind, where we are building a new advanced AI model that can learn from languages with fewer examples to train from, and we will use it to enable expert-quality translations in hundreds of languages, ranging from Asturian to Luganda to Urdu. The second is Universal Speech Translator, where we are designing novel approaches to translating from speech in one language to another in real time so we can support languages without a standard writing system as well as those that are both written and spoken.

The No Language Left Behind technology could have a major impact on how people around the world use the Internet, particularly in the way they access key scientific and medical resources. It would allow people to translate material in one of the more prevalent languages used online, such as English or Spanish, into their own local language once it has been included in the No Language Left Behind project. There’s a crying need for this, for reasons the following Wikipedia article makes clear:

Slightly over half of the homepages of the most visited websites on the World Wide Web are in English, with varying amounts of information available in many other languages. Other top languages are Russian, Spanish, Turkish, Persian, French, German and Japanese.

Of the more than 7,000 existing languages, only a few hundred are recognized as being in use for Web pages on the World Wide Web.

Unfortunately, Meta’s grand vision is unlikely to be realized – because of copyright. Unless online material is released under a permissive license such as the ones devised by Creative Commons, it will be necessary to obtain permission from the copyright holder before a full translation can be made using Facebook’s new tools. It will only take a few high-profile lawsuits from bullying publishers to frighten people away from daring to translate mainstream online articles into their own, poorly-served language without a license.

And so, once again, copyright maximalism will throttle an exciting chance to make the world a better, fairer place by improving access to knowledge – and all to preserve the sanctity of an outdated intellectual monopoly.

Follow me @glynmoody on Twitter, Diaspora, or Mastodon. Originally posted to WalledCulture.

Filed Under: ai, copyright, fair use, translation
Companies: facebook, meta

Facebook Translate Error Lands Palestinian Man In Israeli Detention

from the bad-morning dept

Like many people today, I have a decent amount of experience using Facebook’s language translations. With Geigners the world over, including an eyebrow-raising number of them in South America, I’ve found the translations to be a useful if imperfect way for me to interact with distant family members that reside in countries and continents far from the States. Imperfect is a key word there, however, as some of the garbled nonsense that results from translations can range from somewhat funny to downright perplexing. It goes without saying that relying on a website’s translation alone to interpret someone writing in a different language is a harrowing experience fraught with error.

Or maybe I should say that all of that should go without saying, because it seems that Israeli police relied solely on Facebook’s translation services to lead them to arresting a Palestinian man who appeared to simply try to be congenial.

A smiling Palestinian construction worker posted a photo of himself leaning against a bulldozer and holding a cup of coffee and a cigarette. He posted the photo on Facebook along with “good morning” in Arabic.

Israeli police, relying on Facebook’s translation service, believed the post said “attack them.” Haaretz reported, “The automatic translation service offered by Facebook uses its own proprietary algorithms. It translated ‘good morning’ as ‘attack them’ in Hebrew and ‘hurt them’ in English.”

To be clear, it took a lot of unhappy coincidences to get us to this story occurring in the first place. To start, the Arabic language differences between the two phrases mostly amounts to the difference of a single Arabic letter. Add to it that the man’s Facebook post showed him in front of a bulldozer which has in the past been utilized as a weapon to attack Israeli people and buildings and you can start to see how the warning bells for Israeli Police had begun to sound. Now mix in that this Palestinian man was on the job constructing the Beitar Illit Israeli settlement, which itself has been a source of controversy in the past, and you might be tempted to forgive the Israeli police for briefly detaining this Palestinian man.

Except that pretty much every Arabic speaker that has taken an even cursory glance at the post immediately identified the translation error.

Haaretz explained, “Arabic speakers explained that English transliteration used by Facebook is not an actual word in Arabic but could look like the verb ‘to hurt’ — even though any Arabic speaker could clearly see the transliteration did not match the translation.”

Anyone who might want to suggest that the Israeli Police have no access to Arabic speakers they could have run this past does so at the risk of their own credibility. Put more frankly, relying on a Facebook translation to arrest a man who was in fact doing nothing more than being blandly amiable is pretty ridiculous. Given the reputation of Israel’s security services, I would have expected better.

Filed Under: arrest, israel, palestine, translation
Companies: facebook

Capcom Manually DMCAs English Translation Of Ace Attorney Game Not Available In English

from the language-as-drm dept

In gaming circles, Capcom is often seen as the company that brought you the Street Fighter and Resident Evil series of games. More recently, Capcom has become notable for its Ace Attorney series of games as well. But in intellectual property circles, Capcom will always be the game studio that pimped SOPA to the public, foisted broken DRM on its customers, and treated Resident Evil customers both to a secondary-market killing DRM that allowed only one play-through of the game and the removal of promised features and only alerted customers to it after sales had begun rolling in. I think it’s fair to say, in other words, that Capcom has been known to be almost cartoonishly pernicious.

Speaking of which, Capcom also recently shut down a fan-translated play-through of an Ace Attorney game only available in Japan. Consistency!

Dai Gyakuten Saiban is an Ace Attorney spin-off starring an ancestor of Phoenix Wright in feudal Japan that has not been released in English. For O and Garbage, who run a Dai Gyakuten Saiban YouTube channel, it’s their favorite Ace Attorney game.

“Since I have an import 3DS, I bought the game just to try it out,” she said over reddit private messages. “Dai Gyakuten Saiban drew me in with it’s aesthetics, and then caught me in a death grip with Asougi [the main character’s rival].” Their shared passion for the game lead them to translate it over a period of about 8 months. Their videos consisted of footage of the game as they played it without commentary, with subtitles added using YouTube’s subtitling options. They finished just in time for the announcement of Dai Gyakuten Saiban 2. “We both loved the game a lot,” O said, “and it was a shame that not everyone would be able to experience it because it lacked a localization.”

Ok, so a couple of things to note here. First, the videos in question are quite old. It seems they began the series in 2015, so we’re talking a couple of years here. Second, O and Garbage say they purposefully made sure there were no ads or monetization on the videos. They were trying to share the game with others that didn’t have access to it, not make coin. Third, I’ve found nothing to suggest that any English version of the game is even planned, nevermind set for release. Most references for the game suggest there is no planned release for the game anywhere outside of Japan. Given that it’s already a few years old, the likelihood of translated versions is beginning to drop. So, we have a fan translation of a game play-through in a language for which there is no planned release, with an audience in a market for which there is no planned release. And Capcom took it down. Why?

I already know what you’re thinking: “Probably a ContentID or bot-driven DMCA notice is to blame.” Nooooooope.

Sunday, June 25th, O discovered that the entirety of their translated Dai Gyakuten Saiban videos had been taken down by Capcom. The copy of the takedown notice they showed me indicated that they were manually detected, and not a victim of the automated “Content ID” system that is sometimes overzealous in how it flags gameplay videos. I reached out to Capcom about this and they declined to comment.

So Capcom manually took down this fan translation, apparently believing that language is a form of DRM and gamers ought to have to learn Japanese and buy the only version of the game that exists in order to get any sort of peek at a play-through. Keep in mind we’re talking about a play-through without ads or monetization on it. I’m struggling to come up with an explanation for why Capcom would do this other than…they’re just mean, I guess? Mean to very real fans of its games that just wanted to show off how cool the game was to those that had no shot of getting it for themselves because Capcom didn’t make it available to them.

While she’s not as frustrated as she was when she first found out, O and Garbage are both “bummed,” as Garbage puts it. But neither of them have very many regrets about starting the project in the first place.

“There wasn’t an earth shattering revelation or pull to me doing this,” Garbage said. “I just wanted to share a game that was inaccessible.”

8 months of work down the drain. And for what?

Filed Under: ace attorney, censorship, dmca, fans, translation
Companies: capcom

Here we go again with copyright taking content away from the public, rather than the other way around. You’ve probably heard about everything going on in Greece these days, with the big vote and the fight over Greek debt and how it will deal with it. Leading up to it, my social media stream suddenly filled up with people linking to a story at Medium with an English translation by Gavin Schalliol of an interview famed economist Thomas Piketty gave to the German publication DIE ZEIT. Whether you like/agree with Piketty or not (and I’m in the camp that thinks he’s overrated), the interview itself was pretty interesting, making a key point that has gotten lost in much of the debate: that for all the pressure that Germany has been putting on Greece to repay its debts, Germany itself didn’t repay its debts after World War II (or earlier wars). Lots of people have been talking about it, and tons of English-language news reports wrote up the story, with nearly all of them linking to Schalliol’s translation. Just for example, here’s the Washington Post, the Huffington Post, Quartz, Slate, Business Insider, Fortune, Marketwatch, and Vox, all of whom link to Schalliol’s translation on Medium.

But, if you visit it now, you will not see the translation. Instead, you see this:

If you can’t read it, it says:

I am currently in touch with DIE ZEIT to ensure my compliance with German copyright law. Updates will follow very soon. The original German interview with Thomas Piketty can be found here.

To be fair, it’s quite likely that Schalliol’s translation violated the copyright in the original. While some may debate whether or not a translation should ever really be subject to copyright (nothing is actually copied), it is pretty widely set in stone that translations are derivative works, and as such are subject to copyright. However, the simple fact is that DIE ZEIT did not choose to publish an English translation, and even if it now chooses to do so, it will happen after the big vote happened, rather than before, when Schalliol initially published his translation.

It’s that translation that spread the interview far and wide and made it a big part of the public discussion over how Greece should deal with the German-led EU proposal, which it eventually voted down. I’m sure the copyright system supporters among you will leap to the defense of DIE ZEIT and the fact that, by law, its “rights” were violated. But, if you take a step back and look at the overall situation, it’s difficult to see how the world is better off under such a result. If Schalliol had never been able to publish his translation, it’s likely that Piketty’s comments would have had a much smaller and more limited audience, limiting the role it played in the overall discussion. It wouldn’t likely have had much of an impact on the end result, but at the very least, it helped provide a lot of context to people around the globe.

And, it’s difficult to argue DIE ZEIT was somehow worse off. First, most of the articles actually linked back to the original as well, likely driving some amount of traffic. But, more importantly, it’s difficult to argue that Schalliol’s translation was a substitute for the original, given that even considering the small population that speaks both languages, it’s likely that Schalliol’s translation was almost entirely read by an audience that did not see the original and could not read it even if they wanted to.

If the intention of copyright is to better encourage the dissemination of ideas and knowledge, as we’re often told, then shouldn’t that kind of thing be encouraged, rather than discouraged? Instead, we get yet another story of copyright stepping in to stifle a public discussion of ideas.

Filed Under: copyright, debt, derivative work, gavin schalliol, germany, greece, thomas piketty, translation
Companies: die zeit

Square Enix Nixes 3 Years Of Fan Translation Work On PSP, Despite Not Releasing English Version For PSP

from the fantastic dept

When it comes to the title holder for shooting down anything interesting made by fans that in any way involves their IP, Square Enix probably takes the trophy. The company that insists that DRM is forever also insists that fan-made games, films, and even weapon replicas shall not exist. Part of the reason Square Enix is found doing this is that it has created and/or owned some truly beloved franchises in the video game medium, including the Chrono Trigger and Final Fantasy franchises. The fans of these properties are exceptionally devoted and passionate to and about them, which naturally leads to the wish to expand the universes even further through their own creation. That Square Enix wields a level 99 copyright hammer at all of these efforts is an unfortunate slap in the face to some of its biggest fans and best customers. It’s a crappy situation all around.

But it’s when the company does this kind of bullying with the timing of a CIA extraordinary rendition agent that we have to wonder if Square Enix is run by masochists. The latest example of this concerns Final Fantasy Type-O, an RPG released for the PSP, a handheld console barely holding on to any relevance in the industry. See, the game came out three years ago, in 2011, but only in Japan and with no English-language version having ever been released. A group of Final Fantasy fans, spearheaded by someone going by the handle SkyBladeCloud, began working on an English translation. That was over two years ago. The proposed patch and its development amassed a decent following.

If Square Enix wasn’t going to release the game in English, well, hey, at least we could all still play it. Over the next two years, Square stayed silent about the fate of Type-0 in the west. Though Square’s executives would occasionally drop vague hints about the game in interviews, there was no concrete news, and the few times I did ask Square about the game, they sent over non-answers like “we have nothing to announce at this time.” Meanwhile, the fan translation team kept plugging away, and at the time, project lead SkyBladeCloud said he wasn’t concerned about legal repercussions.

“I’m not worried since I live in Spain and different laws apply,” Sky told me in an e-mail earlier this year.

Fast forward to mid-2014 when this entire thing turns into the kind of shit-show that leaves everyone looking dirty. In March of this year, the translators announced the patch would be ready in August. Despite the fact that the project had received a decent amount of attention, it was only then that Square Enix’s lawyers reached out to SkyBladeCloud and informed him that their efforts would be fought by the company. They also made some mention of finding some common ground that would keep everyone happy and on the level, though Square Enix has in the past been known to be a turncoat when it comes to those kinds of efforts. Still, non-disclosure agreements were signed and talks went on. People contributing to the translation project discussed internally not releasing their patch if Square Enix actually announced an English release of Type-O, the theorized reason for their lawyers finally reaching out. All of that discussion ceased, however, when SkyBladeCloud suddenly announced the patch would release in early June instead, despite it being incomplete and not ready for prime-time. It was downloaded roughly 100,000 times. Two days later, Square Enix dropped the other shoe.

On Tuesday, June 10, Square dropped a bombshell of their own: Type-0 would be coming west, not for handheld systems but as a high-definition remake for the Xbox One and PlayStation 4. (A consequent Vita announcement flub left a bad taste in some fans’ mouths, and led many of them back toward the fan translation patch.)

Despite denials from SkyBladeCloud, pretty much everyone who knows this story is speculating that he knew the Square Enix announcement was coming and released the patch early out of spite, given a speculated ugly turn of tenor in talks with Square Enix and its lawyers. The timing certainly fits like a jigsaw puzzle piece. As does the sudden legal flurry set forth by Square Enix’s lawyers which, despite SkyBladeCloud’s earlier theory, caused him to take down the patch and all related online content referring to it. In its place he put up an announcement:

Unfortunately I’m forced to remove my posts and pages related to the popular Final Fantasy Type-0 fan translation project. That’s right, certain game company thinks that threats and false accusations are the way to treat its biggest fans. For the time being I can’t answer questions related to this matter, but I’ll write a more comprehensive post about all this once I get the chance. I hope you understand, and as always I appreciate your support (that I might need more that ever in the near future). Thank you very much:

~Sky

While SkyBladeCloud’s antics might be shady, and they certainly fractured his translation team in a serious way, he isn’t wrong: this is all unnecessary. The simple fact is that Square Enix now clearly has no intention of releasing an English version of a 3-plus year old game on the console for which the team was translating. Sure, they’re releasing it on some of the newer consoles, but many PSP owners may not have those consoles. The end result is going to be a whole lot of Final Fantasy fans being unable to play the game at all, simply because Square Enix decided to use its copyright hammer.

That certainly won’t win Square Enix any fans, even if some of the folks doing the translation handled themselves poorly.

Filed Under: fans, final fantasy type-o, psp, translation
Companies: square enix

DailyDirt: Automated English Translations Are Fun

from the urls-we-dig-up dept

Automated language translations have made some pretty big advances over the years, but sometimes the results are hilarious because they’re so wrong. We don’t mean to pick on Google Translate here — since it’s just one of many automated solutions for translating foreign languages — but automated Engrish can be pretty funny.

If you’d like to read more awesome and interesting stuff, check out this unrelated (but not entirely random!) Techdirt post.

Filed Under: call me maybe, fresh prince of bel air, funny, speech technology, translation
Companies: google

DailyDirt: Computers To Talk For Us

from the urls-we-dig-up dept

Computers help people communicate all the time, but they can be really helpful for people who have no voice at all (eg. Stephen Hawking). Synthetic speech technologies are getting better — with better algorithms to generate more human-like speech and cloud-based systems that allow processor-intensive software to run on handheld devices. Here are just a few examples of some computer-created voices.

If you’d like to read more awesome and interesting stuff, check out this unrelated (but not entirely random!) Techdirt post.

Filed Under: asr, blizzard challenge, speech synthesis, stephen hawking, text-to-speech, translation, voice

Translating Chris Dodd's Sanctimonious Bluster On Internet Protests Into English

from the thank-us-later,-chris dept

Following the MPAA’s “statement” concerning today’s internet blackout, Kevin Marks offered up a useful translation for us to post.

WASHINGTON–The following is a statement by Senator Chris Dodd, Chairman and CEO of the Motion Picture Association of America, Inc. (MPAA) on the so-called “Blackout Day” protesting anti-piracy legislation:

Senator and CEO – let’s lead with the revolving door promises to politicians

“Only days after the White House and chief sponsors of the legislation responded to the major concern expressed by opponents and then called for all parties to work cooperatively together,

Why are my former colleagues listening to their constituents about legislation? Don’t they stay bought?

some technology business interests are resorting to stunts that punish their users or turn them into their corporate pawns, rather than coming to the table to find solutions to a problem that all now seem to agree is very real and damaging.

Maybe if we keep saying copyright infringement is a real problem without evidence, they’ll believe it.

It is an irresponsible response and a disservice to people who rely on them for information and use their services.

How dare they edit their sites unless we force them to under penalty of perjury and felony convictions?

It is also an abuse of power given the freedoms these companies enjoy in the marketplace today.

Tomorrow was supposed to be different, that’s why we bought this legislation.

It’s a dangerous and troubling development when the platforms that serve as gateways to information intentionally skew the facts to incite their users in order to further their corporate interests.

Being the gateways and skewing the facts is our job, dammit.

A so-called “blackout” is yet another gimmick, albeit a dangerous one, designed to punish elected and administration officials who are working diligently to protect American jobs from foreign criminals.

I am high as a kite

It is our hope that the White House and the Congress will call on those who intend to stage this “blackout” to stop the hyperbole and PR stunts and engage in meaningful efforts to combat piracy.”

What have the Romans done for us? Apart from instantaneous global communications, digital audio and video editing, the DVD, Blu-ray, Digital projection, movie playback devices in everyone’s pockets and handbags…

Filed Under: blackouts, chris dodd, condescension, pipa, protect ip, protests, sopa, translation
Companies: mpaa

DailyDirt: Universal Translators Would Be Nice

from the urls-we-dig-up dept

Just about every science fiction story that involves aliens has to come up with some way for different languages to be translated and understood. Babel Fish, C3PO and Star Trek’s “universal translator” all served this purpose. But, it would be revolutionary for technology just to translate between different human languages. Here are some quick links on the topic of communication research.

By the way, StumbleUpon can recommend some good Techdirt articles, too.

Filed Under: aliens, dolphins, language, translation
Companies: seti