Two Giants of AI Team Up to Head Off the Robot Apocalypse (original) (raw)
There’s nothing new about worrying that superintelligent machines may endanger humanity, but the idea has lately become hard to avoid.
A spurt of progress in artificial intelligence as well as comments by figures such as Bill Gates—who declared himself “in the camp that is concerned about superintelligence”—have given new traction to nightmare scenarios featuring supersmart software. Now two leading centers in the current AI boom are trying to bring discussion about the dangers of smart machines down to Earth. Google's DeepMind, the unit behind the company’s artificial Go champion, and OpenAI, the nonprofit lab funded in part by Tesla’s Elon Musk, have teamed up to make practical progress on a problem they argue has attracted too many headlines and too few practical ideas: How do you make smart software that doesn’t go rogue?
“If you're worried about bad things happening, the best thing we can do is study the relatively mundane things that go wrong in AI systems today,” says Dario Amodei, a curly-haired researcher on OpenAI's small team working on AI safety. "That seems less scary and a lot saner than kind of saying, ‘You know, there’s this problem that we might have in 50 years.’” OpenAI and DeepMind contributed to a position paper last summer calling for more concrete work on near-term safety challenges in AI.
A new paper from the two organizations on a machine learning system that uses pointers from humans to learn a new task, rather than figuring out its own—potentially unpredictable—approach, follows through on that. Amodei says the project shows it’s possible to do practical work right now on making machine learning systems less able to produce nasty surprises. (The project could be seen as Musk’s money going roughly where his mouth has already been; in a 2014 appearance at MIT, he described work on AI as “summoning the demon.”)
None of DeepMind’s researchers were available to comment, but spokesperson Jonathan Fildes wrote in an email that the company hopes the continuing collaboration will inspire others to work on making machine learning less likely to misbehave. “In the area of AI safety, we need to establish best practices that are adopted across as many organizations as possible,” he wrote.
The first problem OpenAI and DeepMind took on is that software powered by so-called reinforcement learning doesn’t always do what its masters want it to do—and sometimes kind of cheats. The technique, which is hot in AI right now, has software figure out a task by experimenting with different actions and sticking with those that maximize a virtual reward or score, meted out by a piece of code that works like a mathematical motivator. It was instrumental to the victory of DeepMind’s AlphaGo over human champions at the board game Go, and is showing promise in making robots better at manipulating objects.
But crafting the mathematical motivator, or reward function, such that the system will do the right thing is not easy. For complex tasks with many steps, it’s mind-bogglingly difficult—imagine trying to mathematically define a scoring system for tidying up your bedroom—and even for seemingly simple ones results can be surprising. When OpenAI set a reinforcement learning agent to play boat racing game CoastRunners, for example, it surprised its creators by figuring out a way to score points by driving in circles rather than completing the course.
DeepMind and OpenAI’s solution is to have reinforcement learning software take feedback from human trainers instead, and use their input to define its virtual reward system. They hired contractors to give feedback to AI agents via an interface that repeatedly asks which of two short video clips of the AI agent at work is closest to the desired behavior.
OpenAI
This simple simulated robot, called a Hopper, learned to do a backflip after receiving 900 of those virtual thumbs-up verdicts from the AI trainers while it tried different movements. With thousands of bits of feedback, a version of the system learned to play Atari games such as Pong and got to be better than a human player at the driving game Enduro. Right now this approach requires too much human supervision to be very practical at eliciting complex tasks, but Amodei says results already hint at how this could be a powerful way to make AI systems more aligned with what humans want of them.
It took less than an hour of humans giving feedback to get Hopper to land that backflip, compared to the two hours it took an OpenAI researcher to craft a reward function that ultimately produced a much less elegant flip. “It looks super awkward and kind of twitchy,” says Amodei. “The backflip we trained from human feedback is better because what’s a good backflip is kind of an aesthetic human judgment.” You can see how complex tasks such as cleaning your home might also be easier to specify correctly with a dash of human feedback than with code alone.
Making AI systems that can soak up goals and motivations from humans has emerged as a major theme in the expanding project of making machines that are both safe and smart. For example, researchers affiliated with UC Berkeley’s Center for Human-Compatible AI are experimenting with getting robots such as autonomous cars or home assistants to take advice or physical guidance from people. “Objectives shouldn’t be a thing you just write down for a robot; they should actually come from people in a collaborative process,” says Anca Dragan, coleader of the center.
She hopes the idea can catch on in the industry beyond DeepMind and OpenAI’s explorations, and says companies already run into problems that might be prevented by infusing some human judgement into AI systems. In 2015, Google hurriedly tweaked its photo recognition service after it tagged photos of black people as gorillas.
Longer term, Amodei says, spending the next few years working on making existing, modestly smart machine learning systems more aligned with human goals could also lay the groundwork for our potential future face-off with superintelligence. “When, someday, we do face very powerful AI systems, we can really be experts in how to make them interact with humans,” he says. If it happens, perhaps the first superintelligent machine to open its electronic eyes will gaze at us with empathy.