AI learned to betray others. Here's why that's okay

Feb 24, 2017

This article is published in collaboration with O'Reilly.

Honda's latest version of the Asimo humanoid robot runs during a presentation in Zaventem near Brussels July 16, 2014. Honda introduced in Belgium an improved version of its Asimo humanoid robot that it says has enhanced intelligence and hand dexterity, and is able to run at a speed of some 9 kilometres per hour (5.6 miles per hour). REUTERS/Francois Lenoir (BELGIUM - Tags: SCIENCE TECHNOLOGY BUSINESS SOCIETY) - RTR3YVQI

Machines learn what we teach them.

Image: REUTERS/Francois Lenoir

Mike Loukides

Vice President of Content Strategy, O'Reilly Media, Inc

Our Impact

What's the World Economic Forum doing to accelerate action on Emerging Technologies?

The Big Picture

Explore and monitor how Innovation is affecting economies, industries and global issues

Stay up to date:

Innovation

There's been a lot of buzz about some experiments at DeepMind that study whether AI systems will be aggressive or collaborative when playing a game. Players gather virtual apples; they have the ability to temporarily incapacitate an opponent by "shooting" a virtual "laser." And humans are surprised that AIs at times decide that it's to their advantage to shoot their opponent, rather than peacefully gathering apples.

My question is simple: what does this tell us? The answer is also simple: nothing at all. If you ask an AI to play a game in which firing lasers at your opponents is allowed, it isn't surprising that the AI fires lasers at opponents, whether the opponents are virtual or physical. You wouldn't expect it to a priori develop some version of Asimov's laws and say, "I can't do this." (If the software doesn't allow it to fire the lasers, well, it won't, but that's hardly interesting.) You wouldn't expect an AI to have a crisis of conscience and say, "no, no, I can't do it." Unless it was programmed with some sort of guilt module, which as far as I know, doesn't exist.

Humans, after all, do the same. They kill in first-person shooters as well as in real life. We have whole divisions of the government devoted to the organized killing of other people. (We ironically call that "keeping the peace.") And while humans have a guilt module, it usually only engages after the fact.

The only interesting question that a game like this might answer is whether AI systems are more, or less, willing to pull the trigger than humans. I would be willing to bet that:

- When computers play humans, the computers win. We've certainly had enough experience losing at chess, Go, and poker.

- Humans are more likely to go for the guns because, well, it's what we do. DeepMind's research suggests that a computer would only shoot if it's part of an efficient strategy for winning; it won't shoot because it's a reflex, because it's scared, or because it's fun.

It's up to you whether shooting as part of an efficient strategy for winning is an improvement over human behavior, but it's exactly what I would expect. DeepMind didn't beat Lee Sedol at Go by refusing to be aggressive.

And even then, given that we're only talking about a game, I'm not sure that experiment shows us anything at all. I'd expect an AI to be pretty good at playing a first-person shooter, and I don't see any reason for it to derive Asimov's Laws from first principles when it's only exterminating bits. I certainly wouldn't volunteer to participate in a real-life shooter against some scary Boston Dynamics creation, and I hope nobody plans to run that experiment.

Likewise, I don't see any reason for an AI to "learn" that there are things in its universe that aren't just bits. We are fascinated by "machine learning"; but in the end, the machines only learn what we tell them to learn. I'm skeptical of singularities, but I will agree that we're facing a singularity when a computer can learn, entirely on its own, that some of the bit patterns coming in through its sensors are humans, and that these bit patterns are qualitatively different from the bit patterns of dogs, cats, or rocks.

In the end, we're back where we started. Fear of AI reflects our fear of ourselves. AI mimics human behaviors because we teach it to do so—in this case, by asking it to play a game with human rules. As I've said, if we want better AI, we have to be better people. If we want an AI that can distinguish between humans and bits, we have to teach it what humans are, and how to behave differently in their presence. ("You can shoot the wolf; you can't shoot the human.") And if don't want AI agents to shoot at all, we have to build software that doesn't have the ability to shoot.

Don't give your AIs guns.

Accept our marketing cookies to access this content.

These cookies are currently disabled in your browser.

Don't miss any update on this topic

Create a free account and access your personalized content collection with our latest publications and analyses.

License and Republishing

World Economic Forum articles may be republished in accordance with the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License, and in accordance with our Terms of Use.

The views expressed in this article are those of the author alone and not the World Economic Forum.