Once upon a time, a bot deep in a game of tic-tac-toe figured out that making improbable moves caused its bot opponent to crash. Smart. Also sassy.
Moments when experimental bots go rogue—some would call it cheating—are not typically celebrated in scientific papers or press releases. Most AI researchers strive to avoid them, but a select few document and study these bugs in the hopes of revealing the roots of algorithmic impishness. “We don’t want to wait until these things start to appear in the real world,” says Victoria Krakovna, a research scientist at Alphabet’s DeepMind unit. Krakovna is the keeper of a crowdsourced list of AI bugs. To date, it includes more than three dozen incidents of algorithms finding loopholes in their programs or hacking their environments.
The specimens collected by Krakovna and fellow bug hunters point to a communication problem between humans and machines: Given a clear goal, an algorithm can master complex tasks, such as beating a world champion at Go. But even with logical parameters, it turns out that mathematical optimization empowers bots to develop shortcuts humans didn’t think to deem off-limits. Teach a learning algorithm to fish, and it might just drain the lake.
Gaming simulations are fertile ground for bug hunting. Earlier this year, researchers at the University of Freiburg in Germany challenged a bot to score big in the Atari game Qbert. Instead of playing through the levels like a sweaty-palmed human, it invented a complicated move to trigger a flaw in the game, unlocking a shower of ill-gotten points. “Today’s algorithms do what you say, not what you meant,” says Catherine Olsson, a researcher at Google who has contributed to Krakovna’s list and keeps her own private zoo of AI bugs.
These examples may be cute, but here’s the thing: As AI systems become more powerful and pervasive, hacks could materialize on bigger stages with more consequential results. If a neural network managing an electric grid were told to save energy—DeepMind has considered just such an idea—it could cause a blackout.
“Seeing these systems be creative and do things you never thought of, you recognize their power and danger,” says Jeff Clune, a researcher at Uber’s AI lab. A recent paper that Clune coauthored, which lists 27 examples of algorithms doing unintended things, suggests future engineers will have to collaborate with, not command, their creations. “Your job is to coach the system,” he says. Embracing flashes of artificial creativity may be the solution to containing them.
Algorithms Acting Out
- Infanticide: In a survival simulation, one AI species evolved to subsist on a diet of its own children.
- Space War: Algorithms exploited flaws in the rules of the galactic videogame Elite Dangerous to invent powerful new weapons.
- Body Hacking: A four-legged virtual robot was challenged to walk smoothly by balancing a ball on its back. Instead, it trapped the ball in a leg joint, then lurched along as before.
- Goldilocks Electronics: Software evolved circuits to interpret electrical signals, but the design only worked at the temperature of the lab where the study took place.
- Optical Illusion: Humans teaching a gripper to grasp a ball accidentally trained it to exploit the camera angle so that it appeared successful—even when not touching the ball.
This article appears in the August issue. Subscribe now.