One way to measure progress in artificial intelligence is to chart victories by algorithms over champions of increasingly challenging games—checkers, chess, and, in 2016, Go. On Wednesday, five bots sought to extend AI’s mastery to e-sports, in the fantasy battle game Dota 2. They failed, as a team of pro gamers from Brazil called paiN defended humanity’s honor—for now.
A crowd of thousands in Vancouver’s hockey arena watched the bots battle paiN over 52 tense minutes packed with spells and firebolts. The humans won decisively. The human-machine contest was a side event to The International, a Dota 2 tournament that boasts the biggest purse in e-sports, at $25 million.
The match suggests that the best pro gamers maintain an edge over the best algorithms. A warm-up match earlier this month in which the bots defeated a team of Dota experts who provide commentary on pro games had raised expectations that AI was about to claim another scalp. OpenAI is targeting Dota 2 because, appearances aside, it is mathematically more complex than chess or Go. In the game, five-person teams pick characters such as spiders, sorcerers, and centaurs, and then fight to destroy each other’s bases.
The way the bots lost highlights a limitation of machine learning, the technique driving the current AI boom that relies on machines learning tricky tasks through experience and data. Mathematically rendering data into software that makes decisions works well for some tasks, such as speech recognition, but doesn’t easily create impressive powers of strategy or planning.
OpenAI’s software racked up more kills than paiN, spooking the tournament’s commentators with perfectly timed and coordinated attacks that could seem impossible for its human opponents to withstand. But the bots lagged strategically, squandering opportunities to gather and direct the resources needed for overall victory.
“The bots are still very good at moment-to-moment, but they seem bad at macro-level decisions,” tweeted Mike Cook, who researches games and AI at the University of Falmouth in the UK, and the Max Planck Institute for Software Systems in Germany.
That combination of precise tactics but wobbly strategy may reflect the way OpenAI’s bots learned to play Dota. They taught themselves the game from scratch using a technique called reinforcement learning, which is also at the heart of some of Google parent Alphabet’s AI ambitions.
In reinforcement learning, software figures out a task through trial and error. It takes on a challenge over and over, trying different actions, and sticks with those that work. OpenAI’s bots prepared for Wednesday’s match by playing millions of speeded up games of Dota against clones of themselves.
That’s very different from how humans approach problems. A novice can—fortunately—become a Dota pro in far fewer than a million games by understanding the game’s goals, and learning how to create productive strategies. Bots based on reinforcement learning—at least today—don’t engage with the game at a higher level. They are driven by predicting the best action in any given moment. “It’s reactive, they look at a state of the world and come up with something to do now,” says Ben Recht, a professor at University of California Berkeley.
Susan Zhang, a software engineer who worked on OpenAI’s Dota project, says that shortcoming showed during Wednesday’s loss. During training, the bots look at most 14 minutes ahead when judging the effects of actions they take. “They simply don’t have any mechanism to ‘plan’ for more than 14 minutes at a time,” she says. “This definitely contributes to the lack of long term strategy that we see.”
For the same reason, the bots don’t react well to situations they didn’t encounter during training. “If it sees something that it hasn’t seen before, it’s hard for it to adjust immediately,” says Zhang.
That doesn’t mean reinforcement learning can’t be powerful. By giving paiN a competitive game, OpenAI’s bots have already attained a level beyond earlier videogame bots.
The nonprofit credits that to how it leveraged advances in graphics processors to put more computational power behind the technique. That makes trickier problems tractable, by giving learning algorithms the millions of tries they need to figure out tactics that work. Earlier this month, OpenAI used the same general approach behind its Dota bots to give a five-fingered robotic hand impressive dexterity. Musk left the board of OpenAI earlier this year, saying he wanted to avoid conflicts of interest with Tesla.
PaiN’s bot-squashing session Wednesday could be just a historical blip in the relentless advance of AI as soon as Thursday. Greg Brockman, an OpenAI cofounder who is also the organization’s chief technology officer, says the bots will play two more games this week, although he did not provide details. “What we proved today is we’re right there at the edge of human ability,” he says.
Speaking before Wednesday’s result, Recht of Berkeley told WIRED he expected OpenAI’s bots to beat a pro team—if not in its first matchup then soon after. But he pushed back against the notion this could signal machines will soon be better than humans at a wide range of typical jobs.
“The game environment is very constrained and simple because we want it to be fun to play,” he says. “That’s good for algorithms that need millions of repeatable simulations, but it takes away the challenges and unpredictability of the real world.”