Not too long ago, back in 2016, the miraculous feats of the Go-playing artificial intelligence AlphaGo scoring 4 to 1 against the 18-time world champion Lee Sedol had astonished and quickly spread across the world. The brains behind this mighty “robotic brain” are none other than the Google DeepMind team, pioneers, in the field of artificial intelligence research and applications. Continuing in their developments of game-playing agents, DeepMind created AlphaZero, an agent, instead of being created solely for a single game like AlphaGo, it could now learn on its own to play a multitude of board games and have succeeded in defeating the champion programs in Chess (Stockfish), Shogi(Elmo), even in defeating AlphaGo with a 61% win rate, gradually expanding the dominance of AI across the domain of all board games (Silver). Still, no one expected the transition when DeepMind certainly announced their newest agent, AlphaStar, to be advancing and challenging to learn and play one of the most famous and complex online video games ever in history, a real-time strategy (RTS) game: StarCraft 2.
Unlike traditional board games with, in the end, a finite amount of moves available, StarCraft 2 is continuous with infinitely many states and possibilities. Having ultimately 3 races of choice: Terran, Protoss, and Zerg, over 30 deployable units and buildings available to each race, each potentially with their own abilities, over 20 unique upgrades, PVP (player vs player) mode scales up the convolution of the game to the next level, enabling boundless opportunities for timing, combinations, and likewise the unlimited amount of strategies and playstyles.
Even more complexly, players’ visions are restricted down to a small, zoomed-in, movable rectangle area, with only a minimap on the top left corner to glance at the activities in the visible area. Speaking of which, outside of their controlled zones with units and buildings, the players will not have any visibility and information, including their components’ movements, units and buildings unless they actively scout out with units or certain abilities. At the same time of worrying about setting up a defense or stocking up troops for offence, players also need to mine out and expand for resources like minerals and gases in order to sustain the production of units and buildings at first and throughout the game, potentially also splitting troops to harass mining lines or distributing troops for anti-harass. Combining all these, a good player must sustain a high efficiency and accuracy of actions from the beginning to the end in order to maximize their resources and opportunities: a common term, known as APM (actions per minute), categorizes such, and intuitively, the higher the better.
While the computer can perform at a human-untouchable, unfair amount of APM, with overwhelmingly these many rules, possibilities, and restrictions, how can an AI agent that reads only pixels possibly be able to even learn and function properly while at the same time striving to master these concepts and deploy complex strategies to the point they can defeat human experts? The leap from board games to RTS games is far too big. Almost everyone had doubts about how the robotic brain can perform in its new and exotic realm.
Taking on the challenge nonetheless, Google DeepMind team decided to construct the initial frameworks of AlphaStar with a deep neural network, using a multi-agent learning algorithm, to be trained on gameplays provided by Blizzard with supervised learning, helping the agent to be able to pick up and imitate many of the strategies used by real players in the game plays in different scenarios. Progress has been proven when the agent is already able to defeat the hardest level of computer AI implemented in StarCraft 2 at an absurd 95% of the time. In the next training phase, DeepMind decided to opt for a creative multi-agent reinforcement learning resembling the actual competitive ladder system in the game. Agents trained from the previous model were turned to play against each other, with the winning agent ranking up while the losing agent de-ranking. Along the way, specific agents were assigned objectives to target and defeat another individual or group of agents. This created a diverse environment with agents repeatedly exploring and exploiting new and different strategies against the ones that have been known to work, improving and consolidating strategy selections and counter-strategies over time.
After up to 200 years worth of gameplay training per agent, a final agent was created by combining and condensing the top-performing strategies discovered (Silver). It then was decided to be put on the arena against the top professional players of StarCraft 2. If the match against humans is to be fairer, AlphaStar should have its view restricted and its APM limited, akin to human experiences. Google Deepmind fulfilled the first by re-training AlphaStar to mandatorily control the view of the camera from its previous “omnispective god mode”, and the latter was proven to be already satisfied as AlphaStar proved to perform only at a mean of 280 APM, much lower than the approximate Korean-pro average of 350 APM due to its learning experience from human replays.
Even with such limiting human restrictions, AlphaStar astonished the world. In its match against professional TLO, a top Zerg player, and a Protoss grandmaster, AlphaStar mercilessly defeated the pro 5 to 0, throughout the game exhibiting human-like plays as well as some intriguing, unseen strategies. These strategies are perhaps discoveries beyond current human interpretations. Following just a week after, AlphaStar was pitched against MaNa, an absolute top player ranked 13th in the 2018 world championship and a top10 Protoss player. Yet again, AlphaStar managed to bring down the great master, displaying diverse advanced strategies, deployments, and insane micro-managements of the units. GGs were typed in the chat, the AI claimed the triumphs, scoring 5 to 0.
Such glorious feats by AlphaStar also correspond to the heroic feats in cutting-edge AI research. DeepMind has erased the doubts of people, once again, by proving the potential of AI and its applications through its mastering of just one of the most remarkable and complicated video games ever created. Players and non-players alike should appreciate the meticulous hard work of the researchers, and together as a species, look forward to the general merits and contributions that AI would bring to humanity in the upcoming years.