For years now, AI researchers have been leveraging game environments to train computer models to react to complicated scenarios and make decisions accordingly. In some ways, the trial-and-error process mimics how children learn about the world around them.
With the unprecedented growth of computational power and funding, artificial agents trained using machine learning have not only deciphered classic games such as Chess, Go,and StarCraft — they have proven capable of leaping to superhuman performance levels.
In game theory a perfect information game such as Go is a contest where each player can see everything that will effect the outcome. Imperfect information games — where important information is hidden and players must evaluate all possible outcomes when making decisions — are much more difficult environments.
Mahjong is a perfect example of an imperfect information game. The tile-based strategy game is commonly played by four players using a set of 144 tiles based on Chinese characters and symbols. Players take turns drawing and discarding tiles until one can build a winning hand with 14 tiles.
Machine learning researchers are increasing using imperfect information game environments for examining and developing artificial agents. Similar to StarCraft, Mahjong combines skill, strategy, calculation and sometimes a little bit of luck.
University of Technology Sydney Professor Sanjiang Li and Researcher Xueqing Yan recently published a mathematical and AI study of Mahjong. The pair used a basic ruleset, “Mahjong – 0,” played with three types of tiles:
- Bamboos:B1,B2,…,B9, each with four identical tiles
- Characters:C1,C2,…,C9, each with four identical tiles
- Dots:D1,D2,…,D9, each with four identical tiles
The machine takes turns drawing and discarding tiles, tasked with evaluating the quality of its hand and deciding which tile to discard in order to increase the chance of completing a winning hand.
The research paper also proposes possible future approaches for developing AI on Mahjong:
- Extend the basic version of Mahjong – 0 to include more tiles such as the winds, dragons, etc.
- Adjust the set of legal 14-tiles according to different rules
- Consider different complete 14-tiles will have different scores
Mahjong remains one the most popular imperfect information games yet to be studied by AI researchers. Last month, Google AI and DeepMind open-sourced their Hanabi Learning Environment for collaborative multi-agent learning research, based on the cooperative two to five player card game Hanabi, and proposed an experimental framework for the research community to evaluate algorithmic advances and assess the performance of current state-of-the-art techniques.
Following on the successes of powerful artificial agents such as AlphaZero and AlphaStar in Go and StarCraft, might we soon see an AlphaMahjong rise to superhuman level?
Although games continue to be a popular testing ground for teaching various algorithms or agents to perform a variety of tasks, they are generally regarded as a step toward a greater goal. As DeepMind Research Scientist Oriol Vinyals puts it: “The mission of DeepMind is to build artificial general intelligence.”
The paper Let’s Play Mahjong is on arXiv.
Journalist: Fangyu Cai | Editor: Michael Sarazen