The idea that artificial intelligence outperforms humans is almost nowhere more tangible than in the world of board games, such as chess and go. From one day to the next, AI made the leap from amateur player to grandmaster. What is really behind it?
On 15 March 2016, the computer programme AlphaGo convincingly decided a five-game match against South Korean Lee Sedol, one of the best go players at the time, 4-1. The completely unexpected victory is seen as a major breakthrough in the development of AI. South Korea was so shocked by it that it promptly decided to invest hundreds of millions of dollars more in AI. The breakthrough has also been decisive for China’s ambition to be AI world leader by 2030, according to New York Times‘ sources.
In chess and checkers, computers had surpassed humans much earlier. Chess computer Deep Blue’s victory over world chess champion Garry Kasparov in 1997 made world news at the time. For chess players themselves, this was not a huge shock at the time: it had simply been coming for years.
Embed from Getty ImagesYet chess players were also confronted with the new powers AI developed in their game. In 2017, Google’s techies had AlphaZero play chess against the hitherto strongest ‘classical’ chess programme, which it defeated by an overwhelming margin. Playing chess in such an impressive way was something the programme had learned by playing ‘only nine hours’ of games against itself.
Remarkably, AlphaZero proved to play much more human and interestingly than the strongest chess programmes to date, as if the programme understood the game on a deeper level – which, of course, it did not really.
But why did AlphaGo and AlphaZero suddenly perform so differently from well-known programmes? To answer that question, we need to look in a little more detail at how we taught computers to play these games.
Searching trees
Board games like chess, checkers and go have always been a favourite subject within artificial intelligence research. They are games where both players have the same information about position at any time, unlike card games such as bridge and poker; and chance does not play a role as it does in backgammon. In 1928, mathematician John von Neumann, in his paper Zur Theorie der Gesellschaftsspiele, laid the foundation for AI research on these games.
So how does such a programme work? In essence, it’s quite simple: first, it looks at which moves the player can perform on its turn and then it looks at what the opponent can do on each of those moves, and then it looks again at what is possible on all those different answers, … and you soon find out that the tree of variants becomes huge. A game of tic-tac-toe takes nine moves at the most, which gives you over 250,000 different games, but for a computer program, it is still a piece of cake to calculate it all.
In chess, however, it is different. From the initial position, after three moves back and forth, you already have more than nine million possible positions. And in go, the number of possibilities is several orders of magnitude higher; there, the number of branches is astronomical. A program can only calculate a limited part of the tree of variants in these games, and many branches not deep enough to arrive at a win, loss or a draw. So, when writing a program, there are two challenges: how do you evaluate a position at the ends of your variants tree, and which parts of that tree do you search the deepest?
Alan Turing wrote probably the very first chess programme, Turochamp, in 1948. In the absence of computers, the steps of his algorithm had to be executed on paper. A move thus took more than half an hour, but it worked! Unfortunately, only one original game has survived, but in 2004 chess software producer ChessBase managed to make a reconstruction of Turing’s algorithm.
Turing’s evaluation function was very similar to how you learn to look at a position as a novice chess player. Among other things, it counts ‘the logs’ (he gave a pawn the value 1, a knight 3, a bishop 3½, a rook 5 and a queen 10). Turochamp obviously does not calculate very deeply: one move back and forth, and the variants in which there is then a more or less forced move (directly taking back a piece, dealing with a check), it calculates a bit further.
Basically, Deep Blue did not do much different from Turochamp almost 50 years later. It managed to beat Kasparov by combining brute computational power of special hardware and an excellent position evaluation devised by the development team that included strong chess players. In addition, the program used other human knowledge in the form of an opening database and databases for endgames with just a few pieces that have already been fully calculated till the end.
In Deep Blue, artificial intelligence mainly means that a program mimics the thinking process of human chess players much faster and more accurately, using as input the accumulated experience we have gained by studying the game for a long time. Even with Deep Blue’s successors, we saw little real development in this respect. But it can be done differently, with self-learning algorithms.
The magic go
Although anyone could soon run a chess program on their smartphone that no grandmaster could win from, go still seemed too difficult. WIRED published an article in 2014 on the state of affairs in which experts say that it would take another ten years before the computer would really be a good match for humans in go. A couple even say in that story: computers will never succeed.
Go is more difficult for a computer than chess because of the much larger number of possibilities, but there are also problems in the evaluation function. People often turn out to be unsure themselves why they think one move is better than another. ‘It’s something subconscious, that you train through years and years of playing. I’ll see a move and be sure it’s the right one, but won’t be able to tell you exactly how I know. I just see it,’ says Michael Redmond in the WIRED story, the only Western player ever to reach the highest level. This also gives the game something elusive, it is almost magical. Lee Sedol in the same article: ‘In the Western world, you have chess, but go is incomparably more subtle and intellectual.’
The involvement of DeepMind (acquired by Google in 2014) changed this completely. The company develops neural networks for all sorts of things: to predict how proteins fold, all sorts of language applications, analysing medical scans. Using neural networks for games is an attractive way to demonstrate its capabilities to a large audience.
Neural networks
In training AlphaGo, DeepMind used two neural networks, one that predicts moves based on known games, and another that builds an evaluation function. Step one is the learning material for the neural network. As many as 30 million moves by high-level players were shown to the machine. Then the developers made the program play many, many games against itself: this way, it could get the most successful strategies wear in.
After this training, the evaluation function was ‘frozen’: so while playing the match against Sedol, AlphaGo did not learn anything new. No human knowledge of the game was embedded in the evaluation function and it is actually a black box for us as well. AlphaGo sometimes played moves that top players could find no reason for at all based on their experience, but which turned out to be excellent as the game progressed.
After its success at go, DeepMind developed a more generalistic network. This model, AlphaZero, trains itself without being introduced to human games first. It beat Stockfish in 2017, the strongest classical chess program at the time (after all, playing against a human was no longer a real challenge).
The research and match matches can be read back in a paper published in Science in 2018. While the nine-hour workout may strike you as very short, it is worth remembering that a lot of computing power was used for this: AlphaZero used a staggering 1,920 computer processors and 280 video cards, to play as many as 44 million games against itself. The programme is not available to the ordinary chess player, but there are now open-source programmes based on the same principles.
For chess, AI has brought new impetus at the top level; in go, things seem to be a bit different for the time being. Chess players have slowly been able to get used to the role of computers. The programmes gradually got better and better, strong chess players learned to use them in preparation for their own games, and they even gained new insights through AlphaZero’s play that can be seen in grandmaster games.
In go, it went in one fell swoop from programmes that competed nicely at the amateur level to the elusive and virtually unbeatable AlphaGo. In 2019, three years after his defeat, Lee Sedol threw in the towel as a professional. He recently told the New York Times that he had been utterly surprised that the programme could beat him: ‘I could no longer enjoy the game, so I retired.’
Translation of the article I wrote for Skepter 37.3 (2024)