Standing on the shoulders of giants

by Albert Silver
9/18/2019 – The expression is used to convey the idea that someone’s work and achievements were only possible thanks to the predecessors they build upon, and this is unquestionably true of Fat Fritz. You may think you know when or where the tale starts, but what if we told you that it all really started with a visionary computer scientist 60 years ago in 1959? | Photo: IBM

ChessBase 15 - Mega package ChessBase 15 - Mega package

Find the right combination! ChessBase 15 program + new Mega Database 2019 with 7.6 million games and more than 70,000 master analyses. Plus ChessBase Magazine (DVD + magazine) and CB Premium membership for 1 year!

More...

The start of the revolution

When you think of AI or machine learning you may draw up images of AlphaZero or even some science fiction reference such as HAL-9000 from 2001: A Space Odyssey. However, the true forefather, who set the stage for all of this, was the great Arthur Samuel.

Samuel was a computer scientist, visionary, and pioneer, who wrote the first checkers program for the IBM 701 in the early 1950s. His program, "Samuel’s Checkers Program", was first shown to the general public on TV on February 24th, 1956, and the impact was so powerful that IBM stock went up 15 points overnight (a huge jump at that time). This program also helped set the stage for all the modern chess programs we have come to know so well, with features like look-ahead, an evaluation function, and a mini-max search that he would later develop into alpha-beta pruning. So while he may have been one of the forefathers of chess engines such as Stockfish, how does that give him credit to AI?

Arthur Samuel plays checkers with an IBM 704 computer in Poughkeepsie, New York | Photo: history-computer.com

Arthur Samuel was not content to simply develop the world’s first checkers program, and he began to develop the first techniques of actual machine learning — a term he himself coined — including experiments in which his program played itself thousands of times to try to improve. He laid the groundwork for future developments in the field of reinforcement learning, and published his seminal paper “Some Studies in Machine Learning Using the Game of Checkers” in July, 1959. His work began with rote learning at first, but soon began working on techniques that would become the precursor to Temporal Difference Learning, which would lead to a pivotal point in reinforcement learning 30 years later.

While Arthur Samuel’s work still developed hand-crafted values the program would fine-tune, it wasn’t until 1992 that the first truly ‘Zero’ neural network was developed for a game: Gerald Tesauro’s groundbreaking TD-Gammon. The image on the left, from IBM's Think Magazine, December 1992, shows Gerald operating his Backgammon program.

TD-Gammon was the first time true model-free reinforcement learning (this is what Deep Mind’s ‘Zero’ means) was applied to a board game, and from the very start it produced extraordinary results. The idea and theory of model-free reinforcement learning is that the neural network, the AI’s ‘brain’, starts with bare bones knowledge of the game, and must learn how to play it entirely on its own.

The concept was not invented by the IBM researcher, and even he was using the work of Richard Sutton [pictured right from his home page], a distinguished research scientist at DeepMind and a professor of computing science at the University of Alberta.

Nevertheless, Gerald Tesauro was the first person to apply model-free reinforcement learning to a board game. TD-Gammon used a combination of techniques to create a neural network model that learned from pure self-play games from which it developed its own strategies. Prior to it, backgammon programs had been pitifully weak, trounced by experts with ease. TD-Gammon developed a very fine positional sense and its strategies were to become the foundation of modern backgammon theory. After 1.5 million self-play games, it had peaked at a standard that could challenge the best players of the day. Two-time backgammon world champion Bill Robertie wrote a book about his time studying and playing with it called “Learning from the Machine”. This concept was to be the same one that informed Matthew Sadler’s book on AlphaZero Game Changer over 25 years later. Also, much like DeepMind was to do decades later, Dr. Tesauro published his work and all the details in a scientific paper, “Temporal Difference Learning”, in ACM in 1995. Today you can find a version ported for Tensorflow.

This changed everything, and it led to the birth of ever greater backgammon neural networks that could provide world-class competition as well as world-class analysis. The first great program to follow and raise the standard was Jellyfish, after which came Snowie, and even a magnificent open-source project: GNU Backgammon, which to this day is the second strongest backgammon software available. It too can be found at its source site. For documentation, refer to my online manual, “All About GNU”.

DeepMind, Go and Leela

Curiously, in spite of the enormous success in backgammon, there was no follow-up for other board games, regardless of whether they were successful or not. In chess, the oldest truly developed neural network is Stoofvlees, a 2007 program by Gian-Carlo Pascutto, trained from grandmaster games. While considerably ahead of its time in concept, it lacked the hardware needed to run at speeds that might make it truly competitive. Years later, another neural network effort would be made: Giraffe by Matthew Lai, who soon joined the DeepMind team that developed AlphaZero.

AlphaGo logo

It wasn’t until 2016 that we saw a repeat of history of sorts as DeepMind used a new and powerful technique known as Deep Reinforcement Learning and produced the breakthrough program AlphaGo. Although it was already able to demonstrate clear superiority over one of the top Go players in the world — the Korean genius Lee Sedol — it still used a combination of human games and self-play games. The project, headed by David Silver, published a paper, and it was an immediate bolt of lightning to the Go world as top commercial programs such as Crazy Stone all came out with Deep Learning versions based on the new techniques.

Gian-Carlo PascuttoOnline, a new free program came out with a nice interface that also implemented these new techniques and its author was none other than Gian-Carlo Pascutto (right). He had already written a top engine called Deep Sjeng using the standard engine techniques that were employed by Fritz, Rybka, Houdini and others, but had since moved into computer Go. Here was a field that had been wide open prior to AlphaGo, with barely a program that could play at master level, much less world champion caliber. The name of his pet project was Leela.

Then in 2017, DeepMind took a bold step forward by using an advanced neural network structure invented by Microsoft called Residual Networks, or ResNets for short. In their paper, DeepMind attributes no less than 600 Elo in improvement to this structural change. It also marked a shift toward the model-free reinforcement learning used by Tesauro, which they coined ‘Zero’. This meant that the new Go program was now free of any knowledge or biases, and was taught nothing beyond the rules of the game. Everything it learned would be the product of its massive self-play and the conclusions it reached on its own. Tens of millions of games later, AlphaGo Zero, as this new version was called, was head and shoulders above AlphaGo, which had already been the unchallenged number one. Analysts that included some of the best players in the world were blown away. They saw clear areas where the groundwork for new theory was being laid out, and thanks to the fresh scientific article detailing it all, there was now a roadmap to this Holy Grail of Go.

Again, Gian-Carlo Pascutto took up the gauntlet, but faced a new challenge: resources. While DeepMind might boast supercomputers to generate the tens of millions of games needed to create their god of Go, Pascutto had no such option, and the problem was daunting. Consider that even with the most powerful computer on the market, with the most powerful GPU (graphics processing unit) needed to accelerate the work, only a few hundred games per hour would be created. There might be a roadmap to the Holy Grail, but it would require a decade to get there. He therefore created a new open-source project called Leela Zero and added a brilliant idea: a client program anyone could download, which would generate self-play games and automatically send them to a centralized server. The idea was to leverage the computer power of fans and dreamers alike and let them help build this community-driven AlphaGo Zero neural network available to all. Needless to say, the actual search and training algorithms were those published by DeepMind in their papers. Pundits from chess were fascinated, envious, and in complete agreement: these newfangled neural networks might solve Go, which is more pattern recognition than brute calculation, but they would not work for chess. Right?

AlphaZero and Leela Chess Zero

To call it a clap of thunder in a clear blue sky would be a gross understatement. When DeepMind announced AlphaZero in December of 2017, a new improved version of their AlphaGo Zero template, but this time applied to three games: Go, shogi, and… chess! The result was slack-jawed chess players around the world. DeepMind claimed to not only have produced a neural network that played chess at the highest level, but that did so in extraordinary conditions. First of all, it used almost 1000 times fewer nodes per second than the reigning number one Stockfish, which would mean at least 500 Elo lost if it were Stockfish itself trying this stunt. Second, it played positional chess to make a top GM weep with joy, with speculative plays and attacks that no conventional engine would try (unless it were configured with suicidal settings).

Had this been their first announcement they might have been met with derision and disbelief, but with the track record of AlphaGo behind them, it just set chess players dreaming. Sergey Karjakin, the World Championship challenger of 2016, said that he would pay a million dollars to have AlphaZero. It wasn’t important whether this was a genuine monetary offer, the spirit of the comment was widely shared. Once again, DeepMind published the recipe to their success.

Demis Hassabis

An outsider might wonder why DeepMind would even bother given the massive success and publicity they had garnered from AlphaGo, but the key lay in the founder of DeepMind: Demis Hassabis. Hassabis had been a chess prodigy in his own right, reaching master level (Elo 2300+) a the age of 13. While he may have strayed from the chess path as a professional player, his first love was not to be forgotten. With the success of AlphaGo Zero, and the development pipeline having been so well fine-tuned by now, it was time to see if something special could be done in chess as well. The rest, as they say, is history.

Gary LinscottAlmost immediately after the news and publication of the first draft of the paper, someone tried to convert Gian-Carlo Pascutto’s open-source Leela Zero code to chess, but it was a non-functional mess. This time Gary Linscott (in image to the right) came to the rescue, a programmer who was a main developer and contributor to the chess engine Stockfish. He cleaned up the code, and with the help of some other skilled programmers equally desirous to see it work, they went about effectuating the transition from Go to chess. As one of them later commented though, without the Leela Zero codebase to work with, they were unlikely to have even tried to take on this mammoth project. I wrote about this development in my article Leela Chess Zero: AlphaZero for the PC in April 2018.

Within about a month the code had been tested to work, and giving credit where credit was due, it was called Leela Chess Zero. Just as in the case of Leela Zero (the Go program), a key component was to leverage the community resources via contributions using a client program that anyone could run from their computer. Again, the issue of producing sufficient games to train the project posed a major challenge. DeepMind claimed to have needed roughly 44 million games to train the AlphaZero chess neural network, and while they managed this feat in four hours with the help of 5000 super processors especially designed for AI development, the average Joe could come nowhere near this. LCZero had been born.

Since there was no desire to reinvent the wheel, a number of basic Stockfish pieces of code were used to enable the switch to chess, such as the move generator. Another more significant problem was the pure speed of the search and binary. Using their custom-made AI processors known as TPUs, DeepMind had not really needed to worry about speed. With just four they reached an average of over 60,000 nodes per second, while a user with a 1080ti, the most powerful GPU at the time, could barely hope for 2000 nodes per second with a similar network running in the LCZero binary. Two key contributors were to team up and solve these problems, Alexander Lyashuk and Ankan Banerjee.

Alexander LyashukAlexander Lyashuk [left, image from Chessprogramming.org], also known as Crem on the official Discord channel, is a lead programmer in the Leela Chess project, overseeing the contributions made by others and vetting the quality of the code to avoid having it bogged down in bugs and issues. He was also the original author of the new lc0 (as in Leela Chess Zero) binary that was to replace the old Stockfish code with clean code, which would no longer raise eyebrows regarding its origins.

Ankan BanerjeeAnkan Banerjee [image right from Chessprogramming.org] is a programmer who has written the drivers for Nvidia video cards, and with his singular knowledge and skills, hand-wrote code that leverages their speed, leading to a near five to ten times increase in speed. Suddenly that same 1080ti was producing upwards of 9000 nodes per second instead of the previous 1000-2000, and when the new generation came out supporting FP16 code, he wrote further code so that the new RTX 2080 was now flying at 30,000 nodes per second.

While many continued to fine-tune and debug the main binary, it still used the core search advocated and designed by DeepMind. Naturally, modern additions were included such as tablebase support, but it is still AlphaZero at its heart.

Progress was steady and not without hiccups. Some of the training parameters at first had to be guessed at and tested, since the pre-paper that DeepMind posted gave the essentials, but failed to share some seemingly minor, yet important details. Eventually, with time, the biggest issues were overcome, multiple iterations of the main network were trained, and it did realize its dream of defeating Stockfish under the rigorous conditions in TCEC, a highly respected online computer tournament.

Deus X and Fat Fritz

I entered the fray around March of 2018 as merely one more enthusiast, fascinated and confused. When I read about DeepMind’s disavowal of using outside content such as human or engine games, I was mystified. Granted computer Go before AlphaZero was almost rudimentary by comparison, so using engine games from other sources made little sense, but chess was different. It had enormous databases of incredibly high quality games, whether human or engine, which surely surpassed the quality of a self-play game played in seconds. I approached numerous experts and tried to convince at least one to undertake the experiment, but no one would bite. Not only was there an almost instant refusal based on the word of DeepMind, but there was a secondary school I came to dub ‘zeroists’ who refused to even consider anything that was not ‘zero’. Meaning it had to adhere to the philosophy of zero outside input of any form.

As I explored the codebase it dawned on me that I might be able to try this on my own. That is the beauty of open-source, and in this case the tools were mostly in place. The training code used to train the Leela Chess network, itself based on the AlphaZero paper, would gladly accept games of any source, so long as they fit the format it expected. In theory, the binary could convert this, but since it was deemed a dead end, the code was broken and no one had any interest in repairing it. One contributor familiar with it came to my rescue, Dietrich Kappe, as he himself was playing with this idea and shared his fixes that allowed the PGNs to be converted to trainable content.

Another substantial challenge was that I had nowhere near the 44 million games AlphaZero used. This time scrutinizing the literature on neural network training I came across methods that differed from those used by AlphaZero, and by extension Leela, promising to accelerate development as well as the quality of the result. Over the course of months and constant experimentation, this and many other changes led to surprising results. This first neural network was called Deus X (i.e. Deus Ex Machina) and was trained entirely on human games, achieving a success beyond what I had hoped. Running in the lc0 binary as a new neural network module, it played in a strong distinctive style that was the result of its study of human games.

Encouraged by the feedback of observers and the Leela developers, I went on to pursue more ambitious projects. One big help was Daniel Uranga, an Argentine programmer, who made numerous changes to both the binary and the training code to aid my efforts. After all, the purpose of my efforts was not to reproduce AlphaZero or Leela, but try something different. Eventually this larger and more ambitious project began to shine and ChessBase threw in its support as well, which is what led to Fat Fritz.

"What on Earth is that?"

Still, much like every project in this article, while Fat Fritz stands unquestionably on the shoulders of giants, it too brings something new to the table, and will enrich your chess analysis and enjoyment.

Links




Born in the US, he grew up in Paris, France, where he completed his Baccalaureat, and after college moved to Rio de Janeiro, Brazil. He had a peak rating of 2240 FIDE, and was a key designer of Chess Assistant 6. In 2010 he joined the ChessBase family as an editor and writer at ChessBase News. He is also a passionate photographer with work appearing in numerous publications.
Discussion and Feedback Join the public discussion or submit your feedback to the editors


Discuss

Rules for reader comments

 
 

Not registered yet? Register

Bertman Bertman 9/20/2019 08:54
@RayLopez First of all, no relation to my namesake. Second, there is no Alpha-Beta in it. It uses MCTS, which averages the scores of the options, rather than seek a final best move and issue its score. Also, it chooses moves by the number of visits. I recommend you read the paper on AlphaZero linked in the article as it will provide more information. Lastly, you are 100% correct that 'zero' does not eliminate biases per se, it merely eliminates the biases any outside source of information it might use. For example, if during its learning it decides that the Ruy Lopez Berlin is the best opening, you can be sure that the vast majority of its self-play games will use that opening, meaning there is a bias, just not a human or external bias.
Mr TambourineMan Mr TambourineMan 9/19/2019 09:16
Yes, Lachesis, I thought about it first! And I feel extremely honored to just be mentioned by Frederic. It made my day!

Hallo Fred! One day I hope to be able to visit Chessbase and hope you are there then. But as slow as I am, you probably have other in top positions for the company or you retired by then. You should know that a special loyalty to Chessbase has been developed since 1989 the year I got a copy of Fritz 1 on a floppy from a friend. And to be able to use it, I had to buy my first computer, it was bananas but I spitted up € 1,000 in today's monetary value for a simple IBM. Pls tell Matthias this story. Okey nothing new. Just saying.
RayLopez RayLopez 9/19/2019 02:51
As a sometime hobbyist programmer myself (C#), I think, without even spending a few minutes to research it, this statement is misleading: "This meant that the new Go program was now free of any knowledge or biases, and was taught nothing beyond the rules of the game". Probably not. I'm pretty sure the Alpha-Beta algorithm is used in Go, just like with neutral net chess programs. Alpha-Beta is nothing more than "he moves there, I do this; he moves there, I do that" for a couple of moves. Since you cannot do this for more than a few ply due to O(n!) complexity, the trick is the evaluation function at the end of your moves, which in neutral networks is informed more by actual games and patterns achieved rather than strict mechanical rules and numerical evaluations like 2R = p + Q. Another way of putting this is prior chess engine programmers underestimated GM M. Tal. And is David Silver related to Albert Silver?
besominov besominov 9/18/2019 08:20
"the impact was so powerful that IBM stock went up 15 points overnight"

So that's where they got the idea to rig the match with Kasparov.
mikolov mikolov 9/18/2019 06:38
First I want to compliment Mr. Silver on a very well written article. Second I am intrigued in where will these types of engines lead us in the understanding of chess.
ortsac2014 ortsac2014 9/18/2019 06:05
Any possibility of Fat Fritz competing in TCEC?
Lachesis Lachesis 9/18/2019 01:54
Excellent question Mr TambourineMan, I wish I would have thought of it lol :)
Frederic Frederic 9/18/2019 12:57
@Mr TamborinMan: interesting thought. I will discuss with Albert and Matthias.
Mr TambourineMan Mr TambourineMan 9/18/2019 11:28
Very good article Albert! Could Fat Fritz perhaps be helped by Lets Check function?
Lachesis Lachesis 9/18/2019 11:15
I have always enjoyed and appreciated articles by Albert Silver. Wish he will write more for Chessbase in the future.
1