How a neural network is made

by Albert Silver
2/21/2021 – To say that Fat Fritz 2 has been making waves is an understatement. In this article the author describes the process of its creation as a powerful new neural network that runs inside a slightly modified Stockfish. You will also learn the difference between the search and the neural network, what makes Fat Fritz different, and all the considerations and work that went into its development.

ChessBase 16 - Mega package Edition 2021 ChessBase 16 - Mega package Edition 2021

Your key to fresh ideas, precise analyses and targeted training!
Everyone uses ChessBase, from the World Champion to the amateur next door. It is the program of choice for anyone who loves the game and wants to know more about it. Start your personal success story with ChessBase and enjoy the game even more.

More...

If you ever wanted to know the details of what the difference is between the search and the neural network, or the challenges of such a lengthy project, the article that follows should give you a much clearer picture. That said, nothing therein contradicts any part of the description made in the author's launch article.

What is more important in Formula 1? The car or the driver? 

In the chess engine world with neural networks, the search is that race car and engine, and the neural network is the driver. A super fast engine alone might be able to attain record speeds but it takes a great driver to use his judgement and skill to steer that great car to victory. Likewise, the best driver in the world can only compensate for a slow car by so much.

The search is an amazing part of any chess engine and is designed to allow the best conclusions to be reached as quickly and efficiently as possible. However, even the best search is of little use if it does not have a strong evaluation to know what moves to search, which ones to extend deeper, and which ones to reject or stop analyzing. In the world of traditional engines Stockfish’s search is the best of its kind.

While the McLaren MP4/6 was the fastest car on the circuit in its time, it was the combination with the great Ayrton Senna that brought it so many wins. |  By wileynorwichphoto - Flickr: Senna @ USGP 1991, CC By 2.0

Nevertheless, make no mistake, using the same search with two distinct neural networks will reach different conclusions and analyze different moves to different points. There will certainly be similar and even identical main lines at times, since the best move is still the best move, but even in perfect play there is room for style and preference.

Both Fat Fritz 1.0, and the new Fat Fritz 2.0 use the searches from the open-source Lc0 and Stockfish, and when you purchase Fat Fritz you are certainly not paying for the free Lc0 and Stockfish they come bundled with. You are buying the unique Fat Fritz neural networks that come with them, which will provide new ideas that should enrich any player’s analysis. While Elo is important to measure the general performance, it is only one aspect, and is why multiple points of view are always useful and important. If I had my own private Karpov to analyze for me, I'd be over the moon, but I would not reject the idea of my own private Kasparov as well, would you?

What follows is a detailed account of how the Fat Fritz 2 neural network came to be, sharing all the thought and work that went into its development. A preview shared with a friend and grandmaster garnered one comment: "That's a lot of work." Indeed it was.

Fat Fritz 2 - the Lc0 net

After the successful release of the neural network Fat Fritz 1.0 in November 2019, which ran in Lc0 (the Leela Chess binary), I soon began its successor a larger neural network that would work in it as well.

This new ‘Fat Fritz 2’ was to be a much larger set of weights, using a different size than the one the Leela project had been developing, incorporating some new ideas. I tested and trained a variety of combinations, taking me a few weeks, until I hit upon the one that performed best.

By February training had already begun and I tentatively aimed for December to finish it. The doubt was because producing data for a larger network is incredibly time consuming, even with the fourteen 2080ti GPUs I had put to its service. The financial risk here was all mine, but passion for my work has always been a key driving force. The larger network planned meant a slowdown of over twofold, and with 80-90 net iterations planned I was lucky to be able to train one every three days.

NNUE or not?

Initial progress was swift but then in July 2020 NNUE began making its waves in the chess world. A programmer by the name of Tanuki, part of a group of Shogi programmers, had been working to bring the NNUE technology to Stockfish, as explained in the initial release article. Though he had been at it for months, it was really only in July that results began to actually show the promise of this new direction. I decided to try it out and there were tools made available to train one’s own. An early build of Stockfish with a NNUE net was available and the results were spectacular. I knew it was still early days but I now saw that my project might reach December and already be ‘obsolete’. I was tortured. I had spent thousands of dollars already in training the current Lc0-based neural network, and if I adopted the NNUE wave all this would be for naught.

Should I finish what I had started regardless? Or should I jump headlong onto the NNUE bandwagon? If I switched I had another problem: I could perfectly well create a new neural network that might run in it, but if I had nothing fresh to bring to the table in neural network ideas there was no point.

On the plus side, Stockfish made this possible as it is GPL, and the entire purpose of the GPL is to allow distribution and commercialization, setting out the conditions in which it must be done. You will find it in projects everywhere from free derivatives such as Cfish, to professional apps like PlayMagnus. Still, it only makes sense to go down this path if I can add genuine value by offering a strong, interesting and attractive neural network.

A lovely logo created and sent by Ali Ahmed (click on it to see the beautiful detail)

The first thing I noticed was that most everyone seemed to be following a similar net construction plan in the Stockfish ecosphere: a network with 256 neurons, trained on Stockfish evaluations generated at 8 plies deep (four moves) and then at 12 plies deep (six moves). Progress was rapid as the talented NN trainer, Sergio Vieri, undertook to train the one used by the community. With huge resources at his disposal he labored on the Stockfish net for nearly two months, finally quitting in September when he was no longer able to improve it. All other attempts I saw were either trying to repeat his success (none did) or improve it with further training (also none did). Aside from some tuning done on the net’s output, the net has been untouched or improved since then.

A few random efforts were made to build larger nets and smaller nets, though all met with uninspiring results. In fact, the results were so unimpressive that when asked about larger nets the official Stockfish Discord, a community chat area, would see replies by developers such as “It has been tried, but is a dead end". I did not disagree it had been tried, but calling it a dead end seemed premature.

Going all-in

I decided that two things could lead to something really interesting, if they worked. The first was to use the evaluations of Fat Fritz 1.0 instead of an engine such as Stockfish. After all, there was no question that the residual nets introduced by Deep Mind were still far more powerful, so if I could leverage the evaluations of Fat Fritz 1.0, I might end up with something quite special. This was an essential idea but needed technical expertise I did not possess to make happen.

Enter my close friend, Daniel Uranga, who was instrumental in allowing me to experiment with so many ideas. It was still up to me to train the nets, draw conclusions, and more, but he was the engineer who made these experiments possible. I immediately rented a fleet of powerful GPUs to start producing data for the project, knowing that whatever the choice this data would be needed.

With the implementation of Fat Fritz moves and evaluations taken care of, there was now the question of the size of the neural network. Stockfish’s choice of 256 neurons was nearly ubiquitous, even adopted in the Shogi world, and the few tries at larger or smaller nets had ended in very disappointing results at least 50 Elo behind.

Bigger works after all

It was by far the most exhausting investigation and took several weeks. I trained nets using every permutation possible: bigger nets, smaller nets, deeper nets, shallower nets, and so on. They all had to be trained deeply to be sure the results were trustworthy, and they all had to be tested at each stage for tens of thousands of games to ensure the measurements were also reliable. Testing on only 1000 games would mean it could still be off by as much as 30 Elo, and I would not be able to know if my choice was correct.

Finally I hit upon the network size used, 512 neurons, which is double the size of the one used inside Stockfish and other projects. Needless to say, this was a real find and it will no doubt lead to similar endeavors by other hobbyists and developers. Its details can be found in the Github, and while it might seem a small thing, remember that this was previously considered a dead end to explore.

Still, none of this guaranteed a great result, just a promising one. I might still come out of all this empty-handed, well behind, with only a nice ‘flavor’ network to show for it and a significant financial loss to the tune of $16,000. I had no backing in any of this, and if it failed, it would be entirely out of my own pocket.

Training and improving Fat Fritz 2

The initial training took over nine days around the clock to build the net. And this was with a fast 32-thread machine. When it finally concluded this phase, the results were beyond what I had hoped. On an identical binary to the one Stockfish 12 used, it was coming out ahead in their head-to-head matches. Obviously, this wasn’t about beating Stockfish, but rather the net it used. They both employed the same binary after all. Regardless of what happened after, the project was a clear success.

The next phase was the hardest and most frustrating by far. If I had thought to reproduce Sergio Vieri’s impressive progression of training the net ever more using a form of reinforcement learning, I was disabused very quickly, despite employing over 700 fast computer threads to produce special data from this new NNUE net.

Over a period of two months, well over a thousand nets were created and tested around the clock, yet only two nets actually showed measurable progress. You can imagine how draining this was. I would get up in the middle of the night to start new tests so as to not waste any computer time at any point. This effort was made using special scripts shared by Dietrich Kappe, a tireless developer who wrote them to generate data in his own work.

I feared soon someone would come out and take up the gauntlet Sergio has thrown down, to take the Stockfish net to new heights of wonderous progress. But it soon became clear I was not the only one struggling to keep up with his exceptional work. No one had managed to even equal it, never mind beat it, and some had resources that dwarfed even my princely cloud rentals.

The test results

My tests were all done against Stockfish itself with each iteration. Why Stockfish? Because it is the standard to which all must be held. I needed to at least know if I was presenting a subpar product to users. I knew it would play differently in many situations, but would it also be weaker? This was my concern. Whether or not users felt it added value to warrant purchasing is always an individual call of course. If their only concern is Elo, then there is no compelling reason to buy Fat fritz 2, but if you want a powerful and different perspective then they will be well served. It is worth adding that I do not have the time or interest to test against a field of engines a hundred or more Elo behind. So ultimately it is perfectly possible it would beat the king but be less efficient at sweeping up the rest. I could live with that.

It has been tested in a deep match of ten thousand games against the February 11 build of Stockfish, which has since become the new Stockfish 13. This match was played by an independent tester who used a fixed eight-move deep book that is taken from a base of 2.1 million games played on LiChess in 2020 by opponents both rated 2400+.

Some may feel that the comparisons in strength made with Stockfish are an attempt to put it down, but it is just the opposite. It is acknowledging without question it is the standard to which all others must be held. Note: the GUI's output can be a bit confusing. This shows a 16 Elo difference, not 32.

It is this result and others of the type that have prompted the claim it is the no.1 of its kind. Without question it depends on the powerful search provided by the open-source project Stockfish, but if search was all it took then NNUE would not have made a difference and Stockfish 12 would have been a modest jump ahead of Stockfish 11, instead of the massive leap it actually was thanks to a powerful brain. It is the combination of a great neural network with it that makes it what it is.

Fat Fritz 2 is a net trained on the games and evaluations of Fat Fritz in a uniquely large architecture, which combined with the fantastic Stockfish is something I feel will bring valuable analysis and ideas to all players. I am unreservedly proud of it.

A sample of its play

Here is a game shared by a user employing 36 threads in 15m + 15s games. A Classical King’s Indian was played between Stockfish (Feb 11 build) and Fat Fritz 2. The first nine moves were fed to them and after that they were on their own. The dynamic uncompromising play by Black is striking and spectacular.

 

 


Born in the US, he grew up in Paris, France, where he completed his Baccalaureat, and after college moved to Rio de Janeiro, Brazil. He had a peak rating of 2240 FIDE, and was a key designer of Chess Assistant 6. In 2010 he joined the ChessBase family as an editor and writer at ChessBase News. He is also a passionate photographer with work appearing in numerous publications.