Optimizing Fat Fritz, the top rated engine in the world

by Albert Silver
11/25/2019 – A new edition of the Computer Chess Ratings List, published on November 23, ranks Fat Fritz as number one. Now the engine is available to consumers through the new Fritz 17 release. A recurring question by users new to Fat Fritz and Leela, and even by veterans, is: How do I modify the settings in the engine and how do I obtain the best performance with both a standard GPU, as well as a top-end system sporting two powerful graphics cards? Read on to find out!

Fritz 17 - The giant PC chess program, now with Fat Fritz Fritz 17 - The giant PC chess program, now with Fat Fritz

The most popular chess program offers you everything you will need as a dedicated chess enthusiast, with innovative training methods for amateurs and professionals alike.

More...

A new paradigm

CCRL, the Computer Chess Ratings List, is the one of the oldest and longest running chess engine ratings lists in activity, going strong for 15 years now. You can find full tests of engines going back to Chess Tiger 2004! Following the times, they also now include testing with graphics cards to test neural networks, which includes not just Leela and Fat Fritz, but also Allie, and Stoofvlees. But it's Fat Fritz, running with an RTX2080 GPU that tops the current (November 23rd) list. While we don't believe that engine vs engine competition is the best reason to use our new flagship engine — its strengths as an effective analysis partner go beyond mere rating — it's still a valuable independent metric.

CCRL standings

November 23rd results matrix, top 12 engines (best versions only)

The website is very sophisticated with an impressive range of filters, details and statistics. By default, it will display only the best versions of any specific engine, but if you click on the Complete List link, it will show every version and hardware setup used. 

A breakdown of the individual match results can be found at the bottom of the list, or clicking on each entry. All games are available for download.

Getting the most performance from your engine

One of the most fascinating aspects of the neural networks is their reliance (for now) on a graphics processor (GPU) to achieve best results. For some, this is considered 'unfair' when comparing to a conventional chess engine such as Stockfish, Komodo, and so many more, since it gets a Central Processing Unit (CPU) and a GPU. However, this is a very misleading way of describing what goes on. The reason is two-fold:

  1. The GPU isn’t used to actually do any calculations. While the search is conducted on the CPU just like any engine, the huge weights file, containing millions of values which make up its understanding of chess, is read by the GPU for each node to give its evaluation. Even with a top-of-the-line GPU, only two cores are used to run the search. More will actually hurt speed, since the bottleneck is really how fast the GPU can read the neural network file, and not the CPU’s search calculations.
  2. The classic engine can certainly use two CPU cores, or four, or 32, or 128. Every added core just increases the speed of its calculations. Fat Fritz and Leela gain nothing from mountains of additional CPU cores.

As a result, saying the neural network has an advantage thanks to the GPU is incorrect since it cannot benefit from many CPU cores as conventional engines can. It is simply different.

Understanding that a conventional engine is different from a neural network is fine, but how can one compare them in a balanced situation? Sadly the answer is not straightforward.

How to compare: The AlphaZero ratio

When DeepMind published their paper on AlphaZero and the results against Stockfish 8, they gave a crucial piece of information that, at the time, probably seemed a curiosity at best: the ratio. All the results they published were based on two things: the time control, and above all the overall speed advantage in nodes per second that Stockfish had: 900 times more nodes per second (on average). 

Source: DeepMind

Why was this important? Because when neural networks came to the PC as a practical reality to reproduce AlphaZero, the only way to compare results was to set up a match that used the same conditions (around 63,000 vs 58,100,000). This means that if Fat Fritz or Leela is running at 10,000 nodes per second, Stockfish should be running around 9,000,000 nodes per second. Tip the balance too much one way or the other and you will then be clearly favouring one side over the other. 

The Diesel Dragster

One curiosity that was also shared by DeepMind in their first pre-paper, was how AlphaZero relied on a minimum depth to really show its strength.

Performance of AlphaZero and Stockfish, plotted against time per move. (source: DeepMind)

In a nutshell it shows that at very shallow depths, even AlphaZero lost and lost badly to Stockfish, but right around 30 thousand nodes per move it broke even, and only pulled ahead beyond that. Please note that was nodes per move, not per second. For AlphaZero, on their impressive hardware, that meant about half a second of analysis, but on a slower GPU that might mean more time was needed. Fat Fritz, for all its wonderful creativity has shown itself to illustrate this in the spades. GM Moradiabadi has noted how principled Fat Fritz’s play was, meaning that if the position called for material sacrifices for activity, it would not hesitate, even if this led to sharp double-edged positions. It is perhaps for this reason that results in very shallow games (or weak hardware) can lead to disappointing scorelines, as the complications do need a certain amount of calculations to be resolved. 

However, if you do take these factors into consideration, you can get superlative results. Here is a match played in 100 games against Stockfish Dev (November 6 build) using the TCEC 16 openings, in 10-minute games with a 5-second increment.

The hardware was such that Stockfish ran at around 27 million nodes per second (based on start position) on 32 threads, while Fat Fritz (a newer build) was running at 30 thousand nodes per second, for a perfect 900 to 1 ratio as explained above. This also means that for each move, Fat Fritz was getting a minimum of 400 thousand nodes. Remember that AlphaZero was run in far longer games reaching easily 30 million nodes per move, with Stockfish getting 900 times that.

 

So why ‘Diesel Dragster’? The idea was to convey a racer that may be slow to get to speed, but that has a fantastic top speed once it gets there.

Optimal configurations

If you look at the internal settings of both Fat Fritz and Leela, you will notice many values that are completely different, such as the CPUCT, CPUCT factor, and so on. These values are the result of a deeply automated process known as CLOP, which helps determine the best performing values for an engine. These will vary from engine to engine, or in this case from neural network to neural network, so do not assume that one set of best values will work on another. They might easily cripple the other neural network instead of help it

Still, as a small secret shared now with the readers: the tuning process was extended to a full 170 hours for Fat Fritz, and its new optimal values are slightly different than the ones that came with the first release. While they will be included in the next engine update, feel free to use them now:

Just right-click on the engine pane, and select Properties to open the UCI options

Change the cpuct to 3.56 (instead of 3.67), the cpuctfactor to 2.74 (instead of 2.54), and the Policy Temperature to 1.84 (instead of 1.87). You can see these values and where they are above.

Nvidia 16xx video cards

If you own a machine that has one of the newer Nvidia mid-range cards such as the GTX1650, GTX 1650ti, GTX1660 or GTX1660ti, you can enjoy a nice speed boost by changing the UCI option called the Backend to cudnn-fp16. Be sure to leave the number of threads to 2 though. More than that will hurt performance.

Multiple GPUs

One scenario that was not covered at all in the installation process is the matter of multiple GPUs. There is no question you can get maximum results with more than one GPU, and contrary to games in the past which required a special connection linking the two, all you need is to have them installed, and make a few changes in the UCI options. 

The exact changes are:

Backend — Multiplexing
Backend Options — (backend=cudnn-fp16,gpu=0),(backend=cudnn-fp16,gpu=1)
Threads — 4

Conclusion

Hopefully you will now find yourself armed with the means to get the most out of your Fat Fritz or Leela, and what to expect.

For more, check out all stories on Fat Fritz and the new Fritz 17.


Fritz 17 - The giant PC chess program, now with Fat Fritz

The most popular chess program offers you everything you will need as a dedicated chess enthusiast, with innovative training methods for amateurs and professionals alike.

More...



Born in the US, he grew up in Paris, France, where he completed his Baccalaureat, and after college moved to Rio de Janeiro, Brazil. He had a peak rating of 2240 FIDE, and was a key designer of Chess Assistant 6. In 2010 he joined the ChessBase family as an editor and writer at ChessBase News. He is also a passionate photographer with work appearing in numerous publications.