Fat Fritz and the KID

by Albert Silver
8/26/2019 – While the title may sound like the name of the latest spaghetti western, it is actually a look at how Fat Fritz may change the way players view the King's Indian Defense, notably the Classical Main Line, long maligned by generations of engines. See how it compares to the latest Stockfish on it, and why its unique development gives it such a broad outlook on openings in general.

ChessBase 15 - Mega package ChessBase 15 - Mega package

Find the right combination! ChessBase 15 program + new Mega Database 2019 with 7.6 million games and more than 70,000 master analyses. Plus ChessBase Magazine (DVD + magazine) and CB Premium membership for 1 year!

More...

Evaluating a classic

If there is a standard opening whose reputation has been repeatedly pummeled by engines over the years, it is the King's Indian Defense. Especially the Classical main lines with the two sides rolling their pawns forward. While it enjoyed its spot in the sun with its last great proponent in Garry Kasparov, the latter’s inability to overcome his rival Vladimir Kramnik with it, leading him to actually stop playing it, was a hammer blow. Almost as if dictated by fate, the rise of the chess engines was around this time, with Fritz at the top, later followed by Rybka, Houdini and more, and they all seemed to universally condemn the opening with extremely positive evaluations for White, which were only reversed by a mistake or blunder. The conclusion one might draw from this is that while the KID might offer plenty of practical chances, it is fundamentally suspect.

Or is it?

book cover

Fat Fritz offers a very different perspective on this, and it is not just because it is a neural network. It is also because of how it was trained and developed. When I began training it, the idea was not only to offer it the bulk of human genius from Mega Database, but to see it learn about all the openings that exist. This does not mean it will necessary like them, just that if it doesn’t, its opinion will have come from the practical study of millions of positions. This highlights one of the major differences compared to the ‘zero’ approach used by AlphaZero. The ‘zero’ approach accurately claims to remove all human bias, since in effect it is learning solely from games played against itself, but its proponents fail to highlight that instead this introduces significant biases of its own.

The published article on AlphaZero, and even the excellent book on it by GM Matthew Sadler and Natasha Regan, clearly show that while AlphaZero may be incredibly strong, it is also a super specialist in very specific openings. When allowed to steer the games towards openings of its choice, it did unusually well, but if greater variety were enforced the results were much less impressive. Was this because some of these other openings were inherently flawed or because its skills were not as universally strong in all of them? I suspect a bit of both.

Consider the graphs published by Deep Mind on the time spent on various openings in its own self-play. Meaning the percent of time in those 44 million games that it laboured on each opening.

Here are three:

The original article from which this is displayed explains:

Matches starting from the most popular human openings. AlphaZero plays against Stockfish in chess. In the left bar, AlphaZero plays white, starting from the given position; in the right bar, AlphaZero plays black. Each bar shows the results from AlphaZero’s perspective: win (green), draw (gray), or loss (red). The percentage frequency of self-play training games in which this opening was selected by AlphaZero is plotted against the duration of training, in hours.

As you can see from the first diagram in the list, the first three moves that might lead to the King’s Indian were barely ever touched by AlphaZero compared to other choices. Even the Caro Kann (the third diagram) enjoyed far greater interest. The results against Stockfish 8 were also less than inspiring, with more draws and losses as black than any of the others, as well as fewer wins. Fat Fritz is very different, and in my testing against Stockfish 10, with similar conditions (meaning 1000 to 1 speed difference), it is an opening where it will outscore the otherwise great engine quite significantly.

In my testing I use standard openings suites with balanced positions that enforce an opening, but not the exact continuation. In my longest test match*, when the Classical King’s Indian was played, Fat Fritz (an earlier development version) won both games as white and black.

 

Here are a few key positions that illustrate some of the more startling differences: 

 

This is the position after 21.h1 and it is astonishing to see that Stockfish, playing black, thinks it is better with a -0.83 evaluation, while Fat Fritz thinks it is +2.02 ! It is hard to imagine a more striking disagreement. 

 

Six moves later Stockfish is beginning to see the light and now agrees White is better, if only by +0.25, while Fat Fritz evaluates itself at +3.29. Looking at the position, it is really hard to understand how Stockfish can be so unconcerned. In the past, with no diverging opinions, the conventional wisdom was to not argue with the engine, and just accept it knew what it was talking about. Now there is a diverging opinion, and a result to reinforce it as well.


In the second game in which they repeated the opening with reversed colours, we are treated to an example of how Fat Fritz handles the dynamic needs of Black’s position.

 

The move is 20…xe4! A shot Stockfish did not expect. Though Fat Fritz found it on its own, it is only fair to point out that it has been played several times in correspondence play, and once even by GM John Nunn. Fat Fritz found it and played it in 14 seconds according to the timestamp in the game file.

Another clear divergence can be seen several moves later, when after 29…f4 the position stands so: 

 

Here is a condition that many players will recognize: the infamous 0.00. For the last couple of moves Stockfish 10 has judged the position as 0.00, and even for the next couple of moves after will continue to claim it is 0.00. However, at this point Fat Fritz is becoming quite happy with its position as black and thinks it is -1.22 in its favor. A few moves later, Stockfish will change its tune and from there on it never recovers.

So, am I suggesting the King’s Indian is winning or even better? No, but I feel confident that if anything it shows the fundamental vitality of the opening itself, and that Fat Fritz has a much better understanding overall of the needs and methods of the situations that arise from the King's Indian than most other engines. Using it as an analytical tool can lead to the discovery of plenty of gold nuggets, and I have no doubt there will be other openings that will also benefit. 


*Stockfish was running on 32 threads (each 3.7 GHz), while Fat Fritz ran on an underclocked RTX 2080 (and 2 CPU threads). The games were 15 minutes plus 10 seconds increment. Ponder was off.




Born in the US, he grew up in Paris, France, where he completed his Baccalaureat, and after college moved to Rio de Janeiro, Brazil. He had a peak rating of 2240 FIDE, and was a key designer of Chess Assistant 6. In 2010 he joined the ChessBase family as an editor and writer at ChessBase News. He is also a passionate photographer with work appearing in numerous publications.
Discussion and Feedback Join the public discussion or submit your feedback to the editors


Discuss

Rules for reader comments

 
 

Not registered yet? Register

dakchung dakchung 9/4/2019 07:31
"This is the position after 21.Kh1 and it is astonishing to see that Stockfish, playing black, thinks it is better with a -0.83 evaluation, while Fat Fritz thinks it is +2.02 ! It is hard to imagine a more striking disagreement!" Well, my Stockfish 260819 says +1.03 (depth 26), +1.95 (depth 30), +1.66 (depth 34) and +1.81 (depth 45).
chesspasky chesspasky 8/27/2019 09:18
I wonder if somebody is going to make available to any computer this kind of engine cause leela is a pain to instal it
celeje celeje 8/27/2019 03:36
@ Albert Silver

I cannot help you on this topic. It is clear you have no engineering background. It is also clear you have absolutely no knowledge of how computers work. It's not even about "how the GPU is used in lco". You don't know about GPUs for anything. It's not magic.

When you insist on writing false statements, innocent readers could be misled. Peace.
Albert Silver Albert Silver 8/27/2019 03:13
@celeje I cannot help you in this topic. not only is it clear you have absolutely no clue how the GPU is used in lc0, but what is more you insist on trying to give lessons based on that very ignorance. Peace.
celeje celeje 8/27/2019 11:29
@ Albert Silver:
You are totally wrong. You seem to think nodes measure computer usage. They don't. That's like thinking miles measure time. You don't understand how computers work.
Albert Silver Albert Silver 8/26/2019 11:59
@celeje My only recommendation to you is to read the code. It is your understanding of the GPU's function in running lc0 and producing nodes that is lacking.
celeje celeje 8/26/2019 11:02
@ Albert Silver: Nodes are totally meaningless as an indicator of the hardware speed being used.

Lc0 does not even define "nodes" the same way as Stockfish does, but even if it did that would still not tell you how much calculation they are doing per second.

You are not understanding what the GPU does in this. Please ask someone with an engineering background on the Lc0 team if you dispute this.
Albert Silver Albert Silver 8/26/2019 07:12
@celeje The RTX GPU doesn't do any operations when running Fat Fritz, so I'm afraid the number you quoted is meaningless. The GPU's purpose is essentially to take the position, feed it to the NN, and spit out the evaluation of the NN back to the CPU, which is where the actual search is conducted. This input-output process is actually the biggest bottleneck in its calculations. The raw numbers of the match were 30 million nodes per second for SF10 (based on 60 seconds calculating the start position), and 30 thousand nodes per second for Fat Fritz. This is what the 1000 to 1 speed difference means.
celeje celeje 8/26/2019 06:51
@AdopePlayer: The GPU, which the NN engine uses, does far more computations than a CPU, so the GPU has the true speed advantage.

(But often articles don't use sensible measures of speed, so they will sometimes claim opposite things.)

The RTX 2080 GPU does 20 trillion operations per second (but the note at the end of the article says it was "underclocked"). I don't know what the CPU numbers are.
AdopePlayer AdopePlayer 8/26/2019 06:12
Wonder what 1000 to 1 speed difference means.
paulmurphy paulmurphy 8/26/2019 02:49
Next project:
Somebody spend a few ducats in the engine cloud and have Fat Fritz rehabilitate
the King's Gambit!
KevinC KevinC 8/26/2019 12:24
"Fat Fritz" always reminds me of Nagasaki, even if you add "and the kid" to it.
1