How data analysis has changed the NBA and top-level chess

by Roger Lorenz
4/26/2024 – Inspired by a data analysis of the style of play in American basketball, Roger Lorenz has carried out a similar analysis for top-level chess. There have indeed significant changes over the last few decades — not necessarily positive ones.

ChessBase 17 - Mega package - Edition 2024 ChessBase 17 - Mega package - Edition 2024

It is the program of choice for anyone who loves the game and wants to know more about it. Start your personal success story with ChessBase and enjoy the game even more.

More...

Using data analysis to improve results

I recently came across a comment on X (formerly Twitter) by grandmaster and world champion coach Peter Heine Nielson. In it, he refers to the book Hoop Atlas: Mapping the Remarkable Transformation of the Modern NBA, which will be published in May 2024. The author advertises the book on X with the illustration above. It shows the changes in shot selection in the NBA from the 2023/2004 season to the 2023/2024 season [Goldsberry 2024].

For those readers who are not so familiar with basketball and the NBA, a few notes. The National Basketball Association (NBA for short) is by far the strongest and most popular basketball league in the world. The NBA currently consists of 30 teams from the USA and Canada. The following illustration shows the basketball court.

Figure 2: The basketball court (Source: Wikipedia)

Successful throws beyond the three-point line score 3 points, throws inside the line score 2 points. The probability that a shooter will successfully convert a shot decreases with increasing distance from the basket. It is therefore not surprising that 3-point shots have always been popular near the baseline, where the distance between the three-point line is slightly shorter than in the centre of the court (6.60m vs. 7.24m).

Figure 1 (teaser picture) shows a major change in the way NBA teams play. Whereas in the past (2003/2024, left side of the figure) shooters utilised the two-point range more frequently, today (2023/2024, right side of the figure) they concentrate on three-point shots and shots close to the basket. There are hardly any shots from the centre distance. There is also an increasing tendency to take three-point shots one or two steps behind the three-point line.

Anyone who has watched an NBA game recently (e.g. on YouTube) can confirm this. While smaller players such as Stephen Curry (1.88m tall) are now also moving more aggressively to the basket, taller players such as Nikola Jokić (2.11m tall) can now also score three-pointers very well. And with Luka Dončić, you sometimes get the impression that he takes his three-point shots just past the halfway line.

Figure 3: Current top shooters in the NBA: Stephen Curry, Nikola Jokić and Luka Dončić (Source: Wikipedia)

These changes in the game are certainly no coincidence. All NBA teams now have a team of data analysts who advise the coaches. These analyses will have revealed that a three-point shot pays off more than a two-point shot from mid-range. This realisation will then have had an impact on the signing of new players, so that there are now, for example, more players who are good at three-point shots or can find good shooting positions close to the basket. The whole system is promoted in training, where they practise such shots and how to play out such shots more often.

Traditional basketball fans in the United States are not happy about this development. There are currently discussions about moving the three-point line further back and abolishing the three-point line near the baseline altogether. But there will certainly be no changes in the short term.

What about chess?

I have the impression that we are seeing a similar development in chess. We chess players don't throw a ball at a basket. But we choose our openings carefully. And I think that we see much less diversity of openings in top-level chess these days than we used to.

So I analysed whether my impression is correct or whether I'm wrong. To do this, I analysed which openings the top players used to play (here I chose 1990, when Kasparov, Karpov and Korchnoi were still active) and nowadays (here I chose 2019 to avoid distortions due to the many online tournaments that took place during the Covid pandemic). I defined the term top player as a TOP100 player.

The definition of top player brought the first problem. The ELO ratings of the top players have changed significantly upwards in recent years. In 1990, an ELO rating of 2540 was enough to be in the TOP100, whereas in 2019 this benchmark was already 2654.

Figure 4: Development of ELO ratings for top players since 1990 (author's analysis based on the FIDE ELO lists from July of each year)

I used the Mega2023 database from ChessBase as my data source. I first saved this database in PGN format. I then extracted the following data per game:

  • Event
  • Date
  • ELO White
  • ELO Black
  • Result
  • ECO-Code

In the next step, I edited the data as follows:

  • All games from 1990, if both players had an ELO rating >= 2540
  • All games from 2019, if both players had an ELO rating >= 2654
  • Filter out all games without a result or without an ECO code (this should also remove all Chess960 games)
  • Try to filter out rapid and blitz games. To do this, I removed games that contained patterns such as "Blitz", "Rapid", "5'" or "Titled Tuesday" in the event field. I certainly didn't catch all rapid and blitz games, but hopefully enough to avoid distortion by such games.
  • Removing duplicates

I then used this data to determine the 30 most common ECO codes and their frequency for 2019 and 1990. These are listed in the following figure:

Figure 5: The 30 most frequent openings (identified by the ECO code) in 2019 and 1990. Entries marked in green can be found in both lists

The illustration shows two things. Firstly, the selection of openings has changed. Only 7 of the TOP 30 openings from 1990 are still in the TOP 30 list in 2019. No opening in the top 10 from 1990 is still in the top 10 in 2019. Openings such as the French, Pirc, Dutch or King's Indian have completely dropped out of the list of the 30 most-played openings. Instead, the Italian, the Berlin Defence and the Petroff now prevail.

The frequency with which the top openings are played has also changed significantly. In 2019, the 30 most common ECO codes are responsible for over 50% of games. In 1990, it took 57 ECO codes to reach 50%.

Why is that? I generated the following statistics.

Figure 6: Development of the rate of draws in TOP games since 1990

The chart shows that the rate of draws in games between top players has risen sharply since 1990. While this rate was around 57% in 1990, it rose to almost 65% in 2019.

All in all, I come to the conclusion that players marshalling the black pieces in particular are trying to reduce risks through their choice of openings. They are rewarded with higher draw odds. Risky openings such as the French, Pirc, Dutch or King's Indian are now only used by the top players as a surprise weapon.

A real success story from Black's point of view. It is therefore unlikely that this will change in the future.

Closing remarks

The statistics show that my hunch was right. In top-level chess you concentrate on just a few openings. Openings that represent a higher risk, especially for Black, are hardly ever seen any more.

Like the traditional basketball fans in the United States, I am not happy with this development. I also think it's a shame that this is opening up a gap between top chess and the rest. In the past, many (myself included) tried to imitate the top players in the opening. When Garry Kasparov played the Najdorf or the King's Indian, these were openings that were also encountered in the lower divisions. However, I can't remember seeing the Berlin Defence in a team or amateur tournament in the last 20 years.

I suppose I'll just have to accept that. Unfortunately, there are no solutions in chess similar to moving the three-point line in basketball.

Sources

[Goldsberry 2024]: Kirk Goldsberry, Hoop Atlas: Mapping the Remarkable Transformation of the Modern NBA, Mariner Verlag


How to play the Najdorf

Between 2004 and 2007 the 13th World Champion Garry Kasparov recorded a large 3-volume Najdorf video course. ChessBase is publishing this great classic in a complete edition in the current ChessBase Media format. Look forward to this classic of chess!


Links


Roger Lorenz studied Computer Science in Bonn in the 1980s and worked afterwards for many years as a project manager and consultant. After retirement he has now more time for hobbies which includes playing chess, chess history and computer chess engines. He is member of the chess club Bonn/Beuel and the Chess History and Literature Society. You can contact Roger through his homepage.
Discussion and Feedback Submit your feedback to the editors