The Elo ratings: Inflation or Deflation?

by ChessBase
10/7/2021 – In a recent article Walter Wolf took a close look at the current development of the Elo ratings. IM Dirk Sebastian and IM Martin Voigt, who are both from Hamburg, have also examined the trends in the recent developments of the Elo ratings and try to answer the question whether there currently is an inflation or a deflation of Elo?

ChessBase 17 - Mega package - Edition 2024 ChessBase 17 - Mega package - Edition 2024

It is the program of choice for anyone who loves the game and wants to know more about it. Start your personal success story with ChessBase and enjoy the game even more.

More...

By Martin Voigt

Is there Elo inflation or deflation?

In light of a recent article by Walter Wolf, we want to give another perspective on the subject of Elo ratings as an agent of playing strength.

Introduction

Is a player with 2600 in 1990 or 2000 comparable to one in 2019? Was he stronger? Weaker?

On the one hand, there are many more players now with 2600+ or even 2700+ than there were 20 or 30 years ago, so it suggests itself that there is Elo inflation and higher ratings are easier attainable nowadays. On the other hand, players might have become much better due to improved training possibilities like chess databases, strong engines and a faster spread of information. Furthermore, the player pool has increased, giving more people the opportunity to reach a considerable playing strength.

Approach

To evaluate playing strength we analysed more than 300,000 games, checked each played move with Stockfish, and calculated the average centipawn loss (acpl). The centipawn loss (cpl) is the difference between the evaluation after the stockfish move and the evaluation after the played move. (We capped the cpl at 500.) Though the acpl is just half the truth about playing strength as it doesn't take the complexity of positions into account, it is a sufficient measure for our purpose to judge the playing strength of huge groups of players.

2600 includes all players with Elo 2551-2650

There is a decrease in acpl, meaning, the players made smaller mistakes as the years went by. And the better play isn't just caused by improved knowledge of opening theory.

This graphic shows the acpl for players around 2600 by timeframe and movenumber. We combined moves in steps of 5, e.g. the value at move 35 incorporates the moves 31 to 35. (Typically, the acpl is highest before the time control at move 40. Not even the implementation of increments changed that.)

The decline in acpl can be observed for all ratings, not only in games of strong GM / professionals.

Summary: We see a rating deflation during this millenium. A player in 2019 was stronger than a player with the same rating in 2000.

What happened?

In 1993, FIDE decreased the rating floor from 2205 to 2005, in 2001 to 1800, in 2004 it was lowered to 1600, and in 2006 it was reduced to 1400. Later, the floor even dropped to 1000. The acpl of players with a similar rating didn't change in a significant manner between 1993 and 2004.

But after bringing down the rating floor to 1600, the acpl went down.

Attempt of explanation

Nowadays, many young players get a rating early in their chess development. They enter the list with a low rating which increases over time, taking points from the player pool in the process, thereby leading to rating deflation.

Adolescents have a higher k-factor. That speeds up their gains but doesn't stop seasoned players from losing rating points against their yet underrated opponents.

Modification proposal

Young players and players with few rated games who clearly outperform their old rating could be rated with their performance. That way, points would be lost less often just for the opponent being underrated.

Links


Reports about chess: tournaments, championships, portraits, interviews, World Championships, product launches and more.

Discuss

Rules for reader comments

 
 

Not registered yet? Register

czyzyk78 czyzyk78 1/17/2022 05:35
Another simple approach to reduce the rating deflation would be introduction of K=5 for a seasoned player meeting an opponent that haven't had yet played, say 200 games. The number "200" is just an example and should be determined based on the player's rating analysis and should reflect the average number of games needed to be played by a player in order to stabilize his/her rating.
tom_70 tom_70 10/9/2021 11:49
I remember reading many years ago that once a player passes 2700, every 50 elo points is considered a "class" better of player. If you look at the top rankings, those proportions still seem pretty accurate.
Theochessman Theochessman 10/8/2021 10:02
Why doesn't USCF switch to Fide rating anyway? The Fide is the world organisation of Chess, remember? I believe there's even a Canadian seperate raing system. What a mess... Now everybody know USCF rating is about 150 higher than FIDE anyway. So a 2000 USCF is only 1850 of "real" strength.
Also I think the online ratings like Lichess and chesscom should be finetuned so they are more similar to real ratings. Now an expert can be 2600 online rating. And GMs are 3100+ It would be handy for everyone if an OTB rating and online rating could be similar.
PhishMaster PhishMaster 10/7/2021 11:08
The problem is even worse in the U.S. where you have young players, who play in the very few FIDE events, but continue to get much better quickly over time, and their USCF rating is MUCH higher. Not including the two GMs, I recently played three kids in only my second FIDE-rated event in 40+ years. The kids were all around 2000 USCF, but were only 1500s FIDE.

So I lost to the two GMs, but had to beat three 2000s to gain not very many points because they are so underrated in FIDE.
Aighearach Aighearach 10/7/2021 07:53
The ratings don't measure your chess strength, they measure your ranking against other players. Inflation would be if the players at some percentile were now higher rated. If the objective level of play increases an amount equal to a rating category, there has been no inflation or deflation. Of course with better training tools, players are stronger now. And with a growing player pool, there should be more people at any rating level. Stable ratings mean the proportions at different rating levels are the same.
Doug Eckert Doug Eckert 10/7/2021 07:15
The authors are stating there is deflation compared to 20 years ago. But, the system is producing relatively correct ratings among existing players. I am relatively old at 56 and for certain the players today are stronger than from 30 years ago. I would guess it is about 150 - 200 points.

The biggest issue with the rating system is dealing with rapidly improving youngsters. I would propose consider dividing the higher rated player's k factor by the lower rated player's k factor and then applying this fraction to the rating result. This will have the impact of lowering the rating impact on the higher rated player by either 50% or 75%. If the player wins, the dilution to their rating increase will be quite small. But, the dilution for a draw or a loss will be much greater which should reflect a closer value to the result.
Jack Nayer Jack Nayer 10/7/2021 04:47
I used to teach statistics at university level. I never wrote about chess ratings, but it has always clear to me that there is deflation. The higher the rating, the more deflation. How could it be else - as if an enormously bigger pool, more available information, books, DVDs, chess programs, trainers and rising incomes in some countries would not make a difference. No rating system is perfect. The Elo system is far from perfect.
hansj hansj 10/7/2021 04:26
This matter must be dealt with. We cannot have neither inflation nor deflation. The chess universe needs to be stable.
1