Demythologizing the chess player's Elo rating

Preamble

The invention and employment of the Elo Rating System may be the best, but not the perfect, thing that ever happened to chess playing and organizing. The chessplayer’s Elo rating has been overvalued in significance both by players and organizers and for that matter, by FIDE. This issue has serious consequences and before it becomes an unmanageable problem, FIDE should take positive steps to address it – correct the erroneous perception very soon, preferably now.

Short of requiring everyone to read the Prof. Elo’s book, The Rating of Chessplayers – Past and Present, FIDE should disseminate to all its member national federations, to all chess publications, and require all chessplayers (amateurs and professionals alike) to read, a circular which explains the nature, significance, application and limitations of the rating system.

I have prepared the following article for such purpose. I trust it will reach a far wider audience if it is vented through your website.

Demythologizing the chess player's Elo rating

The invention and employment of the Elo rating system may be the best, but not the perfect, thing that ever happened to chess playing and organizing. The late Professor Arpad E. Elo expressed strong sentiments over the inordinate importance being attributed to the rating. He regretted that the Elo Rating System contributed greatly to the prevailing opinion that regards chess as first and foremost a sport. As a result, the chessplayer’s Elo Rating has been overvalued in significance by top-rank players and organizers of major and prestigious chess competitions. And for that matter, even by FIDE, itself.

This problem has serious undesirable consequences some of them being that:

Top players will tend to protect their ratings from decreasing at great cost. When rating is given undue importance, its preservation or improvement will, of necessity, lead the player to abandon his desire to create and innovate. The proliferation of colorless draws is inimical to the interest of organizers and sponsors of tournaments, and spectators.
In many a chessplayer and chess aficionado, it will foster an attitude of irrational respect (euphemism for fear) for higher-rated opponents and unwarranted contempt for lower-rated ones.
Organizers and sponsors of competitions will be misguidedly confining their choice of participants to a narrow field of high-rated players to the discrimination of the greater majority of their equally qualified associates.
Gratuitous payment of excessive appearance fees by organizers of competitions will be made to high-rated players, on the latter’s demand, the effect of which is the depletion of funds that should, otherwise, be part of the prizes whose legitimate recipients are the winners of tournaments.
FIDE can be misdirected in its policymaking efforts regarding International Titles and Ratings requirements and regulations.
FIDE can unjustifiably be compelled, at considerable cost, to grant requests for recalculation to restore the loss of a few rating points due to accidental exclusion of some game results from the rating calculation.
FIDE can wrongfully be exposed and dragged into costly legal disputes in cases where individual ratings are inadvertently miscalculated on account of clerical error such as data omission. Players would argue that the failure to receive an invitation to a chess competition could lead to economic loss. Participation in a prestigious tournament means the receipt of an appearance fee and the possibility of winning substantial prize money.

FIDE, and all quarters concerned, should act immediately to address the problem and rectify this erroneous perception about the real significance of a chessplayer’s Elo Rating. Short of requiring everyone to read Prof. Elo’s book, The Rating of Chessplayers – Past and Present, FIDE should disseminate to all its member national federations, to all chess publications, and require all chessplayers (amateurs and professionals alike) to read, a circular explaining that:

The Elo Rating System is a statistical system. Though the calculation processes involved are mathematical (they use formulas that are precise mathematical statements), the underlying concepts employed (such as probabilities, confidence intervals, margins of error, measures of reliability) in the derivation of these formulas are statistical in nature. Even the data that form the basis of the calculations are themselves fluctuating units of human performance that is subject to variability. Professor Arpad E. Elo so eloquently stated the process for the benefit and appreciation of the layman in each of us: “The measurement of the rating of an individual chessplayer might be well compared with the measurement of the position of a cork bobbing up and down on the surface of agitated water with a yardstick tied to a rope which is swaying in the wind” and held by a trembling hand [last phrase is mine].

Therefore, the Elo Rating of an individual player is not a mathematically precise figure. It is statistically derived with an accuracy in direct correlation to the amount of data (game results) on which it is based. A measurement based on the standard 30 games provides a rating that is 95% probable to be within plus or minus 100 Elo points of its true value. That’s the reason why it is not a certainty that the higher-rated player will always beat his lower-rated opponent. Take this simplistic illustration: Player A rated Elo 2600 plays with Player B rated Elo 2500. The true strength of A lies in the range of 2500 - 2700 while that of B is in the range of 2400 - 2600. If A plays languidly at 2550 while B plays inspired chess at 2550, we will have an even match and the game should result in a draw!

In the Elo Rating System, the absolute value of the player’s rating is meaningless. It is the difference in ratings between players which has significance and it represents their relative scoring capabilities. (As far as only this pair of chessplayers is concerned, Alexander Grischuk's Elo 2773 and Wesley So's Elo 2673 may as well be arbitrarily changed to Elo 2000 and Elo 1900, respectively.)

When the rating system is conducted on the continuous basis, such as being done by FIDE now, ratings are computed after each event by the current rating formula: Rn = Ro + K(W-We). The self-correcting characteristics of this equation, when applied continually with statistically adequate interplay within the rating pool, will automatically generate proper relative ratings after sufficient time. Therefore, minor errors in rating calculations due to data omissions will not affect the overall integrity of the system in the long run.

When all the participants in a tournament have ratings that fall within a rating interval of 200 Elo points, the players are said to belong to one playing class and good all-around competition results. No one is badly outclassed and no one badly outclasses the field. The weakest player on his good day will play about as well as the strongest player on the latter’s off day.

In the choice of a particular participant to be invited to a chess event, the organizer should not rely on the player’s rating as his sole criterion for selection but, rather, also the overall character of the player which certainly will have a greater impact on the conduct and success of the competition.

With the foregoing demythologizing of the chessplayer’s Elo Rating we have come to realize that its obsessive valuation by players, organizers and FIDE alike, is unfounded on fact.

SHOP

SHOP

Demythologizing the chess player's Elo rating

ONLINE SHOP

The Endgame Academy Vol.1: Checkmate & pawn endgames

Preamble

Demythologizing the chess player's Elo rating

Previous articles by Elmer Dumlao Sangalang

Discuss

Fritz 20

Your Calculation Compass: Know When to Calculate

The top-tier Trompowsky

Trompowsky Powerbook 2025

Trompowsky Powerbase 2025

ChessBase Magazine 226

2nd Move Anti-Sicilian Powerbook 2025

How to play the Anti-Sicilians

Pop-up for detailed settings