GM Dr. John Nunn, England
I just thought I’d make some comments on the K-Factor debate.
Changing the K-factor from 10 to 20 for the whole Elo list is a radical change,
the most radical in the 40-year history of the Elo system. What is curious is
that in all the debate nobody seems to have set out exactly what the supposed
advantages to this change are. Certainly, people’s ratings will go up
and down faster, but why should this make the ratings ‘better’?
Jeff Sonas states "Using a more dynamic K-factor (24 instead of 10) would
result in ratings that more accurately predict players' future results, and
thus I would call those ratings more ‘accurate’." But where’s
the proof of this? It strikes me that this statement is in fact very unlikely
to be true. The performance of players varies considerably from one tournament
to the next. Increasing the K-factor effectively places greater weight on the
most recent results, so that the rating is based less on an average of results
and more on the latest tournament. Since this is subject to a wide random variation,
there seems no particular reason to believe that the rating will more accurately
reflect a player’s strength or better predict future results.
The argument put forward by GM Bartlomiej Macieja that an increase in the
K-factor is a necessary consequence of more frequent rating lists doesn't stand
up to examination. The K-factor and the frequency of rating lists are unrelated
to one another. Rating change depends on the number of games you have played.
If you have played 40 games in 6 months, it doesn't make any difference whether
FIDE publishes one rating list at the end of six months or one every day; you've
still played the same number of games and the change in your rating should be
I am against the increase in the K-factor, for two reasons. The first is that
the Elo system has worked well for 40 years, and while all rating systems have
their defects, during this time the Elo system has gained respect as a good
indicator of current playing strength. Why then change it in such a dramatic
The second reason doesn’t seem to have been mentioned so far, but I
think this is the reason why many top players are against the K=20 change. With
qualification to many important events, including the World Championship, being
based on ratings, obtaining a higher rating can be extremely valuable. In the
past there has been a certain amount of ‘rating cheating’, ranging
from the buying of individual games to the construction of entire imaginary
tournaments. With the stakes so high, this will doubtless also occur in the
future. The problem is that it is much easier to cheat on your rating with K=20.
With 20 points at stake in each game, it only takes a small amount of cheating
to cause a massive surge in your rating. What the top players are concerned
about is that the places in elite tournaments and even the World Championship
which should rightfully go to them will instead go to ‘rating cheats’.
Of course, you can cheat on your rating with K=10 too, but why make the task
so much easier?
On the whole, there seems no real evidence that K=20 will result in a more
accurate rating system, while there are a number of risks and disadvantages.
GM Michal Krasenkow, Gorzow Wlkp., Poland
For me personally the FIDE decision to increase the K-factor was a
bolt from the blue. Before the Dresden Congress it was only discussed in a purely
theoretical manner within a small group of specialists. And suddenly it was
put into practice – without wider discussion in the world chess community,
without any computer simulation (was is really difficult to recalculate the
events of, say, 2006-2008 according to the proposed rules?). Is it the right
way to make such revolutionary changes?
My opinion is that the K factor can and should be increased, to 15 or 20 –
that can be a subject of discussion and – let me repeat – a computer
simulation. What is absolutely ridiculous is the introduction of K=30 for "before
2400" players. One of the main ideas of the Elo system was that the winner
of a single game got as many rating points as the loser lost. That was infringed
by the introduction of K=15 for players who have never reached 2400, which IMHO
was the main cause of rating inflation in recent decades (until the most recent
years, when the "350 rule", combined with game-by-game calculation,
became an even stronger inflation factor). What will happen when we introduce
K=30 – God knows. Definitely, ratings of "before 2400" players
will have nothing to do with their playing strength, rather with the number
of games they will manage to play during the two-month rating period. 36 games,
i.e. two open tournaments a month – nothing special for an active player,
he needn't do it "intentionally" – will lead to a 1.5-fold rating
"overleap" as Mr. Lorscheid showed in his example (i.e. a player rated
2300, with an average performance 2400, will get 2450!). I think, it is obvious
for everyone that any inertion in rating changes is better than overlap. Besides,
the inflation of ratings, with a lot of players getting 2400+ with K=30 and
then dropping back with K=20, will increase to a level no-one can even predict.
Therefore, in the present situation I fully support the decision to halt the
changes. Then the wide discusiion should be reopened, a computer simulation
of different versions of changes (K=15 for all; K=20 for all etc.) should be
made; only after that a new rating calculation system can be introduced.
Elmer Dumlao Sangalang, Manila, the Philippines
The Current Rating Rn is given by the formula, Rn = Ro + K(W - We). Ro (original
rating) is based on No games which is dependent on K (the rating point value
of a single game). If K = 10, No = 80. If K = 15, No = 50. If K = 25, No = 30.
During a competition, the player plays N new games. The Current Rating formula
performs the operation of averaging the latest performance in N games into the
prior rating so as to smoothly diminish the effect of the earlier performances,
while retaining the full contribution of the latest performance. For the smooth
blending of the new into the old, the number of games to be newly rated should
not exceed the number of games on which Ro is based.
The reliability of a rating depends on the number of games used in the calculation
of the rating. With 30 games, the rating is 95% reliable. With 50 games, 98.8%.
With 80 games, 99.7%. When k = 25, we are satisfied with a rating that's based
on 30 games which is only 95% credible. With k = 15, we settle for a rating
based of 50 games which is 98.8% credible. With k = 10, we want a rating to
be based on 80 games so that it will be 99.7% credible.
The number of times the Rating List is produced has nothing to do with the
choice of K. The claim that two players should have the same rating if they
score the same number of points against a common set of opponents in the same
tournament is equivalent to demanding that their Performance Rating in the single
event count as their Current Rating. The Current Rating is different from the
Performance Rating. The Current Rating is made up of several Performance Ratings.
FIDE should not support the increase in the K Factor if it does not want the
reliability of ratings to be diminished.
Johan Ostergaard, Copenhagen, Denmark
Jeff Sonas is brillant – I only wish that the people at FIDE
spent more time reading and understanding his results. Does he really do all
this work for free?
John Rood, Holbrook, MA, USA
This is the infamous Jeff Sonas who around 2005 or so predicted it would be
a long time before computers surpassed the top grandmasters in playing skill,
Bostjan Sirnik, Ljubljana, Slovenia
As a response to this debate I have decided to share with you my concerns about
the effects of new formulas for calculating the chess ratings on rating inflation/deflation.
I suggest to empirically test the following "common sense" conclusions
that can be made regarding the proposed new rating system:
The increase of the K-factor will boost the variance of the player's individual
Consequently players with rating under 2400 with a higher K-factor will
have a much higher probability to eventually reach the limit of 2400 points
and thus getting a lower K-factor (without significant improvement of their
Consequently the percentage of players with rating under 2400 and with
lower K-factor will become much higher then it is today.
Basically this will lead to an important deflation of rating points in
the pool of "just under 2400" players. It is even possible that
this will undermine the stability (and validity) of the whole rating system.
Although this concerns cover only a minor part of the whole picture I believe
that they should be discussed and above all empirically examined. I also want
to point out my impression that FIDE chose a highly irresponsible and unprofessional
way to put in practice such an important and revolutionary set of changes as
new formulas behind the rating system certainly are. In his letter
published here at chessbase.com Jeff Sonas comments that "the question
of rating inflation is a very difficult one and really the only way to tackle
it is to look at actual data and see the result of various approaches".
GM Bartlomiej Macieja also
asked FIDE to perform statistical studies with the available empirical data:
"Instead of waiting for a year or two in order to show consequences of
the change of the value of the K-factor, it is much better and faster to calculate
results from last two or even five years using the new value of the K-factor."
It's incomprehensible that FIDE has the empirical data about chess ratings
from the past but it's not willing to use them before making such decisions.
If FIDE lacks the know-how then at least it could make these historical data
public and invite the chessplayers with the appropriate expertise (e.g. Mr.
Sonas) to contribute with their analyses and conclusions. Obviously the corrections
to the current rating system must be done, but in this case forcing ad hoc solutions
without scientific justification will do more damage then good.
Mark Adams, Wales
I have been a rating officer for 25 years and I am convinced that we, as chess
players, have lost the plot with ratings. I suggest we get rid of ratings and
go back to the old system of classes, where progress is determined by norms.
Imagine sitting down to a game and not worrying about losing rating points!
Acure for the 'too many draws' issue? There will still be a need to determine
some form of ranking for the top players – why not use a system similar
to tennis where you get points for winning events? OK, too radical for most,
as ratings are ingrained in a chess players psyche. But try to think above this
and maybe you'll agree?