Ratings Summit in Athens
A preliminary report by Jeff Sonas
On June 11-12, FIDE held a special meeting in Athens, Greece to discuss the
implications of changes to the FIDE rating system, especially the increase of
the K-factor. The K-factor controls how rapidly a player's rating responds
to their recent results. The increase had been previously agreed to by the
General Assembly, and was scheduled to go into effect at the start of July.
Ratings experts from around the world were brought together to recommend a course
of action to the Presidential Board (which met a few days later in Krakow, Poland),
and FIDE Deputy President Georgios Makropoulos chaired the two-day Athens meeting.
Meeting participants (left to right): FIDE Executive Director David Jarrett,
FIDE Deputy President Georgios Makropoulos, Nick Faulks, GM Bartlomiej Macieja,
Jeff Sonas, GM John Nunn, FIDE Qualification Commission Chairman Mikko Markkula,
Vladimir Kukaev. Not shown (because he took the picture): FIDE Treasurer Nigel
I should say first of all that I was very impressed with the FIDE decision
to hold this meeting, as well as their conduct during the meeting. Whatever
the procedural problems in the past that led to this situation, emergency discussion
and analysis were certainly called for at this juncture, and it was a very productive
meeting. I have nothing but good things to say about FIDE throughout this process.
One point to emphasize is that for this meeting at least, the only two options
were to recommend that the Presidential Board accept, or reject, the specific
decisions of the General Assembly, so we did not spend too much time on debating
the "perfect solution". After analyzing the issue at length during
our meeting, the majority clearly felt that the doubling of the K-factor was
not necessarily an improvement, and that the matter was sufficiently unclear
as to require further analysis. In fact my personal feeling, after looking
at the data and running many simulations, is to go beyond that statement and
say that doubling the K-factor could quite likely be a very poor move. Therefore
we recommended to the Presidential Board that it refer the K-factor-increase
back to committee, while supporting several other changes to the rating calculation,
including the move to 6 lists per year (rather than 4 lists per year) as well
as increasing the "350-point-rule" to be the "400-point-rule"
I am currently working on a longer writeup summarizing the technical discussion,
as well as additional analysis that was inspired by the brainstorming during
the meeting itself and subsequent informal discussions afterwards with other
attendees. For now I would just like to point out a few highlights from the
It might seem strange to have supported the move to 6 lists per year, while
at the same time rejecting the K-factor increase. You would think these
two decisions in combination would actually make the rating system less
dynamic rather than more. This is true, but it is important to recognize
the relative magnitude of the two factors. A change to the K-factor makes
a big difference, whereas changing from 4 lists/year to 6 lists/year makes
a very small difference. Even increasing the K-factor for established players
from 10 to 11 would be a severe overreaction to the move from 4 lists/year
to 6 lists/year. I looked at the average rating change for each player
across various rating formulas, and my calculations indicated that the proper
increase in K-factor, to account for the reduced rating "responsiveness"
associated with the move from 4 lists/year to 6 lists/year, would be to
increase the K-factor from 10 to 10.2. So if we decide to increase the
K-factor significantly, it ought to be for reasons other than the move from
4 lists/year to 6 lists/year.
I felt a particular responsibility for this situation because I advocated
moving to a universal K-factor of 24 seven years ago, and this conclusion
was used to partially justify the decision to double the K-factor. My earlier
analysis included optimizing a single universal K-factor, rather than the
three-tiered K-factor that FIDE currently has (where some players have K=10,
some players have K=15, and some players have K=25). In fact the vast majority
of players do have K=15 already. Additionally my earlier analysis was performed
on a smaller, unofficial subset of games from 1994-2001, whereas my more
recent analysis covered the entire dataset – all players and all results
– for the whole FIDE rating list from 1999-2009. My older model also did
not have a mechanism for introducing new provisional players into the system,
whereas my latest model actually matches just about all of the FIDE rating
regulations. For all these reasons I feel that the recent analysis is more
reliable, and I am very grateful that FIDE was so forthcoming with the data
for my analysis. I will be writing much more about my conclusions regarding
the K-factor as well as my simulation model itself.
GM Bartlomiej Macieja held the consistent view that the ratings are so
important, being used for direct qualification into the Candidates and World
Cup events, that if we have to make the calculations somewhat more complex
and it results in more accurate ratings, this is a good tradeoff. For instance,
two players may be so close in their ratings that their positions might
be reversed if we adjusted for number of games played with White vs. Black,
or if we used a different K-factor. We discussed this at some length informally
after the meeting, and others felt that the rating system is a nice balance
right now between simplicity and accuracy, and were concerned about the
introduction of further complexity into it. Certainly another way to tackle
this problem would be to change how the qualification process works, to
make it depend more upon direct over-the-board results and less on the precise
rating calculation. This is a very difficult issue.
GM John Nunn had surveyed several of the top players, including both younger
and older players from the top ten, and shared some of the results of that
survey. They all seemed to feel that the current system was working fine.
Obviously it is in their interest to be conservative because their ratings
are already high, but some of these players (especially the younger ones)
are still improving, and thus it would also seem to be in their interest
to support a more dynamic system. Nevertheless they pretty much all said
that they didn't see a reason to change the K-factor.
GM Dmitriy Jakovenko had previously written an article for the ACP website
regarding the increase of the K-factor, and this article was discussed extensively
during our meeting. His description of various K-factors as reflecting
various beliefs in how important your last 20 games are, compared to all
previous games, was quite useful in conveying the impact of the K-factor.
It was a very helpful perspective and provided a subjective assessment that
people could make themselves, instead of just having to trust the analysis
of people like me with all our statistics. The subjective assessment shouldn't
be the only one but it absolutely helps in framing the discussion.
Deputy President Makropoulos did much more than just run the meeting, and
had several insightful points that motivated me to reassess some of my analysis. For
instance he didn't completely agree with my approach of evaluating the "accuracy"
of rating systems by comparing expected score against actual score, because
we expect improving players to outperform their rating and thus perhaps
some of that should be considered "improvement" in the players
rather than just error in the rating itself. He was also resistant to the
idea that we must look back to Professor Elo's writings and opinions to
determine the ideal course of action, preferring instead that we start from
the assumption that the current situation is the default that players are
accustomed to. I agree that the burden of proof should fall upon people
advocating a change in the current system, even if that change would involve
moving more toward what Elo himself originally envisioned.
Finally, one topic that received considerable discussion was "rating
inflation" – what does it actually mean, is there evidence of it, and
what causes it? This was relevant to the K-factor discussion because a
higher K-factor does seem to increase "inflation", depending on
your definition of the word, of course! In addition if top players' ratings
are really increasing too fast then it could devalue the Grandmaster title,
which is based upon an absolute rating cutoff. Inflation was certainly
a topic where there was some disagreement among the experts regarding the
fundamental answers to all three of the above questions, and certainly requires
further investigation. I will be writing much more about this, and I anticipate
lively discussion among the mathematically-inclined readers in particular.
Jeff Sonas in Athens
K Factor Meeting in Athens
Prior to the Presidential Board in Krakow, a meeting was held in Athens to
discuss the proposed changes to the rating system and in particular the increase
in the K factor. A small group of experts gathered to discuss the matter and
to make recommendations, if needed, regarding the decisions taken in Dresden.
The two day meeting was chaired by the FIDE Deputy President, Georgios Makropoulos,
and included the FIDE Treasurer Nigel Freeman, FIDE Executive Director David
Jarrett, FIDE Qualifications Commission Chairman Mikko Markkula, FIDE Qualification
Commission Councillor Nick Faulks, Jeff Sonas from California, GMs John Nunn
and Bartlomiej Macieja plus Vladimir Kukaev from the Ratings Office in Elista.
The meeting focused on the K factor issue but also dealt with a number of other
matters including possible inflation in the rating system. The meeting supported
the move to 6 lists per year and the increase of the ‘350 point rule’
to ‘400 point rule’ but felt that the increase in the K factor should
be referred back to the Qualification Commission and this recommendation was
agreed at the Presidential Board.
Jeff Sonas and Bartlomiej Macieja both produced interesting and detailed material
and John Nunn had carried out a survey of top players. FIDE is indebted to them
for their hard work in the preparation for this meeting. In addition, the ACP
forwarded an important paper by GM Dmitriy Jakovenko.
It is intended that this group continues to cooperate and meet again next year.
The great K-Factor debate on ChessBase.com
||Ratings Summit in Athens
22.06.2009 – On June 11-12, FIDE held a special
meeting in Athens, Greece to discuss the implications of changes to the
FIDE rating system, especially the increase of the K-factor. Ratings
experts from around the world (including John Nunn and GM Bartlomiej Macieja)
were brought together to recommend a course of action to the Presidential
Sonas reports on the meeting.
||Rating and K-factor: wrapping up the debate
11.05.2009 – The discussions regarding the
K-factor – the rate at which ratings go up or down when they are calculated
– reaches its climax with a wrap-up article by Dr John Nunn, grandmaster
and mathematician, who evaluates the arguments that have been presented
by the different parties. After this it is up to FIDE, which has already
initiated positive steps settle the matter. Final
||Thompson: Leave the K-factor alone!
07.05.2009 – The debate on
whether to increase the rate of change of the Elo list continues. Today
we received an interesting letter from Ken Thompson, the father of Unix
and C, and a pioneer of computer chess. Ken believes that the current
rating system isn't broken and that the status quo is better than change.
If anything the ratings should be published more often – every day if
||Rating debate (6): Here comes the proof!
04.05.2009 – "I couldn't believe my eyes when
I read GM John Nunn's opinion," writes GM Bartlomiej Macieja (pronunciation
supplied), the original initiator of this debate. He presents proof for
the fact, challenged by Nunn, that the K-factor and the frequency of rating
lists are related to one another. Other readers have also weighed in,
a wrap-up reply by John Nunn will appear soon. Long,
||Rating debate: is 24 the ideal K-factor?
03.05.2009 – FIDE decided to speed up the change
in their ratings calculations, then turned more cautious about it. Polish
GM Bartlomiej Macieja criticised them for balking, and Jeff Sonas provided
compelling statistical reasons for changing the K-factor to 24. Finally
John Nunn warned of the disadvantages of changed a well-functioning system.
Here are some more interesting
||Nunn on the K-factor: show me the proof!
30.04.2009 – With the debate raging over FIDE's
decision to change or not to change the K-factor used in calculating players'
ratings, we are glad to receive an important message from our voice-of-reason
grandmaster. Dr John Nunn says "there seems no real evidence that K=20
will result in a more accurate rating system, while there are a number
of risks and disadvantages." His
explanation and reader feedback.
||Macieja: the FIDE General Assembly must decide
30.04.2009 – "Using the FIDE Laws of Chess
terminology, the move has been made, and no takeback is any longer possible."
Polish GM Bartlomiej Macieja is insisting that the decision to increase
the K-factor in rating calculations is not just necessary and good in
the current tournament situation, it is in fact irrevocable and can only
be legally changed by the body that passed it. Open
||FIDE: We support the increase of the K-factor
29.04.2009 – Yesterday we published a letter
by GM Bartlomiej Macieja asking the World Chess Federation not to delay
the decision to increase the K-factor in their ratings calculation. Today
we received a reply to Maceija's passionate appeal from FIDE, outlining
the reasons for the actions. In addition interesting letters from our
readers, including one from statistician Jeff Sonas. Opinions
||Macieja: The increase of the K-factor is essential
28.04.2009 – Yesterday we reported
that FIDE had decided not simply to change the K-Factor in its rating
calculation, but to publish two parallel lists for a year and then review
the results. Today we received a passionate appeal by GM Bartlomiej Macieja
not to delay the decision but increase the K-factor immediately. In fact
he advocated recalculating the lists of the last two or even five years.
the debate begin.
||FIDE: Anand-Topalov bidding, K-Factor |
27.04.2009 – The World Chess Federation has
opened the bidding for the next World Championship match between Viswanathan
Anand and Veselin Topalov, scheduled for April 2010. At the same time
FIDE has reacted to concerns of players and decided not to simply change
the K-Factor in its rating calculation, but to in fact publish two parallel
lists for a year and then review the results. Press