Impressions from FIDE rating conference 2010
By Michalis Kaloumenos
I was well prepared to attend the FIDE rating conference held in Athens, Greece
from June 1st to June 4th. I made a plan and wrote down questions seeking for
answers that would help me complete my task. I introduced myself and I told
them that I wanted to write in simple words an article about the rating system
and the proceedings of the meeting, so that ordinary people could understand
the debate. “Good luck” said Stewart Reuben and soon I found out that his wish
was wise: There were no easy answers to my questions!

The participants of FIDE rating conference 2010 (left to right): Mikko Markkula
(Chairman of FIDE Qualification Commission), Stewart Reuben (Secretary of FIDE
QC), Nick Faulks (Councillor of FIDE QC), David Jarrett (FIDE Executive Director
and moderator of the panel), Jeff Sonas, GM Bartlomiej Macieja and Vladimir
Kukaev (Director of FIDE Elista Office)
I joined the Wednesday afternoon session when Jeff Sonas presented his graphs
regarding the “rating inflation” problem. The well known statistician has spent
years of research in analyzing the rating system using the FIDE lists as input,
and in order to apply his own ideas a posteriori over the past years,
he even reconstructed tournament tables from chess databases, because prior
to 2006 FIDE required only the final points of a player and the average Elo
of his opponents in order to calculate ratings, not the individual results in
detail. However, despite so many years of analysis the “rating inflation” problem
lacks a widely accepted definition. In order to accept a situation as a problem,
a definition that identifies the problem is required. So, Jeff Sonas draws a
graph of the Elo points of the player ranked in number five (also #10, #20,
#50 and #100) of the FIDE lists over the past years and finds that the points
of this #5 player have risen.

Well, every coin has a flip side. It is a common observation related to the
“rating inflation” problem that the May 2010 list includes 37 +2700 players
compared to only 11 in the July 2000 list. When the rating system was introduced
in 1971, only Bobby Fischer was above this threshold. From this point of view
Bartlomiej Macieja had an explanation. He presented a graph with the number
of players rated in intervals of 100 points. But instead of using the absolute
number of players in every area, he used the ratio of these players over the
total population of the rating list. The result was amazing. This normalization
showed that the distribution of the players in the same interval did not change
over the years. According to Bartek the “rating inflation” problem is a result
of the expansion of the population. Nowadays there are almost 112,000 players
with FIDE Elo ratings compared to 33,384 in year 2000.

This example is characteristic of the different approach between the two main
speakers of the conference. Both of them want to improve the accuracy of the
rating formula, but they understand this simple word “accuracy” in a different
way. For Jeff Sonas the formula is accurate if it provides an expected result
distribution identical to the actual result distribution for all circumstances
(unrated players, established players, higher rated, lower rated). Wherever
deviation is observed, a retouch is necessary. On the contrary, GM Bartlomiej
Macieja requests to be accurately ranked in the FIDE rating list. (A simple
reason for this: Sometimes Elo points alone qualify players for FIDE tournaments.)
Do you understand the difference?
For Jeff, the numbers provide the required input in order to improve the rating
system. For Bartek, further interpretation of the numbers is needed. This research
may lead to new parameters that must be taken into account. (For example, did
you ever notice that there are players who select their tournaments with care
so that a possible bad performance will not result to a big loss of points?)
When David Jarrett opened the topic “towards the future”, Bartlomiej Macieja
recommended a radical change: abandon completely the Elo system and adopt the
Glicko system instead. He compared Arpad Elo’s system to Newton’s mechanics
and Marc Glickman’s system to Einstein’s theory of relativity. That is, the
Elo system is a subset of Glicko system. Glickman allows the evaluation and
use of quality factors that the Elo system completely ignores. However the construction
of the accurate widely accepted formula that everybody agrees
with is far from obvious.

Jeff Sonas – always in front of his computer
It suddenly became clear to me that this situation was not a dipole but rather
a triangle. It is FIDE’s responsibility to organize and conduct a creative dialogue
among all interested individuals in order to find the right solution able to
survive over the following ten years at least. As David Jarrett pointed out
(and everybody agreed) they’d rather prepare carefully a solid proposition for
the 2012 General Assembly and use plaster to fix the holes until then.
From a technical point of view every solution is applicable. As Vladimir Kukaev
confirmed, six lists per year provide more accurate ratings compared to four
lists per year. Even a monthly list publication is possible (if this is decided).
In fact, the system provides live rating changes (available in the profile of
each player individually) as soon as tournament results enter the system. However,
it is impossible to force tournament directors to update the players’ rating
every day. In addition, tournament organizers do not provide the results by
themselves. The National Federations are the members of FIDE accredited to fulfill
this task. As a result, a number of days usually intervene before results enter
the system.

Vladimir Kukaev is responsible for publishing the FIDE rating list six times
per year
Imagine that even a slight change to the present system might cause strong
objections. For example a unique K-factor (K=25) may lead high rated players
to deliberate partial inactivity. Another example: doubling the required games
for an unrated player (currently 9) in order to obtain his initial Elo points
is an accepted mathematical solution towards a more accurate first entry. But
I believe that certain National Federations would not like this, especially
if they struggle to develop chess in their country and provide a lot of newcomers
to the system with a limited financial support that does not allow them to organize
many tournaments. Further changes (not to mention a revolutionary one) require
diplomacy, politics and a well prepared proposition that can be finally voted
by FIDE General Assembly.
So what was this conference all about? The main speakers presented the data
from their recent analysis, examined the significance of every parameter of
the formula, exchanged opinions with the rest of the participants and finally
the team decided to write down what changes they should consider in the future,
if any changes are required to the present system. I wouldn’t like to mention
this part of the proceedings for two reasons: a) this conference had simply
an advisory purpose, since they are going to propose possible changes to the
Presidential Board and then address the General Assembly, and b) I expect that
an official announcement will soon become available. Furthermore, I expect that
Jeff Sonas and Bartlomiej Macieja are going to publish their own ideas and conclusions
and provide all technical details in order to feed a public dialogue. In the
end everyone was satisfied with the particularly productive four days they spent
in Athens. As Mikko Markkula pointed out the FIDE rating system is stable and
highly appreciated and the conference found the right path of discussion in
order to make it even better.
Mikko Markkula (Chrairman of FIDE QC). In front of him a heavily used copy
of Arpad Elo’s book “The Rating of chess players, Past and Present” (currently
published by Ishi Press International). Bartek’s copy was in much better condition.

Michalis Kaloumenos and Stewart Reuben
It seems that as chess develops globally, ratings become a more complex and
dynamically changing system that requires continuous attention and re-evaluation.
Had it been easier in the past to calculate ratings? I really do not know, but
provided that the first July 1971 list includes only 592 players, it must have
been an easier task to produce the list. By the way, I discovered that this
first list was allegedly prepared by Mrs. Elo alone (not Mr. Elo) in her kitchen,
and she didn’t even use a calculator. This amazing piece of information was
confidentially revealed to me by SR, a respectable 71 years old Englishman who
wishes that his name remains secret.
Appendix
As soon as I was going to mail my article to ChessBase, I received an e-mail
from GM Bartlomiej Macieja stating:
Jeff has the following approach: he wants a player ranked Nth (for instance
10th or 100th) to have the same rating throughout history. If it doesn't happen,
he says we have inflation or deflation. For me, such an approach is not very
interesting. If I want to see which player was closer to the top at his times,
I simply check his position on a rating list published that time. Why do I
need to double columns in order to get the same information?
What I would like to measure is what happens with ratings of players who
keep their level. If their rating increases, I call it inflation, if their
rating decreases, I call it deflation.
It is clear for everybody that Kasparov is much stronger than Steinitz
was. In Jeff's approach, ratings of Kasparov and Steinitz would be more or
less the same. You can see it in his "chessmetrics ratings":

In my approach, there should be a huge gap in ratings between
Kasparov and Steinitz. You can look at it also from another side. As there
are more and more good players, in Jeff's approach if a player keeps his level,
his rating should decrease. It may lead to situations in which players would
not like to play (to be active), because by average, in Jeff's approach, they
should lose rating by playing games. In fact there is no escape for anybody,
because Jeff wants to punish inactive players as well.
In other words, in Jeff's approach, a player needs to constantly make
a progress to keep his rating stable. In my approach, a player needs to keep
his level in order to keep his rating.
References
A few words about the author
Michalis Kaloumenos is an electrical and computer engineer who graduated from
the National Technical University of Athens. He lives in Athens with his wife
and three children. Michalis is a ChessBase software expert. From 2006 to 2009
he was responsible for the column “chess and computers” for the Greek chess
magazine “Skaki gia olous”. He conducted and edited many interviews for the
magazine including one with Georgios Makropoulos and another one with chess
engine Fritz 10. His current chess project regards the construction and management
of the yet-to-be-launched www.e-skaki.gr
website (only in Greek) together with his old friends from the editorial team
of the magazine.
Copyright
ChessBase