Ranking chess players according to the quality of their moves

by Frederic Friedel
4/27/2017 – How do you rate players from different periods? An AI researcher has undertaken to do it based not on the results of the games, but on the quality of the moves played. Jean-Marc Alliot used a strong chess engine running on a 640 processor cluster to analyse over two million positions that occurred in 26,000 games of World Champions since Steinitz. From this he produced a table of probable results between players of different eras. Example: Carlsen would have beaten Smyslov 57:43.

ChessBase 14 Download ChessBase 14 Download

Everyone uses ChessBase, from the World Champion to the amateur next door. Start your personal success story with ChessBase 14 and enjoy your chess even more!

Along with the ChessBase 14 program you can access the Live Database of 8 million games, and receive three months of free ChesssBase Account Premium membership and all of our online apps! Have a look today!


Artificial Intelligence evaluates chess champions

The Elo rating system in chess, well known to all of us, is based on the results of players against each other. Designed by the Hungarian physics professor and chess master Árpád Imre Élo in 1970 the system is used to predict the probability of rated players winning or losing their games against another rated players. If a player performs better or worse than predicted then rating points are added to or deducted from his rating. However, the Elo system does not take into account the the quality of the moves played during a game and is therefore unable to reliably rank players who have played at different periods in history.

Now computer scientist and AI researcher Jean-Marc Alliot of the Institut de Recherche en Informatique de Toulouse has come up with a new system (and reported on it in the journal of the International Computer Games Association) that does exactly that: rank players by evaluating the quality of their actual moves. He does this by comparing the moves of World Champions with those of a strong chess engine – the program Stockfish running on a supercomputer. The assumption is that the engine is executing almost perfect moves.

Alliot has evaluated 26,000 games played by World Champions since Steinitz, estimating the probability of their making a mistake – and the magnitude of the mistake – for each position in their games. From this he derived a probabilistic model for each player, and used it to compute the win/draw/lose probability for any given match between any two players. The predictions, he says, have proven not only to be extremely close to the results from actual encounters between the players, but they also fare better than those based on Elo scores. The results, he claims, demonstrate that the level of chess players has been steadily increasing. The current world champion, Magnus Carlsen, tops the list, while Bobby Fischer is third.

Here are predictions of game results between the different world champions in their best year:

  Ca Kr Fi Ka An Kh Sm Pe Kp Ks
Carlsen   52 54 54 57 58 57 58 56 60
Kramnik 49   52 52 55 56 56 57 55 59
Fischer 47 49   51 53 57 56 57 56 59
Kasparov 47 49 50   53 54 54 54 53 57
Anand 44 46 48 48   54 52 53 53 57
Khalifman 43 45 44 47 47   50 51 52 53
Smyslov 43 45 45 47 49 51   50 51 53
Petrosian 43 44 45 47 49 50 51   52 53
Karpov 44 46 45 48 48 49 50 49   51
Kasimdzhanov 41 43 42 45 45 48 48 48 50  

Under current conditions, Alliot feels, this new ranking method cannot immediately replace the Elo system, which is easier to set up and implement. However, increases in computing power will make it possible to extend the new method to an ever-growing pool of players in the near future.

Read the full detailed paper published by Jean-Marc Alliot in the ICGA Journal, Volume 39 -1, April 2017. Mathematically proficient readers are welcome to comment on his method and his results.

Editor-in-Chief of the ChessBase News Page. Studied Philosophy and Linguistics at the University of Hamburg and Oxford, graduating with a thesis on speech act theory and moral language. He started a university career but switched to science journalism, producing documentaries for German TV. In 1986 he co-founded ChessBase.
Discussion and Feedback Join the public discussion or submit your feedback to the editors


Rules for reader comments


Not registered yet? Register

satman satman 4/27/2017 11:53
So Khalifman at his best would have had the edge on both Petrosian and Karpov.
You learn something every day!
nimzobob nimzobob 4/27/2017 11:44
Are there some factors that are not being considered here. Sub-optimal moves can be played for a reason - reduce counter-play in a winning position, to complicate the game, to give better winning chances, to keep the game alive, etc.

What is the margin of error here? When the margins are very small it is easy to lie with statistics :-)
WildKid WildKid 4/27/2017 11:36
I have read the whole paper with some attention. On the whole it seems to be a good piece of work, but I do have the following suggestions for improvement.

A) The methodology of measuring the mean difference between chosen and optimal moves is biased in favor of 'safe' players who stick to balanced lines, and thus rarely make big mistakes. It would evaluate Petrosian as a much better player than Tal, for example. Tal would lost points for 'unsound' sacrifices where the correct defense is hard to see, and also for possibly suboptimal play in highly tactical situations that still may result in a won game. The algorithm could be improved by giving credit to players when their OPPONENTS frequently make mistakes (presumably because they are in positions that are very difficult for a human to evaluate for either side, but that some players such as Tal and Mamedyarov(?) prefer.)

B) Another improvement would allow e.g. Stockfish to retrospectively re-evaluate a position where evaluation of subsequent moves proves the initial engine valuation to be mistaken. For example, there is a very famous game where Kasparov, as White, sacrificed a knight, and engines ruled the sacrifice unsound. However, three or four moves later, the engines evaluate the position as good for White, without being able to identify any error in Black's play. The algorithm as written would punish Kasparov for disagreeing with the engine, rather than reward him for being smarter than it!

That said, I think this paper is a good start.