Prize competition: can you out-predict Elo?

by ChessBase
8/18/2010 – The Elo rating system has done good service for half a century. But can we devise a more accurate method for measuring chess strength and predicting results. Kaggle, an organisation that provides a platform for data-prediction competitions, is giving statisticians 65,000 games which they must use to predict the results 7,800 other games. There are prizes for the best new systems.

ChessBase 18 - Mega package ChessBase 18 - Mega package

Winning starts with what you know
The new version 18 offers completely new possibilities for chess training and analysis: playing style analysis, search for strategic themes, access to 6 billion Lichess games, player preparation by matching Lichess games, download Chess.com games with built-in API, built-in cloud engine and much more.

More...

Elo versus the Rest of the World

This year marks the 40th anniversary of the historic “USSR versus the Rest of the World” chess tournament, won by the Soviet team in 1970. More recently we watched the famous “Kasparov versus the World” Internet chess game, won by Garry Kasparov in 1999 against a consultation team of more than 50,000 opponents. And next month Magnus Carlsen takes on the World in the RAW World Chess Challenge.

<img data-cke-saved-src="http://en.chessbase.com/portals/4/files/news/2010/kaggle.gif" src="http://en.chessbase.com/portals/4/files/news/2010/kaggle.gif" style="float: left; margin-right: 10px; margin-bottom: 5px; width=" 200"="" height="140">Now Jeff Sonas and Kaggle have teamed up to present a new competition in this grand tradition: it is “Elo versus the Rest of the World”, pitting hundreds of statisticians against Arpad Elo’s greatest legacy: the Elo rating system.

Arpad Elo, a Hungarian-born physicist and chess master, invented the Elo system half a century ago. Marvelous in its simplicity, it is easily the most widely used method for rating chess players, and has often been used to determine who is invited to play in top tournaments. The formula has also been applied to other contests, ranging from soccer to video games to trading-card games.

However, it has never really been demonstrated that Elo ratings are more accurate than other approaches would be. "Elo versus the Rest of the World", an online competition hosted by Kaggle, will benchmark Elo's system against other systems submitted by statisticians worldwide. The top performers have the opportunity to help shape the future of chess ratings. Competitors can "train" their rating systems using the results (downloadable from the Kaggle website) of more than 65,000 chess games played in recent years. They then submit their predictions for an additional 7,800+ games. The website instantly judges the accuracy of each entry's predictions, and top performers are tracked by a public leaderboard for all to see.  Already more than 90 teams of statisticians have submitted entries.

Top performers who share their methodology receive Amazon.com gift vouchers, or the highest finishers can select prizes from among several chess software packages donated by Chessbase, including the latest versions of ChessBase, Big Database, and Fritz, as well as other training DVD's featuring world champions.


First prize is a copy of Fritz signed by all-time greats Garry Kasparov,
Viswanathan Anand, Anatoly Karpov, and Vladimir Kramnik!

The 15-week competition has only just kicked off, but already things are looking difficult for Elo. Within the first 24 hours, several submissions already surpassed the Elo benchmark. It isn't a surprise that today's experts are able to outperform Elo. After all, the system was developed long before we could easily process large amounts of game data. However, we expected Elo might lead the list for a bit longer!

The competition will also benchmark other well-known rating systems. For example, Professor Mark Glickman developed the Glicko and Glicko-2 systems, which extend the Elo system by introducing additional parameters to track the reliability and volatility of player ratings. Ken Thompson used a linearly weighted performance rating, from a player's last 100 games, to calculate the Professional chess ratings used years ago by the PCA. And Jeff Sonas developed Chessmetrics ratings specifically to maximize predictive power. Perhaps these systems will hold up better!

Links


Reports about chess: tournaments, championships, portraits, interviews, World Championships, product launches and more.

Discuss

Rules for reader comments

 
 

Not registered yet? Register