Prize competition: can you out-predict Elo?

8/18/2010 – The Elo rating system has done good service for half a century. But can we devise a more accurate method for measuring chess strength and predicting results. Kaggle, an organisation that provides a platform for data-prediction competitions, is giving statisticians 65,000 games which they must use to predict the results 7,800 other games. There are prizes for the best new systems.

ChessBase 14 Download ChessBase 14 Download

Everyone uses ChessBase, from the World Champion to the amateur next door. Start your personal success story with ChessBase 14 and enjoy your chess even more!


Along with the ChessBase 14 program you can access the Live Database of 8 million games, and receive three months of free ChesssBase Account Premium membership and all of our online apps! Have a look today!

More...

Elo versus the Rest of the World

This year marks the 40th anniversary of the historic “USSR versus the Rest of the World” chess tournament, won by the Soviet team in 1970. More recently we watched the famous “Kasparov versus the World” Internet chess game, won by Garry Kasparov in 1999 against a consultation team of more than 50,000 opponents. And next month Magnus Carlsen takes on the World in the RAW World Chess Challenge.

Now Jeff Sonas and Kaggle have teamed up to present a new competition in this grand tradition: it is “Elo versus the Rest of the World”, pitting hundreds of statisticians against Arpad Elo’s greatest legacy: the Elo rating system.

Arpad Elo, a Hungarian-born physicist and chess master, invented the Elo system half a century ago. Marvelous in its simplicity, it is easily the most widely used method for rating chess players, and has often been used to determine who is invited to play in top tournaments. The formula has also been applied to other contests, ranging from soccer to video games to trading-card games.

However, it has never really been demonstrated that Elo ratings are more accurate than other approaches would be. "Elo versus the Rest of the World", an online competition hosted by Kaggle, will benchmark Elo's system against other systems submitted by statisticians worldwide. The top performers have the opportunity to help shape the future of chess ratings. Competitors can "train" their rating systems using the results (downloadable from the Kaggle website) of more than 65,000 chess games played in recent years. They then submit their predictions for an additional 7,800+ games. The website instantly judges the accuracy of each entry's predictions, and top performers are tracked by a public leaderboard for all to see.  Already more than 90 teams of statisticians have submitted entries.

Top performers who share their methodology receive Amazon.com gift vouchers, or the highest finishers can select prizes from among several chess software packages donated by Chessbase, including the latest versions of ChessBase, Big Database, and Fritz, as well as other training DVD's featuring world champions.


First prize is a copy of Fritz signed by all-time greats Garry Kasparov,
Viswanathan Anand, Anatoly Karpov, and Vladimir Kramnik!

The 15-week competition has only just kicked off, but already things are looking difficult for Elo. Within the first 24 hours, several submissions already surpassed the Elo benchmark. It isn't a surprise that today's experts are able to outperform Elo. After all, the system was developed long before we could easily process large amounts of game data. However, we expected Elo might lead the list for a bit longer!

The competition will also benchmark other well-known rating systems. For example, Professor Mark Glickman developed the Glicko and Glicko-2 systems, which extend the Elo system by introducing additional parameters to track the reliability and volatility of player ratings. Ken Thompson used a linearly weighted performance rating, from a player's last 100 games, to calculate the Professional chess ratings used years ago by the PCA. And Jeff Sonas developed Chessmetrics ratings specifically to maximize predictive power. Perhaps these systems will hold up better!

Links


Discussion and Feedback Join the public discussion or submit your feedback to the editors


Discuss

Rules for reader comments

 
 

Not registered yet? Register