ChessBase Logo Shop Link
Language : DE EN ES
Search : OK

Can you out-predict Elo? – Competition update

21.9.2010 - Can we devise a more accurate method for measuring chess strength and predicting results than Elo, which has done good service for half a century? Jeff Sonas has given statisticians 65,000 games which they must use to predict the results 7,800 other games. The idea is to find out who can out-perform Elo. In the lead is 28-year-old Portugese biochemist Filipe Maia. Current status.
 

Kaggle is a platform that allows companies, researchers, governments and other organizations to post their problems and have statisticians worldwide compete to predict the future (produce the best forecasts) or predict the past (find the best insights hiding in data). Statisticians on Kaggle are rated and ranked based on past performance so the competition host will know who are the smartest people in the room. Statistician Jeff Sonas and Kaggle teamed up to present a new competition in this grand tradition: it is “Elo versus the Rest of the World”, pitting hundreds of statisticians against Arpad Elo’s greatest legacy: the Elo rating system. We reported on this.

Progress Update: Elo versus the Rest of the World

By Jeff Sonas

We have just passed the halfway mark of the "Elo vs. the Rest of the World" contest, scheduled to end on November 14th. The contest is based upon the premise that a primary purpose of any chess rating system is to accurately assess the current strength of players, and we can measure the accuracy of a rating system by seeing how well the ratings do at predicting players' results in upcoming events. The winner of the contest will be the one whose rating system does the best job at predicting the results of a set of 7,800 games played recently among players rated 2200+.

So far we have had an unprecedented level of participation, with 162 different teams submitting entries to the contest! There is also a very active discussion forum to promote the free flow of ideas, although many teams are still hesitant to share too many details about their approach (especially considering that the winner will receive a copy of Fritz signed by Garry Kasparov, Viswanathan Anand, Anatoly Karpov, and Viktor Korchnoi) Both Chessbase and Kaggle have donated generous prizes, to be awarded to top-performing participants who are willing to share their methodology publicly.


First prize is a copy of Fritz signed by all-time greats Garry Kasparov,
Viswanathan Anand, Anatoly Karpov, and Vladimir Kramnik!

The Kaggle website provides access to a training dataset of 65,000 recent chess games, that can be used to train their rating system, as well as a test dataset (a list of 7,800 recent chess games whose outcomes must be predicted and submitted). The website (which knows the results of those 7,800 games) automatically scores each submission and maintains a public leaderboard of each participant's best-performing entry. A more complete score for each entry is kept secret so that participants cannot "decode" the leaderboard, and this secret list determines the final winners, to be announced at the end of the contest.

A wide range of approaches has been used, including almost every known chess rating system as well as other tries involving neural networks, machine learning, data mining, business intelligence tools, and artificial intelligence. In fact over 1,600 different tries have been submitted so far, and we anticipate far more submissions as the competition heats up over the final seven weeks.


Filipe Maia, a PhD student at Janos Hajdu Molecular
Biophysics group at Uppsala University

The #1 spot is currently held by Portuguese physicist Filipe Maia, who confesses to little knowledge about statistics or chess ratings, but is nevertheless managing to lead the competition! He is also the author of El Turco, the first-ever Portuguese chess engine. Out of the current top ten teams on the leaderboard, seven use variants of the Chessmetrics rating system, two are modified Elo systems, and one is a "home-grown variant of ensemble recursive binary partitioning". That last approach belongs to the #3 team on the public leaderboard, a team known as "Old Dogs With New Tricks". This team is a collaborative effort between Dave Slate and Peter Frey, both prominent leaders in computer chess for many years.

Although the "Old Dogs With New Tricks" team clearly has a lot of chess expertise, and the #2 spot is held by Israeli mathematician and chess player Uri Blass (FIDE rating 2051), the top ten or twenty teams are primarily comprised of mathematicians, data miners, and other scientists having minimal direct experience with chess or chess ratings. This suggests that experts on chess rating theory might still have a lot to learn from experts in other fields, which of course is one of the desired outcomes of this contest! We have attracted interest from around the globe, with the top twenty comprised of participants from Portugal, Israel, USA, Germany, Australia, UK, Singapore, Denmark, and Ecuador.

As the organizer of the contest, I have "benchmarked" several prominent rating systems, starting with Chessmetrics, Elo, PCA, and Glicko/Glicko-2. Other systems (including TrueSkill) will also be benchmarked in the near future. A "benchmark" consists of implementing those systems, optimizing any parameters for predictive power, submitting predictions based on their ratings, and publicly describing the details of the methodology in the discussion forum. These benchmark entries help other competitors to gauge the success of their own entries and to get some ideas of what other people have tried in the past. If you are interested in learning more about any of the benchmarked systems, you can find detailed descriptions in the discussion forum on the contest website.

Currently, out of 162 teams, the benchmarks hold the following rankings:

Chessmetrics Benchmark:   #10
Glicko-2 Benchmark: #38
Glicko Benchmark: #39
PCA Benchmark: #66
Elo Benchmark: #82

Thus it is becoming increasingly clear that there are many alternative approaches that seem more accurate than the Elo system, being more effective at measuring players' current strength and predicting players' results in upcoming events. However, predictive power and accuracy are not the only yardsticks to use in assessing a rating system; it is clear that inertia, familiarity, and simplicity are powerful advantages of the Elo system…

The first half of the contest has been a great success and we look forward to a very competitive and productive second half! Please come visit the contest website and join the fun!

Links

Feedback and mail to our news service Please use this account if you want to contribute to or comment on our news page service
Tagged with:

See also

Today on playchess.com

Endgame Magic Show

16.5.2013 - Once a month Karsten Müller hosts the endgame magic show. The reknown endgame expert shows brilliant endgames from recent tournament games. Today Pascal Simon is his guest. Starting at 4pm. Become Premium Member!

Norway Chess Round 7

15.5.2013 - The Norway Chess tournament is held for the first time with a super strong field of Grandmasters including Carlsen, Anand, Aronian and many more. Beginning at 4 pm Daniel King will analyse the games. Become Premium Member!

Shop

ChessBase 12 - Mega package

From club players to World Champions - ChessBase 12 is every ambitious chess player’s Swiss army knife. The latest version leaves the competition in the starting blocks thanks to 64-bit capability and a host of innovative analysis and training features.

€269.90

ChessBase Magazine Extra 153

Extra 153, with more than 24,000 current games and three classics: Dejan Bojkov, Larence Trent and Robert Ris present on video the fantastic duels Larsen-Stahlberg (Copenhagen 1958), Trent-Hebden, (London 2006) and Nezhmetdinov-Chernikov (Rostov 1962)

€12.99

Opening Encyclopedia 2013

Everything you need to create a complete and powerful repertoire: more than 5,200 opening surveys, 4,5 million games (about 80,000 of them annotated), 728 opening articles from CBMagazine and a 1 GB opening book with all statistics.

€99.90

Know the Terrain Vol. 5: The Philidor Structure

The Philidor structure (White pawns on d4 and e4, Black pawns on d6 and e5), is a fundamental position in the open games. In his new training course, IM Sam Collins shows you just how much explosive power is packed into this apparently simple structure.

€27.90

Najdorf Powerbook 2013

The Najdorf Powerbook 2013 bases on an unbelievable amount of informations: 58 000 master games and more than 1 070 000 top class Najdorf games from the engine room on playchess.com are the basis for a must have product to any serious Najdorf player.

€9.90

ChessBase Tutorials Openings # 05: Flank Openings

See what the Réti System is all about in the English, King’s Indian Attack or Bird’s openings with this collection of master games, and prepare to launch surprise attack!

€29.90

Chess Endgames 12 - Rook vs Knight

What is the best way to use your pieces to their full potential in the endgame? GM Karsten Müller demonstrates “knight geometry”, and teaches you how to employ the “knight check shadow” in your own games!

€29.90