Elo Meter - The test that calculates your Elo

by Albert Silver
10/11/2021 – What if you could get an estimate of your Elo rating just by doing a single test, as opposed to playing a couple of dozen games at long time controls? It isn’t an exaggeration to say this sounds like one of those inane personality ‘quizzes’ you see in supermarket magazines. But this proposal was the subject of a university study to do just that. Try it yourself!

ChessBase Account Premium annual subscription ChessBase Account Premium annual subscription

At the airport, in the hotel or at home on your couch: with the new ChessBase you always have access to the whole ChessBase world: the new ChessBase video library, tactics server, opening training App, the live database with eight million games, Let’s Check and web access to playchess.com

More...

When I was shown this site, my first thought was that this was a bit of a gimmick, a bit like the 'IQ tests' that crop up here and there, and where you congratulate yourself for being a 'genius'. Insert smiley here. However, the credentials are quite a bit more substantial, as are the supporting papers behind the theory.

What is Elometer

The home page is clean and clear. Start your motors!

In a nutshell, the idea is to answer a test set of 76 positions to the best of your ability, some of which have plain cut-and-dry answers, and others which are far more flexible and allow for a second-best or third-best move. Your second-best choice, if that be the case, will not be adjudicated as a fail, and your final rating will be adjusted accordingly. The positions vary in both difficulty and content, and though the initial positions are all completely elementary in nature, they soon ramp up the challenge after which they vary wildly, or such was my impression.

This is the first position, and as you can see, it is as easy as can be. This won't last long, but it does mean it is almost impossible not to get some right.

After you finish the test, they ask a number of questions regarding your conduct in the test, such as how often you play, how long you have played serious chess (30+ years for me, with caveats), how seriously you took the test, and of course whether you looked up the answers to any of the actual questions with an engine or database. I cannot see the point of doing that frankly, since the only person you'd be fooling is yourself.

Of the 76 positions, I recognized four. Two were actually pawn endgames I saw recently from Dvoretsky's Endgame Manual. 

The science behind the test set and how it arrives at its results is really the most fascinating part of it. The authors of the test are Birk Diedenhofen and Jochen Musch from the University of Duesseldorf, Institute of Experimental Psychology. After you have finished the test, and the small questionnaire, the authors explain:

We used item response theory (or "latent trait theory"; Hambleton, Swaminathan, & Rogers, 1991) to derive an estimate of your playing strength based on your answers to a set of chess problems with known properties. To arrive at this estimate, we employed the two-parameter Birnbaum model (Lord, 1980) which allows items to differ a) in difficulty and b) in discriminatory power. The set of chess problems we used was taken from the "Amsterdam Chess Test" developed by van der Maas & Wagenmakers (2005), who presented their chess problems to a sample of 259 participants at a Dutch open tournament. The national Elo rating of these participants ranged from 1169 to 2629. Using a subset of the items of this test (the Choose-A-Move item set A and B), we were able to compute a maximum likelihood estimate of your ELO rating based on a prediction formula regressing the latent ability estimates of the Birnbaum model on the ELO ratings of the comparison sample. Using the test information function, we were also able to compute a 95% confidence interval for this estimate.

What is Item Response Theory?

In psychometrics, item response theory (IRT) (also known as latent trait theory, strong true score theory, or modern mental test theory) is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables.

It is based on the application of related mathematical models to testing data, and is often regarded as superior to classical test theory. It is also the science behind famous higher education tests such as the Graduate Record Examination (GRE) and the Graduate Management Admission Test (GMAT).

IRT is based on the idea that the probability of a correct/keyed response to an item is a mathematical function of person and item parameters. Examples include general intelligence or the strength of an attitude. Parameters on which items are characterized include their difficulty; discrimination (slope or correlation), as well as representing how steeply the rate of success of individuals varies with their ability. (source: Wikipedia)

How good is it?

This does beg the question of how accurate the estimates are. I absolutely don't question the sincerity and integrity of their efforts, but I do know I was last rated 2149 FIDE a bit over a decade ago, my last serious games, and though I recently retook training as shared in a previous article, "Study Chess with Me - the video series", good sense tells me the result I got is more Walter Mitty than reality.

Hard to know what to make of this, but I doubt I am alone

Nevertheless, it is obviously better to have it mistakenly telling me I am higher rated, than have it tell me I am lucky to break 1800, but still.... Insert second smiley here. 

Regardless of the obvious grain of salt it must be taken with, if you don't have any massive ego issues involved with such things, it is certainly a good workout that will take you a couple of hours, and is definitely fun. We're chess players after all, so solving positions is what we are all about, right?

Link to Elo Meter

If you wish to share in my study plan, feel free to join the series. I am certain that at the very least this has helped me produce a better result than without it.

 


Born in the US, he grew up in Paris, France, where he completed his Baccalaureat, and after college moved to Rio de Janeiro, Brazil. He had a peak rating of 2240 FIDE, and was a key designer of Chess Assistant 6. In 2010 he joined the ChessBase family as an editor and writer at ChessBase News. He is also a passionate photographer with work appearing in numerous publications, and the content creator of the YouTube channel, Chess & Tech.
Discussion and Feedback Join the public discussion or submit your feedback to the editors


Discuss

Rules for reader comments

 
 

Not registered yet? Register

vikas2200 vikas2200 2 hours ago
I got 2429 rating estimation. But, my actual rating revolves around 2000.
Albert Silver Albert Silver 10/14/2021 03:57
Hi all, so i got a reply today by the author and it was very much what we theorized. Here is his reply in full:

"Hi Albert,

Thanks for your mail! We had some server issues because of the high
number of visitors. I hope the Elometer is now working as expected again.

Kind regards,
Birk"

Signature:
Dr. Birk Diedenhofen
Department of Experimental Psychology
University of Duesseldorf
Albert Silver Albert Silver 10/14/2021 07:19
@shivasunder _ the night this article went up, and seeing some of the issues some had, I wrote both the authors listed at the site. I pointed them to this article and the comments. I have not received any replies.
shivasundar shivasundar 10/14/2021 12:24
I am getting Fatal Test Exception after several hours of working through the problems as well (40 problems). I started again, and got to 10, then 2 - got the exception again and again. There is some issue with the server or the web app itself @Albert Silver. Please reach out to the people who maintain the site...
chardan chardan 10/13/2021 10:20
Unfortunately, this crashed on me so often that I never was able to complete it-- it seems to have improved over the last day, but I still got alllmost done only to find my session unrecoverable.
hansj hansj 10/13/2021 12:23
@Theochessman:
I guess the solution was not 1.e8 (queen or rook), but 1.Ng6.
Longhorn Longhorn 10/13/2021 06:25
Enjoyed taking the Chess test. Some problems pushed my abilities. Had a 2244 Elo result. The Elometer site authors have put together a good and fair test. Perhaps a couple of strong masters can contribute, refining the test further. The site operated smoothly for me.
Theochessman Theochessman 10/12/2021 10:33
OK, I did the test. I did it very quickly, only looking at each position for about 5 seconds and after a very long tiring day, after a terrible night of sleep. Still I got the ridiculously high score of 2167. Also it seemed to me about all of the problems were completely tactical. Not much positional skills had to be involved. Another point to mention is in 1 position there was a promotion and it didn't let me underpromote to a rook. I could just make the promotion move, when promotion to a Q would have led to stalemate.
The idea of this test is ok, but it doesn't seem to me to be a very accurate estimate. First of all because the set of problems is way too small. Also because it's very onesided. Only tactical and 75% of the problems were beginner level easy.
mc1483 mc1483 10/12/2021 11:36
Not a good idea to advertise a site that cannot bear many connections at the same time.
ChrisHolmes ChrisHolmes 10/12/2021 09:25
I just get a blank screen with a little rotating icon showing the browser is trying to load the page. After 10 minutes I gave up.
Great idea but a victim of its own success - not enough bandwidth to handle the demand.
Rambus Rambus 10/12/2021 01:11
Mine was higher than I expected at 2265 (95% CI 2126 - 2404). I don't have (and never had) any rating other than Lichess blitz of ≈2000 (3+0 games only). Perhaps there should be a time limit - given enough time, most of us can spot something. Many problems were solved instantly, but my screen saver (which comes on after 5 minutes) activated many times.
capaping capaping 10/11/2021 08:37
I have tried it three times but after about ten positions I always get the same error message. It looks like they are not able to handle the volume of CB readers who are attempting to use their Elo Meter. Fingers crossed they will fix this bug some time soon.
GM Luis Lopez GM Luis Lopez 10/11/2021 07:48
Acabo de ingresar la página, sin embargo al parecer presenta problemas, demora en cargar y cuando lo hace aparece "Se encontró una excepción de prueba fatal. Prueba detenida."
Albert Silver Albert Silver 10/11/2021 07:34
I hear you. It might be the site is being surprised by the volume of ChessBase readers, I cannot say. In any case, I wrote the authors a short while ago advising them of the problem. I'll post whatever reply I get here later, but since it is after hours in Germany, I don't expect a reply, or solution, before tomorrow.
nirvana1963 nirvana1963 10/11/2021 07:21
@Albert Silver - I used Firefox for the first test, the second time I used Edge and after a couple of positions I got the same error message.
Albert Silver Albert Silver 10/11/2021 07:00
I have no idea what to tell you. Have you checked if it is a browser issue? I used Chrome in my desktop for reference. I was about 45 positions deep when I had to stop for a half hour to walk my dog. I left it open and resumed as soon as I got back.
TimSpanton TimSpanton 10/11/2021 06:55
I answered the first problem and immediately got: Fatal test exception encountered. Test halted.
hansj hansj 10/11/2021 06:15
I got that "Fatal test exception ..." too. Half way through – even though I did not commit any fatal blunders myself.
nirvana1963 nirvana1963 10/11/2021 05:15
@Albert Silver - I'm not so sure about that. When I completed almost half of the test in a few hours today I got this message "Fatal test exception encountered. Test halted." Have to start all over again, bit annoying...
Albert Silver Albert Silver 10/11/2021 04:49
@nirvana1963 - While true, you can leave the tab open in your browser. There is no time limit after all.
nirvana1963 nirvana1963 10/11/2021 04:28
Great tool! It would be nice if there was an option to save your results and get back later. It seems you have to do 76 puzzles in one run and that's quite a task!
psamant psamant 10/11/2021 03:10
I tried it and got a rating that was more than I expected, so there seems to be some bias, perhaps intentional, to increase rating so as not to disappoint casual users!
skent skent 10/11/2021 11:57
Yes, they calculate it. But it should be fair that they first estimate your rating and then ask what your real rating is.
1