In Part I we looked at the historical results of chess games played between the strongest grandmasters and the strongest chess computers. The surprising conclusion was this: top grandmasters and top chess computers are dead-even, and they have been stuck that way for some time. Neither side has actually won a match from the other in five years, and the last seven events between grandmasters rated 2700+ and chess computers have all been drawn. How long can this deadlock last?
You're probably thinking, "Okay, sure, they're even now, but isn't it inevitable that computers will just get better and better and eventually leave the humans behind?" Well, yes, computers will definitely get better, but don't forget that human players are improving too! Human players are constantly getting better at chess; it's just super-hard to measure this statistically, because improving humans play against other improving humans. It is NOT inevitable that computers will surpass humans.
Let's look at some numbers and graphs. At first glance, it seems like the top computers must be gaining considerable ground on the top humans. For instance, it is quite easy to calculate a performance rating for top computers, in their games against humans with FIDE ratings. I did this for 742 games over the past fifteen years, grouped into three-year spans (considering only computers who were one of the top eight engines in the world at the time). Here are the results:
The overall performance rating of computers against humans has increased at roughly 30 Elo points per year. This would suggest that however fast humans are improving, the computers are improving even faster. Thirty Elo points a year faster. However, if you split these same human opponents into two groups, based on a rating cutoff point of 2550, an interesting pattern emerges. Computers are not doing any better against 2550+ players right now, than they were a few years ago.
Although computers have certainly been more successful in 2001-2003 than ever before, that is only due to the fact that they are finally starting to dominate the humans who aren't in the top 200 in the world. As you can see in that graph, computers are doing no better today, against 2550+ opposition, than they were three years ago. Let's look at this event-by-event. First, I'll show you the results against the weaker group, over the past 5-6 years:
Up through mid-2001, a +4 score by a computer, in an event against rated humans, was almost unheard-of. In the two or three years since then, every single computer that played in a tournament against sub-2550 humans, has scored between +4 and +7 against them. Seven straight events, and five different computers, but they all scored at least +4.
On the other hand, you don't see the same level of improvement by computers against the stronger players (in the 2550-2700 range). In fact, you don't see ANY improvement by computers. Let's look at the events since 1998 where computers faced opponents in that rating class:
Computers are becoming more and more dominant against everyone but the top 200 players in the world. That is leading to an overall performance rating for computers that is getting higher and higher. However, the players in the top-200 are holding their ground even against the latest and greatest computers. Perhaps that group will soon shrink down to only the top-100, or the top-50, but not inevitably, and not irreversibly. As you can see from my previous graphics, there is no sign that the top-200 players are losing ground at all against the top computers.
The top 20 humans (the 2700+ crowd) are managing a long string of drawn matches against computers, and the rest of the top-200 is averaging the same 35% to 40% score that they did a few years ago. So, amazing as it may seem, I don't see any evidence that the top computers are suddenly going to take over the chess world. Of course the top computers are improving, mostly through hardware and software upgrades, but somehow the top humans are improving just as fast, in other ways.
In Part I, I mentioned that there were two key questions to answer. Let's review those questions and see what my answers have turned out to be:
Question #1: A decade ago, top grandmasters were undeniably
stronger than chess computers. There was a large gap in strength, roughly 300
Elo points. In chess terms, if a top grandmaster had played 100 games against
a top computer, the grandmaster would have won the match by a score of about
85-15 (roughly speaking). In the past ten years, computers have certainly reduced
the gap. How large is the gap right now, and who is ahead?
Answer #1: There is no measurable gap right now between top computers and top-20 grandmasters. They have been deadlocked for about five years. The remaining top-200 grandmasters, as a group, are slightly weaker than the top computers.
Question #2: Who is improving faster, top grandmasters or
chess computers? What can we say about how the situation will be different in
one year, or ten years, or fifty years?
Answer #2: There is no measurable difference in how fast either group is improving. There is nothing (yet) to suggest that one or the other will suddenly pull ahead. It may well be that in ten years the top computers and top grandmasters will still be deadlocked.
We are at a unique point in chess history, an unprecedented state of dynamic balance. The top computers have caught up with the top grandmasters, and right now I'm not convinced that the computers will zoom past the grandmasters. Everything depends on whether computers or grandmasters can improve faster, starting today. It may even be that the top humans can figure out how to improve their own anti-computer play, to the point that they will pull ahead again. Perhaps Garry Kasparov can lead the way once more.
There is much more to say about how computers and grandmasters are improving, relative to each other. We'll compare and contrast the various ways that computers and grandmasters are improving, and how we can try to measure those improvements, starting with Part III next week.
Jeff Sonas is a statistical chess analyst who has written dozens of articles since 1999 for several chess websites. He has invented a new rating system and used it to generate 150 years of historical chess ratings for thousands of players. You can explore these ratings on his Chessmetrics website. Jeff is also Chief Architect for Ninaza, providing web-based medical software for clinical trials.