The future is here – AlphaZero learns chess

by Albert Silver
12/6/2017 – Imagine this: you tell a computer system how the pieces move — nothing more. Then you tell it to learn to play the game. And a day later — yes, just 24 hours — it has figured it out to the level that beats the strongest programs in the world convincingly! DeepMind, the company that recently created the strongest Go program in the world, turned its attention to chess, and came up with this spectacular result.

ChessBase 17 - Mega package - Edition 2024 ChessBase 17 - Mega package - Edition 2024

It is the program of choice for anyone who loves the game and wants to know more about it. Start your personal success story with ChessBase and enjoy the game even more.

More...

DeepMind and AlphaZero

About three years ago, DeepMind, a company owned by Google that specializes in AI development, turned its attention to the ancient game of Go. Go had been the one game that had eluded all computer efforts to become world class, and even up until the announcement was deemed a goal that would not be attained for another decade! This was how large the difference was. When a public challenge and match was organized against the legendary player Lee Sedol, a South Korean whose track record had him in the ranks of the greatest ever, everyone thought it would be an interesting spectacle, but a certain win by the human. The question wasn’t even whether the program AlphaGo would win or lose, but how much closer it was to the Holy Grail goal. The result was a crushing 4-1 victory, and a revolution in the Go world. In spite of a ton of second-guessing by the elite, who could not accept the loss, eventually they came to terms with the reality of AlphaGo, a machine that was among the very best, albeit not unbeatable. It had lost a game after all.AlphaGo logo

The saga did not end there. A year later a new updated version of AlphaGo was pitted against the world number one of Go, Ke Jie, a young Chinese whose genius is not without parallels to Magnus Carlsen in chess. At the age of just 16 he won his first world title and by the age of 17 was the clear world number one. That had been in 2015, and now at age 19, he was even stronger. The new match was held in China itself, and even Ke Jie knew he was most likely a serious underdog. There were no illusions anymore. He played superbly but still lost by a perfect 3-0, a testimony to the amazing capabilities of the new AI.

Many chess players and pundits had wondered how it would do in the noble game of chess. There were serious doubts on just how successful it might be. Go is a huge and long game with a 19x19 grid, in which all pieces are the same, and not one moves. Calculating ahead as in chess is an exercise in futility so pattern recognition is king. Chess is very different. There is no questioning the value of knowledge and pattern recognition in chess, but the royal game is supremely tactical and a lot of knowledge can be compensated for by simply outcalculating the opponent. This has been true not only of computer chess, but humans as well.

However, there were some very startling results in the last few months that need to be understood. DeepMind’s interest in Go did not end with that match against the number one. You might ask yourself what more there was to do after that? Beat him 20-0 and not just 3-0? No, of course not. However, the super Go program became an internal litmus test of a sorts. Its standard was unquestioned and quantified, so if one wanted to test a new self-learning AI, and how good it was, then throwing it at Go and seeing how it compared to the AlphaGo program would be a way to measure it.

A new AI was created called AlphaZero. It had several strikingly different changes. The first was that it was not shown tens of thousands of master games in Go to learn from, instead it was shown none. Not a single one. It was merely shown the rules, without any other information. The result was a shock. Within just three days its completely self-taught Go program was stronger than the version that had beat Lee Sedol, a result the previous AI had needed over a year to achieve. Within three weeks it was beating the strongest AlphaGo that had defeated Ke Jie. What is more: while the Lee Sedol version had used 48 highly specialized processors to create the program, this new version used only four!

Graph showing the relative evolution of AlphaZero : Source: DeepMind

AlphaZero learns Chess

Approaching chess might still seem unusual. After all, although DeepMind had already shown near revolutionary breakthroughs thanks to Go, that had been a game that had yet to be ‘solved’. Chess already had its Deep Blue 20 years ago, and today even a good smartphone can beat the world number one. What is there to prove exactly?

Garry Kasparov is seen chatting with Demis Hassabis, founder of DeepMind | Photo: Lennart Ootes

It needs to be remembered that Demis Hassabis, the founder of DeepMind has a profound chess connection of his own. He had been a chess prodigy in his own right, and at age 13 was the second highest rated player under 14 in the world, second only to Judit Polgar. He eventually left the chess track to pursue other things, like founding his own PC video game company at age 17, but the link is there. There was still a burning question on everyone’s mind: just how well would AlphaZero do if it was focused on chess? Would it just be very smart, but smashed by the number-crunching engines of today where a single ply is often the difference between winning or losing? Or would something special come of it?

Professor David Silver explains how AlphaZero was able to progress much quicker when it had to learn everything on its own as opposed to analzying large amounts of data. The efficiency of a principled algorithm was the most important factor.

A new paradigm 

On December 5 the DeepMind group published a new paper at the site of Cornell University called "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm", and the results were nothing short of staggering. AlphaZero had done more than just master the game, it had attained new heights in ways considered inconceivable. The test is in the pudding of course, so before going into some of the fascinating nitty-gritty details, let’s cut to the chase. It played a match against the latest and greatest version of Stockfish, and won by an incredible score of 64 : 36, and not only that, AlphaZero had zero losses (28 wins and 72 draws)!

Stockfish needs no introduction to ChessBase readers, but it's worth noting that the program was on a computer that was running nearly 900 times faster! Indeed, AlphaZero was calculating roughly 80 thousand positions per second, while Stockfish, running on a PC with 64 threads (likely a 32-core machine) was running at 70 million positions per second. To better understand how big a deficit that is, if another version of Stockfish were to run 900 times slower, this would be equivalent to roughly 8 moves less deep. How is this possible?

The paper "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" at Cornell University

The paper explains:

“AlphaZero compensates for the lower number of evaluations by using its deep neural network to focus much more selectively on the most promising variations – arguably a more “human-like” approach to search, as originally proposed by Shannon. Figure 2 shows the scalability of each player with respect to thinking time, measured on an Elo scale, relative to Stockfish or Elmo with 40ms thinking time. AlphaZero’s MCTS scaled more effectively with thinking time than either Stockfish or Elmo, calling into question the widely held belief that alpha-beta search is inherently superior in these domains.”

This diagram shows that the longer AlphaZero had to think, the more it improved compared to Stockfish

In other words, instead of a hybrid brute-force approach, which has been the core of chess engines today, it went in a completely different direction, opting for an extremely selective search that emulates how humans think. A top player may be able to outcalculate a weaker player in both consistency and depth, but it still remains a joke compared to what even the weakest computer programs are doing. It is the human’s sheer knowledge and ability to filter out so many moves that allows them to reach the standard they do. Remember that although Garry Kasparov lost to Deep Blue it is not clear at all that it was genuinely stronger than him even then, and this was despite reaching speeds of 200 million positions per second. If AlphaZero is really able to use its understanding to not only compensate 900 times fewer moves, but surpass them, then we are looking at a major paradigm shift.

How does it play?

Since AlphaZero did not benefit from any chess knowledge, which means no games or opening theory, it also means it had to discover opening theory on its own. And do recall that this is the result of only 24 hours of self-learning. The team produced fascinating graphs showing the openings it discovered as well as the ones it gradually rejected as it grew stronger!

Professor David Silver, lead scientist behind AlphaZero, explains how AlphaZero learned openings in Go, and gradually began to discard some in favor of others as it improved. The same is seen in chess.

In the diagram above, we can see that in the early games, AlphaZero was quite enthusiastic about playing the French Defense, but after two hours (this so humiliating) began to play it less and less.

The Caro-Kann fared a good deal better, and held a prime spot in AlphaZero's opening choices until it also gradually filtered it out. So what openings did AlphaZero actually like or choose by the end of its learning process? The English Opening and the Queen's Gambit!

The paper also came accompanied by ten games to share the results. It needs to be said that these are very different from the usual fare of engine games. If Karpov had been a chess engine, he might have been called AlphaZero. There is a relentless positional boa constrictor approach that is simply unheard of. Modern chess engines are focused on activity, and have special safeguards to avoid blocked positions as they have no understanding of them and often find themselves in a dead end before they realize it. AlphaZero has no such prejudices or issues, and seems to thrive on snuffing out the opponent’s play. It is singularly impressive, and what is astonishing is how it is able to also find tactics that the engines seem blind to.

 

In this position from Game 5 of the ten published, this position arose after move 20...Kh8. The completely disjointed array of Black’s pieces is striking, and AlphaZero came up with the fantastic 21.Bg5!! After analyzing it and the consequences, there is no question this is the killer move here, and while my laptop cannot produce 70 million positions per second, I gave it to Houdini 6.02 with 9 million positions per second. It analyzed it for one full hour and was unable to find 21.Bg5!!

A screenshot of Houdini 6.02 after an hour of analysis

Here is another little gem of a shot, in which AlphaZero had completely stymied Stockfish positionally, and now wraps it up with some nice tactics. Look at this incredible sequence in game nine:

 

Here AlphaZero played the breathtaking 30. Bxg6!! The threat is obviously 30...fxg6 31. Qxe6+, but how do you continue after the game's 30...Bxg5 31. Qxg5 fxg6?

 

Here AlphaZero continued with 32. f5!! and after 32...Rg8 33. Qh6 Qf7 34. f6 obtained a deadly bind, and worked it into a win 20 moves later. Time to get a thesaurus for all the references synonymous of 'amazing'. 

What lies ahead

So where does this leave chess, and what does it mean in general? This is a game-changer, a term that is so often used and abused, and there is no other way of describing it. Deep Blue was a breakthrough moment, but its result was thanks to highly specialized hardware whose purpose was to play chess, nothing else. If one had tried to make it play Go, for example, it would have never worked. This completely open-ended AI able to learn from the least amount of information and take this to levels hitherto never imagined is not a threat to ‘beat’ us at any number of activities, it is a promise to analyze problems such as disease, famine, and other problems in ways that might conceivably lead to genuine solutions.  

For chess, this will likely lead to genuinely breakthrough engines following in these footsteps. That is what happened in Go. For years and years, Go programs had been more or less stuck where they were, unable to make any meaningful advances, and then came along AlphaGo. It wasn't because AlphaGo offered some inspiration to 'try harder', it was because just as here, a paper was published detailing all the techniques and algorithms developed and used so that others might follow in their footsteps. And they did. Literally within a couple of months, new versions of top programs such as Crazy Stone, began offering updated engines with Deep Learning, which brought hundreds (plural) of Elo in improvement. This is no exaggeration.

Within a couple of months, the revolutionary techniques used to create AlphaGo began to appear in top PC programs of Go

The paper on chess offers similar information allowing anyone to do what they did. Obviously they won't have the benefit of the specialized TPUs, a processor designed especially for this deep learning training, but nor are they required to do so. It bears remembering that this was also done without the benefit of many of the specialized programming techniques and tricks in chess programming. Who is to say they cannot be combined for even greater results? Even the DeepMind team think it bears investigating:

"It is likely that some of these techniques could further improve the performance of AlphaZero; however, we have focused on a pure self-play reinforcement learning approach and leave these extensions for future research."

Replay the ten games between AlphaZero and Stockfish 8 (70 million NPS)

Links


Born in the US, he grew up in Paris, France, where he completed his Baccalaureat, and after college moved to Rio de Janeiro, Brazil. He had a peak rating of 2240 FIDE, and was a key designer of Chess Assistant 6. In 2010 he joined the ChessBase family as an editor and writer at ChessBase News. He is also a passionate photographer with work appearing in numerous publications, and the content creator of the YouTube channel, Chess & Tech.

Discuss

Rules for reader comments

 
 

Not registered yet? Register

wasmaster wasmaster 12/9/2017 04:10
A few notes on AlphaGo Zero.
1) I've seen some false statements about the conditions of the match:

- According to the Deepmind paper, the "evaluation" (match vs. Stockfish) the hardware was pretty comparable for both contestants.
"...we used Stockfish version 8 (official Linux release) as a baseline program, using 64 CPU threads and a hash size of 1GB" and AlphaZero "... was executed on a single machine with 4 TPUs [Tensor processing units - think graphics cards on steroids]". The training, prior to the match, used many more TPUs (500?).

- Stockfish was not neutered by removing its opening book. AFAIK, it didn't use an ending table base, but not sure if that mattered.
- AlphaZero was "only" 200 ELO stronger than Stockfish. However, that misses the point. Chess almost certainly has an ELO limit: a rating at which a player can reliably achieve an "ideal" result (which is almost certainly a draw) against a perfect opponent. I'm guessing that the asymptote in rating for AlphaZero is partially due to Stockfish being close enough to this limit to be able to draw an appreciable number of games against any engine or even perfect play.
I don't have proof of this, but here's an analogy:
Time for a reset of ELO -- you and every other tournament player get your rating reset to 1500. You're going to play lots of games to reestablish your rating and everyone else's. Oh, and the game is Tic-tac-toe (aka noughts and crosses, Xs and Os, ...). What do you think the highest legitimate rating will be? 1550? 1520? 1501?
In Checkers/Draughts, this happened in the mid-1990s. The last man/machine match ended up +1/-0/=31 to the machine, and play was very close to optimal by both sides. Any machine would be unable to achieve a rating more than a few hundred points above a human playing almost error-free checkers. This is what I expect as engines get stronger -- machine vs. machine matches will have an increasing percentage of draws, and engines/AIs will not be able to achieve significantly better results with large differences in hardware and algorithms/AI. (Go probably has a MUCH higher ELO limit.)
jsaldea12 jsaldea12 12/9/2017 12:33
jsaldea12 jsaldea12 0 reply report x
AlphaGo is a breakthrough. Just like flat screen TV, LED, that now becomes a part of our lives. But it appears the potential of Alphago complete mastery of computerized algorithm with the sets and individual set of f rules, pattern, etc., being obeyed totally at speed of light, has opened a new dimension, as I said like flat screen TV, LED. In astronomy, for instance, it may be able to prove whether there exists black hole. To me, it does not exist, against the law of physics, the bigger the fire, the bigger the light. AlphaGo may be able to explain why gravity of earth is all attraction, NO REPULSION (see article in internet.). In medicine, it may be able to discover cure for cancer, virus, etc. These are just samples what AlphaGo complete mastery of algorithm can do. I congratulate Demis Hassabis for making this Nobel breakthrough.

Jose s. aldea dec. 9, 2017
Scientist-inventor
Masquer Masquer 12/9/2017 12:10
@tourthefarce
AlphaZero is 100 Elo points above the crippled SF8 version it played against, based on the score in the 100 game match.
tourthefarce tourthefarce 12/8/2017 09:56
Did anyone calculate AlphaZero's rating based on this match with Stockfish?
pcst pcst 12/8/2017 07:26
Example that I won stockfish 8 64 too. I think Alpha zero will move like Stockfish some of game. I like to test Bug mode with Alpha zero.
[Event ""]
[Site ""]
[Date "24/07/2017 8:26:43"]
[Round "1"]
[White "Human"]
[Black "Stockfish 8 64"]
[Opening "Petrov's defence"]
[Eco "C42"]
[TimeControl "0.5+0 (Min.+Inc.)"]
[Result "1-0"]

{[%clk 0:00:30] [%clk 0:00:30] } 1. e4 {[%clk 0:00:30] } e5 {[%clk 0:00:27]
} 2. Nf3 {[%clk 0:00:29] } Nf6 {[%clk 0:00:26] } 3. d3 {[%clk 0:00:28]
} Nc6 {[%clk 0:00:25] } 4. Bd2 {[%clk 0:00:28] } d5 {[%clk 0:00:24] } 5.
Nc3 {[%clk 0:00:27] } d4 {[%clk 0:00:23] } 6. Ne2 {[%clk 0:00:26] } Bc5
{[%clk 0:00:22] } 7. h3 {[%clk 0:00:26] } O-O {[%clk 0:00:22] } 8. Qc1
{[%clk 0:00:25] } Be6 {[%clk 0:00:21] } 9. Ng3 {[%clk 0:00:24] } a5 {[%clk
0:00:20] } 10. Be2 {[%clk 0:00:24] } a4 {[%clk 0:00:19] } 11. Bh6 {[%clk
0:00:21] } gxh6 {[%clk 0:00:19] } 12. Qxh6 {[%clk 0:00:21] } Bb4+ {[%clk
0:00:18] } 13. c3 {[%clk 0:00:19] } dxc3 {[%clk 0:00:18] } 14. O-O {[%clk
0:00:18] } cxb2 {[%clk 0:00:17] } 15. Ng5 {[%clk 0:00:17] } bxa1=Q {[%clk
0:00:16] } 16. Nh5 {[%clk 0:00:17] } Qxf1+ {[%clk 0:00:15] } 17. Bxf1 {[%clk
0:00:16] } a3 {[%clk 0:00:14] } 18. Qg7# {[%clk 0:00:15] } 1-0
e-mars e-mars 12/8/2017 06:54
People don't understand and just argue about this stockfish crippled and that... They don't realise this is basically a starting point with AlphaZero being in a "prototype" phase. Even under such conditions (stockfish crippled) a prototype, after only few hours of learning, can achieve such a result, you can imagine what it can do after a days, weeks, months of learning.

Downside: I don't think you will see a "portable", commercial (PC version) of AlphaZero any time soon. So no TCEC I am afraid, unless they provide some sort of AlphaZero instance connected to TCEC (but as far as I know part of the TCEC rules is that engines use the same hardware...).

Maybe the ICGA can organise something.
dysanfel dysanfel 12/8/2017 06:44
After 31.Qxc7 it looks like Stockfish is better. I cannot believe that AlphaZero won that position. It is uncanny.
mrburns123 mrburns123 12/8/2017 06:39
@sgbowcaster On page 4 of their paper you can see, that AlphaZero didn't make much progress anymore after 200k out of 700k learning steps: https://arxiv.org/pdf/1712.01815.pdf
phenomenonly phenomenonly 12/8/2017 06:03
Incredible impressive ... . As well as the game shown in the first diagramm with 21.Lg5!! ... . For those who are interested in this game and it's most interesting variations, they can find an analysis on http://www.sklauffen.de/wordpress/wp-content/uploads/2017/12/2017-12-04-AlphaZero-Stockfish8.htm .
Quitch Quitch 12/8/2017 06:02
sgbowcaster, because they aren't interested in Chess so much as proving how effective their algorithm is. It took ~4 hours for a AlphaZero go from learning Chess from scratch to being able to beat the best engine that over a thousand years of human knowledge of the game could create. Chess is just a ruleset to them to demonstrate their neural net.
KingZor KingZor 12/8/2017 05:43
Igo Freiberger, I stand corrected. The whole report is rather slapdash. An amazing accomplishment, but lousy reporting.
sgbowcaster sgbowcaster 12/8/2017 05:03
Why didn't they let AlphaZero learn for 3 or 4 days instead of a few hours and then write a paper on that?????
Bov Bov 12/8/2017 01:51
@jsaldea12
I assume the position is wrong and the Knight has to stand somewhere else than in d5 ?!? Else please give the solution after 2.Nxb6
oxygenes oxygenes 12/8/2017 01:22
@jsaldea12
i am not big puzzle solver, but i have feeling, you have probably wrong date.
You puzzle is really great, if i remember right, it is modified version of your previous puzzle, where you firstly give us wrong position and then show wrong solution. I am just curious, how you want reach mate in 9 after 1.gxh6 gxh6 2.Ba7-b6 Nd5xb6? or this puzzle is some sort of helpmate?
tjallen tjallen 12/8/2017 01:15
In the linked paper, under Domain Knowledge, the authors say:
"5. Chess and shogi games exceeding a maximum number of steps (determined by typical game length) were terminated and assigned a drawn outcome; "
The authors do not indicate what game-length was chosen for chess, nor where this fact is used. Is a maximum game-length used in the matches against stockfish? In the training matches against itself? In its own evaluation function MCTS or subtree lengths? I don't see where they say they use this parameter. How would this affect the match outcome?
Thinkler Thinkler 12/8/2017 11:36
I know the hype is real. But please read the paper! Do not blindly spread wrong information. There is a difference between AlphaZero and AlphaGo Zero. This article is full of mistakes. But it might be too late already.
googyi googyi 12/8/2017 11:22
@cloudmann Just let your stockfish 8 until it reaches depth 49-50 and you will se 35. Nc4.
algorithmy algorithmy 12/8/2017 10:32
I'm not impressed at all. Sounds like a scam to me, or at best, like an ill-constructed and biased experiment that was done by people who may know something about programing but not too much about chess!
weerogue weerogue 12/8/2017 10:19
Probably a little bit of unfair play somewhere along the process here, but still absolutely astonishing results. I just played though game 3 and enjoyed it enormously! AZ sacs a pawn, then another, then the exchange and finally Black is in complete Zugzwang and has nothing to do but throw away material with each move it is incredible!!
cloudmann cloudmann 12/8/2017 08:25
<canu>
I agree, that after analysing even the first game, where SF8 is White in a C65 match, there was something not quite kosher going on here. For example, in that 1st game, check out SF8's move #'s 29 (g3), 35 (Nc4), 36 (Rc1), and 38 (Nf3), and ask yourself if something isn't awry.
My version of SF8 would have never made those particular moves.
jsaldea12 jsaldea12 12/8/2017 06:51
THE POSITION AND SOLUTION TO THE 9 MOVER CHESS PUZZLE IS AS FOLLOWS:

Position: White: Pa2, Ka3, Pa4, Ba7, Ba8, Pb5, Pc2, Pd6,Pe2, Nf3, Ng1, Pg5
Black Pc3. Kc4, Pc5, B-c8, N-d5,Pd7, Pe3, P-g3, P-f6, Pg7, Ph6

Solution 1: (1)Pg5xPf6 or Pg5xPh6…PxP(2) B- b6...P-h5 (3) B-d8…P-h4 (4) bd8xpf6…p-h3 (5) b-g5…B-b7 (6) BxB…P-h2 (7) BxPe3…NxB (8)N-e5…K-d4 (9)Ng1- f3 Mate.

I was not expecting this. Sorry, I messed up the celebration of Deep Mind. I was expecting AlphaGo to mincemeat the puzzle after what it has done to Stockfish. After 24 hours of waiting, at less now I feel assured this 30 years old in the making has no loopholes anymore. Merry Xmas in advance!!!

JOSE S. ALDEA
Dec. 8, 2017 2:00PM
kgraham63 kgraham63 12/8/2017 06:45
How about the computers teach humans how to play better chess instead of using all this processing power to beat the human player. This would be a true renaissance in the next generation of AI.
Masquer Masquer 12/8/2017 06:09
The play of AlphaZero in the 10 games published appears to be very impressive, to say the least. It would not surprise me if it is truly stronger than Stockfish. However, the conditions under which the match(es) were held are not entirely clear. Whatever is stated in the paper appears to be definitely unfavorable to Stockfish, which vitiates the result and makes the whole test *unscientific*.

Let's enumerate:

1. Stockfish 8 was used, which is ~40 Elo below the latest development version of SF, that's available to everyone
2. The time control was 1 minute for each move, crippling away SF's time management algorithm (where SF could use more time in the opening and in critical positions)
3. While SF used 64 processor cores, it was only allowed to use 1 GB of hash memory, a ridiculously low amount, which would only take away from its ability to use those 64 cores effectively
4. SF was not allowed to use a book (not that it has a native book), but AlphaZero's own prior training basically made it possible for it (A0) to play with one (in effect)

Despite all these serious handicaps, the given result of the 100 games (28 wins and 72 draws, or 64-36 in favor of AlphaZero) only shows a 100 Elo advantage over the crippled-down SF8.

The paper is ambiguous as to the number of games played between AlphaZero and SF8 (some say 100 games, others assume 1200). It only published 10 of the games. This by itself is very sloppy and unscientific.

In conclusion, because of these deficiencies the community has every right to ask for better test conditions and more clarity from the AlphaZero team!
pcst pcst 12/8/2017 05:55
Make it real guy don't keep it secret chess is play to be best year by year. Who need is best. Accept it year by year. It so long until we go to another planet it not finish yet.
rubinsteinak rubinsteinak 12/8/2017 05:50
@Martin_Cz I think the oddities you're finding are down to two reasons: 1. SF8 was selectively handicapped with a mere 1 GB of RAM. I'm not an expert on how chess programs work, but basically SF stores the positions it looks at and their evaluations in a "hash table" that resides in RAM. The less RAM, the less SF can store during the game. This makes the program much more inefficient in time use and affects the depth of search. 2. An arbitrary 1 min/move rule was imposed, which again, is not optimal for SF search routine. Combine these two factors, and you get oddities like the ones you've found, plus others are finding even more egregious mistakes. More can be read here: https://www.chess.com/blog/maelic/new-computer-chess-champion-not-yet

I think DeepMind did try A0 against the "real" SF8 and it didn't go very well. Then they started tweaking different settings and creating arbitrary time limits until they got the results they wanted. What I don't understand is why they thought they'd get away with it?
pcst pcst 12/8/2017 05:45
I win stockfish 8 64 55 games I think I can play to win alpha zero too. In year bug mode. Email me thaichessengland@gmail.com
pcst pcst 12/8/2017 05:38
Let play me 0.5 minute per game I think I will find bug of Alpha zero. Test 1 Day. OR CAN Alpha zero play Thai chess. Regard
Martin_Cz Martin_Cz 12/8/2017 04:44
Also, in the same game, what puzzles me are two moves:
16. .... Nb7. Why would anyone put a knight almost in the corner??? Isn't it a terrible spot? My machine briefly "mentions" that move in a few seconds analysis, but then comes up with - according to its evaluation - better alternatives: 16. ... Nf5 / Bc4 / Ne8.
Later, it doesn't value the move 18. ... h6 very much either (overall evaluation -0,28) while it prefers 18. .... d5 with the evaluation -1,13.
But again: My machine is rather outdated, and I'm no expert at computer chess either, so if anyone cares to run a deep analysis, it would be cool! Anyway, what struck my amateur chess player eye in the position where you praised the move 21. Bg5 by Alpha were the awkwardly placed black pieces on the queen side. So I wondered why would Stockfish place a knight on b7 if there is still another knight on b8, as well as the pawns on d7 and c6, hindering the development of the b8 knight. And surprise surprise: my Stockfish doesn't like the move Nb7 very much either! Am I missing something?
Martin_Cz Martin_Cz 12/8/2017 04:26
Hi! Can someone with a strong machine (or you guys at chessbase) do a proper analysis of the mentioned move 21. Bg5 and the following variants? I only have an old Stockfish 2.1.1 JA 64 bit, with Intel Core i5, CPU 2.5 GHz, 4 GB RAM, so my 5 year old laptop may be missing something, but it doesn't simply support your claim that: 'After analyzing it and the consequences, there is no question this is the killer move here.' After 21. ... f5 and 22. Qf4 as in the game, it thinks 22. ... hxg5 is stronger than the move 22. ... Nc5 played in the game. It offers the following line: 23. Nxg5 Qxh5 24. Bf3 Qg6 25. Kg2 Kg8 26. Re7 with possible answers by black: 26. ... Ne8 / Nc5 / Qd6, and my machine does NOT see how to win for white and gives the evaluation zero. It is even ok with the move 21. hxg5 and the following sample line: 22. Jxg5 Qg8 23. Qh4 Bd3 24. h6 Re8 25. hxg7 Kxg7 26. Qd4+ f6 27. Qxd3 fxg5 28. Qf5 or Rxe8 for white etc. and the evaluation is also 0.00.
celeje celeje 12/8/2017 03:22
@reddawg07: You mention IBM's Watson. That is basically a scam and the masses are too ignorant to realize they've been scammed. That's why it's important not to allow any cheating or fudging by Google/DeepMind/Alpha. If they behave like IBM, that's not good.
TommyCB TommyCB 12/8/2017 02:47
ChessBase:

"It played a match against the latest and greatest version of Stockfish"

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm:

"In the Methods we describe these augmentations, focusing on the
2016 Top Chess Engine Championship (TCEC) world-champion Stockfish"

The 2016 TCEC version is over a year old, so not the latest and greatest.
Mr TambourineMan Mr TambourineMan 12/8/2017 02:16
More results will come over time. The objection that Stockfish was handicapped and especially regarding it had no opening book while AlphaZeroChess in practice had his self-taught opening book is of course a bit unsportsmanlike.

When Kasparov played against Deep Blue, he had no opening book, and could not even take a time out and go to the library to check, while the computer could check immediately, built in, all Kasparov had was his self-learning opening skills. Deep Blue, however, was able to consult millions of games and also had a pre-programmed opening book with all Kasparov games and also specially designed to use against Kasparov. Also a little unsportsmanlike.

As far as opening books for brute force computers are concerned, I have always seen it as an Achilles heel that their game strength is so weak without an opening book. As well as to some extent in the endgame without Endgame Databasis. So what has been done here is little to be seen that the emperor is naked by first pulling of his clothes and then yelling the emperor is naked. It was not told this way in HC Andersen's saga although the end is still the same.
Igor Freiberger Igor Freiberger 12/8/2017 02:16
LetoAtreides82: see the legend for Table 2. Here they clearly explain that "The plot shows the proportion of self-play training games in which AlphaZero played each opening, against training time. We also report the win/draw/loss results of *100 game AlphaZero vs. Stockfish matches* starting from each opening, as either white (w) or black (b), from AlphaZero’s perspective."

There is no doubt they made 12 matches of 100 games between AZ and Stockfish for the 12 chosen openings. The mix of two different data in one graphic is not very scientific, but the whole paper also suffers from this approach.
dofski dofski 12/8/2017 01:16
@tjallen
You may be right with A. ie no further improvement possible. But why should this be for A0? No explanation is given ? This doesn't seem right to me.

I do not know whether you think B. is really a possibility. If A0 had "solved" chess then surely it would not show the range of win, draw results obtained. It is interesting it did so very much better with white.
tjallen tjallen 12/8/2017 12:55
Looks like SF and other engines are pruning away good moves in positions where multiple pieces are left en prise. Also they seem to be pruning away good moves in positions where an equalizing recapture is beyond the move horizon. That is something A0 has exposed.
canu canu 12/8/2017 12:33
Something very fishy took place in this match. After analysing the games I believe Stockfish was ran on below standard settings.
tjallen tjallen 12/8/2017 12:30
dofski - I also noticed the asymptotic training graphs of A0, especially for chess. There was no improvement from 300K to 700K training sets, it seems stuck at about 3600 elo. I figure there are at least two possible explanations:
A. The A0 algorithm does not allow for more chess improvement (at least by self-play), or
B. A0 is playing objectively perfect chess and there is no room for improvement.
I do not think this is any problem with the elo system.
JactaEst JactaEst 12/8/2017 12:18
'At one time I had a book titled "The World's Worst Joke Book" or something similar to that. One anecdote was that of a man asking the strongest computer does God exist. The reply was it does now.'

It's actually a famous sci-fi (very) short story from the 60's I think:
'All the computers in the world were linked together and asked 'Is there a God?'
'There is now' came the reply.'
JactaEst JactaEst 12/7/2017 11:57
Yes - I think they will have to play such a match to put the valid criticism that they played a crippled Stockfish to bed.

Nevertheless in several of the published games which it lost Stockfish came out of the opening believing itself to be dead level or even ahead - so it does seem that AZ simply evaluates positions better than Stockfish.
dofski dofski 12/7/2017 11:20
@fgkdjlkag
"Obviously there is an upper elo limit, I'm not sure why this is considered interesting or significant? "
I had a quick surf and there suggestions that 3400-3600 may be a maximum. But it is still not clear to me whether this is a function of the ELO system itself or of machines.

Suppose a machine X has 3600 against other powerful machines. Is it being suggested that it would be impossible to design a machine Y which beats X 3:1 on average and should therefore should have a higher ELO.