Alpha Zero: Comparing "Orangutans and Apples"

by André Schulz
12/13/2017 – In time for the start of the London Chess Classic DeepMind, a subsidiary of Google, published a remarkable report about the success of their "Machine Learning" project Alpha Zero. Alpha Zero is a chess program and won a 100 game match against Stockfish by a large margin. But some questions remain. Reactions from chess professionals and fans.

Fritz 16 - He just wants to play! Fritz 16 - He just wants to play!

Fritz 16 is looking forward to playing with you, and you're certain to have a great deal of fun with him too. Tense games and even well-fought victories await you with "Easy play" and "Assisted analysis" modes.

More...

From Zero to Chess

The company DeepMind Technologies was founded in London 2010 by Demis Hassabis, Shane Legg and Mustafa Suleyman. In January 2014, Google bought the start-up company for an undisclosed amount, estimated to be about USD $500 million. The company became Google DeepMind, and has the vision to "understand artificial intelligence". Here, it wants to adapt the capacity of the human brain to the approaches of "Machine Learning".

Machine learning

In October 2015, DeepMind had a first big success with the game of Go. Go is a very complex game and requires strategic skills in particular. For a long time it had been impossible to translate the requirements of Go into mathematical formulas that would allow Go programs to compete with the best human Go players. But with special self-learning heuristics the DeepMind program AlphaGo got better and better and was finally strong enough to beat Go professionals. In October 2015, AlphaGo defeated several-time European Champion Fan Hui, in March 2016 the program won 4 : 1 against the South Korean Go professional Lee Sedol, a 9-Dan player — both matches were played under tournament conditions.

The architecture of the AlphaGo program is based on an interaction of two neural networks, a "policy network" to define candidate moves, and a "value network" to evaluate positions. A Monte Carlo approach connects the two networks to a search tree. With the help of a database with 30 million moves the program learnt to predict the moves of humans.

By nicoguaro (Own work) [CC BY 3.0 (http://creativecommons.org/licenses/by/3.0)], via Wikimedia Commons

In the match against Fan Hui, AlphaGo ran on a computer cluster of 1202 CPUs and 178 GPUs and used 40 "search threads". In the following match against Lee Sedol it had 1920 CPUs and 280 GPUs. For the learning phase before the matches the Google Cloud platform with its Tensor Processing Units (TPUs, ASICs for the software collection TensorFlow) was used.

In May 2017 AlphaGo took part in the "Wuzhen Future of Go Summit 2017" in Wuzhen, China, and won three games against the world's number one, Ke Jie. The program also won against five leading Go players who could consult with each other during the game.

The next development step was the program AlphaGo Zero, and in October 2017, DeepMind published a report about the development of this program. AlphaGo Zero started at zero, with reduced hardware structure. That is, the program knew the rules of Go but had no previous knowledge whatsoever about the game. However, it got better by playing against itself. Four Tensor Processing Units were used as hardware. With the help of TensorFlow it took AlphaGo Zero only three days to play better than the previous AlphaGo version which had beaten the best human Go player — but now AlphaGo Zero defeated its predecessor with 100-0.

Since Hassabis had been a good chess player as a junior it did not come as a surprise when DeepMind turned to chess after its success with Go. From the beginning of computer development chess has been considered the touchstone of artificial intelligence (AI).

(Above) Monte Carlo method applied to approximating the value of π. After placing 30,000 random points, the estimate for π is within 0.07% of the actual value. | Source: By nicoguaro CC BY 3.0, via Wikimedia Commons

DeepMind's Video about AlphaGo Zero

The last big leap forward in the development of computer chess happened a bit more than ten years ago when Fabien Letouzey published a new approach of the search tree with his program "Fruit". Vasik Rajlich, the developer of Rybka, significantly improved this approach. His program Rybka was later decompiled and several programmers used the Rybka code as a point of departure to write even further developed and improved chess programs of their own.

The basis of all these programs is an optimised Alpha-Beta search in which certain evaluation parameter (material, possibilities to develop, king safety, control of squares, etc.) establish the best moves for both sides. The more lines you can eliminate as irrelevant in the search tree, the more efficient is the search, and the program can go much deeper into the crucial main line. The program with the deeper search wins against the other programs. However, the drawing rate in top-level computer chess is very high.


Houdini 6 Standard

Houdini 6 continues where its predecessor left off, and adds solid 60 Elo points to this formidable engine, once again making Houdini the strongest chess program currently available on the market.

More...


Alpha Zero's Monte Carlo search tree is a completely different approach. At every point the program plays a number of games against itself, that always start with the current position. In the end it counts the results for an evaluation. In their paper "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" the authors described this approach in more detail.

In a learning phase (training) Alpha Zero used 5000 "first-generation" TPUs from the Google hardware park to play games against itself. 64 "second-generation" TPUs were used for the training of the neuronal network. And after only four hours of training Alpha Zero played better than Stockfish.

During the training phase Alpha Zero also played matches against Stockfish, always a hundred games, 50 with White and 50 with Black, and starting with ten popular openings. Alpha Zero won the majority of these matches but not all of them: in the Queens Gambit the program lost 1-2  with Black (47 games were drawn). In the Grünfeld (which DeepMind erroneously calls "Kings Indian") Alpha Zero lost 0-2 with Black while 48 games ended in a draw. In the Kan-Variation of the Sicilian it lost 3-7 with 40 draws. With colours reversed Alpha Zero always won clearly.

SourceDeepMind

The "well-trained" Alpha Zero program then played a 100 game match against Stockfish, in which it used a computer with four TPUs while Stockfish was running on hardware with "64 threads". In 25 of the 28 games Alpha Zero won in this match Alpha Zero was playing with White but with Black it won only three games. That is a very unusual result. Usually, there's 55% statistical difference between White and Black in chess. In Go and Shogi matches the difference between playing with White and playing with Black was much less marked.

Source: DeepMind

Incidentally, this result equals a 65% success rate or an Elo-difference of about 130 points — which is the difference between Magnus Carlsen and a 2700 player.

GM Daniel King shares AlphaZero vs. Stockfish highlights:

This on-demand PowerPlay show requires a ChessBase Account


ChessBase Account Premium annual subscription

At the airport, in the hotel or at home on your couch: with the new ChessBase you always have access to the whole ChessBase world: the new ChessBase video library, tactics server, opening training App, the live database with eight million games, Let’s Check and web access to playchess.com

More...


Reaction and reception

The reaction of the international press was enthusiastic, comparable to the reaction when Deep Blue won the match against Garry Kasparov 20 years ago. Back then the value of IBM shares rose considerably. Google DeepMind certainly would not be unhappy if that happened to its parent company. But the reactions were also markedly uncritical. Breathless reporting along the lines of: A great super computer just taught itself chess in a couple of hours and now is better than the best chess program. Mankind took a great step forward (to where?). After all, this is the very impression the publication wanted to create. 

On Chess.com Tore Romstad from the Stockfish team had the following to say about the match:

The match results by themselves are not particularly meaningful because of the rather strange choice of time controls and Stockfish parameter settings: The games were played at a fixed time of 1 minute/move, which means that Stockfish has no use of its time management heuristics (lot of effort has been put into making Stockfish identify critical points in the game and decide when to spend some extra time on a move; at a fixed time per move, the strength will suffer significantly).

He goes on to note that the version of Stockfish was not the most current one and the specifics of its hardware set up was unusual and untested. By contrast, the "4 hours of learning" is actually misleading considering the hardware resources underlying that work.

But in any case, Stockfish vs AlphaZero is very much a comparison of apples to orangutans. One is a conventional chess program running on ordinary computers, the other uses fundamentally different techniques and is running on custom designed hardware that is not available for purchase (and would be way out of the budget of ordinary users if it were).

Romstad admits that the comparison between two entirely different approaches has its charms and might give better impulses for future developments than the previous races in computer chess where one program with the same calculation methods is only slightly better than another.

Viswanathan Anand

Anand after Round 6 | Source: Saint Louis Chess Club on YouTube

Several players weighed in on the London Chess Classic live webcast, some of the most interesting remarks came from Viswanathan Anand:

"Oviously this four hour thing is not too relevant — though it's a nice punchline — but it's obviously very powerful hardware, so it's equal to my laptop sitting for a couple of decades. I think the more relevant thing is that it figured everything out from scratch and that is scary and promising if you look at it...I would like to think that it should be a little bit harder. It feels annoying that you can work things out with just the rules of chess that quickly."

Indeed, for chess players who work with computer programs, the breakthrough of Alpha Zero has, for now, no use at all. In the short run, no adequate hardware for Alpha Zero will be available. And for chess programmers the results of the research project were rather disillusioning. And even if an Alpha Zero program would at some point in the future run on common hardware, the required powerful development environment would still be unaffordable. But if the project eventually spawns an open source cousin, one that could provide the necessary computer performance, it would spell the end of the individual and varied chess programs as we know them today. Until then, the like of Houdini and Komodo are still top dogs in the chess engine market.

GM Larry Kaufman from the Komodo team lauded the news with a caveat on Facebook:

Yes, it's big news. I'm not sure yet how it will affect what we do. It depends on whether Google releases the program or keeps it proprietary. It wasn't a fair match in all respects, but nevertheless impressive.


Komodo Chess 11

The multiple computer chess world champion comes in a new and yet more powerful version. Thanks to co-author US Grandmaster Larry Kaufman, Komodo is the strategist among the top chess programs!

More...


Other grandmasters took to Twitter:

IM Sagar Shah of ChessBase India recorded an instructive lecture at the Indian Institute of Technology Madras on the role of technology in chess, and discussed AlphaZero at length, including analysis of some of its games:

To sum up

The DeepMind team achieved a remarkable success with the Alpha Zero project. It showed that it is possible to use a Monte Carlo method to reach an enormous playing strength after only a short training period — if you use the Google Cloud with 5000 TPUs for training, of course!

Unfortunately, the comparison with Stockfish is misleading. The Stockfish program ran on a parallel hardware which is — if one understands Tore Romstad correctly — only of limited use to the program. It is not clear precisely how the hardware employed ought to be compared. The match was played without opening book and without endgame tablebases, which both are integral components of a program like Stockfish. The chosen time control is totally unusual, even nonsense, in chess — particularly in computer chess.

Of the 100 games of the match, DeepMind only published ten wins by Alpha Zero, unfortunately without information about search depths and evaluations.

But the entire chess world is eagerly awaiting more experiments and information on future development plans.

 

Links:

Translation from German: Johannes Fischer
Additional reporting: Macauley Peterson




André Schulz started working for ChessBase in 1991 and is an editor of ChessBase News.
Discussion and Feedback Join the public discussion or submit your feedback to the editors


Discuss

Rules for reader comments

 
 

Not registered yet? Register

desiertos desiertos 4/7/2018 04:53
No orangutans and apples. Stockfish certainly knew what it was doing: it drew many games.
celeje celeje 3/31/2018 01:58
@ enfant:
You can even more safely predict that DeepMind will never allow Alpha Zero to play against "any strong engine that is free to play its best game".
It's not hard to work out why.
enfant enfant 3/24/2018 06:34
RE: Alpha Zero Strongest Chess Engine??

That would seem to be the implication.

But alas, in chess, there is just no free lunch.

I think I can safely predict that Alpha Zero will lose to any strong engine that is
free to play its best game. And that Alpha Zero is a "Wizard of Oz", hiding behind
the curtain that it cannot foresee complex tactical variations very well.
celeje celeje 1/4/2018 11:29
@DrCliche: Thanks for the info. What exactly is Stockfish Master? (Did it have any of the add-ons they talk about?)
Did Master have the normal 40/40 time limits?
How much hash?

Oh, another thing: is there an official rating for this Stockfish Master?
DrCliche DrCliche 1/3/2018 02:51
@astokes Such an experiment had already been proposed, started, and finished on the FishCooking forums before you even made your comment. The results were that a properly configured Stockfish Master utterly crushed a handicapped Stockfish 8.

Limitations:

- The manner of handicapping was as close as possible as they could get to Deep Mind's configuration based on the available information, that is, Stockfish 8 was given exactly one minute per move and only 1GB of hash.
- Both engines were running single threaded on the same machine, which out of necessity was normal consumer hardware, making it significantly slower than the hardware used by Deep Mind for either engine.
- Perhaps out of habit, games used FishTest's default 2-move opening book rather than starting from scratch. Those positions include some that significantly favor white.
- Stockfish Master wasn't using an opening book or endgame tablebases.

Anyway, the match score was 26 wins, 72 draws, and 2 losses for Stockfish Master, very similar AlphaZero's score. Both losses occurred when defending against two of the openings that significantly favor white. (In supplementary games where openings were predetermined, AlphaZero also lost games against the misconfigured Stockfish 8. For example, AlphaZero had 20 wins, 71 draws, and 9 losses in the Sicilian. In 2 of those losses, AlphaZero was white!)
jsaldea12 jsaldea12 12/25/2017 02:32
Proposed fair match between Alphazero Vs. Carlsen or So

Thank you Celeje and Astokes for your comments. I actually sent that chess puzzle to the author, respected Demis Hassabis, author of Alphazero, ( said Alphazero, super-giant chess computer that totally demolished stockfish, current world super- computer chess champion), a number of times, expecting to be demolished like stockfish, but the good Dr. Hassabis did not respond. Ir appears clear even Alphazero algorithm, has a limit. And it sets me thinking. In that match between super computer against Kasparov, the time maybe the same. THIS IS UNFAIR. That engine computer can flash a move at the speed of light a second. But human naturally is thousand or million of times slower. Thus, to make the match fair and equal, it is proposed to arrange a match, say between Alphazero against sworld chess champion Carlsen or GM So, the super giant chess computer is given 1 minute to make one move while the human is given 1 hour to make a move. I kike ti see this,I can say this because Alphazero has not proven it can solve my 30 years in the making complex chess puzzle, 9 mover or shortened 7 mover. I still say human is superior to machine.

Merry Xmas, happy new year!!!!

jsaldea12 Dec. 25, 2017
celeje celeje 12/23/2017 03:24
@e-mars: I just saw your last comment addressed to me. I didn't notice it before, because it's surrounded by spam posts unrelated to the topic. Please re-read what I wrote.
1. They are not reviewing their own paper, you know.
The reviewers are unlikely to be chess experts. The reviewers are unlikely to be familiar with computer chess either. Even chess pros aren't all that familiar with computer chess. They just use the programs.
2. It's unknown whether any of the authors has any interest in computer chess. Probably not. It's not part of AI. I didn't actually say anything in the post you responded to about the authors' knowledge of chess or computer chess. I mentioned what was reported, that they have refused to answer the chess journalist's questions. That's all.
3. I don't think the authors necessarily have to know about chess or computer chess, but maybe if they knew more about computer chess they would not have mucked up the Stockfish setup, which now seems clear to us but maybe not to the reviewers. We'll see.
celeje celeje 12/23/2017 05:54
@astokes: I agree it's a simple thing. I agree with the test you describe. But your conclusion is the complete opposite of the truth.

It's not people whimpering that's the problem. It's DeepMind being very secretive and looking like they have stuff to hide.
I'm sure plenty of people would be happy to set up an "anointed" Stockfish. The problem is trying to get DeepMind to release the version & configuration of their "crippled" Stockfish.
astokes astokes 12/22/2017 11:47
It's a simple thing.

Configure a straightjacket Stockfish (jack) exactly the same as the one used by Google. Have it play 100 games against the anointed Stockfish (current version, best hardware, opening book, endgame tables, etc.) playing under roughly 40 minutes per 40 move time controls, and see whether anointed Stockfish can kick jack Stockfish in the nuts _anywhere_ near as hard as AlphaZero managed to do.

For my money, there's far too much whimpering about the non-ideal match conditions.

Anointed Stockfish should put up, or shut up.
jsaldea12 jsaldea12 12/22/2017 03:07
Revised 7 mover chess puzzle, no illegal position:
White to mate black in 7 moves. (Never mind dizzying 8 mover) • White to mate black in 7 moves: • Position: White: Ka3, Pa4, P-a5, Ba8, B-b6,Pc2, Pd6,Pe2, Nf3, Ng1, Pg5 Black: Pa6 Pc3. Kc4, Pc5, B-c8, N-d5,Pd7, Pe3, Pf5, Pg7, Ph5
It is still complex 7 mover chess puzzle. Can AlphaZero, by itself algorithm, without human intervention, solve it dispatchly..Please let us know•respected Dr. Demis Hazzalbis.
Jsaldea12 dec. 21, 2017
jsaldea12 jsaldea12 12/20/2017 12:49
I agree with chess expert, Mark Erenbutg, Judge, world chess cup, that chess puzzles must adhere to rules of play of chess
albitex albitex 12/19/2017 06:21
Just now
But, to learning Alphazero have play 700.000 games, then is not this how to create a book? Online a engine takes weeks, months, to create a good book, Alphazero with a 600-core supercomputer, took one day. It has not shown learning!
Learning would be shown if after a month of play it continues to improve.
jsaldea12 jsaldea12 12/19/2017 01:15
Reverted 9 mover chess puzzle for Alphazero:
My chess puzzle, reverted to mate in nine, below, thanks to Mark Erenburg, arbiter, world chess cup, for giving me opportunities, pushing me to correct myself, although again he considers the posaition illegal?? (in puzzle, it is not illegal) until I made the puzzle complex too good a challenge for chess computers to solve, even, with due respect, to AlphaZero to solve by itself algorithm, without human intervention.
White to mate black in 9 moves:
Position:
White: Pa2, Ka3, Pa4, P-a6, Ba7, Ba8, P-b5,Pc2, Pd6,Pe2, Nf3, Ng1, Pg5
Black: Pc3. Kc4, Pc5, B-c8, N-d5,Pd7, Pe3, P-f5, P-g3, Pg7, Ph6
Solution 1 : (1) Pg6 P-h5 (2) B-b6 P-h4 (3) B-d8…P-h3 (4) b-g5…P-f4 (5) BxPf4…B-b7 (6) BxB Ph2 (7) BxPe3…NxB (8)N-e5…K-d4 (9)Ng1- f3 Mate.

Solution 11: (1) Pg5xPh6…PxP(2) B- b6...P-h5 (3) B-d8…P-h4 (4) B-g5 Pf6 (5) BxP Bb7(6) BxB Nb4, (7) BxPe3…NxB (8)N-e5…K-d4 (9)Ng1- f3 Mate.
Best wishes.
Jose S. Aldea
Dec. 19.2017 (9AM)
celeje celeje 12/18/2017 01:32
@fgkdjlkag : Re. " it's curious and heartening that practically every point made by the IM AI researcher was already made in the comments here"
I don't want to sound critical of him, but I don't think they were really original thoughts. I also think there's a little bit of using his IM title to claim authority. But none of his comments were about specific chess moves.
DrCliche DrCliche 12/17/2017 07:57
@bullwinkle Stockfish doesn't have "an opening book". If you want Stockfish to use one, you have to supply your own book and book interface, and none were specified in the detailed methods of the paper. One could argue, however, that if you want to breathlessly claim to have "convincingly defeated a world-champion program" (and more!), you should configure that program for maximum playing strength, to the best of your resources and abilities. That is, a leading opening book (e.g. Cerebellum), 6-piece tablebases (though Google is one of the few places that could reasonably manage 7), and TCEC or better hardware. Deep Mind's batting 0-for-3 so far.
fons2 fons2 12/17/2017 06:07
@ jsaldea12

[FEN "B1b5/B2p2p1/p2Pp2p/P1pn1pP1/P1k5/K1p2Np1/P1P1P3/6N1 w - -"]

1. e3 Nxe3 2. Ne5+ Kd4 3. Ngf3#
jsaldea12 jsaldea12 12/17/2017 04:30
Would appreciate AlphaZero to solve by itself, below 8 mover puzzle

Pushing the puzzle to 8,9,10 movers up increase complications exponentally, a thousand, a hundred thousand possibilities, and I have no chess computer to aid me, it is dizzying mentally. But this puzzle is not only very complicated, it is most beautiful.
I believe super computers have their limit. But alpha Zero is entirely different. It obeys and moves by itself and solves by command. That is why I would appreciate Dr.Demis Hassabis Alpha Zero, by itself, without human intervention, solve this revised puzzle, below, shortened to 8 mover. jsa `12. 12.17.17
white mates black in 8 moves
Corrected Position: White: Pa2, Ka3, Pa4, P-a5, Ba7, Ba8, Pc2, Pd6,Pe2, Nf3, Ng1, Pg5
Black Pa6,P Pc3. Kc4, Pc5, B-c8, N-d5,Pd7, Pe3, P-f5, P-g3, Pg7, Ph6
Regards and thanks.
jose s. aldea dec. 17.17
e-mars e-mars 12/16/2017 09:01
@celeje Dharshan Kumaran is a GM. Demis Hassabis's been a chess prodigy. Only to name a few.
They DO know about chess :-)
jsaldea12 jsaldea12 12/16/2017 11:55
You are right, Boy, i made a booboo.. This one is not so good, reduced, but white mates black in 8 moves
Corrected Position: White: Pa2, Ka3, Pa4, P-a5, Ba7, Ba8, Pc2, Pd6,Pe2, Nf3, Ng1, Pg5
Black Pa6,P Pc3. Kc4, Pc5, B-c8, N-d5,Pd7, Pe3, P-f5, P-g3, Pg7, Ph6

Solution 1: (1) Pg5xPh6…PxP(2) B- b6...P-h5 (3) B-d8…P-h4 (4) b-g5…P-f4 (5) BxPf4…P-h3 (6) BxPe3…NxB (7)N-e5…K-d4 (8)Ng1- f3 Mate.
Solution 2 MORE BEAUTIFUL SOLUTION : (1) Pg6 P-h5 (2) B-b6 P-h4 (3) B-d8…P-h3 (4) b-g5…P-f4 (5) BxPf4…P-h2 (6) BxPe3…NxB (7)N-e5…K-d4 (8)Ng1- f3 Mate.
I think this is now chicken pie to AlphaZero. Thank you all and regards
Jsaldea12 dec. 16, 2017.
celeje celeje 12/16/2017 08:04
@fgkdjlkag: It's currently under review. The authors have used that as an excuse to avoid answering any questions about it. The problem is the reviewers may or may not know much about chess or computer chess. If they just know about AI then they may not pick up flaws.
When it is published, it will be interesting to see what journal it is, to see whether their policy really is to forbid authors to make public comments before publication. Some journals have that policy, most don't.
bullwinkle bullwinkle 12/16/2017 06:30
What is the source of the statement that Stockfish is not using its opening book? The paper says nothing about that. And is there any example from the games where Stockfish does not follow known theory when given the opportunity?
peteypabpro peteypabpro 12/16/2017 04:58
@fgkdjlkag it wasn't published; it was just uploaded to arxiv.
mdamien mdamien 12/16/2017 04:43
@Martin.chrz I would not at all be surprised if Stockfish were playing at a crippled strength, although I recognize that others feel strongly that it wasn't crippled. I'm guessing, though, that all would agree that the version of Stockfish they used would still mop the board with any human player. What impressed me with the games they released (and granted, these were cherry picked) is that it's not a style you typically see of computer play, such long-term positional sacrifices. Sure, Magnus might be able do it against a 2200 (perhaps if he were intentionally handicapping himself over the board) but not against even a crippled Stockfish. The point is that the AlphaZero team would seem to have a different sort of chess AI, regardless of how it might fare in a match against an in-form Stockfish.
fgkdjlkag fgkdjlkag 12/16/2017 04:08
@PCMorphy72, it's curious and heartening that practically every point made by the IM AI researcher in the link you posted was already made in the comments here (including the previous thread on the topic).

Much blame also has to be given to the journal that published the piece. But it is representative of the quality of published articles today. In most/all fields, reviewers are not paid and hence only do a cursory evaluation of a manuscript before accepting it. They could have asked for revisions. They could have rejected the piece. But there is also a conflict of interest - they want to be the journal to publish the piece from Google that is going to create an international media buzz (even if unjustified).
peteypabpro peteypabpro 12/15/2017 10:12
@mdamien I'm not sure the claim that "it's inescapable to conclude that the world leading, professional computer scientists at Deep Mind used a gimped, gutted, misconfigured Stockfish intentionally" is well-thought out...
idlivadai idlivadai 12/15/2017 11:55
By not making public the 90 remaining gamesof alphazero with stockfish, there seems to be an effort to hide few things... if you are going to shock the chess world with something new and claim you have turned it upside down, then you need to make the games public... By removing stockfish access to its openinng books or table bases, you have removed its teeth merely to prove your point..Seems to be merely a stunt and advertisement for a product. Why their team never answers various concern raised by chess playing public..
celeje celeje 12/15/2017 11:44
@Martin.chrz: I don't think there's too much doubt about the 1200 games. They were not part of the training. Otherwise it wouldn't be self-training. I think they just did them to illustrate it can play the popular human openings well. They talk about a separate 100 games as the match, because in the 100 games AZ could play whatever it wanted. So the 1200 games are almost certainly not training, but they are AZ at full strength.
celeje celeje 12/15/2017 11:33
@ DrCliche: Thanks for taking the time to write a long reply.
celeje celeje 12/15/2017 11:32
@peteypabpro: I see. Yes, Tord Romstad was quoted as saying "far more search threads than has ever received any significant amount of testing".
But I did not take Tord to mean "too many threads".
I took Tord as meaning the program never plays and competes on superhardware, so all those developing it haven't spent time on that sort of thing. If it did, they would have optimized it for multiple processors and got a huge improvement. So it's playing better with more threads, but not as much as it could.
Martin.chrz@ortenix.cz Martin.chrz@ortenix.cz 12/15/2017 05:33
To PCMorphy72:
Thumbs up to that critical analysis of the AI researcher.
Those are my questions and doubts too!
Martin.chrz@ortenix.cz Martin.chrz@ortenix.cz 12/15/2017 05:26
To MDamien: You wrote: "Apart from these interesting points, the select games released from the match are simply astounding. [...] Then, to realize that the defending side is not a player from the 1800's, but a machine heretofore known to simply make no tactical mistakes ... it's breathtaking"

Before you get carried away too much, you should realize what the disscussion here (and elsewhere) is about. DeepMind crippled Stockfish's strength considerably - probably on purpose, so that they could claim they were able to build the strongest machine. How much they lowered the strength exactly is unknown. However, enough to beat it.

But as DrCliche pointed out: "A correctly configured Stockfish Master with books, ETBs, and reasonable hash size would have had similarly dominating results against Deep Mind's Stockfish 8 [....]"

Can't you really see the problem here? It's easy to beat a much weaker opponent, even with sacrificing material and using space advantage and ... and... whatever. It's like Magnus playing a 2200-rated player. In such games you could probably muse about "beautiful sacrifices", "not caring about material, just focusing on beautiful tactics and excellent attacks" etc. And Magnus would get away with it, because the weaker player would collaps under too many threats and strong attacking ideas. Well, it's not that easy if Magnus faces a similarly strong opponent! You have to play with much more caution generally, and if you sacrifice too much material and your stronger opponent defends successfully against your attack, not only may you NOT win, but you could easily LOSE such games...
Then, you would probably ask: "Where's that beauty, where are those breath-taking moves we saw previously in his games?"
The difference between Alpha and even the crippled Stockfish isn't that big as between Magnus and a 2200-player, of course, but the parallel is correct.

Once they (the guys at DeepMind) will face Stockfish at its full strength, we'll see if Alpha is able to come up with such "beautiful ideas" and "breath-taking moves". My guess is Stockfish (or Komodo or Houdini) won't simply allow such positions where those moves would be possible.

And yes: I want to see a proper match! I want to see Alpha's proper strenght! Perhaps it IS genious, perhaps it WILL play beautiful moves! What DeepMind achieved in 'go' has been incredible!
But this? Releasing only 10 winning games (out of 100 or out of 1200 who knows?), it is NO PROOF of Alpha's strength. It's just a laugh.
Sadly...
Martin.chrz@ortenix.cz Martin.chrz@ortenix.cz 12/15/2017 04:49
Did the author of this article (or Chessbase) try to get comments from DeepMind on some of the disputed questions? Namely:

- Can they resolve the 1200 games conundrum?
- Was or was not Stockfish allowed to use opening libraries and endgame tablebases?
- Was there any reason they used the ridiculously small RAM for Stockfish?
- Is it true that: "they completely circumvented time management and used not 1 min/move TC but movetime uci command. This means that search was interrupted in the middle of the iteration [...]"? (see the post of DrCliche) If so, why?
- How can they explain that lots of other versions of Stockfish (run on even mediocre machines) discard some of the moves "their" version of SF made quite quickly and easily? How do they explain it? Does or does it not bother them?
- Do they realize that they crippled their version of Stockfish? How many ELO points do they think "their" Stockfish version really had? Follow-up question:
- How can they boast that their AlphaZero "destroyed the strongest chess computer to date"?
- Would they be willing to face the three strongest chess computers (Stockfish, Komodo and Houdini) under agreed conditions? On what HW? Equal HW? Or at least their super-computer against versions of the 3 above mentioned chess programs that are usually used at the TCEC championships? [Note: Of course, it's anyone's guess how such a match would unfold, but my guess is that it would be enough for the current chess programs to refute Alpha's strength with their full and USUAL power.]

I think you'd be able to think of a few more questions yourself.

Really: Why not ask them directly? They make lots of publicity for themselves. If they are serious about their contribution to science and/or chess programming, they should probably NOT dodge these questions.
Otherwise, it's just a guessing game of the chess / AI community in TOO many aspects.
mdamien mdamien 12/15/2017 01:10
@DrCliche What's with the rational, well-thought-out posts? Chess needs more blitz posting to keep the game lively and adversarial.
DrCliche DrCliche 12/15/2017 12:12
@celeje I don't know where that information comes from, it could simply be a supposition trying to explain the observation that Deep Mind Stockfish played worse than real Stockfish does on an old phone.

"Fail low" is somewhat esoteric chess programming terminology. Here's a pretty straightforward explanation from Crafty programmer Robert Hyatt: http://www.open-chess.org/viewtopic.php?f=5&t=2754. If you're interested in learning more about chess programming, https://chessprogramming.wikispaces.com/ is a good resource with lots of explanatory links including the one I just gave you.

Anyway, as I understand it, what happens when an engine isn't allowed to resolve a fail low is that there will be lots of lines that got pruned at very low depth and haven't actually been properly evaluated. Those moves will have as an upper bound the aspiration window's lower bound. That is, all that is known about those moves is "hey, these moves are probably at least this bad, they might be a lot worse."

So when the engine is chugging along and all of a sudden the evaluation for what it thought was the best move plunges below the aspiration window's lower bound (because some deep refutation was discovered), the engine is left with a large collection of alternate moves that all have the same score: the aspiration window's lower bound.

But those alternate moves haven't actually been evaluated particularly deeply. Their scores are just upper bounds set when they were pruned. The engine needs to actually widen its aspiration window and search more deeply along more lines to get better approximations for their real evaluations. If the search is just randomly interrupted in the middle of attempting to resolve a fail low situation, its list of moves will look like this:

"Best" move: Pruned move that hasn't been deepened yet with a score equal to the old aspiration window's lower bound.
2nd best move: Ditto.
3rd best move: Ditto.
etc.
...
nth best move: The former best move that had a deep refutation.
(n+1)th best move: Pruned move that got deepened and pruned again for falling below the new aspiration window's even lower bound.
(n+2)th best move: Ditto.
etc.

So the engine will be forced to just randomly pick between moves about which nothing is known except their probable upper bounds. Odds are the move at the top of the list will be bad. Stockfish can recognize when fail lows happen, and normally doesn't allow itself to pick a move when its current "best" move is unexplored and simply bounded above by the old aspiration window's lower bound. But when you slap Stockfish in the face and say, "STOP EVERYTHING AND GIVE ME A MOVE RIGHT NOW", well, $#!+ happens.

(Caveat: I'm not an expert chess programmer and haven't studied Stockfish's code in great detail, but I believe my explanation here is essentially correct.)
e-mars e-mars 12/14/2017 11:57
@celeje and also more threads is not synonym of more power if you don't provide more ram. it is like too many cars stuck in a countryside road.
peteypabpro peteypabpro 12/14/2017 11:52
@celeje in the post that @PCMorphy linked to

https://medium.com/@josecamachocollados/is-alphazero-really-a-scientific-breakthrough-in-ai-bf66ae1c84f2

"Tord Romstad also pointed out to the fact that Stockfish “was playing with far more search threads than has ever received any significant amount of testing”
celeje celeje 12/14/2017 11:41
@peteypabpro: Can you point us to where any "Stockfish defender" has claimed "too many threads"? I don't see any such claim on this webpage.
peteypabpro peteypabpro 12/14/2017 11:21
I think it's funny that the Stockfish defenders are simultaneously complaining that it was run on weak hardware compared to AlphaZero but also too strong of hardware ("too many threads")
celeje celeje 12/14/2017 10:38
@DrCliche: Re. "they completely circumvented time management and used not 1 min/move TC but movetime uci command. This means that search was interrupted in the middle of the iteration, does not matter what happened at that point - not resolved fail low or best move change."
How did the writer of those words know this had happened (since the AZ team is basically hiding and not giving any details)?
What does "resolve a fail low" mean?
DrCliche DrCliche 12/14/2017 09:02
@davidrimshnick It doesn't get "stuck" at depth 39, it merely takes more time to deepen the search the deeper you go. Let it keep running and you will get to depth 41, though it might take a while if your computer is slow.

Regardless, on TCEC10's beefy hardware, all the top engines were routinely reaching middlegame depths in the 40s, with Stockfish consistently going the deepest. Deep Mind claimed their copy of Stockfish was calculating 70-80 million nodes per second, which is ostensibly faster than the TCEC machine! However 1GB hash for 64 threads is ABSURDLY low, resulting in massive playing strength reduction. Ideally, you'd have a minimum of 1GB hash per thread, though 2GB per thread would be better. I can't conceive of a reason the Deep Mind team couldn't have easily made that happen. Google literally has the largest computing infrastructure in the world. They probably have a million times the CPU and memory resources used for Deep Mind's copy of Stockfish just sitting idle at any given moment.

Also, from the Stockfish forums, "they completely circumvented time management and used not 1 min/move TC but movetime uci command. This means that search was interrupted in the middle of the iteration, does not matter what happened at that point - not resolved fail low or best move change. To me it looks like substantial elo loss in comparison even to 1min/move time control."

So not only did Deep Mind needlessly gimp Stockfish by forcing it to use precisely 1 min/move rather than letting it manage its own time (resulting in massive playing strength reduction), the Deep Mind team didn't even set *that* up correctly, resulting in situations where moves were selected essentially at random because Stockfish wasn't allowed to resolve a fail low. (And, indeed, if you go through the published games with a correctly configured Stockfish on even modest home hardware, you'll find many instances, often multiple *per game* where your copy of Stockfish will almost instantly reject whatever move Deep Mind's Stockfish ended up playing. And if you let your evaluation keep running to high depths, the move that Deep Mind's Stockfish played will never enter the top PVs.)

A correctly configured Stockfish Master with books, ETBs, and reasonable hash size would have had similarly dominating results against Deep Mind's Stockfish 8-, though the wins wouldn't look "beautiful" to a human because Stockfish's playing style is considerably less "human" than AlphaZero's at the moment. It's quite possible, even probable, that AlphaZero as tested wasn't even close to the strongest chess playing entity in the world (much less the strongest, as Deep Mind's paper coyly leads you to believe), despite the fact it was essentially running on a supercomputer.

All that being said, AlphaZero is clearly capturing and successfully applying chess knowledge that current engines don't understand as readily. There's little doubt that neural network based chess evaluation can and will be used to improve top engines over the coming years. There's also little doubt that AlphaZero's purposefully very general approach would have benefited from intelligent application of domain-specific knowledge and more advanced search strategies. For example, using tablebases to 100% correctly score endgames during training would probably have made AlphaZero stronger. It's also probably true that while playing, a search algorithm that mixes alpha-beta with MCTS would be an improvement over MCTS alone. (You might need to stick to 100% MCTS during training, though, since alpha-beta propagates errors rather than averaging them out.)