Shredderbases

by ChessBase
11/11/2006 – The whole chess world is abuzz with a strange new word on everyone's lips: Shredderbases. What are they? What do they do? Is mankind doomed? And, if not, what's for dinner? We tackle at least half of these questions in the latest ChessBase Workshop.

ChessBase 18 - Mega package ChessBase 18 - Mega package

Winning starts with what you know
The new version 18 offers completely new possibilities for chess training and analysis: playing style analysis, search for strategic themes, access to 6 billion Lichess games, player preparation by matching Lichess games, download Chess.com games with built-in API, built-in cloud engine and much more.

More...

It never fails -- as soon as you get a bit of technology figured out, here comes something new.

I've written extensively about the Thompson endgame databases vs. the Nalimov tablebases numerous times in the past, but here's a quick refresher. Back in the early 1990's, Bell Labs' Ken Thompson developed a database format that chess engines could use to play the endgame perfectly. For example, there'd be no more dithering around with a Knight + Bishop vs. lone King ending, in which a chess engine would hack around, be unable to force the mate, and turn the game into a draw. Once the material on the board hit N+B vs. K, the engine would access the database and play the endgame out to mate perfectly.

But the Thompson databases had some drawbacks. First of all, the specific material balance covered by an endgame database had to be physically present on the board before the chess engine could access the database. Using the above example, if the material balance was N+B+P vs. a lone King, the database wouldn't come into play until the player on the short end of the stick managed to cop off that extra pawn; consequently, the engine still might conceivably mess around and turn that won game into a draw before the pawn came off the board and the endgame database kicked in.

A second limiting factor was the fact that the databases gave perfect info regarding the sequence and number of moves to win, draw, or change of material balance. What does this mean for the typical chess engine? In, say, a R+P vs K ending, the engine would know the moves and distance to win, draw, or promotion, but wouldn't know what would happen after the promotion occurred. It might know that it was five moves to pawn promotion, but couldn't automatically switch to a Q+R vs K database to extend that analysis out further -- it had to wait until after the promotion occurred before it could switch databases to the proper Q+R vs. K to find out what comes next.

Eugene Nalimov's tablebases, which came along later in the 90's, solved both of these problems. The tablebases could be "linked" together to account for what happens after a pawn promotes or material is exchanged off of the board. And chess engines could use the tablebases "on the fly"; if, for example, during analysis of a R+3P vs K+P ending, the engine saw a simplified position after material exchanges (say R+P vs K) out there somewhere in its search, it could access the tablebases immediately without waiting for that material balance to physically appear on the board and then use that information when deciding on the proper move in the present (R+3P vs K+P) position.

This was pretty hot stuff and completely changed the complexion of computer chess. It used to be that if a human player could exchange down to a simple endgame position against a computer chess engine, the human player often actually increased his chances of winning; the human player's better endgame technique would carry the day against the limited search horizon of the computer (there was a time when I, an average patzer, could easily beat the then-best computer engines on the market in a simple King-and-pawn ending because, in my knowledge of endgame principles and technique, I could "see farther ahead" than the computer could when it calculated every move but only out to nine or ten plies). But with the new Nalimov tablebases, computers started playing endgames like a house afire, and that old "anti-computer" trick of swapping down just didn't cut it any longer.

OK, that's all well and good, but there's still a fly in the ointment as far as chess engines were concerned. They still have to access the tablebases from a physical disk, and that takes a bit of time -- it's still greasy fast, but not as fast as things might be if the tablebases could be loaded and accessed in memory (i.e. RAM). This is pretty well known to computer savvy folks, but you might not know this if you've not encountered it yourself. The slowest form of data access is from removable media: floppy disks are the slowest while CDs and DVDs are somewhat quicker. But accessing data from the hard drive is quicker still; this is why the instructions to many computer games and programs recommend that you install the whole magilla to your hard drive if you've got the space, instead of doing a partial install in which some files (typically sound and sometimes graphics) have to be pulled from the CD/DVD when they're needed.

Hard drive access is physically faster than removable disk access, but accessing data from RAM is way, way quicker still (and the reason why many 3D shoot 'em up computer games require you to have a video card containing a high amount of separate video RAM). Data gets sucked up and stored temporarily in RAM where it can be accessed like lightning by a particular program. (And this also explains why you might need to restart your computer after you've run several programs in succession. When you exit a program, it's supposed to "release" the RAM it's using and make it available for other programs. Often, though, not all of the RAM gets released, and the program leaves a "footprint" behind, hogging up some of the RAM it was supposed to let go of. After you run a bunch of programs, a big chunk of RAM is still being held by these footprints, gumming up the works. Restarting your computer removes these "footprints" and makes all of your RAM available again).

So what's the problem? If you read my previous ChessBase Workshop column about Fritz Endgame Turbo 3, you already know that some of these tablebase files are big honkers individually, and the whole set is just plain enormous. There's just no freakin' way your machine will have enough RAM to store all of the tablebase files. You can alleviate the problem somewhat by making your tablebase RAM cache larger (under Tools/Options), but this is a double-edged sword: the more RAM you allocate to tablebases, the less will be available for the engine's hash tables during the middlegame.

How do you lick this problem? Shredder's programmers found an obvious solution: make the tablebases smaller by using a different, tighter compression technique when storing them. This allows the three, four, and five piece tablebases to fit into 157 MB of RAM -- compare that with the seven-and-a-half gigabytes required by the Nalimov tablebases. Now, admittedly, this isn't going to help a guy like me much; my six year old machine has 512 MB of RAM, so loading the Shredderbases into RAM will still take a chunk of away from my engine's hash tables. But for users of newer computers, most of which come with 1 GB of RAM minimum, this is some amazing stuff.

Get this -- accessing the Shredderbases from RAM is about 1000 times faster than accessing the Nalimov tablebases from the hard drive. Yowza!

Once again, though, there's a tradeoff -- the more you compress data, the slower the access time (comparatively speaking). But this gives the programmers a nice chunk of middleground to stand upon. The "Deep" version of Shredder 10 assumes (of course) a multi-processor machine and, naturally, a higher amount of RAM than is available on a regular single-processor computer (the idea being that if you can afford a multi-processor job, you're not going to skimp on RAM). Thus there's a second version of the Shredderbases, not as tightly compressed as the standard version but still more compressed than the Nalimov tablebases, which can be loaded into 441 MB RAM. The engine can access these tablebases ten times faster than the regular versions, and ten thousand times faster than the Nalimov tablebases.

Sorry, I have to go lie down now -- my head is spinning. But if you want to read more about Shredder 10 and Deep Shredder 10, you can get the skinny here.

One last note before I pass out. I've received numerous e-mails from users wanting to know the technical details of the Shredderbases' compression technique. I don't have the details, and I wouldn't be allowed to give them out even if I did. Sorry.

I'm off to hibernate. Until next week, have fun!

You can e-mail me with your comments on ChessBase Workshop. All responses will be read, and sending an e-mail to this address grants us permission to use it in a future column. No tech support questions, please.


© 2006, Steven A. Lopez. All rights reserved.


Reports about chess: tournaments, championships, portraits, interviews, World Championships, product launches and more.

Discuss

Rules for reader comments

 
 

Not registered yet? Register