John Hartmann: Diving into Databases

by John Hartmann
12/11/2015 – "You can buy the fanciest GUI," writes John Hartmann, "and you can collect all of the strongest engines around, but if you’re working with poor quality data, your research will suffer for it." He takes a look at the chess databases that are available for ambitious players and takes a special look at the Big and Mega Databases from ChessBase, each currently with 6.46 million games.

ChessBase 18 - Mega package ChessBase 18 - Mega package

Winning starts with what you know
The new version 18 offers completely new possibilities for chess training and analysis: playing style analysis, search for strategic themes, access to 6 billion Lichess games, player preparation by matching Lichess games, download Chess.com games with built-in API, built-in cloud engine and much more.

More...

Chess Book Reviews: Diving into Databases

By John Hartmann

When I was in high school and learning about the basics of computer science, I was taught an acronym to underscore the importance of having clean data to work with: GIGO, or ‘Garbage in, Garbage out.’ You can have all the fantastic algorithms and formula you like, but if your data is in poor shape, you’ll never come close to the results you desire.

The same is true of chess data. You can buy the fanciest GUI (graphical user interface) the market has to offer, and you can collect all of the strongest engines around, but if you’re working with poor quality data, your research will suffer for it.

Big / MegaBase 2016

There’s no way around it. You need a large reference database if you’re going to do any serious chess research or study. Online databases like chess-db.com, chessgames.com and ChessBase’s own online database are no substitute. They require Internet connections and you can’t easily manipulate online data. The largest and most well-known of these reference databases are Big Database (BigBase) and Mega Database (MegaBase) 2016 from ChessBase.

BigBase and MegaBase each contain over 6.46 million games running from the earliest recorded games through October of this year. The database is searchable by player, tournament, and annotator (among other things), and you can access various indices or ‘keys’ for openings, endgames, strategic and tactical themes. Note the last three keys are not accessible in the default ChessBase 12/13 settings. You can access them by going to Options – Misc – Use ‘Theme Keys.’

You might suspect, given the name of the product, that each year brings a new version of the database to the market. And you would be correct to do so. The 2015 release of MegaBase contained 6,161,344 games, and the data wranglers at ChessBase have bumped that total to 6,466,288 in the 2016 edition. About half of these games have appeared in issues of ChessBase Magazine and ChessBase Magazine Extra, but 166,692 of them are entirely new to the ecosystem.

While the majority are from 2014 and 2015 events, there are some historical additions as well. Among them are 18 games played by Botvinnik, 14 by Alekhine, and 9 by Spassky.

There are a number of similarities between BigBase and MegaBase. The number of games in each product is identical, as are the indices and keys. So what distinguishes them? MegaBase comes with two additional features that BigBase lacks: the inclusion of annotated games and a year’s worth of weekly updates. [MegaBase also comes with an updated version of PlayerBase, which collects rating data and pictures for thousands of players, but since I don’t use the feature, I will refrain from commenting on it.]

The 2016 version of MegaBase includes over seventy five thousand games with named annotators. This represents an increase of 3425 annotated games over the 2015 edition. While regulars like Atalik, Ftacnik and Marin provide notes to Super-GM games, there are also analyzed games by lesser-known combatants. Hundreds of annotated games from John Donaldson and Elliot Winslow are new to this edition, all of which come from amateur contests at the Mechanics Institute in the past few years.

MegaBase also comes with an update service, where weekly downloads of 5000 games are provided for a year. As a point of comparison, we are currently at update number 49 for MegaBase 2015, and 245713 games have been added to the database with all updates included.

This means, by the way, that not every game submitted to ChessBase is included in these weekly updates. Apples to oranges comparisons aren’t possible, but about sixty thousand or so games are in the 2016 database and not in the fully updated 2015 version.

BigBase and MegaBase are the preeminent reference databases available today. They are not perfect. Tim Harding has remarked on problems (some of which appear to have been fixed) with Blackburne’s games, for example, and John Watson never played in the 1966 British U14 Championship. Doubtless there remains plenty of tournaments, like the 1995 MCC/ACF Summer International (whose bulletin sits on my desk), just waiting to be entered into the computer. But no other database comes close to these two in terms of comprehensiveness and cleanliness of data. Anyone doing serious chess work, from openings to history to biography, needs one of these two products.

BigBase 2016 is available for download or post for €59.90 ($55.42 without VAT for those outside the EU). MegaBase 2016, which includes the annotated games, the weekly updates and the PlayerBase, costs €159.90 ($147.93 without VAT), and updates from previous versions of MegaBase costs €59.90 ($55.42 without VAT). The Update option comes with the annotated games, weekly updates, etc.

...

There is no substitute for having a large research database such as MegaBase or BigBase at your disposal for pre-game preparation, opening research and general chess study. Because MegaBase comes with annotated games, weekly updates and the PlayerBase, it is the premier database product on the market today. Serious opening analysts and correspondence players should absolutely consider supplementing BigBase or MegaBase with CorrBase.

Not everyone can afford MegaBase. For those on a budget, BigBase is an adequate stand-in for MegaBase. For those less interested in historical games and more in recent examples, Mark Crowther’s complete The Week in Chess database is perhaps a more worthy and cost-effective replacement.

Mega Database 2016

Languages: English, German
ISBN:978-3-86681-502-5
EAN: 9783866815025
Delivery: Download, Post
Level: Any
Price: €159.90 or €134.37 without VAT

The exclusive annotated database. Contains more than 6.4 millions games from 1560 to 2015 in the highest ChessBase quality standard. 68,500 games contain commentary from top players, with ChessBase opening classification with more than 100,000 key positions, direct access to players, tournaments, middlegame themes, endgames. The largest topclass annotated database in the world. The most recent games of the database are from the middle of September 2015. Mega 2016 also features a new edition of the playerbase (requires ChessBase 12 or 13). As usual, this is where most of the work was done. As the player index now contains already more than 319,000 entries, it made sense to use an adapted playerbase which includes about 390,000 names. Doing this, the photo database was extended as well to contain 35,000 pictures now.

Incl. Online Mega-Update 2016: With ChessBase 12 or 13 you can download games for Mega 2016 for the whole year, a total of approximately 200,000! That means your Mega 2016 will remain up to date from January to December.

Order Mega 2016 in the ChessBase Shop
Order Big Database 2016 for €59.90 (€50.34 without VAT)


John is the Digital Editor for uschess.org, the website for the US Chess Federation (US Chess), and he writes the monthly review column for Chess Life magazine. He won the 2018 Best Column Award from the Chess Journalists of America. You can find him on the web at Chess Book Reviews and First Look Chess.

Discuss

Rules for reader comments

 
 

Not registered yet? Register