Database basics - part 1

by ChessBase
8/24/2004 – In the new ChessBase Workshop, columnist Steve Lopez gets back to basics with the first entry in a new series on how to use chess databases to improve your own chess skills. In our first installment, we examine what a database is and why using one as part of your chess study is important. More...

ChessBase 17 - Mega package - Edition 2024 ChessBase 17 - Mega package - Edition 2024

It is the program of choice for anyone who loves the game and wants to know more about it. Start your personal success story with ChessBase and enjoy the game even more.

More...

DATABASE BASICS -- PART 1

by Steve Lopez

Technical writing isn't like any other kind of writing. It's not (necessarily) telling a story. It's not like creating a poem in iambic pentameter or a haiku in which strict rules of form must be followed. It's not like writing a newspaper article in which the "five W's" should be provided in the opening paragraph. And writing a weekly technical column for a general and varied audience is a somewhat dicey proposition. You're walking a high wire, in that you don't want to make things so complex that it shoots over the heads of beginners while, at the same time, you need to provide enough "meat" for the more advanced members of the audience.

I was reminded of this the other night as I was browsing a used book store's History section. I was all set to buy a "general overview" of the American Civil War when I suddenly realized that I don't need it. I've read a dozen such books and I derived absolutely nothing new from the last four or five of them. So I saved my hard-earned ducats by putting the book back on the shelf. "Maybe somebody new to the ACW will benefit from it," I thought. And that's when a new thought hit me: I'd "graduated" from the proverbial "Civil War 101 class" ages ago. Consequently, there are many bits of knowledge I take for granted that a reader new to that historical period wouldn't know.

Tonight as I was tearing up the highway between the ditches on I-70 I made yet another related connection by reversing the thought. There are a lot of things I take for granted as a chess writer that might not be "common knowledge" for new readers of this column. It's not that I'm terribly smart (I'm not) but rather that I have over a decade of experience with chess computer software and there are things I learned in the early going that aren't at all obvious to someone who's just bought his first PC chess program. Over the years in which I've been writing this weekly column I've tried to strike a happy medium between the beginning and advanced software users. I've tried to make the articles as simple as a x for Dummies book while simultaneously providing enough meat for the folks who've been around the block a few times.

But I've been recently reminded that some readers have become lost in the shuffle. Sure, almost anyone can install a chess program and be playing a game against the computer within a few minutes. However there are a lot of extra features that baffle new users, features that I take completely for granted.

One of these is the chess database. I'm discovering that there are a lot of recent converts to electronic chess tools who have no idea what a database is or what one is supposed to accomplish by using one. So in the next few ChessBase Workshops we're going to look at chess databases -- and we're going to do it from scratch with no preassumptions being made from my end. We'll start with generalities and then move on to the specifics. We'll use Fritz8 as our "example" program, but a fair little bit of the ground we'll cover will be applicable to any chess software program that contains database functions.

I'm going to try a different approach with these columns. There will be places where extra exposition will be useful but not required. So I'm going to add this extra explanation in the form of footnotes. I can already hear some of you groaning but, trust me, it'll be pretty painless. The extra footnoted material will appear in red lettering and will be interspersed throughout the articles instead of appearing at the end of each piece. If you don't want to read them you can easily skip down to the next paragraph that appears in standard black type. If the approach works (and I think it will), great. If not, it was an effort worth the attempt.

The first thing we'll need to do is define a "database". A database is any collection of related information. When new PC users think of a database, they think of some monstrous master collection of information containing all of the world's knowledge (like that computer in the original version of the movie Rollerball or the Enterprise's computer on Star Trek). Sure, that description works but it's not very accurate. A database doesn't have to be "monstrous" -- heck, it doesn't even have to be what most people think of when they hear the word "information". I run an online game league and keep a database of the league's players: names, nicknames ("handles"), e-mail addresses, etc. It's not a huge group; maybe a couple of dozen players participate. But this collection of useful (to me, at least) information by definition constitutes a database.

Back in the pre-electronic age, one would have needed to use a somewhat modified definition for the term "database": any organized collection of related information. Let's look at a familiar example from everyday life: your local telephone directory. Unless you live someplace like Elk Hills, Wyoming (population 45), the phone book would be totally unusable (and therefore worthless) if it wasn't organized. If a resident of Elk Hills needs to look up a phone number, it's not crucial for that list to be an organized one since it'll fit comfortably on a single sheet of paper; he can just visually scan the list until he finds the number he needs. But a denizen of a major metropolitan area needs organization in his list or else he's gonna find bupkis when he needs a particular number.

The standard organizational method for a printed telephone book is to list people alphabetically by last name. This has worked wonderfully well for decades. If you need to find Ralph Callahan's phone number you just flip to the "C's", then to "Ca", and so on until you find "Callahan". Then you check the first names/initials (also conveniently alphabetized) until you see the "R's". And you should quickly find you old pal Ralphie's number.

There are other ways to organize telephone directories. Way back in the day you used to be able to purchase a printed phone book with the organizational method was numerical order by phone numbers. But early telemarketers found this to be a useful tool for locating people to annoy at dinnertime, so you generally can't get these books anymore. Another approach that's still in alternative use in some places is a "city directory" which groups listings by neighborhoods/street names. But the tried and true method is still the alphabetical phone book which has worked very well.

The approach does have occasional (sometimes embittering) limitations. Say you meet a really hot gal named Evelyn down at the nightclub and she winds up inviting you home. You gladly accompany her and spend a wonderful evening at her place over on Market Street. As you're groggily leaving the next morning, she breathlessly whispers, "Call me?" as you depart. It's not until after you get home that you realize that she never gave you her number, you never found out her last name, and that you were so hungover when you left her place that you didn't take note of which building she lives in -- and Market's a pretty long street.

So what do you do? If all you have is an alphabetical phone book, you're sunk; enjoy the memory and learn to live with the loss. If you have a "city directory" style of book available you have at least a shot at finding her number: just look up all the people on Market whose first names start with "E". Of course if she gave you a fake first name this won't help you either -- but at least there's a ray of hope.

This is exactly why electronic databases (as opposed to paper ones) are becoming so popular. With a decent electronic database you at least have a shot at pulling up the info you need in a fast easy manner. In the above case, you'd load the software, do a search with "Market" as the "Street" parameter and "E" as the first initial, and maybe score a few hits; with luck one of them will be the cutie from down at the club.

That's why we have to take the word "organized" out of our definition of the word "database" when we're dealing with the electronic medium. An electronic database doesn't even have to be organized. The master list of information can be thrown together in any old haphazard manner; in fact if you were to look at a printout of the contents of a lot of electronic databases you'd see that there's really no rhyme or reason to the way they're organized. The entries don't need to be in alphabetical order, numerical order, or any other kind of "order". They can be thrown together any old way because the search tools for an electronic database (even an unorganized one) allow you to pull up the information you need, and you can oftentimes do it much more quickly than you could if you were using an organized print database.

In the words of Cliff Stoll, this is some hot damn stuff. Nowadays there's no need for people to spend countless hours organizing the raw material and for other people to spend countless more hours hunting up specific bits of information. The organizational part is gone and the search times have been cut to seconds.

Here's a personal example. I'm a Civil War historian and one of the legendary primary sources in my field is a 130 volume set of books called The Official Records of the War of the Rebellion. You can purchase printed versions of the books, though it'll cost you dang near three large to buy them all and you might have to add a room to your house in order to store them. Alternatively, you can do what I did: buy the complete collection on CD. Believe me, this has saved me an incalculable amount of time and, considering what my time as a writer/researcher is worth, the CD paid for itself in the first week I owned it. I once tried looking up all the references to a small West Virginia battle in the paper version; I hunted for hours and still didn't find all of the material (The Official Records is a truly abominably badly organized work). I did the same thing using the battle's name as a search parameter in the CD's search window and got all of the references in a couple of seconds.

You probably do this all the time without even thinking about it. Everytime you use Google for an Internet search, you're searching a database; the Internet is the world's largest database and is absolutely the worst organized. You can try it right now if you like. Do a Google search for "Britney Spears" and "topless" and you'll instantly have links to hundreds of websites. Of course, this doesn't mean that you're necessarily going to find what you're looking for. You'll get a bazillion hits but none of 'em will have ol' Brit with her duds off -- but they will contain countless offers to sell you such a pic. Heh -- P.T. Barnum was right all along.

OK already -- I can hear you asking what all of this has to do with chess. A chess database is a searchable collection of chess games. You can use a program's search tools to find the information you desire by entering parameters -- you tell the program what to look for and it'll find the games that meet your requirements. And a chess database doesn't even have to be organized in any sort of chronological or alphabetical manner. The games can be stored in any old haphazard order and the search mask will still pull up what you need. [1]

[1] Of course, there are reasons why you might want to have a database that's compiled in an organized manner. If you want to create a tournament crosstable using ChessBase/Fritz software, all of the games of that tournament should be "blocked" together in the database (although you can also do a search for that tournament's games and then create a crosstable from the list of hits). And it's much easier to visually scan down an organized list as a means of "browsing" the database's contents without necessarily doing a search.

Now why would you find a chess database useful? The answers are as varied as the number of chessplayers using databases. Historical researchers and writers often use databases in their work; I used to write a series of articles on great chessplayers from the "Golden Age" and frequently used a database to find and review their games, often as a means of locating games to include in the articles. Correspondence players are always using databases to look up opening variations and positions as a means of evaluating strong moves to use in their postal games. Some folks just like to play through great games of the past -- I know a lot of database users who no longer actively play chess themselves but enjoy replaying the games of classic chess contests. And dang near everyone who owns a chess program with database capabilities will sooner or later create a database of his or her own games as a record of their own chess exploits; the first thing I did when I got my copy of Knightstalker back in 1992 was to create a database of my USCF over-the-board games (and it's a database I still use all these years later).

By far the most compelling use for a chess database is to use it to improve our own chessplaying skills. That's why most chess books [2] contain example games; the author explains a concept and then illustrates it with one or more actual games in which that idea appeared.

[2] The reason I say "most" is that there are a few notable chess books which contain no games, just an explanation of terms or concepts. Bruce Pandolfini's Weapons of Chess jumps immediately to mind here.

While it's certainly most beneficial to have some sort of "guiding hand" showing you the way, it's not always a requirement. Many chessplayers learn a lot through a form of "osmosis": playing over a lot of games and gradually seeing commonalities and patterns emerging in them. [3] This is a major reason why using a database is so beneficial. You as a player will see common patterns from game to game and while it's not advisable (or even possible) to blindly ape the moves of strong players, seeing and understanding their techniques in these common circumstances and being able to adapt them to the unique (but similar) circumstances in your own games will certainly improve your chess results.

[3] Studies have shown that pattern recognition is a very important component of an individual player's overall chess skill. Chess is a game in which similar general patterns are frequently seen in dissimilar specific positions. The ability to recognize these patterns and adapt a "standard" general set of mental tools and procedures to specific and unique positions is unarguably a major part of a strong chessplayer's skill set.

We've discussed the "what" and the "why". We'll look at the "how" starting with next week's ChessBase Workshop. Until then, have fun!


© 2004, Steven A. Lopez. All rights reserved.


Reports about chess: tournaments, championships, portraits, interviews, World Championships, product launches and more.

Discuss

Rules for reader comments

 
 

Not registered yet? Register