Study of square utilization and occupancy

3/13/2015 – Inspired by the chess visualization study by Seth Kadish, featured at ChessBase last year, Devin Camenares, a professor in biology, created a suite of tools to analyze the square utilization and occupancy for any and all pieces and squares, across all moves and positions in all games in a database. He then applied this to the games of Fischer and Carlsen.

ChessBase 14 Download ChessBase 14 Download

Everyone uses ChessBase, from the World Champion to the amateur next door. Start your personal success story with ChessBase 14 and enjoy your chess even more!


Along with the ChessBase 14 program you can access the Live Database of 8 million games, and receive three months of free ChesssBase Account Premium membership and all of our online apps! Have a look today!

More...

By Devin Camenares

Introduction

As a chess player, I always seek ways to improve my game. As a scientist, I look for hidden connections within large sets of biological data. Finally, as a professor, I’m interested in finding new ways to explain and demonstrate the scientific method to my students. I’ve often wondered if I could combine these three endeavors: is it possible to leverage large databases of chess ‘data’ to apply a scientific approach to the game? While I can’t claim to have an answer to this open-ended question, I have created a suite of simple, easy to use, and freely accessible tools that might be helpful in chess research.

These tools can be used to analyze the number of moves a particular piece makes to any given square, or the number of positions where you find a particular piece on a given square. In other words, you can determine the square utilization and occupancy for any and all pieces and squares, across all moves and positions in all games in a PGN database. In addition, there is a simple tool to reformat PGN or create a chessboard heatmap. I was motivated to develop these tools for several reasons. First, the coding was good practice for the development of other programs needed for my life science research (i.e. a nice way to teach myself JavaScript). I was also inspired by the chess visualization study by Seth Kadish, featured at ChessBase last year, among other chess studies.

In fact, the tools I created can recreate some of the work done by Kadish; namely, looking at the aggregate square utilization for the white pieces, as played by Fischer or Carlsen.

Fischer versus Carlsen as White

In the figures above, a chessboard heatmap is shown, with the inset numbers representing the number of times a White piece landed on that square across all the games in the database used. The darker the square color, the higher the value. It’s not surprising that there is a lot of ‘traffic’ in the center of the board, and that there only minor differences between these two world champions.

In addition to looking at square utilization, I also created a tool for examining square occupancy. Applying this to the same dataset reveals a different pattern:

Comparing both the square utilization and occupancy, we might conclude that both players move their central pawns, develop pieces through the center, and generally keep their Kingside structure intact (in particular, there is usually something placed on g2, whether it be a pawn or bishop). While looking at the square utilization data alone might lead you to think Fischer preferred 1.d4, both datasets together make the situation clear: he preferred to first plant a pawn on the e4 square and then open lines and mobilize his pieces through the d4 square (i.e. any Open Sicilian, which starts with 1.e4 but more heavily utilizes the d4 square, with moves such as 3.d4 cxd4 4.Nxd4).

Difference Makers in Fischer’s Games

We can also take a closer look at the above data, and examine the square utilization and occupancy for individual pieces. For this type of analysis, I have found it more interesting to consider Fischer’s wins and losses with white as separate databases. Shown as an example is the utilization and occupancy (or ‘traffic’ and ‘parking’) of the white knights.

You can see that there are some similarities to this pattern, but also slight differences. After normalizing the number of moves in each set, we can then subtract the values from the losses from those found in the wins. In essence, we are looking for piece placements or square utilizations that are found more often in wins or losses. This ‘differential’ data set can also be plotted as a heatmap; here, positive numbers or orange represent squares that occur more often in White wins, while negative numbers or blue represent squares that occur more often in losses. (The scale is such that a value of 10 represents a square that is featured one percentage point more in wins versus losses).

Perhaps unsurprising, this data suggests that an advanced knight, particularly one travelling to the f6 square, is more often found in Fischer wins as white. While the square utilization data highlights transient travel through f6, the occupancy analysis paints a different picture: a stable knight on f3, controlling central dark squares, also correlates with White victories. An interesting result is the positive differential for a b1 knight. One possible explanation is that this represents games in which Fischer scored a quick and violent win, a game so short that he did not need to complete his development. This same analysis can be done with any pieces in this database, white or black.

Examining the white king in Fischer’s game might also be explained by common chess wisdom. An uncastled King stuck on e1 appears to be found more often in a loss, while a castled king correlates with a win. Advanced king moves on the kingside, most likely in the endgame, appear to contribute to success. I can’t help but wonder if this reflects in part Fischer’s choice of the Ruy Lopez Exchange variation.

Difference Makers in the Opening

Studying which squares makes a difference in a player’s games could be interesting, but perhaps not that useful unless you planned on facing Fischer or Carlsen in your next club tournament. More interesting and perhaps useful is to take the same approach with an opening database, comparing datasets from wins and losses in your favorite opening. As one final example, shown below is the differential data for the white queen in several thousand games with the SicilianSveshnikov played in the last fifteen years.

From this we might conclude that White should strive to use their queen in attacking the light squares on the black kingside. Conversely, if White is spending time using the queen to cover squares like d3 and d2, Black has probably already seized the initiative. While interesting, it’s important to note that the data only gives hints at correlation; the idea that it is important for the queen to penetrate these square thus remains a hypothesis to be further tested by examining the games themselves (but at least you have a target to examine!)

Ideally, I would continue and expand this analysis, and may occasionally post results at my blog, Science on the Squares (which also contains more detailed information on the process and the heatmaps). However, with the start of a new semester my profession responsibilities place a formidable demand on my time. I invite any readers to use these tools in their own chess research, and to share their results and insights with the chess community, here or elsewhere.

About the Author

Devin Camenares is an Assistant Professor in the Department of Biological Sciences at Kingsborough Community College, and resides in Yonkers, NY. He attended Rutgers University as an undergraduate, and later earned a Ph.D. in Molecular Biology from Stony Brook University. If he is not teaching, doing research, or spending time with his new wife Vicki, you will probably find him staring at a familiar set of 64 squares.


Discussion and Feedback Join the public discussion or submit your feedback to the editors


Discuss

Rules for reader comments

 
 

Not registered yet? Register

sciencesquares sciencesquares 3/21/2015 02:01
Thanks everybody for the interest and comments. I hope you enjoyed the article. Especially thanks to Niima, fentropy, NimzoCapa for your kind words.

@Karbuncle: Good point; still, I think the fact that we can get objective data that proves this is worthwhile. Perhaps we can use the approach to uncover other principles, or maybe they can be used when designing a chess engine (Both of these are speculation, of course).

@brabo_hf: As fentropy points out, I am not trying to develop a chess equation. What I have done is create a tool that simply gives you data about a PGN database. Many already use aggregate data from a PGN file, in the form of win-loss statistics. Neither replaces traditional study methods for chess improvement, but may augment one's efforts. For creation or use of this program, a FIDE rating is not required.

@NimzoCapa: a very astute observation! I recently posted the entire datasets for Carlsen and Fischer on my blog, and I comment on their Bishop use. Examining this confirms your idea, that Fischer uses b5 more often (especially in wins)

@genem: My program can give you information on any of the 6 White pieces or 6 Black pieces. You can see this for some of the PGNs I've analyzed on recent posts at my blog. Interesting thoughts about Fischer Random Chess (or just Fischer Chess); currently, my program cannot handle different starting positions. However, I suspect that you are correct about the changes we would see.

-Devin
genem genem 3/14/2015 08:53
I would like to see Devin Camenares modify his software to produce heat maps for each of the 8 white and 8 black officers.

With chess limited to endlessly reusing the same one start setup, it is rare to see (for example) a white knight on square e3, even though e3 is a major square. New heat maps from Camenares would highlight the strangling of chess by the limit of only one start setup.
This rarity of Ne3 (especially any early Ne3) hints at how much of what chess could be is hidden from us by the lack of a second often-reused start setup.

How do White's pieces coordinate to fight for the center when Ne3 is a common move and Nf3 is rare? Or when B-N5 pins are rare?
There are interesting answers, but we will never know until a second start setup gains significant reuse.

Chess960, aka Fischer Random Chess (FRC), goes too far. Better to anoint one additional start setup for frequent reuse.

"Discard the 'Random' from Fischer Random Chess!"
NimzoCapa NimzoCapa 3/14/2015 05:08
I don't think it's important whether this will help one improve one's chess or not. I didn't come in with any such expectations, and I find the data to be interesting. I would like to see more of such charts, just out of general curiosity. For example, are all world champions this similar? Here are a couple observations about the slight differences in Fischer and Carlsen. Fischer uses b5-a4-b3 a bit more -- perhaps something to do with playing Spanish and Sicilian variations where the bishop maneuvers its way to b3. On the other hand, c4 is more prominently shown in Carlsen's chart, probably due to playing a lot more 1.d4/Nf3/c4 games.
brabo_hf brabo_hf 3/13/2015 06:47
Just reread the introduction.
"As a chess player, I always seek ways to improve my game....I’ve often wondered if I could combine these three endeavors: is it possible to leverage large databases of chess ‘data’ to apply a scientific approach to the game? "

The author doesn't even have an official fide rating so sorry but I find this article not serious.
fentropy fentropy 3/13/2015 02:54
"...it’s important to note that the data only gives hints at correlation" - - If you read and interpret his comments properly, you will find that he does not claim this is a tool to directly improve your Chess, rather this is a program that visualizes the tenancies of where the pieces are residing and where they are going in a specific player database. He hints that once general tenancies are identified, the reader may want to 'zoom in' using their own detective skills.

stop the hating, as this is quite an achievement to create a statistical piece mobilization program.

+1 from me.

fentropy
brabo_hf brabo_hf 3/13/2015 11:12
You can't put chess in a mathematical model. I even strongly advise players not to use this tool to improve as it can very well be counterproductive.
Karbuncle Karbuncle 3/13/2015 07:32
So what you can deduce from all this is that classical fundamentals of chess apply even at the very top: Develop to control the center and castle. Amazing, isn't it?
Niima Niima 3/13/2015 05:39
Interesting article. Thank you Devin.
1