AlphaZero/Kramnik: Exploring new chess variants

Assessing Game Balance with AlphaZero

The following is excerpts from a 97-page scientific treaties submitted by Nenad Tomašev (DeepMind), Ulrich Paquet (DeepMind), Demis Hassabis (DeepMind) and Vladimir Kramnik (World Chess Champion 2000–2007). We will excerpt this paper in multiple parts, providing you with example games for your own evaluation.

AlphaZero is a reinforcement learning system that can learn near-optimal strategies for any rule set from scratch without any human supervision, and provides an in silico alternative for game balance assessment. In their paper the team demonstrate the potential of AlphaZero to be used as a tool for creative exploration and design of new chess variants. Given the increasing depth of known chess opening theory, the high percentage of draws in professional play, and the non-negligible number of games that end while both players are still in their home preparation, there has recently been an increasing interest in chess variants, such as Fischer Random Chess.

In their study, the team has used AlphaZero to explore nine chess variants that involve atomic changes to the rules of chess, keeping the game close to the original, while allowing for novel strategic and tactical patterns. By effectively simulating decades of human play in a matter of hours, they are able to answer what the games between strong human players would potentially look like, if these variants were to be adopted. In this process, they identified several variants of chess that appear to be very dynamic and interesting. The findings demonstrate the rich possibilities that lie beyond the modern chess rules. They state:

Rule design is a critical part of game development, and small alterations of game rules can have a large effect on the overall playability and game dynamics. Fine-tuning and balancing rule sets in games is often a time-consuming, laborious process and automating the balancing process is an open area of research, where machine learning and evolutionary methods have recently been used to help game designers balance the games more efficiently. Here we examine the potential of AlphaZero to be used as an exploration tool for investigating game balance and game dynamics under different rule sets in board games, taking chess as an example use case.

Popular games often evolve over time, and modern-day chess is no exception. The original game of chess is thought to have been conceived in India in the sixth century, from where it initially spread to Persia, then the Muslim world and later to Europe and globally. In medieval times, European chess was still largely based on Shatranj, an early variant originating from the Sasanian Empire that was based on the Indian Chaturanga. Notably, the queen and the bishop (alfin) moves were much more restricted, and the pieces were not as powerful as those in modern chess. Castling did not exist, but the king’s leap and the queen’s leap existed instead as special first king and queen moves. Apart from checkmate, it was also possible to win by baring the opposite king, leaving the piece isolated with the entirety of its army having been captured. In Shatranj, stalemate was considered a win, whereas these days it is considered a draw.

The evolution of chess variants over the centuries can be viewed through the lens of changes in search space complexity and the expected final outcome uncertainty throughout the game, the latter being emphasized by modern rules and seen as important for the overall entertainment value. Modern chess was introduced in the fifteenth century, and is one of the most popular games to date, captivating the imagination of players around the world.

The interest in further development of chess has not subsided, especially considering a decreasing number of decisive games in professional chess and an increasing reliance on theory and home preparation with chess engines. This trend, coupled with curiosity and desire to tinker with such an inspiring game, has given rise to many variants of chess that have been proposed over the years. These variants involve alterations to the board, the piece placement, or the rules, to offer players “something subtle, sparkling, or amusing which cannot be done in ordinary chess” (Beasly, 1998). Probably the most well-known and popular chess variant is the so-called Chess960 or Fischer Random Chess, where pieces on the first rank are placed in one of 960 random permutations, making theoretical preparation infeasible.

AlphaZero had demonstrated state-of-the-art results in playing go, chess, and shogi – learning from self-play without any human supervision. In doing so, AlphaZero demonstrated a unique playing style at the time, later analysed in Game Changer (Sadler & Regan, 2019). This in turn gave rise to new projects like Leela Chess Zero and improvements in existing chess engines. CrazyAra employs a related approach for playing the Crazyhouse chess variant, though it involved pre-training from existing human games. A model-based extension of the original system was shown to generalise to domains like Atari, while maintaining its performance on chess even without an exact environment simulator. AlphaZero has also shown promise beyond game environments, as a recent application of the model to global optimisation of quantum dynamics suggests.

This is how the AlphaZero team describe the project:

Rule Alterations

There are many ways in which the rules of chess could be altered and in this work we limit ourselves to considering atomic changes that keep the game as close as possible to classical chess. In some cases, secondary changes needed to be made to the 50-move rule to avoid potentially infinite games. The idea was to try to preserve the symmetry and the aesthetic appeal of the original game, while hoping to uncover dynamic variants with new open-ing, middlegame or endgame patterns and a novel body of opening theory. With that in mind, we did not consider any alterations involving changes to the board itself, the number of pieces, or their arrangement. Such changes were outside of the scope of this initial exploration. Rule alterations that we examine are listed in Table 1. The variants in Table 1 are by no means new to this paper, and many are guised under other names: Self-capture is sometimes referred to as “Reform Chess” or “Free Capture Chess”, while Pawn-back is called “Wren’s Game” by Pritchard (1994). None have yet come under intense scrutiny, and the impact of counting stalemate as a win is a lingering open question in the chess community.

Each of the hypothetical rule alterations listed in Table 1 could potentially affect the game either in desired or undesired ways. As an example, consider No-castling chess. One possible outcome of disallowing castling is that it would result in an aggressive playing style and attacking games, given that the kings are more exposed during the game and it takes time to get them to safety. Yet, the inability to easily safeguard one’s own king might make attacking itself a poor choice, due to the counterattacking opportunities that open up for the defending side. In Classical chess, players usually castle prior to launching an attack. Therefore, such a change could alternatively be seen as leading to unenterprising play and a much more restrained approach to the game.

Historically, the only way to assess such ideas would have been for a large number of human players to play the game over a long period of time, until enough experience and understanding has been accumulated. Not only is this a long process, but it also requires the support of a large number of players to begin with. With AlphaZero, we can automate this process and simulate the equivalent of decades of human play within a day, allowing us to test these hypotheses in silico and observe the emerging patterns and theory for each of the considered variations of the game.

Variant	Primary rule change	Secondary rule change
No-castling	Castling is disallowed throughout the game	--
No-castling (10)	Castling is disallowed for the first 10 moves (20 plies)	--
Pawn one square	Pawns can only move by one square	--
Stalemate=win	Forcing stalemate is a win rather than a draw	--
Torpedo	Pawns can move by 1 or 2 squares anywhere on the board. En passant can happen anywhere on the board.	--
Semi-torpedo	Pawns can move by two square both from the 2nd and the 3rd rank	--
Pawn-back	Pawns can move backwards by one square, but only back to the 2nd/7th rank for White/Black	Pawn moves do not count towards the 50 move rule
Pawn-sideways	Pawns can also move laterally by one square. Captures are unchanged, diagonally upwards	Sideway pawn moves do not count towards the 50 move rule
Self-capture	It is possible to capture one’s own pieces	--

Table 1. A list of considered alterations to the rules of chess.

Qualitative assessment

To evaluate the differences in play between the set of chess variations considered in this study, we couple the quantitative assessment of the variations with expert analysis based on a large set of representative games. While the overall decisiveness and opening diversity add to the appeal of any chess variation, the subjective questions of aesthetic value and the types of positions, moves and patterns that arise are not possible to fully capture quantitatively. For providing a deep qualitative assessment of the appeal of these chess variations, we rely on the experience of chess grandmaster Vladimir Kramnik, an ex-world chess champion and an authority on the game. By characterising typical patterns, we hope to provide players with insights to help them judge for themselves if they would find some of these chess variants interesting enough to try out in practice. What we provide here are preliminary findings.

In the following weeks we will present you with the results and example games and positions from each of these variants. For today you can check out the first in the list, No Castling Chess, which we already described (Vladimir Kramnik proposes an exciting chess variant!).

In fact we organised a full tournament with OTB play, under the supervision of Vladimir Kramnik himself. The result: a dramatic reduction of the number of undecided games: First ever "no-castling" tournament results in 89% decisive games!

SHOP

SHOP

AlphaZero/Kramnik: Exploring new chess variants

ONLINE SHOP

How to play the Anti-Sicilians

Assessing Game Balance with AlphaZero

Rule Alterations

Qualitative assessment

Discuss

ChessBase '26 - Mega Package

ChessBase Magazine 230

Reversed Sicilian Power - Win with 1.c4 e5!

ChessBase Magazine Subscription (Digital)

Master Class Tactics - Train your combination skills! Vol.4

ChessBase Magazine Extra 229

Understanding Middlegame Strategy Vol 13: Reversed Colour Systems – Grünfeld & Dutch

A powerful 1.e4 Repertoire

Pop-up for detailed settings