Embracing Risk in Tournaments

On the Importance of Embracing Risk in Tournaments

By Darse Billings

In two-player games like chess, if we want to know who the best player is, we usually play tournaments. However, the player who wins the most tournaments is not necessarily the best player!

Read that last sentence again. The objectively best player may not be the favourite to win a tournament. That is a fact.

Here's the key point to explain this apparent riddle: some players may have a propensity to draw more games, and that hurts their chances of winning a tournament. Their quiet, positional style reduces the risk of a loss; whereas a sharp tactical style tends to win or lose games, rather than draw. The more conservative player might be the strongest player in the world – the favourite to win a match against any other player – but still win less than his fair share of tournaments. (And by "his", we of course mean "his, her, or its").

In terms of game theory, a tournament with n players is really an n-player meta-game, even though each game within the tournament is played between two players.

Over many tournaments, the strongest player will earn the highest average score, and the best win-loss record, but might nevertheless be far behind the leaders in terms of most tournament wins (first place, or equal first).

Conversely, the sharp win-or-lose player will win more than his fair share of tournaments. The sharp player will also have more than his fair share of last-place finishes, but that is relatively unimportant. (In terms of game theory, we say that the utility function is non-uniform). The higher variance in outcomes ensures a less central probability distribution over possible scores, resulting in a higher probability of finishing first (or last). We might say "variance is our friend".

To demonstrate this fact empirically, a simple computer simulation was written that reflects players of equal strength but different styles. Each player is defined with a number that reflects their personal proclivity toward a drawn game.

Let's look at the two extremes. Player A (aggressive) abhors draws, and will do anything within reason to avoid them, always looking for a decisive result, win or lose. Player Q (quiet) never takes any unnecessary risks, strives to accrue small advantages, hoping to convert them into a win in the endgame, but is always content with a draw if that doesn't work out. If the game is not a draw, then both players have a 50-50 chance of winning or losing, so they are of objectively equal strength.

We will assign each player a value for "drawishness", from 0% to 100%. Player A is our 0% model, Player Q is 100%. Other players are in between these extremes. When two players play each other, we will average their drawishness values to determine the probability that the game ends in a draw. In the case of Player A versus Player Q, 50% of the games will end in a draw, 25% in a win for A, and 25% in a win for Q. If players with drawishness of 40% and 80% play each other, then 60% of games end in draws, 20% wins, 20% losses.

Now we will define a roster of players, and play many complete round-robin tournaments. The same conclusions will hold for Swiss system tournaments – even more so, because short-term variance is even more important for that format. In these experiments we will include all players using 10% increments in drawishness, giving us 11 players in total.

We ran 1.1 million round-robin tournaments. For each tournament, all the players sharing first place split the tournament win evenly (for example, a three-way tie for first would give each player one-third of a tournament win). Since all 11 players are of equal strength, each would be expected to win 100,000 tournaments, all else being equal.

However, style is important for winning tournaments, and we find that the sharper styles win more tournaments than drawish styles. In fact, Player A wins almost 50% more than his fair share of tournaments, while Player Q wins less than half! The complete results are shown in Table 1 (with the number of tournament wins rounded off to the nearest thousand).

Table 1. Tournament wins among 11 players of equal strength, but different playing styles.

We previously asserted that an objectively stronger player might win fewer tournaments than his weaker peers, based on differences in style. To demonstrate this, we now add a strength factor to the definition of each player. This is a relative factor that determines the probability of winning a game that ends in a win or a loss.

For example, a strong player with a strength factor of 70% playing against a weaker player with a strength factor of 50% would end up winning 60% of the decisive games between them. If the two players have an average drawishness factor of 50%, then the total outcome from a long series of games would be 50% draws, 30% wins for the stronger player, and 20% wins for the weaker player. In this example, the difference in strength between these players would be roughly 40 rating points.

We will now run tournaments where the strongest players are also the most drawish players. Although they earn the highest scores on average, this does not automatically translate into tournament victories.

In Table 2, we allow the 11 players with increasingly drawish styles to be progressively stronger by a 1% increase, or roughly two rating points per step. We find that the weakest players still win the most tournaments, while the strongest player wins by far the fewest.

Table 2. Tournament wins among 11 players of increasing strength but increasingly drawish playing styles.

In Table 3, we repeat the experiment, with double the increase in objective playing strengths. Now the weakest players are winning fewer tournaments, but so are the strongest players. Although the distribution is flatter, the players winning the most tournaments are those of moderate strength, with moderate acceptance of risk. We can see that style has a slightly greater impact than strength under these conditions.

Table 3. Tournament wins among 11 players of increasing strength but increasingly drawish playing styles.

In Table 4, we double the rate of increasing strength again, giving an even greater objective advantage for the more conservative players. Now the balance is tipped in favour of the stronger players, but the strongest player in the group is still below par in total wins.

Table 4. Tournament wins among 11 players of increasing strength but increasingly drawish playing styles.

In fact, even if we skew the rating differences to a 200-point spread overall, the strongest player still doesn't win the most tournaments! This is shown in Table 5.

Table 5. Tournament wins among 11 players of rapidly increasing strength but increasingly drawish playing styles.

Note that the best player never loses a single game to the worst player in this scenario, but does draw half of the time, for a 75% overall win rate. Despite the large disparity in playing strengths, the strongest player still does not dominate the tournament circuit, ranking behind the other strong players with slightly less conservative playing styles.

Under these extreme conditions the weaker players seldom win tournaments, but they nevertheless score those upsets much more frequently than their strength alone would predict.

To put these results in perspective, Table 6 looks at the same range of playing strengths, but with all players having the same propensity toward drawn games (50% draws in all cases). We can see that under these conditions, the strongest player wins by far the most events – twice as many as Player Q in the above setting, and almost one third of all the tournaments played. Player 6 has not changed, but wins half as many tournaments now that the top players are unshackled. Clearly, style counts for a lot when it comes to winning tournaments!

Table 6. Tournament wins among 11 players of rapidly increasing strength and equally drawish playing styles (50% draws).

A few caveats are needed to prevent possible misinterpretation of these results. We've shown that all else being equal, a player who tends to score a win and a loss is better off than a player who tends to score two draws. However, if trying to avoid a draw increases the chance of a loss more than the chance of a win, then all bets are off. You cannot casually sacrifice expected value just to increase variance.

Second, the utility of a draw changes as a tournament progresses. As an extreme example, suppose Player A leads by a half point over player Q going into the last round of a major tournament, and both are well clear of the rest of the field. Then a draw is as good as a win for A, but is worthless to Q. To maximize their chances, they'd both better be willing to do a role reversal!

Finally, it is not a simple dichotomy between positional and tactical styles. Although they are correlated with high and low draw frequencies, they are not the same thing. Familiar tactical positions may lead to more draws than novel positional situations. Moreover, some positions demand tactics, others demand an emphasis on strategic considerations. Trying to force a square peg into a round hole is not the best approach. As Sun Tzu said, the best style is no style – you must be able to fluidly adapt to the circumstances.

For those who understand the dynamics of tournament strategy, none of these conclusions should come as a surprise. Extreme results and increased risk go hand in hand – on both ends of the spectrum. A player who is willing to risk the embarrassment of finishing last also enjoys a decided advantage toward achieving the highest glory.

In the past, tournament results may have counted more toward the impression of a player's strength than match results. Perhaps the drawish masters were better than their reputations. Perhaps players like Petrosian deserve more credit than they usually receive. Perhaps the chess world should begin distinguishing between the top tournament players (such as Anand and Topolov), and the top match players (such as Kramnik). Food for thought.

Regardless, the lesson is clear: if you want to win tournaments, play to win!

Biography

Dr. Darse Billings holds a Ph.D. in computer science, and has a keen interest in games of all descriptions. He is the architect of the strongest poker-playing programs in the world, having founded that area of artificial intelligence research in 1992. In 2000, his Lines of Action program, Mona, won the de facto world championship, defeating all of the top human players, and winning every game it ever played.

Darse retired from competitive chess in 1989, with a rating of 2065, having suffered from a drawish positional style.

SHOP

SHOP

Embracing Risk in Tournaments

ONLINE SHOP

The menacing Colle & Zukertort System for the Club Player!

On the Importance of Embracing Risk in Tournaments

By Darse Billings

Biography

Discuss

ChessBase '26 - Mega Package

A powerful 1.e4 Repertoire

ChessBase Magazine 229

Fritz your Chess Coach 2

King’s Indian – A Complete Repertoire for Black Part 1 & 2

King’s Indian – A Complete Repertoire for Black Part 1: Mastering the Sidelines

King's Indian – A Complete Repertoire for Black Part 2: The Classical Main Lines

ChessBase Magazine Extra 228

Pop-up for detailed settings