What were the odds of Hou Yifan’s pairings in 2017?

by Johannes Meijer
2/2/2018 – The Gibraltar Masters wrapped up Thursday, with Levon Aronian in first place. This year round ten passed without incident, in contrast to 2017 when, on February 2nd, the story of the day was a rare scandal involving women's World Champion Hou Yifan deliberately losing a game in protest of the high number of women she was paired against. She was further confounded when a similarly unlikely string of pairings happened in October at the Isle of Man Open. Johannes Meijer looks at the odds in detail. Hou did not return to Gibralter in 2018, but instead competed in the Tata Steel Chess Masters. | Photo: Alina l'Ami

ChessBase 14 Download ChessBase 14 Download

Everyone uses ChessBase, from the World Champion to the amateur next door. Start your personal success story with ChessBase 14 and enjoy your chess even more!


Along with the ChessBase 14 program you can access the Live Database of 8 million games, and receive three months of free ChesssBase Account Premium membership and all of our online apps! Have a look today!

More...

One year later

Imagine, you are at a tournament with 255 players of which 43 are female. You are to play ten rounds. How many female opponents would you expect to face? Three? Five? I am pretty sure you wouldn't say seven. Yet, this was exactly the number of female players Hou Yifan faced at the Gibraltar Open 2017 when, a year ago today, she threw her last game in protest of these seemingly odd pairings.

How probable is such a pairing? Could it have happened by chance at all?

Let's expose this strange occurrence to statistics. In this article, we will do it for you — and uncover the truth about Hou Yifan's pairings in Gibraltar. In the article Investigating Hou’s pairings, Grandmaster John Nunn told Albert Silver that precisely calculating the odds would be an exercise in futility. This seems indeed to be the case, but it is however possible to estimate these odds with the aid of a simple probability model.

One of the goals of the organizers of the Tradewise Gibraltar Masters tournaments is to attract as many as possible world class chess players at the end of the month of January. They do this by offering very attractive awards for all and special awards for women. Many women attend and compete in Gibraltar head-to-head with strong and very strong male grandmasters.

Obviously that was the intention of world champion Hou Yifan when she travelled in January, 2017, to Gibraltar. She had decided to focus on playing the strongest competition she could, rather pursue women-only honours, but in Gibraltar she had to play seven women and three men. That wasn’t exactly according to plan. Beforehand she must have expected that she had to face no women, like in 2015, or just two, as in 2012. Hou Yifan was clearly affected by her pairings and threw her last game in protest.

[The contemporaneous interview with Hou is no longer available on the Gibraltar Chess YouTube channel, but can still be found on YouTube, as well as the live commentary surrounding the incident.]

Hou Yifan

Hou Yifan explaining her protest in 2017

Let’s have a look at the probability of her unusual pairings. How do we calculate the probability that a participant of a given Gibraltar Open, with a total of N participants of which K are women, that lasts n rounds, has to face k women, with k a number between 0 and n? This looks very much like an experiment where we have a population of (N-K) black and K white balls and draw randomly n times a ball, without replacement, and count the number of k successes, i.e. white balls — which in this case shall mark the female players. Without replacement implies that after drawing a ball we don’t put it back. In this way we avoid drawing the same ball twice or more. Said in a different way: a chess player doesn’t have to play the same opponent twice or more. This problem has been solved a long time ago and leads to the hypergeometric distribution.

A question that has to be answered is whether the hypergeometric probability model adequately describes the reality of an open chess tournament. This is not obvious at all. Everybody who has played in an open tournament knows that at the end of the tournament he or she will tend to finish at about the same place as at the beginning of the tournament. His or her gain or loss of rating points will be close to zero. This is true for most of the participants but not for the youngsters who hope to gain rating points and for the veterans who hope to limit the loss of rating points. While playing in an open tournament, after a few rounds you start to recognize the faces around you. The group of players with the same score becomes smaller after each round and the pairings seem to become less random. So preserving some doubt that our model describes adequately the reality of an open tournament seems more than justified.

In order to test the hypergeometric model, I had a close look at the last six Gibraltar Open tournaments. Specifically, I counted the number of participants that faced k women during the n rounds of the tournament, with k a number between 0 and n. For the results of these investigations see table 1. The critical reader will notice that in this table a number of games are missing. This is due to the fact that there were women who took a day off during one or more rounds and because I overlooked every year one or two games that were played by women.

Table 1

Table 1. The observed number of players facing k women during n=10 rounds.

Looking at this table we notice that in 2012, one participant, Soumya Swaminathan, faced six women and that in 2017 something even more remarkable occurred when two participants, Ketevan Arakhamia-Grant and Bodda Pratyusha, faced six women and one participant, Hou Yifan, faced seven women. It is astounding that all four players are women. One might think of a conspiracy if it wasn’t for the fact that a computer took care of the pairings.

Information about the participants that faced five, six and seven female players during these six tournaments can be found in table 2. The name of Andy Baert from Belgium appears twice in this table.  In 2014 and 2016 he attracted a combined total of ten female chess players.

Table 2

Table 2. The players that faced k=5, 6 or 7 women during the Gibraltar Opens of 2012-2017.

The next step is to compare the observed numbers of players that faced k women with the numbers of players that are predicted by the hypergeometric model. The results of my calculations for the year 2013 can be found in table 3. The total number of participants of the Tradewise Gibraltar Masters 2013 was N=247, there were K=34 women and the tournament lasted n=10 rounds. Three of the 340 games played by the female participants are missing.

Table 3

Table 3. The Expected and Observed number of players facing k women during n=10 rounds.

For the calculation of the probabilities p(k) the online calculator of René Vápeník can be used. The observed number of players that faced k women are given in the Obs(k) column, while in the Exp(k) column the expected number of players can be found. In order to check the "goodness of fit" of the hypergeometric distribution we used Pearson’s chi-squared test. With the online calculator of Kristopher J. Preacher it could be determined that p(χ^2=4.16) = 0.53. The probability p(χ^2) = 0.53 is much higher than p=0.05, a conventional criterion for statistical significance. If the p-value lies above p=0.05 then we can accept our model, otherwise we have to reject it. We conclude that for the Gibraltar Open 2013 our model is certainly acceptable.

Table 4

Table 4. The χ^2 and p(χ^2) values of the Gibraltar Opens 2012-2017.

The results of chi-squared "goodness of fit" tests for the years 2012-2017 can be found in table 4 above. According to these results, the expected values agree well with the observed values in all six cases. We observe that for the hypergeometric model five of the six values of p(χ^2) lie far above the critical value of p=0.05. Only the value of p(χ^2) for 2017 lies rather close to the critical value of p=0.05. These results confirm that we can use the hypergeometric model to estimate the probability that a participant faces k women during the n rounds of a Gibraltar Open tournament. The unexpected result suggests that the pairing process in Gibraltar was quite random.

For the Tradewise Gibraltar Masters 2017 tournament, with N=255, K=43 and n=10, I found that the odds that a participant had to face k=7 female opponents were 1 in 5311. So we could expect that, after playing roughly 21 tournaments with the same characteristics, one of the participants would eventually face seven female chess players. So far, in Gibraltar 15 tournaments have been played, most however with slightly less female participants. Anyway, the goddess of chance, Tyche (which means luck in Greek) decided that the honor of playing seven female opponents would be bestowed during the Tradewise Gibraltar Masters 2017 tournament on Hou Yifan, the world’s strongest female player. An honor that left her utterly confused.

Rd. Bo. SNo   Name Rtg FED Pts. Res. w-we
1 22 143 WGM Pourkashiyan Atousa 2303 IRI 5,0 s 1 0,11
2 19 85 GM Zhukova Natalia 2447 UKR 5,5 w 1 0,24
3 11 47 GM Muzychuk Anna 2558 UKR 6,5 s ½ -0,13
4 16 51 GM Muzychuk Mariya 2546 UKR 6,0 w 1 0,36
5 4 5 GM Adams Michael 2751 ENG 7,5 s 0 -0,36
6 19 81 GM Cramling Pia 2454 SWE 5,5 w ½ -0,25
7 20 78 IM Ider Borya 2463 FRA 5,5 s 1 0,26
8 15 38 GM Ju Wenjun 2583 CHN 7,0 w 0 -0,59
9 23 66 IM Batsiashvili Nino 2492 GEO 6,0 s 1 0,29
10 17 37 GM Lalith Babu M R 2587 IND 7,0 w 0 -0,59

Two other female players Ketevan Arakhamia-Grant and Bodda Pratyusha, who each played six female opponents, kept her company. Together, these three events made it a unique tournament. In retrospect, I believe that in Gibraltar, Tyche wanted to honor all female chess players in a highly original way.

After Gibraltar, Hou Yifan played in mixed tournaments in Sharjah, Shen Zhen, Karlsruhe & Baden-Baden, Moscow, Geneva, Biel and Tbilisi. Well, slightly mixed, because she was the only woman among forty-six men. She ended this string of tournaments on a high note winning the Biel tournament in August ahead of grandmasters like Etienne Bacrot and Pentala Harikrishna.

Hou Yifan

Hou Yifan in Biel, 2017 ǀ Photo: Pascal Simon

Hou Yifan’s next stop was the Isle of Man in the Irish Sea where she played in the Chess.com Isle of Man Open Masters 2017. This tournament was her first really mixed tournament after the Tradewise Gibraltar Masters 2017. The big surprise that awaited Hou Yifan on the Isle of Man was that in the first, second, third and fourth round her opponents were all women. It must have spooked her. What was Tyche trying to tell Yifan?

Data for the expected and observed number of participants of the Isle of Man Open 2017 that played k women during the n=9 rounds of this tournament can be found in table 5. The six games that are missing weren’t played. The differences between the values in the Exp(k) and Obs(K) are all very small and consequently the value of p(χ^2) is close to one — so almost a 100% match between expected and observed pairings. In this particular case, this agreement between prediction and reality is almost too good to be true. One of the predictions of the hypergeometric probability model, with N=159, K=22 and n=9, was that approximately three participants would face k=4 female players. These players turned out to be Andrew J. Ledger, Michael Babar and Hou Yifan. Completely explained by statistics — nothing strange happened at all!

Table 5

Table 5. The Expected and Observed number of players facing k women during n=9 rounds.

During the Isle of Man Open, a rather improbable event occurred that went unnoticed.  As reported by Sagar Shah in Indians at the Isle of Man (October 15th, 2017), there were K=30 players from India among the N=159 participants. It turned out that the number of players that would face k Indian players during the n=9 rounds of the tournament could be predicted rather well with the hypergeometric model. According to this model, we could expect that 1.76 participants would face k=5 Indian players and that 0.23 participants would face k=6 Indian players. The two participants that faced five Indians were Boris Gelfand and Varuzhan Akobian, while Adhiban Baskaran must have been quite amazed that he had to face six players, five of them in a row, from his native country. The wheel of fortune clearly turns in mysterious ways and not only for Hou Yifan!

Rd. Bo. SNo   Name Rtg FED Club/City Pts. Res. we w-we K rtg+/-
1 6 7 GM Gelfand Boris 2737 ISR   5,0 w ½ 0,41 0,09 10 0,90
2 36 86   Raja Harshit 2423 IND   4,0 s ½ 0,81 -0,31 10 -3,10
3 30 91 IM Degtiarev Evgeny 2412 GER   4,0 w 1 0,82 0,18 10 1,80
4 21 63 IM Nihal Sarin 2483 IND   5,0 s 1 0,74 0,26 10 2,60
5 15 97 IM Harsha Bharathakoti 2394 IND   5,0 w 0 0,83 -0,83 10 -8,30
6 23 45 GM Sunilduth Lyna Narayanan 2568 IND   6,0 s 0 0,64 -0,64 10 -6,40
7 33 84 GM Sundararajan Kidambi 2426 IND   4,0 w 1 0,80 0,20 10 2,00
8 24 78 IM Swayams Mishra 2444 IND   5,0 s ½ 0,79 -0,29 10 -2,90
9 23 68 IM Batsiashvili Nino 2472 GEO   4,5 w 1 0,76 0,24 10 2,40

In his article Investigating Hou’s pairings, Albert Silver wrote that the Gibraltar pairings and the Isle of Man pairings have been thoroughly checked. They were found to be beyond any suspicion of wrongdoing. Silver:

"These findings have been passed on to Yifan, who was forced to wonder what bizarre twist of fate had made her victim to such an incredible series of coincidences. Nevertheless, she was reassured, as were we all, that everything was in order."

It is nice that our hypergeometric probability model provides extra evidence for these findings.

Links




Johannes lives with his wife in sunny Cochabamba, Bolivia. Now that he is retired he writes articles about chess and mathematics, e.g., Famous numbers on a chessboard (2010) and The Golden Triangle (2010), and from time to time he still plays chess. In 2014 he won the price for the best veteran (Millor Veterà) during the 32e Open International d’Andorra.
Discussion and Feedback Join the public discussion or submit your feedback to the editors


Discuss

Rules for reader comments

 
 

Not registered yet? Register

Quanber Quanber 2/7/2018 10:57
FramiS wrote : "It is strange that people don't understand that at a Swiss tournament no drawing of lots is involved. The pairings are done according to a strict deterministic rule: the Dutch pairing system in this case"

What a strange attitude to the subject. Every tournament system is created through a series of man made rules and ideas. The algorithms may look strict deterministic but Kurt Gödel has taught us that this a shallow picture. There will always be situations where the system delivers different solution to the same problem. In addition the tournaments have to chose Swiss variations or discretionary rules like

01) what score if no opponent ( 1 or ½ ? )
02) can people from same nation meet each other in last round ?
03) special rules for first two rounds ?

Just to mention some few variations. Every discretionary rule creates variations of pairings. Maybe sometimes unwanted variations, so any system should constantly be subject to discussion and changes. This was the situation here, which many of us agree about.

I don’t know about the chosen Swiss variations in this tournament. It is not mention on the home page. But I know that Swiss system should be under constant development, exactly because we want to make it as fair as possible. The later rounds have a much greater bearing on the final results than the earlier rounds. So some speculate in this and create a poor start so they are paired against weaker opposition. Can we imporve the system ?

The Swiss System is most likely superior compared to randomized pairings for a single open, but one can produce solid mathematical argument in favor of randomized pairings over a long sample of tournaments. So maybe we should even drop the Dutch lady and stick to real fairness ? Why should the guy in the middle always expect to get crushed in the first round by the guy in the top ?

No where in any of my comments I have ever said that the Organisers of Tradewise Gibraltar deliberately cheated Hou Yifan with the pairings. On the contrary, already from my first comments I made it clear that there was no indication of that. I only enter the discussion because I found the chosen mathematical model to analyse the result unsatisfactory. But I wish to thank Johannes Meijer for at least for trying to understand the odd result of the pair.

To my opinion Hou Yifan has been subject to extreme abuse because she reacted in chock to a completely unlikely pairing ( 7 female players in 9 round ). A human reaction is met with all these Kafka Commissionert who telles us that everything is according to the rules. We are not living in a Kafka universe. Vi try to help each other to improve through thinking.

My interest is to investigate mathematical string theories with rules for combining the strings that produce the actual observed incompleteness in the description of a physical system ( Quantum indeterminacy). Can we create indeterminacy from strict mathematical algorithms ? Because our present knowledge is that nothing is strict deterministic in the actual Universe. Maybe the solution is the Dutch pairing system ?
FramiS FramiS 2/7/2018 02:15
It is strange that people don't understand that at a Swiss tournament no drawing of lots is involved. The pairings are done according to a strict deterministic rule: the Dutch pairing system in this case.. Therefore there is no need calculating some probabilities. Just check the pairings according the rules. You can do it even manually without a computer if you are painstaking enough. Here are the pairing rules: https://www.fide.com/fide/handbook.html?id=167&view=article.

It has aleady be done by IA Roberto Ricca ( http://arbiters.fide.com/images/stories/downloads/2017/FIDE_Arbiters_Magazine_No_4_-_February_2017.pdf ) and all pairings of Hou Yifan were corrrect, but every one in doubt can do it by themselves.
moderncheckers moderncheckers 2/7/2018 01:55
I am happy that Quanber has gotten more reasonable and agreed that there was nothing wrong with the pairings in the tournament.
As for the GOF test, it is debatable whether it could be applied, as the distributions of the ratings of males and females were different (I guess the first one had a higher standard deviation, for example), and, as written in the very article we are commenting:
"While playing in an open tournament, after a few rounds you start to recognize the faces around you. The group of players with the same score becomes smaller after each round and the pairings seem to become less random. So preserving some doubt that our model describes adequately the reality of an open tournament seems more than justified."
This model is only a rough approximation, and the author of the article was aware of it.
lajosarpad lajosarpad 2/7/2018 11:54
@Quanber

I do not see Internet police here. You are able to formulate your opinion without censorship and other people are able to do the same. If someone disagrees with you, then it is rude to assume that there is any Internet police present. People have different opinions and none of us assumed you are from the Internet police, even though you were quite aggressive in some remarks. I would like to not quote them here, to preserve the constructivity we have achieved with great difficulty. If you formulate your opinion, you will always find people who disagree with you and if you have strong arguments, then you might convince them. There is no need to victimize yourself talking about abuses of rights or Internet police when we speak about the pairings of Hou Yifan. If you continue to do so, I will certainly optimize my time by not reading your comments. If you come up with arguments you believe in, then I will honestly tell you when I agree and when I disagree and why. I am a male. When you speak about the level of males on chessbase, do you think it is correct towards me? When you speak about the number of male decision makers as something bad, you imply that male people tend to be injust towards women when money talks. Yet, if FIDE was run by women, it would be not impossible to reach a similar scenario. Organizing the rapid and blitz world championships in Saudi Arabia was a wrong decision, but the cause of it was the greed of some people and not their gender.
lajosarpad lajosarpad 2/7/2018 11:39
@Quanber

I repeat my main point: other pairings were unlikely as well, especially before the event. However, if you simulate the pairings, taking the results for granted, the likeliness of the pairings increases. It is good that she stands up for her rights, but this is telling more about her than the organizers. I personally have no problems if FIDE or the organizers are mostly male. I would not object if the majority were female either. I simply do not care about their gender, but rather about the quality of their work. The important factor is whether they do a good job. Organizing the rapid and blitz world championships in Saudi Arabia were a serious abuse of women rights, for instance, so it is good to criticize this. I do not think you moved away from the topic, but please, do not start a discussion about Trump and other political figures in this discussion, to defamate Motherncheckers. Moderncheckers gave strong arguments. He, unlike me took the time to actually analyze the statistics. I stated that I consider that to be irrelevant here, because one could pick a large number of grandmasters from any Swiss tournament whose pairing was unlikely. Is this showing that they were all cheated? Nope, this is the nature of the probability of pairings. Your main argument was an attempt to prove that the pairings of Hou Yifan were unlikely. If simulations of pairings yield similar results, then the null hypothesis that there was no manipulation increases in strength. If not, then the null hypothesis still remains. To have any convincing suspicion we need to see facts about the organizers, like a history of manipulation, or an expressed will to have such pairings before the tournament, or the impossibility (not improbability) of the pairings she had with the same software and the same version. The organizers are human beings and they have the right not to be defamated unless there is an extremely good argument to do so. Nobody called Hou Yifan an idiot and she is a respectable super grandmaster. However, in this case she was incorrect, when she had cast suspicion on people without strong arguments because she did not like her pairings. Yes, she is a human being, as anyone else, even we, males are humans with rights and we should stand up for the dignity of the organizers unless we have proof of their wrongdoings. Yes, Hou Yifan's dignity should be respected as well, but I do not consider the criticism of her discussed action to be disrespectful or incorrect. We know for a fact that she had cast suspicion on the organizers, since she expressed that she did not like the pairings, she gave away a game and asked for explanations for her pairings. And here we are, more than a year after the event, still discussing this suspicion, in the form of probabilities and statistics. She won a supertournament, that is a great achievement. I would rather discuss that when we talk about Hou Yifan instead of the sad mistake she made last year.

@Moderncheckers

You gave strong arguments, I do not think you need to get personal when Quanber bashes you. I can assure you that I am not interested about the bashing she has done to you, nor the bashing you have done to her. Let's keep the level of the discussion. Please, notice that since my comments yesterday Quanber was far more correct than before, so I clearly see the possibility to have a civilized discussion. She tended to lose her calmness, so we should not give her any reason to do so. And if she loses calmness even then, then we can point out the fallacies she is making. I believe she understands that she will not come out in good shape from this discussion if she bashes males or the other people present in this discussion. Note, that I am only addressing her arguments rather than her person. If she comes up with a fallacy, it will be pointed out. If she comes out with good arguments, then she will be rewarded with counter-arguments.
Quanber Quanber 2/7/2018 08:10
The agenda for the article written by Johannes Meijer is given in the article above :

"Let's expose this strange occurrence to statistics. In this article, we will do it for you — and uncover the truth about Hou Yifan's pairings in Gibraltar. In the article Investigating Hou’s pairings, Grandmaster John Nunn told Albert Silver that precisely calculating the odds would be an exercise in futility. This seems indeed to be the case, but it is however possible to estimate these odds with the aid of a simple probability model."

Now that was the agenda. To uncover the truth about Hou Yifans pairing and estimate the odds using a simple probability model.

I related exactly to this agenda. My agenda was to critise this simple probability model, a GOF test, which demands the existence of a simple random sampling. Assume we have a population of X objects. The sample taken from the population consist of Y objects. If all possible samples of Y objects are equally likely to occur, the sampling method is simple random sampling. Simple random sampling makes it possible to define a confidence interval around a sample mean , and therefore the ability to use a hypergeometric distribution.

It is clear that this is not the case in Swiss system tournaments, where players are paired with opponents who have done equally well. Players don't face the same opponent more than once and don't play with the same colour more than twice. The pairing program will try first to find a solution where the player change between white and black from game to game.

Through simple logical and down to earth manual investigation of Hou Yifans tournament result we found that the likely number of possible female opponents in the 10 rounds would be far less than demanded from a GOF test. In fact it would be close to 1 in many of the 10 rounds : 42,11,1,1,0,1,1,1,3,1 ).

Unlikely things happens every day. We should not make them likely using bad statistic. There is nothing wrong with the pairings in the tournemant. But if Hou Yifan was told the same explanation for the pairing as given in the article I can fully understand her reaction.

Moderncheckers (MC) wrote …" You gradually move away from the topic of the article, which is the statistical analysis of the pairings as a whole, and engage in much more specific analysis of Hou's pairings in each round" .

No, I did not move away from the topic. Moderncheckers move away. The agenda was not a general analysis of pairing as a whole. but to use a GOF test proving that the pairing was not that unlikely i rwealtion to what one would expect. The agenda was nothing less but to uncover the truth about Hou Yifans pairing. The article fails to do using wrong statistic, creating a most unwanted suspicion of faul play. There was nothing of that.

End of story. I have no more time for FIDS internet police, I need to do my dayli work.
moderncheckers moderncheckers 2/6/2018 10:11
@Quanber You repeat yourself, and in your round-by-round analysis you hide the information about the ratings of all players, which I showed, and which makes it crystal-clear why Hou got the pairings she got. If you doubt it, then tell me, in which round the pairings were wrong, and against whom she should be paired in that round according to you.

"The modern Trump way of dealing with reality. FIDE´s own internet police. It is absolutely useless to discuss anything with you. But I am happy so many others got the point", you write. How typical. There must surely be something wrong with me that makes the discussion useless, because you have to defend your points and cannot win by yelling "conspiration!", "FIDE is run by men!", "I am a Mathematician!", and calling other people names? And who are the "others", who "got the point"?
Quanber Quanber 2/6/2018 09:17
Before round 4 15 possible opponents / 1 female / 7 %
Hou has 2½ points and needs to play white. There are 15 possible opponents. Hou needs to have white.
Muzychuk Anna (2558) excludede already played
Batsiashvili Nino ( 2492) excluded as she need to have white
Muzychuk Mariya (2546) OK as she need to have black

There is precisely one and only one female player who fits the condition. Hou play white against Muzychuk Mariya (2546) and win.


Before round 4 15 possible opponents / 1 female / 7 %
Hou has 2½ points and needs to play white. There are 15 possible opponents. Hou needs to have white.
Muzychuk Anna (2558) excludede already played
Batsiashvili Nino ( 2492) excluded as she need to have white
Muzychuk Mariya (2546) OK as she need to have black

There is precisely one and only one female player who fits the condition. Hou play white against Muzychuk Mariya (2546) and win.




Before round 5 7 possible opponents / 0 female / 0 %
Hou has 3½ point and needs to play black. There are no female to chose .
Hou plays white against Michael Adams ( 2751) and lose.


Before round 6 17 possible opponents / 1 female / 6 %
Hou has 3½ point and needs to play white. Female players :
Muzychuk Anna (2558) excludede already played
Lagno Kateryna (2530) excluded as she need to have white
Batchimeg Tuvshintugs (2390) excluded as she need to have white
Cramling Pia (2454) OK as she need to have black

There is precisely one and only one female player who fits the condition. Hou play white against Pia Cramling. It is a draw.


Before round 7 14 possible opponents / 1 female / male : 93 %
Hou has 4 point and needs to play black . Female players :
Lagno Kateryna (2530) excluded as she need to have black
Cramling Pia (2454) excludede already played
Petra Papp (2352) OK as she needs to have white

There is precisely one and only one female player who fits the condition.
However Hou is given a male player that does not fit the normal conditions. The
male player gets two white in a row.



Before round 8 8 possible opponents / 1 female / 13 %
Hou has 5 point and needs to play white . Female players :

Ju Wenjun ( 2583) OK as she need to have black
Muzychuk Anna (2558) excludede already played

There is precisely one and only one female player who fits the condition. Hou play white against Ju Wenjun ( 2583) and lose


Before round 9 15 possible opponents / 3 female / 20 %
Hou has 5 point and needs to play black . Female players :

Muzychuk Mariya (2546) excludede already played
Batsiashvili Nino (2492) OK as she need to have white
Cramling Pia (2454) excludede already played
Zatonskih Anna (2443) OK as she needs to play white
Khotenashvili Bela (2430) OK as she needs to play white
Pustovoitova Daria (2407) excluded as she need to have black

There is finally more than one female player who fits the conditions. Hou play black
Batsiashvili Nino (2492) and win.

Before round 10 11 possible opponents / 1 female / 9%
Hou has 6 point and needs to play white . Female players :

Stefanova Antoaneta (2512) excluded as she need to have white
Lagno Kateryna (2530) excluded as she need to have white
Muzychuk Mariya (2546) excludede already played
Zatonskih Anna (2443) OK as she needs to have black

There is precisely one and only one female player who fits the condition. However Hou play against male player Lalith Babu M R (2587) and lose

However Hou play against male player Lalith Babu M R (2587) and lose
Quanber Quanber 2/6/2018 09:16
Moderncheckers (MC) wrote :

"This way, you gradually move away from the topic of the article, which is the statistical analysis of the pairings as a whole, and engage in much more specific analysis of Hou's pairings in each round"

No it was not. The topic was if a GOF test could be used to prove that the pairings was ok. The answer is negative.

You manipulate all the time. There is absolutely no information in your round schemes. Pure rubbish. Like reading lines from a phone book until the opponents get tired.

A player will shift between black and white each round if possible, and be paired with players of the same points if possible. That gives Hou Yifan a much lower number possible female pairings that the GOF test assume. The actual pairing gives a much lowver probability than assummed in a GOF test.

I have lured your approach. It happens everytime someone critise FIDE and Organizers, from trousers to scarfs to pairings. Then they get attacked back with black smoke and manipulative rubbish. The modern Trump way of dealing with reality. FIDE´s own internet police. It is absolutely useless to discuss anything with you. But I am happy so many others got the point.



Before round 1 254 possible opponents / 42 female / 17 %
Hou played black against Pourkashiyan Atousa ( ELO 2303 ) and won.

Before round 2 107 possible opponents / 12 female / 11 %
Hou had 1 point as 114 other players. She need to have white, which drastically reduce the possibilities. She played white against Zhukova Natalia (2447) and won again.


Before round 3 18 possible opponents / 1 female / 6 %
Hou has 2 points and now it starts getting interesting. Only 37 players had 2 points. Hou needs to have black in the round which reduce the number of possible opponents to 18. Could it be a female player :
Ju Wenjun ( 2583) … excluded as she need to have black
Muzychuk Anna (2558) OK as she need to have white
Lagno Kateryna (2530) excluded as she need to have black
Stefanova Antoaneta ( 2512) excluded as she need to have black
Batsiashvili Nino ( 2492) excluded as she need to have black
Batchimeg Tuvshintugs (2390) excluded as she need to have black

There is precisely one and only one female player who fits the condition. Hou play black against Muzychuk Anna (2558). It’s a draw .
moderncheckers moderncheckers 2/6/2018 08:00
@Quanber I showed the fallacy of your previous arguments. Now you come back with new ones: you are going through the pairings round-by-round, counting women and men with the same number of points as Hou. This way, you gradually move away from the topic of the article, which is the statistical analysis of the pairings as a whole, and engage in much more specific analysis of Hou's pairings in each round.

Please go one step further, then, and consider also the ratings. When there are several players with the same number of points, the pairings are not randomly chosen among them. If there are 2N players with the same number of points, they are sorted by ratings, and the player number 1 plays against N+1, player number 2 against N+2, ..., and player number N plays against 2N. There are of course further corrections to prevent playing several games in a row with the same color of pieces. If the number of players is odd, then the procedure is similar, except one player who has to be matched against someone with a different number of points.

GM Bator Sambuev at chesstalk dot com forum went through all the pairings, and let me copy here what he wrote:

"Looks like she got opponents strictly by rating.

Round 1
21 21 GM Iturrizaga Bonelli Eduardo 2652 0 1 - 0 0 FM Lopez Mulet Inigo 2304 142
22 143 WGM Pourkashiyan Atousa 2303 0 0 - 1 0 GM Hou Yifan 2651 22
23 23 GM Piorun Kacper 2651 0 1 - 0 0 IM Povah Nigel E 2298 144

Round 2
16 18 GM Fressinet Laurent 2660 1 ½ - ½ 1 GM Cramling Pia 2454 81
17 20 GM Howell David W L 2655 1 1 - 0 1 IM Khademalsharieh Sarasadat 2452 83
18 88 GM Khotenashvili Bela 2430 1 0 - 1 1 GM Iturrizaga Bonelli Eduardo 2652 21
19 22 GM Hou Yifan 2651 1 1 - 0 1 GM Zhukova Natalia 2447 85
20 90 IM Wemmers Xander 2424 1 0 - 1 1 GM Piorun Kacper 2651 23
21 24 GM Anton Guijarro David 2650 1 1 - 0 1 GM Womacka Mathias 2435 87
22 26 GM Sethuraman S.P. 2637 1 1 - 0 1 GM Sundararajan Kidambi 2420 91

Round 3
9 45 GM Grigoriants Sergey 2564 2 ½ - ½ 2 GM Howell David W L 2655 20
10 21 GM Iturrizaga Bonelli Eduardo 2652 2 1 - 0 2 IM Kollars Dmitrij 2500 62
11 47 GM Muzychuk Anna 2558 2 ½ - ½ 2 GM Hou Yifan 2651 22
12 23 GM Piorun Kacper 2651 2 1 - 0 2 IM Bellahcene Bilel 2493 64
13 49 GM Mastrovasilis Athanasios 2551 2 ½ - ½ 2 GM Anton Guijarro David 2650 24
14 63 IM Liang Awonder 2496 2 ½ - ½ 2 GM Sethuraman S.P. 2637 26

Round 4
13 18 GM Fressinet Laurent 2660 2½ 1 - 0 2½ GM Muzychuk Anna 2558 47
14 50 GM Schroeder Jan-Christian 2550 2½ ½ - ½ 2½ GM Ganguly Surya Shekhar 2657 19
15 20 GM Howell David W L 2655 2½ 1 - 0 2½ GM Mastrovasilis Athanasios 2551 49
16 22 GM Hou Yifan 2651 2½ 1 - 0 2½ GM Muzychuk Mariya 2546 51
17 24 GM Anton Guijarro David 2650 2½ 1 - 0 2½ IM Liang Awonder 2496 63
18 26 GM Sethuraman S.P. 2637 2½ 1 - 0 2½ IM Batsiashvili Nino 2492 66

Round 5
2 20 GM Howell David W L 2655 3½ ½ - ½ 3½ GM Vachier-Lagrave Maxime 2796 2
3 3 GM Nakamura Hikaru 2785 3½ 1 - 0 3½ GM Iturrizaga Bonelli Eduardo 2652 21
4 5 GM Adams Michael 2751 3½ 1 - 0 3½ GM Hou Yifan 2651 22
5 24 GM Anton Guijarro David 2650 3½ 1 - 0 3½ GM Gelfand Boris 2721 10
6 15 GM Zvjaginsev Vadim 2679 3½ ½ - ½ 3½ GM Sethuraman S.P. 2637 26
moderncheckers moderncheckers 2/6/2018 08:00
Round 6
14 10 GM Gelfand Boris 2721 3½ 1 - 0 3½ IM Steinberg Nitzan 2486 69
15 11 GM Naiditsch Arkadij 2702 3½ 1 - 0 3½ IM Salomon Johan 2470 73
16 62 IM Kollars Dmitrij 2500 3½ ½ - ½ 3½ GM Shankland Samuel L 2674 17
17 18 GM Fressinet Laurent 2660 3½ 1 - 0 3½ IM Javakhishvili Lela 2455 80
18 66 IM Batsiashvili Nino 2492 3½ 0 - 1 3½ GM Iturrizaga Bonelli Eduardo 2652 21
19 22 GM Hou Yifan 2651 3½ ½ - ½ 3½ GM Cramling Pia 2454 81
20 32 GM Vocaturo Daniele 2606 3½ 1 - 0 3½ GM Womacka Mathias 2435 87
21 33 GM Fridman Daniel 2594 3½ 1 - 0 3½ IM Godart Francois 2381 109
22 68 IM Krysa Leandro 2491 3½ 0 - 1 3½ GM Lalith Babu M R 2587 37
23 72 GM Debashis Das 2472 3½ 1 - 0 3½ GM Blomqvist Erik 2574 41
24 94 IM Carlstedt Jonathan 2413 3½ ½ - ½ 3½ GM Grigoriants Sergey 2564 45
25 46 GM Donchenko Alexander 2559 3½ ½ - ½ 3½ GM Arakhamia-Grant Ketevan 2370 117

Round 7
17 60 GM Mikhalevski Victor 2504 4 ½ - ½ 4 GM Kovalenko Igor 2684 14
18 17 GM Shankland Samuel L 2674 4 1 - 0 4 IM Paehtz Elisabeth 2468 76
19 20 GM Howell David W L 2655 4 1 - 0 4 IM Kollars Dmitrij 2500 62
20 78 IM Ider Borya 2463 4 0 - 1 4 GM Hou Yifan 2651 22
21 23 GM Piorun Kacper 2651 4 1 - 0 4 IM Cheng Bobby 2452 82
22 25 GM Gupta Abhijeet 2645 4 0 - 1 4 IM Carlstedt Jonathan 2413 94
23 98 IM Docx Stefan 2405 4 0 - 1 4 GM Sethuraman S.P. 2637 26
24 104 FM Garriga Cazorla Pere 2386 4 0 - 1 4 GM Gopal G.N. 2579 40
25 117 GM Arakhamia-Grant Ketevan 2370 4 0 - 1 4 GM Deac Bogdan-Daniel 2572 42

Round 8
13 18 GM Fressinet Laurent 2660 5 1 - 0 5 GM Vocaturo Daniele 2606 32
14 48 GM Huzman Alexander 2557 5 0 - 1 5 GM Howell David W L 2655 20
15 22 GM Hou Yifan 2651 5 0 - 1 5 GM Ju Wenjun 2583 38
16 94 IM Carlstedt Jonathan 2413 5 ½ - ½ 5 GM Piorun Kacper 2651 23
17 26 GM Sethuraman S.P. 2637 5 1 - 0 5 IM Dragnev Valentin 2492 67
18 14 GM Kovalenko Igor 2684 4½ 1 - 0 5 GM Grigoriants Sergey 2564 45

Kovalenko previously played Dragnev

Round 9
21 57 GM Lemos Damian 2516 5 ½ - ½ 5 GM Ganguly Surya Shekhar 2657 19
22 21 GM Iturrizaga Bonelli Eduardo 2652 5 1 - 0 5 GM Cramling Pia 2454 81
23 66 IM Batsiashvili Nino 2492 5 0 - 1 5 GM Hou Yifan 2651 22
24 67 IM Dragnev Valentin 2492 5 ½ - ½ 5 GM Oparin Grigoriy 2625 29
25 32 GM Vocaturo Daniele 2606 5 ½ - ½ 5 IM Santos Ruiz Miguel 2484 70
26 35 GM Istratescu Andrei 2593 5 1 - 0 5 IM Cheng Bobby 2452 82
27 86 IM Zatonskih Anna 2443 5 1 - 0 5 GM Antipov Mikhail Al. 2580 39
28 75 IM Esserman Marc 2468 5 ½ - ½ 5 GM Gopal G.N. 2579 40

Antipov previously played Esserman.

I didn't notice any manipulations. ", wrote Sambuev. Neither do I.

This should end the discussion.
Quanber Quanber 2/6/2018 06:33
A colleagues of mine has done the painstaking work of investigation the colours, to make the statistic even more precise. That is, what female players has the same points as Hou Yifan in each round and the correct color to be played. The result is a somewhat scary. In the first two rounds Hou has 17 % and 11 % chance to meet a female opponent.

But in the rounds 3,4,6, and 8 ( 4 rounds) there is precisely one and only one female who fits the criteria
(same points and correct color) to play Hou And she get them all !

Round 1 Female Pourkashiyan Atousa ( 2303 ) 42 options ( 17 % )
Round 2 Female Zhukova Natalia (2447) 11 options ( 11 % )
Round 3 Female Muzychuk Anna (2558) / 1 option ( 6 % )
Round 4 Female Muzychuk Mariya (2546) / 1 option (7 % )
Round 5 Male Michael Adams ( 2751) / no options ( M 100 % )
Round 6 Female Cramling Pia (2454) / 1 option ( 6 %)
Round 7 Male Ider Borya (2463) / 1 option ( M 93 % )
Round 8 Female Ju Wenjun ( 2583) / 1 option ( 13 % )
Round 9 Female Batsiashvili Nino (2492) / 3 options 20 %
Round 10 Male Lalith Babu M R (2587) / 1 option ( M 81 % )

The number of possible female opponents are for all 10 rounds : (42, 11 , 1 , 1 ,0 ,1 ,1 ,1 ,3 ,1 )..
Apart from the first two rounds Hou had pretty much one single option of a female opponet in each round.
and got it.....

Pairing round 1 : 254 possible opponents / 42 female / 17 %
Pairing round 2 : 107 possible opponents / 12 female / 11 %
Pairing round 3 : 18 possible opponents / 1 female / 6 %
There is precisely one and only one female player who fits the condition. Hou play black against Muzychuk Anna (2558). It’s a draw .

Pairing round 4 15 possible opponents / 1 female / 7 %
There is precisely one and only one female player who fits the condition. Hou play white against Muzychuk Mariya (2546) and win.

Pairing round 5 7 possible opponents / 0 female / 0 %
There are no female to chose.Hou plays white against Michael Adams ( 2751) and lose.

Pairing round 6 17 possible opponents / 1 female / 6 %
There is precisely one and only one female player who fits the condition. Hou play white against Pia Cramling. It is a draw.

Pairing round 7 14 possible opponents / 1 female / male : 93 %
Petra Papp (2352) OK as she needs to have white . There is precisely one and only one female player who fits the condition. However Hou is given a male player that does not fit the normal conditions. The male player gets two white in a row.

Pairing round 8 8 possible opponents / 1 female / 13 %
There is precisely one and only one female player who fits the condition. Hou play white against Ju Wenjun ( 2583) and lose

Pairing round 9 15 possible opponents / 3 female / 20 %
Hou has 5 point and needs to play black . Female players :
Batsiashvili Nino (2492) OK as she need to have white
Zatonskih Anna (2443) OK as she needs to play white
Khotenashvili Bela (2430) OK as she needs to play white

There is finally more than one female player who fits the conditions. Hou play black
Batsiashvili Nino (2492) and win.

Before round 10 11 possible opponents / 1 female / 9%
Hou has 6 point and needs to play white . Zatonskih Anna (2443) OK as she needs to have black . However Hou play against male player Lalith Babu M R (2587) and lose
Quanber Quanber 2/6/2018 01:37
Sorrym something went wrong in the upload of the scheme. He is a resume of the approximate
numbers :

Before round 1 254 possible opponents / 42 female / 17 %
Before round 2 114 possible opponents / 14 female / 12 %
Before round 3 36 possible opponents / 3 female / 8 %
Before round 4 29 possible opponents / female / 7 %
Before round 5 13 possible opponents / 0 female / 0 %
Before round 6 33 possible opponents / 4 female / 12 %
Before round 7 27 possible opponents / 3 female / 11 %
Before round 8 20 possible opponents / 3 female / 11 %

Only the first round gives 17 % chance for a female opponet.

The only round (5) there was no female to pair she got a man !

And then again after round 9 where she complained.
Quanber Quanber 2/6/2018 01:24
The answers from lajsarpad and moderncheckers has made me analyse in detail if Hou Yifan was actually cheated in the pairings. No one can conclude this with certainty. There will always be an element of doubt. But the result of my further evaluation is that the pairings is indeed far more unlikely than I originally thought.

Based on the Swiss tournament system one is forced to count how many players has the same number of points as Hou after the round and how many of those who are female players. In this process we should of course exclude those female players she has already played.

There are some restrictions of her pairing that reduce the number in the first round. But lets chose the highest possible percentage go 17 % . After she won the first round the number of possible players she can meet drops to 114 of which only 12 % are female. The second win brings her among the top 70 of the players the rest of the tournament, and the number of possible female opponents drops to an average of 10 %.


Hou ELO
1 22 2651 254 42 17 %
2 22 2651 143F 2303 1 1 1 - 115 114 14 12 %
3 22 2651 85F 2447 1 2 1 - 37 36 3 8 %
4 22 2651 47F 2558 ½ 2½ 10 - 40 29 2 7 %
5 22 2651 51F 2546 1 3½ 3 - 16 13 0 0 %
6 22 2651 5M 2751 0 3½ 24 - 58 33 4 12 %
7 22 2651 81F 2454 ½ 4 30 - 58 27 3 11 %
8 22 2651 78F 2463 1 5 16 - 35 19 1 5 %
9 22 2651 38F 2583 0 5 41 - 69 27 3 11 %
10 22 2651 66F 2492 1 6 20 - 41 20 3 15 %


Notice the total number of female players in the tournament was 17 %. This is the basic probability used to develop the GOF test. The article explain why the pairing with a basic probability of 17 % for meeting a female player was in agreement with a GOF test with a p-value above 5 %. ( the normal chosen statistical significance level )

My conclusion is that the basic probabilty is far lower, and the whole method is insufficent to conclude anything. It fails to count the possible female opponents correctly in each round. I might have fail couting
all the female players myself ( I dont know all the names) , but I could not find any more than these low numbers.

Are the other indications that Hou Yifan could be subject to abuse ? The answer is unfortunately yes.

Hou Yifan is one of the few female players who has always stand up for her rights. She refuse to play in tournaments where she has to dress according to some religious traditions, in spite of the huge amount of money involved. Few players has her high moral. I really admire her.

When I talk to high ranked players who play in big tournaments , it is a general attitude that FIDE has a indirect influence on all events. They don’t like players who stand up for their rights. They want discipline and let the money talk. History of chess is full of these stories, and I find the current development very offensive.

I not offensive to men. But I think it is about time female players stand up for their rights. FIDE is and organization, run by men who has been under server critics for being spineless when money talks.

Imagine a tournament where women could dress as they please, but men has to wear a scarf around the head. Maybe next years Tradewise Gibraltar ?
lajosarpad lajosarpad 2/6/2018 09:49
@Quanber

"Ufortunately the male level here at Chessbase.com . First we abuse Hou Yifann with wrong statistic claiming that she is an idiot."

I think you are wrong when you say that on chessbase the male users are calling Hou Yifan an idiot. I have googled this and the first article I found was this one with your comment. So, if you happen to say the truth, please, show us the exact articles and comments on chessbase where she was being called an idiot and show me how this became a tendency of the male users. And even if someone has called her such (which, again, I am finding difficult to believe) on chessbase, that does not make all the male users or the majority of them abusing Hou Yifan. And coming up with the gender of your arguing partners to discredit them is quite idiotic, not worthy of a mathematician as you are according to your claim.

You apply to sex and nationality. I don't care about your sex and your nationality. Answer the arguments with arguments, or spare my time.

"I only entered this discussion to point out that a GOF test would not be a good way to prove that Hou Yifan was not cheated."

And by chance your very first comment contained baseless and rude assumptions about me. You only entered to discuss the validity of the GOF test? Yeah, right...
lajosarpad lajosarpad 2/6/2018 09:20
@Quanber

the article is about the odds of Hou Yifan's pairings. The reason we are even talking about the odds of her pairings is that she had cast suspicions on the organizers. So, when I am speaking about the odds of other players and about the suspicion, I am focusing on the subject, not on something else, as you falsely imply. My point, which you failed to acknowledge in your response was that using the unlikely nature of an event to "prove" that the event is impossible is a fallacy and Hou Yifan has done it when she had cast suspicion on the organizers.

You have stated that you are a mathematician. Will you be automatically right in an argument just because you are a mathematician? No, you might be wrong. As a matter of fact, stating that you are a mathematician with the clear intention to make others believe you are right is a fallacy called appeal to authority (https://en.wikipedia.org/wiki/Argument_from_authority) and a mathematician should know better than using such fallacies. By the way, I am a mathematician too. But I do not intend to make you accept my position just because I am a mathematician.


"As yyou dont care about the articvle itself, your comment becomes a disgrace towards Hou Yifan who had all the reason to be amazed, and never got a good explanation. This situation continue with chessbase. com´s article. "

You have 0 knowledge about what I care, so please do not use premature assumptions, since that's not a really plausible approach and in this case it is even an argumentum ad hominem fallacy (https://en.wikipedia.org/wiki/Ad_hominem). Since your assumption is incorrect, the conclusion is baseless. I consider casting suspicions on the organizers based solely on the unlikely nature of her pairings to be shameful. When she casts suspicion on others, she should come with a proof, or very good arguments, something she clearly failed to do. Asking the organizers to explain the pairings, giving up a game and clearly stating she was unhappy with the pairings is clearly showing that she was suspicious. If she was not unhappy with the pairings, then we would not speak about them even after a year. And we speak about them because she was unhappy with it. By the way, asking the organizers to explain the situation is a fallacy called shifting the burden of proof (https://en.wikipedia.org/wiki/Argument_from_ignorance), since in our case, the null hypothesis is that the pairings were not manipulated. If one wants us to believe this was not the case, then she should do the proving and not the organizers.

I did not allocate time to calculate the exact probability of her pairings, because the value in question is not meaningful. Whatever that value is, we can find another, not suspicious event with equal or less probability, which did happen.
Quanber Quanber 2/6/2018 03:06
Dear Jenyes,

It is absolutely imperative in the use of a GOF test that the sampling method is simple random sampling. Assume we have a population of X objects. The sample taken from the population consist of Y objects. If all possible samples of Y objects are equally likely to occur, the sampling method is simple random sampling.

It is equally clear that this is not the case in here. In Swiss system tournaments players are paired with opponents who have done equally well. Players don't face the same opponent more than once. Players often don't play with the same color more than twice, and in the end, the difference between the number of white and black games should be no more than one. Players of the same federation cannot be paired in the last round to avoid match fixing, or for political situations, players of certain federations cannot face each other.

As for the formulation of a null hypothesis and choice of the level of statistical significance
( p - value) these methods demands that we have a simple random sampling. Simple random sampling makes it possible to define a confidence interval around a sample mean , and therefore the ability to use a hypergeometric distribution ( a discrete probability distribution)

Statistical analysis is not appropriate when non-random sampling methods are used. The article and Moderncheckers ignore these observations.

It is also a mistake to assume, that if one find the pairing not likely , then one assume that the Organizer has cheated Hou Yifan, This is the sad story Moderncheckers try to spread.
The formulation of a null hypothesis ( given we had a good method to test) should instead focus on the possibility that the restrictions on the pairing could make a very unlikely situation more likely. If one analyse this deeper, round for round, Hou might end up having an understable explanation.

Which is what she still deserve.
jenyes jenyes 2/6/2018 12:50
I think the article does a great job of explaining the math but it misses the point. That is, some results are simply undesirable. It does nothing to remove this or similar results from the realm of possibility all together.
Quanber Quanber 2/6/2018 12:31
To Moderncheckers

You fail to understand everything in my comment.

Broken English
My broken English does not live up to your superior standard. So of course, here you start your journey. Make jokes about my spelling and sentences. Education and so on. It is soo typical. If I expressed myself in my Native language ( a very small country) you would not understand a single word. But then again, you dont have to understand anything. The rest of the world just have to adopt to your superior language, right ? Its so pathetic to abuse me here. Try to let your hatred go and stick to the subject.

GOF test
A GOF test is a primitive statistical analysis, where one assume that the elements of the sample space all have the same probabilities. This is clearly not the case here, and I will not discuss that any further. I only entered this discussion to point out that a GOF test would not be a good way to prove that Hou Yifan was not cheated.

Was Hou Yifan cheated in the pairings?
I have never expressed that conclusion. I only said that she was not given a proper explanation prior to last round why she met 7 women in 9 rounds. Here reaction was frustration. I fully understand her.


Distribution of female players ELO
43 women participated out of 255 players or 17 % . Hou enter the tournament with the natural expectation, that if she did ok with points, she would meet players around her own rating. And she did well, having 6 points out of 9 before the last round. Winning the last round would have given her a shared second place.

From starting rank nr. 1 on the list (2827) down to Hou Yifan (2651) the ELO difference was 176. If we jump further 176 ELO down below Hou Yifan we end at rating 2475. Te result will be :

ELO 2875 – 2652 : 21 players above , zero female players.
ELO 2651 22. Hou Yifan
ELO 2651 – 2475 :
38 Ju Wenjun 2583
47 Muzychuk Anna 2558
51 Muzychuk Mariya 2546
54. Lagno Kateryna 2530
56 Gunina Valentino 2524
58 Stefano Antoaneta 2512
66. Batsiashvili Nino 2492

At least these numbers are not to discussion. So there will be zero female players above Hou and 7 below her , given the same range. In total ( 7/65) ~ 11 %

In the final result the first 100 players have ELO ratings higher that 2297. With some few exceptions , which seems to be men ( 2 -3 % )

You fail to understand that the average ELO rating 2297 of the women is actually so low , given Hou Yifans actual performance, that she had all the reason to be amazed about the pairings. So this is the whole point here :

If Hou Yifan could not understand the pairings, then stick to some healthy arguments. Dont make a big article about a GOF test that explains why it is all quite natural. It is not.
chessdrummer chessdrummer 2/5/2018 05:17
The entire argument is ridiculous, but I have heard similar cases applied to ethnicity and nationality. There is no functional difference between a male who is 2500 and a female who is 2500. Either we accept women as equal competitors in chess or not. The issue is that she sees herself as the highest-rated women player and believes she should be playing top players who happen to be men. She seeks separation. If Hou Yifan faced all men of exactly the same rating, would she complain? Probably not. This is not a chess issue. It is about gender perception.
rgorn rgorn 2/5/2018 05:04
@RayLopez:

http://www.chess-results.com/tnr257693.aspx?lan=1

They used Swiss-Manager from Heinz Herzog, and there is a link to the Swiss-Manager tournamentfile. If you have a copy of this program, you can recalculate the pairings yourself.

See also last years discussions for people who manually checked the pairings of selected rounds.
moderncheckers moderncheckers 2/5/2018 05:06
@RayLopez: To be precise, Quanber did not argue that the significance level chosen was wrong, this is your new input.

What is usually known as "p-hacking" is choosing data and/or the significance level (i.e., the α-level, which is what you by sloppiness call the p-value; this is the maximal p-value which implies the rejection of the null hypothesis) specifically in such a way as to prove that the observation was significant.

Setting the α-level at 5% means that even provided the null hypothesis is true, it might be rejected once out of twenty times, not what you again sloppily stated.

Here the null hypothesis is that the organizers of the tournament did not cheat in the pairing. Cheating is a serious accusation, and in my opinion the null hypothesis should be rejected only if the evidence against it is overwhelming. But you are saying that the α-level set at 0.05, which is the most commonly chosen value in all of statistics, is too low, and that one should set it higher, so that (by definition of the α-level) the probability of erroneously rejecting the null hypothesis is higher. Now this is not how I imagine justice, and not how I imagine law courts should work, if they used statistics to confirm or reject accusations.

"What about using a 1% level?", you wrote. If I understand your intentions, you wrote it as a rhetorical question, but in fact, yes, using a 1% level would make sense here, considering the gravity of the accusation, as it would make the probability of wrongly accusing the organizers provided they are innocent lower.
RayLopez RayLopez 2/5/2018 04:06
@moderncheckers - Quanber made a good point however when pointing out a p-value of only 5% was taken for statistical significance. This is too low. It's a form of p-hacking. What about using a 1% level? p-value of 5% means once out of twenty times your statistical test is inconclusive.

@rgorn, others: I'd like to see the study that replicates the pairings. Fide rules allow for manual pairings.
moderncheckers moderncheckers 2/5/2018 12:57
Dear Quanber the self-styled mathematician (or, as you write "Mathematician" – with the capital M, no less!),

now that you have written what you had wanted to write, even without looking at the starting list, please take a seat for a while and do the math.

"One of the major flaw in this analysis is the fact that the distribution of the female players is not equal. They have an average lower rating than the men. With some few exceptions. Hou Yifan is one of the exceptions, " you write.

Well, this is the list of ratings of the woman players at that tournament: {2651, 2583, 2558, 2546, 2530, 2524, 2512, 2492, 2468, 2455, 2454, 2452, 2452, 2447, 2443, 2430, 2418, 2413, 2407, 2406, 2390, 2387, 2387, 2378, 2378, 2375, 2370, 2370, 2364, 2352, 2303, 2259, 2247, 2227, 2164, 2108, 2095, 1966, 1954, 1924, 1913, 1829, 1707}. The average rating is 2327.63. Now please calculate the average rating of all the players, male and female. The result you will get is 2296.73.

Further you write: "Imagine that you are given the number 22 and have to pick another number from 1 - 255 . Out of these numbers there are 43 blue numbers, and 212 red numbers. The blue numbers are mostly placed at the bottom of the list ( numbers higher than 180 ) "

How can you say that they are at the bottom of the list? Most of the women were rated over 2350, and the bottom of the starting list of the tournament was almost exclusively male! Is it so hard for you to imagine a tournament where the average rating of the women is higher than that of the men?

Moreover, notice that the greatest "density" of women players, so to speak, was around 2450, that is not far from Hou's rating. If those women performed a bit above their ratings, or Hou a bit below hers, then the probability of her being paired with a woman should be quite high.

"As a Mathematician it offends me when someone use mathematics to prove they are right with false mathematical arguments," you write. Well, then you should feel offended by yourself this time.

Your comments are offensive not only for yourself though, but also for other users. First you denigrate all the women, by taking it for granted that their average rating must be lower than men's, that they are indeed necessarily "mostly placed at the bottom of the list". How is that not sexist? Then, to show that perhaps you are not sexist but a misanthrope, having the same contempt for all, you denigrate indiscriminately all the men, writing about "the male level here on Chessbase.com" as of something that is uniformly low. You also write wrongly that people call Hou an "idiot", which of course nobody in their sane mind would do. Perhaps because you like to call people "idiots" yourself? And finally, you denigrate all the chess world calling it an "aggressive selfie universe".

Could you please move your hatred out of this discussion?

As for me, I have a lot of respect for Hou's playing strength, but I find her way of protesting by throwing a game to be unsportspersonlike, and would not approve of such protest no matter who the protester was and whether the protest was statistically justified (which this protest turned out not to be, anyway).
rgorn rgorn 2/5/2018 12:47
What I learned last year is that in Gibraltar they use a program that implements FIDE rules to determine the pairings for each round. The first round pairings are determined by the players' rankings, and every other round's pairings are determined by the resluts of the previous round. FIDE rules are actually an algorithm, so everything is 100% deterministic and there isn't any element of chance. Manual intervention was minimal and did not concern Hou's pairings. Several poeple have reproduced the pairings of Gibraltar 2017 with this program, some even manually applied FIDE rules to check the program got it right.

The only thing that is not 100% deterministic are of course the results of each round. That's something that could be simulated with an appropriate model. Input: Some 128 Players with respective ELO. Calculate initial pairings according to algorithm. Run simulation for results of round 1. Calculate pairings for round 2. Run simulation for results of round 2. Etc. After the tournament is finished, count how many women Hou played. Then run the whole tournament simulation again and again. Calculate how many women Hou played on average. Plus every other number that might be of interest.
Quanber Quanber 2/4/2018 11:28
Typical comment from AlphZero. Ufortunately the male level here at Chessbase.com . First we abuse Hou Yifann with wrong statistic claiming that she is an idiot. When the flaws in the use of a GOF test is demonstrated, we abuse her with something else.

It is like some members of The National Riffle Association : Yes, 20 000 Americans die every year from
gunshot But you know what, even more people die from car accidents ! So lets first ban cars !

I guess Hou Yifan will soon be lost for chess, as she start at Oxford University. She got that possibility
right before Tata Steel. So her thoughts was toward a bright future, far away from the agressive selfie
universe called chess , where we all do rather simple thing, but think it is important. It is not.
AlphaZero1 AlphaZero1 2/4/2018 09:06
Well, ate least she got 9 men in Tata. Not complaining now I guess.
Quanber Quanber 2/4/2018 05:43
This discussin is not about critics of the organizers pairings.

It was about using a Goddness of Fit test to "prove" that the pairings was not particular unlikely.
According to the article is was above 5 % and there acceptable.

My argument, as a Mathematician is, that the use of a GOF test ( Goodness of Fit ) is simply
not acceptable. Becuase the inital conditions does not fit a hypergeometric model.

Please focus on the subject lajosapard.

The pairings might be extremely unlikely, but still no based on cheat. I agreee completely.

The cheat first starts with this article. When one explains that the result, 7 female out of 10 opponent, is actually within the limit of a GOF test. Now that is nonsense, based on ignorance of the statistical parameters, as I explained. As a Mathematician it offends me when someone use mathematics to prove they are right with false mathematical arguments.

As yyou dont care about the articvle itself, your comment becomes a disgrace towards Hou Yifan who had all the reason to be amazed, and never got a good explanation. This situation continue with chessbase. com´s article.

Notice Hou Yifan never accused anybody of cheating. She simply could not understand the pairings, and before female nr. 7 she did talk to the Organization for a good explanation. She never got that.
lajosarpad lajosarpad 2/4/2018 12:28
maxharmonist is right. If we pick any player's pairing and calculate its probability, we will find similarly unlikely events, yet some people are suspicious on Yifan's pairings and not suspicious on other pairings. It is rude to assume that the organizers manually interacted with the pairings, implying that they were lying when they denied this, so, if we do that, we should have an extremely good reason. The unlikely nature of Hou Yifan's pairing is not a good reason to do so, because if we apply this reasoning consistently, we will find that all Swiss tournaments with large number of players resulted in pairings similarly unlikely as this one. So if we believe this reasoning, then we should assume that the organizers in virtually all Swiss tournaments manually altered the pairings, which is clearly false. I am not saying I factually know that the organizers did not manipulate the pairings in this case, but I am saying that we should see very strong arguments in order to accept such claims.

Also, accepting the suspicions in this case based on the weakly founded arguments we had seen would lead to a backstreet, where we would heed the suspicions of all the players who use a similar fallacy to wine about their pairings.
maxharmonist maxharmonist 2/4/2018 07:13
”I believe some people independently confirmed the computer pairings”

Many did, and everyone was as amazed at Hou throwing a game because she meant the organizers were supposed to have tweaked her pairings. That the pairings turn out as they do for every player in these events is obviously an extremely small chance statistically.
joshuar joshuar 2/4/2018 03:58
I believe some people independently confirmed the computer pairings using the same software. The question I have... if you delete the Male/Female entries in forming the player lists, does the very same program spit out the same results? Maybe there is a bug that is considering gender in its pairing algorithm.
RayLopez RayLopez 2/3/2018 08:06
Author writes, as justification of Hou's bizarre pairing: "For the Tradewise Gibraltar Masters 2017 tournament, with N=255, K=43 and n=10, I found that the odds that a participant had to face k=7 female opponents were 1 in 5311.". What are the odds that the strongest women player would be the one in 5311? Probably one in a million. I think what happened is that the organizers manually 'tweaked' the Swiss chess pairings program, which is permitted by the rules. Often I notice in Swiss tournaments played in foreign countries that contestants from the same home country get paired with one another, despite the odds of this happening so frequently are small. I suspect the organizers, as is their right, manually override the Swiss pairings program for both the first and the last rounds, and maybe even inbetween rounds. Swiss pairings is very complicated (the Dutch Swiss system, not the Norwegian Monrad system) with 'floaters' and byes and so on, so organizers routinely tweak the Swiss Manager program or equivalent I am pretty sure. Much ado about nothing IMO, but that's an aside. I just as soon switch from the Swiss Dutch to the Monrad Swiss system anyway, and do away with the "you cannot play the same player twice" taboo, which seems feigned. If the best match for you is a previously played player, why not play them again?
Quanber Quanber 2/3/2018 03:31
All statistical analysis is critical based on a correct evaluation of the initial conditions.

One of the major flaw in this analysis is the fact that the distribution of the female players
is not equal. They have an average lower rating than the men. With some few exceptions. Hou Yifan is one of the exceptions.

Imagine that you are given the number 22 and have to pick another number from 1 - 255 . Out of these numbers there are 43 blue numbers, and 212 red numbers. The blue numbers are mostly placed at the bottom of the list ( numbers higher than 180 )

However, we have a restriction on your choice. You can not just pick a number randomly. You have to pick a number which is not so far away from your own number. I think everybody can then understand that the probability to pick a red number would increase above a random distribution model.

This is the situation here.

Hou Yifan is rather high rated compared to other female players, and can only meet people of roughly equal score after some rounds . This reduce drastically the possible number of women she can meet.

Hou Yifan ranked 22 out of 255 participants. To my knowledge the next highest ranked woman in the tournament was nr. 38 Ju Whenjun

I order to give a fair propability for meeting 7 women one most calculate the probability round for round , based on how many men and women Hou Yifan can actually meet given her actualt score in that round. That would involve much more men, and some few women than a general hypergeometric model.

The conclusion is clear. One can not in any solid way use a generel Goodness of Fit test based on the hypergeomtetric model if the population is not representative. It is simple bad statistics.

Best regards

Quanber
lajosarpad lajosarpad 2/3/2018 02:19
I do not know whether any conspiracy theory regarding the pairings is accurate, but the null hypothesis in such cases is that the organizers are innocent, so, unless we get further information, the best possible assumption we could make is that they are innocent. Hou Yifan was acting as they were not innocent and the best argument she could come up with was that the probability of playing with so many women was very small indeed. However, this is the so called appeal to probability fallacy (https://en.wikipedia.org/wiki/Appeal_to_probability), assuming that not having that pairing is much more probable than having it.

If we consider the probability of the pairing of any other player having exactly the same as it happened to be, we will find similarly slim chances, yet some consider those unlikely pairings to be normal and this one to be suspicious. This is the so called Texas sharpshooter fallacy (https://en.wikipedia.org/wiki/Texas_sharpshooter_fallacy).

So, unless someone comes up with some proof of the suspicion it is incorrect to give any importance to the conspiracy theories regarding the pairings. The infinite nature of time, space and matter makes sure that extremely unlikely events will happen regularly. Are they miracles or conspiracies? They might be, but if we apply Occam's razor on these we will find that assuming miracles or conspiracies is unnecessary and therefore scientifically unsound assumptions.
moderncheckers moderncheckers 2/3/2018 02:19
Great article! If your name is Meijer (as in Meijer G-function), then it is no wonder you are good in math. Next time, Hou Yifan should ask herself what is the probability that a 2650 player is invited to as many elite tournaments as she is.
CostaMaison3 CostaMaison3 2/3/2018 11:49
The pairing is not purely random. It considers the players' score at the end of each round. (e.g. a player who scored 6 so far will more likely face a player who has a close score of 6). This narrows the probability space when estimating the probability of facing a female after each round.

In addition to the above, the strength of females is more narrower than males. The males are rated from (1600 to +2800), whereas females is narrower. This will increase the likelihood of a female player to play against another female player.
WALLFISH WALLFISH 2/3/2018 10:21
I think it would be more straightforward to snapshot the ratings and run simulations of the tournament -
both pairings and games.
Omoplata Omoplata 2/3/2018 08:41
Excellent article; very interesting to see a thorough statistical analysis of this, and it's easy to see why Hou Yifan was suspicious/upset at the time as it was such a rare occurance.
tom_70 tom_70 2/3/2018 03:37
She was outclassed in the Tata Steel this year. Protest is one thing, but playing in a section you have ZERO chance of winning isn't smart either.
Heavygeardiver Heavygeardiver 2/3/2018 02:19
Now we know she was bamboozled!!
1