Chinese New Year Chinese Chess

Chinese New Year Chinese Chess?

Hi again! As of the time of this post, we are fast approaching the month of February. What awaits us in February is a long holiday which some of us folks will celebrate, Chinese New Year!! So in light of the coming festivities, I thought I could do something (related to) Chinese, but not so much the New Year!

And so I stumbled upon a rather interesting dataset. This dataset contains close to 10,000 games of an online version of Chinese Chess.

The dataset is split into two parts. The first dataset contains game-level information, such as the players information, who the winner is. The second dataset contains move information for each game.

[1] "   gameID  game_datetime         blackID          blackELO  redID            redELO  winner "
[2] "---------  --------------------  --------------  ---------  --------------  -------  -------"
[3] " 57380690  2017-02-10 13:11:10   levinhson            1382  baochang           1495  red    "
[4] " 57380691  2017-02-10 13:21:43   phucnguyen           1269  quanem             1282  red    "
[5] " 57380692  2017-02-10 13:23:04   pgm2785g             1253  swf2016g           1186  black  "
[6] " 57380693  2017-02-10 13:09:58   cbk4833g             1209  ftc484g            1426  black  "
[1] "   gameID   turn  side    move " "---------  -----  ------  -----"
[3] " 57380690      1  red     C2.5 " " 57380690      2  red     H2+3 "
[5] " 57380690      3  red     R1.2 " " 57380690      4  red     P7+1 "

To start off, let’s assume that red always move first, and investigate if starting first necessarily leads to a win!

From the chart below, well, it does seem that starting first has its advantage, but is it really the case?

To start off, let’s first take a look at the number of moves (in total) that were taken for each game!

Interestingly, from the chart below, we see a spike at 1 move. Come to think of it, its hardly surprising considering that the game is played on the web, and there is minimal entry requirement and literally zero exit restriction (you can literally leave anytime during the match).

And so, our next step is to remove games which are obviously not meaningful from our data (i.e. games where players exit before its meaningful to even make a move to win_). But first, we would need to find out what the minimum number of moves one player can make to win is. Well, the answer is 4!! A double cannon move that traps the king in its position and both its advisors are not in a good position to defend it. Even then, looking at the data on the moves for games where either player makes only 4 moves, we do not find any evidence of such a case happening.

So, to arbitarily filter off games where there are early quitters (who are possibly noobs), I decided to set the cut-off at 20 (for total number of moves).

Starting the match right is always important, for any player of any skill level. So, as a first move (punt intended), I thougth we could look at what are the most popular first moves. Note that because chinese chess is a turn-based move, it is only right that we focus on the first move by the red player, and not the black player, as the black player’s move is likely to be in respond to the red player’s first move.

But before we move on to the chart below, let’s go through the notations used for the moves.

First, the alphabets stand for the pieces themselves:

  • K - King
  • A - Advisor (pieces next to the king)
  • E - Elephant
  • H - Horse
  • R - Car
  • C - Cannon
  • P - Pawn

Second, there are typically two numbers in the notation, where both numbers correspond to the location on the board, with 1 on the right and 9 on the left for each player. The first number correspond to the current location of that piece, and the second notation refers to the location on the board that this piece will move to.

Lastly, there are three symbols that help to identify the kind of move that the piece is making, namely, + (plus sign), - (minus sign), and . (full stop). The + sign is a movement forward on the board (or into enemy territory), while the - sign is a movement backward on the board, and . is a horizontal movement on the board.

Red Player’s First Move

From the chart below, we find that the top two most common first moves of the red player involve the movement of either one of the Cannons to the centre of the board.

In fact, the Cannon is the first piece that is moved almost 68% of the time by the red player, and within the context of the game, this is likely to be taken as an offensive move as it immediately puts pressure on the opponent’s Pawn in the 5th position. Right behind at 11% is the movement of the elephant, which is likely to be viewed as a defensive move (given that the elephant is not able to cross the river). We also see the movement of the Pawn and Horse, with both coming in at about 9% each of all first moves.

It doesn’t make any practical sense to be moving the rest of the chess pieces for your first move, so I think we can safely ignore these and move on.

Piece Moved Number Percentage
C 6470 0.68
E 1088 0.11
P 903 0.09
H 831 0.09
A 214 0.02
R 32 0.00
K 4 0.00

How does the opponent react?

Next, let us take a quick look at how the black player reacts depending on the move that the first player makes. We visualize this with a simple sankey chart using the googleVis package.

There’s quite a lot of information in this chart, so let’s break it down into a few parts:

  1. Red Player Moves Cannon - There are two main reactions to this move, given that the black player’s Pawn in position 5 is now under threat. The black player either makes a defensive move by moving his Horse to protect the Pawn, or chooses to make a “copycat” offensive move by moving the cannon to a similar position. We also see some instances where the black player moves the elephant, but one wonders the reason behind this move given that this would mean that the Pawn in position 5 would be eaten.
  2. Red Player moves Elephant, Pawn or Horse - We also see that should the red player choose to move an altenrative piece other than the cannon, the black player takes advantage of this by taking this opportunity to put the red player under some pressure by moving the cannon.

So… does your first move matter?

Next, based on just the first moves of each player, let’s see if its possible to tell if there’s a clear winner.

From the chart below, its not surprising to see that the first moves are not telling of a winner, and the game is really very open even after your first moves. This likely means that no matter what move you make at the start, there should still be a way for you to correct that and win the match.

Since the first moves of each player is not telling, we then move on to look at early game and see if it does provide some hint of a possible winner.

Early Game

To be able to understand the impact of early game on the outcome, let’s focus on games which are well developed. These are games where each player makes at least 20 moves. I also filter out only the first 10 moves, for easy data manipulation.

Moving the Chariot Early

There was a saying in Chinese that “if you don’t move your chariot by move three, you would lose the battle”. We can never be entirely sure if such sayings continue to be true, so what better way than to test it out. (From the chart below, we do find that most players do try to move their chariot by the 3rd move.)

To do so, I first counted, for each game and player, which was the first move that the player’s chariot was first moved, and next compare if players were more likely to win if they moved their chariot by the third move.

Next, when we split each of the bars further into winners and losers, we find that the probability of winning doesn’t seem to depend on when you move your chariot. Seems easier to believe now that this old Chinese saying may no longer be true anymore.

Moving the same piece too often

Another concept in Chinese Chess is efficiency, which is to move as many pieces as possible into advantageous positions so that your important troops can “control important positions” on the board as early as possible.

What this implies is that the first 10 moves of the match could be relatively crucial in determining the outcome of the match. For this last section, I will investigate if players who moved a single piece too often have a higher tendency to lose the game.

From the chart below, we can see that the typically, players tend to move a particular type of piece no more than 3 times during the first 10 moves of the match. Hence, we will focus on the ~13% of players who moved a single type of piece 4 times or more during the first 10 moves of the match.

And… here are the results! From the regression coefficient below, we see that there is some importance in moving a variety of pieces in the first 10 moves of the match, as this allows your chess pieces to take up important positions that will lead you in good stead for the rest of the match. However, the saving grace is that the coefficients is not large. An increase in the number of moves on a particular piece in the first 10 moves of the match only reduces the probability of you winning the match by 0.6%.

Call:
probitmfx(formula = is_winner ~ count, data = eff, atmean = T)

Marginal Effects:
           dF/dx  Std. Err.       z  P>|z|
count -0.0041348  0.0043565 -0.9491 0.3426

Hope this inspires you to have a quick game of Chinese Chess during the upcoming Chinese New Year Holidays!

Data source obtained from Kaggle. Credit to Chang Hsin Lee for compiling this amazing dataset. Data can be accessed from this link.