Zero Sum Games¶

Motivating example: changing Rock Paper Scissors¶

If we modify the game of Rock Paper Scissora to add a single new strategy “Spock”:

Spock smashes Scissors and Rock.
Paper disproves Spock.

The other modification is that the game is no longer symmetric: only the row player can use Spock.

The mathematical representation of this game is given by:

\[\begin{split}A = \begin{pmatrix} 0 & -1 & 1 \\ 1 & 0 & -1\\ -1 & 1 & 0\\ 1 & -1 & 1 \end{pmatrix}\end{split}\]

Is there a way that the Row player can play that guarantees a particular expected proportion of wins?

Optimising worst case outcomes: the min-max and max-min strategies¶

Given a zero sum game defined by \(A\in\mathbb{R}^{m\times n}\) and a column player with strategy \(y\in\mathbb{R}^{n}\), the row player is aiming to find a best response strategy \(x\in\mathbb{R}^{m}\) which corresponds to:

\[\max_{x\in\mathcal{S}_1} xAy^T\]

This corresponds to finding the maximum row of \(Ay^T\):

\[\max_{i \leq m} (Ay^T)_i\]

The column player, through their choice of \(y\) will be able to define the upper bound \(v\) of \(\max_{i \leq m} (Ay^T)_i\). In fact, as the game is zero sum, there will be aiming to choose \(y\) such that the upper bound \(v\) is as low as possible.

Thus,

\[\max_{x\in\mathcal{S}_1} xAy^T = \max_{i \leq m} (Ay^T)_i = \min\{v \in\mathbb{R} \;|\;Ay^T \leq \mathbb{1}v\}\]

Thus, the min-max strategy \(y\) of the column player is a solution to the following linear program:

\[\min_{y, v}v\]

Subject to:

\[\begin{split}\begin{align*} Ay^T &\leq \mathbb{1}v \\ y&\in\mathcal{S}_2 \end{align*}\end{split}\]

In this case, \(v\) is called the min-max value of the game.

The corresponding max-min strategy \(x\) of the row player is a solution to the following linear program:

\[\max_{x, u}u\]

Subject to:

\[\begin{split}\begin{align*} xA &\geq \mathbb{1}u \\ x\in\mathcal{S}_1 \end{align*}\end{split}\]

In this case, \(v\) is called the max-min value of the game.

For the modified rock paper scissors game the max-min strategy \(x\) of the row player will be a solution to the following linear program:

\[\max_{x, u}u\]

Subject to:

\[\begin{split}\begin{align*} x_2 - x_3 + x_4 &\geq u \\ -x_1 + x_3 - x_4 &\geq u \\ x_1 - x_3 + x_4 &\geq u \\ x_1 + x_2 + x_3 + x_4 &= 1\\ x_i &\geq 0 \text{ for all } i\in\{1, 2, 3, 4\} \end{align*}\end{split}\]

Question

For the zero sum game Matching Pennies: with payoff matrix \(A = \begin{pmatrix} 1 & -1\\ -1 & 1 \end{pmatrix}\).

What linear program is the max-min strategy \(x\) a solution to?
What is the max-min strategy?

Answer

The max-min strategy is a solution to the following linear program:

\[\max_{x, u}u\]

Subject to:

\[\begin{split}\begin{align*} x_1 - x_2 &\geq u \\ -x_1 + x_2&\geq u \\ x_1 + x_2 &= 1\\ x_i &\geq 0 \text{ for all } i\in\{1, 2\} \end{align*}\end{split}\]

Given that \(x_1+x_2=1\) this linear program corresponds to:

\[\max_{x_1, u}u\]

Subject to:

\[\begin{split}\begin{align*} 2x_1 - 1 &\geq u \\ -2x_1 + 1&\geq u \\ 0 \leq x_1 &\leq 1\\ \end{align*}\end{split}\]

These contraints can be rewritten as:

\[\begin{split}\begin{align*} x_1 &\geq \frac{1 + u}{2} \\ x_1 &\leq \frac{1 - u}{2} \\ 0 &\leq x_1 \leq 1\\ \end{align*}\end{split}\]

This implies that \(\frac{u + 1}{2}\leq x_1\leq \frac{1 - u}{2}\) which implies \(u \leq -u\) which is only true when \(u=0\). When \(u=0\) the constraints become: \(\frac{1}{2} \leq x_1\leq \frac{1}{2}\) giving: \(x_1=1/2\).

The max-min strategy is thus: \(x=(1/2, 1/2)\).

Re formulation of the linear program¶

In a Zero Sum Game, given a row player payoff matrix \(A\) with \(m\) rows and \(n\) columns, the following linear program will give the max-min strategy and value:

\[\min_{x\in\mathbb{R}^{(m + 1)\times 1}} cx\]

Subject to:

\[\begin{split}\begin{align*} M_{\text{ub}}x &\leq b_{\text{ub}} \\ M_{\text{eq}}x &= b_{\text{eq}} \\ x_i &\geq 0&&\text{ for }i\leq m \end{align*}\end{split}\]

Where the coefficients of the linear program are defined by:

\[\begin{split}\begin{align*} c &= (\underbrace{0, \dots, 0}_{m}, -1) && c\in\{0, 1\}^{1 \times (m + 1)}\\ M_{\text{ub}} &= \begin{pmatrix}(-A^T)_{11}&\dots&(-A^T)_{1m}&1\\ \vdots &\ddots&\vdots &1\\ (-A^T)_{n1}&\dots&(-A^T)_{nm}&1\end{pmatrix} && M\in\mathbb{R}^{n\times (m + 1)}\\ b_{\text{ub}} &= (\underbrace{0, \dots, 0}_{n})^T && b_{\text{ub}}\in\{0\}^{n\times 1}\\ M_{\text{eq}} &= (\underbrace{1, \dots, 1}_{m}, 0) && M_{\text{eq}}\in\{0, 1\}^{1\times(m + 1)}\\ b_{\text{eq}} &= 1 \\ \end{align*}\end{split}\]

This reformulation is in fact how the linear program is written in Nashpy’s source code.

For the modified rock paper scissors game the coefficients of the linear system are given by:

\[\begin{split}\begin{align*} c &= (0, 0, 0, 0, -1)\\ M_{\text{ub}} &= \begin{pmatrix} 0 & -1 & 1 & -1 & 1\\ 1 & 0 & -1 & 1 & 1\\ -1& 1.& 0.& -1.& 1 \end{pmatrix}\\ b_{\text{ub}} &= \begin{pmatrix}0\\0\\0\end{pmatrix}\\ M_{\text{eq}} &= (1, 1, 1, 1, 0)\\ b_{\text{eq}} &= 1 \\ \end{align*}\end{split}\]

Question

Obtain the coefficients of the reformulated linear system for the zero sum game Matching Pennies: with payoff matrix \(A = \begin{pmatrix} 1 & -1\\ -1 & 1 \end{pmatrix}\).

Answer

Here are the coefficients of the reformulated linear system:

\[\begin{split}\begin{align*} c &= (0, 0, -1)\\ M_{\text{ub}} &= \begin{pmatrix} -1 & 1 & 1\\ 1 & -1 & 1\\ \end{pmatrix}\\ b_{\text{ub}} &= \begin{pmatrix}0\\0\end{pmatrix}\\ M_{\text{eq}} &= (1, 1, 0)\\ b_{\text{eq}} &= 1 \\ \end{align*}\end{split}\]

The minimax theorem¶

The minimax theorem [vonNeumann1928] states that if there exists optimal values of the:

max-min value \(u\) and the max-min strategy \(x\).
min-max value \(v\) and the min-max strategy \(y\).

then \(u=v\).

This can be proved using the Linear Program Duality Theorem.

Note that this answers the question posed at the end of Motivating example: changing Rock Paper Scissors: through a choice of strategy the row player can ensure they obtain the value of the game which is equal to the max-min value and the min-max value.

Exercises¶

Obtain the coefficients of the reformulated linear system for the zero sum games with the following payoff matrices:

\(A = \begin{pmatrix} 3 & -1\\ -1 & 2 \end{pmatrix}\).
\(A = \begin{pmatrix} -1 & -1\\ -1 & 3 \end{pmatrix}\).
\(A = \begin{pmatrix} 2 & 1 & -3\\ -3 & -1 & 3 \end{pmatrix}\).
\(A = \begin{pmatrix} 3 & -2 & 0\\ -3 & 0 & 3 \\ 0 & 2 & -5 \end{pmatrix}\).

Using Nashpy¶

See Use the minimax theorem for guidance of how to use Nashpy to find the Nash equilibria of Zero sum games using the mini max theorem.