.. _zero-sum-games:

Zero Sum Games
==============

.. _motivating-example-zero-sum-games:

Motivating example: changing Rock Paper Scissors
------------------------------------------------

If we modify the game of :ref:`Rock Paper Scissora
<motivating-example-strategy-for-rps>` to add a single new strategy "Spock":

- Spock smashes Scissors and Rock.
- Paper disproves Spock.

The other modification is that the game is no longer symmetric: only the row
player can use the Well.

The mathematical representation of this game is given by:

.. math::

   A = \begin{pmatrix}
   0  & -1 & 1 \\
   1  & 0  & -1\\
   -1 & 1  & 0\\
   1  & -1 & 1
   \end{pmatrix}

Is there a way that the Row player can play that guarantees a particular
expected proportion of wins?

Optimising worst case outcomes: the min-max and max-min strategies
------------------------------------------------------------------

Given a zero sum game defined by :math:`A\in\mathbb{R}^{m\times n}` and a column
player with strategy :math:`y\in\mathbb{R}^{n}`, the row player is aiming to
find a :ref:`best response strategy <definition-of-best-response>`
:math:`x\in\mathbb{R}^{m}` which corresponds to:

.. math::

   \max_{x\in\mathcal{S}_1} xAy^T

This corresponds to finding the maximum row of :math:`Ay^T`:

.. math::

   \max_{i \leq m} (Ay^T)_i

The column player, through their choice of :math:`y` will be able to define the
upper bound :math:`v` of :math:`\max_{i \leq m} (Ay^T)_i`. In fact, as the game
is zero sum, there will be aiming to choose :math:`y` such that the upper bound
:math:`v` is as low as possible.

Thus,

.. math::

   \max_{x\in\mathcal{S}_1} xAy^T = \max_{i \leq m} (Ay^T)_i = \min\{v \in\mathbb{R} \;|\;Ay^T \leq \mathbb{1}v\}

Thus, the *min-max* strategy :math:`y` of the column player is a solution to the
following :ref:`linear program <definition-of-a-linear-program>`:

.. math::

   \min_{y, v}v

Subject to:

.. math::

   \begin{align*}
        Ay^T &\leq \mathbb{1}v \\
        y&\in\mathcal{S}_2
   \end{align*}

In this case, :math:`v` is called the *min-max* value of the game.

The corresponding *max-min* strategy :math:`x` of the row player is a solution
to the following :ref:`linear program <definition-of-a-linear-program>`:

.. math::

   \max_{x, u}u

Subject to:

.. math::

   \begin{align*}
        xA &\geq \mathbb{1}u \\
        x\in\mathcal{S}_1
   \end{align*}

In this case, :math:`v` is called the *max-min* value of the game.

For the :ref:`modified rock paper scissors game
<motivating-example-zero-sum-games>` the *max-min* strategy :math:`x` of the row
player will be a solution to the following linear program:

.. math::

   \max_{x, u}u

Subject to:

.. math::

   \begin{align*}
        x_2 - x_3 + x_4 &\geq u \\
        -x_1 + x_3 - x_4 &\geq u \\
        x_1 - x_3 + x_4 &\geq u \\
        x_1 + x_2 + x_3 + x_4 &= 1\\
        x_i &\geq 0 \text{ for all } i\in\{1, 2, 3, 4\}
   \end{align*}

.. admonition:: Question
   :class: note

   For the zero sum game :ref:`Matching Pennies <matching-pennies>`: with payoff
   matrix :math:`A = \begin{pmatrix} 1 & -1\\ -1 & 1 \end{pmatrix}`.

   1. What linear program is the *max-min* strategy :math:`x` a solution to?
   2. What is the *max-min* strategy?

.. admonition:: Answer
   :class: caution, dropdown

   1. The *max-min* strategy is a solution to the following linear program:
   
   .. math::

      \max_{x, u}u

   Subject to:

   .. math::

      \begin{align*}
           x_1 - x_2 &\geq u \\
           -x_1 + x_2&\geq u \\
           x_1 + x_2 &= 1\\
           x_i &\geq 0 \text{ for all } i\in\{1, 2\}
      \end{align*}

   2. Given that :math:`x_1+x_2=1` this linear program corresponds to:

   .. math::

      \max_{x_1, u}u

   Subject to:

   .. math::

      \begin{align*}
           2x_1 - 1 &\geq u \\
           -2x_1 + 1&\geq u \\
           0 \leq x_1 &\leq 1\\
      \end{align*}

   These contraints can be rewritten as:

   .. math::

      \begin{align*}
           x_1 &\geq \frac{1 + u}{2} \\
           x_1 &\leq \frac{1 - u}{2} \\
           0 &\leq x_1 \leq 1\\
      \end{align*}


   This implies that :math:`\frac{u + 1}{2}\leq x_1\leq \frac{1 - u}{2}` which
   implies :math:`u \leq -u` which is only true when :math:`u=0`. When
   :math:`u=0` the constraints become: :math:`\frac{1}{2} \leq x_1\leq
   \frac{1}{2}` giving: :math:`x_1=1/2`.

   The max-min strategy is thus: :math:`x=(1/2, 1/2)`.


.. _formulation-of-linear-program:

Re formulation of the linear program
------------------------------------

In a :ref:`Zero Sum Game <definition-of-zero-sum-game>`, given a row player
payoff matrix :math:`A` with :math:`m` rows and :math:`n` columns, the following
linear program will give the max-min strategy and value:

.. math::

   \min_{x\in\mathbb{R}^{(m + 1)\times 1}} cx

Subject to:

.. math::

   \begin{align*}
        M_{\text{ub}}x &\leq b_{\text{ub}} \\
        M_{\text{eq}}x &= b_{\text{eq}} \\
        x_i            &\geq 0&&\text{ for }i\leq m
   \end{align*}

Where the coefficients of the linear program are defined by:

.. math::

   \begin{align*}
       c &= (\underbrace{0, \dots, 0}_{m}, -1) && c\in\{0, 1\}^{1 \times (m + 1)}\\
       M_{\text{ub}} &= \begin{pmatrix}(-A^T)_{11}&\dots&(-A^T)_{1m}&1\\
                                       \vdots     &\ddots&\vdots           &1\\
                                       (-A^T)_{n1}&\dots&(-A^T)_{nm}&1\end{pmatrix} && M\in\mathbb{R}^{n\times (m + 1)}\\
       b_{\text{ub}} &= (\underbrace{0, \dots, 0}_{n})^T && b_{\text{ub}}\in\{0\}^{n\times 1}\\
       M_{\text{eq}} &= (\underbrace{1, \dots, 1}_{m}, 0) && M_{\text{eq}}\in\{0, 1\}^{1\times(m + 1)}\\
       b_{\text{eq}} &= 1 \\
   \end{align*}

This reformulation is in fact how the linear program is written in Nashpy's
source code.

For the :ref:`modified rock paper scissors game
<motivating-example-zero-sum-games>` the coefficients of the linear system are
given by:

.. math::

   \begin{align*}
       c &= (0, 0, 0, 0, -1)\\
       M_{\text{ub}} &= \begin{pmatrix}
           0 & -1 &  1 & -1 &  1\\
           1 &  0 & -1 &  1 &  1\\
           -1&  1.&  0.& -1.&  1
           \end{pmatrix}\\
       b_{\text{ub}} &= \begin{pmatrix}0\\0\\0\end{pmatrix}\\
       M_{\text{eq}} &= (1, 1, 1, 1, 0)\\
       b_{\text{eq}} &= 1 \\
   \end{align*}

.. admonition:: Question
   :class: note

   Obtain the coefficients of the reformulated linear system for the zero sum
   game :ref:`Matching Pennies <matching-pennies>`: with payoff matrix :math:`A
   = \begin{pmatrix} 1 & -1\\ -1 & 1 \end{pmatrix}`.

.. admonition:: Answer
   :class: caution, dropdown

   Here are the coefficients of the reformulated linear system:

   .. math::

      \begin{align*}
          c &= (0, 0, -1)\\
          M_{\text{ub}} &= \begin{pmatrix}
              -1 & 1 &  1\\
              1 &  -1 & 1\\
              \end{pmatrix}\\
          b_{\text{ub}} &= \begin{pmatrix}0\\0\end{pmatrix}\\
          M_{\text{eq}} &= (1, 1, 0)\\
          b_{\text{eq}} &= 1 \\
      \end{align*}


.. _the-minimax-theorem:

The minimax theorem
-------------------

The minimax theorem [vonNeumann1928]_ states that if there exists optimal
values of the:

1. *max-min* value :math:`u` and the *max-min* strategy :math:`x`.
2. *min-max* value :math:`v` and the *min-max* strategy :math:`y`.

then :math:`u=v`.

This can be proved using the :ref:`linear_program_duality_theorem`.

Note that this answers the question posed at the end of 
:ref:`motivating-example-zero-sum-games`: through a choice of strategy the row 
player can ensure they obtain the value of the game which is equal to the
*max-min* value and the *min-max* value.

Exercises
---------


Obtain the coefficients of the reformulated linear system for the zero sum
games with the following payoff matrices:

1. :math:`A = \begin{pmatrix} 3 & -1\\ -1 & 2 \end{pmatrix}`.
2. :math:`A = \begin{pmatrix} -1 & -1\\ -1 & 3 \end{pmatrix}`.
3. :math:`A = \begin{pmatrix} 2 & 1 & -3\\ -3 & -1 & 3 \end{pmatrix}`.
4. :math:`A = \begin{pmatrix} 3 & -2 & 0\\ -3 & 0 & 3 \\ 0 & 2 & -5 \end{pmatrix}`.

Using Nashpy
------------

See :ref:`how-to-use-minimax` for guidance of how to use Nashpy to
find the Nash equilibria of Zero sum games using the mini max theorem.