.. _best-responses-discussion:

Best responses
==============

.. _motivating-example-matching-pennies:

Motivating example: Best Responses in Matching Pennies
------------------------------------------------------

Considering the game :ref:`matching-pennies`:

..  math::

    A = \begin{pmatrix}
    1 & -1\\
    -1 & 1
    \end{pmatrix}
    \qquad
    B = \begin{pmatrix}
    -1 & 1\\
    1 & -1
    \end{pmatrix}

If the row player knows that the column player is playing the :ref:`strategy
<strategies-discussion>` :math:`\sigma_c=(0, 1)` the utility of the row player
is maximised by playing :math:`\sigma_r=(0, 1)`.

In this case :math:`\sigma_r` is referred to as a **best response** to
:math:`\sigma_c`.

Alternatively, if the column player knows that the row player is playing the
:ref:`strategy <strategies-discussion>` :math:`\sigma_r=(0, 1)` the column
player's best response is :math:`\sigma_c=(1, 0)`.

.. _definition-of-best-response:

Definition of a best response in a normal form game
---------------------------------------------------

In a two player game :math:`(A,B)\in{\mathbb{R}^{m\times n}}^2` a strategy
:math:`\sigma_r^*`  of the row player is a best response to a column players'
strategy :math:`\sigma_c` if and only if:

.. math::

   \sigma_r^*=\text{argmax}_{\sigma_r\in \mathcal{S}_1}\sigma_rA\sigma_c^T.

Where :math:`\mathcal{S}_1` denotes the :ref:`space of all
strategies<definition-of-strategy-spaces-in-normal-form-games>` for the first
player.

Similarly a mixed strategy :math:`\sigma_c^*`  of the column player is a best
response to a row players' strategy :math:`\sigma_r` if and only if:

.. math::

   \sigma_c^*=\text{argmax}_{\sigma_c\in \mathcal{S}_2}\sigma_rB\sigma_c^T.


.. admonition:: Question
   :class: note

   For the :ref:`Prisoners Dilemma <prisoners-dilemma>`:

   What is the row player's best response to either of the actions of the
   column player?

.. admonition:: Answer
   :class: caution, dropdown

   Recalling that :math:`A` is given by:

   .. math::

      A = \begin{pmatrix}
      3 & 0\\
      5 & 1
      \end{pmatrix}

   Against the first action of the column player the best response is to choose
   the second action which gives a utility of 5. This can be expressed as:

   .. math::

      \text{argmax}_{i\in\mathcal{A}_1}A_{i1}=2

   Against the second action of the column player the best response is to choose
   the second action which gives a utility of 1. This can be expressed as:

   .. math::

      \text{argmax}_{i\in\mathcal{A}_1}A_{i2}=2


   The row player's best response to either of the actions of the column player
   is :math:`\sigma_r^*=(1,0)`. This can be expressed as:

   .. math::

      \text{argmax}_{i\in\mathcal{A}_1}A_{ij}=2\text{ for all }j\in\mathcal{A}_2

.. _definition_of_a_degenerate_game:

Definition of a degenerate game
-------------------------------

A two player game is called non degenerate if no mixed strategy of support size
:math:`k` has more than :math:`k` pure best responses.

.. admonition:: Question
   :class: note

   Show that the following game is degenerate:

   .. math::

      A =
      \begin{pmatrix}
           0 & -1 &  1\\
          -1 &  0 &  1\\
          -1 &  1 &  0
      \end{pmatrix}

.. admonition:: Answer
   :class: caution, dropdown

   The third column has two pure best responses.

.. _best_responses_in_2_by_2_games:

Generic best responses in 2 by 2 games
--------------------------------------

In two player normal form games with :math:`|A_1|=|A_2|=2`: a 2 by 2 game, the
utility of a row player playing :math:`\sigma_r=(x, 1 - x)` against a strategy
:math:`\sigma_c = (y, 1 - y)` is linear in :math:`x`:

.. math::

   u_r(\sigma_r, \sigma_c) &= (x, 1 - x) A (y, 1 - y) ^T \\
                           &= A_{11}xy + A_{12}x(1-y) + A_{21}(1-x)y + A_{22}(1-x)(1-y) \\
                           &= a x + b

where:

.. math::

   a &=  A_{11}y + A_{12}(1 - y) - A_{21}y - A_{22}(1 - y)\\
   b &=  A_{21}y + A_{22}(1 - y)

This observation allows us to obtain the best response :math:`\sigma_r^*`
against any :math:`\sigma_c = (y, 1 - y)`.

For example, consider :ref:`matching-pennies`. Below is a plot of
:math:`u_r(\sigma_r, \sigma_c)` as a function of :math:`y` for :math:`\sigma_r
\in \{(1, 0), (0, 1)\}`.

.. plot::

   import matplotlib.pyplot as plt
   import nashpy as nash
   import numpy as np

   A = np.array([[1, -1], [-1, 1]])
   game = nash.Game(A)
   ys = [0, 1]
   sigma_rs = [(1, 0), (0, 1)]
   u_rs = [[game[sigma_r, (y, 1 - y)][0] for y in ys] for sigma_r in sigma_rs]
   plt.plot(ys, u_rs[0], label="$(A\sigma_c^T)_1$")
   plt.plot(ys, u_rs[1], label="$(A\sigma_c^T)_2$")
   plt.xlabel("$\sigma_c=(y, 1-y)$")
   plt.title("Utility to row player")
   plt.legend()

Given that the utilities in both cases are linear, the best response to any
value of :math:`y \ne 1/2` is either :math:`(1, 0)` or :math:`(0, 1)`.
The best response :math:`\sigma_r^*` is given by:

.. math::

   \sigma_r ^* = \begin{cases}
                    (1, 0),& \text{ if } y > 1/2\\
                    (0, 1),& \text{ if } y < 1/2\\
                    \text{indifferent},& \text{ if } y=1/2
                 \end{cases}

.. _best_responses_condition:

.. admonition:: Question
   :class: note

   For the :ref:`matching-pennies` game:

   What is the column player's best response as a function of :math:`x` where
   :math:`\sigma_r=(x, 1 - x)`.

.. admonition:: Answer
   :class: caution, dropdown

   Recalling that :math:`B` is given by:

   .. math::

      B = \begin{pmatrix}
      -1 & 1\\
      1 & -1
      \end{pmatrix}

   This gives:

   .. math::

      u_c(\sigma_r, (1, 0)) =& -x + (1-x)= 1 - 2x\\
      u_c(\sigma_r, (0, 1)) =& x - (1-x)= -1 + 2x


   Here is a plot of the utilities:

   .. plot::

      import matplotlib.pyplot as plt
      import nashpy as nash

      xs = np.array([0, 1])
      u_cs = [1 - 2 * xs, - 1 + 2 * xs]
      plt.plot(xs, u_cs[0], label="$(\sigma_rB)_1$")
      plt.plot(xs, u_cs[1], label="$(\sigma_rB)_2$")
      plt.xlabel("$\sigma_r=(x, 1-x)$")
      plt.title("Utility to column player")
      plt.legend()

   The best response is given by:

   .. math::

      \sigma_c ^* = \begin{cases}
                        (1, 0),& \text{ if } x < 1/2\\
                        (0, 1),& \text{ if } x > 1/2\\
                        \text{indifferent},& \text{ if } x=1/2
                     \end{cases}

.. _best_response_condition:

General condition for a best response
-------------------------------------

In a two player game :math:`(A,B)\in{\mathbb{R}^{m\times n}}^2` a strategy
:math:`\sigma_r^*`  of the row player is a best response to a column players'
strategy :math:`\sigma_c` if and only if:

.. math::

   {\sigma_{r^*}}_i > 0 \Rightarrow (A\sigma_c^T)_i = \text{max}_{k \in \mathcal{A}_2}(A\sigma_c ^ T)_k \text{ for all }i \in \mathcal{A}_1


Proof
*****

:math:`(A\sigma_c^T)_i` is the utility of the row player when they play their
:math:`i^{\text{th}}` action. Thus:

.. math::

   \sigma_rA\sigma_c^T=\sum_{i=1}^{m}{\sigma_r}_i(A\sigma_c^T)_i

Let :math:`u=\max_{k}(A\sigma_c^T)_k` giving:

.. math::

   \sigma_rA\sigma_c^T&=\sum_{i=1}^{m}{\sigma_r}_i(u - u + (A\sigma_c^T)_i)\\
                      &=\sum_{i=1}^{m}{\sigma_r}_iu - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i)\\
                      &=u - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i)

We know that :math:`u - (A\sigma_c^T)_i\geq 0`, thus the largest
:math:`\sigma_rA\sigma_c^T` can be is :math:`u` which occurs if and only if
:math:`{\sigma_r}_i > 0 \Rightarrow (A\sigma_c^T)_i = u` as required.

.. admonition:: Question
   :class: note

   For the :ref:`Rock Paper Scissors <motivating-example-strategy-for-rps>`
   game:

   Which of the following pairs of strategies are best responses to each other:

   1. :math:`\sigma_r=(0, 0, 1) \text{ and } \sigma_c=(0, 1/2, 1/2)`
   2. :math:`\sigma_r=(1/3, 1/3, 1/3) \text{ and } \sigma_c=(0, 1/2, 1/2)`
   3. :math:`\sigma_r=(1/3, 1/3, 1/3) \text{ and } \sigma_c=(1/3, 1/3, 1/3)`

.. admonition:: Answer
   :class: caution, dropdown

   Recalling that :math:`A` and :math:`B` are given by:


   .. math::

      A = \begin{pmatrix}
      0  & -1 & 1 \\
      1  & 0  & -1\\
      -1 & 1  & 0\\
      \end{pmatrix}

   .. math::

      B = - A = \begin{pmatrix}
      0  & 1 & -1 \\
      -1  & 0  & 1\\
      1 & -1  & 0\\
      \end{pmatrix}

   We can apply the best response condition to each pairs of strategies:

   1. :math:`A\sigma_c^T = \begin{pmatrix}0\\ -1/2\\ 1/2\\\end{pmatrix}`.
      :math:`\text{max}(A\sigma_c^T)=1/2`. The only :math:`i` for which
      :math:`{\sigma_r}_i > 0` is :math:`i=3` and
      :math:`(A\sigma_c^T)_3=\text{max}(A\sigma_c^T)` thus :math:`\sigma_r`
      **is a best response to** :math:`\sigma_c`.  :math:`\sigma_rB = (1, -1,
      0)`.  :math:`\text{max}(\sigma_rB)=1`. The values of :math:`i` for
      which :math:`{\sigma_c}_i > 0` are :math:`i=2` and :math:`i=3` but
      :math:`(\sigma_r B)_2 \ne \text{max}(\sigma_r B)` thus :math:`\sigma_c`
      **is not a best response to** :math:`\sigma_r`.
   2. :math:`A\sigma_c^T = \begin{pmatrix}0\\ -1/2\\ 1/2\\\end{pmatrix}`.
      :math:`\text{max}(A\sigma_c^T)=1/2`. The values of :math:`i` for which
      :math:`{\sigma_r}_i > 0` are :math:`i=1`, :math:`i=2` and :math:`i=3`
      however, :math:`(A\sigma_c^T)_2 \ne \text{max}(A\sigma_c^T)` thus
      :math:`\sigma_r` **is not a best response to** :math:`\sigma_c`.
      :math:`\sigma_rB = (0, 0, 0)`.  :math:`\text{max}(\sigma_rB)=0`. The
      values of :math:`i` for which :math:`{\sigma_c}_i > 0` are :math:`i=2`
      and :math:`i=3` and :math:`(\sigma_r B)_2 = (\sigma_r B)_3=
      \text{max}(\sigma_r B)` thus :math:`\sigma_c` **is a best response to**
      :math:`\sigma_r`.
   3. :math:`A\sigma_c^T = \begin{pmatrix}0\\ 0\\ 0\\\end{pmatrix}`.
      :math:`\text{max}(A\sigma_c^T)=0`. The values of :math:`i` for which
      :math:`{\sigma_r}_i > 0` are :math:`i=1`, :math:`i=2` and :math:`i=3`
      and :math:`(A\sigma_c^T)_1=(A\sigma_c^T)_2 = (A\sigma_c^T)_3
      =\text{max}(A\sigma_c^T)` thus :math:`\sigma_r` **is a best response
      to** :math:`\sigma_c`.  :math:`\sigma_rB = (0, 0, 0)`.
      :math:`\text{max}(\sigma_rB)=0`. The values of :math:`i` for which
      :math:`{\sigma_c}_i > 0` are :math:`i=1`, :math:`i=2` and :math:`i=3`
      and :math:`(\sigma_r B)_1 =(\sigma_r B)_2 = (\sigma_r B)_3=
      \text{max}(\sigma_r B)` thus :math:`\sigma_c` **is a best response to**
      :math:`\sigma_r`.


Definition of Nash equilibrium
------------------------------

In a two player game :math:`(A, B)\in {\mathbb{R}^{m \times n}} ^ 2`,
:math:`(\sigma_r, \sigma_c)` is a Nash equilibria if :math:`\sigma_r` is a best
response to :math:`\sigma_c` and :math:`\sigma_c` is a best response to
:math:`\sigma_r`.

Exercises
---------


1. For the following games identify the best responses:

   1. 

   .. math:: 

      A = 
      \begin{pmatrix}
      2 & 1\\
      1 & 1\end{pmatrix}
      \qquad
      B =
      \begin{pmatrix}
         1 & 1\\
         1 & 3\end{pmatrix}

   2. 

   .. math::

      A = 
      \begin{pmatrix}
      2  & 1 & 3 & 17\\
      27 & 3 & 1 & 1\\
      4  & 6 & 7 & 18
      \end{pmatrix}
      \qquad
      B =
      \begin{pmatrix}
      11 & 9  & 10 & 22\\
      0  & 1  & 1  & 0\\
      2  & 10 & 12 & 0
      \end{pmatrix}

   3. 

   .. math:: 

      A =
      \begin{pmatrix}
      3  & 3 & 2 \\
      2  & 1 & 3 
      \end{pmatrix}
      \qquad
      B =
      \begin{pmatrix}
      2  & 1 & 3 \\
      2  & 3 & 2 
      \end{pmatrix}

   4. 

   .. math::

      A =
      \begin{pmatrix}
      3 & -1\\
      2 & 7\end{pmatrix}
      \qquad
      B =
      \begin{pmatrix}
      -3 & 1\\
      1 & -6\end{pmatrix}

2. Represent the following game in normal form:

      Assume two neighbouring countries have at their disposal very
      destructive armies. If both countries attack each other the
      countries’ civilian population will suffer 10 thousand casualties.
      If one country attacks whilst the other remains peaceful, the
      peaceful country will lose 15 thousand casualties but would also
      retaliate causing the offensive country 13 thousand casualties. If
      both countries remain peaceful then there are no casualties.

   1. Clearly state the players and strategy sets.
   2. Plot the utilities to both countries assuming that they play a
      mixed strategy while the other country remains peaceful.
   3. Plot the utilities to both countries assuming that they play a
      mixed strategy while the other country attacks.
   4. Obtain the best responses of each player.

3. Construct a degenerate :math:`3\times 3` game.

Using Nashpy
------------

See :ref:`how-to-check-best-responses` for guidance of how to
use Nashpy to check if a strategy is a best response.