# Best responses¶

## Motivating example: Best Responses in Matching Pennies¶

Considering the game Matching Pennies:

$\begin{split}A = \begin{pmatrix} 1 & -1\\ -1 & 1 \end{pmatrix} \qquad B = \begin{pmatrix} -1 & 1\\ 1 & -1 \end{pmatrix}\end{split}$

If the row player knows that the column player is playing the strategy $$\sigma_c=(0, 1)$$ the utility of the row player is maximised by playing $$\sigma_r=(0, 1)$$.

In this case $$\sigma_r$$ is referred to as a best response to $$\sigma_c$$.

Alternatively, if the column player knows that the row player is playing the strategy $$\sigma_r=(0, 1)$$ the column player’s best response is $$\sigma_c=(1, 0)$$.

## Definition of a best response in a normal form game¶

In a two player game $$(A,B)\in{\mathbb{R}^{m\times n}}^2$$ a strategy $$\sigma_r^*$$ of the row player is a best response to a column players’ strategy $$\sigma_c$$ if and only if:

$\sigma_r^*=\text{argmax}_{\sigma_r\in \mathcal{S}_1}\sigma_rA\sigma_c^T.$

Where $$\mathcal{S}_1$$ denotes the space of all strategies for the first player.

Similarly a mixed strategy $$\sigma_c^*$$ of the column player is a best response to a row players’ strategy $$\sigma_r$$ if and only if:

$\sigma_c^*=\text{argmax}_{\sigma_c\in \mathcal{S}_2}\sigma_rB\sigma_c^T.$

Question

For the Prisoners Dilemma:

What is the row player’s best response to either of the actions of the column player?

## Generic best responses in 2 by 2 games¶

In two player normal form games with $$|A_1|=|A_2|=2$$: a 2 by 2 game, the utility of a row player playing $$\sigma_r=(x, 1 - x)$$ against a strategy $$\sigma_c = (y, 1 - y)$$ is linear in $$x$$:

$\begin{split}u_r(\sigma_r, \sigma_c) &= (x, 1 - x) A (y, 1 - y) ^T \\ &= A_{11}xy + A_{12}x(1-y) + A_{21}(1-x)y + A_{22}(1-x)(1-y) \\ &= a x + b\end{split}$

where:

$\begin{split}a &= A_{11}y + A_{12}(1 - y) - A_{21}y - A_{22}(1 - y)\\ b &= A_{21}y + A_{22}(1 - y)\end{split}$

This observation allows us to obtain the best response $$\sigma_r^*$$ against any $$\sigma_c = (y, 1 - y)$$.

For example, consider Matching Pennies. Below is a plot of $$u_r(\sigma_r, \sigma_c)$$ as a function of $$y$$ for $$\sigma_r \in \{(1, 0), (0, 1)\}$$.

Given that the utilities in both cases are linear, the best response to any value of $$y \ne 1/2$$ is either $$(1, 0)$$ or $$(0, 1)$$. The best response $$\sigma_r^*$$ is given by:

$\begin{split}\sigma_r ^* = \begin{cases} (1, 0),& \text{ if } y > 1/2\\ (0, 1),& \text{ if } y < 1/2\\ \text{indifferent},& \text{ if } y=1/2 \end{cases}\end{split}$

Question

For the Matching Pennies game:

What is the column player’s best response as a function of $$x$$ where $$\sigma_r=(x, 1 - x)$$.

## General condition for a best response¶

In a two player game $$(A,B)\in{\mathbb{R}^{m\times n}}^2$$ a strategy $$\sigma_r^*$$ of the row player is a best response to a column players’ strategy $$\sigma_c$$ if and only if:

${\sigma_{r^*}}_i > 0 \Rightarrow (A\sigma_c^T)_i = \text{max}_{k \in \mathcal{A}_2}(A\sigma_c ^ T)_k \text{ for all }i \in \mathcal{A}_1$

### Proof¶

$$(A\sigma_c^T)_i$$ is the utility of the row player when they play their $$i^{\text{th}}$$ action. Thus:

$\sigma_rA\sigma_c^T=\sum_{i=1}^{m}{\sigma_r}_i(A\sigma_c^T)_i$

Let $$u=\max_{k}(A\sigma_c^T)_k$$ giving:

$\begin{split}\sigma_rA\sigma_c^T&=\sum_{i=1}^{m}{\sigma_r}_i(u - u + (A\sigma_c^T)_i)\\ &=\sum_{i=1}^{m}{\sigma_r}_iu - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i)\\ &=u - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i)\end{split}$

We know that $$u - (A\sigma_c^T)_i\geq 0$$, thus the largest $$\sigma_rA\sigma_c^T$$ can be is $$u$$ which occurs if and only if $${\sigma_r}_i > 0 \Rightarrow (A\sigma_c^T)_i = u$$ as required.

Question

For the Rock Paper Scissors game:

Which of the following pairs of strategies are best responses to each other:

1. $$\sigma_r=(0, 0, 1) \text{ and } \sigma_c=(0, 1/2, 1/2)$$

2. $$\sigma_r=(1/3, 1/3, 1/3) \text{ and } \sigma_c=(0, 1/2, 1/2)$$

3. $$\sigma_r=(1/3, 1/3, 1/3) \text{ and } \sigma_c=(1/3, 1/3, 1/3)$$

## Definition of Nash equilibrium¶

In a two player game $$(A, B)\in {\mathbb{R}^{m \times n}} ^ 2$$, $$(\sigma_r, \sigma_c)$$ is a Nash equilibria if $$\sigma_r$$ is a best response to $$\sigma_c$$ and $$\sigma_c$$ is a best response to $$\sigma_r$$.

## Using Nashpy¶

See Check if a strategy is a best response for guidance of how to use Nashpy to check if a strategy is a best response.