.. _zero-sum-games: Zero Sum Games ============== .. _motivating-example-zero-sum-games: Motivating example: changing Rock Paper Scissors ------------------------------------------------ If we modify the game of :ref:`Rock Paper Scissora ` to add a single new strategy "Spock": - Spock smashes Scissors and Rock. - Paper disproves Spock. The other modification is that the game is no longer symmetric: only the row player can use the Well. The mathematical representation of this game is given by: .. math:: A = \begin{pmatrix} 0 & -1 & 1 \\ 1 & 0 & -1\\ -1 & 1 & 0\\ 1 & -1 & 1 \end{pmatrix} Is there a way that the Row player can play that guarantees a particular expected proportion of wins? Optimising worst case outcomes: the min-max and max-min strategies ------------------------------------------------------------------ Given a zero sum game defined by :math:`A\in\mathbb{R}^{m\times n}` and a column player with strategy :math:`y\in\mathbb{R}^{n}`, the row player is aiming to find a :ref:`best response strategy ` :math:`x\in\mathbb{R}^{m}` which corresponds to: .. math:: \max_{x\in\mathcal{S}_1} xAy^T This corresponds to finding the maximum row of :math:`Ay^T`: .. math:: \max_{i \leq m} (Ay^T)_i The column player, through their choice of :math:`y` will be able to define the upper bound :math:`v` of :math:`\max_{i \leq m} (Ay^T)_i`. In fact, as the game is zero sum, there will be aiming to choose :math:`y` such that the upper bound :math:`v` is as low as possible. Thus, .. math:: \max_{x\in\mathcal{S}_1} xAy^T = \max_{i \leq m} (Ay^T)_i = \min\{v \in\mathbb{R} \;|\;Ay^T \leq \mathbb{1}v\} Thus, the *min-max* strategy :math:`y` of the column player is a solution to the following :ref:`linear program `: .. math:: \min_{y, v}v Subject to: .. math:: \begin{align*} Ay^T &\leq \mathbb{1}v \\ y&\in\mathcal{S}_2 \end{align*} In this case, :math:`v` is called the *min-max* value of the game. The corresponding *max-min* strategy :math:`x` of the row player is a solution to the following :ref:`linear program `: .. math:: \max_{x, u}u Subject to: .. math:: \begin{align*} xA &\geq \mathbb{1}u \\ x\in\mathcal{S}_1 \end{align*} In this case, :math:`v` is called the *max-min* value of the game. For the :ref:`modified rock paper scissors game ` the *max-min* strategy :math:`x` of the row player will be a solution to the following linear program: .. math:: \max_{x, u}u Subject to: .. math:: \begin{align*} x_2 - x_3 + x_4 &\geq u \\ -x_1 + x_3 - x_4 &\geq u \\ x_1 - x_3 + x_4 &\geq u \\ x_1 + x_2 + x_3 + x_4 &= 1\\ x_i &\geq 0 \text{ for all } i\in\{1, 2, 3, 4\} \end{align*} .. admonition:: Question :class: note For the zero sum game :ref:`Matching Pennies `: with payoff matrix :math:`A = \begin{pmatrix} 1 & -1\\ -1 & 1 \end{pmatrix}`. 1. What linear program is the *max-min* strategy :math:`x` a solution to? 2. What is the *max-min* strategy? .. admonition:: Answer :class: caution, dropdown 1. The *max-min* strategy is a solution to the following linear program: .. math:: \max_{x, u}u Subject to: .. math:: \begin{align*} x_1 - x_2 &\geq u \\ -x_1 + x_2&\geq u \\ x_1 + x_2 &= 1\\ x_i &\geq 0 \text{ for all } i\in\{1, 2\} \end{align*} 2. Given that :math:`x_1+x_2=1` this linear program corresponds to: .. math:: \max_{x_1, u}u Subject to: .. math:: \begin{align*} 2x_1 - 1 &\geq u \\ -2x_1 + 1&\geq u \\ 0 \leq x_1 &\leq 1\\ \end{align*} These contraints can be rewritten as: .. math:: \begin{align*} x_1 &\geq \frac{1 + u}{2} \\ x_1 &\leq \frac{1 - u}{2} \\ 0 &\leq x_1 \leq 1\\ \end{align*} This implies that :math:`\frac{u + 1}{2}\leq x_1\leq \frac{1 - u}{2}` which implies :math:`u \leq -u` which is only true when :math:`u=0`. When :math:`u=0` the constraints become: :math:`\frac{1}{2} \leq x_1\leq \frac{1}{2}` giving: :math:`x_1=1/2`. The max-min strategy is thus: :math:`x=(1/2, 1/2)`. .. _formulation-of-linear-program: Re formulation of the linear program ------------------------------------ In a :ref:`Zero Sum Game `, given a row player payoff matrix :math:`A` with :math:`m` rows and :math:`n` columns, the following linear program will give the max-min strategy and value: .. math:: \min_{x\in\mathbb{R}^{(m + 1)\times 1}} cx Subject to: .. math:: \begin{align*} M_{\text{ub}}x &\leq b_{\text{ub}} \\ M_{\text{eq}}x &= b_{\text{eq}} \\ x_i &\geq 0&&\text{ for }i\leq m \end{align*} Where the coefficients of the linear program are defined by: .. math:: \begin{align*} c &= (\underbrace{0, \dots, 0}_{m}, -1) && c\in\{0, 1\}^{1 \times (m + 1)}\\ M_{\text{ub}} &= \begin{pmatrix}(-A^T)_{11}&\dots&(-A^T)_{1m}&1\\ \vdots &\ddots&\vdots &1\\ (-A^T)_{n1}&\dots&(-A^T)_{nm}&1\end{pmatrix} && M\in\mathbb{R}^{n\times (m + 1)}\\ b_{\text{ub}} &= (\underbrace{0, \dots, 0}_{n})^T && b_{\text{ub}}\in\{0\}^{n\times 1}\\ M_{\text{eq}} &= (\underbrace{1, \dots, 1}_{m}, 0) && M_{\text{eq}}\in\{0, 1\}^{1\times(m + 1)}\\ b_{\text{eq}} &= 1 \\ \end{align*} This reformulation is in fact how the linear program is written in Nashpy's source code. For the :ref:`modified rock paper scissors game ` the coefficients of the linear system are given by: .. math:: \begin{align*} c &= (0, 0, 0, 0, -1)\\ M_{\text{ub}} &= \begin{pmatrix} 0 & -1 & 1 & -1 & 1\\ 1 & 0 & -1 & 1 & 1\\ -1& 1.& 0.& -1.& 1 \end{pmatrix}\\ b_{\text{ub}} &= \begin{pmatrix}0\\0\\0\end{pmatrix}\\ M_{\text{eq}} &= (1, 1, 1, 1, 0)\\ b_{\text{eq}} &= 1 \\ \end{align*} .. admonition:: Question :class: note Obtain the coefficients of the reformulated linear system for the zero sum game :ref:`Matching Pennies `: with payoff matrix :math:`A = \begin{pmatrix} 1 & -1\\ -1 & 1 \end{pmatrix}`. .. admonition:: Answer :class: caution, dropdown Here are the coefficients of the reformulated linear system: .. math:: \begin{align*} c &= (0, 0, -1)\\ M_{\text{ub}} &= \begin{pmatrix} -1 & 1 & 1\\ 1 & -1 & 1\\ \end{pmatrix}\\ b_{\text{ub}} &= \begin{pmatrix}0\\0\end{pmatrix}\\ M_{\text{eq}} &= (1, 1, 0)\\ b_{\text{eq}} &= 1 \\ \end{align*} .. _the-minimax-theorem: The minimax theorem ------------------- The minimax theorem [vonNeumann1928]_ states that if there exists optimal values of the: 1. *max-min* value :math:`u` and the *max-min* strategy :math:`x`. 2. *min-max* value :math:`v` and the *min-max* strategy :math:`y`. then :math:`u=v`. This can be proved using the :ref:`linear_program_duality_theorem`. Note that this answers the question posed at the end of :ref:`motivating-example-zero-sum-games`: through a choice of strategy the row player can ensure they obtain the value of the game which is equal to the *max-min* value and the *min-max* value. Exercises --------- Obtain the coefficients of the reformulated linear system for the zero sum games with the following payoff matrices: 1. :math:`A = \begin{pmatrix} 3 & -1\\ -1 & 2 \end{pmatrix}`. 2. :math:`A = \begin{pmatrix} -1 & -1\\ -1 & 3 \end{pmatrix}`. 3. :math:`A = \begin{pmatrix} 2 & 1 & -3\\ -3 & -1 & 3 \end{pmatrix}`. 4. :math:`A = \begin{pmatrix} 3 & -2 & 0\\ -3 & 0 & 3 \\ 0 & 2 & -5 \end{pmatrix}`. Using Nashpy ------------ See :ref:`how-to-use-minimax` for guidance of how to use Nashpy to find the Nash equilibria of Zero sum games using the mini max theorem.