Tuesday, June 30, 2015

Exponentially fast computation of Nash equilibria in matrix games: a weird inequality for polyhedral convex functions


Introduction: General considerations

Example of bounded polyhedron, aka polytope
Definition. A proper convex function (the qualificative "proper" means the function is not identically equal to $+\infty$) $g: \mathbb{R}^p \rightarrow [-\infty, +\infty]$ is said to be polyhedral if its epigraph $\mathrm{epi}(g) := \{(x, t) \in \mathbb{R}^{p+1} | t \ge g(x)\}$ is a (convex) polyhedron.

We have the following useful characterization (see [section 2, Murota et al. 2013]): $g$ is polyhedral iff there exists a matrix $A = (a_i) \in \mathbb{R}^{m \times n}$ and a vector $c = (c_i) \in \mathbb{R}^m$ s.t $g(x) = \text{max}\{\langle a_i, x \rangle + c_i| 1 \le i \le m\}$,  $\forall x \in dom(g)$. As usual, the set $dom(g) := \{x \in \mathbb{R}^p | g(x) < +\infty\}$ is the effective domain of $g$, and is non-empty since $g$ is proper, by assumption. Given a compact convex subset $C \subseteq dom(g)$, let $\mathrm{Opt}_C(g) \subseteq C$ be the set of minimizers of $g$ on $C$. When there is no confusion about the set $C$, we will omit the subscript and simply write $\mathrm{Opt}(g)$. Note that $\mathrm{Opt}_C(g)$ is itself convex.


Minimax for matrix games and the weird inequality

It's not hard to see that the Nash equilibrium problem for a matrix game with payoff matrix $A \in \mathbb{R}^{m \times n}$
\begin{eqnarray}
\underset{x \in \Delta_m}{\text{minimize }}\underset{y \in \Delta_n}{\text{maximize }}\langle Ax, y\rangle
\end{eqnarray}
can be written as
\begin{eqnarray}
\underset{(x,y) \in \Delta_m \times \Delta_n}{\text{minimize }}G(x, y),
\end{eqnarray}
where
\begin{eqnarray}
\begin{split}
G(x, y) & := \underset{v \in \Delta_n}{\text{max }}\langle Ax, v\rangle - \underset{u \in \Delta_n}{\text{min }}\langle Au, y\rangle = max\{\langle Ax, v\rangle - \langle Au, y\rangle | (u,v) \in \Delta_m \times \Delta_n\} \\
&= \text{max}\{\langle r_i, [x\text{ }y]^T \rangle | 1 \le i \le m + n\}
\end{split}
\end{eqnarray} is the primal-dual gap function and $(r_i) = R := \begin{bmatrix}0\hspace{1.5em}A\\-A^T\hspace{.4em}0\end{bmatrix} \in \mathbb{R}^{(m+n)^2}$. Thus $G$ is a polyhedral convex function on $\Delta_m \times \Delta_n$. Note that $G(x, y) \ge 0$ for every pair of strategies $(x, y) \in \Delta_m \times \Delta_n$, with equality iff $(x,y) \in \mathrm{Opt}(G)$. Also note that the elements of $\mathrm{Opt}(G)$ are simply the Nash equilibria of the game.

Definition. For a given tolerance $\epsilon > 0$, we call a pair of strategies $(x^*,y^*) \in \Delta_m \times \Delta_n$ a Nash $\epsilon$-equilibrium if $G(x^*, y^*) \le \epsilon$.

A remarkable result by [Andrew Gilpin et al.] (Lemma 3) says that for any matrix game, there exists a scalar $\delta > 0$ such that
 \begin{eqnarray} \mathrm{dist}((x, y), \mathrm{Opt}(G)) \le \dfrac{G(x, y)}{\delta}, \forall (x, y) \in \Delta_m \times \Delta_n, \label{eq:weirdo}
\end{eqnarray}

where $\mathrm{dist}((x, y), \mathrm{Opt}(G)) := \mathrm{min}\{\|(u,v) - (x,y)\|| (u, v) \in \mathrm{Opt}(G)\}$ is the distance of $(x, y)$ from $\mathrm{Opt}(G)$. Using this bound in conjunction with Nesterov smoothing, the authors derived an $\mathcal{O}\left(\left(\|A\| / \delta\right) ln\left(1 / \epsilon\right)\right)$ algorithm for computing a Nash $\epsilon$-equilibrium. Note however that, the suprimum of the $\delta$s veryfin \eqref{eq:weirdo} can be arbitrarily small in fact, we can explicitly construct games for which it is arbitrarily close to $0$), and so the exponential speed may not be experienced empirically on some ill-conditioned games. To be continued...

No comments:

Post a Comment