Well-Supported versus Approximate Nash Equilibria: Query Complexity of Large Games

Xi Chen, Yu Cheng, Bo Tang

Introduction

The celebrated theorem of Nash [Nas50] states that every finite game has an equilibrium point. The solution concept of Nash equilibrium (NE) has been tremendously influential in economics and social sciences ever since (e.g. see [HR04]). The complexity and efficient approximation of NE have been studied intensively during the past decade, and much progress has been made (e.g., see [LMM03, AKV05, BVV05, KT07, TS07, DGP09, CDT09, EY10, Meh14, DP15, BPR15, CDO15, DKT15, Rub15, DKS15, Bar15]).

In this paper, we study the randomized query complexity of computing an $\epsilon$ -approximate Nash equilibrium (ANE) in large games, for some constant $\epsilon>0$ . Given a game $\mathcal{G}$ with $n$ players and $\alpha$ actions for each player, we index the players by the set $[n]=\{1,\ldots,n\}$ and index the actions by the set $[\alpha]=\{1,\ldots,\alpha\}$ . Recall that an $\epsilon$ -ANE of $\mathcal{G}$ is a mixed strategy profile

in which each player $i$ plays an $\epsilon$ -best response $\boldsymbol{x}_{i}$ to other players’ strategies $\boldsymbol{x}_{-i}$ We follow the convention and write $\boldsymbol{x}_{-i}:=(\boldsymbol{x}_{1},\ldots,\boldsymbol{x}_{i-1},\boldsymbol{x}_{i+1},\ldots,\boldsymbol{x}_{n})$ , strategies of players other than $i$ in $\boldsymbol{x}$ . (see the precise definition in Section 2). Since the notion of ANE (as well as that of well-supported Nash equilibria to be discussed below) is additive, we always assume that the payoff functions of games considered throughout this paper take values between and $1$ .

For the (payoff) query model, an oracle algorithm with unlimited computational power is given an approximation parameter $\epsilon$ , the number of players $n$ and the number of actions $\alpha$ in an unknown game $\mathcal{G}$ , and needs to find an $\epsilon$ -ANE of $\mathcal{G}$ . The algorithm has oracle access to the payoff functions of players in $\mathcal{G}$ : For each round, the algorithm can adaptively query a pure strategy profile $\boldsymbol{a}\in[\alpha]^{n}$ , and receives the payoff of every player with respect to $\boldsymbol{a}$ . We are interested in the number of queries needed by any randomized oracle algorithm for this task. Note that a trivial upper bound is $\alpha^{n}$ , by simply querying all the pure strategy profiles.

The query complexity of (approximate) Nash equilibria and related solution concepts has received considerable attention recently, e.g. see [FGGS13, HN13, FS14, GR14, Bab14, BB15, GT15]. Below we review results that are most relevant to our work.

Turning to the harder, but perhaps more interesting, problem of approximating Nash equilibria under the payoff query model, the deterministic lower bound of [HN13] for $(1/2)$ -CE directly implies the same bound for $(1/2)$ -ANE, since any $\epsilon$ -ANE by definition is also an $\epsilon$ -CE. For the randomized query complexity of NE, Babichenko [Bab14] showed that any randomized algorithm requires $2^{\Omega(n)}$ queries to find an $\epsilon$ -well-supported Nash equilibrium (WSNE), in a binary-action, $n$ -player game (see Theorem 1.4). Recall that an $\epsilon$ -WSNE of a game is a mixed strategy profile $\boldsymbol{x}$ in which the probability of player $i$ playing action $j$ is positive only when action $j$ is an $\epsilon$ -best response with respect to $\boldsymbol{x}_{-i}$ (see Section 2 for the precise definition). By definition, an $\epsilon$ -WSNE is also an $\epsilon$ -ANE but the inverse is not true. Following a well-known connection between WSNE and ANE [DGP09] (and using random samples to approximate expected payoffs), Babichenko [Bab14] showed that the same $2^{\Omega(n)}$ bound holds for the randomized query complexity of $\epsilon$ -ANE, but only when $\epsilon=O(1/n)$ . The randomized query complexity of $\epsilon$ -ANE in large games, an arguably more natural relaxation of exact NE compared to WSNE, remains an open problem when $\epsilon>0$ is a constant.

Our Results

For binary-action, $n$ -player games, we show that $2^{\Omega(n/\log n)}$ queries are required for any randomized algorithm to find an $\epsilon$ -ANE, for some constant $\epsilon>0$ . To state the result, we use $\text{{QC}}_{p}(\textbf{ANE}(n,\epsilon))$ , for some $p>0$ , to denote the smallest $T$ such that there exists a randomized oracle algorithm that uses no more than $T$ queries and outputs an $\epsilon$ -ANE with probability at least $p$ , given any unknown binary-action, $n$ -player game. Our main result is the following lower bound on $\text{{QC}}_{p}(\textbf{ANE}(n,\epsilon))$ :

There exist two constants $\epsilon>0$ and $c>0$ such that

Our lower bound answers an open problem posed by Hart and Nisan [HN13] and by Babichenko [Bab14]. Our result shows that, in terms of their query complexities, finding an $\epsilon$ -ANE is almost as hard as finding an $\epsilon$ -WSNE in a large game, even for constant $\epsilon>0$ . It also implies the following corollary regarding the rate of convergence of $k$ -queries dynamics (see [Bab14] for the definition).

There exist two constants $\epsilon>0$ and $c>0$ such that no $k$ -queries dynamic can converge to an $\epsilon$ -ANE in $\smash{2^{\Omega(n/\log n)}/k}$ steps with probability at least $2^{-{cn}/{\log n}}$ in all binary-action and $n$ -player games.

In addition to the randomized query complexity, our proof of Theorem 1.1 yields a polynomial-time reduction Recall that a polynomial-time reduction from total search problem $A$ to total search problem $B$ is a pair $(f,g)$ of polynomial-time computable functions such that: 1) for every input instance $x$ of $A$ , $f(x)$ is an input instance of $B$ ; 2) for every solution $y$ to $f(x)$ in $B$ , $g(y)$ is a solution to $x$ in $A$ . from the problem of finding an $\epsilon$ -WSNE to that of finding an $(\epsilon^{\prime}=\Omega(\epsilon))$ -ANE in a succinct game with a fixed number of actions. Following the definition from [PR08], we say that an $\alpha$ -action succinct game is a pair $(n,U)$ , where $n$ is the number of players and $U$ is a (multi-output) Boolean circuit that, given a pure strategy profile $\boldsymbol{a}\in[\alpha]^{n}$ (encoded in binary), outputs the payoffs of all $n$ players with respect to $\boldsymbol{a}$ in the game. We show that

Approximate vs Well-Supported Nash Equilibria

Let $\text{{QC}}_{p}(\textbf{WSNE}(n,\epsilon))$ , for some $p>0$ , denote the smallest $T$ such that there exists a randomized oracle algorithm that uses no more than $T$ queries and outputs an $\epsilon$ -WSNE with probability at least $p$ , given any unknown binary-action, $n$ -player game. Babichenko showed that

There exist two constants $\epsilon>0$ and $c>0$ such that

Given the result of Babichenko as above, the same exponential lower bound for the randomized query complexity of $\epsilon$ -ANE, for small enough constant $\epsilon>0$ , would follow immediately if

Given oracle access to $\mathcal{G}$ and any $\epsilon^{\prime}$ -ANE of $\mathcal{G}$ , where $\epsilon^{\prime}=c(\alpha)\cdot\epsilon$ for some constant $c>0$ that only depends on $\alpha$ , there is a query-efficient procedure that outputs an $\epsilon$ -WSNE of $\mathcal{G}$ .

However, the best such procedure known is the following result from [DGP09]. The parameters are subsequently improved in [Bab14], where the number of queries needed is also analyzed:

Given oracle access to $\mathcal{G}$ and any $\epsilon^{2}/(16n)$ -ANE of $\mathcal{G}$ , there is a procedure that outputs an $\epsilon$ -WSNE of $\mathcal{G}$ , where $n$ is the number of players, using $\text{poly}(\alpha,n,1/\epsilon)$ payoff queries.

The procedure is very natural: For each player, reallocate probabilities on actions with a relatively low expected payoff to a best-response action. By Theorem 1.4, such a procedure implies the same exponential lower bound for $\epsilon$ -ANE [Bab14] but only when $\epsilon$ is $O(1/n)$ .

No better procedure is known. By definition, an ANE poses a slightly weaker condition on each player compared to that of a WSNE. More specifically, given mixed strategies of other players $\boldsymbol{x}_{-i}$ , for an $\epsilon$ -WSNE, $\boldsymbol{x}_{i}$ must be supported on actions that are $\epsilon$ -best responses to $\boldsymbol{x}_{-i}$ , while in an $\epsilon$ -ANE, $\boldsymbol{x}_{i}$ can be any mixed strategy that yields an overall $\epsilon$ -best response to $\boldsymbol{x}_{-i}$ . For example, $\boldsymbol{x}_{i}$ may put $1-\epsilon$ probability on best-response actions while putting $\epsilon$ probability on any other actions. On the one hand, this makes WSNE much easier to analyze and control in hardness reductions, which is why it played a critical role in characterizing the complexity of Nash equilibria, starting with the work of [DGP09], later in [CDT09] and subsequent works. On the other hand, as the $\epsilon$ being of interest in [DGP09, CDT09] is either exponentially or polynomially small, any hardness result for $\epsilon$ -WSNE yields the same result for $\epsilon$ -ANE (by combining the procedure of [DGP09] described above and a folklore padding argument).

Our Approach

While we were not able to improve the procedure of [DGP09, Bab14], we prove Theorem 1.1 via a query-efficient reduction from the problem of finding a WSNE to that of finding a ANE:

Given any $\alpha$ -action, $n$ -player game $\mathcal{G}$ and any parameter $\epsilon>0$ , one can define a new $\alpha$ -action game $\mathcal{G}^{\prime}$ with a slightly larger set of $O(\alpha^{2}\log(n/\epsilon)\cdot n)$ players such that

To answer each payoff query on $\mathcal{G}^{\prime}$ , it suffices to make $\alpha n$ payoff queries on $\mathcal{G}$ ;

There is a procedure that, given any $\epsilon$ -ANE $\boldsymbol{x}$ of $\mathcal{G}^{\prime}$ , outputs a $(4\alpha\epsilon)$ -WSNE $\boldsymbol{y}$ of $\mathcal{G}$ , with no payoff oracle access to $\mathcal{G}$ or $\mathcal{G}^{\prime}$ .

Our reduction is presented in Section 3. Theorem 1.1 then follows directly from the lower bound of Babichenko [Bab14] on the randomized query complexity of WSNE (in Theorem 1.4). Theorem 1.3 follows from the fact that: 1) the payoff entries of $\mathcal{G}^{\prime}$ are easy to compute; and 2) the procedure to obtain $\boldsymbol{y}$ from $\boldsymbol{x}$ runs in time polynomial in the length of the binary representation of $\boldsymbol{x}$ , when the number of actions $\alpha$ is bounded.

Recall that in the procedure of [DGP09, Bab14], an $\epsilon$ -WSNE is obtained from an $\epsilon^{\prime}$ -ANE with $\epsilon^{\prime}=\epsilon^{2}/(16n)$ by reallocating probabilities on actions with relatively low expected payoff (formally, actions with payoff $\Omega(\epsilon)$ lower than the best response) to best-response actions. From the definition of ANE, no player can have probability more than $O(\epsilon^{\prime}/\epsilon)=O(\epsilon/n)$ on actions with low payoff in an $\epsilon^{\prime}$ -ANE. Thus, the procedure changes the expected payoff of each player on each action by at most $n\cdot O(\epsilon/n)=O(\epsilon)$ since it changes the mixed strategy of each player by $O(\epsilon/n)$ . It follows that the new mixed strategy profile is an $\epsilon$ -WSNE. The blow up of a factor of $n$ from $\epsilon^{\prime}$ to $\epsilon$ is precisely due to the cumulative impact on a player’s expected payoff imposed by small changes to all other players’ mixed strategies.

Our reduction from WSNE to ANE overcomes this obstacle by constructing a new and slightly larger game $\mathcal{G}^{\prime}$ with $O(n\log{n})$ players, where each player $i$ in the original $n$ -player game $\mathcal{G}$ is now simulated by a group of $O(\log{n})$ players indexed by $(i,j)$ in the new game $\mathcal{G}^{\prime}$ . The payoff function of player $(i,j)$ in $\mathcal{G}^{\prime}$ is exactly the same as that of player $i$ in $\mathcal{G}$ , but is defined with respect to the aggregate action of each group of players in $\mathcal{G}^{\prime}$ by taking the majority among each group.

We then show that an $\epsilon$ -WSNE of $\mathcal{G}$ can be recovered from an $\epsilon^{\prime}$ -ANE of $\mathcal{G}^{\prime}$ , with $\epsilon^{\prime}=\Omega(\epsilon)$ , by 1) computing the distribution of the majority action of each group and 2) truncating the small entries in each distribution. Intuitively, by focusing on the aggregate behavior of each group of $O(\log n)$ independent players in $\mathcal{G}^{\prime}$ , we make sure that the mixed strategies obtained from Step 1) are highly concentrated on actions with close-to-best expected payoffs, and actions with low payoffs can only appear as the majority action of a group with probability $O(\epsilon/n)$ . Therefore in Step 2), we only need to truncate entries with probability $O(\epsilon/n)$ , and the remaining positive entries would correspond to close-to-best actions. We can also control the effect of this truncation at the same time, because when the number of actions are bounded, the aggregate behavior of each group changes by at most $O(\epsilon/n)$ . This allows us to show that the result is an $\epsilon$ -WSNE of the original game $\mathcal{G}$ .

Organization

The rest of the paper is organized as follows. We first give formal definitions of ANE and WSNE in Section 2. In Section 3 we present the reduction from WSNE to ANE for large games, and then use it to prove Theorem 1.1 and Theorem 1.3 in Section 4. We conclude and discuss open problems in Section 5.

Preliminaries

A game $\mathcal{G}$ is a triple $(n,\alpha,\boldsymbol{u})$ , where $n$ is the number of players, $\alpha$ is the number of actions for each player, and $\boldsymbol{u}=(u_{1},\dots,u_{n})$ are the payoff functions, one for each player. We always use $[n]=\{1,\ldots,n\}$ to denote the set of players and $[\alpha]=\{1,\ldots,\alpha\}$ to denote the set of actions for each player. Since we are interested in additive approximations, each $u_{i}$ maps $[\alpha]^{n}$ to $$.

Let $\Delta_{\alpha}$ denote the set of probability distributions over $[\alpha]$ . A mixed strategy profile of $\mathcal{G}$ is then a tuple $\boldsymbol{x}=(\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n})$ of mixed strategies, where $\boldsymbol{x}_{i}\in\Delta_{\alpha}$ denotes the mixed strategy of player $i$ . Given $\boldsymbol{x}$ , we use $\boldsymbol{x}_{-i}$ to denote the tuple of mixed strategies of all players other than $i$ . As a shorthand, we write $u_{i}(\boldsymbol{x})$ to denote the expected payoff of player $i$ with respect to $\boldsymbol{x}$ , and write $u_{i}(a,\boldsymbol{x}_{-i})$ to denote the expected payoff of player $i$ playing action $a\in\big{[}\alpha\big{]}$ with respect to $\boldsymbol{x}_{-i}$ :

Next we define approximate and well-supported Nash equilibria.

Given $\epsilon>0$ , an $\epsilon$ -approximate Nash equilibrium of an $\alpha$ -action and $n$ -player game $\mathcal{G}(n,\alpha,\boldsymbol{u})$ is a mixed strategy profile $\boldsymbol{x}=(\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n})$ such that for every player $i\in[n]$ :

Given $\epsilon>0$ , an $\epsilon$ -well-supported Nash equilibrium of $\mathcal{G}(n,\alpha,\boldsymbol{u})$ is a mixed strategy profile $\boldsymbol{x}=(\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n})$ such that for every player $i\in[n]$ and every action $a$ in the support of $\boldsymbol{x}_{i}$ :

Finally, we give a formal definition of succinct games [PR08].

An $\alpha$ -action succinct game is a pair $(n,U)$ , where $n$ is the number of players and $U$ is a (multi-output) Boolean circuit that, given any pure strategy profile $\boldsymbol{a}\in[\alpha]^{n}$ (encoded in binary), outputs the payoffs of all $n$ players with respect to $\boldsymbol{a}$ in the game. The input size of $(n,U)$ is the size of the circuit $U$ .

A Reduction from Well-Supported to Approximate Nash Equilibria

Given an $\alpha$ -action, $n$ -player game $\mathcal{G}(n,\alpha,\boldsymbol{u})$ and $\epsilon\in(0,1)$ , we define a new game $\mathcal{G}^{\prime}(sn,\alpha,\boldsymbol{u}^{\prime})$ with $sn$ players, where

We then show that, given an $\epsilon$ -ANE $\boldsymbol{x}$ of the new game $\mathcal{G}^{\prime}$ , one can compute a $(4\alpha\epsilon)$ -WSNE $\boldsymbol{y}$ of $\mathcal{G}$ without making any payoff queries to $\mathcal{G}$ or $\mathcal{G}^{\prime}$ .

For each player $i\in[n]$ in $\mathcal{G}$ , we introduce a group of $s$ players in $\mathcal{G}^{\prime}$ , indexed by $(i,j)$ with $j\in[s]$ , and use $u^{\prime}_{i,j}$ to denote the payoff function of player $(i,j)$ . Given any pure strategy profile $\boldsymbol{a}=(a_{i,j}:i\in[n],j\in[s])$ , we define the payoff $u^{\prime}_{i,j}(\boldsymbol{a})$ of player $(i,j)$ as follows. First, for each $i\in[n]$ , let $\bar{a}_{i}\in[\alpha]$ denote the majority action played by the $i$ -th group (players $(i,j)$ , $j\in[s]$ ) in the pure strategy profile $\boldsymbol{a}$ (break ties by choosing the action with the smallest index). Write $\boldsymbol{\bar{a}}=(\bar{a}_{1},\ldots,\bar{a}_{n})$ . Next, the payoff of player $(i,j)$ under $\boldsymbol{a}$ is defined as

This completes the definition of $\mathcal{G}^{\prime}$ . The lemma below follows directly from the definition.

To answer a payoff query on $\mathcal{G}^{\prime}$ , it suffices to make $\alpha n$ payoff queries on $\mathcal{G}$ .

By the definition of $\mathcal{G}^{\prime}$ , $u^{\prime}_{i,j}(\boldsymbol{a})$ ’s for all $(i,j)$ , $i\in[n]$ and $j\in[s]$ , are determined by

for which $\alpha n$ payoff queries on $\mathcal{G}$ suffice.

We conclude our reduction by proving the following lemma:

Given any $\epsilon$ -ANE $\boldsymbol{x}$ of $\mathcal{G}^{\prime}$ , one can compute a $(4\alpha\epsilon)$ -WSNE $\boldsymbol{y}$ of $G$ without making any payoff queries on $\mathcal{G}$ or $\mathcal{G}^{\prime}$ . Moreover, when $\alpha$ is a constant, the computation of $\boldsymbol{y}$ from $\boldsymbol{x}$ can be done in time polynomial in the number of bits needed in the binary representation of $\boldsymbol{x}$ and $1/\epsilon$ .

Let $\boldsymbol{x}=(\boldsymbol{x}_{i,j}:i\in[n],j\in[\alpha])$ be an $\epsilon$ -ANE of $\mathcal{G}^{\prime}$ . For each group $i$ and action $k\in[\alpha]$ , let

Recall that $\bar{a}_{i}$ is the majority action played by players $(i,j)$ , $j\in[s]$ , in the pure strategy profile $\boldsymbol{a}$ . Then by definition, each $\boldsymbol{\bar{x}}_{i}=(\bar{x}_{i,1},\ldots,\bar{x}_{i,\alpha})$ is a probability distribution over $[\alpha]$ .

Next we define a mixed strategy $\boldsymbol{y}=(\boldsymbol{y}_{1},\ldots,\boldsymbol{y}_{n})$ of $\mathcal{G}$ , and show that $\boldsymbol{y}$ is a $(4\alpha\epsilon)$ -WSNE. Let

It is clear that $\boldsymbol{y}_{i}=(y_{i,1},\ldots,y_{i,\alpha})$ is a probability distribution over $[\alpha]$ .

But note that, the total variation distance between $\boldsymbol{\bar{x}}_{j}$ and $\boldsymbol{y}_{j}$ for each $j\in[n]$ is at most $\alpha\epsilon/n$ . So by coupling and applying union bound, we have that

By the definition (1) of the payoff function $u^{\prime}_{i,j}$ , we have

Combining (6) and (7), we have that for every player $(i,j)$ , $j\in[s]$ :

For the running time, when $\alpha$ is a constant, to compute $\bar{x}_{i,k}$ in (2) one needs to go through

many pure strategy profiles of players $(i,j)$ , $j\in[s]$ . Thus $\boldsymbol{y}$ can be computed in time polynomial in the number of bits needed in the binary representation of $\boldsymbol{x}$ and $1/\epsilon$ .

Proofs of Theorems 1.1 and 1.3

We use the query-efficient reduction given above to prove Theorem 1.1 and Theorem 1.3.

By Theorem 1.4, there exist two constants $\epsilon^{\prime}>0$ and $c^{\prime}>0$ such that

Let $n=8n^{\prime}\cdot\left\lceil\ln(n^{\prime}/\epsilon^{\prime})\right\rceil$ and $\epsilon=8\epsilon^{\prime}$ . It follows from Lemma 3.1 and Lemma 3.2 that

The theorem then follows from $n^{\prime}=\Omega(n/\log n)$ .

From Lemma 3.2, it suffices to show that, given any $\alpha$ -action succinct game $\mathcal{G}=(n,U)$ , one can construct, in polynomial time, a Boolean circuit $U^{\prime}$ that implements the payoff functions of players in $\mathcal{G}^{\prime}$ . This can be done by following the definition of $\mathcal{G}^{\prime}$ in the previous section, as the payoffs of a pure strategy profile $\boldsymbol{a}$ in $\mathcal{G}^{\prime}$ only depends (in a straight-forward fashion) on the payoffs of $\alpha n=O(n)$ easy-to-compute profiles of $\mathcal{G}$ .

Conclusion

In this paper, we present a simple and efficient reduction from the problem of finding a WSNE to that of finding an ANE in large games with a bounded number of actions. Our results complement the existing study on relations between WSNE and ANE. As an application, we obtain a lower bound on the randomized query complexity of $\epsilon$ -ANE for some constant $\epsilon>0$ . It would be interesting to see other applications of our reduction in understanding the complexity of Nash equilibria. It also remains an open problem to remove the $\log n$ factor in the exponent of our lower bound, i.e. to show that the number of queries needed to reach an $\epsilon$ -ANE is indeed $2^{\Omega(n)}$ . This $\log n$ factor shows up because we simulate each player in the original game with $O(\log n)$ players in the new game. Is there a more efficient simulation that uses fewer players?