Well-Supported versus Approximate Nash Equilibria: Query Complexity of Large Games

Xi Chen, Yu Cheng, Bo Tang

Introduction

The celebrated theorem of Nash [Nas50] states that every finite game has an equilibrium point. The solution concept of Nash equilibrium (NE) has been tremendously influential in economics and social sciences ever since (e.g. see [HR04]). The complexity and efficient approximation of NE have been studied intensively during the past decade, and much progress has been made (e.g., see [LMM03, AKV05, BVV05, KT07, TS07, DGP09, CDT09, EY10, Meh14, DP15, BPR15, CDO15, DKT15, Rub15, DKS15, Bar15]).

In this paper, we study the randomized query complexity of computing an ϵ\epsilon-approximate Nash equilibrium (ANE) in large games, for some constant ϵ>0\epsilon>0. Given a game G\mathcal{G} with nn players and α\alpha actions for each player, we index the players by the set [n]={1,,n}[n]=\{1,\ldots,n\} and index the actions by the set [α]={1,,α}[\alpha]=\{1,\ldots,\alpha\}. Recall that an ϵ\epsilon-ANE of G\mathcal{G} is a mixed strategy profile

in which each player ii plays an ϵ\epsilon-best response xi\boldsymbol{x}_{i} to other players’ strategies xi\boldsymbol{x}_{-i} We follow the convention and write xi:=(x1,,xi1,xi+1,,xn)\boldsymbol{x}_{-i}:=(\boldsymbol{x}_{1},\ldots,\boldsymbol{x}_{i-1},\boldsymbol{x}_{i+1},\ldots,\boldsymbol{x}_{n}), strategies of players other than ii in x\boldsymbol{x}. (see the precise definition in Section 2). Since the notion of ANE (as well as that of well-supported Nash equilibria to be discussed below) is additive, we always assume that the payoff functions of games considered throughout this paper take values between and 11.

For the (payoff) query model, an oracle algorithm with unlimited computational power is given an approximation parameter ϵ\epsilon, the number of players nn and the number of actions α\alpha in an unknown game G\mathcal{G}, and needs to find an ϵ\epsilon-ANE of G\mathcal{G}. The algorithm has oracle access to the payoff functions of players in G\mathcal{G}: For each round, the algorithm can adaptively query a pure strategy profile a[α]n\boldsymbol{a}\in[\alpha]^{n}, and receives the payoff of every player with respect to a\boldsymbol{a}. We are interested in the number of queries needed by any randomized oracle algorithm for this task. Note that a trivial upper bound is αn\alpha^{n}, by simply querying all the pure strategy profiles.

The query complexity of (approximate) Nash equilibria and related solution concepts has received considerable attention recently, e.g. see [FGGS13, HN13, FS14, GR14, Bab14, BB15, GT15]. Below we review results that are most relevant to our work.

Turning to the harder, but perhaps more interesting, problem of approximating Nash equilibria under the payoff query model, the deterministic lower bound of [HN13] for (1/2)(1/2)-CE directly implies the same bound for (1/2)(1/2)-ANE, since any ϵ\epsilon-ANE by definition is also an ϵ\epsilon-CE. For the randomized query complexity of NE, Babichenko [Bab14] showed that any randomized algorithm requires 2Ω(n)2^{\Omega(n)} queries to find an ϵ\epsilon-well-supported Nash equilibrium (WSNE), in a binary-action, nn-player game (see Theorem 1.4). Recall that an ϵ\epsilon-WSNE of a game is a mixed strategy profile x\boldsymbol{x} in which the probability of player ii playing action jj is positive only when action jj is an ϵ\epsilon-best response with respect to xi\boldsymbol{x}_{-i} (see Section 2 for the precise definition). By definition, an ϵ\epsilon-WSNE is also an ϵ\epsilon-ANE but the inverse is not true. Following a well-known connection between WSNE and ANE [DGP09] (and using random samples to approximate expected payoffs), Babichenko [Bab14] showed that the same 2Ω(n)2^{\Omega(n)} bound holds for the randomized query complexity of ϵ\epsilon-ANE, but only when ϵ=O(1/n)\epsilon=O(1/n). The randomized query complexity of ϵ\epsilon-ANE in large games, an arguably more natural relaxation of exact NE compared to WSNE, remains an open problem when ϵ>0\epsilon>0 is a constant.

Our Results

For binary-action, nn-player games, we show that 2Ω(n/logn)2^{\Omega(n/\log n)} queries are required for any randomized algorithm to find an ϵ\epsilon-ANE, for some constant ϵ>0\epsilon>0. To state the result, we use QCp(ANE(n,ϵ))\text{{QC}}_{p}(\textbf{ANE}(n,\epsilon)), for some p>0p>0, to denote the smallest TT such that there exists a randomized oracle algorithm that uses no more than TT queries and outputs an ϵ\epsilon-ANE with probability at least pp, given any unknown binary-action, nn-player game. Our main result is the following lower bound on QCp(ANE(n,ϵ))\text{{QC}}_{p}(\textbf{ANE}(n,\epsilon)):

There exist two constants ϵ>0\epsilon>0 and c>0c>0 such that

Our lower bound answers an open problem posed by Hart and Nisan [HN13] and by Babichenko [Bab14]. Our result shows that, in terms of their query complexities, finding an ϵ\epsilon-ANE is almost as hard as finding an ϵ\epsilon-WSNE in a large game, even for constant ϵ>0\epsilon>0. It also implies the following corollary regarding the rate of convergence of kk-queries dynamics (see [Bab14] for the definition).

There exist two constants ϵ>0\epsilon>0 and c>0c>0 such that no kk-queries dynamic can converge to an ϵ\epsilon-ANE in 2Ω(n/logn)/k\smash{2^{\Omega(n/\log n)}/k} steps with probability at least 2cn/logn2^{-{cn}/{\log n}} in all binary-action and nn-player games.

In addition to the randomized query complexity, our proof of Theorem 1.1 yields a polynomial-time reduction Recall that a polynomial-time reduction from total search problem AA to total search problem BB is a pair (f,g)(f,g) of polynomial-time computable functions such that: 1) for every input instance xx of AA, f(x)f(x) is an input instance of BB; 2) for every solution yy to f(x)f(x) in BB, g(y)g(y) is a solution to xx in AA. from the problem of finding an ϵ\epsilon-WSNE to that of finding an (ϵ=Ω(ϵ))(\epsilon^{\prime}=\Omega(\epsilon))-ANE in a succinct game with a fixed number of actions. Following the definition from [PR08], we say that an α\alpha-action succinct game is a pair (n,U)(n,U), where nn is the number of players and UU is a (multi-output) Boolean circuit that, given a pure strategy profile a[α]n\boldsymbol{a}\in[\alpha]^{n} (encoded in binary), outputs the payoffs of all nn players with respect to a\boldsymbol{a} in the game. We show that

Approximate vs Well-Supported Nash Equilibria

Let QCp(WSNE(n,ϵ))\text{{QC}}_{p}(\textbf{WSNE}(n,\epsilon)), for some p>0p>0, denote the smallest TT such that there exists a randomized oracle algorithm that uses no more than TT queries and outputs an ϵ\epsilon-WSNE with probability at least pp, given any unknown binary-action, nn-player game. Babichenko showed that

There exist two constants ϵ>0\epsilon>0 and c>0c>0 such that

Given the result of Babichenko as above, the same exponential lower bound for the randomized query complexity of ϵ\epsilon-ANE, for small enough constant ϵ>0\epsilon>0, would follow immediately if

Given oracle access to G\mathcal{G} and any ϵ\epsilon^{\prime}-ANE of G\mathcal{G}, where ϵ=c(α)ϵ\epsilon^{\prime}=c(\alpha)\cdot\epsilon for some constant c>0c>0 that only depends on α\alpha, there is a query-efficient procedure that outputs an ϵ\epsilon-WSNE of G\mathcal{G}.

However, the best such procedure known is the following result from [DGP09]. The parameters are subsequently improved in [Bab14], where the number of queries needed is also analyzed:

Given oracle access to G\mathcal{G} and any ϵ2/(16n)\epsilon^{2}/(16n)-ANE of G\mathcal{G}, there is a procedure that outputs an ϵ\epsilon-WSNE of G\mathcal{G}, where nn is the number of players, using poly(α,n,1/ϵ)\text{poly}(\alpha,n,1/\epsilon) payoff queries.

The procedure is very natural: For each player, reallocate probabilities on actions with a relatively low expected payoff to a best-response action. By Theorem 1.4, such a procedure implies the same exponential lower bound for ϵ\epsilon-ANE [Bab14] but only when ϵ\epsilon is O(1/n)O(1/n).

No better procedure is known. By definition, an ANE poses a slightly weaker condition on each player compared to that of a WSNE. More specifically, given mixed strategies of other players xi\boldsymbol{x}_{-i}, for an ϵ\epsilon-WSNE, xi\boldsymbol{x}_{i} must be supported on actions that are ϵ\epsilon-best responses to xi\boldsymbol{x}_{-i}, while in an ϵ\epsilon-ANE, xi\boldsymbol{x}_{i} can be any mixed strategy that yields an overall ϵ\epsilon-best response to xi\boldsymbol{x}_{-i}. For example, xi\boldsymbol{x}_{i} may put 1ϵ1-\epsilon probability on best-response actions while putting ϵ\epsilon probability on any other actions. On the one hand, this makes WSNE much easier to analyze and control in hardness reductions, which is why it played a critical role in characterizing the complexity of Nash equilibria, starting with the work of [DGP09], later in [CDT09] and subsequent works. On the other hand, as the ϵ\epsilon being of interest in [DGP09, CDT09] is either exponentially or polynomially small, any hardness result for ϵ\epsilon-WSNE yields the same result for ϵ\epsilon-ANE (by combining the procedure of [DGP09] described above and a folklore padding argument).

Our Approach

While we were not able to improve the procedure of [DGP09, Bab14], we prove Theorem 1.1 via a query-efficient reduction from the problem of finding a WSNE to that of finding a ANE:

Given any α\alpha-action, nn-player game G\mathcal{G} and any parameter ϵ>0\epsilon>0, one can define a new α\alpha-action game G\mathcal{G}^{\prime} with a slightly larger set of O(α2log(n/ϵ)n)O(\alpha^{2}\log(n/\epsilon)\cdot n) players such that

To answer each payoff query on G\mathcal{G}^{\prime}, it suffices to make αn\alpha n payoff queries on G\mathcal{G};

There is a procedure that, given any ϵ\epsilon-ANE x\boldsymbol{x} of G\mathcal{G}^{\prime}, outputs a (4αϵ)(4\alpha\epsilon)-WSNE y\boldsymbol{y} of G\mathcal{G}, with no payoff oracle access to G\mathcal{G} or G\mathcal{G}^{\prime}.

Our reduction is presented in Section 3. Theorem 1.1 then follows directly from the lower bound of Babichenko [Bab14] on the randomized query complexity of WSNE (in Theorem 1.4). Theorem 1.3 follows from the fact that: 1) the payoff entries of G\mathcal{G}^{\prime} are easy to compute; and 2) the procedure to obtain y\boldsymbol{y} from x\boldsymbol{x} runs in time polynomial in the length of the binary representation of x\boldsymbol{x}, when the number of actions α\alpha is bounded.

Recall that in the procedure of [DGP09, Bab14], an ϵ\epsilon-WSNE is obtained from an ϵ\epsilon^{\prime}-ANE with ϵ=ϵ2/(16n)\epsilon^{\prime}=\epsilon^{2}/(16n) by reallocating probabilities on actions with relatively low expected payoff (formally, actions with payoff Ω(ϵ)\Omega(\epsilon) lower than the best response) to best-response actions. From the definition of ANE, no player can have probability more than O(ϵ/ϵ)=O(ϵ/n)O(\epsilon^{\prime}/\epsilon)=O(\epsilon/n) on actions with low payoff in an ϵ\epsilon^{\prime}-ANE. Thus, the procedure changes the expected payoff of each player on each action by at most nO(ϵ/n)=O(ϵ)n\cdot O(\epsilon/n)=O(\epsilon) since it changes the mixed strategy of each player by O(ϵ/n)O(\epsilon/n). It follows that the new mixed strategy profile is an ϵ\epsilon-WSNE. The blow up of a factor of nn from ϵ\epsilon^{\prime} to ϵ\epsilon is precisely due to the cumulative impact on a player’s expected payoff imposed by small changes to all other players’ mixed strategies.

Our reduction from WSNE to ANE overcomes this obstacle by constructing a new and slightly larger game G\mathcal{G}^{\prime} with O(nlogn)O(n\log{n}) players, where each player ii in the original nn-player game G\mathcal{G} is now simulated by a group of O(logn)O(\log{n}) players indexed by (i,j)(i,j) in the new game G\mathcal{G}^{\prime}. The payoff function of player (i,j)(i,j) in G\mathcal{G}^{\prime} is exactly the same as that of player ii in G\mathcal{G}, but is defined with respect to the aggregate action of each group of players in G\mathcal{G}^{\prime} by taking the majority among each group.

We then show that an ϵ\epsilon-WSNE of G\mathcal{G} can be recovered from an ϵ\epsilon^{\prime}-ANE of G\mathcal{G}^{\prime}, with ϵ=Ω(ϵ)\epsilon^{\prime}=\Omega(\epsilon), by 1) computing the distribution of the majority action of each group and 2) truncating the small entries in each distribution. Intuitively, by focusing on the aggregate behavior of each group of O(logn)O(\log n) independent players in G\mathcal{G}^{\prime}, we make sure that the mixed strategies obtained from Step 1) are highly concentrated on actions with close-to-best expected payoffs, and actions with low payoffs can only appear as the majority action of a group with probability O(ϵ/n)O(\epsilon/n). Therefore in Step 2), we only need to truncate entries with probability O(ϵ/n)O(\epsilon/n), and the remaining positive entries would correspond to close-to-best actions. We can also control the effect of this truncation at the same time, because when the number of actions are bounded, the aggregate behavior of each group changes by at most O(ϵ/n)O(\epsilon/n). This allows us to show that the result is an ϵ\epsilon-WSNE of the original game G\mathcal{G}.

Organization

The rest of the paper is organized as follows. We first give formal definitions of ANE and WSNE in Section 2. In Section 3 we present the reduction from WSNE to ANE for large games, and then use it to prove Theorem 1.1 and Theorem 1.3 in Section 4. We conclude and discuss open problems in Section 5.

Preliminaries

A game G\mathcal{G} is a triple (n,α,u)(n,\alpha,\boldsymbol{u}), where nn is the number of players, α\alpha is the number of actions for each player, and u=(u1,,un)\boldsymbol{u}=(u_{1},\dots,u_{n}) are the payoff functions, one for each player. We always use [n]={1,,n}[n]=\{1,\ldots,n\} to denote the set of players and [α]={1,,α}[\alpha]=\{1,\ldots,\alpha\} to denote the set of actions for each player. Since we are interested in additive approximations, each uiu_{i} maps [α]n[\alpha]^{n} to $$.

Let Δα\Delta_{\alpha} denote the set of probability distributions over [α][\alpha]. A mixed strategy profile of G\mathcal{G} is then a tuple x=(x1,,xn)\boldsymbol{x}=(\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n}) of mixed strategies, where xiΔα\boldsymbol{x}_{i}\in\Delta_{\alpha} denotes the mixed strategy of player ii. Given x\boldsymbol{x}, we use xi\boldsymbol{x}_{-i} to denote the tuple of mixed strategies of all players other than ii. As a shorthand, we write ui(x)u_{i}(\boldsymbol{x}) to denote the expected payoff of player ii with respect to x\boldsymbol{x}, and write ui(a,xi)u_{i}(a,\boldsymbol{x}_{-i}) to denote the expected payoff of player ii playing action a\in\big{[}\alpha\big{]} with respect to xi\boldsymbol{x}_{-i}:

Next we define approximate and well-supported Nash equilibria.

Given ϵ>0\epsilon>0, an ϵ\epsilon-approximate Nash equilibrium of an α\alpha-action and nn-player game G(n,α,u)\mathcal{G}(n,\alpha,\boldsymbol{u}) is a mixed strategy profile x=(x1,,xn)\boldsymbol{x}=(\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n}) such that for every player i[n]i\in[n]:

Given ϵ>0\epsilon>0, an ϵ\epsilon-well-supported Nash equilibrium of G(n,α,u)\mathcal{G}(n,\alpha,\boldsymbol{u}) is a mixed strategy profile x=(x1,,xn)\boldsymbol{x}=(\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n}) such that for every player i[n]i\in[n] and every action aa in the support of xi\boldsymbol{x}_{i}:

Finally, we give a formal definition of succinct games [PR08].

An α\alpha-action succinct game is a pair (n,U)(n,U), where nn is the number of players and UU is a (multi-output) Boolean circuit that, given any pure strategy profile a[α]n\boldsymbol{a}\in[\alpha]^{n} (encoded in binary), outputs the payoffs of all nn players with respect to a\boldsymbol{a} in the game. The input size of (n,U)(n,U) is the size of the circuit UU.

A Reduction from Well-Supported to Approximate Nash Equilibria

Given an α\alpha-action, nn-player game G(n,α,u)\mathcal{G}(n,\alpha,\boldsymbol{u}) and ϵ(0,1)\epsilon\in(0,1), we define a new game G(sn,α,u)\mathcal{G}^{\prime}(sn,\alpha,\boldsymbol{u}^{\prime}) with snsn players, where

We then show that, given an ϵ\epsilon-ANE x\boldsymbol{x} of the new game G\mathcal{G}^{\prime}, one can compute a (4αϵ)(4\alpha\epsilon)-WSNE y\boldsymbol{y} of G\mathcal{G} without making any payoff queries to G\mathcal{G} or G\mathcal{G}^{\prime}.

For each player i[n]i\in[n] in G\mathcal{G}, we introduce a group of ss players in G\mathcal{G}^{\prime}, indexed by (i,j)(i,j) with j[s]j\in[s], and use ui,ju^{\prime}_{i,j} to denote the payoff function of player (i,j)(i,j). Given any pure strategy profile a=(ai,j:i[n],j[s])\boldsymbol{a}=(a_{i,j}:i\in[n],j\in[s]), we define the payoff ui,j(a)u^{\prime}_{i,j}(\boldsymbol{a}) of player (i,j)(i,j) as follows. First, for each i[n]i\in[n], let aˉi[α]\bar{a}_{i}\in[\alpha] denote the majority action played by the ii-th group (players (i,j)(i,j), j[s]j\in[s]) in the pure strategy profile a\boldsymbol{a} (break ties by choosing the action with the smallest index). Write aˉ=(aˉ1,,aˉn)\boldsymbol{\bar{a}}=(\bar{a}_{1},\ldots,\bar{a}_{n}). Next, the payoff of player (i,j)(i,j) under a\boldsymbol{a} is defined as

This completes the definition of G\mathcal{G}^{\prime}. The lemma below follows directly from the definition.

To answer a payoff query on G\mathcal{G}^{\prime}, it suffices to make αn\alpha n payoff queries on G\mathcal{G}.

By the definition of G\mathcal{G}^{\prime}, ui,j(a)u^{\prime}_{i,j}(\boldsymbol{a})’s for all (i,j)(i,j), i[n]i\in[n] and j[s]j\in[s], are determined by

for which αn\alpha n payoff queries on G\mathcal{G} suffice.

We conclude our reduction by proving the following lemma:

Given any ϵ\epsilon-ANE x\boldsymbol{x} of G\mathcal{G}^{\prime}, one can compute a (4αϵ)(4\alpha\epsilon)-WSNE y\boldsymbol{y} of GG without making any payoff queries on G\mathcal{G} or G\mathcal{G}^{\prime}. Moreover, when α\alpha is a constant, the computation of y\boldsymbol{y} from x\boldsymbol{x} can be done in time polynomial in the number of bits needed in the binary representation of x\boldsymbol{x} and 1/ϵ1/\epsilon.

Let x=(xi,j:i[n],j[α])\boldsymbol{x}=(\boldsymbol{x}_{i,j}:i\in[n],j\in[\alpha]) be an ϵ\epsilon-ANE of G\mathcal{G}^{\prime}. For each group ii and action k[α]k\in[\alpha], let

Recall that aˉi\bar{a}_{i} is the majority action played by players (i,j)(i,j), j[s]j\in[s], in the pure strategy profile a\boldsymbol{a}. Then by definition, each xˉi=(xˉi,1,,xˉi,α)\boldsymbol{\bar{x}}_{i}=(\bar{x}_{i,1},\ldots,\bar{x}_{i,\alpha}) is a probability distribution over [α][\alpha].

Next we define a mixed strategy y=(y1,,yn)\boldsymbol{y}=(\boldsymbol{y}_{1},\ldots,\boldsymbol{y}_{n}) of G\mathcal{G}, and show that y\boldsymbol{y} is a (4αϵ)(4\alpha\epsilon)-WSNE. Let

It is clear that yi=(yi,1,,yi,α)\boldsymbol{y}_{i}=(y_{i,1},\ldots,y_{i,\alpha}) is a probability distribution over [α][\alpha].

But note that, the total variation distance between xˉj\boldsymbol{\bar{x}}_{j} and yj\boldsymbol{y}_{j} for each j[n]j\in[n] is at most αϵ/n\alpha\epsilon/n. So by coupling and applying union bound, we have that

By the definition (1) of the payoff function ui,ju^{\prime}_{i,j}, we have

Combining (6) and (7), we have that for every player (i,j)(i,j), j[s]j\in[s]:

For the running time, when α\alpha is a constant, to compute xˉi,k\bar{x}_{i,k} in (2) one needs to go through

many pure strategy profiles of players (i,j)(i,j), j[s]j\in[s]. Thus y\boldsymbol{y} can be computed in time polynomial in the number of bits needed in the binary representation of x\boldsymbol{x} and 1/ϵ1/\epsilon.

Proofs of Theorems 1.1 and 1.3

We use the query-efficient reduction given above to prove Theorem 1.1 and Theorem 1.3.

By Theorem 1.4, there exist two constants ϵ>0\epsilon^{\prime}>0 and c>0c^{\prime}>0 such that

Let n=8nln(n/ϵ)n=8n^{\prime}\cdot\left\lceil\ln(n^{\prime}/\epsilon^{\prime})\right\rceil and ϵ=8ϵ\epsilon=8\epsilon^{\prime}. It follows from Lemma 3.1 and Lemma 3.2 that

The theorem then follows from n=Ω(n/logn)n^{\prime}=\Omega(n/\log n).

From Lemma 3.2, it suffices to show that, given any α\alpha-action succinct game G=(n,U)\mathcal{G}=(n,U), one can construct, in polynomial time, a Boolean circuit UU^{\prime} that implements the payoff functions of players in G\mathcal{G}^{\prime}. This can be done by following the definition of G\mathcal{G}^{\prime} in the previous section, as the payoffs of a pure strategy profile a\boldsymbol{a} in G\mathcal{G}^{\prime} only depends (in a straight-forward fashion) on the payoffs of αn=O(n)\alpha n=O(n) easy-to-compute profiles of G\mathcal{G}.

Conclusion

In this paper, we present a simple and efficient reduction from the problem of finding a WSNE to that of finding an ANE in large games with a bounded number of actions. Our results complement the existing study on relations between WSNE and ANE. As an application, we obtain a lower bound on the randomized query complexity of ϵ\epsilon-ANE for some constant ϵ>0\epsilon>0. It would be interesting to see other applications of our reduction in understanding the complexity of Nash equilibria. It also remains an open problem to remove the logn\log n factor in the exponent of our lower bound, i.e. to show that the number of queries needed to reach an ϵ\epsilon-ANE is indeed 2Ω(n)2^{\Omega(n)}. This logn\log n factor shows up because we simulate each player in the original game with O(logn)O(\log n) players in the new game. Is there a more efficient simulation that uses fewer players?

References