Can Almost Everybody be Almost Happy? PCP for PPAD and the Inapproximability of Nash

Yakov Babichenko, Christos Papadimitriou, Aviad Rubinstein

Introduction

It is known that finding an ε\varepsilon-approximate Nash equilibrium in a two-person game:

is \PPAD-complete if ε\varepsilon is inversely polynomial ;

but can be solved in quasipolynomial time for any fixed ε>0\varepsilon>0 ; and

the smallest known polynomially attainable approximation ratio is still over 3103\over 10 .

These three facts articulate rather dramatically the mystery, by now almost a decade old, of the problem’s approximability.

Can the inversely polynomial inapproximability bound of be improved to constant? Unlikely, because by this would imply a quasipolynomial algorithm for the iconic problem in PPAD:

Conventional wisdom, supported by black-box lower bounds , is that this problem requires exponential time to solve. In direct analogy with the exponential time hypothesis , we state the following:

In this paper, we identify a plausible new conjecture, a strengthening of Conjecture 1 in a natural and novel direction, which implies that the quasipolynomial approximation algorithm of is optimum. In particular, we define the (ε,δ)(\varepsilon,\delta)-Gcircuit to be the problem of finding values for the variables (lines) that ε\varepsilon-approximately satisfy a fraction of at least (1δ)(1-\delta) of the constraints (gates).

As we have mentioned, this latter statement, albeit with δ=0\delta=0, is known to follow from only Conjecture 1, for some ε>0\varepsilon>0 .

The (ε,δ)(\varepsilon,\delta)-Gcircuit problem is a constraint satisfaction problem: each line/mixed strategy is a real variable, and each gate/player defines a constraint. Since we are considering ε\varepsilon-approximate satisfaction of the constraints, each variable need be represented using only a constant (depending on ε\varepsilon) number of bits. Therefore, a satisfying assignment (truncations of a Nash equilibrium) can be distinguished from an unsatisfying one (one violating the (εδ)(\varepsilon-\delta)-relaxation) by querying a constant (proportional to 1/δ1/\delta) number of bits of the input, determined at random (but of course not uniformly so). This suggests that Conjecture 2 can be interpreted as a probabilistically checkable proof formulation of Gcircuit: informally, it states that “\PPAD has a PCP” (that is, a complete problem whose witnesses can be verified by examining, at random, a finite number of bits).

The main result of this paper, explained next, reveals another similarity between the PCP formulation of NP and the (ε,δ)(\varepsilon,\delta)-Gcircuit problem: The intractability of the latter problem implies, among several other inapproximability theorems, the strongest possible inapproximability result for the 2-player Nash equilibrium problem, arguably the central open question in the area.

The Main Result

We denote by ε\varepsilon-2Nash the problem of finding an ε\varepsilon-Nash equilibrium in a bimatrix game. Our main result is the following:

Our proof, given in Section 3, employs the technique of “birthday repetition,” pioneered by and used in different contexts by , and in particular by to show intractability of a Nash equilibrium-related problem. Starting from a polymatrix game with two strategies per player and in which the player interactions form a cubic bipartite graph with nn nodes on each side, the players on both sides are broken into blocks of size n\sqrt{n}. The game is simulated by a two-player game, in which each player simulates the nodes in one of the sides of the bipartite graph by choosing a block and a strategy for each node in it; that is, the total number of actions of each of the two players is about 2n2^{\sqrt{n}}, and such is the complexity of the reduction (this is necessary if one wishes to rule out better than quasipolynomial algorithms). The interactions between blocks, plus certain particular side games played by the two players, ensure a faithful enough simulation of the original multimatrix game.

In addition to Theorem 1, we prove two other complexity consequences of Conjecture 2, solving certain open problems in the area: An improved inapproximability result for the problem of relative (multiplicative) approximation of the Nash equilibrium first established by Dskalakis , and an inapproximability result for the problem of finding a competitive equilibrium with equal incomes and indivisible goods when one seeks to minimize the Gini index of the income distribution.

The (ε,δ)𝜀𝛿(\varepsilon,\delta)-Gcircuit and Weak Approximate Nash

Generalized circuits are similar to the standard algebraic circuits, the main difference being that generalized circuits contain cycles, which allow them to verify fixed points of continuous functions. We restrict the class of generalized circuits to include only a particular list of gates described below. Formally,

[Generalized circuits, ] A generalized circuit S{\cal S} is a pair (V,T)\left(V,{\cal{T}}\right), where VV is a set of nodes and T{\cal{T}} is a collection of gates. Every gate TTT\in{\cal{T}} is a 5-tuple T=G(ζv1,v2v)T=G\left(\zeta\mid v_{1},v_{2}\mid v\right), in which G{Gζ,G×ζ,G=,G+,G,G<,G,G,G¬}G\in\{G_{\zeta},G_{\times\zeta},G_{=},G_{+},G_{-},G_{<},G_{\vee},G_{\wedge},G_{\neg}\} is the type of the gate; ζR{nil}\zeta\in{R}\cup\left\{nil\right\} is a (optional) real parameter; v1,v2V{nil}v_{1},v_{2}\in V\cup\{nil\} are the first and second input nodes of the gate (one or both of them may be missing); and vVv\in V is the output node; no two distinct gates have the same output.

Alternatively, we can think of each gate as a constraint on the values on the incoming and outgoing wires. We are interested in the following constraint satisfaction problem: given a generalized circuit, find an assignment to all the wires that simultaneously satisfies all the gates. When every gate computes a continuous function of the incoming wires, a solution must exist by Brouwer’s fixed point theorem.

In particular, we are interested in the approximate version of this CSP, where we must approximately satisfy most of the constraints.

Given a generalized circuit S=(V,T){\cal S}=\left(V,{\cal T}\right), (ε,δ)(\varepsilon,\delta)-Gcircuit is the problem of finding an assignment that (ε,δ)(\varepsilon,\delta)-approximately satisfies it.

In an (ε,δ)\left(\varepsilon,\delta\right)-weak approximate Nash equilibrium at most a δ\delta-fraction of the players can gain more than ε\varepsilon by deviating.

The reduction from (ε,δ)(\varepsilon,\delta)-Gcircuit to (ε,δ)\left(\varepsilon^{\prime},\delta^{\prime}\right)-weak approximate Nash equilibrium follows analogously to the reduction in from ε\varepsilon-Gcircuit to ε\varepsilon^{\prime}-approximate Nash equilibrium.

The first step is to reduce (ε,δ)(\varepsilon,\delta)-Gcircuit to \big{(}\Theta(\varepsilon^{2}),\Theta(\varepsilon\cdot\delta)\big{)}-Gcircuit with fan-out 22. The naive way to do this is to replace larger fan-outs with a binary tree of G=G_{=} gates. Daskalakis et al. successfully use this method for exponentially small ε\varepsilon, but for constant ε\varepsilon the iterative application of G=G_{=} gates accumulates too much noise (Θ(εlogn)\Theta(\varepsilon\cdot\log n)). Instead, we follow and focus on logical gates whose noise does not accumulate. Given an output of a logical gate (G<,G,G,G¬G_{<},G_{\vee},G_{\wedge},G_{\neg}), we can introduce a binary tree (with even depth) of G¬G_{\neg} gates that copy the original output instead of having a single gate with a large fan-out. When we begin with an output of an arithmetic gate (Gζ,G×ζ,G=,G+,GG_{\zeta},G_{\times\zeta},G_{=},G_{+},G_{-}), we first use Θ(1/ε)\Theta(1/\varepsilon) gates of types G=G_{=} and G<G_{<} to parse it into an ε\varepsilon-precision unary representation. Then, we copy each bit in the unary representation using a tree of G¬G_{\neg} gates. Finally, we (approximately) recover the original value from each copy of the unary representation using Θ(1/ε)\Theta(1/\varepsilon) gates of types G×ζG_{\times\zeta} and G+G_{+}. Notice that the number of new gates introduced for each gate at the leaves of the binary tree is Θ(1/ε)\Theta(1/\varepsilon). Therefore, if the solution to the new (fan-out 22) instance violates an O(δε)O(\delta\cdot\varepsilon)-fraction of the gate-constraints, then it induces a solution to the original (arbitrary fan-out) instance that violates at most a δ\delta-fraction of the constraints. (See the full version of for more details.)

We now reduce (ε,δ)(\varepsilon,\delta)-Gcircuit with fan-out 22 to (ε,δ)\left(\varepsilon^{\prime},\delta^{\prime}\right)-weak approximate Nash equilibrium. Daskalakis et al. construct, for each gate in the generalized circuit, a game gadget with a few players whose mixed strategies at approximate equilibrium simulate the computation carried by the gate. In other words, every approximate equilibrium of the gate gadget induces an approximately satisfying assignment to the corresponding input and output lines in the generalized circuit. See Lemma 1 for an example of such a gadget. The gadgets are concatenated by identifying the “output player” of one gadget with the “input player(s)” of the next gadget(s) in the generalized circuit. Since each gadget is composed of only a constant number of players, if all but a δ\delta^{\prime}-fraction of the players play approximately at equilibrium, all but a δ=Θ(δ)\delta=\Theta(\delta^{\prime})-fraction of the gates are violated by the induced assignment to the generalized circuit.

The above construction suffices to show hardness for an (ε^,δ)\left(\hat{\varepsilon},\delta^{\prime}\right)-weak well-supported Nash equilibrium, i.e. one for where all but a δ\delta^{\prime}-fraction of the players, every action in the support is ε^\hat{\varepsilon}-approximately best response. Finally, for constant degree graphical games, it is not hard to derive an (ε^,δ)\left(\hat{\varepsilon},\delta^{\prime}\right)-weak well-supported Nash equilibrium from any (ε,δ)\left(\varepsilon^{\prime},\delta^{\prime}\right)-weak approximate Nash equilibrium for ε=Θ(ε^2)\varepsilon^{\prime}=\Theta(\hat{\varepsilon}^{2}) [26, Lemma 4].

In the other direction, from (ε,δ)\left(\varepsilon^{\prime},\delta^{\prime}\right)-weak approximate Nash equilibrium to (ε,δ)(\varepsilon,\delta)-Gcircuit, it suffices to construct a generalized circuit that computes an ε\varepsilon^{\prime}-approximate best response of each player (except a δ\delta^{\prime}-fraction). Notice that in a graphical game with a constant number of actions per player, given a profile of mixed strategies, one only needs a constant number of arithmetic gates to compute a best response of any player. In particular, to compute an ε\varepsilon-approximate best response it suffices to use gates of finite precision ε=Θ(ε)\varepsilon^{\prime}=\Theta(\varepsilon). Finally, since the number of gates per player is constant, if at most a δ\delta-fraction of the gates are ε\varepsilon-unsatisfied, they can appear in the best-response computation for at most a δ=Θ(δ)\delta^{\prime}=\Theta(\delta)-fraction of the players. ∎

Let v1v_{1},v2v_{2}, and ww be players in a graphical game, and suppose that the payoffs of v2v_{2} and ww are as follows.

Then, in every ε\varepsilon-NE p[v2]=min(ζp[v1],1)±ε\mathbf{p}\left[v_{2}\right]=\min\left(\zeta\mathbf{p}\left[v_{1}\right],1\right)\pm\varepsilon, where p[u]\mathbf{p}\left[u\right] denotes the probability that uu assigns to strategy 11.

(Sketch) If p[v2]>ζp[v1]+ε\mathbf{p}\left[v_{2}\right]>\zeta\mathbf{p}\left[v_{1}\right]+\varepsilon, then in every ε\varepsilon-NE p[w]=1\mathbf{p}\left[w\right]=1, which contradicts p[v2]>ε\mathbf{p}\left[v_{2}\right]>\varepsilon. Similarly, if p[v2]<min(ζp[v1],1)ε\mathbf{p}\left[v_{2}\right]<\min\left(\zeta\mathbf{p}\left[v_{1}\right],1\right)-\varepsilon, then p[w]=0\mathbf{p}\left[w\right]=0, which yields a contradiction when p[v2]<1ε\mathbf{p}\left[v_{2}\right]<1-\varepsilon. ∎

Proof of Theorem 1

The proof is a reduction from weak approximation of Nash equilibrium in a polymatrix game (recall Theorem 2) to the two-player problem. The two players simultaneously play three games: the main game, which is the heart of the reduction; and two games based on a construction due to Althofer , which impose structural properties of any approximate Nash equilibrium (interestingly, the oblivious lower bound of uses the same game). The final payoff of a player is the sum of payoffs in all three games. For convenience of notation, the payoffs in each game will be in ;inordertonormalizethepayoffsinthefinalgameto; in order to normalize the payoffs in the final game to one should multiply by 1/31/3 the payoffs in all three games.

We let each of the two players “control” the vertices on one side of the bipartite graphical game (we henceforth use players only for the players in the bimatrix game, and refer to the players in the graphical polymatrix game as vertices). For ease of notation, we assume wlog that each player controls an equal number of vertices, nn.

We partition the vertices of each player into n/kn/k disjoint subsets of size at most 2k=2n2k=2\sqrt{n}, such that every two subsets share at most 1818 edges. By Lemma 6, we can efficiently find such a partition. Let (S1,,Sn/k)\left(S_{1},\dots,S_{n/k}\right) and (T1,,Tn/k)\left(T_{1},\dots,T_{n/k}\right) denote the partitions of the respective players. Each action of the players corresponds to a choice of a subset (out of n/kn/k subsets in the partition), and a choice of an action for each vertex in the subset (out of at most 22k2^{2k} vectors of actions). Together, the main game has (nk22k)×(nk22k)\left(\frac{n}{k}\cdot 2^{2k}\right)\times\left(\frac{n}{k}\cdot 2^{2k}\right) action profiles. When players choose actions (Si,αSi)\left(S_{i},\vec{\alpha}_{S_{i}}\right) and (Tj,βTj)\left(T_{j},\vec{\beta}_{T_{j}}\right), the payoff of the row player is the sum of the payoffs over all shared edges of SiS_{i} and TjT_{j}, when they play the respective strategies from αSi\vec{\alpha}_{S_{i}} and βTj\vec{\beta}_{T_{j}} (here we use the polymatrix structure of the payoffs in the graphical game; the payoffs are defined over edges and therefore payoffs of a certain vertex can be defined even though not necessarily all its neighbours are playing). Finally, we normalize by λ/18\lambda/18, where λ=Θ(δ2)\lambda=\Theta(\delta^{2}) is a small constant that satisfies λ>ε\lambda>\varepsilon^{\prime}. Similarly the payoff of the second player is derived from the payoffs of the vertices in TjT_{j}.

Althofer’s games

Altofer’s game is an asymmetric hide-and-seek win-lose game over ll locations. The hider chooses a location i[l]i\in[l], the seeker chooses a subset B[l]B\subset[l] of of size B=l/2|B|=l/2. The hider wins if iBi\notin B, the seeker wins if iBi\in B. Namely, the payoff function is given by

In Althofer’s game each player can guarantee 12\frac{1}{2} by uniform play, therefore the value of the game is 12\frac{1}{2}. Althofer’s game enjoys the following strong property: in every ε\varepsilon-approximate Nash equilibrium, the hider must play a mixed strategy that is O(ε)O(\varepsilon)-close to the uniform distribution in total variation distance; see .

2 Structure of an equilibrium

For a mixed action xx of player 11 we denote by x(Si)x(S_{i}) the total probability that the player chooses to control the vertices SiS_{i}; i.e.,

If (x,y)(x,y) is an ε\varepsilon^{\prime}-Nash equilibrium, then i[n/k]x(Si)kn=O(λ)\sum_{i\in[n/k]}\left|x\left(S_{i}\right)-\frac{k}{n}\right|=O\left(\lambda\right).

In the proof we use the following Lemma from :

Let {ai}i[l]\{a_{i}\}_{i\in[l]} be real numbers satisfying the following properties for some θ>0\theta>0: (1) a1a2...ala_{1}\geq a_{2}\geq...\geq a_{l}; (2) i[l]ai=0\sum_{i\in[l]}a_{i}=0; and (3) i[l/2]aiθ\sum_{i\in[l/2]}a_{i}\leq\theta. Then i[l]ai4θ\sum_{i\in[l]}|a_{i}|\leq 4\theta.

In order apply Lemma 3, we denote ai=x(Si)kna_{i}=x(S_{i})-\frac{k}{n} and we assume wlog that x(S1)x(S2)...x(Sn/k)x(S_{1})\geq x(S_{2})\geq...\geq x(S_{n/k}). Then the first two conditions hold. Regarding the third condition, we argue that i[l/2]x(Si)3λ\sum_{i\in[l/2]}x(S_{i})\leq 3\lambda this will complete the proof.

Player 1 can guarantee a payoff of 1/21/2 in the sum of Althofer’s games as a hider and the main game, by playing uniformly and choosing an arbitrary actions for the controlled vertices (for instance, αSi=0\vec{\alpha}_{S_{i}}=\vec{0}). Assume by contradiction that i[l/2]x(Si)>3λ\sum_{i\in[l/2]}x(S_{i})>3\lambda. Since Player 2 is ε\varepsilon^{\prime}-best replying, his payoff in the Althofer game (as a seeker) is at least 12+3λε\frac{1}{2}+3\lambda-\varepsilon^{\prime} (because he can get 1/2+3λ1/2+3\lambda by choosing the set [l/2][l/2]). Therefore, Player 1’s payoff in the Althofer game (as a hider) is at most 123λ+ε\frac{1}{2}-3\lambda+\varepsilon^{\prime}. If we add to it his payoffs in the main game, then his payoff is at most 122λ+ε\frac{1}{2}-2\lambda+\varepsilon^{\prime}. Therefore, player 1 can increase his payoff by 2λε>ε2\lambda-\varepsilon^{\prime}>\varepsilon^{\prime} by deviating to the uniform distribution over the locations (as a hider), and maintaining his mixed action as a seeker. ∎

3 Completing the proof

Any mixed strategy xx of player 1 in the bimatrix game induces a mixed strategy of all the vertices in iSi\cup_{i}S_{i} in the obvious way. Vertex sSis\in S_{i} plays the action 11 with probability pp where pp is the conditional probability

If the event “Player 1 controls SiS_{i}” occurs with probability 0, then we define (arbitrarily) that p=1p=1. Similarly, any mixed strategy yy of player 2 induces a mixed strategy of all the vertices in iTi\cup_{i}T_{i}.

We claim that if (x,y)\left(x,y\right) is an ε\varepsilon^{\prime}-approximate Nash equilibrium of the bimatrix game, then the induced mixed-strategies profile is an (ε,δ)\left(\varepsilon,\delta\right)-approximate Nash equilibrium of the original graphical game.

By Lemma 2 and Markov’s inequality, for a (1O(λ))\left(1-O\left(\sqrt{\lambda}\right)\right)-fraction of the subsets, the row player distributes within a (1±O(λ))\left(1\pm O\left(\sqrt{\lambda}\right)\right)-factor of the correct weight (k/nk/n); i.e. x(Si)kn[1O(λ),1+O(λ)]x\left(S_{i}\right)\in\frac{k}{n}\cdot\left[1-O\left(\sqrt{\lambda}\right),1+O\left(\sqrt{\lambda}\right)\right]. Let us restrict our attention only to those subsets, and call them good. We say that a vertex is good if it and all of its neighbors belong to good subsets. Since the game graph is of bounded degree, we again have that a (1O(λ))\left(1-O\left(\sqrt{\lambda}\right)\right)-fraction of the vertices is good.

Consider any good vertex whose induced mixed strategy is not ε\varepsilon-optimal in the original game given that the rest of the vertices also play according to their induced strategies. Then, changing its strategy in the bimatrix game (while leaving all other marginals the same), would increase the payoff of its player by at least

where the (1O(λ))2(kn)2\left(1-O\left(\sqrt{\lambda}\right)\right)^{2}\cdot\left(\frac{k}{n}\right)^{2} term corresponds to the probabilities that the subsets corresponding to both the vertex and any of its neighbors are played; the (1O(λ))ε\left(1-O\left(\sqrt{\lambda}\right)\right)\cdot\varepsilon term corresponds to the improvement of the vertex in the bimatrix game, where instead of summing the payoff in all three edges (as in the graphical game) these three edges have weights {1+γj}j=1,2,3\{1+\gamma_{j}\}_{j=1,2,3} for γj=O(λ)|\gamma_{j}|=O(\sqrt{\lambda}); finally λ/18\lambda/18 is the normalization. The right side follows by plugging in k=nk=\sqrt{n}.

If (by a contradiction) a δ\delta-fraction of the vertices have ε\varepsilon-improvement strategies; then one of the players has (δ/2O(λ))n\left(\delta/2-O\left(\sqrt{\lambda}\right)\right)\cdot n good vertices with an ε\varepsilon-improvementHere we use our choice of λ=Θ(δ2)\lambda=\Theta(\delta^{2}), which guarantees that δ/2O(λ)=Ω(δ)\delta/2-O\left(\sqrt{\lambda}\right)=\Omega(\delta). This player can benefit Ω(ελ/n)\Omega\left(\varepsilon\cdot\lambda/n\right) from a deviation of each vertex. So his total improvement from all deviations simultaneously is Ω(εδλ)\Omega\left(\varepsilon\cdot\delta\cdot\lambda\right) - which is impossible when (x,y)(x,y) is an ε\varepsilon^{\prime}-approximate Nash equilibrium for ε=Θ(εδλ)\varepsilon^{\prime}=\Theta(\varepsilon\cdot\delta\cdot\lambda).

Implications for Relative ε𝜀\varepsilon-approximate Nash

Daskalakis defines a notion of relative (sometimes also called multiplicative , as opposed to the more standard additive) ε\varepsilon-Nash equilibrium, and proves that in two-player games with payoffs in [1,1]\left[-1,1\right], finding a relative ε\varepsilon-well supported Nash equilibrium is \PPAD-complete. In particular, he concludes that the quasi-polynomial algorithm of Lipton et al. cannot achieve this notion of approximate equilibrium.

This result has two caveats: (1) Through the use of positive and negative payoffs, the gain from deviation is large compared to the expected payoff, only because the latter is small due to cancellation of positive and negative payoffs. Namely, the gain from deviation may be very small compared to the expected magnitude of the payoff. (2) It only applies to the more restrictive notion (thus rendering the hardness result weaker) of well-supported approximate equilibrium; i.e. an equilibrium where every action in the support has to be approximately optimal. Showing \PPAD-hardness for both positive payoffs and general (non well-supported) approximate equilibrium were left as open questions in . Recently, the first question was settled in where it was shown that finding a relative ε\varepsilon-well-supported Nash equilibrium with positive payoffs is indeed \PPAD-complete.

Here, assuming Conjecture 2, we settle the second question: we show that finding any relative ε\varepsilon-approximate Nash equilibrium is \PPAD-complete. Furthermore, in our hard instance the row player has only positive payoffs and the column player has only negative payoffs, and so there is no cancellation of payoffs as in the construction of .

Assuming Conjecture 2 there exists a constant ε=Θ(εδ3)>0\varepsilon^{\prime}=\Theta\left(\varepsilon\cdot\delta^{3}\right)>0 such that finding a relative ε\varepsilon^{\prime}-approximate Nash equilibrium (ANE) in a bimatrix game where the row player’s payoffs are non-negative and the column player’s payoffs are non-positive is \PPAD-complete.

Proving a similar theorem when both players have positive payoffs remains an interesting open question. In fact, we do not know of any instances with positive payoffs where all ε\varepsilon-approximate Nash equilibria must have large (e.g., linear, or even super-logarithmic) support.

We reduce from the problem of finding an (additive) (ε,δ)\left(\varepsilon,\delta\right)-Nash in a bipartite, degree 33, polymatrix game with two actions per player. We construct a main game where the bimatrix game players (henceforth just players) control the nodes of the polymatrix game, and two side games that guarantee that each player randomizes (approximately) uniformly over all her nodes.

We let the row player “control” the nodes on one side of the bipartite game graph, and let the column player control the nodes on the other side. Namely, let nn be the number of nodes on each side of the graph; each player has 2n2n actions, each corresponding to a choice of node and an action for that node. When the players play strategies that correspond to adjacent nodes in the graphs, they receive the payoffs of the corresponding nodes, scaled (by a small positive constant η=O(δ2)\eta=O\left(\delta^{2}\right)) and shifted to fit in the intervals: [1,1+η]\left[1,1+\eta\right] for the row player, and [(1+η),1]\left[-\left(1+\eta\right),-1\right] for the column player. If the nodes played do not share an edge in the bipartite game graph, the utility for both players is zero. Notice that if either player chooses her node uniformly at random (and chooses the action for that node arbitrarily), then the expected payoff is in 3n[1,1+η]\frac{3}{n}\cdot\left[1,1+\eta\right] for the row player, and in 3n[(1+η),1]\frac{3}{n}\cdot\left[-\left(1+\eta\right),-1\right] for the column player.

We also let the players play two hide-and-seek win-lose zero-sum games over a space of nn actions. In both games, the row player is chasing the column player. In each game, if they pick the same strategy, the row player receives payoff 11 (and the column player receives 1-1); otherwise the payoffs are . Finally, in the first side game we identify the row player’s strategies with her choice of nodes in the main game. Namely, if she plays node ii in the main game and the column player chose ii in the first side game, then her payoff from this side game is 11. Similarly, we identify the column player’s strategies in the second game with her choice of nodes in the main game.

We proceed by showing that in every relative ε\varepsilon^{\prime}-ANE, the row player’s utility is approximately 5/n5/n, and the column player’s utility is approximately 5/n-5/n (Lemma 4); then we show that in every relative ε\varepsilon^{\prime}-ANE, both players randomize approximately uniformly over their nodes (Lemma 5); finally we use these two observations to complete the proof (Subsection 4.2). ∎

Given mixed strategies (x,y)\left(x,y\right), we let x(i)x\left(i\right) denote the total probability that the row player assigns to node ii, and analogously for y(j)y\left(j\right). We also let x(y)x^{*}\left(y\right) and y(x)y^{*}\left(x\right) denote the corresponding best responses of each player. Finally, let UR(x;y)U_{R}\left(x;y\right) and UC(y;x)U_{C}\left(y;x\right) denote the expected payoffs for the row and column players, respectively.

If (x,y)\left(x,y\right) is a relative ε\varepsilon^{\prime}-ANE, then

Observe that the main game is relative η\eta-approximately zero-sum; i.e. for any pure strategy profile (x,y)\left(x^{\prime},y^{\prime}\right) the expected utilities UR\mboxmain(x;y),UC\mboxmain(y;x)U_{R}^{\mbox{main}}\left(x^{\prime};y^{\prime}\right),U_{C}^{\mbox{main}}\left(y^{\prime};x^{\prime}\right) of the row and column player, respectively, satisfy:

By linearity of expectation and convexity of the absolute value function, this continues to hold when (x,y)\left(x^{\prime},y^{\prime}\right) are mixed strategies. Furthermore, since the side games are exactly zero-sum and have the same signs as the main game, the same follows for the payoffs from the entire game:

In particular, the above inequality holds for (x,y)\left(x,y\right). Thus, the upper bounds in (1) and (2) follow from the lower bounds in (2) and (1), respectively.

Finally, (1ε)5nUR(x;y)\left(1-\varepsilon^{\prime}\right)\cdot\frac{5}{n}\leq U_{R}\left(x;y\right) in every relative ε\varepsilon^{\prime}-ANE, because the row player can guarantee an expected payoff of at least 5n\frac{5}{n} by randomizing uniformly over all her strategies. Similarly, UC(y;x)(1+ε)(3(1+η)+2n)U_{C}\left(y;x\right)\geq-\left(1+\varepsilon^{\prime}\right)\left(\frac{3\left(1+\eta\right)+2}{n}\right) because the column player can guarantee a payoff of at least 3(1+η)+2n-\frac{3\left(1+\eta\right)+2}{n} by randomizing uniformly over all her strategies.∎

There exists a constant λ=Θ(δ2)\lambda=\Theta\left(\delta^{2}\right) such that for every (x,y)\left(x,y\right) relative ε\varepsilon^{\prime}-ANE, x(i)1nλ\sum\left|x\left(i\right)-\frac{1}{n}\right|\leq\lambda and y(j)1nλ\sum\left|y\left(j\right)-\frac{1}{n}\right|\leq\lambda.

Assume by contradiction that x(i)1n>λ\sum\left|x\left(i\right)-\frac{1}{n}\right|>\lambda, then in particular there must exist an i[n]i\in\left[n\right] such that x(i)<(1λ/2)/nx\left(i\right)<\left(1-\lambda/2\right)/n. When the column player chooses her strategy uniformly at random in the main game and in the second side game and plays strategy ii in the first side game, her expected payoff is at least

Therefore, in any relative ε\varepsilon^{\prime}-ANE her expected utility is at least

Which contradicts the upper bound in (2) when we take λ\lambda sufficiently large, e.g. λ=10η\lambda=10\eta.

Similarly, if y(j)1n>λ\sum\left|y\left(j\right)-\frac{1}{n}\right|>\lambda, there must exist an j[n]j\in\left[n\right] such that y(j)>(1+λ/2)/ny\left(j\right)>\left(1+\lambda/2\right)/n. Therefore, the row player can guarantee a payoff of at least

Thus by relative ε\varepsilon^{\prime}-ANE,

which contradicts the upper bound in (1). ∎

2 Completing the proof of Theorem 3

Now, given a relative ε\varepsilon^{\prime}-ANE (x,y)\left(x,y\right), we can take, for each node ii, the mixed strategy induced by the probabilities x(i:1)/x(i)x\left(i:1\right)/x\left(i\right) and x(i:2)/x(i)x\left(i:2\right)/x\left(i\right) that the row player assigns to each action (respectively, y(j:1)/y(j)y\left(j:1\right)/y\left(j\right) and y(j:2)/y(j)y\left(j:2\right)/y\left(j\right) assigned by the column player). We claim that this strategy profile is an (ε,δ)\left(\varepsilon,\delta\right)-approximate equilibrium for the polymatrix game. Assume by contradiction that this is not the case.

By Lemma 5 and Markov’s inequality, a (1O(λ))\left(1-O\left(\sqrt{\lambda}\right)\right)-fraction of the nodes are played within λn\frac{\sqrt{\lambda}}{n} of the correct probability 1/n1/n. For a node ii controlled by the row player, we say that it is good if x(i)1/nλn\left|x\left(i\right)-1/n\right|\leq\frac{\sqrt{\lambda}}{n} and y(j)1/nλn    jN(i)\left|y\left(j\right)-1/n\right|\leq\frac{\sqrt{\lambda}}{n}\;\;\forall j\in{\cal N}\left(i\right), and analogously for column player’s nodes. Since the graph has bounded degree, a (1O(λ))\left(1-O\left(\sqrt{\lambda}\right)\right)-fraction of the nodes are good.

Let ii be any good node who has an ε\varepsilon-improving deviation from her induced strategy in the polymatrix game. If the player who controls ii makes the corresponding deviation in the two-player game, she increases her expected payoff by at least εη1n2(1O(λ))\varepsilon\cdot\eta\cdot\frac{1}{n^{2}}\cdot\left(1-O\left(\sqrt{\lambda}\right)\right). (We multiply by η\eta to account for the scaling; by 1/n21/n^{2} for the probability that this player plays node ii and the other player plays a neighbor of ii; and by (1O(λ))\left(1-O\left(\sqrt{\lambda}\right)\right) to correct for the deviation from 1/n1/n in the latter probabilities.) By our assumption that the induced strategy profile is not an (ε,δ)\left(\varepsilon,\delta\right)-approximate equilibrium for the polymatrix game, at least one of the players has at least (δO(λ))n\left(\delta-O\left(\sqrt{\lambda}\right)\right)n good nodes with ε\varepsilon-improving deviations. Summing her gains from those deviations, we get that this player has can improve her expected payoff by at least [εη(1O(λ))](δO(λ))/n\left[\varepsilon\cdot\eta\cdot\left(1-O\left(\sqrt{\lambda}\right)\right)\right]\left(\delta-O\left(\sqrt{\lambda}\right)\right)/n.

However, this is a contradiction since it follows from Lemma 4, that (x,y)\left(x,y\right) is also an (additive) ε(6n)\varepsilon^{\prime}\cdot\left(\frac{6}{n}\right)-ANE. ∎

Implications for Fairness Mechanisms

Competitive equilibrium from equal incomes (CEEI) is a well-known fair allocation mechanism : We give all agents a unit of money, and price the goods in such a way that the market clears. It is also well-known that when goods are indivisible or utilities are non-linear, an equilibrium may not exist. However, Budish proves that an approximate CEEI still exists. This concept of approximate equilibrium is used in practical system for allocating seats in courses to business school students .

Budish measures the proximity to a perfect CEEI via two parameters: a solution is an (α,β)\left(\alpha,\beta\right)-CEEI if the clearing error of the competitive equilibrium is less than α\alpha, and all the incomes are between 11 and 1+β1+\beta. Budish shows that an (α,β)\left(\alpha,\beta\right) always exists for some favorable α=α\alpha=\alpha^{*}, and any β>0\beta>0. Recently, showed a reduction from ε\varepsilon-Gcircuit with fan-out 22, to the problem of finding an (α,Θ(ε/log(1/ε)))\left(\alpha^{*},\Theta\left(\varepsilon/\log\left(1/\varepsilon\right)\right)\right)-CEEI. In particular, when combined with the results of , this implies that it is \PPAD-complete to find an (α,β)\left(\alpha^{*},\beta\right)-CEEI for some constant β>0\beta>0.

The β\beta parameter used in Budish’s formulation is an imperfect way of measuring income inequalities, as it may be set by a single outlier. Perhaps the best known, and most widely used, measure of income inequality is the Gini index (e.g. ) (see the Appendix for the definition). In fact, the Gini index is used to assess the performance of the class seat assignment system used in practice .

Assuming Conjecture 2, finding an income assignment and prices with low market clearing error and near-optimal Gini index is PPAD-complete.

See the appendix for the precise definitions and statement of the result.

Discussion

The purpose of this paper is to showcase an important open problem:

reduce End of the Line to (ε,δ)(\varepsilon,\delta)-Gcircuit

— i.e., show that Conjecture 1 implies Conjecture 2. As we mentioned, such a reduction would imply a “PCP for \PPAD”; i.e. it would imply that there exists a probabilistically checkable proof for a \PPAD-complete problem.

The equivalent statement for the class of \NP-complete problem, is the celebrated PCP Theorem . What can we learn from our experience with constructing PCP’s for \NP? First of all, we have many different constructions of PCPs for \NP, and this may be seen as circumstantial evidence that constructing a PCP for \PPAD is possible. Equally important, there are several constructions of PCPs for \NP that are near-linear (e.g. ); notice that for the application to ε\varepsilon-Nash in bimatrix games we need the proof length to be sub-quadratic.

The next question one should ask is, whether we can adapt the techniques used in the proofs of the PCP Theorem to the class of \PPAD-complete problems. We briefly and informally sketch some of our thoughts on the matter. All the proofs of the PCP Theorems that we are aware of (including ) compose an inner verifier with an outer verifier. The outer verifier in Dinur’s proof is combinatorial in nature, and it seems plausible that the same or similar techniques could be modified to fit the generalized circuit graph. Unfortunately, all inner verifiers that we know are discrete in nature, whereas our characterization of \PPAD with Gcircuit (or Nash) is inherently based on continuous constraints. Thus the following interesting (and somewhat open-ended) questions arise: Can we find a characterization of \PPAD via a constraint satisfaction problem whose constraints are discrete in nature? Alternatively, can one construct an inner (non-efficient) verifier using the constraints specified in the definition of the Gcircuit problem?

Many thanks to Boaz Barak, Paul Cristiano, Muli Safra, and Madhu Sudan for inspiring discussions about probabilistic checkable proofs.

References

Appendix A Finding a good partition

Let G=(U,V,E)G=\left(U,V,E\right) be a bipartite dd-regular graph with n=U=Vn=\left|U\right|=\left|V\right|. Then we can efficiently find partitions S1,,Sn/kS_{1},\dots,S_{n/k} and T1,,Tn/kT_{1},\dots,T_{n/k} of UU and VV, respectively, to disjoint subsets, such that each subset has size at most 2k=2n2k=2\sqrt{n}, and:

Let S1,,Sn/kS_{1},\dots,S_{n/k} be an arbitrary partition of UU into disjoint subsets of size exactly kk. We inductively place the vertices of VV into subsets T1,,Tn/kT_{1},\dots,T_{n/k}, while maintaining the desiderata that each subset TjT_{j} is of size at most 2k2k, and for every i,ji,j the number of edges from SiS_{i} to TjT_{j} is at most 2d2k2/n2d^{2}k^{2}/n.

It is left to show that for any partial partition of VV, there is a subset TJT_{J} into which we can place the next vertex vv. In expectation, every subset TjT_{j} has less than kk vertices. Therefore by Markov’s inequality, less than half of the subsets have 2k2k vertices or more. vv has neighbors in at most dd subsets SiS_{i}. Recall that the SiS_{i}’s are of size exactly kk. Thus each SiS_{i} with a neighbor of vv has, in expectation over jj, less than dk2/ndk^{2}/n neighbors in TjT_{j}. Using Markov’s inequality again, less than a 1/(2d)1/(2d)-fraction of the subsets TjT_{j} contain 2d2k2/n2d^{2}k^{2}/n neighbors of SiS_{i}. In total, we lose less than half the subsets for the size desideratum, and less than 1/(2d)1/(2d)-fraction of the subsets for each SiS_{i} containing a neighbor of vv. Therefore there always remains at least one subset TjT_{j} to which we can add vv. ∎

Appendix B The course allocation problem

Even though the approximate CEEI and the existence theorem in are applicable to a broad range of allocation problems, we shall describe our results in the language of the course allocation problem.

We are given a set of MM courses with integer capacities (the supply) (qj)j=1M(q_{j})_{j=1}^{M}, and a set of NN students, where each student ii has a set Ψi2M\Psi_{i}\subseteq 2^{M} of permissible course bundles, with each bundle containing at most kMk\leq M courses. The set Ψi\Psi_{i} encodes both scheduling constraints (e.g., courses that meet at the same time) and any constraints specific to student ii (e.g. prerequisites).

Each student ii has a strict ordering over her permissible schedules, denoted by i\preccurlyeq_{i}. We allow arbitrarily complex preferences — in particular, students may regard courses as substitutes or complements. More formally:

Course Allocation Problem The input to a course allocation problem consists of:

For each student ii a set of course bundles (Ψi)i=1N\left(\Psi_{i}\right)_{i=1}^{N}.

The students’ reported preferences, (i)i=1N\left(\preccurlyeq_{i}\right)_{i=1}^{N},

The course capacities, (qj)j=1M\left(q_{j}\right)_{j=1}^{M}, and

The output to a course allocation problem consists of:

Prices for each course (pj)j=1M(p_{j}^{*})_{j=1}^{M},

Allocations for each student(xi)i=1N(x_{i}^{*})_{i=1}^{N}, and

Budgets for each student (bi)i=1N\left(b_{i}^{*}\right)_{i=1}^{N}.

The quality of an allocation is evaluated based on its proximity to market clearing and income equality. As for the definition of market clearing error, it suffices for our purposes to require that no course is over-subscribed or under-subscribed by more than α\alpha^{*} students (we refer the curious reader to for the precise definition). The Gini coefficient, which measures income inequality, is discussed in the following subsection.

Given a distribution of incomes DD, the Lorenz curve plots, for each xx\in, the cumulative wealth owned by the bottom xx-fraction of the population. Let F_{D}^{-1}(x)=\sup\big{\{}y\colon\Pr_{z\sim D}[z\leq y]\leq x\big{\}}, and define LD(x)=(0xFD1(x))/(01FD1(x))L_{D}(x)=\left(\int_{0}^{x}F_{D}^{-1}(x)\right)/\left(\int_{0}^{1}F_{D}^{-1}(x)\right). Then the Lorenz curve is the graph \big{(}x,L_{D}(x)\big{)}, for xx\in. Notice that FD1(x)F_{D}^{-1}(x) is monotonically non-decreasing, so the Lorenz curve is convex.

The Gini index is defined as the ratio of the area between the \big{(}x,x\big{)} line and the Lorenz curve (by the convexity, xLD(x)xx\geq L_{D}(x)\forall x\in), divided by the entire area under the \big{(}x,x\big{)} line (the latter is always 1/21/2): GD=201(xLD(x))dx=1201LD(x)dxG_{D}=2\int_{0}^{1}(x-L_{D}(x))dx=1-2\int_{0}^{1}L_{D}(x)dx

In general, a smaller Gini index corresponds to a more equal distribution of wealth. For example, when all incomes are exactly equal, the Lorenz curve is exactly equal to the \big{(}x,x\big{)} line, and the Gini index is . On the other extreme, when one person has all the wealth, the area under the Lorenz curve goes to as the population size increases, and the Gini index approaches 11.

B.2 Intractability of approximate CEEI with near-optimal Gini index

Assuming Conjecture 2, there exists some constant γ>0\gamma>0 such that finding an allocation with market clearing error α\alpha^{*} and Gini coefficient γ\gamma is \PPAD-complete.

The rest of this section is devoted to sketching a proof of Theorem 4. In the next subsection, we briefly outline the reduction of from generalized circuits with fan-out 22 to approximate CEEI. (Recall that in Section 2 we show how to convert any generalized circuit to one with fan-out 22.) Then, we show that after normalizing the median budget to 1+ε/21+\varepsilon^{\prime}/2, in any allocation which does not correspond to a valid solution to the (ε,δ)(\varepsilon,\delta)-Gcircuit instance, a δ\delta^{\prime}-fraction of the students have budgets either at most 11 or at least 1+ε1+\varepsilon^{\prime}, (for some constants ε=Θ(ε/log(1/ε))\varepsilon^{\prime}=\Theta\left(\varepsilon/\log(1/\varepsilon)\right) and δ=Θ(δ/log(1/ε))\delta^{\prime}=\Theta\left(\delta/\log(1/\varepsilon)\right)). Then, the proof is complete with the following lemma:

Let the median income be 1+ε/21+\varepsilon^{\prime}/2, and suppose that a δ\delta^{\prime}-fraction of the population has income at most 11 (resp. at least 1+ε1+\varepsilon^{\prime}). Then the Gini index is at least γ=Θ(δε)=Θ(δε/log2(1/ε))\gamma=\Theta(\delta^{\prime}\cdot\varepsilon^{\prime})=\Theta(\delta\cdot\varepsilon/\log^{2}(1/\varepsilon)).

The total income of the poorer half of the population is at most (1/2δ)(1+ε/2)+δ=(1/2)(1+ε/2)δε/2(1/2-\delta^{\prime})(1+\varepsilon^{\prime}/2)+\delta^{\prime}=(1/2)(1+\varepsilon^{\prime}/2)-\delta^{\prime}\cdot\varepsilon^{\prime}/2, whereas the richer half of the population has a total income of at least (1/2)(1+ε/2)(1/2)(1+\varepsilon^{\prime}/2). Therefore, the Lorenz curve’s value at 1/21/2 is bounded by

We can now use elementary geometry to bound the area under the Lorenz curve:

Therefore the Gini index is at least Θ(δε)\Theta(\delta^{\prime}\cdot\varepsilon^{\prime}). A similar argument works when a δ\delta^{\prime}-fraction of the population has income at least 1+ε1+\varepsilon^{\prime}.

B.3 From generalized circuits to Course Allocation

reduce generalized circuits to Course Allocation, by constructing a gadget for each gate of the generalized circuit. (In fact, a few gadgets per gate are required to handle a subtle issue that call ”course-size amplification”.) We provide the gadget for the NOT gate as an example, and then describe the properties of the ’s reduction that we need to complete the proof.

We henceforth normalize the prices and budgets in every assignment to the Course Allocation instance such that the median budget is 1+ε/21+\varepsilon^{\prime}/2.

Let nx>6αn_{x}>6\alpha^{*} and suppose that the economy contains the following courses:

c1xc_{{1-x}} with capacity q1x=2nx/3q_{{1-x}}=2n_{x}/3 (the “output course”);

nxn_{x} students interested only in the schedule {cx,c1x}\left\{c_{x},c_{{1-x}}\right\};

and suppose further that at most n1x=nx/6n_{{1-x}}=n_{x}/6 other students are interested in course c1xc_{{1-x}}.

Then in any normalized approximate CEEI with market clearing error at most α\alpha^{*}, at least one of the following must hold:

A constant fraction of the nxn_{x} students have budgets either less than 11 or greater than 1+ε1+\varepsilon^{\prime}.

If p1x>1px+εp_{{1-x}}^{*}>1-p_{x}^{*}+\varepsilon^{\prime}, then none of the nxn_{x} students can afford the bundle {cx,c1x}\left\{c_{x},c_{{1-x}}\right\} - except those whose budget is greater than 1+ε1+\varepsilon^{\prime}. Other than them, there are at most n1x=nx/6n_{{1-x}}=n_{x}/6 students enrolled in the c1xc_{{1-x}} - much less than the capacity 2nx/32n_{x}/3. Therefore for the market clearing error to be less than α=nx/6\alpha^{*}=n_{x}/6, nx/3n_{x}/3 of the students must have budget greater than 1+ε1+\varepsilon^{\prime}

On the other hand, if p1x<1pxp_{{1-x}}^{*}<1-p_{x}^{*}, then all nxn_{x} students can afford the bundle {cx,c1x}\left\{c_{x},c_{{1-x}}\right\} - except for those whose budget is less than 11. Therefore in order to satisfy the market clearing requirement, at least nx/6n_{x}/6 students must have budget less than 11.

Similarly, provide gadgets for all the gates in the definition of the Gcircuit problem, and show that they can be concatenated to simulate the computation on the generalized circuit. For each gate of the generalized circuit, ’s reduction uses at most Θ(1/log(1/ε))\Theta(1/\log(1/\varepsilon)) (in particular, a constant) number of gadgets. Furthermore, the number of students that participate in each of those gadgets is approximately the same (Θ(α)\Theta(\alpha^{*})). Therefore, for every gate which is not ε\varepsilon-satisfied, there are at least Θ(α)\Theta(\alpha^{*}) students whose budgets are either at most 11 or at least 1+ε1+\varepsilon^{\prime}. Thus if the assignment to the Course Allocation problem does not correspond to an (ε,δ)(\varepsilon,\delta)-approximate solution to Gcircuit, then a δ=Θ(δ/log(1/ε))\delta^{\prime}=\Theta(\delta/\log(1/\varepsilon))-fraction of the students must have budgets at most 11 or at least 1+ε1+\varepsilon^{\prime}. Applying Lemma 7 completes the proof of Theorem 4. ∎