Can Almost Everybody be Almost Happy? PCP for PPAD and the Inapproximability of Nash

Yakov Babichenko, Christos Papadimitriou, Aviad Rubinstein

Introduction

It is known that finding an $\varepsilon$ -approximate Nash equilibrium in a two-person game:

is \PPAD-complete if $\varepsilon$ is inversely polynomial ;

but can be solved in quasipolynomial time for any fixed $\varepsilon>0$ ; and

the smallest known polynomially attainable approximation ratio is still over $3\over 10$ .

These three facts articulate rather dramatically the mystery, by now almost a decade old, of the problem’s approximability.

Can the inversely polynomial inapproximability bound of be improved to constant? Unlikely, because by this would imply a quasipolynomial algorithm for the iconic problem in PPAD:

Conventional wisdom, supported by black-box lower bounds , is that this problem requires exponential time to solve. In direct analogy with the exponential time hypothesis , we state the following:

In this paper, we identify a plausible new conjecture, a strengthening of Conjecture 1 in a natural and novel direction, which implies that the quasipolynomial approximation algorithm of is optimum. In particular, we define the $(\varepsilon,\delta)$ -Gcircuit to be the problem of finding values for the variables (lines) that $\varepsilon$ -approximately satisfy a fraction of at least $(1-\delta)$ of the constraints (gates).

As we have mentioned, this latter statement, albeit with $\delta=0$ , is known to follow from only Conjecture 1, for some $\varepsilon>0$ .

The $(\varepsilon,\delta)$ -Gcircuit problem is a constraint satisfaction problem: each line/mixed strategy is a real variable, and each gate/player defines a constraint. Since we are considering $\varepsilon$ -approximate satisfaction of the constraints, each variable need be represented using only a constant (depending on $\varepsilon$ ) number of bits. Therefore, a satisfying assignment (truncations of a Nash equilibrium) can be distinguished from an unsatisfying one (one violating the $(\varepsilon-\delta)$ -relaxation) by querying a constant (proportional to $1/\delta$ ) number of bits of the input, determined at random (but of course not uniformly so). This suggests that Conjecture 2 can be interpreted as a probabilistically checkable proof formulation of Gcircuit: informally, it states that “\PPAD has a PCP” (that is, a complete problem whose witnesses can be verified by examining, at random, a finite number of bits).

The main result of this paper, explained next, reveals another similarity between the PCP formulation of NP and the $(\varepsilon,\delta)$ -Gcircuit problem: The intractability of the latter problem implies, among several other inapproximability theorems, the strongest possible inapproximability result for the 2-player Nash equilibrium problem, arguably the central open question in the area.

The Main Result

We denote by $\varepsilon$ -2Nash the problem of finding an $\varepsilon$ -Nash equilibrium in a bimatrix game. Our main result is the following:

Our proof, given in Section 3, employs the technique of “birthday repetition,” pioneered by and used in different contexts by , and in particular by to show intractability of a Nash equilibrium-related problem. Starting from a polymatrix game with two strategies per player and in which the player interactions form a cubic bipartite graph with $n$ nodes on each side, the players on both sides are broken into blocks of size $\sqrt{n}$ . The game is simulated by a two-player game, in which each player simulates the nodes in one of the sides of the bipartite graph by choosing a block and a strategy for each node in it; that is, the total number of actions of each of the two players is about $2^{\sqrt{n}}$ , and such is the complexity of the reduction (this is necessary if one wishes to rule out better than quasipolynomial algorithms). The interactions between blocks, plus certain particular side games played by the two players, ensure a faithful enough simulation of the original multimatrix game.

In addition to Theorem 1, we prove two other complexity consequences of Conjecture 2, solving certain open problems in the area: An improved inapproximability result for the problem of relative (multiplicative) approximation of the Nash equilibrium first established by Dskalakis , and an inapproximability result for the problem of finding a competitive equilibrium with equal incomes and indivisible goods when one seeks to minimize the Gini index of the income distribution.

The (ε,δ)𝜀𝛿(\varepsilon,\delta)-Gcircuit and Weak Approximate Nash

Generalized circuits are similar to the standard algebraic circuits, the main difference being that generalized circuits contain cycles, which allow them to verify fixed points of continuous functions. We restrict the class of generalized circuits to include only a particular list of gates described below. Formally,

[Generalized circuits, ] A generalized circuit ${\cal S}$ is a pair $\left(V,{\cal{T}}\right)$ , where $V$ is a set of nodes and ${\cal{T}}$ is a collection of gates. Every gate $T\in{\cal{T}}$ is a 5-tuple $T=G\left(\zeta\mid v_{1},v_{2}\mid v\right)$ , in which $G\in\{G_{\zeta},G_{\times\zeta},G_{=},G_{+},G_{-},G_{<},G_{\vee},G_{\wedge},G_{\neg}\}$ is the type of the gate; $\zeta\in{R}\cup\left\{nil\right\}$ is a (optional) real parameter; $v_{1},v_{2}\in V\cup\{nil\}$ are the first and second input nodes of the gate (one or both of them may be missing); and $v\in V$ is the output node; no two distinct gates have the same output.

Alternatively, we can think of each gate as a constraint on the values on the incoming and outgoing wires. We are interested in the following constraint satisfaction problem: given a generalized circuit, find an assignment to all the wires that simultaneously satisfies all the gates. When every gate computes a continuous function of the incoming wires, a solution must exist by Brouwer’s fixed point theorem.

In particular, we are interested in the approximate version of this CSP, where we must approximately satisfy most of the constraints.

Given a generalized circuit ${\cal S}=\left(V,{\cal T}\right)$ , $(\varepsilon,\delta)$ -Gcircuit is the problem of finding an assignment that $(\varepsilon,\delta)$ -approximately satisfies it.

In an $\left(\varepsilon,\delta\right)$ -weak approximate Nash equilibrium at most a $\delta$ -fraction of the players can gain more than $\varepsilon$ by deviating.

The reduction from $(\varepsilon,\delta)$ -Gcircuit to $\left(\varepsilon^{\prime},\delta^{\prime}\right)$ -weak approximate Nash equilibrium follows analogously to the reduction in from $\varepsilon$ -Gcircuit to $\varepsilon^{\prime}$ -approximate Nash equilibrium.

The first step is to reduce $(\varepsilon,\delta)$ -Gcircuit to $\big{(}\Theta(\varepsilon^{2}),\Theta(\varepsilon\cdot\delta)\big{)}$ -Gcircuit with fan-out $2$ . The naive way to do this is to replace larger fan-outs with a binary tree of $G_{=}$ gates. Daskalakis et al. successfully use this method for exponentially small $\varepsilon$ , but for constant $\varepsilon$ the iterative application of $G_{=}$ gates accumulates too much noise ( $\Theta(\varepsilon\cdot\log n)$ ). Instead, we follow and focus on logical gates whose noise does not accumulate. Given an output of a logical gate ( $G_{<},G_{\vee},G_{\wedge},G_{\neg}$ ), we can introduce a binary tree (with even depth) of $G_{\neg}$ gates that copy the original output instead of having a single gate with a large fan-out. When we begin with an output of an arithmetic gate ( $G_{\zeta},G_{\times\zeta},G_{=},G_{+},G_{-}$ ), we first use $\Theta(1/\varepsilon)$ gates of types $G_{=}$ and $G_{<}$ to parse it into an $\varepsilon$ -precision unary representation. Then, we copy each bit in the unary representation using a tree of $G_{\neg}$ gates. Finally, we (approximately) recover the original value from each copy of the unary representation using $\Theta(1/\varepsilon)$ gates of types $G_{\times\zeta}$ and $G_{+}$ . Notice that the number of new gates introduced for each gate at the leaves of the binary tree is $\Theta(1/\varepsilon)$ . Therefore, if the solution to the new (fan-out $2$ ) instance violates an $O(\delta\cdot\varepsilon)$ -fraction of the gate-constraints, then it induces a solution to the original (arbitrary fan-out) instance that violates at most a $\delta$ -fraction of the constraints. (See the full version of for more details.)

We now reduce $(\varepsilon,\delta)$ -Gcircuit with fan-out $2$ to $\left(\varepsilon^{\prime},\delta^{\prime}\right)$ -weak approximate Nash equilibrium. Daskalakis et al. construct, for each gate in the generalized circuit, a game gadget with a few players whose mixed strategies at approximate equilibrium simulate the computation carried by the gate. In other words, every approximate equilibrium of the gate gadget induces an approximately satisfying assignment to the corresponding input and output lines in the generalized circuit. See Lemma 1 for an example of such a gadget. The gadgets are concatenated by identifying the “output player” of one gadget with the “input player(s)” of the next gadget(s) in the generalized circuit. Since each gadget is composed of only a constant number of players, if all but a $\delta^{\prime}$ -fraction of the players play approximately at equilibrium, all but a $\delta=\Theta(\delta^{\prime})$ -fraction of the gates are violated by the induced assignment to the generalized circuit.

The above construction suffices to show hardness for an $\left(\hat{\varepsilon},\delta^{\prime}\right)$ -weak well-supported Nash equilibrium, i.e. one for where all but a $\delta^{\prime}$ -fraction of the players, every action in the support is $\hat{\varepsilon}$ -approximately best response. Finally, for constant degree graphical games, it is not hard to derive an $\left(\hat{\varepsilon},\delta^{\prime}\right)$ -weak well-supported Nash equilibrium from any $\left(\varepsilon^{\prime},\delta^{\prime}\right)$ -weak approximate Nash equilibrium for $\varepsilon^{\prime}=\Theta(\hat{\varepsilon}^{2})$ [26, Lemma 4].

In the other direction, from $\left(\varepsilon^{\prime},\delta^{\prime}\right)$ -weak approximate Nash equilibrium to $(\varepsilon,\delta)$ -Gcircuit, it suffices to construct a generalized circuit that computes an $\varepsilon^{\prime}$ -approximate best response of each player (except a $\delta^{\prime}$ -fraction). Notice that in a graphical game with a constant number of actions per player, given a profile of mixed strategies, one only needs a constant number of arithmetic gates to compute a best response of any player. In particular, to compute an $\varepsilon$ -approximate best response it suffices to use gates of finite precision $\varepsilon^{\prime}=\Theta(\varepsilon)$ . Finally, since the number of gates per player is constant, if at most a $\delta$ -fraction of the gates are $\varepsilon$ -unsatisfied, they can appear in the best-response computation for at most a $\delta^{\prime}=\Theta(\delta)$ -fraction of the players. ∎

Let $v_{1}$ , $v_{2}$ , and $w$ be players in a graphical game, and suppose that the payoffs of $v_{2}$ and $w$ are as follows.

Then, in every $\varepsilon$ -NE $\mathbf{p}\left[v_{2}\right]=\min\left(\zeta\mathbf{p}\left[v_{1}\right],1\right)\pm\varepsilon$ , where $\mathbf{p}\left[u\right]$ denotes the probability that $u$ assigns to strategy $1$ .

(Sketch) If $\mathbf{p}\left[v_{2}\right]>\zeta\mathbf{p}\left[v_{1}\right]+\varepsilon$ , then in every $\varepsilon$ -NE $\mathbf{p}\left[w\right]=1$ , which contradicts $\mathbf{p}\left[v_{2}\right]>\varepsilon$ . Similarly, if $\mathbf{p}\left[v_{2}\right]<\min\left(\zeta\mathbf{p}\left[v_{1}\right],1\right)-\varepsilon$ , then $\mathbf{p}\left[w\right]=0$ , which yields a contradiction when $\mathbf{p}\left[v_{2}\right]<1-\varepsilon$ . ∎

Proof of Theorem 1

The proof is a reduction from weak approximation of Nash equilibrium in a polymatrix game (recall Theorem 2) to the two-player problem. The two players simultaneously play three games: the main game, which is the heart of the reduction; and two games based on a construction due to Althofer , which impose structural properties of any approximate Nash equilibrium (interestingly, the oblivious lower bound of uses the same game). The final payoff of a player is the sum of payoffs in all three games. For convenience of notation, the payoffs in each game will be in $; in order to normalize the payoffs in the final game to$ one should multiply by $1/3$ the payoffs in all three games.

We let each of the two players “control” the vertices on one side of the bipartite graphical game (we henceforth use players only for the players in the bimatrix game, and refer to the players in the graphical polymatrix game as vertices). For ease of notation, we assume wlog that each player controls an equal number of vertices, $n$ .

We partition the vertices of each player into $n/k$ disjoint subsets of size at most $2k=2\sqrt{n}$ , such that every two subsets share at most $18$ edges. By Lemma 6, we can efficiently find such a partition. Let $\left(S_{1},\dots,S_{n/k}\right)$ and $\left(T_{1},\dots,T_{n/k}\right)$ denote the partitions of the respective players. Each action of the players corresponds to a choice of a subset (out of $n/k$ subsets in the partition), and a choice of an action for each vertex in the subset (out of at most $2^{2k}$ vectors of actions). Together, the main game has $\left(\frac{n}{k}\cdot 2^{2k}\right)\times\left(\frac{n}{k}\cdot 2^{2k}\right)$ action profiles. When players choose actions $\left(S_{i},\vec{\alpha}_{S_{i}}\right)$ and $\left(T_{j},\vec{\beta}_{T_{j}}\right)$ , the payoff of the row player is the sum of the payoffs over all shared edges of $S_{i}$ and $T_{j}$ , when they play the respective strategies from $\vec{\alpha}_{S_{i}}$ and $\vec{\beta}_{T_{j}}$ (here we use the polymatrix structure of the payoffs in the graphical game; the payoffs are defined over edges and therefore payoffs of a certain vertex can be defined even though not necessarily all its neighbours are playing). Finally, we normalize by $\lambda/18$ , where $\lambda=\Theta(\delta^{2})$ is a small constant that satisfies $\lambda>\varepsilon^{\prime}$ . Similarly the payoff of the second player is derived from the payoffs of the vertices in $T_{j}$ .

Althofer’s games

Altofer’s game is an asymmetric hide-and-seek win-lose game over $l$ locations. The hider chooses a location $i\in[l]$ , the seeker chooses a subset $B\subset[l]$ of of size $|B|=l/2$ . The hider wins if $i\notin B$ , the seeker wins if $i\in B$ . Namely, the payoff function is given by

In Althofer’s game each player can guarantee $\frac{1}{2}$ by uniform play, therefore the value of the game is $\frac{1}{2}$ . Althofer’s game enjoys the following strong property: in every $\varepsilon$ -approximate Nash equilibrium, the hider must play a mixed strategy that is $O(\varepsilon)$ -close to the uniform distribution in total variation distance; see .

2 Structure of an equilibrium

For a mixed action $x$ of player $1$ we denote by $x(S_{i})$ the total probability that the player chooses to control the vertices $S_{i}$ ; i.e.,

If $(x,y)$ is an $\varepsilon^{\prime}$ -Nash equilibrium, then $\sum_{i\in[n/k]}\left|x\left(S_{i}\right)-\frac{k}{n}\right|=O\left(\lambda\right)$ .

In the proof we use the following Lemma from :

Let $\{a_{i}\}_{i\in[l]}$ be real numbers satisfying the following properties for some $\theta>0$ : (1) $a_{1}\geq a_{2}\geq...\geq a_{l}$ ; (2) $\sum_{i\in[l]}a_{i}=0$ ; and (3) $\sum_{i\in[l/2]}a_{i}\leq\theta$ . Then $\sum_{i\in[l]}|a_{i}|\leq 4\theta$ .

In order apply Lemma 3, we denote $a_{i}=x(S_{i})-\frac{k}{n}$ and we assume wlog that $x(S_{1})\geq x(S_{2})\geq...\geq x(S_{n/k})$ . Then the first two conditions hold. Regarding the third condition, we argue that $\sum_{i\in[l/2]}x(S_{i})\leq 3\lambda$ this will complete the proof.

Player 1 can guarantee a payoff of $1/2$ in the sum of Althofer’s games as a hider and the main game, by playing uniformly and choosing an arbitrary actions for the controlled vertices (for instance, $\vec{\alpha}_{S_{i}}=\vec{0}$ ). Assume by contradiction that $\sum_{i\in[l/2]}x(S_{i})>3\lambda$ . Since Player 2 is $\varepsilon^{\prime}$ -best replying, his payoff in the Althofer game (as a seeker) is at least $\frac{1}{2}+3\lambda-\varepsilon^{\prime}$ (because he can get $1/2+3\lambda$ by choosing the set $[l/2]$ ). Therefore, Player 1’s payoff in the Althofer game (as a hider) is at most $\frac{1}{2}-3\lambda+\varepsilon^{\prime}$ . If we add to it his payoffs in the main game, then his payoff is at most $\frac{1}{2}-2\lambda+\varepsilon^{\prime}$ . Therefore, player 1 can increase his payoff by $2\lambda-\varepsilon^{\prime}>\varepsilon^{\prime}$ by deviating to the uniform distribution over the locations (as a hider), and maintaining his mixed action as a seeker. ∎

3 Completing the proof

Any mixed strategy $x$ of player 1 in the bimatrix game induces a mixed strategy of all the vertices in $\cup_{i}S_{i}$ in the obvious way. Vertex $s\in S_{i}$ plays the action $1$ with probability $p$ where $p$ is the conditional probability

If the event “Player 1 controls $S_{i}$ ” occurs with probability 0, then we define (arbitrarily) that $p=1$ . Similarly, any mixed strategy $y$ of player 2 induces a mixed strategy of all the vertices in $\cup_{i}T_{i}$ .

We claim that if $\left(x,y\right)$ is an $\varepsilon^{\prime}$ -approximate Nash equilibrium of the bimatrix game, then the induced mixed-strategies profile is an $\left(\varepsilon,\delta\right)$ -approximate Nash equilibrium of the original graphical game.

By Lemma 2 and Markov’s inequality, for a $\left(1-O\left(\sqrt{\lambda}\right)\right)$ -fraction of the subsets, the row player distributes within a $\left(1\pm O\left(\sqrt{\lambda}\right)\right)$ -factor of the correct weight ( $k/n$ ); i.e. $x\left(S_{i}\right)\in\frac{k}{n}\cdot\left[1-O\left(\sqrt{\lambda}\right),1+O\left(\sqrt{\lambda}\right)\right]$ . Let us restrict our attention only to those subsets, and call them good. We say that a vertex is good if it and all of its neighbors belong to good subsets. Since the game graph is of bounded degree, we again have that a $\left(1-O\left(\sqrt{\lambda}\right)\right)$ -fraction of the vertices is good.

Consider any good vertex whose induced mixed strategy is not $\varepsilon$ -optimal in the original game given that the rest of the vertices also play according to their induced strategies. Then, changing its strategy in the bimatrix game (while leaving all other marginals the same), would increase the payoff of its player by at least

where the $\left(1-O\left(\sqrt{\lambda}\right)\right)^{2}\cdot\left(\frac{k}{n}\right)^{2}$ term corresponds to the probabilities that the subsets corresponding to both the vertex and any of its neighbors are played; the $\left(1-O\left(\sqrt{\lambda}\right)\right)\cdot\varepsilon$ term corresponds to the improvement of the vertex in the bimatrix game, where instead of summing the payoff in all three edges (as in the graphical game) these three edges have weights $\{1+\gamma_{j}\}_{j=1,2,3}$ for $|\gamma_{j}|=O(\sqrt{\lambda})$ ; finally $\lambda/18$ is the normalization. The right side follows by plugging in $k=\sqrt{n}$ .

If (by a contradiction) a $\delta$ -fraction of the vertices have $\varepsilon$ -improvement strategies; then one of the players has $\left(\delta/2-O\left(\sqrt{\lambda}\right)\right)\cdot n$ good vertices with an $\varepsilon$ -improvementHere we use our choice of $\lambda=\Theta(\delta^{2})$ , which guarantees that $\delta/2-O\left(\sqrt{\lambda}\right)=\Omega(\delta)$ . This player can benefit $\Omega\left(\varepsilon\cdot\lambda/n\right)$ from a deviation of each vertex. So his total improvement from all deviations simultaneously is $\Omega\left(\varepsilon\cdot\delta\cdot\lambda\right)$ - which is impossible when $(x,y)$ is an $\varepsilon^{\prime}$ -approximate Nash equilibrium for $\varepsilon^{\prime}=\Theta(\varepsilon\cdot\delta\cdot\lambda)$ .

Implications for Relative ε𝜀\varepsilon-approximate Nash

Daskalakis defines a notion of relative (sometimes also called multiplicative , as opposed to the more standard additive) $\varepsilon$ -Nash equilibrium, and proves that in two-player games with payoffs in $\left[-1,1\right]$ , finding a relative $\varepsilon$ -well supported Nash equilibrium is \PPAD-complete. In particular, he concludes that the quasi-polynomial algorithm of Lipton et al. cannot achieve this notion of approximate equilibrium.

This result has two caveats: (1) Through the use of positive and negative payoffs, the gain from deviation is large compared to the expected payoff, only because the latter is small due to cancellation of positive and negative payoffs. Namely, the gain from deviation may be very small compared to the expected magnitude of the payoff. (2) It only applies to the more restrictive notion (thus rendering the hardness result weaker) of well-supported approximate equilibrium; i.e. an equilibrium where every action in the support has to be approximately optimal. Showing \PPAD-hardness for both positive payoffs and general (non well-supported) approximate equilibrium were left as open questions in . Recently, the first question was settled in where it was shown that finding a relative $\varepsilon$ -well-supported Nash equilibrium with positive payoffs is indeed \PPAD-complete.

Here, assuming Conjecture 2, we settle the second question: we show that finding any relative $\varepsilon$ -approximate Nash equilibrium is \PPAD-complete. Furthermore, in our hard instance the row player has only positive payoffs and the column player has only negative payoffs, and so there is no cancellation of payoffs as in the construction of .

Assuming Conjecture 2 there exists a constant $\varepsilon^{\prime}=\Theta\left(\varepsilon\cdot\delta^{3}\right)>0$ such that finding a relative $\varepsilon^{\prime}$ -approximate Nash equilibrium (ANE) in a bimatrix game where the row player’s payoffs are non-negative and the column player’s payoffs are non-positive is \PPAD-complete.

Proving a similar theorem when both players have positive payoffs remains an interesting open question. In fact, we do not know of any instances with positive payoffs where all $\varepsilon$ -approximate Nash equilibria must have large (e.g., linear, or even super-logarithmic) support.

We reduce from the problem of finding an (additive) $\left(\varepsilon,\delta\right)$ -Nash in a bipartite, degree $3$ , polymatrix game with two actions per player. We construct a main game where the bimatrix game players (henceforth just players) control the nodes of the polymatrix game, and two side games that guarantee that each player randomizes (approximately) uniformly over all her nodes.

We let the row player “control” the nodes on one side of the bipartite game graph, and let the column player control the nodes on the other side. Namely, let $n$ be the number of nodes on each side of the graph; each player has $2n$ actions, each corresponding to a choice of node and an action for that node. When the players play strategies that correspond to adjacent nodes in the graphs, they receive the payoffs of the corresponding nodes, scaled (by a small positive constant $\eta=O\left(\delta^{2}\right)$ ) and shifted to fit in the intervals: $\left[1,1+\eta\right]$ for the row player, and $\left[-\left(1+\eta\right),-1\right]$ for the column player. If the nodes played do not share an edge in the bipartite game graph, the utility for both players is zero. Notice that if either player chooses her node uniformly at random (and chooses the action for that node arbitrarily), then the expected payoff is in $\frac{3}{n}\cdot\left[1,1+\eta\right]$ for the row player, and in $\frac{3}{n}\cdot\left[-\left(1+\eta\right),-1\right]$ for the column player.

We also let the players play two hide-and-seek win-lose zero-sum games over a space of $n$ actions. In both games, the row player is chasing the column player. In each game, if they pick the same strategy, the row player receives payoff $1$ (and the column player receives $-1$ ); otherwise the payoffs are . Finally, in the first side game we identify the row player’s strategies with her choice of nodes in the main game. Namely, if she plays node $i$ in the main game and the column player chose $i$ in the first side game, then her payoff from this side game is $1$ . Similarly, we identify the column player’s strategies in the second game with her choice of nodes in the main game.

We proceed by showing that in every relative $\varepsilon^{\prime}$ -ANE, the row player’s utility is approximately $5/n$ , and the column player’s utility is approximately $-5/n$ (Lemma 4); then we show that in every relative $\varepsilon^{\prime}$ -ANE, both players randomize approximately uniformly over their nodes (Lemma 5); finally we use these two observations to complete the proof (Subsection 4.2). ∎

Given mixed strategies $\left(x,y\right)$ , we let $x\left(i\right)$ denote the total probability that the row player assigns to node $i$ , and analogously for $y\left(j\right)$ . We also let $x^{*}\left(y\right)$ and $y^{*}\left(x\right)$ denote the corresponding best responses of each player. Finally, let $U_{R}\left(x;y\right)$ and $U_{C}\left(y;x\right)$ denote the expected payoffs for the row and column players, respectively.

If $\left(x,y\right)$ is a relative $\varepsilon^{\prime}$ -ANE, then

Observe that the main game is relative $\eta$ -approximately zero-sum; i.e. for any pure strategy profile $\left(x^{\prime},y^{\prime}\right)$ the expected utilities $U_{R}^{\mbox{main}}\left(x^{\prime};y^{\prime}\right),U_{C}^{\mbox{main}}\left(y^{\prime};x^{\prime}\right)$ of the row and column player, respectively, satisfy:

By linearity of expectation and convexity of the absolute value function, this continues to hold when $\left(x^{\prime},y^{\prime}\right)$ are mixed strategies. Furthermore, since the side games are exactly zero-sum and have the same signs as the main game, the same follows for the payoffs from the entire game:

In particular, the above inequality holds for $\left(x,y\right)$ . Thus, the upper bounds in (1) and (2) follow from the lower bounds in (2) and (1), respectively.

Finally, $\left(1-\varepsilon^{\prime}\right)\cdot\frac{5}{n}\leq U_{R}\left(x;y\right)$ in every relative $\varepsilon^{\prime}$ -ANE, because the row player can guarantee an expected payoff of at least $\frac{5}{n}$ by randomizing uniformly over all her strategies. Similarly, $U_{C}\left(y;x\right)\geq-\left(1+\varepsilon^{\prime}\right)\left(\frac{3\left(1+\eta\right)+2}{n}\right)$ because the column player can guarantee a payoff of at least $-\frac{3\left(1+\eta\right)+2}{n}$ by randomizing uniformly over all her strategies.∎

There exists a constant $\lambda=\Theta\left(\delta^{2}\right)$ such that for every $\left(x,y\right)$ relative $\varepsilon^{\prime}$ -ANE, $\sum\left|x\left(i\right)-\frac{1}{n}\right|\leq\lambda$ and $\sum\left|y\left(j\right)-\frac{1}{n}\right|\leq\lambda$ .

Assume by contradiction that $\sum\left|x\left(i\right)-\frac{1}{n}\right|>\lambda$ , then in particular there must exist an $i\in\left[n\right]$ such that $x\left(i\right)<\left(1-\lambda/2\right)/n$ . When the column player chooses her strategy uniformly at random in the main game and in the second side game and plays strategy $i$ in the first side game, her expected payoff is at least

Therefore, in any relative $\varepsilon^{\prime}$ -ANE her expected utility is at least

Which contradicts the upper bound in (2) when we take $\lambda$ sufficiently large, e.g. $\lambda=10\eta$ .

Similarly, if $\sum\left|y\left(j\right)-\frac{1}{n}\right|>\lambda$ , there must exist an $j\in\left[n\right]$ such that $y\left(j\right)>\left(1+\lambda/2\right)/n$ . Therefore, the row player can guarantee a payoff of at least

Thus by relative $\varepsilon^{\prime}$ -ANE,

which contradicts the upper bound in (1). ∎

2 Completing the proof of Theorem 3

Now, given a relative $\varepsilon^{\prime}$ -ANE $\left(x,y\right)$ , we can take, for each node $i$ , the mixed strategy induced by the probabilities $x\left(i:1\right)/x\left(i\right)$ and $x\left(i:2\right)/x\left(i\right)$ that the row player assigns to each action (respectively, $y\left(j:1\right)/y\left(j\right)$ and $y\left(j:2\right)/y\left(j\right)$ assigned by the column player). We claim that this strategy profile is an $\left(\varepsilon,\delta\right)$ -approximate equilibrium for the polymatrix game. Assume by contradiction that this is not the case.

By Lemma 5 and Markov’s inequality, a $\left(1-O\left(\sqrt{\lambda}\right)\right)$ -fraction of the nodes are played within $\frac{\sqrt{\lambda}}{n}$ of the correct probability $1/n$ . For a node $i$ controlled by the row player, we say that it is good if $\left|x\left(i\right)-1/n\right|\leq\frac{\sqrt{\lambda}}{n}$ and $\left|y\left(j\right)-1/n\right|\leq\frac{\sqrt{\lambda}}{n}\;\;\forall j\in{\cal N}\left(i\right)$ , and analogously for column player’s nodes. Since the graph has bounded degree, a $\left(1-O\left(\sqrt{\lambda}\right)\right)$ -fraction of the nodes are good.

Let $i$ be any good node who has an $\varepsilon$ -improving deviation from her induced strategy in the polymatrix game. If the player who controls $i$ makes the corresponding deviation in the two-player game, she increases her expected payoff by at least $\varepsilon\cdot\eta\cdot\frac{1}{n^{2}}\cdot\left(1-O\left(\sqrt{\lambda}\right)\right)$ . (We multiply by $\eta$ to account for the scaling; by $1/n^{2}$ for the probability that this player plays node $i$ and the other player plays a neighbor of $i$ ; and by $\left(1-O\left(\sqrt{\lambda}\right)\right)$ to correct for the deviation from $1/n$ in the latter probabilities.) By our assumption that the induced strategy profile is not an $\left(\varepsilon,\delta\right)$ -approximate equilibrium for the polymatrix game, at least one of the players has at least $\left(\delta-O\left(\sqrt{\lambda}\right)\right)n$ good nodes with $\varepsilon$ -improving deviations. Summing her gains from those deviations, we get that this player has can improve her expected payoff by at least $\left[\varepsilon\cdot\eta\cdot\left(1-O\left(\sqrt{\lambda}\right)\right)\right]\left(\delta-O\left(\sqrt{\lambda}\right)\right)/n$ .

However, this is a contradiction since it follows from Lemma 4, that $\left(x,y\right)$ is also an (additive) $\varepsilon^{\prime}\cdot\left(\frac{6}{n}\right)$ -ANE. ∎

Implications for Fairness Mechanisms

Competitive equilibrium from equal incomes (CEEI) is a well-known fair allocation mechanism : We give all agents a unit of money, and price the goods in such a way that the market clears. It is also well-known that when goods are indivisible or utilities are non-linear, an equilibrium may not exist. However, Budish proves that an approximate CEEI still exists. This concept of approximate equilibrium is used in practical system for allocating seats in courses to business school students .

Budish measures the proximity to a perfect CEEI via two parameters: a solution is an $\left(\alpha,\beta\right)$ -CEEI if the clearing error of the competitive equilibrium is less than $\alpha$ , and all the incomes are between $1$ and $1+\beta$ . Budish shows that an $\left(\alpha,\beta\right)$ always exists for some favorable $\alpha=\alpha^{*}$ , and any $\beta>0$ . Recently, showed a reduction from $\varepsilon$ -Gcircuit with fan-out $2$ , to the problem of finding an $\left(\alpha^{*},\Theta\left(\varepsilon/\log\left(1/\varepsilon\right)\right)\right)$ -CEEI. In particular, when combined with the results of , this implies that it is \PPAD-complete to find an $\left(\alpha^{*},\beta\right)$ -CEEI for some constant $\beta>0$ .

The $\beta$ parameter used in Budish’s formulation is an imperfect way of measuring income inequalities, as it may be set by a single outlier. Perhaps the best known, and most widely used, measure of income inequality is the Gini index (e.g. ) (see the Appendix for the definition). In fact, the Gini index is used to assess the performance of the class seat assignment system used in practice .

Assuming Conjecture 2, finding an income assignment and prices with low market clearing error and near-optimal Gini index is PPAD-complete.

See the appendix for the precise definitions and statement of the result.

Discussion

The purpose of this paper is to showcase an important open problem:

reduce End of the Line to $(\varepsilon,\delta)$ -Gcircuit

— i.e., show that Conjecture 1 implies Conjecture 2. As we mentioned, such a reduction would imply a “PCP for \PPAD”; i.e. it would imply that there exists a probabilistically checkable proof for a \PPAD-complete problem.

The equivalent statement for the class of \NP-complete problem, is the celebrated PCP Theorem . What can we learn from our experience with constructing PCP’s for \NP? First of all, we have many different constructions of PCPs for \NP, and this may be seen as circumstantial evidence that constructing a PCP for \PPAD is possible. Equally important, there are several constructions of PCPs for \NP that are near-linear (e.g. ); notice that for the application to $\varepsilon$ -Nash in bimatrix games we need the proof length to be sub-quadratic.

The next question one should ask is, whether we can adapt the techniques used in the proofs of the PCP Theorem to the class of \PPAD-complete problems. We briefly and informally sketch some of our thoughts on the matter. All the proofs of the PCP Theorems that we are aware of (including ) compose an inner verifier with an outer verifier. The outer verifier in Dinur’s proof is combinatorial in nature, and it seems plausible that the same or similar techniques could be modified to fit the generalized circuit graph. Unfortunately, all inner verifiers that we know are discrete in nature, whereas our characterization of \PPAD with Gcircuit (or Nash) is inherently based on continuous constraints. Thus the following interesting (and somewhat open-ended) questions arise: Can we find a characterization of \PPAD via a constraint satisfaction problem whose constraints are discrete in nature? Alternatively, can one construct an inner (non-efficient) verifier using the constraints specified in the definition of the Gcircuit problem?

Many thanks to Boaz Barak, Paul Cristiano, Muli Safra, and Madhu Sudan for inspiring discussions about probabilistic checkable proofs.

References

Appendix A Finding a good partition

Let $G=\left(U,V,E\right)$ be a bipartite $d$ -regular graph with $n=\left|U\right|=\left|V\right|$ . Then we can efficiently find partitions $S_{1},\dots,S_{n/k}$ and $T_{1},\dots,T_{n/k}$ of $U$ and $V$ , respectively, to disjoint subsets, such that each subset has size at most $2k=2\sqrt{n}$ , and:

Let $S_{1},\dots,S_{n/k}$ be an arbitrary partition of $U$ into disjoint subsets of size exactly $k$ . We inductively place the vertices of $V$ into subsets $T_{1},\dots,T_{n/k}$ , while maintaining the desiderata that each subset $T_{j}$ is of size at most $2k$ , and for every $i,j$ the number of edges from $S_{i}$ to $T_{j}$ is at most $2d^{2}k^{2}/n$ .

It is left to show that for any partial partition of $V$ , there is a subset $T_{J}$ into which we can place the next vertex $v$ . In expectation, every subset $T_{j}$ has less than $k$ vertices. Therefore by Markov’s inequality, less than half of the subsets have $2k$ vertices or more. $v$ has neighbors in at most $d$ subsets $S_{i}$ . Recall that the $S_{i}$ ’s are of size exactly $k$ . Thus each $S_{i}$ with a neighbor of $v$ has, in expectation over $j$ , less than $dk^{2}/n$ neighbors in $T_{j}$ . Using Markov’s inequality again, less than a $1/(2d)$ -fraction of the subsets $T_{j}$ contain $2d^{2}k^{2}/n$ neighbors of $S_{i}$ . In total, we lose less than half the subsets for the size desideratum, and less than $1/(2d)$ -fraction of the subsets for each $S_{i}$ containing a neighbor of $v$ . Therefore there always remains at least one subset $T_{j}$ to which we can add $v$ . ∎

Appendix B The course allocation problem

Even though the approximate CEEI and the existence theorem in are applicable to a broad range of allocation problems, we shall describe our results in the language of the course allocation problem.

We are given a set of $M$ courses with integer capacities (the supply) $(q_{j})_{j=1}^{M}$ , and a set of $N$ students, where each student $i$ has a set $\Psi_{i}\subseteq 2^{M}$ of permissible course bundles, with each bundle containing at most $k\leq M$ courses. The set $\Psi_{i}$ encodes both scheduling constraints (e.g., courses that meet at the same time) and any constraints specific to student $i$ (e.g. prerequisites).

Each student $i$ has a strict ordering over her permissible schedules, denoted by $\preccurlyeq_{i}$ . We allow arbitrarily complex preferences — in particular, students may regard courses as substitutes or complements. More formally:

Course Allocation Problem The input to a course allocation problem consists of:

For each student $i$ a set of course bundles $\left(\Psi_{i}\right)_{i=1}^{N}$ .

The students’ reported preferences, $\left(\preccurlyeq_{i}\right)_{i=1}^{N}$ ,

The course capacities, $\left(q_{j}\right)_{j=1}^{M}$ , and

The output to a course allocation problem consists of:

Prices for each course $(p_{j}^{*})_{j=1}^{M}$ ,

Allocations for each student $(x_{i}^{*})_{i=1}^{N}$ , and

Budgets for each student $\left(b_{i}^{*}\right)_{i=1}^{N}$ .

The quality of an allocation is evaluated based on its proximity to market clearing and income equality. As for the definition of market clearing error, it suffices for our purposes to require that no course is over-subscribed or under-subscribed by more than $\alpha^{*}$ students (we refer the curious reader to for the precise definition). The Gini coefficient, which measures income inequality, is discussed in the following subsection.

Given a distribution of incomes $D$ , the Lorenz curve plots, for each $x\in$ , the cumulative wealth owned by the bottom $x$ -fraction of the population. Let $F_{D}^{-1}(x)=\sup\big{\{}y\colon\Pr_{z\sim D}[z\leq y]\leq x\big{\}}$ , and define $L_{D}(x)=\left(\int_{0}^{x}F_{D}^{-1}(x)\right)/\left(\int_{0}^{1}F_{D}^{-1}(x)\right)$ . Then the Lorenz curve is the graph $\big{(}x,L_{D}(x)\big{)}$ , for $x\in$ . Notice that $F_{D}^{-1}(x)$ is monotonically non-decreasing, so the Lorenz curve is convex.

The Gini index is defined as the ratio of the area between the $\big{(}x,x\big{)}$ line and the Lorenz curve (by the convexity, $x\geq L_{D}(x)\forall x\in$ ), divided by the entire area under the $\big{(}x,x\big{)}$ line (the latter is always $1/2$ ): $G_{D}=2\int_{0}^{1}(x-L_{D}(x))dx=1-2\int_{0}^{1}L_{D}(x)dx$

In general, a smaller Gini index corresponds to a more equal distribution of wealth. For example, when all incomes are exactly equal, the Lorenz curve is exactly equal to the $\big{(}x,x\big{)}$ line, and the Gini index is . On the other extreme, when one person has all the wealth, the area under the Lorenz curve goes to as the population size increases, and the Gini index approaches $1$ .

B.2 Intractability of approximate CEEI with near-optimal Gini index

Assuming Conjecture 2, there exists some constant $\gamma>0$ such that finding an allocation with market clearing error $\alpha^{*}$ and Gini coefficient $\gamma$ is \PPAD-complete.

The rest of this section is devoted to sketching a proof of Theorem 4. In the next subsection, we briefly outline the reduction of from generalized circuits with fan-out $2$ to approximate CEEI. (Recall that in Section 2 we show how to convert any generalized circuit to one with fan-out $2$ .) Then, we show that after normalizing the median budget to $1+\varepsilon^{\prime}/2$ , in any allocation which does not correspond to a valid solution to the $(\varepsilon,\delta)$ -Gcircuit instance, a $\delta^{\prime}$ -fraction of the students have budgets either at most $1$ or at least $1+\varepsilon^{\prime}$ , (for some constants $\varepsilon^{\prime}=\Theta\left(\varepsilon/\log(1/\varepsilon)\right)$ and $\delta^{\prime}=\Theta\left(\delta/\log(1/\varepsilon)\right)$ ). Then, the proof is complete with the following lemma:

Let the median income be $1+\varepsilon^{\prime}/2$ , and suppose that a $\delta^{\prime}$ -fraction of the population has income at most $1$ (resp. at least $1+\varepsilon^{\prime}$ ). Then the Gini index is at least $\gamma=\Theta(\delta^{\prime}\cdot\varepsilon^{\prime})=\Theta(\delta\cdot\varepsilon/\log^{2}(1/\varepsilon))$ .

The total income of the poorer half of the population is at most $(1/2-\delta^{\prime})(1+\varepsilon^{\prime}/2)+\delta^{\prime}=(1/2)(1+\varepsilon^{\prime}/2)-\delta^{\prime}\cdot\varepsilon^{\prime}/2$ , whereas the richer half of the population has a total income of at least $(1/2)(1+\varepsilon^{\prime}/2)$ . Therefore, the Lorenz curve’s value at $1/2$ is bounded by

We can now use elementary geometry to bound the area under the Lorenz curve:

Therefore the Gini index is at least $\Theta(\delta^{\prime}\cdot\varepsilon^{\prime})$ . A similar argument works when a $\delta^{\prime}$ -fraction of the population has income at least $1+\varepsilon^{\prime}$ .

B.3 From generalized circuits to Course Allocation

reduce generalized circuits to Course Allocation, by constructing a gadget for each gate of the generalized circuit. (In fact, a few gadgets per gate are required to handle a subtle issue that call ”course-size amplification”.) We provide the gadget for the NOT gate as an example, and then describe the properties of the ’s reduction that we need to complete the proof.

We henceforth normalize the prices and budgets in every assignment to the Course Allocation instance such that the median budget is $1+\varepsilon^{\prime}/2$ .

Let $n_{x}>6\alpha^{*}$ and suppose that the economy contains the following courses:

$c_{{1-x}}$ with capacity $q_{{1-x}}=2n_{x}/3$ (the “output course”);

$n_{x}$ students interested only in the schedule $\left\{c_{x},c_{{1-x}}\right\}$ ;

and suppose further that at most $n_{{1-x}}=n_{x}/6$ other students are interested in course $c_{{1-x}}$ .

Then in any normalized approximate CEEI with market clearing error at most $\alpha^{*}$ , at least one of the following must hold:

A constant fraction of the $n_{x}$ students have budgets either less than $1$ or greater than $1+\varepsilon^{\prime}$ .

If $p_{{1-x}}^{*}>1-p_{x}^{*}+\varepsilon^{\prime}$ , then none of the $n_{x}$ students can afford the bundle $\left\{c_{x},c_{{1-x}}\right\}$ - except those whose budget is greater than $1+\varepsilon^{\prime}$ . Other than them, there are at most $n_{{1-x}}=n_{x}/6$ students enrolled in the $c_{{1-x}}$ - much less than the capacity $2n_{x}/3$ . Therefore for the market clearing error to be less than $\alpha^{*}=n_{x}/6$ , $n_{x}/3$ of the students must have budget greater than $1+\varepsilon^{\prime}$

On the other hand, if $p_{{1-x}}^{*}<1-p_{x}^{*}$ , then all $n_{x}$ students can afford the bundle $\left\{c_{x},c_{{1-x}}\right\}$ - except for those whose budget is less than $1$ . Therefore in order to satisfy the market clearing requirement, at least $n_{x}/6$ students must have budget less than $1$ .

Similarly, provide gadgets for all the gates in the definition of the Gcircuit problem, and show that they can be concatenated to simulate the computation on the generalized circuit. For each gate of the generalized circuit, ’s reduction uses at most $\Theta(1/\log(1/\varepsilon))$ (in particular, a constant) number of gadgets. Furthermore, the number of students that participate in each of those gadgets is approximately the same ( $\Theta(\alpha^{*})$ ). Therefore, for every gate which is not $\varepsilon$ -satisfied, there are at least $\Theta(\alpha^{*})$ students whose budgets are either at most $1$ or at least $1+\varepsilon^{\prime}$ . Thus if the assignment to the Course Allocation problem does not correspond to an $(\varepsilon,\delta)$ -approximate solution to Gcircuit, then a $\delta^{\prime}=\Theta(\delta/\log(1/\varepsilon))$ -fraction of the students must have budgets at most $1$ or at least $1+\varepsilon^{\prime}$ . Applying Lemma 7 completes the proof of Theorem 4. ∎