Tree Polymatrix Games are PPAD-hard

Argyrios Deligkas, John Fearnley, Rahul Savani

Introduction

A polymatrix game is a succinctly represented many-player game. The players are represented by vertices in an interaction graph, where each edge of the graph specifies a two-player game that is to be played by the adjacent vertices. Each player picks a pure strategy, or action, and then plays that action in all of the edge-games that they are involved with. They then receive the sum of the payoffs from each of those games. A Nash equilibrium prescribes a mixed strategy to each player, with the property that no player has an incentive to unilaterally deviate from their assigned strategy.

Constant-action polymatrix games have played a central role in the study of equilibrium computation. The classical \PPAD-hardness result for finding Nash equilibria in bimatrix games uses constant-action polymatrix games as an intermediate step in the reduction . Rubinstein later showed that there exists a constant ϵ>0\epsilon>0 such that computing an ϵ\epsilon-approximate Nash equilibrium in two-action bipartite polymatrix games is \PPAD-hard , which was the first result of its kind to give hardness for constant ϵ\epsilon.

These hardness results create polymatrix games whose interaction graphs contain cycles. This has lead researchers to study acyclic polymatrix games, with the hope of finding tractable cases. Kearns, Littman, and Singh claimed to produce a polynomial-time algorithm for finding a Nash equilibrium in a two-action tree graphical game , where graphical games are a slight generalization of polymatrix games. However, their algorithm does not work, which was pointed out by Elkind, Goldberg, and Goldberg , who also showed that the natural fix gives an exponential-time algorithm.

Elkind, Goldberg, and Goldberg also show that a Nash equilibrium can be found in polynomial time for two-action graphical games whose interaction graphs contain only paths and cycles. They also show that finding a Nash equilibrium is \PPAD-hard when the interaction graph has pathwidth at most four, but there appears to be some issues with their approach (see Appendix A). Later work of Barman, Ligett, and Piliouras provided a QPTAS for constant-action tree polymatrix games, and then Ortiz and Irfan gave an FPTAS for this case. All three papers, , leave as a main open problem the question of whether it is possible to find a Nash equilibrium in a tree polymatrix in polynomial time.

Our contribution. In this work we show that finding a Nash equilibrium in twenty-action tree polymatrix games is \PPAD-hard. Combined with the known \PPADcontainment of polymatrix games , this implies that the problem is \PPAD-complete. This is the first hardness result for polymatrix (or graphical) games in which the interaction graph is acyclic, and decisively closes the open question raised by prior work: tree polymatrix games cannot be solved in polynomial time unless \PPADis equal to P\mathtt{P}.

Our reduction produces a particularly simple class of interaction graphs: all of our games are played on caterpillar graphs (see Figure 3) which consist of a single path with small one-vertex branches affixed to every node. These graphs have pathwidth 11, so we obtain a stark contrast with prior work: two-action path polymatrix games can be solved in polynomial time , but twenty-action pathwidth-1-caterpillar polymatrix games are \PPAD-hard.

Our approach is founded upon Mehta’s proof that 2D-LinearFIXP is \PPAD-hard . We show that her reduction can be implemented by a synchronous arithmetic circuit with constant width. We then embed the constant-width circuit into a caterpillar polymatrix game, where each player in the game is responsible for simulating all gates at a particular level of the circuit. This differs from previous hardness results , where each player is responsible for simulating exactly one gate from the circuit.

Along the way, we also substantially strengthen Mehta’s hardness result for LinearFIXP\mathtt{LinearFIXP}. She showed \PPAD-hardness for finding an exact fixed point of a 2D-LinearFIXP instance, and an ϵ\epsilon-fixed point of a kD-LinearFIXP instance, where ϵ\epsilon is polynomially small. We show \PPAD-hardness for finding an ϵ\epsilon-fixed point of a 2D-LinearFIXP instance when ϵ\epsilon is any constant less than (21)/20.2071(\sqrt{2}-1)/2\approx 0.2071. So we have lifted the hardness regime from polynomially small approximations in kk-dimensions to constant approximations in two-dimensions, and our constant is substantial when compared to the trivial upper bound of 0.50.5.

Related work. The class \PPADwas defined by Papadimitriou . Years later, Daskalakis, Goldberg, and Papadimitriou (DGP) proved \PPAD-hardness for graphical games and 3-player normal form games. Chen, Deng, and Teng (CDT) extended this result to 2-player games and proved that there is no FPTAS for the problem unless \PPAD=P\PPAD=\mathtt{P}. The observations made by CDT imply that DGP’s result also holds for polymatrix games with constantly-many actions (but with cycles in the interaction graph) for an exponentially small ϵ\epsilon. More recently, Rubinstein showed that there exists a constant ϵ>0\epsilon>0 such that computing an ϵNE\epsilon-NE in binary-action bipartite polymatrix games is \PPAD-hard (again with cycles in the interaction graph).

Etessami and Yiannakakis defined the classes FIXP\mathtt{FIXP} and LinearFIXP\mathtt{LinearFIXP} and they proved that LinearFIXP=\PPAD\mathtt{LinearFIXP}=\PPAD. Mehta strengthened these results by proving that two-dimensional LinearFIXP\mathtt{LinearFIXP} equals \PPAD, building on the result of Chen and Deng who proved that 2D-discrete Brouwer is \PPAD-hard .

On the positive side, Cai and Daskalakis , proved that NE can be efficiently found in polymatrix games where every 2-player game is zero-sum. Ortiz and Irfan and Deligkas, Fearnley, and Savani produced QPTASs for polymatrix games of bounded treewidth (in addition to the FPTAS of for tree polymatrix games mentioned above). For general polymatrix games, the only positive result to date is a polynomial-time algorithm to compute a (12+δ)(\frac{1}{2}+\delta)-NE . Finally, an empirical study on algorithms for exact and approximate NE in polymatrix games can be found in .

Preliminaries

Polymatrix games. An nn-player polymatrix game is defined by an undirected interaction graph G=(V,E)G=(V,E) with nn vertices, where each vertex represents a player, and the edges of the graph specify which players interact with each other. Each player in the game has mm actions, and each edge (v,u)E(v,u)\in E of the graph is associated with two m×mm\times m matrices Av,uA^{v,u} and Au,vA^{u,v} which specify a bimatrix game that is to be played between the two players, where Av,uA^{v,u} specifies the payoffs to player vv from their interaction with player uu.

Each player in the game selects a single action, and then plays that action in all of the bimatrix games with their neighbours in the graph. Their payoff is the sum of the payoffs that they obtain from each of the individual bimatrix games.

A mixed strategy for player ii is a probability distribution over the mm actions of that player, a strategy profile is a vector s=(s1,s2,,sn)\mathbf{s}=(s_{1},s_{2},\ldots,s_{n}) where sis_{i} is a mixed strategy for player ii. The vector of expected payoffs for player ii under strategy profile s\mathbf{s} is pi(s):=(i,j)EAi,jsj\mathbf{p}_{i}(\mathbf{s}):=\sum_{(i,j)\in E}A^{i,j}s_{j}. The expected payoff to player ii under s\mathbf{s} is sipi(s)s_{i}\cdot\mathbf{p}_{i}(\mathbf{s}). A strategy profile is a mixed Nash equilibrium if sipi(s)=maxsipi(s)s_{i}\cdot\mathbf{p}_{i}(\mathbf{s})=\max_{s_{i}}\mathbf{p}_{i}(\mathbf{s}) for all ii, which means that no player can unilaterally change their strategy in order to obtain a higher expected payoff. In this paper we are interested in the problem of computing a Nash equilibrium of a tree polymatrix game, which is a polymatrix game in which the interaction graph is a tree.

Arithmetic circuits. For the purposes of this paper, each gate in an arithmetic circuit will operate only on values that lie in the range $.Inourconstruction,wewillusefourspecificgates,calledconstantintroductiondenotedby. In our construction, we will use four specific gates, called constant introduction denoted byc,boundedadditiondenotedby, bounded addition denoted by+^{b},boundedsubtractiondenotedby, bounded subtraction denoted by-^{b},andboundedmultiplicationbyaconstantdenotedby, and bounded multiplication by a constant denoted by*^{b}c$. These gates are formally defined as follows.

cc is a gate with no inputs that outputs some fixed constant cc\in.

Given inputs x,yx,y\in the gate x+by:=min(x+y,1)x+^{b}y:=\min\left(x+y,1\right).

Given inputs x,yx,y\in the gate xby:=max(xy,0)x-^{b}y:=\max\left(x-y,0\right).

Given an input xx\in, and a constant c0c\geq 0, the gate xbc:=min(xc,1)x*^{b}c:=\min\left(x*c,1\right).

These gates perform their operation, but also clip the output value so that it lies in the range $.Notethattheconstant. Note that the constantcinthein the*^{b}c$ gate is specified as part of the gate. Multiplication of two inputs is not allowed.

We will build arithmetic circuits that compute functions of the form dd^{d}\rightarrow^{d}. A circuit C=(I,G)C=(I,G) consists of a set I={in1,in2,,ind}I=\{\texttt{in}_{1},\texttt{in}_{2},\dots,\texttt{in}_{d}\} containing dd input nodes, and a set G={g1,g2,,gk}G=\{g_{1},g_{2},\dots,g_{k}\} containing kk gates. Each gate gig_{i} has a type from the set {c,+b,b,bc}\{c,+^{b},-^{b},*^{b}c\}, and if the gate has one or more inputs, these are taken from the set IGI\cup G. The connectivity structure of the gates is required to be a directed acyclic graph.

The depth of a gate, denoted by d(g)d(g) is the length of the longest path from that gate to an input. We will build synchronous circuits, meaning that all gates of the form gx=gy+bgzg_{x}=g_{y}+^{b}g_{z} satisfy d(gx)=1+d(gy)=1+d(gz)d(g_{x})=1+d(g_{y})=1+d(g_{z}), and likewise for gates of the form gx=gybgzg_{x}=g_{y}-^{b}g_{z}. There are no restrictions on cc-gates, or bc*^{b}c-gates.

The width of a particular level ii of the circuit is defined to be w(i)={gj  :  d(gj)=i}w(i)=|\{g_{j}\;:\;d(g_{j})=i\}|, which is the number of gates at that level. The width of a circuit is defined to be w(C)=maxiw(i)w(C)=\max_{i}w(i), which is the maximum width taken over all the levels of the circuit.

Straight line programs. A convenient way of specifying an arithmetic circuit is to write down a straight line program (SLP) .

Each line of an SLP consists of a statement of the form v \leftarrow op, where v is a variable, and op consists of exactly one arithmetic operation from the set set {c,+b,b,bc}\{c,+^{b},-^{b},*^{b}c\}. The inputs to the gate can be any variable that is defined before the line, or one of the inputs to the circuit. We permit variables to be used on the left hand side in more than one line, which effectively means that we allow variables to be overwritten.

It is easy to turn an SLP into a circuit. Each line is turned into a gate, and if variable v is used as the input to gate gg, then we set the corresponding input of gg to be the gate gg^{\prime} that corresponds to the line that most recently assigned a value to v. SLP 1 above specifies a circuit with four gates, and the output of the circuit will be 0.750.75 +b\texttt{+}^{b} in1\texttt{in}_{1}.

For the sake of brevity, we also allow if statements and for loops in our SLPs. These two pieces of syntax can be thought of as macros that help us specify a straight line program concisely. The arguments to an if statement or a for loop must be constants that do not depend on the value of any gate in the circuit. When we turn an SLP into a circuit, we unroll every for loop the specified number of times, and we resolve every if statement by deleting the block if the condition does not hold. So the example above produces a circuit with seven gates: two gates correspond to the lines x \leftarrow in1 *b\texttt{*}^{b} 1 and out1 \leftarrow x *b\texttt{*}^{b} 1, while there are five gates corresponding to the line x \leftarrow x +b\texttt{+}^{b} 0.1, since there are five copies of the line remaining after we unroll the loop and resolve the if statements. The output of the resulting circuit will be 0.50.5 +b\texttt{+}^{b} in1\texttt{in}_{1}.

Liveness of variables and circuit width. Our ultimate goal will be to build circuits that have small width. To do this, we can keep track of the number of variables that are live at any one time in our SLPs. A variable v is live at line ii of an SLP if both of the following conditions are met.

There exists a line with index jij\leq i that assigns a value to v.

There exists a line with index kik\geq i that uses the value assigned to v as an argument.

The number of variables that are live at line ii is denoted by live(i)\operatorname{live}(i), and the number of variables used by an SLP is defined to be maxilive(i)\max_{i}\operatorname{live}(i), which is the maximum number of variables that are live at any point in the SLP. The following is proved in Appendix B.

An SLP that uses ww variables can be transformed into a polynomial-size synchronous circuit of width ww.

Hardness of 2D-Brouwer

In this section, we consider the following problem. It is a variant of two-dimensional Brouwer that uses only our restricted set of bounded gates.

Given an arithmetic circuit F:22F:^{2}\rightarrow^{2} using gates from the set {c, +b+^{b}, b-^{b}, b*^{b} c}, find x2x\in^{2} such that F(x)=xF(x)=x.

As a starting point for our reduction, we will show that this problem is \PPAD-hard. Our proof will follow the work of Mehta , who showed that the closely related 2D-LinearFIXP problem is \PPAD-hard. There are two differences between 2D-Brouwer and 2D-LinearFIXP.

2D-LinearFIXP takes a circuit that uses gates from the set {c,+,,c,max,min}\{c,+,-,*c,\max,\min\}, where none of these gates bound their outputs to be in $$.

In this section, we present an altered version of Mehta’s reduction, which will show that finding an ϵ\epsilon-solution to 2D-Brouwer is \PPAD-hard for a constant ϵ\epsilon.

Discrete Brouwer. The starting point for Mehta’s reduction is the two-dimensional discrete Brouwer problem, which is known to be \PPAD-hard . This problem is defined over a discretization of the unit square 2^{2} into a grid of points G={0,1/2n,2/2n,,(2n1)/2n}2G=\{0,1/2^{n},2/2^{n},\dots,(2^{n}-1)/2^{n}\}^{2}. The input to the problem is a Boolean circuit C:G{1,2,3}C:G\rightarrow\{1,2,3\} the assigns one of three colors to each point. The coloring will respect the following boundary conditions.

We have C(2n12n,i)=C(i,2n12n)=3C(\frac{2^{n}-1}{2^{n}},i)=C(i,\frac{2^{n}-1}{2^{n}})=3 for all i>0i>0.

These conditions can be enforced syntactically by modifying the circuit. The problem is to find a grid square that is trichromatic, meaning that all three colors appear on one of the four points that define the square.

Given a Boolean circuit C:{0,1}n×{0,1}n{1,2,3}C:\{0,1\}^{n}\times\{0,1\}^{n}\rightarrow\{1,2,3\} that satisfies the boundary conditions, find a point x,y{0,1}nx,y\in\{0,1\}^{n} such that, for each color i{1,2,3}i\in\{1,2,3\}, there exists a point (x,y)(x^{\prime},y^{\prime}) with C(x,y)=iC(x^{\prime},y^{\prime})=i where x{x,x+1}x^{\prime}\in\{x,x+1\} and y{y,y+1}y^{\prime}\in\{y,y+1\}.

Our first deviation from Mehta’s reduction is to insist on the following stronger boundary condition, which is shown in Figure 1(a).

We have C(i,j)=1C(i,j)=1 for all ii, and for all jϵj\leq\epsilon.

We have C(i,j)=2C(i,j)=2 for all j>ϵj>\epsilon, and for all iϵi\leq\epsilon.

We have C(i,j)=C(j,i)=3C(i,j)=C(j,i)=3 for all i>ϵi>\epsilon, and all j1ϵj\geq 1-\epsilon.

The original boundary conditions placed constraints only on the outermost grid points, while these conditions place constraints on a border of width ϵ\epsilon. We call this modified problem ϵ\epsilon-ThickDisBrouwer, which is the same as DiscreteBrouwer, except that the function is syntactically required to satisfy the new boundary conditions.

It is not difficult to produce a polynomial time reduction from DiscreteBrouwer to ϵ\epsilon-ThickDisBrouwer. It suffices to increase the number of points in the grid, and then to embed the original DiscreteBrouwer instance into the [ϵ,1ϵ]2[\epsilon,1-\epsilon]^{2} square in the middle of the instance. The proof of the following lemma can be found in Appendix C.

DiscreteBrouwer can be reduced in polynomial time to ϵ\epsilon-ThickDisBrouwer.

Embedding the grid in 2^{2}. We now reduce ϵ\epsilon-ThickDisBrouwer to 2D-Brouwer. One of the keys steps of the reduction is to map points from the continuous space 2^{2} to the discrete grid GG. Specifically, given a point xx\in, we would like to determine the nn bits that define the integer x2n\lfloor x\cdot 2^{n}\rfloor.

Mehta showed that this mapping from continuous points to discrete points can be done by a linear arithmetic circuit. Here we give a slightly different formulation that uses only gates from the set {c,+b,b,bc}\{c,+^{b},-^{b},*^{b}c\}. Let LL be a fixed constant that will be defined later.

SLP 3 extracts the first bit of the number xx\in. The first three lines of the program compute the value b=(xb0.5)bLb=(x-^{b}0.5)*^{b}L. There are three possibilities.

If 0.5<x<0.5+1/L0.5<x<0.5+1/L, then bb will be some number strictly between and 11.

The first two cases correctly decode the first bit of xx, and we call these cases good decodes. We will call the third case a bad decode, since the bit has not been decoded correctly.

SLP 4 extracts the first nn bits of xx, by extracting each bit in turn, starting with the first bit. The three lines after each extraction erase the current first bit of xx, and then multiply xx by two, which means that the next extraction will give us the next bit of xx. If any of the bit decodes are bad, then this procedure will break, meaning that we only extract the first nn bits of xx in the case where all decodes are good. We say that xx is well-positioned if the procedure succeeds, and poorly-positioned otherwise.

Multiple samples. The problem of poorly-positioned points is common in \PPAD-hardness reductions. Indeed, observe that we cannot define an SLP that always correctly extracts the first nn bits of xx, since this would be a discontinuous function, and all gates in our arithmetic circuits compute continuous functions. As in previous works, this is resolved by taking multiple samples around a given point. Specifically, for the point p2p\in^{2}, we sample kk points p1p_{1}, p2p_{2}, …, pkp_{k} where pi=p+(i1)(1(k+1)2n+1,1(k+1)2n+1).p_{i}=p+(i-1)\left(\frac{1}{(k+1)\cdot 2^{n+1}},\frac{1}{(k+1)\cdot 2^{n+1}}\right). Mehta proved that there exists a setting for LL that ensures that there are at most two points that have poorly positioned coordinates. We have changed several details, and so we provide our own statement and proof here. The proof can be found in Appendix D.

If L=(k+2)2n+1L=(k+2)\cdot 2^{n+1}, then at most two of the points p1p_{1} through pkp_{k} have poorly-positioned coordinates.

Evaluating a Boolean circuit. Once we have decoded the bits for a well-positioned point, we have a sequence of 0/1 variables. It is easy to simulate a Boolean circuit on these values.

The operator ¬  x\lnot\;x can be simulated by 1bx1-^{b}x.

The operator xyx\lor y can be simulated by x+byx+^{b}y.

The operator xyx\land y can be simulated by applying De Morgan’s laws and using \lor and ¬\lnot.

Recall that CC outputs one of three possible colors. We also assume, without loss of generality, that CC gives its output as a one-hot vector. This means that there are three Boolean outputs x1,x2,x3{0,1}3x_{1},x_{2},x_{3}\in\{0,1\}^{3} of the circuit. The color 11 is represented by the vector (1,0,0)(1,0,0), the color 22 is represented as (0,1,0)(0,1,0), and color 33 is represented as (0,0,1)(0,0,1). If the simulation is applied to a point with well-positioned coordinates, then the circuit will output one of these three vectors, while if it is applied to a point with poorly positioned coordinates, then the circuit will output some value x3x\in^{3} that has no particular meaning.

The output. The key idea behind the reduction is that each color will be mapped to a displacement vector, as shown in Figure 1(b). Here we again deviate from Mehta’s reduction, by giving different vectors that will allow us to prove our approximation lower bound.

Color 11 will be mapped to the vector (0,1)ϵ(0,1)\cdot\epsilon.

Color 22 will be mapped to the vector (1,12)ϵ(1,1-\sqrt{2})\cdot\epsilon.

Color 33 will be mapped to the vector (1,12)ϵ(-1,1-\sqrt{2})\cdot\epsilon.

These are irrational coordinates, but in our proofs we argue that a suitably good rational approximation of these vectors will suffice. We average the displacements over the kk different sampled points to get the final output of the circuit. Suppose that xijx_{ij} denotes output ii from sampled point jj. Our circuit will compute

Finally, we specify F:22F:^{2}\rightarrow^{2} to compute F(x,y)=(x+dispxϵ,y+dispyϵ)F(x,y)=(x+\texttt{disp}_{x}\cdot\epsilon,y+\texttt{disp}_{y}\cdot\epsilon).

Completing the proof. To find an approximate fixed point of FF, we must find a point where both dispx\texttt{disp}_{x} and dispy\texttt{disp}_{y} are close to zero. The dotted square in Figure 1(b) shows the set of displacements that satisfy x(0,0)(21)ϵ\|x-(0,0)\|_{\infty}\leq(\sqrt{2}-1)\cdot\epsilon, which correspond to the displacements that would be (21)ϵ(\sqrt{2}-1)\cdot\epsilon-fixed points.

The idea is that, if we do not sample points of all three colors, then we cannot produce a displacement that is strictly better than an (21)ϵ(\sqrt{2}-1)\cdot\epsilon-fixed point. For example, if we only have points of colors 1 and 2, then the displacement will be some point on the dashed line between the red and blue vectors in Figure 1(b). This line touches the box of (21)ϵ(\sqrt{2}-1)\cdot\epsilon-fixed points, but does not enter it. It can be seen that the same property holds for the other pairs of colors: we specifically chose the displacement vectors in order to maximize the size of the inscribed square shown in Figure 1(b).

The argument is complicated by the fact that two of our sampled points may have poorly positioned coordinates, which may drag the displacement towards (0,0)(0,0). However, this effect can be minimized by taking a large number of samples. We show show the following lemma.

Let ϵ<(21)ϵ\epsilon^{\prime}<(\sqrt{2}-1)\cdot\epsilon be a constant. There is a sufficiently large constant kk such that, if xF(x)<ϵ\|x-F(x)\|_{\infty}<\epsilon^{\prime}, then xx is contained in a trichromatic square.

The proof of Lemma 3.5 can be found in Appendix E. Since ϵ\epsilon can be fixed to be any constant strictly less than 0.50.5, we obtain the following.

Given a 2D-Brouwer instance, it is \PPAD-hard to find a point x2x\in^{2} s.t. xF(x)<(21)/20.2071\|x-F(x)\|_{\infty}<(\sqrt{2}-1)/2\approx 0.2071.

Reducing 2D-Brouwer to 2D-LinearFIXP is easy, since the gates {c,+b,b,bc}\{c,+^{b},-^{b},*^{b}c\} can be simulated by the gates {c,+,,c,max,min}\{c,+,-,*c,\max,\min\}. This implies that it is \PPAD-hard to find an ϵ\epsilon-fixed point of a 2D-LinearFIXP instance with ϵ<(21)/2\epsilon<(\sqrt{2}-1)/2.

It should be noted that an ϵ\epsilon-approximate fixed point can be found in polynomial time if the function has a suitably small Lipschitz constant, by trying all points in a grid of width ϵ\epsilon. We are able to obtain a lower bound for constant ϵ\epsilon because our functions have exponentially large Lipschitz constants.

Hardness of 2D-Brouwer with a constant width circuit

In our reduction from 2D-Brouwer to tree polymatrix games, the number of actions in the game will be determined by the width of the circuit. This means that the hardness proof from the previous section is not a sufficient starting point, because it produces 2D-Brouwer instances that have circuits with high width. In particular, the circuits will extract 2n2n bits from the two inputs, which means that the circuits will have width at least 2n2n.

Since we desire a constant number of actions in our tree polymatrix game, we need to build a hardness proof for 2D-Brouwer that produces a circuit with constant width. In this section we do exactly that, by reimplementing the reduction from the previous section using gadgets that keep the width small.

Bit packing. We adopt an idea of Elkind, Goldberg, and Goldberg , to store many bits in a single arithmetic value using a packed representation. Given bits b1,b2,,bk{0,1}b_{1},b_{2},\dots,b_{k}\in\{0,1\}, the packed representation of these bits is the value packed(b1,b2,,bk):=i=1kbi/2i\operatorname{packed}(b_{1},b_{2},\dots,b_{k}):=\sum_{i=1}^{k}b_{i}/2^{i}. We will show that the reduction from the previous section can be performed while keeping all Boolean values in a single variable that uses packed representation.

Working with packed variables. We build SLPs that work with this packed representation, two of which are shown below.

The FirstBit SLP combines the ideas from SLPs 3 and 4 to extract the first bit from a value xx\in. Repeatedly applying this SLP allows us to read out each bit of a value in sequence. The Clear SLP uses this to set some bits of a packed variable to zero. It takes as input a set of indices II, and a packed variable x=packed(b1,b2,,bk)x=\operatorname{packed}(b_{1},b_{2},\dots,b_{k}). At the end of the SLP we have x=packed(b1,b2,,bk)x=\operatorname{packed}(b^{\prime}_{1},b^{\prime}_{2},\dots,b^{\prime}_{k}) where bi=0b^{\prime}_{i}=0 whenever iIi\in I, and bi=bib^{\prime}_{i}=b_{i} otherwise.

It first copies xx to a fresh variable xx^{\prime}. The bits of xx^{\prime} are then read-out using FirstBit. Whenever a bit bib_{i} with iIi\in I is decoded from xx^{\prime}, we subtract bi/2ib_{i}/2^{i} from xx. If bi=1b_{i}=1, then this sets the corresponding bit of xx to zero, and if bi=0b_{i}=0, then this leaves xx unchanged.

We want to minimize the the width of the circuit that we produce, so we keep track of the number of extra variables used by our SLPs. For FirstBit, this is zero, while for clear this is two, since that SLP uses the fresh variables xx^{\prime} and bb.

Packing and unpacking bits. We implement two SLPs that manipulated packed variables. The Pack(x, y, S) operation allows us to extract bits from yy\in, and store them in xx, while the Unpack(x, y, S) operation allows us to extract bits from xx to create a value yy\in. This is formally specified in the following lemma, which is proved in Appendix F.

Suppose that we are given x=packed(b1,b2,,bk)\mathtt{x}=\operatorname{packed}(b_{1},b_{2},\dots,b_{k}), a variable y\mathtt{y}\in, and a sequence of indices S=s1,s2,,sj.S=\langle s_{1},s_{2},\dots,s_{j}\rangle. Let yjy_{j} denote the jjth bit of yy. The following SLPs can be implemented using at most two extra variables.

Pack(x, y, S) modifies x\mathtt{x} so that x=packed(b1,b2,,bk)\mathtt{x}=\operatorname{packed}(b^{\prime}_{1},b^{\prime}_{2},\dots,b^{\prime}_{k}) where bi=yjb^{\prime}_{i}=y_{j} whenever there exists an index sjSs_{j}\in S with sj=is_{j}=i, and bi=bib^{\prime}_{i}=b_{i} otherwise.

Unpack(x, y, S) modifies y so that y=y+bi=1jbsi/2i\mathtt{y}=\mathtt{y}+^{b}\sum_{i=1}^{j}b_{s_{i}}/2^{i}

Simulating a Boolean operations. As described in the previous section, the reduction only needs to simulate or- and not-gates. Given x=packed(b1,b2,,bk)\mathtt{x}=\operatorname{packed}(b_{1},b_{2},\dots,b_{k}), and three indices i1,i2,i3i_{1},i_{2},i_{3}, we implement two SLPs, which both modify xx so that x=packed(b1,b2,,bk)\mathtt{x}=\operatorname{packed}(b^{\prime}_{1},b^{\prime}_{2},\dots,b^{\prime}_{k}). SLP 7 implements Or(x,i1,i2,i3)\mathtt{Or(x,i_{1},i_{2},i_{3})}, which ensures that bi3=bi1bi2b^{\prime}_{i_{3}}=b_{i_{1}}\lor b_{i_{2}}, and bi=bib^{\prime}_{i}=b_{i} for ii3i\neq i_{3}. SLP 8 implements Not(x,i1,i2)\mathtt{Not(x,i_{1},i_{2})}, which ensures that bi2=¬bi1b^{\prime}_{i_{2}}=\lnot b_{i_{1}}, and bi=bib^{\prime}_{i}=b_{i} for ii2i\neq i_{2}.

These two SLPs simply unpack the input bits, perform the operation, and then pack the result into the output bit. The Or SLP uses the Unpack operation to set a=bi1+bbi2\mathtt{a}=b_{i_{1}}+^{b}b_{i_{2}}. Both SLPs use three extra variables: the fresh variable a is live throughout, and the pack and unpack operations use two extra variables. The variable b in the Not SLP is not live concurrently with a pack or unpack, and so does not increase the number of live variables. These two SLPs can be used to simulate a Boolean circuit using at most three extra variables.

Let CC be a Boolean circuit with nn inputs and kk gates. Suppose that x=packed(b1,,bn)x=\operatorname{packed}(b_{1},\dots,b_{n}), gives values for the inputs of the circuit. There is an SLP Simulate(C, x) that uses three extra variables, and modifies xx so that x=packed(b1,,bn,bn+1,,bn+k)x=\operatorname{packed}(b_{1},\dots,b_{n},b_{n+1},\dots,b_{n+k}), where bn+ib_{n+i} is the output of gate ii of the circuit.

Implementing the reduction. Finally, we can show that the circuit built in Theorem 3.6 can be implemented by an SLP that uses at most 8 variables. This SLP cycles through each sampled point in turn, computes the xx and yy displacements by simulating the Boolean circuit, and then adds the result to the output. The following theorem is proved in Appendix H

Given a 2D-Brouwer instance, it is \PPAD-hard to find a point x2x\in^{2} with xF(x)<212\|x-F(x)\|_{\infty}<\frac{\sqrt{2}-1}{2} even for a synchronous circuit of width eight.

Hardness for tree polymatrix games

Now we show that finding a Nash equilibrium of a tree polymatrix game is \PPAD-hard. We reduce from the low-width 2D-Brouwer problem, whose hardness was shown in Theorem 4.3. Throughout this section, we suppose that we have a 2D-Brouwer instance defined by a synchronous arithmetic circuit FF of width eight and depth nn. The gates of this circuit will be indexed as gi,jg_{i,j} where 1i81\leq i\leq 8 and 1jn1\leq j\leq n, meaning that gi,jg_{i,j} is the iith gate on level jj.

Modifying the circuit. The first step of the reduction is to modify the circuit. First, we modify the circuit so that all gates operate on values in [0,0.1][0,0.1], rather than $.Weintroducetheoperators. We introduce the operators+^{b}_{0.1},,-^{b}_{0.1},and, and*^{b}_{0.1},whichboundtheiroutputstobein, which bound their outputs to be in[0,0.1].Thefollowinglemma,provedinAppendixI,statesthatwecanrewriteourcircuitusingthesenewgates.Thetransformationsimplydividesall. The following lemma, proved in Appendix I, states that we can rewrite our circuit using these new gates. The transformation simply divides allc$-gates in the circuit by ten.

Given an arithmetic circuit F:22F:^{2}\rightarrow^{2} that uses gates from {c,+b,b,b}\{c,+^{b},-^{b},*^{b}\}, we can construct a circuit F:[0,0.1]2[0,0.1]2F^{\prime}:[0,0.1]^{2}\rightarrow[0,0.1]^{2} that uses the gates from {c,+0.1b,0.1b,0.1b}\{c,+^{b}_{0.1},-^{b}_{0.1},*^{b}_{0.1}\}, so that F(x,y)=(x,y)F(x,y)=(x,y) if and only if F(x/10,y/10)=(x/10,y/10)F^{\prime}(x/10,y/10)=(x/10,y/10).

Next we modify the structure of the circuit by connecting the two outputs of the circuit to its two inputs. Suppose, without loss of generality, that g7,1g_{7,1} and g8,1g_{8,1} are the inputs and that g7,ng_{7,n} and g8,ng_{8,n} are outputs. Note that the equality x=yx=y can be implemented using the gate x=y0.1b1x=y*^{b}_{0.1}1. We add the following extra equalities, which are shown in Figure 2.

We add gates g9,n1=g7,ng_{9,n-1}=g_{7,n} and g10,n1=g8,ng_{10,n-1}=g_{8,n}.

For each jj in the range 2j<n12\leq j<n-1, we add g9,j=g9,j+1g_{9,j}=g_{9,j+1} and g10,j=g10,j+1g_{10,j}=g_{10,j+1}.

We modify g7,1g_{7,1} so that g7,1=g9,2g_{7,1}=g_{9,2}, and we modify g8,1g_{8,1} so that g8,1=g10,2g_{8,1}=g_{10,2}.

Note that these gates are backwards: they copy values from higher levels in the circuit to lower levels, and so the result is not a circuit, but a system of constraints defined by gates, with some structural properties. Firstly, each gate gi,jg_{i,j} is only involved in constraints with gates of the form gi,j+1g_{i^{\prime},j+1} and gi,j1g_{i^{\prime},j-1}. Secondly, finding values for the gates that satisfy all of the constraints is \PPAD-hard, since by construction such values would yield a fixed point of FF.

The polymatrix game. The polymatrix game will contain three types of players.

For each i=1,,ni=1,\ldots,n, we have a variable player viv_{i}.

For each i=1,,n1i=1,\ldots,n-1, we have a constraint player cic_{i}, who is connected to viv_{i} and vi+1v_{i+1}.

For each i=1,,2n1i=1,\ldots,2n-1, we have a mix player mim_{i}. If ii is even, then mim_{i} is connected to ci/2c_{i/2}. If ii is odd, then mim_{i} is connected to v(i+1)/2v_{(i+1)/2}.

The structure of this game is shown in Figure 3. Each player has twenty actions, which are divided into ten pairs, xix_{i} and xˉi\bar{x}_{i} for i=1,,10i=1,\ldots,10.

Forcing mixing. The role of the mix players is to force the variable and constraint players to play specific mixed strategies: for every variable or constraint player jj, we want sj(xi)+sj(xˉi)=0.1s_{j}(x_{i})+s_{j}(\bar{x}_{i})=0.1 for all ii, which means that the same amount of probability is assigned to each pair of actions. To force this, each mix player plays a high-stakes hide-and-seek against their opponent, which is shown in Figure 4. This zero-sum game is defined by a 20×2020\times 20 matrix ZZ and a constant MM. The payoff ZijZ_{ij} is defined as follows. If i{xa,xˉa}i\in\{x_{a},\bar{x}_{a}\} and j{xa,xˉa}j\in\{x_{a},\bar{x}_{a}\} for some aa, then Zij=MZ_{ij}=M. Otherwise, Zij=0Z_{ij}=0. For each ii the player mim_{i} plays against player jj, which is either a constraint player cic_{i^{\prime}} or a variable player viv_{i^{\prime}}. We define the payoff matrix Ami,j=ZA^{m_{i},j}=Z and Gj,mi=ZG^{j,m_{i}}=-Z.

The following lemma, proved in Appendix J, shows that if MM is suitably large, then the variable and constraint players must allocate probability 0.10.1 to each of the ten action pairs.

Suppose that all payoffs in the games between variable and constraint players use payoffs in the range [P,P][-P,P]. If M>40PM>40\cdot P then in every mixed Nash equilibrium s\mathbf{s}, the action sjs_{j} of every variable and constraint player jj satisfies sj(xi)+sj(xˉi)=0.1s_{j}(x_{i})+s_{j}(\bar{x}_{i})=0.1 for all ii.

Gate gadgets. We now define the payoffs for variable and constraint players. Actions xix_{i} and xˉi\bar{x}_{i} of variable player vjv_{j} will represent the output of gate gi,jg_{i,j}. Specifically, the probability that player vjv_{j} assigns to action xix_{i} will be equal to the output of gi,jg_{i,j}. In this way, the strategy of variable player vjv_{j} will represent the output of every gate at level jj of the circuit. The constraint player cjc_{j} enforces all constraints between the gates at level jj and the gates at level j+1j+1. To simulate each gate, we will embed one of the gate gadgets from Figure 5, which originated from the reduction of DGP , into the bimatrix games that involve cjc_{j}.

The idea is that, for the constraint player to be in equilibrium, the variable players must play xix_{i} with probabilities that exactly simulate the original gate. Lemma 5.2 allows us to treat each gate independently: each pair of actions xix_{i} and si\mathbf{s}_{i} must receive probability 0.10.1 in total, but the split of probability between xix_{i} and si\mathbf{s}_{i} is determined by the gate gadgets.

Formally, we construct the payoff matrices Avi,ciA^{v_{i},c_{i}} and Aci,vi+1A^{c_{i},v_{i+1}} for all i<ni<n by first setting each payoff to . Then, for each gate, we embed the corresponding gate gadget from Figure 5 into the matrices. For each gate ga,jg_{a,j}, we take the corresponding game from Figure 5, and embed it into the rows xax_{a} and xˉa\bar{x}_{a} of a constraint player’s matrix. The diagrams specify specific actions of the constraint and variable players that should be modified.

For gates that originated in the circuit, the gadget is always embedded into the matrices Avj1,cj1A^{v_{j-1},c_{j-1}} and Acj1,vjA^{c_{j-1},v_{j}}, the synchronicity of the circuit ensures that the inputs for level jj gates come from level j1j-1 gates. We have also added extra multiplication gates that copy values from the output of the circuit back to the input. These gates are of the form gi,j=gi,j+1g_{i,j}=g_{i^{\prime},j+1}, and are embedded into the matrices Avj,cjA^{v_{j},c_{j}} and Acj,vj+1A^{c_{j},v_{j+1}}.

The following lemma, proved in Appendix 5.3, states that, in every Nash equilibrium, the strategies of the variable players exactly simulate the gates that have been embedded.

In every mixed Nash equilibrium s\mathbf{s} of the game, the following are satisfied for each gate gi,jg_{i,j}.

If gi,j=cg_{i,j}=c, then svj(xi)=cs_{v_{j}}(x_{i})=c.

If gi,j=gi1,j1  +0.1b  gi2,j1g_{i,j}=g_{i_{1},j-1}\;+^{b}_{0.1}\;g_{i_{2},j-1}, then svj(xi)=svj1(xi1)  +0.1b  svj1(xi2)s_{v_{j}}(x_{i})=s_{v_{j-1}}(x_{i_{1}})\;+^{b}_{0.1}\;s_{v_{j-1}}(x_{i_{2}}).

If gi,j=gi1,j1  0.1b  gi2,j1g_{i,j}=g_{i_{1},j-1}\;-^{b}_{0.1}\;g_{i_{2},j-1}, then svj(xi)=svj1(xi1)  0.1b  svj1(xi2)s_{v_{j}}(x_{i})=s_{v_{j-1}}(x_{i_{1}})\;-^{b}_{0.1}\;s_{v_{j-1}}(x_{i_{2}}).

If gi,j=gi1,j  0.1b  cg_{i,j}=g_{i_{1},j^{\prime}}\;*^{b}_{0.1}\;c, then svj(xi)=svj(xi1)  0.1b  cs_{v_{j}}(x_{i})=s_{v_{j^{\prime}}}(x_{i_{1}})\;*^{b}_{0.1}\;c.

Lemma 5.3 says that, in every Nash equilibrium of the game, the strategies of the variable players exactly simulate the gates, which by construction means that they give us a fixed point of the circuit FF. Also note that it is straightforward to give a path decomposition for our interaction graph, where each node in the decomposition contains exactly two vertices from the game, meaning that the graph has pathwidth 1. So we have proved the following.

It is \PPAD-hard to find a Nash equilibrium of a tree polymatrix game, even when all players have at most twenty actions and the interaction graph has pathwidth 1.

Open questions

For polymatrix games, the main open question is to find the exact boundary between tractability and hardness. Twenty-action pathwidth-1 tree polymatrix games are hard, but two-action path polymatrix games can be solved in polynomial time . What about two-action tree polymatrix games, or path-polymatrix games with more than two actions?

For 2D-Brouwer and 2D-LinearFIXP, the natural question is: for which ϵ\epsilon is it hard to find an ϵ\epsilon-fixed point? We have shown that it is hard for ϵ=0.2071\epsilon=0.2071, while the case for ϵ=0.5\epsilon=0.5 is trivial, since the point (0.5,0.5)(0.5,0.5) must always be a 0.50.5-fixed point. Closing the gap between these two numbers would be desirable.

References

Appendix A An issue with the lower bound in [9]

This section refers to the result in , which purports to show that finding a Nash equilibrium in a graphical game of pathwidth four is \PPAD-hard. Like this paper, their proof reduces from discrete Brouwer, but unlike this paper and other work , the proof attempts to carry out the reduction entirely using Boolean values. In other words, there is no step (like Lemmas 3.3 and 3.4 in this paper), where the Boolean outputs of the circuit are converted to arithmetic values. In all reductions of this type, this is carried out by averaging over multiple copies of the circuit, with the understanding that some of the circuits may give nonsensical outputs.

It is difficult to see how a reduction that avoids this step could work. This is because the expected payoff for a player in a polymatrix game is a continuous function of the other player’s strategies. But attempting to reduce directly from a Boolean circuit would produce a function that is discontinuous.

It seems very likely that the proof in can be repaired by including an explicit averaging step, and it this may still result in a graph that has bounded pathwidth, though it is less clear that the pathwidth would still be four. On the other hand, our work makes this less pressing, since the repaired result would still be subsumed by our lower bound for polymatrix games with pathwidth one.

Appendix B Proof of Lemma 2.1

The idea is to make each level of the circuit correspond to a line of the SLP. We assume that all for loops have been unrolled, and that all if statements have been resolved. Suppose that the resulting SLP has kk lines, and furthermore assume that at each line of the SLP, we have an indexed list v1,v2,,vlv_{1},v_{2},\dots,v_{l} of the variables that are live on each line, where of course we have lwl\leq w.

We will build a circuit with kwk\cdot w gates, and will index those gates as gi,jg_{i,j}, where 1ik1\leq i\leq k is a line, and 1jw1\leq j\leq w is a variable. The idea is that the gate gi,jg_{i,j} will compute the value of the jjth live variable on line ii. The gate gi,jg_{i,j} will be constructed as follows.

If there are fewer than jj variables live at line kk of the SLP, then gi,jg_{i,j} is a dummy cc-gate.

If line ii of the SLP is vjv_{j} \leftarrow op, then we define gi,j=opg_{i,j}=\mathtt{op}. If op uses a variable x\mathtt{x} as an input, then by definition, this variable must be live on line i1i-1, and so we find the index jj^{\prime} for x on line i1i-1, and we substitute gi1,jg_{i-1,j^{\prime}} for x in op. We do this for both arguments in the case where op is +b+^{b} or b-^{b}.

If line ii of the SLP does not assign a value to vjv_{j}, then by definition, the variable must be live on line i1i-1. As before, let jj^{\prime} be the index of this variable on line i1i-1. We define gi,j=gi1,jb1g_{i,j}=g_{i-1,j^{\prime}}*^{b}1.

It is not difficult to see that this circuit exactly simulates the SLP. Moreover, by construction, we have d(gi,j)=id(g_{i,j})=i. Hence, each level of the circuit has width exactly ww, and so the overall width of the circuit is ww.

Appendix C Proof of Lemma 3.3

Suppose that we are given a DiscreteBrouwer instance defined by a circuit CC over the grid Gn={0,1/2n,2/2n,,(2n1)/2n}2G_{n}=\{0,1/2^{n},2/2^{n},\dots,(2^{n}-1)/2^{n}\}^{2}. Let nn^{\prime} be an integer such that 2n/2n<(12ϵ)2^{n}/2^{n^{\prime}}<(1-2\epsilon). We will build an ϵ\epsilon-ThickDisBrouwer instance defined by a circuit CC^{\prime} over the grid Gn={0,1/2n,2/2n,,(2n1)/2n}2G_{n^{\prime}}=\{0,1/2^{n^{\prime}},2/2^{n^{\prime}},\dots,(2^{n^{\prime}}-1)/2^{n^{\prime}}\}^{2}. We will embed the original instance in the center of the new instance, where the point (x0,y0)=(0.52n1/2n,0.52n1/2n)(x_{0},y_{0})=(0.5-2^{n-1}/2^{n^{\prime}},0.5-2^{n-1}/2^{n^{\prime}}) in GG^{\prime} will correspond to the point (0,0)(0,0) in GG. We use the following procedure to determine the color of a point (x,y)Gn(x,y)\in G_{n^{\prime}}.

If 0xx02n0\leq x-x_{0}\leq 2^{n} and 0yy02n0\leq y-y_{0}\leq 2^{n}, then C(x,y)=C(xx0,yy0)C^{\prime}(x,y)=C(x-x_{0},y-y_{0}).

Otherwise, if xx0<0x-x_{0}<0, then C(x,y)=1C(x,y)=1.

Otherwise, if yy00y-y_{0}\leq 0, then C(x,y)=2C(x,y)=2.

where the second inequality used the definition of nn^{\prime}. Moreover

where again the second inequality used the definition of nn^{\prime}. The same inequalities hold for y0y_{0}. Hence, the first step of our procedure perfectly embeds the original instance into the new instance, while the other steps ensure that the ϵ\epsilon-ThickDisBrouwer boundary conditions hold.

Points in the boundary cannot be solutions, because the boundary constraints ensure that at least one of the three colors will be missing. Hence, every solution of CC^{\prime} on GG^{\prime} must also be a solution of CC on GG.

Appendix D Proof of Lemma 3.4

Observe that SLP 3 produces a bad decode if and only if xx is in the range [0.5,0.5+1/L)[0.5,0.5+1/L). Since SLP 4 extracts nn bits, multiplying xx by two each time, it follows that one of the decodes will fail if

Hence, the point pi=(pi1,pi2)p_{i}=(p_{i}^{1},p_{i}^{2}) has a poorly-positioned coordinate if there is some integer aa such that pi1I(a)p_{i}^{1}\in I(a), or pi2I(a)p_{i}^{2}\in I(a). For a fixed dimension j{1,2}j\in\{1,2\}, we have two properties.

There cannot be two points pip_{i} and pip_{i^{\prime}} such that pijp_{i}^{j} and pijp_{i^{\prime}}^{j} both lie in the same interval I(a)I(a). This is because the width of the interval is

where the final term is the defined difference between pijp_{i}^{j} and pi+1jp_{i+1}^{j}.

There cannot be two distinct indices aa and aa^{\prime} such that pijI(a)p_{i}^{j}\in I(a) and pijI(a)p_{i^{\prime}}^{j}\in I(a^{\prime}). This is because the distance between p1jp_{1}^{j} and pkjp_{k}^{j} is at most

whereas the distance between any two consecutive intervals I(a)I(a) and I(a+1)I(a+1) is at least

From these two facts, it follows that there is at most one point that has a poorly-positioned coordinate in dimension jj, so there can be at most two points that have poorly positioned coordinates.

Appendix E Proof of Lemma 3.5

We argue that if xF(x)<ϵ/2\|x-F(x)\|_{\infty}<\epsilon^{\prime}/2, then there exist three indices i1i_{1}, i2,i_{2}, and i3i_{3} such that pijp_{i_{j}} has well-positioned coordinates, and that the lower-left corner of the square containing pijp_{i_{j}} has color jj.

Suppose for the sake of contradiction that this is not true. Then there must be a color that is missing, and there are two cases to consider.

First suppose that color 1 is missing. Since there are at most two points with poorly-positioned coordinates, we know that we have at least k2k-2 points jj for which x2j=1x_{2j}=1 or x3j=1x_{3j}=1. Hence we have

where the 2/k2/k term comes from the fact that the poorly positioned points can maximize dispy\texttt{disp}_{y} by fixing x1j=1x_{1j}=1 and x2j=x3j=0x_{2j}=x_{3j}=0, and thus can contribute at most 2ϵ/k2\cdot\epsilon/k to the sum.

As kk tends to infinity, the right-hand side converges to (12)ϵ(1-\sqrt{2})\cdot\epsilon. Since ϵ<ϵ\epsilon^{\prime}<\epsilon, we can choose a sufficiently large constant kk such that dispy<(12)ϵ\texttt{disp}_{y}<(1-\sqrt{2})\cdot\epsilon^{\prime}. Now, observing that 121-\sqrt{2} is negative, we get the following

Now suppose that one of colors 2 or 3 is missing. We will consider the case where color 3 is missing, as the other case is symmetric. As before, since there are at most two points with poorly-positioned coordinates, we know that we have at least k2k-2 points jj for which x1j=1x_{1j}=1 or x2j=1x_{2j}=1. One of the two following cases applies.

At least (21)k2(\sqrt{2}-1)\cdot k-2 well-positioned points satisfy x2j=1x_{2j}=1. If this is the case, then we have

where we have used the fact that there are no well positioned points with color 3, and the fact that the poorly-positioned points cannot reduce the sum by more than 2ϵk\frac{2\cdot\epsilon}{k}.

As kk tends to infinity, the right-hand side tends to (21)ϵ(\sqrt{2}-1)\cdot\epsilon, so there is a sufficiently large constant kk such that dispx>(21)ϵ\texttt{disp}_{x}>(\sqrt{2}-1)\cdot\epsilon^{\prime}, and so xF(X)>(21)ϵ\|x-F(X)\|_{\infty}>(\sqrt{2}-1)\cdot\epsilon^{\prime}.

At least k(21)kk-(\sqrt{2}-1)\cdot k well-positioned points satisfy x1j=1x_{1j}=1. In this case we have

The first line of this inequality uses the fact that we have no well-positioned points with color 3, and that the poorly-positioned points can reduce the sum by at most 2ϵk\frac{2\cdot\epsilon}{k}. The second line substitutes the bounds that we have for x1jx_{1j} and x2jx_{2j}. The third line uses the fact that 21\sqrt{2}-1 is a solution of the equation x=1xx2x=1-x-x^{2}.

As in the other two cases, this means that we can choose a sufficiently large constant kk such that xF(X)>(21)ϵ\|x-F(X)\|_{\infty}>(\sqrt{2}-1)\cdot\epsilon^{\prime}.

Next we observe that the arguments given above all continue to hold if we substitute a sufficiently precise rational approximation 2\sqrt{2} in our displacement vector calculation. This is because all three arguments prove that some expression converges to (21)ϵ>(21)ϵ(\sqrt{2}-1)\cdot\epsilon>(\sqrt{2}-1)\cdot\epsilon^{\prime}, thus we can replace 2\sqrt{2} with any suitably close rational that ensures that the expressions converge to (x1)ϵ>(21)ϵ(x-1)\cdot\epsilon>(\sqrt{2}-1)\cdot\epsilon^{\prime} for some xx.

So far we have shown that there exist three well-positioned points pi1p_{i_{1}}, pi2p_{i_{2}}, and pi3p_{i_{3}} that have three distinct colors. To see that xx is contained within a trichromatic square, it suffices to observe that pkp11/2k\|p_{k}-p_{1}\|_{\infty}\leq 1/2^{k}, which means that all three points must be contained in squares that are adjacent to the square containing xx.

Appendix F Proof of Lemma 4.1

We construct SLPs for both of the operations.

Packing bits. The Pack operation is implemented by the following SLP.

SLP 9 implements the pack operation. It begins by clearing the bits referenced by the sequence SS. It then copies y to y’, and destructively extracts the first jj bits of y’. These bits are then stored at the correct index in x by the final line of the for loop. In total, this SLP uses two additional variables y’ and b. Two extra variables are used by Clear, but these stop being live after the first line, before y’ and b become live.

Unpacking bits. The Unpack operation is implemented by the following SLP.

SLP 10 implements the unpacking operation. It first copies x to x’, and then destructively extracts the first kk bits of x’. Whenever a bit referred to by SS is extracted from x’, it is first multiplied by 12sj\frac{1}{2^{s_{j}}}, which puts it at the correct position, and is then added to y. This SLP uses the two additional variables x’ and b.

Appendix G Proof of Lemma 4.2

Simulating a Boolean circuit. Let gn+1,gn+2,,gn+k\langle g_{n+1},g_{n+2},\dots,g_{n+k}\rangle be the gates of the circuit, and suppose, without loss of generality, that the gates have been topologically ordered. The following SLP will simulate the circuit CC.

Assuming that the first nn bits of xx already contain the packed inputs of the circuit, SLP 11 implements the operation Simulate(C,x)\mathtt{Simulate(C,x)} that computes the output of each gate. This simply iterates through and simulates each gate. The SLP introduce no new variables, and so it uses three additional live variables in total, which come from the Or and Not operations.

Appendix H Proof of Theorem 4.3

Dealing with the output. Recall that our Boolean circuit will output three bits, and that these bits determine which displacement vector is added to the output of the arithmetic circuit. We now build an SLP that does this conversion. It implements AddVector(x,i,outx,outy,k,dx,dy)\mathtt{AddVector(x,i,\texttt{out}_{x},\texttt{out}_{y},k,d_{x},d_{y})}, where x=packed(b1,b2,,bn)x=\operatorname{packed}(b_{1},b_{2},\dots,b_{n}), ini\leq n is an index, outx\texttt{out}_{x} and outy\texttt{out}_{y} are variables, kk is an integer, and dx,dyd_{x},d_{y}\in. After this procedure, we should have outx=outx+dxbi/k\texttt{out}_{x}=\texttt{out}_{x}+d_{x}\cdot b_{i}/k, and outy=outy+dybi/k\texttt{out}_{y}=\texttt{out}_{y}+d_{y}\cdot b_{i}/k. SLP 12 does this operation. It uses three extra variables in total: the fresh variable a is live throughout, and the two unpack operations use two extra variables.

Implementing the reduction. Finally, we can implement the reduction from DiscreteBrouwer to 2D-Brouwer. We will assume that we have been given a Boolean circuit CC that takes 2n2n inputs, where the first nn input bits correspond to the xx coordinate, and the second nn input bits correspond to the yy coordinate. Recall that we have required that CC gives its output as a one-hot vector. We assume that the three output bits of CC are indexed n+k2n+k-2, n+k1n+k-1, and n+kn+k, corresponding to colors 11, 22, and 33, respectively.

SLP 13 implements the reduction. The variables inx\texttt{in}_{x} and iny\texttt{in}_{y} hold the inputs to the circuit, while the variables outx\texttt{out}_{x} and outy\texttt{out}_{y} are the outputs. The SLP first copies the inputs to the outputs, and then modifies the outputs using the displacement vectors. Each iteration of the for loop computes the computes the displacement contributed by the point pip_{i} (defined in the previous section). This involves decoding the first nn bits of both inx\texttt{in}_{x} and iny\texttt{in}_{y}, which can be done via the pack operation, simulating the circuit on the resulting bits, and then adding the correct displacement vectors to outx\texttt{out}_{x} and outy\texttt{out}_{y}.

The correctness of this SLP follows from our correctness proof for Theorem 3.6, since all we have done in this section is reimplement while using a small number of live variables. In total, this SLP uses four extra variables. All of the macros use at most three extra variables, and the fresh variable x during these macros. Since inx\texttt{in}_{x} iny\texttt{in}_{y}, outx\texttt{out}_{x} and outy\texttt{out}_{y} are all live throughout as well, this gives us 8 live variables in total.

Appendix I Proof of Lemma 5.1

The circuit FF^{\prime} consists of gates gi,jg^{\prime}_{i,j} for each 1i81\leq i\leq 8 and 1jn1\leq j\leq n.

If gi,j=cg_{i,j}=c, then gi,j=c/10g^{\prime}_{i,j}=c/10.

If gi,j=ga,b+bgx,yg_{i,j}=g_{a,b}+^{b}g_{x,y}, then gi,j=ga,b  +0.1b  gx,yg^{\prime}_{i,j}=g^{\prime}_{a,b}\;+^{b}_{0.1}\;g^{\prime}_{x,y}.

If gi,j=ga,bbgx,yg_{i,j}=g_{a,b}-^{b}g_{x,y}, then gi,j=ga,b  0.1b  gx,yg^{\prime}_{i,j}=g^{\prime}_{a,b}\;-^{b}_{0.1}\;g^{\prime}_{x,y}.

If gi,j=ga,bbcg_{i,j}=g_{a,b}*^{b}c, then gi,j=ga,b  0.1b  cg^{\prime}_{i,j}=g^{\prime}_{a,b}\;*^{b}_{0.1}\;c.

Let (x,y)2(x,y)\in^{2}. It is not difficult to show by induction, that if we compute F(x,y)F(x,y) and F(x/10,y/10)F^{\prime}(x/10,y/10), then gi,j=gi,j/10g^{\prime}_{i,j}=g_{i,j}/10 for all ii and jj. Hence, F(x,y)=(x,y)F(x,y)=(x,y) if and only if F(x/10,y/10)=(x/10,y/10)F^{\prime}(x/10,y/10)=(x/10,y/10).

Appendix J Proof of Lemma 5.2

For the sake of contradiction, suppose that there is a Nash equilibrium s\mathbf{s} in which there is some variable or constraint player jj that fails to satisfy this equality. Let II be the subset of indices that maximize the expression sj(xi)+sj(xˉi)s_{j}(x_{i})+s_{j}(\bar{x}_{i}), ie., II contains the pairs that player jj plays with highest probability. Note that since player jj does not play all pairs uniformly, II does not contain every index, so let JJ be the non-empty set of indices not in II.

Let mkm_{k} be the mix player who plays against player jj. By construction, the actions xix_{i} and xˉi\bar{x}_{i} have payoff (sj(xi)+sj(xˉi))M\left(s_{j}(x_{i})+s_{j}(\bar{x}_{i})\right)\cdot M for mkm_{k}. Since s\mathbf{s} is a Nash equilibrium, mkm_{k} may only place probability on actions that are best responses, which means that he may only place probability on the actions xix_{i} and xˉi\bar{x}_{i} when iIi\in I.

Let ii be an index that maximizes smk(xi)+smk(xˉi)s_{m_{k}}(x_{i})+s_{m_{k}}(\bar{x}_{i}) for player mkm_{k}. By the above argument, we have iIi\in I. The actions xix_{i} and xˉi\bar{x}_{i} for player jj give payoff at most

The first expression uses 2P2P as the maximum possible payoff that player jj can obtain from the two other games in which he is involved. The first inequality uses the fact that ii was the pair with maximal probability, and there are exactly 10 pairs. The second inequality uses the fact that M/10>4PM/10>4P.

On the other hand, let ii^{\prime} be an index in JJ. By the argument above, we have smk(xi)+smk(xˉi)=0s_{m_{k}}(x_{i^{\prime}})+s_{m_{k}}(\bar{x}_{i^{\prime}})=0. Hence, the payoff of actions xix_{i^{\prime}} and xˉi\bar{x}_{i^{\prime}} to player jj is at least 2P-2P, since that is the lowest payoff that he can obtain from the other two games in which he is involved.

But now we have arrived at our contradiction. Player jj places non-zero probability on at least one action xix_{i} or xˉi\bar{x}_{i} with iIi\in I that is not a pure best response. Hence s\mathbf{s} cannot be a Nash equilibrium.

Appendix K Proof of Lemma 5.3

We can actually prove this lemma for all four gates simultaneously. Let jj^{\prime} be the index constraint player into which the gate gadget is embedded. Observe that all four games for the four gate types have a similar structure: The payoffs for actions xix_{i} and xˉi\bar{x}_{i} for player vjv_{j} are identical across all four games, and the payoff of action xix_{i} for cjc_{j^{\prime}} are also identical; the only thing that differs between the gates is the payoff to player cjc_{j^{\prime}} for action xˉi\bar{x}_{i}. We describe these differences using a function ff.

For cc-gates, we define f(s)=cf(\mathbf{s})=c.

For +0.1b+^{b}_{0.1}-gates, we define f(s)=svj1(xi1)  +  svj1(xi1)f(\mathbf{s})=s_{v_{j-1}}(x_{i_{1}})\;+\;s_{v_{j-1}}(x_{i_{1}}).

For 0.1b-^{b}_{0.1}-gates, we define f(s)=svj1(xi1)    svj1(xi1)f(\mathbf{s})=s_{v_{j-1}}(x_{i_{1}})\;-\;s_{v_{j-1}}(x_{i_{1}}).

For 0.1b*^{b}_{0.1}-gates, we define f(s)=svj(xi1)    cf(\mathbf{s})=s_{v_{j^{\prime}}}(x_{i_{1}})\;*\;c.

Observe that the payoff of action xˉi\bar{x}_{i} to player cjc_{j^{\prime}} is f(s)f(\mathbf{s}). To prove the lemma, we must show that player vjv_{j} plays xix_{i} with probability

If f(s)0f(\mathbf{s})\leq 0, then we argue that svj(xi)=0s_{v_{j}}(x_{i})=0. Suppose for the sake of contradiction that player vjv_{j} places non-zero probability on action xix_{i}. Then action xix_{i} for player cjc_{j^{\prime}} will have payoff strictly greater than zero, whereas action xˉi\bar{x}_{i} will have payoff f(s)0f(\mathbf{s})\leq 0. Hence, in equilibrium, cjc_{j^{\prime}} cannot play action xˉi\bar{x}_{i}. Lemma 5.2 then implies that player cjc_{j^{\prime}} must play xix_{i} with probability 0.10.1. If cjc_{j^{\prime}} does this, then the payoff to vjv_{j} for xix_{i} will be zero, and the payoff to vjv_{j} for xˉi\bar{x}_{i} will be 0.10.1. This means that vjv_{j} places non-zero probability on an action that is not a best response, and so is a contradiction.

If f(s)0.1f(\mathbf{s})\geq 0.1, then we argue that svj(xi)=0.1s_{v_{j}}(x_{i})=0.1. Suppose for the sake of contradiction with Lemma 5.2 that svj(xˉi)>0s_{v_{j}}(\bar{x}_{i})>0. Observe that the payoff to player cjc_{j^{\prime}} of action xˉi\bar{x}_{i} is f(s)0.1f(\mathbf{s})\geq 0.1, whereas the payoff to player cjc_{j^{\prime}} of action xix_{i} is svj(xi)<0.1s_{v_{j}}(x_{i})<0.1. So to be in equilibrium and consistent with Lemma 5.2, player cjc_{j^{\prime}} must place 0.10.1 probability on action xˉi\bar{x}_{i}, and probability on action xix_{i}. But this means that the payoff of action xˉi\bar{x}_{i} to player vjv_{j} is zero, while the payoff of action xix_{i} to player vjv_{j} is 0.10.1. Hence player vjv_{j} has placed non-zero probability on an action that is not a pure best response, and so we have our contradiction.

If 0<f(s)<0.10<f(\mathbf{s})<0.1, then we argue that svj(xi)=f(s)s_{v_{j}}(x_{i})=f(\mathbf{s}). We first prove that player cjc_{j^{\prime}} must play both xix_{i} and xˉi\bar{x}_{i} with positive probability.

If player cjc_{j^{\prime}} does not play xˉi\bar{x}_{i} then player vjv_{j} will not play xix_{i}, and player cjc_{j^{\prime}} will receive payoff , but in this scenario he could get f(s)>0f(\mathbf{s})>0 by playing xˉi\bar{x}_{i} instead of his current strategy.

If player cjc_{j^{\prime}} does not play xix_{i} then player vjv_{j} will not play xˉi\bar{x}_{i}. Player cjc_{j^{\prime}} will receive payoff f(s)f(\mathbf{s}) for playing xˉi\bar{x}_{i}, but in this scenario he could receive payoff 1>f(s)1>f(\mathbf{s}) for playing xix_{i} instead.

In order for player cjc_{j^{\prime}} to mix over xix_{i} and xˉi\bar{x}_{i} in equilibrium, their payoffs must be equal. This is only the case when svj(xi)=f(s)s_{v_{j}}(x_{i})=f(\mathbf{s}).