Sum of Squares Lower Bounds from Pairwise Independence

Boaz Barak, Siu On Chan, Pravesh Kothari

Introduction

Constraint Satisfaction Problems (CSP) are among the most natural computational problems, and yet their computational complexity is not fully understood. In particular several works have studied the notion of Approximation Resistance, which loosely speaking means that the best polynomial-time approximation algorithm is simply the one that outputs a random assignment. Under Khot’s Unique Games Conjecture much is known about this property. In particular Austrin and Mossell showed if the UGC is true, then, for every predicate P:{0,1}k{0,1}P:\{0,1\}^{k}\rightarrow\{0,1\}, if there exists a pairwise independent distribution μ\mu over P1(1)P^{-1}(1) (i.e., a distribution μ\mu such that for every ij[k]i\neq j\in[k], the marginal μiμj\mu_{i}\mu_{j} is the uniform distribution over {0,1}2\{0,1\}^{2}), then PP is approximation resistant. Austrin and Håstad used this to establish (under the UGC) fairly tight bounds on the threshold at which a random predicate of a particular density becomes approximation resistant. However, there is no consensus whether the UGC is true. Assuming only PNP\mathbf{P}\neq\mathbf{NP}, the best known bound is by Chan who showed that a predicate is approximation resistant if it contains a distribution μ\mu as above satisfying the additional condition that it is uniform over a subspace VGF(2)kV\subseteq GF(2)^{k}. This algebraic structure is a fairly strong condition. In particular if we choose P:{0,1}k{0,1}P:\{0,1\}^{k}\rightarrow\{0,1\} to be a random predicate conditioned on P1(1)=t|P^{-1}(1)|=t (where t{12k}t\in\{1\ldots 2^{k}\} is some parameter), then PP will satisfy the first condition (supporting a pairwise independent distribution) with high probability as long as t>ck2t>ck^{2} for some constant cc while it will not satisfy the second condition even for tt as large as exp(k/5)\exp(k/5) (see Observation A.1).

Another line of work has been concerned with proving unconditional lower bounds for these problems on restricted families of algorithms. These works considered convex relaxations for CSPs, where we say that a CSP is approximation resistant for some relaxation R\mathcal{R} if there is an instance for which a random assignment is essentially optimal, but the relaxation value is 1o(1)1-o(1) (namely, the relaxation “thinks” that it’s possible to satisfy almost all constraints). Interestingly, the unconditional results match the conditional ones. That is, for certain weaker relaxations (namely, the Sherali-Adams linear programming hierarchy or Sherali-Adams augmented with the basic semidefinite program), there are unconditional results for the same predicates that were shown approximation-resistant under the UGC . (This is of course not a coincidence, as the UGC is intimately connected with some of these weaker relaxations .) In contrast, for the stronger Sum of Squares (SOS) (also known as Lasserre) relaxation , the previously known results utilized the same conditions as in Chan’s NP-hardness result (and in fact inspired Chan’s work).

In this work we show that the pairwise independence condition suffices for lower bounds even for this stronger Sum-of-Squares hierarchy. This result is interesting in its own right and, based on past experience, could also be viewed as suggesting that it may be possible to improve the UGC-based results to results based on PNP\mathbf{P}\neq\mathbf{NP}.

Our results actually hold for a more general setting than showing approximation-resistance of predicates, and so to state them we need to introduce some notation. Roughly speaking, we show that for every kk and an arbitrarily small ϵ>0\epsilon>0, there exists a set I={C1,,Cm}\mathcal{I}=\{C_{1},\ldots,C_{m}\} of kk-tuples of literals (i.e. variables or their negations) over the variables x1,,xnx_{1},\ldots,x_{n} such that (1) for every assignment xx to the variables, the induced distribution on {0,1}k\{0,1\}^{k} obtained by taking a random i[m]i\in[m] and looking at the literals in CiC_{i} is ϵ\epsilon-close to the uniform distribution on {0,1}k\{0,1\}^{k} but (2) for every pairwise independent distribution μ\mu over {0,1}k\{0,1\}^{k}, there is a relaxation-solution that “cheats” the Ω(n)\Omega(n)-degree SOS relaxation to think that there is a distribution D\mathcal{D} over assignments (i.e. {0,1}k\{0,1\}^{k}) such that for every i[m]i\in[m], the projection of D\mathcal{D} to the literals in CiC_{i} is distributed according to μ\mu. This immediately implies that predicates supporting a pairwise independent distribution are approximation-resistant for this relaxation. We now formally state our results:

The Sum-of-Squares hierarchy can be thought of as optimizing over pseudo-expectations; see the survey and the references therein, as well as the lecture notes . For notational convenience, we will use variables over {±1}\{\pm 1\} instead of {0,1}\{0,1\}. A literal is a function f:{±1}n{±1}f:\{\pm 1\}^{n}\rightarrow\{\pm 1\} such that f(x)=xif(x)=x_{i} or f(x)=xif(x)=-x_{i} for some ii. If C=(f1,,fk)C=(f_{1},\ldots,f_{k}) is a kk-tuple of literals then we denote by C(x)C(x) the tuple (f1(x),,fk(x))(f_{1}(x),\ldots,f_{k}(x)). Our main result is the following:

For every x{±1}nx\in\{\pm 1\}^{n}, the distribution {C(x)}\{C(x)\} where CC is chosen at random in I\mathcal{I} is within ϵ\epsilon statistical distance to the uniform distribution over {±1}k\{\pm 1\}^{k}.

The following immediate corollary implies that predicates supporting pairwise independent distributions are approximation-resistant for Ω(n)\Omega(n)-degree SOS:

For every ϵ>0\epsilon>0 and P:{±1}k{0,1}P:\{\pm 1\}^{k}\rightarrow\{0,1\}, if there exists a pairwise independent distribution μ\mu supported on P1(1)P^{-1}(1) then there exists δ>0\delta>0 such that for all nn there is a set I={C1,,Cm}\mathcal{I}=\{C_{1},\ldots,C_{m}\} of kk-tuples of literals over x1,,xnx_{1},\ldots,x_{n} such that

The value of the δn\delta n-degree Max-PP SOS relaxation for the fraction of satisfiable constraints on the instance I\mathcal{I} is 11.

The instance I=(C1,,Cm)\mathcal{I}=(C_{1},\ldots,C_{m}) is actually obtained at random (with some pruning of a small fraction of the constraints, or alternatively, with some loss in the “perfect completeness” condition). Thus our results can also be thought as giving some evidence to a conjecture of Barak, Kindler and Steurer that no polynomial-time algorithm (including in particular the SOS algorithm) can beat the basic semidefinite program on approximating random CSP instances.

Throughout this paper we restrict ourselves to the Boolean case, and do not consider extensions to a larger alphabet, though our methods may be useful in this case as well.

2 Related works

Grigoriev proved in 1999 that (in the language of this paper) 3XOR is approximation resistant for the degree Ω(n)\Omega(n) Sum-of-Squares hierarchy. Grigoriev’s work in fact predated the papers of Parrilo and Lasserre proposing the SOS hierarchy, and so he used the different (but equivalent) language of Positivstellensatz Calculus proofs. (Also, as far we know, he did not note that these proofs can be efficiently found via a semidefinite program.) Grigoriev’s result was rediscovered in 2008 by Schoenebeck , who also noted that it implies approximation resistance for 3SAT and some other CSPs as well. Tulsiani (see also Chan ) further generalized these results and in particular showed that every predicate that contains a pairwise independent subgroup is approximation resistant for Ω(n)\Omega(n)-degree SOS. Both Tulsiani and Schoenebeck follow Grigoriev’s technique of reducing SOS lower bounds to resolution width lower bounds. As far as we know, no other SOS integrality gaps for approximating CSPs were known, and there are very few SOS lower bounds in general, most notably Grigoriev’s lower bound for knapsack and the very recent result by Meka, Potechin and Wigderson for the planted clique problem (personal communication).

Arora, Bollobás, Lovász and Tourlakis obtained integrality gaps for the Lovász-Schrijver linear programming hierarchy for Vertex Cover. Schoenebeck, Trevisan and Tulsiani showed that Max-Cut is approximation resistant for Ω(n)\Omega(n) levels of the Lovász-Schrijver hierarchy, and these results have been strengthened to the stronger Sherali-Adams hierarchy . The famous Goemans-Williamson algorithm shows that Max-Cut is not approximation resistant for even the degree 22 SOS hierarchy, further underscoring the difference between these relaxations.

Perhaps closest to our work are the papers of Benabbas, Georgiou, Magen, and Tulsiani who showed that predicates containing a pairwise independent distribution are approximation resistant for Ω(n)\Omega(n) rounds of the Sherali Adams hierarchy, even when one adds the degree 22 SOS constraints. Indeed, our pseudo-distribution agrees with theirs, though we describe it somewhat differently, and most importantly, need a completely different argument to show that it is positive semi-definite. Our work is also inspired by the pseudo-expectation view of the SOS hierarchy as advocated in the papers .

Overview of our proof

We now review the construction of the instance, as well as the pseudo-expectation operator, and then discuss how we come up with these local orthogonal functions. As mentioned above, our instance I=(C1,,Cm)\mathcal{I}=(C_{1},\ldots,C_{m}) will simply be a random instance, which we think of as a kk-uniform hypergraph with mm hyperedges C1,,CmC_{1},\ldots,C_{m}. After some pruning we can assume this hypergraph has girth Ω(logn)\Omega(\log n).If we don’t prune these clauses then our proof guarantees that for 1o(1)1-o(1) fraction of the clauses we get the marginal distribution to be μ\mu. It is possible that this can be upgraded to all of the clauses at the expense of some additional complication, but we have not checked whether or not that’s the case. By a simple Chernoff + union bound argument, if m>cnm>cn for a sufficiently large constant cc then for every assignment x{±1}nx\in\{\pm 1\}^{n}, the induced distribution {Ci(x)}i[m]\{C_{i}(x)\}_{i\sim[m]} will be ϵ\epsilon-close to the uniform distribution. For this informal overview, suppose that we merely want to establish the existence of a degree dd pseudo-expectation operator for some large constant dd. Note that this means that sets of at most dd (or even 2d2^{d}) variables form a forest (i.e. disjoint collection of trees) in this hypergraph.

The above overview can be converted into a full proof with some care when d=o(log(n))d=o(\log{(n)}) by exploiting the acyclicity of all subgraphs involved. Extending to d=Ω(n)d=\Omega(n) , however, introduces additional subtleties. When dd exceeds Ω(log(n))\Omega(\log{(n)}), subgraphs induced by dd vertices of I\mathcal{I} can have cycles. An immediate effect of this is that the the definition of a closed set that we gave before no longer yields consistent local distributions on any collection of dd variables. An example of a problem that arises when cycles can exist on a set of vertices is illustrated in Figure 2. To fix this, we define a stronger notion of closed set SS that guarantees that all paths of length at most 33 between any two vertices in SS are completely contained inside SS. This notion of closures differs from the one that Benabbas et. al. use. An appeal to the expansion property of I\mathcal{I} (instead of high girth as before) can be used to show that the closure of a set SS is at most a constant factor larger than S|S|. Similarly, as before, we need to show that there exists a (suitably defined) ball, Ball(A)Ball(A) around any set AA of variables (of size at most dd) such that the correlations with any other set BB of size at most dd are “captured” by the intersection of Ball(A)Ball(A) and BB. This needs a more careful argument. In particular, the correlations (even in the low girth case) are actually not necessarily captured by the intersection of Ball(A)Ball(A) with BB, but rather with some set BB^{\prime} that is related, but not identical to BB. However, the crucial property that we require is that the set Bin=Ball(A)BB_{in}=Ball(A)\cap B^{\prime} satisfies (1) if BB came before AA in the ordering, then so will BinB_{in} and (2) Bin+BBall(A)B|B_{in}|+|B\setminus Ball(A)|\leq|B|. This second property is more complicated to prove in the case where B|B| can be much larger than the girth bound, but turns out to hold there as well. The bottom line is that with additional care however, the high level picture provided by this overview can indeed be implemented and we give a full analysis based on the local Gram-Schmidt like process in Section 6.

Preliminaries

We collect some standard definitions and notation here. A (k,n)(k,n)-instance is a kk-uniform hypergraph I={C1,,Cm}\mathcal{I}=\{C_{1},\ldots,C_{m}\} over [n][n] so that every hyperedge (also known as a clause) C=(i1,,ik)IC=(i_{1},\ldots,i_{k})\in\mathcal{I} is labeled by a string σ=σC{±1}k\sigma=\sigma^{C}\in\{\pm 1\}^{k}. We identify a clause CC with the function that maps x{±1}nx\in\{\pm 1\}^{n} to y1,,yky_{1},\ldots,y_{k} where yj=σjxijy_{j}=\sigma_{j}x_{i_{j}}. We will sometimes also consider CC as a tuple of the literals (σi1xi1,,σikxik)(\sigma_{i_{1}}x_{i_{1}},\ldots,\sigma_{i_{k}}x_{i_{k}}). We write V(C)V(C) for the variables involved in (or covered by) a clause CC and similarly for V[n]V\subseteq[n] we write C(V)\mathcal{C}(V) for the set of all clauses CC such that V(C)VV(C)\subseteq V. For any x{1,1}nx\in\{-1,1\}^{n}, we write xAx_{A} to denote the tuple of coordinates in the subset A[n]A\subseteq[n]. If x{1,1}Ax\in\{-1,1\}^{A} and y{1,1}By\in\{-1,1\}^{B} for disjoint sets AA and BB, we will write xyx\circ y for the string in {1,1}AB\{-1,1\}^{A\cup B} that projects to xx for coordinates in AA and to yy for coordinates in BB.

Unless explicitly mentioned, the base of all logarithms appearing in the paper is assumed to be 22. We consider the arity of our tuples kk to be a constant and so OO notation may hide the dependence on kk.

We now define some standard ideas in the context of hypergraphs.

The degree of GG is the maximum number of hyperedges that intersect with any given hyperedge in GG. The length of the shortest cycle in GG is said to be the girth of GG. For any vertices u,vu,v of a hypergraph GG, we define the distance, dist(u,v)\mathsf{dist}(u,v) of u,vu,v in GG as the minimum number of hyperedges in any path that joins uu and vv in GG. For S,TS,T, subsets of vertices, we define dist(S,T)=defminsS,tTdist(s,t).\mathsf{dist}(S,T)\stackrel{{\scriptstyle\textrm{def}}}{{=}}\min_{s\in S,t\in T}\mathsf{dist}(s,t).

Next, we define the notion of expansion in a kk-uniform hypergraph GG:

A kk-uniform constraint hypergraph GG is said to be (r,β)(r,\beta)-expanding if any collection C\mathcal{C} of at most rr hyperedges of GG cover at least (k1β)C(k-1-\beta)|\mathcal{C}| vertices of GG, i.e. {vCCvC}(k1β)C|\{v\mid\exists C\in\mathcal{C}\text{, }v\in C\}|\geq(k-1-\beta)|\mathcal{C}|. We call β\beta, the coefficient of expansion of GG.

Let I\mathcal{I} be a (k,n)(k,n) instance. We now describe the properties of the (k,n)(k,n) instances that we need and give a construction for them in Section B of the Appendix by taking a random instance and removing a few clauses. Specifically, we show the existence of nice instances, the ones that satisfy the properties described in the lemma below:

Fix 1>ϵ,δ01>\epsilon,\delta\geq 0 and γekk2\gamma\geq e^{k}k^{2}. Then, there exists a kk-uniform constraint hypergraph GG with γn\gamma n edges such that for η=(1/γ2)2/δ\eta=(1/\gamma^{2})^{2/\delta}, 1/τ=4log2(γk2)1/\tau=4\log_{2}(\gamma k^{2}), GG:

has girth \frakfamilygτlog(n)\text{\frakfamily g}\geq\tau\log{(n)}

We will use this lemma with any given ϵ\epsilon (the soundness slack), δ=1200\delta=\frac{1}{200} and γ=ekk2/ϵ2\gamma=e^{k}k^{2}/\epsilon^{2}. We will call the instances that satisfy the conditions of the lemma above as nice.

For such instances, it is also easy to prove the soundness part (part (i)) of Theorem 1.2 (see Section B.1 of the Appendix) which we record in the following lemma.

For every ϵ>0\epsilon>0 and kk, if nn is sufficiently large then there exists a nice (k,n)(k,n)-instance I\mathcal{I} with the property that for every x{±1}nx\in\{\pm 1\}^{n}, the distribution {C(x)}CI\{C(x)\}_{C\in\mathcal{I}} is ϵ\epsilon-close in total variation distance to the uniform distribution on {±1}k\{\pm 1\}^{k}.

Closed sets, and the definition of the pseudo-expectation

We first define the concept of closed sets that is central to our argument.

For every R1R\geq 1, a set A[n]A\subseteq[n] is RR-closed if for every v,vAv,v^{\prime}\in A, any path of length at most RR between vv and vv^{\prime} is contained in AA. We say that AA is closed if it is 33-closed.

We define the RR-closure of AA, denoted by clR(A)\mathsf{cl}_{R}(A), to be the intersection of all sets BB such that ABA\subseteq B and BB is RR-closed. The closure of AA, denoted by cl(A)\mathsf{cl}(A), is the 33-closure of AA.

Readers familiar with the definition of closure (or advice set) in the work of or will find the definition of closure above slightly different. The main difference is that our definition allows us to have some nice properties such as uniqueness and that the intersection of two closed sets is closed, which are very helpful for our proof. We stress however that the actual pseudo-expectation is the same as that of those works.

Next, we give a constructive definition of closure of a set.

Given S[n]S\subseteq[n] and any R<min{\frakfamilyg/2,12β}R<\min\{\text{\frakfamily g}/2,\frac{1}{2\beta}\}, the RR-closure of SS can be obtained by the following procedure run on SS: Set A:=A:=\emptyset. For every v,vV(A)Sv,v^{\prime}\in V(A)\cup S such that there is a path of length at most RR between vv and vv^{\prime} in I\mathcal{I} not contained in AA, add every clause in the path to AA. Output V(A)SV(A)\cup S.

Observe that the procedure terminates as there are only finitely many clauses. Further, the output is closed by virtue of the termination of the procedure. By induction on the time at which a path is added in the procedure, it is easy to show that every closed set containing SS must contain the path. Thus, V(A)V(A) is a closed set containing AA and every clause CC such that V(C)V(A)V(C)\subseteq V(A) satisfies V(C)clR(S)V(C)\subseteq\mathsf{cl}_{R}(S). The lemma now follows by the minimality of clR(S)\mathsf{cl}_{R}(S). ∎

Next, we bound the size of clR(S)\mathsf{cl}_{R}(S).

For any R<min{\frakfamilyg/2,12β}R<\min\{\text{\frakfamily g}/2,\frac{1}{2\beta}\} and S[n]S\subseteq[n] such that Sηn10R|S|\leq\frac{\eta n}{10R}. Then, C(clR(S))2RS|\mathcal{C}(\mathsf{cl}_{R}(S))|\leq 2R|S| and clR(S)2RkS|\mathsf{cl}_{R}(S)|\leq 2Rk|S|.

Consider the procedure described in Lemma 4.3. Let SisoclR(S)S^{iso}\subseteq\mathsf{cl}_{R}(S) be the isolated vertices in clR(S)\mathsf{cl}_{R}(S). Observe that one cannot add any isolated vertices in the procedure and thus SisoSS^{iso}\subseteq S. Define S=SSisoS^{\prime}=S\setminus S^{iso}. Then, clR(S)=clR(S)Siso\mathsf{cl}_{R}(S)=\mathsf{cl}_{R}(S^{\prime})\cup S^{iso}.

If the process terminates before adding a total of q=S1Rβq=\frac{|S^{\prime}|}{\frac{1}{R}-\beta} clauses, then there’s nothing to prove, since SSηn10R|S^{\prime}|\leq|S|\leq\frac{\eta n}{10R} yields that qηn5q\leq\frac{\eta n}{5}. Thus, suppose, for the sake of a contradiction, that the procedure adds >q>q clauses and let ithi^{th} round of the procedure be the first round where the number of clauses added exceeds qq.

Let Ci\mathcal{C}_{i} be the set of clauses added in the procedure till the ithi^{th} round and let SiS^{\prime}_{i} be the set of variables obtained by taking the union of variables covered by the clauses added and SS^{\prime}. Further, suppose that the ithi^{th} round adds qiq_{i} clauses. Then, Ciq+qi<ηn|\mathcal{C}_{i}|\leq q+q_{i}<\eta n and thus, Ci\mathcal{C}_{i} must satisfy the expansion requirement: V(Ci)(q+qi)(k1β)|V(\mathcal{C}_{i})|\geq(q+q_{i})(k-1-\beta). On the other hand, any new path of length jRj\leq R added in a round adds at most jk(j1)2jk-(j-1)-2 new vertices. Thus, on an average, every one of the at most jj new clauses added in any round of the procedure contribute at most: k11/jk11/Rk-1-1/j\leq k-1-1/R new vertices. Thus, SiS+(q+qi)(k11/R)|S^{\prime}_{i}|\leq|S^{\prime}|+(q+q_{i})(k-1-1/R).

This yields that S(q+qi)(1/Rβ)>S|S^{\prime}|\geq(q+q_{i})\cdot(1/R-\beta)>|S^{\prime}| using that q=S1Rβq=\frac{|S^{\prime}|}{\frac{1}{R}-\beta}. This is a contradiction.

The size claimed in the lemma now follows by observing that 1Rβ12R\frac{1}{R}-\beta\geq\frac{1}{2R} and that every clause contributes at most kk new variables. ∎

The following lemma summarizes the simple properties of the closures defined here.

For any R<\frakfamilyg/2R<\text{\frakfamily g}/2, if AA and BB are RR-closed and then so is ABA\cap B.

If ABA\subseteq B then clR(A)clR(B)\mathsf{cl}_{R}(A)\subseteq\mathsf{cl}_{R}(B).

Every connected component of clR(A)\mathsf{cl}_{R}(A) of size 2\geq 2 intersects AA in at least two elements.

Let A=A1A2AmA=A_{1}\cup A_{2}\cup\ldots A_{m}. Then, cl(A)=cl(i=1mcl(Ai))\mathsf{cl}(A)=\mathsf{cl}(\cup_{i=1}^{m}\mathsf{cl}(A_{i})).

If there are two vertices v,vv,v^{\prime} in ABA\cap B such that dist(v,v)R\mathsf{dist}(v,v^{\prime})\leq R, then since both AA and BB are closed, both of them should contain the unique (since R<\frakfamilyg/2R<\text{\frakfamily g}/2) path between them.

By definition, clR(B)\mathsf{cl}_{R}(B) is an RR-closed set containing BAB\supseteq A and hence if clR(A)clR(B)\mathsf{cl}_{R}(A)\nsubseteq\mathsf{cl}_{R}(B) then clR(A)clR(B)\mathsf{cl}_{R}(A)\cap\mathsf{cl}_{R}(B) would be an even smaller RR-closed set that contains AA, contradicting the minimality of clR(A)\mathsf{cl}_{R}(A).

Suppose otherwise that there is some connected component SS of clR(A)\mathsf{cl}_{R}(A) with S2|S|\geq 2 intersecting AA with at most one element {x}\{x\}, then we claim that B=(clR(A)S){x}B=(\mathsf{cl}_{R}(A)\setminus S)\cup\{x\} is an RR-closed set containing AA. Clearly, BAB\supseteq A. Now suppose for the sake of contradiction that there were two vertices vvv\neq v^{\prime} of distance at most RR in BB whose path is not in BB. Then since BclR(A)B\subseteq\mathsf{cl}_{R}(A) and clR(A)\mathsf{cl}_{R}(A) is RR-closed, the path between vv and vv^{\prime} must have had a vertex uS{x}u\in S\setminus\{x\}. But since one of vv or vv^{\prime} must be different than xx (say vv^{\prime}), we get by contradiction that vv^{\prime} was connected to SS in clR(A)\mathsf{cl}_{R}(A).

Let B=cl(i=1mcl(Ai))B=\mathsf{cl}(\cup_{i=1}^{m}\mathsf{cl}(A_{i})). Since cl(A)\mathsf{cl}(A) is closed and contains i=1mAi\cup_{i=1}^{m}A_{i}, Bcl(A)B\subseteq\mathsf{cl}(A). If Bcl(A)B\neq\mathsf{cl}(A), then, Bi=1mAiB\supseteq\cup_{i=1}^{m}A_{i} and is closed contradicting the minimality of cl(A)\mathsf{cl}(A). ∎

The definition and the proof of consistency of the local distribution we define were shown by Benabbas et. al. for the weaker notion of closures they used (in order to define linear round solutions in the Sherali Adams hierarchy). The argument for our notion of closure is similar but we include it here for the sake of completeness.

For every set S[n]S\subseteq[n], Sd|S|\leq d, let cl(S)\mathsf{cl}(S) be the closure of SS and suppose ISI_{S} is the set of isolated variables in cl(S)\mathsf{cl}(S). Define C(cl(S))\mathcal{C}(\mathsf{cl}(S)) be all clauses CC such that V(C)cl(S)V(C)\subseteq\mathsf{cl}(S). Then, we set:

where xCx_{C} the projection of xx on to the coordinates in V(C)V(C), and Zcl(S)=2kC(cl(S))cl(S)Z_{\mathsf{cl}(S)}=2^{k|\mathcal{C}(\mathsf{cl}(S))|-|\mathsf{cl}(S)|} (1)\geq 1). Observe that the above expression tells us that the marginal distribution of νcl(S)\nu_{cl(S)} over ISI_{S} is uniform. We extend the notation above and write νT\nu_{T} for the marginal of νcl(T)\nu_{\mathsf{cl}(T)} on variables in TT.

We now show that νcl(S)\nu_{\mathsf{cl}(S)} defined above is indeed a probability distribution over cl(S)\mathsf{cl}(S).

Let AA and BB be closed sets such that ABA\subseteq B and C(B)ηn|\mathcal{C}(B)|\leq\eta n. Then,

νA\nu_{A} is a valid probability distribution: x{1,1}AνA(x)=1\sum_{x\in\{-1,1\}^{A}}\nu_{A}(x)=1.

ν\nu is locally consistent: for every x{1,1}Sx\in\{-1,1\}^{S}, νA(x)=y{1,1}BAνB(xy)\nu_{A}(x)=\sum_{y\in\{-1,1\}^{B\setminus A}}\nu_{B}(x\circ y).

The following claim that we record as a lemma will be useful in the proof.

There exists an ordering C1,C2,,CrC_{1},C_{2},\ldots,C_{r} of clauses in CA,B\mathcal{C}_{A,B} and a partition of BAB\setminus A into sets F1V(C1),F2V(C2),,FrV(Cr)F_{1}\subseteq V(C_{1}),F_{2}\subseteq V(C_{2}),\ldots,F_{r}\subseteq V(C_{r}) such that for every jrj\leq r, Fjk2|F_{j}|\geq k-2 and Fj(i>jV(Ci))=F_{j}\cap\left(\cup_{i>j}V(C_{i})\right)=\emptyset.

We first complete the proof of Lemma 4.6 and then prove Lemma 4.7.

Let ZA=2A+kC(A)Z_{A}=2^{-|A|+k|\mathcal{C}(A)|} and ZB=2B+kC(B)Z_{B}=2^{-|B|+k|\mathcal{C}(B)|}. Let CA,B=C(B)C(A)\mathcal{C}_{A,B}=\mathcal{C}(B)\setminus\mathcal{C}(A). Using (1), we have:

To simplify notation, we will write μi\mu_{i} for μCi\mu_{C_{i}} and xix^{i} for xV(Ci)x_{V(C_{i})} where x{1,1}nx\in\{-1,1\}^{n}. We have, using the ordering given by Lemma 4.7. Then,

Now, i=1rV(Cr)Fr=krBA\sum_{i=1}^{r}|V(C_{r})\setminus F_{r}|=kr-|B\setminus A|. Further, B+kC(B)kr+BA=A+kC(A)-|B|+k|\mathcal{C}(B)|-kr+|B\setminus A|=-|A|+k|\mathcal{C}(A)|. Thus, ZB2i=1rV(Cr)Fr=ZAZ_{B}\cdot 2^{-\sum_{i=1}^{r}|V(C_{r})\setminus F_{r}|}=Z_{A} completing the proof. ∎

For every CCA,BC\in\mathcal{C}_{A,B} define Γ(C)={vV(C)CCCA,BvV(C)}\Gamma(C)=\{v\in V(C)\mid\forall C^{\prime}\neq C\in\mathcal{C}_{A,B}\text{, }v\notin V(C^{\prime})\}. For any collection C\mathcal{C} of clauses in CA,B\mathcal{C}_{A,B}, let Δ(C)=CCΓ(C)\Delta(\mathcal{C})=|\cup_{C\in\mathcal{C}}\Gamma(C)|. Similarly, define ΓA(C)=Γ(C)A\Gamma_{A}(C)=\Gamma(C)\setminus A and ΔA(C)=CCΓA(C)\Delta_{A}(\mathcal{C})=|\cup_{C\in\mathcal{C}}\Gamma_{A}(C)|. We make the following claim:

For any CCA,B\mathcal{C}\subseteq\mathcal{C}_{A,B}, ΔA(C)(k5/22β)C.\Delta_{A}(\mathcal{C})\geq(k-5/2-2\beta)|\mathcal{C}|.

We first complete the proof of the lemma using the claim. Since ΔA(CA,B)(k5/22β)CA,B\Delta_{A}(\mathcal{C}_{A,B})\geq(k-5/2-2\beta)|\mathcal{C}_{A,B}| and β<1/10\beta<1/10, there exists a clause CC such that ΓA(C)k2|\Gamma_{A}(C)|\geq k-2. Now V(C)AΓA(C)V(C)\setminus A\supseteq\Gamma_{A}(C) and thus V(C)Ak2|V(C)\setminus A|\geq k-2. We place this clause at the beginning of the ordering, call it C1C_{1} and set F1=V(C)AF_{1}=V(C)\setminus A. We now iterate with CA,B{C}\mathcal{C}_{A,B}\setminus\{C\} to complete the construction, obtain a clause C2CA,BC1C_{2}\in\mathcal{C}_{A,B}\setminus C_{1} such that ΓA(C2)k2|\Gamma_{A}(C_{2})|\geq k-2. Since ΓA(C1)\Gamma_{A}(C_{1}) cannot intersect ΓA(C2)\Gamma_{A}(C_{2}), we can now set F2=V(C2)V(C1)F_{2}=V(C_{2})\setminus V(C_{1}). Continuing this way yields the required ordering and partition of BAB\setminus A. ∎

Fix any C\mathcal{C} and consider any (maximally) connected subgraph with edges CC\mathcal{C}^{\prime}\subseteq\mathcal{C}. If C\mathcal{C}^{\prime} consists of a single clause CC, then V(C)A1|V(C)\cap A|\leq 1 (since AA is closed) and V(C)V(C)=V(C)\cap V(C^{\prime})=\emptyset for any CCCC^{\prime}\neq C\in\mathcal{C}. Thus, ΓA(C)k1\Gamma_{A}(\mathcal{C}^{\prime})\geq k-1.

Now suppose C\mathcal{C}^{\prime} consists of at least 22 clauses. We first claim that Δ(C)(k22β)C\Delta(\mathcal{C}^{\prime})\geq(k-2-2\beta)|\mathcal{C}^{\prime}|. To see this, observe that C\mathcal{C}^{\prime} is a collection of at most ηn\eta n clauses in I\mathcal{I} and thus, V(C)(k1β)C|V(\mathcal{C}^{\prime})|\geq(k-1-\beta)|\mathcal{C}|. Further, every vV(C)CCΓ(C)v\in V(\mathcal{C}^{\prime})\setminus\cup_{C\in\mathcal{C}^{\prime}}\Gamma(C) belongs to at least two different clauses in C\mathcal{C}^{\prime} and thus, (k1β)CV(C)Δ(C)+(kCΔ(C))/2(k-1-\beta)|\mathcal{C}^{\prime}|\leq|V(\mathcal{C}^{\prime})|\leq\Delta(\mathcal{C}^{\prime})+(k|\mathcal{C}^{\prime}|-\Delta(\mathcal{C}^{\prime}))/2. Rearranging gives Δ(C)(k22β)C\Delta(\mathcal{C}^{\prime})\geq(k-2-2\beta)|\mathcal{C}^{\prime}|.

Next, we claim that that for every vV(C)Av\in V(\mathcal{C}^{\prime})\cap A there exists a pair of clauses C,CC,C^{\prime} such that V(CC)A={v}V\left(C\cup C^{\prime}\right)\cap A=\{v\}. Consider any clause CCC\in\mathcal{C} such that V(C)A={v}V(C)\cap A=\{v\}. If there is another clause CC^{\prime} such that V(C)A={v}V(C^{\prime})\cap A=\{v\}, then observe that V(C)V(C^{\prime}) cannot intersect AA in any other element (since AA is closed) and thus we can let C,CC,C^{\prime} be the pair as above, corresponding to vv. Otherwise, there exists a clause CC^{\prime} such that CCC^{\prime}\in\mathcal{C} such that V(C)V(C)V(C^{\prime})\cap V(C)\neq\emptyset (since V(C)V(\mathcal{C}^{\prime}) is connected) and V(C)A=V(C^{\prime})\cap A=\emptyset (as otherwise there would be a path between two distinct vertices of AA, of length at most 22 outside of AA). Further, observe that all such pairs are disjoint. This is because if some pairs intersect, then they induce a path of length at most 33 between two distinct vertices of AA that is not contained in AA (violating the 33 closedness of AA). Thus, V(C)AC/2|V(\mathcal{C}^{\prime})\cap A|\leq|\mathcal{C}^{\prime}|/2. Thus, we must have: ΔA(C)Δ(C)C/2(k22β)CC/2=(k5/22β)C\Delta_{A}(\mathcal{C}^{\prime})\geq\Delta(\mathcal{C}^{\prime})-|\mathcal{C}^{\prime}|/2\geq(k-2-2\beta)|\mathcal{C}^{\prime}|-|\mathcal{C}^{\prime}|/2=(k-5/2-2\beta)|\mathcal{C}^{\prime}|.

Since for every connected component C\mathcal{C}^{\prime} inside C\mathcal{C} we have that ΔA(C)(k5/22β)C,\Delta_{A}(\mathcal{C}^{\prime})\geq(k-5/2-2\beta)|\mathcal{C}^{\prime}|, we must have ΔA(C)(k5/22β)C\Delta_{A}(\mathcal{C})\geq(k-5/2-2\beta)|\mathcal{C}| as promised. This completes the proof of claim. ∎

Suppose AA and BB are closed disjoint sets such that ABA\cup B is closed. Then, νAB(x)=νA(xA)νB(xB)\nu_{A\cup B}(x)=\nu_{A}(x_{A})\cdot\nu_{B}(x_{B}).

We now define the pseudo-expectation operator associated with the local distributions {νT}Ts\{\nu_{T}\}_{|T|\leq s}:

Let I\mathcal{I} be a nice (k,n)(k,n) instance and μ\mu a pairwise independent distribution over {±1}k\{\pm 1\}^{k}. Then the family of local distributions {νX}X[n],X<d\{\nu_{X}\}_{X\subseteq[n],|X|<d} for s=ηn/6s=\eta n/6 satisfies:

Completeness: For every clause CC of I\mathcal{I}, νV(C)=μ\nu_{V(C)}=\mu.

Consistency: for every ST[n]S\subseteq T\subseteq[n], Td|T|\leq d, the marginal of νT\nu_{T} on to SS is νS\nu_{S}.

The completeness property follows from (1) and C(V(C))={C}\mathcal{C}(V(C))=\{C\}. The consistency property follows from Lemma 4.6. ∎

Local Distribution on Unions

In this section we make an important step towards showing the positivity property of our pseudo-distribution by showing that if two sets AA and BB are sufficiently closed, then the local distribution on ABA\cup B is only determined by the clauses that are contained in AA or in BB. In particular, this implies that if AA and BB are disjoint then the distribution on AA is independent of the distribution of BB. The main result of this section is the following expression for the local distribution on the union of AA and BB where AA is RR-closed for a sufficiently large constant RR and BB is closed.

Suppose AA is RR-closed for R100R\geq 100 and BB is closed. Then, for any x{1,1}ABx\in\{-1,1\}^{A\cup B},

where ZA,B=2kC(AB)ABZ_{A,B}=2^{k|\mathcal{C}(A\cup B)|-|A\cup B|}.

We make two convenient definitions before proceeding, see Figure 3:

For any two closed sets AA and BB, a path PP of length at most 33 is said to be a bridge path for the pair A,BA,B if PA=PB=1|P\cap A|=|P\cap B|=1.

For any two closed sets AA and BB, a path PP of length at most 33 is said to be a bridge-closure path for the pair A,BA,B, if there exists a bridge path PP^{\prime} such that PP=1|P^{\prime}\cap P|=1 and PB=1|P\cap B|=1 but CA=C\cap A=\emptyset.

As a final remark, observe that the example from Figure 1 shows that AA and BB being 22-closed is not enough to guarantee the statement of the lemma. While we believe that at least one of the sets out of AA and BB should be RR-closed for some R>3R>3 for the lemma to hold, currently, we do not have any example of a counter example demonstrating this point. We now proceed with the actual proof.

Let D=cl(AB)D=\mathsf{cl}(A\cup B). Let CA,B\mathcal{C}_{A,B} and CB\mathcal{C}_{B} be the set of bridge paths and bridge closure paths of BB for the pair A,BA,B, respectively. Observe that V(CA,B)V(CB)DV(\mathcal{C}_{A,B})\cup V(\mathcal{C}_{B})\subseteq D. We now show that these are the only extra clauses in DD:

The first observation describes how bridge paths and bridge-closure paths intersect.

For any distinct P1,P2CA,BP_{1},P_{2}\in\mathcal{C}_{A,B}, P1P2ABP_{1}\cap P_{2}\subseteq A\cup B.

For any distinct P1,P2CBP_{1},P_{2}\in\mathcal{C}_{B}, P1P2V(P)BP_{1}\cap P_{2}\subseteq V(P)\cup B where PP is a bridge path.

For any PCBP\in\mathcal{C}_{B} and PCA,BP^{\prime}\in\mathcal{C}_{A,B}, V(P)V(P)1|V(P)\cap V(P^{\prime})|\leq 1.

Suppose P1,P2CBP_{1},P_{2}\in\mathcal{C}_{B} are such that P1PP_{1}\cap P\neq\emptyset and P2PP_{2}\cap P^{\prime}\neq\emptyset for some bridge paths PPP\neq P^{\prime}. Then, P1P2=P_{1}\cap P_{2}=\emptyset.

If the claim weren’t true, then there must be a path of length 6\leq 6 between two vertices of AA which violating that AA is RR-closed.

Suppose first that there is a bridge path PP such that PP1P\cap P_{1}\neq\emptyset and PP2P\cap P_{2}\neq\emptyset. If either of P1P_{1} or P2P_{2} intersect PP in more than one element, then there is a cycle of length at most 66 in GG which violates the fact that GG has Ω(1)\Omega(1) girth. If P1P_{1} and P2P_{2} intersect in an element not contained in V(P)V(P), then, again there is a cycle of length at most 99 in GG violating the high girth of GG. Similarly, if P1,P2P_{1},P_{2} intersect inside BB, then, they cannot intersect outside of BB and further, they cannot both intersect the same bridge path as it would yield a cycle of length at most 99 in GG. Thus in both the cases, P1P2V(P)BP_{1}\cap P_{2}\subseteq V(P)\cup B for some bridge path PP.

Otherwise there is a cycle of length at most 66 in GG violating that GG has girth ω(1)\omega(1).

If not, then if PPA=1|P\cap P^{\prime}\cap A|=1 then there is a cycle of length 1212 in the graph, contradicting our assumption on the girth. Otherwise PPA=2|P\cap P^{\prime}\cap A|=2 which means that there is a path of length at most 1212 between two distinct vertices of AA.

The next observation shows that there is no path of length at most 33 that connects two bridge paths, two bridge-closure paths or two bridge-bridge-closure paths that are not contained in ABA\cup B.

There is no path PP of length at most 33 not contained in AA that connects a bridge path PP^{\prime} and AA.

There is no path of length at most 33 not contained in AA that connects PCA,BP\in\mathcal{C}_{A,B} with PCBP^{\prime}\in\mathcal{C}_{B}.

There is no path of length at most 33 connecting distinct P,PCBP,P^{\prime}\in\mathcal{C}_{B}.

Otherwise there is a path of length at most 66 between two vertices of AA not contained in AA, violating the fact that AA is RR closed.

Otherwise there is a path of length at most 1212 between two vertices of AA, violating that AA is RR closed.

Otherwise there is a path of length at most 1818 not contained in AA, connecting two vertices of AA.

The following claim is now a consequence of the claims above:

For any CC such that V(C)ABV(C)\nsubseteq A\cup B but V(C)DV(C)\subseteq D, CCA,BCBC\in\mathcal{C}_{A,B}\cup\mathcal{C}_{B}.

Consider the iterative procedure of building the closure of ABA\cup B by adding paths one by one in some order. Let PP be the first path in this order that violates the claim. Then, either PP intersects two bridge paths or a bridge path and AA or a bridge path and a bridge-closure path or two bridge-closure paths. Each of these situations is explicitly barred by the claims above. This completes the argument. ∎

Let Z=2kC(D)D=2kC(AB)+kCA,BDZ=2^{k|\mathcal{C}(D)|-|D|}=2^{k|\mathcal{C}(A\cup B)|+k|\mathcal{C}_{A,B}|-|D|}. Observe that Z22CA,B=ZA,BZ\cdot 2^{-2|\mathcal{C}_{A,B}|}=Z_{A,B}. For every clause CCA,BCBC\in\mathcal{C}_{A,B}\cup\mathcal{C}_{B}, let VC=V(C)(AB)V_{C}^{\prime}=V(C)\setminus(A\cup B) and VC=V(C)(AB)V_{C}^{\prime\prime}=V(C)\cap(A\cap B). Similarly, let D=D(AB)D^{\prime}=D\setminus(A\cup B) and D=D(AB)D^{\prime\prime}=D\cap(A\cup B). Next, we claim:

Let D=V1V2D^{\prime}=V_{1}\cup V_{2} such that V1V2=V_{1}\cap V_{2}=\emptyset defined by V1=DV(CA,B))V_{1}=D^{\prime}\setminus V(\mathcal{C}_{A,B})) and V2=DV1V_{2}=D^{\prime}\setminus V_{1}.

In this section, we prove our main result. Our proof will follow easily from the following lemma which is the main result of this section.

The rest of this section is devoted to proving Lemma 6.1.

Our aim is to build an order on the ([n]d){[n]\choose{\leq d}}, in which to process them for our local orthogonalization procedure. We start with an arbitrary ordering on the clauses of I\mathcal{I}, e.g. for every CIC\in\mathcal{I} we define a unique index ζ(C)[m]\zeta(C)\in[m]. We say that ABA\prec B if:

C(cl(A))\mathcal{C}(\mathsf{cl}(A)) is smaller than C(cl(B))\mathcal{C}(\mathsf{cl}(B)) in lexicographic order of ζ\zeta. That is, ABA\prec B if the maximum index ζ(C)\zeta(C) for Ccl(A)C\in\mathsf{cl}(A) is smaller than this maximum for cl(B)\mathsf{cl}(B), and if they are equal we break ties by the second largest index and so on. We define π(cl(A))\pi(\mathsf{cl}(A)) to be the index of cl(A)\mathsf{cl}(A) according to this ordering. (Note that π\pi is a permutation on distinct closures, and so if cl(A)cl(B)\mathsf{cl}(A)\neq\mathsf{cl}(B) then π(cl(A))π(cl(B))\pi(\mathsf{cl}(A))\neq\pi(\mathsf{cl}(B)).)

If C(cl(A))=C(cl(B))\mathcal{C}(\mathsf{cl}(A))=\mathcal{C}(\mathsf{cl}(B)) then we say that ABA\prec B if A<B|A|<|B|.

If C(cl(A))=C(cl(B))\mathcal{C}(\mathsf{cl}(A))=\mathcal{C}(\mathsf{cl}(B)) and A=B|A|=|B| then we break ties arbitrarily.

For i=0,,Mi=0,\ldots,M, we let AiA_{i} denote the ithi^{th} set in this ordering. Note that A0=A_{0}=\emptyset and A1,,AnA_{1},\ldots,A_{n} are the singleton elements {1},,{n}\{1\},\ldots,\{n\} (in some arbitrary order). We will write χi\chi_{i} for χAi\chi_{A_{i}} in the following to reduce clutter.

2 Local Orthogonalization

Set R=100R=100. Define the ithi^{th} local correlated space as

Invoking Lemma 4.12, it suffices to show that clR(Ai)<s=ηn/6|\mathsf{cl}_{R}(A_{i})|<s=\eta n/6. This follows by noting that Aid|A_{i}|\leq d and clR(Ai)2Rkd=200ηn/10000=ηn/500s|\mathsf{cl}_{R}(A_{i})|\leq 2Rkd=200\eta n/10000=\eta n/500\leq s (Lemma 4.4). ∎

The following simple lemma would be very useful.

Now suppose for the sake of contradiction that

for some δ>0\delta>0. (If the expectation is negative then we can take g-g.) Let f=χˉiϵgf=\bar{\chi}_{i}-\epsilon g. We have:

and so if ϵ\epsilon is sufficiently small then

contradicting our choice of χˉi\bar{\chi}_{i}. ∎

3 Global Orthogonality lemma

We will need the following observation for the proof which we record before proceeding:

Suppose HH is a connected kk-uniform hypergraph such that there exist a subset of vertices, UU, U2|U|\geq 2 satisfying: dist(u,v)>R\mathsf{dist}(u,v)>R for every distinct u,vUu,v\in U. Then, HH must have at least UR2\frac{|U|R}{2} hyperedges.

Observe that the collection of balls of radius R/2R/2 around any vertex in uUu\in U are all disjoint and contain at least one path (due to connectedness of HH). ∎

Fix any j<ij<i and let A=AiA=A_{i} and B=AjB=A_{j}. Let

For every xBbdyx\in B_{bdy} we call any associated clause WW as in the definition above as a boundary clause. Let Bout=BclR(Ai)B_{out}=B\setminus\mathsf{cl}_{R}(A_{i}) and Bin=B(BoutBbdy)B_{in}=B\setminus(B_{out}\cup B_{bdy}) and Brest=B(BoutBin)B_{rest}=B\setminus(B_{out}\cup B_{in}). Note that BbdyB_{bdy} is not necessarily a subset of BB. Next, we make two useful claims:

We will show that BbdyBout|B_{bdy}|\leq|B_{out}|. This immediately yields the claim by observing that dB=Bin+Bout+BrestBin+Bbdyd\geq|B|=|B_{in}|+|B_{out}|+|B_{rest}|\geq|B_{in}|+|B_{bdy}|. We note that the proof of this claim is significantly simpler in the case that B<R/2|B|<R/2. Proving it in the case when RR is a constant and B=Ω(n)|B|=\Omega(n) is one of the main technical ingredients in getting the proof sketched in the overview to work for Ω(n)\Omega(n) rounds of the SOS hierarchy.

Let Q[n]Q\subseteq[n] be a (maximally) connected component in the subgraph defined by the hyperedges C(cl(B))C(clR(A))\mathcal{C}(\mathsf{cl}(B))\setminus\mathcal{C}(\mathsf{cl}_{R}(A)). Let Qbdy=BbdyQQ_{bdy}=B_{bdy}\cap Q and Qout=BoutQQ_{out}=B_{out}\cap Q. BbdyB_{bdy} is thus partitioned into QbdyQ_{bdy} for every possible maximally connected subgraphs QQ. It is thus enough to prove that QbdyQout|Q_{bdy}|\leq|Q_{out}| for any fixed QQ.

Observe that QclR(A)=QbdyQ\cap\mathsf{cl}_{R}(A)=Q_{bdy}. If QclR(A)=Q\cap\mathsf{cl}_{R}(A)=\emptyset, then, there is nothing to prove. If Qbdy={v}Q_{bdy}=\{v\}, then, QQ contains V(Wv)V(W_{v}) where WvW_{v} is a boundary clause associated with vv. If QQ contains no vertex of BoutB_{out}, then, observe that cl(B)(Q{v})\mathsf{cl}(B)\setminus(Q\setminus\{v\}) is a closed set containing BB contradicting the minimality of cl(B)\mathsf{cl}(B). Thus, in this case, QbdyQout|Q_{bdy}|\leq|Q_{out}|.

Now suppose for Qbdy2|Q_{bdy}|\geq 2. Then, vertices in QbdyQ_{bdy} are connected through clauses in QQ. On the other hand, since AA is RR-closed, for any u,vQbdyu,v\in Q_{bdy}, any path that uses clauses from QQ between u,vu,v must be of length at least R+1R+1. Applying Lemma 6.6, we observe that C(Q)QbdyR/2|\mathcal{C}(Q)|\geq|Q_{bdy}|R/2.

Next, we claim that Qcl(QbdyQout)Q\subseteq\mathsf{cl}(Q_{bdy}\cup Q_{out}). It is easy to complete the proof once we have this claim: observe that

Rearranging yields that QoutQbdyR/266|Q_{out}|\geq|Q_{bdy}|\cdot\frac{R/2-6}{6}. Using R24R\geq 24 yields that QoutQbdy|Q_{out}|\geq|Q_{bdy}|.

We now proceed to show that Qcl(QbdyQout)Q\subseteq\mathsf{cl}(Q_{bdy}\cup Q_{out}). By Lemma 4.5 (4), cl(B)=cl(BinBbdyBout)\mathsf{cl}(B)=\mathsf{cl}(B_{in}\cup B_{bdy}\cup B_{out}). Let B=BinBbdyBout(QbdyQout)B^{\prime}=B_{in}\cup B_{bdy}\cup B_{out}\setminus(Q_{bdy}\cup Q_{out}). Then, by another application of Lemma 4.5 (4), cl(B)=cl(cl(B)cl(QbdyQout))\mathsf{cl}(B)=\mathsf{cl}(\mathsf{cl}(B^{\prime})\cup\mathsf{cl}(Q_{bdy}\cup Q_{out})). In other words, one can build the closure of BB by first building the closure of BB^{\prime} and QbdyQoutQ_{bdy}\cup Q_{out} (Step 11) and then taking the closure of the unions of the obtained sets (Step 22). Clearly, the final output contains every clause in C(Q)\mathcal{C}(Q). If we show that (1) C(cl(B))C(Q)=\mathcal{C}(\mathsf{cl}(B^{\prime}))\cap\mathcal{C}(Q)=\emptyset and that (2) no clause from C(Q)\mathcal{C}(Q) is added in the step 22, then every clause in C(Q)\mathcal{C}(Q) must be added in the procedure to build cl(QbdyQout)\mathsf{cl}(Q_{bdy}\cup Q_{out}) and thus we are done. We now proceed to show the two statements above.

(1): First observe that cl(B)\mathsf{cl}(B^{\prime}) itself can be built by building the closure of BinB_{in} (and cl(Bin)clR(A)C(cl(Bin))C(Q)=\mathsf{cl}(B_{in})\subseteq\mathsf{cl}_{R}(A)\Rightarrow\mathcal{C}(\mathsf{cl}(B_{in}))\cap\mathcal{C}(Q)=\emptyset), the closure of BoutBbdy(QbdyQout)B_{out}\cup B_{bdy}\setminus(Q_{bdy}\cup Q_{out}) (that cannot intersect any clause from C(Q)\mathcal{C}(Q) as then QQ must include a vertex from BoutBbdy(QbdyQout)B_{out}\cup B_{bdy}\setminus(Q_{bdy}\cup Q_{out}), a contradiction) and finally taking the closure of their union. This last step cannot add a clause in QQ: every path PP added connects cl(Bin)\mathsf{cl}(B_{in}) and cl(BoutBbdy(QbdyQout))\mathsf{cl}(B_{out}\cup B_{bdy}\setminus(Q_{bdy}\cup Q_{out})). If PP is contained in clR(A)\mathsf{cl}_{R}(A), then, there is nothing to prove. Otherwise PP must pass (exactly once) through a boundary vertex. If PP contains a clause from C(Q)\mathcal{C}(Q), then, if PP passes through a boundary vertex not in QbdyQ_{bdy}, then this enlarges QQ violating that QQ is a maximally connected component. If, on the other hand, PP passes through a boundary vertex in QbdyQ_{bdy}, then, PP connects BoutQB_{out}\setminus Q with QQ violating the maximality of QQ. Thus, C(cl(B))\mathcal{C}(\mathsf{cl}(B^{\prime})) cannot include any clause from C(Q)\mathcal{C}(Q).

(2): Consider the step 22 of the procedure to build cl(B)\mathsf{cl}(B). In this step, we add paths (of length at most 33) that connect cl(B)\mathsf{cl}(B^{\prime}) and cl(QbdyQout)\mathsf{cl}(Q_{bdy}\cup Q_{out}). For any such path PP, if PP includes some clause CC from C(Q)\mathcal{C}(Q) then it crosses out of clR(A)\mathsf{cl}_{R}(A) (exactly once) and thus must pass through a boundary vertex. By maximality of QQ, we must have that PBbdyQbdyP\cap B_{bdy}\in Q_{bdy} and PC(clR(A))C(Q)P\setminus\mathcal{C}(\mathsf{cl}_{R}(A))\subseteq\mathcal{C}(Q). On the other hand, the part of PP that connects some vertex in QbdyQ_{bdy} to cl(QbdyQout)\mathsf{cl}(Q_{bdy}\cup Q_{out}) is of length at most 33 and thus must be contained in cl(QbdyQout)\mathsf{cl}(Q_{bdy}\cup Q_{out}). Thus every edge in PC(clR(A))P\setminus\mathcal{C}(\mathsf{cl}_{R}(A)) is present in C(cl(QbdyQout)\mathcal{C}(\mathsf{cl}(Q_{bdy}\cup Q_{out}) and thus CC(Q)C\in\mathcal{C}(Q).

Suppose BoutB_{out}\neq\emptyset. Then, for every SBinBbdyS\subseteq B_{in}\cup B_{bdy}, SAS\prec A.

Since BoutB_{out}\neq\emptyset, cl(B)cl(A)\mathsf{cl}(B)\neq\mathsf{cl}(A). Thus, π(cl(B))<π(cl(A))\pi(\mathsf{cl}(B))<\pi(\mathsf{cl}(A)). Now, BinBbdyd|B_{in}\cup B_{bdy}|\leq d from Claim 6.7. Thus, every subset SBinBbdyS\subseteq B_{in}\cup B_{bdy} has a well-defined ordering w.r.t ([n]d){[n]\choose{\leq d}}. Further, for every such SS, cl(S)cl(B)\mathsf{cl}(S)\subseteq\mathsf{cl}(B) (Lemma 4.5) and thus, π(cl(S))π(cl(B))\pi(\mathsf{cl}(S))\leq\pi(\mathsf{cl}(B)). Hence, SAS\prec A. ∎

Now, χB=χBinχBrestχBout\chi_{B}=\chi_{B_{in}}\chi_{B_{rest}}\chi_{B_{out}} and we can write

Consider an arbitrary assignment zz to BAB\setminus A and γ{±1}Bbdy\gamma\in\{\pm 1\}^{|B_{bdy}|} to xBbdyx_{B_{bdy}}. Let \mathds1Bbdy=γ\mathds{1}_{B_{bdy}=\gamma} be the function that on input x{±1}nx\in\{\pm 1\}^{n} outputs 11 if xBbdy=γx_{B_{bdy}}=\gamma and zero otherwise.

Lemma 5.1 gives the expression for the local distribution on clR(A)cl(B)\mathsf{cl}_{R}(A)\cup\mathsf{cl}(B). Using the expression, we have:

for ever SclR(A)S\subseteq\mathsf{cl}_{R}(A), Sd|S|\leq d.

Now BinBbdyBout|B_{in}\cup B_{bdy}|\leq|B_{out}| (Claim 6.7) and \mathds1Bbdy=γSpan{χSSBbdy}\mathds{1}_{B_{bdy}}=\gamma\in\mathsf{Span}\{\chi_{S}\mid S\subseteq B_{bdy}\}:

Each index set SS of the characters above is a subset of BB and thus SAiS\prec A_{i} (invoking Claim 6.8 along with the fact π(cl(B))<π(cl(A))\pi(\mathsf{cl}(B))<\pi(\mathsf{cl}(A)) ). Thus, χBinχBrest\mathds1Bbdy=γVi\chi_{B_{in}}\chi_{B_{rest}}\cdot\mathds{1}_{B_{bdy}=\gamma}\in V_{i}. Using Lemma 6.3, thus,

We can now complete the proof of Lemma 6.1.

Acknowledgements

Thanks to Ryan O’Donnell, Li-Yang Tan, and David Steurer for fruitful discussions and the anonymous reviewers for their valuable comments and suggestions on a previous version of this paper.

References

Appendix A Random sparse predicates

Consider a random sparse predicate PP on kk variables and accepting P1(1)=t|P^{-1}(1)|=t assignments. If t=exp(o(k))t=\exp(o(k)), we now show that PP does not support a pairwise independent subgroup with high probability, as kk tends to infinity. Here the randomness corresponds to choosing P1(1)P^{-1}(1) to be a tt-sized subset of {0,1}k\{0,1\}^{k} uniformly at random.

Under the condition of the observation, P1(1)P^{-1}(1) does not contain any pairwise independent subgroup, because any such a subgroup contains an affine subspace of dimension 22.

Let v1,,vtP1(1)v_{1},\dots,v_{t}\in P^{-1}(1) be an enumeration of vectors in P1(1)P^{-1}(1). Note that if P1(1)P^{-1}(1) contains a subspace of dimension 2, then there are 1a<b<ct1\leq a<b<c\leq t such that this subspace is exactly the affine span of va,vb,vcv_{a},v_{b},v_{c}.

For a fixed choice of the triple (a,b,c)(a,b,c), conditioning on the event that va,vb,vcv_{a},v_{b},v_{c} span an affine subspace of dimension 22, the remaining vector from this affine subspace also belongs to P1(1)P^{-1}(1) with probability at most t/2kt/2^{k}. Taking a union bound over (a,b,c)(a,b,c) (at most t3t^{3} such choices), we see that P1(1)P^{-1}(1) contains an affine subspace with probability at most t4/2kt^{4}/2^{k}. ∎

Appendix B Constructing nice instances

In this section, we show the existence of nice instances of constraint hypergraphs and prove Theorem 3.4.

Fix 1>ϵ,δ01>\epsilon,\delta\geq 0 and γekk2\gamma\geq e^{k}k^{2}. Then, there exists a kk-uniform constraint hypergraph GG with γn\gamma n edges such that for η=(1/γ2)2/δ\eta=(1/\gamma^{2})^{2/\delta}, τ=4log2(γk2)\tau=4\log_{2}(\gamma k^{2}), GG:

has girth \frakfamilyglog(n)/τ\text{\frakfamily g}\geq\log{(n)}/\tau, and

We first choose a random graph GG by choosing every kk uniform hyperedge, independently, with probability p=4γk!/nk1p=4\gamma\cdot k!/n^{k-1}. Our final hypergraph will be obtained by removing hyperedges from GG.

For GG chosen as above, with probability at least 1/31/3,

has between 2γn2\gamma n and 6γn6\gamma n edges.

has at most n1/4log(n)n^{1/4}\log{(n)} cycles of length at most g and

We first show that the claim above is enough to complete the proof of the lemma. We define GG^{\prime} to be the hypergraph obtained by removing every cycle of length at most g.By the claim above, the total number of hyperedges removed in this process, for a large enough nn, is at most γn\gamma n. Observe that the last property in the statement of the theorem is immediately satisfied by GG^{\prime}. Further, since GG^{\prime} is obtained only by removing hyperedges from GG, GG^{\prime} still enjoys (ηn,δ)(\eta n,\delta)-expansion. Thus, GG^{\prime} is a constraint hypergraph that satisfies the requirements of the lemma. Finally, the total number of edges removed is sublinear in nn and thus GG^{\prime} has at least γn\gamma n edges for a large enough nn.

We now move on to complete the proof of the claim above:

The expected number of edges in GG is given by p(nk)=4γn(1k1n)k14γn(1(k1)2n)p\cdot{n\choose k}=4\gamma n(1-\frac{k-1}{n})^{k-1}\geq 4\gamma n(1-\frac{(k-1)^{2}}{n}). By an application of Chernoff bound, the probability that the number of edges does not lie in the interval [2γn,6γn][2\gamma n,6\gamma n] is at most 2eγn162e^{\frac{-\gamma n}{16}}.

Next, consider any collection of ss clauses and let us compute the probability that they cover at most cscs variables for some c=k1δc=k-1-\delta. This probability, is then upper bounded by

Using that (csk)(cs)k/k!{{cs}\choose k}\leq(cs)^{k}/k! and the approximation (xy)(xey)y{x\choose y}\leq\left(\frac{xe}{y}\right)^{y}, we can upper bound the above expression by:

Using that c=k1δc=k-1-\delta and that δ<1\delta<1 now yields an upper bound of

Thus, using that γ>ekk2\gamma>e^{k}k^{2} and that ss satisfies sn(1/γ2)2/δ\frac{s}{n}\leq(1/\gamma^{2})^{2/\delta} makes the above probability at most (1/γ2)s(1/\gamma^{2})^{s}.

By an application of Markov’s inequality, with probability at least 7/87/8 over the draw of hyperedges of GG, the number of cycles of length at most \frakfamilyg=14logγk2n\text{\frakfamily g}=\frac{1}{4}\log_{\gamma k^{2}}n are at most

By a union bound, now, all the three properties above can be ensured with probability at least 1/31/3.

In this section, we show that after fixing the underlying hyperedges GG of an instance, with high probability over the literals on constraints, all assignments are very close to a random assignment. Here closeness is measured with respect to the distribution {C(x)}\{C(x)\} as one chooses a uniformly random constraint among all hyperedges of the hypergraph.

Let GG be any hypergraph with mm hyperedges. Let I\mathcal{I} be an instance with the same underlying hypergraph as GG, and with the literals in all clauses be chosen uniformly at random. We have the following lemma.

Suppose m=Ω(2O(k)ϵ2n)m=\Omega(2^{O(k)}\epsilon^{-2}n). With high probability over the choice of literals, for any assignment x{±1}nx\in\{\pm 1\}^{n}, the distribution {C(x)}\{C(x)\} with CC chosen uniformly at random in I\mathcal{I} is within ϵ\epsilon statistical distance to the uniform distribution over {±1}k\{\pm 1\}^{k}.

Now suppose the signs of the literals from I\mathcal{I} for every constraint are chosen uniformly at random, keeping the underlying subhypergraph fixed. Then μI,x[y]\mu_{\mathcal{I},x}[y] is now a random variable depending on the randomness of the literals. For each ii, the indicator \mathds1Ci(x)=y\mathds{1}_{C_{i}(x)=y} equals 11 with probability 1/2k1/2^{k}, and equals with the remaining probability (over the randomness of the signs of the literals on the ii-th constraint), and the random variables \mathds1Ci(x)=y\mathds{1}_{C_{i}(x)=y} are independent of each other for different ii. Therefore μI,x[y]\mu_{\mathcal{I},x}[y] is the average of mm independent {0,1}\{0,1\}-indicator random variables, each being 11 with probability 1/2k1/2^{k}. By Chernoff–Hoeffding bound, we have μI,x[y]1/2k>η|\mu_{\mathcal{I},x}[y]-1/2^{k}|>\eta with probability at most 2exp(η2m/2k+1)2\exp(-\eta^{2}m/2^{k+1}). By a union bound over all assignments x{±1}nx\in\{\pm 1\}^{n}, the maximum deviation of μI,x[y]\mu_{\mathcal{I},x}[y] from 1/2k1/2^{k} (over all xx) exceeds η\eta with probability at most 2exp(η2m/2k+1+nlog2)2\exp(-\eta^{2}m/2^{k+1}+n\log 2). Letting η=ϵ/2k\eta=\epsilon/2^{k}, we see that

as long as m=Ω(2O(k)ϵ2n)m=\Omega(2^{O(k)}\epsilon^{-2}n).

Now the distribution {Ci(x)}\{C_{i}(x)\} for a random i[m]i\in[m] has statistical distance at least ϵ\epsilon implies that μI,x[y]1/2kϵ/2k|\mu_{\mathcal{I},x}[y]-1/2^{k}|\geq\epsilon/2^{k} for some yy. By a union bound over all y{±1}ky\in\{\pm 1\}^{k}, the distribution {Ci(x)}\{C_{i}(x)\} is close in statistical distance to the uniform distribution on {±1}k\{\pm 1\}^{k} except with probability exp(O(k)Ω(n))\exp(O(k)-\Omega(n)), assuming m=Ω(2O(k)ϵ2n)m=\Omega(2^{O(k)}\epsilon^{-2}n). ∎