Sum of Squares Lower Bounds from Pairwise Independence

Boaz Barak, Siu On Chan, Pravesh Kothari

Introduction

Constraint Satisfaction Problems (CSP) are among the most natural computational problems, and yet their computational complexity is not fully understood. In particular several works have studied the notion of Approximation Resistance, which loosely speaking means that the best polynomial-time approximation algorithm is simply the one that outputs a random assignment. Under Khot’s Unique Games Conjecture much is known about this property. In particular Austrin and Mossell showed if the UGC is true, then, for every predicate $P:\{0,1\}^{k}\rightarrow\{0,1\}$ , if there exists a pairwise independent distribution $\mu$ over $P^{-1}(1)$ (i.e., a distribution $\mu$ such that for every $i\neq j\in[k]$ , the marginal $\mu_{i}\mu_{j}$ is the uniform distribution over $\{0,1\}^{2}$ ), then $P$ is approximation resistant. Austrin and Håstad used this to establish (under the UGC) fairly tight bounds on the threshold at which a random predicate of a particular density becomes approximation resistant. However, there is no consensus whether the UGC is true. Assuming only $\mathbf{P}\neq\mathbf{NP}$ , the best known bound is by Chan who showed that a predicate is approximation resistant if it contains a distribution $\mu$ as above satisfying the additional condition that it is uniform over a subspace $V\subseteq GF(2)^{k}$ . This algebraic structure is a fairly strong condition. In particular if we choose $P:\{0,1\}^{k}\rightarrow\{0,1\}$ to be a random predicate conditioned on $|P^{-1}(1)|=t$ (where $t\in\{1\ldots 2^{k}\}$ is some parameter), then $P$ will satisfy the first condition (supporting a pairwise independent distribution) with high probability as long as $t>ck^{2}$ for some constant $c$ while it will not satisfy the second condition even for $t$ as large as $\exp(k/5)$ (see Observation A.1).

Another line of work has been concerned with proving unconditional lower bounds for these problems on restricted families of algorithms. These works considered convex relaxations for CSPs, where we say that a CSP is approximation resistant for some relaxation $\mathcal{R}$ if there is an instance for which a random assignment is essentially optimal, but the relaxation value is $1-o(1)$ (namely, the relaxation “thinks” that it’s possible to satisfy almost all constraints). Interestingly, the unconditional results match the conditional ones. That is, for certain weaker relaxations (namely, the Sherali-Adams linear programming hierarchy or Sherali-Adams augmented with the basic semidefinite program), there are unconditional results for the same predicates that were shown approximation-resistant under the UGC . (This is of course not a coincidence, as the UGC is intimately connected with some of these weaker relaxations .) In contrast, for the stronger Sum of Squares (SOS) (also known as Lasserre) relaxation , the previously known results utilized the same conditions as in Chan’s NP-hardness result (and in fact inspired Chan’s work).

In this work we show that the pairwise independence condition suffices for lower bounds even for this stronger Sum-of-Squares hierarchy. This result is interesting in its own right and, based on past experience, could also be viewed as suggesting that it may be possible to improve the UGC-based results to results based on $\mathbf{P}\neq\mathbf{NP}$ .

Our results actually hold for a more general setting than showing approximation-resistance of predicates, and so to state them we need to introduce some notation. Roughly speaking, we show that for every $k$ and an arbitrarily small $\epsilon>0$ , there exists a set $\mathcal{I}=\{C_{1},\ldots,C_{m}\}$ of $k$ -tuples of literals (i.e. variables or their negations) over the variables $x_{1},\ldots,x_{n}$ such that (1) for every assignment $x$ to the variables, the induced distribution on $\{0,1\}^{k}$ obtained by taking a random $i\in[m]$ and looking at the literals in $C_{i}$ is $\epsilon$ -close to the uniform distribution on $\{0,1\}^{k}$ but (2) for every pairwise independent distribution $\mu$ over $\{0,1\}^{k}$ , there is a relaxation-solution that “cheats” the $\Omega(n)$ -degree SOS relaxation to think that there is a distribution $\mathcal{D}$ over assignments (i.e. $\{0,1\}^{k}$ ) such that for every $i\in[m]$ , the projection of $\mathcal{D}$ to the literals in $C_{i}$ is distributed according to $\mu$ . This immediately implies that predicates supporting a pairwise independent distribution are approximation-resistant for this relaxation. We now formally state our results:

The Sum-of-Squares hierarchy can be thought of as optimizing over pseudo-expectations; see the survey and the references therein, as well as the lecture notes . For notational convenience, we will use variables over $\{\pm 1\}$ instead of $\{0,1\}$ . A literal is a function $f:\{\pm 1\}^{n}\rightarrow\{\pm 1\}$ such that $f(x)=x_{i}$ or $f(x)=-x_{i}$ for some $i$ . If $C=(f_{1},\ldots,f_{k})$ is a $k$ -tuple of literals then we denote by $C(x)$ the tuple $(f_{1}(x),\ldots,f_{k}(x))$ . Our main result is the following:

For every $x\in\{\pm 1\}^{n}$ , the distribution $\{C(x)\}$ where $C$ is chosen at random in $\mathcal{I}$ is within $\epsilon$ statistical distance to the uniform distribution over $\{\pm 1\}^{k}$ .

The following immediate corollary implies that predicates supporting pairwise independent distributions are approximation-resistant for $\Omega(n)$ -degree SOS:

For every $\epsilon>0$ and $P:\{\pm 1\}^{k}\rightarrow\{0,1\}$ , if there exists a pairwise independent distribution $\mu$ supported on $P^{-1}(1)$ then there exists $\delta>0$ such that for all $n$ there is a set $\mathcal{I}=\{C_{1},\ldots,C_{m}\}$ of $k$ -tuples of literals over $x_{1},\ldots,x_{n}$ such that

The value of the $\delta n$ -degree Max- $P$ SOS relaxation for the fraction of satisfiable constraints on the instance $\mathcal{I}$ is $1$ .

The instance $\mathcal{I}=(C_{1},\ldots,C_{m})$ is actually obtained at random (with some pruning of a small fraction of the constraints, or alternatively, with some loss in the “perfect completeness” condition). Thus our results can also be thought as giving some evidence to a conjecture of Barak, Kindler and Steurer that no polynomial-time algorithm (including in particular the SOS algorithm) can beat the basic semidefinite program on approximating random CSP instances.

Throughout this paper we restrict ourselves to the Boolean case, and do not consider extensions to a larger alphabet, though our methods may be useful in this case as well.

2 Related works

Grigoriev proved in 1999 that (in the language of this paper) 3XOR is approximation resistant for the degree $\Omega(n)$ Sum-of-Squares hierarchy. Grigoriev’s work in fact predated the papers of Parrilo and Lasserre proposing the SOS hierarchy, and so he used the different (but equivalent) language of Positivstellensatz Calculus proofs. (Also, as far we know, he did not note that these proofs can be efficiently found via a semidefinite program.) Grigoriev’s result was rediscovered in 2008 by Schoenebeck , who also noted that it implies approximation resistance for 3SAT and some other CSPs as well. Tulsiani (see also Chan ) further generalized these results and in particular showed that every predicate that contains a pairwise independent subgroup is approximation resistant for $\Omega(n)$ -degree SOS. Both Tulsiani and Schoenebeck follow Grigoriev’s technique of reducing SOS lower bounds to resolution width lower bounds. As far as we know, no other SOS integrality gaps for approximating CSPs were known, and there are very few SOS lower bounds in general, most notably Grigoriev’s lower bound for knapsack and the very recent result by Meka, Potechin and Wigderson for the planted clique problem (personal communication).

Arora, Bollobás, Lovász and Tourlakis obtained integrality gaps for the Lovász-Schrijver linear programming hierarchy for Vertex Cover. Schoenebeck, Trevisan and Tulsiani showed that Max-Cut is approximation resistant for $\Omega(n)$ levels of the Lovász-Schrijver hierarchy, and these results have been strengthened to the stronger Sherali-Adams hierarchy . The famous Goemans-Williamson algorithm shows that Max-Cut is not approximation resistant for even the degree $2$ SOS hierarchy, further underscoring the difference between these relaxations.

Perhaps closest to our work are the papers of Benabbas, Georgiou, Magen, and Tulsiani who showed that predicates containing a pairwise independent distribution are approximation resistant for $\Omega(n)$ rounds of the Sherali Adams hierarchy, even when one adds the degree $2$ SOS constraints. Indeed, our pseudo-distribution agrees with theirs, though we describe it somewhat differently, and most importantly, need a completely different argument to show that it is positive semi-definite. Our work is also inspired by the pseudo-expectation view of the SOS hierarchy as advocated in the papers .

Overview of our proof

We now review the construction of the instance, as well as the pseudo-expectation operator, and then discuss how we come up with these local orthogonal functions. As mentioned above, our instance $\mathcal{I}=(C_{1},\ldots,C_{m})$ will simply be a random instance, which we think of as a $k$ -uniform hypergraph with $m$ hyperedges $C_{1},\ldots,C_{m}$ . After some pruning we can assume this hypergraph has girth $\Omega(\log n)$ .If we don’t prune these clauses then our proof guarantees that for $1-o(1)$ fraction of the clauses we get the marginal distribution to be $\mu$ . It is possible that this can be upgraded to all of the clauses at the expense of some additional complication, but we have not checked whether or not that’s the case. By a simple Chernoff + union bound argument, if $m>cn$ for a sufficiently large constant $c$ then for every assignment $x\in\{\pm 1\}^{n}$ , the induced distribution $\{C_{i}(x)\}_{i\sim[m]}$ will be $\epsilon$ -close to the uniform distribution. For this informal overview, suppose that we merely want to establish the existence of a degree $d$ pseudo-expectation operator for some large constant $d$ . Note that this means that sets of at most $d$ (or even $2^{d}$ ) variables form a forest (i.e. disjoint collection of trees) in this hypergraph.

The above overview can be converted into a full proof with some care when $d=o(\log{(n)})$ by exploiting the acyclicity of all subgraphs involved. Extending to $d=\Omega(n)$ , however, introduces additional subtleties. When $d$ exceeds $\Omega(\log{(n)})$ , subgraphs induced by $d$ vertices of $\mathcal{I}$ can have cycles. An immediate effect of this is that the the definition of a closed set that we gave before no longer yields consistent local distributions on any collection of $d$ variables. An example of a problem that arises when cycles can exist on a set of vertices is illustrated in Figure 2. To fix this, we define a stronger notion of closed set $S$ that guarantees that all paths of length at most $3$ between any two vertices in $S$ are completely contained inside $S$ . This notion of closures differs from the one that Benabbas et. al. use. An appeal to the expansion property of $\mathcal{I}$ (instead of high girth as before) can be used to show that the closure of a set $S$ is at most a constant factor larger than $|S|$ . Similarly, as before, we need to show that there exists a (suitably defined) ball, $Ball(A)$ around any set $A$ of variables (of size at most $d$ ) such that the correlations with any other set $B$ of size at most $d$ are “captured” by the intersection of $Ball(A)$ and $B$ . This needs a more careful argument. In particular, the correlations (even in the low girth case) are actually not necessarily captured by the intersection of $Ball(A)$ with $B$ , but rather with some set $B^{\prime}$ that is related, but not identical to $B$ . However, the crucial property that we require is that the set $B_{in}=Ball(A)\cap B^{\prime}$ satisfies (1) if $B$ came before $A$ in the ordering, then so will $B_{in}$ and (2) $|B_{in}|+|B\setminus Ball(A)|\leq|B|$ . This second property is more complicated to prove in the case where $|B|$ can be much larger than the girth bound, but turns out to hold there as well. The bottom line is that with additional care however, the high level picture provided by this overview can indeed be implemented and we give a full analysis based on the local Gram-Schmidt like process in Section 6.

Preliminaries

We collect some standard definitions and notation here. A $(k,n)$ -instance is a $k$ -uniform hypergraph $\mathcal{I}=\{C_{1},\ldots,C_{m}\}$ over $[n]$ so that every hyperedge (also known as a clause) $C=(i_{1},\ldots,i_{k})\in\mathcal{I}$ is labeled by a string $\sigma=\sigma^{C}\in\{\pm 1\}^{k}$ . We identify a clause $C$ with the function that maps $x\in\{\pm 1\}^{n}$ to $y_{1},\ldots,y_{k}$ where $y_{j}=\sigma_{j}x_{i_{j}}$ . We will sometimes also consider $C$ as a tuple of the literals $(\sigma_{i_{1}}x_{i_{1}},\ldots,\sigma_{i_{k}}x_{i_{k}})$ . We write $V(C)$ for the variables involved in (or covered by) a clause $C$ and similarly for $V\subseteq[n]$ we write $\mathcal{C}(V)$ for the set of all clauses $C$ such that $V(C)\subseteq V$ . For any $x\in\{-1,1\}^{n}$ , we write $x_{A}$ to denote the tuple of coordinates in the subset $A\subseteq[n]$ . If $x\in\{-1,1\}^{A}$ and $y\in\{-1,1\}^{B}$ for disjoint sets $A$ and $B$ , we will write $x\circ y$ for the string in $\{-1,1\}^{A\cup B}$ that projects to $x$ for coordinates in $A$ and to $y$ for coordinates in $B$ .

Unless explicitly mentioned, the base of all logarithms appearing in the paper is assumed to be $2$ . We consider the arity of our tuples $k$ to be a constant and so $O$ notation may hide the dependence on $k$ .

We now define some standard ideas in the context of hypergraphs.

The degree of $G$ is the maximum number of hyperedges that intersect with any given hyperedge in $G$ . The length of the shortest cycle in $G$ is said to be the girth of $G$ . For any vertices $u,v$ of a hypergraph $G$ , we define the distance, $\mathsf{dist}(u,v)$ of $u,v$ in $G$ as the minimum number of hyperedges in any path that joins $u$ and $v$ in $G$ . For $S,T$ , subsets of vertices, we define $\mathsf{dist}(S,T)\stackrel{{\scriptstyle\textrm{def}}}{{=}}\min_{s\in S,t\in T}\mathsf{dist}(s,t).$

Next, we define the notion of expansion in a $k$ -uniform hypergraph $G$ :

A $k$ -uniform constraint hypergraph $G$ is said to be $(r,\beta)$ -expanding if any collection $\mathcal{C}$ of at most $r$ hyperedges of $G$ cover at least $(k-1-\beta)|\mathcal{C}|$ vertices of $G$ , i.e. $|\{v\mid\exists C\in\mathcal{C}\text{, }v\in C\}|\geq(k-1-\beta)|\mathcal{C}|$ . We call $\beta$ , the coefficient of expansion of $G$ .

Let $\mathcal{I}$ be a $(k,n)$ instance. We now describe the properties of the $(k,n)$ instances that we need and give a construction for them in Section B of the Appendix by taking a random instance and removing a few clauses. Specifically, we show the existence of nice instances, the ones that satisfy the properties described in the lemma below:

Fix $1>\epsilon,\delta\geq 0$ and $\gamma\geq e^{k}k^{2}$ . Then, there exists a $k$ -uniform constraint hypergraph $G$ with $\gamma n$ edges such that for $\eta=(1/\gamma^{2})^{2/\delta}$ , $1/\tau=4\log_{2}(\gamma k^{2})$ , $G$ :

has girth $\text{\frakfamily g}\geq\tau\log{(n)}$

We will use this lemma with any given $\epsilon$ (the soundness slack), $\delta=\frac{1}{200}$ and $\gamma=e^{k}k^{2}/\epsilon^{2}$ . We will call the instances that satisfy the conditions of the lemma above as nice.

For such instances, it is also easy to prove the soundness part (part (i)) of Theorem 1.2 (see Section B.1 of the Appendix) which we record in the following lemma.

For every $\epsilon>0$ and $k$ , if $n$ is sufficiently large then there exists a nice $(k,n)$ -instance $\mathcal{I}$ with the property that for every $x\in\{\pm 1\}^{n}$ , the distribution $\{C(x)\}_{C\in\mathcal{I}}$ is $\epsilon$ -close in total variation distance to the uniform distribution on $\{\pm 1\}^{k}$ .

Closed sets, and the definition of the pseudo-expectation

We first define the concept of closed sets that is central to our argument.

For every $R\geq 1$ , a set $A\subseteq[n]$ is $R$ -closed if for every $v,v^{\prime}\in A$ , any path of length at most $R$ between $v$ and $v^{\prime}$ is contained in $A$ . We say that $A$ is closed if it is $3$ -closed.

We define the $R$ -closure of $A$ , denoted by $\mathsf{cl}_{R}(A)$ , to be the intersection of all sets $B$ such that $A\subseteq B$ and $B$ is $R$ -closed. The closure of $A$ , denoted by $\mathsf{cl}(A)$ , is the $3$ -closure of $A$ .

Readers familiar with the definition of closure (or advice set) in the work of or will find the definition of closure above slightly different. The main difference is that our definition allows us to have some nice properties such as uniqueness and that the intersection of two closed sets is closed, which are very helpful for our proof. We stress however that the actual pseudo-expectation is the same as that of those works.

Next, we give a constructive definition of closure of a set.

Given $S\subseteq[n]$ and any $R<\min\{\text{\frakfamily g}/2,\frac{1}{2\beta}\}$ , the $R$ -closure of $S$ can be obtained by the following procedure run on $S$ : Set $A:=\emptyset$ . For every $v,v^{\prime}\in V(A)\cup S$ such that there is a path of length at most $R$ between $v$ and $v^{\prime}$ in $\mathcal{I}$ not contained in $A$ , add every clause in the path to $A$ . Output $V(A)\cup S$ .

Observe that the procedure terminates as there are only finitely many clauses. Further, the output is closed by virtue of the termination of the procedure. By induction on the time at which a path is added in the procedure, it is easy to show that every closed set containing $S$ must contain the path. Thus, $V(A)$ is a closed set containing $A$ and every clause $C$ such that $V(C)\subseteq V(A)$ satisfies $V(C)\subseteq\mathsf{cl}_{R}(S)$ . The lemma now follows by the minimality of $\mathsf{cl}_{R}(S)$ . ∎

Next, we bound the size of $\mathsf{cl}_{R}(S)$ .

For any $R<\min\{\text{\frakfamily g}/2,\frac{1}{2\beta}\}$ and $S\subseteq[n]$ such that $|S|\leq\frac{\eta n}{10R}$ . Then, $|\mathcal{C}(\mathsf{cl}_{R}(S))|\leq 2R|S|$ and $|\mathsf{cl}_{R}(S)|\leq 2Rk|S|$ .

Consider the procedure described in Lemma 4.3. Let $S^{iso}\subseteq\mathsf{cl}_{R}(S)$ be the isolated vertices in $\mathsf{cl}_{R}(S)$ . Observe that one cannot add any isolated vertices in the procedure and thus $S^{iso}\subseteq S$ . Define $S^{\prime}=S\setminus S^{iso}$ . Then, $\mathsf{cl}_{R}(S)=\mathsf{cl}_{R}(S^{\prime})\cup S^{iso}$ .

If the process terminates before adding a total of $q=\frac{|S^{\prime}|}{\frac{1}{R}-\beta}$ clauses, then there’s nothing to prove, since $|S^{\prime}|\leq|S|\leq\frac{\eta n}{10R}$ yields that $q\leq\frac{\eta n}{5}$ . Thus, suppose, for the sake of a contradiction, that the procedure adds $>q$ clauses and let $i^{th}$ round of the procedure be the first round where the number of clauses added exceeds $q$ .

Let $\mathcal{C}_{i}$ be the set of clauses added in the procedure till the $i^{th}$ round and let $S^{\prime}_{i}$ be the set of variables obtained by taking the union of variables covered by the clauses added and $S^{\prime}$ . Further, suppose that the $i^{th}$ round adds $q_{i}$ clauses. Then, $|\mathcal{C}_{i}|\leq q+q_{i}<\eta n$ and thus, $\mathcal{C}_{i}$ must satisfy the expansion requirement: $|V(\mathcal{C}_{i})|\geq(q+q_{i})(k-1-\beta)$ . On the other hand, any new path of length $j\leq R$ added in a round adds at most $jk-(j-1)-2$ new vertices. Thus, on an average, every one of the at most $j$ new clauses added in any round of the procedure contribute at most: $k-1-1/j\leq k-1-1/R$ new vertices. Thus, $|S^{\prime}_{i}|\leq|S^{\prime}|+(q+q_{i})(k-1-1/R)$ .

This yields that $|S^{\prime}|\geq(q+q_{i})\cdot(1/R-\beta)>|S^{\prime}|$ using that $q=\frac{|S^{\prime}|}{\frac{1}{R}-\beta}$ . This is a contradiction.

The size claimed in the lemma now follows by observing that $\frac{1}{R}-\beta\geq\frac{1}{2R}$ and that every clause contributes at most $k$ new variables. ∎

The following lemma summarizes the simple properties of the closures defined here.

For any $R<\text{\frakfamily g}/2$ , if $A$ and $B$ are $R$ -closed and then so is $A\cap B$ .

If $A\subseteq B$ then $\mathsf{cl}_{R}(A)\subseteq\mathsf{cl}_{R}(B)$ .

Every connected component of $\mathsf{cl}_{R}(A)$ of size $\geq 2$ intersects $A$ in at least two elements.

Let $A=A_{1}\cup A_{2}\cup\ldots A_{m}$ . Then, $\mathsf{cl}(A)=\mathsf{cl}(\cup_{i=1}^{m}\mathsf{cl}(A_{i}))$ .

If there are two vertices $v,v^{\prime}$ in $A\cap B$ such that $\mathsf{dist}(v,v^{\prime})\leq R$ , then since both $A$ and $B$ are closed, both of them should contain the unique (since $R<\text{\frakfamily g}/2$ ) path between them.

By definition, $\mathsf{cl}_{R}(B)$ is an $R$ -closed set containing $B\supseteq A$ and hence if $\mathsf{cl}_{R}(A)\nsubseteq\mathsf{cl}_{R}(B)$ then $\mathsf{cl}_{R}(A)\cap\mathsf{cl}_{R}(B)$ would be an even smaller $R$ -closed set that contains $A$ , contradicting the minimality of $\mathsf{cl}_{R}(A)$ .

Suppose otherwise that there is some connected component $S$ of $\mathsf{cl}_{R}(A)$ with $|S|\geq 2$ intersecting $A$ with at most one element $\{x\}$ , then we claim that $B=(\mathsf{cl}_{R}(A)\setminus S)\cup\{x\}$ is an $R$ -closed set containing $A$ . Clearly, $B\supseteq A$ . Now suppose for the sake of contradiction that there were two vertices $v\neq v^{\prime}$ of distance at most $R$ in $B$ whose path is not in $B$ . Then since $B\subseteq\mathsf{cl}_{R}(A)$ and $\mathsf{cl}_{R}(A)$ is $R$ -closed, the path between $v$ and $v^{\prime}$ must have had a vertex $u\in S\setminus\{x\}$ . But since one of $v$ or $v^{\prime}$ must be different than $x$ (say $v^{\prime}$ ), we get by contradiction that $v^{\prime}$ was connected to $S$ in $\mathsf{cl}_{R}(A)$ .

Let $B=\mathsf{cl}(\cup_{i=1}^{m}\mathsf{cl}(A_{i}))$ . Since $\mathsf{cl}(A)$ is closed and contains $\cup_{i=1}^{m}A_{i}$ , $B\subseteq\mathsf{cl}(A)$ . If $B\neq\mathsf{cl}(A)$ , then, $B\supseteq\cup_{i=1}^{m}A_{i}$ and is closed contradicting the minimality of $\mathsf{cl}(A)$ . ∎

The definition and the proof of consistency of the local distribution we define were shown by Benabbas et. al. for the weaker notion of closures they used (in order to define linear round solutions in the Sherali Adams hierarchy). The argument for our notion of closure is similar but we include it here for the sake of completeness.

For every set $S\subseteq[n]$ , $|S|\leq d$ , let $\mathsf{cl}(S)$ be the closure of $S$ and suppose $I_{S}$ is the set of isolated variables in $\mathsf{cl}(S)$ . Define $\mathcal{C}(\mathsf{cl}(S))$ be all clauses $C$ such that $V(C)\subseteq\mathsf{cl}(S)$ . Then, we set:

where $x_{C}$ the projection of $x$ on to the coordinates in $V(C)$ , and $Z_{\mathsf{cl}(S)}=2^{k|\mathcal{C}(\mathsf{cl}(S))|-|\mathsf{cl}(S)|}$ ( $\geq 1)$ . Observe that the above expression tells us that the marginal distribution of $\nu_{cl(S)}$ over $I_{S}$ is uniform. We extend the notation above and write $\nu_{T}$ for the marginal of $\nu_{\mathsf{cl}(T)}$ on variables in $T$ .

We now show that $\nu_{\mathsf{cl}(S)}$ defined above is indeed a probability distribution over $\mathsf{cl}(S)$ .

Let $A$ and $B$ be closed sets such that $A\subseteq B$ and $|\mathcal{C}(B)|\leq\eta n$ . Then,

$\nu_{A}$ is a valid probability distribution: $\sum_{x\in\{-1,1\}^{A}}\nu_{A}(x)=1$ .

$\nu$ is locally consistent: for every $x\in\{-1,1\}^{S}$ , $\nu_{A}(x)=\sum_{y\in\{-1,1\}^{B\setminus A}}\nu_{B}(x\circ y)$ .

The following claim that we record as a lemma will be useful in the proof.

There exists an ordering $C_{1},C_{2},\ldots,C_{r}$ of clauses in $\mathcal{C}_{A,B}$ and a partition of $B\setminus A$ into sets $F_{1}\subseteq V(C_{1}),F_{2}\subseteq V(C_{2}),\ldots,F_{r}\subseteq V(C_{r})$ such that for every $j\leq r$ , $|F_{j}|\geq k-2$ and $F_{j}\cap\left(\cup_{i>j}V(C_{i})\right)=\emptyset$ .

We first complete the proof of Lemma 4.6 and then prove Lemma 4.7.

Let $Z_{A}=2^{-|A|+k|\mathcal{C}(A)|}$ and $Z_{B}=2^{-|B|+k|\mathcal{C}(B)|}$ . Let $\mathcal{C}_{A,B}=\mathcal{C}(B)\setminus\mathcal{C}(A)$ . Using (1), we have:

To simplify notation, we will write $\mu_{i}$ for $\mu_{C_{i}}$ and $x^{i}$ for $x_{V(C_{i})}$ where $x\in\{-1,1\}^{n}$ . We have, using the ordering given by Lemma 4.7. Then,

Now, $\sum_{i=1}^{r}|V(C_{r})\setminus F_{r}|=kr-|B\setminus A|$ . Further, $-|B|+k|\mathcal{C}(B)|-kr+|B\setminus A|=-|A|+k|\mathcal{C}(A)|$ . Thus, $Z_{B}\cdot 2^{-\sum_{i=1}^{r}|V(C_{r})\setminus F_{r}|}=Z_{A}$ completing the proof. ∎

For every $C\in\mathcal{C}_{A,B}$ define $\Gamma(C)=\{v\in V(C)\mid\forall C^{\prime}\neq C\in\mathcal{C}_{A,B}\text{, }v\notin V(C^{\prime})\}$ . For any collection $\mathcal{C}$ of clauses in $\mathcal{C}_{A,B}$ , let $\Delta(\mathcal{C})=|\cup_{C\in\mathcal{C}}\Gamma(C)|$ . Similarly, define $\Gamma_{A}(C)=\Gamma(C)\setminus A$ and $\Delta_{A}(\mathcal{C})=|\cup_{C\in\mathcal{C}}\Gamma_{A}(C)|$ . We make the following claim:

For any $\mathcal{C}\subseteq\mathcal{C}_{A,B}$ , $\Delta_{A}(\mathcal{C})\geq(k-5/2-2\beta)|\mathcal{C}|.$

We first complete the proof of the lemma using the claim. Since $\Delta_{A}(\mathcal{C}_{A,B})\geq(k-5/2-2\beta)|\mathcal{C}_{A,B}|$ and $\beta<1/10$ , there exists a clause $C$ such that $|\Gamma_{A}(C)|\geq k-2$ . Now $V(C)\setminus A\supseteq\Gamma_{A}(C)$ and thus $|V(C)\setminus A|\geq k-2$ . We place this clause at the beginning of the ordering, call it $C_{1}$ and set $F_{1}=V(C)\setminus A$ . We now iterate with $\mathcal{C}_{A,B}\setminus\{C\}$ to complete the construction, obtain a clause $C_{2}\in\mathcal{C}_{A,B}\setminus C_{1}$ such that $|\Gamma_{A}(C_{2})|\geq k-2$ . Since $\Gamma_{A}(C_{1})$ cannot intersect $\Gamma_{A}(C_{2})$ , we can now set $F_{2}=V(C_{2})\setminus V(C_{1})$ . Continuing this way yields the required ordering and partition of $B\setminus A$ . ∎

Fix any $\mathcal{C}$ and consider any (maximally) connected subgraph with edges $\mathcal{C}^{\prime}\subseteq\mathcal{C}$ . If $\mathcal{C}^{\prime}$ consists of a single clause $C$ , then $|V(C)\cap A|\leq 1$ (since $A$ is closed) and $V(C)\cap V(C^{\prime})=\emptyset$ for any $C^{\prime}\neq C\in\mathcal{C}$ . Thus, $\Gamma_{A}(\mathcal{C}^{\prime})\geq k-1$ .

Now suppose $\mathcal{C}^{\prime}$ consists of at least $2$ clauses. We first claim that $\Delta(\mathcal{C}^{\prime})\geq(k-2-2\beta)|\mathcal{C}^{\prime}|$ . To see this, observe that $\mathcal{C}^{\prime}$ is a collection of at most $\eta n$ clauses in $\mathcal{I}$ and thus, $|V(\mathcal{C}^{\prime})|\geq(k-1-\beta)|\mathcal{C}|$ . Further, every $v\in V(\mathcal{C}^{\prime})\setminus\cup_{C\in\mathcal{C}^{\prime}}\Gamma(C)$ belongs to at least two different clauses in $\mathcal{C}^{\prime}$ and thus, $(k-1-\beta)|\mathcal{C}^{\prime}|\leq|V(\mathcal{C}^{\prime})|\leq\Delta(\mathcal{C}^{\prime})+(k|\mathcal{C}^{\prime}|-\Delta(\mathcal{C}^{\prime}))/2$ . Rearranging gives $\Delta(\mathcal{C}^{\prime})\geq(k-2-2\beta)|\mathcal{C}^{\prime}|$ .

Next, we claim that that for every $v\in V(\mathcal{C}^{\prime})\cap A$ there exists a pair of clauses $C,C^{\prime}$ such that $V\left(C\cup C^{\prime}\right)\cap A=\{v\}$ . Consider any clause $C\in\mathcal{C}$ such that $V(C)\cap A=\{v\}$ . If there is another clause $C^{\prime}$ such that $V(C^{\prime})\cap A=\{v\}$ , then observe that $V(C^{\prime})$ cannot intersect $A$ in any other element (since $A$ is closed) and thus we can let $C,C^{\prime}$ be the pair as above, corresponding to $v$ . Otherwise, there exists a clause $C^{\prime}$ such that $C^{\prime}\in\mathcal{C}$ such that $V(C^{\prime})\cap V(C)\neq\emptyset$ (since $V(\mathcal{C}^{\prime})$ is connected) and $V(C^{\prime})\cap A=\emptyset$ (as otherwise there would be a path between two distinct vertices of $A$ , of length at most $2$ outside of $A$ ). Further, observe that all such pairs are disjoint. This is because if some pairs intersect, then they induce a path of length at most $3$ between two distinct vertices of $A$ that is not contained in $A$ (violating the $3$ closedness of $A$ ). Thus, $|V(\mathcal{C}^{\prime})\cap A|\leq|\mathcal{C}^{\prime}|/2$ . Thus, we must have: $\Delta_{A}(\mathcal{C}^{\prime})\geq\Delta(\mathcal{C}^{\prime})-|\mathcal{C}^{\prime}|/2\geq(k-2-2\beta)|\mathcal{C}^{\prime}|-|\mathcal{C}^{\prime}|/2=(k-5/2-2\beta)|\mathcal{C}^{\prime}|$ .

Since for every connected component $\mathcal{C}^{\prime}$ inside $\mathcal{C}$ we have that $\Delta_{A}(\mathcal{C}^{\prime})\geq(k-5/2-2\beta)|\mathcal{C}^{\prime}|,$ we must have $\Delta_{A}(\mathcal{C})\geq(k-5/2-2\beta)|\mathcal{C}|$ as promised. This completes the proof of claim. ∎

Suppose $A$ and $B$ are closed disjoint sets such that $A\cup B$ is closed. Then, $\nu_{A\cup B}(x)=\nu_{A}(x_{A})\cdot\nu_{B}(x_{B})$ .

We now define the pseudo-expectation operator associated with the local distributions $\{\nu_{T}\}_{|T|\leq s}$ :

Let $\mathcal{I}$ be a nice $(k,n)$ instance and $\mu$ a pairwise independent distribution over $\{\pm 1\}^{k}$ . Then the family of local distributions $\{\nu_{X}\}_{X\subseteq[n],|X|<d}$ for $s=\eta n/6$ satisfies:

Completeness: For every clause $C$ of $\mathcal{I}$ , $\nu_{V(C)}=\mu$ .

Consistency: for every $S\subseteq T\subseteq[n]$ , $|T|\leq d$ , the marginal of $\nu_{T}$ on to $S$ is $\nu_{S}$ .

The completeness property follows from (1) and $\mathcal{C}(V(C))=\{C\}$ . The consistency property follows from Lemma 4.6. ∎

Local Distribution on Unions

In this section we make an important step towards showing the positivity property of our pseudo-distribution by showing that if two sets $A$ and $B$ are sufficiently closed, then the local distribution on $A\cup B$ is only determined by the clauses that are contained in $A$ or in $B$ . In particular, this implies that if $A$ and $B$ are disjoint then the distribution on $A$ is independent of the distribution of $B$ . The main result of this section is the following expression for the local distribution on the union of $A$ and $B$ where $A$ is $R$ -closed for a sufficiently large constant $R$ and $B$ is closed.

Suppose $A$ is $R$ -closed for $R\geq 100$ and $B$ is closed. Then, for any $x\in\{-1,1\}^{A\cup B}$ ,

where $Z_{A,B}=2^{k|\mathcal{C}(A\cup B)|-|A\cup B|}$ .

We make two convenient definitions before proceeding, see Figure 3:

For any two closed sets $A$ and $B$ , a path $P$ of length at most $3$ is said to be a bridge path for the pair $A,B$ if $|P\cap A|=|P\cap B|=1$ .

For any two closed sets $A$ and $B$ , a path $P$ of length at most $3$ is said to be a bridge-closure path for the pair $A,B$ , if there exists a bridge path $P^{\prime}$ such that $|P^{\prime}\cap P|=1$ and $|P\cap B|=1$ but $C\cap A=\emptyset$ .

As a final remark, observe that the example from Figure 1 shows that $A$ and $B$ being $2$ -closed is not enough to guarantee the statement of the lemma. While we believe that at least one of the sets out of $A$ and $B$ should be $R$ -closed for some $R>3$ for the lemma to hold, currently, we do not have any example of a counter example demonstrating this point. We now proceed with the actual proof.

Let $D=\mathsf{cl}(A\cup B)$ . Let $\mathcal{C}_{A,B}$ and $\mathcal{C}_{B}$ be the set of bridge paths and bridge closure paths of $B$ for the pair $A,B$ , respectively. Observe that $V(\mathcal{C}_{A,B})\cup V(\mathcal{C}_{B})\subseteq D$ . We now show that these are the only extra clauses in $D$ :

The first observation describes how bridge paths and bridge-closure paths intersect.

For any distinct $P_{1},P_{2}\in\mathcal{C}_{A,B}$ , $P_{1}\cap P_{2}\subseteq A\cup B$ .

For any distinct $P_{1},P_{2}\in\mathcal{C}_{B}$ , $P_{1}\cap P_{2}\subseteq V(P)\cup B$ where $P$ is a bridge path.

For any $P\in\mathcal{C}_{B}$ and $P^{\prime}\in\mathcal{C}_{A,B}$ , $|V(P)\cap V(P^{\prime})|\leq 1$ .

Suppose $P_{1},P_{2}\in\mathcal{C}_{B}$ are such that $P_{1}\cap P\neq\emptyset$ and $P_{2}\cap P^{\prime}\neq\emptyset$ for some bridge paths $P\neq P^{\prime}$ . Then, $P_{1}\cap P_{2}=\emptyset$ .

If the claim weren’t true, then there must be a path of length $\leq 6$ between two vertices of $A$ which violating that $A$ is $R$ -closed.

Suppose first that there is a bridge path $P$ such that $P\cap P_{1}\neq\emptyset$ and $P\cap P_{2}\neq\emptyset$ . If either of $P_{1}$ or $P_{2}$ intersect $P$ in more than one element, then there is a cycle of length at most $6$ in $G$ which violates the fact that $G$ has $\Omega(1)$ girth. If $P_{1}$ and $P_{2}$ intersect in an element not contained in $V(P)$ , then, again there is a cycle of length at most $9$ in $G$ violating the high girth of $G$ . Similarly, if $P_{1},P_{2}$ intersect inside $B$ , then, they cannot intersect outside of $B$ and further, they cannot both intersect the same bridge path as it would yield a cycle of length at most $9$ in $G$ . Thus in both the cases, $P_{1}\cap P_{2}\subseteq V(P)\cup B$ for some bridge path $P$ .

Otherwise there is a cycle of length at most $6$ in $G$ violating that $G$ has girth $\omega(1)$ .

If not, then if $|P\cap P^{\prime}\cap A|=1$ then there is a cycle of length $12$ in the graph, contradicting our assumption on the girth. Otherwise $|P\cap P^{\prime}\cap A|=2$ which means that there is a path of length at most $12$ between two distinct vertices of $A$ .

The next observation shows that there is no path of length at most $3$ that connects two bridge paths, two bridge-closure paths or two bridge-bridge-closure paths that are not contained in $A\cup B$ .

There is no path $P$ of length at most $3$ not contained in $A$ that connects a bridge path $P^{\prime}$ and $A$ .

There is no path of length at most $3$ not contained in $A$ that connects $P\in\mathcal{C}_{A,B}$ with $P^{\prime}\in\mathcal{C}_{B}$ .

There is no path of length at most $3$ connecting distinct $P,P^{\prime}\in\mathcal{C}_{B}$ .

Otherwise there is a path of length at most $6$ between two vertices of $A$ not contained in $A$ , violating the fact that $A$ is $R$ closed.

Otherwise there is a path of length at most $12$ between two vertices of $A$ , violating that $A$ is $R$ closed.

Otherwise there is a path of length at most $18$ not contained in $A$ , connecting two vertices of $A$ .

The following claim is now a consequence of the claims above:

For any $C$ such that $V(C)\nsubseteq A\cup B$ but $V(C)\subseteq D$ , $C\in\mathcal{C}_{A,B}\cup\mathcal{C}_{B}$ .

Consider the iterative procedure of building the closure of $A\cup B$ by adding paths one by one in some order. Let $P$ be the first path in this order that violates the claim. Then, either $P$ intersects two bridge paths or a bridge path and $A$ or a bridge path and a bridge-closure path or two bridge-closure paths. Each of these situations is explicitly barred by the claims above. This completes the argument. ∎

Let $Z=2^{k|\mathcal{C}(D)|-|D|}=2^{k|\mathcal{C}(A\cup B)|+k|\mathcal{C}_{A,B}|-|D|}$ . Observe that $Z\cdot 2^{-2|\mathcal{C}_{A,B}|}=Z_{A,B}$ . For every clause $C\in\mathcal{C}_{A,B}\cup\mathcal{C}_{B}$ , let $V_{C}^{\prime}=V(C)\setminus(A\cup B)$ and $V_{C}^{\prime\prime}=V(C)\cap(A\cap B)$ . Similarly, let $D^{\prime}=D\setminus(A\cup B)$ and $D^{\prime\prime}=D\cap(A\cup B)$ . Next, we claim:

Let $D^{\prime}=V_{1}\cup V_{2}$ such that $V_{1}\cap V_{2}=\emptyset$ defined by $V_{1}=D^{\prime}\setminus V(\mathcal{C}_{A,B}))$ and $V_{2}=D^{\prime}\setminus V_{1}$ .

In this section, we prove our main result. Our proof will follow easily from the following lemma which is the main result of this section.

The rest of this section is devoted to proving Lemma 6.1.

Our aim is to build an order on the ${[n]\choose{\leq d}}$ , in which to process them for our local orthogonalization procedure. We start with an arbitrary ordering on the clauses of $\mathcal{I}$ , e.g. for every $C\in\mathcal{I}$ we define a unique index $\zeta(C)\in[m]$ . We say that $A\prec B$ if:

$\mathcal{C}(\mathsf{cl}(A))$ is smaller than $\mathcal{C}(\mathsf{cl}(B))$ in lexicographic order of $\zeta$ . That is, $A\prec B$ if the maximum index $\zeta(C)$ for $C\in\mathsf{cl}(A)$ is smaller than this maximum for $\mathsf{cl}(B)$ , and if they are equal we break ties by the second largest index and so on. We define $\pi(\mathsf{cl}(A))$ to be the index of $\mathsf{cl}(A)$ according to this ordering. (Note that $\pi$ is a permutation on distinct closures, and so if $\mathsf{cl}(A)\neq\mathsf{cl}(B)$ then $\pi(\mathsf{cl}(A))\neq\pi(\mathsf{cl}(B))$ .)

If $\mathcal{C}(\mathsf{cl}(A))=\mathcal{C}(\mathsf{cl}(B))$ then we say that $A\prec B$ if $|A|<|B|$ .

If $\mathcal{C}(\mathsf{cl}(A))=\mathcal{C}(\mathsf{cl}(B))$ and $|A|=|B|$ then we break ties arbitrarily.

For $i=0,\ldots,M$ , we let $A_{i}$ denote the $i^{th}$ set in this ordering. Note that $A_{0}=\emptyset$ and $A_{1},\ldots,A_{n}$ are the singleton elements $\{1\},\ldots,\{n\}$ (in some arbitrary order). We will write $\chi_{i}$ for $\chi_{A_{i}}$ in the following to reduce clutter.

2 Local Orthogonalization

Set $R=100$ . Define the $i^{th}$ local correlated space as

Invoking Lemma 4.12, it suffices to show that $|\mathsf{cl}_{R}(A_{i})|<s=\eta n/6$ . This follows by noting that $|A_{i}|\leq d$ and $|\mathsf{cl}_{R}(A_{i})|\leq 2Rkd=200\eta n/10000=\eta n/500\leq s$ (Lemma 4.4). ∎

The following simple lemma would be very useful.

Now suppose for the sake of contradiction that

for some $\delta>0$ . (If the expectation is negative then we can take $-g$ .) Let $f=\bar{\chi}_{i}-\epsilon g$ . We have:

and so if $\epsilon$ is sufficiently small then

contradicting our choice of $\bar{\chi}_{i}$ . ∎

3 Global Orthogonality lemma

We will need the following observation for the proof which we record before proceeding:

Suppose $H$ is a connected $k$ -uniform hypergraph such that there exist a subset of vertices, $U$ , $|U|\geq 2$ satisfying: $\mathsf{dist}(u,v)>R$ for every distinct $u,v\in U$ . Then, $H$ must have at least $\frac{|U|R}{2}$ hyperedges.

Observe that the collection of balls of radius $R/2$ around any vertex in $u\in U$ are all disjoint and contain at least one path (due to connectedness of $H$ ). ∎

Fix any $j<i$ and let $A=A_{i}$ and $B=A_{j}$ . Let

For every $x\in B_{bdy}$ we call any associated clause $W$ as in the definition above as a boundary clause. Let $B_{out}=B\setminus\mathsf{cl}_{R}(A_{i})$ and $B_{in}=B\setminus(B_{out}\cup B_{bdy})$ and $B_{rest}=B\setminus(B_{out}\cup B_{in})$ . Note that $B_{bdy}$ is not necessarily a subset of $B$ . Next, we make two useful claims:

We will show that $|B_{bdy}|\leq|B_{out}|$ . This immediately yields the claim by observing that $d\geq|B|=|B_{in}|+|B_{out}|+|B_{rest}|\geq|B_{in}|+|B_{bdy}|$ . We note that the proof of this claim is significantly simpler in the case that $|B|<R/2$ . Proving it in the case when $R$ is a constant and $|B|=\Omega(n)$ is one of the main technical ingredients in getting the proof sketched in the overview to work for $\Omega(n)$ rounds of the SOS hierarchy.

Let $Q\subseteq[n]$ be a (maximally) connected component in the subgraph defined by the hyperedges $\mathcal{C}(\mathsf{cl}(B))\setminus\mathcal{C}(\mathsf{cl}_{R}(A))$ . Let $Q_{bdy}=B_{bdy}\cap Q$ and $Q_{out}=B_{out}\cap Q$ . $B_{bdy}$ is thus partitioned into $Q_{bdy}$ for every possible maximally connected subgraphs $Q$ . It is thus enough to prove that $|Q_{bdy}|\leq|Q_{out}|$ for any fixed $Q$ .

Observe that $Q\cap\mathsf{cl}_{R}(A)=Q_{bdy}$ . If $Q\cap\mathsf{cl}_{R}(A)=\emptyset$ , then, there is nothing to prove. If $Q_{bdy}=\{v\}$ , then, $Q$ contains $V(W_{v})$ where $W_{v}$ is a boundary clause associated with $v$ . If $Q$ contains no vertex of $B_{out}$ , then, observe that $\mathsf{cl}(B)\setminus(Q\setminus\{v\})$ is a closed set containing $B$ contradicting the minimality of $\mathsf{cl}(B)$ . Thus, in this case, $|Q_{bdy}|\leq|Q_{out}|$ .

Now suppose for $|Q_{bdy}|\geq 2$ . Then, vertices in $Q_{bdy}$ are connected through clauses in $Q$ . On the other hand, since $A$ is $R$ -closed, for any $u,v\in Q_{bdy}$ , any path that uses clauses from $Q$ between $u,v$ must be of length at least $R+1$ . Applying Lemma 6.6, we observe that $|\mathcal{C}(Q)|\geq|Q_{bdy}|R/2$ .

Next, we claim that $Q\subseteq\mathsf{cl}(Q_{bdy}\cup Q_{out})$ . It is easy to complete the proof once we have this claim: observe that

Rearranging yields that $|Q_{out}|\geq|Q_{bdy}|\cdot\frac{R/2-6}{6}$ . Using $R\geq 24$ yields that $|Q_{out}|\geq|Q_{bdy}|$ .

We now proceed to show that $Q\subseteq\mathsf{cl}(Q_{bdy}\cup Q_{out})$ . By Lemma 4.5 (4), $\mathsf{cl}(B)=\mathsf{cl}(B_{in}\cup B_{bdy}\cup B_{out})$ . Let $B^{\prime}=B_{in}\cup B_{bdy}\cup B_{out}\setminus(Q_{bdy}\cup Q_{out})$ . Then, by another application of Lemma 4.5 (4), $\mathsf{cl}(B)=\mathsf{cl}(\mathsf{cl}(B^{\prime})\cup\mathsf{cl}(Q_{bdy}\cup Q_{out}))$ . In other words, one can build the closure of $B$ by first building the closure of $B^{\prime}$ and $Q_{bdy}\cup Q_{out}$ (Step $1$ ) and then taking the closure of the unions of the obtained sets (Step $2$ ). Clearly, the final output contains every clause in $\mathcal{C}(Q)$ . If we show that (1) $\mathcal{C}(\mathsf{cl}(B^{\prime}))\cap\mathcal{C}(Q)=\emptyset$ and that (2) no clause from $\mathcal{C}(Q)$ is added in the step $2$ , then every clause in $\mathcal{C}(Q)$ must be added in the procedure to build $\mathsf{cl}(Q_{bdy}\cup Q_{out})$ and thus we are done. We now proceed to show the two statements above.

(1): First observe that $\mathsf{cl}(B^{\prime})$ itself can be built by building the closure of $B_{in}$ (and $\mathsf{cl}(B_{in})\subseteq\mathsf{cl}_{R}(A)\Rightarrow\mathcal{C}(\mathsf{cl}(B_{in}))\cap\mathcal{C}(Q)=\emptyset$ ), the closure of $B_{out}\cup B_{bdy}\setminus(Q_{bdy}\cup Q_{out})$ (that cannot intersect any clause from $\mathcal{C}(Q)$ as then $Q$ must include a vertex from $B_{out}\cup B_{bdy}\setminus(Q_{bdy}\cup Q_{out})$ , a contradiction) and finally taking the closure of their union. This last step cannot add a clause in $Q$ : every path $P$ added connects $\mathsf{cl}(B_{in})$ and $\mathsf{cl}(B_{out}\cup B_{bdy}\setminus(Q_{bdy}\cup Q_{out}))$ . If $P$ is contained in $\mathsf{cl}_{R}(A)$ , then, there is nothing to prove. Otherwise $P$ must pass (exactly once) through a boundary vertex. If $P$ contains a clause from $\mathcal{C}(Q)$ , then, if $P$ passes through a boundary vertex not in $Q_{bdy}$ , then this enlarges $Q$ violating that $Q$ is a maximally connected component. If, on the other hand, $P$ passes through a boundary vertex in $Q_{bdy}$ , then, $P$ connects $B_{out}\setminus Q$ with $Q$ violating the maximality of $Q$ . Thus, $\mathcal{C}(\mathsf{cl}(B^{\prime}))$ cannot include any clause from $\mathcal{C}(Q)$ .

(2): Consider the step $2$ of the procedure to build $\mathsf{cl}(B)$ . In this step, we add paths (of length at most $3$ ) that connect $\mathsf{cl}(B^{\prime})$ and $\mathsf{cl}(Q_{bdy}\cup Q_{out})$ . For any such path $P$ , if $P$ includes some clause $C$ from $\mathcal{C}(Q)$ then it crosses out of $\mathsf{cl}_{R}(A)$ (exactly once) and thus must pass through a boundary vertex. By maximality of $Q$ , we must have that $P\cap B_{bdy}\in Q_{bdy}$ and $P\setminus\mathcal{C}(\mathsf{cl}_{R}(A))\subseteq\mathcal{C}(Q)$ . On the other hand, the part of $P$ that connects some vertex in $Q_{bdy}$ to $\mathsf{cl}(Q_{bdy}\cup Q_{out})$ is of length at most $3$ and thus must be contained in $\mathsf{cl}(Q_{bdy}\cup Q_{out})$ . Thus every edge in $P\setminus\mathcal{C}(\mathsf{cl}_{R}(A))$ is present in $\mathcal{C}(\mathsf{cl}(Q_{bdy}\cup Q_{out})$ and thus $C\in\mathcal{C}(Q)$ .

Suppose $B_{out}\neq\emptyset$ . Then, for every $S\subseteq B_{in}\cup B_{bdy}$ , $S\prec A$ .

Since $B_{out}\neq\emptyset$ , $\mathsf{cl}(B)\neq\mathsf{cl}(A)$ . Thus, $\pi(\mathsf{cl}(B))<\pi(\mathsf{cl}(A))$ . Now, $|B_{in}\cup B_{bdy}|\leq d$ from Claim 6.7. Thus, every subset $S\subseteq B_{in}\cup B_{bdy}$ has a well-defined ordering w.r.t ${[n]\choose{\leq d}}$ . Further, for every such $S$ , $\mathsf{cl}(S)\subseteq\mathsf{cl}(B)$ (Lemma 4.5) and thus, $\pi(\mathsf{cl}(S))\leq\pi(\mathsf{cl}(B))$ . Hence, $S\prec A$ . ∎

Now, $\chi_{B}=\chi_{B_{in}}\chi_{B_{rest}}\chi_{B_{out}}$ and we can write

Consider an arbitrary assignment $z$ to $B\setminus A$ and $\gamma\in\{\pm 1\}^{|B_{bdy}|}$ to $x_{B_{bdy}}$ . Let $\mathds{1}_{B_{bdy}=\gamma}$ be the function that on input $x\in\{\pm 1\}^{n}$ outputs $1$ if $x_{B_{bdy}}=\gamma$ and zero otherwise.

Lemma 5.1 gives the expression for the local distribution on $\mathsf{cl}_{R}(A)\cup\mathsf{cl}(B)$ . Using the expression, we have:

for ever $S\subseteq\mathsf{cl}_{R}(A)$ , $|S|\leq d$ .

Now $|B_{in}\cup B_{bdy}|\leq|B_{out}|$ (Claim 6.7) and $\mathds{1}_{B_{bdy}}=\gamma\in\mathsf{Span}\{\chi_{S}\mid S\subseteq B_{bdy}\}$ :

Each index set $S$ of the characters above is a subset of $B$ and thus $S\prec A_{i}$ (invoking Claim 6.8 along with the fact $\pi(\mathsf{cl}(B))<\pi(\mathsf{cl}(A))$ ). Thus, $\chi_{B_{in}}\chi_{B_{rest}}\cdot\mathds{1}_{B_{bdy}=\gamma}\in V_{i}$ . Using Lemma 6.3, thus,

We can now complete the proof of Lemma 6.1.

Acknowledgements

Thanks to Ryan O’Donnell, Li-Yang Tan, and David Steurer for fruitful discussions and the anonymous reviewers for their valuable comments and suggestions on a previous version of this paper.

References

Appendix A Random sparse predicates

Consider a random sparse predicate $P$ on $k$ variables and accepting $|P^{-1}(1)|=t$ assignments. If $t=\exp(o(k))$ , we now show that $P$ does not support a pairwise independent subgroup with high probability, as $k$ tends to infinity. Here the randomness corresponds to choosing $P^{-1}(1)$ to be a $t$ -sized subset of $\{0,1\}^{k}$ uniformly at random.

Under the condition of the observation, $P^{-1}(1)$ does not contain any pairwise independent subgroup, because any such a subgroup contains an affine subspace of dimension $2$ .

Let $v_{1},\dots,v_{t}\in P^{-1}(1)$ be an enumeration of vectors in $P^{-1}(1)$ . Note that if $P^{-1}(1)$ contains a subspace of dimension 2, then there are $1\leq a<b<c\leq t$ such that this subspace is exactly the affine span of $v_{a},v_{b},v_{c}$ .

For a fixed choice of the triple $(a,b,c)$ , conditioning on the event that $v_{a},v_{b},v_{c}$ span an affine subspace of dimension $2$ , the remaining vector from this affine subspace also belongs to $P^{-1}(1)$ with probability at most $t/2^{k}$ . Taking a union bound over $(a,b,c)$ (at most $t^{3}$ such choices), we see that $P^{-1}(1)$ contains an affine subspace with probability at most $t^{4}/2^{k}$ . ∎

Appendix B Constructing nice instances

In this section, we show the existence of nice instances of constraint hypergraphs and prove Theorem 3.4.

has girth $\text{\frakfamily g}\geq\log{(n)}/\tau$ , and

We first choose a random graph $G$ by choosing every $k$ uniform hyperedge, independently, with probability $p=4\gamma\cdot k!/n^{k-1}$ . Our final hypergraph will be obtained by removing hyperedges from $G$ .

For $G$ chosen as above, with probability at least $1/3$ ,

has between $2\gamma n$ and $6\gamma n$ edges.

has at most $n^{1/4}\log{(n)}$ cycles of length at most g and

We first show that the claim above is enough to complete the proof of the lemma. We define $G^{\prime}$ to be the hypergraph obtained by removing every cycle of length at most g.By the claim above, the total number of hyperedges removed in this process, for a large enough $n$ , is at most $\gamma n$ . Observe that the last property in the statement of the theorem is immediately satisfied by $G^{\prime}$ . Further, since $G^{\prime}$ is obtained only by removing hyperedges from $G$ , $G^{\prime}$ still enjoys $(\eta n,\delta)$ -expansion. Thus, $G^{\prime}$ is a constraint hypergraph that satisfies the requirements of the lemma. Finally, the total number of edges removed is sublinear in $n$ and thus $G^{\prime}$ has at least $\gamma n$ edges for a large enough $n$ .

We now move on to complete the proof of the claim above:

The expected number of edges in $G$ is given by $p\cdot{n\choose k}=4\gamma n(1-\frac{k-1}{n})^{k-1}\geq 4\gamma n(1-\frac{(k-1)^{2}}{n})$ . By an application of Chernoff bound, the probability that the number of edges does not lie in the interval $[2\gamma n,6\gamma n]$ is at most $2e^{\frac{-\gamma n}{16}}$ .

Next, consider any collection of $s$ clauses and let us compute the probability that they cover at most $cs$ variables for some $c=k-1-\delta$ . This probability, is then upper bounded by

Using that ${{cs}\choose k}\leq(cs)^{k}/k!$ and the approximation ${x\choose y}\leq\left(\frac{xe}{y}\right)^{y}$ , we can upper bound the above expression by:

Using that $c=k-1-\delta$ and that $\delta<1$ now yields an upper bound of

Thus, using that $\gamma>e^{k}k^{2}$ and that $s$ satisfies $\frac{s}{n}\leq(1/\gamma^{2})^{2/\delta}$ makes the above probability at most $(1/\gamma^{2})^{s}$ .

By an application of Markov’s inequality, with probability at least $7/8$ over the draw of hyperedges of $G$ , the number of cycles of length at most $\text{\frakfamily g}=\frac{1}{4}\log_{\gamma k^{2}}n$ are at most

By a union bound, now, all the three properties above can be ensured with probability at least $1/3$ .

In this section, we show that after fixing the underlying hyperedges $G$ of an instance, with high probability over the literals on constraints, all assignments are very close to a random assignment. Here closeness is measured with respect to the distribution $\{C(x)\}$ as one chooses a uniformly random constraint among all hyperedges of the hypergraph.

Let $G$ be any hypergraph with $m$ hyperedges. Let $\mathcal{I}$ be an instance with the same underlying hypergraph as $G$ , and with the literals in all clauses be chosen uniformly at random. We have the following lemma.

Suppose $m=\Omega(2^{O(k)}\epsilon^{-2}n)$ . With high probability over the choice of literals, for any assignment $x\in\{\pm 1\}^{n}$ , the distribution $\{C(x)\}$ with $C$ chosen uniformly at random in $\mathcal{I}$ is within $\epsilon$ statistical distance to the uniform distribution over $\{\pm 1\}^{k}$ .

Now suppose the signs of the literals from $\mathcal{I}$ for every constraint are chosen uniformly at random, keeping the underlying subhypergraph fixed. Then $\mu_{\mathcal{I},x}[y]$ is now a random variable depending on the randomness of the literals. For each $i$ , the indicator $\mathds{1}_{C_{i}(x)=y}$ equals $1$ with probability $1/2^{k}$ , and equals with the remaining probability (over the randomness of the signs of the literals on the $i$ -th constraint), and the random variables $\mathds{1}_{C_{i}(x)=y}$ are independent of each other for different $i$ . Therefore $\mu_{\mathcal{I},x}[y]$ is the average of $m$ independent $\{0,1\}$ -indicator random variables, each being $1$ with probability $1/2^{k}$ . By Chernoff–Hoeffding bound, we have $|\mu_{\mathcal{I},x}[y]-1/2^{k}|>\eta$ with probability at most $2\exp(-\eta^{2}m/2^{k+1})$ . By a union bound over all assignments $x\in\{\pm 1\}^{n}$ , the maximum deviation of $\mu_{\mathcal{I},x}[y]$ from $1/2^{k}$ (over all $x$ ) exceeds $\eta$ with probability at most $2\exp(-\eta^{2}m/2^{k+1}+n\log 2)$ . Letting $\eta=\epsilon/2^{k}$ , we see that

as long as $m=\Omega(2^{O(k)}\epsilon^{-2}n)$ .

Now the distribution $\{C_{i}(x)\}$ for a random $i\in[m]$ has statistical distance at least $\epsilon$ implies that $|\mu_{\mathcal{I},x}[y]-1/2^{k}|\geq\epsilon/2^{k}$ for some $y$ . By a union bound over all $y\in\{\pm 1\}^{k}$ , the distribution $\{C_{i}(x)\}$ is close in statistical distance to the uniform distribution on $\{\pm 1\}^{k}$ except with probability $\exp(O(k)-\Omega(n))$ , assuming $m=\Omega(2^{O(k)}\epsilon^{-2}n)$ . ∎