Sum of Squares Lower Bounds from Pairwise Independence
Boaz Barak, Siu On Chan, Pravesh Kothari
Introduction
Constraint Satisfaction Problems (CSP) are among the most natural computational problems, and yet their computational complexity is not fully understood. In particular several works have studied the notion of Approximation Resistance, which loosely speaking means that the best polynomial-time approximation algorithm is simply the one that outputs a random assignment. Under Khot’s Unique Games Conjecture much is known about this property. In particular Austrin and Mossell showed if the UGC is true, then, for every predicate , if there exists a pairwise independent distribution over (i.e., a distribution such that for every , the marginal is the uniform distribution over ), then is approximation resistant. Austrin and Håstad used this to establish (under the UGC) fairly tight bounds on the threshold at which a random predicate of a particular density becomes approximation resistant. However, there is no consensus whether the UGC is true. Assuming only , the best known bound is by Chan who showed that a predicate is approximation resistant if it contains a distribution as above satisfying the additional condition that it is uniform over a subspace . This algebraic structure is a fairly strong condition. In particular if we choose to be a random predicate conditioned on (where is some parameter), then will satisfy the first condition (supporting a pairwise independent distribution) with high probability as long as for some constant while it will not satisfy the second condition even for as large as (see Observation A.1).
Another line of work has been concerned with proving unconditional lower bounds for these problems on restricted families of algorithms. These works considered convex relaxations for CSPs, where we say that a CSP is approximation resistant for some relaxation if there is an instance for which a random assignment is essentially optimal, but the relaxation value is (namely, the relaxation “thinks” that it’s possible to satisfy almost all constraints). Interestingly, the unconditional results match the conditional ones. That is, for certain weaker relaxations (namely, the Sherali-Adams linear programming hierarchy or Sherali-Adams augmented with the basic semidefinite program), there are unconditional results for the same predicates that were shown approximation-resistant under the UGC . (This is of course not a coincidence, as the UGC is intimately connected with some of these weaker relaxations .) In contrast, for the stronger Sum of Squares (SOS) (also known as Lasserre) relaxation , the previously known results utilized the same conditions as in Chan’s NP-hardness result (and in fact inspired Chan’s work).
In this work we show that the pairwise independence condition suffices for lower bounds even for this stronger Sum-of-Squares hierarchy. This result is interesting in its own right and, based on past experience, could also be viewed as suggesting that it may be possible to improve the UGC-based results to results based on .
Our results actually hold for a more general setting than showing approximation-resistance of predicates, and so to state them we need to introduce some notation. Roughly speaking, we show that for every and an arbitrarily small , there exists a set of -tuples of literals (i.e. variables or their negations) over the variables such that (1) for every assignment to the variables, the induced distribution on obtained by taking a random and looking at the literals in is -close to the uniform distribution on but (2) for every pairwise independent distribution over , there is a relaxation-solution that “cheats” the -degree SOS relaxation to think that there is a distribution over assignments (i.e. ) such that for every , the projection of to the literals in is distributed according to . This immediately implies that predicates supporting a pairwise independent distribution are approximation-resistant for this relaxation. We now formally state our results:
The Sum-of-Squares hierarchy can be thought of as optimizing over pseudo-expectations; see the survey and the references therein, as well as the lecture notes . For notational convenience, we will use variables over instead of . A literal is a function such that or for some . If is a -tuple of literals then we denote by the tuple . Our main result is the following:
For every , the distribution where is chosen at random in is within statistical distance to the uniform distribution over .
The following immediate corollary implies that predicates supporting pairwise independent distributions are approximation-resistant for -degree SOS:
For every and , if there exists a pairwise independent distribution supported on then there exists such that for all there is a set of -tuples of literals over such that
The value of the -degree Max- SOS relaxation for the fraction of satisfiable constraints on the instance is .
The instance is actually obtained at random (with some pruning of a small fraction of the constraints, or alternatively, with some loss in the “perfect completeness” condition). Thus our results can also be thought as giving some evidence to a conjecture of Barak, Kindler and Steurer that no polynomial-time algorithm (including in particular the SOS algorithm) can beat the basic semidefinite program on approximating random CSP instances.
Throughout this paper we restrict ourselves to the Boolean case, and do not consider extensions to a larger alphabet, though our methods may be useful in this case as well.
2 Related works
Grigoriev proved in 1999 that (in the language of this paper) 3XOR is approximation resistant for the degree Sum-of-Squares hierarchy. Grigoriev’s work in fact predated the papers of Parrilo and Lasserre proposing the SOS hierarchy, and so he used the different (but equivalent) language of Positivstellensatz Calculus proofs. (Also, as far we know, he did not note that these proofs can be efficiently found via a semidefinite program.) Grigoriev’s result was rediscovered in 2008 by Schoenebeck , who also noted that it implies approximation resistance for 3SAT and some other CSPs as well. Tulsiani (see also Chan ) further generalized these results and in particular showed that every predicate that contains a pairwise independent subgroup is approximation resistant for -degree SOS. Both Tulsiani and Schoenebeck follow Grigoriev’s technique of reducing SOS lower bounds to resolution width lower bounds. As far as we know, no other SOS integrality gaps for approximating CSPs were known, and there are very few SOS lower bounds in general, most notably Grigoriev’s lower bound for knapsack and the very recent result by Meka, Potechin and Wigderson for the planted clique problem (personal communication).
Arora, Bollobás, Lovász and Tourlakis obtained integrality gaps for the Lovász-Schrijver linear programming hierarchy for Vertex Cover. Schoenebeck, Trevisan and Tulsiani showed that Max-Cut is approximation resistant for levels of the Lovász-Schrijver hierarchy, and these results have been strengthened to the stronger Sherali-Adams hierarchy . The famous Goemans-Williamson algorithm shows that Max-Cut is not approximation resistant for even the degree SOS hierarchy, further underscoring the difference between these relaxations.
Perhaps closest to our work are the papers of Benabbas, Georgiou, Magen, and Tulsiani who showed that predicates containing a pairwise independent distribution are approximation resistant for rounds of the Sherali Adams hierarchy, even when one adds the degree SOS constraints. Indeed, our pseudo-distribution agrees with theirs, though we describe it somewhat differently, and most importantly, need a completely different argument to show that it is positive semi-definite. Our work is also inspired by the pseudo-expectation view of the SOS hierarchy as advocated in the papers .
Overview of our proof
We now review the construction of the instance, as well as the pseudo-expectation operator, and then discuss how we come up with these local orthogonal functions. As mentioned above, our instance will simply be a random instance, which we think of as a -uniform hypergraph with hyperedges . After some pruning we can assume this hypergraph has girth .If we don’t prune these clauses then our proof guarantees that for fraction of the clauses we get the marginal distribution to be . It is possible that this can be upgraded to all of the clauses at the expense of some additional complication, but we have not checked whether or not that’s the case. By a simple Chernoff + union bound argument, if for a sufficiently large constant then for every assignment , the induced distribution will be -close to the uniform distribution. For this informal overview, suppose that we merely want to establish the existence of a degree pseudo-expectation operator for some large constant . Note that this means that sets of at most (or even ) variables form a forest (i.e. disjoint collection of trees) in this hypergraph.
The above overview can be converted into a full proof with some care when by exploiting the acyclicity of all subgraphs involved. Extending to , however, introduces additional subtleties. When exceeds , subgraphs induced by vertices of can have cycles. An immediate effect of this is that the the definition of a closed set that we gave before no longer yields consistent local distributions on any collection of variables. An example of a problem that arises when cycles can exist on a set of vertices is illustrated in Figure 2. To fix this, we define a stronger notion of closed set that guarantees that all paths of length at most between any two vertices in are completely contained inside . This notion of closures differs from the one that Benabbas et. al. use. An appeal to the expansion property of (instead of high girth as before) can be used to show that the closure of a set is at most a constant factor larger than . Similarly, as before, we need to show that there exists a (suitably defined) ball, around any set of variables (of size at most ) such that the correlations with any other set of size at most are “captured” by the intersection of and . This needs a more careful argument. In particular, the correlations (even in the low girth case) are actually not necessarily captured by the intersection of with , but rather with some set that is related, but not identical to . However, the crucial property that we require is that the set satisfies (1) if came before in the ordering, then so will and (2) . This second property is more complicated to prove in the case where can be much larger than the girth bound, but turns out to hold there as well. The bottom line is that with additional care however, the high level picture provided by this overview can indeed be implemented and we give a full analysis based on the local Gram-Schmidt like process in Section 6.
Preliminaries
We collect some standard definitions and notation here. A -instance is a -uniform hypergraph over so that every hyperedge (also known as a clause) is labeled by a string . We identify a clause with the function that maps to where . We will sometimes also consider as a tuple of the literals . We write for the variables involved in (or covered by) a clause and similarly for we write for the set of all clauses such that . For any , we write to denote the tuple of coordinates in the subset . If and for disjoint sets and , we will write for the string in that projects to for coordinates in and to for coordinates in .
Unless explicitly mentioned, the base of all logarithms appearing in the paper is assumed to be . We consider the arity of our tuples to be a constant and so notation may hide the dependence on .
We now define some standard ideas in the context of hypergraphs.
The degree of is the maximum number of hyperedges that intersect with any given hyperedge in . The length of the shortest cycle in is said to be the girth of . For any vertices of a hypergraph , we define the distance, of in as the minimum number of hyperedges in any path that joins and in . For , subsets of vertices, we define
Next, we define the notion of expansion in a -uniform hypergraph :
A -uniform constraint hypergraph is said to be -expanding if any collection of at most hyperedges of cover at least vertices of , i.e. . We call , the coefficient of expansion of .
Let be a instance. We now describe the properties of the instances that we need and give a construction for them in Section B of the Appendix by taking a random instance and removing a few clauses. Specifically, we show the existence of nice instances, the ones that satisfy the properties described in the lemma below:
Fix and . Then, there exists a -uniform constraint hypergraph with edges such that for , , :
has girth
We will use this lemma with any given (the soundness slack), and . We will call the instances that satisfy the conditions of the lemma above as nice.
For such instances, it is also easy to prove the soundness part (part (i)) of Theorem 1.2 (see Section B.1 of the Appendix) which we record in the following lemma.
For every and , if is sufficiently large then there exists a nice -instance with the property that for every , the distribution is -close in total variation distance to the uniform distribution on .
Closed sets, and the definition of the pseudo-expectation
We first define the concept of closed sets that is central to our argument.
For every , a set is -closed if for every , any path of length at most between and is contained in . We say that is closed if it is -closed.
We define the -closure of , denoted by , to be the intersection of all sets such that and is -closed. The closure of , denoted by , is the -closure of .
Readers familiar with the definition of closure (or advice set) in the work of or will find the definition of closure above slightly different. The main difference is that our definition allows us to have some nice properties such as uniqueness and that the intersection of two closed sets is closed, which are very helpful for our proof. We stress however that the actual pseudo-expectation is the same as that of those works.
Next, we give a constructive definition of closure of a set.
Given and any , the -closure of can be obtained by the following procedure run on : Set . For every such that there is a path of length at most between and in not contained in , add every clause in the path to . Output .
Observe that the procedure terminates as there are only finitely many clauses. Further, the output is closed by virtue of the termination of the procedure. By induction on the time at which a path is added in the procedure, it is easy to show that every closed set containing must contain the path. Thus, is a closed set containing and every clause such that satisfies . The lemma now follows by the minimality of . ∎
Next, we bound the size of .
For any and such that . Then, and .
Consider the procedure described in Lemma 4.3. Let be the isolated vertices in . Observe that one cannot add any isolated vertices in the procedure and thus . Define . Then, .
If the process terminates before adding a total of clauses, then there’s nothing to prove, since yields that . Thus, suppose, for the sake of a contradiction, that the procedure adds clauses and let round of the procedure be the first round where the number of clauses added exceeds .
Let be the set of clauses added in the procedure till the round and let be the set of variables obtained by taking the union of variables covered by the clauses added and . Further, suppose that the round adds clauses. Then, and thus, must satisfy the expansion requirement: . On the other hand, any new path of length added in a round adds at most new vertices. Thus, on an average, every one of the at most new clauses added in any round of the procedure contribute at most: new vertices. Thus, .
This yields that using that . This is a contradiction.
The size claimed in the lemma now follows by observing that and that every clause contributes at most new variables. ∎
The following lemma summarizes the simple properties of the closures defined here.
For any , if and are -closed and then so is .
If then .
Every connected component of of size intersects in at least two elements.
Let . Then, .
If there are two vertices in such that , then since both and are closed, both of them should contain the unique (since ) path between them.
By definition, is an -closed set containing and hence if then would be an even smaller -closed set that contains , contradicting the minimality of .
Suppose otherwise that there is some connected component of with intersecting with at most one element , then we claim that is an -closed set containing . Clearly, . Now suppose for the sake of contradiction that there were two vertices of distance at most in whose path is not in . Then since and is -closed, the path between and must have had a vertex . But since one of or must be different than (say ), we get by contradiction that was connected to in .
Let . Since is closed and contains , . If , then, and is closed contradicting the minimality of . ∎
The definition and the proof of consistency of the local distribution we define were shown by Benabbas et. al. for the weaker notion of closures they used (in order to define linear round solutions in the Sherali Adams hierarchy). The argument for our notion of closure is similar but we include it here for the sake of completeness.
For every set , , let be the closure of and suppose is the set of isolated variables in . Define be all clauses such that . Then, we set:
where the projection of on to the coordinates in , and (. Observe that the above expression tells us that the marginal distribution of over is uniform. We extend the notation above and write for the marginal of on variables in .
We now show that defined above is indeed a probability distribution over .
Let and be closed sets such that and . Then,
is a valid probability distribution: .
is locally consistent: for every , .
The following claim that we record as a lemma will be useful in the proof.
There exists an ordering of clauses in and a partition of into sets such that for every , and .
We first complete the proof of Lemma 4.6 and then prove Lemma 4.7.
Let and . Let . Using (1), we have:
To simplify notation, we will write for and for where . We have, using the ordering given by Lemma 4.7. Then,
Now, . Further, . Thus, completing the proof. ∎
For every define . For any collection of clauses in , let . Similarly, define and . We make the following claim:
For any ,
We first complete the proof of the lemma using the claim. Since and , there exists a clause such that . Now and thus . We place this clause at the beginning of the ordering, call it and set . We now iterate with to complete the construction, obtain a clause such that . Since cannot intersect , we can now set . Continuing this way yields the required ordering and partition of . ∎
Fix any and consider any (maximally) connected subgraph with edges . If consists of a single clause , then (since is closed) and for any . Thus, .
Now suppose consists of at least clauses. We first claim that . To see this, observe that is a collection of at most clauses in and thus, . Further, every belongs to at least two different clauses in and thus, . Rearranging gives .
Next, we claim that that for every there exists a pair of clauses such that . Consider any clause such that . If there is another clause such that , then observe that cannot intersect in any other element (since is closed) and thus we can let be the pair as above, corresponding to . Otherwise, there exists a clause such that such that (since is connected) and (as otherwise there would be a path between two distinct vertices of , of length at most outside of ). Further, observe that all such pairs are disjoint. This is because if some pairs intersect, then they induce a path of length at most between two distinct vertices of that is not contained in (violating the closedness of ). Thus, . Thus, we must have: .
Since for every connected component inside we have that we must have as promised. This completes the proof of claim. ∎
Suppose and are closed disjoint sets such that is closed. Then, .
We now define the pseudo-expectation operator associated with the local distributions :
Let be a nice instance and a pairwise independent distribution over . Then the family of local distributions for satisfies:
Completeness: For every clause of , .
Consistency: for every , , the marginal of on to is .
The completeness property follows from (1) and . The consistency property follows from Lemma 4.6. ∎
Local Distribution on Unions
In this section we make an important step towards showing the positivity property of our pseudo-distribution by showing that if two sets and are sufficiently closed, then the local distribution on is only determined by the clauses that are contained in or in . In particular, this implies that if and are disjoint then the distribution on is independent of the distribution of . The main result of this section is the following expression for the local distribution on the union of and where is -closed for a sufficiently large constant and is closed.
Suppose is -closed for and is closed. Then, for any ,
where .
We make two convenient definitions before proceeding, see Figure 3:
For any two closed sets and , a path of length at most is said to be a bridge path for the pair if .
For any two closed sets and , a path of length at most is said to be a bridge-closure path for the pair , if there exists a bridge path such that and but .
As a final remark, observe that the example from Figure 1 shows that and being -closed is not enough to guarantee the statement of the lemma. While we believe that at least one of the sets out of and should be -closed for some for the lemma to hold, currently, we do not have any example of a counter example demonstrating this point. We now proceed with the actual proof.
Let . Let and be the set of bridge paths and bridge closure paths of for the pair , respectively. Observe that . We now show that these are the only extra clauses in :
The first observation describes how bridge paths and bridge-closure paths intersect.
For any distinct , .
For any distinct , where is a bridge path.
For any and , .
Suppose are such that and for some bridge paths . Then, .
If the claim weren’t true, then there must be a path of length between two vertices of which violating that is -closed.
Suppose first that there is a bridge path such that and . If either of or intersect in more than one element, then there is a cycle of length at most in which violates the fact that has girth. If and intersect in an element not contained in , then, again there is a cycle of length at most in violating the high girth of . Similarly, if intersect inside , then, they cannot intersect outside of and further, they cannot both intersect the same bridge path as it would yield a cycle of length at most in . Thus in both the cases, for some bridge path .
Otherwise there is a cycle of length at most in violating that has girth .
If not, then if then there is a cycle of length in the graph, contradicting our assumption on the girth. Otherwise which means that there is a path of length at most between two distinct vertices of .
The next observation shows that there is no path of length at most that connects two bridge paths, two bridge-closure paths or two bridge-bridge-closure paths that are not contained in .
There is no path of length at most not contained in that connects a bridge path and .
There is no path of length at most not contained in that connects with .
There is no path of length at most connecting distinct .
Otherwise there is a path of length at most between two vertices of not contained in , violating the fact that is closed.
Otherwise there is a path of length at most between two vertices of , violating that is closed.
Otherwise there is a path of length at most not contained in , connecting two vertices of .
The following claim is now a consequence of the claims above:
For any such that but , .
Consider the iterative procedure of building the closure of by adding paths one by one in some order. Let be the first path in this order that violates the claim. Then, either intersects two bridge paths or a bridge path and or a bridge path and a bridge-closure path or two bridge-closure paths. Each of these situations is explicitly barred by the claims above. This completes the argument. ∎
Let . Observe that . For every clause , let and . Similarly, let and . Next, we claim:
Let such that defined by and .
In this section, we prove our main result. Our proof will follow easily from the following lemma which is the main result of this section.
The rest of this section is devoted to proving Lemma 6.1.
Our aim is to build an order on the , in which to process them for our local orthogonalization procedure. We start with an arbitrary ordering on the clauses of , e.g. for every we define a unique index . We say that if:
is smaller than in lexicographic order of . That is, if the maximum index for is smaller than this maximum for , and if they are equal we break ties by the second largest index and so on. We define to be the index of according to this ordering. (Note that is a permutation on distinct closures, and so if then .)
If then we say that if .
If and then we break ties arbitrarily.
For , we let denote the set in this ordering. Note that and are the singleton elements (in some arbitrary order). We will write for in the following to reduce clutter.
2 Local Orthogonalization
Set . Define the local correlated space as
Invoking Lemma 4.12, it suffices to show that . This follows by noting that and (Lemma 4.4). ∎
The following simple lemma would be very useful.
Now suppose for the sake of contradiction that
for some . (If the expectation is negative then we can take .) Let . We have:
and so if is sufficiently small then
contradicting our choice of . ∎
3 Global Orthogonality lemma
We will need the following observation for the proof which we record before proceeding:
Suppose is a connected -uniform hypergraph such that there exist a subset of vertices, , satisfying: for every distinct . Then, must have at least hyperedges.
Observe that the collection of balls of radius around any vertex in are all disjoint and contain at least one path (due to connectedness of ). ∎
Fix any and let and . Let
For every we call any associated clause as in the definition above as a boundary clause. Let and and . Note that is not necessarily a subset of . Next, we make two useful claims:
We will show that . This immediately yields the claim by observing that . We note that the proof of this claim is significantly simpler in the case that . Proving it in the case when is a constant and is one of the main technical ingredients in getting the proof sketched in the overview to work for rounds of the SOS hierarchy.
Let be a (maximally) connected component in the subgraph defined by the hyperedges . Let and . is thus partitioned into for every possible maximally connected subgraphs . It is thus enough to prove that for any fixed .
Observe that . If , then, there is nothing to prove. If , then, contains where is a boundary clause associated with . If contains no vertex of , then, observe that is a closed set containing contradicting the minimality of . Thus, in this case, .
Now suppose for . Then, vertices in are connected through clauses in . On the other hand, since is -closed, for any , any path that uses clauses from between must be of length at least . Applying Lemma 6.6, we observe that .
Next, we claim that . It is easy to complete the proof once we have this claim: observe that
Rearranging yields that . Using yields that .
We now proceed to show that . By Lemma 4.5 (4), . Let . Then, by another application of Lemma 4.5 (4), . In other words, one can build the closure of by first building the closure of and (Step ) and then taking the closure of the unions of the obtained sets (Step ). Clearly, the final output contains every clause in . If we show that (1) and that (2) no clause from is added in the step , then every clause in must be added in the procedure to build and thus we are done. We now proceed to show the two statements above.
(1): First observe that itself can be built by building the closure of (and ), the closure of (that cannot intersect any clause from as then must include a vertex from , a contradiction) and finally taking the closure of their union. This last step cannot add a clause in : every path added connects and . If is contained in , then, there is nothing to prove. Otherwise must pass (exactly once) through a boundary vertex. If contains a clause from , then, if passes through a boundary vertex not in , then this enlarges violating that is a maximally connected component. If, on the other hand, passes through a boundary vertex in , then, connects with violating the maximality of . Thus, cannot include any clause from .
(2): Consider the step of the procedure to build . In this step, we add paths (of length at most ) that connect and . For any such path , if includes some clause from then it crosses out of (exactly once) and thus must pass through a boundary vertex. By maximality of , we must have that and . On the other hand, the part of that connects some vertex in to is of length at most and thus must be contained in . Thus every edge in is present in and thus .
Suppose . Then, for every , .
Since , . Thus, . Now, from Claim 6.7. Thus, every subset has a well-defined ordering w.r.t . Further, for every such , (Lemma 4.5) and thus, . Hence, . ∎
Now, and we can write
Consider an arbitrary assignment to and to . Let be the function that on input outputs if and zero otherwise.
Lemma 5.1 gives the expression for the local distribution on . Using the expression, we have:
for ever , .
Now (Claim 6.7) and :
Each index set of the characters above is a subset of and thus (invoking Claim 6.8 along with the fact ). Thus, . Using Lemma 6.3, thus,
We can now complete the proof of Lemma 6.1.
Acknowledgements
Thanks to Ryan O’Donnell, Li-Yang Tan, and David Steurer for fruitful discussions and the anonymous reviewers for their valuable comments and suggestions on a previous version of this paper.
References
Appendix A Random sparse predicates
Consider a random sparse predicate on variables and accepting assignments. If , we now show that does not support a pairwise independent subgroup with high probability, as tends to infinity. Here the randomness corresponds to choosing to be a -sized subset of uniformly at random.
Under the condition of the observation, does not contain any pairwise independent subgroup, because any such a subgroup contains an affine subspace of dimension .
Let be an enumeration of vectors in . Note that if contains a subspace of dimension 2, then there are such that this subspace is exactly the affine span of .
For a fixed choice of the triple , conditioning on the event that span an affine subspace of dimension , the remaining vector from this affine subspace also belongs to with probability at most . Taking a union bound over (at most such choices), we see that contains an affine subspace with probability at most . ∎
Appendix B Constructing nice instances
In this section, we show the existence of nice instances of constraint hypergraphs and prove Theorem 3.4.
Fix and . Then, there exists a -uniform constraint hypergraph with edges such that for , , :
has girth , and
We first choose a random graph by choosing every uniform hyperedge, independently, with probability . Our final hypergraph will be obtained by removing hyperedges from .
For chosen as above, with probability at least ,
has between and edges.
has at most cycles of length at most g and
We first show that the claim above is enough to complete the proof of the lemma. We define to be the hypergraph obtained by removing every cycle of length at most g.By the claim above, the total number of hyperedges removed in this process, for a large enough , is at most . Observe that the last property in the statement of the theorem is immediately satisfied by . Further, since is obtained only by removing hyperedges from , still enjoys -expansion. Thus, is a constraint hypergraph that satisfies the requirements of the lemma. Finally, the total number of edges removed is sublinear in and thus has at least edges for a large enough .
We now move on to complete the proof of the claim above:
The expected number of edges in is given by . By an application of Chernoff bound, the probability that the number of edges does not lie in the interval is at most .
Next, consider any collection of clauses and let us compute the probability that they cover at most variables for some . This probability, is then upper bounded by
Using that and the approximation , we can upper bound the above expression by:
Using that and that now yields an upper bound of
Thus, using that and that satisfies makes the above probability at most .
By an application of Markov’s inequality, with probability at least over the draw of hyperedges of , the number of cycles of length at most are at most
By a union bound, now, all the three properties above can be ensured with probability at least .
In this section, we show that after fixing the underlying hyperedges of an instance, with high probability over the literals on constraints, all assignments are very close to a random assignment. Here closeness is measured with respect to the distribution as one chooses a uniformly random constraint among all hyperedges of the hypergraph.
Let be any hypergraph with hyperedges. Let be an instance with the same underlying hypergraph as , and with the literals in all clauses be chosen uniformly at random. We have the following lemma.
Suppose . With high probability over the choice of literals, for any assignment , the distribution with chosen uniformly at random in is within statistical distance to the uniform distribution over .
Now suppose the signs of the literals from for every constraint are chosen uniformly at random, keeping the underlying subhypergraph fixed. Then is now a random variable depending on the randomness of the literals. For each , the indicator equals with probability , and equals with the remaining probability (over the randomness of the signs of the literals on the -th constraint), and the random variables are independent of each other for different . Therefore is the average of independent -indicator random variables, each being with probability . By Chernoff–Hoeffding bound, we have with probability at most . By a union bound over all assignments , the maximum deviation of from (over all ) exceeds with probability at most . Letting , we see that
as long as .
Now the distribution for a random has statistical distance at least implies that for some . By a union bound over all , the distribution is close in statistical distance to the uniform distribution on except with probability , assuming . ∎