SoS and Planted Clique: Tight Analysis of MPW Moments at all Degrees and an Optimal Lower Bound at Degree Four

Samuel B. Hopkins, Pravesh K. Kothari, Aaron Potechin

Introduction

Let G(n,p)G(n,p) be the Erdős-Rényi random graph where each edge is present in GG with probability pp independently of others. It is an easy calculation that the largest clique in GG(n,12)G\sim G(n,\frac{1}{2}) is of size (2+o(1))log(n)(2+o(1))\cdot\log{(n)} with high probability. Recovering such a clique using an efficient algorithm has been a long standing open question in theoretical computer science. As early as 1976, Karp [Kar76] suggested the impossibility of finding cliques of size even (1+ε)log(n)(1+\varepsilon)\log{(n)} for any constant ε>0\varepsilon>0 in polynomial time. Karp’s conjecture was remarkably prescient and has stood ground after nearly 44 decades of research.

Lack of algorithmic progress on the question motivated Jerrum [Jer92] and Kucera [Kuc95] to consider a relaxed version known as the planted clique problem. In this setting, we are given a graph GG obtained by planting a clique of size ω\omega on a graph sampled according to G(n,12)G(n,\frac{1}{2}). Information theoretically, the added clique is identifiable as long a ωlog(n)\omega\gg\log{(n)}. The goal is to recover the added clique via an efficient algorithm for as small an ω\omega as possible. This variant is also connected to the question of finding large communities in social networks and the problem of signal finding in molecular biology [PS000]. Despite attracting a lot of attention, the best known polynomial time algorithm can only find planted cliques when their size ω=Ω(n)\omega=\Omega(\sqrt{n}) [AKS98, FK03]. The LS+ semi-definite programming hierarchy leads to the state of the art trade off: planted cliques of size ωn2d\omega\approx\sqrt{\frac{n}{2^{d}}} can be recovered in time nO(d)n^{O(d)} for any d=O(log(n)d=O(\log{(n)}.

Recently, this difficulty of finding cliques of size ωn\omega\ll\sqrt{n} has led to an increasing confidence in planted clique being a candidate for an average case hard problem and has inspired new research directions in cryptography [ABW10], property testing [AAK+07], machine learning [BR13], algorithmic game theory [DBL09, ABC11] and mathematical finance [DBL10].

In this paper we are interested in understanding the Sum of Squares (SoS, also known as Lasserre) semi-definite programming (SDP) hierachy [Las01, Par00] for the planted clique problem. This is a family of algorithms, paramterized by a number dd called the degree, where the dthd^{th} algorithm takes nO(d)n^{O(d)} time to execute. The sum of squares hierarchy can be viewed as a common generalization and extension of both linear programming and spectral techniques, and as such has been remarkably successful in combinatorial optimization. In particular it captures the state of the art algorithms for problems such as Sparsest Cut [ARV04], MAX CUT [GW95], Unique Games/Small Set Expansion [ABS10, BRS11, GS12]. Recently, [BBH+12] showed that a polynomial time algorithm from this hierarchy solves all known integrality gap instances of the Unique Games problem, and similar results have been shown for the hard instances of MAX-CUT [DMN13] and Balanced Separator [OZ13]. Very recently, [LRS15] showed that the sum of squares algorithm is in fact optimal amongst all efficient algorithms based on semidefinite programming for a large class of problems that includes constraint satisfaction and the traveling salesman problem. Moreover, Barak, Kelner and Steurer [BKS14, BKS15] used the SoS hierarchy to give improved algorithms to average case problems such as the dictionary learning problem and the planted sparse vector problem that have at least some similarity to the planted clique problem.Thus several researchers have asked whether the SoS hierarchy can yield improved algorithms for this problem as well.

The first published work along these lines was of Meka, Potechin and Wigderson [MPW15] who showed that for every d2d\geqslant 2, the degree 2d2d SoS cannot find planted cliques of size smaller than n12d\approx n^{\frac{1}{2d}}.We use \approx to denote equality up to factors polylogarithmic in nn (the size of the graph) and with an arbitrary dependence on the degree parameter dd. Deshpande and Montanari [DM15] independently proved a tighter lower bound of n1/3\approx n^{1/3} for the case of degree 44. In the main result of this paper, we extend the prior works and show that the first non trivial extension of the spectral algorithm, namely the SoS algorithm of degree 44, cannot find cliques of size n\approx\sqrt{n}, a bound optimal within polylog(n)\operatorname{poly}\log{(n)} factors. Our lower bound for degree 44 is obtained by a careful “correction” to the certificate used by [MPW15] and [DM15] in their lower bounds.

A similar result was obtained in an independent work by Raghavendra and Schramm [RS15]. In our second main result, we give a tight analysis of the certificate considered by [MPW15] and [DM15] and show that it yields a lower bound of n1d+1n^{\frac{1}{d+1}}.

The certificate of [MPW15, DM15] is sufficient to show an Ω(n/2d)\Omega(\sqrt{n/2^{d}}) lower bound for the degree dd LS+LS+ hierarchy [FK03] (which is a weaker SDP that also runs in time nO(d)n^{O(d)}). However, a generalization of an argument of Kelner (see Section 10) shows that this is not the case for the SoS hierarchy, and our analysis of this certificate is tight. Hence our work shows that to get stronger lower bounds for higher degree SOS it is necessary and sufficient to utilize more complicated constructions of certificates than those used for weaker hierarchies. Whether this additional complexity results in an asymptotically better tradeoff between the running time and clique size remains a tantalizing open question.

Technical Overview

The SoS semidefinite programming hierarchy yields a convex programming relaxation for the planted clique problem. That is, we derive from the input graph GG a convex program PG\mathcal{P}_{G} such that if the graph had a clique of size ω\omega then PG\mathcal{P}_{G} is feasible. To show that the program fails to solve the planted clique problem with parameter ω\omega, we show that with high probability there is a solution (known as a certificate) for the program PG\mathcal{P}_{G} even when GG is a random graph from G(n,1/2)G(n,1/2) (which in particular will not have a clique of size logn\gg\log n).

It is known that such an approach cannot work to obtain a n\approx\sqrt{n} lower bound for the SoS program of degree 44 and higher. That is, this natural certificate does not satisfy the conditions of the SoS program. Hence to obtain our lower bound for degree 44 SoS we need to consider a more complicated certificate, that can be thought of as making a global “correction” to the simple certificate of [MPW15, DM15]. For higher degrees, we have not yet been able to analyze the corresponding complex certificate, but we are able to give a tight analysis of the simple certificate, showing that it certifies an ωn1/(d+1)\omega\approx n^{1/(d+1)} lower bound on degree 2d2d SoS relaxation. The key technical difficulty in both our work and prior ones is analyzing the spectrum of random matrices that have dependent entries. Deshpande and Montanari [DM15] achieved such an analysis by a tour-de-force of case analysis specifically tailored for the degree 44 case. However, the complexity of this argument makes it unwieldy to extend to either the case of the more complex certificate or the case of analyzing the simple certificate at higher degrees. Thus, key to our analysis is a more principled representation-theoretic approach, inspired by Grigoriev [Gri01a], to analyzing the spectrum of these kind of matrices. We hope this approach would be of use in further results for both the planted clique and other problems.

We now give an informal overview of the SoS program for planted clique, the [MPW15, DM15] certificate, our correction to it, and our analysis. See Section 3 and [BS14] for details.

2 The “Simple Moments”

The simplest form of such a pseudo-distribution is to set

3 There is such a thing as too simple

4 Fixing the simple moments

for every clique SS. Note that when ω=εn\omega=\varepsilon\sqrt{n}, the correction factor would typically be of the form 1±Θ(ε)1\pm\Theta(\varepsilon).While it might seem that there is a chance for these pseudo-expectations to be negative, if ω<n/polylog(n)\omega<\sqrt{n}/\operatorname{polylog}(n) then it is exceedingly unlikely that there will exists an SS such that rS>n/ω|r_{S}|>n/\omega, and so we ignore this issue in this overview.

If we now focus on the contribution of the n4n^{4} terms where the jkj_{k}’s are all distinct, we see that each such set SS yields the term

5 Analyzing the corrected moments

The above gives some intuition why the corrected moments might be better than the simple moments for one set of polynomials. But a priori it is not at all clear that those polynomials encapsulated all the issues with the simple moments. Moreover, it is also unclear whether or not the correction itself could introduce additional issues, and create new types of negative eigenvectors. Ruling out these two possibilities is the crux of our analysis.

Because of the dependencies in the random matrix M\mathcal{M}, the deviation from expectation varies according to the eigenspace. Thus, in [DM15], the deviation from the expectation is analyzed by first decomposing along the eigenspaces V0,V1,V2V_{0},V_{1},V_{2}.

A second technical idea is required to carry out this decomposition. Because of the symmetries present in the spaces V0,V1,V2V_{0},V_{1},V_{2}, this decomposition is very nearly the same as splitting up the matrix M\mathcal{M} in an ostensibly unrelated way. Each entry M(I,J)\mathcal{M}(I,J) of M\mathcal{M} for I,J([n]2)I,J\in{{[n]}\choose{2}} is the 0/10/1 indicator for the presence of a clique on IJI\cup J. This indicator is just the AND of all the ±1\pm 1 indicators gbg_{b} for the presence of the edges bE(IJ)b\in\mathcal{E}(I\cup J). Taking a Fourier decomposition of this suggests a way to decompose M\mathcal{M} as \mathcal{M}=\sum_{\text{subsetsSofedgesonof edges on4nodes}}\mathcal{M}_{S},This is not quite the whole picture, see Section 7.1.1. where the matrix MS\mathcal{M}_{S} corresponds to the Fourier character SS. The matrices MS\mathcal{M}_{S} can be matched up to the subspaces V0,V1,V2V_{0},V_{1},V_{2} in such a way that those matrices with larger spectral norm (corresponding to larger deviations from expectation) have subspaces with smaller eigenvalues in their kernels!

The foregoing is missing one subtlety. Some monomial matrices MS\mathcal{M}_{S} do not match nicely to a single subspace. Instead they form cross terms: for example, having V2V_{2} in the left kernel but not in the right kernel (not all these matrices need be symmetric). In fact, it is just such a matrix which keeps the simple moments from remaining PSD beyond ωn1/3\omega\approx n^{1/3}.

For the four nodes a1,a2,b1,b2a_{1},a_{2},b_{1},b_{2}, consider the monomial ga1,b1ga1,b2g_{a_{1},b_{1}}g_{a_{1},b_{2}} and the corresponding matrix MS({a1,a2},{b1,b2})(ω4/n4)ga1,b1ga1,b2\mathcal{M}_{S}(\{a_{1},a_{2}\},\{b_{1},b_{2}\})\approx(\omega^{4}/n^{4})g_{a_{1},b_{1}}g_{a_{1},b_{2}}. The entries MS\mathcal{M}_{S} do not depend on a1a_{1}, and so there are many repeated rows, which creates a much larger spectral norm for MS\mathcal{M}_{S} than if it had independent entries. [DM15] prove the (tight) bound MSω4/n7/2\|\mathcal{M}_{S}\|\approx\omega^{4}/n^{7/2}. At the same time, it turns out only to have V2V_{2} in its left kernel, not its right one. Appealing to the above picture, in order to have ω4/n7/2ω5/2/n2\omega^{4}/n^{7/2}\ll\omega^{5/2}/n^{2}, we must have ωn1/3\omega\ll n^{1/3}.

We make one further observation about the matrix MS\mathcal{M}_{S}^{\prime} from the previous section: its rows are the tensor squares of the ±1\pm 1 neighborhood indicator vectors rir_{i} from above. Our fix to the simple moments, described above as adjusting individual pseudo-expectations, amounts roughly to adding to M\mathcal{M} the matrix (ω5/n5)i(ri2)(ri2)(\omega^{5}/n^{5})\sum_{i}(r_{i}^{\otimes 2})(r_{i}^{\otimes 2})^{\dagger}. This carves out of V2V_{2} (our worst subspace from an eigenvalue perspective) a new subspaces space V1.5V_{1.5} with eigenvalues lower-bounded by λ1.5ω5/nω2/n2\lambda_{1.5}\approx\omega^{5}/n\gg\omega^{2}/n^{2}. Now instead of matching the bad matrix MS\mathcal{M}_{S} to V1V_{1} and V2V_{2} as a cross term, we can match it to V1V_{1} and V1.5V_{1.5} as a cross term. Then we only need ω4/n7/2λ1λ1.5ω4/n7/2\omega^{4}/n^{7/2}\ll\sqrt{\lambda_{1}\lambda_{1.5}}\approx\omega^{4}/n^{7/2}. With some care in the details, the above picture can be made precise.

However, a crucial point is that the matrix N=(ω5/n5)i(ri2)(ri2)N=(\omega^{5}/n^{5})\sum_{i}(r_{i}^{\otimes 2})(r_{i}^{\otimes 2})^{\dagger} doesn’t satisfy the clique constraints (in that all entries I,JI,J with IJI\cup J not a clique should be ). A chunk of our proof goes into analyzing the discrepancy between the matrix NN and its zeroed out version. Our analysis here requires the use of new combinatorial tools (Section 6.4) combined with the trace moment method.

As we have alluded to already, a key technical step in our proof is to show that certain Fourier-decomposed matrices of the form discussed above have some of the subspaces V0,V1,V2V_{0},V_{1},V_{2} in their kernels. In the analysis of simple moments for degree 44, [DM15] use explicit entries for canonical forms of eigenvectors in V0,V1V_{0},V_{1} to accomplish this. However this approach hits analytical roadblocks for the analysis in case of higher degrees. Canonical forms for the eigenvectors are hard to pin down explicitly from the literature in algebraic combinatorics.

A similar approach was utilized heavily by Grigoriev [Gri01a] to prove a sum of squares lower bound for the knapsack problem. While for degree 44 either explicit eigenvectors or our approach will work, although the latter takes some more elbow grease, ours is absolutely vital for our tight analysis of the MPW moments for the higher degrees. We hope that such an approach will be useful for proving better (approaching ωn\omega\approx\sqrt{n}) integrality gap for Sum of Squares relaxations of higher degree for Planted Clique and other related problems.

The analysis of the MPW operator at higher degrees also presents other new challenges that do not show up in the special case of degree 44 analyzed in [DM15]. [DM15] deal with the optimization version of the degree 44 SOS program which could be potentially weaker than the one we analyze here (and thus our lower bound is technically stronger). Working with the “optimization” version simplifies the analysis in [DM15] a little bit as the matrix M\mathcal{M} has entries that only have local dependence on the graph GG. We explicitly work with the feasibility version of the degree 4 SOS program and thus, must deal with the additional complexity of the entries of M\mathcal{M} having a global dependence. As in [MPW15], we deal with this situation by separating M\mathcal{M} into matrices LL and Δ\Delta such that LL has only local dependence on the graph GG. [MPW15] deal with Δ\Delta by a simple entrywise bound, however, employing such a bound yields no improvement over the bound proved in [MPW15] for us. It turns out that we have to do a fine grained analysis of the Δ\Delta matrix itself by a decomposition for Δ\Delta such that each piece is essentially only locally dependent on the graph. Once we have such a decomposition, our ideas from the analysis of LL can be extended to that setting as well.

Finally, our argument for analyzing the spectral norms of each of the pieces of encountered in the decompositions also needs to be much more general than in case of [DM15] to handle higher degrees. For this, we identify a simple combinatorial structure that controls the norm bounds and allows a general hammer for computing the norms of all the matrices that appear in this analysis. Our proofs here are based on the trace power method and build on the combinatorial techniques in [DM15].

6 Preview of Technical Toolkit

In this section we give a preview of the key lemmas that allow us to carry out the analyses described thus far. We have simplified some issues for the sake of exposition; details may be found in Section 6.

We are concerned with the matrices in the aforementioned Fourier decomposition. Let B[d]×[d]B\subseteq[d]\times[d] be a bipartite graph on 2d2d vertices. Let QBQ_{B} be an ([n]d)×([n]d){{[n]}\choose{d}}\times{{[n]}\choose{d}} matrix with entries

(We are ignoring what happens if IJI\cap J\neq\emptyset.)

This first lemma bounds the spectral norm of such a matrix in terms of the shape of BB.

We also want to show that these matrices have nontrivial kernels, so we can bound their negative eigenvalues against the parts of the simple moments with larger positive eigenvalues. The following allows us to carry out this matching of Fourier decomposition matrices to eigenspaces V0,,VdV_{0},\ldots,V_{d} of the expectation matrix.

7 Related Work

There’s a large amount of work on understanding Linear and Semidefinite Programming based hierarchies. A detailed survey on the sum of squares hierarchy and references to works related can be found in [BS14]. The earliest works on proving SoS lower bounds were due to Grigoriev [Gri01a, Gri01b] who showed that degree Ω(n)\Omega(n) SoS does not beat the random assignment for 3SAT or 3XOR even on random instances from a natural distribution. Some of these lower bounds were rediscovered by Schoenebeck [Sch08]. Lower bounds for SOS essentially rely on gadget reductions from 3SAT or 3XOR and this approach has been understood in some detail [Tul09, BCV+12]. An exception to this methodology is the recent work of Barak et al. in proving SoS lower bounds for pairwise independent CSPs [BCK15]. Even though the lower bounds for CSPs are for random instances, the average-case nature of the problem does not show up as a main analytic issue. There has recently been a surge of interest in understanding the performance of SoS on average-case problems of interest in machine learning, both in proving upper and lower bounds [HSS15, BM15, MW15, GM15, BKS15].

For the planted clique problem, Feige and Krauthgamer gave an analysis of the performance of the LS+ semidefinite heirarchy tight to within constants [FK00, FK03] giving the state of the art algorithm for finding planted cliques in any fixed polynomial time. Other algorithmic techniques not based on convex relaxations have been studied and shown to fail for planted clique beyond ωn\omega\approx\sqrt{n}, most prominantly Markov Chain Monte Carlo (MCMC) [Jer92]. Recently, Feldman et. al. [FGR+13] showed a lower bound for (a variant of) the planted clique problem in the restricted class of statistical algorithms that generalize MCMC based methods and many other algorithmic techniques. Frieze and Kannan [FK08] proposed an approach for the planted clique problem through optimizing a degree-33 polynomial related to the random graph. Such polynomials are NP hard to optimize in the worst case but the belief is that the random nature of the polynomials might be helpful. This approach was generalized to higher degree polynomials by Brubaker and Vempala [BV09].

There has also been recent work on variants of the problem that define Gaussian versions of the planted clique and more generally, the hidden submatrix problems showing, for example, strong inditinguishability results about the spectrum of the associated matrices with and without planting [MRZ14]. Finally, the present work builds heavily on independent papers of Meka, Potechin, and Wigderson [MPW15] and Deshpande and Montanari [DM15], which we have already thoroughly discussed.

Section 3 contains preliminaries. Section 4 contains definitions and the necessary background on the simple moments, a.k.a. the MPW operator. Section 5 contains the formal definition of our corrected degree 44 moments. Section 6 lays out the technical framework for the analyses of the corrected degree 44 moments and for the tightened bounds on the MPW operator at higher degrees. Here we define the Fourier decompositions alluded to above and carry out representation-theoretic arguments about their kernels. Section 7 and Section 8 use the tools we have built thus far to prove the main theorems. In Section 9 we prove a technical concentration result for small subgraphs of G(n,1/2)G(n,1/2) required for the analysis. In Section 10 we sketch Kelner’s argument showing that our analysis of the MPW moments is nearly optimal.

Preliminaries

We will use the following general notation in the paper.

GG will denote a draw from G(n,12)G(n,\frac{1}{2}) unless otherwise stated.

For a square symmetric matrices Q,RQ,R, we write QRQ\succeq R to mean QRQ-R is positive semidefinite.

For any matrix MM, M||M|| denotes its largest singular value, or, equivalently, M=maxx:x2=1Mx2.||M||=\max_{x:||x||_{2}=1}||Mx||^{2}.

For matrices M,NM,N of same dimensions, MNM\odot N denotes their entrywise or Hadamard product, i.e., (MN)(I,J)=M(I,J)N(I,J)(M\odot N)(I,J)=M(I,J)\cdot N(I,J) for every I,JI,J.

For a graph GG and any set of two vertices of GG, ee, geg_{e} denotes the {1,1}\{-1,1\} indicator of the edge ee being present in GG. That is, ge=+1g_{e}=+1 if ee is an edge in GG and 1-1 otherwise.

For a set II of vertices of GG, E(I)=(I2)\mathcal{E}(I)={I\choose 2}, the set of all pairs from II.

For a pair of subsets of vertices I,JI,J of GG, Eext(I,J)=E(IJ)(E(I)E(J)),\mathcal{E}_{ext}(I,J)=\mathcal{E}(I\cup J)\setminus(\mathcal{E}(I)\cup\mathcal{E}(J)), the set of cut edges between II and JJ.

For a subspace VV, ΠV\Pi_{V} denotes the projector to VV.

Following [BBH+12] and many subsequent papers, we work with SoS using the language of pseudo-expectations.

Let p0,,pkPdnp_{0},\ldots,p_{k}\in\mathcal{P}_{d}^{n}. If a pseudo-expectation satisfying the constraints {p0=0,,pk=0}\{p_{0}=0,\ldots,p_{k}=0\} exists, it can be found in time nO(d)n^{O(d)}. If none exists, a certificate of infeasibility of these equations is found instead.

The following observations (actually both the same observation in different forms) will come in handy in our analysis.

by our assumptions on PiMPjP_{i}MP_{j} and Cauchy-Schwarz. For each i,ji,j, we know 1k(Pixλi+Pixλj)2kPixPjxλiλj\frac{1}{k}(\|P_{i}x\|\lambda_{i}+\|P_{i}x\|\lambda_{j})\geqslant\frac{2}{k}\|P_{i}x\|\|P_{j}x\|\sqrt{\lambda_{i}\lambda_{j}}, which implies that the whole expression is nonnegative. ∎

The MPW Operator

For any set I[n]I\subseteq[n], let xIx_{I} be the monomial ΠiIxi\Pi_{i\in I}x_{i}.

Further, we set C2d=C2d(G)C_{2d}=C_{2d}(G) be the number of 2d2d-cliques in GG.

Our analysis improves upon the analysis in [MPW15] and generalizes the improved analysis for the special case of d=2d=2 done in [DM15]. By a generalization of the counter example due to Kelner, our analysis can actually be shown to be tight. We defer the details of the counter example to the full version. In the remaining part of this section, we begin the task of proving Theorem 4.4 by introducing certain simplifications and computing the eigen values of the expected value of the matrix under G(n,12)G(n,\frac{1}{2}).

Observe that for any I,JI,J, M(I,J)\mathcal{M}^{\prime}(I,J) is chosen so as to depend only on the edges with one end point in II and the other in JJ. Intuitively, this corresponds to thinking of II and JJ as being cliques in GG by addition of some edges. Moreoever, M(I,J)\mathcal{M}^{\prime}(I,J) is chosen so that M(I,J)=M(I,J)\mathcal{M}^{\prime}(I,J)=M(I,J) whenever I,JI,J are actually cliques in GG. Thus, as noted above, we have the following fact (which is Lemma 5.1 in [MPW15]).

M\mathcal{M} is PSD if M\mathcal{M}^{\prime} is PSD.

Thus, the following lemma completes the proof of Theorem 4.4.

2 The Expectation Matrix

We first describe the entries of the matrix M\mathcal{M}^{\prime}.

Next, we need a basic fact about the (shared) eigenspaces of all the set symmetric matrices, in particular, their number and dimensions which follows from the following well known result from classical theory of Johnson schemes.

For 0jd0\leqslant j\leqslant d, dim(Vj)=(nj)(nj1)dim(V_{j})={n\choose j}-{n\choose{j-1}}.

Using a nice basis for the matrices in J\mathcal{J}, one can obtain the following estimates of the eigenvalues of EE on ViV_{i} for each 0id0\leqslant i\leqslant d:

Let ω<n2d3d2d1\omega<\frac{n-2d}{3d2^{d-1}} and dω/2d\leqslant\omega/2. Let λj(E)\lambda_{j}(E) be the eigenvalue of EE on VjV_{j} as defined in Fact 4.9. Then,

The Corrected Operator for Degree Four

In this section, we present the pseudodistribution that we will use to show an almost optimal lower bound on the degree 44 SOS algorithm. Our pseudodistribution is obtained by “correcting" the one described in the previous section. The correction itself is inspired by an explicit polynomial described by Kelner who showed that the pseudodistribution from the previous section for degree 44 does not satisfy positivity for ωn1/3\omega\gg n^{1/3}.

Let γ>0\gamma>0 be a real parameter to be chosen later. Let L\operatorname{\mathcal{L}} be the linear operator on the linear space of homogeneous, multilinear polynomials of degree 44 such that:

For ω>0\omega>0 and cωc\ll\omega, there exists xx, x=ω±O(c/ω3)x=\omega\pm O(c/\omega^{3}), such that (x4)=(ω4)+c{x\choose 4}={\omega\choose 4}+c.

We first use the correction operator L=Lγ\operatorname{\mathcal{L}}=\operatorname{\mathcal{L}}_{\gamma} to define the corrected moments on all multilinear monomials of degree 44. For every S[n]S\subseteq[n], S=4|S|=4, we set:

Then we know there exists ω=ω±O(c/ω3)>0\omega^{\prime}=\omega\pm O(c/\omega^{3})>0 satisfying (ω4)=def(ω4)+c{\omega^{\prime}\choose 4}\stackrel{{\scriptstyle\textrm{def}}}{{=}}{\omega\choose 4}+c (using Fact 5.2). Thus, the degree 44 moments we defined “think” ixi=ω\sum_{i}x_{i}=\omega^{\prime}. We use this relationship to extend the definition to all the monomials. For every S[n]S\subseteq[n], S=3|S|=3,

Furthermore, if GG(n,1/2)G\sim G(n,1/2), then with probability 1O(n25)1-O(n^{-25}), ωω<O(γlog(n)2ω2/n5/2)|\omega^{\prime}-\omega|<O(\gamma\log(n)^{2}\omega^{2}/n^{5/2}).

There is ω0=Ω(n/polylogn)\omega_{0}=\Omega(\sqrt{n}/\operatorname{polylog}n) and γ=Θ(1)\gamma=\Theta(1) so that for ωω0\omega\leqslant\omega_{0}, with probability at least 11/n1-1/n over the draw of GG(n,12)G\sim G(n,\frac{1}{2}), N0\mathcal{N}^{\prime}\succ 0.

We prove the first item; the others are similar.

The proof is by several applications of McDiarmid’s inequality. By a standard Chernoff bound there is a universal constant C0C_{0} so that for every s[n]s\in[n] and every i,j,k[n]i,j,k\in[n], with probabilty 1O(n40)1-O(n^{-40}),

for some other universal constant C1C_{1}. So by McDiarmid’s inequality there is C2C_{2} so that with probability 1O(n34)1-O(n^{-34}),

By a similar argument there is C3C_{3} so that for every s,i[n]s,i\in[n], with probability 1O(n30)1-O(n^{-30}),

for some other constant C4C_{4}. The result follows by a final application of McDiarmid’s inequality (we lose a factor of nn at this step as opposed to the n\sqrt{n} at previous steps because there are n2\approx n^{2} edges to be revealed). ∎

We can now complete the proof of Lemma 5.5 using Lemma 5.7 and Fact 5.2.

In Section 6, we develop some general tools for analyzing the matrices that we encounter before going on to prove Theorem 4.4 and Lemma 5.6.

Tools

We provide background in the required tools from basic representation theory below.

For a finite dimensional complex vector space VV, let Hom(V,V)Hom(V,V) be the set of all linear maps from VV into VV. For any finite group GG and π:GHom(V,V)\pi:G\rightarrow Hom(V,V), the pair (π,V)(\pi,V) is said to be a representation of GG if π\pi satisfies, for any g1,g2Gg_{1},g_{2}\in G,

where the “\cdot” on the LHS corresponds to the group operation and on the RHS, the composition of linear maps on VV. When the map π\pi is clear from the context (as some natural action of the group GG on VV), we abuse notation and just say that VV is a representation of GG.

Let (π,V)(\pi,V) be a representation of a group GG. A subspace WVW\subseteq V is said to be a subrepresentation if for every wWw\in W, π(g)wW\pi(g)w\in W for every gGg\in G. That is, WW is a stable or invariant subspace for all the linear maps π(g)\pi(g), one for each gGg\in G. Observe that in this case, (π,W)(\pi,W) is another representation of GG. A representation (π,V)(\pi,V) of GG is said to be irreducible if for any subspace WW invariant under all the linear maps π(g)\pi(g) for gGg\in G, W=VW=V or W={0}W=\{0\}.

Suppose (π,V)(\pi,V) and (π,W)(\pi^{\prime},W) are representations of a group GG. Suppose L:VWL:V\rightarrow W is a linear map such that for any gGg\in G and vVv\in V,

Then, for any irreducible representation ViVV_{i}\subseteq V under π\pi, L(Vi)WL(V_{i})\subseteq W is an irreducible representation in WW under π\pi^{\prime}.

2 Eigenspaces of the Set Symmetric Matrices

The set of all set symmetric matrices is known as the Johnson scheme in algebraic combinatorics. All such matrices commute and thus share eigenspaces. While the matrices in the Johnson scheme are well studied, the description of the eigenspaces in the literature is hard to use for the purpose of our proofs. We thus take a more direct approach and use basic representation theory in what follows to identify a simple symmetry condition on the eigenspaces of set symmetric matrices which will be useful to understand the spectral properties of the matrices we study.

Then, vV0V1Vtv\in V_{0}\oplus V_{1}\oplus\cdots\oplus V_{t}.

3 Kernels of Patterned Matrices

In this section we design some general tools to understand the spectral structure of matrices that have restricted variations around the set symmetric structure discussed in the previous section. The main tool we will use to establish these results is Lemma 6.4 shown in the previous section. Before moving on to this task, we describe a high level overview of what we intend to do. The following paragraph can be skipped to dive directly into the technical details without the loss of continuity.

The study of the eigenspaces of set symmetric matrices lets us completely understand the spectral structure of the expectation matrix EE. In the next section when we analyze the spectrum of M\mathcal{M}^{\prime}, we will encounter matrices that depend on the underlying graph GG and thus are not set symmetric. However, if the dependence on the underlying graph GG is in some sense limited, we hope that some of the nice algebraic properties that set symmetry grants us should perhaps continue to hold. In our case, we will be able to decompose EE into various pieces and for each of these pieces, the entry at (I,J)(I,J) has dependence on the graph GG based only on the status (edge vs no edge) of a small number of pairs (i,j)I×J(i,j)\in I\times J. The goal of this section is to develop tools to understand certain (coarse) spectral properties of such matrices.

and the right automorphism group of BB as

We are now ready to define patterned matrices:

When q=0q=0, we write QB,fQ_{B,f} for the corresponding patterned matrix.

The following result describes the kernels of certain symmetrized sums of QB,Z,fQ_{B,Z,f} and is the main claim of this section.

We first show that uu above is well defined in that the definition does not depend on the specific subset ITI_{T} used so long as ITTI_{T}\supseteq T. We adopt the notation (that ignores the “direction” of action) Bσ=defσBB^{\sigma}\stackrel{{\scriptstyle\textrm{def}}}{{=}}\sigma\circ B only for the calculations that follow.

We can equivalently write the claim above as:

This completes the proof using Lemma 6.4. ∎

4 Concentration for Locally Random Matrices over G​(n,12)𝐺𝑛12G(n,\frac{1}{2})

For d2d\geqslant 2, d=O(log(n))d=O(\log{(n)}), and a bipartite graph BBB\in\mathcal{B}, let Q=QB,fQ=Q_{B,f} be a patterned matrix with f(x)=bBxbf(x)=\prod_{b\in B}x_{b}. That is,

The next main result of this section considers a different class of matrices that appear in the analysis in Section 8.

Let U×U\subseteq\times be a bipartite graph on 44 vertices and suppose UU is nonempty. Let M([n]2)×([n]2)M\in{{[n]}\choose{2}}\times{{[n]}\choose{2}} be a matrix with the entry at {a1,a2}\{a_{1},a_{2}\}, {b1,b2}\{b_{1},b_{2}\} for a1a2a_{1}\leqslant a_{2} and b1b2b_{1}\leqslant b_{2} are given by

The proofs of both these results are based on the standard idea of analyzing the trace of higher powers of a matrix to prove bounds on its spectral norm. The proof of Lemma 6.11 is similar to the proofs via the trace power method for bounding the norms of matrices as presented in [DM15]. The general format we present here will come in handy for multiple applications to various matrices in Section 7. Lemma 6.12 deals with somewhat more complicated matrices that appear in the analysis of the corrected operator for degree 44 lower bound. Nevertheless, as is common in such proofs, the analysis is based on a combinatorial analysis of the terms that make non zero contribution to the trace powers combined with the simplifying effect of random partitioning based arguments. We describe the details of the proof in the following section.

It is enough to show that QQ^{\prime} occurs as a principal submatrix of QQ. For this, take the submatrix of rows and columns of MM indexed by tuples (a1,,ad)(a_{1},\ldots,a_{d}) in sorted order, i.e., with a1a2ada_{1}\leqslant a_{2}\leqslant\ldots a_{d}. ∎

We will use the following lemma to break dependencies in certain random matrices by decomposing them into matrices whose entries, while still dependent, have additional structure.

Suppose Q(I,J)=0Q(I,J)=0 when IJI\cap J\neq\emptyset. Let (S11,,Sk1),,(S1r,,Skr)(S_{1}^{1},\ldots,S_{k}^{1}),\ldots,(S_{1}^{r},\ldots,S_{k}^{r}) be a sequence of partition of [n][n] into kk bins. Each partition induces a matrix based on QQ as follows:

Then, there is a family of partitions (S11,,Sk1),,(S1r,,Skr)(S_{1}^{1},\ldots,S_{k}^{1}),\ldots,(S_{1}^{r},\ldots,S_{k}^{r}) such that Q=i=1rQiQ=\sum_{i=1}^{r}Q_{i} with rO(kklogn)r\leqslant O(k^{k}\log n).

where T(a1,,ad),(b1,,bd)iT^{i}_{(a_{1},\ldots,a_{d}),(b_{1},\ldots,b_{d})} is the set of indices jj so that j,(a1,,ad),(b1,,bd)j,(a_{1},\ldots,a_{d}),(b_{1},\ldots,b_{d}) respect the partition (S1i,,Ski,Ti)(S_{1}^{i},\ldots,S_{k}^{i},T^{i}) and do not respect any partition (S1i,,Ski,Ti)(S_{1}^{i^{\prime}},\ldots,S_{k}^{i^{\prime}},T^{i^{\prime}}) for any i<ii^{\prime}<i.

Then, there is a family (S11,,Sk1,T1),,(S1r,,Skr,Tr)(S_{1}^{1},\ldots,S_{k}^{1},T^{1}),\ldots,(S_{1}^{r},\ldots,S_{k}^{r},T^{r}) of partitions of [n][n] so that Q=i=1rQiQ=\sum_{i=1}^{r}Q_{i} with rO((k+1)k+1logn)r\leqslant O((k+1)^{k+1}\log n).

We present the proof of 1; the proof of 2 is almost identical. For rr to be chosen later, we pick partitions (S11,,Sk1),,(S1r,,Skr)(S_{1}^{1},\ldots,S_{k}^{1}),\ldots,(S_{1}^{r},\ldots,S_{k}^{r}) uniformly at random and independently so that each is partition of [n][n] into sets of size n/kn/k each.

Call (a1,,ad),(b1,,bd)(a_{1},\ldots,a_{d}),(b_{1},\ldots,b_{d}) good at step ii if aj,bjSjia_{j},b_{j}\in S_{j}^{i} for every j<kj<k and aj,bjSkia_{j},b_{j}\in S_{k}^{i} if jkj\geqslant k. It is enough to show that after rO(kklogn)r\leqslant O(k^{k}\log n) steps the probability that every {a1,,ad,b1,,bd}\{a_{1},\ldots,a_{d},b_{1},\ldots,b_{d}\} of size 2d2d is good at some step iri\leqslant r.

Fix some (a1,,ad),(b1,,bd)(a_{1},\ldots,a_{d}),(b_{1},\ldots,b_{d}) with {a1,,ad,b1,,bd}=2d|\{a_{1},\ldots,a_{d},b_{1},\ldots,b_{d}\}|=2d. It is good at step ii with probability at least kkk^{-k}. Since the steps are independent, after rr steps

which is at most 1/n10d1/n^{10d} for some r=O(kklogn)r=O(k^{k}\log n).

Taking a union bound over all O(n2d)O(n^{2d}) tuples (a1,,ad),(b1,,bd)(a_{1},\ldots,a_{d}),(b_{1},\ldots,b_{d}) with {a1,,ad,b1,,bd}=2d|\{a_{1},\ldots,a_{d},b_{1},\ldots,b_{d}\}|=2d completes the proof. ∎

The usefulness of the above definition is captured by the following claim:

4.2 Graph-Theoretic Definitions and Lemmas

In this section, we set up some notation and definitions helpful in our proofs of the main results of this section. The next few definitions and notation are generalization of the ones used in [DM15] to general degrees dd and are useful in the proof of Lemma 6.11.

Let GG be a graph. A labeled UU-ribbon RR is a tuple (R,F)(R,F) where RR is a UU-ribbon and F:RGF:R\rightarrow G is a map labeling each vertex of RR with a vertex in GG. We require that for (u,v)(u,v) an edge in RR, F(u)F(v)F(u)\neq F(v).

Let (R,F)(R,F) be a labeled UU-ribbon where UU has 2d2d vertices. We say (R,F)(R,F) is disjoint if for every ii,

Let (R,F)(R,F) be a labeled UU-ribbon where UU has 2d2d vertices. We say that (R,F)(R,F) is contributing if no element of the multiset {(F(u),F(v)):(u,v)R}\{(F(u),F(v))\,:\,(u,v)\in R\} occurs with odd multiplicity.

The following combinatorial lemma will serve as a tool in the proofs of the main results for this section.

The assumption on UU implies that RR contains the cycles

The next few definitions and notation are needed in the proof of Lemma 6.12.

Where GG is a graph, a labeled fancy UU-ribbon is a tuple (R,F)(R,F) where RR is a fancy UU-ribbon and F:RGF:R\rightarrow G labels each vertex of RR with a vertex in GG. We require for any edge (u,v)R(u,v)\in R that F(u)F(v)F(u)\neq F(v).

When UU is empty the proof is similar: there are two paths, {ai1,bi1,ci}\{a_{i}^{1},b_{i}^{1},c_{i}\} and {ai2,bi2,c2}\{a_{i}^{2},b_{i}^{2},c_{2}\}. ∎

4.3 Proofs of Lemma 6.11 and Lemma 6.12

By Lemma 6.13 it is enough to prove the analogous claims for the nd×ndn^{d}\times n^{d} matrix QQ with entries given by

By multiplying QQ by suitable permutation matrices P,PP,P^{\prime} to give PQPPQP^{\prime}, we may assume in the 22-matching case above that the matching is {(1,1),(2,2)}\{(1,1),(2,2)\} and in the nonempty graph case that the edge contained is (1,1)(1,1) (where we think of the vertex set of UU as [d]×[d][d]\times[d]). Note that Q=PQP\|Q\|=\|PQP^{\prime}\|.

We apply Lemma 6.14 to obtain a family of matrices {Qi}i[r]\{Q_{i}\}_{i\in[r]} for some r=O(33logn)=O(logn)r=O(3^{3}\log n)=O(\log n) satisfying Q=iQiQ=\sum_{i}Q_{i}. On any entry (a1,,ad),(b1,,bd)(a_{1},\ldots,a_{d}),(b_{1},\ldots,b_{d}) on which QiQ_{i} is nonzero it is equal to QQ at that entry, and furthermore for each QiQ_{i} there is a partition (S1i,,S3i)(S_{1}^{i},\ldots,S_{3}^{i}) of [n][n] so that if Qi[(a1,,ad),(b1,,bd)]0Q_{i}[(a_{1},\ldots,a_{d}),(b_{1},\ldots,b_{d})]\neq 0 then a1,b1S1i,a2,b2S2ia_{1},b_{1}\in S_{1}^{i},a_{2},b_{2}\in S_{2}^{i}, and aj,bjS3ia_{j},b_{j}\in S_{3}^{i} for all j>2j>2.

Taking a union bound over the logn\log n matrices QiQ_{i}, we get that

and so by the triangle inequality applied to M=iQi\|M\|=\|\sum_{i}Q_{i}\|, we get

We first handle the case when UU is non empty. By Lemma 6.13 it is enough to prove the analogous statement for the n2×n2n^{2}\times n^{2} matrix, also by abuse of notation denoted QQ, which is the sum of matrices (abusing notation again) QkQ_{k} with entries given by

By multiplication with an appropriate permutation matrix (which cannot change the spectral norm), we may assume that UU contains the edge (1,1)(1,1). We begin with 2 from Lemma 6.14, whose hypotheses are satisfied by our convention ya,a=0y_{a,a}=0. This gives a family Q1,,QrQ_{1},\ldots,Q_{r} with r=O(33logn)=O(logn)r=O(3^{3}\log n)=O(\log n) so that i=1rQi\sum_{i=1}^{r}Q_{i} and a corresponding family of partitions (S11,S21,T1),,(S1r,S2r,Tr)(S^{1}_{1},S^{1}_{2},T^{1}),\ldots,(S_{1}^{r},S_{2}^{r},T^{r}). Lemma 6.14 guarantees that Qi[(a1,a2),(b1,b2)]=kTyk,a1yk,a2yk,b1yk,b2(i,j)Uyai,bjQ_{i}[(a_{1},a_{2}),(b_{1},b_{2})]=\sum_{k\in T}y_{k,a_{1}}y_{k,a_{2}}y_{k,b_{1}}y_{k,b_{2}}\prod_{(i,j)\in U}y_{a_{i},b_{j}} for some TTiT\subseteq T^{i} when a1,b1S1ia_{1},b_{1}\in S_{1}^{i} and a2,b2S2ia_{2},b_{2}\in S_{2}^{i} and is zero otherwise.

By a union bound and triangle inequality, we then get

The proof in the case of UU empty is similar, using Lemma 6.23 in the empty UU case.

Analyzing Deviations for the Degree-d𝑑d MPW Operator

In this section, we use the tools developed in Section 6 to analyze the spectrum of the deviation matrix D=MED=\mathcal{M}^{\prime}-E and prove Lemma 4.7.

As noted in Section 7, we decompose M=E+D\mathcal{M}^{\prime}=E+D. For any I,J([n]d)I,J\in{{[n]}\choose{d}}, D(I,J)D(I,J) depends on a) deg(IJ)\deg(I\cup J) and b) whether Eext(I,J)G\mathcal{E}_{ext}(I,J)\subseteq G. If D(I,J)D(I,J) depended only on b) above, then it could be decomposed into a sum of patterned matrices defined in Section 6; analyzing these is tractable. Our first step is thus to get rid of the dependence on deg(IJ)\deg(I\cup J)—the only part depending on the entire graph. We will obtain a matrix LL that depends only on whether Eext(I,J)G\mathcal{E}_{ext}(I,J)\subseteq G or not (and thus is “locally random" in the sense of [MPW15]).

and p(i)=2(di)2p(i)=2^{-(d-i)^{2}} for each ii. We set L(I,J)L(I,J) for every I,J([n]d)I,J\in{{[n]}\choose{d}} to be

The first is regarding the high level approach. The approach of [DM15] used explicit expressions for a canonical set of eigenvectors in V1V_{1} to obtain similar conclusions as us for the case of d=2d=2. This approach gets unwieldy very quickly because the explicit entries of eigenvectors for ViV_{i} for i>1i>1 are hard to work with [BI84]. We tackle this issue by developing an argument that doesn’t need explicit entries of the eigenvectors. Instead, we use basic representation theory (Section 6.3) to identify a set of symmetries satisfies by vectors in ViV_{i} for each ii and use it obtain the conclusions we require.

Second, [DM15] deal with the optimization version of the degree 44 SOS program which, as noted in the introduction, could be potentially weaker than the one we analyze here (and thus our lower bound is technically stronger). This simplifies the analysis in [DM15] a little bit as the matrix Δ\Delta defined above is identically zero for the operator analyzed. We explicitly work with the feasibility version of the degree 4 SOS program and thus, must deal with the additional complexity of handling Δ\Delta. It turns out that we have to do a fine grained analysis of the Δ\Delta matrix itself. The decomposition we use for Δ\Delta is somewhat different from the case of LL even though, the analysis of each piece of the decomposition proceeds similar to the case of LL.

Third, for the special case of d=2d=2, essentially the only matrix one has to analyze is the L0L_{0}, the matrix obtained by zeroing out all entries (I,J)(I,J) in LL such that IJI\cap J\neq\emptyset: a uniform bound on spectral norm of the remaining component suffices. However, for higher dd, one has to deal with the “non-disjoint" entries with some care and an argument analogous to the one in [DM15] fails to show PSDness of M\mathcal{M}^{\prime} beyond ωn12d\omega\approx n^{\frac{1}{2d}} giving no asymptotic improvement over [MPW15].

Finally, our argument for analyzing the spectral norms of each of the pieces also needs to be much more general than in case of [DM15] to handle higher degrees. For this, we identify a simple combinatorial structure (size of maximum matchings in appropriate bipartite graphs on 2d2d vertices) that controls the bounds and could also be used to obtain slick proofs of the conclusions required in [DM15] in the context of analyzing M\mathcal{M}^{\prime} for d=2d=2. Our combinatorial argument itself is a generalization of the one given by [DM15] for this case.

With probability at least 11n1-\frac{1}{n} over the draw of GG(n,12)G\sim G(n,\frac{1}{2}), each 0id0\leqslant i\leqslant d satisfies

The next lemma does a (even more) fine grained analysis of the spectrum of Δ\Delta:

With probability at least 11n1-\frac{1}{n} over the draw of GG(n,12)G\sim G(n,\frac{1}{2}), for each 0id0\leqslant i\leqslant d:

We can now use Lemma 7.2 and Lemma 7.3 to complete the proof of Lemma 4.7.

For each 0i,jd0\leqslant i,j\leqslant d, we compute ΠiMΠj\Pi_{i}\mathcal{M}^{\prime}\Pi_{j} and use Lemma 3.4. We write M=E+L+Δ\mathcal{M}^{\prime}=E+L+\Delta. First, ΠiEΠj=0\Pi_{i}^{\dagger}E\Pi_{j}=0 whenever iji\neq j as Πi\Pi_{i} are projectors to eigenspaces of EE. Let λ0,λ1,,λd\lambda_{0},\lambda_{1},\ldots,\lambda_{d} be the eigenvalues of EE on eigenspaces V0,V1,,VdV_{0},V_{1},\ldots,V_{d}. From Lemma 4.10, we have:

In what follows, all our statements hold with probability at least 1O(1)/n1-O(1)/n:

Using Lemma 7.2 and Lemma 7.3, for every i2i\geqslant 2,

Then, it is easy to check that for any ω=O(n1d+1)\omega=O(n^{\frac{1}{d+1}}),

Next, we bound the cross terms Πi(L+Δ)Πj|\Pi_{i}(L+\Delta)\Pi_{j}| for iji\neq j. Again, using Lemma 7.2 and Lemma 7.3, we have for i,j2i,j\geqslant 2:

For ωO(n1d+1)\omega\leqslant O(n^{\frac{1}{d+1}}), it is again easy to check that, for i,j2i,j\geqslant 2, the above expression is at most 2dλiλj\frac{2}{d}\sqrt{\lambda_{i}\lambda_{j}}.

In the case when one of i,ji,j is at most 11, we have the bound

In this case, it is easy to check that so long as ω=O(n1d+1/polylog(n))\omega=O(n^{\frac{1}{d+1}}/\operatorname{poly}\log{(n)}),

By an application of Lemma 3.4, the proof is complete. ∎

We first describe the high level idea of the proof.

We start by decomposing L=q=0dLqL=\sum_{q=0}^{d}L_{q} where Lq(I,J)=L(I,J)L_{q}(I,J)=L(I,J) if IJ=q|I\cap J|=q and otherwise. Notice that each LqL_{q} then is obtained by a scaling an appropriate 0/10/1 matrix.

Most illuminating is the disjoint case L0L_{0}, which is nonzero only at entries I,JI,J with IJ=I\cap J=\emptyset. For any disjoint I,JI,J, L0(I,J)L_{0}(I,J) depends only whether Eext(I,J)G\mathcal{E}_{ext}(I,J)\subseteq G, which, one could write as an appropriately scaled AND function of the indicators geg_{e} of edges eEext(I,J)e\in\mathcal{E}_{ext}(I,J). We can expand this AND function in the monomial (parities of subsets of geg_{e} variables) basis. Each such monomial corresponds to the bipartite graph BB that contains the pairs eEext(I,J)e\in\mathcal{E}_{ext}(I,J) that constitute the monomial. This gives a decomposition of L0L_{0} into 2d212^{d^{2}}-1 (since the constant term is , LL being zero mean) components, L0BL_{0}^{B} for each non empty, labeled bipartite graph on [d]×[d][d]\times[d].

We can bound the spectral norm of each of the pieces L0BL_{0}^{B} by direct application of tools derived in Section 6.4. The main work in this section goes into showing that depending upon the structure of BB, an appropriate selection of subspaces ViV_{i} lie in left or right kernels of L0BL_{0}^{B}. Thus, for a fixed term ΠiLΠj\Pi_{i}L\Pi_{j}, some L0BL_{0}^{B} do not contribute. We identify the maximum spectral norm among contributing terms to obtain the final bound.

To accomplish this goal, we rely heavily on the tools built in Section 6.3 which give us a handle on the symmetries of the eigenspaces V0,V1,,VdV_{0},V_{1},\ldots,V_{d}. This requires some work based on representation theory of finite groups and is presented in Lemma 6.4 and Lemma 6.9.

The case of LqL_{q} for q0q\neq 0 needs even finer decomposition. We decompose each LqL_{q} (q>0q>0) into matrices that identify the “pattern" of the qq intersecting vertices. In [MPW15] a similar idea is used to reduce the task of bounding the spectral norm of LqL_{q} to a calculation similar to one in the case of L0L_{0}. However, unlike [MPW15], we also require properties of the kernels of the components of the decomposition. After restricting to a fixed intersection pattern of qq vertices, we thus resort to using a generalization of the kernel analysis used for the L0L_{0} case. We now proceed with the proof plan as described beginning with the decomposition of each LqL_{q}.

1.1 Decomposing L𝐿L

We start by decomposing LL further as L=q=0dLqL=\sum_{q=0}^{d}L_{q} where for any I,J([n]d)I,J\in{{[n]}\choose{d}},

Recall that B\mathcal{B} is the set of all bipartite labeled graphs with left and right vertex sets labeled by [d][d] and [d][d]. Recall also from Section 6.3 that for any I,J[n]I,J\subseteq[n] with I=J=d|I|=|J|=d, the graph ζI,J(B)\zeta_{I,J}(B) is a copy of BB on vertex sets I,JI,J where the correspondence between II and [d][d] and JJ and [d][d] is determined by the sorting map ζ\zeta. Finally, recall that for a graph GG on [n][n], we let gbg_{b} be the ±1\pm 1 indicator for the presence of edge bb in GG, and by convention gb=0g_{b}=0 when b=(i,i)b=(i,i) for any i[n]i\in[n]. For any BBB\in\mathcal{B}, define an ([n]d)×([n]d){{[n]}\choose{d}}\times{{[n]}\choose{d}} matrix

We think of L0L_{0} as a rescaling and centering of a 0/10/1 matrix whose entries are the AND of the ±1\pm 1 indicators for the edges in Eext(I,J)\mathcal{E}_{ext}(I,J). Decomposing these ANDs into monomials over those ±1\pm 1 indicators, we see that each monomial corresponds exactly to one bipartite graph BB, and the centering of L0L_{0} corresponds to removing the constant monomial, which corresponds to the empty bipartite graph. Every other monomial recieves equal weight 2d22^{-d^{2}} in this expansion, and so from these observations it becomes routine to verify that

Similarly, we further decompose LqL_{q} for q>0q>0. Here things are a bit more involved. Let us motivate our decomposition by understanding the structure of the matrix LqL_{q} for q>0q>0 a little bit. Consider an entry (I,J)(I,J) such that IJ=KI\cap J=K. Then, Eext(I,J)Eext(IK,JK)\mathcal{E}_{ext}(I,J)\subseteq\mathcal{E}_{ext}(I\setminus K,J\setminus K). Thus, the edge structure in the bipartite subgraph on vertex sets IKI\setminus K and JKJ\setminus K decides the value of L(I,J)L(I,J) for any graph GG and we can hope to a get a patterned matrix. We now follow this intuition.

We now define a matrix which is nonzero only on entries which intersect in at least qq places:

Finally, it is again easy to verify that:

1.2 Spectral Analysis of L𝐿L

In order to prove Lemma 7.2, we will first use the decomposition described in the previous section to write LqL_{q} as a sum of appropriate patterned matrices. We will then partition the sum into groups, each group corresponding to an equivalence class of (left or right) similar bipartite graphs BB. We will infer some properties about the kernel and finally use the spectral norm bounds from Section 6.4 to complete the proof.

With probability at least 11/n101-1/n^{10} over the draw of GG(n,12)G\sim G(n,\frac{1}{2}),

Before proving the three claims above, we show how they imply Lemma 7.2:

For the second part, fix an i2i\geqslant 2. We first show that some terms in the decomposition in (7.3) do not contribute to ΠiLqΠi\left|\Pi_{i}^{\dagger}L_{q}\Pi_{i}\right|.

In the remaining part of this section, we complete the proofs of Claims 7.6 and 7.5.

1.3 Proof of Claims

In this section, we obtain quick proofs of the three claims above using the tools developed in Section 6.

The proof is by appealing to Fact 3.3. Observe that

The next is a direct application of Lemma 6.9.

Finally, we prove Claim 7.5 using Lemma 6.11.

where K[n]K\subseteq[n] is some fixed subset of size qq such that KI=KJ=K\cap I^{\prime}=K\cap J^{\prime}=\emptyset.

2 Proof of Lemma 7.3

We now move on to analyzing the spectrum of the matrix Δ\Delta.

The high level plan of the proof is similar to that of Lemma 7.2. We define Δi\Delta_{i} for each 0id0\leqslant i\leqslant d as follows:

We further split Δq=K:K=qΔq,K,\Delta_{q}=\sum_{K:|K|=q}\Delta_{q,K}, where:

First, we observe that Δ0=0\Delta_{0}=0. This is because deg(IJ)\deg(I\cup J) when II and JJ are disjoint is exactly 11 and doesn’t depend on the graph. Thus, Δ=i=1nΔi.\Delta=\sum_{i=1}^{n}\Delta_{i}. As before we would like to spot patterned matrices in each Δi\Delta_{i} to show that appropriate eigenspaces VjV_{j} lie in the kernel of Δi\Delta_{i}. In case of Δq\Delta_{q}, however, there’s a difference how this needs to be done. This is because each entry (I,J)(I,J) of Δi\Delta_{i} potentially depends on the edges from every vertex in the graph GG to II and JJ. This is unlike the case of LL where the (I,J)(I,J) entry depends only on the edges between II and JJ (in fact that’s the reason we separated LL from Δ\Delta in the analysis). Nevertheless, we give a decomposition below that will help us make claims similar to the ones in the case of analyzing LL in this case too.

Let us first explain the main idea in the decomposition. The entry Δq(I,J)\Delta_{q}(I,J) depends on two events: a) whether Eext(I,J)G\mathcal{E}_{ext}(I,J)\subseteq G and b) the number of subsets S[n]S\subseteq[n] of size S=q|S|=q that form 2d2d-cliques with IJI\cup J. The main observation that motivates our decomposition is the following: in the event that Eext(I,J)G\mathcal{E}_{ext}(I,J)\subseteq G, the deviation in deg(IJ)\deg(I\cup J) is completely captured (up to low order terms) by just the number of vertices ss that has an edge to all of IJI\cup J in GG. This allows us to write entries of Δq\Delta_{q} as a sum of contribution to the deviation due to each vertex ss separately. For the case of q=1q=1, this argument is in fact exact and there are no low order terms. When q>1q>1, the contributions due to individual vertices contribute the bulk of the deviation and only low order terms remain.

From here on, we are in a situation similar to the one encountered in analyzing LL in the previous subsection. We show that the components in the decomposition with large spectral norm do not contribute to quadratic forms over eigenspaces with small eigenvalues of the expectation matrix EE using the idea of patterned matrices from Section 6.3. We show that the remaining components have small spectral norm using the combinatorial techniques combined with the trace moment method developed in Section 6.4.

Using the matrices above, we can show the following approximate factorization for the entries of Δq,K\Delta_{q,K}:

for η=22IJ+(IJ+12)(2d2)(nI)2dI1(2dI1)!.\eta=2^{-2|I\cup J|+{{|I\cup J|+1}\choose 2}-{{2d}\choose 2}}\frac{(n-|I|)^{2d-|I|-1}}{(2d-|I|-1)!}.

Let AIJA_{I\cup J} be the set of vertices ss in GG not in IJI\cup J so that (s,i)G(s,i)\in G for all iIJi\in I\cup J. By definition, if IJI\cup J is a clique,

and hence the lemma follows from Theorem 9.4. ∎

We will need another definition before proceeding: For each ii,

As in the case of analyzing LL, we define the filled in versions of es,Kie^{i}_{s,K} as follows:

We start by giving norm bounds on all the matrices involved in the decomposition in Lemma 7.7.

Then, s[n],K:K=qQ=R(q)\sum_{s\in[n],K:|K|=q}Q=R^{(q)} in the sense of Definition 6.15. Thus, by Fact 6.16 R(q)22dR.\|R^{(q)}\|\leqslant 2^{2d}\|R\|. Thus, we focus on bounding R\|R\| in the following. We will use the trace moment method for this purpose and the argument is similar to the ones made in Section 6.4 to develop the general purpose spectral concentration results. For this reason, we will be a bit more terse than in the case of the other applications of the trace moment method before. We set up the notation for the general RR as above and specialize the combinatorial reasoning for each of the specific matrices involved later.

We now investigate when does a term in the expansion above contribute a non-zero value to the LHS.

We can now complete the proof of Lemma 7.3 using Lemma 7.8and Lemma 7.9.

For each qq, let AqA_{q} be the expression given by Lemma 7.7 to approximate the entries of Δq\Delta_{q}. Then, we have for any i,ji,j:

Analyzing Deviations for the Corrected Degree-4 Operator

The goal of this section is to prove Lemma 5.6, that is, to show that N0\mathcal{N}^{\prime}\succ 0. The proof is organized into 55 main claims that we next present.

(Recall that C4C_{4} is the number of 44-cliques in GG.) We then set

where M\mathcal{M}^{\prime} is the filled in MPW matrix (Definition 4.5).

Our first claim shows that it is enough to prove PSDness of N\mathcal{N}:

For any γ,c\gamma,c there is ω0=Ω(n/log(n)cγ)\omega_{0}=\Omega(\sqrt{n/\log(n)c\gamma}) so that for any ωω0\omega\leqslant\omega_{0} with probability 1O(n10)1-O(n^{-10}) if Nω2n2/cI\mathcal{N}\succ\omega^{2}n^{2}/c\cdot I then N0\mathcal{N}^{\prime}\succ 0.

Let ΠWa\Pi_{W_{a}} be the projector to WaW_{a} for every a{0,1,1.5,2}a\in\{0,1,1.5,2\}.

For every γ\gamma there is ω0=Ω(γn/log(n)2)\omega_{0}=\Omega(\sqrt{\gamma n}/\log(n)^{2}) so that for ωω0\omega\leqslant\omega_{0}, with probability 1O(n10)1-O(n^{-10}),

Next, we analyze the spectrum of LL. Here at last we see the main technical improvement in these corrected moments—the cross term between V1V_{1} and V2V_{2} has become a cross term between W1W_{1} and W2W_{2} and has much-reduced norm.

Next, we bound the spectral norm of Δ\Delta. This is a direct corollary of the more general bound in Lemma 7.3, but to have a self-contained proof of the degree-4 case we also give a proof later in this section.

Let M\mathcal{M}^{\prime} be the (n2)×(n2){n\choose 2}\times{n\choose 2} filled-in matrix for the degree-4 MPW moments with clique size ω\omega (Definition 4.5). Let Δ\Delta be as in Definition 7.1. With probability 1O(n10)1-O(n^{-10}), ΔO(ω3n3/2log(n)2)\|\Delta\|\leqslant O(\omega^{3}n^{3/2}\log(n)^{2}).

Let GG(n,1/2)G\sim G(n,1/2). With probability 1O(n10)1-O(n^{-10}), RR0O(γω5n1/2log(n)2)\|\mathcal{R}-\mathcal{R}_{0}\|\leqslant O(\gamma\omega^{5}n^{1/2}\log(n)^{2}).

The proofs of these lemmas follow, but first we complete the proof of Lemma 5.6 and hence of Theorem 5.4.

By Lemma 8.1, it will be enough to exhibit c,γ=polylognc,\gamma=\operatorname{polylog}n and ω1=Ω(n/log(n)cγ)\omega_{1}=\Omega(\sqrt{n/\log(n)c\gamma}) so that N(ω2n2/c)I\mathcal{N}\succ(\omega^{2}n^{2}/c)\cdot I with probability 1O(n10)1-O(n^{-10}) when ωω1\omega\leqslant\omega_{1}. Then our final bound will be given by the minimum of ω1\omega_{1} and ω0\omega_{0} of Lemma 8.1. (Recall that γ\gamma is a parameter inside N\mathcal{N}.) In the following, all that we claim happens with probability at least 1n91-n^{-9} by a union bound.

First of all, by Lemma 8.5, for every γ\gamma there is c=O(min{n/ω3γ,1})c=O(\min\{n/\omega^{3}\gamma,1\}) so that

We assume c=c(γ)c=c(\gamma) is chosen in this way.

For any γ\gamma, by Lemma 8.5 if we choose ωn/γlog(n)2\omega\leqslant\sqrt{n}/\gamma\log(n)^{2} then RR0o(ω2n2)\|\mathcal{R}-\mathcal{R}_{0}\|\leqslant o(\omega^{2}n^{2}). Adding RR0\mathcal{R}-\mathcal{R}_{0} to the previous equation,

By the same reasoning using Lemma 8.4 to add Δ\Delta to the previous equation, we have for the same choice of ω\omega:

So it just remains to add LL to the left-hand side.

Let pp be as in Lemma 8.3, which implies that

Choosing γ=O(p2log(n))\gamma=O(p^{2}\log(n)) we get that this is o(ω4n2γω5n)o(\sqrt{\omega^{4}n^{2}\cdot\gamma\omega^{5}n}). So using Lemma 3.5 to add (8.3) and (8.4), we obtain

Together with Lemma 3.5 and (8.5), this implies the lemma, for γp2logp\gamma\geqslant p^{2}\log p, c=c(γ)c=c(\gamma) as above, and ω0min{n/plog(n)2,n/γlog(n)2}\omega_{0}\leqslant\min\{\sqrt{n}/p\log(n)^{2},\sqrt{n}/\gamma\log(n)^{2}\}. ∎

We start with the easy parts. Note that ΠW2LΠW2=ΠW2ΠV2LΠV2ΠW2\Pi_{W_{2}}L\Pi_{W_{2}}=\Pi_{W_{2}}\Pi_{V_{2}}L\Pi_{V_{2}}\Pi_{W_{2}}, so the bound

2 Lower-Degree Cleanup (Lemma 8.1)

Every i,j,k[n]i,j,k\in[n] with i,j,ki,j,k all distinct satisfies

With probability 1O(n20)1-O(n^{-20}) when GG(n,1/2)G\sim G(n,1/2) by Lemma 5.7, we get that 1/(ω2)(ω3)=O(1/ω2)1/(\omega^{\prime}-2)(\omega^{\prime}-3)=O(1/\omega^{2}). By the same,

which is o(ω2/n2)o(\omega^{2}/n^{2}) for ω=o(n)\omega=o(\sqrt{n}) and γ=o(ω)\gamma=o(\omega). ∎

We first observe that an eigenvalue lower bound on N\mathcal{N} implies the same on the (principal) submatrix indexed only by cliques (in this case, edges) in GG. This submatrix is equal to C4N+ErrC_{4}\mathcal{N}^{\prime}+\text{Err}, where

We break Err into two parts so that Err2+Err3=Err\text{Err}_{2}+\text{Err}_{3}=\text{Err}:

Note that Err3\text{Err}_{3} consists of the off-diagonal nonzero entries of Err, while Err2\text{Err}_{2} contains the diagonal entries of Err. We start by showing a bound on Err3\|\text{Err}_{3}\|.

Next we bound Err2\text{Err}_{2}. Since it is diagonal, it is enough to give an entrywise bound.

3 Eigenvalue Lower Bound for the Correction

The following is the main claim for this section.

We will need the following graph theoretic machinery for the moment method bound.

The labeled diamond ribbon (R,F)(R,F) is contributing if no element of the multiset {(F(x),F(y)) such that (x,y)R}\{(F(x),F(y))\text{ such that }(x,y)\in R\} occurs with odd multiplicity. It is disjoint if the sets

By our disjointness assumption, every element of the multiset {F(si),F(ti)}\{F(s_{i}),F(t_{i})\} must occur with multiplicity at least two and similarly for {F(ui)}\{F(u_{i})\} and {F(vi)}\{F(v_{i})\}. ∎

Note that the matrix R=s(rs2)(rs2)R=\sum_{s}(r_{s}^{\otimes 2})(r_{s}^{\otimes 2})^{\dagger} has row and column spaces both VV. Note also that it factors as SSSS^{\dagger}, where SS is the ([n]2)×n{{[n]}\choose{2}}\times n matrix whose columns are the vectors rs2r_{s}^{\otimes 2}. Thus, it will be enough to show that

where here II is the n×nn\times n identity matrix.

For this, consider the matrix SSS^{\dagger}S indexed by vertices s,t[n]s,t\in[n]. It has entries

So, zeroing this matrix on the diagonal, it is enough to prove that

Let H:=SS([n]2)IH\mathrel{\mathop{:}}=S^{\dagger}S-{{[n]}\choose{2}}I. Let Hi,jH_{i,j} for iji\neq j be given by

Then H=ijHi,jH=\sum_{i\neq j}H_{i,j}. Note that Hi,j(s,t)=0H_{i,j}(s,t)=0 if i{s,t}i\in\{s,t\} or j{s,t}j\in\{s,t\}. Thus the obvious generalization of Lemma 6.14 to the two-parameter family Hi,jH_{i,j} applies. This gives us a family of matrices H1,,HrH^{1},\ldots,H^{r} for some r=O(logn)r=O(\log n) and a corresponding family of partitions (S11,S22,S31,S41,S51,S61),,(S1r,,S6r)(S_{1}^{1},S_{2}^{2},S_{3}^{1},S_{4}^{1},S_{5}^{1},S_{6}^{1}),\ldots,(S_{1}^{r},\ldots,S_{6}^{r}) of [n][n].

These will be such that i=1rHi=H\sum_{i=1}^{r}H^{i}=H, and

4 Proofs of Remaining Lemmas

Note that for I,JI,J disjoint we have Δ(I,J)=0\Delta(I,J)=0. We bound the maximal sum across any row of Δ\Delta. With probability 1O(n10)1-O(n^{-10}) every off-diagonal entry off Δ\Delta is at most O(ω3nlog(n)2)O(\omega^{3}\sqrt{n}\log(n)^{2}) in absolute value [MPW15, Theorem 10.1]. For each I([n]2)I\in{{[n]}\choose{2}}, we then get JIΔ(I,J)O(ω3n3/2log(n)2)\sum_{J\neq I}|\Delta(I,J)|\leqslant O(\omega^{3}n^{3/2}\log(n)^{2}). At the same time, The diagonal entries are each at most O(ω2n3/2log(n)2)O(\omega^{2}n^{3/2}\log(n)^{2}) with similar probability, again by [MPW15, Theorem 10.1]. ∎

Now it is enough to bound R0R\|\mathcal{R}_{0}-\mathcal{R}\|. This is the deviation introduced by zeroing non-clique entries. Note that each entry I,JI,J of R\mathcal{R} can be decomposed as a function of the underlying graph edge variables:

where we recall that for an edge (u,v)(u,v), the variable g(u,v)g_{(u,v)} is the ±1\pm 1 indicator for that edge. Each of the entries in the sum over nonempty SS above corresponds to a matrix of the form bounded in Lemma 6.12, where we conclude that each has spectral norm at most O(n3/2log(n)2)O(n^{3/2}\log(n)^{2}) with probability 1O(n10)1-O(n^{-10}). We conclude (also using C4n4C_{4}\approx n^{4}, see [MPW15, Theorem 10.3]) that R0RO(γω5nlog(n)2)\|\mathcal{R}_{0}-\mathcal{R}\|\leqslant O(\gamma\omega^{5}\sqrt{n}\log(n)^{2}) as desired. ∎

In this section, we prove the following large-deviation bounds on the number of xx-cliques a random G(n,1/2)G(n,1/2) graph contains and on degG(I)\deg_{G}(I). Similar results (which are likely sufficient for our needs) appear in the literature; see [Ruc88, Vu01, JLR11] for instance. We provide these proofs for completeness. A coarser concentration result for degG(I)\deg_{G}(I) appears in [MPW15].

For a graph GG, define Nx(G)N_{x}(G) to be the number of xx-cliques in GG.

Unless otherwise specified, in this section GG(n,1/2)G\sim G(n,1/2).

This first theorem gives the large deviation bound for the number of cliques of size xx in GG.

For all ε(0,1)\varepsilon\in(0,1), for all xx, if n>x2(2eelnε)(2e+2elnε)n>{x^{2}}(2e-e\ln{\varepsilon})(2e+2-e\ln{\varepsilon}) then

We also want a large deviations inequality for the number of cliques of size 2d2d that a clique of size d<2dd^{\prime}<2d participates in in GG. Moreover, to carry out the eigenspace splitting arguments needed for Lemma 7.3, we want to know the dependence of this deviation on the number of vertices adjacent to every vertex in the dd^{\prime}-clique. The following theorem serves both these purposes.

Given any I[n]I\subseteq[n], let AIA_{I} be the set of all vertices not in II which are adjacent to all vertices in II.

There is a universal constant CC so that for any I[n]I\subseteq[n] of size at most 2d2d, if II is a clique in GG then for any ε(0,1)\varepsilon\in(0,1), if n100d222d(3lnε)2n\geqslant 100{d^{2}}2^{2d}(3-\ln{\varepsilon})^{2} then

The key lemma in proving Theorem 9.2 is the following, which bounds how often subsets of GG of size xx share (potential) edges.

If x2x\geqslant 2 and nx2q(q+2)n\geqslant{x^{2}}q(q+2) then there are at most 2nxqq(x2x!)q2n^{xq-q}\left(\frac{{x^{2}}}{x!}\right)^{q} multi-sets S={V1,,Vq}S=\{V_{1},\cdots,V_{q}\} of subsets Vi[n]V_{i}\subseteq[n] of size Vi=x|V_{i}|=x such that that for all jj there exists an iji\neq j such that ViVj2|V_{i}\cap V_{j}|\geqslant 2.

Using Lemma 9.5, we can bound the deviation of the number of xx-cliques in GG from its expected value; we carry this out now.

Define X=V:VG,V=x(1V2(x2))X=\sum_{V:V\subseteq G,|V|=x}{\left(1_{V}-2^{-{x\choose 2}}\right)} where 1V=11_{V}=1 if VV is a clique in GG and otherwise.

X=Nx(G)2(x2)(nx)X=N_{x}(G)-2^{-{x\choose 2}}{n\choose x}

If x2x\geqslant 2 and nx2q(q+2)n\geqslant{x^{2}}q(q+2) then E[Xq]<2q!(x2x!nx1)qE[X^{q}]<2q!\left(\frac{{x^{2}}}{x!}n^{x-1}\right)^{q}.

The result is trivial for x=0x=0 and x=1x=1 so we may assume that x2x\geqslant 2. Using Corollary 9.8 and Markov’s inequality,

Thus, we just need to give an upper bound on min{positive even q}{2q!εq}\min_{\{\text{positive even q}\}}\{\sqrt[q]{\frac{2q!}{\varepsilon}}\}. For all positive even qq, 2q!qq2q!\leqslant q^{q} so this expression is upper bounded by qεq\frac{q}{\sqrt[q]{\varepsilon}}. We now try to minimize qεq\frac{q}{\sqrt[q]{\varepsilon}} over all positive even qq. Taking the derivative of this expression with respect to qq yields 1εq+lnεqεq\frac{1}{\sqrt[q]{\varepsilon}}+\frac{\ln{\varepsilon}}{q\sqrt[q]{\varepsilon}}. Setting this to yields q=lnεq=-\ln{\varepsilon}. However, we require qq to be even so we take qq to be the smallest positive even integer which is greater than lnε-\ln{\varepsilon}. Now q<2lnεq<2-\ln{\varepsilon} and εqε(lnε)=(elnε)1lnε=1e\sqrt[q]{\varepsilon}\geqslant\sqrt[(-\ln{\varepsilon})]{\varepsilon}=\left(e^{\ln{\varepsilon}}\right)^{\frac{1}{-\ln{\varepsilon}}}=\frac{1}{e}. Putting everything together, for this qq, 2q!εqqεq<2eelnε\sqrt[q]{\frac{2q!}{\varepsilon}}\leqslant\frac{q}{\sqrt[q]{\varepsilon}}<2e-e\ln{\varepsilon}. Plugging this in gives

All that is left is to check that nx2q(q+2)n\geqslant{x^{2}}q(q+2) for this qq to make sure that our application of Corollary 9.8 was valid. Since n>x2(2eelnε)(2e+2elnε)n>{x^{2}}(2e-e\ln{\varepsilon})(2e+2-e\ln{\varepsilon}), this holds, as needed. ∎

Now that we have proven Theorem 9.2, we will derive Theorem 9.4 from Theorem 9.2. The idea is that conditioned on II being a clique, by Theorem 9.2, degG(I)\deg_{G}(I) is primarily determined by AI|A_{I}|, which can be easily shown to be tightly concentrated around its expected value. We start with the following lemma

If n>dn>d then for any I[n]I\subseteq[n] of size less than dd, if we first determine all of the edges incident to elements of II (which determines AIA_{I}) then if II is a clique, when we look at the remainder of the graph, for any ε1(0,1)\varepsilon_{1}\in(0,1),

so long as the following conditions hold:

(dI)2IAInI11(d-|I|)\left|\frac{2^{|I|}|A_{I}|}{n-|I|}-1\right|\leqslant 1

AI>d2(2eelnε1)(2e+2elnε1)|A_{I}|>{d^{2}}(2e-e\ln{\varepsilon_{1}})(2e+2-e\ln{\varepsilon_{1}})

To prove Lemma 9.9 we require the following results; proofs of the more elementary ones are deferred to Section 9.1.

For all nonnegative integers nn and kk where k<nk<n, 0nkk!(nk)k22nnkk!0\leqslant\frac{n^{k}}{k!}-{n\choose k}\leqslant\frac{k^{2}}{2n}\frac{n^{k}}{k!}

Note that nkj=0k1(nj)nkj=0k1(1jn)nk(1j=0k1jn)nk(1k22n)n^{k}\geqslant\prod_{j=0}^{k-1}{(n-j)}\geqslant n^{k}\prod_{j=0}^{k-1}{(1-\frac{j}{n})}\geqslant n^{k}(1-\sum_{j=0}^{k-1}{\frac{j}{n}})\geqslant n^{k}(1-\frac{k^{2}}{2n})

This implies that 0nkj=0k1(nj)k22nnk0\leqslant n^{k}-\prod_{j=0}^{k-1}{(n-j)}\leqslant\frac{k^{2}}{2n}{n^{k}} and dividing everything by k!k! gives the claimed result. ∎

If n>dIn>d\geqslant|I| then 2(I2)(d2)(nI)dI(dI)!2(I2)(d2)(nIdI)2(I2)(d2)ndI1\left|2^{{{|I|}\choose 2}-{d\choose 2}}\frac{(n-|I|)^{d-|I|}}{(d-|I|)!}-2^{{{|I|}\choose 2}-{d\choose 2}}{{n-|I|}\choose{d-|I|}}\right|\leqslant 2^{{{|I|}\choose 2}-{d\choose 2}}n^{d-|I|-1}

Applying Proposition 9.10 on nIn-|I| and dId-|I| gives

Multiplying this equation by 2(I2)(d2)2^{{{|I|}\choose 2}-{d\choose 2}} gives the claimed result. ∎

For all nonnegative integers xx and dd where xdx\leqslant d, x(dx)+(x2)(d2)=(dx2)x(d-x)+{x\choose 2}-{d\choose 2}=-{{d-x}\choose 2}.

For any nonnegative integer kk and any xx such that kx1|kx|\leqslant 1, (1+x)k(1+kx)k2x2\left|(1+x)^{k}-(1+kx)\right|\leqslant{k^{2}}{x^{2}}

Eventually in the course of proving Lemma 9.9 we will break

into pieces. The following lemmas offer the necessary bounds on each piece.

If (dI)2IAInI11(d-|I|)\left|\frac{2^{|I|}|A_{I}|}{n-|I|}-1\right|\leqslant 1 then

Applying Proposition 9.13 with x=2IAInI1x=\frac{2^{|I|}|A_{I}|}{n-|I|}-1 and k=(dI)k=(d-|I|), since (dI)2IAInI11(d-|I|)\left|\frac{2^{|I|}|A_{I}|}{n-|I|}-1\right|\leqslant 1,

Multiplying this equation by 2(I2)(d2)(nI)dI(dI)!\frac{2^{{{|I|}\choose 2}-{d\choose 2}}(n-|I|)^{d-|I|}}{(d-|I|)!} and using Proposition 9.12 with x=Ix=|I| gives

If AI>dI|A_{I}|>d-|I| then 2(dI2)(AIdI)2(dI2)AIdI(dI)!2(dI2)AIdI1\left|2^{-{{d-|I|}\choose 2}}{{|A_{I}|}\choose{d-|I|}}-2^{-{{d-|I|}\choose 2}}\frac{|A_{I}|^{d-|I|}}{(d-|I|)!}\right|\leqslant 2^{-{{d-|I|}\choose 2}}|A_{I}|^{d-|I|-1}

Applying Proposition 9.10 on AI|A_{I}| and dId-|I| gives

Multiplying this equation by 2(dI2)2^{-{{d-|I|}\choose 2}} gives the claimed result. ∎

For all ε1(0,1)\varepsilon_{1}\in(0,1), if AI>d2(2eelnε1)(2e+2elnε1)|A_{I}|>{d^{2}}(2e-e\ln{\varepsilon_{1}})(2e+2-e\ln{\varepsilon_{1}}) then

This lemma follows immediately from applying Theorem 9.2 on the random graph GG restricted to the vertices AIA_{I}. ∎

into four parts and analyze each one separately.

2(I2)(d2)(nI)dI(dI)!2(I2)(d2)(nIdI)2^{{{|I|}\choose 2}-{d\choose 2}}\frac{(n-|I|)^{d-|I|}}{(d-|I|)!}-2^{{{|I|}\choose 2}-{d\choose 2}}{{n-|I|}\choose{d-|I|}}

2(dI2)AIdI(dI)!2(I2)(d2)(nI)dI(dI)!(AInI2I)2(I+12)(d2)(nI)dI1(dI1)!2^{-{{d-|I|}\choose 2}}\frac{|A_{I}|^{d-|I|}}{(d-|I|)!}-2^{{{|I|}\choose 2}-{d\choose 2}}\frac{(n-|I|)^{d-|I|}}{(d-|I|)!}-\left(|A_{I}|-\frac{n-|I|}{2^{|I|}}\right)\frac{2^{{{|I|+1}\choose 2}-{d\choose 2}}(n-|I|)^{d-|I|-1}}{(d-|I|-1)!}

2(dI2)(AIdI)2(dI2)AIdI(dI)!2^{-{{d-|I|}\choose 2}}{{|A_{I}|}\choose{d-|I|}}-2^{-{{d-|I|}\choose 2}}\frac{|A_{I}|^{d-|I|}}{(d-|I|)!}

degG(I)2(dI2)(AIdI)\deg_{G}(I)-2^{-{{d-|I|}\choose 2}}{{|A_{I}|}\choose{d-|I|}}

Combining Lemma 9.11, Lemma 9.14, Lemma 9.15, and Lemma 9.16, we have that under the given conditions,

The result now reduces to showing the following equation

which follows from the facts that I<d|I|<d, AIn|A_{I}|\leqslant n, and (dI)2(dI)!2\frac{(d-|I|)^{2}}{(d-|I|)!}\leqslant 2. ∎

To use Lemma 9.9 to prove Theorem 9.4, we need probabilistic bounds on AI|A_{I}|.

There is a universal constant CC so that for all ε2(0,1)\varepsilon_{2}\in(0,1), P[AI2I(nI)>C(2lnε2)n]<ε2P\left[||A_{I}|-2^{-|I|}(n-|I|)|>C(2-\ln{\varepsilon_{2}})\sqrt{n}\right]<\varepsilon_{2}

We have all we need now to prove Theorem 9.4.

The result is trivial if I=d|I|=d and follows immediately from Theorem 9.2 if I=0|I|=0 so we may assume that 0<I<d0<|I|<d.

In what follows, CC and CC^{\prime} denotes universal constants which may vary from line to line. Now recall that by Lemma 9.9, for any ε1(0,1)\varepsilon_{1}\in(0,1),

so long as the following conditions hold:

(dI)2IAInI11(d-|I|)\left|\frac{2^{|I|}|A_{I}|}{n-|I|}-1\right|\leqslant 1

AI>d2(2eelnε1)(2e+2elnε1)|A_{I}|>{d^{2}}(2e-e\ln{\varepsilon_{1}})(2e+2-e\ln{\varepsilon_{1}})

Taking ε1=ε2=ε2\varepsilon_{1}=\varepsilon_{2}=\frac{\varepsilon}{2}, plugging Lemma 9.17 into these equations and using the union bound, we have that

so long as the corresponding conditions hold. Assuming these conditions hold for now, since I<n16|I|<\frac{n}{16}, I<d|I|<d, (dI)2(dI)!2\frac{(d-|I|)^{2}}{(d-|I|)!}\leqslant 2, and 2I2(I2)=2(I+12)2^{|I|}2^{{|I|}\choose 2}=2^{{|I|+1}\choose 2},

as needed. For the first part of Theorem Theorem 9.4, note that this implies that

Taking nCd222d(3lnε)2n\geqslant C{d^{2}}2^{2d}(3-\ln{\varepsilon})^{2} and d2d\geqslant 2,

as needed. All that is left is to check the conditions for Lemma 9.9, which are as follows.

(dI)2Ie(3lnε)nnI1(d-|I|)\frac{2^{|I|}e(3-\ln{\varepsilon})\sqrt{n}}{n-|I|}\leqslant 1

2I(nI)>d2e(3lnε)(3e+2elnε)+e(3lnε)n2^{-|I|}(n-|I|)>{d^{2}}e(3-\ln{\varepsilon})(3e+2-e\ln{\varepsilon})+e(3-\ln{\varepsilon})\sqrt{n}

These conditions are true if n4d222d(3lnε)2n\geqslant 4{d^{2}}2^{2d}(3-\ln{\varepsilon})^{2}. To see this, note that since d>I>0d>|I|>0 and I<n16|I|<\frac{n}{16}

(dI)2Ie(3lnε)nd2Ie(3lnε)nIe44d222d(3lnε)2nInI(d-|I|)2^{|I|}e(3-\ln{\varepsilon})\sqrt{n}\leqslant d2^{|I|}e(3-\ln{\varepsilon})\sqrt{n}-|I|\leqslant\frac{e}{4}\sqrt{4{d^{2}}2^{2d}(3-\ln{\varepsilon})^{2}}\sqrt{n}-|I|\leqslant n-|I|

2Id2e(3lnε)(3e+2elnε)<2Id2e23e+23e(3lnε)2<102Id2(3lnε)2516n2^{|I|}{d^{2}}e(3-\ln{\varepsilon})(3e+2-e\ln{\varepsilon})<2^{|I|}{d^{2}}e^{2}\frac{3e+2}{3e}(3-\ln{\varepsilon})^{2}<10\cdot{2^{|I|}}{d^{2}}(3-\ln{\varepsilon})^{2}\leqslant\frac{5}{16}n

2Ie(3lnε)n<42I(3lnε)(2d2d(3lnε))n22^{|I|}e(3-\ln{\varepsilon})\sqrt{n}<4\cdot{2^{|I|}}(3-\ln{\varepsilon})\left(2d{2^{d}}(3-\ln{\varepsilon})\right)\leqslant\frac{n}{2}

Dividing the first statement by nIn-|I| gives the first condition. Using the second and third statements,

Dividing this by 2I2^{|I|} gives the second condition, as needed. ∎

For each multi-set S={V1,,Vk}S=\{V_{1},\cdots,V_{k}\} of kk xx-cliques, define the constraint graph HSH_{S} as follows.

E(HS)={(Vi,Vj):ViVj2}E(H_{S})=\{(V_{i},V_{j}):|V_{i}\cap V_{j}|\geqslant 2\}

Let’s first bound the number of SS such that HSH_{S} is connected.

For any x,k2x,k\geqslant 2, there are at most nkx2k+2k!2x4(x42(x!))kn^{kx-2k+2}k!\frac{2}{x^{4}}\left(\frac{{x^{4}}}{2(x!)}\right)^{k} multi-sets SS of kk xx-cliques such that HSH_{S} is connected.

Since HSH_{S} is connected, we can order {V1,,Vk}\{V_{1},\cdots,V_{k}\} so that for all j>1j>1 there is an i<ji<j such that (Vi,Vj)E(HS)(V_{i},V_{j})\in E(H_{S}). Assuming this is the case, we have at most (nx){n\choose x} choices for V1V_{1}. For each j>1j>1, there are two vertices in VjV_{j} which are contained in some ii where i<ji<j. There are at most jj choices for this ii and then there are (d2){d\choose 2} choices for which two vertices of ViV_{i} are contained in VjV_{j}. There are at most (nx2){{n}\choose{x-2}} choices for the other x2x-2 vertices of VjV_{j} so for each j>1j>1 there are at most j(x2)(nx2)j{x\choose 2}{{n}\choose{x-2}} choices for VjV_{j}. Putting everything together, there are at most

multi-sets SS of kk xx-cliques such that HSH_{S} is connected. ∎

Now consider the number of multi-sets SS of qq xx-cliques such that HSH_{S} has tt connected components with sizes s1,,sts_{1},\cdots,s_{t}. Using Lemma 9.19, there are at most

We now total this up over all possible t,s1,,stt,s_{1},\cdots,s_{t}. For the special case that all connected components of HSH_{S} have size 22, there are at most nxqq(x2x!)q{n^{xq-q}\left(\frac{{x^{2}}}{x!}\right)^{q}} such SS. We will show that this term contributes more than all of the other terms combined, which implies that the total number of SS is at most 2nxqq(x2x!)q2{n^{xq-q}\left(\frac{{x^{2}}}{x!}\right)^{q}}, as needed.

For a given tt, i=1tsi!2t1(q+22t)!2t(q+2)q2t\prod_{i=1}^{t}{{s_{i}}!}\leqslant 2^{t-1}{(q+2-2t)!}\leqslant{2^{t}}{(q+2)^{q-2t}}. Also, each of the tt components of HSH_{S} must have at least two vertices so to determine the sizes s1,,sts_{1},\cdots,s_{t} it is sufficient to decide how to distribute the q2tq-2t extra vertices among the tt connected components of HSH_{S}. There are at most (qt1t1)qq2t{{q-t-1}\choose{t-1}}\leqslant q^{q-2t} ways to do this. Thus, the total contribution for terms of a given tt is at most

Since nx2q(q+2)n\geqslant{x^{2}}q(q+2), the nxqq(x2x!)q{n^{xq-q}\left(\frac{{x^{2}}}{x!}\right)^{q}} term which comes from t=q2t=\frac{q}{2} contributes more than all the other terms combined, as needed. ∎

Rearranging this equation gives (d2)=(x2)+dx+(dx2){d\choose 2}={x\choose 2}+dx+{{d-x}\choose 2} which just says that if we want to pick two elements in [1,d][1,d] we can either pick two elements in [1,x][1,x], one element from [1,x][1,x] and one element from [x+1,d][x+1,d], or two elements from [x+1,d][x+1,d]. ∎

Optimality of MPW Analysis

In this section we sketch an argument due to Kelner showing that the MPW moments are not PSD when ωn1/(d+1)\omega\gg n^{1/(d+1)}.

With high probability, the MW moments are not PSD when ωn1d+1\omega\gg n^{\frac{1}{d+1}}. In particular, if ωn1d+1\omega\gg n^{\frac{1}{d+1}} then for all ss, for some appropriately chosen CC, with high probability,

We will be using the following proposition heavily.

We have the equation that jxj=ω\sum_{j}{x_{j}}=\omega so

For all IV(G)I\subseteq V(G) and all mm such that I+m2d|I|+m\leqslant 2d,

Analyzing the third term is trickier. For the third term, taking K=IΔJK=I\Delta J for each term,

For the second part, for each x[0,d]x\in[0,d] there are at most (n2x)\binom{n}{2x} KK such that KV(G){s}K\subseteq V(G)\setminus\{s\} and K=2x|K|=2x. Thus,

From our concentration bounds, with high probability, for all xx the corresponding term on the right is O(ωd+xlgnn)O(\frac{{\omega}^{d+x}\lg{n}}{n}) which is O(ωd+xn)O(\frac{\omega^{d+x}}{n}). For all xdx\leqslant d, this is much smaller than Ω(ω2d+1n)\Omega(\frac{{\omega}^{2d+1}}{n}). Thus we may ignore the second part.

For the first part, the values rs(i)r_{s}(i) are completely independent of the values yKy_{K}. When we take the expected value of

over the edges incident to ss, only the x=0x=0 term remains and we obtain (ωdd)\binom{\omega-d}{d}. When we take the variance of

over the edges incident to ss, only the square terms remain so we obtain

From our concentration bounds on the yKy_{K}, with high probability this is

Acknowledgements

We thank Jonathan Kelner, Ankur Moitra, Aviad Rubinstein, and David Steurer for many helpful discussions. We especially thank Boaz Barak for introducing the problem to us, for numerous conversations which led to many of the proofs in this paper, and for invaluable help and advice in writing it.

S.B.H. acknowledges the support of an NSF Graduate Research Fellowship under award no. 1144153.

References