Delocalization and Diffusion Profile for Random Band Matrices

Laszlo Erdos, Antti Knowles, Horng-Tzer Yau, Jun Yin

Introduction

Typically, WW is a mesoscopic scale, larger than the lattice spacing but smaller than the diameter LL of the system: 1WL1\ll W\ll L. These models are natural interpolations between random Schrödinger operators with short range quantum transitions such as the Anderson model And and mean-field random matrices such as Wigner matrices Wig . In particular, random band matrices may be used to model the Anderson metal-insulator phase transition, which we briefly outline.

The analysis of the trace of the single Green function yields the limiting spectral density of HH which is the Wigner semicircle law provided the band width WW diverges as LL\to\infty. For band matrices the semicircle law on large scales, corresponding to spectral parameter η>0\eta>0 independent of NN, was given in MPK . More recently, a semicircle law on small scales, in which η1\eta\ll 1, was derived in EYY1 and generalized in EKYY4 . The results of EYY1 ; EKYY4 are summarized in Lemma 3.4 below. As an application of our method, we prove a further improvement of the semiricle law in Theorem 2.2 below.

The main new ingredient in this paper is the self-consistent equation for the matrix TT, whose entries

are local averages of Gxy2|G_{xy}|^{2}. We show in Theorem 4.1 below that TT satisfies a self-consistent equation of the form

where DD is the matrix of second moments of ff (see (8.1) below). In order to give the leading-order behaviour of TT, we use m2=1αη+O(η2)|m|^{2}=1-\alpha\eta+O(\eta^{2}) (see (3.5) below), where

Therefore the Fourier transform of TT is approximately given by

in the regime pW1\lvert p\rvert\ll W^{-1} and η1\eta\ll 1. This corresponds to the diffusion approximation on scales larger than WW with an effective diffusion constant DeffD_{\rm eff}. In the language of diagrammatic perturbation theory, the change from DD to DeffD_{\rm eff} has the interpretation of a self-energy renormalization. This result coincides with Equation (1.5.5) of Sp , which was obtained by computing the sum of ladder diagrams in a high-moment expansion.

The main result of this paper is a justification of this heuristic argument in a certain range of parameters. The error term E{\mathcal{E}} contains fluctuations of local averages. Roughly speaking, we need to control the size of \sum_{x}\big{[}|G_{xy}|^{2}-P_{x}|G_{xy}|^{2}\big{]}, where PxP_{x} denotes partial expectation with respect to the matrix entries in the xx-th row (see T~xy\widetilde{T}_{xy} in (4.5) below). Unfortunately, Gxy2|G_{xy}|^{2} and Gxy2|G_{x^{\prime}y}|^{2} for xxx\neq x^{\prime} are not independent; in fact they are strongly correlated for small η\eta, and they do not behave like independent random variables. Estimating high moments of these averages requires an unwrapping of the hierarchical correlation structure among several resolvent matrix entries. The necessary estimates are quite involved. They are a special case of the more general Fluctuation Averaging Theorem that is published separately EKY2 , and was originally developed for application in the current paper. There have been several previous results in this direction; see (EYY2, , Lemma 5.2), (EYY3, , Lemma 4.1), (EKYY1, , Theorem 5.6), and (PY, , Theorem 3.2). The Fluctuation Averaging Theorem generalizes these ideas to arbitrary monomials of GG and exploits an additional cancellation mechanism in averages of Gxy2|G_{xy}|^{2} that is not present in averages of GxxG_{xx}. For more details, see EKY2 .

Formulation of the results

for some fixed δ>0\delta>0. The parameter LL is the fundamental large quantity of our model. Define the dd-dimensional discrete torus

where ZL,WZ_{L,W} is a normalization constant chosen so that SS is a stochastic matrix:

In particular, we may consider the two classical symmetry classes of random matrices: real symmetric and complex Hermitian. For real symmetric band matrices we assume

For complex Hermitian band matrices we assume

in addition to (2.5). A common way to satisfy (2.9) is to choose the real and imaginary parts of ζij\zeta_{ij} to be independent with identical variance. As in EKY2 , our results also hold without this assumption, but we omit the details of this generalization to avoid needless complications.

From the definition of SS it is easy to see that ZN,W=Wd+O(Wd1)Z_{N,W}=W^{d}+O(W^{d-1}). In particular,

The following definition introduces a notion of a high-probability bound that is suited for our purposes.

for large enough NN0(ε,D)N\geqslant N_{0}(\varepsilon,D). Unless stated otherwise, throughout this paper the stochastic domination will always be uniform in all parameters apart from the parameter δ\delta in (2.1) and the sequence of constants μp\mu_{p} in (2.11); thus, N0(ε,D)N_{0}(\varepsilon,D) also depends on δ\delta and μp\mu_{p}. If XX is stochastically dominated by Ψ\Psi, uniformly in uu, we use the equivalent notations

For example, using Chebyshev’s inequality and (2.11) one easily finds that

so that we may also write hij=O((sij)1/2)h_{ij}=O_{\prec}((s_{ij})^{1/2}). The relation \prec satisfies the familiar algebraic rules of order relations. The general statements are formulated later in Lemma 3.3.

We remark that Definition 2.1 is tailored to the assumption that (2.11) holds for any pp. If (2.11) only holds for some large but fixed pp then all of our results still hold, but in a somewhat weaker sense. Indeed, the control of the exceptional events in our theorems is expressed via the relation \prec. If only finitely many moments are assumed to be finite in (2.11), then the exponents ε\varepsilon and DD in the definition of \prec cannot be chosen to be arbitrary, and will in fact depend on pp. Repeating our arguments under this weaker assumption would require us to follow all of these exponents through the entire proof. Our assumption that (2.11) holds for any pp streamlines our statements and proofs, by avoiding the need to keep track of the precise values of these parameters.

Throughout the following we make use of a spectral parameter

We choose and fix two arbitrary (small) global constants γ>0\gamma>0 and κ>0\kappa>0. All of our estimates will depend on κ\kappa and γ\gamma, and we shall often omit the explicit mention of this dependence. Set

We introduce the Stieltjes transform of Wigner’s semicircle law, defined by

It is well known that the Stieltjes transform mm is characterized by the unique solution of

with Imm(z)>0\operatorname{Im}m(z)>0 for Imz>0\operatorname{Im}z>0. Thus we have

To avoid confusion, we remark that the Stieltjes transform mm was denoted by mscm_{sc} in the papers ESY1 ; ESY2 ; ESY3 ; ESY4 ; ESY5 ; ESY6 ; ESY7 ; ESYY ; EYY1 ; EYY2 ; EYY3 ; EKYY1 ; EKYY2 , in which mm had a different meaning from (2.14).

and denote its entries by Gij(z)G_{ij}(z). In the following sections we list our main results on the resolvent matrix entries.

We conclude this section by introducing some notation that will be used throughout the paper. We use CC to denote a generic large positive constant, which may depend on some fixed parameters and whose value may change from one expression to the next. Similarly, we use cc to denote a generic small positive constant. For two positive quantities ANA_{N} and BNB_{N} we sometimes use the notation ANBNA_{N}\asymp B_{N} to mean cANBNCANcA_{N}\leqslant B_{N}\leqslant CA_{N}. Moreover, we use ANBNA_{N}\ll B_{N} to mean that there exists a constant c>0c>0 such that ANNcBNA_{N}\leqslant N^{-c}B_{N}; we also use ANBNA_{N}\gg B_{N} to denote BNANB_{N}\ll A_{N}. (Note that these latter conventions are nonstandard.) Finally, we introduce the Japanese bracket x:=1+x2\langle x\rangle\mathrel{\mathop{:}}=\sqrt{1+\lvert x\rvert^{2}}. Most quantities in this paper depend on the spectral parameter zz, which we however mostly omit from the notation.

For simplicity, here we state our main results assuming that d=1d=1 and that ff satisfies the decay condition

The generalization of our results to d>1d>1 and slowly decaying ff is straightforward, and will be given in Section 8. We emphasize that the core of our argument, given in Sections 3–5, is valid in general, independent of the dimension.

2. Improved local semicircle law for resolvent entries and delocalization

Throughout this section we assume d=1d=1 and (2.17). The Wigner semicircle law states that the normalized trace, 1NTrG(z)\frac{1}{N}\operatorname{Tr}G(z), is asymptotically given by m(z)m(z). In fact, this asymptotics holds even for individual matrix entries. Our first theorem controls the (zz-dependent) random variable

For the following we introduce the deterministic control parameter ΦΦ(N)(z)\Phi\equiv\Phi^{(N)}(z) through

Assume d=1d=1 and (2.17). Suppose moreover that

Clearly, the assumption ηN2/W3\eta\gg N^{2}/W^{3} can be replaced with the stronger assumption ηW1/2\eta\gg W^{-1/2}. The assumption NW5/4N\ll W^{5/4} is technical; to see why it is needed, see (6.3) in the proof of Theorem 2.2 below. In the regime (2.19), Theorem 2.2 improves the earlier result

proved in EYY1 (see Lemma 3.4 below). In fact, the estimate (2.20) is optimal, as may be seen from (2.38) and the first estimate of (2.30) below. By spectral decomposition of GG one easily finds that

Thus, in the regime where Λ\Lambda is bounded, the average of Gxy2\lvert G_{xy}\rvert^{2} is of order (Nη)1(N\eta)^{-1}. Here we introduced the notation G(z):=(G(z))=(Hzˉ)1G^{*}(z)\mathrel{\mathop{:}}=(G(z))^{*}=(H-\bar{z})^{-1}, which we shall use throughout the following.

The bound (2.20) implies an estimate on the Stieltjes transform of the empirical spectral density, mN(z):=N1TrG(z)m_{N}(z)\mathrel{\mathop{:}}=N^{-1}\operatorname{Tr}G(z). Under the assumptions of Theorem 2.2 and the conditions (2.19), we have

we leave the details to the reader. We remark that (2.23) is the simplest form of the fluctuation averaging mechanism (see Section 3.1). A concise proof of (2.23) can be found in (EKYY4, , Theorem 4.6).

For η(W/N)2\eta\leqslant(W/N)^{2} we have Φ2=(Nη)1\Phi^{2}=(N\eta)^{-1}, and the bound (2.20) therefore shows that all off-diagonal entries of GG have a magnitude comparable with the average of their magnitudes. We say that the resolvent is completely delocalized. Complete delocalization of the resolvent implies that the eigenvectors are completely delocalized in a weak sense. The precise formulation is given in Proposition 7.1 below. By choosing η\eta such that W1/2η(W/N)2W^{-1/2}\leqslant\eta\leqslant(W/N)^{2} and invoking Proposition 7.1 we obtain the following corollary.

Assume d=1d=1 and (2.17). If NW5/4N\ll W^{5/4} then the eigenvectors of HH are completely delocalized in the sense of Proposition 7.1 below.

This corollary improves the result in EK1 ; EK2 , where complete eigenvector delocalization (in a slightly weaker sense; see Remark 2.7 below) was proved under the condition NW7/6N\ll W^{7/6}. It was observed in Section 11 of EK1 that the graphical perturbative renormalization scheme of EK1 ; EK2 faces a fundamental barrier at N=W6/5N=W^{6/5}. The reason for this barrier is that a large family of graphs whose contribution was subleading for NW6/5N\ll W^{6/5} in fact yield a leading-order contribution for NW6/5N\geqslant W^{6/5} if estimated individually. The cancellation mechanism among these subleading graphs has so far not been identified. As evidenced by Corollary 2.4, our present approach goes beyond this barrier.

3. Diffusion profile

Note that the matrix Θ=(Θxy)\Theta=(\Theta_{xy}) solves the equation

which is obtained from (1.1) by dropping the error term E{\mathcal{E}}. Clearly, Θxy\Theta_{xy} is translation invariant, i.e. Θxy=Θu0\Theta_{xy}=\Theta_{u0} with u=[xy]Nu=[x-y]_{N}. Moreover, Θxy>0\Theta_{xy}>0 for all x,yx,y. Indeed, this follows immediately from the geometric series representation

which converges by m<1|m|<1 (see (3.6) below) and the trivial bound 0(Sn)xy10\leqslant(S^{n})_{xy}\leqslant 1, as follows from (2.3).

(We normalize by W2W^{-2} to account for the fact that the distribution su0s_{u0} has variance O(W2)O(W^{2}).) It is easy to see that

a precise computation is given in (5.2) below.

Note that TT is not symmetric, but our results also hold for TxyT_{xy} replaced with the quantities jsyjGxj2\sum_{j}s_{yj}\lvert G_{xj}\rvert^{2} or i,jsxisyjGij2\sum_{i,j}s_{xi}s_{yj}\lvert G_{ij}\rvert^{2}.

Assume d=1d=1 and (2.17). Suppose that NW5/4N\ll W^{5/4} and (W/N)2η1(W/N)^{2}\leqslant\eta\leqslant 1. Then

Note that the total mass of the distribution Gx02\lvert G_{x0}\rvert^{2} may be computed explicitly by spectral decomposition of GG: assuming ΛΨ\Lambda\prec\Psi we have

in agreement with the corresponding statement (2.28) for the deterministic limiting profile.

We expect that (2.30) should in fact hold under the weaker conditions η1N\eta\gg\frac{1}{N} and NW2N\ll W^{2}. The improved local semicircle law (2.20) should also hold under these weaker conditions. In particular, this would imply complete delocalization of the eigenvectors for all NW2N\ll W^{2}. One obstacle is that a non-trivial control on Λ\Lambda in the regime η1W\eta\leqslant\frac{1}{W} is difficult to obtain.

The resolvent is controlled for ηW1/2\eta\geqslant W^{-1/2} (instead of ηW1/3\eta\gg W^{-1/3}).

The control on the profile is pointwise in xx and yy (instead of in a weak sense on the scale Wη1/2W\eta^{-1/2}).

The estimates hold with high probability (instead of in expectation).

However, the result in the current paper is not uniform in NN, unlike that of EK1 ; EK2 .

We conclude this section with an asymptotic result on the deterministic profile Θx0\Theta_{x0}. Since we are interested in large values of xx, we need to consider the small-momentum behaviour of the Fourier transform of Θx0\Theta_{x0}. Using the small-pp expansion (1.2) and (1.4), we therefore find that Θx0θx\Theta_{x0}\approx\theta_{x}, where we defined the NN-periodic function

Moreover, if (W/N)2η1(W/N)^{2}\leqslant\eta\leqslant 1 and NW2N\leqslant W^{2}, we have the sharp upper bound ΘxyCΥxy\Theta_{xy}\leqslant C\Upsilon_{xy}.

where in the last step we used the elementary identities (3.3) and (3.5) below. In fact, the calculation (2.39) is a mere consistency check (to leading order) since xΘx0=Immη\sum_{x}\Theta_{x0}=\frac{\operatorname{Im}m}{\eta}; see (5.2) below. We conclude that the average height of the profile is of order (Nη)1(N\eta)^{-1}. The peak of the exponential profile has height of order (Wη)1(W\sqrt{\eta})^{-1}, which dominates over the average height if and only if η(W/N)2\eta\gg(W/N)^{2}. The regime η(W/N)2\eta\gg(W/N)^{2} corresponds to the regime where η\eta is sufficiently large that the complete delocalization has not taken place, and the profile is mostly concentrated in the region xyWη1/2N|x-y|\leqslant W\eta^{-1/2}\ll N.

These scenarios are best understood in a dynamical picture in which η\eta is decreased down from 11. The ensuing dynamics of θ\theta corresponds to the diffusion approximation, where the quantum problem is replaced with a random walk of step-size of order WW. On a configuration space consisting of NN sites, such a random walk will reach an equilibrium beyond time scales (N/W)2(N/W)^{2}. As observed in Remark 2.7, η1\eta^{-1} plays the role of time tt, so that in this dynamical picture equilibrium is reached for tη1(N/W)2t\sim\eta^{-1}\gg(N/W)^{2}. Figure 1 illustrates this diffusive spreading of the profile for different values of η\eta.

4. Delocalization with a small mean-field component

In this section we continue to assume d=1d=1 and (2.17). We now consider a related model

The effect of adding a small Wigner component of size ε\varepsilon is that the imaginary part of the spectral parameter effectively increases from η\eta to η+ε\eta+\varepsilon in the local semicircle law and in the diffusion approximation. In particular, we can eliminate the condition NW5/4N\leqslant W^{5/4} and still obtain delocalization for HεH_{\varepsilon} provided ε\varepsilon is not too small. These results are summarized in the following theorem. In order to state it, we introduce the control parameter

which is analogous to Φ\Phi defined in (2.18).

Suppose that η(η+ε)W1\eta(\eta+\varepsilon)\gg W^{-1}. Moreover, suppose that NW5/4N\ll W^{5/4} or η+εW1/2\eta+\varepsilon\gg W^{-1/2}. Then

Suppose that ε+ηW1/2\varepsilon+\eta\gg W^{-1/2} and

Then the resolvent is completely delocalized:

If ε(N/W2)2/3\varepsilon\gg(N/W^{2})^{2/3} then the eigenvectors of HεH_{\varepsilon} are completely delocalized in the sense of Proposition 7.1.

This theorem formulates only the bounds concerning delocalization, i.e. the counterparts of Theorem 2.2 and Corollary 2.4. Similarly to Theorem 2.5, a non-trivial profile can be proved for the average of Gxy2|G_{xy}|^{2}. The profile is visible in the regime NηWε+ηN\eta\geqslant W\sqrt{\varepsilon+\eta}, and it is given by

where the approximation is valid in the regime xyN|x-y|\ll N. The details of the precise formulation and the proof are left to the reader.

Preliminaries

In this subsection we introduce some further notations and collect some basic facts that will be used throughout the paper. Throughout this section we work in the general dd-dimensional setting of Section 2.1.

For T{1,,N}T\subset\{1,\dots,N\} we define H(T)H^{(T)} by

Moreover, we define the resolvent of H(T)H^{(T)} through

Let XX(H)X\equiv X(H) be a random variable. For i{1,,N}i\in\{1,\dots,N\} define the operations PiP_{i} and QiQ_{i} through

We call PiP_{i} partial expectation in the index ii. Moreover, we say that XX is independent of T{1,,N}T\subset\{1,\dots,N\} if X=PiXX=P_{i}X for all iTi\in T.

Suppose that X(u,v)Ψ(u,v)X(u,v)\prec\Psi(u,v) uniformly in uUu\in U and vVv\in V. If VNC\lvert V\rvert\leqslant N^{C} for some constant CC then

Suppose that X1(u)Ψ1(u)X_{1}(u)\prec\Psi_{1}(u) uniformly in uu and X2(u)Ψ2(u)X_{2}(u)\prec\Psi_{2}(u) uniformly in uu. Then

The claims (i) and (ii) follow from a simple union bound. The claim (iii) follows from Chebyshev’s inequality, using a high-moment estimate combined with Jensen’s inequality for partial expectation. We omit the details. ∎

Note that if for any ε>0\varepsilon>0 and p1p\geqslant 1 we have

for large enough NN (depending on ε\varepsilon and pp) then XΨX\prec\Psi by Chebyshev’s inequality. Moreover, if XΨX\leqslant\Psi almost surely, then XΨX\prec\Psi. Hence O(Ψ)O_{\prec}(\Psi) describes a larger class of random variables than O(Ψ)O(\Psi).

We need the following bound on Λ\Lambda.

Away from the spectral edges, i.e. for zSz\in\bf{S}, this bound was proved in Proposition 3.3 of EYY1 . In EYY1 , the matrix entries xijx_{ij} were assumed to have at most subexponential tails (a stronger assumption than (2.11) for all pp), but the proof of EYY1 extends trivially to our case. See EKYY4 for a simplified and generalized alternative proof.

The following result collects some elementary facts about mm.

The identity (3.3) follows by taking the imaginary part of (2.15). The estimate (3.4) was proved in EYY2 , Lemma 4.2. From (2.16) we find Imm=1/α+O(η)\operatorname{Im}m=1/\alpha+O(\eta), from which (3.5) follows easily using (3.3). Finally, (3.6) follows from Lemma 4.2 in EYY2 combined with (3.3) and (3.4). ∎

The following resolvent identities form the backbone of all of our proofs. They first appeared in (EYY1, , Lemmas 4.1 and 4.2) and (EKYY2, , Lemma 6.10). The idea behind them is that a resolvent entry GijG_{ij} depends strongly on the ii-th and jj-th columns of HH, but weakly on all other columns. The first set of identities (called Family A) determine how to make a resolvent entry GijG_{ij} independent of an additional index ki,jk\neq i,j. The second set (Family B) identities express the dependence of a resolvent entry GijG_{ij} on the entries in the ii-th or in the jj-th column of HH.

For any Hermitian matrix HH and T{1,,N}T\subset\{1,\dots,N\} the following identities hold.

For i,j,kTi,j,k\notin T and ki,j,k\neq i,j, we have

For i,jTi,j\notin T satisfying iji\neq j we have

The deterministic control parameter Ψ\Psi is admissible if

A typical example of an admissible control parameter is

If Ψ\Psi is admissible then the lower bound in (3.9) together with (2.12) ensure that hijΨh_{ij}\prec\Psi.

The following lemma gives an expansion formula for the diagonal entries of GG.

Suppose that ΛΨ\Lambda\prec\Psi for some admissible Ψ\Psi. Defining

The claim is an immediate consequence of Equations (9.1) and (9.2) in EKY2 . (Related but less explicit formulas were also obtained in EYY3 ). ∎

In this section we collect the necessary results from EKY2 . The following proposition is a special case of the Fluctuation Averaging Theorem of EKY2 .

Suppose that ΛΨ\Lambda\prec\Psi for some admissible control parameter Ψ\Psi. Then

All of these estimates follow immediately from Theorem 4.8, Lemma B.1, and Proposition B.2 of EKY2 , recalling that by assumption Immcκ\operatorname{Im}m\geqslant c_{\kappa} by (3.6). ∎

The important quantity on the right-hand sides of (3.12), (3.13) and (3.14) is Ψ\Psi. The additional factors M1/4M^{-1/4} are a technicalThis nuisance is necessary, however, and Proposition 3.9 would be false without the factors of M1/4M^{-1/4}. See (EKY2, , Remark 4.10). nuisance, but their precise form will play some role in the large-η\eta regime, where M1/4M^{-1/4} is not negligible compared to Ψ\Psi.

To interpret these estimates, we note that each summand in (3.12), (3.13), and (3.14) has a naive size given by Ψk\Psi^{k}, where kk is the number of off-diagonal resolvent entries in the summand. Without averaging, this naive size would be a sharp upper bound. In the second estimate in (3.12) the averaging does not improve the bound since GμaGaμ=Gμa2G_{\mu a}G_{a\mu}^{*}=|G_{\mu a}|^{2} is positive. In all other estimates, the monomial on the left-hand side either has a nontrivial phase or its expectation is zero thanks to QaQ_{a}. Proposition 3.9 asserts that in these cases the averaged quantity is smaller than its individual summands. Note that this averaging of fluctuations is effective even though the entries of GG may be strongly correlated. How many additional factors of Ψ\Psi one gains depends on the structure of the left-hand side in a subtle way; see Theorem 4.8 of EKY2 for the precise statement. For the applications in this paper the second bound in (3.13) is especially important; here the averaging yields a gain of two extra factors of Ψ\Psi.

We remark that all these bounds also hold if the weight functions sρas_{\rho a} are replaced with a more general weight function. The precise definition is given in Definition 4.4 of EKY2 . All the weights used in this paper satisfy Definition 4.4 of EKY2 .

We also note that averaging in indices can be replaced by expectations. We shall need the following special case of Theorem 4.15 of EKY2 .

Suppose that ΛΨ\Lambda\prec\Psi for some admissible control parameter Ψ\Psi. Then for aμ,νa\neq\mu,\nu

Self-consistent equation for T𝑇T

After these preparations, we now move on to the main arguments of this paper. Throughout this section we work in the general dd-dimensional setting of Section 2.1. In this section we derive a self-consistent equation for TT, given in Theorem 4.1, whose error terms are controlled precisely using the fluctuation averaging from Proposition 3.9. In Section 5 we solve this self-consistent equation; the result is given in Proposition 5.1.

Suppose that ΛΨ\Lambda\prec\Psi for some admissible control parameter Ψ\Psi. Then we have

where the matrix entries of the error satisfy

The naive size of TxyT_{xy} is of order Ψ2\Psi^{2}. Notice that the error term in the self-consistent equation (4.2) is smaller by two orders. This improvement is essentially due to second estimate of (3.13).

Instead of averaging in the first index of the resolvent in the definition of TT (2.29), we could have averaged in the second, resulting in the quantity Txy:=jGxj2sjyT_{xy}^{\prime}\mathrel{\mathop{:}}=\sum_{j}\lvert G_{xj}\rvert^{2}s_{jy}. Then TT^{\prime} satisfies the self-consistent equation

where E{\mathcal{E}}^{\prime} also satisfies (4.3).

Before the proof we mention that this result also gives a self-consistent equation for the two-sided averaged quantity

Taking the average ysyz\sum_{y}s_{yz} of (4.1), we get the following corollary.

Suppose that ΛΨ\Lambda\prec\Psi for some admissible control parameter Ψ\Psi. Then we have

where E{\mathcal{E}} and E~\widetilde{{\mathcal{E}}} each satisfy (4.3).

The rest of this section is devoted to the proof of Theorem 4.1. We begin by writing

Then by the second formula in (3.13), we have

Notice that (3.13) applies only to the summands iyi\neq y in (4.5). The estimate for the summand i=yi=y follows from

where we used that (Gyym)Ψ(G_{yy}-m)\prec\Psi (see (3.11)) and that Ψ\Psi is admissible, and in particular ΨM1Ψ2M1/2\Psi M^{-1}\leqslant\Psi^{2}M^{-1/2}.

We shall compute isxiPiGiy2\sum_{i}s_{xi}P_{i}|G_{iy}|^{2} up to error terms of order Ψ4\Psi^{4}. We have the following result.

Suppose that ΛΨ\Lambda\prec\Psi for some admissible control parameter Ψ\Psi. Then

(It is possible to improve the last error term in (4.7), but we shall not need this.) Before proving Lemma 4.4, we show how it implies Theorem 4.1.

Equation (4.1) is an immediate consequence of (4.8), (4.6), and (4.5). Hence Theorem 4.1 follows from Lemma 4.4. ∎

Throughout the following we shall repeatedly need the simple estimate

where the first step follows from ΛΨ\Lambda\prec\Psi, the second from the fact that Ψ\Psi is admissible, and the last from (3.4). In particular, for ki,jk\neq i,j, from (3.7) we get the estimate

We start the proof of Lemma 4.4 with the case iyi\neq y. Using (3.11) we get

where in the last step we used ZiΨZ_{i}\prec\Psi and the large deviation bound (see Lemma B.2)

We may now compute the contribution of the main term in (4.11) to PiGiy2P_{i}\lvert G_{iy}\rvert^{2}. Still assuming iyi\neq y, we find

In the third step we used (4.9), and in the last step we added the missing term k=ik=i to obtain TiyT_{iy}; the resulting error term is O(Ψ2M1)O_{\prec}(\Psi^{2}M^{-1}) since iyi\neq y. Next, using (4.9) we get

In the second step, using (3.7) and (4.9), we inserted an upper index ii as a preparation to taking the partial expectation PiP_{i}. We obtain

Now we take the partial expectation in ii in (4.15). Using that

by Proposition 3.10 and (4.10), we find that PiP_{i} applied to the second term in (4.15) results in a quantity O_{\prec}\big{(}\Psi^{2}(\Psi+M^{-1/4})^{2}\big{)}. Thus the contribution of main term in (4.11) to PiGiy2P_{i}|G_{iy}|^{2} is

Next, we look at the contribution of the term with a ZZ in (4.11):

Now we remove the upper indices at the expense of an error of size O(Ψ4)O_{\prec}(\Psi^{4}), and then add back the exceptional summation index ii as before. This gives

where in the second step we used (3.14); the various cases of coinciding indices k,l,yk,l,y are easily dealt with using the bound M1/2ΨM^{-1/2}\leqslant\Psi.

As remarked above, in the real symmetric case (2.8) the pairing c=kc=k, d=ld=l is also possible. This gives rise to the additional error term

Combining (4.11), (4.16) and (4.17) yields

for iyi\neq y. This proves (4.7) for the case iyi\neq y.

Here we used that GyymΨG_{yy}-m\prec\Psi and that Py(Gyym)Ψ2+M1/2P_{y}(G_{yy}-m)\prec\Psi^{2}+M^{-1/2} by (3.11). It is possible to compute this term to high order in Ψ\Psi, but we shall not need this.

For the proof of (4.8) we run almost the same argument as above but now we aim at removing all upper indices ii. We first consider the summands iyi\neq y. From (4.15) we get

where we removed the upper index ii using (3.7), and included the summand k=ik=i at the expense of a negligible error term. Taking the average i(y)sxi\sum_{i}^{(y)}s_{xi} of the second term on the right-hand side yields

In the first step we just added the exceptional index i=yi=y, and estimated the additional terms with i=yi=y using sxysykM1sykM2s_{xy}s_{yk}\leqslant M^{-1}s_{yk}\leqslant M^{-2} as well as GkyGyyGykδky+Ψ2G_{ky}G_{yy}G_{yk}^{*}\prec\delta_{ky}+\Psi^{2}. In the second step we used (3.14). Note that the gain comes from the summation index ii.

Thus the contribution of the main term of (4.11) to i(y)sxiPiGiy2\sum_{i}^{(y)}s_{xi}P_{i}|G_{iy}|^{2} is

The contributions of the error terms in (4.11) to iysxiPiGiy2\sum_{i\neq y}s_{xi}P_{i}|G_{iy}|^{2} are of order O_{\prec}\bigl{(}{\Psi^{4}+\Psi^{2}M^{-1/2}}\bigr{)}; this is true even without averaging (see (4.17)). Thus we have

Finally, we consider the case i=yi=y. From (4.20) we get

This formula provides the missing summands i=yi=y in (4.24) and hence yields (4.8). ∎

Solving the equation for T𝑇T

Thus, the entries Πij\Pi_{ij} of Π\Pi are all equal to 1/N1/N, and SΠ=ΠS=ΠS\Pi=\Pi S=\Pi since SS is stochastic by (2.3). The complementary projection is denoted by Π ⁣:=1Π\overline{\Pi}\!\,\mathrel{\mathop{:}}=1-\Pi.

We perform this splitting on TxyT_{xy} only in the xx coordinate, regarding yy as fixed. Thus, we split

here the last step follows easily by spectral decomposition of GG. We can use the local semicircle law, Lemma 3.4, to get

It is instructive to perform the same averaging with the deterministic profile Θ\Theta:

Having dealt with the component ΠT\Pi T in (5.1), we devote the rest of this section to the component Π ⁣T\overline{\Pi}\!\,T. The following proposition contains the main result of this section.

Suppose that ΛΨ\Lambda\prec\Psi for some admissible control parameter Ψ\Psi. Then we have for all yy

Multiplying (4.2) by Π ⁣\overline{\Pi}\!\, from the left yields

where we used that SΠ=ΠS=ΠS\Pi=\Pi S=\Pi. Therefore

Note that (Π ⁣T)xy=TxyT ⁣y(\overline{\Pi}\!\,T)_{xy}=T_{xy}-\overline{T}\!\,_{y}. Using (5.6) we therefore get (5.3) whose error term satisfies

This completes the proof of (5.3) and (5.4). ∎

Next, we estimate Gijδijm2|G_{ij}-\delta_{ij}m|^{2} in terms of TijT_{ij}. In other words, we derive pointwise estimates on GijG_{ij} from estimates on the averaged quantity TxyT_{xy}. This gives rise to an improved bound on Λ\Lambda, which we may plug back into Proposition 5.1. Thus we get a self-improving scheme which may be iterated.

Suppose that ΛΨ\Lambda\prec\Psi with some admissible control parameter Ψ\Psi and TijΩij2T_{ij}\prec\Omega_{ij}^{2} for a family of admissible control parameters Ωij\Omega_{ij} indexed by a pair (i,j)(i,j) (see Definition 3.7). Then

(Here we write Ωij2:=(Ωij)2\Omega_{ij}^{2}\mathrel{\mathop{:}}=(\Omega_{ij})^{2}.)

We fix the index jj throughout the proof. Let first iji\neq j. Then (3.8) gives

We shall use the large deviation bounds from Theorem B.1 to estimate the sum. For that we shall need a bound on

where in the first step we used (3.7) and (4.9). Since Gii1G_{ii}\prec 1, we get from (5.9) and Theorem B.1 (i) that

To estimate GiimG_{ii}-m, we use (3.11) to get

where in the second step we used Theorem B.1 (i) and (ii), with the bounds Gkk(i)1G_{kk}^{(i)}\prec 1 and

This last estimate follows along the lines of (5.10), whereby the error terms resulting from the removal of the upper indices are estimated by Cauchy-Schwarz; we omit the details. Finally M1M^{-1} can be absorbed into kΩik2ski\sum_{k}\Omega_{ik}^{2}s_{ki} by admissibility of Ωij\Omega_{ij}. ∎

We may now combine Proposition 5.1 and Lemma 5.3 in an iterative self-improving scheme, which results in an improved bound on Λ\Lambda.

Suppose that ΛΨ\Lambda\prec\Psi and TijΩ2T_{ij}\prec\Omega^{2} for all ii and jj, where Ψ\Psi and Ω\Omega are admissible control parameters. Then

We apply Lemma 5.3 to the constant control parameter Ωij=Ω\Omega_{ij}=\Omega for each i,ji,j. Thus, suppose that TijΩ2T_{ij}\prec\Omega^{2} for all i,ji,j, Lemma 5.3 yields

Now we can iterate this estimate, Ω2+Ψ4\Omega^{2}+\Psi^{4} taking the role of Ψ2\Psi^{2} in controlling Λ2\Lambda^{2}. Thus after one iteration we get

After kk iterations we get Λ2Ω2+Ψ2k\Lambda^{2}\prec\Omega^{2}+\Psi^{2^{k}}. Since Ω\Omega and Ψ\Psi are admissible, we have Ψ2kΩ2\Psi^{2^{k}}\prec\Omega^{2} for klogγk\sim|\log\gamma|. This completes the proof. ∎

First we show that for large enough LL the Euclidean matrix norm satisfies

with some positive constant c1c_{1} depending on the profile ff. Since the matrix entries sijs_{ij} are translation invariant (see (2.2)), it is sufficient to compute its Fourier transform as defined in (5.13). Using the property Su^(p)=S^(p)u^(p)\widehat{Su}(p)=\widehat{S}(p)\widehat{u}(p), the fact that Π^(p)=δp0\widehat{\Pi}(p)=\delta_{p0}, and Plancherel’s identity, we find

The last step follows easily from S^(p)1+δ\widehat{S}(p)\geqslant-1+\delta (recall (2.4)) and the representation

where in the second step we used pxπ\lvert p\cdot x\rvert\leqslant\pi, and in the last step p2π/L\lvert p\rvert\geqslant 2\pi/L.

From (3.6) we get 1m2cη1-|m|^{2}\geqslant c\eta, which, combined with (5.15), yields

In order to prove (5.6), we first observe that S1\|S\|_{\infty}\leqslant 1 as follows from the estimate

Delocalization bounds

In this section we prove our main results – Theorems 2.2, 2.5, and 2.11. We return to the one-dimensional case, d=1d=1, and continue to assume (2.17). In particular, we write NN instead of LL. The simple extension to higher dimensions is given in Section 8.

Suppose that ΛΨ\Lambda\prec\Psi for some admissible control parameter Ψ\Psi. Then (5.3) together with (5.4), (5.1), and (2.38) yield

Recalling Corollary 5.4, we have therefore proved

i.e. the upper bound Λ2Ψ2\Lambda^{2}\prec\Psi^{2} can be replaced with the stronger bound (6.2).

We can now iterate (6.2), exactly as in the proof of Corollary 5.4. We start the iteration with Ψ0:=(Wη)1/2\Psi_{0}\mathrel{\mathop{:}}=(W\eta)^{-1/2}; see Lemma 3.4. Explicitly, the iteration reads

From (6.2) and Lemma 3.4 we get that Λ2Ψk\Lambda^{2}\prec\Psi_{k} for any fixed kk.

In order perform the iteration, we require

Thus we get the conditions NW5/4N\ll W^{5/4} and ηN2/W3\eta\gg N^{2}/W^{3}. (Here we used (2.1)). Satisfying these two conditions is the reason we need to impose the restriction on WW in Theorem 2.2, Corollary 2.4, and Theorem 2.5. Using (6.3) and the fact that Φ\Phi is by definition admissible, it is now easy to see that there is a finite constant kk, which depends on the implicit constants cc in \ll and \gg above, such that Ψk2CΦ2\Psi_{k}^{2}\leqslant C\Phi^{2}. This concludes the proof of Theorem 2.2.

2. Delocalization with profile: proof of Theorem 2.5

By assumption we have (W/N)2η1(W/N)^{2}\leqslant\eta\leqslant 1, so that in particular Φ2=W1η1/2=:Ψ2\Phi^{2}=W^{-1}\eta^{-1/2}=\mathrel{\mathop{:}}\Psi^{2}. Note that this Ψ\Psi is admissible. From (2.20) we get ΛΨ\Lambda\prec\Psi. Now observe that ImmNη=ΠxyImmη\frac{\operatorname{Im}m}{N\eta}=\Pi_{xy}\frac{\operatorname{Im}m}{\eta} for all xx and yy, as well as

by (3.3) and the property ΠS=SΠ=Π\Pi S=S\Pi=\Pi. Thus (5.3) together with (5.4) and (5.1) implies the first estimate of (2.30), since in the regime η(W/N)2\eta\geqslant(W/N)^{2} and W5/4NW^{5/4}\gg N the error term (5.4) is bounded by

The second estimate of (2.30) follows from the first one and (4.7).

Next, (2.31) follows by using (2.33) in (2.30).

Finally, using Lemma 5.3 with Ωij2=Υij\Omega_{ij}^{2}=\Upsilon_{ij} and Ψ:=W1/2η1/4\Psi\mathrel{\mathop{:}}=W^{-1/2}\eta^{-1/4}, we obtain

Here we used that Ψ4\Psi^{4} can be absorbed into (Nη)1Υij(N\eta)^{-1}\leqslant\Upsilon_{ij} and in the last summation kΥikski\sum_{k}\Upsilon_{ik}s_{ki} can be absorbed into ΥiiCWη\Upsilon_{ii}\geqslant\frac{C}{W\sqrt{\eta}}. This proves (2.32), and hence concludes the proof of Theorem 2.5.

3. Delocalization with a small mean-field component: proof of Theorem 2.11

with some positive constant cc. This implies that (5.5) and (5.6) can be improved to

Suppose now that ΛΨ\Lambda\prec\Psi for some admissible control parameter Ψ\Psi. Then the statement of Proposition 5.1 is modified to

Notice that the Fourier transforms of SS and SεS_{\varepsilon} (defined by (5.13)) satisfy

Here we treated the zero mode p=0p=0 separately; it is given by

where in the last step we used (3.3). The error term in (6.9) is estimated using a similar calculation.

Notice that the coefficient of S^(p)\widehat{S}(p) in the denominator of (6.9) is now m2(1ε)=1ε(1ε)αη+O(η2)|m|^{2}(1-\varepsilon)=1-\varepsilon-(1-\varepsilon)\alpha\eta+O(\eta^{2}), where we used (3.5). The results and the proof of Proposition 2.8 remain unchanged when SS is replaced with SεS_{\varepsilon}, except that αη\alpha\eta must be replaced with (1ε)αη+ε(1-\varepsilon)\alpha\eta+\varepsilon on the right-hand side of (2.36), and the whole expression is multiplied by an additional factor (1ε)(1-\varepsilon). Moreover, instead of (2.38), we now have

Recall the definition (2.20) of Φε\Phi_{\varepsilon}. Following the proof of Theorem 2.2, instead of (6.2) we now obtain

As in Section 6.1, we can iterate (6.11) under the conditions

(Note that the a priori estimate (Wη)1(W\eta)^{-1} is still determined by WW despite the small mean-field component. In Lemma 3.4 it is given by (Mη)1/2(M\eta)^{-1/2} where M=(maxijsij)1(εN1+W1)1WM=(\max_{ij}s_{ij})^{-1}\sim(\varepsilon N^{-1}+W^{-1})^{-1}\sim W.) The first condition of (6.12) holds if

In order to get complete delocalization of the resolvent, i.e. Λ2(Nη)1\Lambda^{2}\prec(N\eta)^{-1}, we require ΛΦε2\Lambda\prec\Phi_{\varepsilon}^{2} as well as

which ensures that Φε=(Nη)1\Phi_{\varepsilon}=(N\eta)^{-1}. Hence we get complete delocalization of the resolvent provided that (6.13), (6.14), and (6.16) hold. This concludes the proof of part (ii).

If ε(N/W2)2/3\varepsilon\gg(N/W^{2})^{2/3} then there exists an η\eta such that the assumptions of part (ii) are met. Hence part (ii) and Proposition 7.1 yields part (iii). This concludes the proof of Theorem 2.11.

Complete delocalization of eigenvectors

Let ε>0\varepsilon>0 and define the random subset of eigenvector indices through

Suppose that ΛΨ\Lambda\prec\Psi for some admissible control parameter Ψ\Psi. Let ηηN\eta\equiv\eta_{N} be a sequence satisfying M1+γη1M^{-1+\gamma}\leqslant\eta\ll 1. Suppose that

uniformly in EIE\in I, where in the first step we used the spectral decomposition of GG. Thus, for all xx, the map yηImmGyx2y\mapsto\frac{\eta}{\operatorname{Im}m}\lvert G_{yx}\rvert^{2} is approximately a probability distribution on {1,,N}\{1,\dots,N\}. Roughly, (7.1) states that this probability distribution is supported on the order of NN sites of {1,,N}\{1,\dots,N\}. More precisely, (7.1) yields (introducing the standard basis vector δx\delta_{x} defined by (δx)(y):=δxy(\delta_{x})(y)\mathrel{\mathop{:}}=\delta_{xy}), for any fixed xx,

uniformly in EIE\in I. Here in the third step we used (7.2) and (7.1), and in the last step the upper bound ηMc\eta\leqslant M^{-c} and the fact that Ψ\Psi is admissible.

Therefore we may estimate the left-hand side by its square root to get the bound

Similarly, we may estimate the second term of (7.4) using

Combining (7.4) with (7.3), (7.5), and (7.6), we get

Setting ζ=ε\zeta=\sqrt{\varepsilon} in (7.8) therefore yields

Extension to higher dimensions and a slowly decaying band

In this Section we extend Theorem 2.2, Corollary 2.4, and Theorem 2.5 in two directions: higher dimensions dd and a slowly decaying band.

The multidimensional analogues of the slowly decaying profile are left to the reader, as is the formulation of these extensions if a small mean-field component is added to the band matrix. All these results can be obtained in a straightforward manner following the proofs for the one-dimensional case with a rapidly decaying ff.

Let d=2,3,d=2,3,\dots and assume (2.17). Then there is a constant CC such that

In order to state the precise form of the profile Θxy\Theta_{xy}, we define the covariance matrix DDWD\equiv D_{W} through

Since D>0D_{\infty}>0 we get Dc>0D\geqslant c>0 uniformly in WW.

Next, we define the dd-dimensional Yukawa potential

where χ\chi is a smooth function satisfying χ(q)=1\chi(q)=1 for q1/2\lvert q\rvert\leqslant 1/2 and χ(q)=0\chi(q)=0 for q1\lvert q\rvert\geqslant 1, φ\varphi is a Schwartz function satisfying φ=1\int\varphi=1, and φt(x):=tdφ(x/t)\varphi_{t}(x)\mathrel{\mathop{:}}=t^{-d}\varphi(x/t). (In fact, φ(D1/2x)\varphi(D^{-1/2}x) is the Fourier transform of χ(q)\chi(q).) The second step of (8.2) follows by Poisson summation; see Appendix A and in particular (A.14) for more details. The following lemma gives the precise error bounds in the approximation (8.2).

Let d=1,2,3,d=1,2,3,\dots and assume (2.17). Then

The convolution in (8.2) smooths out the Yukawa potential on the scale xWx\approx W. The error terms in (8.3) are negligible compared to the main term θx\theta_{x} in the regime WxCWη1/2W\ll|x|\leqslant CW\eta^{-1/2}. Therefore the approximation θ\theta is meaningful from the profile scale Wη1/2W\eta^{-1/2} down to the band scale WW. The actual choice of the function χ\chi in (8.2) is immaterial in the relevant regime xW|x|\gg W, as long as χ\chi is equal to one in a neighbourhood of the origin.

Next, we state the counterparts of Theorem 2.2, Corollary 2.4, and Theorem 2.5 in the higher-dimensional setting. Their proofs are trivial modifications of the proofs of their one-dimensional counterparts, using Lemmas 8.1 and 8.2.

Let d=2,3,d=2,3,\dots and assume (2.17). Suppose moreover that LW1+d/4L\ll W^{1+d/4} and ηL2/Wd+2\eta\gg L^{2}/W^{d+2}. Then we have

Let d=2,3,d=2,3,\dots and assume (2.17). If LW1+d/4L\ll W^{1+d/4} then the eigenvectors of HH are completely delocalized in the sense of Proposition 7.1.

Let d=2,3,d=2,3,\dots and assume (2.17). Suppose that LW1+d/4L\ll W^{1+d/4} and (W/L)2η1(W/L)^{2}\leqslant\eta\leqslant 1. Then

Moreover, the analogues of (2.31) and (2.32) hold with

where KK is an arbitrary, fixed, positive integer.

2. Slowly decaying band

In this section we make the following assumption on the band shape. Suppose that d=1d=1 and ff is smooth and symmetric, and satisfies

for some fixed β(0,2)\beta\in(0,2). Here hh is a symmetric function satisfying

for some fixed h0>0h_{0}>0. Note that by definition ff is smooth and symmetric, so that h(x)=O(x1+β)h(x)=O(\lvert x\rvert^{1+\beta}) near the origin.

In order to avoid technical issues arising from the periodicity of SS, we cut off the tail of ff at scales xNx\approx N. Thus we set

here σ\sigma is a smooth, symmetric bump function satisfying σ(x)=1\sigma(x)=1 for xa\lvert x\rvert\leqslant a and σ(x)=0\sigma(x)=0 for xb\lvert x\rvert\geqslant b, where 0<a<b<1/20<a<b<1/2. As usual, ZZ is a normalization constant.

The following lemma is the analogue of Lemma 5.2. Its proof is similar to that of Lemma 5.2; the key input is Lemma A.2 (iii).

Suppose that d=1d=1 and that (8.6) and (8.7) hold. Then

Next, we give the sharp upper bound on the peak of the profile.

Suppose that d=1d=1 and that (8.6) and (8.7) hold. Then

In order to describe the asymptotic shape of the profile, we define

which plays a role similar to the unrenormalized diffusion constant DD from (2.26). Moreover, define the function

which is bounded for β>1\beta>1. It is easy to check that for β>1\beta>1 and x1\lvert x\rvert\geqslant 1 we have

with an explicitly computable constant Cβ>0C_{\beta}>0.

Suppose that d=1d=1 and that (8.6) and (8.7) hold for some β>1\beta>1. Suppose moreover that

The matrix Θ\Theta is the resolvent of a superdiffusive operator, whose symbol in Fourier space is BWpβB\lvert Wp\rvert^{\beta}. Thus, under the identification t=η1t=\eta^{-1} from Remark 2.7, we find that the associated dynamics scales according to xWt1/βx\sim Wt^{1/\beta} instead of the diffusive scaling xWt1/2x\sim Wt^{1/2}.

We may now state the counterparts of Theorem 2.2, Corollary 2.4, and Theorem 2.5 for the slowly decaying band. Their proofs are trivial modifications of those for the strongly decaying band, using Lemmas 8.7 and 8.8.

Suppose that d=1d=1 and that (8.6) and (8.7) hold. Suppose moreover that NW1+1/2βN\ll W^{1+1/2\beta} and η(N/W)β/W\eta\gg(N/W)^{\beta}/W. Then we have

Suppose that d=1d=1 and that (8.6) and (8.7) hold. If NW1+1/2βN\ll W^{1+1/2\beta} then the eigenvectors of HH are completely delocalized in the sense of Proposition 7.1.

Suppose that d=1d=1 and that (8.6) and (8.7) hold for some β1\beta\geqslant 1. Suppose that NW1+1/2βN\ll W^{1+1/2\beta} and (W/N)βη1(W/N)^{\beta}\leqslant\eta\leqslant 1. Then

Moreover, the analogues of (2.31) and (2.32) hold.

Appendix A The deterministic profile

In this appendix we establish bounds and asymptotics for the deterministic profile Θxy\Theta_{xy}.

where gg is a bounded smooth function satisfying g(q)=g(q)g(q)=g(-q). Clearly, f^\widehat{f} is real and f^1\|\widehat{f}\|_{\infty}\leqslant 1. Moreover, we claim that for any ε>0\varepsilon>0 there exists an ε>0\varepsilon^{\prime}>0 such that

indeed, this follows easily from the identity

As a guide for intuition, we have S^W(q)f^(q)\widehat{S}_{W}(q)\approx\widehat{f}(q), as can be seen from

Thus, our proof consists in controlling the error in the approximation

As a first step, we establish basic properties of S^W\widehat{S}_{W} that are analogous to (A.2) and (A.1).

The function S^W\widehat{S}_{W} is smooth with uniformly bounded derivatives, real, and symmetric with S^W(q)1\lvert\widehat{S}_{W}(q)\rvert\leqslant 1 and S^W(0)=1\widehat{S}_{W}(0)=1. Moreover, it has the following properties.

For any ε>0\varepsilon>0 there exists an ε>0\varepsilon^{\prime}>0 such that

for large enough WW (depending on ε\varepsilon).

There exists a smooth function gWg_{W} whose derivatives are bounded uniformly in WW such that

The proof of (i) is similar to that of (A.2).

(recall that qπW\lvert q\rvert\leqslant\pi W) and that

to estimate the main term, and (2.1) to estimate the error term with K1/δK\sim 1/\delta. We also have the trivial bound on S^W1|\widehat{S}_{W}|\leqslant 1. Thus we have

We can iterate the above argument for the main term in (A.8), thus obtaining higher order divided differences of ff. Since ff is smooth and decays rapidly, (A.6) follows for k=0k=0. The proof for k>0k>0 is analogous.

Now (iii) follows from the fact that the function h(q)\mathrel{\mathop{:}}=\bigl{(}{1-q^{2}/2-\cos(q)}\bigr{)}q^{-4} is smooth and its derivatives are bounded. ∎

with some positive constant cc. By (A.5) we have on the support of χ ⁣\overline{\chi}\!\,

for some ε\varepsilon^{\prime} depending on ε\varepsilon. Then from (A.3) we have

extended to the whole real line, is smooth and its derivatives are bounded uniformly in NN and WW (by (A.9)). (These bounds may of course depend on ε\varepsilon). Moreover, R(q)=OK(qK)R(q)=O_{K}(\langle q\rangle^{-K}) for any KK; see (A.6). By summation by parts, as in (A.8), we find that for such a function we have

Now we consider the first term in (A.10). For the following we use Ai(q,η,N,W)A_{i}(q,\eta,N,W) with i=1,2,3,i=1,2,3,\dots to denote functions that are smooth in qq and whose qq-derivatives are uniformly bounded in qq, η\eta, WW, and NN. Using the Taylor expansion (A.7) and (3.5), we have (omitting the arguments for brevity)

This gives (again omitting the arguments)

where we introduced the new variable r:=η1/2qr\mathrel{\mathop{:}}=\eta^{-1/2}q. By definition, A1,,A6A_{1},\dots,A_{6} and their qq-derivatives are uniformly bounded. Since Dc>0D\geqslant c>0 and rεη1/2r\leqslant\varepsilon\eta^{-1/2} on the support of χ\chi, we find that for small enough ε\varepsilon the denominator of the second line of (A.12) is bounded away from zero, uniformly in rr, η\eta, WW, and NN. We therefore conclude that FN,W,ηF_{N,W,\eta} is smooth and its derivatives (in the variable rr) are uniformly bounded.

Using summation by parts, exactly as in (A.8), we get

Here we used that the sum on the left-hand side ranges over a set of size O(N/W)O(N/W) due to the factor χ\chi in the definition of FN,W,ηF_{N,W,\eta}. Therefore (A.12) and (A.13) imply that the first term of (A.10) is given by

Notice that the error term in (A.11) is smaller than in (A.14). Next, we remove the factor χ\chi from the main term, exactly as in (A.11). Plugging this into (A.10) yields

We can extend the summation in the main term

where the error term on the right-hand side is of order O(W2)O(W^{-2}). Thus we have

The main term can be computed by the Poisson summation formula

In order to prove (2.38), it suffices to analyse the asymptotics of the expression

We consider two cases. If \eta\geqslant\bigl{(}{\frac{W}{N}}\bigr{)}^{2} then R1WηR\asymp\frac{1}{W\sqrt{\eta}}. On the other hand, if \eta\leqslant\bigl{(}{\frac{W}{N}}\bigr{)}^{2} we use an integral approximation to get

This concludes the proof of (2.38), and hence of Proposition 2.8.

A.2. Higher dimensions: proofs of Lemmas 8.1 and 8.2

We follow the argument from the proof of Proposition 2.8 in the previous section, and merely sketch the differences. We use the dd-dimensional lattices

Note that, unlike in the proof of Proposition 2.8, we keep the cutoff function χ\chi in the main term since the function (η+qDq)1(\eta+q\cdot Dq)^{-1} is not integrable in higher dimensions.

The main term of (A.17) can be computed using Poisson summation:

Using that V(x)x2dV(x)\asymp\lvert x\rvert^{2-d} near the origin, we find VφαηCWd\lVert V*\varphi_{\sqrt{\alpha\eta}}\rVert_{\infty}\leqslant CW^{-d}. By treating the two cases \eta\leqslant\bigl{(}{\frac{W}{N}}\bigr{)}^{2} and \eta\geqslant\bigl{(}{\frac{W}{N}}\bigr{)}^{2} separately, we find exactly as in the last paragraph of the proof of Proposition 2.8 that (A.18) is bounded by CWd+C(Nη)1CW^{-d}+C(N\eta)^{-1}.

What remains therefore is the estimate of the error term containing RR in (A.17). To that end, we write

We need a more precise bound on the error term of (A.17) than the bound CWdCW^{-d} from the proof of Lemma 8.1. In fact, we claim that

The proof of (A.20) is a rather laborious exercise in Taylor expansion whose details we omit. The basic strategy is similar to the analysis of (A.12), except that we expand S^W\widehat{S}_{W} up to order d/2+2d/2+2 (instead of 44). This completes the proof of Lemma 8.2. ∎

A.3. Slowly decaying band: proof of Lemma 8.8 and Proposition 8.9

We begin by proving the following auxiliary result, which gives the relevant asymptotics of S^W\widehat{S}_{W}. For q0q\neq 0 define

We also set b(0):=0b(0)\mathrel{\mathop{:}}=0, so that bb is continuous.

Suppose that d=1d=1 and that (8.6) and (8.7) hold. Then the following are true.

Part (i) is proved similarly to (A.6), using summation by parts.

Let χ\chi be a smooth, symmetric bump function satisfying χ(x)=1\chi(x)=1 for x1\lvert x\rvert\leqslant 1 and χ(x)=0\chi(x)=0 for x2\lvert x\rvert\geqslant 2. Write χ ⁣:=1χ\overline{\chi}\!\,\mathrel{\mathop{:}}=1-\chi. We introduce the splitting

on the right-hand side of (A.23). It is easy to check that the two first terms give a contribution of order O(q2)O(q^{2}). The last term of the splitting gives rise to

where the last step follows from a mid-point Riemann sum approximation. Now a change of variables u=qxu=qx easily yields (A.22).

Part (iii) follows from part (ii) using an argument similar to (5.15). ∎

where the first term is the contribution of the low modes qη1/β\lvert q\rvert\leqslant\eta^{1/\beta} and the second term the contribution of the high modes qη1/β\lvert q\rvert\geqslant\eta^{1/\beta}, which may be replaced with an integral and estimated using Lemma A.2. We omit the details. ∎

We proceed similarly to the proof of Proposition 2.8. We choose a cutoff scale ε\varepsilon, and denote by χ\chi the bump function from the proof of Lemma A.2. The scale ε\varepsilon satisfies η1/βε1\eta^{1/\beta}\ll\varepsilon\ll 1, and will be chosen by optimizing at the end of the proof.

We use the expansions (A.22) and (3.5). Thus we find, as in the proof of Proposition 2.8,

where χ\chi is a smooth bump function as in the proof of Proposition 2.8 and

Note that for qQ{0}q\in Q\setminus\{0\} we have b(q)cb(q)\geqslant c. Using (A.21) we may therefore estimate, as in (A.12), to get

for some c>0c>0, where we used (8.10). Setting ε:=η11/β\varepsilon\mathrel{\mathop{:}}=\eta^{1-1/\beta} and Poisson summation yields

Now (8.11) follows by noting that by (8.9), under the assumption (8.10), only the term k=0k=0 is of leading order. ∎

Appendix B Multilinear large deviation estimates

In this appendix we give a generalization of the large deviation estimate of Corollary B.3 EYY1 . The proof is simpler and the statement is formulated under the assumption (2.11) instead of the stronger subexponential decay assumption. Moreover, since the current proof does not rely on the Burkholder inequality, it is trivially generalizable to arbitrary multilinear estimates.

Throughout the following we consider random variables XX satisfying

Suppose that \bigl{(}{\sum_{i}\lvert b_{i}\rvert^{2}}\bigr{)}^{1/2}\prec\Psi. Then ibiXiΨ\sum_{i}b_{i}X_{i}\prec\Psi.

Suppose that \bigl{(}{\sum_{i\neq j}\lvert a_{ij}\rvert^{2}}\bigr{)}^{1/2}\prec\Psi. Then ijaijXiXjΨ\sum_{i\neq j}a_{ij}X_{i}X_{j}\prec\Psi.

Suppose that \bigl{(}{\sum_{i,j}\lvert a_{ij}\rvert^{2}}\bigr{)}^{1/2}\prec\Psi. Then i,jaijXiYjΨ\sum_{i,j}a_{ij}X_{i}Y_{j}\prec\Psi.

If all of the above random variables depend on an index uu and the hypotheses of (i) – (iii) are uniform in uu, then so are the conclusions.

The rest of this appendix is devoted to the proof of Theorem B.1. Our proof in fact generalizes trivially to arbitrary multilinear estimates for quantities of the form i1,,ikai1ik(u)Xi1(u)Xik(u)\sum_{i_{1},\dots,i_{k}}^{*}a_{i_{1}\dots i_{k}}(u)X_{i_{1}}(u)\cdots X_{i_{k}}(u), where the star indicates that the summation indices are constrained to be distinct.

We first recall the following version of the Marcinkiewicz-Zygmund inequality.

Let X1,,XNX_{1},\dots,X_{N} be a family of independent random variables each satisfying (B.1) and suppose that the family (bi)(b_{i}) is deterministic. Then

The proof is a simple application of Jensen’s inequality. Writing B2:=jbi2B^{2}\mathrel{\mathop{:}}=\sum_{j}\lvert b_{i}\rvert^{2}, we get, by the classical Marcinkiewicz-Zygmund inequality stroock in the first line, that

Next, we prove the following intermediate result.

Let X1,,XN,Y1,,YNX_{1},\dots,X_{N},Y_{1},\dots,Y_{N} be independent random variables each satisfying (B.1), and suppose that the family (aij)(a_{ij}) is deterministic. Then for all p2p\geqslant 2 we have

Note that (bj)(b_{j}) and (Yj)(Y_{j}) are independent families. By conditioning on the family (bj)(b_{j}), we therefore get from Lemma B.2 and the triangle inequality that

Let X1,,XNX_{1},\dots,X_{N} be independent random variables each satisfying (B.1), and suppose that the family (aij)(a_{ij}) is deterministic. Then we have

The proof relies on the identity (valid for iji\neq j)

where the sum ranges over nonempty subsets II and JJ. Now we may estimate

As remarked above, the proof of Lemma B.4 may be easily extended to multilinear expressions of the form i1,,ikai1ikXi1Xik\sum_{i_{1},\dots,i_{k}}^{*}a_{i_{1}\dots i_{k}}X_{i_{1}}\cdots X_{i_{k}}.

We may now complete the proof of Theorem B.1.

The proof is a simple application of Chebyshev’s inequality. Part (i) follows from Lemma B.2, part (ii) from Lemma B.4, and part (iii) from Lemma B.3. We give the details for part (iii).

for arbitrary DD. In the second step we used the definition of \bigl{(}{\sum_{i\neq j}|a_{ij}|^{2}}\bigr{)}^{1/2}\prec\Psi with parameters ε/2\varepsilon/2 and D+1D+1. In the last step we used Lemma B.4 by conditioning on (aij)(a_{ij}). Given ε\varepsilon and DD, there is a large enough pp such that the first term on the last line is bounded by ND1N^{-D-1}. Since ε\varepsilon and DD were arbitrary, the proof is complete.

The claimed uniformity in uu in the case that aija_{ij} and XiX_{i} depend on an index uu also follows from the above estimate. ∎

References