Random matrices: Sharp concentration of eigenvalues

Terence Tao, Van Vu

Introduction

The purpose of this paper is to sharpen the existing bounds on the eigenvalue counting function NI=NI(Wn)N_{I}=N_{I}(W_{n}) of a (normalized) Wigner matrix Wn=1nMnW_{n}=\frac{1}{\sqrt{n}}M_{n}, and related quantities such as the Stieltjes transform sWn(z)s_{W_{n}}(z) and individual eigenvalues λi(Wn)\lambda_{i}(W_{n}). Let us first state the Wigner random matrix model which we will use.

Let n1n\geq 1 be an integer (which we view as a parameter going off to infinity; in particular, nn is understood to be large enough that quantities such as loglogn\log\log n are well-defined and positive). An n×nn\times n Wigner matrix MnM_{n} is defined to be a random Hermitian n×nn\times n matrix Mn=(ξij)1i,jnM_{n}=(\xi_{ij})_{1\leq i,j\leq n}, in which the ξij\xi_{ij} for 1ijn1\leq i\leq j\leq n are jointly independent with ξji=ξij\xi_{ji}=\overline{\xi_{ij}} (in particular, the ξii\xi_{ii} are real-valued). For 1i<jn1\leq i<j\leq n, we require that the ξij\xi_{ij} have mean zero and variance one, while for 1i=jn1\leq i=j\leq n we require that the ξij\xi_{ij} (which are necessarily real) have mean zero and variance σ2\sigma^{2} for some σ2>0\sigma^{2}>0 independent of i,j,ni,j,n. For simplicity, we will also assume that for each 1i<jn1\leq i<j\leq n, the real and imaginary parts Reξij{\operatorname{Re}}\xi_{ij}, Imξij{\operatorname{Im}}\xi_{ij} are independent. We refer to the distributions Reξij{\operatorname{Re}}\xi_{ij}, Imξij{\operatorname{Im}}\xi_{ij} for 1i<jn1\leq i<j\leq n and ξii\xi_{ii} for 1in1\leq i\leq n as the atom distributions of MnM_{n}, and view them as fixed while nn goes off to infinity.

We say that the Wigner matrix ensemble obeys Condition C0 if we have the exponential decay condition

for all 1i,jn1\leq i,j\leq n and tCt\geq C^{\prime}, and some constants C,CC,C^{\prime} (independent of i,j,ni,j,n).

Two Wigner matrices Mn=(ξij)1i,jnM_{n}=(\xi_{ij})_{1\leq i,j\leq n} and Mn=(ξij)1i,jnM^{\prime}_{n}=(\xi^{\prime}_{ij})_{1\leq i,j\leq n} are said to have matching moments to order mm for some m0m\geq 0 if one has

for all 1i,jn1\leq i,j\leq n and all natural numbers k,l0k,l\geq 0 with k+lmk+l\leq m. As we are assuming the real and imaginary parts to be independent, this condition simplifies to the conditions

for all 1i,jn1\leq i,j\leq n and all 0km0\leq k\leq m. If we only require (2) or (3) to hold in the off-diagonal case iji\neq j (resp. in the diagonal case i=ji=j), we say that MnM_{n} and MnM^{\prime}_{n} match moments to order mm off the diagonal (resp. on the diagonal).

We observe four basic examples of Wigner matrices:

In the symmetric Bernoulli ensemble, ξij\xi_{ij} equals +1+1 with probability 1/21/2 and 1-1 with probability 1/21/2 for all 1i,jn1\leq i,j\leq n, and σ2=1\sigma^{2}=1.

In the complex Hermitian Bernoulli ensemble, Reξij,Imξij{\operatorname{Re}}\xi_{ij},{\operatorname{Im}}\xi_{ij} for 1i<jn1\leq i<j\leq n and ξii\xi_{ii} for 1in1\leq i\leq n all equal +1+1 with probability 1/21/2 and 1-1 with probability 1/21/2, and σ2=1\sigma^{2}=1.

Note that we do not require the off-diagonal ξij\xi_{ij}, 1i<jn1\leq i<j\leq n (or the diagonal ξii\xi_{ii}, 1in1\leq i\leq n) to be identically distributed. This lack of an identical distribution hypothesis will be convenient when we apply the Lindeberg exchange strategy , in which one Wigner matrix is compared to another one by exchanging the entries of the former matrix with the latter oneMore precisely, we exchange the diagonal entries one at a time, and the off-diagonal entries two at a time, in order to preserve the Hermitian property throughout. at a time. As such, the intermediate stages of this exchange process need not have identically distributed entries, even if the initial and final matrices do.

The hypothesis of independence of real and imaginary parts is imposed purely to simplify the exposition, and can easily be removed at the cost of some more complicated notation; in particular, the simpler moment matching condition (3) must be replaced by the more complicated condition (2). See Remark 23.

In this paper, we will mostly deal with the (coarse-scale) normalization Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n} of MnM_{n} of the Wigner matrix, and more specifically with the eigenvalue counting function

Let MnM_{n} be a Wigner Hermitian matrix obeying Condition C0. Then for any fixed interval II (independent of nn), one has

See for instance for a proof of this theorem and for historical background. Condition C0 can be omitted from this law, but we retain the hypothesis as it will be needed for the subsequent results discussed below.

If we use o(x)o(x) to denote a quantity that goes to zero as nn\to\infty after dividing by xx, we can reformulate Theorem 3 as the assertion that the asymptotic

holds with probability 1o(1)1-o(1) for each fixed II.

One can also phrase the semicircular law in terms of the individual eigenvalues λi(Wn)\lambda_{i}(W_{n}). If for each 1in1\leq i\leq n we define the classical location γi\gamma_{i} of the normalised ithi^{\operatorname{th}} eigenvalue by the formula

then the Wigner semicircular law (combined with an almost sure bound of (2+o(1))n(2+o(1))\sqrt{n} for the operator norm of MnM_{n}, due to Bai and Yin ) is equivalent to the assertion that one has

for any given 1in1\leq i\leq n, with probability 1o(1)1-o(1).

In this paper we investigate sharper versions of the semicircular law (known in the literature as local semicircular laws), which improve upon the error terms and failure probabilities in (4) and (6), and in which the interval II is now allowed to depend on nn.

We first discuss the case of the Gaussian Unitary Ensemble (GUE), which is the most well-understood case, as the joint distribution of the eigenvalues is given by a determinantal point process. Because of this, it is known that for any interval II, the random variable NI(Wn)N_{I}(W_{n}) in the GUE case obeys a law of the form

where the ηi=ηi,n,I\eta_{i}=\eta_{i,n,I} are jointly independent indicator random variables (i.e. they take values in {0,1}\{0,1\}); see e.g. [3, Corollary 4.2.24]. The mean and variance of NI(Wn)N_{I}(W_{n}) can also be computed in the GUE case with a high degree of accuracy:

Let MnM_{n} be drawn from GUE, let Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}, and let I=[,x]I=[-\infty,x] for some real number xx (which may depend on nn). Let ε>0{\varepsilon}>0 be independent of nn.

(Bulk case) If x[2+ε,2ε]x\in[-2+{\varepsilon},2-{\varepsilon}], then

(Variance bound) If one has x[2,2ε]x\in[-2,2-{\varepsilon}] and n2/3(2+x)n^{2/3}(2+x)\to\infty as nn\to\infty, one has

In particular, one has VarNI(Wn)=O(logn)\mathbf{Var}N_{I}(W_{n})=O(\log n) in this regime.

Here of course we use X=O(Y)X=O(Y), XYX\ll Y or YXY\gg X to denote the estimate XCY|X|\leq CY for some quantity CC independent of nn. We will also use cc to denote various small positive constants c>0c>0 independent of nn (but possibly depending on the constants in Condition C0).

See [21, Lemmas 2.1, 2.2, 2.3]. Note that the normalization conventions in differ by a factor of 2\sqrt{2} from the ones used hereThere is a slight inaccuracy in the statement of [21, Lemma 2.2], namely that the main term of 423πn(1t)3/2\frac{4\sqrt{2}}{3\pi}n(1-t)^{3/2} in that lemma should be replaced with the more accurate main term 2nπt11x2 dx\frac{2n}{\pi}\int_{t}^{1}\sqrt{1-x^{2}}\ dx (which is what actually comes out of the proof of [21, Lemma 2.2]). These two main terms differ by O(1)O(1) in the regime t=1O(n2/5)t=1-O(n^{-2/5}) as can be seen from a Taylor expansion, but they differ by more than O(1)O(1) outside of this regime.. ∎

By combining these estimates with a well-known inequality of Bennett , we obtain a concentration estimate for NI(Wn)N_{I}(W_{n}) in the GUE case:

Let MnM_{n} be drawn from GUE, let Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}, and let II be an interval. Then one has

By the triangle inequality we may take I=[,x]I=[-\infty,x] for some real number xx. As ρsc\rho_{\operatorname{sc}} is supported on $andhastotalmassand has total mass1,wesee(usingthetrivialbounds, we see (using the trivial bounds0\leq N_{I}(W_{n})\leq nandandN_{I}(W_{n})\leq N_{J}(W_{n})wheneverwheneverI\subset J)thatwithoutlossofgeneralitywemayassume) that without loss of generality we may assumex\in.By(7)andTheorem4,. By (7) and Theorem 4,N_{I}(W_{n})isthenthesumofindependentindicatorfunctions,andthemeanis then the sum of independent indicator functions, and the mean\muandvarianceand variance\sigma^{2}$ of this sum is given by

and σ2=O(logn)\sigma^{2}=O(\log n) respectively. Bennett’s inequality (see , or [23, p.29]) then asserts that

where ϕ(x):=(1+x)log(1+x)x\phi(x):=(1+x)\log(1+x)-x. Since ϕ(x)x\phi(x)\gg x when x1x\gg 1, the claim followsIndeed, this argument shows a slightly better bound than exp(cT)\exp(-cT). One can also use Bernstein’s inequality to also obtain the exp(cT)\exp(-cT) bound if desired.. ∎

Let us say that an event holds with overwhelming probability if it occurs with probability 1O(nA)1-O(n^{-A}) for each fixed AA. From the above corollary we see in particular that in the GUE case, one has

with overwhelming probability for each fixed II, and an easy union bound argument (ranging over all intervals II in, say, $whoseendpointsareamultipleofwhose endpoints are a multiple ofn^{-100}(say))thenshowsthatthisisalsotrueuniformlyin(say)) then shows that this is also true uniformly inI$ as well.

By using a general result of Costin and Lebowitz , one can also obtain a central limit theorem for NI(Wn)N_{I}(W_{n}) as long as II is not too small; see . Such results have also been recently been extended to more general Wigner matrices in . However, such theorems will not be the focus of the current paper.

Now we turn from the GUE case to more general Wigner ensembles. There has been much interest in recent years in obtaining concentration results for NI(Wn)N_{I}(W_{n}) (and for closely related objects, such as the Stieltjes transform sWn(z):=1ntrace(Wnz)1s_{W_{n}}(z):=\frac{1}{n}\operatorname{trace}(W_{n}-z)^{-1} of WnW_{n}) for short intervals II, due to the applicability of such results to establishing various universality properties of such matrices; see . The previous best result in this direction was by Erdős, Yau, and Yin (see also for a variant):

Let MnM_{n} be a Wigner matrix obeying Condition C0, and let Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}. Then, for any interval II, one has

for all TlogAloglognnT\geq\log^{A\log\log n}n, and some constant A>0A>0.

One can reformulate (8) equivalently as the assertion that

In particular, this theorem asserts that with overwhelming probability one has

for all intervals II. The proof of the above theorem is somewhat lengthy, requiring a delicate analysis of the self-consistent equation of the Stieltjes transform of WnW_{n}. A recent preprint of Götze and Tikhomirov has claimedAt the current time of writing, the preprint is being revised to address some gaps in the proofs of some lemmas in that paper, specifically Lemmas 5.2 and 5.3 from (private communication). an improvement to this result, namely that

with probability 1O(exp(clogn(loglogn)α))1-O(\exp(-c\log n(\log\log n)^{\alpha})) for certain explicit exponents C,αC,\alpha. This claim would imply as a consequence that for any interval II, NI(Wn)N_{I}(W_{n}) has variance O(logO(1)n)O(\log^{O(1)}n).

Comparing Theorem 7 with the previous results for the GUE case, we see that there is a loss of a double logarithm loglogn\log\log n in the exponent. The first main result of this paperWe would like to thank M. Ledoux for a private conversation that led to this question. is to remove this double logarithmic loss, at least under an additional vanishing moment assumption:

Let MnM_{n} be a Wigner matrix obeying Condition C0, and let Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}. Assume that MnM_{n} matches moments with GUE to third order off the diagonal (i.e. Reξij,Imξij{\operatorname{Re}}\xi_{ij},{\operatorname{Im}}\xi_{ij} have variance 1/21/2 and third moment zero). Then, for any interval II, one has

This estimate is phrased for any TT, but the bound only becomes non-trivial when TlogCnT\gg\log^{C}n for some sufficiently large CC. In that regime, we see that this result removes the double-logarithmic factor from Theorem 7; it is also comparable to the result (9) from when T=logO(1)nT=\log^{O(1)}n (though not with as sharp a set of exponents as , and one also needs an additional moment matching hypothesis), but gives additional large deviation bounds when TT is much larger than logO(1)n\log^{O(1)}n.

As we are assuming Re(ξij){\operatorname{Re}}(\xi_{ij}) and Im(ξij){\operatorname{Im}}(\xi_{ij}) to be independent, the moment matching condition simplifies to the constraints that ERe(ξij)2=EIm(ξij)2=12{\mathbf{E}}{\operatorname{Re}}(\xi_{ij})^{2}={\mathbf{E}}{\operatorname{Im}}(\xi_{ij})^{2}=\frac{1}{2} and ERe(ξij)3=EIm(ξij)3=0{\mathbf{E}}{\operatorname{Re}}(\xi_{ij})^{3}={\mathbf{E}}{\operatorname{Im}}(\xi_{ij})^{3}=0. However, it is possible to extend this theorem to the case when the real and imaginary parts of ξij\xi_{ij} are not independent; see Remark 23.

The constant cc in the bound in Theorem 8 is quite decent in several cases. For instance, if the atom variables of MnM_{n} are Bernoulli or have sub-gaussian tail, then we can set c=2/5o(1)c=2/5-o(1) by optimizing our arguments (details omitted). If we assume 4 matching moments rather than 3, then we can set c=1c=1 (see Remark 26), matching the bound in Corollary 5. It is an interesting question to determine the best value of cc. The value of cc in is implicit and rather small.

We prove Theorem 8 in Sections 2-4. Our argument differs from that in in that it only uses a relatively crude analysis of the self-consistent equation to obtain some preliminary bounds on the Stieltjes transform and on NIN_{I} (which were also essentially implicit in previous literature). Instead, the bulk of the argument relies on using the Lindeberg swapping strategy to deduce concentration of NI(Wn)N_{I}(W_{n}) in the non-GUE case from the concentration results in the GUE case provided by Corollary 5. In order to keep the error terms in this swapping under control, three matching momentsCompare with the “four moment theorem” from . We need one less moment here because we are working at “mesoscopic” scales (in which the number of eigenvalues involved is much larger than 11) rather than at “microscopic” scales. However, in Theorem 14 below, only one eigenvalue is involved, making the problem microscopic enough to require four moments instead of three. are needed.

Very roughly speaking, the main idea of the argument is to show that high moments such as

are quite stable (in a multiplicative sense) if one swaps (the real or imaginary part of) one of the entries of WnW_{n} (and its adjoint) with another random variable that matches the moments of the original entry to third order. For technical reasons, however, we do not quite manipulate NI(Wn)N_{I}(W_{n}) directly, but instead work with a proxy for this quantity, namely a certain integral of the Stieltjes transform of WnW_{n}. As observed in , the Lindeberg swapping argument is quite simple to implement at the level of the Stieltjes transform (due to the simplicity of the resolvent identities, when compared against the rather complicated Taylor expansions of individual eigenvalues used in ).

The result in Theorem 8 is well suited for controlling eigenvalues in the bulk of the spectrum, but is not sufficient by itself to control eigenvalues at the edge, and in particular the largest eigenvalue λ1(Wn)\lambda_{1}(W_{n}) and the smallest eigenvalue λn(Wn)\lambda_{n}(W_{n}). However, it is known that these eigenvalues are highly concentrated around +2+2 and 2-2 respectively. In the GUE case, we have the following concentration result of Aubrun :

Let MnM_{n} be drawn from GUE, let Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}. Then one has

As is well known, the random variable n2/3(λ1(Wn)2)n^{2/3}(\lambda_{1}(W_{n})-2) in fact converges in distribution to the Tracy-Widom law . However, we will not focus on this law here. The exponent 3/23/2 on the right-hand side cannot be improved (indeed, it matches the decay rate of the Tracy-Widom law); see for further discussion.

This result was partially extended to the Wigner case in :

Let MnM_{n} be a Wigner matrix obeying Condition C0, and let Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}. Then one has

for all TlogAloglognnT\geq\log^{A\log\log n}n, for some A>0A>0 independent of nn. By symmetry, one also has

As before, we can reformulate (10) equivalently as the assertion that

Our second main result is to remove the double logarithm from Theorem 13, at the cost of requiring matching GUE to fourth order rather than to third order:

Let MnM_{n} be a Wigner matrix obeying Condition C0, and let Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}. Assume that MnM_{n} matches moments with GUE to fourth order off the diagonal and second order on the diagonal (i.e. σ2=1\sigma^{2}=1). Then one has

for any T>0T>0. By symmetry, one then also has

We will derive Theorem 14 from Theorem 11 in Section 5 using the same techniques used to derive Theorem 8 from Corollary 5.

By combining Theorem 8 and Theorem 14 one can “solve” for individual eigenvalues λi(Wn)\lambda_{i}(W_{n}) to obtain an appropriate concentration (localization) result:

Let MnM_{n} be a Wigner matrix obeying Condition C0, and let Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}. Assume that MnM_{n} matches moments with GUE to fourth order off the diagonal and second order on the diagonal. Then for any 1in1\leq i\leq n, we have

If we assume only three matching moments, then the above estimate still holds provided that we have the additional hypothesis

for some fixed c>0c^{\prime}>0 (where the constant cc above is allowed to depend on cc^{\prime}).

The second part of this corollary significantly improves [30, Theorem 29]. (As a matter of fact, the original proof of this theorem has a gap in it; see [33, Appendix A] for a further discussion.)

First assume four matching moments. By Theorems 8, 14 and the union bound, we see that outside of an event of probability nO(1)exp(cTc)n^{O(1)}\exp(-cT^{c}), we have

for all intervals II, as well as the bounds

Some elementary estimation of the semicircular density ρsc\rho_{\operatorname{sc}} and its integrals Iρsc(y) dy\int_{I}\rho_{\operatorname{sc}}(y)\ dy (cf. [17, §5]) then gives

for all 1in1\leq i\leq n. The claim follows (possibly after adjusting TT by a multiplicative factor).

Now suppose we only have three matching moments. Then by Theorem 8 and the union bound, we may assume that

for all II. In particular (setting II equal to [2,+)[2,+\infty) or (,2](-\infty,-2]) this implies that 2λi(Wn)2-2\leq\lambda_{i}(W_{n})\leq 2 whenever min(i,n+1i)Tc\min(i,n+1-i)\geq T^{c^{\prime}}. One can then argue as before. ∎

The results in this paper also hold if one replaces the GUE ensemble by the GOE ensemble, in which case one considers real symmetric Wigner matrices instead of Hermitian Wigner matrices, with the off-diagonal ξij\xi_{ij} having mean zero, variance one, and third moment zero (if there are three matching moments) and fourth moment equal to 33 (if there are four matching moments). To do this, one needs to replace Theorem 4 and Theorem 11 by their GOE counterparts. The GOE version of Theorem 4 was established by O’Rourke . The GOE version of Theorem 11 follows from the results in . In principle, one might be able to use other ensembles (such as the gaussian divisible matrices ) to match moments with, which would allow one to remove the moment conditions almost entirely. We will not pursue these matters here.

We are indebted to the anonymous referees for several suggestions and corrections.

Reduction to the Stieltjes transform

We now begin the proof of Theorem 8. The first step is to replace the counting function NI=NI(Wn)N_{I}=N_{I}(W_{n}) with the Stieltjes transform sWns_{W_{n}}, defined by the formula

for any complex number zz with positive imaginary part. We can express this Stieltjes transform as a Riemann-Stieltjes integral

which gives a clear connection between the Stieltjes transform and the counting function; in the converse direction, we have the identity

whenever EE is not an eigenvalue of WnW_{n}, showing that (in principle at least) we can reconstruct the eigenvalue counting function from the Stieltjes transform.

Using the heuristic dN(,x)nρsc(x) dxdN_{(-\infty,x)}\approx n\rho_{\operatorname{sc}}(x)\ dx from (4), we thus expect from (14) to have sWnsscs_{W_{n}}\approx s_{\operatorname{sc}}, where

As is well known (see e.g. ), sscs_{\operatorname{sc}} can be evaluated explicitly via contour integrationFor instance, one can observe that 1πImssc(x±1ε)\frac{1}{\pi}{\operatorname{Im}}s_{\operatorname{sc}}(x\pm\sqrt{-1}{\varepsilon}) converges to ±ρsc(x)\pm\rho_{sc}(x) as ε0+{\varepsilon}\to 0^{+}, and then apply the Cauchy integral formula to sscs_{\operatorname{sc}} around the slit $$.

where z24\sqrt{z^{2}-4} is the branch of the square root that is asymptotic to zz at infinity. In particular, sscs_{\operatorname{sc}} exactly obeys the self-consistent equation

In the case of GUE, we may easily formalize this heuristic with the assistance of Corollary 5:

Let MnM_{n} be drawn from GUE, and Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}. Then for any T>0T>0 and any complex number z=E+1ηz=E+\sqrt{-1}\eta with η>0\eta>0, one has

We may assume that TlognT\gg\log n, as the claim is trivial otherwise. Let T1lognT_{1}\gg\log n be chosen later. From Corollary 5 and the union bound, we see that with probability 1O(nO(1)exp(cT1))1-O(n^{O(1)}\exp(-cT_{1})), one has

for all intervals II in $whoseendpointsaremultiplesofwhose endpoints are multiples ofn^{-100},andhenceforallintervals, and hence for all intervalsI$. In particular,

for all xx. On the other hand, from (14) and integration by parts, one has

The error term on the right-hand side evaluates to O(T1nη)O(\frac{T_{1}}{n\eta}). The claim then follows by choosing T1T_{1} to be a small multiple of TT. ∎

We will use this proposition to obtain a similar concentration result for Wigner matrices:

Let MnM_{n} be a Wigner matrix obeying Condition C0, and let Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}. Assume that MnM_{n} matches moments with GUE to third order off the diagonal. Then for any T>0T>0 and any complex number z=E+1ηz=E+\sqrt{-1}\eta with EE\in and 0<ηn1000<\eta\ll n^{100}, one has

We prove this theorem in later sections. Let us assume it for now, and use it to establish Theorem 8. The basic idea (which is standard in the Stieltjes transform approach to the local semicircle law) is to use a truncated form of (15). Let Mn,Wn,T,KM_{n},W_{n},T,K be as in the above theorem. By the triangle inequality, we may take I=(,E)I=(-\infty,E) for some real number EE; from the support of ρsc\rho_{\operatorname{sc}}, we may assume that EE\in. We may also take Tlog100nT\gg\log^{100}n (say), as the claim is trivial otherwise.

Let T1T/lognlog99nT_{1}\gg T/\log n\gg\log^{99}n be a quantity to be chosen later, and set η0:=T1/n\eta_{0}:=T_{1}/n. Applying Theorem 18 and the union bound, we see that outside of an event of probability at most

for all values of η\eta between η0\eta_{0} and n100n^{100} which are integer multiples of n1000n^{-1000}. On the other hand, in this range one easily verifies that the functions ηsWn(E+1η)\eta\mapsto s_{W_{n}}(E+\sqrt{-1}\eta) and ηssc(E+1η)\eta\mapsto s_{\operatorname{sc}}(E+\sqrt{-1}\eta) are Lipschitz with Lipschitz norm at most O(n200)O(n^{200}) (say). As a consequence, we conclude (after conditioning outside of the above exceptional event) that (19) holds for all η\eta between η0\eta_{0} and n100n^{100}.

By conditioning on another event of probability at most (18), we may assume that all entries of MnM_{n} are of size at most O(n)O(n) (say). Among other things, this implies that all eigenvalues λi(Wn)\lambda_{i}(W_{n}) are (very crudely) of size at most O(n20)O(n^{20}).

Since ηη0=T1/n\eta\geq\eta_{0}=T_{1}/n, we conclude from (19) and (16) that

We conclude thatOne could also have used Proposition 30 at this juncture.

for all ηη0\eta\geq\eta_{0} (note that this claim is trivial for ηn100\eta\geq n^{100}).

Next, if we integrate (19) and use the triangle inequality, we observe that

Let us now evaluate the left-hand side. From the definition of the Stieltjes transform, we may rewrite it as

where Arg{\operatorname{Arg}} is the standard branch of the argument on the upper half-plane.

Since EE\in and λi(Wn)=O(n20)\lambda_{i}(W_{n})=O(n^{20}), we have

(say). Also, from elementary trigonometry one has

We may therefore write the left-hand side of (21) as

(compare with (15)). On the other hand, from (20) and dyadic decomposition (recalling that λi(Wn)=O(n20)\lambda_{i}(W_{n})=O(n^{20})) one has

Choosing T1T_{1} to be a small multiple of T/lognT/\log n (and bounding T1cT_{1}^{c} from below by TcO(logn)T^{c^{\prime}}-O(\log n) for some sufficiently small c>0c^{\prime}>0), we obtain Theorem 8 as desired.

It remains to deduce Theorem 18 from Proposition 17. This will be the objective of the next few sections.

The moment method, and the Lindeberg strategy

Given a matrix Wn=1nMnW_{n}=\frac{1}{\sqrt{n}}M_{n} and a complex number z=E+1ηz=E+\sqrt{-1}\eta, define the quantity A(Wn)=A(Wn,z)A(W_{n})=A(W_{n},z) by the formula

This quantity describes the normalised deviation of the Stieltjes transform of WnW_{n} from the semicircular law at zz. In this notation, Proposition 17 becomes the assertion that

whenever T>0T>0, EE\in, and 0<ηn1000<\eta\ll n^{100}, when MnM_{n} is drawn from a Wigner matrix obeying Condition C0, and with Reξij{\operatorname{Re}}\xi_{ij} and Imξij{\operatorname{Im}}\xi_{ij} having variance 1/21/2 and third moment zero for 1i<jn1\leq i<j\leq n.

To deduce (23) from (22) we will use the moment method combined with the Lindeberg exchange strategy; more specifically, we will show that a high moment EA(Wn)k{\mathbf{E}}A(W_{n})^{k} for some large even number kk (which one should think of, in practice, as comparable to TT) is stable under the operation of replacing (the real or imaginary part of) one entry of MnM_{n} (and its transpose) with another entry with a number of matching moments. The Lindeberg exchange strategy is by now a standard tool in establishing universality properties for Wigner matrices , , ; the main novelty hereVery recently , a similar application of the Lindeberg exchange strategy to a high moment of a spectral statistic was used to establish some related concentration results. We thank Antti Knowles for bringing this preprint to our attention. is the application of that strategy to a high moment EA(Wn)k{\mathbf{E}}A(W_{n})^{k} (as opposed to a quantity such as EG(A(Wn)){\mathbf{E}}G(A(W_{n})) for some smooth test function GG).

Let us now make the strategy more precise. Let us call two Wigner matrices Mn,MnM_{n},M^{\prime}_{n} real-adjacent, or adjacent for short, if their respective atom variables ξij,ξij\xi_{ij},\xi^{\prime}_{ij} are equal except for a single choice of (i,j)=(a,b)(i,j)=(a,b) and its transpose (i,j)=(b,a)(i,j)=(b,a), and such that ξab,ξab\xi_{ab},\xi^{\prime}_{ab} either have identical real parts, or identical imaginary parts. Thus, a Wigner matrix MnM^{\prime}_{n} adjacent to MnM_{n} is formed by changing the real or imaginary part of a single entry of MnM_{n} and its adjoint, leaving the other components of MnM_{n} unchanged. The main technical step is then to establish the following proposition.

Let Mn,MnM_{n},M^{\prime}_{n} be two adjacent Wigner matrices obeying Condition C0, whose moments match to order mm for some fixed m=O(1)m=O(1). Let z=E+1ηz=E+\sqrt{-1}\eta for some EE\in and 0<ηn1000<\eta\ll n^{100}, and set Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n} and Wn:=1nMnW^{\prime}_{n}:=\frac{1}{\sqrt{n}}M^{\prime}_{n}. Then for any even integer klognk\geq\log n, one has

Let us assume this proposition for now and establish Theorem 18. Let n,Mn,Wn,E,η,z,Tn,M_{n},W_{n},E,\eta,z,T be as in that theorem. We may assume that TlogC0nT\geq\log^{C_{0}}n (say) for some sufficiently large absolute constant C0C_{0}, as the claim is trivial otherwise; we may also assume that TηnT\leq\eta n, since the claim follows from existing local semicircle laws (in particular, Corollary 32). In particular, we may now assume that TnO(1)T\leq n^{O(1)} and ηlogC0n/n\eta\geq\log^{C_{0}}n/n. Our task is now to show that

On the other hand, if MnM^{\prime}_{n} is drawn from GUE and Wn:=1nMnW^{\prime}_{n}:=\frac{1}{\sqrt{n}}M^{\prime}_{n}, then from Proposition 17 one has

for all T>0T>0. In particular, for any klognk\geq\log n, one has

We can replace MnM^{\prime}_{n} with MnM_{n} in a sequence of n2n^{2} exchanges from one Wigner matrix to a real-adjacent one; n2nn^{2}-n of these exchanges arise by swapping the real or imaginary part of an off-diagonal entry ξij\xi_{ij} of MnM^{\prime}_{n} (and its transpose ξji\xi_{ji}) with the corresponding component of MnM_{n}, and nn of these exchanges arise by swapping a diagonal entry ξii\xi_{ii} of MnM^{\prime}_{n} with the corresponding entry of MnM_{n}. We perform these exchanges in an arbitrary order. By hypothesis, for the n2nn^{2}-n off-diagonal exchanges one has matching moments to order m=3m=3, while for the diagonal exchanges one has matching moments to order m=1m=1. Let Mn=Mn0,Mn1,,Mnn2=MnM_{n}=M^{0}_{n},M^{1}_{n},\ldots,M^{n^{2}}_{n}=M^{\prime}_{n} denote the sequence of exchanges from MnM_{n} to Mnn2M^{n^{2}}_{n}, and let Wn0,,Wnn2W^{0}_{n},\ldots,W^{n^{2}}_{n} be the associated rescaled Wigner matrices. By Proposition 19 one has

for 0a<n20\leq a<n^{2}, where mam_{a} is equal to 33 for n2nn^{2}-n choices of aa and equal to 11 for nn choices of aa. Concatenating these bounds, we conclude that for any klognk\geq\log n one has

If we set kk to be the largest even integer less than Tc0T^{c_{0}} for some absolute constant c0c_{0}, and if C0C_{0} is sufficiently large depending on c0c_{0}, we obtain (25) as desired, thanks to the assumptions logC0nTnη\log^{C_{0}}n\leq T\leq n\eta.

An inspection of the above argument reveals that we in fact have the slight refinement

in the regime T(nη)cT\leq(n\eta)^{c}, since in this regime we may take kk to be a small multiple of TT (rounded off to the nearest even integer, of course). Unfortunately, this refinement does not appear to immediately offer any significant improvement to the conclusion of Theorem 8.

It remains to establish Proposition 19. This will be achieved in the next section.

Stability of high moments

We now prove Proposition 19. We introduce a definition:

An elementary matrix is a matrix which has one of the following forms

As Mn,MnM_{n},M^{\prime}_{n} are real-adjacent, one can write

for some elementary matrix VV, some random matrix Mn0M_{n}^{0}, and some real random variables ξ,ξ\xi,\xi^{\prime} independent of Mn0M_{n}^{0} that match moments to mthm^{\operatorname{th}} order and obey the exponential decay condition

for all tCt\geq C^{\prime} and some C,C>0C,C^{\prime}>0.

We now recall some (deterministic) resolvent stability results concerning matrices of the form Mn0+tVM_{n}^{0}+tV. Define the matrix norm R(,1)\|R\|_{(\infty,1)} of a n×nn\times n matrix R=(Rij)1i,j1R=(R_{ij})_{1\leq i,j\leq 1} by the formula

Let Mn0M_{n}^{0} be a Hermitian matrix, let VV be an elementary matrix, and let tt be a real number. Let z:=E+1ηz:=E+\sqrt{-1}\eta be a complex number with η>0\eta>0. Write

Furthermore, if we set st:=1ntraceRts_{t}:=\frac{1}{n}\operatorname{trace}R_{t}, then we have the Taylor expansion

for any fixed nonnegative m=O(1)m=O(1), where the coefficients cjc_{j} are independent of tt and obey the bounds

See [32, Lemma 12] and [32, Proposition 13]. ∎

Our objective is to establish (24). From Corollary 33 we see that

with probability 1O(nO(1)exp((nη)c)1-O(n^{O(1)}\exp(-(n\eta)^{c}), while from (28) we certainly have ξ=o(n)\xi=o(\sqrt{n}) with 1O(nO(1)exp((nη)c)1-O(n^{O(1)}\exp(-(n\eta)^{c}). Hence by the first conclusion of Proposition 22 (with Mn0M_{n}^{0} and VV replaced with Mn0+ξVM_{n}^{0}+\xi V, and setting tt equal to ξ-\xi) we have

with probability 1O(nO(1)exp((nη)c))1-O(n^{O(1)}\exp(-(n\eta)^{c})). Using the crude bound A(Wn)=O(nO(1))A(W_{n})=O(n^{O(1)}), we may thus condition Mn0M_{n}^{0} to be fixed and obeying (30), since the contribution of the event where (30) fails to EA(Wn)k{\mathbf{E}}A(W_{n})^{k} is O(nO(k)exp((nη)c))O(n^{O(k)}\exp(-(n\eta)^{c})).

By Proposition 22, we thus see that whenever ξ=o(n)\xi=o(\sqrt{n}), one has

where the coefficients A0,ajA_{0},a_{j} are deterministic (and in particular independent of ξ,ξ\xi,\xi^{\prime}, though they can depend on η,n\eta,n), and aja_{j} obeys the bound aj=O(1)a_{j}=O(1).

Suppose first that A0k|A_{0}|\leq k. Then one has

whenever ξ=o(n)\xi=o(\sqrt{n}), which gives a net contribution of O(k)kO(k)^{k} to EA(Wn)k{\mathbf{E}}|A(W_{n})|^{k}; meanwhile, from (28), the case when ξn\xi\gg\sqrt{n} contributes at most O(nO(k)exp((nη)c))O(n^{O(k)}\exp(-(n\eta)^{c})). Thus we may assume that A0>k|A_{0}|>k. Thus we have

for some deterministic coefficients b1,,bm=O(1)b_{1},\ldots,b_{m}=O(1), and assuming that ξ=o(n)\xi=o(\sqrt{n}). Raising this to the kthk^{\operatorname{th}} power (after using Taylor’s theorem with remainder to expand (1+1kx)k(1+\frac{1}{k}x)^{k} to mthm^{\operatorname{th}} order in the regime x=o(1)x=o(1)), we conclude that

for some deterministic coefficients d1,,dm=O(1)d_{1},\ldots,d_{m}=O(1) (which are allowed to depend on kk), whenever ξ=o(n)\xi=o(\sqrt{n}). Taking (conditional) expectations in ξ\xi (using (28) and the trivial bound A(Wn)=O(nO(1))A(W_{n})=O(n^{O(1)}) to handle the tail event when ξn|\xi|\gg\sqrt{n}) we conclude that

Since ξ\xi and ξ\xi^{\prime} match to order kk, we obtain the claim. This concludes the proof of Proposition 19 and hence Theorem 8.

It is possible to adapt the above arguments to the case when Reξij{\operatorname{Re}}\xi_{ij} and Imξij{\operatorname{Im}}\xi_{ij} are not assumed to be independent. The main new difficulty is that instead of swapping the real and imaginary parts of a single entry ξab\xi_{ab} of MnM_{n} (and its transpose ξba\xi_{ba}) separately, one has to swap them together. This requires one to consider perturbations of the form

where V1,V2V_{1},V_{2} are two distinct elementary random variables, and ξ1,ξ2\xi_{1},\xi_{2} are real random variables that are not necessarily independent and obeying the exponential decay hypothesis (28). However, it is possible to extend Proposition 22 without much difficulty to the case of two-parameter perturbations and perform a similar argument to that given above. We omit the details.

Extreme eigenvalues

We now prove Theorem 14, by combining the arguments in previous sections with some ideas from (and in particular, demonstrating a concentration of ImsWn(E+1η){\operatorname{Im}}s_{W_{n}}(E+\sqrt{-1}\eta) that is better than 1/nη1/n\eta for some energy E>2E>2). By symmetry, it suffices to prove the bound for λ1(Wn)\lambda_{1}(W_{n}). We may of course assume that nn is large.

By standard large deviation estimates, one has

for any E3E\geq 3; seeOne could also use the earlier estimates in or ; see also for more discussion. [16, Lemma 7.2]. This already deals with the case when n2/3Tn100n^{2/3}\leq T\leq n^{100} (say), and the case T>n100T>n^{100} can be handled by crudely bounding λ1(Wn)\lambda_{1}(W_{n}) by, say, the Frobenius norm of WnW_{n} and using Condition C0. Thus we may restrict attention to the regime Tn2/3T\leq n^{2/3}, and show that

We may assume that TlogC0nT\geq\log^{C_{0}}n for some suitably large absolute constant C0C_{0}, as the claim is trivial otherwise.

Suppose that λ1(Wn)\lambda_{1}(W_{n}) was in the interval [2+n2/3T,3][2+n^{-2/3}T,3]. Set η:=n2/3\eta:=n^{-2/3}, and let B(Wn)B(W_{n}) denote the quantity

where EE is the closest multiple of n2/3n^{-2/3} in [2+n2/3T,3][2+n^{-2/3}T,3] to λ1(Wn)\lambda_{1}(W_{n}). Thus, by the union bound, it will suffice to show that

Let MnM^{\prime}_{n} be drawn from GUE, and set Wn:=1nMnW^{\prime}_{n}:=\frac{1}{\sqrt{n}}M^{\prime}_{n}. By Theorem 11, we have

outside of an event of probability O(exp(cT3/2))O(\exp(-cT^{3/2})); in particular, we have

Also, from Corollary 5 and the union bound we see that outside of an event of probability O(nO(1)exp(cT))O(n^{O(1)}\exp(-cT)), one has

(say) for all intervals II. In particular, outside of this event, we have

for all k1k\geq 1, using the bound ρsc(y)=O((2y)1/2)\rho_{sc}(y)=O((2-y)^{1/2}) when y<2y<2.

From (34), (35), (32), and dyadic decomposition one easily establishes that

outside of an event of probability O(nO(1)exp(cTc))O(n^{O(1)}\exp(-cT^{c})).

Let lognkn0.01\log n\leq k\leq n^{0.01} be an integer to be chosen later. Since we may trivially bound ImsWn(E+1η){\operatorname{Im}}s_{W_{n}}(E+\sqrt{-1}\eta) by nO(1)n^{O(1)}, we conclude that

We claim the following stability result for EB(Wn)k{\mathbf{E}}B(W_{n})^{k}, analogous to Proposition 19:

Let Mn,MnM_{n},M^{\prime}_{n} be two adjacent Wigner matrices obeying Condition C0, whose moments match to order mm for some fixed m=O(1)m=O(1). Set Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n} and Wn:=1nMnW^{\prime}_{n}:=\frac{1}{\sqrt{n}}M_{n}. Then for any integer lognkn0.1\log n\leq k\leq n^{0.1}, one has

Applying this proposition n2nn^{2}-n times with m=4m=4 and nn times with m=2m=2 we conclude that

and thus (using (36) and the hypothesis kn0.01k\leq n^{0.01})

The desired claim (33) then follows from Markov’s inequality by taking k=Tc0k=T^{c_{0}} for some sufficiently small c0>0c_{0}>0 (and assuming C0C_{0} sufficiently large depending on c0>0c_{0}>0).

It remains to establish Proposition 24. As in the previous section, we write

for some elementary matrix VV, some random matrix Mn0M_{n}^{0}, and some real random variables ξ,ξ\xi,\xi^{\prime} independent of Mn0M_{n}^{0} that match moments to mthm^{\operatorname{th}} order and obey the exponential decay condition (28). Arguing exactly as before, we may condition Mn0M_{n}^{0} to be a deterministic matrix for which

Using Proposition 22 as before, we see that

for some deterministic coefficients B0B_{0} and aj=O(1)a_{j}=O(1), whenever ξ=o(n)\xi=o(\sqrt{n}).

Suppose first that B01/200|B_{0}|\leq 1/200. Then one has B(Wn)1/100|B(W_{n})|\leq 1/100 whenever ξ=o(n)\xi=o(\sqrt{n}), and so this case contributes O(100k)+O(nO(k)exp(cnc))O(100^{-k})+O(n^{O(k)}\exp(-cn^{c})) to (37), which is acceptable. Thus we may restrict attention to the case when B0>1/200|B_{0}|>1/200. Then we may write

whenever ξ=o(n)\xi=o(\sqrt{n}), where the bj=O(1)b_{j}=O(1) are deterministic coefficients.

Suppose now that ξ=O(n0.3)\xi=O(n^{0.3}). Since kn0.01k\leq n^{0.01}, we may perform a Taylor expansion of (1+x)k(1+x)^{k} to order mm for x=O(n0.2)x=O(n^{-0.2}) and conclude that

in this regime, where the cj=O(1)c_{j}=O(1) are deterministic coefficients (which are allowed to depend in kk). Taking expectations as in the preceding section, and using (28) to handle those ξ\xi with ξn0.3|\xi|\geq n^{0.3}, we conclude that

and similarly for EB(Wn)k{\mathbf{E}}B(W^{\prime}_{n})^{k}; and the claim follows from the matching moments hypothesis.

As in Remark 23, it is possible to extend these arguments to the case when Re(ξij){\operatorname{Re}}(\xi_{ij}) and Im(ξij){\operatorname{Im}}(\xi_{ij}) are not independent; we leave the details to the interested reader.

Note that when one has four matching moments rather than three, the error terms are more favorable by a factor of n\sqrt{n}, giving some additional room to vary the parameters of the argument by small powers of nn. Because of this, it is possible to modify the proof of Theorem 18 to conclude in this case that

in the regime 0<Tnc0<T\leq n^{c} for a sufficiently small cc. This is achieved by arguing as in this section, except that one allows the resolvent R0(,1)\|R_{0}\|_{(\infty,1)} to be as large as O(nc)O(n^{c}) rather than O(1)O(1) in order to keep the failure probability bounded by O(nO(1)exp(nc))O(n^{O(1)}\exp(-n^{c})) rather than O(nO(1)exp((nη)c))O(n^{O(1)}\exp(-(n\eta)^{c})). We omit the details. As a consequence, we can sharpen the conclusion of Theorem 8 to

when 0<Tnc0<T\leq n^{c} and MnM_{n} matches moments with GUE to fourth order off the diagonal and second order on the diagonal.

Appendix A Local semicircle law

In this appendix we establish some preliminary local semicircle law estimates, following the treatment in and . As the methods used here are now standard, and the results very close to those in and , we shall be somewhat brief in our treatment.

We first recall a concentration estimate of Hanson and Wright .

for all tCt\geq C^{\prime} and 1in1\leq i\leq n, and some C,C>0C,C^{\prime}>0 independent of nn. Let AA be an n×nn\times n matrix. Then for any T>0T>0, one has

outside of an event of probability O(exp(cTc))O(\exp(-cT^{c})).

See [16, Lemma B.1]. (Note that a factor of σ\sigma is missing from the statement of the exponential decay hypothesis in the lemma as stated in , which is needed in order to reduce to the σ=1\sigma=1 case.) ∎

outside of an event of probability O(exp(cdc))O(\exp(-cd^{c})).

Apply the preceding proposition with A:=πVA:=\pi_{V} (so traceA=traceAA=d\operatorname{trace}A=\operatorname{trace}A^{*}A=d) and T:=d1/2/10T:=d^{1/2}/10. ∎

We can also use Talagrand’s inequality as in , combining with a truncation argument (to bound each entries by some properly chosen quantity KK). In the case when the atom variables have very fast decay (such as sub-gaussian) or bounded (such as Bernoulli), this calculation will actually lead to a decent bound on the value of cc in Theorem 8.

We can now establish a crude upper bound on the counting function NIN_{I} of a Wigner matrix.

Let MnM_{n} be a Wigner matrix obeying Condition C0, and let Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}. Then for any interval II, one has

outside of an event of probability O(nO(1)exp(c(nI)c))O(n^{O(1)}\exp(-c(n|I|)^{c})).

Fix II, which we write as I=[Eη,E+η]I=[E-\eta,E+\eta]. Suppose that

for some sufficiently large absolute constant CC to be chosen later. We will show that this leads to a contradiction outside of an event of probability O(nO(1)exp(c(nη)c)O(n^{O(1)}\exp(-c(n\eta)^{c}).

On the other hand, we can write the Stieltjes transform sWns_{W_{n}} in terms of the coefficients RijR_{ij} of the resolvent as

Thus, by the pigeonhole principle, we have

for some 1in1\leq i\leq n. By symmetry (and conceding a factor of nn in the failure probability estimates) we may take i=ni=n.

Now, a standard Schur complement computation (see e.g. [30, Lemma 42]) shows that

where R(n)(z)=(Wn(n)z)1R^{(n)}(z)=(W_{n}^{(n)}-z)^{-1} is the resolvent corresponding to the n1×n1n-1\times n-1 matrix Wn(n)W_{n}^{(n)} formed by removing the nthn^{\operatorname{th}} row and column from WnW_{n}, ξnn\xi_{nn} is the bottom right entry of MnM_{n}, and XX is the rightmost column of WnW_{n} (after removing the bottom entry 1nξnn\frac{1}{\sqrt{n}}\xi_{nn}). In particular, using the trivial bound Im1z1Imz|{\operatorname{Im}}\frac{1}{z}|\leq\frac{1}{|{\operatorname{Im}}z|}, we conclude that

Now, by the Cauchy interlacing law, Wn(n)W_{n}^{(n)} has Cnη\gg Cn\eta consecutive eigenvalues in II. There are O(n2)O(n^{2}) possibilities for the starting and ending index of these eigenvalues. If we let VV be the space spanned by the corresponding eigenvectors, then dim(V)Cnη\dim(V)\gg Cn\eta, and from the spectral theorem we see that

outside of an event of probability O(exp(c(nη)c))O(\exp(-c(n\eta)^{c})). If CC is sufficiently large, the claim follows. ∎

This gives rise to a self-consistent equation:

Let MnM_{n} be a Wigner matrix obeying Condition C0, and let Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}. Then for any z=E+1ηz=E+\sqrt{-1}\eta with E=O(1)E=O(1) and 0<ηn1000<\eta\ll n^{100}, and all 1in1\leq i\leq n, one has

outside of an event of probability O(nO(1)exp(c(nη)c))O(n^{O(1)}\exp(-c(n\eta)^{c})). In particular, by the union bound, we have

outside of an event of probability O(nO(1)exp(c(nη)c))O(n^{O(1)}\exp(-c(n\eta)^{c})).

We can assume that nηlog100nn\eta\geq\log^{100}n (say), as the claim is trivial otherwise. By symmetry, it will suffice to establish

outside of an event of probability O(nO(1)exp(c(nη)c))O(n^{O(1)}\exp(-c(n\eta)^{c})). By (39), this statement is equivalent to

By Condition C0, one has 1nξnn=o(1)\frac{1}{\sqrt{n}}\xi_{nn}=o(1) outside of an event of probability O(exp(cnc))O(\exp(-cn^{c})), which is certainly acceptable; so our task is now to show that

outside of an event of probability O(nO(1)exp(c(nη)c))O(n^{O(1)}\exp(-c(n\eta)^{c})).

From the Cauchy interlacing law (cf. [30, §5.2]) we know that

By Proposition 30 and the union bound, we may assume outside of an event of probability O(nO(1)exp(c(nη)c))O(n^{O(1)}\exp(-c(n\eta)^{c})), one has

for all intervals II of width at least η\eta centered at EE. By interlacing, we may also conclude

for such intervals. Inserting this bound into (42), we conclude that

If we then apply Proposition 27 with T:=(nη)1/4T:=(n\eta)^{1/4} (say), using the hypothesis that nηlog100nn\eta\geq\log^{100}n (so that 1/(nη)c=o(1)1/(n\eta)^{c}=o(1) for any c>0c>0) we conclude (41) outside of an event of order O(nO(1)exp(c(nη)c))O(n^{O(1)}\exp(-c(n\eta)^{c})), as required. ∎

We can combine this proposition with a standard stability analysis of the self-consistent equation (40) to conclude a crude version of the local semicircle law:

Let MnM_{n} be a Wigner matrix obeying Condition C0, and let Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}. Then for any z=E+1ηz=E+\sqrt{-1}\eta with E=O(1)E=O(1) and 0<ηn1000<\eta\ll n^{100}, and all 1in1\leq i\leq n, one has

outside of an event with probability O(nO(1)exp(c(nη)c))O(n^{O(1)}\exp(-c(n\eta)^{c})).

We note that this corollary is essentially [17, Theorem 3.1]; in the statement of the result in the additional constraint ηlogCloglogn/n\eta\geq\log^{C\log\log n}/n for some constant CC is imposed, but this constraint is not actually used in the proof, at least if one is not concerned with obtaining the best possible bounds for the o(1)o(1) error terms. For the convenience of the reader, we sketch the proof of this corollary below.

As before we may assume that ηlog100n/n\eta\geq\log^{100}n/n; we may also assume that nn is large. By Proposition 31, we may assume that (40) holds.

Let us first dispose of the case when η\eta is large, say η100\eta\geq 100. In this case, the imaginary part of sWn(z)+z+o(1)s_{W_{n}}(z)+z+o(1) is at least 100o(1)100-o(1), and hence by (40) one has sWn(z)1/100+o(1)|s_{W_{n}}(z)|\leq 1/100+o(1); inserting this back into (40) (and using (16)) one obtains sWn(z)ssc(z)1/10|s_{W_{n}}(z)-s_{\operatorname{sc}}(z)|\leq 1/10 (say). One can then deduce (44) from (40) (and (17)) by a routine application of the contraction mapping theorem.

Henceforth we assume that η<100\eta<100, so that z=O(1)z=O(1). Then equation (40) already implies that sWn(z)=O(1)s_{W_{n}}(z)=O(1), since (40) cannot hold if sWn(z)|s_{W_{n}}(z)| is too large. We may thus multiply out the denominator and conclude that

Since the two solutions to the quadratic equation s2+zs+1=0s^{2}+zs+1=0 are s=ssc(z)s=s_{\operatorname{sc}}(z) and s=zssc(z)s=-z-s_{\operatorname{sc}}(z), we conclude that

outside of an event with probability O(nO(1)exp(c(nη)c))O(n^{O(1)}\exp(-c(n\eta)^{c})).

We apply this fact with zz replaced by an arbitrary complex numbers ζ\zeta with Re(ζ)=O(1){\operatorname{Re}}(\zeta)=O(1) and ηIm(ζ)1\eta\leq{\operatorname{Im}}(\zeta)\ll 1, and whose real and imaginary parts are multiples of n100n^{-100} (say). By the union bound, the probability of the failure event is still O(nO(1)exp(c(nη)c))O(n^{O(1)}\exp(-c(n\eta)^{c})). We may then remove the latter hypotheses using the fact that sWns_{W_{n}} and sscs_{\operatorname{sc}} have Lipschitz constant O(n)O(n) in this region, and conclude that outside of an event of probability O(nO(1)exp(c(nη)c))O(n^{O(1)}\exp(-c(n\eta)^{c})), one has

for all ζ\zeta with Re(ζ)=O(1){\operatorname{Re}}(\zeta)=O(1) and ηIm(ζ)1\eta\leq{\operatorname{Im}}(\zeta)\ll 1. On the other hand, if one has Im(ζ)c{\operatorname{Im}}(\zeta)\geq c for some absolute constant c>0c>0, then the second possibility in (46) cannot occur for nn large enough, because sWn(ζ)s_{W_{n}}(\zeta) necessarily has positive imaginary part. A continuity argument then shows that the first option in (46) holds for all ζ\zeta in the indicated regionWhen ζ\zeta approaches the edges ±2\pm 2 of the spectrum, thus ζ=±2+o(1)\zeta=\pm 2+o(1), the two options in (46) begin to overlap, but in that regime one can deduce the first option from the second (with a slightly worse o(1)o(1) error) and so the claim made in the text is still valid.. This gives (44). Among other things, this shows that sWn(z)+z1|s_{W_{n}}(z)+z|\gg 1 (thanks to (17)), and then from (17) and the second part of Proposition 31 we obtain (45). ∎

For our applications, we will also need bounds on the coefficient norm

Let MnM_{n} be a Wigner matrix obeying Condition C0, and let Wn:=1nMnW_{n}:=\frac{1}{\sqrt{n}}M_{n}. Then for any z=E+1ηz=E+\sqrt{-1}\eta with E=O(1)E=O(1) and 0<ηn1000<\eta\ll n^{100}, one has

outside of an event with probability O(nO(1)exp(c(nη)c))O(n^{O(1)}\exp(-c(n\eta)^{c})).

Again, we may assume η>log100n/n\eta>\log^{100}n/n. By the union bound, it suffices to show for each 1i,jn1\leq i,j\leq n that

outside of an event with probability O(nO(1)exp(c(nη)c))O(n^{O(1)}\exp(-c(n\eta)^{c})). In the diagonal case i=ji=j, this follows directly from (45), so suppose that iji\neq j. In this case, we may use the Schur complement identity

where R(i)(z)R^{(i)}(z) is the resolvent associated to the n1×n1n-1\times n-1 matrix Wn(i)W_{n}^{(i)} formed by removing the ithi^{\operatorname{th}} row and column from WnW_{n}, and Kij(ij)K_{ij}^{(ij)} is the quantity

outside of an event of probability O(nO(1)exp(c(nη)c))O(n^{O(1)}\exp(-c(n\eta)^{c})). But by Proposition 27 (viewing the n2×n2n-2\times n-2 matrix (Wn(ij)z)1(W_{n}^{(ij)}-z)^{-1} as the upper-right block of a nilpotent 2(n2)×2(n2)2(n-2)\times 2(n-2) matrix, and concatenating XiX_{i} and XjX_{j} together), one has

outside of an event of probability O(exp(cTc))O(\exp(-cT^{c})), for any T>0T>0. But by repeating the derivation of (43), one has

If one then sets T=O(nη)T=O(\sqrt{n\eta}), one obtains the claim. ∎

We remark that the above argument in fact shows that we may improve the bound R(z)ij=O(1)R(z)_{ij}=O(1) to R(z)ij=O(1(nη)1/2δ)R(z)_{ij}=O(\frac{1}{(n\eta)^{1/2-\delta}}) for any fixed δ>0\delta>0; compare with [17, Theorem 3.1]. However, this improvement is not used in this paper.

References