Bulk Universality for Wigner Matrices

Laszlo Erdos, Sandrine Peche, Jose A. Ramirez, Benjamin Schlein, Horng-Tzer Yau

Introduction

The fundamental reason why random matrices have been used to model many large systems is based on the belief that their local eigenvalue statistics are universal. This is generally referred to as the universality of random matrices. It is well-known that the local behavior of eigenvalues near the spectral edge and in the bulk are governed by the Tracy-Widom law and by the Dyson sine kernel, respectively. Since the seminal work of Dyson for the Gaussian Unitary Ensemble (GUE), the universality both for the edge and the bulk were proven for very general classes of unitary invariant ensembles in the past two decades (see, e.g. and references therein). For non-unitary ensembles, the most natural examples are the Wigner matrix ensembles , i.e., random matrices with independent identically distributed entries. The edge universality for these ensembles was proved by Soshnikov using the moment method; the bulk universality remained unknown due to a lack of method to analyze local spectral properties of large matrices inside the spectrum. For ensembles of the form

where H^\widehat{H} is a Wigner matrix, VV is an independent standard GUE matrix and aa is a positive constant of order one (independent of NN), the bulk universality was proved by Johansson . (Strictly speaking, the range of the parameter aa in depends on the energy EE. This restriction was later removed by Ben Arous and Péché , who also extended this approach to Wishart ensembles).

The approach of is partly based on the asymptotic analysis of an explicit formula by Brézin-Hikami for the correlation functions of the eigenvalues of H^+aV\widehat{H}+aV. This matrix can also be generated by a stochastic flow

and the evolution of the eigenvalues is given by the Dyson Brownian motion . The result of thus states that the bulk universality holds for times of order one. The eigenvalue distribution of GUE is in fact the invariant measure of Dyson Brownian motion. (Rigorously speaking, the Brownian motion has to be replaced by an Ornstein-Uhlenbeck process, but we will neglect this subtlety.) It is thus tempting to derive the universality of H^+sV\widehat{H}+\sqrt{s}V via the convergence to equilibrium. We have recently carried out this approach and the key observation is that the sine kernel, as a property of local statistics, depends almost exclusively on the convergence to local equilibrium. With this method we have reduced the necessary time to N1+ξN^{-1+\xi}, for any ξ>1/4\xi>1/4 in . Note that the relaxation time to local equilibrium is N1N^{-1}; the additional exponent ξ\xi is due to technical reasons.

From the stochastic calculus, one can see that the typical distance between the corresponding eigenvalues of H^+sV\widehat{H}+\sqrt{s}V and H^\widehat{H} is of order (s/N)1/2(s/N)^{1/2}. Thus the bulk universality of H^\widehat{H} would hold if we could prove the Dyson sine kernel for time s1/Ns\ll 1/N. On the other hand, for time smaller than 1/N1/N, the eigenvalues do not move in the scale 1/N1/N and the dynamical consideration seems to be pointless. In this paper, we provide an approach to address the comparison of eigenvalues between H^+sV\widehat{H}+\sqrt{s}V and H^\widehat{H}. To describe the idea, we now introduce the notations.

Suppose the real and imaginary parts of the offdiagonal matrix elements evolve according to the Ornstein-Uhlenbeck (OU) process

with the reversible measure μ(dx)=ex2dx\mu({\rm d}x)=e^{-x^{2}}{\rm d}x and initial distribution u0=uu_{0}=u (strictly speaking, a differently normalized OU process is used for the diagonal elements but we omit this detail here). Under this process, the matrix evolves as

and the expectation and variance of the matrix entries remain constant. Notice for time tt small, ta2t\approx a^{2} when compared with (1.1), after a trivial rescaling.

The initial distribution of all the matrix elements is Fdμn=(u  dμ)nF\,{\rm d}\mu^{\otimes n}=(u\;{\rm d}\mu)^{\otimes n} with n=N2n=N^{2}. Let L{\mathcal{L}} be the generator on the product space and etL:=(etL)ne^{t{\mathcal{L}}}:=(e^{tL})^{\otimes n} be the dynamics of the OU process for all the matrix elements. The joint probability distribution of the matrix elements at time tt is then given by

Suppose that for some tt small, say, t=N1+λt=N^{-1+\lambda} with λ>0\lambda>0, we know the local eigenvalue correlation function w.r.t. FtF_{t}. Let

be the total variation norm between FtF_{t} to FF. In order to approximate the correlation functions of FF by FtF_{t} in a weak sense (tested against bounded observables), we need Var(F,Ft)0Var(F,F_{t})\to 0. Heuristically, Var(F,Ft)tN2Var(F,F_{t})\sim tN^{2} and this requires that tN2t\ll N^{-2} which is far from the time scale tN1+ξt\geq N^{-1+\xi} for which the sine kernel has been proven in . For observables on short scales, an effective speed of convergence for the total variation is needed. For example, to test a local observable with two variables in scale 1/N1/N, as in the case of the Dyson sine kernel, one has to prove Var(F,Ft)=o(N2)Var(F,F_{t})=o(N^{-2}).

Although the heuristic bound Var(F,Ft)tN2Var(F,F_{t})\sim tN^{2} can be improved to Var(F,Ft)tNVar(F,F_{t})\sim tN, further improvement seems to be impossible. Thus we are unable to obtain even the weaker bound Var(F,Ft)=o(1)Var(F,F_{t})=o(1) for t>1/Nt>1/N. The main observation in the current paper is that, while we cannot compare FF with FtF_{t}, it suffices to prove the existence of some function GG for which the correlation functions with respect to etLGe^{t{\mathcal{L}}}G can be computed for tN1+λt\geq N^{-1+\lambda} and Var(F,etLG)=o(N2)Var(F,e^{t{\mathcal{L}}}G)=o(N^{-2}). Since the necessary input to compute the correlation functions is the validity of the semicircle law on short scales, which we have proved for a wide class of distributions ν\nu in , the choice of GG is essentially dominated by the condition Var(F,etLG)=o(N2)Var(F,e^{t{\mathcal{L}}}G)=o(N^{-2}). Note that GG itself may depend on tt. Since F=etL(etLF)F=e^{t{\mathcal{L}}}(e^{-t{\mathcal{L}}}F), we could, in principle, choose G=etLF=[etL]nFG=e^{-t{\mathcal{L}}}F=[e^{-tL}]^{\otimes n}F. But the diffusive dynamics cannot be reversed besides a very special class of initial data GG. However, we only have to approximately reverse the dynamics and the choice G_{t}=\big{[}1-tL+\frac{1}{2}t^{2}L^{2}\big{]}^{\otimes n}F turns out to be sufficient. In this case, etLGtF=O(N2t3)e^{t{\mathcal{L}}}G_{t}-F=O(N^{2}t^{3}) and we will show that

Furthermore, under some mild regularity condition on FF, GtG_{t} is in the class for which we can establish the local semicircle law . We will call this argument the method of time reversal.

We now summarize the assumptions on the initial distribution. Let the probability measure of the real and imaginary parts of the off-diagonal matrix elements be of the form

with some constants δ>0\delta>0, CC and CC^{\prime}. In Section 5 we explain how to relax this latter condition to exponential decay,

with some constants C,CC,C^{\prime} (in fact, some high power law decay is sufficient). We assume that the first moment of ν\nu is zero and the variance is 12\frac{1}{2}

We assume the conditions (1.5), (1.6) and (1.8) for V~\widetilde{V} as well with the variance changed to 1.

Let pN(x1,x2,,xN)p_{N}(x_{1},x_{2},\ldots,x_{N}) denote the probability density of eigenvalues and for any k=1,2,,Nk=1,2,\ldots,N, let

be the kk-point correlation function. With our choice of the variance of ν\nu, the density pN(1)(x)p^{(1)}_{N}(x) is supported in +o(1)+o(1) and in the NN\to\infty limit it converges to the Wigner semicircle law given by the density

With similar methods we can also prove that the higher order rescaled correlation functions,

converge in the weak sense to \mbox{det}\big{(}f(a_{i}-a_{j})\big{)}_{1\leq i,j\leq k} where f(τ)=sinπτπτf(\tau)=\frac{\sin\pi\tau}{\pi\tau}, however this statement requires more regularity conditions on VV. The proof of the sine kernel for etLGte^{tL}G_{t} immediately implies the convergence of the higher order correlation functions with respect to the evolved measure. To conclude for the higher order correlation functions with respect to FF, however, one needs to improve the accuracy in (1.4). This can be achieved by approximating the backward evolution etLe^{-t{\mathcal{L}}} to a higher order. For example, using G_{t}=\big{[}1-tL+\frac{1}{2!}(-tL)^{2}-\ldots\frac{1}{(m-1)!}(-tL)^{m-1}\big{]}^{\otimes n}F, will improve the bound (1.4) to t2mN2t^{2m}N^{2}, modulo NεN^{\varepsilon} corrections, if VV is 2m2m-times differentiable with bounds similar to (1.5).

We now state our result concerning the eigenvalue gap distribution. For any s>0s>0 and u<2|u|<2 we define the density of eigenvalue pairs with distance less than s/Nϱsc(u)s/N\varrho_{sc}(u) in the vicinity of uu by

where tN=N1+δt_{N}=N^{-1+\delta} for some 0<δ<10<\delta<1.

Suppose the probability measure of the matrix elements satisfies conditions (1.5), (1.6) and (1.8). Let Kα{\mathcal{K}}_{\alpha} be the operator acting on L2((0,α))L^{2}((0,\alpha)) with kernel sinπ(xy)π(xy)\frac{\sin\pi(x-y)}{\pi(x-y)}. Then for any uu with u<2|u|<2 and for any s>0s>0 we have

where det\det denotes the Fredholm determinant of the compact operator 1Kα1-{\mathcal{K}}_{\alpha}.

As a corollary of Theorem 1.2, one can easily show that the probability to find no eigenvalue in the interval [u,u+α/(ϱsc(u0)N)][u,u+\alpha/(\varrho_{sc}(u_{0})N)], after averaging in an interval of size N1+δN^{-1+\delta} around u0(2,2)u_{0}\in(-2,2), is given by det(1Kα)\det(1-{\mathcal{K}}_{\alpha}), same as in the case of GUE (see, e.g., ). Note that assuming more regularity on the exponent of the density u(x)=eU(x)u(x)=e^{-U(x)}, we can get a better bound on the convergence rate (by approximating the backwards evolution etLe^{-t{\mathcal{L}}} to a higher order) and avoid therefore the averaging over uu.

The proof of Theorem 1.1 and 1.2 consists of two main parts. In Section 2 we prove the approximation (1.4) under precise conditions on the initial distribution u=eVu=e^{-V}. In Section 3 we prove the sine kernel for the distribution etLGte^{t{\mathcal{L}}}G_{t} with t=N1+λt=N^{-1+\lambda} for any λ>0\lambda>0, which is the optimal time scale for such a result. Our approach is to recast the formula for the correlation function in , which becomes unstable for t1t\ll 1, into a more symmetric form (Proposition 3.2) so that it is stable for all time up to t=N1+λt=N^{-1+\lambda}. The saddle point analysis can then be achieved with the local semicircle law from . Finally, we complete the proofs of the main theorems in Section 4.

The method of time reversal described previously is very general and should be applicable to a wide range of models. More significantly, it explains the origin of the universality, i.e., the universality comes from the “time reversal”. To summarize, the universality consists of the following observations: (1) The local statistics are determined by the local equilibrium measures. (2) The relaxation to local equilibria takes place in a short time. (3) The original distribution can be well-approximated by the distribution of the Dyson Brownian motion for a short time with initial data given by an approximate inverse flow. To implement this scheme, a key input is to estimate the fluctuations of the empirical density of eigenvalues in short scales.

Shortly after this manuscript appeared on the arXiv, we learned that our main result was also obtained by Tao and Vu in under essentially no regularity conditions on the initial distribution ν\nu provided the third moment of ν\nu vanishes. Some partial results for the Gaussian orthogonal ensembles are also obtained and we refer the reader to the preprint for more details.

Conventions. We will use the letters CC and cc to denote general constants whose precise values are irrelevant and they may change from line to line. These constants may depend on the constants in (1.5)–(1.8).

Method of Time Reversal

Recall the Ornstein-Uhlenbeck process from (1.3) with the reversible measure μ(dx)=μ(x)dx=ex2dx\mu({\rm d}x)=\mu(x){\rm d}x=e^{-x^{2}}{\rm d}x. Let uu be a positive density with respect to μ\mu, i.e. udμ=1\int u{\rm d}\mu=1 and we write u(x)=exp(V(x))u(x)=\exp(-V(x)).

Let VV satisfy the conditions (1.5), (1.6) with some kk and (1.8). Let λ>0\lambda>0 be sufficiently small and t=N1+λt=N^{-1+\lambda}. Define a cutoff initial density as

where θ\theta is a smooth cutoff function satisfying θ(x)=1\theta(x)=1 for x1|x|\leq 1 and θ(x)=0\theta(x)=0 for x2|x|\geq 2 and cNc_{N} and dNd_{N} are chosen such that v(x)dμ(x)v(x){\rm d}\mu(x) is a probability density with zero expectation. Denote L=Ln{\mathcal{L}}=L^{\otimes n}, F=unF=u^{\otimes n} and Fc=vnF_{c}=v^{\otimes n} with n=N2n=N^{2}.

with some c>0c>0 depending on kk and λ\lambda.

(ii) gt:=(1tL+12t2L2)vg_{t}:=(1-tL+\frac{1}{2}t^{2}L^{2})v is a probability measure with respect to dμ{\rm d}\mu and for Gt:=[gt]nG_{t}:=[g_{t}]^{\otimes n} we have

where CC depends on λ\lambda and on the constants in (1.5), (1.6).

In the formulation of this proposition we have not taken into account that in our application the diagonal elements of the matrix evolve under a differently normalized OU process with generator L~=12x2x2x\widetilde{L}=\frac{1}{2}\partial_{x}^{2}-\frac{x}{2}\partial_{x} with invariant measure ex2/2dxe^{-x^{2}/2}{\rm d}x. This modification is only notational and does not affect the validity of the estimates (2.1) and (2.2).

Proof. From condition (1.6) the estimate (2.1) follows directly by noting that the constants cNc_{N} and dNd_{N} are subexponentially small in NN. For the proof of (2.2), we first control the evolution of each matrix element under the OU process (1.3). We assume that for the initial density vv

hold with some constants positive A1,A2A_{1},A_{2} and A3A_{3}. Set gt=(1tL+12t2L2)vg_{t}=(1-tL+\frac{1}{2}t^{2}L^{2})v for some t>0t>0 and note that gtg_{t} is a probability density with respect to μ\mu if

Note that by the monotonicity preserving property of the Ornstein-Uhlenbeck kernel and by (2.3), we have

Here we used the fact that esLvesA1ve^{sL}v\leq e^{sA_{1}}v under the first condition in (2.3), which follows from integrating the inequality

where we used (2.6), (2.5) and finally (2.4).

Now we consider the evolution of the product density Fc=vnF_{c}=v^{\otimes n}, note that Fc  dμn=1\int F_{c}\;{\rm d}\mu^{\otimes n}=1. Applying the same procedure to each variable, we have

as long as A32t6nA_{3}^{2}t^{6}n is bounded. In our application n=N2n=N^{2}, thus (2.9) will imply (2.2) provided that

which will also guarantee (2.7). It is straightforward to check that the density v(x)v(x) satisfies (2.3) with constants AjA_{j} subject to (2.4) and (2.10). This completes the proof.

Sine kernel for the time evolved measure

We use the contour integral representation for the correlation functions of the eigenvalues of a matrix of the form H=H^+aVH=\widehat{H}+aV, where VV is a GUE matrix . We will apply this result for the matrix

where, apart from a trivial prefactor et/2e^{-t/2}, GtG_{t} plays the role of H^\widehat{H} and a=(et1)1/2t1/2a=(e^{t}-1)^{1/2}\approx t^{1/2}. In order to be able to use the formula given in Proposition 1.1 of to analyze H=H^+aVH=\widehat{H}+aV, we rescale the variance of dν{\rm d}\nu from 12\frac{1}{2} to 18+12a2\frac{1}{8}+\frac{1}{2}a^{2} which changes the semicircle law for H=H^+aVH=\widehat{H}+aV to

In particular, the support changes from $toto[-\sqrt{1+4a^{2}},\sqrt{1+4a^{2}}].Sinceeventually. Since eventuallyawillgotozero,theconditionwill go to zero, the condition|u|<2inTheorem(1.1)tobeawayfromthespectraledgechangestotheconditionin Theorem (1.1) to be away from the spectral edge changes to the condition|u|<1whichweassumeinthesequel.Thesemicirclelawforwhich we assume in the sequel. The semicircle law for\widehat{H}$ will also change from the one given in (1.10) to

In the rest of this Section we will use (3.3). The main result of this section is

Proof. Using Proposition 1.1 of , the (symmetrized) distribution of the eigenvalues x=(x1,,xN)x=(x_{1},\ldots,x_{N}) of H=H^+aVH=\widehat{H}+aV for any fixed H^\widehat{H} is given by

where y=(y1,yN)y=(y_{1},\ldots y_{N}) is the eigenvalues of the Wigner matrix H^\widehat{H} with the choice of S=a2/NS=a^{2}/N. Note that

where y(H^)=(y1(H^),,yN(H^))y(\widehat{H})=(y_{1}(\widehat{H}),\ldots,y_{N}(\widehat{H})) are the eigenvalues of the Wigner matrix H^\widehat{H}. We will choose YN{\mathcal{Y}}_{N} to be the event that the points y=(y1,,yN)y=(y_{1},\ldots,y_{N}) follow the semicircle law (3.3). The limit of the correlation functions of qS(x,y)q_{S}(x,y) will be computed starting from the next section in Proposition 3.3.

with some sufficiently small η0<1\eta_{0}<1 and we set

(after taking the supremum over all energies, which can be controlled taking energies on a grid of spacing η\eta). Note that the variance of the matrix elements in was different (see remark at the beginning of Section 3.1) but this does not change the estimates. The condition C1) of on the Gaussian decay for the initial density gtμ=(1tL+12t2L2)vμg_{t}\mu=(1-tL+\frac{1}{2}t^{2}L^{2})v\mu is clearly satisfied by (2.3) and (1.6). Combining the estimate (3.9) with Proposition 3.3 and with the argument after (3.6), we have proved Proposition 3.1.

We compute the correlation functions of qS(x;y)q_{S}(x;y) in xx, for any fixed yYNy\in{\mathcal{Y}}_{N}:

Note that this definition of the correlation functions differs from the definition of RmNR_{m}^{N} given in ; the relation being

The following representation is based on the formula in , but it is more stable and suitable for analysis for very short time.

The correlation functions can be represented as

Proof of Proposition 3.2. From Eq. (2.18) in , we have

The change of variables w=(1β)r+βww=(1-\beta)r+\beta w^{\prime}, z=(1β)r+βzz=(1-\beta)r+\beta z^{\prime} leads to

for every β\beta. Taking the derivative in β\beta at β=1\beta=1, and removing the primes from the new integration variables, we find the identity

Using that Hv(w)=wv+Sj=1N1/(wyj)H^{\prime}_{v}(w)=w-v+S\sum_{j=1}^{N}1/(w-y_{j}), we find

The second term on the r.h.s. is just (vu)vKN(u,v)(v-u)\frac{\partial}{\partial v}{\mathcal{K}}_{N}(u,v). Therefore

Integrating back over vv, starting from uu, we find that

At this point the contours of integration can be modified; since the singularity 1/(wz)1/(w-z) has been removed, they are now allowed to cross. This completes the proof of the proposition.

Let κ>0\kappa>0. For any sequence y=y(N)YNy=y^{(N)}\in{\mathcal{Y}}_{N} with the choice S=N2+λS=N^{-2+\lambda} we have

uniformly for u1κ|u|\leq 1-\kappa and for α,β\alpha,\beta in a compact set. Moreover, the correlation functions satisfy

uniformly for u1κ|u|\leq 1-\kappa and for α1,,αm\alpha_{1},\ldots,\alpha_{m} in a compact set.

Proof. The statement in (3.15) follows directly from (3.14) and (3.11), so it is sufficient to prove (3.14). We will prove (3.14) in the form

for any sequence u(N)u^{(N)} with u(N)uC/N|u^{(N)}-u_{*}|\leq C/N and for every fixed uu_{*} with u<1κ|u_{*}|<1-\kappa. In order to get (3.14), we take u(N)=u+α/Nϱ(u)u^{(N)}=u_{*}+\alpha/N\varrho(u_{*}) with u=uu_{*}=u.

2 Saddle points

This is equivalent to finding the zeros of a polynomial of degree N+1N+1. There are N1N-1 real roots and two complex roots, called qN±q_{N}^{\pm}, that are complex conjugates of each other

We will work with qN:=qN+q_{N}:=q_{N}^{+}, the analysis of the other saddle is analogous. Clearly Re  qNK|Re\;q_{N}|\leq K for some large KK.

The solutions of this latter equation (for small tt) are given by

where we also used the equation (3.23) for q±q^{\pm}. We set q=q+q=q^{+}.

We need to know that fN0f_{N}^{\prime\prime}\neq 0 at the qNq_{N} saddle.

It follows from (3.8) that for yYy\in{\mathcal{Y}} we have

We compare qq and qNq_{N}. We have from (3.22)

First we show that for the only solution to (3.26) with positive imaginary part we have Im  qNηIm\;q_{N}\geq\eta. This is a fixed point argument.

for some large constant CC. Since yYy\in{\mathcal{Y}}, we know that

with F(z)=O(t)F^{\prime}(z)=O(t) if zΞz\in\Xi. Thus FN(z)1/2|F_{N}^{\prime}(z)|\leq 1/2 for zΞz\in\Xi, so FNF_{N} is a contraction on Ξ\Xi and thus (3.26) has a unique solution, which is qNq_{N}.

3 Evaluating the integrals

Using Laplace asymptotics, we compute the integrals in (3.17). We choose the horizontal curves γ±\gamma_{\pm} to pass through the two saddles q±=a±ibq^{\pm}=a\pm ib of ff (see (3.24)), i.e. we set ω=b{\omega}=b (see the definition of γ±\gamma^{\pm} after (3.12)). The vertical line Γ\Gamma is shifted to pass through the saddles, i.e. Re  Γ=aRe\;\Gamma=a. Moreover, if necessary, we deform Γ\Gamma in a O(N1)O(N^{-1})-neighborhood of aa so that minj\mboxdist(Γ,yj)N2\min_{j}\mbox{dist}(\Gamma,y_{j})\geq N^{-2} and \mboxdist(Γ,aN)N2\mbox{dist}(\Gamma,a_{N})\geq N^{-2}; this is always possible.

according to whether Im  zIm\;z and Im  wIm\;w are positive or negative, e.g.

where Γ+=Γ{w  :  Imw0}\Gamma_{+}=\Gamma\cap\{w\;:\;Im\,w\geq 0\} and Γ=Γ{w  :  Imw0}\Gamma_{-}=\Gamma\cap\{w\;:\;Im\,w\leq 0\}. We will work on A++A^{++}, the other three integrals are treated similarly.

The main contribution to the integral A++A^{++} will come from an ε\varepsilon-neighborhood in zz and ww of the saddle point qN=qN+q_{N}=q_{N}^{+}. The radius ε\varepsilon will be chosen such that after a local change of variable ff and fNf_{N} become quadratic near the saddle. We now explain the local change of variable.

with ϕ(q)=0\phi(q)=0, ϕ(q)=tf(q)\phi^{\prime}(q)=\sqrt{tf^{\prime\prime}(q)} such that

we also assume that εη\varepsilon\leq\eta. We will choose ε=ct\varepsilon=ct with a small cc, depending on uu. We have

from the explicit formula (3.23), so (3.32) is satisfied. Note that ϕ(q)=tf(q)=1+O(t)\phi^{\prime}(q)=\sqrt{tf^{\prime\prime}(q)}=1+O(t).

We have a similar change of variables for fNf_{N}, i.e. ϕN\phi_{N} with the properties that

For yYy\in{\mathcal{Y}}, we have f^{\prime\prime}_{N}(q_{N})=t^{-1}\big{[}1+O(N^{-\lambda/4})\big{]} and fN(z)Ct2Nλ/4|f^{\prime\prime\prime}_{N}(z)|\leq Ct^{-2}N^{-\lambda/4} by (3.25) and (3.33), thus we can choose ε=ct\varepsilon=ct for some small constant c1u2c\leq\sqrt{1-u^{2}}.

Moreover we have ϕN(z)CzqN|\phi_{N}(z)|\leq C|z-q_{N}| for zqct|z-q|\leq ct, so by Cauchy formula ϕN(z)C|\phi^{\prime}_{N}(z)|\leq C and ϕN(z)Ct1|\phi^{\prime\prime}_{N}(z)|\leq Ct^{-1} for zqct|z-q|\leq ct (maybe after reducing cc). The same formulas hold for ϕ\phi as well. We also have

where in the first term we used (3.25) and in the second we used fN(z)Ct2|f^{\prime\prime\prime}_{N}(z)|\leq Ct^{-2}.

for any zz with zqct|z-q|\leq ct. Therefore the maps ϕ\phi and ϕN\phi_{N} are C1C^{1}-close within DεD_{\varepsilon} and both of them are C1C^{1}-close to the shift map zzqz\to z-q.

We first consider the zz integration. Recall that qN=qN+=aN+ibNq_{N}=q^{+}_{N}=a_{N}+ib_{N} from (3.24). We fix a small positive constant c11c_{1}\ll 1 and we define the domains

where ε=ct\varepsilon=ct. Recall that γ+\gamma^{+} was the horizontal line going through q=a+ibq=a+ib, the saddle of ff. We will deform γ+\gamma^{+} to γN+\gamma_{N}^{+} so that it passes through qNq_{N} and it matches with γ+\gamma^{+} at the points aN±2ε+iba_{N}\pm 2\varepsilon+ib. Within the regime Re  zaNε|Re\;z-a_{N}|\leq\varepsilon, we define γN+\gamma_{N}^{+} by the requirement that Im  ϕN=0Im\;\phi_{N}=0 along γN+\gamma_{N}^{+}. Since ϕN(z)\phi_{N}(z) is close to the map zzqNz\to z-q_{N} by (3.36), clearly γN+\gamma_{N}^{+} is almost horizontal curve in small neighborhood of qNq_{N}, so it remains in WW until it reaches the vertical lines Re  zaN=ε|Re\;z-a_{N}|=\varepsilon. In the regime εRe  zaN2ε\varepsilon\leq|Re\;z-a_{N}|\leq 2\varepsilon, we require that γN+\gamma_{N}^{+} matches with γ+\gamma^{+} at the points aN±2ε+iba_{N}\pm 2\varepsilon+ib and it remains in the wedge WW. In the outside regime, Re  zaN2ε|Re\;z-a_{N}|\geq 2\varepsilon we set γN+=γ+\gamma_{N}^{+}=\gamma^{+}, in particular γN+WΩ\gamma_{N}^{+}\subset W\cup\Omega (see Fig. 1).

Proof. The second statement (3.40) follows from the normal form (3.35) and the fact that for zWz\in W we have Im  (zqN)c1Re(zqN)|Im\;(z-q_{N})|\leq c_{1}|Re(z-q_{N})|, i.e. Re(zqN)20Re(z-q_{N})^{2}\geq 0, and ϕN\phi_{N} is close to the map zzqNz\to z-q_{N} in WW, so Re[ϕN(z)]20Re[\phi_{N}(z)]^{2}\geq 0 for zWz\in W.

For the first statement, we assume xax\geq a, the case xax\leq a is analogous. We get by explicit calculation

(the error is absorbed since xact/2|x-a|\geq ct/2 for x+iyΩx+iy\in\Omega^{*}). Since Re[fN(z)fN(qN)]0Re[f_{N}(z)-f_{N}(q_{N})]\geq 0 on the vertical lines xa=ε/2|x-a|=\varepsilon/2, ybc1ε/2|y-b|\leq c_{1}\varepsilon/2, we can integrate the inequality (3.41) to obtain (3.39).

which holds for yη|y|\geq\eta. By explicit computation, and using f(a+ib)=0f^{\prime}(a+ib)=0,

if y121u2|y|\leq\frac{1}{2}\sqrt{1-u^{2}}, xaCt|x-a|\leq Ct for some large CC. Thus we have

where ε=ct\varepsilon=ct with a small cc as before and a similar lower bound holds for ybε/2y-b\leq-\varepsilon/2. Defining

analogously to WW before, we easily obtain

The regimes 0yη0\leq y\leq\eta and y121u2y\geq\frac{1}{2}\sqrt{1-u^{2}} are treated directly. We use

from (3.42), if η0\eta_{0} is sufficiently small, see (3.7).

If y121u2y\geq\frac{1}{2}\sqrt{1-u^{2}}, then

and thus Re  fN(x+iy)y2/4tRe\;f_{N}(x+iy)\leq-y^{2}/4t in this regime. Summarizing these results, we have

We can define a new contour ΓN+\Gamma_{N}^{+} similar to the γN+\gamma_{N}^{+}. It follows the path where ϕN\phi_{N} has zero imaginary part when Im  wbε/2|Im\;w-b|\leq\varepsilon/2 and then it returns to Γ+\Gamma^{+} when Im  wbε|Im\;w-b|\geq\varepsilon. We recall that minj\mboxdist(ΓN+,yj)N2\min_{j}\mbox{dist}(\Gamma_{N}^{+},y_{j})\geq N^{-2} and \mboxdist(Γ,aN)N2\mbox{dist}(\Gamma,a_{N})\geq N^{-2} by the choice of Γ\Gamma.

With the paths γN+\gamma_{N}^{+} and ΓN+\Gamma_{N}^{+} defined, we can now do the integration

if z(a+ib)ε|z-(a+ib)|\leq\varepsilon, w(a+ib)ε|w-(a+ib)|\leq\varepsilon. In order to make sure that these bounds are satisfied, we fix the constant r=Re qN(u)r=\text{Re }q_{N}(u_{*}) in (3.17). Here qN(u)q_{N}(u^{*}) is the unique solution with positive imaginary part of the saddle point equation (3.22), with uu (which is actually a short hand notation for u(N)u^{(N)}) replaced by the fixed uu_{*}. Note that, since u(N)uC/N|u^{(N)}-u_{*}|\leq C/N, we find that the real part of the exponent of hN(w)h_{N}(w) (see (3.20)) is bounded, rRew/tρC|r-\text{Re}w|/t\rho\leq C, as ww runs through Γ\Gamma.

This choice also guarantees that, away from the saddle,

that hold for Im  zη|Im\;z|\geq\eta, Im  w0Im\;w\geq 0. These bound follow from (3.19), (3.20) and (3.21) and when ww is near the real axis, we also used that ΓN\Gamma_{N} is away from the yjy_{j}’s.

The integration in A++A^{++} (see (3.47)) will be divided into regimes near the saddle qNq_{N} (“inside”) or away from the saddle (“outside”):

Recall that qNq=o(t)|q_{N}-q|=o(t) and q=q+=a+ibq=q^{+}=a+ib (see (3.24)). For example

where χ\chi is the characteristic function of the interval [ε,ε][-\varepsilon,\varepsilon]. The other AA’s are defined analogously.

The integral of the exponential term is bounded by

Taking into account (3.48) and (3.49), we see that AioecNt|A_{io}|\leq e^{-cNt} since t=N1+λt=N^{-1+\lambda}. Similarly we can bound all other terms with an outside part. When RezactN1|Re\,z-a|\geq ct\gg N^{-1}, then the exponential growth of hNh_{N} in (3.49) will be controlled by the Gaussian decay of

Finally, we have to compute the contribution of the saddle, i.e. the term AiiA_{ii}. We let γ~\widetilde{\gamma} be the part of γN+\gamma_{N}^{+} with Re  γNaε|Re\;\gamma_{N}-a|\leq\varepsilon and similarly defined Γ~\widetilde{\Gamma}. Recall that Im  ϕN=0Im\;\phi_{N}=0 on γ~\widetilde{\gamma}. From standard Laplace asymptotics calculation, we have

while the main term in the bracket on the r.h.s. of (3.51) is of order t1t^{-1}. Analogously performing the dw{\rm d}w integration, we obtain that

where we also used gN(qN,qN)=fN(qN)g_{N}(q_{N},q_{N})=f_{N}^{\prime\prime}(q_{N}) following from (3.21). So far we considered the saddle qN=qN+q_{N}=q_{N}^{+} with positive imaginary part for both the zz and ww integrals. The same calculation can be performed at the saddle z=w=qNz=w=q_{N}^{-}. The mixed case, when zz is integrated near one of the saddles and ww is near the other one, gives zero contribution, since gN(qN,qN+)=gN(qN+,qN)=0g_{N}(q_{N}^{-},q_{N}^{+})=g_{N}(q_{N}^{+},q_{N}^{-})=0 by (3.21). Adding up the contributions of the two relevant saddles, z=w=qN+z=w=q_{N}^{+} and z=w=qNz=w=q_{N}^{-}, taking into account the opposite orientations of the two pieces of γN\gamma_{N}, one obtains

where we used the choice r=Re qN±(u)r=\text{Re }q^{\pm}_{N}(u^{*}) (see after (3.48)), which guarantees that rReqN±0|r-\text{Re}q_{N}^{\pm}|\to 0 as NN\to\infty, and the equations (3.16), (3.24), and (3.28). This completes the proof of Proposition 3.3.

Proof of the main theorems

Proof of Theorem 1.1. We follow the notations of Proposition 2.1. In Proposition 3.1 we have shown that the sine kernel holds for the measure etLGte^{t{\mathcal{L}}}G_{t} if t=N1+λt=N^{-1+\lambda}. More precisely, let pN,t(x)p_{N,t}(x), denote the density function of the eigenvalues x=(x1,,xN)x=(x_{1},\ldots,x_{N}) w.r.t. etLGte^{t{\mathcal{L}}}G_{t} and let pN,t(2)p_{N,t}^{(2)} be the two point correlation function, defined analogously to (1.9). Similarly, we define pN,c(x)p_{N,c}(x) and pN,c(2)p_{N,c}^{(2)} for the eigenvalue density and two point correlation function w.r.t. truncated measure Fc=vnF_{c}=v^{\otimes n}.

for any u<2|u|<2 and with the notation ϱ=ϱsc(u)\varrho=\varrho_{sc}(u). (We remark that pN,t(2)p_{N,t}^{(2)} was denoted by p~N(2)\widetilde{p}_{N}^{(2)} in Proposition 3.1 and the condition u<2|u|<2 is translated into u<1|u|<1 after rescaling.)

To prove (1.11), we thus only need to control the difference as follows

with some c>0c>0 as NN\to\infty. To estimate (II)(II), we have

Using (4.1) for the observable O|O| instead of OO, the second factor on the r.h.s. of (4.2) is bounded. Since OO is bounded, the first factor is smaller than

Here in the first step we used that the quantity D(f,g)=f/g12gD(f,g)=\int|f/g-1|^{2}g for two probability measures ff and gg decreases when taking marginals. In the second step, we used that D(f,g)D(f,g) decreases when passing the probability laws from matrix elements to the induced probability laws for the eigenvalues. Finally, we used the estimate (2.2). This completes the proof of Theorem 1.1.

by recalling (3.5). The second term can be estimated by using ΛN|\Lambda|\leq N and (3.9) as

For the first term in (4.4), we use the exclusion-inclusion principle to compute

with ϱ=ϱ(u)\varrho=\varrho(u) (see (3.2)) and recall that p~N,y,S(m)\widetilde{p}^{(m)}_{N,y,S} denote the correlation functions of qS(x,y)q_{S}(x,y) (see (3.10)). After a change of variables,

where the factor mm comes from considering the integration sector z1zjz_{1}\leq z_{j}, j2j\geq 2. Taking NN\to\infty and using Proposition 3.3, we get

where in the last determinant term we set a1=0a_{1}=0. The interchange of the limit and the summation can be justified by noting that the exclusion-inclusion principle guarantees that (4.6) is an alternating series where the difference between the sum and its MM-term truncation can be controlled by the (M+1)(M+1)-th term for any MM. We note that the left hand side of (4.8) is 0sp(α)dα\int_{0}^{s}p(\alpha){\rm d}\alpha, where p(α)p(\alpha) is the second derivative of the Fredholm determinant det(1Kα)\det(1-{\mathcal{K}}_{\alpha}) (see (1.13)). Combining (4.8) with the estimate (4.5), we have

After rescaling (3.1), we also conclude that the limit of the expectation of Λ\Lambda with respect to the time evolved ensemble etLGte^{t{\mathcal{L}}}G_{t} (see Proposition 2.1) is given by right hand side of (4.9).

Finally, the difference of the expectation of Λ\Lambda with respect to the measure etLGte^{t{\mathcal{L}}}G_{t} and w.r.t. the initial ensemble FF vanishes since ΛN|\Lambda|\leq N and \mboxVar(etLGt,F)CN2+4λ\mbox{Var}(e^{t{\mathcal{L}}}G_{t},F)\leq CN^{-2+4\lambda} (see (2.1) and (2.2)). This completes the proof of Theorem 1.2.

Some extensions and comments

In this section we explain how to relax some of the conditions on the initial distribution ν\nu.

We first explain how to extend our proof to include distributions ν\nu with compact support. Take for example a density w.r.t. the Gaussian measure dμ(x)=ex2{\rm d}\mu(x)=e^{-x^{2}} that is given by a nice bump function u(x)u(x) supported in $decayinglikedecaying like(1\pm x)^{m}neartheboundarynear the boundaryx=\pm 1.Clearly,forany. Clearly, for anymfixedthisdistributionviolatestheassumptionsofTheorem1.1.Wenowshowthatforfixed this distribution violates the assumptions of Theorem 1.1. We now show that form$ large enough, it is still possible to prove the universality. Define a new distribution with density

with a small parameter τ>0\tau>0 to be determined later. Near the edge 11 we have Lq(1y)Cym2Lq(1-y)\lesssim Cy^{m-2} for 0y10\leq y\ll 1 with some mm-dependent constant CC. We thus need the condition

to guarantee that (1tL)q(1-tL)q is a probability density. This inequality holds if

The other conditions concerning L2L^{2} and L3L^{3} (see (2.3)) can be handled similarly. Choosing τ=Ct1/2\tau=Ct^{1/2}, the total variation norm is bounded by

Since n=N2n=N^{2} and t=N1+εt=N^{-1+\varepsilon}, we have

Let, say, m9m\geq 9, then the error term will be smaller than N2δN^{-2-\delta} with some δ>0\delta>0 and this will imply Theorem 1.1 for the initial distribution uu. The modification of uu in (5.1) can certainly be more sophisticated to reduce the exponent mm.

Therefore, Theorem 4.1 of holds with the estimates taking the form

References