Universality of Random Matrices and Local Relaxation Flow

Laszlo Erdos, Benjamin Schlein, Horng-Tzer Yau

Introduction

A central question concerning random matrices is the universality conjecture which states that local statistics of eigenvalues are determined by the symmetries of the ensembles but are otherwise independent of the details of the distributions. There are two types of universalities: the edge universality and the bulk universality concerning the interior of the spectrum. The edge universality is commonly approached via the moment method while the bulk universality was proven for very general classes of unitary invariant ensembles (see, e.g. and references therein) based on detailed analysis of orthogonal polynomials. The most prominent non-unitary ensembles are the Wigner matrices, i.e., random matrices with i.i.d. matrix elements that follow a general distribution. The bulk universality for Hermitian Wigner ensembles was first established in for ensembles with smooth distributions. The later work by Tao and Vu did not assume smoothness but it required some moment condition which was removed later in . Our approach to prove the universality was based on the following three steps.

It states that the number of eigenvalues in a spectral window containing about NεN^{\varepsilon} eigenvalues is given by the semicircle law with a very high probability . The factor NεN^{\varepsilon} can be improved to any sufficiently large constant at the expense of deterioriation of the probability estimate.

The Gaussian divisible ensembles are given by matrices of the form

where H^\widehat{H} is a Wigner matrix, VV is an independent standard GUE matrix and s>0s>0. Johansson and the later improvement in proved that the bulk universality holds for ensembles of the form (1.1) if s>0s>0 is independent of NN. In the work , this result was extended to s=N1+εs=N^{-1+\varepsilon} for any ε>0\varepsilon>0. The key ingredient for this extension was the local semicircle law.

For any given Wigner matrix HH, we find another Wigner matrix H^\widehat{H} so that the eigenvalue statistics of HH and H^+sV\widehat{H}+\sqrt{s}V are close to each other. The choice of H^\widehat{H} is given by a reverse heat flow argument.

Johansson’s proof of the universality of Hermitian Wigner ensembles relied on the asymptotic analysis of an explicit formula by Brézin-Hikami for the correlation functions of the eigenvalues of H^+sV\widehat{H}+\sqrt{s}V. Unfortunately, the similar formula for GOE is not very explicit and the corresponding result is not available. On the other hand, the eigenvalue distribution of the matrix H^+sV\widehat{H}+\sqrt{s}V is the same as that of H^+V(s)\widehat{H}+V(s), where the matrix elements of V(s)V(s) are independent standard Brownian motions with variance s/Ns/N. Dyson observed that the evolution of the eigenvalues of the flow sH^+V(s)s\to\widehat{H}+V(s) is given by a system of coupled stochastic differential equations (SDE), commonly called the Dyson Brownian motion (DBM) .

If we replace the Brownian motions by the Ornstein-Uhlenbeck processes, the resulting dynamics on the eigenvalues, which we still call DBM, has the GUE or GOE eigenvalue distributions as the invariant measures depending on the symmetry type of the ensembles. Thus the result of Johansson can be interpreted as stating that the local statistics of GUE is reached via DBM for time of order one. In fact, by analyzing the dynamics of DBM with ideas from the hydrodynamical limit, we have extended Johansson’s result to sN3/4s\gg N^{-3/4} . The key observation of is that the local statistics of eigenvalues depend exclusively on the approach to local equilibrium. This method avoids the usage of explicit formulae for correlation functions, but the identification of local equilibria, unfortunately, still uses explicit representations of correlation functions by orthogonal polynomials (following e.g. ), and the extension to other ensembles is not a simple task.

Therefore, the universality for symmetric random matrices remained open and the only partial result is Theorem 23 of for Wigner matrices with the first four moments of the matrix elements matching those of GOE. The approach of consisted of three similar steps as outlined above. For Step 2, it used the result of . For Step 3, a four moment comparison theorem for individual eigenvalues was proved in and the local semicircle law (Step 1) was one of the key inputs in this proof.

In this paper, we introduce a general approach to prove local ergodicity of DBM, partially motivated by the previous work . In this approach the analysis of orthogonal polynomials or explicit formulae are completely eliminated and the method applies to both Hermitian and symmetric ensembles. In fact, the heart of the proof is a convex analysis and it applies to β\beta-ensembles for any β1\beta\geq 1. The model specific information required to complete this approach involves only rough estimates on the accuracy of the local density of eigenvalues. We expect this method to apply to a very general class of models. More detailed explanations will be given in Section 3.

Statement of Main Results

To fix the notation, we will present the case of symmetric Wigner matrices; the modification to the Hermitian case is straightforward and will be omitted. The extension to the quaternion self-dual case is also standard, see, e.g. for the notations and setup. On the other hand, the main theorem on DBM (Theorem 2.1) is valid for general β\beta-ensembles. Thus all notations for matrices will be restricted to symmetric matrices but all results for flows will be stated and proved for general β\beta-ensembles. We first explain our general result about DBM and in Section 2.2 we will present its application to Wigner matrices.

The joint distributions of the eigenvalues x=(x1,x2,,xN){\bf{x}}=(x_{1},x_{2},\ldots,x_{N}) of the Gaussian Unitary Ensemble (GUE) and the Gaussian Orthogonal Ensemble (GOE) are given by the following measure

where β=1\beta=1 for GOE and β=2\beta=2 for GUE. We will sometimes use μ\mu to denote the density of the measure as well, i.e., μ(x)dx=μ(dx)\mu({\bf{x}}){\rm d}{\bf{x}}=\mu({\rm d}{\bf{x}}). We consider μ\mu defined on the ordered set

and this measure is well-defined for all β>0\beta>0. The Dyson Brownian motion (DBM) is characterized by the generator

acting on L2(μ)L^{2}(\mu). The DBM is reversible with respect to μ\mu with the Dirichlet form

where j=xj\partial_{j}=\partial_{x_{j}}. Notice that we have added a drift β4xii\frac{\beta}{4}x_{i}\partial_{i} so that the DBM is reversible w.r.t. μ\mu. The original definition by Dyson in was slightly different; it contained no drift term.

Denote the distribution of the process at the time tt by ft(x)μ(dx)f_{t}({\bf x})\mu({\rm d}{\bf x}). Then ftf_{t} satisfies

The corresponding stochastic differential equation for x(t){\bf x}(t) is now given by (see, e.g. Section 12.1 of )

where {Bi  :  1iN}\{B_{i}\;:\;1\leq i\leq N\} is a collection of independent Brownian motions. The well-posedness of DBM on ΣN\Sigma_{N} has been proved in Section 4.3.1 of , see the Appendix for some more details. This step requires β1\beta\geq 1 which we will assume from now on.

The dynamics given by (2.4) and (2.5) with β=1,2,4\beta=1,2,4 can be realized by the evolution of the eigenvalues of symmetric, hermitian and quaternion self-dual matrix ensembles, but the dynamics is well-defined for β1\beta\geq 1 independently of the original matrix models. Our main result, Theorem 2.1, is valid for all β1\beta\geq 1.

Similarly, the correlation functions of the equilibrium measure are denoted by

is the density of the semicircle law and it is well-known that ϱsc\varrho_{sc} is also the density w.r.t. the measure μ\mu in the limit NN\to\infty for β1\beta\geq 1. Define

and let γj\gamma_{j} be the classical location of the jj-th eigenvalue

Our key result on the local ergodicity of DBM is the following theorem.

Suppose the initial density f0f_{0} satisfies Sμ(f0):=f0logf0dμCNmS_{\mu}(f_{0}):=\int f_{0}\log f_{0}{\rm d}\mu\leq CN^{m} with some fixed exponent mm independent of NN. Let ftf_{t} be the solution of the forward equation (2.4). Suppose that the following three assumptions are satisfied for all sufficiently large NN.

There exist constants b>0{\mathfrak{b}}>0 and c>0{\mathfrak{c}}>0 such that

Convention. We will use the letters CC and cc to denote general positive constants whose precise values are irrelevant and they may change from line to line.

The relaxation time to global equilibrium for the DBM is order one in our scaling. The simplest way to see this is via the Bakry-Emery theorem which states that, roughly speaking, the relaxation time is the inverse of the lower bound to the Hessian of the Hamiltonian H{\mathcal{H}}. In our case HI{\mathcal{H}}^{\prime\prime}\geq I, and this implies that the relaxation time is order one. On the other hand, it was conjectured by Dyson that the relaxation time to local equilibrium is of order N1N^{-1}. Theorem 2.1 asserts that the relaxation time to local equilibrium is less than NζN^{-\zeta}. Although this is far from proving Dyson’s conjecture, it is the first effective estimate that shows that the local equilibrium is approached much faster than the global one. Moreover, this result suffices to prove the bulk universality of Wigner matrices when combining with the reverse heat flow ideas introduced in . We remark that the concept of local equilibrium is used vaguely here and in Dyson’s paper. In principle, there are many local equilibria depending on boundary conditions and the uniqueness is a tough question especially now that the interaction is long ranged and singular.

The proof of Theorem 2.1 is based on the introduction of the pseudo equilibrium measure which we now explain. It is common that the global and the local equilibrium are reached at different time scales for interacting particle systems, of which DBM is a special case. On the other hand, the hydrodynamical approach for the DBM yields very complicated estimates. The main reason for the complications is due to that the equilibrium measure of DBM has a logarithmic two body interaction that is both long range and singular at short distances. Hence the proof of the uniqueness of ”local equilibrium measures” is very complicated and we were able to carry it out only for the Hermitian case due to that several identities involving orthogonal polynomials are valid only for the β=2\beta=2 case. However, there are two key observations from this study:

The local statistics does not depend on the long range part of the logarithmic interaction, in other words, we can cutoff the interactions between far away particles without changing the local statistics.

The relaxation time for the gradient flow associated with the local equilibrium with a fixed boundary condition is much smaller than the global relaxation time of the DBM, which is of order one.

To finesse the difficulty associated with the uniqueness of local equilibria, we define the pseudo equilibrium measure, ω\omega, by cutting off the long range interactions of the equilibrium measure μ\mu and show that ω\omega, μ\mu and ftμf_{t}\mu all have the same local statistics. The key idea that the last assertion holds is to estimate the relative entropy of the solution to the DBM, ftμf_{t}\mu, relative to the pseudo equilibrium measure ω\omega. Since the pseudo equilibrium measure is not a global equilibrium measure, the entropy will not decay monotonically as in the case of the relative entropy w.r.t. the equilibrium measure. More precisely, the time derivative of the relative entropy, under the flow of DBM, w.r.t. the pseudo equilibrium measure consists of two terms (see Theorem 3.5): (i) a dissipation term of Dirichlet form of ftμf_{t}\mu w.r.t. the pseudo equilibrium measure; (ii) an error term due to the fact that the pseudo equilibrium measure is not the true equilibrium measure.

Since the logarithmic interactions between far away particles can be approximated by a mean-field potential obtained from using the local density, the error term in (ii) can be controlled if we know the local density of particles w.r.t the distribution ftμf_{t}\mu. The precise conditions are the Assumptions (1)–(3). In the special case of β=1,2\beta=1,2, when the DBM is generated by symmetric matrix ensembles, these assumptions will be verified in Lemma 2.2 by using the local semicircle law; the case of β=4\beta=4 is similar and some details are given in . For other values of β\beta it is an open question to verify the corresponding assumptions. Given that the error term in (ii) can be bounded, we obtain an estimate on the Dirichlet form of ftμf_{t}\mu w.r.t. the pseudo equilibrium measure. The key question is whether this estimate alone is sufficient to pin down the local statistics. For this purpose, we note that the Dirichlet form w.r.t. ω\omega generates a new gradient flow, the local relaxation flow. The global relaxation time of the local relaxation flow, determined by the convexity of the pseudo equilibrium measure, is much shorter than that of the standard DBM. This leads to strong estimates on the local relaxation flow and in particular, it identifies the local statistics. The details of the entropy estimates and the local relaxation flow will appear in Section 3. We now state the main application of Theorem 2.1, the universality of symmetric Wigner ensembles.

2 Universality of symmetric Wigner ensemble

We remark that (2.15) implies that ν\nu has a Gaussian decay, i.e.

for some δ0>0\delta_{0}>0. We require that ν~\widetilde{\nu} also satisfies (2.15). In this paper, all conditions and statements involving ν\nu apply to ν~\widetilde{\nu} as well, but for the simplicity of the presentation, we will neglect mentioning ν~\widetilde{\nu} all the times.

where γ\gamma is the standard Gaussian distribution with variance one. For the diagonal element, the Ornstein-Uhlenbeck process should be replaced by the one reversible w.r.t. the Gaussian measure with variance two due to the convention that the variances of the diagonal elements are equal to two. The Ornstein-Uhlenbeck process (2.17) induces a stochastic process on the eigenvalues; it is well-known that the process on the eigenvalue is given by the DBM (2.5) with β=1\beta=1. Notice that we used the Ornstein-Uhlenbeck process so that the resulting DBM is reversible w.r.t. μ\mu.

Our goal is to apply Theorem 2.1 with β=1\beta=1 and for this purpose, we need to verify Assumptions 1-3. Assumption 3 follows from the local semicircle law, Theorem 5.1, stated later in Section 5. Assumptions 1 and 2 can be verified if the measure ν\nu satisfies the logarithmic Sobolev inequality (2.15). The precise statement is the following lemma.

Suppose the assumption (2.15) on the distribution ν\nu of the matrix elements holds. Then there are positive numbers a{\mathfrak{a}}, b{\mathfrak{b}} and c{\mathfrak{c}}, depending on θ\theta from (2.15), such that (2.10) and (2.11) hold.

From this Lemma, for symmetric Wigner matrices whose matrix element distributions satisfy the LSI, the assumptions of Theorem 2.1 are satisfied. Hence the correlation functions w.r.t. ftμf_{t}\mu and the GOE equilibrium measure μN(β=1)\mu^{(\beta=1)}_{N} are identical in the large NN limit for some t=Nζt=N^{-\zeta} in the sense that (2.13) holds. Together with the reverse heat flow argument, we have the following universality theorem for local statistics of Wigner ensembles whose matrix element distribution is smooth and satisfies the logarithmic Sobolev inequality. Denote by pN(k)p_{N}^{(k)} the correlation functions of the eigenvalues of the symmetric Wigner ensemble. Let pN,GOE(k)p_{N,GOE}^{(k)} be the correlation functions of the eigenvalues of GOE, i.e., the correlation functions of the equilibrium measure μN(β=1)\mu^{(\beta=1)}_{N}. It is well-known that pN,GOE(k)p_{N,GOE}^{(k)} can be computed explicitly (see, e.g. Section 7 of ).

Suppose the distribution ν\nu for the matrix elements satisfies the logarithmic Sobolev inequality (2.15). Assume that ν\nu has a positive density ν(x)=eU(x)\nu(x)=e^{-U(x)} such that for any jj there are constants C1,C2C_{1},C_{2}, depending on jj, such that

Then for any E<2|E|<2 and for any k1k\geq 1, we have

Theorem 2.3 is a simple corollary of Theorem 2.1 and the method of the reverse heat flow . It will be proved briefly in Section 4. Though we stated the universality in terms of correlation functions, it also holds for the eigenvalue gap distribution and we omit the obvious statement (the analogous statement for the Hermitian case was formulated in Theorem 1.2 of ).

In the following corollary, by using Theorem 15 of , we remove all assumptions from Theorem 2.3 except for a decay condition and a technical condition that ν\nu is supported in at least three points. This latter technical condition was removed in our later paper , where we generalized our approach to a broader class of random matrix ensembles.

Suppose the distribution ν\nu of the matrix elements has mean zero, variance one and a tail with a subexponential decay, i.e. it satisfies that

for some constants C,q>0C,\mathfrak{q}>0. Assume that ν\nu is supported in at least three points. Then the conclusion (2.19) of Theorem 2.3 holds.

Proof. Let mjm_{j} denote the moments of ν\nu

(ii) the derivative bounds (2.18) hold, and (iii) the logarithmic Sobolev inequality (2.15) holds. It is easy to argue that such a measure ν^\widehat{\nu} exists. Consider the space of all measures satisfying (2.18) with a finite LSI constant. Since the condition (2.18) and the finite LSI constant condition are preserved under small smooth perturbations which are infinite dimensional, there are enough freedom to choose perturbations so as to match the first four moments as long as m4>m32+1m_{4}>m_{3}^{2}+1. An elementary detailed proof of this fact is given in Lemma C.1. of . Therefore, ν^\widehat{\nu} satisfies the assumption of Theorem 2.3 and thus (2.19) holds for the measure ν^\widehat{\nu}. Recall that Theorem 15 in asserts that the local eigenvalue statistics for matrices whose matrix element distributions match up to the first four moments are the same in the limit NN\to\infty (strictly speaking, this theorem was proved only for hermitian matrices, but the parallel version for symmetric ensembles holds as well, see the remark at the end of Section 1.6 in ). This proves the corollary.

Pseudo equilibrium measure and Entropy Dissipation Estimates

The key idea to prove Theorem 2.1 is an estimate on the time to local equilibrium for the DBM. However, to estimate this time to local equilibrium, we need to introduce a different flow, the local relaxation flow, defined as the gradient flow of the pseudo equilibrium measure. The pseudo equilibrium measure is a measure which has the local statistics of the β\beta ensemble but has a strong convexity property. Fix a positive number η\eta with N1/6η1N^{-1/6}\ll\eta\ll 1, and for the rest of this paper let ε>0\varepsilon>0 be a small positive number which we will not specify. Let γj±:=γj±ηNε\gamma_{j}^{\pm}:=\gamma_{j}\pm\eta N^{-\varepsilon} and define the mean field potential of eigenvalues far away from the jj-th one as

where the summation is over all k{1,2,,N}k\in\{1,2,\ldots,N\} such that kj>Nη|k-j|>N\eta. For xγj+x\geq\gamma_{j}^{+}, we extend WjW_{j} by

and similarly for xγjx\leq\gamma_{j}^{-}. In other words, WjW_{j} is just the simplest convex extension of the function defined by (3.1) on IjI_{j}. This modification will avoid the singularities at x=γkx=\gamma_{k}. Notice that this is purely a technical device since we will show in (5.26) of Proposition 5.9 that the probability of the regime IjcI_{j}^{c} is negligible in the sense that

The pseudo equilibrium measure ωN=ω{\omega}_{N}={\omega} is defined by

Recall that the relative entropy with respect to a measure λ\lambda is defined by

The local relaxation flow is defined to be the reversible dynamics w.r.t. ω\omega characterized by the generator L~\widetilde{L} defined by

for xIjx\in I_{j}. Note that for any kk with kj>Nη|k-j|>N\eta, we have γk∉2Ij\gamma_{k}\not\in 2I_{j}, where 2Ij2I_{j} is the doubling of the interval IjI_{j}. Moreover, for k=j±Nηk=j\pm N\eta we have γkγjCη2/3|\gamma_{k}-\gamma_{j}|\leq C\eta^{2/3} for some constant CC, and so xγkCη2/3|x-\gamma_{k}|\leq C\eta^{2/3} for xIjx\in I_{j}. Thus we obtain that

with some positive constant cc, using β1\beta\geq 1. Since WjW_{j} was defined by a convex extension outside IjI_{j}, the same bound holds for any xx:

i.e., the mean field potential is uniformly convex with the convexity bound given in (3.8).

The potential WW is chosen to satisfy the two convexity properties: (3.8) and (3.18) and there are many other possible choices for WW. For example, without changing the form of WW given in (3.1), a more natural choice for γj\gamma_{j} would be

This may somewhat improve the constant in the estimate (2.10), but the analysis is more complicated and we will not pursue this choice in this paper.

The following theorem is our main result on the local ergodicity of DBM.

Suppose that Sμ(f0ψ)CNmS_{\mu}(f_{0}|\psi)\leq CN^{m} for some mm fixed. Let τ:=η1/3Nδ\tau:=\eta^{1/3}N^{\delta} for some δ>0\delta>0. Define

Then for any J{1,2,,Nn}J\subset\{1,2,\ldots,N-n\} we have

We emphasize that Theorem 3.1 applies to all β1\beta\geq 1 ensembles and the only assumption concerning the distribution ftf_{t} is in (3.9). Notice that the first error term becomes large for δ\delta large, i.e., if τ\tau is large. The first ingredient to prove Theorem 3.1 is the analysis of the local relaxation flow. The following theorem shows that the local relaxation flow satisfies an entropy dissipation estimate and its equilibrium measure satisfies the logarithmic Sobolev inequality.

(Dirichlet Form Dissipation Estimate) Suppose (3.8) holds. Consider the equation

with reversible measure ω\omega. Denote by R:=η1/6R:=\eta^{1/6}. Then we have

with a universal constant CC. Thus the relaxation time to equilibrium is of order R2=η1/3R^{2}=\eta^{1/3} and we have

The notation R=η1/6R=\eta^{1/6} was introduced so that this result and Theorem 4.2 in are identical. The scale parameter RR has a meaning in , but it is purely a choice of convention here. The proof given below follows the argument in and it was outlined in this context in Section 5.1 of . The new observation is the additional second term on the r.h.s of (3.13), corresponding to “local Dirichlet form dissipation”. The estimate (3.14) on this additional term will play a key role in this paper.

Proof. In it was shown that, with the notation h=qh=\sqrt{q}, we have

imply that the Hessian of H~\widetilde{\mathcal{H}} is bounded from below as

with some positive constant CC. This proves (3.13) and (3.14) since R2=η1/3R^{2}=\eta^{1/3}. Inserting the inequality

and integrating the resulting equation, we prove (3.15). Inserting (3.15) into (3.19) we have

The proof of (3.17) requires an integration by parts and the boundary term at xi=xjx_{i}=x_{j} (explained in Section 5.1. of ) should vanish. In the Appendix we will justify this technical step.

with some constant CC depending only on GG.

Proof. Without loss of generality, we consider only the case J={1,,Nn}J=\{1,\ldots,N-n\}. Let qtq_{t} satisfy

with an initial condition q0q_{0}. We first compare qτq_{\tau} with q=1q_{\infty}=1. Using the entropy inequality,

and the exponential decay of the entropy (3.16), we have

To compare q0q_{0} with qτq_{\tau}, by differentiation, we have

From the Schwarz inequality and q=2qq\partial q=2\sqrt{q}\partial\sqrt{q} the last term is bounded by

since GG is smooth and compactly supported. This proves the Lemma.

Notice if we use only the entropy dissipation and Dirichlet form, the main term on the right hand side of (3.20) will become CSω(q)τC\sqrt{S_{\omega}(q)\tau}. Hence by exploiting the Dirichlet form dissipation coming from the second term on the r.h.s. of (3.13), we gain the crucial factor N1/2N^{-1/2} in the estimate.

The second ingredient to prove Theorem 3.1 is the following entropy and Dirichlet form estimates.

(Entropy and Dirichlet Form Estimates) Suppose the assumptions of Theorem 3.1 hold. Recall that τ=η1/3Nδ\tau=\eta^{1/3}N^{\delta} and define gt:=ft/ψg_{t}:=f_{t}/\psi so that Sμ(ftψ)=Sω(gt)S_{\mu}(f_{t}|\psi)=S_{\omega}(g_{t}). Then the entropy and the Dirichlet form satisfy the estimates:

Proof. First we need the following relative entropy identity from .

Let ftf_{t} be a probability density satisfying tft=Lft\partial_{t}f_{t}=Lf_{t}. Then for any probability density ψt\psi_{t} we have

In our setting, ψ\psi is independent of tt and LL satisfies (3.6). Hence we have

Since the middle term on the right hand side vanishes, we have from the Schwarz inequality

Together with the LSI (3.15) and (3.9), we have

for tτt\leq\tau. Since Sω(g0)=Sμ(f0ψ)CNmS_{\omega}(g_{0})=S_{\mu}(f_{0}|\psi)\leq CN^{m} and τ/2R2\tau/2\gg R^{2}, the last inequality proves (3.22).

Integrating (3.25) from t=τ/2t=\tau/2 to t=τt=\tau and using the monotonicity of the Dirichlet form in time, we have proved (3.23) with the choice of τ\tau.

Proof of Theorem 3.1. Fix τ=R2Nδ=η1/3Nδ\tau=R^{2}N^{\delta}=\eta^{1/3}N^{\delta} and let q0:=gτ=fτ/ψq_{0}:=g_{\tau}=f_{\tau}/\psi with f0f_{0} satisfying the assumption of Theorem 3.5, i.e., Sμ(f0ψ)CNmS_{\mu}(f_{0}|\psi)\leq CN^{m} for some mm and (3.9) holds. Using (3.23), we have

Clearly, equation (3.27) also holds for the special choice f0=1f_{0}=1 (for which fτ=1f_{\tau}=1), i.e. local statistics of μ\mu and ω{\omega} can be compared. Hence we can replace the measure ω\omega in (3.27) by μ\mu and this proves Theorem 3.1.

Proof of Theorem 2.3 and Theorem 2.1

We first prove Theorem 2.3 assuming that Theorem 2.1 holds. Our main tool is the reverse heat flow argument from . Recall that the distribution of the matrix element is given by a measure ν\nu and the generator of the Ornstein-Uhlenbeck process is AA (2.17). The probability distribution of all matrix elements is νn\nu^{\otimes n}, n=N2n=N^{2}. The joint probability distribution of the matrix elements at time tt as every matrix element evolves under the Ornstein-Uhlenbeck process is given by

where we recall that γ\gamma is the standard Gaussian measure.

Fix a positive integer KK. Suppose that ν=udγ\nu=u{\rm d}\gamma satisfies the subexponential decay condition (2.20) and the regularity condition (2.18) for all jKj\leq K. Then there is a small constant αK\alpha_{K}, depending only on KK, such that for any positive tαKt\leq\alpha_{K} there exists a probability density gtg_{t} w.r.t. γ\gamma with mean zero and variance one such that

for some C>0C>0 depending on KK. Furthermore, gtg_{t} can be chosen such that if the logarithmic Sobolev inequality (2.15) holds for the measure ν=uγ\nu=u\gamma, then it holds for gtγg_{t}\gamma as well, with the logarithmic Sobolev constant changing by a factor of at most 22.

Furthermore, let A=An{\cal A}=A^{\otimes n}, F=unF=u^{\otimes n} with n=N2n=N^{2} and set Gt:=gtnG_{t}:=g_{t}^{\otimes n}. Then we also have

Proof. This proposition can be proved following the reverse heat flow idea from . Define θ(x)=θ0(tαx)\theta(x)=\theta_{0}(t^{\alpha}x) with some small positive α>0\alpha>0 depending on KK, where θ0\theta_{0} is a smooth cutoff function satisfying θ0(x)=1\theta_{0}(x)=1 for x1|x|\leq 1 and θ0(x)=0\theta_{0}(x)=0 for x2|x|\geq 2. Set

By assumption (2.18), hsh_{s} is positive and

for any sts\leq t if tt is small enough.

Define vs=esAhsv_{s}=e^{sA}h_{s} and by definition, v0=uv_{0}=u. Then

Since the Ornstein-Uhlenbeck is a contraction in L1(dγ)L^{1}({\rm d}\gamma), together with (2.18), we have

Notice that hth_{t} may not be normalized as a probability density w.r.t. γ\gamma but it is easy to check that there is a constant ct=1+O(tM)c_{t}=1+O(t^{M}), for any M>0M>0 positive, such that cthtc_{t}h_{t} is a probability density. Clearly,

and the same formulas hold if hth_{t} is replaced by vtv_{t} since the OU flow preserves expectation and variance. Let gtg_{t} be defined by

Then gtg_{t} is a probability density w.r.t. γ\gamma with zero mean and variance 11. It is easy to check that the total variation norm of htgth_{t}-g_{t} is smaller than any power of tt. Using again the contraction property of etAe^{tA} and (4.4), we get

Now we check the LSI constant for gtg_{t}. Recall that gtg_{t} was obtained from hth_{t} by translation and dilation. By definition of the LSI constant, the translation does not change it. The dilation changes the constant, but since our dilation constant is nearly one, the change of LSI constant is also nearly one. So we only have to compare the LSI constants between dν=udγ{\rm d}\nu=u{\rm d}\gamma and cthtdγc_{t}h_{t}{\rm d}\gamma. From (4.3) and that ctc_{t} is nearly one, the LSI constant changes by a factor less than 22. This proves the claim on the LSI constant.

and this completes the proof of Proposition 4.1.

We now apply Theorem 2.1 to the initial distribution given by the eigenvalues of the symmetric Wigner ensemble with distribution gτγg_{\tau}\gamma where τ=Nζ\tau=N^{-\zeta}. By Proposition 4.1, the LSI constant of gτγg_{\tau}\gamma is bounded by the initial LSI constant of ν\nu by a factor of at most two. Thus we can apply Lemma 2.2 to verify Assumptions 1 and 2 of Theorem 2.1. Assumption 3 follows from the local semicircle law, Theorem 5.1. Thus the correlation functions of the eigenvalues of the ensemble with distribution (eτAgτ)γ(e^{\tau A}g_{\tau})\gamma are the same as those of GOE in the sense of (2.13). Finally, using (4.2), we can approximate the kk-point correlation function w.r.t. (eτAgτ)γ(e^{\tau A}g_{\tau})\gamma by the one w.r.t. ν\nu after choosing KK sufficiently large so that N2τK=N2Kζ=o(Nk)N^{2}\tau^{K}=N^{2-K\zeta}=o(N^{-k}). The additional smallness factor NkN^{-k} for the estimate on the total variation in (4.2) is necessary to conclude the convergence of the kk-point correlation function, since it is rescaled by a factor NN in each variable. We also used the trivial fact that the total variation distance of two eigenvalue distributions is bounded by the total variation distance of the distributions of the full matrix ensembles. Finally we remark that the b0b\to 0 limit in (2.19) is needed to replace ϱsc(E)\varrho_{sc}(E) in (2.13) with ϱsc(E)\varrho_{sc}(E^{\prime}) in (2.19) using the continuity of OO. This concludes the proof of Theorem 2.3.

Proof of Theorem 2.1. Step 1. The first step is to show that the right hand side of (3.11) vanishes in the large NN limit for η=Nε3\eta=N^{-\varepsilon_{3}} with ε3\varepsilon_{3} small enough provided that the estimates (2.10), (2.11) hold. By (2.11), xjIjx_{j}\in I_{j} (recall the definition of IjI_{j} from (3.1)) with a very high probability. In this paper we will say that an event holds with a very high probability if the complement event has a probability that is subexponentially small in NN, i.e., it is bounded by Cexp(Nε)C\exp(-N^{\varepsilon}) with some fixed ε>0\varepsilon>0. From the definition of bjb_{j} (3.6), we have

Notice that function g(x):=\mboxsgn(x)x+ηg(x):=\frac{\mbox{sgn(x)}}{|x|+\eta} satisfies

as long as xx and yy have the same sign. In our case, xjxkx_{j}-x_{k} and xjγkx_{j}-\gamma_{k} have the same sign as long as

The last inequality holds with a very high probability due to (2.11) provided ε3\varepsilon_{3} is smaller than b{\mathfrak{b}}. We remark that this is the only place where we used Assumption 2. Thus,, with a very high probability, we have

The contribution to Λ\Lambda of the exceptional event is negligible, since its probability is subexponentially small in NN and bjCηCNε3|b_{j}|\leq C\eta\leq CN^{\varepsilon_{3}}. Thus, recalling the definition of QQ from (2.9) and the definition of Λ\Lambda from (3.9), we can bound the error term on the right hand side of (3.11) by

provided that (2.10) holds and ε3\varepsilon_{3} and δ\delta are small enough, depending on a{\mathfrak{a}}.

Step 2. From (3.11) to correlation functions. The equation (3.11) shows that for a special class of observables, depending only on rescaled differences of the points xjx_{j}, the expectations w.r.t. ftμf_{t}\mu and w.r.t the equilibrium measure μ\mu are identical in the large NN limit. But the class of observables in (2.13) of Theorem 2.1 is somewhat bigger and we need to extend our result to them. Without the dE{\rm d}E^{\prime} integration in (2.13), the observable would strongly depend on a fixed energy EE^{\prime} and could not be approximated by observables depending only on differences of xjx_{j}. Taking a small averaging in EE^{\prime} remedies this problem.

We will consider EE, bb and nn fixed, i.e., the constants in this proof may depend on these three parameters. We start with the identity

We will set Yi,m=0Y_{i,{\bf{m}}}=0 if i+mn>Ni+m_{n}>N. We have to show that

Let MM be an NN-dependent parameter chosen at the end of the proof. Let

and note that Sn(M)Mn1|S_{n}(M)|\leq M^{n-1}. To prove (4.12), it is sufficient to show that

hold for any τη1/3Nδ\tau\geq\eta^{1/3}N^{\delta} (note that τ=\tau=\infty corresponds to the equilibrium, f=1f_{\infty}=1), where η1/3Nδ\eta^{1/3}N^{\delta} is chosen in Theorem 2.1 and η\eta is chosen in the Step 1.

Case 1: Small m{\bf{m}} case; proof of (4.13).

After performing the dE{\rm d}E^{\prime} integration, we will eventually apply Theorem 3.1 to the function

For any EE and 0<ξ<b0<\xi<b define sets of integers J=JE,b,ξJ=J_{E,b,\xi} and J±=JE,b,ξ±J^{\pm}=J^{\pm}_{E,b,\xi} by

where γi\gamma_{i} was defined in (2.8). Clearly JJJ+J^{-}\subset J\subset J^{+}. With these notations, we have

The error term ΩJ,m+\Omega^{+}_{J,{\bf{m}}}, defined by (4.16) indirectly, comes from those i∉J+i\not\in J^{+} indices, for which xi[Eb,E+b]+O(N1)x_{i}\in[E-b,E+b]+O(N^{-1}) since Yi,m(E,x)=0Y_{i,{\bf{m}}}(E^{\prime},{\bf{x}})=0 unless xiEC/N|x_{i}-E^{\prime}|\leq C/N, the constant depending on the support of OO. Thus

for any sufficiently large NN assuming ξ1/N\xi\gg 1/N and using that OO is a bounded function. The additional N1N^{-1} factor comes from the dE{\rm d}E^{\prime} integration. Taking the expectation with respect to the measure fτdμf_{\tau}{\rm d}\mu, and by a Schwarz inequality, we get

using Assumption 1 (2.10). We can also estimate

where the error term ΞJ,m+\Xi^{+}_{J,{\bf{m}}}, defined by (4.18), comes from indices iJi\in J^{-} such that xi∉[Eb,E+b]+O(1/N)x_{i}\not\in[E-b,E+b]+O(1/N). It satisfies the same bound (4.17) as ΩJ,m+\Omega^{+}_{J,{\bf{m}}}. By the continuity of ϱ\varrho, the density of γi\gamma_{i}’s is bounded by CNCN, thus J+JCNξ|J^{+}\setminus J^{-}|\leq CN\xi and JJCNξ|J\setminus J^{-}|\leq CN\xi. Therefore, summing up the formula (4.15) for iJi\in J, we obtain from (4.16) and (4.18)

for each mSn{\bf{m}}\in S_{n}. A similar lower bound can be obtained analogously, and after choosing ξ=N1/4\xi=N^{-1/4}, we obtain

Adding up (4.19) for all mSn(M){\bf{m}}\in S_{n}(M), we get

and the same estimate holds for the equilibrium, i.e., if we set τ=\tau=\infty in (4.20). Subtracting these two formulas and applying (3.11) from Theorem 3.1 to each summand on the second term in (4.19) and using (4.10), we conclude that

we obtain that (4.21) vanishes as NN\to\infty, and this proves (4.13).

Step 2. Large m{\bf{m}} case; proof of (4.14).

as NN\to\infty. Inserting this into (4.24), this completes the proof of (4.14) and the proof of Theorem 2.1.

Local semicircle law and proof of Lemma 2.2

We first recall the local semicircle law concerning the eigenvalues x1<x2<<xNx_{1}<x_{2}<\ldots<x_{N} of HH. Let

be the Stieltjes transform of the empirical eigenvalue distribution at spectral parameter z=E+iηz=E+i\eta, η>0\eta>0, and let

be the Stieltjes transform of the semicirle distribution. In Theorem 4.1 of we proved the following version of the local semicircle law (we remark that, contrary to what is stated in Theorem 4.1 of , condition (2.5) of is not needed and has not been used in the proof):

Assume that the distribution ν\nu of the matrix elements of the symmetric Wigner matrix ensemble satisfies (2.15) with some constant θ\theta and assume that yy is such that (logN)4/Ny1(\log N)^{4}/N\leq|y|\leq 1. Then for any K>0K>0 there exist positive constants δ0\delta_{0}, CC and c>0c>0, depending only on KK and θ\theta, such that for any xK|x|\leq K we have

for all 0δδ00\leq\delta\leq\delta_{0} and for all NN large enough.

As a corollary of the local semicircle law, the number of eigenvalues up to a fixed energy EE can be estimated. The precise statement is the following proposition and it was proven in Proposition 4.2, equation (4.14) of .

Assume that the distribution ν\nu of the matrix elements of the symmetric Wigner matrix ensemble satisfies (2.15) with some finite constant θ\theta. Let

be the expectation of the empirical distribution function of the eigenvalues and recall the definition of nsc(E)n_{sc}(E) from (2.7). Then there exists a constant C>0C>0, depending only on the constant θ\theta in (2.15), such that

The local semicircle law implies that the local density of eigenvalues is bounded, but the estimate in Theorem 5.1 deteriorates near the spectral edges. The following upper bound on the number of eigenvalues in an interval provides a uniform control near the edges. This lemma was essentially proved in Theorem 4.6 of using ideas from an earlier version, Theorem 5.1 of . For the convenience of the reader, a detailed proof is given in the Appendix C.

We remark that for Theorem 5.1 and Lemma 5.3 it is sufficient to assume only the Gaussian decay condition (2.16) instead of the logarithmic Sobolev inequality (2.15).

We can now start to prove Lemma 2.2. Since the logarithmic Sobolev inequality holds for ν\nu (2.15), it also holds for νt\nu_{t} as well; for a proof see Lemma B.1 in Appendix B and recall that νt\nu_{t} is the convolution of ν\nu with the Ornstein-Uhlenbeck kernel which itself satisfies the logarithmic Sobolev inequality. Moreover, the LSI constant of νt\nu_{t} is bounded uniformly for all t>0t>0, since it is the maximum of the LSI constant θ\theta of ν\nu and the LSI constant of the Ornstein-Uhlenbeck kernel, which is bounded uniformly in time. Therefore Lemma 2.2 follows immediately from its time independent version:

Suppose that the distribution ν\nu of the matrix elements of the symmetric Wigner ensemble satisfies (2.15) with some finite constant θ\theta. Then there exist positive constants a{\mathfrak{a}}, b{\mathfrak{b}} and c{\mathfrak{c}} such that

hold for the eigenvalues xjx_{j} of HH and for any NN0=N0(θ,a,b,c)N\geq N_{0}=N_{0}(\theta,{\mathfrak{a}},{\mathfrak{b}},{\mathfrak{c}}).

The proof of Lemma 5.4 is divided into two steps. In the first step, Section 5.1, we estimate the fluctuation of the eigenvalues xjx_{j} around their mean values using the logarithmic Sobolev inequality. In the second step, Section 5.2, we estimate the deviation of the mean location of xjx_{j} from the classical location γj\gamma_{j} using (5.3).

the expected location of xjx_{j}. We start with an estimate on the expected location of the extreme eigenvalues:

Suppose that the probability measure ν\nu of the matrix entries satisfies

for some c0>0c_{0}>0 (this condition is satisfied, in particular, under (2.15), see (2.16)). Then for any δ>0\delta>0 we have

with some constant C0C_{0} depending on δ\delta and c0c_{0}.

Proof. For any MM, define the probability measure

Denote by ζMN=ijζM\zeta_{M}^{N}=\bigotimes_{i\leq j}\zeta_{M} (νN\nu^{N} resp.) the probability law of the random matrices whose matrix elements are distributed according to ζM\zeta_{M} (ν\nu resp.). As usual, we neglect the fact that distribution ν\nu should be replaced with ν~\widetilde{\nu} for the diagonal elements i=ji=j. Since the number of index pairs iji\leq j is of order N2N^{2}, the total variational norm between ζMN\zeta_{M}^{N} and νN\nu^{N} is bounded by

From Theorem 1.4 of we obtain that for any δ>0\delta>0

holds almost surely w.r.t. ζMN\zeta_{M}^{N}. It follows from (5.11) that xNx_{N} is bounded w.r.t. the distribution νN\nu^{N} as well, up to a subexponentially small probability. To estimate the tail of xNx_{N} w.r.t. νN\nu^{N}, we use that maxjxj2\mboxTrH2\max_{j}|x_{j}|^{2}\leq\mbox{Tr}\,H^{2} and the trivial large deviation bound based upon (5.8),

that holds for any K>0K>0 with constants C,cC,c depending on c0c_{0}. We thus obtain that the expectations of xNx_{N} w.r.t. these two measures satisfy

Thus we have proved that, for any δ>0\delta>0,

with some constant CC depending on c0c_{0} and δ\delta. Similar lower bound holds for α1\alpha_{1}.

Next we estimate the fluctuations of xjx_{j}:

with a constant CC depending on ε\varepsilon and θ\theta.

Proof. First order perturbation theory of the eigenvalue xjx_{j} of HH shows that

Using (2.15) and the Bobkov-Götze concentration inequality (Theorem 2.1 of ), we have for any T>0T>0

after optimizing for TT and using that xjCN1/2|\nabla x_{j}|\leq CN^{-1/2} from above. This proves (5.14).

The following proposition is a refinement of Proposition 5.6:

Fix a sufficiently small constant δ>0\delta>0 and set κ=N1/18+δ\kappa=N^{-1/18+\delta}. Then for any index ii with CNκ3/2iN(1Cκ3/2)CN\kappa^{3/2}\leq i\leq N(1-C\kappa^{3/2}) we have

The constants CC and cc depend on δ\delta and θ\theta but are independent of NN.

As a preparation for the proof of Proposition 5.7, we need the following estimate on the tail of the gap distribution.

Let E<2|E|<2. Denote by xαx_{\alpha} the largest eigenvalue below EE and assume that αN1\alpha\leq N-1. Then there exist positive constants CC and cc, depending only on the Sobolev constant θ\theta in (2.15), such that for any MM that satisfy c(logN)4/(2E)MCN(2E)c(\log N)^{4}/(2-|E|)\leq M\leq CN(2-|E|), we have

This lemma was proven in Theorem E1 of , see also Theorem 3.3 of , and the proof will not be repeated here. We only mention the main idea, that the local semicircle law, Theorem 5.1, provides a positive lower bound on the number of eigenvalues in any interval II of size |I|\geq A(\log N)^{4}/\big{[}N\big{|}2-|x|\big{|}\big{]} around the point xx with a sufficiently large constant AA. In particular, it follows that there is at least one eigenvalue in each such interval II with a very high probability.

Proof of Proposition 5.7. We choose M,KM,K positive numbers, depending on NN, such that

with some sufficiently small c>0c>0 and large C>0C>0 constants. Let

Consider an index ii with CNκ3/2iN(1Cκ3/2)CN\kappa^{3/2}\leq i\leq N(1-C\kappa^{3/2}), then 2γiCκ|2-|\gamma_{i}||\geq C\kappa. We first show that 2xiCκ2|2-|x_{i}||\geq C\kappa^{2} with a very high probability. Suppose, in the contrary, that xi<2+Cκ2x_{i}<-2+C\kappa^{2} for some iCNκ3/2i\geq CN\kappa^{3/2} (the case xi2Cκ2x_{i}\geq 2-C\kappa^{2} is treated analogously). From (5.9) and (5.14) it follows that x12C0N1/4+δx_{1}\geq-2-C_{0}N^{-1/4+\delta} with a very high probability. But then the interval [2C0N1/4+δ,2+Cκ2][-2-C_{0}N^{-1/4+\delta},-2+C\kappa^{2}] of length C0N1/4+δ+Cκ2κ3/2C_{0}N^{-1/4+\delta}+C\kappa^{2}\ll\kappa^{3/2} would contain CNκ3/2CN\kappa^{3/2} eigenvalues, an event with an extremely low probability by (5.4).

Knowing that 2xiκ2|2-|x_{i}||\geq\kappa^{2} with a very high probability, we can use (5.17) to conclude that for any index ii with CNκ3/2iN(1Cκ3/2)CN\kappa^{3/2}\leq i\leq N(1-C\kappa^{3/2}) we have

Similarly to the calculation in Theorem 3.1 of , by using the logarithmic Sobolev inequality (2.15), we have

This bound holds for any index ii with the remark that if i<Ki<K or i>NKi>N-K, then the averaging over the indices jj is done asymmetrically.

Combining this estimate with (5.21) we have, apart from a set of very small probability, that

for any CNκ3/2iN(1Cκ3/2)CN\kappa^{3/2}\leq i\leq N(1-C\kappa^{3/2}). Taking expectation, and using the tail estimate (5.13) to control xNx_{N} on the event of very small probability where (5.23) may not hold, we also obtain for these ii indices that

Subtracting the last two inequalities yields

and combining this bound with the estimate (5.14) for the extreme indices, we obtain

The inequalities (5.15) and (5.16) now follow if we choose the parameters as

which choice is compatible with the conditions (5.18).

2 Deviation of the eigenvalues from their classical locations

The next Proposition 5.9 below estimates the distance of the eigenvalues from their location given by the semicircle law. This will justify that the convex extension of the potential WjW_{j} affects only regimes of very small probability.

For any small δ>0\delta>0 and for any j=1,2,Nj=1,2,\ldots N we have

The constants CC and cc depend on δ\delta and θ\theta but are independent of NN.

We remark that in the bulk αmγm\alpha_{m}-\gamma_{m} is expected to be bounded by O(N1+ε)O(N^{-1+\varepsilon}) (in the hermitian case it was proven in , see also ); near the edges one expects αmγmO(N2/3)\alpha_{m}-\gamma_{m}\sim O(N^{-2/3}). Our estimate is not optimal, but it gives a short proof that is sufficient for our purpose. We remark that after submitting this paper, these conjectures were proven in .

to be the counting function of the expected locations of the eigenvalues. We compare n~(E)\widetilde{n}(E) with n(E)n(E) defined in (5.2). Using the fluctuation bound (5.14), we have

and the second term is analogous. We thus have

For the energy range E3|E|\leq 3, we use (5.28):

with an ε\varepsilon dependent constant.

To estimate αmγm|\alpha_{m}-\gamma_{m}|, we can assume, without loss of generality, that αmγm\alpha_{m}\geq\gamma_{m}, the other case is treated analogously. Let λ>0\lambda>0 be a parameter that will be optimized later. Set m0=[CNλ3/2]m_{0}=[CN\lambda^{3/2}] with a sufficiently large constant CC. Since nsc(2+δ)δ3/2n_{sc}(-2+\delta)\sim\delta^{3/2}, for any small δ>0\delta>0, the parameter λ\lambda is roughly the energy difference from the edge to the m0m_{0}-th eigenvalue.

From the property nsc(2+δ)δ3/2n_{sc}(-2+\delta)\sim\delta^{3/2} for small δ\delta, we have

Combining this estimate with (5.32), we have

for any mm with m0mNm0m_{0}\leq m\leq N-m_{0}.

For the extreme indices, we use that if mm0m\leq m_{0}, then from Lemma 5.5 and (5.33), we have

with CC depending on ε\varepsilon. Similar estimates hold at the upper edge of the spectrum, i.e. for mNm0m\geq N-m_{0}. Choosing λ=CN1/5+ε\lambda=CN^{-1/5+\varepsilon}, we conclude the proof of (5.27). The proof of (5.26) then follows from (5.14) and this concludes the proof of Proposition 5.9.

The following Proposition is a strengthening of the bound (5.32) used previously.

for any δ>0\delta>0 and with a constant CC depending on δ\delta and θ\theta.

Proof. Recalling the definition of Φ\Phi from (5.19), we will prove that

which gives (5.34) with the choice of parameters (5.25). We proceed similarly to the proof of Proposition 5.9 but we notice that in addition to (5.28), a stronger bound on n(E)n~(E)|n(E)-\widetilde{n}(E)| is available for EI:=[E,E+]E\in I:=[E_{-},E_{+}], where E±:=±(2C2κ)E_{\pm}:=\pm(2-C_{2}\kappa), with some large constant C2C_{2} and setting κ:=N1/18+δ\kappa:=N^{-1/18+\delta} as in Proposition 5.7. To obtain an improved bound, note that for any EE in this interval

To see this inequality, define the random index

The estimate (5.36) will then follow if we prove that j0j1j_{0}\leq j_{1}, i.e. αj0E+Φ\alpha_{j_{0}}\leq E+\Phi, with a very high probability. By (5.26) we have, with a very high probability, that

Therefore, with a very high probability, γj0\gamma_{j_{0}} is in the CN1/5+δCN^{-1/5+\delta} vicinity of EIE\in I, and thus CNκ3/2j0N(1Cκ3/2)CN\kappa^{3/2}\leq j_{0}\leq N(1-C\kappa^{3/2}) holds for any fixed CC if C2C_{2} in the definition of E±E_{\pm} is sufficiently large. Thus xj0αj0Φ|x_{j_{0}}-\alpha_{j_{0}}|\leq\Phi with a very high probability by (5.15), so xj0Ex_{j_{0}}\leq E implies αj0E+Φ\alpha_{j_{0}}\leq E+\Phi and this proves (5.36).

is analogous. Finally, by the Lipschitz continuity of n~(E)\widetilde{n}(E) on a scale bigger than (logN)2/N(\log N)^{2}/N, we have

where we also used that ΦCexp(cNδ)\Phi\geq C\exp(-cN^{\delta}).

Define the interval J=[2C1N1/4+δ,2+C1N1/4+δ]J=[-2-C_{1}N^{-1/4+\delta},2+C_{1}N^{1/4+\delta}] with a constant C1C_{1} larger than the constant C0C_{0} in (5.9). Using (5.30), we have

since JICκ+N1/4+δCκ|J\setminus I|\leq C\kappa+N^{-1/4+\delta}\leq C\kappa. Finally, n~(E)0\widetilde{n}(E)\equiv 0 for E<2C0N1/4+δE<-2-C_{0}N^{-1/4+\delta} and n~(E)1\widetilde{n}(E)\equiv 1 for E>2+C0N1/4+δE>2+C_{0}N^{-1/4+\delta} by (5.9). Since C1>C0C_{1}>C_{0}, combining these estimates with the fluctuation (5.14) and with the tail estimate (5.13), we obtain that n(E)(1-n(E))\leq C\exp\big{[}-cN^{1/4}\big{]} for any EJcE\in J^{c}, and it decays exponentially for large E|E|, therefore

Collecting all these estimates, inserting them into (5.38) and using (5.3), we obtain (5.35) and conclude the proof of Proposition 5.10.

Finally, we can complete the proof of Lemma 5.4. By (5.16) and (5.34), we have

apart from a set of probability C\exp\big{[}-cN^{\delta}\big{]}. Combining it with the tail estimate (5.13) on maxxk\max|x_{k}|, we obtain (5.5) with any a<1/18{\mathfrak{a}}<1/18. The inequality (5.6) in Lemma 5.4 follows immediately from (5.26) with any c>0{\mathfrak{c}}>0 sufficiently small and with b<1/5c{\mathfrak{b}}<1/5-{\mathfrak{c}}.

Appendix A Some Properties of the Eigenvalue Process

In the main part of the paper we did not specify the function spaces in which the equations (2.4) and (3.12) are solved. In this appendix we summarize some basic properties of these equations. In particular, we justify the integration by parts in (3.17). For simplicity, we consider the most singular β=1\beta=1 case only.

The Dyson Brownian motion as a stochastic process was rigorously constructed in Section 4.3.1 of . It was proved that the eigenvalues do not collide with probability one and thus (2.4) holds in a weak sense on the open set ΣN\Sigma_{N}. The coefficients of LL have a (xixj)1(x_{i}-x_{j})^{-1} singularity near the coalescence hyperspace xi=xjx_{i}=x_{j}. We focus only on the single collision singularities, i.e. on the case j=i±1j=i\pm 1. By the ordering of the eigenvalues, higher order collision points form a zero measure set on the boundary of ΣN\Sigma_{N} and can thus be neglected. In an open neighborhood near the coalescence hyperspace xi=xi+1x_{i}=x_{i+1}, the generator has the form

after a change of variables, v=12(xi+xi+1)v=\frac{1}{2}(x_{i}+x_{i+1}), u=12(xi+1xi)u=\frac{1}{2}(x_{i+1}-x_{i}), where LregL_{reg} has regular coefficients. The boundary condition at u=0u=0 is given by the standard boundary condition of the generator of the Bessel process, u2+1uu\partial_{u}^{2}+\frac{1}{u}\partial_{u}, which is uf(u)0uf^{\prime}(u)\to 0 as u0+u\to 0+. Thus LL is defined on functions fC2(ΣN)f\in C^{2}(\Sigma_{N}) with sufficient decay at infinity and with boundary conditions

The generator L~\widetilde{L} of (3.12) differs from LL only in drift terms with bounded coefficients, hence the boundary conditions of L~\widetilde{L} and LL coincide. Finally, we need some non-vanishing and regularity property of the solution of (3.12):

Let ΩΣN\Omega\subset\Sigma^{N} be a bounded open set such that

i.e. Ω\overline{\Omega} intersects at most one of the coalescent hyperplanes, namely the {x1=x2}\{x_{1}=x_{2}\}. Then any weak solution qt(x)q_{t}({\bf{x}}) of (3.12) with boundary conditions (A.1) is C2C^{2} on (t,x)R+×Ω(t,{\bf{x}})\in R_{+}\times\overline{\Omega} and for any t>0t>0 we have

where LregL_{reg} is an elliptic operator with second derivatives in the y{\bf{y}} variables and with bounded coefficients on the compact set Φ(Ω)\Phi(\overline{\Omega}). The solution in the new coordinates is q~t(u,y)=qt(Φ1(u,y))\widetilde{q}_{t}(u,{\bf{y}})=q_{t}(\Phi^{-1}(u,{\bf{y}})). Introducing a function q^t(a,b,y):=q~t(a2+b2,y)\widehat{q}_{t}(a,b,{\bf{y}}):=\widetilde{q}_{t}(\sqrt{a^{2}+b^{2}},{\bf{y}}) defined in N+1N+1 variables, we see that q^t\widehat{q}_{t} satisfies tq^t=L^q^t\partial_{t}\widehat{q}_{t}=\widehat{L}\widehat{q}_{t}, where

i.e. L^\widehat{L} is elliptic with bounded coefficients in the new variables. Notice that the boundary condition (A.1) implies that, in the two dimensional plane of (a,b)(a,b), the support of the test function for the equation tq^t=L^q^t\partial_{t}\widehat{q}_{t}=\widehat{L}\widehat{q}_{t} is allowed to include the origin (0,0)(0,0).

By standard parabolic regularity, we obtain that the solution is C2C^{2} and is bounded from above and below.

This lemma justifies the integration by parts in (3.17). Since qC2q\in C^{2} and it is separated away from zero, h=qh=\sqrt{q} has no singularity on the coalescence lines. Since the function exp(H~)\exp(-\widetilde{\mathcal{H}}) vanishes whenever xi=xjx_{i}=x_{j} for some iji\neq j, the boundary terms of the form

Appendix B Logarithmic Sobolev inequality for convolution measures

for any ff with f(x)KH(x)dx=1\int f(x)K\ast H(x){\rm d}x=1. Here =d/dx\nabla={\rm d}/{\rm d}x.

Proof. The following proof is really a special case of the martingale approach used in to prove LSI. Let

For any fixed yy, from the LSI w.r.t. the measure K(xy)dxK(x-y){\rm d}x, the first term on the right hand side is bounded by

Integrating by parts, we can rewrite the last term as

where we have used f(x)=2f(x)f(x)f^{\prime}(x)=2\sqrt{f(x)}\nabla\sqrt{f(x)} and the Schwarz inequality. Combining these inequalities, we have proved the Lemma.

Appendix C Proof of Lemma 5.3

For any 1kN1\leq k\leq N, let H(k)H^{(k)} denote the (N1)×(N1)(N-1)\times(N-1) minor that is obtained from the Wigner matrix HH by removing the kk-th row and column. Let a(k)=(hk1,hk2,hk,k1,hk,k+1,hkN)t{\bf{a}}^{(k)}=(h_{k1},h_{k2},\ldots h_{k,k-1},h_{k,k+1},\ldots h_{kN})^{t} be the kk-th column of HH without the hkkh_{kk} element. Let λ1(k)<λ2(k)<<λN1(k)\lambda_{1}^{(k)}<\lambda_{2}^{(k)}<\ldots<\lambda_{N-1}^{(k)} be the eigenvalues and u1(k),u2(k),{\bf{u}}_{1}^{(k)},{\bf{u}}_{2}^{(k)},\ldots the corresponding eigenvectors of H(k)H^{(k)} and set

It is well known (see, e.g. Lemma 2.5 of ), that the eigenvalues of H(k)H^{(k)} and HH are interlaced for each kk, i.e.

Expressing the resolvent G=(Hz)1G=(H-z)^{-1} of HH at a spectral parameter z=E+iηz=E+i\eta, η>0\eta>0, in terms of the resolvent of H(k)H^{(k)}, we obtain

By considering only the imaginary part, we obtain

where we restricted the α\alpha summation in (C.3) only to eigenvalues lying in II.

For each k=1,2,Nk=1,2,\ldots N, we define the event

if KK is sufficiently large, recalling that η=I(logN)2/N\eta=|I|\geq(\log N)^{2}/N. On the complement event, Ω~c\widetilde{\Omega}^{c}, we have from (C.4) that

i.e. NI(C/δ)1/2Nη{\mathcal{N}}_{I}\leq(C/\delta)^{1/2}N\eta. Choosing KK sufficiently large, we obtain (5.4) from (C.5). This proves Lemma 5.3.

holds for any I{\cal I}, where m=Im=|{\cal I}| is the cardinality of the index set I{\cal I}.

Proof of Lemma C.1. We will need the following result of Hanson and Wright , extended to non-symmetric variables by Wright . We remark that this statement can also be extended to complex random variables (Proposition 4.5 ).

Let bjb_{j}, j=1,2,Nj=1,2,\ldots N be a sequence of real i.i.d. random variables with distribution dν{\rm d}\nu satisfying the Gaussian decay (2.16) for some δ0>0\delta_{0}>0. Let ajka_{jk}, j,k=1,2,Nj,k=1,2,\ldots N be arbitrary real numbers and let A{\cal A} be the N×NN\times N matrix with entries Ajk:=ajk{\cal A}_{jk}:=|a_{jk}|. Define

Then there exists a constant c>0c>0, depending only on δ0,D\delta_{0},D from (2.16), such that for any δ>0\delta>0

where A:=(\text{Tr}\,{\cal A}{\cal A}^{t})^{1/2}=\big{[}\sum_{j,k}|a_{jk}|^{2}\big{]}^{1/2}.

Acknowledgement: We thank Jun Yin for several helpful comments and pointing out some errors in the preliminary versions of this paper. We are also grateful to the referees for their suggestions to improve the presentation.

References