Universality of Random Matrices and Local Relaxation Flow

Laszlo Erdos, Benjamin Schlein, Horng-Tzer Yau

Introduction

A central question concerning random matrices is the universality conjecture which states that local statistics of eigenvalues are determined by the symmetries of the ensembles but are otherwise independent of the details of the distributions. There are two types of universalities: the edge universality and the bulk universality concerning the interior of the spectrum. The edge universality is commonly approached via the moment method while the bulk universality was proven for very general classes of unitary invariant ensembles (see, e.g. and references therein) based on detailed analysis of orthogonal polynomials. The most prominent non-unitary ensembles are the Wigner matrices, i.e., random matrices with i.i.d. matrix elements that follow a general distribution. The bulk universality for Hermitian Wigner ensembles was first established in for ensembles with smooth distributions. The later work by Tao and Vu did not assume smoothness but it required some moment condition which was removed later in . Our approach to prove the universality was based on the following three steps.

It states that the number of eigenvalues in a spectral window containing about $N^{\varepsilon}$ eigenvalues is given by the semicircle law with a very high probability . The factor $N^{\varepsilon}$ can be improved to any sufficiently large constant at the expense of deterioriation of the probability estimate.

The Gaussian divisible ensembles are given by matrices of the form

where $\widehat{H}$ is a Wigner matrix, $V$ is an independent standard GUE matrix and $s>0$ . Johansson and the later improvement in proved that the bulk universality holds for ensembles of the form (1.1) if $s>0$ is independent of $N$ . In the work , this result was extended to $s=N^{-1+\varepsilon}$ for any $\varepsilon>0$ . The key ingredient for this extension was the local semicircle law.

For any given Wigner matrix $H$ , we find another Wigner matrix $\widehat{H}$ so that the eigenvalue statistics of $H$ and $\widehat{H}+\sqrt{s}V$ are close to each other. The choice of $\widehat{H}$ is given by a reverse heat flow argument.

Johansson’s proof of the universality of Hermitian Wigner ensembles relied on the asymptotic analysis of an explicit formula by Brézin-Hikami for the correlation functions of the eigenvalues of $\widehat{H}+\sqrt{s}V$ . Unfortunately, the similar formula for GOE is not very explicit and the corresponding result is not available. On the other hand, the eigenvalue distribution of the matrix $\widehat{H}+\sqrt{s}V$ is the same as that of $\widehat{H}+V(s)$ , where the matrix elements of $V(s)$ are independent standard Brownian motions with variance $s/N$ . Dyson observed that the evolution of the eigenvalues of the flow $s\to\widehat{H}+V(s)$ is given by a system of coupled stochastic differential equations (SDE), commonly called the Dyson Brownian motion (DBM) .

If we replace the Brownian motions by the Ornstein-Uhlenbeck processes, the resulting dynamics on the eigenvalues, which we still call DBM, has the GUE or GOE eigenvalue distributions as the invariant measures depending on the symmetry type of the ensembles. Thus the result of Johansson can be interpreted as stating that the local statistics of GUE is reached via DBM for time of order one. In fact, by analyzing the dynamics of DBM with ideas from the hydrodynamical limit, we have extended Johansson’s result to $s\gg N^{-3/4}$ . The key observation of is that the local statistics of eigenvalues depend exclusively on the approach to local equilibrium. This method avoids the usage of explicit formulae for correlation functions, but the identification of local equilibria, unfortunately, still uses explicit representations of correlation functions by orthogonal polynomials (following e.g. ), and the extension to other ensembles is not a simple task.

Therefore, the universality for symmetric random matrices remained open and the only partial result is Theorem 23 of for Wigner matrices with the first four moments of the matrix elements matching those of GOE. The approach of consisted of three similar steps as outlined above. For Step 2, it used the result of . For Step 3, a four moment comparison theorem for individual eigenvalues was proved in and the local semicircle law (Step 1) was one of the key inputs in this proof.

In this paper, we introduce a general approach to prove local ergodicity of DBM, partially motivated by the previous work . In this approach the analysis of orthogonal polynomials or explicit formulae are completely eliminated and the method applies to both Hermitian and symmetric ensembles. In fact, the heart of the proof is a convex analysis and it applies to $\beta$ -ensembles for any $\beta\geq 1$ . The model specific information required to complete this approach involves only rough estimates on the accuracy of the local density of eigenvalues. We expect this method to apply to a very general class of models. More detailed explanations will be given in Section 3.

Statement of Main Results

To fix the notation, we will present the case of symmetric Wigner matrices; the modification to the Hermitian case is straightforward and will be omitted. The extension to the quaternion self-dual case is also standard, see, e.g. for the notations and setup. On the other hand, the main theorem on DBM (Theorem 2.1) is valid for general $\beta$ -ensembles. Thus all notations for matrices will be restricted to symmetric matrices but all results for flows will be stated and proved for general $\beta$ -ensembles. We first explain our general result about DBM and in Section 2.2 we will present its application to Wigner matrices.

The joint distributions of the eigenvalues ${\bf{x}}=(x_{1},x_{2},\ldots,x_{N})$ of the Gaussian Unitary Ensemble (GUE) and the Gaussian Orthogonal Ensemble (GOE) are given by the following measure

where $\beta=1$ for GOE and $\beta=2$ for GUE. We will sometimes use $\mu$ to denote the density of the measure as well, i.e., $\mu({\bf{x}}){\rm d}{\bf{x}}=\mu({\rm d}{\bf{x}})$ . We consider $\mu$ defined on the ordered set

and this measure is well-defined for all $\beta>0$ . The Dyson Brownian motion (DBM) is characterized by the generator

acting on $L^{2}(\mu)$ . The DBM is reversible with respect to $\mu$ with the Dirichlet form

where $\partial_{j}=\partial_{x_{j}}$ . Notice that we have added a drift $\frac{\beta}{4}x_{i}\partial_{i}$ so that the DBM is reversible w.r.t. $\mu$ . The original definition by Dyson in was slightly different; it contained no drift term.

Denote the distribution of the process at the time $t$ by $f_{t}({\bf x})\mu({\rm d}{\bf x})$ . Then $f_{t}$ satisfies

The corresponding stochastic differential equation for ${\bf x}(t)$ is now given by (see, e.g. Section 12.1 of )

where $\{B_{i}\;:\;1\leq i\leq N\}$ is a collection of independent Brownian motions. The well-posedness of DBM on $\Sigma_{N}$ has been proved in Section 4.3.1 of , see the Appendix for some more details. This step requires $\beta\geq 1$ which we will assume from now on.

The dynamics given by (2.4) and (2.5) with $\beta=1,2,4$ can be realized by the evolution of the eigenvalues of symmetric, hermitian and quaternion self-dual matrix ensembles, but the dynamics is well-defined for $\beta\geq 1$ independently of the original matrix models. Our main result, Theorem 2.1, is valid for all $\beta\geq 1$ .

Similarly, the correlation functions of the equilibrium measure are denoted by

is the density of the semicircle law and it is well-known that $\varrho_{sc}$ is also the density w.r.t. the measure $\mu$ in the limit $N\to\infty$ for $\beta\geq 1$ . Define

and let $\gamma_{j}$ be the classical location of the $j$ -th eigenvalue

Our key result on the local ergodicity of DBM is the following theorem.

Suppose the initial density $f_{0}$ satisfies $S_{\mu}(f_{0}):=\int f_{0}\log f_{0}{\rm d}\mu\leq CN^{m}$ with some fixed exponent $m$ independent of $N$ . Let $f_{t}$ be the solution of the forward equation (2.4). Suppose that the following three assumptions are satisfied for all sufficiently large $N$ .

There exist constants ${\mathfrak{b}}>0$ and ${\mathfrak{c}}>0$ such that

Convention. We will use the letters $C$ and $c$ to denote general positive constants whose precise values are irrelevant and they may change from line to line.

The relaxation time to global equilibrium for the DBM is order one in our scaling. The simplest way to see this is via the Bakry-Emery theorem which states that, roughly speaking, the relaxation time is the inverse of the lower bound to the Hessian of the Hamiltonian ${\mathcal{H}}$ . In our case ${\mathcal{H}}^{\prime\prime}\geq I$ , and this implies that the relaxation time is order one. On the other hand, it was conjectured by Dyson that the relaxation time to local equilibrium is of order $N^{-1}$ . Theorem 2.1 asserts that the relaxation time to local equilibrium is less than $N^{-\zeta}$ . Although this is far from proving Dyson’s conjecture, it is the first effective estimate that shows that the local equilibrium is approached much faster than the global one. Moreover, this result suffices to prove the bulk universality of Wigner matrices when combining with the reverse heat flow ideas introduced in . We remark that the concept of local equilibrium is used vaguely here and in Dyson’s paper. In principle, there are many local equilibria depending on boundary conditions and the uniqueness is a tough question especially now that the interaction is long ranged and singular.

The proof of Theorem 2.1 is based on the introduction of the pseudo equilibrium measure which we now explain. It is common that the global and the local equilibrium are reached at different time scales for interacting particle systems, of which DBM is a special case. On the other hand, the hydrodynamical approach for the DBM yields very complicated estimates. The main reason for the complications is due to that the equilibrium measure of DBM has a logarithmic two body interaction that is both long range and singular at short distances. Hence the proof of the uniqueness of ”local equilibrium measures” is very complicated and we were able to carry it out only for the Hermitian case due to that several identities involving orthogonal polynomials are valid only for the $\beta=2$ case. However, there are two key observations from this study:

The local statistics does not depend on the long range part of the logarithmic interaction, in other words, we can cutoff the interactions between far away particles without changing the local statistics.

The relaxation time for the gradient flow associated with the local equilibrium with a fixed boundary condition is much smaller than the global relaxation time of the DBM, which is of order one.

To finesse the difficulty associated with the uniqueness of local equilibria, we define the pseudo equilibrium measure, $\omega$ , by cutting off the long range interactions of the equilibrium measure $\mu$ and show that $\omega$ , $\mu$ and $f_{t}\mu$ all have the same local statistics. The key idea that the last assertion holds is to estimate the relative entropy of the solution to the DBM, $f_{t}\mu$ , relative to the pseudo equilibrium measure $\omega$ . Since the pseudo equilibrium measure is not a global equilibrium measure, the entropy will not decay monotonically as in the case of the relative entropy w.r.t. the equilibrium measure. More precisely, the time derivative of the relative entropy, under the flow of DBM, w.r.t. the pseudo equilibrium measure consists of two terms (see Theorem 3.5): (i) a dissipation term of Dirichlet form of $f_{t}\mu$ w.r.t. the pseudo equilibrium measure; (ii) an error term due to the fact that the pseudo equilibrium measure is not the true equilibrium measure.

Since the logarithmic interactions between far away particles can be approximated by a mean-field potential obtained from using the local density, the error term in (ii) can be controlled if we know the local density of particles w.r.t the distribution $f_{t}\mu$ . The precise conditions are the Assumptions (1)–(3). In the special case of $\beta=1,2$ , when the DBM is generated by symmetric matrix ensembles, these assumptions will be verified in Lemma 2.2 by using the local semicircle law; the case of $\beta=4$ is similar and some details are given in . For other values of $\beta$ it is an open question to verify the corresponding assumptions. Given that the error term in (ii) can be bounded, we obtain an estimate on the Dirichlet form of $f_{t}\mu$ w.r.t. the pseudo equilibrium measure. The key question is whether this estimate alone is sufficient to pin down the local statistics. For this purpose, we note that the Dirichlet form w.r.t. $\omega$ generates a new gradient flow, the local relaxation flow. The global relaxation time of the local relaxation flow, determined by the convexity of the pseudo equilibrium measure, is much shorter than that of the standard DBM. This leads to strong estimates on the local relaxation flow and in particular, it identifies the local statistics. The details of the entropy estimates and the local relaxation flow will appear in Section 3. We now state the main application of Theorem 2.1, the universality of symmetric Wigner ensembles.

2 Universality of symmetric Wigner ensemble

We remark that (2.15) implies that $\nu$ has a Gaussian decay, i.e.

for some $\delta_{0}>0$ . We require that $\widetilde{\nu}$ also satisfies (2.15). In this paper, all conditions and statements involving $\nu$ apply to $\widetilde{\nu}$ as well, but for the simplicity of the presentation, we will neglect mentioning $\widetilde{\nu}$ all the times.

where $\gamma$ is the standard Gaussian distribution with variance one. For the diagonal element, the Ornstein-Uhlenbeck process should be replaced by the one reversible w.r.t. the Gaussian measure with variance two due to the convention that the variances of the diagonal elements are equal to two. The Ornstein-Uhlenbeck process (2.17) induces a stochastic process on the eigenvalues; it is well-known that the process on the eigenvalue is given by the DBM (2.5) with $\beta=1$ . Notice that we used the Ornstein-Uhlenbeck process so that the resulting DBM is reversible w.r.t. $\mu$ .

Our goal is to apply Theorem 2.1 with $\beta=1$ and for this purpose, we need to verify Assumptions 1-3. Assumption 3 follows from the local semicircle law, Theorem 5.1, stated later in Section 5. Assumptions 1 and 2 can be verified if the measure $\nu$ satisfies the logarithmic Sobolev inequality (2.15). The precise statement is the following lemma.

Suppose the assumption (2.15) on the distribution $\nu$ of the matrix elements holds. Then there are positive numbers ${\mathfrak{a}}$ , ${\mathfrak{b}}$ and ${\mathfrak{c}}$ , depending on $\theta$ from (2.15), such that (2.10) and (2.11) hold.

From this Lemma, for symmetric Wigner matrices whose matrix element distributions satisfy the LSI, the assumptions of Theorem 2.1 are satisfied. Hence the correlation functions w.r.t. $f_{t}\mu$ and the GOE equilibrium measure $\mu^{(\beta=1)}_{N}$ are identical in the large $N$ limit for some $t=N^{-\zeta}$ in the sense that (2.13) holds. Together with the reverse heat flow argument, we have the following universality theorem for local statistics of Wigner ensembles whose matrix element distribution is smooth and satisfies the logarithmic Sobolev inequality. Denote by $p_{N}^{(k)}$ the correlation functions of the eigenvalues of the symmetric Wigner ensemble. Let $p_{N,GOE}^{(k)}$ be the correlation functions of the eigenvalues of GOE, i.e., the correlation functions of the equilibrium measure $\mu^{(\beta=1)}_{N}$ . It is well-known that $p_{N,GOE}^{(k)}$ can be computed explicitly (see, e.g. Section 7 of ).

Suppose the distribution $\nu$ for the matrix elements satisfies the logarithmic Sobolev inequality (2.15). Assume that $\nu$ has a positive density $\nu(x)=e^{-U(x)}$ such that for any $j$ there are constants $C_{1},C_{2}$ , depending on $j$ , such that

Then for any $|E|<2$ and for any $k\geq 1$ , we have

Theorem 2.3 is a simple corollary of Theorem 2.1 and the method of the reverse heat flow . It will be proved briefly in Section 4. Though we stated the universality in terms of correlation functions, it also holds for the eigenvalue gap distribution and we omit the obvious statement (the analogous statement for the Hermitian case was formulated in Theorem 1.2 of ).

In the following corollary, by using Theorem 15 of , we remove all assumptions from Theorem 2.3 except for a decay condition and a technical condition that $\nu$ is supported in at least three points. This latter technical condition was removed in our later paper , where we generalized our approach to a broader class of random matrix ensembles.

Suppose the distribution $\nu$ of the matrix elements has mean zero, variance one and a tail with a subexponential decay, i.e. it satisfies that

for some constants $C,\mathfrak{q}>0$ . Assume that $\nu$ is supported in at least three points. Then the conclusion (2.19) of Theorem 2.3 holds.

Proof. Let $m_{j}$ denote the moments of $\nu$

(ii) the derivative bounds (2.18) hold, and (iii) the logarithmic Sobolev inequality (2.15) holds. It is easy to argue that such a measure $\widehat{\nu}$ exists. Consider the space of all measures satisfying (2.18) with a finite LSI constant. Since the condition (2.18) and the finite LSI constant condition are preserved under small smooth perturbations which are infinite dimensional, there are enough freedom to choose perturbations so as to match the first four moments as long as $m_{4}>m_{3}^{2}+1$ . An elementary detailed proof of this fact is given in Lemma C.1. of . Therefore, $\widehat{\nu}$ satisfies the assumption of Theorem 2.3 and thus (2.19) holds for the measure $\widehat{\nu}$ . Recall that Theorem 15 in asserts that the local eigenvalue statistics for matrices whose matrix element distributions match up to the first four moments are the same in the limit $N\to\infty$ (strictly speaking, this theorem was proved only for hermitian matrices, but the parallel version for symmetric ensembles holds as well, see the remark at the end of Section 1.6 in ). This proves the corollary.

Pseudo equilibrium measure and Entropy Dissipation Estimates

The key idea to prove Theorem 2.1 is an estimate on the time to local equilibrium for the DBM. However, to estimate this time to local equilibrium, we need to introduce a different flow, the local relaxation flow, defined as the gradient flow of the pseudo equilibrium measure. The pseudo equilibrium measure is a measure which has the local statistics of the $\beta$ ensemble but has a strong convexity property. Fix a positive number $\eta$ with $N^{-1/6}\ll\eta\ll 1$ , and for the rest of this paper let $\varepsilon>0$ be a small positive number which we will not specify. Let $\gamma_{j}^{\pm}:=\gamma_{j}\pm\eta N^{-\varepsilon}$ and define the mean field potential of eigenvalues far away from the $j$ -th one as

where the summation is over all $k\in\{1,2,\ldots,N\}$ such that $|k-j|>N\eta$ . For $x\geq\gamma_{j}^{+}$ , we extend $W_{j}$ by

and similarly for $x\leq\gamma_{j}^{-}$ . In other words, $W_{j}$ is just the simplest convex extension of the function defined by (3.1) on $I_{j}$ . This modification will avoid the singularities at $x=\gamma_{k}$ . Notice that this is purely a technical device since we will show in (5.26) of Proposition 5.9 that the probability of the regime $I_{j}^{c}$ is negligible in the sense that

The pseudo equilibrium measure ${\omega}_{N}={\omega}$ is defined by

Recall that the relative entropy with respect to a measure $\lambda$ is defined by

The local relaxation flow is defined to be the reversible dynamics w.r.t. $\omega$ characterized by the generator $\widetilde{L}$ defined by

for $x\in I_{j}$ . Note that for any $k$ with $|k-j|>N\eta$ , we have $\gamma_{k}\not\in 2I_{j}$ , where $2I_{j}$ is the doubling of the interval $I_{j}$ . Moreover, for $k=j\pm N\eta$ we have $|\gamma_{k}-\gamma_{j}|\leq C\eta^{2/3}$ for some constant $C$ , and so $|x-\gamma_{k}|\leq C\eta^{2/3}$ for $x\in I_{j}$ . Thus we obtain that

with some positive constant $c$ , using $\beta\geq 1$ . Since $W_{j}$ was defined by a convex extension outside $I_{j}$ , the same bound holds for any $x$ :

i.e., the mean field potential is uniformly convex with the convexity bound given in (3.8).

The potential $W$ is chosen to satisfy the two convexity properties: (3.8) and (3.18) and there are many other possible choices for $W$ . For example, without changing the form of $W$ given in (3.1), a more natural choice for $\gamma_{j}$ would be

This may somewhat improve the constant in the estimate (2.10), but the analysis is more complicated and we will not pursue this choice in this paper.

The following theorem is our main result on the local ergodicity of DBM.

Suppose that $S_{\mu}(f_{0}|\psi)\leq CN^{m}$ for some $m$ fixed. Let $\tau:=\eta^{1/3}N^{\delta}$ for some $\delta>0$ . Define

Then for any $J\subset\{1,2,\ldots,N-n\}$ we have

We emphasize that Theorem 3.1 applies to all $\beta\geq 1$ ensembles and the only assumption concerning the distribution $f_{t}$ is in (3.9). Notice that the first error term becomes large for $\delta$ large, i.e., if $\tau$ is large. The first ingredient to prove Theorem 3.1 is the analysis of the local relaxation flow. The following theorem shows that the local relaxation flow satisfies an entropy dissipation estimate and its equilibrium measure satisfies the logarithmic Sobolev inequality.

(Dirichlet Form Dissipation Estimate) Suppose (3.8) holds. Consider the equation

with reversible measure $\omega$ . Denote by $R:=\eta^{1/6}$ . Then we have

with a universal constant $C$ . Thus the relaxation time to equilibrium is of order $R^{2}=\eta^{1/3}$ and we have

The notation $R=\eta^{1/6}$ was introduced so that this result and Theorem 4.2 in are identical. The scale parameter $R$ has a meaning in , but it is purely a choice of convention here. The proof given below follows the argument in and it was outlined in this context in Section 5.1 of . The new observation is the additional second term on the r.h.s of (3.13), corresponding to “local Dirichlet form dissipation”. The estimate (3.14) on this additional term will play a key role in this paper.

Proof. In it was shown that, with the notation $h=\sqrt{q}$ , we have

imply that the Hessian of $\widetilde{\mathcal{H}}$ is bounded from below as

with some positive constant $C$ . This proves (3.13) and (3.14) since $R^{2}=\eta^{1/3}$ . Inserting the inequality

and integrating the resulting equation, we prove (3.15). Inserting (3.15) into (3.19) we have

The proof of (3.17) requires an integration by parts and the boundary term at $x_{i}=x_{j}$ (explained in Section 5.1. of ) should vanish. In the Appendix we will justify this technical step.

with some constant $C$ depending only on $G$ .

Proof. Without loss of generality, we consider only the case $J=\{1,\ldots,N-n\}$ . Let $q_{t}$ satisfy

with an initial condition $q_{0}$ . We first compare $q_{\tau}$ with $q_{\infty}=1$ . Using the entropy inequality,

and the exponential decay of the entropy (3.16), we have

To compare $q_{0}$ with $q_{\tau}$ , by differentiation, we have

From the Schwarz inequality and $\partial q=2\sqrt{q}\partial\sqrt{q}$ the last term is bounded by

since $G$ is smooth and compactly supported. This proves the Lemma.

Notice if we use only the entropy dissipation and Dirichlet form, the main term on the right hand side of (3.20) will become $C\sqrt{S_{\omega}(q)\tau}$ . Hence by exploiting the Dirichlet form dissipation coming from the second term on the r.h.s. of (3.13), we gain the crucial factor $N^{-1/2}$ in the estimate.

The second ingredient to prove Theorem 3.1 is the following entropy and Dirichlet form estimates.

(Entropy and Dirichlet Form Estimates) Suppose the assumptions of Theorem 3.1 hold. Recall that $\tau=\eta^{1/3}N^{\delta}$ and define $g_{t}:=f_{t}/\psi$ so that $S_{\mu}(f_{t}|\psi)=S_{\omega}(g_{t})$ . Then the entropy and the Dirichlet form satisfy the estimates:

Proof. First we need the following relative entropy identity from .

Let $f_{t}$ be a probability density satisfying $\partial_{t}f_{t}=Lf_{t}$ . Then for any probability density $\psi_{t}$ we have

In our setting, $\psi$ is independent of $t$ and $L$ satisfies (3.6). Hence we have

Since the middle term on the right hand side vanishes, we have from the Schwarz inequality

Together with the LSI (3.15) and (3.9), we have

for $t\leq\tau$ . Since $S_{\omega}(g_{0})=S_{\mu}(f_{0}|\psi)\leq CN^{m}$ and $\tau/2\gg R^{2}$ , the last inequality proves (3.22).

Integrating (3.25) from $t=\tau/2$ to $t=\tau$ and using the monotonicity of the Dirichlet form in time, we have proved (3.23) with the choice of $\tau$ .

Proof of Theorem 3.1. Fix $\tau=R^{2}N^{\delta}=\eta^{1/3}N^{\delta}$ and let $q_{0}:=g_{\tau}=f_{\tau}/\psi$ with $f_{0}$ satisfying the assumption of Theorem 3.5, i.e., $S_{\mu}(f_{0}|\psi)\leq CN^{m}$ for some $m$ and (3.9) holds. Using (3.23), we have

Clearly, equation (3.27) also holds for the special choice $f_{0}=1$ (for which $f_{\tau}=1$ ), i.e. local statistics of $\mu$ and ${\omega}$ can be compared. Hence we can replace the measure $\omega$ in (3.27) by $\mu$ and this proves Theorem 3.1.

Proof of Theorem 2.3 and Theorem 2.1

We first prove Theorem 2.3 assuming that Theorem 2.1 holds. Our main tool is the reverse heat flow argument from . Recall that the distribution of the matrix element is given by a measure $\nu$ and the generator of the Ornstein-Uhlenbeck process is $A$ (2.17). The probability distribution of all matrix elements is $\nu^{\otimes n}$ , $n=N^{2}$ . The joint probability distribution of the matrix elements at time $t$ as every matrix element evolves under the Ornstein-Uhlenbeck process is given by

where we recall that $\gamma$ is the standard Gaussian measure.

Fix a positive integer $K$ . Suppose that $\nu=u{\rm d}\gamma$ satisfies the subexponential decay condition (2.20) and the regularity condition (2.18) for all $j\leq K$ . Then there is a small constant $\alpha_{K}$ , depending only on $K$ , such that for any positive $t\leq\alpha_{K}$ there exists a probability density $g_{t}$ w.r.t. $\gamma$ with mean zero and variance one such that

for some $C>0$ depending on $K$ . Furthermore, $g_{t}$ can be chosen such that if the logarithmic Sobolev inequality (2.15) holds for the measure $\nu=u\gamma$ , then it holds for $g_{t}\gamma$ as well, with the logarithmic Sobolev constant changing by a factor of at most $2$ .

Furthermore, let ${\cal A}=A^{\otimes n}$ , $F=u^{\otimes n}$ with $n=N^{2}$ and set $G_{t}:=g_{t}^{\otimes n}$ . Then we also have

Proof. This proposition can be proved following the reverse heat flow idea from . Define $\theta(x)=\theta_{0}(t^{\alpha}x)$ with some small positive $\alpha>0$ depending on $K$ , where $\theta_{0}$ is a smooth cutoff function satisfying $\theta_{0}(x)=1$ for $|x|\leq 1$ and $\theta_{0}(x)=0$ for $|x|\geq 2$ . Set

By assumption (2.18), $h_{s}$ is positive and

for any $s\leq t$ if $t$ is small enough.

Define $v_{s}=e^{sA}h_{s}$ and by definition, $v_{0}=u$ . Then

Since the Ornstein-Uhlenbeck is a contraction in $L^{1}({\rm d}\gamma)$ , together with (2.18), we have

Notice that $h_{t}$ may not be normalized as a probability density w.r.t. $\gamma$ but it is easy to check that there is a constant $c_{t}=1+O(t^{M})$ , for any $M>0$ positive, such that $c_{t}h_{t}$ is a probability density. Clearly,

and the same formulas hold if $h_{t}$ is replaced by $v_{t}$ since the OU flow preserves expectation and variance. Let $g_{t}$ be defined by

Then $g_{t}$ is a probability density w.r.t. $\gamma$ with zero mean and variance $1$ . It is easy to check that the total variation norm of $h_{t}-g_{t}$ is smaller than any power of $t$ . Using again the contraction property of $e^{tA}$ and (4.4), we get

Now we check the LSI constant for $g_{t}$ . Recall that $g_{t}$ was obtained from $h_{t}$ by translation and dilation. By definition of the LSI constant, the translation does not change it. The dilation changes the constant, but since our dilation constant is nearly one, the change of LSI constant is also nearly one. So we only have to compare the LSI constants between ${\rm d}\nu=u{\rm d}\gamma$ and $c_{t}h_{t}{\rm d}\gamma$ . From (4.3) and that $c_{t}$ is nearly one, the LSI constant changes by a factor less than $2$ . This proves the claim on the LSI constant.

and this completes the proof of Proposition 4.1.

We now apply Theorem 2.1 to the initial distribution given by the eigenvalues of the symmetric Wigner ensemble with distribution $g_{\tau}\gamma$ where $\tau=N^{-\zeta}$ . By Proposition 4.1, the LSI constant of $g_{\tau}\gamma$ is bounded by the initial LSI constant of $\nu$ by a factor of at most two. Thus we can apply Lemma 2.2 to verify Assumptions 1 and 2 of Theorem 2.1. Assumption 3 follows from the local semicircle law, Theorem 5.1. Thus the correlation functions of the eigenvalues of the ensemble with distribution $(e^{\tau A}g_{\tau})\gamma$ are the same as those of GOE in the sense of (2.13). Finally, using (4.2), we can approximate the $k$ -point correlation function w.r.t. $(e^{\tau A}g_{\tau})\gamma$ by the one w.r.t. $\nu$ after choosing $K$ sufficiently large so that $N^{2}\tau^{K}=N^{2-K\zeta}=o(N^{-k})$ . The additional smallness factor $N^{-k}$ for the estimate on the total variation in (4.2) is necessary to conclude the convergence of the $k$ -point correlation function, since it is rescaled by a factor $N$ in each variable. We also used the trivial fact that the total variation distance of two eigenvalue distributions is bounded by the total variation distance of the distributions of the full matrix ensembles. Finally we remark that the $b\to 0$ limit in (2.19) is needed to replace $\varrho_{sc}(E)$ in (2.13) with $\varrho_{sc}(E^{\prime})$ in (2.19) using the continuity of $O$ . This concludes the proof of Theorem 2.3.

Proof of Theorem 2.1. Step 1. The first step is to show that the right hand side of (3.11) vanishes in the large $N$ limit for $\eta=N^{-\varepsilon_{3}}$ with $\varepsilon_{3}$ small enough provided that the estimates (2.10), (2.11) hold. By (2.11), $x_{j}\in I_{j}$ (recall the definition of $I_{j}$ from (3.1)) with a very high probability. In this paper we will say that an event holds with a very high probability if the complement event has a probability that is subexponentially small in $N$ , i.e., it is bounded by $C\exp(-N^{\varepsilon})$ with some fixed $\varepsilon>0$ . From the definition of $b_{j}$ (3.6), we have

Notice that function $g(x):=\frac{\mbox{sgn(x)}}{|x|+\eta}$ satisfies

as long as $x$ and $y$ have the same sign. In our case, $x_{j}-x_{k}$ and $x_{j}-\gamma_{k}$ have the same sign as long as

The last inequality holds with a very high probability due to (2.11) provided $\varepsilon_{3}$ is smaller than ${\mathfrak{b}}$ . We remark that this is the only place where we used Assumption 2. Thus,, with a very high probability, we have

The contribution to $\Lambda$ of the exceptional event is negligible, since its probability is subexponentially small in $N$ and $|b_{j}|\leq C\eta\leq CN^{\varepsilon_{3}}$ . Thus, recalling the definition of $Q$ from (2.9) and the definition of $\Lambda$ from (3.9), we can bound the error term on the right hand side of (3.11) by

provided that (2.10) holds and $\varepsilon_{3}$ and $\delta$ are small enough, depending on ${\mathfrak{a}}$ .

Step 2. From (3.11) to correlation functions. The equation (3.11) shows that for a special class of observables, depending only on rescaled differences of the points $x_{j}$ , the expectations w.r.t. $f_{t}\mu$ and w.r.t the equilibrium measure $\mu$ are identical in the large $N$ limit. But the class of observables in (2.13) of Theorem 2.1 is somewhat bigger and we need to extend our result to them. Without the ${\rm d}E^{\prime}$ integration in (2.13), the observable would strongly depend on a fixed energy $E^{\prime}$ and could not be approximated by observables depending only on differences of $x_{j}$ . Taking a small averaging in $E^{\prime}$ remedies this problem.

We will consider $E$ , $b$ and $n$ fixed, i.e., the constants in this proof may depend on these three parameters. We start with the identity

We will set $Y_{i,{\bf{m}}}=0$ if $i+m_{n}>N$ . We have to show that

Let $M$ be an $N$ -dependent parameter chosen at the end of the proof. Let

and note that $|S_{n}(M)|\leq M^{n-1}$ . To prove (4.12), it is sufficient to show that

hold for any $\tau\geq\eta^{1/3}N^{\delta}$ (note that $\tau=\infty$ corresponds to the equilibrium, $f_{\infty}=1$ ), where $\eta^{1/3}N^{\delta}$ is chosen in Theorem 2.1 and $\eta$ is chosen in the Step 1.

Case 1: Small ${\bf{m}}$ case; proof of (4.13).

After performing the ${\rm d}E^{\prime}$ integration, we will eventually apply Theorem 3.1 to the function

For any $E$ and $0<\xi<b$ define sets of integers $J=J_{E,b,\xi}$ and $J^{\pm}=J^{\pm}_{E,b,\xi}$ by

where $\gamma_{i}$ was defined in (2.8). Clearly $J^{-}\subset J\subset J^{+}$ . With these notations, we have

The error term $\Omega^{+}_{J,{\bf{m}}}$ , defined by (4.16) indirectly, comes from those $i\not\in J^{+}$ indices, for which $x_{i}\in[E-b,E+b]+O(N^{-1})$ since $Y_{i,{\bf{m}}}(E^{\prime},{\bf{x}})=0$ unless $|x_{i}-E^{\prime}|\leq C/N$ , the constant depending on the support of $O$ . Thus

for any sufficiently large $N$ assuming $\xi\gg 1/N$ and using that $O$ is a bounded function. The additional $N^{-1}$ factor comes from the ${\rm d}E^{\prime}$ integration. Taking the expectation with respect to the measure $f_{\tau}{\rm d}\mu$ , and by a Schwarz inequality, we get

using Assumption 1 (2.10). We can also estimate

where the error term $\Xi^{+}_{J,{\bf{m}}}$ , defined by (4.18), comes from indices $i\in J^{-}$ such that $x_{i}\not\in[E-b,E+b]+O(1/N)$ . It satisfies the same bound (4.17) as $\Omega^{+}_{J,{\bf{m}}}$ . By the continuity of $\varrho$ , the density of $\gamma_{i}$ ’s is bounded by $CN$ , thus $|J^{+}\setminus J^{-}|\leq CN\xi$ and $|J\setminus J^{-}|\leq CN\xi$ . Therefore, summing up the formula (4.15) for $i\in J$ , we obtain from (4.16) and (4.18)

for each ${\bf{m}}\in S_{n}$ . A similar lower bound can be obtained analogously, and after choosing $\xi=N^{-1/4}$ , we obtain

Adding up (4.19) for all ${\bf{m}}\in S_{n}(M)$ , we get

and the same estimate holds for the equilibrium, i.e., if we set $\tau=\infty$ in (4.20). Subtracting these two formulas and applying (3.11) from Theorem 3.1 to each summand on the second term in (4.19) and using (4.10), we conclude that

we obtain that (4.21) vanishes as $N\to\infty$ , and this proves (4.13).

Step 2. Large ${\bf{m}}$ case; proof of (4.14).

as $N\to\infty$ . Inserting this into (4.24), this completes the proof of (4.14) and the proof of Theorem 2.1.

Local semicircle law and proof of Lemma 2.2

We first recall the local semicircle law concerning the eigenvalues $x_{1}<x_{2}<\ldots<x_{N}$ of $H$ . Let

be the Stieltjes transform of the empirical eigenvalue distribution at spectral parameter $z=E+i\eta$ , $\eta>0$ , and let

be the Stieltjes transform of the semicirle distribution. In Theorem 4.1 of we proved the following version of the local semicircle law (we remark that, contrary to what is stated in Theorem 4.1 of , condition (2.5) of is not needed and has not been used in the proof):

Assume that the distribution $\nu$ of the matrix elements of the symmetric Wigner matrix ensemble satisfies (2.15) with some constant $\theta$ and assume that $y$ is such that $(\log N)^{4}/N\leq|y|\leq 1$ . Then for any $K>0$ there exist positive constants $\delta_{0}$ , $C$ and $c>0$ , depending only on $K$ and $\theta$ , such that for any $|x|\leq K$ we have

for all $0\leq\delta\leq\delta_{0}$ and for all $N$ large enough.

As a corollary of the local semicircle law, the number of eigenvalues up to a fixed energy $E$ can be estimated. The precise statement is the following proposition and it was proven in Proposition 4.2, equation (4.14) of .

Assume that the distribution $\nu$ of the matrix elements of the symmetric Wigner matrix ensemble satisfies (2.15) with some finite constant $\theta$ . Let

be the expectation of the empirical distribution function of the eigenvalues and recall the definition of $n_{sc}(E)$ from (2.7). Then there exists a constant $C>0$ , depending only on the constant $\theta$ in (2.15), such that

The local semicircle law implies that the local density of eigenvalues is bounded, but the estimate in Theorem 5.1 deteriorates near the spectral edges. The following upper bound on the number of eigenvalues in an interval provides a uniform control near the edges. This lemma was essentially proved in Theorem 4.6 of using ideas from an earlier version, Theorem 5.1 of . For the convenience of the reader, a detailed proof is given in the Appendix C.

We remark that for Theorem 5.1 and Lemma 5.3 it is sufficient to assume only the Gaussian decay condition (2.16) instead of the logarithmic Sobolev inequality (2.15).

We can now start to prove Lemma 2.2. Since the logarithmic Sobolev inequality holds for $\nu$ (2.15), it also holds for $\nu_{t}$ as well; for a proof see Lemma B.1 in Appendix B and recall that $\nu_{t}$ is the convolution of $\nu$ with the Ornstein-Uhlenbeck kernel which itself satisfies the logarithmic Sobolev inequality. Moreover, the LSI constant of $\nu_{t}$ is bounded uniformly for all $t>0$ , since it is the maximum of the LSI constant $\theta$ of $\nu$ and the LSI constant of the Ornstein-Uhlenbeck kernel, which is bounded uniformly in time. Therefore Lemma 2.2 follows immediately from its time independent version:

Suppose that the distribution $\nu$ of the matrix elements of the symmetric Wigner ensemble satisfies (2.15) with some finite constant $\theta$ . Then there exist positive constants ${\mathfrak{a}}$ , ${\mathfrak{b}}$ and ${\mathfrak{c}}$ such that

hold for the eigenvalues $x_{j}$ of $H$ and for any $N\geq N_{0}=N_{0}(\theta,{\mathfrak{a}},{\mathfrak{b}},{\mathfrak{c}})$ .

The proof of Lemma 5.4 is divided into two steps. In the first step, Section 5.1, we estimate the fluctuation of the eigenvalues $x_{j}$ around their mean values using the logarithmic Sobolev inequality. In the second step, Section 5.2, we estimate the deviation of the mean location of $x_{j}$ from the classical location $\gamma_{j}$ using (5.3).

the expected location of $x_{j}$ . We start with an estimate on the expected location of the extreme eigenvalues:

Suppose that the probability measure $\nu$ of the matrix entries satisfies

for some $c_{0}>0$ (this condition is satisfied, in particular, under (2.15), see (2.16)). Then for any $\delta>0$ we have

with some constant $C_{0}$ depending on $\delta$ and $c_{0}$ .

Proof. For any $M$ , define the probability measure

Denote by $\zeta_{M}^{N}=\bigotimes_{i\leq j}\zeta_{M}$ ( $\nu^{N}$ resp.) the probability law of the random matrices whose matrix elements are distributed according to $\zeta_{M}$ ( $\nu$ resp.). As usual, we neglect the fact that distribution $\nu$ should be replaced with $\widetilde{\nu}$ for the diagonal elements $i=j$ . Since the number of index pairs $i\leq j$ is of order $N^{2}$ , the total variational norm between $\zeta_{M}^{N}$ and $\nu^{N}$ is bounded by

From Theorem 1.4 of we obtain that for any $\delta>0$

holds almost surely w.r.t. $\zeta_{M}^{N}$ . It follows from (5.11) that $x_{N}$ is bounded w.r.t. the distribution $\nu^{N}$ as well, up to a subexponentially small probability. To estimate the tail of $x_{N}$ w.r.t. $\nu^{N}$ , we use that $\max_{j}|x_{j}|^{2}\leq\mbox{Tr}\,H^{2}$ and the trivial large deviation bound based upon (5.8),

that holds for any $K>0$ with constants $C,c$ depending on $c_{0}$ . We thus obtain that the expectations of $x_{N}$ w.r.t. these two measures satisfy

Thus we have proved that, for any $\delta>0$ ,

with some constant $C$ depending on $c_{0}$ and $\delta$ . Similar lower bound holds for $\alpha_{1}$ .

Next we estimate the fluctuations of $x_{j}$ :

with a constant $C$ depending on $\varepsilon$ and $\theta$ .

Proof. First order perturbation theory of the eigenvalue $x_{j}$ of $H$ shows that

Using (2.15) and the Bobkov-Götze concentration inequality (Theorem 2.1 of ), we have for any $T>0$

after optimizing for $T$ and using that $|\nabla x_{j}|\leq CN^{-1/2}$ from above. This proves (5.14).

The following proposition is a refinement of Proposition 5.6:

Fix a sufficiently small constant $\delta>0$ and set $\kappa=N^{-1/18+\delta}$ . Then for any index $i$ with $CN\kappa^{3/2}\leq i\leq N(1-C\kappa^{3/2})$ we have

The constants $C$ and $c$ depend on $\delta$ and $\theta$ but are independent of $N$ .

As a preparation for the proof of Proposition 5.7, we need the following estimate on the tail of the gap distribution.

Let $|E|<2$ . Denote by $x_{\alpha}$ the largest eigenvalue below $E$ and assume that $\alpha\leq N-1$ . Then there exist positive constants $C$ and $c$ , depending only on the Sobolev constant $\theta$ in (2.15), such that for any $M$ that satisfy $c(\log N)^{4}/(2-|E|)\leq M\leq CN(2-|E|)$ , we have

This lemma was proven in Theorem E1 of , see also Theorem 3.3 of , and the proof will not be repeated here. We only mention the main idea, that the local semicircle law, Theorem 5.1, provides a positive lower bound on the number of eigenvalues in any interval $I$ of size $|I|\geq A(\log N)^{4}/\big{[}N\big{|}2-|x|\big{|}\big{]}$ around the point $x$ with a sufficiently large constant $A$ . In particular, it follows that there is at least one eigenvalue in each such interval $I$ with a very high probability.

Proof of Proposition 5.7. We choose $M,K$ positive numbers, depending on $N$ , such that

with some sufficiently small $c>0$ and large $C>0$ constants. Let

Consider an index $i$ with $CN\kappa^{3/2}\leq i\leq N(1-C\kappa^{3/2})$ , then $|2-|\gamma_{i}||\geq C\kappa$ . We first show that $|2-|x_{i}||\geq C\kappa^{2}$ with a very high probability. Suppose, in the contrary, that $x_{i}<-2+C\kappa^{2}$ for some $i\geq CN\kappa^{3/2}$ (the case $x_{i}\geq 2-C\kappa^{2}$ is treated analogously). From (5.9) and (5.14) it follows that $x_{1}\geq-2-C_{0}N^{-1/4+\delta}$ with a very high probability. But then the interval $[-2-C_{0}N^{-1/4+\delta},-2+C\kappa^{2}]$ of length $C_{0}N^{-1/4+\delta}+C\kappa^{2}\ll\kappa^{3/2}$ would contain $CN\kappa^{3/2}$ eigenvalues, an event with an extremely low probability by (5.4).

Knowing that $|2-|x_{i}||\geq\kappa^{2}$ with a very high probability, we can use (5.17) to conclude that for any index $i$ with $CN\kappa^{3/2}\leq i\leq N(1-C\kappa^{3/2})$ we have

Similarly to the calculation in Theorem 3.1 of , by using the logarithmic Sobolev inequality (2.15), we have

This bound holds for any index $i$ with the remark that if $i<K$ or $i>N-K$ , then the averaging over the indices $j$ is done asymmetrically.

Combining this estimate with (5.21) we have, apart from a set of very small probability, that

for any $CN\kappa^{3/2}\leq i\leq N(1-C\kappa^{3/2})$ . Taking expectation, and using the tail estimate (5.13) to control $x_{N}$ on the event of very small probability where (5.23) may not hold, we also obtain for these $i$ indices that

Subtracting the last two inequalities yields

and combining this bound with the estimate (5.14) for the extreme indices, we obtain

The inequalities (5.15) and (5.16) now follow if we choose the parameters as

which choice is compatible with the conditions (5.18).

2 Deviation of the eigenvalues from their classical locations

The next Proposition 5.9 below estimates the distance of the eigenvalues from their location given by the semicircle law. This will justify that the convex extension of the potential $W_{j}$ affects only regimes of very small probability.

For any small $\delta>0$ and for any $j=1,2,\ldots N$ we have

The constants $C$ and $c$ depend on $\delta$ and $\theta$ but are independent of $N$ .

We remark that in the bulk $\alpha_{m}-\gamma_{m}$ is expected to be bounded by $O(N^{-1+\varepsilon})$ (in the hermitian case it was proven in , see also ); near the edges one expects $\alpha_{m}-\gamma_{m}\sim O(N^{-2/3})$ . Our estimate is not optimal, but it gives a short proof that is sufficient for our purpose. We remark that after submitting this paper, these conjectures were proven in .

to be the counting function of the expected locations of the eigenvalues. We compare $\widetilde{n}(E)$ with $n(E)$ defined in (5.2). Using the fluctuation bound (5.14), we have

and the second term is analogous. We thus have

For the energy range $|E|\leq 3$ , we use (5.28):

with an $\varepsilon$ dependent constant.

To estimate $|\alpha_{m}-\gamma_{m}|$ , we can assume, without loss of generality, that $\alpha_{m}\geq\gamma_{m}$ , the other case is treated analogously. Let $\lambda>0$ be a parameter that will be optimized later. Set $m_{0}=[CN\lambda^{3/2}]$ with a sufficiently large constant $C$ . Since $n_{sc}(-2+\delta)\sim\delta^{3/2}$ , for any small $\delta>0$ , the parameter $\lambda$ is roughly the energy difference from the edge to the $m_{0}$ -th eigenvalue.

From the property $n_{sc}(-2+\delta)\sim\delta^{3/2}$ for small $\delta$ , we have

Combining this estimate with (5.32), we have

for any $m$ with $m_{0}\leq m\leq N-m_{0}$ .

For the extreme indices, we use that if $m\leq m_{0}$ , then from Lemma 5.5 and (5.33), we have

with $C$ depending on $\varepsilon$ . Similar estimates hold at the upper edge of the spectrum, i.e. for $m\geq N-m_{0}$ . Choosing $\lambda=CN^{-1/5+\varepsilon}$ , we conclude the proof of (5.27). The proof of (5.26) then follows from (5.14) and this concludes the proof of Proposition 5.9.

The following Proposition is a strengthening of the bound (5.32) used previously.

for any $\delta>0$ and with a constant $C$ depending on $\delta$ and $\theta$ .

Proof. Recalling the definition of $\Phi$ from (5.19), we will prove that

which gives (5.34) with the choice of parameters (5.25). We proceed similarly to the proof of Proposition 5.9 but we notice that in addition to (5.28), a stronger bound on $|n(E)-\widetilde{n}(E)|$ is available for $E\in I:=[E_{-},E_{+}]$ , where $E_{\pm}:=\pm(2-C_{2}\kappa)$ , with some large constant $C_{2}$ and setting $\kappa:=N^{-1/18+\delta}$ as in Proposition 5.7. To obtain an improved bound, note that for any $E$ in this interval

To see this inequality, define the random index

The estimate (5.36) will then follow if we prove that $j_{0}\leq j_{1}$ , i.e. $\alpha_{j_{0}}\leq E+\Phi$ , with a very high probability. By (5.26) we have, with a very high probability, that

Therefore, with a very high probability, $\gamma_{j_{0}}$ is in the $CN^{-1/5+\delta}$ vicinity of $E\in I$ , and thus $CN\kappa^{3/2}\leq j_{0}\leq N(1-C\kappa^{3/2})$ holds for any fixed $C$ if $C_{2}$ in the definition of $E_{\pm}$ is sufficiently large. Thus $|x_{j_{0}}-\alpha_{j_{0}}|\leq\Phi$ with a very high probability by (5.15), so $x_{j_{0}}\leq E$ implies $\alpha_{j_{0}}\leq E+\Phi$ and this proves (5.36).

is analogous. Finally, by the Lipschitz continuity of $\widetilde{n}(E)$ on a scale bigger than $(\log N)^{2}/N$ , we have

where we also used that $\Phi\geq C\exp(-cN^{\delta})$ .

Define the interval $J=[-2-C_{1}N^{-1/4+\delta},2+C_{1}N^{1/4+\delta}]$ with a constant $C_{1}$ larger than the constant $C_{0}$ in (5.9). Using (5.30), we have

since $|J\setminus I|\leq C\kappa+N^{-1/4+\delta}\leq C\kappa$ . Finally, $\widetilde{n}(E)\equiv 0$ for $E<-2-C_{0}N^{-1/4+\delta}$ and $\widetilde{n}(E)\equiv 1$ for $E>2+C_{0}N^{-1/4+\delta}$ by (5.9). Since $C_{1}>C_{0}$ , combining these estimates with the fluctuation (5.14) and with the tail estimate (5.13), we obtain that $n(E)(1-n(E))\leq C\exp\big{[}-cN^{1/4}\big{]}$ for any $E\in J^{c}$ , and it decays exponentially for large $|E|$ , therefore

Collecting all these estimates, inserting them into (5.38) and using (5.3), we obtain (5.35) and conclude the proof of Proposition 5.10.

Finally, we can complete the proof of Lemma 5.4. By (5.16) and (5.34), we have

apart from a set of probability $C\exp\big{[}-cN^{\delta}\big{]}$ . Combining it with the tail estimate (5.13) on $\max|x_{k}|$ , we obtain (5.5) with any ${\mathfrak{a}}<1/18$ . The inequality (5.6) in Lemma 5.4 follows immediately from (5.26) with any ${\mathfrak{c}}>0$ sufficiently small and with ${\mathfrak{b}}<1/5-{\mathfrak{c}}$ .

Appendix A Some Properties of the Eigenvalue Process

In the main part of the paper we did not specify the function spaces in which the equations (2.4) and (3.12) are solved. In this appendix we summarize some basic properties of these equations. In particular, we justify the integration by parts in (3.17). For simplicity, we consider the most singular $\beta=1$ case only.

The Dyson Brownian motion as a stochastic process was rigorously constructed in Section 4.3.1 of . It was proved that the eigenvalues do not collide with probability one and thus (2.4) holds in a weak sense on the open set $\Sigma_{N}$ . The coefficients of $L$ have a $(x_{i}-x_{j})^{-1}$ singularity near the coalescence hyperspace $x_{i}=x_{j}$ . We focus only on the single collision singularities, i.e. on the case $j=i\pm 1$ . By the ordering of the eigenvalues, higher order collision points form a zero measure set on the boundary of $\Sigma_{N}$ and can thus be neglected. In an open neighborhood near the coalescence hyperspace $x_{i}=x_{i+1}$ , the generator has the form

after a change of variables, $v=\frac{1}{2}(x_{i}+x_{i+1})$ , $u=\frac{1}{2}(x_{i+1}-x_{i})$ , where $L_{reg}$ has regular coefficients. The boundary condition at $u=0$ is given by the standard boundary condition of the generator of the Bessel process, $\partial_{u}^{2}+\frac{1}{u}\partial_{u}$ , which is $uf^{\prime}(u)\to 0$ as $u\to 0+$ . Thus $L$ is defined on functions $f\in C^{2}(\Sigma_{N})$ with sufficient decay at infinity and with boundary conditions

The generator $\widetilde{L}$ of (3.12) differs from $L$ only in drift terms with bounded coefficients, hence the boundary conditions of $\widetilde{L}$ and $L$ coincide. Finally, we need some non-vanishing and regularity property of the solution of (3.12):

Let $\Omega\subset\Sigma^{N}$ be a bounded open set such that

i.e. $\overline{\Omega}$ intersects at most one of the coalescent hyperplanes, namely the $\{x_{1}=x_{2}\}$ . Then any weak solution $q_{t}({\bf{x}})$ of (3.12) with boundary conditions (A.1) is $C^{2}$ on $(t,{\bf{x}})\in R_{+}\times\overline{\Omega}$ and for any $t>0$ we have

where $L_{reg}$ is an elliptic operator with second derivatives in the ${\bf{y}}$ variables and with bounded coefficients on the compact set $\Phi(\overline{\Omega})$ . The solution in the new coordinates is $\widetilde{q}_{t}(u,{\bf{y}})=q_{t}(\Phi^{-1}(u,{\bf{y}}))$ . Introducing a function $\widehat{q}_{t}(a,b,{\bf{y}}):=\widetilde{q}_{t}(\sqrt{a^{2}+b^{2}},{\bf{y}})$ defined in $N+1$ variables, we see that $\widehat{q}_{t}$ satisfies $\partial_{t}\widehat{q}_{t}=\widehat{L}\widehat{q}_{t}$ , where

i.e. $\widehat{L}$ is elliptic with bounded coefficients in the new variables. Notice that the boundary condition (A.1) implies that, in the two dimensional plane of $(a,b)$ , the support of the test function for the equation $\partial_{t}\widehat{q}_{t}=\widehat{L}\widehat{q}_{t}$ is allowed to include the origin $(0,0)$ .

By standard parabolic regularity, we obtain that the solution is $C^{2}$ and is bounded from above and below.

This lemma justifies the integration by parts in (3.17). Since $q\in C^{2}$ and it is separated away from zero, $h=\sqrt{q}$ has no singularity on the coalescence lines. Since the function $\exp(-\widetilde{\mathcal{H}})$ vanishes whenever $x_{i}=x_{j}$ for some $i\neq j$ , the boundary terms of the form

Appendix B Logarithmic Sobolev inequality for convolution measures

for any $f$ with $\int f(x)K\ast H(x){\rm d}x=1$ . Here $\nabla={\rm d}/{\rm d}x$ .

Proof. The following proof is really a special case of the martingale approach used in to prove LSI. Let

For any fixed $y$ , from the LSI w.r.t. the measure $K(x-y){\rm d}x$ , the first term on the right hand side is bounded by

Integrating by parts, we can rewrite the last term as

where we have used $f^{\prime}(x)=2\sqrt{f(x)}\nabla\sqrt{f(x)}$ and the Schwarz inequality. Combining these inequalities, we have proved the Lemma.

Appendix C Proof of Lemma 5.3

For any $1\leq k\leq N$ , let $H^{(k)}$ denote the $(N-1)\times(N-1)$ minor that is obtained from the Wigner matrix $H$ by removing the $k$ -th row and column. Let ${\bf{a}}^{(k)}=(h_{k1},h_{k2},\ldots h_{k,k-1},h_{k,k+1},\ldots h_{kN})^{t}$ be the $k$ -th column of $H$ without the $h_{kk}$ element. Let $\lambda_{1}^{(k)}<\lambda_{2}^{(k)}<\ldots<\lambda_{N-1}^{(k)}$ be the eigenvalues and ${\bf{u}}_{1}^{(k)},{\bf{u}}_{2}^{(k)},\ldots$ the corresponding eigenvectors of $H^{(k)}$ and set

It is well known (see, e.g. Lemma 2.5 of ), that the eigenvalues of $H^{(k)}$ and $H$ are interlaced for each $k$ , i.e.

Expressing the resolvent $G=(H-z)^{-1}$ of $H$ at a spectral parameter $z=E+i\eta$ , $\eta>0$ , in terms of the resolvent of $H^{(k)}$ , we obtain

By considering only the imaginary part, we obtain

where we restricted the $\alpha$ summation in (C.3) only to eigenvalues lying in $I$ .

For each $k=1,2,\ldots N$ , we define the event

if $K$ is sufficiently large, recalling that $\eta=|I|\geq(\log N)^{2}/N$ . On the complement event, $\widetilde{\Omega}^{c}$ , we have from (C.4) that

i.e. ${\mathcal{N}}_{I}\leq(C/\delta)^{1/2}N\eta$ . Choosing $K$ sufficiently large, we obtain (5.4) from (C.5). This proves Lemma 5.3.

holds for any ${\cal I}$ , where $m=|{\cal I}|$ is the cardinality of the index set ${\cal I}$ .

Proof of Lemma C.1. We will need the following result of Hanson and Wright , extended to non-symmetric variables by Wright . We remark that this statement can also be extended to complex random variables (Proposition 4.5 ).

Let $b_{j}$ , $j=1,2,\ldots N$ be a sequence of real i.i.d. random variables with distribution ${\rm d}\nu$ satisfying the Gaussian decay (2.16) for some $\delta_{0}>0$ . Let $a_{jk}$ , $j,k=1,2,\ldots N$ be arbitrary real numbers and let ${\cal A}$ be the $N\times N$ matrix with entries ${\cal A}_{jk}:=|a_{jk}|$ . Define

Then there exists a constant $c>0$ , depending only on $\delta_{0},D$ from (2.16), such that for any $\delta>0$

where $A:=(\text{Tr}\,{\cal A}{\cal A}^{t})^{1/2}=\big{[}\sum_{j,k}|a_{jk}|^{2}\big{]}^{1/2}$ .

Acknowledgement: We thank Jun Yin for several helpful comments and pointing out some errors in the preliminary versions of this paper. We are also grateful to the referees for their suggestions to improve the presentation.