Universality of Random Matrices and Local Relaxation Flow
Laszlo Erdos, Benjamin Schlein, Horng-Tzer Yau
Introduction
A central question concerning random matrices is the universality conjecture which states that local statistics of eigenvalues are determined by the symmetries of the ensembles but are otherwise independent of the details of the distributions. There are two types of universalities: the edge universality and the bulk universality concerning the interior of the spectrum. The edge universality is commonly approached via the moment method while the bulk universality was proven for very general classes of unitary invariant ensembles (see, e.g. and references therein) based on detailed analysis of orthogonal polynomials. The most prominent non-unitary ensembles are the Wigner matrices, i.e., random matrices with i.i.d. matrix elements that follow a general distribution. The bulk universality for Hermitian Wigner ensembles was first established in for ensembles with smooth distributions. The later work by Tao and Vu did not assume smoothness but it required some moment condition which was removed later in . Our approach to prove the universality was based on the following three steps.
It states that the number of eigenvalues in a spectral window containing about eigenvalues is given by the semicircle law with a very high probability . The factor can be improved to any sufficiently large constant at the expense of deterioriation of the probability estimate.
The Gaussian divisible ensembles are given by matrices of the form
where is a Wigner matrix, is an independent standard GUE matrix and . Johansson and the later improvement in proved that the bulk universality holds for ensembles of the form (1.1) if is independent of . In the work , this result was extended to for any . The key ingredient for this extension was the local semicircle law.
For any given Wigner matrix , we find another Wigner matrix so that the eigenvalue statistics of and are close to each other. The choice of is given by a reverse heat flow argument.
Johansson’s proof of the universality of Hermitian Wigner ensembles relied on the asymptotic analysis of an explicit formula by Brézin-Hikami for the correlation functions of the eigenvalues of . Unfortunately, the similar formula for GOE is not very explicit and the corresponding result is not available. On the other hand, the eigenvalue distribution of the matrix is the same as that of , where the matrix elements of are independent standard Brownian motions with variance . Dyson observed that the evolution of the eigenvalues of the flow is given by a system of coupled stochastic differential equations (SDE), commonly called the Dyson Brownian motion (DBM) .
If we replace the Brownian motions by the Ornstein-Uhlenbeck processes, the resulting dynamics on the eigenvalues, which we still call DBM, has the GUE or GOE eigenvalue distributions as the invariant measures depending on the symmetry type of the ensembles. Thus the result of Johansson can be interpreted as stating that the local statistics of GUE is reached via DBM for time of order one. In fact, by analyzing the dynamics of DBM with ideas from the hydrodynamical limit, we have extended Johansson’s result to . The key observation of is that the local statistics of eigenvalues depend exclusively on the approach to local equilibrium. This method avoids the usage of explicit formulae for correlation functions, but the identification of local equilibria, unfortunately, still uses explicit representations of correlation functions by orthogonal polynomials (following e.g. ), and the extension to other ensembles is not a simple task.
Therefore, the universality for symmetric random matrices remained open and the only partial result is Theorem 23 of for Wigner matrices with the first four moments of the matrix elements matching those of GOE. The approach of consisted of three similar steps as outlined above. For Step 2, it used the result of . For Step 3, a four moment comparison theorem for individual eigenvalues was proved in and the local semicircle law (Step 1) was one of the key inputs in this proof.
In this paper, we introduce a general approach to prove local ergodicity of DBM, partially motivated by the previous work . In this approach the analysis of orthogonal polynomials or explicit formulae are completely eliminated and the method applies to both Hermitian and symmetric ensembles. In fact, the heart of the proof is a convex analysis and it applies to -ensembles for any . The model specific information required to complete this approach involves only rough estimates on the accuracy of the local density of eigenvalues. We expect this method to apply to a very general class of models. More detailed explanations will be given in Section 3.
Statement of Main Results
To fix the notation, we will present the case of symmetric Wigner matrices; the modification to the Hermitian case is straightforward and will be omitted. The extension to the quaternion self-dual case is also standard, see, e.g. for the notations and setup. On the other hand, the main theorem on DBM (Theorem 2.1) is valid for general -ensembles. Thus all notations for matrices will be restricted to symmetric matrices but all results for flows will be stated and proved for general -ensembles. We first explain our general result about DBM and in Section 2.2 we will present its application to Wigner matrices.
The joint distributions of the eigenvalues of the Gaussian Unitary Ensemble (GUE) and the Gaussian Orthogonal Ensemble (GOE) are given by the following measure
where for GOE and for GUE. We will sometimes use to denote the density of the measure as well, i.e., . We consider defined on the ordered set
and this measure is well-defined for all . The Dyson Brownian motion (DBM) is characterized by the generator
acting on . The DBM is reversible with respect to with the Dirichlet form
where . Notice that we have added a drift so that the DBM is reversible w.r.t. . The original definition by Dyson in was slightly different; it contained no drift term.
Denote the distribution of the process at the time by . Then satisfies
The corresponding stochastic differential equation for is now given by (see, e.g. Section 12.1 of )
where is a collection of independent Brownian motions. The well-posedness of DBM on has been proved in Section 4.3.1 of , see the Appendix for some more details. This step requires which we will assume from now on.
The dynamics given by (2.4) and (2.5) with can be realized by the evolution of the eigenvalues of symmetric, hermitian and quaternion self-dual matrix ensembles, but the dynamics is well-defined for independently of the original matrix models. Our main result, Theorem 2.1, is valid for all .
Similarly, the correlation functions of the equilibrium measure are denoted by
is the density of the semicircle law and it is well-known that is also the density w.r.t. the measure in the limit for . Define
and let be the classical location of the -th eigenvalue
Our key result on the local ergodicity of DBM is the following theorem.
Suppose the initial density satisfies with some fixed exponent independent of . Let be the solution of the forward equation (2.4). Suppose that the following three assumptions are satisfied for all sufficiently large .
There exist constants and such that
Convention. We will use the letters and to denote general positive constants whose precise values are irrelevant and they may change from line to line.
The relaxation time to global equilibrium for the DBM is order one in our scaling. The simplest way to see this is via the Bakry-Emery theorem which states that, roughly speaking, the relaxation time is the inverse of the lower bound to the Hessian of the Hamiltonian . In our case , and this implies that the relaxation time is order one. On the other hand, it was conjectured by Dyson that the relaxation time to local equilibrium is of order . Theorem 2.1 asserts that the relaxation time to local equilibrium is less than . Although this is far from proving Dyson’s conjecture, it is the first effective estimate that shows that the local equilibrium is approached much faster than the global one. Moreover, this result suffices to prove the bulk universality of Wigner matrices when combining with the reverse heat flow ideas introduced in . We remark that the concept of local equilibrium is used vaguely here and in Dyson’s paper. In principle, there are many local equilibria depending on boundary conditions and the uniqueness is a tough question especially now that the interaction is long ranged and singular.
The proof of Theorem 2.1 is based on the introduction of the pseudo equilibrium measure which we now explain. It is common that the global and the local equilibrium are reached at different time scales for interacting particle systems, of which DBM is a special case. On the other hand, the hydrodynamical approach for the DBM yields very complicated estimates. The main reason for the complications is due to that the equilibrium measure of DBM has a logarithmic two body interaction that is both long range and singular at short distances. Hence the proof of the uniqueness of ”local equilibrium measures” is very complicated and we were able to carry it out only for the Hermitian case due to that several identities involving orthogonal polynomials are valid only for the case. However, there are two key observations from this study:
The local statistics does not depend on the long range part of the logarithmic interaction, in other words, we can cutoff the interactions between far away particles without changing the local statistics.
The relaxation time for the gradient flow associated with the local equilibrium with a fixed boundary condition is much smaller than the global relaxation time of the DBM, which is of order one.
To finesse the difficulty associated with the uniqueness of local equilibria, we define the pseudo equilibrium measure, , by cutting off the long range interactions of the equilibrium measure and show that , and all have the same local statistics. The key idea that the last assertion holds is to estimate the relative entropy of the solution to the DBM, , relative to the pseudo equilibrium measure . Since the pseudo equilibrium measure is not a global equilibrium measure, the entropy will not decay monotonically as in the case of the relative entropy w.r.t. the equilibrium measure. More precisely, the time derivative of the relative entropy, under the flow of DBM, w.r.t. the pseudo equilibrium measure consists of two terms (see Theorem 3.5): (i) a dissipation term of Dirichlet form of w.r.t. the pseudo equilibrium measure; (ii) an error term due to the fact that the pseudo equilibrium measure is not the true equilibrium measure.
Since the logarithmic interactions between far away particles can be approximated by a mean-field potential obtained from using the local density, the error term in (ii) can be controlled if we know the local density of particles w.r.t the distribution . The precise conditions are the Assumptions (1)–(3). In the special case of , when the DBM is generated by symmetric matrix ensembles, these assumptions will be verified in Lemma 2.2 by using the local semicircle law; the case of is similar and some details are given in . For other values of it is an open question to verify the corresponding assumptions. Given that the error term in (ii) can be bounded, we obtain an estimate on the Dirichlet form of w.r.t. the pseudo equilibrium measure. The key question is whether this estimate alone is sufficient to pin down the local statistics. For this purpose, we note that the Dirichlet form w.r.t. generates a new gradient flow, the local relaxation flow. The global relaxation time of the local relaxation flow, determined by the convexity of the pseudo equilibrium measure, is much shorter than that of the standard DBM. This leads to strong estimates on the local relaxation flow and in particular, it identifies the local statistics. The details of the entropy estimates and the local relaxation flow will appear in Section 3. We now state the main application of Theorem 2.1, the universality of symmetric Wigner ensembles.
2 Universality of symmetric Wigner ensemble
We remark that (2.15) implies that has a Gaussian decay, i.e.
for some . We require that also satisfies (2.15). In this paper, all conditions and statements involving apply to as well, but for the simplicity of the presentation, we will neglect mentioning all the times.
where is the standard Gaussian distribution with variance one. For the diagonal element, the Ornstein-Uhlenbeck process should be replaced by the one reversible w.r.t. the Gaussian measure with variance two due to the convention that the variances of the diagonal elements are equal to two. The Ornstein-Uhlenbeck process (2.17) induces a stochastic process on the eigenvalues; it is well-known that the process on the eigenvalue is given by the DBM (2.5) with . Notice that we used the Ornstein-Uhlenbeck process so that the resulting DBM is reversible w.r.t. .
Our goal is to apply Theorem 2.1 with and for this purpose, we need to verify Assumptions 1-3. Assumption 3 follows from the local semicircle law, Theorem 5.1, stated later in Section 5. Assumptions 1 and 2 can be verified if the measure satisfies the logarithmic Sobolev inequality (2.15). The precise statement is the following lemma.
Suppose the assumption (2.15) on the distribution of the matrix elements holds. Then there are positive numbers , and , depending on from (2.15), such that (2.10) and (2.11) hold.
From this Lemma, for symmetric Wigner matrices whose matrix element distributions satisfy the LSI, the assumptions of Theorem 2.1 are satisfied. Hence the correlation functions w.r.t. and the GOE equilibrium measure are identical in the large limit for some in the sense that (2.13) holds. Together with the reverse heat flow argument, we have the following universality theorem for local statistics of Wigner ensembles whose matrix element distribution is smooth and satisfies the logarithmic Sobolev inequality. Denote by the correlation functions of the eigenvalues of the symmetric Wigner ensemble. Let be the correlation functions of the eigenvalues of GOE, i.e., the correlation functions of the equilibrium measure . It is well-known that can be computed explicitly (see, e.g. Section 7 of ).
Suppose the distribution for the matrix elements satisfies the logarithmic Sobolev inequality (2.15). Assume that has a positive density such that for any there are constants , depending on , such that
Then for any and for any , we have
Theorem 2.3 is a simple corollary of Theorem 2.1 and the method of the reverse heat flow . It will be proved briefly in Section 4. Though we stated the universality in terms of correlation functions, it also holds for the eigenvalue gap distribution and we omit the obvious statement (the analogous statement for the Hermitian case was formulated in Theorem 1.2 of ).
In the following corollary, by using Theorem 15 of , we remove all assumptions from Theorem 2.3 except for a decay condition and a technical condition that is supported in at least three points. This latter technical condition was removed in our later paper , where we generalized our approach to a broader class of random matrix ensembles.
Suppose the distribution of the matrix elements has mean zero, variance one and a tail with a subexponential decay, i.e. it satisfies that
for some constants . Assume that is supported in at least three points. Then the conclusion (2.19) of Theorem 2.3 holds.
Proof. Let denote the moments of
(ii) the derivative bounds (2.18) hold, and (iii) the logarithmic Sobolev inequality (2.15) holds. It is easy to argue that such a measure exists. Consider the space of all measures satisfying (2.18) with a finite LSI constant. Since the condition (2.18) and the finite LSI constant condition are preserved under small smooth perturbations which are infinite dimensional, there are enough freedom to choose perturbations so as to match the first four moments as long as . An elementary detailed proof of this fact is given in Lemma C.1. of . Therefore, satisfies the assumption of Theorem 2.3 and thus (2.19) holds for the measure . Recall that Theorem 15 in asserts that the local eigenvalue statistics for matrices whose matrix element distributions match up to the first four moments are the same in the limit (strictly speaking, this theorem was proved only for hermitian matrices, but the parallel version for symmetric ensembles holds as well, see the remark at the end of Section 1.6 in ). This proves the corollary.
Pseudo equilibrium measure and Entropy Dissipation Estimates
The key idea to prove Theorem 2.1 is an estimate on the time to local equilibrium for the DBM. However, to estimate this time to local equilibrium, we need to introduce a different flow, the local relaxation flow, defined as the gradient flow of the pseudo equilibrium measure. The pseudo equilibrium measure is a measure which has the local statistics of the ensemble but has a strong convexity property. Fix a positive number with , and for the rest of this paper let be a small positive number which we will not specify. Let and define the mean field potential of eigenvalues far away from the -th one as
where the summation is over all such that . For , we extend by
and similarly for . In other words, is just the simplest convex extension of the function defined by (3.1) on . This modification will avoid the singularities at . Notice that this is purely a technical device since we will show in (5.26) of Proposition 5.9 that the probability of the regime is negligible in the sense that
The pseudo equilibrium measure is defined by
Recall that the relative entropy with respect to a measure is defined by
The local relaxation flow is defined to be the reversible dynamics w.r.t. characterized by the generator defined by
for . Note that for any with , we have , where is the doubling of the interval . Moreover, for we have for some constant , and so for . Thus we obtain that
with some positive constant , using . Since was defined by a convex extension outside , the same bound holds for any :
i.e., the mean field potential is uniformly convex with the convexity bound given in (3.8).
The potential is chosen to satisfy the two convexity properties: (3.8) and (3.18) and there are many other possible choices for . For example, without changing the form of given in (3.1), a more natural choice for would be
This may somewhat improve the constant in the estimate (2.10), but the analysis is more complicated and we will not pursue this choice in this paper.
The following theorem is our main result on the local ergodicity of DBM.
Suppose that for some fixed. Let for some . Define
Then for any we have
We emphasize that Theorem 3.1 applies to all ensembles and the only assumption concerning the distribution is in (3.9). Notice that the first error term becomes large for large, i.e., if is large. The first ingredient to prove Theorem 3.1 is the analysis of the local relaxation flow. The following theorem shows that the local relaxation flow satisfies an entropy dissipation estimate and its equilibrium measure satisfies the logarithmic Sobolev inequality.
(Dirichlet Form Dissipation Estimate) Suppose (3.8) holds. Consider the equation
with reversible measure . Denote by . Then we have
with a universal constant . Thus the relaxation time to equilibrium is of order and we have
The notation was introduced so that this result and Theorem 4.2 in are identical. The scale parameter has a meaning in , but it is purely a choice of convention here. The proof given below follows the argument in and it was outlined in this context in Section 5.1 of . The new observation is the additional second term on the r.h.s of (3.13), corresponding to “local Dirichlet form dissipation”. The estimate (3.14) on this additional term will play a key role in this paper.
Proof. In it was shown that, with the notation , we have
imply that the Hessian of is bounded from below as
with some positive constant . This proves (3.13) and (3.14) since . Inserting the inequality
and integrating the resulting equation, we prove (3.15). Inserting (3.15) into (3.19) we have
The proof of (3.17) requires an integration by parts and the boundary term at (explained in Section 5.1. of ) should vanish. In the Appendix we will justify this technical step.
with some constant depending only on .
Proof. Without loss of generality, we consider only the case . Let satisfy
with an initial condition . We first compare with . Using the entropy inequality,
and the exponential decay of the entropy (3.16), we have
To compare with , by differentiation, we have
From the Schwarz inequality and the last term is bounded by
since is smooth and compactly supported. This proves the Lemma.
Notice if we use only the entropy dissipation and Dirichlet form, the main term on the right hand side of (3.20) will become . Hence by exploiting the Dirichlet form dissipation coming from the second term on the r.h.s. of (3.13), we gain the crucial factor in the estimate.
The second ingredient to prove Theorem 3.1 is the following entropy and Dirichlet form estimates.
(Entropy and Dirichlet Form Estimates) Suppose the assumptions of Theorem 3.1 hold. Recall that and define so that . Then the entropy and the Dirichlet form satisfy the estimates:
Proof. First we need the following relative entropy identity from .
Let be a probability density satisfying . Then for any probability density we have
In our setting, is independent of and satisfies (3.6). Hence we have
Since the middle term on the right hand side vanishes, we have from the Schwarz inequality
Together with the LSI (3.15) and (3.9), we have
for . Since and , the last inequality proves (3.22).
Integrating (3.25) from to and using the monotonicity of the Dirichlet form in time, we have proved (3.23) with the choice of .
Proof of Theorem 3.1. Fix and let with satisfying the assumption of Theorem 3.5, i.e., for some and (3.9) holds. Using (3.23), we have
Clearly, equation (3.27) also holds for the special choice (for which ), i.e. local statistics of and can be compared. Hence we can replace the measure in (3.27) by and this proves Theorem 3.1.
Proof of Theorem 2.3 and Theorem 2.1
We first prove Theorem 2.3 assuming that Theorem 2.1 holds. Our main tool is the reverse heat flow argument from . Recall that the distribution of the matrix element is given by a measure and the generator of the Ornstein-Uhlenbeck process is (2.17). The probability distribution of all matrix elements is , . The joint probability distribution of the matrix elements at time as every matrix element evolves under the Ornstein-Uhlenbeck process is given by
where we recall that is the standard Gaussian measure.
Fix a positive integer . Suppose that satisfies the subexponential decay condition (2.20) and the regularity condition (2.18) for all . Then there is a small constant , depending only on , such that for any positive there exists a probability density w.r.t. with mean zero and variance one such that
for some depending on . Furthermore, can be chosen such that if the logarithmic Sobolev inequality (2.15) holds for the measure , then it holds for as well, with the logarithmic Sobolev constant changing by a factor of at most .
Furthermore, let , with and set . Then we also have
Proof. This proposition can be proved following the reverse heat flow idea from . Define with some small positive depending on , where is a smooth cutoff function satisfying for and for . Set
By assumption (2.18), is positive and
for any if is small enough.
Define and by definition, . Then
Since the Ornstein-Uhlenbeck is a contraction in , together with (2.18), we have
Notice that may not be normalized as a probability density w.r.t. but it is easy to check that there is a constant , for any positive, such that is a probability density. Clearly,
and the same formulas hold if is replaced by since the OU flow preserves expectation and variance. Let be defined by
Then is a probability density w.r.t. with zero mean and variance . It is easy to check that the total variation norm of is smaller than any power of . Using again the contraction property of and (4.4), we get
Now we check the LSI constant for . Recall that was obtained from by translation and dilation. By definition of the LSI constant, the translation does not change it. The dilation changes the constant, but since our dilation constant is nearly one, the change of LSI constant is also nearly one. So we only have to compare the LSI constants between and . From (4.3) and that is nearly one, the LSI constant changes by a factor less than . This proves the claim on the LSI constant.
and this completes the proof of Proposition 4.1.
We now apply Theorem 2.1 to the initial distribution given by the eigenvalues of the symmetric Wigner ensemble with distribution where . By Proposition 4.1, the LSI constant of is bounded by the initial LSI constant of by a factor of at most two. Thus we can apply Lemma 2.2 to verify Assumptions 1 and 2 of Theorem 2.1. Assumption 3 follows from the local semicircle law, Theorem 5.1. Thus the correlation functions of the eigenvalues of the ensemble with distribution are the same as those of GOE in the sense of (2.13). Finally, using (4.2), we can approximate the -point correlation function w.r.t. by the one w.r.t. after choosing sufficiently large so that . The additional smallness factor for the estimate on the total variation in (4.2) is necessary to conclude the convergence of the -point correlation function, since it is rescaled by a factor in each variable. We also used the trivial fact that the total variation distance of two eigenvalue distributions is bounded by the total variation distance of the distributions of the full matrix ensembles. Finally we remark that the limit in (2.19) is needed to replace in (2.13) with in (2.19) using the continuity of . This concludes the proof of Theorem 2.3.
Proof of Theorem 2.1. Step 1. The first step is to show that the right hand side of (3.11) vanishes in the large limit for with small enough provided that the estimates (2.10), (2.11) hold. By (2.11), (recall the definition of from (3.1)) with a very high probability. In this paper we will say that an event holds with a very high probability if the complement event has a probability that is subexponentially small in , i.e., it is bounded by with some fixed . From the definition of (3.6), we have
Notice that function satisfies
as long as and have the same sign. In our case, and have the same sign as long as
The last inequality holds with a very high probability due to (2.11) provided is smaller than . We remark that this is the only place where we used Assumption 2. Thus,, with a very high probability, we have
The contribution to of the exceptional event is negligible, since its probability is subexponentially small in and . Thus, recalling the definition of from (2.9) and the definition of from (3.9), we can bound the error term on the right hand side of (3.11) by
provided that (2.10) holds and and are small enough, depending on .
Step 2. From (3.11) to correlation functions. The equation (3.11) shows that for a special class of observables, depending only on rescaled differences of the points , the expectations w.r.t. and w.r.t the equilibrium measure are identical in the large limit. But the class of observables in (2.13) of Theorem 2.1 is somewhat bigger and we need to extend our result to them. Without the integration in (2.13), the observable would strongly depend on a fixed energy and could not be approximated by observables depending only on differences of . Taking a small averaging in remedies this problem.
We will consider , and fixed, i.e., the constants in this proof may depend on these three parameters. We start with the identity
We will set if . We have to show that
Let be an -dependent parameter chosen at the end of the proof. Let
and note that . To prove (4.12), it is sufficient to show that
hold for any (note that corresponds to the equilibrium, ), where is chosen in Theorem 2.1 and is chosen in the Step 1.
Case 1: Small case; proof of (4.13).
After performing the integration, we will eventually apply Theorem 3.1 to the function
For any and define sets of integers and by
where was defined in (2.8). Clearly . With these notations, we have
The error term , defined by (4.16) indirectly, comes from those indices, for which since unless , the constant depending on the support of . Thus
for any sufficiently large assuming and using that is a bounded function. The additional factor comes from the integration. Taking the expectation with respect to the measure , and by a Schwarz inequality, we get
using Assumption 1 (2.10). We can also estimate
where the error term , defined by (4.18), comes from indices such that . It satisfies the same bound (4.17) as . By the continuity of , the density of ’s is bounded by , thus and . Therefore, summing up the formula (4.15) for , we obtain from (4.16) and (4.18)
for each . A similar lower bound can be obtained analogously, and after choosing , we obtain
Adding up (4.19) for all , we get
and the same estimate holds for the equilibrium, i.e., if we set in (4.20). Subtracting these two formulas and applying (3.11) from Theorem 3.1 to each summand on the second term in (4.19) and using (4.10), we conclude that
we obtain that (4.21) vanishes as , and this proves (4.13).
Step 2. Large case; proof of (4.14).
as . Inserting this into (4.24), this completes the proof of (4.14) and the proof of Theorem 2.1.
Local semicircle law and proof of Lemma 2.2
We first recall the local semicircle law concerning the eigenvalues of . Let
be the Stieltjes transform of the empirical eigenvalue distribution at spectral parameter , , and let
be the Stieltjes transform of the semicirle distribution. In Theorem 4.1 of we proved the following version of the local semicircle law (we remark that, contrary to what is stated in Theorem 4.1 of , condition (2.5) of is not needed and has not been used in the proof):
Assume that the distribution of the matrix elements of the symmetric Wigner matrix ensemble satisfies (2.15) with some constant and assume that is such that . Then for any there exist positive constants , and , depending only on and , such that for any we have
for all and for all large enough.
As a corollary of the local semicircle law, the number of eigenvalues up to a fixed energy can be estimated. The precise statement is the following proposition and it was proven in Proposition 4.2, equation (4.14) of .
Assume that the distribution of the matrix elements of the symmetric Wigner matrix ensemble satisfies (2.15) with some finite constant . Let
be the expectation of the empirical distribution function of the eigenvalues and recall the definition of from (2.7). Then there exists a constant , depending only on the constant in (2.15), such that
The local semicircle law implies that the local density of eigenvalues is bounded, but the estimate in Theorem 5.1 deteriorates near the spectral edges. The following upper bound on the number of eigenvalues in an interval provides a uniform control near the edges. This lemma was essentially proved in Theorem 4.6 of using ideas from an earlier version, Theorem 5.1 of . For the convenience of the reader, a detailed proof is given in the Appendix C.
We remark that for Theorem 5.1 and Lemma 5.3 it is sufficient to assume only the Gaussian decay condition (2.16) instead of the logarithmic Sobolev inequality (2.15).
We can now start to prove Lemma 2.2. Since the logarithmic Sobolev inequality holds for (2.15), it also holds for as well; for a proof see Lemma B.1 in Appendix B and recall that is the convolution of with the Ornstein-Uhlenbeck kernel which itself satisfies the logarithmic Sobolev inequality. Moreover, the LSI constant of is bounded uniformly for all , since it is the maximum of the LSI constant of and the LSI constant of the Ornstein-Uhlenbeck kernel, which is bounded uniformly in time. Therefore Lemma 2.2 follows immediately from its time independent version:
Suppose that the distribution of the matrix elements of the symmetric Wigner ensemble satisfies (2.15) with some finite constant . Then there exist positive constants , and such that
hold for the eigenvalues of and for any .
The proof of Lemma 5.4 is divided into two steps. In the first step, Section 5.1, we estimate the fluctuation of the eigenvalues around their mean values using the logarithmic Sobolev inequality. In the second step, Section 5.2, we estimate the deviation of the mean location of from the classical location using (5.3).
the expected location of . We start with an estimate on the expected location of the extreme eigenvalues:
Suppose that the probability measure of the matrix entries satisfies
for some (this condition is satisfied, in particular, under (2.15), see (2.16)). Then for any we have
with some constant depending on and .
Proof. For any , define the probability measure
Denote by ( resp.) the probability law of the random matrices whose matrix elements are distributed according to ( resp.). As usual, we neglect the fact that distribution should be replaced with for the diagonal elements . Since the number of index pairs is of order , the total variational norm between and is bounded by
From Theorem 1.4 of we obtain that for any
holds almost surely w.r.t. . It follows from (5.11) that is bounded w.r.t. the distribution as well, up to a subexponentially small probability. To estimate the tail of w.r.t. , we use that and the trivial large deviation bound based upon (5.8),
that holds for any with constants depending on . We thus obtain that the expectations of w.r.t. these two measures satisfy
Thus we have proved that, for any ,
with some constant depending on and . Similar lower bound holds for .
Next we estimate the fluctuations of :
with a constant depending on and .
Proof. First order perturbation theory of the eigenvalue of shows that
Using (2.15) and the Bobkov-Götze concentration inequality (Theorem 2.1 of ), we have for any
after optimizing for and using that from above. This proves (5.14).
The following proposition is a refinement of Proposition 5.6:
Fix a sufficiently small constant and set . Then for any index with we have
The constants and depend on and but are independent of .
As a preparation for the proof of Proposition 5.7, we need the following estimate on the tail of the gap distribution.
Let . Denote by the largest eigenvalue below and assume that . Then there exist positive constants and , depending only on the Sobolev constant in (2.15), such that for any that satisfy , we have
This lemma was proven in Theorem E1 of , see also Theorem 3.3 of , and the proof will not be repeated here. We only mention the main idea, that the local semicircle law, Theorem 5.1, provides a positive lower bound on the number of eigenvalues in any interval of size |I|\geq A(\log N)^{4}/\big{[}N\big{|}2-|x|\big{|}\big{]} around the point with a sufficiently large constant . In particular, it follows that there is at least one eigenvalue in each such interval with a very high probability.
Proof of Proposition 5.7. We choose positive numbers, depending on , such that
with some sufficiently small and large constants. Let
Consider an index with , then . We first show that with a very high probability. Suppose, in the contrary, that for some (the case is treated analogously). From (5.9) and (5.14) it follows that with a very high probability. But then the interval of length would contain eigenvalues, an event with an extremely low probability by (5.4).
Knowing that with a very high probability, we can use (5.17) to conclude that for any index with we have
Similarly to the calculation in Theorem 3.1 of , by using the logarithmic Sobolev inequality (2.15), we have
This bound holds for any index with the remark that if or , then the averaging over the indices is done asymmetrically.
Combining this estimate with (5.21) we have, apart from a set of very small probability, that
for any . Taking expectation, and using the tail estimate (5.13) to control on the event of very small probability where (5.23) may not hold, we also obtain for these indices that
Subtracting the last two inequalities yields
and combining this bound with the estimate (5.14) for the extreme indices, we obtain
The inequalities (5.15) and (5.16) now follow if we choose the parameters as
which choice is compatible with the conditions (5.18).
2 Deviation of the eigenvalues from their classical locations
The next Proposition 5.9 below estimates the distance of the eigenvalues from their location given by the semicircle law. This will justify that the convex extension of the potential affects only regimes of very small probability.
For any small and for any we have
The constants and depend on and but are independent of .
We remark that in the bulk is expected to be bounded by (in the hermitian case it was proven in , see also ); near the edges one expects . Our estimate is not optimal, but it gives a short proof that is sufficient for our purpose. We remark that after submitting this paper, these conjectures were proven in .
to be the counting function of the expected locations of the eigenvalues. We compare with defined in (5.2). Using the fluctuation bound (5.14), we have
and the second term is analogous. We thus have
For the energy range , we use (5.28):
with an dependent constant.
To estimate , we can assume, without loss of generality, that , the other case is treated analogously. Let be a parameter that will be optimized later. Set with a sufficiently large constant . Since , for any small , the parameter is roughly the energy difference from the edge to the -th eigenvalue.
From the property for small , we have
Combining this estimate with (5.32), we have
for any with .
For the extreme indices, we use that if , then from Lemma 5.5 and (5.33), we have
with depending on . Similar estimates hold at the upper edge of the spectrum, i.e. for . Choosing , we conclude the proof of (5.27). The proof of (5.26) then follows from (5.14) and this concludes the proof of Proposition 5.9.
The following Proposition is a strengthening of the bound (5.32) used previously.
for any and with a constant depending on and .
Proof. Recalling the definition of from (5.19), we will prove that
which gives (5.34) with the choice of parameters (5.25). We proceed similarly to the proof of Proposition 5.9 but we notice that in addition to (5.28), a stronger bound on is available for , where , with some large constant and setting as in Proposition 5.7. To obtain an improved bound, note that for any in this interval
To see this inequality, define the random index
The estimate (5.36) will then follow if we prove that , i.e. , with a very high probability. By (5.26) we have, with a very high probability, that
Therefore, with a very high probability, is in the vicinity of , and thus holds for any fixed if in the definition of is sufficiently large. Thus with a very high probability by (5.15), so implies and this proves (5.36).
is analogous. Finally, by the Lipschitz continuity of on a scale bigger than , we have
where we also used that .
Define the interval with a constant larger than the constant in (5.9). Using (5.30), we have
since . Finally, for and for by (5.9). Since , combining these estimates with the fluctuation (5.14) and with the tail estimate (5.13), we obtain that n(E)(1-n(E))\leq C\exp\big{[}-cN^{1/4}\big{]} for any , and it decays exponentially for large , therefore
Collecting all these estimates, inserting them into (5.38) and using (5.3), we obtain (5.35) and conclude the proof of Proposition 5.10.
Finally, we can complete the proof of Lemma 5.4. By (5.16) and (5.34), we have
apart from a set of probability C\exp\big{[}-cN^{\delta}\big{]}. Combining it with the tail estimate (5.13) on , we obtain (5.5) with any . The inequality (5.6) in Lemma 5.4 follows immediately from (5.26) with any sufficiently small and with .
Appendix A Some Properties of the Eigenvalue Process
In the main part of the paper we did not specify the function spaces in which the equations (2.4) and (3.12) are solved. In this appendix we summarize some basic properties of these equations. In particular, we justify the integration by parts in (3.17). For simplicity, we consider the most singular case only.
The Dyson Brownian motion as a stochastic process was rigorously constructed in Section 4.3.1 of . It was proved that the eigenvalues do not collide with probability one and thus (2.4) holds in a weak sense on the open set . The coefficients of have a singularity near the coalescence hyperspace . We focus only on the single collision singularities, i.e. on the case . By the ordering of the eigenvalues, higher order collision points form a zero measure set on the boundary of and can thus be neglected. In an open neighborhood near the coalescence hyperspace , the generator has the form
after a change of variables, , , where has regular coefficients. The boundary condition at is given by the standard boundary condition of the generator of the Bessel process, , which is as . Thus is defined on functions with sufficient decay at infinity and with boundary conditions
The generator of (3.12) differs from only in drift terms with bounded coefficients, hence the boundary conditions of and coincide. Finally, we need some non-vanishing and regularity property of the solution of (3.12):
Let be a bounded open set such that
i.e. intersects at most one of the coalescent hyperplanes, namely the . Then any weak solution of (3.12) with boundary conditions (A.1) is on and for any we have
where is an elliptic operator with second derivatives in the variables and with bounded coefficients on the compact set . The solution in the new coordinates is . Introducing a function defined in variables, we see that satisfies , where
i.e. is elliptic with bounded coefficients in the new variables. Notice that the boundary condition (A.1) implies that, in the two dimensional plane of , the support of the test function for the equation is allowed to include the origin .
By standard parabolic regularity, we obtain that the solution is and is bounded from above and below.
This lemma justifies the integration by parts in (3.17). Since and it is separated away from zero, has no singularity on the coalescence lines. Since the function vanishes whenever for some , the boundary terms of the form
Appendix B Logarithmic Sobolev inequality for convolution measures
for any with . Here .
Proof. The following proof is really a special case of the martingale approach used in to prove LSI. Let
For any fixed , from the LSI w.r.t. the measure , the first term on the right hand side is bounded by
Integrating by parts, we can rewrite the last term as
where we have used and the Schwarz inequality. Combining these inequalities, we have proved the Lemma.
Appendix C Proof of Lemma 5.3
For any , let denote the minor that is obtained from the Wigner matrix by removing the -th row and column. Let be the -th column of without the element. Let be the eigenvalues and the corresponding eigenvectors of and set
It is well known (see, e.g. Lemma 2.5 of ), that the eigenvalues of and are interlaced for each , i.e.
Expressing the resolvent of at a spectral parameter , , in terms of the resolvent of , we obtain
By considering only the imaginary part, we obtain
where we restricted the summation in (C.3) only to eigenvalues lying in .
For each , we define the event
if is sufficiently large, recalling that . On the complement event, , we have from (C.4) that
i.e. . Choosing sufficiently large, we obtain (5.4) from (C.5). This proves Lemma 5.3.
holds for any , where is the cardinality of the index set .
Proof of Lemma C.1. We will need the following result of Hanson and Wright , extended to non-symmetric variables by Wright . We remark that this statement can also be extended to complex random variables (Proposition 4.5 ).
Let , be a sequence of real i.i.d. random variables with distribution satisfying the Gaussian decay (2.16) for some . Let , be arbitrary real numbers and let be the matrix with entries . Define
Then there exists a constant , depending only on from (2.16), such that for any
where A:=(\text{Tr}\,{\cal A}{\cal A}^{t})^{1/2}=\big{[}\sum_{j,k}|a_{jk}|^{2}\big{]}^{1/2}.
Acknowledgement: We thank Jun Yin for several helpful comments and pointing out some errors in the preliminary versions of this paper. We are also grateful to the referees for their suggestions to improve the presentation.