On Finite Rank Deformations of Wigner Matrices

Alessandro Pizzo, David Renfrew, Alexander Soshnikov

Introduction and Formulation of Main Results

Let XN=1NWNX_{N}=\frac{1}{\sqrt{N}}W_{N} be a random real symmetric (Hermitian) Wigner matrix with independent entries up from the diagonal. In the real symmetric case, we assume that the off-diagonal entries

are independent random variables such that

are independent random variables (that are also independent from the off-diagonal entries), such that

In a similar fashion, in the Hermitian case, we assume that the off-diagonal entries

are independent random variables such that

are independent centered random variables, independent from the off-diagonal entries, with uniformly bounded third moment of the absolute values.

For a real symmetric (Hermitian) matrix MM of order N,N, its empirical distribution of the eigenvalues is defined as μM=1Ni=1Nδλi,\mu_{M}=\frac{1}{N}\sum_{i=1}^{N}\delta_{\lambda_{i}}, where λ1λN\lambda_{1}\geq\ldots\geq\lambda_{N} are the (ordered) eigenvalues of M.M. Wigner semicircle law (see e.g. , , ) states that almost surely the empirical distribution μXN\mu_{X_{N}} of a random real symmetric (Hermitian) Wigner matrix XNX_{N} converges weakly to the nonrandom limiting distribution μsc.\mu_{sc}. The limiting distribution μsc\mu_{sc} is known as the semicircle distribution. It is absolutely continuous with respect to the Lebesgue measure and has the compact support [2σ,2σ].[-2\sigma,2\sigma]. The density of the Wigner semicircle distribution is given by

converges to φ(x)\*dμsc(dx)\int\varphi(x)\*d\mu_{sc}(dx) almost surely; here and throughout the paper, we use the notation trN=1NTr\text{tr}_{N}=\frac{1}{N}\text{Tr} to denote the normalized trace.

The Stieltjes transform of the semicircle law is

In this paper, we study the fluctuations of the outliers in the spectrum of finite-dimensional deformations of Wigner matrices. Starting with the pioneering work by Füredi and Komlós , there have been several results on finite rank perturbations of matrices with i.i.d. entries, in particular , , , , , , , , , . We also note several papers on the eigenvalues of sample covariance matrices of spiked population models (, , , ).

This manuscript can be viewed as a companion paper to our recent works and on the non-Gaussian fluctuation of the matrix entries of regular functions of Wigner matrices. However, no knowledge of the machinery used in and is required, and the paper can be read independently from these papers.

Here WNW_{N} is a random real symmetric (Hermitian) Wigner matrix as defined in (1.1-1.4) ((1.5-1.7)), and ANA_{N} is a deterministic Hermitian matrix of fixed finite rank r.r. We assume that the eigenvalues of ANA_{N} and their multiplicities are fixed. Let

be the eigenvalues of ANA_{N} each with fixed multiplicity kjk_{j}. Clearly, the eigenvalue θj0=0\theta_{j_{0}}=0 has multiplicity NrN-r and jj0kj=r.\sum_{j\neq j_{0}}k_{j}=r.

The first theorem of this section, Theorem 1.1, concerns the convergence of the extreme eigenvalues of the deformed random matrix. Let us denote ρθ=θ+σ2θ.\rho_{\theta}=\theta+\frac{\sigma^{2}}{\theta}. We shall use the shorthand notation ρj\rho_{j} for ρθj.\rho_{\theta_{j}}. Theorem 1.1 was originally proved by Capitaine, Donati-Martin, and Feral in in the case when the common marginal distribution of the matrix entries is symmetric and satisfies a Poincaré inequality.

Let XN=1NWNX_{N}=\frac{1}{\sqrt{N}}W_{N} be a random real symmetric (Hermitian) Wigner matrix satisfying (1.1-1.4) (respectively (1.5-1.7)). Let Jσ+J_{\sigma^{+}} be the number of jj’s such that θj>σ\theta_{j}>\sigma and JσJ_{\sigma^{-}} be the number of jj’s such that θj<σ\theta_{j}<-\sigma.

For all j=1,,Jσ+j=1,\ldots,J_{\sigma^{+}} and i=1,,kji=1,\ldots,k_{j}, λk1++kj1+iρj,\lambda_{k_{1}+\ldots+k_{j-1}+i}\to\rho_{j},

λk1++kJσ++12σ,\lambda_{k_{1}+\ldots+k_{J_{\sigma^{+}}}+1}\to 2\sigma,

λk1++kJJσ2σ,\lambda_{k_{1}+\ldots+k_{J-J_{\sigma^{-}}}}\to-2\sigma,

For all j=JJσ+1,,Jj=J-J_{\sigma^{-}}+1,\ldots,J and i=1,,kji=1,\ldots,k_{j}, λk1++kj1+iρj.\lambda_{k_{1}+\ldots+k_{j-1}+i}\to\rho_{j}.

In other words, the first k1k_{1} largest eigenvalues of MNM_{N} converge to ρ1,\rho_{1}, the next k2k_{2} largest eigenvalues converge to ρ2,,\rho_{2},\ldots, the Jσ+J_{\sigma^{+}}th bunch of the largest eigenvalues converge to ρJσ+,\rho_{J_{\sigma^{+}}}, the next largest eigenvalue converges to 2\*σ2\*\sigma (since it corresponds to a nonnegative eigenvalue of ANA_{N} which is not bigger than σ\sigma), etc.

If random variables (WN)ij, 1ijN,(W_{N})_{ij},\ 1\leq i\leq j\leq N, satisfy a Poincaré inequality (1.12) with constant υi,j,N\upsilon_{i,j,N} uniformly bounded from zero, υi,j,Nυ>0,\upsilon_{i,j,N}\geq\upsilon>0, the convergence holds with probability one.

Note that the Poincaré inequality tensorizes and the probability measures satisfying the Poincaré inequality have subexponential tails (, ) . In particular, if the marginal distributions of the matrix entries of WNW_{N} satisfy the Poincaré inequality with constant υ>0,\upsilon>0, then the joint distribution of (WN)ij, 1ijN,(W_{N})_{ij},\ 1\leq i\leq j\leq N, also satisfies the Poincaré inequality with the same constant υ.\upsilon. By a standard scaling argument, we note that if the marginal distributions of the matrix entries of WNW_{N} satisfy the Poincaré inequality with υ>0\upsilon>0 then the marginal distributions of the matrix entries of XN=1N\*WNX_{N}=\frac{1}{\sqrt{N}}\*W_{N} satisfy the Poincaré inequality with constant N\*\upsilon.\

Theorem 1.1 follow from Theorem 1.2 formulated below. Theorem 1.2 is concerned with the distribution of the outliers, i.e. the eigenvalues of MNM_{N} corresponding to θj>σ.\theta_{j}>\sigma. Namely, we are interested in the fluctuation of the outliers around ρj, 1jJσ+.\rho_{j},\ 1\leq j\leq J_{\sigma^{+}}. Let us consider a fixed eigenvalue θj\theta_{j} of ANA_{N} such that θj>σ.\theta_{j}>\sigma. In general, if one does not assume some additional information about the structure of the eigenvectors of ANA_{N} corresponding to θj,\theta_{j}, the sequence of random vectors

Let XN=1NWNX_{N}=\frac{1}{\sqrt{N}}W_{N} be a random real symmetric (Hermitian) Wigner matrix defined in (1.1-1.4) (respectively (1.5-1.7)). Let 1jJσ+,1\leq j\leq J_{\sigma^{+}}, so the eigenvalue θj\theta_{j} of ANA_{N} satisfies θj>σ\theta_{j}>\sigma. Then the sequence of random vectors

is bounded in probability. In addition, if the marginal distributions of the matrix entries of WNW_{N} satisfy the Poincaré inequality (1.12) with constant υi,j,N\upsilon_{i,j,N} uniformly bounded from zero, the following holds with probability 11

Theorem 1.2 clearly implies parts (a) and (d) of Theorem 1.1. To see that parts (b) and (c) of Theorem 1.1 also follow, we note that for any fixed positive integer l1l\geq 1 the ll-th largest eigenvalue of XNX_{N} converges in probability to 2\*σ.2\*\sigma. This is a simple consequence of the convergence of the largest eigenvalue of XNX_{N} to 2\*σ2\*\sigma and the semicircle law. Then the interlacing property and Theorem 1.2 imply the desired result.

The bound (1.15) means that there exists a sufficiently large deterministic constant C=C(σ,υ,θ1,,θr)>0,C=C(\sigma,\upsilon,\theta_{1},\ldots,\theta_{r})>0, such that with probability 11

To study the fluctuations of the outliers in more detail, we consider two special cases following .

Case A (“The eigenvectors don’t spread out”)

Case B (“The eigenvectors are delocalized”)

The ll^{\infty} norm of every orthonormal eigenvector of ANA_{N} corresponding to θj\theta_{j} goes to zero as N.N\to\infty.

The next theorem is a consequence of Proposition 1.1 below and Theorems 1.1 and 1.5 in . We use a standard notation β=1\beta=1 in the real symmetric case and β=2\beta=2 in the Hermitian case.

Let XN=1NWNX_{N}=\frac{1}{\sqrt{N}}W_{N} be a random real symmetric (Hermitian) Wigner matrix defined in (1.1-1.4) (respectively (1.5-1.7)) such that the off-diagonal entries (WN)ij, 1i<jN,(W_{N})_{ij},\ 1\leq i<j\leq N, are i.i.d. real (complex) random variables with probability distribution μ\mu and the diagonal entries (WN)ii, 1i<N,(W_{N})_{ii},\ 1\leq i<N, are i.i.d. random variables with probability distribution μ1.\mu_{1}. In Case A, the kjk_{j}-dimensional vector

converges in distribution to the distribution of the ordered eigenvalues of the kj×kjk_{j}\times k_{j} random matrix VjV_{j} defined as

(i) WjW_{j} is a Wigner random matrix of size KjK_{j} with the same marginal distribution of the matrix entries as WN,W_{N},

(ii) HjH_{j} is a real symmetric (Hermitian) Gaussian matrix of size Kj,K_{j}, independent of Wj,W_{j}, with centered independent entries Hst, 1stKj,(ReHst,ImHst, 1s<tKj, Hpp,1pKjH_{st},\ 1\leq s\leq t\leq K_{j},(\operatorname{\mathfrak{Re}}H_{st},\operatorname{\mathfrak{Im}}H_{st},\ 1\leq s<t\leq K_{j},\ H_{pp},1\leq p\leq K_{j} in the Hermitian case) with the variance of the entries given by

(iii) UjU_{j} is a Kj×kjK_{j}\times k_{j} such that the (KjK_{j}-dimensional) columns of UjU_{j} are written from the first KjK_{j} coordinates of the orthonormal eigenvectors corresponding to θj.\theta_{j}.

In , Theorem 1.3 was proved for symmetric marginal distribution satisfying the Poincaré inequality (1.12) under an additional technical assumption that k=o(\sqrt{N}),\ where kk is defined in the paragraph above (1.16).

Using Theorems 4.1 and 4.2 from , one can extend the results of Theorem 1.3 to the case when the entries of WNW_{N} are not identically distributed provided the distribution of the entries (WN)il, 1i,lKj(W_{N})_{il},\ 1\leq i,l\leq K_{j} does not depend on N.N.

Let XN=1NWNX_{N}=\frac{1}{\sqrt{N}}W_{N} be a random real symmetric (Hermitian) Wigner matrix defined in (1.1-1.4) (respectively (1.5-1.7)) such that the distribution of the entries (WN)il, 1i,lKj(W_{N})_{il},\ 1\leq i,l\leq K_{j} does not depend on N.N. Let us assume that the limits

Then in case A, the results of Theorem 1.3 hold with κ4(μ)\kappa_{4}(\mu) in (1.18) replaced by

Let XN=1NWNX_{N}=\frac{1}{\sqrt{N}}W_{N} be a random real symmetric (Hermitian) Wigner matrix defined in (1.1-1.4) (respectively (1.5-1.7)) such that the off-diagonal entries (WN)ij, 1i<jN,(W_{N})_{ij},\ 1\leq i<j\leq N, are i.i.d. random variables with probability distribution μ\mu and the diagonal entries (WN)ii, 1i<N,(W_{N})_{ii},\ 1\leq i<N, are i.i.d. random variables with probability distribution μ1.\mu_{1}. In Case B, the kjk_{j}-dimensional vector

converges in distribution to the distribution of the (ordered) eigenvalues of a kj×kjk_{j}\times k_{j} GOE (GUE) matrix with the variance of the matrix entries given by θj2\*σ2θj2σ2\frac{\theta_{j}^{2}\*\sigma^{2}}{\theta_{j}^{2}-\sigma^{2}} provided k=o(N).k=o(\sqrt{N}).

We recall that kk has been defined above as the minimal number of canonical basis vectors e1,rNe_{1},\ldots r_{N} required to span the eigenvectors corresponding to the eigenvalues θ1,θJσ+.\theta_{1},\ldots\theta_{J_{\sigma^{+}}}.

Theorem 1.5 is an immediate extension of the result of Capitaine, Donati-Martin, and Féral from to our setting since their arguments apply essentially unchanged as soon as Theorem 1.1 is established.

It should be noted that Benaych-Georges, Guionnet, and Maida consider in perturbations of a random Wigner matrix by a finite rank random matrix with eigenvectors that are either independent copies of a random vector vv with i.i.d. centered components satisfying the log-Sobolev inequality or are obtained by Gram-Schmidt orthonormalization of such independent copies. The distribution of the outliers is given in Proposition 5.3. of . Let us denote the distribution of the first component of vv by ν.\nu. If the fourth cumulant κ4(ν)\kappa_{4}(\nu) of ν\nu vanishes, the limiting distribution of the outliers is similar to the result of Theorem 1.5, and given by the distribution of the ordered eigenvalues of a GOE (GUE) matrix. If the fourth cumulant does not vanish, one has to add a diagonal matrix with i.i.d. real Gaussian entries to a GOE (GUE) matrix.

One of the most important results of , concerns the distribution of the “sticking” eigenvalues (i.e. the eigenvalues that correspond to θj<σ).|\theta_{j}|<\sigma). In Theorem 5.3 of , Benaych-Georges, Guionnet, and Maida prove that their limiting distribution is given by the Tracy-Widom law.

Let us briefly describe a key ingredient of the proofs of Theorems 1.2-1.4. We use the notation

Let us consider a fixed eigenvalue θj\theta_{j} of ANA_{N} such that θj>σ\theta_{j}>\sigma and denote by v(1),,v(kj)v^{(1)},\ldots,v^{(k_{j})} the orthonormal eigenvectors of ANA_{N} that correspond to the eigenvalue θj.\theta_{j}. Denote by ΞN(j)\Xi^{(j)}_{N} the kj×kjk_{j}\times k_{j} matrix with the entries

where we recall that \rho_{j}=\theta_{j}+\frac{\sigma^{2}}{\theta_{j}}\. The following proposition plays an important part in our proofs.

Let y1ykjy_{1}\geq\ldots\geq y_{k_{j}} be the ordered eigenvalues of the matrix ΞN(j).\Xi^{(j)}_{N}. Then

It should be mentioned that the key part of the proof of Proposition 1.1 is a lemma from which is stated as Lemma 4.2 in Section 4. Proposition 1.1 indicates that the question of the limiting distribution of the outliers of the spectrum of the deformed Wigner matrix MNM_{N} can be reduced to the question about the limiting distribution of the entries of (1.25).

Without additional assumptions on u(N)u^{(N)} and v(N),v^{(N)}, the sequence

does not necessarily converge in distribution. However, one can show that it is tight.

Let XN=1NWNX_{N}=\frac{1}{\sqrt{N}}W_{N} be a random real symmetric (Hermitian) Wigner matrix defined in (1.1-1.4) (respectively (1.5-1.7)). Then the following statements hold:

where Const(σ2,m5,c3)Const(\sigma^{2},m_{5},c_{3}) depends on σ2,m5,\sigma^{2},m_{5}, and c3.c_{3}.

(iii) If the marginal distributions of the entries of WNW_{N} satisfy the Poincaré inequality (1.12) with a uniform constant υ>0\upsilon>0, and ff is a Lipschitz continuous function on [2\*σδ,2\*σ+δ][-2\*\sigma-\delta,2\*\sigma+\delta] that satisfies a subexponential growth condition

for some positive constants aa and b,b, then

where fL,δ|f|_{\mathcal{L},\delta} is defined in (1.32),

and υ\upsilon is the constant in the Poincaré inequality (1.12).

We finish this section by formulating our last theorem, Theorem 1.7, which allows us to extend Theorem 1.3 (see Remark 5.1 in Section 5). Assume that that the off-diagonal entries (WN)ij, 1i<jN,(W_{N})_{ij},\ 1\leq i<j\leq N, are i.i.d. random variables with probability distribution μ\mu and the diagonal entries (WN)ii, 1i<N,(W_{N})_{ii},\ 1\leq i<N, are i.i.d. random variables with probability distribution μ1.\mu_{1}.

converges in distribution as N.N\to\infty. Without loss of generality, we will consider the real symmetric case; the Hermitian case is essentially identical. Let mm be an arbitrary fixed positive integer. Denote by R(m)(z)R^{(m)}(z) the m×mm\times m upper-left corner of the matrix RN(z).R_{N}(z). Theorem 1.1 in states that a matrix-valued random field

with values in the space of complex symmetric m×mm\times m matrices, converges in finite-dimensional distributions to a random field

where W(m)W^{(m)} is the m×mm\times m upper-left corner submatrix of a Wigner matrix WN, gσ(z)W_{N},\ g_{\sigma}(z) is the Stieltjes transform (1.9) of the Wigner semicircle law, and

is a Gaussian random field with the covariance matrix given by the formulas (1.18)-(1.23) in the real-symmetric case and (1.50)-(1.55) in the Hermitian case in . It is important to note that Yij(z), 1ijm,Y_{ij}(z),\ 1\leq i\leq j\leq m, are independent random processes for different indices (ij).(ij).

Let us extend the definition of Υ(z)\Upsilon(z) to that of an infinite-dimensional matrix \Upsilon(z)_{pq},\ 1\leq p,q<\infty,\ using the formulas (1.18)-(1.23) (respectively (1.50)-(1.55)) from . Thus, the r.h.s. in (1.43) defines now the m×mm\times m upper-left corner of the infinite matrix Υ(z).\Upsilon(z). Then Theorem 1.1 of implies that

Let XN=1NWNX_{N}=\frac{1}{\sqrt{N}}W_{N} be a random real symmetric (Hermitian) Wigner matrix defined in (1.1-1.4) (respectively (1.5-1.7)) such that that the off-diagonal entries (WN)ij, 1i<jN,(W_{N})_{ij},\ 1\leq i<j\leq N, are i.i.d. random variables with probability distribution μ\mu and the diagonal entries (WN)ii, 1i<N,(W_{N})_{ii},\ 1\leq i<N, are i.i.d. random variables with probability distribution μ1.\mu_{1}.

converges weakly to the joint distribution of  up,Υ(z)uq,  1p,ql.\ \langle u_{p},\Upsilon(z)u_{q}\rangle,\ \ 1\leq p,q\leq l.

We would like to thank A. Guionnet for bringing our attention to the preprints and .

Mathematical Expectation and Variance of Resolvent Sesquilinear Form

This section is devoted to the proof of the main building block Theorem 1.6, namely Proposition 2.1.

When it does not lead to ambiguity we will use the shorthand notation, RijR_{ij}, for the ijij-th entry (RN(z))ij,(R_{N}(z))_{ij}, of the resolvent matrix RN(z).R_{N}(z).

In the case when u(N)u^{(N)} and v(N)v^{(N)} are standard basis vectors, u=ei, v=ej,u=e_{i},\ v=e_{j}, the mathematical expectation and the variance of u(N),RN(z)v(N)\langle u^{(N)},R_{N}(z)v^{(N)}\rangle have been studied in . In particular, it has been shown there in Proposition 2.1 and (3.27) that

In , Erdös, Yau, and Yin studied generalized Wigner matrices (defined at the beginning of Section 2 of ), and obtained the following estimates provided the marginal distributions have subexponential tails

where 0<ϕ<1, C1,c>00<\phi<1,\ C\geq 1,c>0 are some constants, 4/ϕlClogN/loglogN,4/\phi\leq l\leq C\log N/\log\log N, N1\*(logN)10\*l<Imz10, Rez5σ,N^{-1}\*(\log N)^{10\*l}<\operatorname{\mathfrak{Im}}z\leq 10,\ |\operatorname{\mathfrak{Re}}z|\leq 5\sigma, and NN is sufficiently large.

It follows from our proofs that the error term on the r.h.s. of (2.2) can be replaced by O\left(\frac{\min(\|u\|_{1},\|v\|_{1})}{|\operatorname{\mathfrak{Im}}z|^{7}\*N}\right),\ where u1=i=1Nui.\|u\|_{1}=\sum_{i=1}^{N}|u_{i}|.

The rest of the section is devoted to the proof of Proposition 2.1.

Without loss of generality, we can restrict our attention to the real symmetric case. The proof in the Hermitian case is very similar. We start by proving (2.2). Using (z\*INXN)\*RN(z)=IN,(z\*I_{N}-X_{N})\*R_{N}(z)=I_{N}, we write

where ηN\eta_{N} is defined in (2.1), and rNr_{N} contains the third and the fourth cumulant terms corresponding to p=2p=2 and p=3p=3 in the decoupling formula (6.1) for ik,i\not=k, and the error terms due to the truncation of the decoupling formula (6.1) for iki\not=k at p=3p=3 and for i=ki=k at p=1.p=1.

where by κ3(i,k)\kappa_{3}(i,k) we denote the third cumulant of (WN)ik.(W_{N})_{ik}. We note that

uniformly in iki\not=k and N.N. To estimate the absolute value of the first term in (2.15), we first sum with respect to jj and then use the Cauchy-Schwarz inequality and (6.7) to obtain

To estimate the absolute value of the second term in (2.15), we write

Finally, we bound the last of the third cumulant terms in (2.15) as

Combining the bounds (2.16-2.18), we see that the contribution of the third cumulant terms to rNr_{N} in (2.12-2.13) is bounded from above by O(1Imz3\*N).O\left(\frac{1}{|\operatorname{\mathfrak{Im}}z|^{3}\*\sqrt{N}}\right). The fourth cumulant terms give

To estimate the absolute value of the first term in (2.19), we note that

(6.7), and the fact that the fourth cumulants of (WN)ik(W_{N})_{ik} are uniformly bounded in absolute value by some constant Const(m5).Const(m_{5}).

To estimate the second term in (2.19), we write

The other two terms in (2.19) are estimated in a similar fashion. Each of them is O(N\*u\*vImz2).O\left(\frac{N\*\|u\|\*\|v\|}{|\operatorname{\mathfrak{Im}}z|^{2}}\right). Therefore, the fourth cumulant terms give the contribution O(1N\*Imz4)O\left(\frac{1}{N\*|\operatorname{\mathfrak{Im}}z|^{4}}\right) to rNr_{N} in (2.12-2.13).

Finally, we estimate the error terms due to the truncation of the decoupling formula at p=3p=3 for iki\not=k and at p=1p=1 for i=k.i=k. Here, we treat the error term due to the truncation of the decoupling formula at p=3p=3 for ik.i\not=k. The second error term can be treated in a similar way. To estimate the error term, we have to consider expressions of the following form

where a,b,c,d,e,f,p,q,s\in\{i,k\},\ the supremum in (2.22) is considered over the resolvents R(l)=(zXN(l))1, l=1,5R^{(l)}=(z-X_{N}^{(l)})^{-1},\ l=1,\ldots 5 of rank two perturbations XN(l)=XN+x\*EikX_{N}^{(l)}=X_{N}+x\*E_{ik} of XNX_{N} with (Eik)jh=δij\*δkh+δih\*δkj.(E_{ik})_{jh}=\delta_{ij}\*\delta_{kh}+\delta_{ih}\*\delta_{kj}. Estimating each entry of R(l)R^{(l)} by 1Imz,\frac{1}{|\operatorname{\mathfrak{Im}}z|}, taking into account that

and using the fact that the fifth cumulants of the off-diagonal entries of WNW_{N} are uniformly bounded, we bound (2.22) from above by O\left(\frac{1}{N\*|\operatorname{\mathfrak{Im}}z|^{5}}\right).\

Combining the estimates of the third and the fourth cumulant terms and the truncation error term, we can rewrite the Master equation (2.12) as

where we recall that by PlP_{l} we denote a polynomial of degree ll with positive coefficients that do not depend on N.N.

which is exactly the estimate (2.2) of Proposition 2.1.

To prove (2.3), we note that (2.25-2.26), (2.28) and (2.6) imply

Now, we turn our attention to the proof of (2.4). The key part of the proof is the following lemma.

where rNr_{N} contains the third and the fourth cumulant terms corresponding to p=2p=2 and p=3p=3 in (6.1) for k=ik=i, and the error due to the truncation of the decoupling formula (6.1) at p=3p=3 for kik\not=i and at p=1p=1 for k=i.k=i. Clearly,

Using (2.42) and (2.47), one can write the last term in (2.39) as

The third cumulant terms in rNr_{N} in (2.40) can be written as

We are going to estimate the terms (2.53-2.55) separately. We start with the last two. We claim that both (2.54) and (2.55) are O(1Imz4\*N)O\left(\frac{1}{|\operatorname{\mathfrak{Im}}z|^{4}\*N}\right). Indeed, consider first (2.54). It follows from (6.4-6.5), (2.42), and (2.47), that it is equal to

Combining (2.59) and (2.58), we estimate (2.57) as O(1Imz4\*N).O\left(\frac{1}{|\operatorname{\mathfrak{Im}}z|^{4}\*N}\right). The other terms in (2.56) can be estimated in a similar way, which implies that (2.54) is O(1Imz4\*N)O\left(\frac{1}{|\operatorname{\mathfrak{Im}}z|^{4}\*N}\right).

Now, we turn our attention to (2.55). Using (2.43-2.45) and (2.48), one can rewrite (2.55) as

We estimate (2.60). The subsums (2.61-2.63) can be estimated in a similar way. The summation with respect to jj in (2.60) gives

Combining the last two bounds, we obtain that (2.60) is O(1Imz4\*N).O\left(\frac{1}{|\operatorname{\mathfrak{Im}}z|^{4}\*N}\right).

Finally, let us estimate (2.53). It can be written as

The subsums (2.64) and (2.66) are bounded from above by O(1Imz4\*N).O\left(\frac{1}{|\operatorname{\mathfrak{Im}}z|^{4}\*N}\right). The calculations are very similar to the ones used above and are left to the reader. The subsum (2.65) can be written as

It follows from the estimates in (2.17) that one has a deterministic upper bound

Combining the estimates (2.53-2.70), we obtain that the third cumulant term (2.52) contributing to rNr_{N} in (2.38) can be written as

Somewhat long but straightforward calculations using (6.4-6.5) and (2.42-2.51) show that the fourth cumulant term in rNr_{N} in (2.38) can be estimated from above by O\left(\frac{1}{|\operatorname{\mathfrak{Im}}z|^{5}\*N}\right).\ Since the calculations are very similar to those in (2.19- 2.21), we leave the details to the reader. In a similar fashion, the error terms in rN,r_{N}, due to the truncation of the decoupling formula at p=3p=3 for iki\not=k and at p=1p=1 for i=ki=k are bounded from above by O\left(\frac{1}{|\operatorname{\mathfrak{Im}}z|^{6}\*N}\right).\ The considerations are similar to those given in the analysis of (2.22).

Combining (2.41), (2.50-2.51), (2.71-2.72), and the bounds on the fourth cumulant term and the error terms discussed in the above paragraph, one rewrites the Master equation (2.38-2.39) as

Subtracting the r.h.s. in (2.35) from the r.h.s. in (2.76), we obtain (2.32). Lemma 2.1 is proven. ∎

Now, we are ready to finish the proof of Proposition 2.1. To obtain the estimate (2.4) from (2.32), we use the same arguments as in Section 3 of and Section 2 of . We note (see e.g. (3.9) in ) that

where the constant LL is chosen sufficiently large so that the O(P4(Imz1)N)O\left(\frac{P_{4}(|\operatorname{\mathfrak{Im}}z|^{-1})}{N}\right) term on the r.h.s. of (2.77) is at most 1/21/2 in absolute value. Multiplying both sides of (2.32) by gN(z),g_{N}(z), and using (6.8), we obtain that

for zON.z\in\mathcal{O}_{N}. It follows from (2.78) that

On the other hand, if ImzL\*N1/4,|\operatorname{\mathfrak{Im}}z|\leq L\*N^{-1/4}, then L4N\*Imz41.\frac{L^{4}}{N\*|\operatorname{\mathfrak{Im}}z|^{4}}\geq 1. Since u(N),RN(z)v(N)1Imz,|\langle u^{(N)},R_{N}(z)v^{(N)}\rangle|\leq\frac{1}{|\operatorname{\mathfrak{Im}}z|}, we have

for zz such that ImzL\*N1/4.|\operatorname{\mathfrak{Im}}z|\leq L\*N^{-1/4}. Combining (2.79) and (2.80), we obtain (2.4). This finishes the proof of Proposition 2.1. ∎

Proof of Theorem 1.6

Our exposition follows closely the ones in Section 3 of and Section 4 of . In order to extend the estimates of Proposition 2.1 to a more general class of test functions, we use the Helffer Sjöstrand functional calculus (see , ).

To prove (1.34), we let l=7l=7 in (3.2) and assume that ff has compact support. It follows from (2.2) that

uniformly on {z:Rezsupp(f), Imz1},\{z:\operatorname{\mathfrak{Re}}z\in supp(f),\ |\operatorname{\mathfrak{Im}}z|\leq 1\}, and C2C_{2} is a constant depending on supp(f).supp(f). We conclude that the second term on the r.h.s. of (3.8) can be estimated as follows

where χf\chi_{f} and χσ\chi_{\sigma} are the characteristic functions of the support of ff and of σ\sigma respectively, and LL is such that supp(f)[L,L].supp(f)\subset[-L,L]. This proves (1.34).

where z=x+iy, w=s+it.z=x+iy,\ w=s+it. Taking into account (2.4), we get

Plugging (3.5) with l=4l=4 in (3.15), we prove (1.33). Thus, we have proved the parts (i) and (ii) of Theorem 1.6.

Now, let us assume that the marginal distributions of the entries of WNW_{N} satisfy the Poincaré inequality (1.12) with a uniform constant υ\upsilon and prove the parts (iii)-(v), i.e. the estimates (1.37), (1.39), and (1.40). Since the proof of (1.37-1.40) is very similar to the proof of Proposition 3.3 in , we discuss here only the main ingredients.

where the Hilbert-Schmidt norm is defined as

In particular, if uu and vv are unit vectors, then

is a complex-valued Lipschitz continuous function on the space of N×NN\times N real symmetric (Hermitian) matrices with the Lipschitz constant

The second observation is that joint distribution of the matrix entries

of XNX_{N} satisfies the Poincaré inequality with the constant 12\*N\*υ\frac{1}{2}\*N\*\upsilon since the Poincaré inequality tensorizes (, ). Therefore, for any complex-valued Lipschitz continuous function of the matrix entries with the Lipschitz constant GL,|G|_{\mathcal{L}}, the distribution of G(XN)G(X_{N}) has exponential tails (see e.g. Lemma 4.4.3 and Exercise 4.4.5 in ), i.e.

Applying (3.19) to the spectral norm X\|X\| of the matrix XNX_{N} and using the universality results for the largest eigenvalues (see and references therein), we obtain

Outliers in the Spectrum of Finite Rank Perturbations of Wigner Matrices

This section is devoted to the proof of Theorem 1.2

is decreasing and  gσ(2\*σ+0)=1/σ.\ g_{\sigma}(2\*\sigma+0)=1/\sigma. Let us choose δ>0\delta>0 in such a way that

i.e. for all θj\theta_{j} that correspond to the outliers (so θj>σ\theta_{j}>\sigma). Let

where ζN(x)\zeta_{N}(x) is defined in (4.6).

where x_{i+1}-x_{i}=N^{-1/3},\ 0\leq i\leq l(N)-1,\ and xl(N)1L<xl(N).x_{l(N)-1}\leq L<x_{l}(N). Clearly, the number of elements in the sequence is O(N1/3).O(N^{1/3}). We have

uniformly in 0il(N)0\leq i\leq l(N) and N\geq 1.\ Indeed,

Now, we are ready to start the proof of Theorem 1.2. Let us denote by u(1),,u(r),u^{(1)},\ldots,u^{(r)}, the orthonormal eigenvectors of ANA_{N} corresponding to the non-zero eigenvalues. We recall that we used the notation θ1>>θj0=0>>θJ\theta_{1}>\ldots>\theta_{j_{0}}=0>\ldots>\theta_{J} for the (fixed) eigenvalues of AN,A_{N}, and denoted the (fixed) multiplicity of θj\theta_{j} by kjk_{j}. The zero eigenvalue θj0=0\theta_{j_{0}}=0 has multiplicity Nr.N-r. Clearly, jj0kj=r.\sum_{j\neq j_{0}}k_{j}=r. Let us denote by Θ\Theta the r×rr\times r diagonal matrix built from the non-zero eigenvalues of AN,A_{N},

Let us also denote by UNU_{N} the N×rN\times r matrix whose columns are given by the orthonormal eigenvectors u(1),,u(r)u^{(1)},\ldots,u^{(r)} of AN.A_{N}. Clearly,

For any x[2\*σ+2δ,L],x\in[2\*\sigma+2\delta,L], we define the r×rr\times r matrix ΞN(x)\Xi_{N}(x) as follows. Let

The first step in the proof of Theorem 1.2 is the following lemma from .

Suppose that xx is not an eigenvalue of XN.X_{N}. Then xx is an eigenvalue of XN+ANX_{N}+A_{N} with multiplicity n1n\geq 1 if and only if gσ(x)g_{\sigma}(x) is an eigenvalue of the r×rr\times r matrix

For the convenience of the reader, we sketch the proof of Lemma 4.2 below.

Let x∉Sp(XN).x\not\in Sp(X_{N}). Therefore RN(x)=(x\*INXN)1R_{N}(x)=(x\*I_{N}-X_{N})^{-1} is well defined, and

We obtain that for x∉Sp(XN)x\not\in Sp(X_{N}) that  xSp(XN+AN)\ x\in Sp(X_{N}+A_{N}) if and only if

where one uses the identity \det(I-B\*C)=\det(I-C\*B).\ Rewriting

Proposition 1.1 plays an important role in the proof of Theorem 1.2. Before we prove Proposition 1.1, we need to introduce some notations and prove Lemma 4.3.

Consider a family of r×rr\times r matrices ZN(x)Z_{N}(x) defined in (4.23) for x[2\*σ+2δ,L].x\in[2\*\sigma+2\delta,L]. Fix an eigenvalue θj\theta_{j} of ANA_{N} such that θj>σ\theta_{j}>\sigma and use the notation v(1),,v(kj)v^{(1)},\ldots,v^{(k_{j})} for the eigenvectors of ANA_{N} that correspond to the eigenvalue \theta_{j}.\ Without loss of generality we can assume that j=1.j=1. We do it just to simplify notations. The case 1<jJσ+1<j\leq J_{\sigma^{+}} is identical. We recall that ΞN(j)\Xi^{(j)}_{N} is defined in (1.25) as the kj×kjk_{j}\times k_{j} submatrix of ΞN(ρj)\Xi_{N}(\rho_{j}) restricted to the rows and columns corresponding to v(i), 1ikj.v^{(i)},\ 1\leq i\leq k_{j}. The central role in the proof of Proposition 1.1 is played by the following lemma.

Let ZN(x), x[2\*σ+2δ,L],Z_{N}(x),\ x\in[2\*\sigma+2\delta,L], be as in (4.23), with ΞN(x)\Xi_{N}(x) defined in (4.22), and Θ\Theta defined in (4.20). Let

be the ordered eigenvalues of ZN(x).Z_{N}(x). Then, for sufficiently large constant C>0,C>0,

in probability, i.e. N\*(zi(ρ1)1θ1)\sqrt{N}\*(z_{i}(\rho_{1})-\frac{1}{\theta_{1}}) is bounded in probability,  1ik1.\ 1\leq i\leq k_{1}.

We claim that (4.27) follows from Lemma 4.1. Indeed, (4.11) and (4.6) imply that

as N.N\to\infty. Since zi(x)zi(y)ZN(x)ZN(y)=1N\*ΞN(y)ΞN(y), 1ir,|z_{i}(x)-z_{i}(y)|\leq\|Z_{N}(x)-Z_{N}(y)\|=\frac{1}{\sqrt{N}}\*\|\Xi_{N}(y)-\Xi_{N}(y)\|,\ 1\leq i\leq r, we conclude that (4.29) implies (4.27).

in probability. Indeed, the entries of the r×rr\times r matrix ΞN(x)\Xi_{N}(x) are bounded in probability since the expectation and variance of

almost surely. Thus, ΞN(x)\|\Xi_{N}(x)\| is also bounded in probability. Since the first k1k_{1} eigenvalues of Θ1\Theta^{-1} are equal to 1θ1,\frac{1}{\theta_{1}}, we obtain (4.28). Lemma 4.3 is proven. ∎

Now, we are ready to prove Proposition 1.1.

By Lemma 4.2, the outliers of XN+ANX_{N}+A_{N} are given by those values of x[2\*σ+δ,M]x\in[2\*\sigma+\delta,M] such that

We recall that gσ(x)g_{\sigma}(x) is a monotonically decreasing function on [2\*σ+δ,M][2\*\sigma+\delta,M] and

Since for 1ik1,1\leq i\leq k_{1}, (4.28) gives us that zi(ρ1)gσ(ρ1)=O(1N)z_{i}(\rho_{1})-g_{\sigma}(\rho_{1})=O(\frac{1}{\sqrt{N}}) in probability, it follows from (4.31) and (4.27) that with probability going to 1,1, there exist M>x1x2xk1>2\*σ+δM>x_{1}\geq x_{2}\geq\ldots\geq x_{k_{1}}>2\*\sigma+\delta such that g_{\sigma}(x_{i})=z_{i}(x_{i}),\ 1\leq i\leq k_{1},\ and

in probability. Applying (4.27) one more time, we get that

in probability. By a standard perturbation theory argument (see e.g. section XII.1 in ), one proves that the first k1k_{1} smallest eigenvalues of the matrix ZN(ρ1)Z_{N}(\rho_{1}) differ from the (increasingly ordered) eigenvalues of the k1×k1k_{1}\times k_{1} matrix 1θ1\*Id1N\*ΞN(m)\frac{1}{\theta_{1}}\*Id-\frac{1}{\sqrt{N}}\*\Xi^{(m)}_{N} by at most O\left(\frac{1}{N}\right),\ in probability, where the matrix ΞN(m)\Xi^{(m)}_{N} has been defined in (1.25). To see this, we use the following standard lemma from the perturbation theory

Let BB be an n×nn\times n real symmetric (Hermitian) matrix that can be written in the block form as B=(Bij)i,j=1,2,B=(B_{ij})_{i,j=1,2}, where BijB_{ij} is an ni×njn_{i}\times n_{j} matrix. Suppose that all eigenvalues of B11B_{11} are smaller than all eigenvalues of B22B_{22} and the gap between the spectra of B11B_{11} and B22B_{22} is at least Const>0.Const>0. In addition, suppose that the operator norm of the offdiagonal block B12B_{12} is bounded from above by ϵ,\epsilon, so that B12=B21ϵ.\|B_{12}\|=\|B_{21}\|\leq\epsilon.

Then there exists const(Const,n)const(Const,n) such that the first n1n_{1} smallest eigenvalues of BB differ from the (increasingly ordered) eigenvalues of B11B_{11} by at most const\*ϵ2.const\*\epsilon^{2}.

and λn1+1λn1>Const.\lambda_{n_{1}+1}-\lambda_{n_{1}}>Const. Then it is easy to see that

is an approximate eigenvector of BB with the approximate eigenvalue λ1\lambda_{1} such that

Since (Bλj)\*ejϵ, 1jn,\|(B-\lambda_{j})\*e_{j}\|\leq\epsilon,\ 1\leq j\leq n, and λjλ1Const, n1<jn,\lambda_{j}-\lambda_{1}\geq Const,\ n_{1}<j\leq n, we obtain that

The result of the lemma can be immediately extended by induction to the case of m×mm\times m block matrices B=(Bij)1i,jm.B=(B_{ij})_{1\leq i,j\leq m}. To apply it in our setting, we note that the k1×k1k_{1}\times k_{1} matrix 1θ1\*Id1N\*ΞN(m)\frac{1}{\theta_{1}}\*Id-\frac{1}{\sqrt{N}}\*\Xi^{(m)}_{N} is the upper-left block of ZN(ρ1).Z_{N}(\rho_{1}). The other diagonal blocks of ZN(ρ1)Z_{N}(\rho_{1}) are given by ΞN(i), 1im1\Xi^{(i)}_{N},\ 1\leq i\leq m-1 defined in (1.25). Since the operator norms of the off-diagonal blocks of ZN(ρ1)Z_{N}(\rho_{1}) are O(N1/2)(see(\refandor)),O(N^{-1/2})(see(\ref{andor})), the desired statement follows.

where y1yk1y_{1}\geq\ldots\geq y_{k_{1}} are the eigenvalues of the matrix ΞN(m).\Xi^{(m)}_{N}. The result of Proposition 1.1 now follows from (4.36) and (4.33). ∎

Since the eigenvalues of the matrix ΞN(m)(ρ1)\Xi^{(m)}_{N}(\rho_{1}) are bounded in probability, the first part of Theorem 1.2, i.e. (1.14), follows from (1.26) in Proposition 1.1.

almost surely, where Const1>0, Const2>0Const_{1}>0,\ Const_{2}>0 are sufficiently large, improving (4.11). Reasoning as before, (4.37) implies that

almost surely for sufficiently large constant Const3>0.Const_{3}>0. Thus, we have

almost surely, which implies (1.15) since gσ(ρ1)=1θ1.g_{\sigma}(\rho_{1})=\frac{1}{\theta_{1}}. Theorem 1.2 is proven. ∎

Proof of Theorems 1.3, 1.4, and 1.7

In this section, we prove Theorems 1.3, 1.4, and 1.7. We start with Theorem 1.3.

Let θj>σ\theta_{j}>\sigma be an eigenvalue of ANA_{N} with the multiplicity kj.k_{j}. Let us assume that Case A takes place. Thus, without loss of generality, we can assume that the eigenvectors of ANA_{N} corresponding to the eigenvalue θj\theta_{j} belong to Span(e1,,eKj),Span(e_{1},\ldots,e_{K_{j}}), where KjK_{j} is a fixed positive integer. As always, we consider the real symmetric case. The treatment of the Hermitian case is very similar. Consider a Kj×kjK_{j}\times k_{j} matrix UjU_{j} such that the (KjK_{j}-dimensional) columns of UjU_{j} are filled by the first KjK_{j} coordinates of the kjk_{j} orthonormal vectors of ANA_{N} corresponding to the eigenvalue θj.\theta_{j}. We recall that the remaining NKjN-K_{j} coordinates of these orthonormal vectors are zero. Let us denote by RN(Kj)(z)R_{N}^{(K_{j})}(z) the upper-left Kj×KjK_{j}\times K_{j} submatrix of the resolvent matrix RN(z)=(z\*INXN)1.R_{N}(z)=(z\*I_{N}-X_{N})^{-1}. Finally, we define the random matrix-valued field

We recall that Theorem 1.1 in states that ΥN(z)\Upsilon_{N}(z) converges weakly in finite-dimensional distributions to a random field

Now, Theorem 1.3 follows from Proposition 1.1 in this paper, and Theorem 1.1 in , since

Theorem 1.3 is proven. The proof of Theorem 1.4 is very similar to the given proof of Theorem 1.3. One has to use Theorems 4.1 and 4.2 and Remark 4.1 in that generalize Theorems 1.1 and 1.5 in to the non-i.i.d. case, and replace κ4(μ)\kappa_{4}(\mu) in (5.4) with κ4(i), 1iKj.\kappa_{4}(i),\ 1\leq i\leq K_{j}.

Now, we turn to the proof of Theorem 1.7.

converges weakly to the joint distribution of  up(n),Υ(z)uq(n),  1p,ql.\ \langle u^{(n)}_{p},\Upsilon(z)u^{(n)}_{q}\rangle,\ \ 1\leq p,q\leq l. Choosing nn sufficiently large, we can make

arbitrary small uniformly in Nn.N\geq n. Indeed, the variance in (5.7) is bounded by O(\|u_{p}-u^{(n)}_{p}\|^{2}+\|u_{q}-u^{(n)}_{q}\|^{2})\ since the entries of Υ(z)\Upsilon(z) are i.i.d random variables with bounded variance on the diagonal and i.i.d. random variables with bounded variance off the diagonal. In addition,

and we can use the bounds (1.33) and (1.34) in Theorem 1.6 rewritten as

are arbitrary small (uniformly in NN) provided one chooses nn sufficiently large. This finishes the proof. ∎

Theorem 1.7 allows the following extension of Theorem 1.3:

where uN(p)u^{(p)}_{N} denotes the projection of u(p)u^{(p)} onto the subspace spanned by the first NN standard basis vectors e1,,eN.e_{1},\ldots,e_{N}. Let UNU_{N} be the N×rN\times r matrix whose columns are given by the vectors u^{(1)}_{N},\ldots,u^{(r)}_{N}.\ Also denote by Θ\Theta the r×rr\times r diagonal matrix

The result of Theorem 1.3 can be extended for such ANA_{N}, with the matrix VjV_{j} given by

Appendix

The appendix contains several basic formulas used throughout the paper.

and can be immediately verified by integration by parts.

Next, we write a basic resolvent identity. For any two Hermitian matrices X1X_{1} and X2X_{2} and non-real zz we have:

As a corollary of (6.3), one has the following formulas. If XX is a real symmetric matrix with resolvent RR then

In a similar way, if XX is a Hermitian matrix then

Finally, we will use the following properties of the resolvent:

where by Sp(X)Sp(X) we denote the spectrum of a real symmetric (Hermitian) matrix X.X. The bound (6.6) implies

Therefore, all entries of the resolvent matrix are bounded by Im(z)1|\operatorname{\mathfrak{Im}}(z)|^{-1}. In a similar fashion, we have the following bound for the Stieltjes transform, g(z)g(z), of any probability measure:

References