The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices

Florent Benaych-Georges, Raj Rao Nadakuditi

Introduction

Let XnX_{n} be an n×nn\times n symmetric (or Hermitian) matrix with eigenvalues λ1(Xn),,λn(Xn)\lambda_{1}(X_{n}),\ldots,\lambda_{n}(X_{n}) and PnP_{n} be an n×nn\times n symmetric (or Hermitian) matrix with rank rnr\leq n and non-zero eigenvalues θ1,,θr\theta_{1},\ldots,\theta_{r}. A fundamental question in matrix analysis is the following :

How are the eigenvalues and eigenvectors of Xn+PnX_{n}+P_{n} related to the eigenvalues and eigenvectors of XnX_{n} and PnP_{n}?

When XnX_{n} and PnP_{n} are diagonalized by the same eigenvectors, we have λi(Xn+Pn)=λj(Xn)+λk(Pn)\lambda_{i}(X_{n}+P_{n})=\lambda_{j}(X_{n})+\lambda_{k}(P_{n}) for appropriate choice of indices i,j,k{1,,n}i,j,k\in\{1,\ldots,n\}. In the general setting, however, the answer is complicated by the fact that the eigenvalues and eigenvectors of their sum depend on the relationship between the eigenspaces of the individual matrices.

In this scenario, one can use Weyl’s interlacing inequalities and Horn inequalities to obtain coarse bounds for the eigenvalues of the sum in terms of the eigenvalues of XnX_{n}. When the norm of PnP_{n} is small relative to the norm of XnX_{n}, tools from perturbation theory (see [22, Chapter 6] or ) can be employed to improve the characterization of the bounded set in which the eigenvalues of the sum must lie. Exploiting any special structure in the matrices allows us to refine these bounds but this is pretty much as far as the theory goes. Instead of exact answers we must resort to a system of coupled inequalities. Describing the behavior of the eigenvectors of the sum is even more complicated.

Surprisingly, adding some randomness to the eigenspaces permits further analytical progress. Specifically, if the eigenspaces are assumed to be “in generic position with respect to each other”, then in place of eigenvalue bounds we have simple, exact answers that are to be interpreted probabilistically. These results bring into focus a phase transition phenomenon of the kind illustrated in Figure 1 for the eigenvalues and eigenvectors of Xn+PnX_{n}+P_{n} and Xn×(In+Pn)X_{n}\times(I_{n}+P_{n}). A precise statement of the results may be found in Section 2.

The development of the eigenvector aspect is another contribution that we would like to highlight. Generally speaking, the eigenvector question has received less attention in random matrix theory and in free probability theory. A notable exception is the recent body of work on the eigenvectors of spiked Wishart matrices which corresponds to μX\mu_{X} being the Marčenko-Pastur measure. In this paper, we extend their results for multiplicative models of the kind (I+Pn)1/2Xn(I+Pn)1/2(I+P_{n})^{1/2}X_{n}(I+P_{n})^{1/2} to the setting where μX\mu_{X} is an arbitrary probability measure and obtain new results for the eigenvectors for additive models of the form Xn+PnX_{n}+P_{n}.

Our proofs rely on the derivation of master equation representations of the eigenvalues and eigenvectors of the perturbed matrix and the subsequent application of concentration inequalities for random vectors uniformly distributed on high dimensional unit spheres (such as the ones appearing in ) to these implicit master equation representations. Consequently, our technique is simpler, more general and brings into focus the source of the phase transition phenomenon. The underlying methods can and have been adapted to study the extreme singular values and singular vectors of deformations of rectangular random matrices, as well as the fluctuations and the large deviations of our model.

The paper is organized as follows. In Section 2, we state the main results and present the integral transforms alluded to above. Section 3 presents some examples. An outline of the proofs is presented in Section 4. Exact master equation representations of the eigenvalues and the eigenvectors of the perturbed matrices are derived in Section 5 and utilized in Section 6 to prove the main results. Technical results needed in these proofs have been relegated to the Appendix.

Main results

Let XnX_{n} be an n×nn\times n symmetric (or Hermitian) random matrix whose ordered eigenvalues we denote by λ1(Xn)λn(Xn)\lambda_{1}(X_{n})\geq\cdots\geq\lambda_{n}(X_{n}). Let μXn\mu_{X_{n}} be the empirical eigenvalue distribution, i.e., the probability measure defined as

Assume that the probability measure μXn\mu_{X_{n}} converges almost surely weakly, as nn\longrightarrow\infty, to a non-random compactly supported probability measure μX\mu_{X}. Let aa and bb be, respectively, the infimum and supremum of the support of μX\mu_{X}. We suppose the smallest and largest eigenvalue of XnX_{n} converge almost surely to aa and bb.

For a given r1r\geq 1, let θ1θr\theta_{1}\geq\cdots\geq\theta_{r} be deterministic non-zero real numbers, chosen independently of nn. For every nn, let PnP_{n} be an n×nn\times n symmetric (or Hermitian) random matrix having rank rr with its rr non-zero eigenvalues equal to θ1,,θr\theta_{1},\ldots,\theta_{r}. Let the index s{0,,r}s\in\{0,\ldots,r\} be defined such that θ1θs>0>θs+1θr\theta_{1}\geq\cdots\geq\theta_{s}>0>\theta_{s+1}\geq\cdots\geq\theta_{r}.

Recall that a symmetric (or Hermitian) random matrix is said to be orthogonally invariant (or unitarily invariant) if its distribution is invariant under the action of the orthogonal (or unitary) group under conjugation.

We suppose that XnX_{n} and PnP_{n} are independent and that either XnX_{n} or PnP_{n} is orthogonally (or unitarily) invariant.

2. Notation

we also let a.s.\overset{\textrm{a.s.}}{\longrightarrow} denote almost sure convergence. The ordered eigenvalues of an n×nn\times n Hermitian matrix MM will be denoted by λ1(M)λn(M)\lambda_{1}(M)\geq\cdots\geq\lambda_{n}(M). Lastly, for a subspace FF of a Euclidian space EE and a vector xEx\in E, we denote the norm of the orthogonal projection of xx onto FF by x,F\langle x,F\rangle.

3. Extreme eigenvalues and eigenvectors under additive perturbations

Consider the rank rr additive perturbation of the random matrix XnX_{n} given by

The extreme eigenvalues of X~n\widetilde{X}_{n} exhibit the following behavior as nn\longrightarrow\infty. We have that for each 1is1\leq i\leq s,

while for each fixed i>si>s, λi(X~n)a.s.b\lambda_{i}(\widetilde{X}_{n})\overset{\textrm{a.s.}}{\longrightarrow}b.

Similarly, for the smallest eigenvalues, we have that for each 0j<rs0\leq j<r-s,

while for each fixed jrsj\geq r-s, λnj(X~n)a.s.a\lambda_{n-j}(\widetilde{X}_{n})\overset{\textrm{a.s.}}{\longrightarrow}a.

is the Cauchy transform of μX\mu_{X}, GμX1()G_{\mu_{X}}^{-1}(\cdot) is its functional inverse so that 1/\textpm±1/\textpm\pm\infty stands for .

Consider i0{1,,r}{i_{0}}\in\{1,\ldots,r\} such that 1/θi0(GμX(a),GμX(b+))1/\theta_{i_{0}}\in(G_{\mu_{X}}(a^{-}),G_{\mu_{X}}(b^{+})). For each nn, define

and let u~\widetilde{u} be a unit-norm eigenvector of X~n\widetilde{X}_{n} associated with the eigenvalue λ~i0\widetilde{\lambda}_{i_{0}}. Then we have, as nn\longrightarrow\infty,

When r=1r=1, let the sole non-zero eigenvalue of PnP_{n} be denoted by θ\theta. Suppose that

For each nn, let u~\widetilde{u} be a unit-norm eigenvector of X~n\widetilde{X}_{n} associated with either the largest or smallest eigenvalue depending on whether θ>0\theta>0 or θ<0\theta<0, respectively. Then we have

The following proposition allows to assert that in many classical matrix models, such as Wigner or Wishart matrices, the above phase transitions actually occur with a finite threshold. The proposition is phrased in terms of bb, the supremum of the support of μX\mu_{X}, but also applies for aa, the infimum of the support of μX\mu_{X}. The proof relies on a straightforward computation which we omit.

Assume that the limiting eigenvalue distribution μX\mu_{X} has a density fμXf_{\mu_{X}} with a power decay at bb, i.e., that, as tbt\to b with t<bt<b, fμX(t)c(bt)αf_{\mu_{X}}(t)\sim c(b-t)^{\alpha} for some exponent α>1\alpha>-1 and some constant cc. Then:

so that the phase transitions in Theorems 2.1 and 2.3 manifest for α=1/2\alpha=1/2.

Under additional hypotheses on the manner in which the empirical eigenvalue distribution of Xna.s.μXX_{n}\overset{\textrm{a.s.}}{\longrightarrow}\mu_{X} as nn\longrightarrow\infty, Theorem 2.2 can be generalized to any eigenvalue with limit ρ\rho equal either to aa or bb such that GμX(ρ)G_{\mu_{X}}^{\prime}(\rho) is finite. In the same way, Theorem 2.3 can be generalized for any value of rr. The specific hypothesis has to do with requiring the spacings between the λi(Xn)\lambda_{i}(X_{n})’s to be more “random matrix like” and exhibit repulsion instead of being “independent sample like” with possible clumping. We plan to develop this line of inquiry in a separate paper.

4. Extreme eigenvalues and eigenvectors under multiplicative perturbations

We maintain the same hypotheses as before so that the limiting probability measure μX\mu_{X}, the index ss and the rank rr matrix PnP_{n} are defined as in Section 2.1. In addition, we assume that for every nn, XnX_{n} is a non-negative definite matrix and that the limiting probability measure μX\mu_{X} is not the Dirac mass at zero.

Consider the rank rr multiplicative perturbation of the random matrix XnX_{n} given by

The extreme eigenvalues of X~n\widetilde{X}_{n} exhibit the following behavior as nn\longrightarrow\infty. We have that for 1is1\leq i\leq s,

while for each fixed i>si>s, λi(X~n)a.s.b\lambda_{i}(\widetilde{X}_{n})\overset{\textrm{a.s.}}{\longrightarrow}b.

In the same way, for the smallest eigenvalues, for each 0j<rs0\leq j<r-s,

while for each fixed jrsj\geq r-s, λnj(X~n)a.s.a\lambda_{n-j}(\widetilde{X}_{n})\overset{\textrm{a.s.}}{\longrightarrow}a.

is the T-transform of μX\mu_{X}, TμX1()T_{\mu_{X}}^{-1}(\cdot) is its functional inverse and 1/±1/\pm\infty stands for .

Consider i0{1,,r}{{i_{0}}}\in\{1,\ldots,r\} such that 1/θi0(TμX(a),TμX(b+))1/\theta_{i_{0}}\in(T_{\mu_{X}}(a^{-}),T_{\mu_{X}}(b^{+})). For each nn, define

and let u~\widetilde{u} be a unit-norm eigenvector of X~n\widetilde{X}_{n} associated with the eigenvalue λ~i0\widetilde{\lambda}_{i_{0}}. Then we have, as nn\longrightarrow\infty,

When r=1r=1, let the sole non-zero eigenvalue of PnP_{n} be denoted by θ\theta. Suppose that

For each nn, let u~\widetilde{u} be the unit-norm eigenvector of X~n\widetilde{X}_{n} associated with either the largest or smallest eigenvalue depending on whether θ>0\theta>0 or θ<0\theta<0, respectively. Then, we have

Assume that the limiting eigenvalue distribution μX\mu_{X} has a density fμXf_{\mu_{X}} with a power decay at bb (or aa or both), i.e., that, as tbt\to b with t<bt<b, fμX(t)c(bt)αf_{\mu_{X}}(t)\sim c(b-t)^{\alpha} for some exponent α>1\alpha>-1 and some constant cc. Then:

so that the phase transitions in Theorems 2.6 and 2.8 manifest for α=1/2\alpha=1/2.

The analogue of Remark 2.5 also applies here.

Consider the matrix Sn=(In+Pn)1/2Xn(In+Pn)1/2S_{n}=(I_{n}+P_{n})^{1/2}X_{n}(I_{n}+P_{n})^{1/2}. The matrices SnS_{n} and X~n=Xn(In+Pn)\widetilde{X}_{n}=X_{n}(I_{n}+P_{n}) are related by a similarity transformation and hence share the same eigenvalues and consequently the same limiting eigenvalue behavior in Theorem 2.6. Additionally, if u~i\widetilde{u}_{i} is a unit-norm eigenvector of X~n\widetilde{X}_{n} then w~i=(In+Pn)1/2u~i\widetilde{w}_{i}=(I_{n}+P_{n})^{1/2}\widetilde{u}_{i} is an eigenvector of SnS_{n} and the unit-norm eigenvector v~i=w~i/w~i\widetilde{v}_{i}=\widetilde{w}_{i}/\|\widetilde{w}_{i}\| satisfies

It follows that we obtain the same phase transition behavior and that when 1/θi(TμX(a),TμX(b+))1/\theta_{i}\in(T_{\mu_{X}}(a^{-}),T_{\mu_{X}}(b^{+})),

so that the analogue of Theorems 2.7 and 2.8 for the eigenvectors of SnS_{n} holds.

5. The Cauchy and T transforms in free probability theory

The Cauchy transform of a compactly supported probability measure μ\mu on the real line is defined as:

If [a,b][a,b] denotes the convex hull of the support of μ\mu, then

exist in [,0)[-\infty,0) and (0,+](0,+\infty], respectively and Gμ()G_{\mu}(\cdot) realizes decreasing homeomorphisms from (,a)(-\infty,a) onto (Gμ(a),0)(G_{\mu}(a^{-}),0) and from (b,+)(b,+\infty) onto (0,Gμ(b+))(0,G_{\mu}(b^{+})). Throughout this paper, we shall denote by Gμ1()G_{\mu}^{-1}(\cdot) the inverses of these homeomorphisms, even though GμG_{\mu} can also define other homeomorphisms on the holes of the support of μ\mu.

is the analogue of the logarithm of the Fourier transform for free additive convolution. The free additive convolution of probability measures on the real line is denoted by the symbol \boxplus and can be characterized as follows.

Let AnA_{n} and BnB_{n} be independent n×nn\times n symmetric (or Hermitian) random matrices that are invariant, in law, by conjugation by any orthogonal (or unitary) matrix. Suppose that, as nn\longrightarrow\infty, μAnμA\mu_{A_{n}}\longrightarrow\mu_{A} and μBnμB\mu_{B_{n}}\longrightarrow\mu_{B}. Then, free probability theory states that μAn+BnμAμB\mu_{A_{n}+B_{n}}\longrightarrow\mu_{A}\boxplus\mu_{B}, a probability measure which can be characterized in terms of the RR-transform as

The connection between free additive convolution and Gμ1G_{\mu}^{-1} (via the RR-transform) and the appearance of Gμ1G_{\mu}^{-1} in Theorem 2.1 could be of independent interest to free probabilists.

5.2. The T𝑇T-transform and its relation to multiplicative free convolution

In the case where μδ0\mu\neq\delta_{0} and the support of μ\mu is contained in [0,+)[0,+\infty), one also defines its TT-transform

which realizes decreasing homeomorphisms from (,a)(-\infty,a) onto (Tμ(a),0)(T_{\mu}(a^{-}),0) and from (b,+)(b,+\infty) onto (0,Tμ(b+))(0,T_{\mu}(b^{+})). Throughout this paper, we shall denote by Tμ1T_{\mu}^{-1} the inverses of these homeomorphisms, even though TμT_{\mu} can also define other homeomorphisms on the holes of the support of μ\mu.

is the analogue of the Fourier transform for free multiplicative convolution \boxtimes. The free multiplicative convolution of two probability measures μA\mu_{A} and μB\mu_{B} is denoted by the symbols \boxtimes and can be characterized as follows.

Let AnA_{n} and BnB_{n} be independent n×nn\times n symmetric (or Hermitian) positive-definite random matrices that are invariant, in law, by conjugation by any orthogonal (or unitary) matrix. Suppose that, as nn\longrightarrow\infty, μAnμA\mu_{A_{n}}\longrightarrow\mu_{A} and μBnμB\mu_{B_{n}}\longrightarrow\mu_{B}. Then, free probability theory states that μAnBnμAμB\mu_{A_{n}\cdot B_{n}}\longrightarrow\mu_{A}\boxtimes\mu_{B}, a probability measure which can be characterized in terms of the SS-transform as

The connection between free multiplicative convolution and Tμ1T_{\mu}^{-1} (via the SS-transform) and the appearance of Tμ1T_{\mu}^{-1} in Theorem 2.6 could be of independent interest to free probabilists.

6. Extensions

Theorem 2.1 can easily be adapted to describe the phase transition in the eigenvalues of Xn+PnX_{n}+P_{n} which fall in the “holes” of the support of μX\mu_{X}. Consider c<dc<d such that almost surely, for nn large enough, XnX_{n} has no eigenvalue in the interval (c,d)(c,d). It implies that GμXG_{\mu_{X}} induces a decreasing homeomorphism, that we shall denote by GμX,(c,d)G_{\mu_{X},(c,d)}, from the interval (c,d)(c,d) onto the interval (GμX(d),GμX(c+))(G_{\mu_{X}}(d^{-}),G_{\mu_{X}}(c^{+})). Then it can be proved that almost surely, for nn large enough, Xn+PnX_{n}+P_{n} has no eigenvalue in the interval (c,d)(c,d), except if some of the 1/θi1/\theta_{i}’s are in the interval (GμX(d),GμX(c+))(G_{\mu_{X}}(d^{-}),G_{\mu_{X}}(c^{+})), in which case for each such index ii, one eigenvalue of Xn+PnX_{n}+P_{n} has limit GμX,(c,d)1(1/θi)G_{\mu_{X},(c,d)}^{-1}(1/\theta_{i}) as nn\longrightarrow\infty.

Theorem 2.1 can also easily be adapted to the case where XnX_{n} itself has isolated eigenvalues in the sense that some of its eigenvalues have limits out of the support of μX\mu_{X}. More formally, let us replace the assumption that the smallest and largest eigenvalues of XnX_{n} tend to the infimum aa and the supremum bb of the support of μX\mu_{X} by the following one.

Moreover, λ1+p+(Xn)a.s.b\lambda_{1+p^{+}}(X_{n})\overset{\textrm{a.s.}}{\longrightarrow}b and λn(1+p)(Xn)a.s.a\lambda_{n-(1+p^{-})}(X_{n})\overset{\textrm{a.s.}}{\longrightarrow}a.

The previous remark forms the basis for an iterative application of our theorems to other perturbational models, such as X~=X(I+P)X+Q\widetilde{X}=\sqrt{X}(I+P)\sqrt{X}+Q for example. Another way to deal with such perturbations is to first derive the corresponding master equations representations that describe how the eigenvalues and eigenvectors of X~\widetilde{X} are related to the eigenvalues and eigenvectors of XX and the perturbing matrices, along the lines of Proposition 5.1 for additive or multiplicative perturbations of Hermitian matrices.

Let GG be an n×mn\times m Gaussian random matrix with independent real (or complex) entries that are normally distributed with mean and variance 11. Then the matrix X=GG/mX=GG^{*}/m is orthogonally (or unitarily) invariant. Hence one can choose an orthonormal basis (U1,,Un)(U_{1},\ldots,U_{n}) of eigenvectors of XX such that the matrix UU with columns U1,,UnU_{1},\ldots,U_{n} is Haar-distributed. When GG is a Gaussian-like matrix, in the sense that its entries are i.i.d. with mean zero and variance one, then upon placing adequate restrictions on the higher order moments, for non-random unit norm vector xnx_{n}, the vector UxnU^{*}x_{n} will be close to uniformly distributed on the unit real (or complex) sphere . Since our proofs rely heavily on the properties of unit norm vectors uniformly distributed on the nn-sphere, they could possibly be adapted to the setting where the unit norm vectors are close to uniformly distributed.

Suppose that PnP_{n} is a random matrix independent of XnX_{n}, with exactly rr non-zero eigenvalues given by θ1(n),,θr(n)\theta_{1}^{(n)},\ldots,\theta_{r}^{(n)}. Let θi(n)a.s.θi\theta_{i}^{(n)}\overset{\textrm{a.s.}}{\longrightarrow}\theta_{i} as nn\longrightarrow\infty. Using [22, Cor. 6.3.8] as in Section 6.2.3, one can easily see that our results will also apply in this case.

The analogues of Remarks 2.11, 2.12, 2.14 and 2.15 for the multiplicative setting also hold here. In particular, Wishart matrices with c>1c>1 (cf Section 3.2) gives an illustration of the case where there is a hole in the support of μX\mu_{X}.

Examples

We now illustrate our results with some concrete computations. The key to applying our results lies in being able to compute the Cauchy or TT transforms of the probability measure μX\mu_{X} and their associated functional inverses. In what follows, we focus on settings where the transforms and their inverses can be expressed in closed form. In settings where the transforms are algebraic so that they can be represented as solutions of polynomial equations, the techniques and software developed in can be utilized. In more complicated settings, one will have to resort to numerical techniques.

Let XnX_{n} be an n×nn\times n symmetric (or Hermitian) matrix with independent, zero mean, normally distributed entries with variance σ2/n\sigma^{2}/n on the diagonal and σ2/(2n)\sigma^{2}/(2n) on the off diagonal. It is known that the spectral measure of XnX_{n} converges almost surely to the famous semi-circle distribution with density

It is known that the extreme eigenvalues converge almost surely to the endpoints of the support . Associated with the spectral measure, we have

GμX(±2σ)=±σG_{\mu_{X}}(\pm 2\sigma)=\pm\sigma and GμX1(1/θ)=θ+σ2θG_{\mu_{X}}^{-1}(1/\theta)=\theta+\frac{\sigma^{2}}{\theta}.

Thus for a PnP_{n} with rr non-zero eigenvalues θ1θs>0>θs+1θr\theta_{1}\geq\cdots\geq\theta_{s}>0>\theta_{s+1}\geq\cdots\geq\theta_{r}, by Theorem 2.1, we have for 1is1\leq i\leq s,

as nn\longrightarrow\infty. This result has already been established in for the symmetric case and in for the Hermitian case. Remark 2.14 explains why our results should hold for Wigner matrices of the sort considered in .

In the setting where r=1r=1 and P=θuuP=\theta\,uu^{*}, let u~\widetilde{u} be a unit-norm eigenvector of Xn+PnX_{n}+P_{n} associated with its largest eigenvalue. By Theorems 2.2 and 2.3, we have

2. Multiplicative perturbation of a random Wishart matrix

Let GnG_{n} be an n×mn\times m real (or complex) matrix with independent, zero mean, normally distributed entries with variance 11. Let Xn=GnGn/mX_{n}=G_{n}G_{n}^{*}/m. It is known that, as n,mn,m\longrightarrow\infty with n/mc>0n/m\to c>0, the spectral measure of XnX_{n} converges almost surely to the famous Marčenko-Pastur distribution with density

where a=(1c)2a=(1-\sqrt{c})^{2} and b=(1+c)2b=(1+\sqrt{c})^{2}. It is known that the extreme eigenvalues converge almost surely to the endpoints of this support.

Associated with this spectral measure, we have

TμX(b+)=1/cT_{\mu_{X}}(b^{+})=1/\sqrt{c}, TμX(a)  =  1/cT_{\mu_{X}}(a^{-})\;=\;-1/\sqrt{c} and

When c>1c>1, there is an atom at zero so that the smallest eigenvalue of XnX_{n} is identically zero. For simplicity, let us consider the setting when c<1c<1 so that the extreme eigenvalues of XnX_{n} converge almost surely to aa and bb. Thus for PnP_{n} with rr non-zero eigenvalues θ1θs>0>θs+1θr\theta_{1}\geq\cdots\geq\theta_{s}>0>\theta_{s+1}\geq\cdots\geq\theta_{r}, with li:=θi+1l_{i}:=\theta_{i}+1, for c<1c<1, by Theorem 2.6, we have for 1is1\leq i\leq s,

as nn\longrightarrow\infty. An analogous result for the smallest eigenvalue may be similarly derived by making the appropriate substitution for aa in Theorem 2.6. Consider the matrix Sn=(In+Pn)1/2Xn(In+Pn)1/2S_{n}=(I_{n}+P_{n})^{1/2}X_{n}(I_{n}+P_{n})^{1/2}. The matrix SnS_{n} may be interpreted as a Wishart distributed sample covariance matrix with “spiked” covariance In+PnI_{n}+P_{n}. By Remark 2.10, the above result applies for the eigenvalues of SnS_{n} as well. This result for the largest eigenvalue of spiked sample covariance matrices was established in and for the extreme eigenvalues in .

In the setting where r=1r=1 and P=θuuP=\theta\,uu^{*}, let l=θ+1l=\theta+1 and let u~\widetilde{u} be a unit-norm eigenvector of Xn(I+Pn)X_{n}(I+P_{n}) associated with its largest (or smallest, depending on whether l>1l>1 or l<1l<1) eigenvalue. By Theorem 2.8, we have

Let v~\widetilde{v} be a unit eigenvector of Sn=(In+Pn)1/2Xn(In+Pn)1/2S_{n}=(I_{n}+P_{n})^{1/2}X_{n}(I_{n}+P_{n})^{1/2} associated with its largest (or smallest, depending on whether l>1l>1 or l<1l<1) eigenvalue. Then, by Theorem 2.8 and Remark 2.10, we have

The result has been established in for the eigenvector associated with the largest eigenvalue. We generalize it to the eigenvector associated with the smallest one.

We note that symmetry considerations imply that when XX is a Wigner matrix then X-X is a Wigner matrix as well. Thus an analytical characterization of the largest eigenvalue of a Wigner matrix directly yields a characterization of the smallest eigenvalue as well. This trick cannot be applied for Wishart matrices since Wishart matrices do not exhibit the symmetries of Wigner matrices. Consequently, the smallest and largest eigenvalues and their associated eigenvectors of Wishart matrices have to be treated separately. Our results facilitate such a characterization.

Outline of the proofs

We now provide an outline of the proofs. We focus on Theorems 2.1, 2.2 and 2.3, which describe the phase transition in the extreme eigenvalues and associated eigenvectors of X+PX+P (the index nn in XnX_{n} and PnP_{n} has been suppressed for brevity). An analogous argument applies for the multiplicative perturbation setting.

Consider the setting where r=1r=1, so that P=θuuP=\theta\,uu^{*}, with uu being a unit norm column vector. Since either XX or PP is assumed to be invariant, in law, under orthogonal (or unitary) conjugation, one can, without loss of generality, suppose that X=diag(λ1,,λn)X=\operatorname{diag}(\lambda_{1},\ldots,\lambda_{n}) and that uu is uniformly distributed on the unit nn-sphere.

The eigenvalues of X+PX+P are the solutions of the equation

Equivalently, for zz so that zIXzI-X is invertible, we have

Consequently, a simple argument reveals that the zz is an eigenvalue of X+PX+P and not an eigenvalue of XX if and only if 11 is an eigenvalue of the matrix (zIX)1P(zI-X)^{-1}P. But (zIX)1P=(zIX)1θuu(zI-X)^{-1}P=(zI-X)^{-1}\theta\,uu^{*} has rank one, so its only non-zero eigenvalue will equal its trace, which in turn is equal to θGμn(z)\theta G_{\mu_{n}}(z), where μn{\mu_{n}} is a “weighted” spectral measure of XX, defined by

Thus any zz outside the spectrum of XX is an eigenvalue of X+PX+P if and only if

Equation (1) describes the relationship between the eigenvalues of X+PX+P and the eigenvalues of XX and the dependence on the coordinates of the vector uu (via the measure μn\mu_{n}).

This is where randomization simplifies analysis. Since uu is a random vector with uniform distribution on the unit nn-sphere, we have that for large nn, uk21n|u_{k}|^{2}\approx\frac{1}{n} with high probability. Consequently, we have μnμX\mu_{n}\approx\mu_{X} so that Gμn(z)GμX(z)G_{\mu_{n}}(z)\approx G_{\mu_{X}}(z). Inverting equation (1) after substituting these approximations yields the location of the largest eigenvalue to be GμX1(1/θ)G_{\mu_{X}}^{-1}(1/\theta) as in Theorem 2.1.

The phase transition for the extreme eigenvalues emerges because under our assumption that the limiting probability measure μX\mu_{X} is compactly supported on [a,b][a,b], the Cauchy transform GμXG_{\mu_{X}} is defined outside [a,b][a,b] and unlike what happens for GμnG_{\mu_{n}}, we do not always have GμX(b+)=+G_{\mu_{X}}(b^{+})=+\infty. Consequently, when 1/θ<GμX(b+)1/\theta<G_{\mu_{X}}(b^{+}), we have that λ1(X~)GμX1(1/θ)\lambda_{1}(\widetilde{X})\approx G_{\mu_{X}}^{-1}(1/\theta) as before. However, when 1/θGμX(b+)1/\theta\geq G_{\mu_{X}}(b^{+}), the phase transition manifests and λ1(X~)λ1(X)=b\lambda_{1}(\widetilde{X})\approx\lambda_{1}(X)=b.

An extension of these arguments for fixed r>1r>1 yields the general result and constitutes the most transparent justification, as sought by the authors in , for the emergence of this phase transition phenomenon in such perturbed random matrix models. We rely on concentration inequalities to make the arguments rigorous.

2. Eigenvectors phase transition

Let u~\widetilde{u} be a unit eigenvector of X+PX+P associated with the eigenvalue zz that satisfies (1). From the relationship (X+P)u~=zu~(X+P)\widetilde{u}=z\widetilde{u}, we deduce that, for P=θuuP=\theta\,uu^{*},

implying that u~\widetilde{u} is proportional to (zIX)1u(zI-X)^{-1}u.

Equation (2) describes the relationship between the eigenvectors of X+PX+P and the eigenvalues of XX and the dependence on the coordinates of the vector uu (via the measure μn\mu_{n}).

Here too, randomization simplifies analysis since for large nn, we have μnμX\mu_{n}\approx\mu_{X} and zρz\approx\rho. Consequently,

so that when 1/θ<GμX(b+)1/\theta<G_{\mu_{X}}(b^{+}), which implies that ρ>b\rho>b, we have

whereas when 1/θGμX(b+)1/\theta\geq G_{\mu_{X}}(b^{+}) and GμXG_{\mu_{X}} has infinite derivative at ρ=b\rho=b, we have

An extension of these arguments for fixed r>1r>1 yields the general result and brings into focus the connection between the eigenvalue phase transition and the associated eigenvector phase transition. As before, concentration inequalities allow us to make these arguments rigorous.

The exact master equations for the perturbed eigenvalues and eigenvectors

In this section, we provide the rr-dimensional analogues of the master equations (1) and (2) employed in our outline of the proof.

Let us fix some positive integers 1rn1\leq r\leq n. Let Xn=diag(λ1,,λn)X_{n}=\operatorname{diag}(\lambda_{1},\ldots,\lambda_{n}) be a diagonal n×nn\times n matrix and Pn=Un,rΘUn,rP_{n}=U_{n,r}\Theta U_{n,r}^{*}, with Θ=diag(θ1,\textdagger,θr)\Theta=\operatorname{diag}(\theta_{1},\textdagger\ldots,\theta_{r}) an r×rr\times r diagonal matrix and Un,rU_{n,r} an n×rn\times r matrix with orthonormal columns, i.e., Un,rUn,r=IrU_{n,r}^{*}U_{n,r}=I_{r}).

a) Then any z{λ1,\textdagger,λn}z\notin\{\lambda_{1},\textdagger\ldots,\lambda_{n}\} is an eigenvalue of X~n:=Xn+Pn{\widetilde{X}}_{n}:=X_{n}+P_{n} if and only if the r×rr\times r matrix

and for all xker(zInX~n)x\in\ker(zI_{n}-{\widetilde{X}}_{n}), we have Un,rxkerMn(z)U_{n,r}^{*}x\in\ker M_{n}(z) and

b) Let uk,l(n)u^{(n)}_{k,l} denote the (k,l)(k,l)-th element of the n×rn\times r matrix Un,rU_{n,r} for k=1,,nk=1,\ldots,n and l=1,,rl=1,\ldots,r. Then for all i,j=1,,ri,j=1,\ldots,r, the (i,j)(i,j)-th entry of the matrix IrUn,r(zInXn)1Un,rΘI_{r}-U_{n,r}^{*}(zI_{n}-X_{n})^{-1}U_{n,r}\Theta can be expressed as

where μi,j(n)\mu_{i,j}^{(n)} is the complex measure defined by

and Gμi,j(n)G_{\mu_{i,j}^{(n)}} is the Cauchy transform of μi,j(n)\mu_{i,j}^{(n)}.

c) In the setting where X~=Xn×(In+Pn)\widetilde{X}=X_{n}\times(I_{n}+P_{n}) and Pn=Un,rΘUn,rP_{n}=U_{n,r}\Theta U_{n,r}^{*} as before, we obtain the analog of a) by replacing every occurrence, in (4) and (6), of (zInXn)1(zI_{n}-X_{n})^{-1} with (zInXn)1Xn(zI_{n}-X_{n})^{-1}X_{n}. We obtain the analog of b) by replacing the Cauchy transform in (7) with the TT-transform.

Proof. Part a) is proved, for example, in [2, Th. 2.3]. Part b) follows from a straightforward computation of the (i,j)(i,j)-th entry of Un,r(zInXn)1Un,rΘU_{n,r}^{*}(zI_{n}-X_{n})^{-1}U_{n,r}\Theta. Part c) can be proved in the same way. \square

Proof of Theorem 2.1

The sequence of steps described below yields the desired proof:

Then, we utilize the “master equations” of Section 5 to express the extreme eigenvalues of X~n\widetilde{X}_{n} as the zz’s such that a certain random r×rr\times r matrix Mn(z)M_{n}(z) is singular.

We then exploit convergence properties of certain analytical functions (derived in the appendix) to prove that almost surely, Mn(z)M_{n}(z) converges to a certain diagonal matrix MGμX(z)M_{G_{\mu_{X}}}(z), uniformly in zz.

We then invoke a continuity lemma (see Lemma 6.1 - derived next) to claim that almost surely, the zz’s such that Mn(z)M_{n}(z) is singular (i.e. the extreme eigenvalues of X~n\widetilde{X}_{n}) converge to the zz’s such that MGμX(z)M_{G_{\mu_{X}}}(z) is singular.

We conclude the proof by noting that, for our setting, the zz’s such that MGμX(z)M_{G_{\mu_{X}}}(z) is singular are precisely the zz’s such that for some i{1,,r}i\in\{1,\ldots,r\}, GμX(z)=1θiG_{\mu_{X}}(z)=\frac{1}{\theta_{i}}. Part (ii) of Lemma 6.1, about the rank of Mn(z)M_{n}(z), will be useful to assert that when the θi\theta_{i}’s are pairwise distinct, the multiplicities of the isolated eigenvalues are all equal to one.

We now prove a continuity lemma that will be used in the proof of Theorem 2.1. We note that nothing in its hypotheses is random. As hinted earlier, we will invoke it to localize the extreme eigenvalues of X~n\widetilde{X}_{n}.

G(z)0G(z)\longrightarrow 0 as z|z|\longrightarrow\infty.

and denote by z1>>zpz_{1}>\cdots>z_{p} the zz’s such that MG(z)M_{G}(z) is singular, where p{0,,r}p\in\{0,\ldots,r\} is identically equal to the number of ii’s such that G(a)<1/θi<G(b+)G(a^{-})<1/\theta_{i}<G(b^{+}).

for nn large enough, for each ii, Mn(zn,i)M_{n}(z_{n,i}) has rank r1r-1.

Proof. Note firstly that the zz’s such that MG(z)M_{G}(z) is singular are the zz’s such that for a certain j{1,\textdagger,r}j\in\{1,\textdagger\ldots,r\},

Since the θj\theta_{j}’s are pairwise distinct, for any zz, there cannot exist more than one j{1,,r}j\in\{1,\ldots,r\} such that (9) holds. As a consequence, for all zz, the rank of MG(z)M_{G}(z) is either rr or r1r-1. Since the set of matrices with rank at least r1r-1 is open in the set of r×rr\times r matrices, once (i) will be proved, (ii) will follow.

Let us now prove (i). Note firstly that by c), there exists R>max{a,b}R>\max\{|a|,|b|\} such that for zz such that zR|z|\geq R, G(z)mini12θi|G(z)|\leq\min_{i}\frac{1}{2|\theta_{i}|}. For any such zz, detMG(z)>2r|\det M_{G}(z)|>2^{-r}. By e), it follows that for nn large enough, the zz’s such that Mn(z)M_{n}(z) is singular satisfy z>R|z|>R. By d), it even follows that the zz’s such that Mn(z)M_{n}(z) is singular satisfy z[R,R]z\in[-R,R].

the number of zz’s in (c,d)(c,d) such that detMn(z)=0\det M_{n}(z)=0, denoted by Cardc,d(n)\operatorname{Card}_{c,d}(n) tends to Cardc,d\operatorname{Card}_{c,d}, the cardinality of the ii’s in {1,,p}\{1,\ldots,p\} such that c<zi<dc<z_{i}<d.

To prove (H), by additivity, one can suppose that cc and dd are close enough to have Cardc,d=0\operatorname{Card}_{c,d}=0 or 11. Let us define γ\gamma to be the circle with diameter [c,d][c,d]. By a) and since c,d{z1,,zp}c,d\notin\{z_{1},\ldots,z_{p}\}, detMG()\det M_{G}(\cdot) does not vanish on γ\gamma, thus

the last equality following from e). It follows that for nn large enough, Cardc,d(n)=Cardc,d\operatorname{Card}_{c,d}(n)=\operatorname{Card}_{c,d} (note that since Cardc,d=0\operatorname{Card}_{c,d}=0 or 11, no ambiguity due to the orders of the zeros has to be taken into account here). \square

2. Proof of Theorem 2.1

Note that Weyl’s interlacing inequalities imply that for all 1in1\leq i\leq n,

where we employ the convention that λk(Xn)=\lambda_{k}(X_{n})=-\infty is k>nk>n and ++\infty if k0k\leq 0. It follows that the empirical spectral measure of X~na.s.μX\widetilde{X}_{n}\overset{\textrm{a.s.}}{\longrightarrow}\mu_{X} because the empirical spectral measure of XnX_{n} does as well.

Since aa and bb belong to the support of μX\mu_{X}, we have, for all i1i\geq 1 fixed,

it follows that for all i1i\geq 1 fixed, λi(Xn)a.s.b\lambda_{i}(X_{n})\overset{\textrm{a.s.}}{\longrightarrow}b and λn+1i(Xn)a.s.a\lambda_{n+1-i}(X_{n})\overset{\textrm{a.s.}}{\longrightarrow}a.

By (10), we deduce both following relation (11) and (12): for all i1i\geq 1 fixed, we have

and for all i>si>s (resp. irsi\geq r-s) fixed, we have

In this section, we assume that the eigenvalues θ1,,θr\theta_{1},\ldots,\theta_{r} of the perturbing matrix PnP_{n} to be pairwise distinct. In the next section, we shall remove this hypothesis by an approximation process.

For a momentarily fixed nn, let the eigenvalues of XnX_{n} be denoted by λ1λn\lambda_{1}\geq\ldots\geq\lambda_{n}. Consider orthogonal (or unitary) n×nn\times n matrices UXU_{X}, UPU_{P} that diagonalize XnX_{n} and PnP_{n}, respectively, such that

The spectrum of Xn+PnX_{n}+P_{n} is identical to the spectrum of the matrix

Since we have assumed that XnX_{n} or PnP_{n} is orthogonally (or unitarily) invariant and that they are independent, this implies that UnU_{n} is a Haar-distributed orthogonal (or unitary) matrix that is also independent of (λ1,,λn)(\lambda_{1},\ldots,\lambda_{n}) (see the first paragraph of the proof of [21, Th. 4.3.5] for additional details).

Recall that the largest eigenvalue λ1(Xn)a.s.b\lambda_{1}(X_{n})\overset{\textrm{a.s.}}{\longrightarrow}b, while the smallest eigenvalue λn(Xn)a.s.a\lambda_{n}(X_{n})\overset{\textrm{a.s.}}{\longrightarrow}a. Let us now consider the eigenvalues of X~n\widetilde{X}_{n} which are out of [λn(Xn),λ1(Xn)][\lambda_{n}(X_{n}),\lambda_{1}(X_{n})]. By Proposition 5.1-a) and an application of the identity in Proposition 5.1-b) these eigenvalues are precisely the numbers z[λn(Xn),λ1(Xn)]z\notin[\lambda_{n}(X_{n}),\lambda_{1}(X_{n})] such that the r×rr\times r matrix

is singular. Recall that in (14), Gμi,j(n)(z)G_{\mu_{i,j}^{(n)}}(z), for i,j=1,,ri,j=1,\ldots,r is the Cauchy transform of the random complex measure defined by

where uk,iu_{k,i} and uk,ju_{k,j} are the (k,i)(k,i)-th and (k,j)(k,j)-th entries of the orthogonal (or unitary) matrix UnU_{n} in (13) and λk\lambda_{k} is the kk-th largest eigenvalue of XnX_{n} as in the first term in (13).

We now note that Hypotheses a), b) and c) of Lemma 6.1 are satisfied and follow from the definition of the Cauchy transform GμXG_{\mu_{X}}. Hypothesis d) of Lemma 6.1 follows from the fact that X~n\widetilde{X}_{n} is Hermitian while hypothesis e) has been established in (17).

Let us recall that the eigenvalues of X~n\widetilde{X}_{n} which are out of [λn(Xn),λ1(Xn)][\lambda_{n}(X_{n}),\lambda_{1}(X_{n})] are precisely those values znz_{n} where the matrix Mn(zn)M_{n}(z_{n}) is singular. As a consequence, we are now in a position where Theorem 2.1 follows by invoking Lemma 6.1. Indeed, by Lemma 6.1, if

then their exists some sequences (zn,1)(z_{n,1}),…, (zn,p)(z_{n,p}) converging respectively to z1,,zpz_{1},\ldots,z_{p} such that for any ε>0\varepsilon>0 small enough, for nn large enough, the eigenvalues of X~n\widetilde{X}_{n} that are out of [aε,b+ε][a-\varepsilon,b+\varepsilon] are exactly zn,1,,zn,pz_{n,1},\ldots,z_{n,p}. Moreover, (5) and Lemma 6.1-(ii) ensure that for nn large enough, these eigenvalues have multiplicity one.

We now treat the case where the θi\theta_{i}’s are not supposed to be pairwise distinct.

We want to prove that for all 1is1\leq i\leq s, λi(X~n)a.s.ρθi\lambda_{i}(\widetilde{X}_{n})\overset{\textrm{a.s.}}{\longrightarrow}\rho_{\theta_{i}} and that for all 0j<rs0\leq j<r-s, λnj(X~n)a.s.ρθrj\lambda_{n-j}(\widetilde{X}_{n})\overset{\textrm{a.s.}}{\longrightarrow}\rho_{\theta_{r-j}}.

We shall treat only the case of largest eigenvalues (the case of smallest ones can be treated in the same way). So let us fix 1is1\leq i\leq s and ε>0\varepsilon>0.

There is η>0\eta>0 such that ρθρθiε|\rho_{\theta}-\rho_{\theta_{i}}|\leq\varepsilon whenever θθiη|\theta-\theta_{i}|\leq\eta. Consider pairwise distinct non zero real numbers θ1>>θr\theta^{\prime}_{1}>\cdots>\theta^{\prime}_{r} such that for all j=1,,rj=1,\ldots,r, θj\theta_{j} and θj\theta^{\prime}_{j} have the same sign and

It implies that ρθiρθiε|\rho_{\theta^{\prime}_{i}}-\rho_{\theta_{i}}|\leq\varepsilon. With the notation in Section 6.2.2, for each nn, we define

Note that by [22, Cor. 6.3.8], we have, for all nn,

Theorem 2.1 can applied to Xn+PnX_{n}+P^{\prime}_{n} (because the θ1,,θr\theta_{1}^{\prime},\ldots,\theta_{r}^{\prime} are pairwise distinct). It follows that almost surely, for nn large enough,

By the triangular inequality, almost surely, for nn large enough,

so that λi(Xn+Pn)a.s.ρθi\lambda_{i}(X_{n}+P_{n})\overset{\textrm{a.s.}}{\longrightarrow}\rho_{\theta_{i}}. \square

Proof of Theorem 2.2

The eigenvectors of Xn+PnX_{n}+P_{n}, are precisely UXU_{X} times the eigenvectors of

Consequently, we have proved Theorem 2.2 by proving the result in the setting where Xn=diag(λ1,,λn)X_{n}=\operatorname{diag}(\lambda_{1},\ldots,\lambda_{n}) and Pn=Undiag(θ1,,θr,0,,0)UnP_{n}=U_{n}\operatorname{diag}(\theta_{1},\ldots,\theta_{r},0,\ldots,0)U_{n}^{*}, where UnU_{n} is a Haar-distributed orthogonal (or unitary) matrix.

Let r0r_{0} be the number of ii’s such that θi=θi0\theta_{i}=\theta_{i_{0}}. Up to a reindexing of the θi\theta_{i}’s (which are then no longer decreasing - this fact does not affect our proof), one can suppose that i0=1{{i_{0}}}=1, θ1==θr0\theta_{1}=\cdots=\theta_{r_{0}}. This choice implies that, for each nn, ker(θ1InPn)\ker(\theta_{1}I_{n}-P_{n}) is the linear span of the r0{r_{0}} first columns u1u_{1}, …, ur0u_{r_{0}} of UnU_{n}. By construction, these columns are orthonormal. Hence, we will have proved Theorem 2.2 if we can prove that as nn\longrightarrow\infty,

As before, for every nn and for all zz outside the spectrum of XnX_{n}, define the r×rr\times r random matrix:

where, for all i,j=1,,ri,j=1,\ldots,r, μi,j(n)\mu_{i,j}^{(n)} is the random complex measure defined by (15).

We have established in Theorem 2.1 that because θi0>1/GμX(b+)\theta_{i_{0}}>1/G_{\mu_{X}}(b^{+}), λ~i0a.s.ρ=GμX1(1/θi0)[a,b]\widetilde{\lambda}_{i_{0}}\overset{\textrm{a.s.}}{\longrightarrow}\rho=G_{\mu_{X}}^{-1}(1/\theta_{i_{0}})\notin[a,b] as nn\longrightarrow\infty. It follows that:

Proposition 5.1 a) states that for nn large enough so that such that λ~i0\widetilde{\lambda}_{i_{0}} is not an eigenvalue of XnX_{n}, the r×1r\times 1 vector

is in the kernel of the r×rr\times r matrix Mn(zn)M_{n}(z_{n}) with Un,ru~21||U_{n,r}^{*}\widetilde{u}||_{2}\leq 1.

Thus by (20), any limit point of Un,ru~U_{n,r}^{*}\widetilde{u} is in the kernel of the matrix on the right hand side of (21), i.e. has its rr0r-r_{0} last coordinates equal to zero.

Thus (19) holds and we have proved Theorem 2.2-b). We now establish (18).

By (6), one has that for all nn, the eigenvector u~\widetilde{u} of X~n\widetilde{X}_{n} associated with the eigenvalue λ~i0\widetilde{\lambda}_{i_{0}} can be expressed as:

As λ~i0a.s.ρ[a,b]\widetilde{\lambda}_{i_{0}}\overset{\textrm{a.s.}}{\longrightarrow}\rho\notin[a,b], the sequence (λ~i0InXn)1(\widetilde{\lambda}_{i_{0}}I_{n}-X_{n})^{-1} is bounded in operator norm so that by (19), u~a.s.0\|\widetilde{u}^{\prime\prime}\|\overset{\textrm{a.s.}}{\longrightarrow}0. Since u~=1\|\widetilde{u}\|=1, this implies that u~a.s.1\|\widetilde{u}^{\prime}\|\overset{\textrm{a.s.}}{\longrightarrow}1.

Since we assumed that θi0=θ1==θr0\theta_{i_{0}}=\theta_{1}=\cdots=\theta_{r_{0}}, we must have that:

By Proposition 9.3, we have that for all iji\neq j, μi,j(n)a.s.δ0\mu_{i,j}^{(n)}\overset{\textrm{a.s.}}{\longrightarrow}\delta_{0} while for all ii, μi,i(n)a.s.μX\mu_{i,i}^{(n)}\overset{\textrm{a.s.}}{\longrightarrow}\mu_{X}. Thus, since we have that zna.s.ρ[a,b]z_{n}\overset{\textrm{a.s.}}{\longrightarrow}\rho\notin[a,b], we have that for all i,j=1,,r0i,j=1,\ldots,r_{0},

Combining the relationship in (22) with the fact that u~a.s.1\|\widetilde{u}^{\prime}\|\overset{\textrm{a.s.}}{\longrightarrow}1, yields (18) and we have proved Theorem 2.2-a). \square

Proof of Theorem 2.3

Let us assume that θ>0\theta>0. The proof supplied below can be easily ported to the setting where θ<0\theta<0.

We denote the coordinates of uu by u1(n),,un(n)u^{(n)}_{1},\ldots,u^{(n)}_{n} and define, for each nn, the random probability measure

The r=1r=1 setting of Proposition 5.1-b) states that the eigenvalues of Xn+PnX_{n}+P_{n} which are not eigenvalue of XnX_{n} are the solutions of

Since GμX(n)(z)G_{\mu_{X}^{(n)}}(z) decreases from ++\infty to for increasing values of z(λ1,+)z\in(\lambda_{1},+\infty), we have that λ1(Xn+Pn)=:λ~1>λ1\lambda_{1}(X_{n}+P_{n})=:\widetilde{\lambda}_{1}>\lambda_{1}. Reproducing the arguments leading to (3) in Section (4.2), yields the relationship:

By Theorem 2.1, we have that λ~1a.s.b\widetilde{\lambda}_{1}\overset{\textrm{a.s.}}{\longrightarrow}b so that

so that by (23), u~,ker(θInPn)2a.s.0\langle\widetilde{u},\ker(\theta I_{n}-P_{n})\rangle|^{2}\overset{\textrm{a.s.}}{\longrightarrow}0 thereby proving Theorem 2.3. \square

We omit the details of the proofs of Theorems 2.6–2.8, since these are straightforward adaptations of the proofs of Theorems 2.1-Theorem 2.3 that can obtained by following the prescription in Proposition 5.1-c).

Appendix: convergence of weighted spectral measures

We now establish a lemma on the weak convergence of complex measures that will be useful in proving Proposition 9.3. We note that the counterpart of this lemma for probability measures is well known. We did not find any reference in standard literature to the “complex measures version” stated next, so we provide a short proof.

which can be made arbitrarily small by appropriately choosing gg. The tightness hypothesis ensures that such a gg can always be found. This proves that (μn)(\mu_{n}) converges weakly to μ\mu. The uniform convergence follows from a straightforward application of Ascoli’s Theorem. \square

2. Convergence of weighted spectral measures

b) Suppose that 1n(x1+x2++xn)\frac{1}{n}(x_{1}+x_{2}+\cdots+x_{n}) converges almost surely to a deterministic limit ll. Then

Proof. We use Lemma 9.1. Note first that almost surely, since supn,kλk<\sup_{n,k}|\lambda_{k}|<\infty, both sequences are tight. Moreover, we have

References