The Isotropic Semicircle Law and Deformation of Wigner Matrices

Antti Knowles, Jun Yin

Introduction

Random matrices were introduced by Wigner Wig in the 1950s to model the excitation spectra of large atomic nuclei, and have since been the subject of intense mathematical investigation. In this paper we study Wigner matrices – random matrices whose entries are independent up to symmetry constraints – that have been deformed by a finite-rank perturbation. By Weyl’s eigenvalue interlacing inequalities, such a deformation does not influence the global statistics of the eigenvalues. Thus, the empirical eigenvalue densities of deformed and undeformed Wigner matrices have the same large-scale asymptotics, and are governed by Wigner’s famous semicircle law. However, the behaviour of individual eigenvalues may change dramatically under a deformation. In particular, deformed Wigner matrices may exhibit outliers, eigenvalues located away from the bulk spectrum. Such models were first investigated by Füredi and Komlós FKoml . Subsequently, much progress SoshPert ; FP ; CDMF1 ; CDMF2 ; CDMF3 ; BGN ; BGGM1 ; BGGM2 has been made in the analysis of the spectrum of such deformed matrix models. See e.g. SoshPert for a review of recent developments. Analogous deformations of covariance matrices, so-called spiked population models, as well as generalizations thereof, were studied in BY1 ; BY2 ; BS .

The phase transition takes place on the scale $d=1+wN^{-1/3}$ where $w$ is of order one. This may be heuristically understood as follows. The largest eigenvalues of $H$ are known to fluctuate on the scale $N^{-2/3}$ around $2$ . The critical scale for $d$ , i.e. the scale on which the outlier is separated from $2$ by a gap of order $N^{-2/3}$ , is therefore $d=1+wN^{-1/3}$ (since in that case $d+d^{-1}=2+w^{2}N^{-2/3}+O(w^{3}N^{-1})$ ). In BBP ; Pec ; BV1 ; BV2 , the authors established the weak convergence as $N\to\infty$

where $\lambda_{N}(A)$ denotes the largest eigenvalue of $A$ . Moreover, the asymptotics in $w$ of the law $\Lambda_{w}$ was analysed in BBP ; Pec ; BV1 ; BV2 ; Bthesis : as $w\to+\infty$ , the law $\Lambda_{w}$ converges to a Gaussian; as $w\to-\infty$ , the law $\Lambda_{w}$ converges to the Tracy-Widom- $\beta$ distribution (where $\beta=1$ for GOE and $\beta=2$ for GUE). As mentioned above, the results of BBP ; Pec ; BV1 ; BV2 also apply to rank- $k$ deformations, where the picture is similar; each eigenvalue $d_{i}\in^{c}$ gives rise to an outlier located around $d_{i}+d_{i}^{-1}$ , while eigenvalues $d_{i}\in(-1,1)$ do not change the statistics of the extremal eigenvalues of $\widetilde{H}$ .

The proofs of BBP ; Pec use an asymptotic analysis of Fredholm determinants, while those of BV1 ; BV2 use an explicit tridiagonal representation of $H$ ; both of these approaches rely heavily on the Gaussian nature of $H$ . In order to study the phase transition for non-Gaussian matrix ensembles, and in particular address the question of spectral universality, a different approach is needed. Interestingly, it was observed in CDMF1 ; CDMF2 ; CDMF3 that the distribution of the outliers is not universal, and may depend on the geometry of the eigenvectors of $A$ . The non-universality of the outliers was further investigated in SoshPert .

In the present paper we take $H$ to be a real symmetric or complex Hermitian Wigner matrix, and $A$ to be a rank- $k$ deterministic matrix whose symmetry class (real symmetric or complex Hermitian) coincides with that of $H$ . We make the following assumptions on the perturbation $A$ .

The eigenvalues $d_{1},\dots,d_{k}$ of $A$ may depend on $N$ ; they satisfy $\bigl{\lvert}\lvert d_{i}\rvert-1\bigr{\rvert}\geqslant(\log N)^{C\log\log N}N^{-1/3}$ , i.e., on the scale of the phase transition, the eigenvalues of $A$ are separated from the transition points by at least a logarithmic factor.

The eigenvectors of $A$ are arbitrary orthonormal vectors.

Our main results on the spectrum of $H+A$ may be informally summarized as follows.

The non-outliers “stick” to eigenvalues of the undeformed matrix $H$ (Theorem 2.7). In particular, the extremal bulk eigenvalues of $H+A$ are universal.

We identify the distribution of the outliers of $H+A$ (Theorem 2.14).

A key ingredient in our proof is a generalization of the local semicircle law. The study of the local semicircle law was initiated in ESY1 ; ESY3 ; it provides a key step towards establishing universality for Wigner matrices ESY4 ; ESY6 ; EYY2 ; EYY3 ; TV1 ; TV2 . The strongest versions of the local semicircle law, proved in EYY3 ; EKYY1 ; EKYY2 , give precise estimates on the local eigenvalue density, down to scales containing $N^{\varepsilon}$ eigenvalues. In fact, as formulated in EYY3 , the local semicircle law gives optimal high-probability estimates on the quantity

where $m(z)$ denotes the Stieltjes transform of Wigner’s semicircle law and $G(z)=(H-z)^{-1}$ is the resolvent of $H$ . Starting from such estimates on (1.1), the two following facts are established in EYY3 .

The eigenvalue density is governed by Wigner’s semicircle law down to scales containing $N^{\varepsilon}$ eigenvalues.

Eigenvalue rigidity: optimal high-probability bounds on the eigenvalue locations.

Another key ingredient in the proof of universality of random matrices is the Green function comparison method introduced in EYY2 . It uses a Lindeberg replacement strategy, which previously appeared in the context of random matrix theory in Chat ; TV1 ; TV2 . A fundamental input in the Green function comparison method is a precise control on the matrix entries of $G$ , which is provided by the local semicircle law. The Green function comparison method has subsequently been applied to proving the spectral universality of adjacency matrices of random graphs EKYY1 ; EKYY2 as well as the universality of eigenvectors of Wigner matrices KY1 .

In this paper, we extend the local semicircle law to the isotropic local semicircle law, which gives optimal high-probability estimates on the quantity

In Section 2, we introduce basic definitions and state our results. In a first part, we state the isotropic semicircle law (Theorem 2.2) and some important corollaries, such as the isotropic delocalization estimate (Theorem 2.5). The second part of Section 2 is devoted to the spectra of deformed Wigner matrices. Our main results are deviation estimates on the eigenvalue locations (Theorem 2.7) and the distribution of the outliers (Theorem 2.14). In subsequent remarks we discuss some special cases of interest, in particular making the link to the previous results of CDMF1 ; CDMF2 ; CDMF3 ; SoshPert .

The remainder of this paper is devoted to proofs. As it turns out, the proof of the isotropic local semicircle law is considerably simpler if the third moments of the matrix entries of $H$ vanish. This case is dealt with in Section 3. The proof is based on the Green function comparison method and the local semicircle law of EYY3 . In Section 4, we give the additional arguments needed to extend the isotropic local semicircle law to arbitrary matrix entries. We remark that the Green function comparison method has been traditionally EYY2 ; EKYY2 ; KY1 used to obtain limiting distributions of smooth, bounded, observables that depend on the resolvent $G$ . In this paper we use it in a novel setting: to obtain high-probability bounds on a fluctuating error.

In Section 5 we use the isotropic semicircle law to obtain an improved estimate outside of the classical spectrum $ $, and prove the isotropic delocalization result which yields optimal high-probability bounds on projections of the eigenvectors of$ H$ onto arbitrary deterministic vectors.

Finally, Section 7 contains the proof of Theorem 2.14, the distribution of the outliers. The proof consists of four main steps.

Let $H$ be the Wigner matrix we are interested in. We introduce a cutoff $\varepsilon_{N}$ (equal to $\varphi^{-D}$ in the notation of Section 7.3). We define $\widehat{H}$ as the Wigner matrix obtained from $H$ by replacing the $(i,j)$ -th entry of $H$ with a Gaussian whenever $\lvert v_{i}\rvert\leqslant\varepsilon_{N}$ and $\lvert v_{j}\rvert\leqslant\varepsilon_{N}$ . We choose $\varepsilon_{N}$ large enough that most entries of $\widehat{H}$ are Gaussian. We shall compare $H$ with a Gaussian matrix $V$ via the intermediate matrix $\widehat{H}$ . In this step, (iii), we compare $\widehat{H}$ with $V$ .

Our proof relies on a block expansion of $\widehat{H}$ , which expresses the distribution of the difference

in terms of a sum of independent random variables ( $\Gamma_{1},\dots,\Gamma_{6}$ in the notation of Section 7.3) whose laws may be explicitly computed.

In the final step, we use the Green function comparison method to analyse the difference

By definition of $\widehat{H}$ , whenever the entry $(i,j)$ of $H$ differs from that of $\widehat{H}$ , we have $\lvert v_{i}\rvert\leqslant\varepsilon_{N}$ and $\lvert v_{j}\rvert\leqslant\varepsilon_{N}$ . As a consequence, as it turns out, the Green function comparison method is applicable. Of special note in this comparison argument is a shift in the mean of the outlier (arising from the second term on the right-hand side of (7.50)), depending on the third moments of the entries of $H$ .

Acknowledgements

We are grateful to Alex Bloemendal, Paul Bourgade, László Erdős, and Horng-Tzer Yau for helpful comments.

Results

We use the abbreviation GOE/GUE to mean GOE if $H$ is a real symmetric Wigner matrix with Gaussian entries and GUE if $H$ is a complex Hermitian Wigner matrix with Gaussian entries. We assume that the entries of $H$ have uniformly subexponential decay, i.e. that there exists a constant $\vartheta>0$ such that

for all $i,j$ . Note that we do not assume the entries of $H$ to be identically distributed.

The following quantities will appear throughout this paper. We choose a fixed but arbitrary constant $\Sigma\geqslant 3$ . We define the logarithmic control parameter

The parameter $\zeta$ will play the role of a fixed positive constant, which simultaneously dictates the power of $\varphi$ in large deviations estimates and characterizes the decay of probability of exceptional events, according to the following definition.

Let $\zeta>0$ . We say that an $N$ -dependent event $\Xi$ holds with $\zeta$ -high probability if there is some constant $C$ such that

which will be used as the argument of Stieltjes transforms and resolvents. In the following we shall often use the notation $E=\operatorname{Re}z$ and $\eta=\operatorname{Im}z$ without further comment. Let

denote the density of the local semicircle law, and

its Stieltjes transform. To avoid confusion, we remark that the Stieltjes transform $m$ was denoted by $m_{sc}$ in the papers ESY1 ; ESY2 ; ESY3 ; ESY4 ; ESY5 ; ESY6 ; ESY7 ; ESYY ; EYY1 ; EYY2 ; EYY3 ; EKYY1 ; EKYY2 , in which $m$ had a different meaning from (2.4). It is well known that the Stieltjes transform $m$ satisfies the identity

For $\eta>0$ we define the resolvent of $H$ through

We denote by $C$ a generic positive large constant, whose value may change from one expression to the next. If this constant depends on some parameters $\alpha$ , we indicate this by writing $C_{\alpha}$ . Finally, for two positive quantities $A_{N}$ and $B_{N}$ we use the notation $A_{N}\asymp B_{N}$ to mean $C^{-1}A_{N}\leqslant B_{N}\leqslant CA_{N}$ for some positive constant $C$ .

2 The isotropic local semicircle law

Fix $\zeta>0$ . Then there exists a constant $C_{\zeta}$ such that

for some large enough constant $C_{0}$ depending on $\zeta$ .

Away from the asymptotic spectrum $$, Theorem 2.2 can be strengthened as follows.

Fix $\zeta>0$ and $\Sigma\geqslant 3$ . Then there exist constants $C_{1}$ and $C_{\zeta}$ such that for any

Using a simple lattice argument combined with the Lipschitz continuity of $z\mapsto G(z)$ , one can easily strengthen the statement (2.7) of Theorem 2.2 to a simultaneous high probability statement for all $z$ , as in (3.16) below. For more details, see e.g. Corollary 3.19 in EKYY1 .

Similarly, mimicking the proof of Lemma 7.2 below, we find

For an $N\times N$ matrix $A$ we denote by $\lambda_{1}(A)\leqslant\lambda_{2}(A)\leqslant\cdots\leqslant\lambda_{N}(A)$ the nondecreasing sequence of eigenvalues of $A$ . Moreover, we denote by $\sigma(A)$ the spectrum of $A$ . It is convenient to abbreviate the (random) eigenvalues of $H$ by

For any integers $a$ and $b$ satisfying $1\leqslant a<b\leqslant N/2$ and

with $\zeta$ -high probability. Here $C_{0}$ is the constant from Theorem 2.2. By symmetry, a similar result holds for the eigenvectors $\alpha\geqslant N/2$ .

If the third moments of the entries of $H$ vanish in the sense of (2.8), then we have the stronger statement

with $\zeta$ -high probability. The second inequality implies

with $\zeta$ -high probability. Compare this with the first inequality of (2.15).

3 Finite-rank deformation of Wigner matrices

We shall study the spectrum of the deformed matrix

We abbreviate the eigenvalues of $\widetilde{H}$ by

In order to state our results, we order the eigenvalues of $D$ , i.e. we assume that $d_{1}\leqslant\dots\leqslant d_{k}$ . Define the numbers

As we shall see, $k^{-}$ is the number of outliers to the left of the bulk and $k^{+}$ the number of outliers to the right of the bulk. We shall always assume that $k^{-}$ and $k^{+}$ are independent of $N$ .

denote the $k^{-}+k^{+}$ indices associated with the outliers. For $i\in O$ abbreviate the associated eigenvalue index by

Choose a sequence $\psi\equiv\psi_{N}$ satisfying $1\leqslant\psi\leqslant N^{\mathfrak{b}}$ . Suppose that

for all $i=1,\dots,k$ . Then for $i\in O$ we have

In CDMF1 , Capitaine, Donati-Martin, and Féral proved that $\mu_{\alpha(i)}\to\theta(d_{i})$ almost surely for all $i\in O$ , under the assumptions that (i) $D$ does not depend on $N$ and (ii) the law of the entries of $H$ is symmetric and satisfies a Poincaré inequality. Subsequently, the assumption (ii) was relaxed by Pizzo, Renfrew, and Soshnikov SoshPert . In fact, in SoshPert the authors proved, assuming (i), that the sequence $\sqrt{N}(\mu_{\alpha(i)}-\theta(d_{i}))$ is bounded in probability for all $i\in O$ .

In BGGM1 ; BGGM2 , Benaych-Georges, Guionnet, and Maïda considered deformations of Wigner matrices by finite-rank random matrices whose eigenvalues are independent of $N$ and whose eigenvectors are either independent copies of a random vector with i.i.d. centred components satisfying a log-Sobolev inequality or are obtained by Gram-Schmidt orthonormalization of such independent copies. For these random perturbation models, they established eigenvalue sticking estimates similar to (2.21).

Provided one is only interested in the locations of the outliers, i.e. (2.20), one can set $\psi=1$ in Theorem 2.7.

We shall refer to the eigenvalues in (2.20), i.e. $\mu_{1},\dots\mu_{k^{-}},\mu_{N-k^{+}+1},\dots,\mu_{N}$ , as the outliers, and to the eigenvalues in (2.21), i.e. $\mu_{k^{-}+1},\dots,\mu_{\varphi^{K}},\mu_{N-\varphi^{K}},\dots,\mu_{N-k^{+}}$ , as the extremal bulk eigenvalues.

The phase transition associated with $d_{i}$ happens on the scale $d_{i}=1+a_{i}N^{-1/3}$ where $a_{i}$ is of order one. The condition (2.19) is optimal (up to powers of $\varphi$ ) in the sense that the power of $N$ in (2.19) cannot be reduced. Indeed, in BBP ; Pec ; BV1 ; BV2 it is established that, for rank-oneFor simplicity of presentation, we consider rank-one deformations, although the results of BBP ; Pec ; BV1 ; BV2 hold for rank- $k$ deformations. deformations of GOE/GUE with $d=1+aN^{-1/3}$ and $a$ of order one, $\mu_{N}$ fluctuates on the scale $N^{-2/3}$ and its distribution differs from that of $\lambda_{N}$ . Hence in that case (2.21) cannot hold for $\psi\gg 1$ . See also Remark 2.13 below for a more detailed discussion of the qualitative behaviour of eigenvalues of $\widetilde{H}$ as $d_{i}$ crosses a transition point.

Note that the location $\theta(d_{i})$ of the outlier associated with $d_{i}=1+a_{i}N^{-1/3}$ satisfies $\theta(d_{i})=2+N^{-2/3}a_{i}^{2}+O(a_{i}^{3}N^{-1})$ . In comparison, the largest eigenvalue of $H$ fluctuates on a scale $N^{-2/3}$ around $2$ .

An immediate corollary of Theorem 2.7 is the universality of the extremal bulk eigenvalues of $\widetilde{H}$ . In other words, under the assumption $\lvert\lvert d_{i}\rvert-1\rvert\geqslant\varphi^{C_{2}+1}N^{-1/3}$ for all $i$ , the statistics of the extremal bulk eigenvalues of $\widetilde{H}$ coincide with those of GOE/GUE.

The parameter $\psi$ describes how strongly the extremal bulk eigenvalues of $\widetilde{H}$ stick to extremal eigenvalues of $H$ . If $d_{i}$ is within distance $CN^{-1/3}$ of a transition point $\pm 1$ , one does not expect the eigenvalues of $\widetilde{H}$ to stick to the eigenvalues of $H$ . For very weak sticking on the scale $N^{-2/3}\varphi^{-1}$ , corresponding to $\psi=\varphi$ , the eigenvalues $d_{i}$ have to satisfy $\bigl{\lvert}\lvert d_{i}\rvert-1\bigr{\rvert}\geqslant\varphi^{C_{2}+1}N^{-1/3}$ . In particular, we may allow outliers at a distance $\varphi^{2C_{2}+2}N^{-2/3}$ from the spectral edge.

On the other hand, in order to obtain strong sticking on the scale $N^{-1+\varepsilon}$ , corresponding to $\psi=N^{1/3-\varepsilon}$ , the eigenvalues $d_{i}$ have to satisfy $\bigl{\lvert}\lvert d_{i}\rvert-1\bigr{\rvert}\geqslant\varphi^{C_{2}}N^{-\varepsilon}$ . Now the outliers have to lie at a distance of at least $N^{2C_{2}-2\varepsilon}$ from the spectral edge.

Thus, Theorem 2.7 gives a clear picture of what happens to the extremal bulk eigenvalues as $d_{i}$ passes a transition point $\pm 1$ . For definiteness, consider the case where $d_{i}$ is varied from $1-c$ to $1+c$ for some small $c>0$ , and all other eigenvalues of $D$ are kept constant. Consider an extremal bulk eigenvalue near $+2$ , say $\mu_{\alpha}$ . By Theorem 2.7, for $d_{i}\leqslant 1-\varphi^{C_{2}+1}N^{-1/3}$ , $\mu_{\alpha}$ sticks to $\lambda_{\beta}$ where $\beta\mathrel{\vbox{\hbox{.}\hbox{.}}}=\alpha+k^{+}$ . As $d_{i}$ approaches $1$ , the eigenvalue $\mu_{\alpha}$ progressively detaches itself from $\lambda_{\beta}$ . Theorem 2.7 allows one to follow this behaviour down to $\lvert d_{i}-1\rvert=\varphi^{C_{2}+1}N^{-1/3}$ . Below this scale, as $d_{i}$ passes $1$ , the eigenvalue $\mu_{\alpha}$ “jumps” from from the vicinity of $\lambda_{\beta}$ to the vicinity of $\lambda_{\beta+1}$ . This jump happens in the range $d_{i}\in[1-\varphi^{C_{2}+1}N^{-1/3},1+\varphi^{C_{2}+1}N^{-1/3}]$ . After the jump, i.e. for $d_{i}\geqslant 1+\varphi^{C_{2}+1}N^{-1/3}$ , the eigenvalue $\mu_{\alpha}$ sticks to $\lambda_{\beta+1}$ instead of $\lambda_{\beta}$ , provided that $\beta<N$ . If $\beta=N$ , then $\mu_{\alpha}$ escapes from the bulk spectrum and becomes an outlier. This jump happens simultaneously for all extremal bulk eigenvalues near $+2$ , and is accompanied by the creation of an outlier. This may be expressed as $(k^{0},k^{+})\mapsto(k^{0}-1,k^{+}+1)$ . Meanwhile, the extremal bulk eigenvalues on the other side of the spectrum, i.e. near $-2$ , remain unaffected by the transition, and continue sticking to the same eigenvalues of $H$ they stuck to before the transition.

Next, we identify the distribution of the outliers. We introduce the customary symmetry index $\beta$ , by definition equal to $1$ if $H$ is real symmetric and $2$ if $H$ is complex Hermitian. In order to state our result, we define the moment matrices $M^{(3)}=(M^{(3)}_{ij})$ and $M^{(4)}=(M^{(4)}_{ij})$ of $H$ through

There is a constant $C_{2}$ such that the following holds. Suppose that

for all $i=1,\dots,k$ . Suppose moreover that for all $i\in O$ we have

and $\Upsilon_{i}$ , a random variable independent of $\Pi_{i}$ with law

Then we have, for all $i\in O$ and all bounded and continuous $f$ ,

The condition (2.24) has the following interpretation. Let $i\in O$ and assume for definiteness that $d_{i}>1$ . If $j$ is not associated with an outlier on the right-hand side of the bulk, i.e. if $d_{j}<1$ , then $d_{i}-d_{j}$ is bounded from below by the right-hand side of (2.24), as follows from (2.23). Hence the condition (2.24) is only needed to ensure that the outliers are not to close too each other; in fact, this condition is optimal (up to the factor $\varphi^{C_{2}}$ ) in guaranteeing that the distributions of the outliers have essentially no overlap. Indeed, by Theorem 2.7 we know that $\mu_{\alpha(i)}$ lies with $\zeta$ -high probability in an interval of length $2\varphi^{C_{3}}N^{-1/2}(d_{i}-1)^{1/2}$ centred around $\theta(d_{i})$ . Moreover, differentiating (2.18) yields

Imposing the condition $\lvert\theta(d_{j})-\theta(d_{i})\rvert\geqslant\varphi^{C_{3}}N^{-1/2}(d_{i}-1)^{1/2}$ leads to (2.24) (with $C_{2}$ increased if necessary so that $C_{2}\geqslant C_{3}$ ). In fact, in BBP ; Pec ; SoshPert it was proved (for $D$ independent of $N$ ) that the distribution associated with degenerate outliers is not Gaussian.

and $\Upsilon^{\prime}$ is a centred Gaussian, independent of $\Pi^{\prime}$ , with variance

Proof of Theorem 2.2, Case A

In this section we prove Theorem 2.2 in the case A, i.e. where the first three moments of the entries of $H$ coincide with those of GOE/GUE.

For definiteness, we consider the case where $H$ is a complex Hermitian Wigner matrix; the proof for real symmetric Wigner matrices is the same. By Markov’s inequality, in order to prove Theorem 2.2 it suffices to prove the following result.

The rest of this section is devoted to the proof of Proposition 3.1.

the distance from $E$ to the spectral edges $\pm 2$ . In the following we use the notations

without further comment. The following lemma collects some useful properties of $m$ , the Stieltjes transform of the semicircle law.

For $\lvert z\rvert\leqslant 2\Sigma$ we have

(Here the implicit constants depend on $\Sigma$ .)

The proof is an elementary calculation; see Lemma 4.2 in EYY2 . ∎

In addition to $\Psi$ , we shall make use of a larger control parameter $\Phi$ , defined as

From Lemma 3.2 we find, for any $z$ satisfying $\lvert z\rvert\leqslant 2\Sigma$ ,

where $A_{N}\lesssim B_{N}$ means $A_{N}\leqslant CB_{N}$ for some constant $C$ .

We shall often need to consider minors of $H$ , which are the content of the following definition.

We shall also need the following resolvent identities, proved in Lemma 4.2 of EYY1 and Lemma 6.10 of EKYY2 .

It is an immediate consequence of (3.6) that

Next, we record some basic large deviations estimates.

Let $a_{1},\dots,a_{N},b_{1},\dots,b_{M}$ be independent random variables with zero mean and unit variance. Assume that there is a constant $\vartheta>0$ such that

Then there exists a constant $\rho\equiv\rho(\vartheta)>1$ such that, for any $\zeta>0$ and any deterministic complex numbers $A_{i}$ and $B_{ij}$ , we have with $\zeta$ -high probability

The estimates (3.12) – (3.14) we proved in Appendix B of EYY1 . The estimate (3.15) follows easily from (3.12) in two steps. Defining $A_{i}\mathrel{\vbox{\hbox{.}\hbox{.}}}=\sum_{j}B_{ij}b_{j}$ , (3.12) yields $\lvert A_{i}\rvert\leqslant\varphi^{\rho\zeta}\bigl{(}{\sum_{j}\lvert B_{ij}\rvert^{2}}\bigr{)}^{1/2}$ with $\zeta$ -high probability. Since the families $\{A_{i}\}$ and $\{a_{i}\}$ are independent, (3.15) follows by using (3.12) again. ∎

Finally, we quote the following results which are proved in Theorems 2.1 and 2.2 of EYY3 . (Recall that we use the notation $m$ for the quantity denoted by $m_{sc}$ in EYY3 .)

Fix $\zeta>0$ . Then there exists a constant $C_{\zeta}$ such that the event

Denote by $\gamma_{1}\leqslant\gamma_{2}\leqslant\cdots\leqslant\gamma_{N}$ the classical locations of the eigenvalues of $H$ , defined through

Fix $\zeta>0$ . Then there exists a constant $C_{\zeta}$ such that

for all $\alpha=1,\dots,N$ with $\zeta$ -high probability.

After these preparations, we may prove the key tool behind the proof of Proposition 3.1. It will be used as input in the Green function comparison method, throughout Sections 3.3, 3.4, and 4. Let us sketch its importance in the Green function comparison method. Anticipating the notation from the proof of Lemma 3.9, we shall have to estimate quantities of the form

where the right-hand side is a resolvent expansion of the left-hand side. The first matrix product on the right-hand side may be written as

For any $\zeta>0$ there exists a constant $C_{\zeta}$ such that

with $\zeta$ -high probability for some constant $C_{\zeta}$ . By spectral decomposition one easily finds that

with $\zeta$ -high probability provided that $\eta>\varphi^{C_{\zeta}}$ for some large enough $C_{\zeta}$ . Setting

we therefore conclude, using first (3.8) and then (3.10), that

with $\zeta$ -high probability. Thus we find for $\eta\geqslant\varphi^{2C_{\zeta}}N^{-1}$

where in the last inequality we used the rough bound $\lvert G_{11}(z)\rvert\leqslant\eta^{-1}\leqslant N$ . Thus (3.20) for GOE/GUE follows from (3.5) and the estimate

From now on we work on the product space generated by the Wigner matrix $H=(N^{-1/2}W_{ij})_{i,j}$ and the GOE/GUE matrix $(N^{-1/2}V_{ij})_{i,j}$ . We fix a bijective ordering map on the index set of the independent matrix elements,

and denote by $H_{\gamma}=(h^{\gamma}_{ij})$ , $\gamma=0,\dots,\gamma_{\rm max}$ , the Wigner matrix whose upper-triangular entries are defined by

In particular, $H_{0}$ is a GOE/GUE matrix and $H_{\gamma_{\rm max}}=H$ .

Let $E^{(ij)}$ denote the matrix whose matrix elements are given by $E^{(ij)}_{kl}\mathrel{\vbox{\hbox{.}\hbox{.}}}=\delta_{ik}\delta_{jl}$ . Fix $\gamma\geqslant 1$ and let $(a,b)$ be determined by $\phi(a,b)=\gamma$ . We shall compare $H_{\gamma-1}$ with $H_{\gamma}$ for each $\gamma$ and then sum up the differences. Note that the matrices $H_{\gamma-1}$ and $H_{\gamma}$ differ only in the entries $(a,b)$ and $(b,a)$ , and they can be written as

here the matrix $Q$ satisfies $Q_{ab}=Q_{ba}=0$ .

which are well-defined for $\eta>0$ since $Q$ and $H_{\gamma}$ are self-adjoint. Using the notation $G^{\gamma}\mathrel{\vbox{\hbox{.}\hbox{.}}}=(H_{\gamma}-z)^{-1}$ , we have the telescopic sum

Now we choose $K=10$ in (3.26). Applying Theorem 3.6 to the Wigner matrix $S$ , using the rough bound $\lVert R\rVert\leqslant\eta^{-1}\leqslant N$ to estimate the rest term in (3.26), and recalling (2.1), we find

with $2\zeta$ -high probability. Here we also used (3.5). Throughout the proof we shall tacitly make use of the bound $\lvert R_{ij}\rvert\leqslant C$ with $2\zeta$ -high probability, as follows from (3.27).

Next, setting $K=1$ in (3.25), recalling (2.1), and using Lemma 3.8, we find

with $2\zeta$ -high probability. Now (3.28), (3.5), and Lemma 3.8 yield

After these preparations, we may start to estimate

We choose $K=4$ in (3.25) and introduce the notation $S-R=\sum_{k=1}^{4}Y_{k}$ , whereby $Y_{k}$ has $k$ factors $V$ . We write

where $\mathcal{A}$ depends on the randomness only through $Q$ and the first three moments of $V_{ab}$ .

Abbreviating $r_{\gamma}\mathrel{\vbox{\hbox{.}\hbox{.}}}=(1-\mathcal{E}_{\gamma})^{-1}(1+\mathcal{E}_{\gamma})\geqslant 1$ we therefore find

Since (3.20) holds for GOE/GUE, we have the initial estimate $X_{0}\leqslant\bigl{(}{\varphi^{C_{\zeta}}\Phi}\bigr{)}^{n}$ . Iteration therefore yields

Next, we observe that $\sum_{\gamma}\mathcal{E}_{\gamma}\leqslant 1$ . Since $0\leqslant\mathcal{E}_{\gamma}\leqslant 1/2$ , we find $\prod_{\gamma}r_{\gamma}\leqslant C$ . This implies

Using Lemma 3.8 and (3.29), we get the bound

with $2\zeta$ -high probability, where in the second step we used Lemma 3.10 below and $s+t\leqslant\varphi^{\zeta}$ , and in the third step the inequality $x^{m-a}y^{a}\leqslant(x+y)^{m}$ . Here $D>0$ is some constant to be chosen later, and $C_{\zeta,D}$ denotes a constant depending on $\zeta$ and $D$ . For the following it will be convenient to abbreviate

Next, we observe that (3.30) and (3.5) imply

for all $n\leqslant\varphi^{\zeta}$ and $N$ large enough. Therefore choosing $D\equiv D_{\zeta}$ large enough we get from (3.40)

Therefore (3.33) follows using (3.35) if we can prove that

for all $s,t$ . We check that all terms on the left-hand side of (3.41) are bounded, for all $s,t\geqslant 0$ , by the right-hand side of (3.41). The first term is trivial: $N^{-\max\{4,s,t\}/2}\leqslant N^{-2}$ . The second term is bounded by

The third term is bounded similarly. Finally, the last term is bounded by

where $E$ denotes a quantity bounded by the three previous terms. This completes the proof of (3.41), and hence of (3.33). ∎

What remains is to prove the following elementary result.

By convexity of the function $x\mapsto x^{m}$ we have, for any $\lambda\in(0,1)$ ,

Choosing $\lambda=1/m$ yields the claim. ∎

We now conclude the proof of Proposition 3.1. By polarization and linearity, it is enough to prove the following result.

For the GOE/GUE matrix $H_{0}$ we get from Theorem 3.6, as in the proof of Lemma 3.9, that

In order to perform the comparison step, we write, similarly to (3.32),

where $\mathcal{B}$ depends on the randomness only through $Q$ and the first three moments of $V_{ab}$ , and

Using Lemma 3.9, (3.4), and (3.5) we find that the right-hand side of (3.44) is bounded by

Therefore (3.43) and (3.44) yield (3.42), exactly as in the paragraph following (3.34).

What remains therefore is to prove (3.44). Using (3.37), (3.5), and Lemma 3.10 we get, for arbitrary $D>0$ ,

with $2\zeta$ -high probability. Therefore we get, similarly to (3.40),

with $2\zeta$ -high probability, where we used (3.30), $N^{-1/2}\leqslant\Psi$ , and Lemma 3.10. Choosing $D>0$ large enough and recalling (3.41) yields (3.44). (We omit the details of the analysis on the low-probability event, which are similar to those following (3.40).) This concludes the proof of Lemma 3.11. ∎

Proof of Theorem 2.2, Case B

In this section we prove Theorem 2.2 in the case B, i.e. we impose no condition on the third moments of the entries of $H$ , and $\Psi(z)$ satisfies (2.9). By Markov’s inequality, it suffices to prove the following result.

The rest of this section is devoted to the proof of Proposition 4.1. We take over the notation of Section 3, which we use throughout this section without further comment.

The following (trivial) observation will be needed in the next section: The constant $C_{0}$ may be increased at will without changing $C_{\zeta}$ in (4.2).

The main technical estimate behind the proof of Lemma 4.2 is the following lemma. Recall the setup (3.21) of the Green function comparison, and in particular the definitions (3.23).

Fix $\zeta>0$ . Then there are constants $C_{0}$ and $C_{1}$ , both depending on $\zeta$ , such that if (2.9) holds with constant $C_{0}$ then we have the following. For any $a,b$ we have

Before proving Lemma 4.3, we use it to complete the proof of Lemma 4.2.

Let $B\subset\{1,\dots,N\}^{2}$ denote the subset

Now (4.2) follows from (4.3) and (4.5), by repeating the argument after (3.34). ∎

Before proving Lemma 4.3, we record the following lower bound on $\eta$ .

The claim follows immediately from $(N\eta)^{-1}\leqslant\Psi\leqslant\varphi^{-C_{0}/3}N^{-1/6}$ . ∎

Note that the proof of (3.33) did not use the assumption (2.8). In particular, all statements in the proof of Lemma 3.9 after (3.35) remain true in the case B. By (3.33), it is enough to prove

for $m=1,2,3$ as well as, assuming (4.4),

for $m=1,2,3$ . In order to prove (4.7) and (4.8), we distinguish four cases depending on $m$ and whether $a=b$ . Recall from (3.35) that

Case (i): $a=b$ and $m\leqslant 3$ . Similarly to (3.37), we find

with $2\zeta$ -high probability, where we used that $1\leqslant m\leqslant 3$ . Therefore Lemma 3.10 yields

which is (4.8). In particular, we have also proved (4.7). Here we omit the details of the estimate on the event of low probability, which are analogous to those following (3.40).

Case (ii): $a\neq b$ and $m=3$ . By (4.9), we have $s=t=3$ . From (3.37) we get

with $2\zeta$ -high probability. Together with (3.4) and (3.39), this yields

with $2\zeta$ -high probability and for any $D>0$ . Choosing $D$ and $C_{0}$ in (2.9) large enough, we get from (2.1), (4.6), Lemma 3.10, and $N^{-1/2}\leqslant\Phi$ that

with $2\zeta$ -high probability. Now (4.8), and hence also (4.7), follows easily (we omit the details of the analysis on the low-probability event).

Case (iii): $a\neq b$ and $m=2$ . Consider first the case $s=t=2$ . Then $A_{2,3,2,2}$ (see (3.36) and (3.31)) is a finite sum of $O(1)$ terms of the form

(The other terms can be obtained from (4.13) by permutation of indices and complex conjugation of factors.) We shall estimate the contribution of $X_{1}$ ; the other terms are dealt with in exactly the same way. Note the presence of an off-diagonal resolvent matrix element $R_{ba}$ , as required by the condition $s=t=2$ . From (3.27) and (4.12) we get, with $m=s=t=2$ , that

with $2\zeta$ -high probability. Note the factor $\Psi$ arising from the estimate of $R_{ba}$ . Choosing $D$ and $C_{0}$ large enough, and recalling (2.9), we find using Lemma 3.10 that

with $2\zeta$ -high probability. This yields (4.8) and hence also (4.7).

Let us therefore consider the case $s=3$ and $t=1$ . (The case $s=1$ and $t=3$ is estimated in the same way.) Using the bounds $\Phi\geqslant(N\eta)^{-1}$ and $\Phi\geqslant N^{-1/2}$ , we find

with $2\zeta$ -high probability, for $D$ and $C_{0}$ large enough. This yields (4.7) in the case $s=3$ and $t=1$ .

In order to prove the stronger bound (4.8) in the case $s=3$ and $t=1$ , we note that (3.29), (3.4), (3.5), and the assumption (4.4) yield

(Again, the other terms can be obtained from $X_{2}$ by permutation of indices and complex conjugation of factors.) We shall show that

We split $R_{bb}=(R_{bb}-m)+m$ in the definition of $X_{2}$ . The first resulting term is estimated, using (3.27), by

Now we observe that, using the bound (3.27), we may repeat the proof of Lemma 3.8 to the letter to find that its statement holds with $(G,\mathcal{G})$ replaced with $(R,\mathcal{R})$ . Thus we find

with $2\zeta$ -high probability, where in the second step we used (3.39) and (4.4), and in the last step (3.5). Using (3.27), (4.4), and $\Phi\geqslant(N\eta)^{-1}$ , we therefore find

with $2\zeta$ -high probability, for any $D\geqslant 0$ . Therefore (3.39) and (4.16) yield

with $2\zeta$ -high probability. Using (2.9), (4.6), and Lemma 3.10, we find that the right-hand side is bounded by

In order to compare the quantities in the brackets, we use (3.6), (3.27), and (4.16) to get

with $2\zeta$ -high probability. In particular, we get from (3.39) and (4.16) that

with $2\zeta$ -high probability, where in the last step we used (4.21) and $n\leqslant\varphi^{\zeta}$ . Now (4.23) follows easily for large enough $C_{0}$ in (2.9), using (2.9) and (4.6). This concludes the proof of (4.18) and hence of (4.17).

Case (iv): $a\neq b$ and $m=1$ . Similarly to (4.15), one easily finds the weak bound (4.7). Let us therefore assume (4.4) and prove (4.8). It suffices to prove that

where $X_{3}$ stands for any of the following expressions:

Here we used that $h_{ab}$ and $h_{ba}$ are independent of $R$ . (Up to an immaterial renaming of indices and complex conjugation, all terms in $A_{1,3}$ are covered by one of these three cases.) Applying the splittings $R_{aa}=m+(R_{aa}-m)$ and $R_{bb}=m+(R_{bb}-m)$ , we find that it suffices to prove (4.27) for $X_{3}$ being any of

Next, applying the splitting (4.19) to the last line, we find that it suffices to prove (4.27) for $X_{3}$ being any of

For $X_{3}$ in (4.28a), we find from (3.27), (4.16), and (4.22) that

with $2\zeta$ -high probability, from which (4.27) easily follows using (2.9), (4.6), (3.39), and Lemma 3.10, having chosen $D$ and $C_{0}$ in (2.9) large enough.

Using (3.12), (3.4), (3.6), and (3.27), we find

with $2\zeta$ -high probability. For the second part of $X_{3}$ resulting from the splitting of $R_{ab}$ , we therefore get the estimate

with $2\zeta$ -high probability, where we used (4.26), (2.1), (3.12), (3.6), and (4.24). Together with (3.6), (3.27), (4.16), and (4.4), we therefore find

with $2\zeta$ -high probability. Recalling (4.21), (4.25), Lemma 3.10, and the usual rough estimate on the complementary low-probability event, a telescopic estimate in (4.30) therefore gives

We deal with the last term by applying (3.6) twice, followed by

itself an immediate consequence of (3.6). This gives the graded expansion

with $2\zeta$ -high probability. Thus we write

From (4.31), (4.33), (3.39), and Lemma 3.10 we therefore get

The second summand of (4.35) consists of $n$ terms of the form

Recalling (4.33), we estimate this as above by

What remains is to estimate the third summand in (4.35). From (4.33) and (4.31) we get

We now conclude the proof of Proposition 4.1. By polarization and linearity, it is enough to prove the following result.

for $m=1,2,3$ . Here $C_{1}$ is a large enough constant depending on $\zeta$ .

Assuming that (4.37) and (4.39) have been proved, we get the claim (4.36) from (3.44) and Lemma 4.3 applied to $S$ ; the detains are identical to those of the proof of Lemma 4.2 and the argument following (3.34).

The proof of (4.37) and (4.39) is similar to the proof of (4.7) and (4.8). The key input is the apriori bound

with $2\zeta$ -high probability, which follows from (4.2) and Markov’s inequality. Throughout the proof, we shall consistently (and without further mention) make use of the inequality

which follows from the elementary inequality $x^{m}y^{n-m}\leqslant x^{n}+y^{n}$ for $x,y\geqslant 0$ , Lemma 3.10, and the estimate

with $2\zeta$ -high probability (as follows from (3.30)). Moreover, as in (4.16), we find that (4.38) implies

As in the proof of Lemma 4.3, we consider four cases.

Case (i): $a=b$ and $m\leqslant 3$ . This is easily dealt with using (3.45); we omit further details.

Case (ii): $a\neq b$ and $m=3$ . Recall that in this case we have $t=s=3$ . From (4.11) we get

with $2\zeta$ -high probability. Therefore using (4.40), (3.4), and $\Psi\geqslant cN^{-1/2}$ we get

with $2\zeta$ -high probability, where in the last step we used (2.9). Choosing $D$ large enough yields (4.39), and hence also (4.37).

Case (iii): $a\neq b$ and $m=2$ . In the case $s=t=2$ , the estimate is similar to the estimate of $X_{1}$ in (4.13). Using (4.40), (3.4), and $\Psi\geqslant cN^{-1/2}$ we get

with $2\zeta$ -high probability, from which (4.39), and hence also (4.37), easily follows.

Next, consider the case $s=3$ and $t=1$ . In order to prove (4.37), we estimate using (4.40) and (3.29), similarly to (4.15),

with $2\zeta$ -high probability from which (4.37) follows. Let us therefore prove (4.39), assuming (4.38). Using (4.41) and (4.40), we find

with $2\zeta$ -high probability. We need to prove that

As for (4.18), by splitting $R_{bb}=(R_{bb}-m)+m$ and using (3.27), we find that it is enough to prove

As for (4.18), we use the splitting (4.19). Using (3.27), (4.40), and (4.6), we find that the bounds

Case (iv): $a\neq b$ and $m=1$ . In order to prove (4.37), we use (4.40) to get

with $2\zeta$ -high probability, from which (4.37) easily follows using $\Psi\geqslant N^{-1/2}$ .

As for (4.27), in order to prove (4.37) and (4.39) it suffices to prove the following claim. For $X_{3}$ being any expression in (4.28a) – (4.28c), we have

Note that from (4.20) and (4.40) we get that

If $X_{3}$ is any expression in (4.28a), we get from Lemma 3.8, (3.27), (4.40), and (4.48) that

with $2\zeta$ -high probability. Now (4.47), and in particular (4.46), follows easily (note that we did not assume (4.38)).

Next, let $X_{3}$ be an expression in (4.28b). From Lemma 3.8, (3.27), (4.40), and (4.48) we get

with $2\zeta$ -high probability. Then the argument from the proof of Lemma 4.2 can be applied almost unchanged, and we get (4.47) assuming (4.38). ∎

Proof of Theorems 2.3 and 2.5

By Lemma 3.2, if $\eta\leqslant\kappa$ and $\lvert E\rvert>2$ then the control parameter on the right-hand side of (2.10) can also be expressed as

where $\kappa\equiv\kappa_{E}$ was defined in (3.2).

By polarization and linearity, it is enough to prove that

It remains therefore to establish (5.2) when $0\leqslant\eta\leqslant\eta_{0}$ . Define

By (5.1) and (5.2) at $z_{0}$ , it is enough to prove that

which, by Lemma 3.2, implies that $m^{\prime}\asymp(\kappa+\eta)^{-1/2}=O(\kappa^{-1/2})$ . Therefore we get

Next, by Theorem 3.7 we have $E\geqslant\lambda_{N}+\eta_{0}$ with $\zeta$ -high probability provided $C_{1}$ is large enough. Therefore, since $\eta\leqslant\eta_{0}\leqslant E-\lambda_{N}\leqslant E-\lambda_{\alpha}$ with $\zeta$ -high probability for all $\alpha\leqslant N$ , we get

with $\zeta$ -high probability, by (5.2) at $z_{0}$ and the estimate $\operatorname{Im}m(z_{0})\leqslant CN^{-1/2}\kappa^{-1/4}$ . Finally, we estimate the real part from

with $\zeta$ -high probability, where in the last step we used that $\eta_{0}\leqslant E-\lambda_{N}$ . Combining (5.6) and (5.7) completes the proof of (5.4). ∎

We begin with (2.14), whose proof is immediate. Using Theorem 2.2 with Condition A and Remark 2.4, we find

with $\zeta$ -high probability, where we used Theorem 3.7 to ensure that $\lambda_{\alpha}\in[-\Sigma,\Sigma]$ with $\zeta$ -high probability. Choosing $\eta=\varphi^{\zeta}N^{-1}$ yields (2.14).

where $\gamma_{\alpha}$ is the classical location of the $\alpha$ -th eigenvalue defined in (3.17). Then we get

where in the first step we used Theorem 3.7 to conclude that $(\lambda_{\alpha}-E)^{2}\leqslant\varphi^{C_{\zeta}}\eta^{2}$ for $a\leqslant\alpha\leqslant b$ . In order to invoke Theorem 2.2 with Condition B, we have to satisfy (2.9). Recalling Lemma 3.2, we find that (2.9) holds provided that

where we abbreviated $\kappa\equiv\kappa_{E}$ . From (3.17) we get

for $\alpha\leqslant N/2$ , from which we deduce, recalling $E=\gamma_{\alpha}$ ,

Since $b^{2/3}-a^{2/3}\geqslant b^{-1/3}(b-a)/2$ , we find that (5.9), and hence (2.9), holds under the condition (2.12).

Therefore we may apply Theorem 2.2 to the right-hand side of (5.8) to get

with $\zeta$ -high probability, where we used Lemma 3.2. The claim now follows from the elementary inequalities

For future use, we record the following consequence of Theorem 2.5 which is useful in combination with dyadic decompositions. For any integer $K\leqslant N/4$ we have

Eigenvalue locations: proof of Theorem 2.7

We begin by collecting a few well-known tools from linear algebra, on which our analysis of the deformed spectrum relies.

We use the following representation of the eigenvalues of $\widetilde{H}$ , which was already used in several papers on finite-rank deformations of random matrices SoshPert ; BGGM1 ; BGGM2 ; BGN .

For the convenience of the reader, we give the simple proof. The claim follows from the computation

We shall also make use of the well-known Weyl’s interlacing property, summarized in the following lemma.

2 Warmup: the rank-one case

For the following we note the elementary estimate

Fix $\zeta>0$ . Then there is a constant $C_{\zeta}$ such that the following holds. For $0\leqslant d\leqslant 1$ we have

with $\zeta$ -high probability. For $1\leqslant d\leqslant\Sigma-1$ we have

By symmetry, an analogous result holds for $d\leqslant 0$ .

The key identityHere we ignore the possibility that $\mu_{N}\in\sigma(H)$ . Since the law of $H$ is absolutely continuous, it is easy to check that the interlacing inequalities in Lemma 6.2 are strict with probability one; see e.g. the proof of Lemma 6.7. for the proof is

where in the last step we used $d\geqslant 1+\varphi^{D}N^{-1/3}$ . In order to prove the second relation of (6.4), we differentiate (5.5) and use Lemma 3.2 to get

Therefore we get from (6.5) and the mean value theorem applied to $m^{\prime}$ that

Therefore (6.4) follows from $m^{\prime}(\theta(d))\asymp(d-1)^{-1}$ .

Now choose $D$ large enough that $x_{-}(d)\geqslant 2+\varphi^{C_{1}}N^{-2/3}$ for $d\geqslant\varphi^{D}N^{-2/3}$ . Thus (6.3) and (6.4) yield

What remains is the case $d\leqslant 1-\varphi^{D}N^{-1/3}$ . Choose $x\mathrel{\vbox{\hbox{.}\hbox{.}}}=2+\varphi^{C_{1}}N^{-2/3}$ where $C_{1}$ is a large constant to be chosen later. For large enough $C_{1}$ we find from Theorem 2.3

with $\zeta$ -high probability. From (3.3) we find

with $\zeta$ -high probability. (The first inequality follows from Lemma 6.2.)

Next, abbreviate $q\mathrel{\vbox{\hbox{.}\hbox{.}}}=\varphi^{C_{2}}$ for some large constant $C_{2}$ to be chosen later. Using Theorem 3.7 we estimate, for $\lambda_{N}\leqslant\mu_{N}\leqslant x$ and large enough $C_{2}$ ,

with $\zeta$ -high probability. In the second inequality we estimated the contribution of the eigenvalues $\alpha\geqslant N/2$ using the dyadic decomposition

and the delocalization estimate (5.11). A similar (in fact easier) dyadic decomposition works for the remaining eigenvalues $\alpha<N/2$ and yields the last term of the second line. Moreover, we have

with $\zeta$ -high probability, by Theorems 3.7 and 2.5. Recalling (6.7) and (6.8), we have therefore proved that

3 The permissible region

The rest of this section is devoted to the proof of Theorem 2.7.

We choose an event, denoted by $\Xi$ , of $\zeta$ -high probability on which the following statements hold.

All statements of Theorems 2.2, 2.3, 2.5, and 3.7 hold.

We note that such a $\Xi$ exists. As explained in Section 6.1, we assume without loss of generality that the law of $H$ is absolutely continuous. Then conditions (i) and (ii) hold almost surely; we omit the standard proof. That condition (iii) holds with $\zeta$ -high probability is a consequence of Theorems 2.2, 2.3, 2.5, and 3.7 (see also Remark 2.4).

For the whole remainder of the proof of Theorem 2.7, we choose and fix an arbitrary realization $H\equiv H^{\omega}$ with $\omega\in\Xi$ . Thus, the randomness of $H$ only comes into play in ensuring that $\Xi$ is of $\zeta$ -high probability. The rest of the argument is entirely deterministic.

For $\widetilde{C}_{2}>0$ define the sets

Let $\widetilde{K}>0$ denote a constant to be chosen later, and define

We shall only consider eigenvalues of $\widetilde{H}$ in $S(\widetilde{K})$ for some large but fixed $\widetilde{K}$ .

Let $\widetilde{C}_{3}>0$ denote some large constant to be chosen later. Define the intervals

First we prove (6.11). By definition of $\Xi$ (see Theorem 3.7), we find that (6.11) holds if

First we consider the case $x\geqslant 2+\varphi^{\widetilde{C}_{2}}N^{-2/3}$ . On $\Xi$ we have

provided $\widetilde{C}_{2}$ is large enough (see Theorem 3.7). In particular, by (6.15) and the definition of $\Xi$ , we have $x\notin\sigma(H)$ . By increasing $\widetilde{C}_{2}$ if necessary we may assume that $\widetilde{C}_{2}\geqslant C_{1}$ , where $C_{1}$ is the constant from Theorem 2.3. Therefore we get from Theorem 2.3 and Lemma 3.2 that

for all $y\in[-\Sigma,\Sigma]$ . (We include an imaginary part $y\neq 0$ for later applications of (6.16); for the purposes of this proof we set $y=0$ .)

Let $i\in\{1,\dots,k^{+}\}$ . Then we may repeat to the letter the argument in the proof of Theorem 6.3 leading to (6.4). Provided that $\widetilde{C}_{3}\geqslant C_{\zeta}+2$ , where $C_{\zeta}$ is the constant in (6.16), we therefore get that

for some $c>0$ depending on $\Sigma$ . It is now easy to put all the estimates associated with $i=1,\dots,k$ together. Recalling (6.16) and choosing $\widetilde{C}_{2}$ large enough yields, for $C_{\zeta}$ denoting the constant from (6.16),

We concludeHere we use the well-known fact that if $\lambda\in\sigma(A+B)$ then $\operatorname{dist}(\lambda,\sigma(A))\leqslant\lVert B\rVert$ . from (6.16) that $M(x)$ is regular if (6.17) holds.

Our aim is to prove that $M(x)$ is regular for any $x$ satisfying (6.19). Once this is done, the regularity of $M(x)$ for $x$ satisfying (6.18) or (6.19) will imply (6.12). Choose $\eta\mathrel{\vbox{\hbox{.}\hbox{.}}}=N^{-2/3}\widetilde{\psi}^{-1}$ and estimate

where in the second step we used (6.19). Therefore, by definition of $\Xi$ (See also Theorem 2.2) and Lemma 3.2, we get (recall that $\widetilde{\psi}\geqslant 1$ )

This implies, for any $x$ satisfying (6.19), that

for all $i$ , we find that $M(x)$ is regular provided $\widetilde{C}_{2}$ is chosen large enough that

This completes the analysis of the case (6.19). The case

is handled similarly. This completes the proof. ∎

4 The initial configuration

The functions $g$ and $f_{N}$ are holomorphic on and inside $\mathcal{C}$ (for large enough $N$ ). Moreover, by construction of $\mathcal{C}$ , the function $g$ has precisely one zero inside $\mathcal{C}$ , namely at $z=\theta(d_{i}^{+})$ . Next, we have

where the second inequality follows from (6.16). The claim now follows from Rouché’s theorem. The eigenvalues near $\theta(d_{i}^{-})$ , $i=1,\dots,k^{-}$ , are handled similarly. ∎

Before moving on, we record the following result on rank-one deformations.

Now the claim follows by approximating an arbitrary matrix $A$ by matrices in $E$ , and by using the Lipschitz continuity of the map $A\mapsto\lambda_{i}(A)$ . ∎

We now deal with the extremal bulk eigenvalues.

Similarly, we have for all $\alpha$ satisfying $\lambda_{\alpha}\leqslant-2+\varphi^{\widetilde{K}}N^{-2/3}$ that

We only prove the first statement; the proof of the second one is almost identical. Abbreviate $\delta^{\prime}\mathrel{\vbox{\hbox{.}\hbox{.}}}=\delta/2$ .

Before sketching the proof of the above claim, we show how to use it to conclude the argument. By Proposition 6.6, there are at least $k^{+}$ eigenvalues in $(x_{+}^{N},\infty)$ . Recall that by assumption $k^{0}=0$ , i.e. $\lvert d_{i}\rvert>1$ for all $i$ . Therefore using interlacing, i.e. a repeated application of Lemma 6.2, we conclude that there are exactly $k^{+}$ eigenvalues in $(x_{+}^{N},\infty)$ . From the above claim we find that there is at least one eigenvalue in $[x_{-}^{N},x_{+}^{N}]$ . Using interlacing we find that there are at most $k^{+}+1$ eigenvalues in $[x_{-}^{N},\infty)$ . We conclude that there is exactly one eigenvalue in $[x_{-}^{N},x_{+}^{N}]$ . We may move on to the $(N-1)$ -th eigenvalue: we have proved that there are (i) at least $k^{+}+1$ eigenvalues in $[x_{-}^{N},\infty)$ (from the previous step), (ii) at least one eigenvalue in $[x_{-}^{N-1},x_{+}^{N-1}]$ (from the claim), and (iii) at most $k^{+}+2$ eigenvalues in $[x_{-}^{N-1},\infty)$ (from interlacing); we conclude that there is exactly one eigenvalue in $[x_{-}^{N-1},x_{+}^{N-1}]$ . Continuing in this fashion concludes the proof.

Let us now complete the sketch of the proof of the above claim. Assume for simplicity that $H$ and $\widetilde{H}$ have no common eigenvalues. From Lemma 6.1 we find that $x$ is an eigenvalue of $\widetilde{H}$ if and only if the matrix $M(x)$ , defined in (6.14), is singular. Thus, we have to prove that there is an $x\in[x_{-}^{\alpha},x_{+}^{\alpha}]$ such that $M(x)$ is singular. The idea of the argument is to do a spectral decomposition of $G$ , and resum all terms not associated with $\lambda_{\alpha}$ to get something close to $\operatorname{Re}m(x)\approx-1$ . More precisely, we write

Now we turn towards the detailed proof in the general case. Since eigenvalues of $H$ may be separated by less than $N^{-1+\delta^{\prime}}$ , we begin by clumping together eigenvalues of $H$ which are separated by less than $N^{-1+\delta^{\prime}}$ . More precisely, we construct a partition $\mathcal{A}=(A_{q})_{q}$ of $\{1,\dots,N\}$ , defined as the finest partition in which $\alpha$ and $\beta$ belong to the same block if $\lvert\lambda_{\alpha}-\lambda_{\beta}\rvert\leqslant N^{-1+\delta^{\prime}}$ . Thus, each block consists of a sequence of consecutive integers. We order the blocks of $\mathcal{A}$ in a “decreasing” fashion, in such a way that if $q<r$ then $\lambda_{\alpha}>\lambda_{\beta}$ for all $\alpha\in A_{q}$ and $\beta\in A_{r}$ .

We now derive a bound on the size of the blocks near the edge. Roughly, we shall show that if $\lambda\in A_{q}$ and $\lambda\geqslant 2-\varphi^{C}N^{-2/3}$ then $\lvert A_{q}\rvert\leqslant\varphi^{C^{\prime}}$ . Let $C_{4}$ be a large constant to be chosen later. Now choose $\alpha$ and $\beta$ satisfying $0\leqslant\alpha\leqslant\beta\leqslant\varphi^{C_{4}}$ such that $N-\alpha$ and $N-\beta$ belong to the same block. Then by definition of $\Xi$ and $\mathcal{A}$ we have

where we used the statement of Theorem 3.7 and the definition (3.17). Thus we get the condition

We conclude that if $\alpha$ and $\beta$ satisfy $0\leqslant\alpha\leqslant\beta\leqslant\varphi^{C_{4}}$ and $N-\alpha$ and $N-\beta$ belong to the same block, then

Let $\alpha_{*}$ denote the largest integer such that $\lambda_{N-\alpha_{*}}\geqslant 2-\varphi^{\widetilde{K}}N^{-2/3}$ . In particular, by definition of $\Xi$ (see Theorem 3.7) we have

Now we choose $C_{4}\equiv C_{4}(\zeta,\widetilde{K})$ large enough that

Next, define $Q$ through $N-\alpha_{*}\in A_{Q}$ . Therefore we get from (6.23) and (6.24) that any $\alpha\leqslant\varphi^{C_{4}}$ such that $N-\alpha\in A_{Q}$ satisfies

Since blocks are contiguous, we conclude that

for each $q=1,\dots,Q$ . Moreover, by definition of $\Xi$ (see Theorem 3.7), we find

for all $q=1,\dots,Q$ and all $\alpha$ such that $N-\alpha\in A_{q}$ .

Now we are ready for the main argument. Pick $q\in\{1,\dots,Q\}$ and abbreviate

which will serve to count eigenvalues. (Note that $x_{0}^{q}=a^{q}-N^{-1+\delta^{\prime}}/3$ and $x_{1}^{q}=b^{q}+N^{-1+\delta^{\prime}}/3$ .) The interval $[x_{0}^{q},x_{1}^{t}]$ contains precisely those eigenvalues of $H$ that are in $A_{q}$ , and its endpoints $x_{0}^{q}$ and $x_{1}^{q}$ are at a distance greater than $N^{-1+\delta^{\prime}}/3$ from any eigenvalue of $H$ . Thus, $[x_{0}^{q},x_{1}^{t}]$ is the correct generalization of the interval $[x_{-}^{\alpha},x_{+}^{\alpha}]$ from the sketch given at the beginning of this proof.

In order to avoid problems with exceptional events, we add some randomness to $D$ . Recall that $D$ satisfies (6.21). Let $\Delta$ be a $k\times k$ Hermitian random matrix whose upper triangular entries are independent and have an absolutely continuous law supported in the unit disk. For $\varepsilon>0$ define

From now on we use “almost surely” to mean almost surely with respect to the randomness of $\Delta$ . Our main goal is to prove that for each $\varepsilon>0$ , almost surely, there are at least $\lvert A_{q}\rvert$ eigenvalues of $\widetilde{H}^{\varepsilon}$ in $[{x_{0}^{q},x_{1}^{q}}]\setminus\sigma(H)$ . (Having done this, we shall deduce, by taking $\varepsilon\to 0$ , that $\widetilde{H}$ has at least $\lvert A_{q}\rvert$ eigenvalues in $[x_{0}^{q},x_{1}^{q}]$ .)

Then (assuming $x\notin\sigma(H)$ ) we know that $x\in\sigma(\widetilde{H}^{\varepsilon})$ if and only if $M^{\varepsilon}(x)$ is singular. Split

Let $x\in[x_{0}^{q},x_{1}^{q}]$ . Similarly to the proof of (6.20), we choose $\eta\mathrel{\vbox{\hbox{.}\hbox{.}}}=N^{-1+\delta^{\prime}}$ and estimate

where we used that $\lvert x-\lambda_{\alpha}\rvert\geqslant 2N^{-1+\delta^{\prime}}/3$ for $\alpha\notin A_{q}$ . Moreover,

where $R(x)$ is continuous in $x$ and independent of $\Delta$ . Compare this to (6.22) in the sketch given at the beginning of the proof. By Theorem 2.5, for $\alpha\in A_{q}$ we have

Now we give the full proof. Recall that $\lvert d_{i}\rvert>1$ is independent of $N$ for all $i$ . Thus we get from (6.26) and (6.27) that, for large enough $N$ and small enough $\varepsilon$ , all eigenvalues of $M^{\varepsilon}(x_{0}^{q})$ are negative. (Here we used that $\lvert\lambda_{\alpha}-x_{0}^{q}\rvert\geqslant N^{-1+\delta^{\prime}}/3$ for $\alpha\in A_{q}$ .) We shall vary $t$ continuously from to $1$ and count the number of eigenvalues crossing the origin. Let $L\mathrel{\vbox{\hbox{.}\hbox{.}}}=\lvert A_{q}\rvert$ and denote by

the values of $t$ at which $x_{t}^{q}\in\sigma(H)$ . (Recall that the eigenvalues of $H$ are distinct.) It is also convenient to write $s_{0}=0$ and $s_{L+1}=1$ . For $t\in\setminus\{s_{1},\dots s_{L}\}$ , let

denote the ordered eigenvalues of $M^{\varepsilon}(x_{t}^{q})$ . We record the following fundamental properties of $e_{1}^{\varepsilon}(t),\dots,e_{k}^{\varepsilon}(t)$ .

For all $i=1,\dots,k$ , we have $e_{i}^{\varepsilon}(0)<0$ for $N$ large enough and $\varepsilon$ small enough (depending on $N$ ).

(In particular, both one-sided limits exist.)

Property (i) was proved after (6.27). Property (ii) follows from (6.26). Property (iii) follows from Lemma 6.7, using (6.26) and the fact that $R(x)$ is continuous.

Moreover, the two following claims are true almost surely.

If $e_{i}^{\varepsilon}(t)=0$ for some $t\in\setminus\{s_{1},\dots,s_{L}\}$ then $e_{j}^{\varepsilon}(t)\neq 0$ for all $j\neq i$ .

From $(*)$ we conclude that, almost surely, $M^{\varepsilon}(x)$ is singular in at least $L$ points in the set $[x_{0}^{q},x_{1}^{q}]\setminus\sigma(H)$ . Therefore $\widetilde{H}^{\varepsilon}$ has almost surely at least $L$ eigenvalues in $[x_{0}^{q},x_{1}^{q}]$ . Taking $\varepsilon\to 0$ , we find that $\widetilde{H}$ has at least $L=\lvert A_{q}\rvert$ eigenvalues in $[x_{0}^{q},x_{1}^{q}]$ .

What remains is to prove that $\widetilde{H}$ has at most $\lvert A_{q}\rvert$ eigenvalues in $[x_{0}^{q},x_{1}^{q}]$ . We prove this using interlacing, similarly to the corresponding argument given in the sketch at the beginning of the proof. Together with Proposition 6.6, we have proved that there are at least $\lvert A_{1}\rvert+k^{+}$ eigenvalues of $\widetilde{H}$ in $[x_{0}^{1},\infty)$ . By interlacing (i.e. a repeated application of Lemma 6.2), we find that there are at most $\lvert A_{1}\rvert+k^{+}$ eigenvalues of $\widetilde{H}$ in $[x_{0}^{1},\infty)$ . We deduce, again using Proposition 6.6, that there are exactly $\lvert A_{1}\rvert$ eigenvalues of $\widetilde{H}$ in $[x_{0}^{1},x_{1}^{1}]$ .

We have proved that there are at least $\lvert A_{1}\rvert+\lvert A_{2}\rvert+k^{+}$ eigenvalues of $\widetilde{H}$ in $[x_{0}^{2},\infty)$ . Using eigenvalue interlacing, we find that there are at most $\lvert A_{1}\rvert+\lvert A_{2}\rvert+k^{+}$ eigenvalues of $\widetilde{H}$ in $[x_{0}^{2},\infty)$ . We conclude that there are exactly $\lvert A_{2}\rvert$ eigenvalues of $\widetilde{H}$ in $[x_{0}^{2},x_{1}^{2}]$ .

We may now repeat this argument for $q=3,4,\dots,Q$ , to get that $\widetilde{H}$ has exactly $\lvert A_{q}\rvert$ eigenvalues in $[x_{0}^{q},x_{1}^{q}]$ , for $q=1,2,\dots,Q$ . Moreover, by (6.25), we find for any $\alpha\in A_{q}$ that

5 Bootstrapping and conclusion of the proof of Theorem 2.7

It is easy to see that such a path exists. Informally, condition (ii) states that if the allowed regions for the outliers $i$ and $j$ do not over lap at time $t=1$ (i.e. the outliers can be distinguished), then they may not overlap at any earlier time.

We continue to work at fixed $N$ and with a fixed realization $H\equiv H^{\omega}$ with $\omega\in\Xi$ . Let $\widetilde{C}_{2}$ and $\widetilde{C}_{3}$ be the constants from Proposition 6.5, and choose $\delta>0$ such that $\widetilde{\psi}\leqslant N^{1/3-\delta}$ . Define

and abbreviate $\mu_{\alpha}(t)=\lambda_{\alpha}(\widetilde{H}(t))$ . By Propositions 6.6 and 6.8, we have that

In order to invoke a continuity argument, we note that Proposition 6.5 yields

for all $t\in$ . Moreover, since $t\mapsto\widetilde{H}(t)$ is continuous, we find that $\mu_{\alpha}(t)$ is continuous in $t\in$ for all $\alpha$ .

for all $t\in$ , and in particular for $t=1$ .

where the second inequality follows from the definition of $I_{i}^{+}(\cdot)$ . This yields

where the constant $C$ depends only on $\Sigma$ . Thus we get

where the last inequality follows from (6.13). Repeating this estimate of $\theta(d_{j+1}^{+}(1))-\theta(d_{j}^{+}(1))$ for the remaining $j\in B_{i}$ , we find

What remains is the analysis of the extremal bulk eigenvalues. Once again, we make use of a continuity argument. As before, we only consider positive eigenvalues, $\lambda_{\alpha}\geqslant 2-\varphi^{\widetilde{K}}N^{-2/3}$ for some $\widetilde{K}$ to be chosen below. Note that by interlacing, Lemma 6.2, we have

(using the convention that $\lambda_{\alpha}=+\infty$ for $\alpha>N$ ). Recall the role of $K$ from the assumptions of Theorem 2.7. Therefore using the definition of $\Xi$ (see Theorem 3.7), we find that there is a $\widetilde{K}=\widetilde{K}(K)$ such that if $\alpha\geqslant N-\varphi^{K}$ then

Let now $\alpha$ satisfy $N-\varphi^{K}\leqslant\alpha\leqslant N-k^{+}$ . Using (6.30), (6.31), and Proposition 6.5, we find

for all $t\in$ . In addition, we know the two following facts about $\mu_{\alpha}(t)$ , for all $t\in$ .

$\mu_{\alpha}(t)$ satisfies the interlacing bound (6.34) for all $t\in$ .

Let $B_{\alpha}$ be the set of $\beta=1,\dots,N$ such that $\lambda_{\beta}$ and $\lambda_{\alpha}$ are in the same connected component of $I^{0}$ . Thus we conclude from (i) and (ii) that

completes the proof of Theorem 2.7 (recall the definition (6.10)).

Distribution of the outliers: proof of Theorem 2.14

The following proposition reduces the problem to analysing a single explicit random variable.

There is a constant $C_{2}$ , depending on $\zeta$ , such that the following holds. Suppose that

for all $i=1,\dots,k$ . Suppose moreover that for all $i\in O$ (2.24) holds. Recall the definitions (2.16) and (2.17). Then we have for all $i\in O$

Before proving Proposition 7.1, we record the following auxiliary result.

Let $C_{1}$ denote the constant from Theorem 2.3. For any

By symmetry, we may assume that $x\geqslant 0$ . Moreover, (7.2) follows from (7.1) and polarization.

We therefore prove (7.1) for $x\geqslant 0$ . We have

Choose $x\geqslant 2+N^{-2/3}\varphi^{C_{1}}$ and abbreviate $\kappa\equiv\kappa_{x}$ . Thus we get, for $\eta\geqslant\varphi^{\zeta}N^{-1}$ ,

with $\zeta$ -high probability, where in the last step we used Theorem 3.7. (In the proof of Theorem 2.3, the constant $C_{1}$ was chosen large enough for this application of Theorem 3.7; see (5.6).) A similar calculation using the definition (2.4) yields

Therefore we get, using Theorem 2.3 and Lemma 3.2,

with $\zeta$ -high probability. Choosing $\eta\mathrel{\vbox{\hbox{.}\hbox{.}}}=N^{-1/6}\kappa^{3/4}$ yields the claim. ∎

We only prove the claim for the case $d_{i}>1$ ; the case $d_{i}<-1$ is handled similarly.

For $2+\varphi^{C_{1}}N^{-2/3}\leqslant x\leqslant\Sigma$ , where $C_{1}$ is the constant from Theorem 2.3, we define the $k\times k$ Hermitian matrices $A(x)$ and $\widetilde{A}(x)$ through

For the rest of the proof we fix $i\in O$ satisfying $d_{i}>1$ . We abbreviate $\theta_{i}\mathrel{\vbox{\hbox{.}\hbox{.}}}=\theta(d_{i})$ . We begin by comparing the eigenvalues of $\widetilde{A}(\theta_{i})$ and $D^{-1}$ . Define the eigenvalue index $r\equiv r(i)=1,\dots,k$ through

with $\zeta$ -high probability for $j=1,\dots,k$ . In particular,

with $\zeta$ -high probability. Moreover, (7.4) and the condition (2.24) yield, for $j\neq i$ ,

with $\zeta$ -high probability, provided $C_{2}$ is chosen large enough. We therefore conclude that

with $\zeta$ -high probability, provided $C_{2}$ is large enough.

Next, we compare the eigenvalues of $A(\theta_{i})$ and $\widetilde{A}(\theta_{i})$ using second-order perturbation theory (the first-order correction vanishes by definition of $\widetilde{A}$ and $A$ ). Theorem 2.3 yields

with $\zeta$ -high probability. Therefore (7.6) and nondegenerate second-order perturbation theory yield, for large enough $C_{2}$ ,

Next, we analyse $A(x)$ and make the link to $\mu_{\alpha(i)}$ . From Lemma 7.2 we find

with $\zeta$ -high probability. In particular, we have for all $j=1,\dots,k$ that

with $\zeta$ -high probability, provided that $2+\varphi^{C_{1}}N^{-2/3}\leqslant x,y\leqslant\Sigma$ .

Recall the definition (2.17) of $\alpha(i)$ . From Lemma 6.1 and Theorem 3.7, we know that $\mu_{\alpha(i)}$ is characterized by the property that there is a $q\equiv q(i)\in\{1,\dots,k\}$ such that

with $\zeta$ -high probability. Provided $C_{2}$ is large enough (depending on $C_{3}$ ), it is easy to see from (7.9) that

with $\zeta$ -high probability. Thus we find, using (7.8), (7.9), and (7.10), that for large enough $C_{2}$ we have

with $\zeta$ -high probability. (Here we absorbed the constant $C_{3}$ into $C_{\zeta}$ .)

We now prove that $q=r$ with $\zeta$ -high probability provided $C_{2}$ is large enough. Assume by contradiction that $q\neq r$ . Then we get, using Theorem 2.3 and the condition (2.24), that

with $\zeta$ -high probability. Moreover, (7.8), (7.9), and (7.10) yield

with $\zeta$ -high probability, where in the last step we used (6.5). Together with (7.12), this yields the desired contradiction provided $C_{2}$ is large enough. Hence $q=r$ .

Putting (7.3), (7.11), and (7.7) together, we get

with $\zeta$ -high probability. Thus we find that, for all $x$ between $\theta_{i}$ and $\mu_{\alpha(i)}$ , we have

with $\zeta$ -high probability, where we used (6.5) and (7.9). Using (6.2), (7.10), and (6.5), we conclude that

with $\zeta$ -high probability. The claim now follows for large enough $C_{2}$ , using the identity (6.2). ∎

2 The GOE/GUE case

By Proposition 7.1, it is enough to analyse the random variable

and we abbreviated $\theta\;\equiv\;\theta(d)$ . For definiteness, we choose $d>1$ in the following.

The following notion of convergence of random variables is convenient for our needs.

Two sequences of random variables, $\{A_{N}\}$ and $\{B_{N}\}$ , are asymptotically equal in distribution, denoted $A_{N}\overset{d}{\sim}B_{N}$ , if they are tight and satisfy

Clearly, $A_{N}\overset{d}{\sim}B_{N}$ if $A_{N}\overset{d}{=}B_{N}$ for all $N$ .

where in the last step we used the boundedness of $f^{\prime}$ . ∎

Let $\{A_{N}\}$ , $\{A_{N}^{\prime}\}$ , $\{B_{N}\}$ , and $\{B_{N}^{\prime}\}$ be sequences of random variables. Suppose that $A_{N}\overset{d}{\sim}A_{N}^{\prime}$ , $B_{N}\overset{d}{\sim}B_{N}^{\prime}$ , $A_{N}$ and $B_{N}$ are independent, and $A_{N}^{\prime}$ and $B_{N}^{\prime}$ are independent. Then

Next, we observe that $A_{N}+B_{N}$ and $A_{N}^{\prime}+B_{N}^{\prime}$ are tight. Therefore, recalling Remark 7.5, we find that it suffices to prove

$f\in C_{c}^{\infty}$ . Denoting by $\hat{f}$ the Fourier transform of $f$ , we find

Let $H$ be a GOE/GUE matrix. Assume that $d$ satisfies (7.14). Then for large enough $C_{2}$ we have

In order to estimate the error term in (7.16), we write

Using (3.6) to estimate $G_{22}^{(1)}-G_{22}$ , as well as Theorem 2.3, Lemma 3.5, and Lemma 3.2, we therefore find that

From (7.16), (7.17), (7.18), and (7.19), we conclude that there exist random variables $\widetilde{R}_{1}$ and $\widetilde{R}_{2}$ satisfying

with $\zeta$ -high probability, the rough bound

In order to infer the distribution of $Y_{1}$ from (7.22), we observe that the random variables $Y_{2}$ and $W$ are independent. Also, $Y_{1}\overset{d}{=}Y_{2}$ . Recalling Theorem 2.3 and (3.6), we find the bounds

with $\zeta$ -high probability, and the rough bounds

Next, let $B$ and $Z_{2}$ be independent random variables whose laws are given by

we find that $Z_{1}\overset{d}{=}Z_{2}$ . Moreover, a standard moment calculation and the definition of $W$ yield

provided $C_{2}$ in (7.14) is large enough. Here we used that

For the induction step, we assume that (7.28) holds for all $k^{\prime}\leqslant k-1$ . From (7.22) we find

We estimate the summands on the left-hand side by

In order to conclude the proof of (7.28), we deduce from (7.26) that

Using the induction assumption (7.28) for $k^{\prime}=k-l$ , (7.29), and the condition $l\geqslant 2$ , we get from (7.31), (7.32), and (7.27) that

for large enough $C_{2}$ . This concludes the proof of (7.28).

for any continuous bounded function $f$ . Next, we estimate

with $\zeta$ -high probability, where in the second step we used Lemma 7.2, (5.5), and Lemma 3.2 to estimate the first term, and Theorem 2.3 and (6.1) to estimate the second term. Therefore

with $\zeta$ -high probability, where in the second step we used (7.29). Therefore (7.33), the fact that $Z\overset{d}{=}Z_{1}$ , and dominated convergence yield

The claim now follows from Lemma 7.10 below. ∎

Let $\{\xi_{N}\}$ be a bounded deterministic sequence. Let $A_{\infty},A_{1},A_{2},\dots$ be random variables such that $A_{N}$ converges weakly to $A_{\infty}$ . Then we have for any bounded continuous function $f$

The claim now follows by dominated convergence. ∎

3 The almost-GOE/GUE case

We compare the original Wigner matrix $H$ with $\widehat{H}$ , a Wigner matrix obtained from $H$ by replacing the $(i,j)$ -th entry of $H$ with a Gaussian whenever $\lvert v_{i}\rvert\leqslant\varphi^{-D}$ and $\lvert v_{j}\rvert\leqslant\varphi^{-D}$ .

We compare the matrix $\widehat{H}$ to a Gaussian matrix.

The step (ii) is performed in this section. To simplify notation, we write $H$ instead of $\widehat{H}$ throughout this section. The step (i) is performed using Green function comparison in Section 7.4 below.

The following shorthand will prove useful.

Let $\{\sigma_{N}\}$ be a bounded positive sequence. If $A_{N}$ and $B_{N}$ are independent random variables with $B_{N}\overset{d}{\sim}\mathcal{N}(0,\sigma_{N}^{2})$ , and if $S_{N}\overset{d}{\sim}A_{N}+B_{N}$ , then we write

As before, we consistently drop the spectral parameter $z=\theta$ from our notation.

where we used that $SH_{0}S^{*}\overset{d}{=}H_{0}$ and the fact that $A$ , $B$ , and $H_{0}$ are independent.

with $\zeta$ -high probability. In order to prove (7.36), write

We consider four cases. First, if $1\leqslant i\neq j\leqslant M$ we find using (3.15) that

with $\zeta$ -high probability. Second, if $1\leqslant i\leqslant M$ we find using (3.13) and (3.14) that

with $\zeta$ -high probability. Third, for $i=M+1$ we have by (3.13)

with $\zeta$ -high probability. Finally, for $1\leqslant i<j=M+1$ we have by (3.15)

with $\zeta$ -high probability. This completes the proof of (7.36).

Next, abbreviate $G_{1}(z)\mathrel{\vbox{\hbox{.}\hbox{.}}}=(H_{1}-z)^{-1}$ . Since $N^{1/2}(N-M-1)^{-1/2}H_{1}$ is an $(N-M-1)\times(N-M-1)$ GOE/GUE matrix, we find from (7.36), Theorem 2.3, and Lemma 3.2 that

with $\zeta$ -high probability. Therefore Schur’s formula yields

Similarly, using (3.13) and (3.14) we find that

with $\zeta$ -high probability, using (3.15) that

with $\zeta$ -high probability, and using (3.13) that

with $\zeta$ -high probability. Using Theorem 2.3 applied to $G_{1}$ (recall that $F$ and $H_{1}$ are independent), we therefore get from (7.39) that

with $\zeta$ -high probability. We write this as

Next, we identify the asymptotic laws of $\Gamma_{1},\dots,\Gamma_{6}$ . There is nothing to be done with $\Gamma_{1}$ . By definition,

The variance of the term in parentheses is

Since $\lvert w_{i}\rvert\leqslant\varphi^{-D}$ , we get from the Central Limit Theorem and Lemma 7.10 that

we find from the Central Limit Theorem and Lemma 7.10 that

Thus we conclude from the Central Limit Theorem and Lemma 7.10 that

Next, (7.43) – (7.47) imply that $\nu N^{1/2}\Gamma_{2},\dots,\nu N^{1/2}\Gamma_{6}$ are tight (as $N$ -dependent random variables). Moreover, an easy variance calculation shows that $\nu N^{1/2}\Gamma_{1}$ is also tight. Therefore we get from (7.35), (7.42), (7.43) – (7.47), Lemma 7.7, and Lemma 7.8 that (recall the notation from Definition 7.11)

the Central Limit Theorem, Lemma 7.10, and Lemma 7.8 we find

4 Conclusion of the proof of Theorem 2.14

Define a new Wigner matrix $\widehat{H}=(\widehat{h}_{ij})=(N^{-1/2}\widehat{W}_{ij})$ through

Thus, $\widehat{H}$ satisfies the assumptions of Proposition 7.12. Let

be the set of matrix indices to be replaced. Similarly to (3.21), we choose a bijective map $\phi\mathrel{\vbox{\hbox{.}\hbox{.}}}J_{D}\to\{1,\dots,\gamma_{\rm max}(D)\}$ and denote by $H_{\gamma}=(h_{ij}^{\gamma})$ the matrix defined by

In particular, $H_{0}=\widehat{H}$ and $H_{\gamma_{\rm max}(D)}=H$ . Let now $(a,b)\in J_{D}$ satisfy $\phi(a,b)=\gamma$ . Similarly to (3.22), we write

Thus we have the rough bound $\lvert x\rvert\leqslant N^{4}$ which we shall tacitly use in the following. We use the notation (3.23), which gives rise to the quantities $x_{R},x_{S},x_{T}$ defined through (7.49) with $G$ replaced by $R,S,T$ respectively. We may now state the main comparison estimate.

where $A_{ab}$ satisfies $\lvert A_{ab}\rvert\leqslant\varphi^{-1}$ ,

Before proving Lemma 7.13, we show how it implies Theorem 2.14.

Applying (7.50) and (7.51) with $f$ replaced by $f^{\prime}$ yields

Subtracting this from (7.50) and using $\lvert A_{ab}\rvert\leqslant\varphi^{-1}$ yields

We now iterate (7.53), starting at $\gamma=1$ and $q=0$ . Using that $\sum_{a,b}\widehat{\mathcal{E}}_{ab}\leqslant C$ and $\sum_{a,b}\lvert Y_{ab}\rvert\leqslant C$ , we find after $\gamma_{\rm max}$ iterations of (7.53)

Moreover, using $\lvert v_{a}\rvert\leqslant\varphi^{-D}$ and $\lvert v_{b}\rvert\leqslant\varphi^{-D}$ , we find that

Using Lemma 7.2, it is now easy to remove the imaginary part $N^{-4}$ of $z$ to get

Since $\widehat{H}$ satisfies the assumptions of Proposition 7.12, we find

using the notation of Definition 7.11. Now Theorem 2.14 follows from Proposition 7.1 and Lemma 7.7. ∎

with $\zeta$ -high probability. This yields

with $\zeta$ -high probability for some constant $\widetilde{C}_{\zeta}$ . Now choose $D\geqslant\widetilde{C}_{\zeta}+1$ . By definition of $J_{D}$ , we have that $\lvert v_{a}\rvert\leqslant\varphi^{-D}$ and $\lvert v_{b}\rvert\leqslant\varphi^{-D}$ . Therefore

with $\zeta$ -high probability. This yields

Using (7.54), it is easy to check that $y_{1}$ is bounded by the right-hand side of (7.55), and that

with $\zeta$ -high probability. In particular,

with $\zeta$ -high probability. Moreover, using, $\lvert v_{a}\rvert\leqslant\varphi^{-D}$ , $\lvert v_{b}\rvert\leqslant\varphi^{-D}$ , (7.57) for $k=2$ , and the fact that $y_{1}$ is bounded by the right-hand side of (7.55), we find that

with $\zeta$ -high probability, provided $D$ is chosen large enough. Similarly, using (7.57) we find that $\lvert y_{k}\rvert\lvert y_{k^{\prime}}\rvert\leqslant\varphi^{-1}\widehat{\mathcal{E}}_{ab}$ for $k,k^{\prime}\geqslant 2$ for large enough $D$ . Thus we conclude from (7.56) that

depends on the randomness only through $R$ and the first two moments of $W_{ab}$ . Moreover, from (7.57) and the fact that $y_{1}$ is bounded by the right-hand side of (7.55), we conclude that $\lvert A_{ab}\rvert\leqslant\varphi^{-1}$ .

If $a=b$ , it is easy to see from (7.57) and the definition of $Y_{ab}$ that

with $\zeta$ -high probability . Therefore it suffices to prove that

with $\zeta$ -high probability. We only deal with the first term of $y_{3,0}$ ; the second one is dealt with analogously. Recalling the definition of $Y_{ab}$ , we conclude that, in order to establish (7.59), it suffices to prove

with $\zeta$ -high probability; here we used that $R$ is independent of $W_{ab}$ .

see (4.20). The second and third terms are estimated using (7.54) and Theorem 2.3:

with $\zeta$ -high probability. Moreover, since $R^{(a)}=T^{(a)}$ , we find from Lemma (3.12), Theorem 2.3, and (3.8) that

What remains is to estimate the right-hand side of (7.64). Defining

with $\zeta$ -high probability. Using (7.63) and using that the derivative of $f$ is bounded, we may estimate the first term of (7.64) as

with $\zeta$ -high probability. Thus we may estimate the third term of (7.64) by