Limits of spiked random matrices I

Alex Bloemendal, Bálint Virág

Introduction

Contemporary problems typically involve high dimensional data, meaning that pp is large as well—perhaps on the same order as nn or even larger. In this setting, say with null covariance Σ=I\Sigma=I, the sample eigenvalues may no longer concentrate around the population eigenvalue 1 but rather spread out over a certain compact interval. If p/ncp/n\to c with 0<c10<c\leq 1, Marčenko and Pastur (1967) proved that a.s. the empirical spectral distribution 1pkδλk/n\frac{1}{p}\sum_{k}\delta_{\lambda_{k}/n} converges weakly to the continuous distribution with density

where a=(1c)2a=(1-\sqrt{c})^{2} and b=(1+c)2b=(1+\sqrt{c})^{2}. (The singular case c>1c>1 is similar by the obvious duality between nn and pp, except that the pnp-n zero eigenvalues become an atom at zero of mass 1c11-c^{-1}.) This Marčenko-Pastur law is the analogue of Wigner’s semicircle law in this setting of multiplicative rather than additive symmetrization (see also Silverstein and Bai 1995). The assumption of Gaussian entries may be significantly relaxed.

Often one is primarily interested in the largest eigenvalues, as for example in the widely practiced statistical method of principal components analysis. Here the goal is a good low-dimensional projection of a high-dimensional data set, i.e. one that captures most of the variance; the structure of the significant trends and correlations is estimated using the largest sample eigenvalues and their eigenvectors. The challenge is to determine which observed eigenvalues actually represent structure in the population, and understanding the behaviour in the null case is therefore an essential first step.

In the null case the first-order behaviour is simple: 1nλkb\frac{1}{n}\lambda_{k}\to b a.s. for each fixed kk as nn\to\infty, i.e. none have limits beyond the edge of the support of the limiting spectral distribution (Geman 1980, Yin et al. 1988). More interestingly, the fluctuations are no longer asymptotically Gaussian but are rather those now recognized as universal at a real symmetric or Hermitian random matrix soft edge: they are on the order n2/3n^{-2/3}, asymptotically distributed according to the appropriate Tracy-Widom law. The latter were introduced by Tracy and Widom (1994, 1996) as limiting largest eigenvalue distributions for the Gaussian ensembles (see also Forrester 1993) and have since been found to occur in diverse probabilistic models. The limit theorems for sample covariance matrices were proved by Johansson (2000) in the complex case and by Johnstone (2001) in the real case (see Soshnikov 2002 for the first universality results here). Restrictions c0,c\neq 0,\infty on the limiting dimensional ratio were removed by El Karoui (2003) (see also Péché 2009).

Now often referred to as the BBP transition, this picture is relevant in various applications. Within mathematics it has been applied to the TASEP model of interacting particles on the line (Ben Arous and Corwin 2011). Spiked complex Wishart matrices occur in problems in wireless communications (Telatar 1999). With these two exceptions, however, most applications involve data that are real rather than complex. They include economics and finance—Harding (2008) used the phase transition to explain an old standard example of the failure of PCA—and medical and population genetics—Patterson et al. (2006) discuss its role in attempting to answer such questions as “Given genotype data, is it from a homogeneous population?” Further applications include speech recognition, statistical learning and the physics of mixtures (see Johnstone 2007, Paul 2007, Féral and Péché 2009 for references). In general, asymptotic distributions in the non-null cases are relevant when evaluating the power of a statistical test (Johnstone 2007).

Despite these developments, the conjectured BBP picture for spiked real Wishart matrices has proven elusive even in the rank one case. The difficulty is with the joint eigenvalue density: The complex case involves an integral over the unitary group that BBP analyzed via the Harish-Chandra-Itzykson-Zuber integral, a tool originating in representation theory that appears to have no straightforward analogue over the orthogonal group. Much is known, however. At the level of a law of large numbers, the phase transition is described by Baik and Silverstein (2006); a related separation phenomenon was observed already by Bai and Silverstein (1998, 1999). A broad generalization of the results on a.s. limits is developed by Benaych-Georges and Nadakuditi (2009) and dubbed “spiked free probability theory”. Paul (2007), Bai and Yao (2008) prove Gaussian central limit theorems in the supercritical regime. Féral and Péché (2009) prove Tracy-Widom fluctuations in the subcritical regime under the scaling assumptions of BBP. Interestingly, Wang (2008) obtained a critical limiting distribution for certain rank one spiked quaternion Wishart matrices.

Since this article was first posted, Mo (2011) gave a different treatment of the real rank one case. Despite the difficulties mentioned, he succeeds with the standard program of obtaining forms for the joint eigenvalue and largest eigenvalue distributions and doing asymptotic analysis on the latter. His description of the limiting distribution naturally looks very different from ours. See Forrester (2011) for some remarks on the two treatments and an alternative construction of the “general β\beta” model we now introduce.

We bypass the eigenvalue density altogether; our starting point is rather a reduction of the matrix to tridiagonal form via Householder’s algorithm, a well-known tool in numerical analysis. Trotter (1984) observed that the algorithm interacts nicely with the Gaussian structure, using the resulting forms to derive the Wigner semicircle and Marčenko-Pastur laws without going through their moments. Observing the similarity of the forms in the β=1,2,4\beta=1,2,4 cases, Dumitriu and Edelman (2002) introduced interpolating matrix ensembles for all β>0\beta>0 whose eigenvalue density is given by Dyson’s Coulomb or log gas model

where vv is the Hermite or the Laguerre weight and ZZ is a normalizing factor (see Forrester 2010 for more on such models). Incidentally, Trotter’s argument applies to these general β\beta analogues and establishes Wigner semicircle and Marčenko-Pastur laws in this setting. An extension to more general weights is part of a forthcoming work of Krishnapur et al. (2011+).

The second step is to consider the tridiagonal ensemble as a discrete random Schrödinger operator (i.e. discrete Laplacian plus random potential) and then take a scaling limit at the soft edge to obtain a certain continuum random Schrödinger operator on the half-line. This “stochastic operator approach to random matrix theory” was pioneered by Edelman and Sutton (2007), Sutton (2005); in the soft edge case their heuristics were proved by Ramírez et al. (2011), who in particular established joint convergence of the largest eigenvalues. Our method is directly based on the latter work and we refer to it throughout by the initials RRV. The key point is that both steps can be adapted to the setting of rank one perturbations. As we will see, the limiting operator feels the perturbation in the boundary condition at the origin.

In order to state our results, we now recall the stochastic Airy operator introduced by Edelman and Sutton (2007). Formally this is the random Schrödinger operator

We will see that, almost surely, Hβ,w\mathcal{H}_{\beta,w} is bounded below with purely discrete, simple spectrum {Λ0<Λ1<}\{\Lambda_{0}<\Lambda_{1}<\cdots\} for all w(,]w\in(-\infty,\infty]. This fact will be established simultaneously with the standard variational characterization: in Proposition 2.8, we show in particular that Λk\Lambda_{k} and the corresponding eigenfunction fkf_{k} are given recursively by

in which we consider only candidates ff for which the first integral is finite, and the stochastic integral is defined pathwise via integration by parts. Recall from RRV that the distribution Fβ,F_{\beta,\infty} of Λ0-\Lambda_{0} in the Dirichlet case w=+w=+\infty may be taken as a definition of Tracy-Widom(β\beta) for general β>0\beta>0, a one-parameter family of distributions interpolating between those at the standard values β=1,2,4\beta=1,2,4. Fixing β\beta, the distributions Fβ,wF_{\beta,w} for finite ww may be thought of as a family of deformations of Tracy-Widom(β\beta). We note that the pathwise dependence of Hβ,w\mathcal{H}_{\beta,w} on the Brownian motion allows the operators to be coupled over ww in a natural way.

Let λ1>>λnp\lambda_{1}>\dots>\lambda_{n\wedge p} be the nonzero eigenvalues of SS. Then, jointly for k=1,2,k=1,2,\ldots in the sense of finite-dimensional distributions, we have

Work of Féral and Péché (2009) immediately allows extension of the previous theorem in the real and complex spiked Wishart cases to more general real and complex spiked sample covariance matrices. More precisely, the i.i.d. multivariate Gaussian columns of the data matrix XX may be replaced with i.i.d. columns having zero mean and rank one spiked diagonal covariance, and satisfying some moment conditions. These authors make the same assumptions on the dimension ratio as BBP, but the null case universality result of Péché (2009) suggest these could be removed.

We prove Theorem 1.1 by establishing a more general technical result, Theorem 2.10 in Section 2. The latter theorem gives conditions under which the low-lying eigenvalues and corresponding eigenvectors of a large random symmetric tridiagonal matrix converge in law to those of a random Schrödinger operator on the half-line with a given potential and homogeneous boundary condition at the origin. Verifying the hypotheses for suitably scaled spiked Laguerre matrices will be relatively straightforward; we do it in Section 3. The approach follows that of RRV, where the null case of Theorem 1.1 is treated.

One advantage of such an approach is that it immediately yields results for other matrix models as well. In particular, finite-rank additive perturbations of Gaussian orthogonal, unitary and symplectic ensembles (GO/U/SE) have received considerable attention. The analogue of the BBP theorem in the perturbed GUE setting was established by Péché (2006), Desrosiers and Forrester (2006). Bassler et al. (2010) treat an interesting generalization and mention some applications to physics. We consider a simple additive rank one perturbation of the GOE obtained by shifting the mean of every entry by the same constant μ/n\mu/\sqrt{n}. By orthogonal invariance, this has the same effect on the spectrum as shifting the (1,1) entry by nμ\sqrt{n}\,\mu. With this perturbation, the usual tridiagonalization procedure works; the resulting form is the β=1\beta=1 case of

As in the spiked real Wishart setting, the critical regime for the rank one perturbed GOE has resisted description. We show that the phase transition in the perturbed Hermite ensemble has the same characterization as the one in the Laguerre ensemble.

Let λ1>>λn\lambda_{1}>\dots>\lambda_{n} be the eigenvalues of GG. Then, jointly for k=0,1,k=0,1,\ldots in the sense of finite-dimensional distributions, we have

where Λ0<Λ1<\Lambda_{0}<\Lambda_{1}<\cdots are the eigenvalues of Hβ,w\mathcal{H}_{\beta,w}. Furthermore, the convergence holds jointly with respect to the natural couplings over all {μn},w\{\mu_{n}\},w satisfying (1.6).

The remarks following the previous theorem apply also to this theorem; the universality issue is discussed in Féral and Péché (2007).

The limit of a rank one perturbed general β\beta soft edge thus seems to be universal, just as at β=2\beta=2. We offer two alternative descriptions.

Fix β>0\beta>0 and let Λ0\Lambda_{0} be the ground state energy of Hβ,w\mathcal{H}_{\beta,w} where w(,]w\in(-\infty,\infty]. The distribution Fβ,w(x)=Pβ,w(Λ0x)F_{\beta,w}(x)=\operatorname{\mathbf{P}}_{\beta,w}(-\Lambda_{0}\leq x) has the following alternative characterizations.

(RRV) Consider the stochastic differential equation

and let P(x0,w)\operatorname{\mathbf{P}}_{(x_{0},w)} be the Itō diffusion measure on paths {px}xx0\{p_{x}\}_{x\geq x_{0}} started from px0=wp_{x_{0}}=w. A path almost surely either explodes to -\infty in finite time or grows like pxxp_{x}\sim\sqrt{x} as xx\to\infty, and we have

has a unique bounded solution, and we have Fβ,w(x)=F(x,w)F_{\beta,w}(x)=F(x,w) for w(,)w\in(-\infty,\infty). We recover the Tracy-Widom(β)(\beta) distribution Fβ,(x)=limwF(x,w)F_{\beta,\infty}(x)=\lim_{w\to\infty}F(x,w).

These characterizations can be extended to the higher eigenvalues; details appear in Section 4.

In RRV the diffusion characterization is derived with classical tools, namely the Riccati transformation and Sturm oscillation theory. We review the relevant facts in Section 4 before proceeding to the boundary value problem. While the latter characterization amounts to a straightforward reformulation of the former, it is appealing in that it involves no stochastic objects. It also turns out to offer a good way of evaluating the distributions numerically (Bloemendal and Sutton 2011+). Most interestingly, however, it provides a sought-after connection with known integrable structure at β=2,4\beta=2,4.

To wit, let u(x)u(x) be the Hastings-McLeod solution of the homogeneous Painlevé II equation

Equation (1.15) is one member of the Lax pair for the Painlevé II equation. The functions f,gf,g can also be defined in terms of the solution of the associated Riemann-Hilbert problem; analysis of the latter yields some information about u,f,gu,f,g summarized in Facts 5.1 and 5.2 below. The following theorem expresses the relationship between the objects just defined and the general β\beta characterization at β=2,4\beta=2,4. The proof is given in Section 5.

hold and follow directly from Theorem 1.7 and Facts 5.1 and 5.2.

The formula for F2,wF_{2,w} is given by Baik (2006), although it appeared earlier in work of Baik and Rains (2000, 2001) in a very different context. The formula for F4,wF_{4,w} appears in Baik and Rains (2000, 2001) in a disguised form; the w=0w=0 case is obtained by Wang (2008), but it is a new result in this context for w0,w\neq 0,\infty. In the β=4\beta=4 case we thus use our characterization to prove a guess.

In particular, we recover the Painlevé II representations of Tracy and Widom at these β\beta in a novel and simple way.

The latter distribution is known to be F1,(x)F_{1,\infty}(x) (Tracy and Widom 1996). Unfortunately we lack an independent proof.

A number of points remain somewhat mysterious. Most obviously, we lack a connection in the β=1\beta=1 case; while the literature previously did not even suggest a guess, it would now be illuminating to reconcile (1.9), (1.10) with the formula obtained by Mo (2011). Even at β=2,4\beta=2,4 it seems there should be a more direct way to derive or at least understand the connection. From the point of view of the PDE (1.9), some kind of extra structure appears to be present at certain special values of the parameter β\beta; what about other values? From the point of view of nonlinear special functions, we have shown directly—independently of any limit theorems—how the well-studied Hastings-McLeod solution admits characterization in terms of a simple linear parabolic boundary value problem in the plane.

We close this introduction by advertising the sequel, in which we treat the general spiked model with analogous methods.

The limit of a spiked tridiagonal ensemble

In this section we strengthen the argument of RRV to apply in the rank one spiked cases. The main convergence result will be applied in the next section to the tridiagonal forms described in the introduction.

Theorem 2.10 below generalizes Theorem 5.1 of RRV in a natural way, giving conditions under which the low-lying eigenvalues and corresponding eigenvectors of a random symmetric tridiagonal matrix converge in law to those of a random Schrödinger operator on the half-line with a given potential and homogeneous boundary condition at the origin. We include substantial parts of the original argument both for completeness and to highlight the new material; see Anderson et al. (2009) for another presentation of the original argument in a special case.

Let (yn,i;j)j=0,,n(y_{n,i;j})_{j=0,\ldots,n}, i=1,2i=1,2 be two discrete-time real-valued random processes with yn,i;0=0y_{n,i;0}=0, and let wnw_{n} be a real-valued random variable. Embed the processes as above. Define a “potential” matrix (or operator)

respectively. We denote this random matrix also as HnH_{n}, and call it a spiked tridiagonal ensemble. (We could have absorbed wnw_{n} into yn,1y_{n,1} as an additive constant, but keep it separate for reasons that will soon be apparent.)

As in RRV, convergence rests on a few key assumptions on the random variables just introduced. By choice, no additional scalings will be required.

Assumption 1 (Tightness and convergence). There exists a continuous random process {y(x)}x0\{y(x)\}_{x\geq 0} with y(0)=0y(0)=0 such that

with respect to the compact-uniform topology on paths.

Assumption 2 (Growth and oscillation bounds). There is a decomposition

with ηn,i;j0\eta_{n,i;j}\geq 0 such that for some deterministic unbounded nondecreasing continuous functions η(x)>0\overline{\eta}(x)>0, ζ(x)1\zeta(x)\geq 1 not depending on nn, and random constants κn1\kappa_{n}\geq 1 defined on the same probability spaces, the following hold: The κn\kappa_{n} are tight in distribution, and for each nn we have almost surely

for all x,ξ[0,n/mn]x,\xi\in[0,n/m_{n}] with ξx1\left\lvert\xi-x\right\rvert\leq 1.

Assumption 3 (Critical or subcritical spiking). For some nonrandom w(,]w\in(-\infty,\infty], we have

The necessity of first and third assumptions will be evident when we define a continuum limit and prove convergence. The more technical second assumption ensures tightness of the matrix eigenvalues; its limiting version (derived in the next subsection) will guarantee discreteness of the limiting spectrum. Lastly, we note that for given yny_{n} the models may be coupled over different choices of wnw_{n}.

Reduction to deterministic setting

In the next subsection we will define a limiting object in terms of yy and ww; we want to prove that the discrete models converge to this continuum limit in law. We reduce the problem to a deterministic convergence statement as follows. First, select any subsequence. It will be convenient to extract a further subsequence so that certain additional tight sequences converge jointly in law; Skorokhod’s representation theorem (see Ethier and Kurtz 1986) says this convergence can be realized almost surely on a single probability space. We may then proceed pathwise.

In detail, consider (2.4)–(2.8). Note in particular that the upper bound of (2.5) shows that the piecewise linear process {0xηn,i}x0\left\{\int_{0}^{x}\eta_{n,i}\right\}_{x\geq 0} is tight in distribution under the compact-uniform topology for i=1,2i=1,2. Given a subsequence, we pass to a further subsequence so that the following distributional limits exist jointly:

for i=1,2i=1,2, where convergence in the first two lines is in the compact-uniform topology. We realize (2.9) pathwise a.s. on some probability space and continue in this deterministic setting.

Without further reference to the subsequences, we will assume this situation for the remainder of the section.

Limiting operator and variational characterization

Formally, the limit of the spiked tridiagonal ensemble HnH_{n} will be the eigenvalue problem

where H=d2/dx2+y(x)\mathcal{H}=-d^{2}/dx^{2}+y^{\prime}(x) and w(,]w\in(-\infty,\infty] is fixed. If w=+w=+\infty, the boundary condition is to be interpreted as f(0)=0f(0)=0; we refer to this as the Dirichlet case, and it will require special treatment in what follows. The primary object for us will be a symmetric bilinear form associated with the eigenvalue problem (2.10).

and an associated Hilbert space LL^{*} as the closure of C0C_{0}^{\infty} under this norm. Note that our LL^{*} differs slightly from the one in RRV. We register some basic facts about LL^{*} functions.

Any fLf\in L^{*} is uniformly Hölder(1/2)-continuous, satisfies f(x)f\left\lvert f(x)\right\rvert\leq\left\lVert f\right\rVert_{*} for all xx, and in the Dirichlet case has f(0)=0f(0)=0.

We have f(y)f(x)=xyffyx1/2\left\lvert f(y)-f(x)\right\rvert=\left\lvert\int_{x}^{y}f^{\prime}\right\rvert\leq\left\lVert f^{\prime}\right\rVert\left\lvert y-x\right\rvert^{1/2}. For fC0f\in C_{0}^{\infty} we have f(x)2=x(f2)2fff2f(x)^{2}=-\int_{x}^{\infty}{(f^{2})}^{\prime}\leq 2\left\lVert f^{\prime}\right\rVert\left\lVert f\right\rVert\leq\left\lVert f\right\rVert_{*}^{2}; an LL^{*}-bounded sequence in C0C_{0}^{\infty} therefore has a compact-uniformly convergent subsequence, so we can extend this bound to fLf\in L^{*} and conclude further that f(0)=0f(0)=0 in the Dirichlet case. ∎

For future reference, we also record some compactness properties of the LL^{*}-norm.

Every LL^{*}-bounded sequence has a subsequence converging in the following modes: (i) weakly in LL^{*}, (ii) derivatives weakly in L2L^{2}, (iii) uniformly on compacts, and (iv) in L2L^{2}.

(i) and (ii) are just Banach-Alaoglu; (iii) is the previous fact and Arzelà-Ascoli again; (iii) implies L2L^{2} convergence locally, while the uniform bound on ηfn2\int\overline{\eta}f_{n}^{2} produces the uniform integrability required for (iv). Note that the weak limit in (ii) really is the derivative of the limit function, as one can see by integrating against functions 1[0,x]\mathbf{1}_{[0,x]} and using pointwise convergence. ∎

We introduce a symmetric bilinear form on C0×C0C_{0}^{\infty}\times C_{0}^{\infty} by

dropping the last term in the Dirichlet case. (We could have absorbed ww into yy as an additive constant in the finite case, but prefer to keep the boundary term separate.) Formally, Hy,w ⁣(φ,f)\mathcal{H}_{y,w}\!\left(\varphi,f\right) is just φ,Hf\left\langle\varphi,\mathcal{H}f\right\rangle; notice how the mixed boundary condition is built “implicitly” into the form, while the Dirichlet boundary condition is built “explicitly” into the space.

There are constants c,C>0c,C>0 so that the following bounds holds for all fC0f\in C_{0}^{\infty}:

In particular, Hy,w ⁣(,)\mathcal{H}_{y,w}\!\left(\cdot,\cdot\right) extends uniquely to a continuous symmetric bilinear form on L×LL^{*}\times L^{*} satisfying the same bounds.

For the first two terms of (2.11), we use the decomposition y=η+ωy=\int\eta+\omega from the previous subsection. Integrating the η\int\eta term by parts, the limiting version of (2.5) easily yields

Break up the ω\omega term as follows. The moving average ωx=xx+1ω\overline{\omega}_{x}=\int_{x}^{x+1}\omega is differentiable with ωx=ωx+1ωx\overline{\omega}_{x}^{\prime}=\omega_{x+1}-\omega_{x}; writing ω=ω+(ωω)\omega=\overline{\omega}+(\omega-\overline{\omega}), we have

The limiting version of (2.7) gives \max\bigl{(}\left\lvert\omega_{\xi}-\omega_{x}\right\rvert,\left\lvert\omega_{\xi}-\omega_{x}\right\rvert^{2}\bigr{)}\leq C_{\varepsilon}+\varepsilon\overline{\eta}(x) for ξx1\left\lvert\xi-x\right\rvert\leq 1, where ε\varepsilon can be made small. In particular, the first term above is bounded absolutely by εf2+Cεf2\varepsilon\left\lVert f\right\rVert_{*}^{2}+C_{\varepsilon}\left\lVert f\right\rVert^{2}. Averaging, we also get ωxωx(Cε+εη(x))1/2\left\lvert\overline{\omega}_{x}-\omega_{x}\right\rvert\leq(C_{\varepsilon}+\varepsilon\overline{\eta}(x))^{1/2}; Cauchy-Schwarz then bounds the second term above absolutely by ε0(f)2+1ε0f2(Cε+εη)\sqrt{\varepsilon}\int_{0}^{\infty}{(f^{\prime})}^{2}+\frac{1}{\sqrt{\varepsilon}}\int_{0}^{\infty}{f^{2}(C_{\varepsilon}+\varepsilon\overline{\eta})} and thus by εf2+Cεf2\sqrt{\varepsilon}\left\lVert f\right\rVert_{*}^{2}+C_{\varepsilon}^{\prime}\left\lVert f\right\rVert^{2}. Now combine all the terms and set ε\varepsilon small to obtain the result.

For the boundary term wf(0)2wf(0)^{2}, it suffices to obtain a bound of the form f(0)2εf2+Cεf2f(0)^{2}\leq\varepsilon\left\lVert f\right\rVert_{*}^{2}+C^{\prime\prime}_{\varepsilon}\left\lVert f\right\rVert^{2}. But f(0)22fff(0)^{2}\leq 2\left\lVert f^{\prime}\right\rVert\left\lVert f\right\rVert from the proof of Fact 2.1 gives such a bound with Cε=1/εC^{\prime\prime}_{\varepsilon}=1/\varepsilon.

The LL^{*} form bound follows from the fact that the LL^{*}-norm dominates the L2L^{2}-norm. We obtain the quadratic form bound Hy,w ⁣(f,f)Cf2\left\lvert\mathcal{H}_{y,w}\!\left(f,f\right)\right\rvert\leq C\left\lVert f\right\rVert^{2}_{*}; it is a standard Hilbert space fact that it may be polarized to a bilinear form bound (see e.g. Halmos 1957). ∎

Call (Λ,f)(\Lambda,f) an eigenvalue-eigenfunction pair if fLf\in L^{*}, f=1\left\lVert f\right\rVert=1, and for all φC0\varphi\in C_{0}^{\infty} we have

Note that (2.13) then automatically holds for all φL\varphi\in L^{*}, by LL^{*}-continuity of both sides.

This definition represents a weak or distributional version of the problem (2.10). As further justification, integrate by parts to write the definition

In the Dirichlet case the first term on the right is replaced with f(0)f^{\prime}(0). On the one hand (2.14) shows that ff^{\prime} has a continuous version, and the equation may be taken to hold everywhere. In particular, ff satisfies the boundary condition of (2.10) at the origin. On the other hand, (2.14) is a straightforward integrated version of the eigenvalue equation in which the potential term has been interpreted via integration by parts. This equation will be useful in Lemma 2.7 below and is the starting point for a rigorous derivation of (1.7) in the stochastic Airy case.

The requirement fLf\in L^{*} in Definition 2.4 is a technical convenience. Regarding regularity, we need ff at least absolutely continuous to make sense of the eigenvalue equation in either an integrated or a distributional sense; we have seen, however, that solutions are in fact C1C^{1}. Regarding behaviour at infinity, the diffusion picture developed by RRV shows a dichotomy: almost all solutions of the eigenvalue equation grow super-exponentially at infinity, except for the eigenfunctions which decay sub-exponentially.

We now characterize eigenvalue-eigenfunction pairs variationally. It is easy to see that each eigenspace is finite-dimensional: a sequence of normalized eigenfunctions must have an L2L^{2}-convergent subsequence by (2.12) and Fact 2.2. By the same argument, eigenvalues can accumulate only at infinity. In fact, more is true:

By linearity, it suffices to show a solution of (2.14) with f(0)=f(0)=0f^{\prime}(0)=f(0)=0 must vanish identically. Integrate by parts to write

which implies that f(x)C(x)0xf\left\lvert f^{\prime}(x)\right\rvert\leq C(x)\int_{0}^{x}\left\lvert f^{\prime}\right\rvert with some C(x)<C(x)<\infty increasing in xx. Gronwall’s lemma then gives f(x)=0f^{\prime}(x)=0 for all x0x\geq 0. ∎

The eigenfunction corresponding to a given eigenvalue is thus uniquely specified with the additional sign normalization -\frac{\pi}{2}<\arg\bigl{(}f(0),f^{\prime}(0)\bigr{)}\leq\frac{\pi}{2}. We order eigenvalue-eigenfunction pairs by their eigenvalues. As usual, it follows from the symmetry of the form that distinct eigenfunctions are L2L^{2}-orthogonal.

There is a well-defined (k+1)(k+1)st lowest eigenvalue-eigenfunction pair (Λk,fk)(\Lambda_{k},f_{k}); it is given recursively by the minimum and minimizer in the variational problem

Since we must have Λk\Lambda_{k}\to\infty, essentially {Λ0,Λ1,}\{\Lambda_{0},\Lambda_{1},\ldots\} exhausts the full spectrum and the operator has compact resolvent. We do not make this precise.

Proceed inductively, minimizing now over {fL:f=1,ff0,,fk1}\{f\in L^{*}:\left\lVert f\right\rVert=1,f\perp f_{0},\dots,f_{k-1}\}. Again, L2L^{2}-convergence of a minimizing sequence guarantees that the limit remains admissible; as before, the limit is in fact a minimizer; conclude by applying the arguments of the previous paragraph in the ortho-complement. The preceding lemma guarantees that Λ0<Λ1<\Lambda_{0}<\Lambda_{1}<\cdots, and that the corresponding eigenfunctions f0,f1,f_{0},f_{1},\ldots are uniquely determined. ∎

Statement

Suppose that HnH_{n} as in (2.1) satisfies Assumptions 1–3 and let (λn,k,vn,k)(\lambda_{n,k},v_{n,k}) be its (k+1)(k+1)st lowest eigenvalue-eigenvector pair. Define the corresponding form Hy,w\mathcal{H}_{y,w} as in (2.11) and let (Λk,fk)(\Lambda_{k},f_{k}) be its a.s. defined (k+1)(k+1)st lowest eigenvalue-eigenfunction pair. Then, jointly for all k=0,1,k=0,1,\ldots in the sense of finite-dimensional distributions, we have λn,kΛk\lambda_{n,k}\Rightarrow\Lambda_{k} and vn,kL2fkv_{n,k}\Rightarrow_{L^{2}}f_{k} as nn\to\infty. The convergence holds jointly over different wn,ww_{n},w for given yn,yy_{n},y.

Essentially, the resolvent matrices (precomposed with the corresponding finite-rank projections) are converging to the continuum resolvent in L2L^{2}-operator norm. We do not define the resolvent operator here.

The proof will be given over the course of the next two subsections. Recall that we proceed in the subsequential almost-sure context of the previous subsection.

Tightness

noting that the additional term in the Dirichlet case is nonnegative for sufficiently large nn.

As in the continuum version, the Dirichlet boundary condition must be put explicitly into the norm (see also Lemma 2.15 below). The case considered in RRV has wn=mnw_{n}=m_{n} in our notation; though it is somewhat hidden in the definitions, the LnL^{*}_{n}-norm used there contains a term mnv02m_{n}v_{0}^{2}.

The derivative and potential terms may be handled exactly as in RRV (proof of Lemma 5.6). For the spike term wnv02w_{n}v_{0}^{2} we recall Assumption 3. In the w<w<\infty case the wnw_{n} are bounded, so it suffices to obtain a bound of the form v02εvn2+Cεv2v_{0}^{2}\leq\varepsilon\left\lVert v\right\rVert_{*n}^{2}+C_{\varepsilon}\left\lVert v\right\rVert^{2} for each ε>0\varepsilon>0 where ε,Cε\varepsilon,C_{\varepsilon} do not depend on nn. Mimicking the continuum version in the proof of Fact 2.1, we have

which gives the desired bound with Cε=1/εC_{\varepsilon}=1/\varepsilon.

In the Dirichlet case, start with (2.15) but with the spike term left out (both of the form and the norm); it can be easily added back in by simply ensuring that c1c\leq 1 and C1C\geq 1. ∎

If wnw_{n}\to-\infty then the lower bound in Lemma 2.13 breaks down: the lowest eigenvalue of HnH_{n} really is going to -\infty. This is the supercritical regime.

Convergence

We begin with a lemma, a discrete-to-continuous version of Fact 2.2.

Let fnff_{n}\to f be as in the hypothesis and conclusion of Lemma 2.15. Then for all φC0\varphi\in C_{0}^{\infty} we have φ,HnfnHy,w ⁣(φ,f)\left\langle\varphi,H_{n}f_{n}\right\rangle\to\mathcal{H}_{y,w}\!\left(\varphi,f\right). In particular, Pnφφ\mathcal{P}_{n}\varphi\to\varphi in this way and so

Note that if fnL2ff_{n}\to_{L^{2}}f, gng_{n} is L2L^{2}-bounded and gngg_{n}\to g weakly in L2L^{2}, then fn,gnf,g\left\langle f_{n},g_{n}\right\rangle\to\left\langle f,g\right\rangle. Therefore φ,DnDnfn=Dnφ,Dnfnφ,f\left\langle\varphi,D_{n}^{\dagger}D_{n}f_{n}\right\rangle=\left\langle D_{n}\varphi,D_{n}f_{n}\right\rangle\to\left\langle\varphi^{\prime},f^{\prime}\right\rangle. The potential term converges as in RRV (proof of Lemma 5.7). Moreover, the spike term converges to the boundary term:

where in the Dirichlet case the left side vanishes for nn large because φ\varphi is supported away from 0.

For the second statement, the uniform LnL^{*}_{n} bound follows from the following observations: \bigl{\lVert}(\mathcal{P}_{n}\varphi){\textstyle\sqrt{1+\overline{\eta}}}\bigr{\rVert}=\left\lVert\mathcal{P}_{n}\varphi{\textstyle\sqrt{1+\overline{\eta}}}\right\rVert\leq\left\lVert\varphi{\textstyle\sqrt{1+\overline{\eta}}}\right\rVert; for nn large enough that Rnφ=φR_{n}\varphi=\varphi we have DnPnφ=PnDnφDnφφ\left\lVert D_{n}\mathcal{P}_{n}\varphi\right\rVert=\left\lVert\mathcal{P}_{n}D_{n}\varphi\right\rVert\leq\left\lVert D_{n}\varphi\right\rVert\leq\left\lVert\varphi^{\prime}\right\rVert (Young’s inequality); and in the Dirichlet case, the extra term vanishes for nn large. The convergence is easy: Pnφφ\mathcal{P}_{n}\varphi\to\varphi compact-uniformly and in L2L^{2}, and for gL2g\in L^{2} we have g,DnPnφ=Png,Dnφg,φ.\left\langle g,D_{n}\mathcal{P}_{n}\varphi\right\rangle=\left\langle\mathcal{P}_{n}g,D_{n}\varphi\right\rangle\to\left\langle g,\varphi^{\prime}\right\rangle.

Finally, we recall the argument of RRV to put all the pieces together.

First we show that for all kk we have λk=lim infλn,kΛk\underline{\lambda}_{k}=\liminf\lambda_{n,k}\geq\Lambda_{k}. Assume that λk<\underline{\lambda}_{k}<\infty. The eigenvalues of HnH_{n} are uniformly bounded below by Lemma 2.13, so there is a subsequence along which (λn,1,,λn,k)(ξ1,,ξk=λk)(\lambda_{n,1},\dots,\lambda_{n,k})\to(\xi_{1},\dots,\xi_{k}=\underline{\lambda}_{k}). By the same lemma the corresponding eigenvector sequences have LnL^{*}_{n}-norm uniformly bounded; pass to a further subsequence so that they all converge as in Lemma 2.15. The limit functions are orthonormal, and by Lemma 2.16 they are eigenfunctions with eigenvalues ξk\xi_{k}. There are therefore kk distinct eigenvalues at most λk\underline{\lambda}_{k}, as required.

We proceed by induction, assuming the conclusion of the theorem up to k1k-1. First find fkεC0f_{k}^{\varepsilon}\in C_{0}^{\infty} with fkεfk<ε\left\lVert f_{k}^{\varepsilon}-f_{k}\right\rVert_{*}<\varepsilon. Consider the vector

The LnL^{*}_{n}-norm of the sum term is uniformly bounded by CεC\varepsilon: indeed, the vn,jn\left\lVert v_{n,j}\right\rVert_{*n} are uniformly bounded by Lemma 2.13, while the coefficients satisfy vn,j,fkεfkεfk+vn,jfj<2ε\left\lvert\left\langle v_{n,j},f_{k}^{\varepsilon}\right\rangle\right\rvert\leq\left\lVert f_{k}^{\varepsilon}-f_{k}\right\rVert+\left\lVert v_{n,j}-f_{j}\right\rVert<2\varepsilon for large nn. By the variational characterization in finite dimensions, and the uniform LnL^{*}_{n} form bound on ,Hn\left\langle\cdot,H_{n}\cdot\right\rangle (Lemma 2.13) together with the uniform bound on Pnfkεn\left\lVert\mathcal{P}_{n}f_{k}^{\varepsilon}\right\rVert_{*n} (Lemma 2.16), we then have

where oε(1)0o_{\varepsilon}(1)\to 0 as ε0\varepsilon\to 0. But (2.16) of Lemma 2.16 provides limPnfkε,HnPnfkε=Hy,w ⁣(fkε,fkε)\lim\left\langle\mathcal{P}_{n}f_{k}^{\varepsilon},H_{n}\mathcal{P}_{n}f_{k}^{\varepsilon}\right\rangle=\mathcal{H}_{y,w}\!\left(f_{k}^{\varepsilon},f_{k}^{\varepsilon}\right), so the right hand side of (2.17) is

Now letting ε0\varepsilon\to 0, we conclude lim supλn,kΛk\limsup\lambda_{n,k}\leq\Lambda_{k}.

Thus λn,kΛk\lambda_{n,k}\to\Lambda_{k}; Lemmas 2.13 and 2.16 imply that any subsequence of the vn,kv_{n,k} has a further subsequence converging in L2L^{2} to some gLg\in L^{*} with (Λk,g)(\Lambda_{k},g) an eigenvalue-eigenfunction pair. But then g=fkg=f_{k}, and so vn,kL2fkv_{n,k}\to_{L^{2}}f_{k}. ∎

Application to Wishart and Gaussian models

We now apply Theorem 2.10 to prove Theorems 1.1 and 1.5. The first step is to obtain the tridiagonal forms. Then, after recalling the derivation of the scaling limit at the soft edge, we verify Assumptions 1–3 for certain scalings of the perturbation.

We explain how to tridiagonalize a rank one spiked real Wishart matrix; the algorithm is basically the usual one described by Trotter (1984) with a few careful choices. We restrict for the moment to the case npn\geq p, but lift this restriction in the Remark 3.1 below. For a given p×np\times n data matrix XX we will construct a pair of orthogonal matrices OO(p), OO(n)O\in O(p),\ O^{\prime}\in O(n) so that W=OXOW=OXO^{\prime} becomes lower bidiagonal; then XX and WW have the same singular values and WWWW^{\dagger} is a symmetric tridiagonal matrix with the same eigenvalues as XXXX^{\dagger}. Further, the structure of XX and O,OO,O^{\prime} will be such that the entries of WW are independent with explicit known distributions.

Reflect the second column of O1XO1O2O_{1}XO_{1}^{\prime}O_{2}^{\prime} as follows: leaving e1,e2\langle e_{1},e_{2}\rangle invariant, reflect the orthogonal component of the column into the positive e3e_{3} direction via left multiplication by O2{I2}O(p2)O_{2}\in\{I_{2}\}\oplus O(p-2), chosen independently of the other columns.

Continue in this way, alternately reflecting rows and columns while leaving the results of previous steps untouched.

The result is that with O=O1OpO^{\prime}=O_{1}^{\prime}\cdots O_{p}^{\prime} and O=Op1O1O=O_{p-1}\cdots O_{1} we have

where {χ~nj}j=0p1\{\widetilde{\chi}_{n-j}\}_{j=0}^{p-1} and {χpj}j=1p1\{\chi_{p-j}\}_{j=1}^{p-1} are independent Chi random variables of parameters given by their indices. We have truncated the npn-p rightmost columns of zeros to obtain a p×pp\times p matrix, leaving the product WWWW^{\dagger} unchanged. We will actually work with WWW^{\dagger}W below, which has the same eigenvalues.

Attempting the above procedure in the case n<pn<p produces a lower bidiagonal matrix WW with n+1n+1 nonzero rows. The matrix WWW^{\dagger}W is now n×nn\times n, has the same nonzero eigenvalues as XXXX^{\dagger}, and looks just like it does in the npn\geq p case except for a discrepancy in the bottom-right corner. The two cases may in fact be unified if one agrees that χ0=0\chi_{0}=0; then WW is (np+1)×(np)(n\wedge p+1)\times(n\wedge p) and has the form (1.2) with β=1\beta=1, while WWW^{\dagger}W is (np)×(np)(n\wedge p)\times(n\wedge p).

The perturbed GOE/GUE/GSE ensembles are even easier to tridiagonalize; as in the Wishart case, the usual procedure of Trotter (1984) works without modification. Starting with an n×nn\times n GOE matrix MM with a perturbation in the (1,1) entry, the upshot is that for certain O1,,On1O_{1},\dots,O_{n-1} with Oj{Ij}O(nj)O_{j}\in\{I_{j}\}\oplus O(n-j) the conjugated matrix On1O1MO1On1O_{n-1}\cdots O_{1}MO_{1}^{\dagger}\cdots O_{n-1}^{\dagger} has the form (1.5) with β=1\beta=1. We do not detail it further here.

Scaling limit

respectively. The usual centering and rescaling for fluctuations at the soft edge—as well as the operator limit itself—can be predicted using the approximations

valid for kk large, where gg is a suitably coupled standard Gaussian. We briefly reproduce the heuristic argument.

To leading order, the top-left corner of SS has n+pn+p on the diagonal and np\sqrt{np} on the off-diagonal. So the top-left corner of

is approximately an unscaled discrete Laplacian. If time is scaled by m1m^{-1}, space has to be scaled by m2m^{2} for this to converge to d2dx2\frac{d^{2}}{dx^{2}}. The next order terms for the jj’th diagonal and off-diagonal entries of SS, where jnpj\ll n\wedge p, are respectively

(we have indexed the gg’s to match the corresponding χ\chi’s). The total noise per unit (unscaled) time is like \tfrac{2}{\sqrt{\beta}}\bigl{(}\sqrt{n}+\sqrt{p}\bigr{)}g; convergence to 2β\tfrac{2}{\sqrt{\beta}} times standard Gaussian white noise bxb_{x}^{\prime} then requires \bigl{(}\sqrt{n}+\sqrt{p}\bigr{)}m_{n}^{2}/\sqrt{np}=m^{1/2}. The averaged part of the potential requires \bigl{(}2+\sqrt{p/n}+\sqrt{n/p}\bigr{)}m^{2}/\sqrt{np}=m^{-1} to converge to the function x-x. Fortunately these two scaling requirements match perfectly; we set

and set the integrated limiting potential to

where bxb_{x} is a standard Brownian motion. Note that

so the conditions mm\to\infty, m=o(np)m=o(n\wedge p) are met by merely having n,pn,p\to\infty together.

We now carefully decompose Hn,pH_{n,p} as in (2.1). In (2.2),(2.3) there is a little freedom between yn,1;1y_{n,1;1} and wnw_{n}, but only in to an additive constant in yn,1y_{n,1} that tends to zero in probability anyway. Thus we may as well set yn,1;1=0y_{n,1;1}=0 to fix wnw_{n} and yn,iy_{n,i}. Assumptions 1 and 2 (the CLT (2.4) and required tightness (2.5)–(2.7) for the potential terms yn,iy_{n,i}) are then verified exactly as in the final subsection of RRV.

It remains to consider Assumption 3. We have

as in (1.4). We want to show that, in this case, wnww_{n}\to w in probability; it is certainly enough to show that wnwn0w_{n}-\overline{w}_{n}\to 0 in probability.

Second order heuristics say the error terms are on the order (np)1/6(n\wedge p)^{-1/6} or m1/2m^{-1/2}, and L2L^{2} estimates easily provide the rigour. All we need is that χk2\chi_{k}^{2} has mean kk and variance 2k2k. We have

Turning now to the perturbed β\beta-Hermite ensemble, take Gn=Gnβ,μnG_{n}=G_{n}^{\beta,\mu_{n}} as in (1.5). With heuristic motivation similar to that in the previous proof, set

and y(x)y(x) as before. Decompose HnH_{n} as in (2.1). Again, the verification of Assumptions 1 and 2 on yn,iy_{n,i} proceeds as in RRV (Lemmas 6.2, 6.3). Moving on to Assumption 3, we have

as in (1.6), the difference is wnwn=n1/62/βg1w_{n}-\overline{w}_{n}=-n^{-1/6}\sqrt{2/\beta}\,g_{1}. It follows that wnwn0w_{n}-\overline{w}_{n}\to 0 in probability, which completes the proof of Theorem 1.5.

Alternative characterizations of the laws

In this section we prove Theorem 1.7 and its extension to higher eigenvalues.

The diffusion characterization is developed in RRV. The starting point is an application of the classical Riccati map p=f/fp=f^{\prime}/f to the eigenvalue equation (2.10), or rigorously to (2.14); the result is the first order differential equation

understood also in the integrated sense. The boundary condition at the origin becomes the initial value

and a zero of ff would have pp explode to -\infty and immediately restart at ++\infty.

The stated path properties of (1.7) appear also in RRV (Propositions 3.7 and 3.9).

Boundary value problem

Briefly, the boundary value problem is just the Kolmogorov backward equation for a hitting probability of the diffusion. We assume the diffusion representation Fβ,w(x)=κ(x,w)({})F_{\beta,w}(x)=\kappa_{(x,w)}(\{\infty\}) for the distribution of Λ0-\Lambda_{0}.

For each fixed xx, Fβ,w(x)F_{\beta,w}(x) is nondecreasing and continuous in w(,]w\in(\infty,\infty] and tends to zero as ww\to-\infty.

There are in fact almost-sure counterparts of these assertions that describe how Λ0\Lambda_{0} depends on ww for each Brownian path, but we do not need them here.

The monotonicity is a consequence of uniqueness of the diffusion path from each space-time point: two paths started from (x,w0)(x,w_{0}) and (x,w1)(x,w_{1}) with w0<w1w_{0}<w_{1} never cross, so if the upper path explodes to -\infty then the lower path must do so as well. The continuity is a general property of statistics of diffusions: κ(x,px)({})\kappa_{(x,p_{x})}(\{\infty\}) is a martingale, so Fβ,w(x)F_{\beta,w}(x) is in fact space-time harmonic. (Again, the behaviour at w=+w=+\infty may be understood by changing coordinates.)

The final assertion is that for fixed x0x_{0} explosion becomes certain as ww\to-\infty. It may be verified by a domination argument involving the ODE (4.2) (time-shifted as above so that λ=0\lambda=0 and the initial time is x0x_{0}), whose paths explode simultaneously with those of (1.7). Given ε>0\varepsilon>0, let MM be such that P(supx[x0,x0+1]bx>M)<ε\operatorname{\mathbf{P}}(\sup_{x\in[x_{0},x_{0}+1]}\left\lvert b_{x}\right\rvert>M)<\varepsilon. It is easy to check that for r0r_{0} sufficiently negative, the solution of r=x(r+M)2r^{\prime}=x-(r+M)^{2} with initial value r(x0)=r0r(x_{0})=r_{0} explodes to -\infty before time x0+1x_{0}+1. Now consider the solution of q=x(q+b)2q^{\prime}=x-(q+b)^{2} with q(x0)r0Mq(x_{0})\leq r_{0}\leq-M. With probability 1ε1-\varepsilon we have q(x)r(x)q^{\prime}(x)\leq r^{\prime}(x) whenever q(x)=r(x)q(x)=r(x), so the paths never cross and qq explodes as well. ∎

Writing LL for the space-time generator of the SDE (1.7), the PDE (1.9) is simply the equation LF=0LF=0. Therefore the hitting probability F(x,w)=Fβ,w(x)F(x,w)=F_{\beta,w}(x) satisfies the PDE. The boundary behaviour (1.10) follows from Lemma 4.1 and the fact that F(,w)F(\cdot,w) is a distribution function for each ww. Specifically, the lower part of the boundary behaviour follows from the fact that F(x,w)F(x,w) is increasing in xx and F(x,w)0F(x,w)\to 0 as ww\to-\infty for each xx. The upper part follows from the fact that F(x,w)F(x,w) is increasing in ww and F(x,w)1F(x,w)\to 1 for fixed ww as xx\to\infty.

As promised, we indicate how the laws of the higher eigenvalues Λ1,Λ2,\Lambda_{1},\Lambda_{2},\ldots may be characterized in terms of the PDE (1.9). The characterization is inductive and follows from (4.3) by reasoning just as in the preceding proof.

Let F(0)(x,w)=Pβ,w(Λ0<x)F_{(0)}(x,w)=\operatorname{\mathbf{P}}_{\beta,w}(-\Lambda_{0}<x). For each k=1,2,k=1,2,\ldots, the boundary value problem

has a unique bounded solution F(k)F_{(k)}, and we have Pβ,w(Λk<x)=F(k)(x,w)\operatorname{\mathbf{P}}_{\beta,w}(-\Lambda_{k}<x)=F_{(k)}(x,w) for w(,)w\in(-\infty,\infty); further, Pβ,(Λk<x)=limwF(k)(x,w)\operatorname{\mathbf{P}}_{\beta,\infty}(-\Lambda_{k}<x)=\lim_{w\to\infty}F_{(k)}(x,w).

Connection with Painlevé II

We now prove Theorem 1.9 and Corollary 1.10. We will need some standard facts about the function u(x)u(x) defined by (1.11),(1.12) and the derived functions v(x), E(x), F(x)v(x),\ E(x),\ F(x) defined in (1.13),(1.14).

E(x)=O(ecx3/2)E(x)=O(e^{-cx^{3/2}}) for some c>0c>0 as x+x\to+\infty.

We will also take for granted some additional information about the functions f(x,w)f(x,w), g(x,w)g(x,w) defined by (1.15),(1.16).

These properties follow from an analysis of the associated Riemann-Hilbert problem with the special monodromy data corresponding to the Hastings-McLeod solution (see Fokas et al. 2006). They are proved in Baik and Rains (2001) except for (iv) which goes back to Deift and Zhou (1995). Interestingly (1.16) and (5.1) are interchangeable in that the latter also uniquely determines a solution of (1.15); this fact does not depend on the specific solution of (1.11) specified by (1.12). By contrast, (5.2) does depend on (1.12). Equations (1.15),(5.3) constitute a so-called Lax pair for the Painlevé II equation (1.11). (It is in fact a simple transformation of the standard Flaschka-Newell Lax pair.) The consistency condition of this overdetermined system—i.e. that the partials commute—is the Painlevé II equation.

and substitute. The coefficient of gg vanishes and the coefficient of ff is

Differentiating, we see that this quantity is constant by (1.11). As all terms vanish in the limit as xx\to\infty, the constant is zero.

It is a little more work to get boundedness and the boundary behaviour (1.10) this time. Dropping the scale factors on x,wx,w, consider

Clearly G>0G>0. For fixed ww, G1G\to 1 as xx\to\infty by (5.5) and the fact that E1/2E1/2=O(ecx3/2)E^{-1/2}-E^{1/2}=O(e^{-cx^{3/2}}) while g=O(ewx)g=O(e^{wx}) from (5.4). Now by (5.3) we have

which is positive for w0w\leq 0. Boundedness in the lower half-plane {w0}\{w\leq 0\} follows, as does the lower boundary behaviour using (5.2).

The upper boundary behaviour follows as well. Indeed, as x,wx,w\to\infty together the coefficient of gg vanishes while the coefficient of ff tends to 1; the gg-term then vanishes while the ff-term tends to 1 as in the β=2\beta=2 case.

These identities are straightforward consequences of the theorem, (1.16) and (5.1). ∎

Acknowledgements The second author is very grateful to José Ramírez for conversations that helped this project go forward. The first author is indebted to Alexander Its for his patient and thorough explanations. We would like to thank Jinho Baik, Alexei Borodin, Peter Forrester, Arno Kuijlaars, Eric Rains, Brian Rider, Brian Sutton, Dong Wang and Ofer Zeitouni for interesting and helpful discussions, as well as AIM and MSRI for providing stimulating environments in December 2009 and September 2010 workshops. The work of the first author was supported in part by an NSERC postgraduate scholarship held at the University of Toronto, and that of the second author by the Canada Research Chair program and the NSERC DAS program.

References