Lower bounds on the smallest eigenvalue of a sample covariance matrix

Pavel Yaskov

Introduction

Lower bounds on the smallest eigenvalue of a sample covariance matrix (or a Gram matrix) play a crucial role in the least squares problems in high-dimensional statistics (see, for example, ). These problems motivate the present work.

In this paper we derive sharp lower bounds for $\lambda_{p}(n^{-1}\mathbf{X}_{pn}\mathbf{X}_{pn}^{\top})$ , where $\lambda_{p}(A)$ is the smallest eigenvalue of a $p\times p$ matrix $A$ . We try to impose as few restrictions on the components of $X_{p}$ as possible. In proofs we use the same strategy as in .

Main results

whenever $y\leqslant C_{2}K_{p}^{2}$ and $Z=Z(p,n)$ as above.

Useful bounds for $c_{p}(a)$ and $C_{p}(a)$ in terms of $L_{p}(\alpha)$ and $M_{p}(\alpha)$ are given in the following proposition.

In addition, for all $\alpha\in(0,2]$ and each $a>0$ , $C_{p}(a)$ is bounded from above by

Applications

We now describe different corollaries of Theorem 2.1 and Theorem 2.2. The next corollary extends Theorem 1.3 in and Theorem 3.1 in (for $A_{i}=X_{pi}X_{pi}^{\top}$ ).

One may further weaken assumptions in Corollary 3.1. Namely, one may assume that $M_{p}(\alpha)<\infty$ for some $\alpha\in(0,2).$ The conclusion of Corollary 3.1 will still hold with some $C_{\alpha}>0$ that depends only on $\alpha$ and $M_{p}(\alpha)$ . In the case $\alpha=2$ , one would have a lower bound of the form $1-C_{2}\sqrt{y\log(e/y)}$ with $C_{2}>0$ depending only on $M_{p}(2).$

Theorems 2.1 and 2.2 improve Theorem 2.1 in as the next corollary shows.

The same conclusion holds if $L_{p}(2)<\infty$ and $n\geqslant 16L_{p}(2)\varepsilon^{-2}p$ .

Let us formulate the final corollary that improves Theorem 3.1 in for small $K_{p}$ .

The range of applicability of Corollary 3.4 is very wide. Namely, there exist some universal constant $K>0$ such that $K_{p}\geqslant K$ for a very large class of isotropic random vectors $X_{p}$ . By Corollary 3.4, this means that $\lambda_{p}(n^{-1}\mathbf{X}_{pn}\mathbf{X}_{pn}^{\top})$ is separated from zero by an universal constant.

The existence of $K$ follows from results related to Kashin’s decomposition theorem. The infinite dimensional version of this theorem is given in Kashin (for a proof, see ). It states the following.

There is an universal constant $K>0$ such that $L_{2}(0,1)=H_{1}\oplus H_{2}$ for some linear subspaces of $H_{i}\subset L_{2}(0,1),$ $i=1,2,$ such that $\|x\|_{1}\geqslant K\|x\|_{2}$ for all $x\in H_{1}\cup H_{2},$ where $\|x\|_{d}$ is the standard norm in $L_{d}(0,1)$ , $d=1,2$ .

Proofs.

In proofs of Theorem 2.1 and Theorem 2.2, we follow the strategy of Srivastava and Vershynin . The key step is the following lemma.

The proof of Lemma 4.1 is given in Appendix.

The strategy itself consists in the following. Let $A_{0}$ be a $p\times p$ zero matrix and

Put $l_{k}=l_{k-1}+\Delta_{k}$ for $1\leqslant k\leqslant n$ , where

Let $U$ and $V$ be non-negative random variables. Then, for all $a>0$ ,

Proof of Theorem 2.1. Take in Lemma 4.3 $X_{p}=X_{pk},$

hereinafter all inequalities with conditional mathematical expectations hold almost surely. By (2), the latter implies that

Assume first that $C^{2}=L_{p}(2)<\infty$ and $p/n\leqslant y$ for some $y>0$ . Define $X_{p}^{\top}AX_{p}$ and $X_{p}^{\top}BX_{p}$ in the same way as in (3). Then, by Lemma 4.3 (with $\varphi=1/(3b)$ ),

Taking $\varphi=\sqrt{y}/(2C)$ in (2), we get $p/(n\varphi)\leqslant y/\varphi=2C\sqrt{y}$ and

Finally, consider the case with $K_{p}>0$ ( the case with $K_{p}=0$ is trivial). By Lemma 4.3 with $b=(3\varphi)^{-1}$ and $\varphi=1/4$ ,

Taking $p/n\leqslant y=K_{p}^{2}/16$ in (2), we get

Taking $y=La^{-1-\alpha/2}$ , we get the desired inequality.

Consider the case $\alpha=2.$ By Theorem 2.2 with $y=p/n$ and $C=\sqrt{L_{p}(2)}$ ,

Proof of Corollary 3.3. Set $L=L_{p}(\alpha)$ for given $\alpha\in(0,2)$ . By Proposition 2.3,

Similarly, taking $y=\varepsilon^{2}/(16C^{2})$ for $C=\sqrt{L_{p}(2)}$ in Theorem 2.2, we get that

Proof of Corollary 3.4. Let $C_{0},C_{1},C_{2}>0$ be such that the second bound in Theorem 2.2 holds. Then, for $p/n\leqslant C_{2}K_{p}^{2},$

Putting $C_{0}^{*}=C_{0}/2,$ $C_{1}^{*}=C_{0}^{2}/(8C_{1}^{2})$ and $C_{2}^{*}=C_{2}$ , we finish the proof.

Appendix

Proof of Lemma 4.1. By Lemma 2.2 in Srivastava and Vershynin , if $A-(l+\Delta)I_{p}\succ 0$ and $q(l+\Delta,v)/[1+Q(l+\Delta,v)]\geqslant\Delta$ , then

since $\Delta\leqslant 1/(3\varphi)$ by construction.

By Bernoulli’s inequality, $(1-x)^{3}\geqslant 1-3x$ whenever $x\in$ . Hence,

where the last equality holds by the definition of $\Delta.$ Proof of Lemma 4.2. We have

for all $a>0.$ By the Cauchy-Schwartz inequality,

This gives the first inequality. Tending $a$ to infinity, we get the second inequality.

The last inequality also follows from the Cauchy-Schwartz inequality. Namely,

Fix $j\in\{1,\ldots,p\}$ and $b>0.$ By Lemma 4.2,

We have $x^{2}/(x+c)\geqslant x-c$ for all $x,c\geqslant 0$ . This yields that

Hence, $C\leqslant 5C_{p}(a)/3.$ Combining all estimates together yields

Consider the function $f(x)=x^{2}/(1+b^{-1}x)$ , $x\geqslant 0$ . Its derivative

Now consider the case with $L_{p}(2)<\infty$ . By Lemma 4.2,

To finish the proof, we only need to note that

Proof of Lemma 4.4. Since $e^{-x}\leqslant 1-x+x^{2}/2$ for all $x\geqslant 0,$ we have

Introduction

Main results

Applications

Proofs.

Appendix

References