A simple proof that random matrices are democratic

Mark A. Davenport, Jason N. Laska, Petros T. Boufounos, Richard G. Baraniuk

Introduction

where $\boldsymbol{\Phi}$ is an $M\times N$ matrix that models the measurement system. The hope is that we can design $\boldsymbol{\Phi}$ so that ${\mathbf{x}}$ can be accurately recovered even when $M\ll N$ . In general this is not possible, but if ${\mathbf{x}}$ is $K$ -sparse, meaning that it has only $K$ nonzero entries then it is possible to design $\boldsymbol{\Phi}$ that preserve the information about ${\mathbf{x}}$ using only $M=O(K\log(N/K))$ measurements. The most commonly studied $\boldsymbol{\Phi}$ that satisfy this bound on $M$ are random, i.e., each entry of $\boldsymbol{\Phi}$ is drawn independently from some suitable distribution . We will focus our attention on such $\boldsymbol{\Phi}$ .

Among the advantages of random measurements is a property commonly referred to as democracy. While it is not usually rigorously defined in the literature, democracy is usually taken to mean that each measurement contributes a similar amount of information about the signal ${\mathbf{x}}$ to the compressed representation ${\mathbf{y}}$ .The original introduction of this term was with respect to quantization , i.e., a democratic quantizer would ensure that each bit is given “equal weight.” As the CS framework developed, it became empirically clear that CS systems exhibited this property with respect to compression . Others have described democracy to mean that each measurement is equally important (or unimportant) . Despite the fact that democracy is so frequently touted as an advantage of random measurements, it has received little analytical attention in the CS context. Perhaps more surprisingly, the property has not been explicitly exploited in applications until recently .

The fact that random measurements are democratic seems intuitive; when using random measurements, each measurement is a randomly weighted sum of a large fraction (or all) of the entries of ${\mathbf{x}}$ , and since the weights are chosen independently at random, no preference is given to any particular entries. More concretely, suppose that the measurements $y_{1},y_{2},\ldots,y_{M}$ are independent and identically distributed (i.i.d.) according to some distribution $f_{Y}$ , as is the case for the $\boldsymbol{\Phi}$ considered in this report. Now suppose that we select $\widetilde{M}<M$ of the $y_{i}$ at random (or according to some procedure that is independent of ${\mathbf{y}}$ ). Then clearly, we are left with a length- $\widetilde{M}$ measurement vector $\widetilde{{\mathbf{y}}}$ such that each $\widetilde{y}_{i}\sim f_{Y}$ . Stated another way, if we set $D=M-\widetilde{M}$ , then there is no difference between collecting $\widetilde{M}$ measurements and collecting $M$ measurements and deleting $D$ of them, provided that this deletion is done independently of the actual values of ${\mathbf{y}}$ .

However, following this line of reasoning will ultimately lead to a rather weak definition of democracy. To see this, consider the case where the measurements are deleted by an adversary. By adaptively deleting the entries of ${\mathbf{y}}$ one can change the distribution of $\widetilde{{\mathbf{y}}}$ . For example, the adversary can delete the $D$ largest elements of ${\mathbf{y}}$ , thereby skewing the distribution of $\widetilde{{\mathbf{y}}}$ . In many cases, especially if the same matrix $\boldsymbol{\Phi}$ will be used repeatedly with different measurements being deleted each time, it would be far better to know that any $\widetilde{M}$ measurements will be sufficient to reconstruct the signal. This is a significantly stronger requirement.

In order to formally define this stronger notion of democracy, we must first describe the properties that a matrix must satisfy to ensure stable reconstruction. Towards that end, we recall the definition of the restricted isometry property (RIP) for the matrix $\boldsymbol{\Phi}$ .

A matrix $\boldsymbol{\Phi}$ satisfies the RIP of order $K$ with constant $\delta\in(0,1)$ if

holds for all ${\mathbf{x}}$ such that $\|{\mathbf{x}}\|_{0}\leq K$ .

Much is known about matrices that satisfy the RIP, but for our purposes it suffices to note that if we draw a random $M\times N$ matrix $\boldsymbol{\Phi}$ whose entries $\phi_{ij}$ are i.i.d. sub-Gaussian random variables, then provided that

we have that with high probability $\boldsymbol{\Phi}$ will satisfy the RIP of order $K$ with constant $\delta$ .

The RIP also provides us with a way to quantify our notion of democracy. To do so, we first establish some notation that will prove useful throughout this report. Let $\Gamma\subset\{1,2,,\ldots,M\}$ . By $\boldsymbol{\Phi}^{\Gamma}$ we mean the $|\Gamma|\times M$ matrix obtained by selecting the rows of $\boldsymbol{\Phi}$ indexed by $\Gamma$ . Alternatively, if $\Lambda\subset\{1,2,\dots,N\}$ , then we use $\boldsymbol{\Phi}_{\Lambda}$ to indicate the $M\times|\Lambda|$ matrix obtained by selecting the columns of $\boldsymbol{\Phi}$ indexed by $\Lambda$ . Following , we now formally define democracy as follows.

Let $\boldsymbol{\Phi}$ be and $M\times N$ matrix, and let $\widetilde{M}\leq M$ be given. We say that $\boldsymbol{\Phi}$ is $(\widetilde{M},K,\delta)$ -democratic if for all $\Gamma$ such that $|\Gamma|\geq\widetilde{M}$ the matrix $\boldsymbol{\Phi}^{\Gamma}$ satisfies the RIP of order $K$ with constant $\delta$ .

In Section 2 below we present a simple proof that Gaussian matrices are democratic and demonstrate how the proof can be extended to sub-Gaussian matrices. The core of this proof can be found in , but is included in full in this report. In Section 3 we discuss the implications of the result and alternative interpretations. Section 4 contains the additional theorems required by the proof.

Random matrices are democratic

We now demonstrate that certain randomly generated matrices are democratic. While the theorem actually holds (with different constants) for the more general class of sub-Gaussian matrices, for simplicity we restrict our attention to Gaussian matrices. We provide discussion of the sub-Gaussian case in Section 4.

Let $\boldsymbol{\Phi}$ by an $M\times N$ matrix with elements $\phi_{ij}$ drawn according to $\mathcal{N}(0,1/M)$ and let $\widetilde{M}\leq M$ , $K<\widetilde{M}$ , and $\delta\in(0,1)$ be given. Define $D=M-\widetilde{M}$ . If

then with probability exceeding $1-3e^{-C_{2}M}$ we have that $\boldsymbol{\Phi}$ is $(\widetilde{M},K,\delta/(1-\delta))$ -democratic, where $C_{1}$ is arbitrary and $C_{2}=(\delta/8)^{2}-\log(42e/\delta)/C_{1}.$

Our proof consists of two main steps. We begin by defining the $M\times(N+M)$ matrix ${\mathbf{A}}=[{\mathbf{I}}~{}\boldsymbol{\Phi}]$ formed by appending $\boldsymbol{\Phi}$ to the $M\times M$ identity matrix. Theorem 2, also found in , demonstrates that under the assumptions in the theorem statement, with probability exceeding $1-3e^{-C_{2}M}$ we have that ${\mathbf{A}}$ satisfies the RIP of order $K+D$ with constant $\delta$ . The second step is to use this fact to show that all possible $\widetilde{M}\times N$ submatrices of $\boldsymbol{\Phi}$ satisfy the RIP of order $K$ with constant $\delta/(1-\delta)$ .

Towards this end, we let $\Gamma\subset\{1,2,\ldots,M\}$ be an arbitrary subset of rows such that $|\Gamma|\geq\widetilde{M}$ . Define $\Lambda=\{1,2,\ldots,M\}\setminus\Gamma$ and note that $|\Lambda|=D$ . Additionally, let

be the orthogonal projector onto $\mathcal{R}({\mathbf{A}}_{\Lambda})$ , i.e., the range, or column space, of ${\mathbf{A}}_{\Lambda}$ . ${\mathbf{A}}_{\Lambda}^{\dagger}=({\mathbf{A}}_{\Lambda}^{T}{\mathbf{A}}_{\Lambda})^{-1}{\mathbf{A}}_{\Lambda}^{T}$ denotes the Moore-Penrose pseudoinverse of ${\mathbf{A}}_{\Lambda}$ . Furthermore, we define

as the orthogonal projector onto the orthogonal complement of $\mathcal{R}({\mathbf{A}}_{\Lambda})$ . In words, this projector nulls the columns of ${\mathbf{A}}$ corresponding to the index set $\Lambda$ . Now, note that $\Lambda\subset\{1,2,\ldots,M\}$ , so ${\mathbf{A}}_{\Lambda}={\mathbf{I}}_{\Lambda}$ . Thus,

where we use ${\mathbf{I}}(\Lambda)$ to denote the $M\times M$ matrix with all zeros except for ones on the diagonal entries corresponding to the columns indexed by $\Lambda$ . (We distinguish the $M\times M$ matrix ${\mathbf{I}}(\Lambda)$ from the $M\times D$ matrix ${\mathbf{I}}_{\Lambda}$ — in the former case we replace columns not indexed by $\Lambda$ with zero columns, while in the latter we remove these columns to form a smaller matrix.) Similarly, we have

Thus, we observe that the matrix ${\mathbf{P}}_{\Lambda}^{\perp}{\mathbf{A}}={\mathbf{I}}(\Gamma){\mathbf{A}}$ is simply the matrix ${\mathbf{A}}$ with zeros replacing all entries on any row $i$ such that $i\notin\Gamma$ , i.e., $({\mathbf{P}}_{\Lambda}^{\perp}{\mathbf{A}})^{\Gamma}={\mathbf{A}}^{\Gamma}$ and $({\mathbf{P}}_{\Lambda}^{\perp}{\mathbf{A}})^{\Lambda}=\boldsymbol{0}$ . Furthermore, Theorem 3, also found in , states that for ${\mathbf{A}}$ satisfying the RIP of order $K+D$ with constant $\delta$ , we have that

Discussion

Observe that we require roughly $O(D\log(N))$ additional measurements to ensure that $\boldsymbol{\Phi}$ is $(\widetilde{M},K,\delta)$ -democratic compared to the number of measurements required to simply ensure that $\boldsymbol{\Phi}$ satisfies the RIP of order $K$ . This seems intuitive; if we wish to be robust to the loss of any $D$ measurements while retaining the RIP of order $K$ , then we should expect to take at least $D$ additional measurements. This is not unique to the CS framework. For instance, by oversampling, i.e., sampling faster than the minimum required Nyquist rate, uniform sampling systems can also improve robustness with respect to the loss of measurements. However, a benefit of the democratic CS system is that the number of additional measurements needed grows more slowly than in the Nyquist case. To see this, consider the case where we lose $D$ samples or measurements. For a fixed time period, suppose that sampling the signal at the Nyquist rate yields $N$ samples. To be robust to the loss of a contiguous block of $D$ samples, we must sample at $D+1$ times the Nyquist rate, yielding $DN$ additional samples. In contrast, the number of additional measurements needed for a CS measurement system to be democratic is $O(D\log(N))$ , given by (4). Thus, the number of additional samples required by a Nyqust sampler depends linearly on $D$ and $N$ while the number of additional measurements for democratic CS systems is still linear in $D$ but only logarithmic in $N$ . If $N$ is large, this can result in tremendous savings. Note also that for a fixed $N$ and $K$ , by driving $M$ higher a CS measurement system can be robust to the loss of a large fraction of the acquired measurements, whereas in Nyquist oversampling, the fraction of (consecutive) samples that can be dropped can never exceed $1/N$ .

In some applications, this difference may have significant impact. For example, in finite dynamic range quantizers, the measurements saturate when their magnitude exceeds some level. Thus, when uniformly sampling with a low saturation level, if one sample saturates, then the likelihood that any of the neighboring samples will saturate is high, and significant oversampling may be required to ensure any benefit. However, in CS, if many adjacent measurements were to saturate, then for only a slight increase in the number of measurements we can mitigate this kind of error by simply rejecting the saturated measurements; the fact that $\boldsymbol{\Phi}$ is democratic ensures that this strategy will be effective.

In addition to robustness, Theorem 1 implies that reconstruction from a subset of CS measurements is stable to the loss of a potentially larger number of measurements than anticipated. To see this, suppose that and $M\times N$ matrix $\boldsymbol{\Phi}$ is $(M-D,K,\delta)$ -democratic, but consider the situation where $D+\widetilde{D}$ measurements are dropped. It is clear from the proof of Theorem 1 that if $\widetilde{D}<K$ , then the resulting matrix $\boldsymbol{\Phi}^{\Gamma}$ will satisfy the RIP of order $K-\widetilde{D}$ with constant $\delta$ . Thus, from , if we define $\widetilde{K}=(K-\widetilde{D})/2$ , then the reconstruction error is then bounded by

where ${\mathbf{x}}_{\widetilde{K}}$ denotes the best $\widetilde{K}$ -term approximation of ${\mathbf{x}}$ and $C_{3}$ is an absolute constant depending on $\boldsymbol{\Phi}$ that can be bounded using the constants derived in Theorem 1. Thus, if $\widetilde{D}$ is small then the additional error caused by dropping too many measurements will also be relatively small. In contrast, there is simply no analog to this kind of stability result for uniform sampling with linear reconstruction. When the number of dropped samples exceeds $D$ (where $D$ represents the oversampling factor described above), there is are no guarantees as to the accuracy of the reconstruction.

2 Numerical exploration

As discussed previously, the democracy property is a stronger condition than the RIP. To demonstrate this, we perform a numerical simulation which illustrates this point. Specifically, we would like to compare the case where the measurements are dropped at random versus the case where the dropped measurements are selected by an adversary. Ideally, we would like to know whether the resulting matrices satisfy the RIP. Of course, this experiment is impossible to perform for two reasons: first, determining if a matrix satisfies the RIP is computationally intractable as it would require checking all possible $K$ -dimensional sub-matrices of $\boldsymbol{\Phi}^{\Gamma}$ . Moreover, in the adversarial setting one would also have to search for the worst possible $\Gamma$ as well, which is impossible for the same reason. Thus, we instead perform a far simpler experiment, which serves as a very rough proxy to the experiment we would like to perform.

The experiment proceeds over $100$ trials as follows. We fix the parameters $N=2048$ and $K=13$ and vary $M$ in the range $(0,380)$ . In each trial we draw a new matrix $\Phi$ with $\phi_{ij}\sim\mathcal{N}(0,1/M)$ and a new signal with $K$ nonzero coefficients, also drawn from a Gaussian distribution, and then the signal is normalized $\|{\mathbf{x}}\|_{2}=1$ . Over each set of trials we estimate two quantities:

the maximum $D$ such that we achieve exact reconstruction for a randomly selected $(M-D)\times N$ submatrix of $\Phi$ on each of the $100$ trials;

the maximum $D$ such that we achieve exact reconstruction for $R=300$ randomly selected $(M-D)\times N$ submatrices of $\Phi$ on each of the $100$ trials..

Ideally, the second case should consider all $(M-D)\times N$ submatrices of $\Phi$ rather than just 300 submatrices, but as this is not possible (for reasons discussed above) we simply perform a random sampling of the space of possible submatrices. Note also that exact recovery on one signal is also not proof that the matrix satisfies the RIP, although failure is proof that the matrix does not.

The results of this experiment are depicted in Figure 1. The circles denote data points with the empty circles corresponding to the random selection experiment and the solid circles corresponding to the democracy experiment. The lines denote the best linear fit for each data set where $D>0$ , with the dashed line corresponding to the random selection experiment and the solid line corresponding to democracy experiment.

The maximum $D$ corresponding to the random selection experiment grows linearly in $M$ (with coefficient 1) once the minimum number of measurements required for RIP, denoted by $M^{\prime}$ , is reached. This is because beyond this point at most $D=M-M^{\prime}$ measurements can be discarded. As demonstrated by the plot, $M^{\prime}\approx 90$ for this experiment. For the democracy experiment $M^{\prime}\approx 150$ , larger than for the RIP experiment. Furthermore, the maximum $D$ for democracy grows more slowly than for the random selection case, which indicates that to be robust to the loss of any $D$ measurements, $CD$ additional measurements, with $C>1$ , are actually necessary.

Theorems

In this section, we prove the two supporting Theorems used in the proof of Theorem 1. We begin by demonstrating that the matrix ${\mathbf{A}}=[{\mathbf{I}}~{}\boldsymbol{\Phi}]$ satisfies the RIP. To do so, we first establish the following lemma, that closely parallels the result in equation (4.3) of . The lemma demonstrates that for any ${\mathbf{u}}$ , if we draw $\boldsymbol{\Phi}$ at random, then $\|{\mathbf{A}}{\mathbf{u}}\|_{2}$ is concentrated around $\|{\mathbf{u}}\|_{2}$ .

We first note that since ${\mathbf{A}}{\mathbf{u}}={\mathbf{w}}+\boldsymbol{\Phi}{\mathbf{x}}$ , we have that

and since $\|{\mathbf{u}}\|_{2}^{2}=\|{\mathbf{w}}\|_{2}^{2}+\|{\mathbf{x}}\|_{2}^{2}+$ , this establishes (9).

We now turn to (10). Using the arguments in , one can show that

As noted above, $2{\mathbf{w}}^{T}\boldsymbol{\Phi}{\mathbf{x}}\sim\mathcal{N}\left(0,4\|{\mathbf{w}}\|_{2}^{2}\|{\mathbf{x}}\|_{2}^{2}/M\right)$ . Hence, we have that

where $Q(\cdot)$ denotes the tail integral of the standard Gaussian distribution. From (13.48) of we have that

Thus, combining (12) and (13) we obtain that with probability at least $1-3e^{-M\eta^{2}/8}$ we have that both

Using (11), we can combine (14) and (15) to obtain

where the last inequality follows from the fact that $\|{\mathbf{w}}\|_{2}\|{\mathbf{x}}\|_{2}\leq\|{\mathbf{u}}\|_{2}\|{\mathbf{u}}\|_{2}$ . Similarly, we also have that

We note that while the above proof assumes that the entries of $\boldsymbol{\Phi}$ are Gaussian, this proof holds with essentially no modifications for a wide class of sub-Gaussian distributions. A random variable $X$ is sub-Gaussian if there exists a constant $C>0$ such that

Using Lemma 1, we now demonstrate that the matrix ${\mathbf{A}}$ satisfies the RIP provided that $M$ is sufficiently large.

Let $\boldsymbol{\Phi}$ be an $M\times N$ matrix with elements $\phi_{ij}$ drawn according to $\mathcal{N}(0,1/M)$ and let let ${\mathbf{A}}=[{\mathbf{I}}~{}\boldsymbol{\Phi}]$ . If

then with probability exceeding $1-3e^{-C_{2}M}$ we have that ${\mathbf{A}}$ satisfies the RIP of order $(K+D)$ with constant $\delta$ , where $C_{1}$ is arbitrary and $C_{2}=(\delta/8)^{2}-\log(42e/\delta)/C_{1}.$

First note that it is enough to prove (17) in the case $\|\mathbf{x}\|_{2}=1$ , since ${\mathbf{A}}$ is linear. Next, fix an index set $J\subset\{1,2,\ldots,N+M\}$ with $|J|=K+D$ , and let $X_{J}$ denote the $(K+D)$ -dimensional subspace spanned by the columns of ${\mathbf{A}}$ indexed by $J$ . We choose a finite set of points $S_{J}$ such that $S_{J}\subseteq X_{J}$ , $\|\mathbf{s}\|_{2}\leq 1$ for all $\mathbf{s}\in S_{J}$ , and for all $\mathbf{x}\in X_{J}$ with $\|\mathbf{x}\|_{2}\leq 1$ we have

One can show (see Chapter 15 of ) that such a set $S_{J}$ exists with $|S_{J}|\leq(3/\epsilon)^{K+D}$ . We then repeat this process for each possible index set $J$ , and collect all the sets $S_{J}$ together

There are $\binom{N+M}{K+D}\leq\left(e\frac{N+M}{K+D}\right)^{K+D}$ possible index sets $J$ , and hence $|S|\leq\left(\frac{3e}{\epsilon}\frac{N+M}{K+D}\right)^{K+D}$ . We now use the union bound to apply Lemma 1 to this set of points such that, with probability exceeding

We now define $\Sigma_{K+D}=\{{\mathbf{x}}:\|{\mathbf{x}}\|_{0}\leq K+D\}$ . We define $B$ as the smallest number such that

Our goal is to show that $B\leq\sqrt{1+\delta}$ . For this, we recall that for any $\mathbf{x}\in\Sigma_{K+D}$ with $\|\mathbf{x}\|_{2}\leq 1$ , we can pick a $\mathbf{s}\in S$ such that $\|\mathbf{x}-\mathbf{s}\|_{2}\leq\epsilon$ and such that $\mathbf{x}-\mathbf{s}\in\Sigma_{K+D}$ (since if $\mathbf{x}\in X_{J}$ , we can pick $\mathbf{s}\in S_{J}\subset X_{J}$ satisfying $\|\mathbf{x}-\mathbf{s}\|_{2}\leq\epsilon$ ). In this case we have

Since by definition $B$ is the smallest number for which (21) holds, we obtain $\sqrt{B}\leq\sqrt{1+2\eta}+\sqrt{B}\epsilon$ , which upon rearranging yields $\sqrt{B}\leq\sqrt{1+2\eta}/(1-\epsilon)$ . One can show that by setting $\epsilon=\delta/14$ and $\eta=\delta/2\sqrt{2}$ , we have that $\sqrt{1+2\eta}/(1-\epsilon)\leq\sqrt{1+\delta}$ , which establishes the upper inequality in (2). The lower inequality follows from this since

where the last inequality again holds with $\epsilon=\delta/14$ and $\eta=\delta/2\sqrt{2}$ . This establishes the theorem. To arrive at the formula for $C_{2}$ we first bound the result in (20) using

and then we replace $(K+D)\log((N+M)/(K+D))$ with $M/C_{1}$ . After simplification, this yields $C_{2}=\eta^{2}/8-\log(3e/\epsilon)/C_{1}$ . By substituting the values for $\epsilon$ and $\eta$ , we obtain the desired result. ∎

In Theorem 3 below, we show that the matrix ${\mathbf{P}}_{\Lambda}^{\perp}{\mathbf{A}}$ satisfies a modified version of the RIP. We begin with an elementary lemma that is a straightforward generalization of Lemma 2.1 of , and states that RIP operators approximately preserve inner products between sparse vectors.

We first assume that $\|{\mathbf{u}}\|_{2}=\|{\mathbf{v}}\|_{2}=1$ . From the fact that

and since ${\mathbf{A}}$ satisfies the RIP, we have that

From the parallelogram identity we obtain

Similarly, one can show that $\langle{\mathbf{A}}{\mathbf{u}},{\mathbf{A}}{\mathbf{v}}\rangle\geq\langle{\mathbf{u}},{\mathbf{v}}\rangle-\delta$ , and thus $|\langle{\mathbf{A}}{\mathbf{u}},{\mathbf{A}}{\mathbf{v}}\rangle-\langle{\mathbf{u}},{\mathbf{v}}\rangle|\leq\delta$ . The result follows for ${\mathbf{u}}$ , ${\mathbf{v}}$ with arbitrary norm from the bilinearity of the inner product. ∎

Suppose that ${\mathbf{A}}$ satisfies the RIP of order $K$ with isometry constant $\delta$ , and let $\Lambda\subset\{1,2,\ldots,N\}$ . Define ${\mathbf{P}}^{\perp}_{\Lambda}$ as in(6). If $|\Lambda|<K$ then

From the definition of ${\mathbf{P}}^{\perp}_{\Lambda}{\mathbf{A}}$ in (5), we may decompose ${\mathbf{P}}^{\perp}_{\Lambda}{\mathbf{A}}{\mathbf{u}}$ as ${\mathbf{P}}^{\perp}_{\Lambda}{\mathbf{A}}{\mathbf{u}}={\mathbf{A}}{\mathbf{u}}-{\mathbf{P}}_{\Lambda}{\mathbf{A}}{\mathbf{u}}$ . Since ${\mathbf{P}}_{\Lambda}$ is an orthogonal projection, we can write

Our goal is to show that $\|{\mathbf{A}}{\mathbf{u}}\|_{2}\approx\|{\mathbf{P}}^{\perp}_{\Lambda}{\mathbf{A}}{\mathbf{u}}\|_{2}$ , or equivalently, that $\|{\mathbf{P}}_{\Lambda}{\mathbf{A}}{\mathbf{u}}\|_{2}$ is small. Towards this end, we note that since ${\mathbf{P}}_{\Lambda}{\mathbf{A}}{\mathbf{u}}$ is orthogonal to ${\mathbf{P}}^{\perp}_{\Lambda}{\mathbf{A}}{\mathbf{u}}$ , we have

Since we trivially have that $\|{\mathbf{P}}_{\Lambda}{\mathbf{A}}{\mathbf{u}}\|_{2}\geq 0$ , we can combine this with (26) to obtain

Since $\|{\mathbf{u}}\|_{0}\leq K$ , we can use the RIP to obtain

Introduction

Random matrices are democratic

Discussion

2 Numerical exploration

Theorems

References