On the singularity probability of discrete random matrices

Jean Bourgain, Van Vu, Philip Matchett Wood

Introduction

Let $n$ be a large integer and $M_{n}$ be an $n$ by $n$ random matrix whose entries are independent (but not necessarily identically distributed) discrete random variables taking values in the complex numbers. The problem of estimating the probability that $M_{n}$ is singular is a basic problem in the theory of random matrices and combinatorics. The goal of this paper is to give a bound that applies to a large variety of distributions. The general statement (Theorem 2.2) is a bit technical, so we will first discuss a few corollaries concerning special cases.

The most famous special case is when the entries of $M_{n}$ are independent identically distributed (i.i.d.) Bernoulli random variables (taking values $\pm 1$ with probability $1/2$ ). The following conjecture has been open for quite some time:

For $M_{\pm 1,n}$ an $n$ by $n$ matrix with each entry an i.i.d. Bernoulli random variable taking the values $+1$ and $-1$ each with probability $1/2$ ,

It is easy to verify that the singularity probability is at least $(1/2)^{n}$ by considering the probability that there are two equal rows (or columns).

Even in the case of i.i.d. Bernoulli random variables, proving that the singularity probability is $o(1)$ is not trivial. It was first done by Komlós in 1967 (see also ; generalizes Komlós’s bound to other integer distributions). The first exponential bound was proven by Kahn, Komlós, and Szemerédi , who showed that $\Pr(M_{\pm 1,n}\mbox{ is singular})\leq.999^{n}$ . This upper bound was improved upon by Tao and Vu in to $.958^{n}$ . A more significant improvement was obtained by the same authors in :

This improvement was made possible through the discovery of a new theorem [11, Theorem 5.2] (which was called the Structure Theorem in ), which gives a complete characterization of a set with certain additive properties. The Structure Theorem (to be more precise, a variant of it) will play a critical role in the current paper as well.

Our general result has the following corollary in the Bernoulli case:

which gives a slight improvement over Inequality (1) (since $1/\sqrt{2}\approx 0.7071<.75$ ).

Let us now discuss a more general class of random matrices. Consider the random variable $\gamma^{(\mu)}$ defined by

and let $M_{\pm 1,n}^{(\mu)}$ be an $n$ by $n$ matrix with each entry an independent copy of $\gamma^{(\mu)}$ . The random variable $\gamma^{(\mu)}$ plays an important role in , and the matrices $M_{\pm 1,n}^{(\mu)}$ are of interest in their own right. In fact, giving zero a large weight is a natural thing to do when one would like to (randomly) sparsify a matrix, a common operation used in randomized algorithms (the values of $\pm 1$ , as the reader will see, are not so critical). Our general result implies the following upper bounds:

Note that Inequality (5) implies Inequality (1) and that Inequality (6) implies Inequality (2) (in both cases setting $\mu=1$ ).

Figure 1 summarizes the upper bounds from Inequalities (4), (5), and (6) and also includes the following lower bounds:

These lower bounds can be derived by computing the probability that one row is all zeros (Inequality (7)) or that there is a dependency between two rows (Inequality (8)). Note that in the case where $\mu\leq 1/2$ , the upper bound in Inequality (4) asymptotically equals the lower bound in Inequality (7), and thus our result is the best possible in this case. We also used a Maple program to derive the formulas for lower bounds resulting from a dependency between three, four, or five rows; however, these lower bounds were inferior to those in Inequality (7) and Inequality (8).

We will now present another corollary of the main theorem that has a somewhat different flavor. In this corollary, we treat partially random matrices, which may have many deterministic rows. Our method allows us to obtain exponential bounds so long as there are still at most $c\ln n$ random rows, where $c>0$ is a particular constant.

Notice that the case $\mathfrak{f}=0$ and $p=1/2$ also implies Inequality (2).

The focus of this paper is optimizing the base of the exponent in bounds on the singularity probability for discrete random matrices. One main tool in this optimization is the use of a structure theorem similar to [11, Theorem 5.2] (see Theorem 6.1 below); however, using such a theorem requires additional assumptions to be placed on the values that can appear as entries, and in particular, this is why we assume in Corollary 1.2 that the set $S$ has cardinality $\left|S\right|\leq O(1)$ and that $\mathfrak{f}\leq c\ln n$ . If one is interested in an exponential bound where there are no conditions on $\mathfrak{f}$ or on the set $S$ (at the expense of having an unspecified constant for the base of the exponential), one can follow the analysis in , which does not make use of a structure theorem, along with ideas in this paper to get a result of the following form:

For every $\epsilon>0$ there exists $\delta>0$ such that the following holds. Let $N_{\mathfrak{f},n}$ be an $n$ by $n$ complex matrix in which $\mathfrak{f}$ rows contain fixed, non-random entries and where the other rows contain entries that are independent discrete random variables. If the fixed rows have co-rank $k$ and if for every random entry $\alpha$ , we have $\max_{x}\Pr(\alpha=x)\leq 1-\epsilon$ , then for all sufficiently large $n$

Note that Theorem 1.4 holds for any $\mathfrak{f}$ and $k$ , and so in particular, an exponential bound on the singularity probability is achieved whenever $k=0$ and $\mathfrak{f}\leq cn$ , where $c<1$ is a constant. Also note that the theorem allows the random entries to have discrete distributions taking infinitely many values. Corollary 3.6 proves a version of Theorem 1.4 with a much better exponential bound, given some additional conditions.

The structure of the rest of the paper is as follows. In Section 2 we define $p$ -bounded of exponent $r$ and state the main theorem of this paper. In Section 3, we discuss some corollaries of Theorem 2.2. In particular, we will:

prove general bounds on the singularity probability for discrete random matrices with entries that have symmetric distributions and with entries that have asymmetric distributions;

Prove a version of Corollary 1.2 (namely, Corollary 3.5) that holds for up to $o(n)$ fixed rows, assuming that the entries in the fixed rows take integer values between $-C$ and $C$ for any positive constant $C$ ; and

prove that the probability that random matrices with integer entries have a rational eigenvalue is exponentially small.

The general theorem

To prove the results in Inequalities (1) and (2) (and also the results in and ), one basic idea is to replace entries of a random matrix with independent copies of the random variable $\gamma^{(\mu)}$ or $2\gamma^{(\mu)}$ (see Equation (3)). One key idea in proving the more general results of the current paper is replacing the entries of a random matrix with more complicated symmetric discrete random variables.

The following definition lies at the heart of our analysis.

Let $p$ be a positive constant such that $0<p<1$ and let $r$ be a positive integer constant. A random variable $\alpha$ taking values in the integers (or, respectively, the integers modulo some large prime $Q$ ) is $p$ -bounded of exponent $r$ if

$q\leq\min_{x}\Pr(\beta^{(\mu)}=x)$ and $\max_{x}\Pr(\beta^{(\mu)}=x)\leq p$ , and

We will define $p$ -bounded of exponent $r$ for collections of random variables below, but first we note that the conditions above are easy to verify in practice. In particular, if we have a symmetric random variable

where the equality on the right-hand side is a simple expected value computation.

We say that a collection of random variables $\{\alpha_{jk}\}_{j,k=1}^{n}$ is $p$ -bounded of exponent $r$ if each $\alpha_{jk}$ is $p$ -bounded of exponent $r$ with the same constants $p$ , $q$ , and $r$ ; and, importantly, the same value of $\mu=1-p$ . We also make the critical assumption that the set of all values that can be taken by the $\beta^{(\mu)}_{jk}$ has cardinality $O(1)$ (a relaxation of this assumption is discussed in Remark 8.5). However, the definition of $\beta^{(\mu)}_{jk}$ is otherwise allowed to vary with $j$ and $k$ . Also, we will use $S$ to denote the set of all possible values taken by the random variables $\alpha_{jk}$ , and we will assume that the cardinality of $S$ is at most $\left|S\right|\leq n^{o(n)}$ .

Let $p$ be a positive constant such that $0<p<1$ , let $r$ be a positive integer constant, and let $S$ be a generalized arithmetic progression in the complex numbers with rank $O(1)$ (independent of $n$ ) and with cardinality at most $\left|S\right|\leq n^{o(n)}$ . Let $N_{n}$ be an $n$ by $n$ matrix with entries $\alpha_{jk}$ , each of which is an independent random variable taking values in $S$ . If the collection of random variables $\{\alpha_{jk}\}_{1\leq j,k\leq n}$ is $p$ -bounded of exponent $r$ , then

In the motivating examples of Section 1 (excluding Corollary 1.2), we discussed the case where the entries of the matrix are i.i.d.; however, in general the distributions of the entries are allowed to differ (and even depend on $n$ ), so long as the entries all take values in the same structured set $S$ described above. The condition that $S$ has additive structure seems to be an artifact of the proof (in particular, at certain points in the proof of Theorem 6.1, we need the set $\left\{\sum_{j=1}^{n}x_{j}:x_{j}\in S\mbox{ for all }j\right\}$ to have cardinality at most $n^{o(n)}$ ). The easiest way to guarantee that $S$ has the required structure is to assume that the set of values taken by all the $\alpha_{jk}$ has cardinality at most $O(1)$ , and this is the approach we take for the corollaries in Section 3, since it also makes it easy to demonstrate that the collection of entries is $p$ -bounded of exponent $r$ .

Some corollaries of Theorem 2.2

In each corollary, we will use the definition of $p$ -bounded of exponent 1 and of exponent 2. The definition of $p$ -bounded of exponent 2 is particularly useful, since then the absolute value on the left-hand side of Inequality (10) is automatically dealt with; however, when $\mu$ is small (for example whenever $\mu\leq 1/2$ ), one can get better bounds by using $p$ -bounded of exponent 1. We have not yet found an example where the best possible bound from Theorem 2.2 is found by using $p$ -bounded of an exponent higher than 2.

To prove Inequality (4), we note for $0\leq\mu\leq\frac{1}{2}$ that (using the definition in Equation (3) of $\gamma^{(\mu)}$ )

and thus $\gamma^{(\mu)}$ is $(1-\mu)$ -bounded of exponent 1 (i.e., take $\beta^{(\mu)}:=\gamma^{(\mu)}$ ), and so Inequality (4) follows from Theorem 2.2.

To prove Inequality (5), we note for $\frac{1}{2}\leq\mu\leq 1$ that

(the inequality above may be checked by squaring both sides and expanding as polynomials in $\cos(2\pi t)$ ). Thus, we can take

to see that $\gamma^{(\mu)}$ is $\displaystyle\left(\frac{2\mu+1}{4}\right)$ -bounded of exponent 1, and so Inequality (5) follows from Theorem 2.2.

To prove Inequality (6), we note for $0\leq\mu\leq 1$ that

to see that $\gamma^{(\mu)}$ is $\displaystyle\left(1-2\mu+\frac{3}{2}\mu^{2}\right)$ -bounded of exponent 2, and so Inequality (6) follows from Theorem 2.2.

2 Matrices with entries having symmetric distributions

In this subsection, we will prove a singularity bound for an $n$ by $n$ matrix $N_{n}^{(\mu)}$ for which each entry is a symmetric discrete random variable taking the value 0 with probability $1-\mu$ .

Let $S$ be a set of complex numbers with cardinality $\left|S\right|\leq O(1)$ . If $N_{n}^{(\mu)}$ is an $n$ by $n$ matrix in which each entry is an independent symmetric complex random variable taking values in $S$ and taking the value 0 with probability $1-\mu$ , then

In particular, the same upper bounds as in Inequalities (4), (5), and (6) (which are shown in Figure 1) apply to the singularity probability for $N_{n}^{(\mu)}$ .

Let $\alpha_{ij}$ be an entry of $N_{n}^{(\mu)}$ . Since $\alpha_{ij}$ is symmetric and takes the value 0 with probability $1-\mu$ , we may write $\alpha_{ij}=\gamma^{(\mu)}_{ij}\eta_{ij}$ , where $\gamma^{(\mu)}_{ij}$ is an independent copy of $\gamma^{(\mu)}$ as defined in Equation (3) and $\eta_{ij}$ is a random variable that shares no values with $-\eta_{ij}$ . This description of $\alpha_{ij}$ was inspired by , and it allows us to condition on $\eta_{ij}$ and then use the remaining randomness in $\gamma^{(\mu)}_{ij}$ to get a bound on the singularity probability. In particular,

where the sum runs over all $(n^{2})$ -tuples $(c_{ij})_{1\leq i,j\leq n}$ of possible values taken by random variables $\eta_{ij}$ . Since $\sum_{(c_{ij})}\Pr(\{\eta_{ij}=c_{ij}\})=1$ , we can complete the proof by proving an exponential bound on $\Pr(N_{n}^{(\mu)}\mbox{ is singular}|\{\eta_{ij}=c_{ij}\})$ , and we will use Theorem 2.2 for this task.

We have thus shown that the entries of $\left.N_{n}^{(\mu)}\rule{0.0pt}{11.0pt}\right|_{\{\eta_{ij}=c_{ij}\}}$ are

Applying Theorem 2.2 completes the proof. ∎

Corollary 3.1 is tight for $0\leq\mu\leq\frac{1}{2}$ , since the probability of a row of all zeroes occurring is $(1-\mu+o(1))^{n}$ ; however, for any specific case, Theorem 2.2 can usually prove better upper bounds than those given by Corollary 3.1.

For example, consider the case of a matrix $M_{\{\pm 2,\pm 1\},n}^{(\mu)}$ with each entry an independent copy of the symmetric random variable

For $M_{\{\pm 2,\pm 1\},n}^{(\mu)}$ as defined above, we have

By the definition of $\alpha^{(\mu)}$ we have

(i.e., the right-hand side of the equation above is non-negative for such $\mu$ ), which proves the first bound.

for $0\leq\mu\leq 1$ , which proves the second bound. ∎

We also have the following lower bounds for the singularity probability of $M_{\{\pm 2,\pm 1\},n}^{(\mu)}$ :

The results of Corollary 3.2 and the corresponding lower bounds are shown in Figure 2, and one should note that the upper bounds are substantially better than those guaranteed by Corollary 3.1.

3 Random matrices with entries having arbitrary distributions

A useful feature of the definition of $p$ -bounded of exponent 2 is that it lets one bound the singularity probability of matrices with independent discrete random variables that are asymmetric.

We will need the following slightly more general corollary in Section 3.4. For a set $A$ and an integer $m$ , we will use the notation $mA:=\{\sum_{j=1}^{m}a_{j}:a_{j}\in A\}$ and $A^{m}:=\{\prod_{j=1}^{m}a_{j}:a_{j}\in A\}$ .

Note that that Corollary 3.4 implies Corollary 3.3 by taking $X_{n}$ to be the matrix of all zeroes.

Let $\alpha_{ij}$ be an entry in $N_{n}$ . Our goal is to describe $\alpha_{ij}$ in a two-step random process, condition on one of the steps, and then use the randomness in the other step to bound the singularity probability. The conditioning approach is the same as that used in the symmetric case (Corollary 3.1) and was inspired by . The conditioning argument is useful since some entries of the random matrix may take some values with very small probability (i.e. probability less than any constant); recall that while the entries of the random matrix always take values in a fixed set $S$ of cardinality $O(1)$ , the distributions of those random variables within $S$ are allowed to vary with $n$ . (Note that making use of Remark 8.5 would provide an alternate way of dealing with entries that take some values with very small probability.)

Say that $\alpha_{ij}$ takes the values $v_{1},\ldots,v_{t}$ with probabilities $\varrho_{1},\ldots,\varrho_{t}$ , respectively, where $\varrho_{1}\geq\varrho_{2}\geq\cdots\geq\varrho_{t}$ . Define new random variables $\eta_{ijk}$ such that for some $i_{0}$ and $i_{1}$ , the values taken by $\eta_{ijk}$ are $v_{i_{0}},v_{i_{0}+1},\ldots,v_{i_{0}+i_{1}}$ with corresponding probabilities $\varrho_{i_{0}}/p_{k},\varrho_{i_{0}+1}/p_{k},\ldots,\varrho_{i_{0}+i_{1}}/p_{k}$ , where $p_{k}:=\sum_{i=1}^{i_{1}}\varrho_{i_{0}+i}$ . Thus, we can write

As in the proof of Corollary 3.1, we will condition on the values taken by the $\eta_{ijk}$ in order to prove a bound on the singularity probability. We have that

where the sum runs over all possible values $(c_{ijk})$ that the $\eta_{ijk}$ can take. Thus, it is sufficient to prove a bound on the singularity probability for the random matrix $X_{n}+\left.N_{n}\rule{0.0pt}{11.0pt}\right|_{\{\eta_{ijk}=c_{ijk}\}}$ which has random entries

where $x_{ij}$ and the $c_{ijk}$ are constants.

Combining Case 1 and Case 2, we have that the collection $\{x_{ij}+\widetilde{\alpha}_{ij}\}$ is $\left(p+\epsilon\right)$ -bounded of exponent 2, and so by and by Theorem 2.2 we have that $\Pr(X_{n}+\left.N_{n}\rule{0.0pt}{11.0pt}\right|_{\{\eta_{ijk}=c_{ijk}\}}\mbox{ is singular})\leq\left(\sqrt{p+\epsilon}+o(1)\right)^{n}$ .

The constant $\epsilon>0$ was chosen arbitrarily, and so letting $\epsilon$ tend to zero, we get that

4 Partially random matrices

In this subsection, we prove a bound on the singularity probability for partly random matrices where many rows are deterministic.

Corollary 3.5 applies to partly random matrices with $\mathfrak{f}=o(n)$ fixed, non-random rows containing integers bounded by a constant and with random entries taking at most $O(1)$ values in the complex numbers. Corollary 1.2, on the other hand, holds with the fixed entries also allowed to take values in the complex numbers and gives a sligtly better bound, but additionally requires $\mathfrak{f}\leq O(\ln n)$ (which is far smaller in general than $o(n)$ ). Proving Corollary 1.2 requires mirroring the entire argument used to prove the main theorem (Theorem 2.2) in the case where $\mathfrak{f}$ rows contain fixed, non-random entires, and we discuss this argument in Section 9. Proving Corollary 3.5, however, can be done directly from Theorem 2.2, as we will show below. First, we will state a generalization of Corollary 3.5.

To obtain Corollary 3.6 from Corollary 3.5, find a collection $\mathcal{C}$ of $\mathfrak{f}-k$ linearly independent rows among the deterministic rows. Replace the rest of the deterministic rows with a collection $\mathcal{C}^{\prime}$ of rows containing integer values between $-K$ and $K$ such that $\mathcal{C}^{\prime}$ is linearly independent from $\mathcal{C}$ . Finally, apply Corollary 3.5 to the new partially random matrix whose deterministic rows are from $\mathcal{C}\cup\mathcal{C}^{\prime}$ , thus proving Corollary 3.6.

By reordering the rows and columns, we may write

where $A$ is an $\mathfrak{f}$ by $\mathfrak{f}$ non-random invertible matrix, $B$ is an $\mathfrak{f}$ by $n-\mathfrak{f}$ non-random matrix, $C$ is an $n-\mathfrak{f}$ by $\mathfrak{f}$ random matrix, and $D$ is an $n-\mathfrak{f}$ by $n-\mathfrak{f}$ random matrix. Note that $N_{\mathfrak{f},n}$ is singular if and only if there exists a vector $\mathbf{v}$ such that $N_{\mathfrak{f},n}\mathbf{v}=0$ . Let $\mathbf{v}_{1}$ be the first $\mathfrak{f}$ coordinates of $\mathbf{v}$ and let $\mathbf{v}_{2}$ be the remaining $n-\mathfrak{f}$ coordinates. Then $N_{\mathfrak{f},n}\mathbf{v}=0$ if and only if

Since $A$ is invertible, these two equations are satisfied if and only if $(-CA^{-1}B+D)\mathbf{v}_{2}=0$ , that is, if and only if the random matrix $-CA^{-1}B+D$ is singular.

We want to show that every entry that can appear in $-CA^{-1}B$ is an element of $n^{o(n)}\left(S\cup\{-1,0,1\}\right)^{O(1)}$ . By the cofactor formula for $A^{-1}$ , we know that the $i,j$ entry of $A^{-1}$ is $(-1)^{i+j}(\det A_{ij})/\det A$ , where $A_{ij}$ is the $\mathfrak{f}-1$ by $\mathfrak{f}-1$ matrix formed by deleting the $i$ -th row and $j$ -th column of $A$ . Thus, $A^{-1}=\frac{1}{\det A}\widetilde{A}$ , where the $i,j$ entry of $\widetilde{A}$ is $(-1)^{i+j}\det A_{ij}$ . By the volume formula for the determinant, we know that $\left|\det A\right|$ is at most the product of the lengths of the row vectors of $A$ ; and thus $\left|\det A\right|\leq n^{o(n)}$ (here we need that $A$ has integer entries between $-K$ and $K$ , where $K$ is a constant, and that $\mathfrak{f}\leq o(n)$ ). Similarly, we have $\left|\det A_{ij}\right|\leq n^{o(n)}$ . Every entry of $\widetilde{A}$ is thus in $n^{o(n)}\{-1,0,1\}$ , every entry of $C$ is in $S$ , and every entry of $B$ is in $O(1)\{-1,0,1\}$ ; thus, every entry of $-C\widetilde{A}B$ is an element of $n^{o(n)}(S\cup\{-1,0,1\})$ .

Conditioning on the values taken by all the entries in $C$ , we have

where the sum runs over all possible matrices $(c_{ij})$ that $C$ can produce. Considering the entries in $C=(c_{ij})$ to be fixed (note that $A$ and $B$ are fixed by assumption), we now need to bound

Note that every entry of $-(c_{ij})\widetilde{A}B$ is an element of $n^{o(n)}\left(S\cup\{-1,0,1\}\right)^{O(1)}$ and that the random matrix $(\det A)D$ has entries that take values in the fixed set $\{(\det A)s:s\in S\}$ having cardinality $O(1)$ . Thus, by Corollary 3.4, we have that

Plugging this bound back into Equation (14) completes the proof. ∎

5 Integer matrices and rational eigenvalues

Let $\eta_{k}$ be the random variable taking the values $-k,-k+1,\ldots,k-1,k$ each with equal probability, and let $M_{n}$ be the $n$ by $n$ matrix where each entry is an independent copy of $\eta_{k}$ . In , Martin and Wong show that for any $\epsilon>0$ ,

where $c(n,\epsilon)$ is a constant depending on $n$ and $\epsilon$ . (One goal in is to study this bound as $k$ goes to $\infty$ while $n$ is fixed, which is why $c(n,\epsilon)$ is allowed to depend on $n$ .)

Below, we prove a similar result for random integer matrices with entries between $-k$ and $k$ (with $k$ fixed), where we allow each entry to have a different (independent) distribution and we also allow the distributions to be very general.

Fix a positive integer $k$ , and let $M_{k,n}$ be a random integer matrix with independent entries, each of which takes values in the set $\{-k,-k+1,\ldots,k-1,k\}$ . Let $c$ be a constant such that for every entry $\alpha$ , we have $\max_{-k\leq x\leq k}\Pr(\alpha=x)\leq c/k$ . Then

where the $o(1)$ term goes to zero as $n$ goes to $\infty$ .

For example, in the case where each independent entry has the uniform distribution on $\{-k,-k+1,\ldots,k-1,k\}$ (as in ), one can set $c=1/2$ in the corollary above.

The proof given below follows the same outline as the main theorem of , with Corollary 1.2 replacing an appeal to [7, Lemma 1].

The characteristic polynomial for $M_{k,n}$ is monic with integer coefficients, and thus the only possible rational eigenvalues are integers (by the rational roots theorem). Every eigenvalue of $M_{k,n}$ has absolute value at most $nk$ (see [7, Lemma 4]); thus, the only possible integer eigenvalues are between $-nk$ and $nk$ .

The matrix $M_{k,n}$ has $\lambda$ as an eigenvalue if and only if $M_{k,n}-\lambda I$ is singular (where $I$ is the $n$ by $n$ identity matrix). By Corollary 1.2 (with $\mathfrak{f}=0$ ), we have

Random matrices with complex entries: A reduction technique

the map $\phi_{Q}$ is injective on $S$ , and

for any $n$ by $n$ matrix $(s_{ij})_{1\leq i,j\leq n}$ with entries $s_{ij}\in S$ , we have

Let $p$ be a constant such that $0<p\leq 1$ and let $D$ be a characteristic zero integral domain. Let $S\subset D$ have cardinality $\left|S\right|\leq O(1)$ . If $N_{n}$ is an $n$ by $n$ matrix with independent random entries, each taking values in $S$ , such that for every entry $\alpha$ , we have $\max_{x}\Pr(\alpha=x)\leq p$ , then

Proof of the main theorem (Theorem 2.2)

Let $X_{i}:=(\alpha_{i,1},\ldots,\alpha_{i,n})$ denote the $i$ -th row of $N_{n}$ . We note that $N_{n}$ has determinant zero if and only if there is a linear dependency among the rows $X_{i}$ . It has been shown (see [10, Lemma 5.1] and also ) that the dominant contribution to the singularity probability comes from the $X_{i}$ spanning a hyperplane (of dimension $n-1$ ). In particular,

where $A_{V}$ denotes the event that $X_{1},\ldots,X_{n}$ span $V$ , and non-trivial means that $V$ contains the origin, $V$ is spanned by vectors in $S^{n}$ (where $S$ is the set of all possible values that can occur in $N_{n}$ ), and $\Pr(X_{i}\in V)>0$ for all $i$ .

As in , we will divide the non-trivial hyperplanes into $n^{2}$ classes, since it is then sufficient to show that the sum of $\Pr(A_{V})$ over all $V$ in a particular class is at most $(p^{1/r}+o(1))^{n}$ .

For $d_{\pm}=0$ , we define $\operatorname{Gr}(0)$ to be the set of all non-trivial hyperplanes such that

We will refer to $d_{\pm}$ as the combinatorial dimension of $V$ .

Note that $\operatorname{Gr}(d_{\pm})=\emptyset$ for $d_{\pm}\geq n-1+1/n$ (by Lemma B.1). We will consider hyperplanes $V$ with combinatorial dimension in three main regions: $d_{\pm}$ small, $d_{\pm}$ medium-sized, and $d_{\pm}$ large. The two lemmas and the proposition below suffice to prove Theorem 2.2.

The reasoning here is the same as in [11, Lemma 2.3], making use of fact that $\Pr(X_{i}\in V)\leq\max_{1\leq i\leq n}\Pr(X_{i}\in V)\leq p^{n-d_{\pm}}\leq\delta^{n}$ . In particular,

which completes the proof since the summing the right-hand side over all $V$ is at most $n\max_{i}\Pr(X_{i}\in V)$ (note that an instance of the vectors $\{X_{j}\}_{1\leq j\leq n}\setminus\{X_{i}\}$ can span at most one hyperplane). ∎

where $k$ is the number of nonzero coordinates in the normal vector to $V$ . Combining the two inequalities above shows that $k\leq n/2$ .

(Lemma A.2 is a natural generalization of [4, Section 3.1]; see also , [10, Lemma 5.1], and [2, Lemma 14.10].) ∎

The results in and were derived using the ideas that we will use for the unexceptional medium combinatorial dimension case. The idea of considering the exceptional case separately in (and using tools from additive combinatorics in the exceptional case) is what lead to the improvement of Inequality (1), which gives a bound of asymptotically $\left(\frac{3}{4}\right)^{n}$ , over the $.999^{n}$ bound in .

2 Proof of the medium combinatorial dimension

Before defining exceptional and unexceptional hyperplanes, we will need some new notation. By assumption, the collection of random variables $\{\alpha_{ij}\}_{1\leq i,j\leq n}$ is $p$ -bounded of exponent $r$ with a constant $\mu=1-p$ , with random variables $\beta^{(\mu)}_{ij}$ corresponding to each $\alpha_{ij}$ , and with a constant $0<q\leq p$ (see Definition 2.1). We also define a constant slightly smaller than $\mu$ , namely $\displaystyle\underline{\mu}:=\mu-\frac{\epsilon_{0}}{100}$ . We will let $Y_{i}:=(y_{i,1},\ldots,y_{i,n}):=(\beta^{(\underline{\mu})}_{i,1},\ldots,\beta^{(\underline{\mu})}_{i,n})$ denote another row vector that corresponds to the row vector $X_{i}$ ( $\beta^{(\underline{\mu})}_{i,j}$ comes from the definition of $p$ -bounded of exponent $r$ ). Also, we will let

Consider a hyperplane $V$ of medium combinatorial dimension (that is, $d_{\pm}$ satisfies the condition in Proposition 5.4). We say $V$ is unexceptional if there exists an $i_{0}$ where $1\leq i_{0}\leq n$ and there exists a $k_{0}$ where $1\leq k_{0}\leq r$ such that

We say $V$ is exceptional if for every $i$ where $1\leq i\leq n$ and for every $k$ where $1\leq k\leq r$ we have

Proposition 5.4 follows from the two lemmas below, so long as $\epsilon_{1}$ is chosen suitably small with respect to $\epsilon_{0}$ , $c_{m}$ , and $r$ .

We will prove Lemma 5.6 in Section 5.3, and we will prove Lemma 5.7 in Section 6.

3 The unexceptional medium combinatorial dimension case

The general idea for the case of an unexceptional hyperplane $V$ is to replace some of the rows $X_{i}$ in the matrix $N_{n}$ with rows that concentrate more sharply on the subspace $V$ . In the case where the exponent $r=1$ , replacing a row $X_{i}$ with $Y_{i}:=(\beta^{(\underline{\mu})}_{i,1},\ldots,\beta^{(\underline{\mu})}_{i,n})$ is successful; however, in the exponent $r=2$ case, for example, replacing the entire row results in a bound that is off by an exponential factor. We solve this problem by replacing $X_{i}$ with only half of $Y_{i}$ (with the other half of the entries being zero). This idea easily extends to any integer $r\geq 2$ and is the motivation for defining the vectors $Z^{*}_{i,k}$ to have all zeros except for roughly $n/r$ coordinates, as is done in Equation (17). The basic utility of $Z^{*}_{i_{0},k_{0}}$ (from Definition 5.5) is that it concentrates more sharply on the unexceptional subspace $V$ than the vector $X_{i}$ for any $i$ .

Let $Z^{*}_{i_{0},k_{0}}$ be the vector from the definition of unexceptional (Definition 5.5) such that $\Pr(X_{i}\in V)<\epsilon_{1}\Pr(Z^{*}_{i_{0},k_{0}}\in V)$ for every $i$ , and set $Z:=Z^{*}_{i_{0},j_{0}}$ . Let $m$ be the closest integer to $\frac{c_{m}\epsilon_{0}n}{r}$ , where $c_{m}$ is a small positive absolute constant (for example, in , $c_{m}$ is taken to be $\frac{1}{100}$ ). Finally, let $Z_{1},\ldots,Z_{m}$ be copies of $Z$ , independent of each other and of $X_{1},\ldots,X_{n}$ .

Let $B_{V,m}$ be the event that $Z_{1},\ldots,Z_{m}$ are linearly independent and lie in $V$ . Then,

The argument follows the same reasoning as [11, Lemma 4.4], however, the quantity $2^{d_{\pm}-n}$ in should be replaced by $\max_{1\leq i\leq n}\Pr(X_{i}\in V)$ . Details are provided in Appendix B. ∎

To conclude the proof of Lemma 5.6, we follow the “row-swapping” argument at the end of [11, Section 4], with the small change of bounding $\Pr(X_{i}\in V)$ by $\displaystyle\max_{1\leq i\leq n}\Pr(X_{i}\in V)$ , which we use in place of the quantity $2^{d_{\pm}-n}$ . Details are provided in Appendix B.

Analyzing the exceptional medium combinatorial dimension case

The approach for exceptional $V$ in is very different from that used in the unexceptional case or in the large or small combinatorial dimension cases. Using some powerful tools from additive combinatorics, the general idea is to put exceptional hyperplanes $V$ in correspondence with a particular additive structure called a generalized arithmetic progression, and then to show that the number of the particular generalized arithmetic progressions that arise in this way is exceedingly small. The key to this approach is a structure theorem—namely, [11, Theorem 5.3]. In this section, we state a slightly modified structure theorem (Theorem 6.1), and then we show how to use Theorem 6.1 to prove Lemma 5.7. In the beginning of Section 7, we outline the changes needed to prove the the structure theorem for our current context, and in Sections 7 and 8 we provide details.

Before stating the structure theorem, we need some definitions and notation. A generalized arithmetic progression of rank $\mathfrak{r}$ is a set of the form

We will use the notation $mP$ , where $m$ is a positive integer, to denote the set $\{\sum_{i=1}^{m}x_{i}:x_{i}\in P\}$ and the notation $P^{m}$ , where $m$ is a positive integer, to denote the set $\{\prod_{i=1}^{m}x_{i}:x_{i}\in P\}$ . If $P$ is a generalized arithmetic progression of rank $\mathfrak{r}$ , then so is $mP$ , while $P^{m}$ , on the other hand, is a generalized arithmetic progression of rank at most $\mathfrak{r}^{m}$ . Also note that $\left|mP\right|\leq m^{\mathfrak{r}}\left|P\right|$ and that $\left|P^{m}\right|\leq\left|P\right|^{m}$ .

Given an exceptional hyperplane $V$ , there exists a representation of the form

and $M_{1},\ldots,M_{\mathfrak{r}}\geq 1$ with the volume bound

(i) (Scaled defining coordinates lie in a progression) The symmetric generalized arithmetic progression

We will discuss the proof of the structure theorem in Sections 7 and 8. In the remainder of this section, we will discuss how to use the structure theorem to prove Lemma 5.7.

Fix $d_{\pm}$ of medium combinatorial dimension (see Proposition 5.4). Using independence of the rows, we have

In [11, Section 5], it is shown using Theorem 6.1(i) and (ii) and Gaussian-type methods (and the fact that $\mathfrak{r}$ is bounded by a constant) that

Plugging the volume bound on $M_{1}\cdots M_{\mathfrak{r}}$ into the previous displayed inequality, we have

Halász-type arguments

The proof of the structure theorem has two main ingredients: tools from additive combinatorics, and Halász-type arguments using discrete Fourier analysis. Our proof of Theorem 6.1 will follow the proof of [11, Theorem 5.2] very closely. We will use results about additive combinatorics from [11, Section 6] directly, and we will discuss below the extent to which the Halász-type arguments of [11, Section 7] need to be modified to work for our current context. The proof of Theorem 6.1 will be given in Section 8 using results from the current section, [11, Section 6], [11, Section 7], and [11, Section 8]. Our Section 8 follows [11, Section 8] closely, with a few modifications to prove rational $T$ -commensurability instead of only rational commensurability.

In this section we discuss modifications to the lemmas in [11, Section 7] that are needed in order to prove Theorem 6.1.

We will use $e_{Q}(\cdot)$ to denote the primitive character

where the last line is an application of Hölder’s inequality.

where $\underline{\mu}:=\mu-\frac{\epsilon_{0}}{100}$ , as defined in Section 5.2. Note that $\displaystyle f(\xi)=\prod_{j=1}^{n}f_{j}(\xi)$ .

We will need the following analog of [11, Lemma 7.1]:

This inequality may be proven pointwise (for each $j$ after expanding out the definition of $g_{k}$ ) using the convexity of the $\log$ function, just as in the proof of [11, Lemma 7.1] (see also [10, Lemma 7.1]. ∎

( $\mu p_{j,1}\geq 2q$ since $\min_{x}\Pr(\beta^{(\mu)}_{j}=x)\geq q$ by Definition 2.1).

Thus, there is a constant $C(\epsilon_{2},q,r)$ such that

for every $\xi\in\Lambda$ . (E.g., the constant $C(\epsilon_{2},q,r):=\left(\frac{50r}{q}\ln\left(\frac{1}{\epsilon_{2}}\right)\right)^{1/2}$ suffices.)

There exists a constant $C$ depending on $\epsilon_{-1},\epsilon_{0},\epsilon_{1},\epsilon_{2},q,r$ , and $\mu$ such that

Furthermore, for every integer $k\geq 4$ we have

Our goal is to bound $\sum_{\xi\in\Lambda}f(\xi)$ from above and below, and then pass to bounds on $\left|\Lambda\right|$ using the fact that $\epsilon_{2}\leq f(\xi)\leq 1$ for all $\xi\in\Lambda$ .

We can choose $\epsilon_{2}$ sufficiently small with respect to $\epsilon_{1}$ and $1-\underline{\mu}/\mu$ so that, for example,

for some constant $C$ . This completes the proof of Lemma 7.2. ∎

We now state and prove a lemma showing that $f(\xi)$ is at least a constant for all $\xi\in 4\Lambda$ . In , the lemma below is unnecessary because an inequality following from [11, Inequality (30)] (which corresponds to Inequality (30)) and the triangle inequality suffices.

Let $\Lambda$ and $f$ be defined as in Equation (28) and Equation (25), respectively. If $\xi\in 4\Lambda$ , then

Note that $c(\epsilon_{-1},\epsilon_{2})$ is a constant.

Note that Inequality (29) implies that for any $\xi^{\prime}\in\Lambda$ we have

Thus, by the triangle inequality, we have for any $\xi\in 4\Lambda$ that

Fix $\xi\in 4\Lambda$ . Let $k_{0}$ be the number of indices $j$ such that

and without loss of generality, say that these indices are $j=1,2,\ldots,k_{0}$ . Squaring Inequality (34), we see that $\frac{k_{0}}{200\mu}\leq\frac{1600r}{\mu}\ln\left(\frac{1}{\epsilon_{2}}\right)$ , and so we have

which is a constant. Thus, for the vast majority of the indices $j$ , namely $j=k_{0}+1,k_{0}+2,\ldots,n$ , we have

Note that $0\leq\left\|x\right\|_{\Lambda}\leq 1$ for all $x$ and that the triangle inequality holds: $\left\|x+y\right\|_{\Lambda}\leq\left\|x\right\|_{\Lambda}+\left\|y\right\|_{\Lambda}$ . We also have that

Thus, squaring Inequality (30) and summing over all $\xi\in\Lambda$ , we have

In the next section, we will complete the proof of the structure theorem using the lemma above.

Proof of the Structure Theorem (Theorem 6.1)

The key to proving the structure theorem is an application of Freiman’s Theorem for finite fields.

The symmetric generalized arithmetic progression $P$ is close to what is needed for Theorem 6.1, since it satisfies the required volume and rank bounds. We will show below that $P$ can be altered in ways that preserve Inequalities (37) and (38) (except possibly for changing the implicit constants) so that $P$ satisfies conditions (i), (ii), and (iii) of Theorem 6.1.

There is an absolute constant $C_{0}\geq 1$ such that the following holds. Let $P$ be a symmetric progression of rank $\mathfrak{r}$ in a abelian group $G$ , such that every nonzero element of $G$ has order at least $\mathfrak{r}^{C_{0}\mathfrak{r}^{3}}|P|$ . Then there exists a proper symmetric generalized arithmetic progression $P^{\prime}$ of rank at most $\mathfrak{r}$ containing $P$ such that

Furthermore, if $P$ is not proper and $\mathfrak{r}\geq 2$ , then $P^{\prime}$ can be chosen to have rank an most $\mathfrak{r}-1$

One can conclude Lemma 8.2 from the proof of [11, Lemma 9.3] (the only difference is noting that the rank can be reduced by at least 1 if $P$ is not proper to begin with). Note that we can always choose $Q$ larger than $\mathfrak{r}^{C_{0}\mathfrak{r}^{3}}|P|\leq O\left(\frac{1}{p}\right)^{n}$ .

Furthermore, if $r\geq 2$ and if there is some $j$ for which $P$ is not ( $k_{j},x_{j}$ )-proper, then $P^{\prime}$ can be chosen to have rank at most $\mathfrak{r}-1$ .

We proceed by induction on the rank $\mathfrak{r}$ . For the base case, let $\mathfrak{r}=1$ and consider $x_{j}\in P$ such that $k_{j}x_{j}\in P$ . Since $P$ has rank 1 in this case, we have that $x_{j}=\Phi_{P}(x_{j})v_{1}$ and $k_{j}x_{j}=\Phi_{P}(k_{j}x_{i})v_{1}$ . Combining these two equations we have $k_{j}\Phi_{P}(x_{j})v_{1}=\Phi_{P}(k_{j}x_{j})v_{1}$ , and dividing by $v_{1}$ (note that we may assume that $v_{1}\neq 0$ ), we see that $k_{j}\Phi_{P}(x_{j})=\Phi_{P}(k_{j}x_{j})$ . Thus $P$ is $(k_{j},x_{j})$ -proper for every $j$ .

For $\mathfrak{r}\geq 2$ , we may assume that there is some $j_{0}$ such that $k_{j_{0}}\Phi_{P}(x_{j_{0}})\neq\Phi_{P}(k_{j_{0}}x_{j_{0}})$ (i.e., we assume that $P$ is not $(k_{j_{0}},x_{j_{0}})$ -proper). We may assume that $P$ has the form $\{m_{1}v_{1}+\cdots+m_{\mathfrak{r}}v_{\mathfrak{r}}:\left|m_{i}\right|<M_{i}/2\}$ . Let $M:=(M_{1},\ldots,M_{\mathfrak{r}})$ , and let $(-M/2,M/2)$ denote the box $\{(m_{1},\ldots,m_{\mathfrak{r}}):\left|m_{i}\right|<M_{i}/2\}$ .

Let $\underline{k}$ be the largest integer such that $\Phi_{P}(\underline{k}x_{j_{0}})=\underline{k}\Phi_{P}(x_{j_{0}})$ , so $1\leq\underline{k}<k_{j_{0}}$ and $\Phi_{P}((\underline{k}+1)x_{j_{0}})\neq(\underline{k}+1)\Phi_{P}(x_{j_{0}})$ . Since $\underline{k}x_{j_{0}}\in P$ and $x_{j_{0}}\in P$ , we know that $\Phi_{P}(x_{j_{0}})\in(-M/2,M/2)$ and $\Phi_{P}(\underline{k}x_{j_{0}})=\underline{k}\Phi_{P}(x_{j_{0}})\in(-M/2,M/2)$ ; and thus, $(\underline{k}+1)\Phi_{P}(x_{j_{0}})\in(-M,M)$ . This shows that $2P$ , which has dimensions $2M=(2M_{1},\ldots,2M_{\mathfrak{r}})$ , is not proper, since it has two distinct representations for $(\underline{k}+1)x_{j_{0}}$ .

We can now apply Lemma 8.2 to $2P$ , thus finding a proper symmetric generalized arithmetic progression $P^{\prime}$ of rank at most $\mathfrak{r}-1$ containing $2P$ (which contains $P$ ) such that

Since $P^{\prime}$ has rank at most $\mathfrak{r}-1$ , we have by induction that there exists $P^{\prime\prime}$ a proper symmetric generalized arithmetic progression of rank at most $\mathfrak{r}-1$ containing $P^{\prime}$ and such that

and such that $P^{\prime\prime}$ is ( $k_{j},x_{j}$ )-proper for every $j$ . Choosing $C_{1}\geq 2C_{0}$ (for example) guarantees that $\mathfrak{r}^{C_{1}(\mathfrak{r}-1)^{4}}\mathfrak{r}^{2C_{0}\mathfrak{r}^{3}}\leq\mathfrak{r}^{C_{1}\mathfrak{r}^{4}}$ , which completes the induction. ∎

Since not all the $\alpha_{i}$ are zero, we may assume that $\alpha_{\mathfrak{r}}\neq 0$ . Setting $w=v_{\mathfrak{r}}/\alpha_{\mathfrak{r}}$ so that $v_{\mathfrak{r}}-w\alpha_{\mathfrak{r}}=0$ , we see that $P$ is contained in the symmetric generalized arithmetic progression

with rank $\mathfrak{r}-1$ , dimensions $M_{1},\ldots,M_{\mathfrak{r}-1}$ (which are the same as the corresponding dimensions for $P$ ), and basis vectors $v_{i}^{\prime}:=v_{i}-\alpha_{i}v_{\mathfrak{r}}/\alpha_{\mathfrak{r}}$ . By construction $|P^{\prime}|\leq|P|$ . ∎

If $P$ is proper, then do nothing; otherwise apply Lemma 8.2.

If $P$ satisfies the three properties given in steps 1, 2, and 3, halt; otherwise, return to step 1.

Thus, all that is left to prove is part (iii), the claim of rational $T$ -commensurability. Though we will not need it in the current section, one should recall that Theorem 6.1 is only useful when $\left|n^{o(n)}T^{O(1)}\right|=n^{o(n)}$ , where $T$ is the symmetric generalized arithmetic progression containing $\{-1,0,1\}$ and all possible values taken by the $\beta^{(\mu)}_{ij}$ and the $\alpha_{ij}$ (see Section 6).

We say that a set $W$ economically $T$ -spans a set $U$ if each $u\in U$ can be represented as a highly $T$ -rational linear combination of elements in $W$ , where each coefficient may be expressed as $a/b$ where $a,b\in n^{o(n)}T^{O(1)}$ and where the implicit constants in the $o(\cdot)$ and $O(\cdot)$ notation are uniform over $U$ .

Comparing our definitions with those from [11, Section 8], we note that “highly rational” means the same thing as “highly $\{-1,0,1\}$ -rational”, and “economically spans” means the same thing as “economically $\{-1,0,1\}$ -spans”. Thus, it is clear that any highly rational number is also highly $T$ -rational for any $T$ containing $\{-1,0,1\}$ , and also the statement “ $W$ economically spans $U$ ” implies “ $W$ economically $T$ -spans $U$ ” for any set $T$ containing $\{-1,0,1\}$ . The remainder of this section paraphrases (with some notational changes) the latter portion of [11, Section 8].

Let $s$ be the smallest integer such that there exists a subset of cardinality $s$ of $\{v_{1},\ldots,v_{\mathfrak{r}}\}$ (by renumbering, say the set is $\{v_{1},\ldots,v_{s}\}$ ) so that for some nonzero $d\in n^{o(n)}T^{O(1)}$ and some $c_{ij}\in n^{o(n)}T^{O(1)}$ we have

Note that the line above is the only place in the proof where we use the assumption from the definition of $p$ -bounded of exponent $r$ that the $\beta^{(\mu)}_{ij}$ take values in a set with cardinality $O(1)$ . As is evidenced here, the following weaker assumption suffices instead: say that for each $1\leq i\leq n$ there exists a set $B_{i}$ such that $\left|B_{i}\right|=O(1)$ and such that $\beta^{(\mu)}_{i1},\beta^{(\mu)}_{i2},\ldots,\beta^{(\mu)}_{in}$ each take a nonzero value in $B_{i}$ with probability at least $q$ . In fact, this weaker assumption also replaces the assumption in the definition of $p$ -bounded of exponent $r$ that $q\leq\min_{x}\Pr(\beta^{(\mu)}_{ij}=x)$ for every $i,j$ : It suffices for each $\beta^{(\mu)}_{ij}$ to take one value in $B_{i}$ with probability at least $q$ , instead of taking every value with probability at least $q$ .

Plugging this last equation into Equation (39), we arrive at

Thus, we have completed the proof of the structure theorem (Theorem 6.1). $\square$

A generalization: 𝔣𝔣\mathfrak{f} rows have fixed, non-random values

In this section, we will give a generalization of Theorem 2.2 to the case where the random matrix $N_{n}$ has $\mathfrak{f}\leq O(\ln n)$ rows that are assumed to be linearly independent and contain fixed, non-random entries. The proof of the generalized result is very similar to the proof of Theorem 2.2, and we will sketch the main differences in the two proofs below.

Let $\mathfrak{f}$ be an integer between 1 and $n$ , let $S$ be a subset of a ring, and let $N_{\mathfrak{f},n}$ be an $n$ by $n$ matrix defined as follows. For $1\leq i\leq\mathfrak{f}$ and $1\leq j\leq n$ , let the entries $s_{ij}$ of $N_{\mathfrak{f},n}$ be fixed (non-random) elements of $S$ such that the rows $(s_{i,1},\ldots,s_{i,n})$ for $1\leq i\leq\mathfrak{f}$ are linearly independent. For $\mathfrak{f}+1\leq i\leq n$ and $1\leq j\leq n$ , let the entries $\alpha_{ij}$ of $N_{\mathfrak{f},n}$ be discrete finite random variables taking values in $S$ . Thus,

Let $p$ be a positive constant such that $0<p<1$ , let $r$ be a positive integer constant, and let $S$ be a generalized arithmetic progression in the complex numbers with rank $O(1)$ (independent of $n$ ) and with cardinality at most $\left|S\right|\leq n^{o(n)}$ . Consider the matrix $N_{\mathfrak{f},n}$ with entries in $S$ (see Definition 9.1 above), where $\mathfrak{f}\leq\left(\frac{r}{2\ln(1/p)}-o(1)\right)\ln n$ . If the collection of random variables $\{\alpha_{jk}\}_{\mathfrak{f}+1\leq j\leq n,1\leq k\leq n}$ is $p$ -bounded of exponent $r$ , then

Note that the bound on the singularity probability of $N_{\mathfrak{f},n}$ for $r\geq 2$ is the same as in Theorem 2.2 (since for $r\geq 2$ , we have $n/r\ll n-c\ln n=n-\mathfrak{f}$ ). This is a reflection of the fact that only the large dimension case uses the randomness in all the rows simultaneously, and in that case the exponential bound does not depend on $r$ . Generally speaking, the best known lower bounds on the singularity probability of a discrete random matrix come from a dependency among at most two random rows, and since $N_{\mathfrak{f},n}$ certainly has more than two random rows, the upper bounds given in Theorem 9.2 seem reasonable.

Theorem 9.2 leads to Corollary 1.2 by following a conditioning argument very similar to that given in Section 3.3.

The proof of Theorem 9.2 follows the same lines of reasoning as that of Theorem 2.2. In this subsection, we will state the main lemmas with the necessary modifications, and we will mention a few important considerations when making the modifications.

Note that Equation (15), which reduces the question of singularity to one of the rows spanning non-trivial hyperplane of dimension $n-1$ holds in the current context, using the same definition of $A_{V}$ and “non-trivial hyperplane” (both are defined after Equation (15) in Section 5.1).

For $d_{\pm}=0$ , we define $\operatorname{Gr}_{\mathfrak{f}}(0)$ to be the set of all non-trivial hyperplanes such that

We will refer to $d_{\pm}$ as the combinatorial dimension of $V$ .

The proof is the same as that for Lemma 5.2; also see , , . ∎

The proof is the same as that for Lemma 5.3, except now we appeal to Lemma A.2 with $\mathfrak{f}>0$ . Note that we must assume $\mathfrak{f}\leq n/2$ in order to apply Lemma A.2. See also ,,. ∎

Consider a hyperplane $V$ of medium combinatorial dimension (that is, $d_{\pm}$ satisfies the condition in Proposition 9.6). We say $V$ is unexceptional if there exists an $i_{0}$ where $\mathfrak{f}+1\leq i_{0}\leq n$ and there exists a $k_{0}$ where $1\leq k_{0}\leq r$ such that

We say $V$ is exceptional if for every $i$ where $\mathfrak{f}+1\leq i\leq n$ and for every $k$ where $1\leq k\leq r$ we have

If $\mathfrak{f}\leq\frac{c_{\mathfrak{f}}\epsilon_{0}n}{r}$ for some positive constant $c_{\mathfrak{f}}$ , then we have

The proof follows in the same way as that for Lemma 5.6; however, when replacing rows $X_{i}$ of $N_{\mathfrak{f},n}$ with rows $Z_{i}$ that concentrate more sharply on $V$ , we must take care to only replace random rows of $N_{\mathfrak{f},n}$ (i.e., rows $X_{1},\ldots,X_{\mathfrak{f}}$ must not be replaced by $Z_{i}$ ). See Appendix B for details. ∎

In the exceptional case, The same structure theorem (Theorem 6.1) holds, leading to the following lemma.

If $\mathfrak{f}\leq\left(\frac{r}{2\ln(1/p)}-o(1)\right)\ln n$ , then

Note that this upper bound is dramatically worse than the analogous upper bound in Lemma 5.7 of $n^{-\frac{n}{2}+o(n)}$ .

As in Lemma 5.7, the main step in the proof is applying the structure theorem (Theorem 6.1). In the current context, Inequality (20) holds with $n-\mathfrak{f}$ as the exponent instead of $n$ (since there are only $n-\mathfrak{f}$ random rows). If we combine this modified version of Inequality (20) with Inequality (21), then we have the bound

Acknowledgments

We would like to thank Kevin Costello for helpful conversations on the conditioning argument in Subsections 3.2 and 3.3. Also the third author would like to thank the National Defense Science and Engineering Fellowship and the National Science Foundation Graduate Research Fellowship for helping fund this work.

Appendix A Two background results

For $Q$ sufficiently large with respect to $q$ , $r$ , and $n$ , it is clear that we have

Our proof is closely modeled on the proof of [12, Corollary 7.13]. Let $\beta^{(\mu)}_{j}$ be the symmetric random variables from the definition of $p$ -bounded of exponent $r$ corresponding to $\alpha_{j}$ (see Equation (9)). Then, we can compute

Combining the above inequalities with Inequality (44) and following the proof of [12, Corollary 7.13] to bound the integral, we have

A.2 A generalization of a lemma due to Komlós [6]

This lemma is a generalization of the result in (see also [2, Lemma 14.10], [4, Section 3.1], and [10, Lemma 5.3]).

where the constant $c$ can be taken to be $c\geq 2\ln(100/p)$ , and where $\underline{0}$ denotes the zero vector.

Let $E_{k}=\{\mbox{there exists }v\in\Omega_{1}\mbox{ with at most$ k $nonzero coordinates such that }N_{\mathfrak{f},n}\cdot v=0\}$ . Clearly,

Let $S$ be the set of all possible values that could appear as entries in $N_{\mathfrak{f},n}$ , and let $\left.N_{\mathfrak{f},n}\right|_{j_{1},\ldots,j_{k}}$ be the $n$ by $k$ matrix consisting of columns $j_{1},\ldots,j_{k}$ of $N_{\mathfrak{f},n}$ . Following [6, Lemma 2] (see also [2, Lemma 14.10] and [10, Lemma 5.3]) we can write

Let $U(k,p,q)$ be a uniform upper bound for $\Pr(\mbox{row$ i $is in$ H $})$ , where $\mathfrak{f}+1\leq i\leq n$ and $q$ is the constant from Definition 2.1 (here, we mean uniform with respect to the index sets $\{j_{1},\ldots,j_{k}\}$ and $\{i_{1},\ldots,i_{k}\}$ ). Then we have

since $k-1$ fixed rows of $\left.N_{\mathfrak{f},n}\right|_{j_{1},\ldots,j_{k}}$ can span at most 1 hyperplane $H$ of dimension $k-1$ .

Summing the bounds in Inequalities (45) and (46) completes the proof. ∎

Appendix B The unexceptional case with 𝔣𝔣\mathfrak{f} fixed rows

This section is adapted from the proof of [11, Lemma 4.1], and proves Lemma 5.6 by setting $\mathfrak{f}=0$ . Assume that $\mathfrak{f}\leq\frac{c_{\mathfrak{f}}\epsilon_{0}n}{r}$ , and let $m$ be the closest integer to $\frac{c_{m}\epsilon n}{r}$ . Let $Z_{1},\ldots,Z_{m}$ be i.i.d. copies of the unexceptional row vector $Z^{*}_{i_{0},k_{0}}$ from Definition 9.7, so $\epsilon_{1}\Pr(Z_{i}\in V)>\Pr(X_{i}\in V)$ for all $\mathfrak{f}+1\leq i\leq n$ . We will need the following version of the Weighted Odlyzko Lemma:

[cf. [11, Lemma 4.3] or [4, Section 3.2]] For $1\leq i$ , let $W_{i-1}$ be an ( $\mathfrak{f}+i-1$ )-dimensional subspace containing $X_{1},\ldots,X_{\mathfrak{f}}$ (which are fixed, linearly independent row vectors). Then

Since $W_{i-1}$ has dimension $\mathfrak{f}+i-1$ , there exists a set of $\mathfrak{f}+i-1$ “determining” coordinates such that if a vector $V\in W_{i-1}$ , then the $\mathfrak{f}+i-1$ “determining” coordinates determine the values of the remaining $n-\mathfrak{f}-i+1$ coordinates. Since the maximum probability that any of the $n/r$ random coordinates in $Z_{i}$ takes a given value is at most $1-\underline{\mu}=p+\frac{\epsilon_{0}}{100}$ , and since there are at least $\frac{n}{r}-\mathfrak{f}-i+1$ of the random coordinates in $Z_{i}$ that are not among the “determining” coordinates, we have the desired upper bound. ∎

Let $V_{0}:=\operatorname{Span}\{X_{1},\ldots,X_{\mathfrak{f}}\}$ , the space spanned by the $\mathfrak{f}$ fixed rows, and for $1\leq i\leq m$ let $B_{V,i}$ be the event that $Z_{1},\ldots,Z_{m}$ are linearly independent in $V\setminus V_{0}$ . We have the following analog of Lemma 5.8 (and also [11, Lemma 4.4]):

Let $m$ , $\mathfrak{f}$ , and $B_{V,m}$ be as defined above. Then,

where $B_{V,0}$ denotes the full space of the $Z_{i}$ . Conditioning on a particular instance of $Z_{1},\ldots,Z_{i-1}$ in $B_{V,i-1}$ , we have that

where $W_{i-1}$ denotes the ( $\mathfrak{f}+i-1$ )-dimensional space spanned by $X_{1},\ldots,X_{\mathfrak{f}}$ and $Z_{1},\ldots,Z_{i-1}$ . We will now establish a uniform bound that does not depend on which particular instance of $Z_{1},\ldots,Z_{i-1}$ in $B_{V,i-1}$ that we fixed by conditioning. By the definition of unexceptional, we have

and by the Weighted Odlyzko Lemma (see Lemma B.1), we have

Using Taylor’s Theorem with remainder (for example), one can show that

and plugging this estimate back into Inequality (47) we get

To conclude Lemma 9.8 (which implies Lemma 5.6 by setting $\mathfrak{f}=0$ ), we will proceed as in the proof for [11, Lemma 4.1].

Let $Z_{1},\ldots,Z_{m}$ be i.i.d. copies of $Z^{*}_{i_{0},k_{0}}$ that are independent of the random rows $X_{\mathfrak{f}+1},\ldots,X_{n}$ . Using independence and Bayes’ Identity we have

Because the $Z_{i}$ are linearly independent in $V\setminus V_{0}$ , we know that there is a subset $I\subset\{\mathfrak{f}+1,\mathfrak{f}+2,\ldots,n\}$ of cardinality $\left|I\right|=m$ , such that $\{Z_{1},\ldots,Z_{m}\}\cup\{X_{i}:i\notin I\}$ spans $V$ . Let $C_{V,I}$ be the event that $\{Z_{1},\ldots,Z_{m}\}\cup\{X_{i}:i\notin I\}$ spans $V$ . Then we have

Summing the above inequality over all unexceptional $V$ (note that $\sum_{V}\Pr(C_{V,I})\leq 1$ ) and combining with the bound for $\Pr(A_{V})$ above gives us

This completes the proof of the estimate for unexceptional $V$ .