On the singularity probability of discrete random matrices

Jean Bourgain, Van Vu, Philip Matchett Wood

Introduction

Let nn be a large integer and MnM_{n} be an nn by nn random matrix whose entries are independent (but not necessarily identically distributed) discrete random variables taking values in the complex numbers. The problem of estimating the probability that MnM_{n} is singular is a basic problem in the theory of random matrices and combinatorics. The goal of this paper is to give a bound that applies to a large variety of distributions. The general statement (Theorem 2.2) is a bit technical, so we will first discuss a few corollaries concerning special cases.

The most famous special case is when the entries of MnM_{n} are independent identically distributed (i.i.d.) Bernoulli random variables (taking values ±1\pm 1 with probability 1/21/2). The following conjecture has been open for quite some time:

For M±1,nM_{\pm 1,n} an nn by nn matrix with each entry an i.i.d. Bernoulli random variable taking the values +1+1 and 1-1 each with probability 1/21/2,

It is easy to verify that the singularity probability is at least (1/2)n(1/2)^{n} by considering the probability that there are two equal rows (or columns).

Even in the case of i.i.d. Bernoulli random variables, proving that the singularity probability is o(1)o(1) is not trivial. It was first done by Komlós in 1967 (see also ; generalizes Komlós’s bound to other integer distributions). The first exponential bound was proven by Kahn, Komlós, and Szemerédi , who showed that Pr(M±1,n\mboxissingular).999n\Pr(M_{\pm 1,n}\mbox{ is singular})\leq.999^{n}. This upper bound was improved upon by Tao and Vu in to .958n.958^{n}. A more significant improvement was obtained by the same authors in :

This improvement was made possible through the discovery of a new theorem [11, Theorem 5.2] (which was called the Structure Theorem in ), which gives a complete characterization of a set with certain additive properties. The Structure Theorem (to be more precise, a variant of it) will play a critical role in the current paper as well.

Our general result has the following corollary in the Bernoulli case:

which gives a slight improvement over Inequality (1) (since 1/20.7071<.751/\sqrt{2}\approx 0.7071<.75).

Let us now discuss a more general class of random matrices. Consider the random variable γ(μ)\gamma^{(\mu)} defined by

and let M±1,n(μ)M_{\pm 1,n}^{(\mu)} be an nn by nn matrix with each entry an independent copy of γ(μ)\gamma^{(\mu)}. The random variable γ(μ)\gamma^{(\mu)} plays an important role in , and the matrices M±1,n(μ)M_{\pm 1,n}^{(\mu)} are of interest in their own right. In fact, giving zero a large weight is a natural thing to do when one would like to (randomly) sparsify a matrix, a common operation used in randomized algorithms (the values of ±1\pm 1, as the reader will see, are not so critical). Our general result implies the following upper bounds:

Note that Inequality (5) implies Inequality (1) and that Inequality (6) implies Inequality (2) (in both cases setting μ=1\mu=1).

Figure 1 summarizes the upper bounds from Inequalities (4), (5), and (6) and also includes the following lower bounds:

These lower bounds can be derived by computing the probability that one row is all zeros (Inequality (7)) or that there is a dependency between two rows (Inequality (8)). Note that in the case where μ1/2\mu\leq 1/2, the upper bound in Inequality (4) asymptotically equals the lower bound in Inequality (7), and thus our result is the best possible in this case. We also used a Maple program to derive the formulas for lower bounds resulting from a dependency between three, four, or five rows; however, these lower bounds were inferior to those in Inequality (7) and Inequality (8).

We will now present another corollary of the main theorem that has a somewhat different flavor. In this corollary, we treat partially random matrices, which may have many deterministic rows. Our method allows us to obtain exponential bounds so long as there are still at most clnnc\ln n random rows, where c>0c>0 is a particular constant.

Notice that the case f=0\mathfrak{f}=0 and p=1/2p=1/2 also implies Inequality (2).

The focus of this paper is optimizing the base of the exponent in bounds on the singularity probability for discrete random matrices. One main tool in this optimization is the use of a structure theorem similar to [11, Theorem 5.2] (see Theorem 6.1 below); however, using such a theorem requires additional assumptions to be placed on the values that can appear as entries, and in particular, this is why we assume in Corollary 1.2 that the set SS has cardinality SO(1)\left|S\right|\leq O(1) and that fclnn\mathfrak{f}\leq c\ln n. If one is interested in an exponential bound where there are no conditions on f\mathfrak{f} or on the set SS (at the expense of having an unspecified constant for the base of the exponential), one can follow the analysis in , which does not make use of a structure theorem, along with ideas in this paper to get a result of the following form:

For every ϵ>0\epsilon>0 there exists δ>0\delta>0 such that the following holds. Let Nf,nN_{\mathfrak{f},n} be an nn by nn complex matrix in which f\mathfrak{f} rows contain fixed, non-random entries and where the other rows contain entries that are independent discrete random variables. If the fixed rows have co-rank kk and if for every random entry α\alpha, we have maxxPr(α=x)1ϵ\max_{x}\Pr(\alpha=x)\leq 1-\epsilon, then for all sufficiently large nn

Note that Theorem 1.4 holds for any f\mathfrak{f} and kk, and so in particular, an exponential bound on the singularity probability is achieved whenever k=0k=0 and fcn\mathfrak{f}\leq cn, where c<1c<1 is a constant. Also note that the theorem allows the random entries to have discrete distributions taking infinitely many values. Corollary 3.6 proves a version of Theorem 1.4 with a much better exponential bound, given some additional conditions.

The structure of the rest of the paper is as follows. In Section 2 we define pp-bounded of exponent rr and state the main theorem of this paper. In Section 3, we discuss some corollaries of Theorem 2.2. In particular, we will:

prove general bounds on the singularity probability for discrete random matrices with entries that have symmetric distributions and with entries that have asymmetric distributions;

Prove a version of Corollary 1.2 (namely, Corollary 3.5) that holds for up to o(n)o(n) fixed rows, assuming that the entries in the fixed rows take integer values between C-C and CC for any positive constant CC; and

prove that the probability that random matrices with integer entries have a rational eigenvalue is exponentially small.

The general theorem

To prove the results in Inequalities (1) and (2) (and also the results in and ), one basic idea is to replace entries of a random matrix with independent copies of the random variable γ(μ)\gamma^{(\mu)} or 2γ(μ)2\gamma^{(\mu)} (see Equation (3)). One key idea in proving the more general results of the current paper is replacing the entries of a random matrix with more complicated symmetric discrete random variables.

The following definition lies at the heart of our analysis.

Let pp be a positive constant such that 0<p<10<p<1 and let rr be a positive integer constant. A random variable α\alpha taking values in the integers (or, respectively, the integers modulo some large prime QQ) is pp-bounded of exponent rr if

qminxPr(β(μ)=x)q\leq\min_{x}\Pr(\beta^{(\mu)}=x) and maxxPr(β(μ)=x)p\max_{x}\Pr(\beta^{(\mu)}=x)\leq p, and

We will define pp-bounded of exponent rr for collections of random variables below, but first we note that the conditions above are easy to verify in practice. In particular, if we have a symmetric random variable

where the equality on the right-hand side is a simple expected value computation.

We say that a collection of random variables {αjk}j,k=1n\{\alpha_{jk}\}_{j,k=1}^{n} is pp-bounded of exponent rr if each αjk\alpha_{jk} is pp-bounded of exponent rr with the same constants pp, qq, and rr; and, importantly, the same value of μ=1p\mu=1-p. We also make the critical assumption that the set of all values that can be taken by the βjk(μ)\beta^{(\mu)}_{jk} has cardinality O(1)O(1) (a relaxation of this assumption is discussed in Remark 8.5). However, the definition of βjk(μ)\beta^{(\mu)}_{jk} is otherwise allowed to vary with jj and kk. Also, we will use SS to denote the set of all possible values taken by the random variables αjk\alpha_{jk}, and we will assume that the cardinality of SS is at most Sno(n)\left|S\right|\leq n^{o(n)}.

Let pp be a positive constant such that 0<p<10<p<1, let rr be a positive integer constant, and let SS be a generalized arithmetic progression in the complex numbers with rank O(1)O(1) (independent of nn) and with cardinality at most Sno(n)\left|S\right|\leq n^{o(n)}. Let NnN_{n} be an nn by nn matrix with entries αjk\alpha_{jk}, each of which is an independent random variable taking values in SS. If the collection of random variables {αjk}1j,kn\{\alpha_{jk}\}_{1\leq j,k\leq n} is pp-bounded of exponent rr, then

In the motivating examples of Section 1 (excluding Corollary 1.2), we discussed the case where the entries of the matrix are i.i.d.; however, in general the distributions of the entries are allowed to differ (and even depend on nn), so long as the entries all take values in the same structured set SS described above. The condition that SS has additive structure seems to be an artifact of the proof (in particular, at certain points in the proof of Theorem 6.1, we need the set {j=1nxj:xjS\mboxforallj}\left\{\sum_{j=1}^{n}x_{j}:x_{j}\in S\mbox{ for all }j\right\} to have cardinality at most no(n)n^{o(n)}). The easiest way to guarantee that SS has the required structure is to assume that the set of values taken by all the αjk\alpha_{jk} has cardinality at most O(1)O(1), and this is the approach we take for the corollaries in Section 3, since it also makes it easy to demonstrate that the collection of entries is pp-bounded of exponent rr.

Some corollaries of Theorem 2.2

In each corollary, we will use the definition of pp-bounded of exponent 1 and of exponent 2. The definition of pp-bounded of exponent 2 is particularly useful, since then the absolute value on the left-hand side of Inequality (10) is automatically dealt with; however, when μ\mu is small (for example whenever μ1/2\mu\leq 1/2), one can get better bounds by using pp-bounded of exponent 1. We have not yet found an example where the best possible bound from Theorem 2.2 is found by using pp-bounded of an exponent higher than 2.

To prove Inequality (4), we note for 0μ120\leq\mu\leq\frac{1}{2} that (using the definition in Equation (3) of γ(μ)\gamma^{(\mu)})

and thus γ(μ)\gamma^{(\mu)} is (1μ)(1-\mu)-bounded of exponent 1 (i.e., take β(μ):=γ(μ)\beta^{(\mu)}:=\gamma^{(\mu)}), and so Inequality (4) follows from Theorem 2.2.

To prove Inequality (5), we note for 12μ1\frac{1}{2}\leq\mu\leq 1 that

(the inequality above may be checked by squaring both sides and expanding as polynomials in cos(2πt)\cos(2\pi t)). Thus, we can take

to see that γ(μ)\gamma^{(\mu)} is (2μ+14)\displaystyle\left(\frac{2\mu+1}{4}\right)-bounded of exponent 1, and so Inequality (5) follows from Theorem 2.2.

To prove Inequality (6), we note for 0μ10\leq\mu\leq 1 that

to see that γ(μ)\gamma^{(\mu)} is (12μ+32μ2)\displaystyle\left(1-2\mu+\frac{3}{2}\mu^{2}\right)-bounded of exponent 2, and so Inequality (6) follows from Theorem 2.2.

2 Matrices with entries having symmetric distributions

In this subsection, we will prove a singularity bound for an nn by nn matrix Nn(μ)N_{n}^{(\mu)} for which each entry is a symmetric discrete random variable taking the value 0 with probability 1μ1-\mu.

Let SS be a set of complex numbers with cardinality SO(1)\left|S\right|\leq O(1). If Nn(μ)N_{n}^{(\mu)} is an nn by nn matrix in which each entry is an independent symmetric complex random variable taking values in SS and taking the value 0 with probability 1μ1-\mu, then

In particular, the same upper bounds as in Inequalities (4), (5), and (6) (which are shown in Figure 1) apply to the singularity probability for Nn(μ)N_{n}^{(\mu)}.

Let αij\alpha_{ij} be an entry of Nn(μ)N_{n}^{(\mu)}. Since αij\alpha_{ij} is symmetric and takes the value 0 with probability 1μ1-\mu, we may write αij=γij(μ)ηij\alpha_{ij}=\gamma^{(\mu)}_{ij}\eta_{ij}, where γij(μ)\gamma^{(\mu)}_{ij} is an independent copy of γ(μ)\gamma^{(\mu)} as defined in Equation (3) and ηij\eta_{ij} is a random variable that shares no values with ηij-\eta_{ij}. This description of αij\alpha_{ij} was inspired by , and it allows us to condition on ηij\eta_{ij} and then use the remaining randomness in γij(μ)\gamma^{(\mu)}_{ij} to get a bound on the singularity probability. In particular,

where the sum runs over all (n2)(n^{2})-tuples (cij)1i,jn(c_{ij})_{1\leq i,j\leq n} of possible values taken by random variables ηij\eta_{ij}. Since (cij)Pr({ηij=cij})=1\sum_{(c_{ij})}\Pr(\{\eta_{ij}=c_{ij}\})=1, we can complete the proof by proving an exponential bound on Pr(Nn(μ)\mboxissingular{ηij=cij})\Pr(N_{n}^{(\mu)}\mbox{ is singular}|\{\eta_{ij}=c_{ij}\}), and we will use Theorem 2.2 for this task.

We have thus shown that the entries of Nn(μ){ηij=cij}\left.N_{n}^{(\mu)}\rule{0.0pt}{11.0pt}\right|_{\{\eta_{ij}=c_{ij}\}} are

Applying Theorem 2.2 completes the proof. ∎

Corollary 3.1 is tight for 0μ120\leq\mu\leq\frac{1}{2}, since the probability of a row of all zeroes occurring is (1μ+o(1))n(1-\mu+o(1))^{n}; however, for any specific case, Theorem 2.2 can usually prove better upper bounds than those given by Corollary 3.1.

For example, consider the case of a matrix M{±2,±1},n(μ)M_{\{\pm 2,\pm 1\},n}^{(\mu)} with each entry an independent copy of the symmetric random variable

For M{±2,±1},n(μ)M_{\{\pm 2,\pm 1\},n}^{(\mu)} as defined above, we have

By the definition of α(μ)\alpha^{(\mu)} we have

(i.e., the right-hand side of the equation above is non-negative for such μ\mu), which proves the first bound.

for 0μ10\leq\mu\leq 1, which proves the second bound. ∎

We also have the following lower bounds for the singularity probability of M{±2,±1},n(μ)M_{\{\pm 2,\pm 1\},n}^{(\mu)}:

The results of Corollary 3.2 and the corresponding lower bounds are shown in Figure 2, and one should note that the upper bounds are substantially better than those guaranteed by Corollary 3.1.

3 Random matrices with entries having arbitrary distributions

A useful feature of the definition of pp-bounded of exponent 2 is that it lets one bound the singularity probability of matrices with independent discrete random variables that are asymmetric.

We will need the following slightly more general corollary in Section 3.4. For a set AA and an integer mm, we will use the notation mA:={j=1maj:ajA}mA:=\{\sum_{j=1}^{m}a_{j}:a_{j}\in A\} and Am:={j=1maj:ajA}A^{m}:=\{\prod_{j=1}^{m}a_{j}:a_{j}\in A\}.

Note that that Corollary 3.4 implies Corollary 3.3 by taking XnX_{n} to be the matrix of all zeroes.

Let αij\alpha_{ij} be an entry in NnN_{n}. Our goal is to describe αij\alpha_{ij} in a two-step random process, condition on one of the steps, and then use the randomness in the other step to bound the singularity probability. The conditioning approach is the same as that used in the symmetric case (Corollary 3.1) and was inspired by . The conditioning argument is useful since some entries of the random matrix may take some values with very small probability (i.e. probability less than any constant); recall that while the entries of the random matrix always take values in a fixed set SS of cardinality O(1)O(1), the distributions of those random variables within SS are allowed to vary with nn. (Note that making use of Remark 8.5 would provide an alternate way of dealing with entries that take some values with very small probability.)

Say that αij\alpha_{ij} takes the values v1,,vtv_{1},\ldots,v_{t} with probabilities ϱ1,,ϱt\varrho_{1},\ldots,\varrho_{t}, respectively, where ϱ1ϱ2ϱt\varrho_{1}\geq\varrho_{2}\geq\cdots\geq\varrho_{t}. Define new random variables ηijk\eta_{ijk} such that for some i0i_{0} and i1i_{1}, the values taken by ηijk\eta_{ijk} are vi0,vi0+1,,vi0+i1v_{i_{0}},v_{i_{0}+1},\ldots,v_{i_{0}+i_{1}} with corresponding probabilities ϱi0/pk,ϱi0+1/pk,,ϱi0+i1/pk\varrho_{i_{0}}/p_{k},\varrho_{i_{0}+1}/p_{k},\ldots,\varrho_{i_{0}+i_{1}}/p_{k}, where pk:=i=1i1ϱi0+ip_{k}:=\sum_{i=1}^{i_{1}}\varrho_{i_{0}+i}. Thus, we can write

As in the proof of Corollary 3.1, we will condition on the values taken by the ηijk\eta_{ijk} in order to prove a bound on the singularity probability. We have that

where the sum runs over all possible values (cijk)(c_{ijk}) that the ηijk\eta_{ijk} can take. Thus, it is sufficient to prove a bound on the singularity probability for the random matrix Xn+Nn{ηijk=cijk}X_{n}+\left.N_{n}\rule{0.0pt}{11.0pt}\right|_{\{\eta_{ijk}=c_{ijk}\}} which has random entries

where xijx_{ij} and the cijkc_{ijk} are constants.

Combining Case 1 and Case 2, we have that the collection {xij+α~ij}\{x_{ij}+\widetilde{\alpha}_{ij}\} is (p+ϵ)\left(p+\epsilon\right)-bounded of exponent 2, and so by and by Theorem 2.2 we have that Pr(Xn+Nn{ηijk=cijk}\mboxissingular)(p+ϵ+o(1))n\Pr(X_{n}+\left.N_{n}\rule{0.0pt}{11.0pt}\right|_{\{\eta_{ijk}=c_{ijk}\}}\mbox{ is singular})\leq\left(\sqrt{p+\epsilon}+o(1)\right)^{n}.

The constant ϵ>0\epsilon>0 was chosen arbitrarily, and so letting ϵ\epsilon tend to zero, we get that

4 Partially random matrices

In this subsection, we prove a bound on the singularity probability for partly random matrices where many rows are deterministic.

Corollary 3.5 applies to partly random matrices with f=o(n)\mathfrak{f}=o(n) fixed, non-random rows containing integers bounded by a constant and with random entries taking at most O(1)O(1) values in the complex numbers. Corollary 1.2, on the other hand, holds with the fixed entries also allowed to take values in the complex numbers and gives a sligtly better bound, but additionally requires fO(lnn)\mathfrak{f}\leq O(\ln n) (which is far smaller in general than o(n)o(n)). Proving Corollary 1.2 requires mirroring the entire argument used to prove the main theorem (Theorem 2.2) in the case where f\mathfrak{f} rows contain fixed, non-random entires, and we discuss this argument in Section 9. Proving Corollary 3.5, however, can be done directly from Theorem 2.2, as we will show below. First, we will state a generalization of Corollary 3.5.

To obtain Corollary 3.6 from Corollary 3.5, find a collection C\mathcal{C} of fk\mathfrak{f}-k linearly independent rows among the deterministic rows. Replace the rest of the deterministic rows with a collection C\mathcal{C}^{\prime} of rows containing integer values between K-K and KK such that C\mathcal{C}^{\prime} is linearly independent from C\mathcal{C}. Finally, apply Corollary 3.5 to the new partially random matrix whose deterministic rows are from CC\mathcal{C}\cup\mathcal{C}^{\prime}, thus proving Corollary 3.6.

By reordering the rows and columns, we may write

where AA is an f\mathfrak{f} by f\mathfrak{f} non-random invertible matrix, BB is an f\mathfrak{f} by nfn-\mathfrak{f} non-random matrix, CC is an nfn-\mathfrak{f} by f\mathfrak{f} random matrix, and DD is an nfn-\mathfrak{f} by nfn-\mathfrak{f} random matrix. Note that Nf,nN_{\mathfrak{f},n} is singular if and only if there exists a vector v\mathbf{v} such that Nf,nv=0N_{\mathfrak{f},n}\mathbf{v}=0. Let v1\mathbf{v}_{1} be the first f\mathfrak{f} coordinates of v\mathbf{v} and let v2\mathbf{v}_{2} be the remaining nfn-\mathfrak{f} coordinates. Then Nf,nv=0N_{\mathfrak{f},n}\mathbf{v}=0 if and only if

Since AA is invertible, these two equations are satisfied if and only if (CA1B+D)v2=0(-CA^{-1}B+D)\mathbf{v}_{2}=0, that is, if and only if the random matrix CA1B+D-CA^{-1}B+D is singular.

We want to show that every entry that can appear in CA1B-CA^{-1}B is an element of no(n)(S{1,0,1})O(1)n^{o(n)}\left(S\cup\{-1,0,1\}\right)^{O(1)}. By the cofactor formula for A1A^{-1}, we know that the i,ji,j entry of A1A^{-1} is (1)i+j(detAij)/detA(-1)^{i+j}(\det A_{ij})/\det A, where AijA_{ij} is the f1\mathfrak{f}-1 by f1\mathfrak{f}-1 matrix formed by deleting the ii-th row and jj-th column of AA. Thus, A1=1detAA~A^{-1}=\frac{1}{\det A}\widetilde{A}, where the i,ji,j entry of A~\widetilde{A} is (1)i+jdetAij(-1)^{i+j}\det A_{ij}. By the volume formula for the determinant, we know that detA\left|\det A\right| is at most the product of the lengths of the row vectors of AA; and thus detAno(n)\left|\det A\right|\leq n^{o(n)} (here we need that AA has integer entries between K-K and KK, where KK is a constant, and that fo(n)\mathfrak{f}\leq o(n)). Similarly, we have detAijno(n)\left|\det A_{ij}\right|\leq n^{o(n)}. Every entry of A~\widetilde{A} is thus in no(n){1,0,1}n^{o(n)}\{-1,0,1\}, every entry of CC is in SS, and every entry of BB is in O(1){1,0,1}O(1)\{-1,0,1\}; thus, every entry of CA~B-C\widetilde{A}B is an element of no(n)(S{1,0,1})n^{o(n)}(S\cup\{-1,0,1\}).

Conditioning on the values taken by all the entries in CC, we have

where the sum runs over all possible matrices (cij)(c_{ij}) that CC can produce. Considering the entries in C=(cij)C=(c_{ij}) to be fixed (note that AA and BB are fixed by assumption), we now need to bound

Note that every entry of (cij)A~B-(c_{ij})\widetilde{A}B is an element of no(n)(S{1,0,1})O(1)n^{o(n)}\left(S\cup\{-1,0,1\}\right)^{O(1)} and that the random matrix (detA)D(\det A)D has entries that take values in the fixed set {(detA)s:sS}\{(\det A)s:s\in S\} having cardinality O(1)O(1). Thus, by Corollary 3.4, we have that

Plugging this bound back into Equation (14) completes the proof. ∎

5 Integer matrices and rational eigenvalues

Let ηk\eta_{k} be the random variable taking the values k,k+1,,k1,k-k,-k+1,\ldots,k-1,k each with equal probability, and let MnM_{n} be the nn by nn matrix where each entry is an independent copy of ηk\eta_{k}. In , Martin and Wong show that for any ϵ>0\epsilon>0,

where c(n,ϵ)c(n,\epsilon) is a constant depending on nn and ϵ\epsilon. (One goal in is to study this bound as kk goes to \infty while nn is fixed, which is why c(n,ϵ)c(n,\epsilon) is allowed to depend on nn.)

Below, we prove a similar result for random integer matrices with entries between k-k and kk (with kk fixed), where we allow each entry to have a different (independent) distribution and we also allow the distributions to be very general.

Fix a positive integer kk, and let Mk,nM_{k,n} be a random integer matrix with independent entries, each of which takes values in the set {k,k+1,,k1,k}\{-k,-k+1,\ldots,k-1,k\}. Let cc be a constant such that for every entry α\alpha, we have maxkxkPr(α=x)c/k\max_{-k\leq x\leq k}\Pr(\alpha=x)\leq c/k. Then

where the o(1)o(1) term goes to zero as nn goes to \infty.

For example, in the case where each independent entry has the uniform distribution on {k,k+1,,k1,k}\{-k,-k+1,\ldots,k-1,k\} (as in ), one can set c=1/2c=1/2 in the corollary above.

The proof given below follows the same outline as the main theorem of , with Corollary 1.2 replacing an appeal to [7, Lemma 1].

The characteristic polynomial for Mk,nM_{k,n} is monic with integer coefficients, and thus the only possible rational eigenvalues are integers (by the rational roots theorem). Every eigenvalue of Mk,nM_{k,n} has absolute value at most nknk (see [7, Lemma 4]); thus, the only possible integer eigenvalues are between nk-nk and nknk.

The matrix Mk,nM_{k,n} has λ\lambda as an eigenvalue if and only if Mk,nλIM_{k,n}-\lambda I is singular (where II is the nn by nn identity matrix). By Corollary 1.2 (with f=0\mathfrak{f}=0), we have

Random matrices with complex entries: A reduction technique

the map ϕQ\phi_{Q} is injective on SS, and

for any nn by nn matrix (sij)1i,jn(s_{ij})_{1\leq i,j\leq n} with entries sijSs_{ij}\in S, we have

Let pp be a constant such that 0<p10<p\leq 1 and let DD be a characteristic zero integral domain. Let SDS\subset D have cardinality SO(1)\left|S\right|\leq O(1). If NnN_{n} is an nn by nn matrix with independent random entries, each taking values in SS, such that for every entry α\alpha, we have maxxPr(α=x)p\max_{x}\Pr(\alpha=x)\leq p, then

Proof of the main theorem (Theorem 2.2)

Let Xi:=(αi,1,,αi,n)X_{i}:=(\alpha_{i,1},\ldots,\alpha_{i,n}) denote the ii-th row of NnN_{n}. We note that NnN_{n} has determinant zero if and only if there is a linear dependency among the rows XiX_{i}. It has been shown (see [10, Lemma 5.1] and also ) that the dominant contribution to the singularity probability comes from the XiX_{i} spanning a hyperplane (of dimension n1n-1). In particular,

where AVA_{V} denotes the event that X1,,XnX_{1},\ldots,X_{n} span VV, and non-trivial means that VV contains the origin, VV is spanned by vectors in SnS^{n} (where SS is the set of all possible values that can occur in NnN_{n}), and Pr(XiV)>0\Pr(X_{i}\in V)>0 for all ii.

As in , we will divide the non-trivial hyperplanes into n2n^{2} classes, since it is then sufficient to show that the sum of Pr(AV)\Pr(A_{V}) over all VV in a particular class is at most (p1/r+o(1))n(p^{1/r}+o(1))^{n}.

For d±=0d_{\pm}=0, we define Gr(0)\operatorname{Gr}(0) to be the set of all non-trivial hyperplanes such that

We will refer to d±d_{\pm} as the combinatorial dimension of VV.

Note that Gr(d±)=\operatorname{Gr}(d_{\pm})=\emptyset for d±n1+1/nd_{\pm}\geq n-1+1/n (by Lemma B.1). We will consider hyperplanes VV with combinatorial dimension in three main regions: d±d_{\pm} small, d±d_{\pm} medium-sized, and d±d_{\pm} large. The two lemmas and the proposition below suffice to prove Theorem 2.2.

The reasoning here is the same as in [11, Lemma 2.3], making use of fact that Pr(XiV)max1inPr(XiV)pnd±δn\Pr(X_{i}\in V)\leq\max_{1\leq i\leq n}\Pr(X_{i}\in V)\leq p^{n-d_{\pm}}\leq\delta^{n}. In particular,

which completes the proof since the summing the right-hand side over all VV is at most nmaxiPr(XiV)n\max_{i}\Pr(X_{i}\in V) (note that an instance of the vectors {Xj}1jn{Xi}\{X_{j}\}_{1\leq j\leq n}\setminus\{X_{i}\} can span at most one hyperplane). ∎

where kk is the number of nonzero coordinates in the normal vector to VV. Combining the two inequalities above shows that kn/2k\leq n/2.

(Lemma A.2 is a natural generalization of [4, Section 3.1]; see also , [10, Lemma 5.1], and [2, Lemma 14.10].) ∎

The results in and were derived using the ideas that we will use for the unexceptional medium combinatorial dimension case. The idea of considering the exceptional case separately in (and using tools from additive combinatorics in the exceptional case) is what lead to the improvement of Inequality (1), which gives a bound of asymptotically (34)n\left(\frac{3}{4}\right)^{n}, over the .999n.999^{n} bound in .

2 Proof of the medium combinatorial dimension

Before defining exceptional and unexceptional hyperplanes, we will need some new notation. By assumption, the collection of random variables {αij}1i,jn\{\alpha_{ij}\}_{1\leq i,j\leq n} is pp-bounded of exponent rr with a constant μ=1p\mu=1-p, with random variables βij(μ)\beta^{(\mu)}_{ij} corresponding to each αij\alpha_{ij}, and with a constant 0<qp0<q\leq p (see Definition 2.1). We also define a constant slightly smaller than μ\mu, namely μ:=μϵ0100\displaystyle\underline{\mu}:=\mu-\frac{\epsilon_{0}}{100}. We will let Yi:=(yi,1,,yi,n):=(βi,1(μ),,βi,n(μ))Y_{i}:=(y_{i,1},\ldots,y_{i,n}):=(\beta^{(\underline{\mu})}_{i,1},\ldots,\beta^{(\underline{\mu})}_{i,n}) denote another row vector that corresponds to the row vector XiX_{i} (βi,j(μ)\beta^{(\underline{\mu})}_{i,j} comes from the definition of pp-bounded of exponent rr). Also, we will let

Consider a hyperplane VV of medium combinatorial dimension (that is, d±d_{\pm} satisfies the condition in Proposition 5.4). We say VV is unexceptional if there exists an i0i_{0} where 1i0n1\leq i_{0}\leq n and there exists a k0k_{0} where 1k0r1\leq k_{0}\leq r such that

We say VV is exceptional if for every ii where 1in1\leq i\leq n and for every kk where 1kr1\leq k\leq r we have

Proposition 5.4 follows from the two lemmas below, so long as ϵ1\epsilon_{1} is chosen suitably small with respect to ϵ0\epsilon_{0}, cmc_{m}, and rr.

We will prove Lemma 5.6 in Section 5.3, and we will prove Lemma 5.7 in Section 6.

3 The unexceptional medium combinatorial dimension case

The general idea for the case of an unexceptional hyperplane VV is to replace some of the rows XiX_{i} in the matrix NnN_{n} with rows that concentrate more sharply on the subspace VV. In the case where the exponent r=1r=1, replacing a row XiX_{i} with Yi:=(βi,1(μ),,βi,n(μ))Y_{i}:=(\beta^{(\underline{\mu})}_{i,1},\ldots,\beta^{(\underline{\mu})}_{i,n}) is successful; however, in the exponent r=2r=2 case, for example, replacing the entire row results in a bound that is off by an exponential factor. We solve this problem by replacing XiX_{i} with only half of YiY_{i} (with the other half of the entries being zero). This idea easily extends to any integer r2r\geq 2 and is the motivation for defining the vectors Zi,kZ^{*}_{i,k} to have all zeros except for roughly n/rn/r coordinates, as is done in Equation (17). The basic utility of Zi0,k0Z^{*}_{i_{0},k_{0}} (from Definition 5.5) is that it concentrates more sharply on the unexceptional subspace VV than the vector XiX_{i} for any ii.

Let Zi0,k0Z^{*}_{i_{0},k_{0}} be the vector from the definition of unexceptional (Definition 5.5) such that Pr(XiV)<ϵ1Pr(Zi0,k0V)\Pr(X_{i}\in V)<\epsilon_{1}\Pr(Z^{*}_{i_{0},k_{0}}\in V) for every ii, and set Z:=Zi0,j0Z:=Z^{*}_{i_{0},j_{0}}. Let mm be the closest integer to cmϵ0nr\frac{c_{m}\epsilon_{0}n}{r}, where cmc_{m} is a small positive absolute constant (for example, in , cmc_{m} is taken to be 1100\frac{1}{100}). Finally, let Z1,,ZmZ_{1},\ldots,Z_{m} be copies of ZZ, independent of each other and of X1,,XnX_{1},\ldots,X_{n}.

Let BV,mB_{V,m} be the event that Z1,,ZmZ_{1},\ldots,Z_{m} are linearly independent and lie in VV. Then,

The argument follows the same reasoning as [11, Lemma 4.4], however, the quantity 2d±n2^{d_{\pm}-n} in should be replaced by max1inPr(XiV)\max_{1\leq i\leq n}\Pr(X_{i}\in V). Details are provided in Appendix B. ∎

To conclude the proof of Lemma 5.6, we follow the “row-swapping” argument at the end of [11, Section 4], with the small change of bounding Pr(XiV)\Pr(X_{i}\in V) by max1inPr(XiV)\displaystyle\max_{1\leq i\leq n}\Pr(X_{i}\in V), which we use in place of the quantity 2d±n2^{d_{\pm}-n}. Details are provided in Appendix B.

Analyzing the exceptional medium combinatorial dimension case

The approach for exceptional VV in is very different from that used in the unexceptional case or in the large or small combinatorial dimension cases. Using some powerful tools from additive combinatorics, the general idea is to put exceptional hyperplanes VV in correspondence with a particular additive structure called a generalized arithmetic progression, and then to show that the number of the particular generalized arithmetic progressions that arise in this way is exceedingly small. The key to this approach is a structure theorem—namely, [11, Theorem 5.3]. In this section, we state a slightly modified structure theorem (Theorem 6.1), and then we show how to use Theorem 6.1 to prove Lemma 5.7. In the beginning of Section 7, we outline the changes needed to prove the the structure theorem for our current context, and in Sections 7 and 8 we provide details.

Before stating the structure theorem, we need some definitions and notation. A generalized arithmetic progression of rank r\mathfrak{r} is a set of the form

We will use the notation mPmP, where mm is a positive integer, to denote the set {i=1mxi:xiP}\{\sum_{i=1}^{m}x_{i}:x_{i}\in P\} and the notation PmP^{m}, where mm is a positive integer, to denote the set {i=1mxi:xiP}\{\prod_{i=1}^{m}x_{i}:x_{i}\in P\}. If PP is a generalized arithmetic progression of rank r\mathfrak{r}, then so is mPmP, while PmP^{m}, on the other hand, is a generalized arithmetic progression of rank at most rm\mathfrak{r}^{m}. Also note that mPmrP\left|mP\right|\leq m^{\mathfrak{r}}\left|P\right| and that PmPm\left|P^{m}\right|\leq\left|P\right|^{m}.

Given an exceptional hyperplane VV, there exists a representation of the form

and M1,,Mr1M_{1},\ldots,M_{\mathfrak{r}}\geq 1 with the volume bound

(i) (Scaled defining coordinates lie in a progression) The symmetric generalized arithmetic progression

We will discuss the proof of the structure theorem in Sections 7 and 8. In the remainder of this section, we will discuss how to use the structure theorem to prove Lemma 5.7.

Fix d±d_{\pm} of medium combinatorial dimension (see Proposition 5.4). Using independence of the rows, we have

In [11, Section 5], it is shown using Theorem 6.1(i) and (ii) and Gaussian-type methods (and the fact that r\mathfrak{r} is bounded by a constant) that

Plugging the volume bound on M1MrM_{1}\cdots M_{\mathfrak{r}} into the previous displayed inequality, we have

Halász-type arguments

The proof of the structure theorem has two main ingredients: tools from additive combinatorics, and Halász-type arguments using discrete Fourier analysis. Our proof of Theorem 6.1 will follow the proof of [11, Theorem 5.2] very closely. We will use results about additive combinatorics from [11, Section 6] directly, and we will discuss below the extent to which the Halász-type arguments of [11, Section 7] need to be modified to work for our current context. The proof of Theorem 6.1 will be given in Section 8 using results from the current section, [11, Section 6], [11, Section 7], and [11, Section 8]. Our Section 8 follows [11, Section 8] closely, with a few modifications to prove rational TT-commensurability instead of only rational commensurability.

In this section we discuss modifications to the lemmas in [11, Section 7] that are needed in order to prove Theorem 6.1.

We will use eQ()e_{Q}(\cdot) to denote the primitive character

where the last line is an application of Hölder’s inequality.

where μ:=μϵ0100\underline{\mu}:=\mu-\frac{\epsilon_{0}}{100}, as defined in Section 5.2. Note that f(ξ)=j=1nfj(ξ)\displaystyle f(\xi)=\prod_{j=1}^{n}f_{j}(\xi).

We will need the following analog of [11, Lemma 7.1]:

This inequality may be proven pointwise (for each jj after expanding out the definition of gkg_{k}) using the convexity of the log\log function, just as in the proof of [11, Lemma 7.1] (see also [10, Lemma 7.1]. ∎

(μpj,12q\mu p_{j,1}\geq 2q since minxPr(βj(μ)=x)q\min_{x}\Pr(\beta^{(\mu)}_{j}=x)\geq q by Definition 2.1).

Thus, there is a constant C(ϵ2,q,r)C(\epsilon_{2},q,r) such that

for every ξΛ\xi\in\Lambda. (E.g., the constant C(ϵ2,q,r):=(50rqln(1ϵ2))1/2C(\epsilon_{2},q,r):=\left(\frac{50r}{q}\ln\left(\frac{1}{\epsilon_{2}}\right)\right)^{1/2} suffices.)

There exists a constant CC depending on ϵ1,ϵ0,ϵ1,ϵ2,q,r\epsilon_{-1},\epsilon_{0},\epsilon_{1},\epsilon_{2},q,r, and μ\mu such that

Furthermore, for every integer k4k\geq 4 we have

Our goal is to bound ξΛf(ξ)\sum_{\xi\in\Lambda}f(\xi) from above and below, and then pass to bounds on Λ\left|\Lambda\right| using the fact that ϵ2f(ξ)1\epsilon_{2}\leq f(\xi)\leq 1 for all ξΛ\xi\in\Lambda.

We can choose ϵ2\epsilon_{2} sufficiently small with respect to ϵ1\epsilon_{1} and 1μ/μ1-\underline{\mu}/\mu so that, for example,

for some constant CC. This completes the proof of Lemma 7.2. ∎

We now state and prove a lemma showing that f(ξ)f(\xi) is at least a constant for all ξ4Λ\xi\in 4\Lambda. In , the lemma below is unnecessary because an inequality following from [11, Inequality (30)] (which corresponds to Inequality (30)) and the triangle inequality suffices.

Let Λ\Lambda and ff be defined as in Equation (28) and Equation (25), respectively. If ξ4Λ\xi\in 4\Lambda, then

Note that c(ϵ1,ϵ2)c(\epsilon_{-1},\epsilon_{2}) is a constant.

Note that Inequality (29) implies that for any ξΛ\xi^{\prime}\in\Lambda we have

Thus, by the triangle inequality, we have for any ξ4Λ\xi\in 4\Lambda that

Fix ξ4Λ\xi\in 4\Lambda. Let k0k_{0} be the number of indices jj such that

and without loss of generality, say that these indices are j=1,2,,k0j=1,2,\ldots,k_{0}. Squaring Inequality (34), we see that k0200μ1600rμln(1ϵ2)\frac{k_{0}}{200\mu}\leq\frac{1600r}{\mu}\ln\left(\frac{1}{\epsilon_{2}}\right), and so we have

which is a constant. Thus, for the vast majority of the indices jj, namely j=k0+1,k0+2,,nj=k_{0}+1,k_{0}+2,\ldots,n, we have

Note that 0xΛ10\leq\left\|x\right\|_{\Lambda}\leq 1 for all xx and that the triangle inequality holds: x+yΛxΛ+yΛ\left\|x+y\right\|_{\Lambda}\leq\left\|x\right\|_{\Lambda}+\left\|y\right\|_{\Lambda}. We also have that

Thus, squaring Inequality (30) and summing over all ξΛ\xi\in\Lambda, we have

In the next section, we will complete the proof of the structure theorem using the lemma above.

Proof of the Structure Theorem (Theorem 6.1)

The key to proving the structure theorem is an application of Freiman’s Theorem for finite fields.

The symmetric generalized arithmetic progression PP is close to what is needed for Theorem 6.1, since it satisfies the required volume and rank bounds. We will show below that PP can be altered in ways that preserve Inequalities (37) and (38) (except possibly for changing the implicit constants) so that PP satisfies conditions (i), (ii), and (iii) of Theorem 6.1.

There is an absolute constant C01C_{0}\geq 1 such that the following holds. Let PP be a symmetric progression of rank r\mathfrak{r} in a abelian group GG, such that every nonzero element of GG has order at least rC0r3P\mathfrak{r}^{C_{0}\mathfrak{r}^{3}}|P|. Then there exists a proper symmetric generalized arithmetic progression PP^{\prime} of rank at most r\mathfrak{r} containing PP such that

Furthermore, if PP is not proper and r2\mathfrak{r}\geq 2, then PP^{\prime} can be chosen to have rank an most r1\mathfrak{r}-1

One can conclude Lemma 8.2 from the proof of [11, Lemma 9.3] (the only difference is noting that the rank can be reduced by at least 1 if PP is not proper to begin with). Note that we can always choose QQ larger than rC0r3PO(1p)n\mathfrak{r}^{C_{0}\mathfrak{r}^{3}}|P|\leq O\left(\frac{1}{p}\right)^{n}.

Furthermore, if r2r\geq 2 and if there is some jj for which PP is not (kj,xjk_{j},x_{j})-proper, then PP^{\prime} can be chosen to have rank at most r1\mathfrak{r}-1.

We proceed by induction on the rank r\mathfrak{r}. For the base case, let r=1\mathfrak{r}=1 and consider xjPx_{j}\in P such that kjxjPk_{j}x_{j}\in P. Since PP has rank 1 in this case, we have that xj=ΦP(xj)v1x_{j}=\Phi_{P}(x_{j})v_{1} and kjxj=ΦP(kjxi)v1k_{j}x_{j}=\Phi_{P}(k_{j}x_{i})v_{1}. Combining these two equations we have kjΦP(xj)v1=ΦP(kjxj)v1k_{j}\Phi_{P}(x_{j})v_{1}=\Phi_{P}(k_{j}x_{j})v_{1}, and dividing by v1v_{1} (note that we may assume that v10v_{1}\neq 0), we see that kjΦP(xj)=ΦP(kjxj)k_{j}\Phi_{P}(x_{j})=\Phi_{P}(k_{j}x_{j}). Thus PP is (kj,xj)(k_{j},x_{j})-proper for every jj.

For r2\mathfrak{r}\geq 2, we may assume that there is some j0j_{0} such that kj0ΦP(xj0)ΦP(kj0xj0)k_{j_{0}}\Phi_{P}(x_{j_{0}})\neq\Phi_{P}(k_{j_{0}}x_{j_{0}}) (i.e., we assume that PP is not (kj0,xj0)(k_{j_{0}},x_{j_{0}})-proper). We may assume that PP has the form {m1v1++mrvr:mi<Mi/2}\{m_{1}v_{1}+\cdots+m_{\mathfrak{r}}v_{\mathfrak{r}}:\left|m_{i}\right|<M_{i}/2\}. Let M:=(M1,,Mr)M:=(M_{1},\ldots,M_{\mathfrak{r}}), and let (M/2,M/2)(-M/2,M/2) denote the box {(m1,,mr):mi<Mi/2}\{(m_{1},\ldots,m_{\mathfrak{r}}):\left|m_{i}\right|<M_{i}/2\}.

Let k\underline{k} be the largest integer such that ΦP(kxj0)=kΦP(xj0)\Phi_{P}(\underline{k}x_{j_{0}})=\underline{k}\Phi_{P}(x_{j_{0}}), so 1k<kj01\leq\underline{k}<k_{j_{0}} and ΦP((k+1)xj0)(k+1)ΦP(xj0)\Phi_{P}((\underline{k}+1)x_{j_{0}})\neq(\underline{k}+1)\Phi_{P}(x_{j_{0}}). Since kxj0P\underline{k}x_{j_{0}}\in P and xj0Px_{j_{0}}\in P, we know that ΦP(xj0)(M/2,M/2)\Phi_{P}(x_{j_{0}})\in(-M/2,M/2) and ΦP(kxj0)=kΦP(xj0)(M/2,M/2)\Phi_{P}(\underline{k}x_{j_{0}})=\underline{k}\Phi_{P}(x_{j_{0}})\in(-M/2,M/2); and thus, (k+1)ΦP(xj0)(M,M)(\underline{k}+1)\Phi_{P}(x_{j_{0}})\in(-M,M). This shows that 2P2P, which has dimensions 2M=(2M1,,2Mr)2M=(2M_{1},\ldots,2M_{\mathfrak{r}}), is not proper, since it has two distinct representations for (k+1)xj0(\underline{k}+1)x_{j_{0}}.

We can now apply Lemma 8.2 to 2P2P, thus finding a proper symmetric generalized arithmetic progression PP^{\prime} of rank at most r1\mathfrak{r}-1 containing 2P2P (which contains PP) such that

Since PP^{\prime} has rank at most r1\mathfrak{r}-1, we have by induction that there exists PP^{\prime\prime} a proper symmetric generalized arithmetic progression of rank at most r1\mathfrak{r}-1 containing PP^{\prime} and such that

and such that PP^{\prime\prime} is (kj,xjk_{j},x_{j})-proper for every jj. Choosing C12C0C_{1}\geq 2C_{0} (for example) guarantees that rC1(r1)4r2C0r3rC1r4\mathfrak{r}^{C_{1}(\mathfrak{r}-1)^{4}}\mathfrak{r}^{2C_{0}\mathfrak{r}^{3}}\leq\mathfrak{r}^{C_{1}\mathfrak{r}^{4}}, which completes the induction. ∎

Since not all the αi\alpha_{i} are zero, we may assume that αr0\alpha_{\mathfrak{r}}\neq 0. Setting w=vr/αrw=v_{\mathfrak{r}}/\alpha_{\mathfrak{r}} so that vrwαr=0v_{\mathfrak{r}}-w\alpha_{\mathfrak{r}}=0, we see that PP is contained in the symmetric generalized arithmetic progression

with rank r1\mathfrak{r}-1, dimensions M1,,Mr1M_{1},\ldots,M_{\mathfrak{r}-1} (which are the same as the corresponding dimensions for PP), and basis vectors vi:=viαivr/αrv_{i}^{\prime}:=v_{i}-\alpha_{i}v_{\mathfrak{r}}/\alpha_{\mathfrak{r}}. By construction PP|P^{\prime}|\leq|P|. ∎

If PP is proper, then do nothing; otherwise apply Lemma 8.2.

If PP satisfies the three properties given in steps 1, 2, and 3, halt; otherwise, return to step 1.

Thus, all that is left to prove is part (iii), the claim of rational TT-commensurability. Though we will not need it in the current section, one should recall that Theorem 6.1 is only useful when no(n)TO(1)=no(n)\left|n^{o(n)}T^{O(1)}\right|=n^{o(n)}, where TT is the symmetric generalized arithmetic progression containing {1,0,1}\{-1,0,1\} and all possible values taken by the βij(μ)\beta^{(\mu)}_{ij} and the αij\alpha_{ij} (see Section 6).

We say that a set WW economically TT-spans a set UU if each uUu\in U can be represented as a highly TT-rational linear combination of elements in WW, where each coefficient may be expressed as a/ba/b where a,bno(n)TO(1)a,b\in n^{o(n)}T^{O(1)} and where the implicit constants in the o()o(\cdot) and O()O(\cdot) notation are uniform over UU.

Comparing our definitions with those from [11, Section 8], we note that “highly rational” means the same thing as “highly {1,0,1}\{-1,0,1\}-rational”, and “economically spans” means the same thing as “economically {1,0,1}\{-1,0,1\}-spans”. Thus, it is clear that any highly rational number is also highly TT-rational for any TT containing {1,0,1}\{-1,0,1\}, and also the statement “WW economically spans UU” implies “WW economically TT-spans UU” for any set TT containing {1,0,1}\{-1,0,1\}. The remainder of this section paraphrases (with some notational changes) the latter portion of [11, Section 8].

Let ss be the smallest integer such that there exists a subset of cardinality ss of {v1,,vr}\{v_{1},\ldots,v_{\mathfrak{r}}\} (by renumbering, say the set is {v1,,vs}\{v_{1},\ldots,v_{s}\}) so that for some nonzero dno(n)TO(1)d\in n^{o(n)}T^{O(1)} and some cijno(n)TO(1)c_{ij}\in n^{o(n)}T^{O(1)} we have

Note that the line above is the only place in the proof where we use the assumption from the definition of pp-bounded of exponent rr that the βij(μ)\beta^{(\mu)}_{ij} take values in a set with cardinality O(1)O(1). As is evidenced here, the following weaker assumption suffices instead: say that for each 1in1\leq i\leq n there exists a set BiB_{i} such that Bi=O(1)\left|B_{i}\right|=O(1) and such that βi1(μ),βi2(μ),,βin(μ)\beta^{(\mu)}_{i1},\beta^{(\mu)}_{i2},\ldots,\beta^{(\mu)}_{in} each take a nonzero value in BiB_{i} with probability at least qq. In fact, this weaker assumption also replaces the assumption in the definition of pp-bounded of exponent rr that qminxPr(βij(μ)=x)q\leq\min_{x}\Pr(\beta^{(\mu)}_{ij}=x) for every i,ji,j: It suffices for each βij(μ)\beta^{(\mu)}_{ij} to take one value in BiB_{i} with probability at least qq, instead of taking every value with probability at least qq.

Plugging this last equation into Equation (39), we arrive at

Thus, we have completed the proof of the structure theorem (Theorem 6.1). \square

A generalization: 𝔣𝔣\mathfrak{f} rows have fixed, non-random values

In this section, we will give a generalization of Theorem 2.2 to the case where the random matrix NnN_{n} has fO(lnn)\mathfrak{f}\leq O(\ln n) rows that are assumed to be linearly independent and contain fixed, non-random entries. The proof of the generalized result is very similar to the proof of Theorem 2.2, and we will sketch the main differences in the two proofs below.

Let f\mathfrak{f} be an integer between 1 and nn, let SS be a subset of a ring, and let Nf,nN_{\mathfrak{f},n} be an nn by nn matrix defined as follows. For 1if1\leq i\leq\mathfrak{f} and 1jn1\leq j\leq n, let the entries sijs_{ij} of Nf,nN_{\mathfrak{f},n} be fixed (non-random) elements of SS such that the rows (si,1,,si,n)(s_{i,1},\ldots,s_{i,n}) for 1if1\leq i\leq\mathfrak{f} are linearly independent. For f+1in\mathfrak{f}+1\leq i\leq n and 1jn1\leq j\leq n, let the entries αij\alpha_{ij} of Nf,nN_{\mathfrak{f},n} be discrete finite random variables taking values in SS. Thus,

Let pp be a positive constant such that 0<p<10<p<1, let rr be a positive integer constant, and let SS be a generalized arithmetic progression in the complex numbers with rank O(1)O(1) (independent of nn) and with cardinality at most Sno(n)\left|S\right|\leq n^{o(n)}. Consider the matrix Nf,nN_{\mathfrak{f},n} with entries in SS (see Definition 9.1 above), where f(r2ln(1/p)o(1))lnn\mathfrak{f}\leq\left(\frac{r}{2\ln(1/p)}-o(1)\right)\ln n. If the collection of random variables {αjk}f+1jn,1kn\{\alpha_{jk}\}_{\mathfrak{f}+1\leq j\leq n,1\leq k\leq n} is pp-bounded of exponent rr, then

Note that the bound on the singularity probability of Nf,nN_{\mathfrak{f},n} for r2r\geq 2 is the same as in Theorem 2.2 (since for r2r\geq 2, we have n/rnclnn=nfn/r\ll n-c\ln n=n-\mathfrak{f}). This is a reflection of the fact that only the large dimension case uses the randomness in all the rows simultaneously, and in that case the exponential bound does not depend on rr. Generally speaking, the best known lower bounds on the singularity probability of a discrete random matrix come from a dependency among at most two random rows, and since Nf,nN_{\mathfrak{f},n} certainly has more than two random rows, the upper bounds given in Theorem 9.2 seem reasonable.

Theorem 9.2 leads to Corollary 1.2 by following a conditioning argument very similar to that given in Section 3.3.

The proof of Theorem 9.2 follows the same lines of reasoning as that of Theorem 2.2. In this subsection, we will state the main lemmas with the necessary modifications, and we will mention a few important considerations when making the modifications.

Note that Equation (15), which reduces the question of singularity to one of the rows spanning non-trivial hyperplane of dimension n1n-1 holds in the current context, using the same definition of AVA_{V} and “non-trivial hyperplane” (both are defined after Equation (15) in Section 5.1).

For d±=0d_{\pm}=0, we define Grf(0)\operatorname{Gr}_{\mathfrak{f}}(0) to be the set of all non-trivial hyperplanes such that

We will refer to d±d_{\pm} as the combinatorial dimension of VV.

The proof is the same as that for Lemma 5.2; also see , , . ∎

The proof is the same as that for Lemma 5.3, except now we appeal to Lemma A.2 with f>0\mathfrak{f}>0. Note that we must assume fn/2\mathfrak{f}\leq n/2 in order to apply Lemma A.2. See also ,,. ∎

Consider a hyperplane VV of medium combinatorial dimension (that is, d±d_{\pm} satisfies the condition in Proposition 9.6). We say VV is unexceptional if there exists an i0i_{0} where f+1i0n\mathfrak{f}+1\leq i_{0}\leq n and there exists a k0k_{0} where 1k0r1\leq k_{0}\leq r such that

We say VV is exceptional if for every ii where f+1in\mathfrak{f}+1\leq i\leq n and for every kk where 1kr1\leq k\leq r we have

If fcfϵ0nr\mathfrak{f}\leq\frac{c_{\mathfrak{f}}\epsilon_{0}n}{r} for some positive constant cfc_{\mathfrak{f}}, then we have

The proof follows in the same way as that for Lemma 5.6; however, when replacing rows XiX_{i} of Nf,nN_{\mathfrak{f},n} with rows ZiZ_{i} that concentrate more sharply on VV, we must take care to only replace random rows of Nf,nN_{\mathfrak{f},n} (i.e., rows X1,,XfX_{1},\ldots,X_{\mathfrak{f}} must not be replaced by ZiZ_{i}). See Appendix B for details. ∎

In the exceptional case, The same structure theorem (Theorem 6.1) holds, leading to the following lemma.

If f(r2ln(1/p)o(1))lnn\mathfrak{f}\leq\left(\frac{r}{2\ln(1/p)}-o(1)\right)\ln n, then

Note that this upper bound is dramatically worse than the analogous upper bound in Lemma 5.7 of nn2+o(n)n^{-\frac{n}{2}+o(n)}.

As in Lemma 5.7, the main step in the proof is applying the structure theorem (Theorem 6.1). In the current context, Inequality (20) holds with nfn-\mathfrak{f} as the exponent instead of nn (since there are only nfn-\mathfrak{f} random rows). If we combine this modified version of Inequality (20) with Inequality (21), then we have the bound

Acknowledgments

We would like to thank Kevin Costello for helpful conversations on the conditioning argument in Subsections 3.2 and 3.3. Also the third author would like to thank the National Defense Science and Engineering Fellowship and the National Science Foundation Graduate Research Fellowship for helping fund this work.

Appendix A Two background results

For QQ sufficiently large with respect to qq, rr, and nn, it is clear that we have

Our proof is closely modeled on the proof of [12, Corollary 7.13]. Let βj(μ)\beta^{(\mu)}_{j} be the symmetric random variables from the definition of pp-bounded of exponent rr corresponding to αj\alpha_{j} (see Equation (9)). Then, we can compute

Combining the above inequalities with Inequality (44) and following the proof of [12, Corollary 7.13] to bound the integral, we have

A.2 A generalization of a lemma due to Komlós [6]

This lemma is a generalization of the result in (see also [2, Lemma 14.10], [4, Section 3.1], and [10, Lemma 5.3]).

where the constant cc can be taken to be c2ln(100/p)c\geq 2\ln(100/p), and where 0\underline{0} denotes the zero vector.

Let E_{k}=\{\mbox{there exists }v\in\Omega_{1}\mbox{ with at mostknonzero coordinates such that }N_{\mathfrak{f},n}\cdot v=0\}. Clearly,

Let SS be the set of all possible values that could appear as entries in Nf,nN_{\mathfrak{f},n}, and let Nf,nj1,,jk\left.N_{\mathfrak{f},n}\right|_{j_{1},\ldots,j_{k}} be the nn by kk matrix consisting of columns j1,,jkj_{1},\ldots,j_{k} of Nf,nN_{\mathfrak{f},n}. Following [6, Lemma 2] (see also [2, Lemma 14.10] and [10, Lemma 5.3]) we can write

Let U(k,p,q)U(k,p,q) be a uniform upper bound for \Pr(\mbox{rowiisinis inH}), where f+1in\mathfrak{f}+1\leq i\leq n and qq is the constant from Definition 2.1 (here, we mean uniform with respect to the index sets {j1,,jk}\{j_{1},\ldots,j_{k}\} and {i1,,ik}\{i_{1},\ldots,i_{k}\}). Then we have

since k1k-1 fixed rows of Nf,nj1,,jk\left.N_{\mathfrak{f},n}\right|_{j_{1},\ldots,j_{k}} can span at most 1 hyperplane HH of dimension k1k-1.

Summing the bounds in Inequalities (45) and (46) completes the proof. ∎

Appendix B The unexceptional case with 𝔣𝔣\mathfrak{f} fixed rows

This section is adapted from the proof of [11, Lemma 4.1], and proves Lemma 5.6 by setting f=0\mathfrak{f}=0. Assume that fcfϵ0nr\mathfrak{f}\leq\frac{c_{\mathfrak{f}}\epsilon_{0}n}{r}, and let mm be the closest integer to cmϵnr\frac{c_{m}\epsilon n}{r}. Let Z1,,ZmZ_{1},\ldots,Z_{m} be i.i.d. copies of the unexceptional row vector Zi0,k0Z^{*}_{i_{0},k_{0}} from Definition 9.7, so ϵ1Pr(ZiV)>Pr(XiV)\epsilon_{1}\Pr(Z_{i}\in V)>\Pr(X_{i}\in V) for all f+1in\mathfrak{f}+1\leq i\leq n. We will need the following version of the Weighted Odlyzko Lemma:

[cf. [11, Lemma 4.3] or [4, Section 3.2]] For 1i1\leq i, let Wi1W_{i-1} be an (f+i1\mathfrak{f}+i-1)-dimensional subspace containing X1,,XfX_{1},\ldots,X_{\mathfrak{f}} (which are fixed, linearly independent row vectors). Then

Since Wi1W_{i-1} has dimension f+i1\mathfrak{f}+i-1, there exists a set of f+i1\mathfrak{f}+i-1 “determining” coordinates such that if a vector VWi1V\in W_{i-1}, then the f+i1\mathfrak{f}+i-1 “determining” coordinates determine the values of the remaining nfi+1n-\mathfrak{f}-i+1 coordinates. Since the maximum probability that any of the n/rn/r random coordinates in ZiZ_{i} takes a given value is at most 1μ=p+ϵ01001-\underline{\mu}=p+\frac{\epsilon_{0}}{100}, and since there are at least nrfi+1\frac{n}{r}-\mathfrak{f}-i+1 of the random coordinates in ZiZ_{i} that are not among the “determining” coordinates, we have the desired upper bound. ∎

Let V0:=Span{X1,,Xf}V_{0}:=\operatorname{Span}\{X_{1},\ldots,X_{\mathfrak{f}}\}, the space spanned by the f\mathfrak{f} fixed rows, and for 1im1\leq i\leq m let BV,iB_{V,i} be the event that Z1,,ZmZ_{1},\ldots,Z_{m} are linearly independent in VV0V\setminus V_{0}. We have the following analog of Lemma 5.8 (and also [11, Lemma 4.4]):

Let mm, f\mathfrak{f}, and BV,mB_{V,m} be as defined above. Then,

where BV,0B_{V,0} denotes the full space of the ZiZ_{i}. Conditioning on a particular instance of Z1,,Zi1Z_{1},\ldots,Z_{i-1} in BV,i1B_{V,i-1}, we have that

where Wi1W_{i-1} denotes the (f+i1\mathfrak{f}+i-1)-dimensional space spanned by X1,,XfX_{1},\ldots,X_{\mathfrak{f}} and Z1,,Zi1Z_{1},\ldots,Z_{i-1}. We will now establish a uniform bound that does not depend on which particular instance of Z1,,Zi1Z_{1},\ldots,Z_{i-1} in BV,i1B_{V,i-1} that we fixed by conditioning. By the definition of unexceptional, we have

and by the Weighted Odlyzko Lemma (see Lemma B.1), we have

Using Taylor’s Theorem with remainder (for example), one can show that

and plugging this estimate back into Inequality (47) we get

To conclude Lemma 9.8 (which implies Lemma 5.6 by setting f=0\mathfrak{f}=0), we will proceed as in the proof for [11, Lemma 4.1].

Let Z1,,ZmZ_{1},\ldots,Z_{m} be i.i.d. copies of Zi0,k0Z^{*}_{i_{0},k_{0}} that are independent of the random rows Xf+1,,XnX_{\mathfrak{f}+1},\ldots,X_{n}. Using independence and Bayes’ Identity we have

Because the ZiZ_{i} are linearly independent in VV0V\setminus V_{0}, we know that there is a subset I{f+1,f+2,,n}I\subset\{\mathfrak{f}+1,\mathfrak{f}+2,\ldots,n\} of cardinality I=m\left|I\right|=m, such that {Z1,,Zm}{Xi:iI}\{Z_{1},\ldots,Z_{m}\}\cup\{X_{i}:i\notin I\} spans VV. Let CV,IC_{V,I} be the event that {Z1,,Zm}{Xi:iI}\{Z_{1},\ldots,Z_{m}\}\cup\{X_{i}:i\notin I\} spans VV. Then we have

Summing the above inequality over all unexceptional VV (note that VPr(CV,I)1\sum_{V}\Pr(C_{V,I})\leq 1) and combining with the bound for Pr(AV)\Pr(A_{V}) above gives us

This completes the proof of the estimate for unexceptional VV.

References